Unresponsive or slow DNS lookups causing mail flow problems


This article applies to:

  • Trustwave MailMarshal (SEG)

Symptoms:

  • Unresponsive or slow DNS Blocklist causing mail flow problems
  • Slow DNS Host Validation
  • MailMarshal Receiver is slow to respond

Causes:

Sometimes DNS Blocklists (also called Real-time blocklists or RBLs) such as SpamCop or SpamHaus can become unresponsive, slow, or completely unavailable. Even a delay of several seconds per message can quickly cause a severe backlog of email to build up in the MailMarshal Incoming queue.

If host validation (PTR lookup) is enabled a slow local DNS server can also cause issues.

To check DNS response:

  1. Telnet into the MailMarshal server on port 25 (see article Q10879 for help with Telnet).
  2. If the 220 response is delayed, disable DNS blocklist rules and host validation, and then attempt Telnet again.
    • If the response is quicker DNS is likely the problem.

Information:

There are a number of steps you can take to minimize DNS related delays

Use a good DNS server

The first item to check is that MailMarshal is using a DNS server that is local, well connected and responsive.

  • MailMarshal uses the DNS servers you set in the Configurator or Management Console, NOT the servers you set in Windows networking settings.
    • Be cautious when specifying a third party DNS server (such as Google Public DNS) with blocklists. Excessive requests can cause the third party server to deny requests.
    • For example, Spamhaus specifically recommends against using Google DNS with their service.
  • MailMarshal has a DNS caching framework (implemented in the Controller service), which can help reduce repetitive DNS look ups. 

RBL specific settings

You should generally use at most two RBL providers. The Marshal IP Reputation Service (maintained by Trustwave) should be one of them.

Notes:

Current versions of MailMarshal have a DNS caching framework, which can help reduce DNS look ups. 

Retry settings

You can configure RBL retries for Engine based lookups. However, since the Controller handles all lookups, these settings are generally not useful or required.

The MailMarshal XML files allow you to use several different DNS Blocklist settings. To reduce the Real-Time Block List (RBL) timeout and to get mail flowing faster in the event of an RBL failure or slowdown MailMarshal uses "LookUpRetry" - This sets the number of retries when checking DNS Blocklist servers, after which MailMarshal will time out and continue processing the message. The default is 4, which gives a timeout of 40 seconds - here's how it works:

LookUpRetry = 1 ....gives 5 second timeout
LookUpRetry = 2 ....gives 10 second timeout
LookUpRetry = 3 ....gives 20 second timeout
LookUpRetry = 4 ....gives 40 second timeout

The timeout value quoted is per IP address NOT simply per RBL, and as MailMarshal parses the Header for IP addresses, each message will typically be checked against multiple IP addresses. A message with three IPs in the header, using a LookUpRetry = 4, will only time out after 120 seconds. To minimize mail delays caused by problematic RBL servers use a LookUpRetry = 1. If an RBL server times out, MailMarshal will back off from the RBL for one minute, during which time it will allow ALL email through unchecked against that database for the RBL rule.


Take a look at some of your DNS XML files, for example the SpamCop.xml or SpamHaus.xml (by default, located in the Config folder under MailMarshal install folder) - notice the following entry:

<Evaluations>
<Eval Name="SpamCop" Enabled="true" Score="60" Type="DNSLookup" Description="IP Listed on SpamCop" Data="bl.spamcop.net" SkipFirstIPs="0" Except="DNSBlacklistExclusions" />
</Evaluations> 

To use a timeout value:

  1. Alter the Eval Name slightly to include a LookUpRetry value:
    <Evaluations>
    <Eval Name="SpamCop" Enabled="true" Score="30" Type="DNSLookup" Description="IP Listed on SpamCop" LookUpRetry="1" Data="bl.spamcop.net" SkipFirstIPs="0" Except="DNSBlacklistExclusions" />
    </Evaluations>

    The position of the LookUpRetry in relation to the other values is not critical.

  2. Save the updated XML file and reload the rules on the server. 

 

This article was previously published as:
NETIQKB41092

Last Modified 5/1/2020.
https://support.trustwave.com/kb/KnowledgebaseArticle10789.aspx