This article applies to:
- Trustwave MailMarshal (SEG) 7.1 and above
- Blended Threat analysis
Question:
- What URLs does MailMarshal rewrite for Blended Threat analysis?
- What URLs does MailMarshal exclude from rewriting for Blended Threat analysis?
Information:
In SEG 7.1 and above, Blended Threat analysis is implemented by rewriting web URLs found in email messages. When a user clicks a rewritten URL, the URL is submitted and scanned in real time by the Trustwave Blended Threat Service. To rewrite URLs, use the Rule Action "Rewrite URLs in the message for Blended Threat Scanning."
This article provides detailed information about what URLs are included and excluded from rewriting.
Email subject
A URL in the subject of an email message will be rewritten following the same rules as for the plain text body (see below).
- Note: Subject lines may be truncated to 255 characters by email clients, in accordance with RFCs, and the truncated result may not be usable as a link.
Plain text email body
In the plain text body of an email message, text that is likely to be converted to a clickable link by an email client will be rewritten.
- Text such as http://some.domain.com
- Text such as www.domain.com
HTML email body
In the HTML body of an email message the following are rewritten:
- Well formatted HTML links and image map links
- Malformed HTML links (such as unclosed tags or non-quoted URLs), and links outside the HTML body tag.
- Plain text URL strings (not HTML links) are converted to clickable links using the rewrite service.
- MailMarshal takes this action because many email clients convert plain text URL strings to clickable links on the fly, even within HTML bodies.
RTF email body
The RTF body of a message (if any) is not scanned by the URL rewriting logic.
Attachments
Most attachments are not scanned by the URL rewriting logic. For instance, Microsoft Office documents and PDFs are not scanned.
Attached email messages are treated in the same way as parent messages. The subject, plain text, and HTML bodies of attached messages are checked and URLs are rewritten.
Global Exclusions
The following URLs are never rewritten:
- Reserved Top Level Domain (TLD) entries (as per RFC 2606) such as .test and .local
- Top Level Domains not currently assigned by IANA.
- Invalid domains or TLDs.
- Domain entries with double byte characters in the name.
- FTP URLs
- Only HTTP and HTTPS URLs are rewritten and submitted.
- IP addresses, with no path part (these are not converted to clickable links by email clients).
- IP address URLs in private network ranges as defined in RFC1918 and RFC4193.
- 10.0.0.0/8
- 172.16.0.0/12
- 192.168.0.0/16
- fc00::/7
- Localhost and loopback addresses (localhost or 127.0.0.1 or ::1).
- Email addresses or mailto: links.
- MailMarshal Local Domains.
- The exact local domain (localdomain.com).
- The local domain with WWW. prepended (www.localdomain.com).
- HTTP and HTTPS variants of the above.
- URLS containing a username and password.
- A list of exclusions maintained by Trustwave and updated through the SpamCensor automatic update process.
User Maintained Exclusion List
You can choose to exclude domains, by editing an exclusion list (in the MailMarshal Configurator, see Tools > MailMarshal Properties > Blended Threats).
- In this list you can enter domain names (but not path parts or protocols). You can use the * wildcard one or more times anywhere in an entry, but no other wildcards or substitutions are supported. Note that the entries are domain names, not URLs.
- Entering a domain name excludes all items in that domain.
- Entering example.com will exclude example.com/path/file.html
- A wildcard * at the end of an entry matches the server name in all domains.
- Entering www.example* will exclude www.example.com and www.example.co.uk
- Entering *.example.com does not exclude the root domain example.com
- If you want to exclude a root domain and all its subdomains, you must make two entries.
Inclusions
The following types of URLs are rewritten for scanning:
- URLs that include a port (www.domain.com:80).
- URLs in IPv6 format (http://[3ffe:2a00:100:7031::1]).
- URLs prepended with HTTP:// or HTTPS:// that are obfuscated (for instance IPv4 addresses represented as integer or octal data)
Known issues and constraints
- Text that can be interpreted as a file name with extension, or a URL, will be rewritten (such as Filename.INFO).