Limitations of File Type Identification in Standard Rules


This article applies to:

  • WebMarshal

Question:

  • What are the limitations of file type identification in Standard Rules?
  • Files being identified as wrong type in Standard Rules

Information:

WebMarshal offers File Type identification in both Standard Rules and Content Analysis Rules. 

  • Standard Rules are evaluated during file retrieval based on partial files or header information. The amount of information available for analysis could vary for each request.
  • Content Analysis rules are evaluated once the entire file is available.

Although type identification is included in Standard Rules for backward compatibility, Trustwave recommends that all File Type conditions should be moved to Content Analysis rules.

This recommendation is particularly important with sub-types such as encrypted variants of archive types. Sub-types are more difficult to recognize and they are particularly likely to be incorrectly or incompletely identified in Standard Rules. Also, some types listed as not detected in Standard Rules are actually sub-types. The files might be detected as belonging to the base type (for instance, Encrypted ZIP files might be detected as ZIP files).

Also note that to identify uploaded files by type, in most cases you must use Content Analysis rules because the documents must be unpacked from a multipart web form.

The following is a list of types that are more likely to be incorrectly identified in Standard Rules. Due to the nature of the identification, other types could also be mis-identified.

  • Word Document
  • Excel Document
  • PowerPoint Document
  • PowerPoint Slideshow
  • Microsoft Publisher
  • Microsoft Project
  • Outlook Mail Message
  • Visio
  • Microsoft Works
  • MSI Installer package
  • Word2007 Document with IRM
  • OLE Files
  • Encrypted Acrobat PDF document
  • Protected Acrobat PDF document
  • Self-extracting archive
  • Encrypted archive
  • Office 2007 files (all applications)
  • Open Office files

Last Modified 4/1/2020.
https://support.trustwave.com/kb/KnowledgebaseArticle12839.aspx