The FTimes Project The HashDig Project The WebJob Project The PaD Project
Location: / Home / FTimes / XMagic
FTimes
Home
What is XMagic?

XMagic is an alternate implementation of file typing based on Magic numbers or tests. XMagic was inspired by the venerable file(1) command.

Where can I get the current XMagic file?

The current XMagic file is available here.

Why should I use file typing via XMagic?

Baselining a system prior to deploying it is a good security practice because the information collected may be used to determine what directories and files have changed since deployment. If a prior baseline doesn't exist and the the system is compromised, the process of determining change becomes much harder. One approach is to construct and baseline a virgin system that approximates the original system. A snapshot of the compromised system can then be compared to the virgin baseline. This technique could, in theory, reduce the number of files that must be reviewed to a very small set. In practice, this technique is good for eliminating system files that typically do not and should not change. Unfortunately, the number of files that aren't eliminated by this technique is usually quite large. So large, in fact, that some additional technique is needed to prioritize the list of unknown/suspicious files into manageable subsets. This is where file typing via XMagic can help. By sorting the list of unknown/suspicious files by type, the practitioner can focus on groups that are relevant to the investigation such as scripts and executables.

In a workbench environment file type information can be collected using existing tools (e.g., the standard file(1) command). In an operational environment it's more efficient and less invasive (think timestamps) to collect all desired attributes, including file type, in a single pass. This drove the decision to add Magic support to FTimes.

What is the format of XMagic?

The best way to learn about XMagic is to start by reading the magic(5) man page, which is available here. Another good resource, is the Magic file that ships with file(1); study it to understand how Magic works in practice. The Magic file that shipped with file(1) 4.17 can be found here. After you've done that, take a look at the next question.

What are the diferences between XMagic and Magic?

Since XMagic support was built from scratch, it requires a modified form of the Magic file that ships with file(1). The most significant differences between XMagic and Magic are as follows:

  • The test operator/value pair has been split into separate fields.
  • XMagic supports regular expression Magic via Perl Compatable Regular Expressions (PCRE). The associated test operators are as follows:
  •    =~  The expression must match
       !~  The expression must not match
    
  • XMagic supports block-based entropy calculations. The associated test types are: row_entropy_1 and row_entropy_2
  • XMagic supports block-based average calculations. The associated test types are: row_average_1 and row_average_2
  • XMagic supports block-based percent calculations for various ctype(3) character classes. The associated test types are: percent_ctype_alnum, percent_ctype_alpha, percent_ctype_ascii, percent_ctype_cntrl, percent_ctype_digit, percent_ctype_lower, percent_ctype_print, percent_ctype_punct, percent_ctype_space, and percent_ctype_upper
  • XMagic supports block-based hash calculations. The associated test types are: md5 and sha1
  • XMagic also supports several different test operators for all of its block-based tests. These operators are listed and described here:
  •    []  (greater than or equal to) and (less than or equal to)
       [)  (greater than or equal to) and (less than)
       (]  (greater than) and (less than or equal to)
       ()  (greater than) and (less than)
       ][  (less than or equal to) or (greater than or equal to)
       ](  (less than or equal to) or (greater than)
       )[  (less than) or (greater than or equal to)
       )(  (less than) or (greater than)
    

Currently, several file(1) types/operators are not supported by XMagic. Some of the unsupported types (e.g., string/[Bbc] and search/<number>) are not necessary because equivalent Magic incantations can be crafted using regular expressions. However, we are planning to implement support for missing types/operators where it makes sense to do so.

While the test operator/value difference is minor, it does remove ambiguities (e.g., '!<arch>'), simplifies parser code, and allows operators to exceed file(1)'s one-character length restriction. In the case where the 'x' operator has been specified (meaning there is no test to perform), a single hyphen, '-', is inserted in the value field to act as a place holder. The following example shows where to insert the implied test operator. Note that if a test operator was not supplied in the standard Magic description, the implied operator is '='.

   Magic:   0   string         \037\235              compress'd data

  XMagic:   0   string     =   \037\235              compress'd data

This example shows where to insert the place holder when the test value is to be ignored:

   Magic:   >6  byte       x                         type %c

  XMagic:   >6  byte       x    -                    type %c

The next two examples show how to convert a series of string/[Bbc] tests to equivalent regexp tests:

   Magic:   0   string/B   =    \=pod\n              Perl POD document
   Magic:   0   string/B   =    \n\=pod\n            Perl POD document
   Magic:   0   string/B   =    \=head1\             Perl POD document
   Magic:   0   string/B   =    \n\=head1\           Perl POD document
   Magic:   0   string/B   =    \=head2\             Perl POD document
   Magic:   0   string/B   =    \n\=head2\           Perl POD document

  XMagic:   0   regexp     =~   ^\n?=(?:pod\n|head[12])   Perl POD document

   Magic:   0   string/cB  =    \<DOCTYPE\ html      HTML document text
   Magic:   0   string/cb  =    \<head               HTML document text
   Magic:   0   string/cb  =    \<title              HTML document text
   Magic:   0   string/cb  =    \<html               HTML document text

  XMagic:   0   regexp     =~   (?i)^\s*<DOCTYPE[\x20\t]+html|head|html|title)   HTML document text

This example shows how to convert a search/<number> test to an equivalent regexp:<number> test (Note: the current maximum <number> for XMagic is 128):

   Magic:   0   search/20  =    foo                  The venerable %s document

  XMagic:   0   regexp:20  =~   foo                  The venerable %s document

This example shows how to use the block-based test types to harvest various topographical information:


  XMagic:   0    byte                     x  -  512
  XMagic:   >&0  row_entropy_1:512        x  -  \b|%f
  XMagic:   >&0  row_entropy_2:512        x  -  \b|%f
  XMagic:   >&0  row_average_1:512        x  -  \b|%f
  XMagic:   >&0  row_average_2:512        x  -  \b|%f
  XMagic:   >&0  percent_ctype_alnum:512  x  -  \b|%f
  XMagic:   >&0  percent_ctype_alpha:512  x  -  \b|%f
  XMagic:   >&0  percent_ctype_ascii:512  x  -  \b|%f
  XMagic:   >&0  percent_ctype_cntrl:512  x  -  \b|%f
  XMagic:   >&0  percent_ctype_digit:512  x  -  \b|%f
  XMagic:   >&0  percent_ctype_lower:512  x  -  \b|%f
  XMagic:   >&0  percent_ctype_print:512  x  -  \b|%f
  XMagic:   >&0  percent_ctype_punct:512  x  -  \b|%f
  XMagic:   >&0  percent_ctype_space:512  x  -  \b|%f
  XMagic:   >&0  percent_ctype_upper:512  x  -  \b|%f
  XMagic:   >&0  sha1:512                 x  -  \b|%s
  XMagic:   >&0  md5:512                  x  -  \b|%s

XMagic does not support signed comparisons -- all integer comparisons are unsigned. As such, the parser does not recognize the 'u' prefix on the data type field. An example of this prefix can be found in the tcpdump Magic.

   Magic:   0   ubelong         0xa1b2c3d4           tcpdump capture file (big-endian)

  XMagic:   0   belong     =    0xa1b2c3d4           tcpdump capture file (big-endian)

How do I use XMagic?

XMagic is supported in both map and dig modes of operation. However, the usage is slightly different.

  • To use XMagic in mapauto mode, place the xmagic file in the current working directory or /usr/local/ftimes/etc/xmagic (c:\ftimes\etc\xmagic for MS Windows).
  • To use XMagic in map{lean,full} modes, you can specify an alternate location using the MagicFile control. To change the predefined XMagic location, edit the value for XMAGIC_DEFAULT_LOCATION in xmagic.h, and recompile.
  • To use XMagic in dig{auto,lean,full} modes, you must assign the path of the XMagic file to the DigStringXMagic control. Note that XMagic is not strictly limited to block typing in dig mode. It can also be used to harvest various topographical information and enumerate well-known structures.
Why does XMagic sometimes report different results than file(1)?

Except for special files, FTimes does not support any built-in Magic. Because of this, FTimes will often report 'unknown' where file(1) reports something like 'ASCII English text'. The main reason for this is that built-in Magic is not based on Magic tests. Rather, it's based on logic that scans the input buffer (character by character) and attempts to make an educated guess as to what the underlying data "looks" like. Currently, whether to support built-in Magic or not is an open question.

What is the future of XMagic?

Currently, XMagic is in a transitional state. We're looking for a better way to standardize existing tests. We're also looking for a way to abbreviate Magic descriptions. Two ideas that have come up are a unique numbering scheme and a MIB like structure. In either case, we think that there should be a unique mapping between a particular Magic test and its identifier.

Along those same lines, it would be nice if the user could specify the level of detail that will be provided upon a match. For example, sometimes it would be sufficient to know that a file (e.g., aliases.db) is a "Berkeley DB Hash file" instead of "Berkeley DB Hash file (Version 2, Little Endian, Bucket Size 8192, Bucket Shift 13, Directory Size 256, Segment Size 256, Segment Shift 8, Overflow Point 1, Last Freed 2, Max Bucket 1, High Mask 0x3, Low Mask 0x1, Fill Factor 65536, Number of Keys 20)".

Another planned change is to move to URL encoding for string values. The current format allows for escapes (e.g., '\ ' for a space) and octal character representations. This makes parsing strings more complex than it needs to be, and has led to broken Magic incantations.

Copyright 2000-2014 The FTimes Project, All Rights Reserved.
The FreeBSD Project SourceForge Logo KoreLogic, Inc.