Skip to content

ftguess

decalage2 edited this page Mar 26, 2026 · 1 revision

ftguess

ftguess is a Python module to determine the type of a file based on its contents (not its extension or filename). It can be used as a command-line tool or as a Python library.

It is part of the python-oletools package.

Main Features

  • Identifies the file type and container format from file content (magic bytes, internal structure)
  • Recognises OLE-based formats (Word 97-2003, Excel 97-2003, PowerPoint 97-2003, MSI, ...)
  • Recognises OpenXML/ZIP-based formats (Word 2007+, Excel 2007+, PowerPoint 2007+, XPS, ...)
  • Recognises RTF, OneNote, PNG, Windows PE executables, and generic ZIP archives
  • Reports the application, container type, file type, MIME content-type, and PRONOM PUID
  • For OLE files, reports the root CLSID and its known name
  • Supports scanning multiple files, recursive directory traversal, and files inside zip archives
  • Can be used as a Python library from your own applications

Supported File Types

File Type Extensions
MS Word 6-7 .doc, .dot
MS Word 97-2003 .doc, .dot
MS Word 2007+ Document .docx
MS Word 2007+ Macro-Enabled Document .docm
MS Word 2007+ Template .dotx
MS Word 2007+ Macro-Enabled Template .dotm
MS Excel 5.0/95 .xls, .xlt, .xla
MS Excel 97-2003 .xls, .xlt, .xla
MS Excel 2007+ Workbook .xlsx
MS Excel 2007+ Macro-Enabled Workbook .xlsm
MS Excel 2007+ Binary Workbook .xlsb
MS Excel 2007+ Template .xltx
MS Excel 2007+ Macro-Enabled Template .xltm
MS Excel 2007+ Macro-Enabled Add-in .xlam
MS PowerPoint 97-2003 .ppt, .pps, .pot
MS PowerPoint 2007+ Presentation .pptx
MS PowerPoint 2007+ Slideshow .ppsx
MS PowerPoint 2007+ Macro-Enabled Presentation .pptm
MS PowerPoint 2007+ Macro-Enabled Slideshow .ppsm
MS OneNote .one
Windows Installer .msi
XPS .xps
RTF .rtf, .doc
PNG .png
Windows PE Executable / DLL .exe, .dll, .sys, .scr
Generic ZIP Archive .zip
Generic OLE/CFB File

Usage

ftguess [options] <filename> [filename2 ...]

Options

-r                  find files recursively in subdirectories
-z PASSWORD         if the file is a zip archive, open first file from it
                    using the provided password
-f PATTERN          if the file is a zip archive, file(s) to open within it
                    (wildcards * and ? supported, default: *)
-l LEVEL            logging level: debug/info/warning/error/critical
                    (default: warning)

Example

$ ftguess sample.docx

ftguess 0.60.2 on Python 3.10.0 - http://decalage.info/python/oletools
THIS IS WORK IN PROGRESS - Check updates regularly!

File       : sample.docx
File Type  : MS Word 2007+ Document
Description: MS Word 2007+ Document (.docx)
Application: MS Word
Container  : OpenXML
Content-type(s) :
PUID       : None

How to use ftguess in your Python applications

Import FileTypeGuesser and pass either a file path or raw bytes:

from oletools.ftguess import FileTypeGuesser, FTYPE, CONTAINER, APP

# From a file path:
ftg = FileTypeGuesser(filepath='document.docx')

# From bytes in memory:
with open('document.docx', 'rb') as f:
    data = f.read()
ftg = FileTypeGuesser(data=data)

# Always close when done:
ftg.close()

Key attributes on the FileTypeGuesser object

Attribute Description
ftg.ftype The matched FType_* class (see constants below)
ftg.filetype String constant from FTYPE, e.g. 'Word2007_DOCX'
ftg.container String constant from CONTAINER, e.g. 'OpenXML'
ftg.application String constant from APP, e.g. 'MS Word'
ftg.root_clsid Root CLSID for OLE files (string), or None
ftg.root_clsid_name Human-readable name for the root CLSID, or None
ftg.main_part_content_type Content-type of the main part for OpenXML files, or None

Convenience methods

Method Returns True if…
ftg.is_ole() The container is OLE/CFB
ftg.is_openxml() The container is OpenXML (ZIP-based)
ftg.is_word() The file is any MS Word format
ftg.is_excel() The file is any MS Excel format
ftg.is_powerpoint() The file is any MS PowerPoint format

Example: checking file type programmatically

from oletools.ftguess import FileTypeGuesser, FTYPE, CONTAINER

ftg = FileTypeGuesser(filepath='sample.doc')

if ftg.is_ole():
    print('OLE container detected')
    if ftg.root_clsid:
        print('Root CLSID:', ftg.root_clsid, '-', ftg.root_clsid_name)

if ftg.filetype == FTYPE.WORD97:
    print('This is a Word 97-2003 document, may contain VBA macros')

if ftg.ftype.may_contain_vba:
    print('This file type may contain VBA macros')

ftg.close()

FType_Base class attributes

Each FType_* class (accessible via ftg.ftype) exposes:

Attribute Description
name Short name of the file type
longname Full descriptive name
extensions List of typical file extensions
content_types List of MIME content-types
PUID PRONOM Unique ID
may_contain_vba True if VBA macros are possible in this format
may_contain_xlm True if XLM (Excel 4) macros are possible
may_contain_ole True if embedded OLE objects are possible

Constants

Use the FTYPE, CONTAINER, and APP classes for comparisons:

from oletools.ftguess import FTYPE, CONTAINER, APP

# FTYPE examples: FTYPE.WORD97, FTYPE.EXCEL2007_XLSM, FTYPE.RTF, FTYPE.UNKNOWN, ...
# CONTAINER examples: CONTAINER.OLE, CONTAINER.OpenXML, CONTAINER.RTF, CONTAINER.ZIP, ...
# APP examples: APP.MSWORD, APP.MSEXCEL, APP.MSPOWERPOINT, APP.UNKNOWN, ...

python-oletools documentation

Clone this wiki locally