nobodd.fs

The nobodd.fs module contains the FatFileSystem class which is the primary entry point for reading FAT file-systems. Constructed with a buffer object representing a memory mapping of the file-system, the class will determine whether the format is FAT12, FAT16, or FAT32. The root attribute provides a Path-like object representing the root directory of the file-system.

>>> from nobodd.disk import DiskImage
>>> from nobodd.fs import FatFileSystem
>>> img = DiskImage('test-gpt.img')
>>> fs = FatFileSystem(img.partitions[1].data)
>>> fs.fat_type
'fat16'
>>> fs.root
FatPath(<FatFileSystem label='TEST' fat_type='fat16'>, '/')

Warning

At the time of writing, the implementation is strictly not thread-safe. Attempting to write to the file-system from multiple threads (whether in separate instances or not) is likely to result in corruption. Attempting to write to the file-system from one thread, while reading from another will result in undefined behaviour including incorrect reads.

Warning

The implementation will not handle certain “obscure” extensions to FAT, such as sub-directory style roots on FAT-12/16. It will attempt to warn about these and abort if they are found.

FatFileSystem

class nobodd.fs.FatFileSystem(mem, atime=False, encoding='iso-8859-1')[source]

Represents a FAT file-system, contained at the start of the buffer object mem.

This class supports the FAT-12, FAT-16, and FAT-32 formats, and will automatically determine which to use from the headers found at the start of mem. The type in use may be queried from fat_type. Of primary use is the root attribute which provides a FatPath instance representing the root directory of the file-system.

Instances can (and should) be used as a context manager; exiting the context will call the close() method implicitly. If certain header bits are set, DamagedFileSystem and DirtyFileSystem warnings may be generated upon opening.

If atime is False, the default, then accesses to files will not update the atime field in file meta-data (when the underlying mem mapping is writable). Finally, encoding specifies the character set used for decoding and encoding DOS short filenames.

close()[source]

Releases the memory references derived from the buffer the instance was constructed with. This method is idempotent.

open_dir(cluster)[source]

Opens the sub-directory in the specified cluster, returning a FatDirectory instance representing it.

Warning

This method is intended for internal use by the FatPath class.

open_entry(index, entry, mode='rb')[source]

Opens the specified entry, which must be a DirectoryEntry instance, which must be a member of index, an instance of FatDirectory. Returns a FatFile instance associated with the specified entry. This permits writes to the file to be properly recorded in the corresponding directory entry.

Warning

This method is intended for internal use by the FatPath class.

open_file(cluster, mode='rb')[source]

Opens the file at the specified cluster, returning a FatFile instance representing it with the specified mode. Note that the FatFile instance returned by this method has no directory entry associated with it.

Warning

This method is intended for internal use by the FatPath class, specifically for “files” underlying the sub-directory structure which do not have an associated size (other than that dictated by their FAT chain of clusters).

property atime

If the underlying mapping is writable, then atime (last access time) will be updated upon reading the content of files, when this property is True (the default is False).

property clusters

A FatClusters sequence representing the clusters containing the data stored in the file-system.

Warning

This attribute is intended for internal use by the FatFile class, but may be useful for low-level exploration or manipulation of FAT file-systems.

property fat

A FatTable sequence representing the FAT table itself.

Warning

This attribute is intended for internal use by the FatFile class, but may be useful for low-level exploration or manipulation of FAT file-systems.

property fat_type

Returns a str indicating the type of FAT file-system present. Returns one of “fat12”, “fat16”, or “fat32”.

property label

Returns the label from the header of the file-system. This is an ASCII string up to 11 characters long.

property readonly

Returns True if the underlying buffer is read-only.

property root

Returns a FatPath instance (a Path-like object) representing the root directory of the FAT file-system. For example:

from nobodd.disk import DiskImage
from nobodd.fs import FatFileSystem

with DiskImage('test.img') as img:
    with FatFileSystem(img.partitions[1].data) as fs:
        print('ls /')
        for p in fs.root.iterdir():
            print(p.name)

Note

This is intended to be the primary entry-point for querying and manipulating the file-system at the high level. Only use the fat and clusters attributes, and the various “open” methods if you want to explore or manipulate the file-system at a low level.

property sfn_encoding

The encoding used for short (8.3) filenames. This defaults to “iso-8859-1” but unfortunately there’s no way of determining the correct codepage for these.

FatFile

class nobodd.fs.FatFile(fs, start, mode='rb', index=None, entry=None)[source]

Represents an open file from a FatFileSystem.

You should never need to construct this instance directly. Instead it (or wrapped variants of it) is returned by the open() method of FatPath instances. For example:

from nobodd.disk import DiskImage
from nobodd.fs import FatFileSystem

with DiskImage('test.img') as img:
    with FatFileSystem(img.partitions[1].data) as fs:
        path = fs.root / 'bar.txt'
        with path.open('r', encoding='utf-8') as f:
            print(f.read())

Instances can (and should) be used as context managers to implicitly close references upon exiting the context. Instances are readable and seekable, and writable, depending on their opening mode and the nature of the underlying FatFileSystem.

As a derivative of io.RawIOBase, all the usual I/O methods should be available.

close()[source]

Flush and close the IO object.

This method has no effect if the file is already closed.

classmethod from_cluster(fs, start, mode='rb')[source]

Construct a FatFile from a FatFileSystem, fs, and a start cluster. The optional mode is equivalent to the built-in open() function.

Files constructed via this method do not have an associated directory entry. As a result, their size is assumed to be the full size of their cluster chain. This is typically used for the “file” backing a FatSubDirectory.

Warning

This method is intended for internal use by the FatPath class.

classmethod from_entry(fs, index, entry, mode='rb')[source]

Construct a FatFile from a FatFileSystem, fs, a FatDirectory, index, and a DirectoryEntry, entry. The optional mode is equivalent to the built-in open() function.

Files constructed via this method have an associated directory entry which will be updated if/when reads or writes occur (updating atime, mtime, and size fields).

Warning

This method is intended for internal use by the FatPath class.

readable()[source]

Return whether object was opened for reading.

If False, read() will raise OSError.

readall()[source]

Read until EOF, using multiple read() call.

seek(pos, whence=0)[source]

Change the stream position to the given byte offset.

offset

The stream position, relative to ‘whence’.

whence

The relative position to seek from.

The offset is interpreted relative to the position indicated by whence. Values for whence are:

  • os.SEEK_SET or 0 – start of stream (the default); offset should be zero or positive

  • os.SEEK_CUR or 1 – current stream position; offset may be negative

  • os.SEEK_END or 2 – end of stream; offset is usually negative

Return the new absolute position.

seekable()[source]

Return whether object supports random access.

If False, seek(), tell() and truncate() will raise OSError. This method may need to do a test seek().

truncate(size=None)[source]

Truncate file to size bytes.

File pointer is left unchanged. Size defaults to the current IO position as reported by tell(). Return the new size.

writable()[source]

Return whether object was opened for writing.

If False, write() will raise OSError.

Exceptions and Warnings

exception nobodd.fs.FatWarning[source]

Base class for warnings issued by FatFileSystem.

exception nobodd.fs.DirtyFileSystem[source]

Raised when opening a FAT file-system that has the “dirty” flag set in the second entry of the FAT.

exception nobodd.fs.DamagedFileSystem[source]

Raised when opening a FAT file-system that has the I/O errors flag set in the second entry of the FAT.

exception nobodd.fs.OrphanedLongFilename[source]

Raised when a LongFilenameEntry is found with a mismatched checksum, terminal flag, out of order index, etc. This usually indicates an orphaned entry as the result of a non-LFN aware file-system driver manipulating a directory.

exception nobodd.fs.BadLongFilename[source]

Raised when a LongFilenameEntry is unambiguously corrupted, e.g. including a non-zero cluster number, in a way that would not be caused by a non-LFN aware file-system driver.

Internal Classes and Functions

You should never need to interact with these classes directly; use FatFileSystem instead. These classes exist to enumerate and manipulate the FAT, and different types of root directory under FAT-12, FAT-16, and FAT-32, and sub-directories (which are common across FAT types).

class nobodd.fs.FatTable[source]

Abstract MutableSequence class representing the FAT table itself.

This is the basis for Fat12Table, Fat16Table, and Fat32Table. While all the implementations are potentially mutable (if the underlying memory mapping is writable), only direct replacement of FAT entries is valid. Insertion and deletion will raise TypeError.

A concrete class is constructed by FatFileSystem (based on the type of FAT format found). The chain() method is used by FatFile (and indirectly FatSubDirectory) to discover the chain of clusters that make up a file (or sub-directory). The free() method is used by writable FatFile instances to find the next free cluster to write to. The mark_free() and mark_end() methods are used to mark a clusters as being free or as the terminal cluster of a file.

chain(start)[source]

Generator method which yields all the clusters in the chain starting at start.

free()[source]

Generator that scans the FAT for free clusters, yielding each as it is found. Iterating to the end of this generator raises OSError with the code ENOSPC (out of space).

abstract get_all(cluster)[source]

Returns the value of cluster in all copies of the FAT, as a tuple (naturally, under normal circumstances, these should all be equal).

insert(cluster, value)[source]

Raises TypeError; the FAT length is immutable.

mark_end(cluster)[source]

Marks cluster as the end of a chain. The value used to indicate the end of a chain is specific to the FAT size.

mark_free(cluster)[source]

Marks cluster as free (this simply sets cluster to 0 in the FAT).

class nobodd.fs.Fat12Table(mem, fat_size, info_mem=None)[source]

Concrete child of FatTable for FAT-12 file-systems.

min_valid = 2
max_valid = 4079
end_mark = 4095
get_all(cluster)[source]

Returns the value of cluster in all copies of the FAT, as a tuple (naturally, under normal circumstances, these should all be equal).

class nobodd.fs.Fat16Table(mem, fat_size, info_mem=None)[source]

Concrete child of FatTable for FAT-16 file-systems.

min_valid = 2
max_valid = 65519
end_mark = 65535
get_all(cluster)[source]

Returns the value of cluster in all copies of the FAT, as a tuple (naturally, under normal circumstances, these should all be equal).

class nobodd.fs.Fat32Table(mem, fat_size, info_mem=None)[source]

Concrete child of FatTable for FAT-32 file-systems.

min_valid = 2
max_valid = 268435439
end_mark = 268435455
free()[source]

Generator that scans the FAT for free clusters, yielding each as it is found. Iterating to the end of this generator raises OSError with the code ENOSPC (out of space).

get_all(cluster)[source]

Returns the value of cluster in all copies of the FAT, as a tuple (naturally, under normal circumstances, these should all be equal).

class nobodd.fs.FatClusters(mem, cluster_size)[source]

MutableSequence representing the clusters of the file-system itself.

While the sequence is mutable, clusters cannot be deleted or inserted, only read and (if the underlying buffer is writable) re-written.

insert(cluster, value)[source]

Raises TypeError; the FS length is immutable.

property readonly

Returns True if the underlying buffer is read-only.

property size

Returns the size (in bytes) of clusters in the file-system.

class nobodd.fs.FatDirectory[source]

An abstract MutableMapping representing a FAT directory. The mapping is ostensibly from filename to DirectoryEntry instances, but there are several oddities to be aware of.

In VFAT, many files effectively have two filenames: the original DOS “short” filename (SFN hereafter) and the VFAT “long” filename (LFN hereafter). All files have an SFN; any file may optionally have an LFN. The SFN is stored in the DirectoryEntry which records details of the file (mode, size, cluster, etc). The optional LFN is stored in leading LongFilenameEntry records.

Even when LongFilenameEntry records do not precede a DirectoryEntry, the file may still have an LFN that differs from the SFN in case only, recorded by flags in the DirectoryEntry. Naturally, some files still only have one filename because the LFN doesn’t vary in case from the SFN, e.g. the special directory entries “.” and “..”, and anything which conforms to original DOS naming rules like “README.TXT”.

For the purposes of listing files, most FAT implementations (including this one) ignore the SFNs. Hence, iterating over this mapping will not yield the SFNs as keys (unless the SFN is equal to the LFN), and they are not counted in the length of the mapping. However, for the purposes of testing existence, opening, etc., FAT implementations allow the use of SFNs. Hence, testing for membership, or manipulating entries via the SFN will work with this mapping, and will implicitly manipulate the associated LFNs (e.g. deleting an entry via a SFN key will also delete the associated LFN key).

In other words, if a file has a distinct LFN and SFN, it has two entries in the mapping (a “visible” LFN entry, and an “invisible” SFN entry). Further, note that FAT is case retentive (for LFNs; SFNs are folded uppercase), but not case sensitive. Hence, membership tests and retrieval from this mapping are case insensitive with regard to keys.

Finally, note that the values in the mapping are always instances of DirectoryEntry. LongFilenameEntry instances are neither accepted nor returned; these are managed internally.

MAX_SFN_SUFFIX = 65535
_clean_entries()[source]

Find and remove all deleted entries from the directory.

The method scans the directory for all directory entries and long filename entries which start with 0xE5, indicating a deleted entry, and overwrites them with later (not deleted) entries. Trailing entries are then zeroed out. The return value is the new offset of the terminal entry.

_get_names(filename)[source]

Given a filename, generate an appropriately encoded long filename (encoded in little-endian UCS-2), short filename (encoded in the file-system’s SFN encoding), extension, and the case attributes. The result is a 4-tuple: lfn, sfn, ext, attr.

lfn, sfn, and ext will be bytes strings, and attr will be an int. If filename is capable of being represented as a short filename only (potentially with non-zero case attributes), lfn in the result will be zero-length.

_get_unique_sfn(prefix, ext)[source]

Given prefix and ext, which are str, of the short filename prefix and extension, find a suffix that is unique in the directory (amongst both long and short filenames, because these are still in the same namespace).

For example, in a directory containing default.config (which has shortname DEFAUL~1.CON), given the filename and extension default.conf, this function will return the str DEFAUL~2.CON.

Because the search requires enumeration of the whole directory, which is expensive, an artificial limit of MAX_SFN_SUFFIX is enforced. If this is reached, the search will terminate with an OSError with code ENOSPC (out of space).

_group_entries()[source]

Generator which yields an offset, and a sequence of either LongFilenameEntry and DirectoryEntry instances.

Each tuple yielded represents a single (extant, non-deleted) file or directory with its long-filename entries at the start, and the directory entry as the final element. The offset associated with the sequence is the offset of the directory entry (not its preceding long filename entries). In other words, for a file with three long-filename entries, the following might be yielded:

(160, [
    <LongFilenameEntry>),
    <LongFilenameEntry>),
    <LongFilenameEntry>),
    <DirectoryEntry>)
])

This indicates that the directory entry is at offset 160, preceded by long filename entries at offsets 128, 96, and 64.

abstract _iter_entries()[source]

Abstract generator that is expected to yield successive offsets and the entries at those offsets as DirectoryEntry instances or LongFilenameEntry instances, as appropriate.

All instances must be yielded, in the order they appear on disk, regardless of whether they represent deleted, orphaned, corrupted, terminal, or post-terminal entries.

_join_lfn_entries(entries, checksum, sequence=0, lfn=b'')[source]

Given entries, a sequence of LongFilenameEntry instances, decode the long filename encoded within them, ensuring that all the invariants (sequence number, checksums, terminal flag, etc.) are obeyed.

Returns the decoded (str) long filename, or None if no valid long filename can be found. Emits various warnings if invalid entries are encountered during decoding, including OrphanedLongFilename and BadLongFilename.

_prefix_entries(filename, entry)[source]

Given entry, a DirectoryEntry, generate the necessary LongFilenameEntry instances (if any), that are necessary to associate entry with the specified filename.

This function merely constructs the instances, ensuring the (many, convoluted!) rules are followed, including that the short filename, if one is generated, is unique in this directory, and the long filename is encoded and check-summed appropriately.

Note

The filename and ext fields of entry are ignored by this method. The only filename that is considered is the one explicitly passed in which becomes the basis for the long filename entries and the short filename stored within the entry itself.

The return value is the sequence of long filename entries and the modified directory entry in the order they should appear on disk.

_split_entries(entries)[source]

Given entries, a sequence of LongFilenameEntry instances, ending with a single DirectoryEntry (as would typically be found in a FAT directory index), return the decoded long filename, short filename, and the directory entry record as a 3-tuple.

If no long filename entries are present, the long filename will be equal to the short filename (but may have lower-case parts).

Note

This function also carries out several checks, including the filename checksum, that all checksums match, that the number of entries is valid, etc. Any violations found may raise warnings including OrphanedLongFilename and BadLongFilename.

abstract _update_entry(offset, entry)[source]

Abstract method which is expected to (re-)write entry (a DirectoryEntry or LongFilenameEntry instance) at the specified offset in the directory.

items() a set-like object providing a view on D's items[source]
values() an object providing a view on D's values[source]
class nobodd.fs.FatRoot(mem, encoding)[source]

An abstract derivative of FatDirectory representing the (fixed-size) root directory of a FAT-12 or FAT-16 file-system. Must be constructed with mem, which is a buffer object covering the root directory clusters, and encoding, which is taken from FatFileSystem.sfn_encoding. The Fat12Root and Fat16Root classes are (trivial) concrete derivatives of this.

class nobodd.fs.FatSubDirectory(fs, start, encoding)[source]

A concrete derivative of FatDirectory representing a sub-directory in a FAT file-system (of any type). Must be constructed with fs (a FatFileSystem instance), start (the first cluster of the sub-directory), and encoding, which is taken from FatFileSystem.sfn_encoding.

class nobodd.fs.Fat12Root(mem, encoding)[source]

Concrete, trivial derivative of FatRoot which simply declares the root as belonging to a FAT-12 file-system.

fat_type = 'fat12'
class nobodd.fs.Fat16Root(mem, encoding)[source]

Concrete, trivial derivative of FatRoot which simply declares the root as belonging to a FAT-16 file-system.

fat_type = 'fat16'
class nobodd.fs.Fat32Root(fs, start, encoding)[source]

This is a trivial derivative of FatSubDirectory because, in FAT-32, the root directory is represented by the same structure as a regular sub-directory.

nobodd.fs.fat_type(mem)[source]

Given a FAT file-system at the start of the buffer mem, determine its type, and decode its headers. Returns a four-tuple containing:

nobodd.fs.fat_type_from_count(bpb, ebpb, ebpb_fat32)[source]

Derives the type of the FAT file-system when it cannot be determined directly from the bpb and ebpb headers (the BIOSParameterBlock, and ExtendedBIOSParameterBlock respectively).

Uses known limits on the number of clusters to derive the type of FAT in use. Returns one of the strings “fat12”, “fat16”, or “fat32”.