Alternate Data Streams and the NTFS file system Nephi Johnson, BYU, CS345 Summer 2009
Overview
The general word for an Alternate Data Stream is a file system fork.
File system forks are found in Macintosh (data, resource, and named forks) , Windows (alternate data streams) , and Novell environments ( multiple data streams) .
Using file system forks is a way to store additional variable-length information with a file in addition to the normal data stream.
Windows ADS is Microsoft’s way of implementing Macintosh’s resource fork.
Nephi Johnson, BYU, CS345 Summer 2009
Alternate Data Streams
Alternate Data Streams are used to store additional information with a file, such as file access/modification times.
Some applications legitimately use them to store metadata about a file
Malware and viruses (not to mention hackers) use them to hide files and executables
Anti-virus software doesn’t always look in alternate data streams
An ADS is invisible to the user without the use of special tools
(**note – in Windows Vista, the dir /R command will display
alternate data streams)
“ This feature permits related data to be managed as a single unit... For example, a graphics program can store a thumbnail image of a bitmap in a named data stream within the NTFS file containing the image.” --Microsoft
NTFS Overview
In order to understand how alternate data streams work, you must understand the NTFS file system.
NTFS is often compared with the FAT file system. Instead of a FAT table, NTFS has a Master File Table (MFT). It also has a copy of the MFT (akin to FAT2) that is used for backups and recovery.
A MFT is similar to a FAT, but instead of the directories storing info about the files and the files containing only data, the directories only store attributes of the directory and the files contain all the info about themselves.
This is the basic layout of an NTFS partition
NTFS Overview - MFT
When a volume is formatted with NTFS, an MFT file is created with the first 16 entries reserved for use as metadata for the file system (shown on next slide).
Windows reserves 12.5% of available disk space (the “MFT Zone”) for future growth by the MFT. This is never used by the user unless everything else has already been used.
Entry sizes in the MFT are determined by the cluster size. Entries have this general format
NTFS Overview – MFT Metadata Nephi Johnson, BYU, CS345 Summer 2009 System File File Name # Purpose of the File Master file table $Mft 0 Contains one base file record for each file and folder on an NTFS volume. If the allocation information for a file or folder is too large to fit within a single record, other file records are allocated as well. Master file table 2 $MftMirr 1 A duplicate image of the first four records of the MFT . This file guarantees access to the MFT in case of a single-sector failure. Log file $LogFile 2 Contains a list of transaction steps used for NTFS recoverability . Log file size depends on volume size and can be as large as 4 MB. It is used by Windows 2000 to restore consistency to NTFS after a system failure. For more information about the log file, see NTFS Recoverability earlier in this chapter. Volume $Volume 3 Contains information about the volume , such as the volume label and the volume version. Attribute definitions $AttrDef 4 A table of attribute names, numbers, and descriptions. Root file name index $ 5 The root folder. Cluster bitmap $Bitmap 6 A representation of the volume showing which clusters are in use. Boot sector $Boot 7 Includes the BPB used to mount the volume and additional bootstrap loader code used if the volume is bootable. Bad cluster file $BadClus 8 Contains bad clusters for the volume. Security file $Secure 9 Contains unique security descriptors for all files within a volume. Upcase table $Upcase 10 Converts lowercase characters to matching Unicode uppercase characters. NTFS extension file $Extend 11 Used for various optional extensions such as quotas, reparse point data, and object identifiers. 12-15 Reserved for future use.
NTFS Overview – Files and Dirs
Each entry in the MFT describes a file or dir and is a collection of attributes (yes , even the file data ) [see next slide]
If the total size of a file’s attributes (remember, attributes also include the file data) is smaller than the record size in the MFT (1 KB), the entire file will be stored in the MFT.
If an attribute’s value(s) are small enough to fit inside the MFT entry, then that attribute is called resident . (filenames, timestamps are always resident)
Otherwise, some attributes are made non-resident and a pointer to a new data run or extent is stored in the Attribute List attribute. The actual values of the non-resident attributes are stored in the extent.
An extent is a contiguous “run” of clusters used to
store an attribute’s data.
Nephi Johnson, BYU, CS345 Summer 2009
NTFS Overview – File/Dir Attr. Attribute Type Description Standard Information Information such as access mode (read-only, read/write, and so forth) timestamp, and link count. Attribute List Locations of all attribute records that do not fit in the MFT record. File Name A repeatable attribute for both long and short file names. The long name of the file can be up to 255 Unicode characters. The short name is the 8.3, case-insensitive name for the file. Additional names, or hard links, required by POSIX can be included as additional file name attributes. Data File data. NTFS supports multiple data attributes per file . Each file typically has one unnamed data attribute . A file can also have one or more named data attributes . Object ID A volume-unique file identifier. Used by the distributed link tracking service. Not all files have object identifiers. Logged Tool Stream Similar to a data stream, but operations are logged to the NTFS log file just like NTFS metadata changes. This attribute is used by EFS. Reparse Point Used for mounted drives. This is also used by Installable File System (IFS) filter drivers to mark certain files as special to that driver. Index Root Used to implement folders and other indexes. Index Allocation Used to implement the B-tree structure for large folders and other large indexes. Bitmap Used to implement the B-tree structure for large folders and other large indexes. Volume Information Used only in the $Volume system file. Contains the volume version.
NTFS Overview – Files and Dirs 2
Each entry in the MFT can contain one unnamed $DATA attribute and multiple named $DATA attributes (yes, this includes directories!)
Each of these data attributes are commonly called data streams
Any stream that is not the default data attribute (unnamed) is called an Alternate Data Stream.
The only way to completely delete an ADS is to delete the file or directory itself. However, an ADS can be overwritten to make the old data inaccessible through normal means.
Alternate Data Streams in action! Nephi Johnson, BYU, CS345 Summer 2009 But here’s some pictures just in case (Notice, no change in file size and no indication of the alternate stream)
Questions
Files and Directories in NTFS ...
are completely different from each other
are collections of attributes
work exactly like they do in FAT
don’t exist
Entries in the MFT can have multiple data streams b/c...
they can have multiple file names
sectors are larger than most clusters
they can have multiple data attributes
of non-resident attributes
Files and Directories in NTFS ...
are completely different from each other
are collections of attributes
work exactly like they do in FAT
don’t exist
Entries in the MFT can have multiple data streams b/c...
they can have multiple file names
sectors are larger than most clusters
they can have multiple data attributes
of non-resident attributes
Answers
Reference
[1] - Discovering alternate data streams using .NET
1 comments
Comments 1 - 1 of 1 previous next Post a comment