Skip to content

Instantly share code, notes, and snippets.

Last active March 25, 2022 12:25
Show Gist options
  • Save danzek/624d5c3aaa349f0e846281c75c2ec371 to your computer and use it in GitHub Desktop.
Save danzek/624d5c3aaa349f0e846281c75c2ec371 to your computer and use it in GitHub Desktop.
Some important articles on Windows/NTFS

Important articles about Windows/NTFS

This also contains quotes from the articles in case they are moved/deleted/etc.

A file time is a 64-bit value that represents the number of 100-nanosecond intervals that have elapsed since 12:00 A.M. January 1, 1601 Coordinated Universal Time (UTC). The system records file times when applications create, access, and write to files.

The NTFS file system stores time values in UTC format, so they are not affected by changes in time zone or daylight saving time. The FAT file system stores time values based on the local time of the computer. For example, a file that is saved at 3:00pm PST in Washington is seen as 6:00pm EST in New York on an NTFS volume, but it is seen as 3:00pm EST in New York on a FAT volume.

Time stamps are updated at various times and for various reasons. The only guarantee about a file time stamp is that the file time is correctly reflected when the handle that makes the change is closed.

Not all file systems can record creation and last access times, and not all file systems record them in the same manner. For example, the resolution of create time on FAT is 10 milliseconds, while write time has a resolution of 2 seconds and access time has a resolution of 1 day, so it is really the access date. The NTFS file system delays updates to the last access time for a file by up to 1 hour after the last access.

By default, Windows caches file data that is read from disks and written to disks. This implies that read operations read file data from an area in system memory known as the system file cache, rather than from the physical disk. Correspondingly, write operations write file data to the system file cache rather than to the disk, and this type of cache is referred to as a write-back cache. Caching is managed per file object.

Caching occurs under the direction of the cache manager, which operates continuously while Windows is running. File data in the system file cache is written to the disk at intervals determined by the operating system, and the memory previously used by that file data is freed—this is referred to as flushing the cache. The policy of delaying the writing of the data to the file and holding it in the cache until the cache is flushed is called lazy writing, and it is triggered by the cache manager at a determinate time interval. The time at which a block of file data is flushed is partially based on the amount of time it has been stored in the cache and the amount of time since the data was last accessed in a read operation. This ensures that file data that is frequently read will stay accessible in the system file cache for the maximum amount of time....

The frequency at which flushing occurs is an important consideration that balances system performance with system reliability. If the system flushes the cache too often, the number of large write operations flushing incurs will degrade system performance significantly. If the system is not flushed often enough, then the likelihood is greater that either system memory will be depleted by the cache, or a sudden system failure (such as a loss of power to the computer) will happen before the flush. In the latter instance, the cached data will be lost.

To ensure that the right amount of flushing occurs, the cache manager spawns a process every second called a lazy writer. The lazy writer process queues one-eighth of the pages that have not been flushed recently to be written to disk. It constantly reevaluates the amount of data being flushed for optimal system performance, and if more data needs to be written it queues more data. Lazy writers do not flush temporary files, because the assumption is that they will be deleted by the application or system.

Some applications, such as virus-checking software, require that their write operations be flushed to disk immediately; Windows provides this ability through write-through caching. A process enables write-through caching for a specific I/O operation by passing the FILE_FLAG_WRITE_THROUGH flag into its call to CreateFile. With write-through caching enabled, data is still written into the cache, but the cache manager writes the data immediately to disk rather than incurring a delay by using the lazy writer. A process can also force a flush of a file it has opened by calling the FlushFileBuffers function.

File system metadata is always cached. Therefore, to store any metadata changes to disk, the file must either be flushed or be opened with FILE_FLAG_WRITE_THROUGH.

The Microsoft Windows products listed at the beginning of this artice contain file system tunneling capabilities to enable compatibility with programs that rely on file systems being able to hold onto file meta-info for a short period of time. This occurs after deletion or renaming and re-introducing a new directory entry with that meta-info (if a create or rename occurs to cause a file of that name to appear again in a short period of time).

When a name is removed from a directory (rename or delete), its short/long name pair and creation time are saved in a cache, keyed by the name that was removed. When a name is added to a directory (rename or create), the cache is searched to see if there is information to restore. The cache is effective per instance of a directory. If a directory is deleted, the cache for it is removed.

These paired operations can cause tunneling on "name."

  • delete(name)/create(name)
  • delete(name)/rename(source, name)
  • rename(name, newname)/create(name)
  • rename(name, newname)/rename(source, name)

The idea is to mimic the behavior MS-DOS programs expect when they use the safe save method. They copy the modified data to a temporary file, delete the original and rename the temporary to the original. This should seem to be the original file when complete. Windows performs tunneling on both FAT and NTFS file systems to ensure long/short file names are retained when 16-bit applications perform this safe save operation.

Tunneling cache time can be adjusted from the default time of 15 seconds, or if tunneling capabilities are undesirable, it can be disabled by adding a value in the Windows Registry.

If tunneling is disabled, applications that use this safe save method can lose the name they are unaware of, usually the LFN, and the rediscovery of shortcut targets could be impaired since the creation timestamps cannot remain constant for files manipulated by such apps. Creation timestamp maintenance is possible in the absence of tunneling if an application is smart enough. The same is not true for the long filenames....

  • HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\FileSystem\MaximumTunnelEntryAgeInSeconds
  • HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\FileSystem\MaximumTunnelEntries (set to 0 to disable tunneling)

One of the file system features you may find yourself surprised by is tunneling, wherein the creation timestamp and short/long names of a file are taken from a file that existed in the directory previously. In other words, if you delete some file "File with long name.txt" and then create a new file with the same name, that new file will have the same short name and the same creation time as the original file. You can read this KB article for details on what operations are sensitive to tunnelling.

Why does tunneling exist at all?

When you use a program to edit an existing file, then save it, you expect the original creation timestamp to be preserved, since you're editing a file, not creating a new one. But internally, many programs save a file by performing a combination of save, delete, and rename operations (such as the ones listed in the linked article), and without tunneling, the creation time of the file would seem to change even though from the end user's point of view, no file got created.

As another example of the importance of tunneling, consider that file "File with long name.txt", whose short name is say "FILEWI~1.TXT". You load this file into a program that is not long-filename-aware and save it. It deletes the old "FILEWI~1.TXT" and creates a new one with the same name. Without tunnelling, the associated long name of the file would be lost. Instead of a friendly long name, the file name got corrupted into this thing with squiggly marks. Not good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment