Skip to content

Instantly share code, notes, and snippets.

@kythyria
Created February 28, 2019 09:42
Show Gist options
  • Save kythyria/97428691bbdf11cc8da8a6d46d9ddec5 to your computer and use it in GitHub Desktop.
Save kythyria/97428691bbdf11cc8da8a6d46d9ddec5 to your computer and use it in GitHub Desktop.
Falsehoods even programmers believe about filesystems
-----------------------------------------------------
I haven't collected evidence someone baked all of these misconceptions into a single program, but boy howdy
are some of them widespread.
1. Deleting and recreating a file is exactly the same as using `ftruncate` on it.
* Not on FAT, NTFS, or anything suitable as a Linux rootfs.
+ Mozilla Thunderbird at least used to assume this about its profile directory.
+ Many programs, such as editors, assume you can delete a file and rename a temp file into it
+ Windows even has magic to try and pretend this works.
2. If you can perform directory enumeration on a path, you cannot use that path as a regular file.
(aka, files aren't directories)
* Not true on WebDAV. Unclear if it applies to POSIX.
* Not true on ReiserFS 4
3. What `open()` does to directories is defined at all.
* WebDAV explicitly doesn't define what happens if you GET a directory.
* I have no damn idea whether this is the case on POSIX
4. Filenames are in a particular encoding.
* Encoding varies between OSes and configurations.
5. Filenames are valid strings in that encoding.
* If you mount a filesystem with a different encoding, you can get invalid filenames.
6. Filenames cannot contain certain characters.
* WebDAV _again_ (HTTP lets you escape any character).
7. Those characters are `/` and NUL.
* RISC OS permits `/` in filenames but not `.`. HTTP allows escaping it. Win32 dislikes a bunch more.
+ I've forgotten this one myself.
8. What do you mean, `.` is the directory separator?
* RISC OS again
* This is how IMAP directories are stored in maildirs, and IMAP itself reflects this.
9. There is a 1:1 correspondence between drive letters and filesystems.
* Nope, between making drive letters be aliases for paths, SMB/CIFS shares which don't
necessarily have drive letters at all, and being able to use NTFS directories as mount
points, even Windows doesn't guarantee this.
10. Each mounted filesystem is attached to a directory in an existing filesystem.
* Not on non-POSIX systems.
11. There is exactly one directory entry per file.
* Not on POSIX or Windows systems.
12. There is exactly one name by which a file can be referred to period.
* Also not on POSIX or Windows, for a very long time.
+ Naive disk usage summarisers run into this.
13. Okay, but at least directories have this property?
* Symlinks, bind mounts, junction points, and WebDAV are really conspiring to ruin your day.
14. You can delete/move a file while it's open.
* Not necessarily on Windows.
+ Log rotators generally do this
+ So do Unix-style program updates
+ Taking advantage of this is sometimes advised as a way to prevent snapshots being written to.
15. You can't delete open files.
* Not necessarily true even on Windows.
16. Moving things within a filesystem is a reasonably fast operation, or even atomic.
* Not on Plan 9 From Bell Labs (no inter-directory move) or Amazon S3 (moving a directory is
actually bulk rename).
+ Maildirs assume atomicity of one file at a time
17. Deleting files increases available space.
* Not on WORM media, or union filesystems where the file is in a readonly location. Nor if
the file is part of more than one ZFS snapshot, or has hardlinks.
* Not if the delete command integrates "recycle bin" functionality
18. All nameable locations can be written to.
* Not true on any filesystem with access controls.
+ Too many Windows 9x programs to count.
19. If a file is readable, all the containing directories are enumerable too.
* Not necessarily on POSIXy systems. Configurable on Windows.
21. Anything that has a drive letter (on Windows) or is mounted in the tree (on POSIX) is on fast,
reasonably local, reliable media.
* Optical media has long seek times, rotating media sometimes has to spin up, and even local
networks can fail.
+ Bash tends to hang if operations like chdir are slow.
+ Operations like chdir being slow at all
+ IPFS' ambitions to have a FUSE driver that can be used as a filesystem rather than IPC imply this.
+ Old implementations of NFS could lock up the entire machine if some other machine didn't respond
in a timely manner.
+ Linux smbfs at least used to lock up the calling process if you tried to chdir after a connection
failure.
22. Filenames are case-sensitive.
* Not on Windows.
23. Filenames are case-insensitive.
* Not on POSIX.
+ Mozilla Thunderbird has in the past assumed this about its profile directory (IDK if it still does).
24. Requesting a file be copied is the same as reading from the source and writing to the destination.
* `cp foo bar` is different to `cat <foo >bar` if bar is a directory or has different ACLs.
* Windows `CopyFile()` also tries to copy ACLs
* catting one file to another will clobber ctime
* And clobber the destination even if the source is invalid
+ http://www.hpl.hp.com/techreports/2003/HPL-2003-222.html includes this assumption in the introduction.
25. Enumerating a directory only shows you its children
* Not on POSIX
+ Everyone who globs `.*` gets bitten by this.
26. Renaming a file causes open file handles to still point to it.
* Not webdav (file handles are a pretence of the client)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment