Skip to content

Instantly share code, notes, and snippets.

@sharadhr
Last active April 17, 2024 10:04
Show Gist options
  • Star 186 You must be signed in to star a gist
  • Fork 8 You must be signed in to fork a gist
  • Save sharadhr/39b804236c1941e9c30d90af828ad41e to your computer and use it in GitHub Desktop.
Save sharadhr/39b804236c1941e9c30d90af828ad41e to your computer and use it in GitHub Desktop.
Home, Not So Sweet Home

$HOME, Not So Sweet $HOME

Preface

Preface

This was supposed to be a blog post, but I have neither the knowledge, nor the time, nor the energy to set up a nice statically-generated blog like everyone else does on Hacker News or proggit. Of course, I want one, but I also want to use a fully .NET Core-based static site generator, and therefore have experimented with Statiq.Dev Web. However, it turns out that I still need to pick up non-trivial HTML and CSS to make it look like anything but a website from the 1990s.

For now, though, GitHub-flavoured Markdown handles almost all use-cases I can think of (embedded images, this spoiler you've opened, complicated lists, in-line HTML for Ctrl, in-line $\LaTeX{}$ for $\text{maths} \equiv \text{fun}$...), and GitHub Gists renders Markdown as formatted text anyway, so what else do I need? I can even pin these gists to my GitHub profile, which is also very nice. If I ever do prepare a blog, it should be fairly straightforward to migrate these posts to it: they're just Markdown, after all. The only thing (so far) I've noticed that is missing is auto-numbered headings, although that's apparently a matter of CSS styling.


I've been meaning to write this for a very long time—more or less since I ever started using Linux properly four years ago, discovered ls -A, and realised what a clutter ~ was. At the same time I noticed that the situation on my Windows install was even worse. What really inspired me to get started was Everything that uses configuration files should report where they're located. I resonate strongly with the author's sentiment, and more importantly, dislike 'opinionated' software that likes to do what its author thinks is best instead of following system conventions.

1 What is $HOME to you?

Home. It is where you're supposed to be most comfortable in. It is your place of refuge, and a sanctum from the mess and chaos of the outside world. It is where you have complete liberty over everything: what you do, when you do things, how you do them, what things you have. It is where these things are supposed to be where you want them, and how you want them to be.

You buy things to decorate, maintain, and improve your home: paintings, photographs, vases, light fixtures, sofas, chairs, ovens, vacuum cleaners. You have your own volition to put these wherever you want, and set them up however you want. Your vacuum cleaner in a wardrobe? Sure. Or in the washroom, or in the service balcony, or the backyard. You can put its dust bag anywhere you want, too.

What if there existed a vacuum cleaner that stopped working if you plugged it into a different socket to the one the manufacturer set out in the manual? What if it stopped working if you changed where you put its dust bag, or swapped it out for a new dust bag? Would you buy this vacuum cleaner? Suppose that this vacuum cleaner was given to you for free, anyway. Would you still use it, but live with the compromise that your home is not exactly how you want it?

If you answer 'maybe', or even 'yes', then welcome to the world of software, where your home is violated on a regular basis. Software that spews configuration files, temporary and cache files, generated and save files, catalogue files, downloaded files everywhere. There is little rhyme or reason to any of it: many applications never tell the user they are putting these files HERE or THERE, and if the user wants to move them around, they aren't given a choice in the matter. Applications regularly disregard platform conventions (many ironies abound, to be detailed below); even if these conventions are clearly documented, some go out of their way to do their own thing. Some applications pretend that everything is immaculate, by hiding their mess (and they typically don't do a great job of it: look at many Linux-first program ported to Windows).

It would irritate anyone. I am particularly compulsive about software doing what I'd like it to do, and about following platform conventions. Conventions, standards, and protocols exist so everyone is on the same page, and there is a common 'language' for software to communicate with. When software authors break them just for the sake of 'opinionation', or because they feel like it (a worse justification, in my opinion), it only leads to much exasperation on the users' end, because it comes as a surprise to them, and software should not be surprising. It ought to be reliable, reproducible, and follow expectations.

This post is a detailed discussion into user profiles, their directories, and how they are—to put it bluntly—in total disarray on Windows and Linux (I haven't used a Mac in ages, but I assume the situation is very similar there, too). Applications treat the user profile as a dumping ground, and any user with a reasonably wide list of installed software will find their user profile very difficult to traverse after some time in use. There are platform conventions and attempts to standardise things on more open-source platforms, but a lot of developers resolutely refuse to change the behaviour of their software for a variety of reasons (some less valid than others).

The first part is a deep dive into user profiles on Linux and Windows, and the conventions that have been established on these platforms over the years. The second section details how they are broken on each platform, and why they are broken.

This is a bit of a soapbox, but I hope developers read this, and at least attempt to fix their software so that home directories are cleaner, and users have easier lives maintaining and using their computers.

If you want to skip all the setup drudgery, go to §3.

2 Setup

Before I begin, there are some platform-specific details and setup to be discussed, as well as phrasing conventions. I will use the terms 'home directory' and 'user profile directory' somewhat interchangeably in this post, whereas 'user profile' means the home directory itself and its contents—including user-specific configuration and data—combined.

2.1 Linux

Linux is famously fragmented, but even so, there exist some conventions for user profiles, especially for desktop environments. As far as I've seen, Linux user profiles are typically created in /home/<username>. The directory path may be chosen during the out-of-box experience (OOBE)/first-time setup of a Linux distro or entirely manually, with useradd -d, which writes to /etc/passwd. Sometimes, /home might occupy an altogether separate partition/sub-volume (if the user is using Btrfs, for instance).

2.1.1 $HOME

Regardless of its location, the environment variable $HOME is set by a login process or graphical display manager (e.g. login, gdm, sddm, etc.) upon login, based on values previously set in /etc/passwd. This file is plaintext, but it may also be read using the Linux API:

#include <cstdio>
#include <pwd.h>
#include <sys/types.h>
#include <unistd.h>

auto main() -> int
{
    auto const pw = getpwuid(getuid());
    if (pw != nullptr) {
        std::printf("User name: %s\n", pw->pw_name);
        std::printf("User ID: %d\n", pw->pw_uid);
        std::printf("Group ID: %d\n", pw->pw_gid);
        std::printf("Home directory: %s\n", pw->pw_dir);
        std::printf("Login shell: %s\n", pw->pw_shell);

        return 0;
    } else {
        return 1;
    }
}

Strictly speaking, though, there is no real concept of a home directory per se on Linux/UNIX (hereafter referred to as *nix), and they aren't treated any differently by the OS (unlike Windows, as seen below). The pw_dir member variable is just the initial working directory of the login shell and any subsequent shells started by the corresponding user; it could technically point to any directory that has read permissions for said user. The pwd.h manual states as much.

Most shells and desktop environments parse ~ as an alias to $HOME; running cd without any command-line arguments also navigates to $HOME.

2.1.2 Dot-files and dot-directories

On *nix, prefixing a file or directory with a full-stop (.) excludes said path from being listed in userland utilities, such as ls or graphical file managers by default. These files are considered 'hidden', although no special meaning is given to them by the filesystem itself (unlike on Windows). This is convention dating to the earliest days of UNIX.

Most Linux users' home directories will contain a collection of these dot-files and dot-directories, and they are typically used to set user-specific configuration for almost all programs on *nix. They typically reside in the top level of $HOME; for instance, the Vim configuration is in $HOME/.vimrc. If you do want to list hidden files with ls, use ls -a or ls -A.

2.1.3 XDG Base Directories

XDG

XDG stands for 'Cross-Desktop Group'.

The XDG Base Directory specification defines several environment variables expanding to subdirectories of $HOME in an attempt to standardise dot-files and dot-directories. I summarise them below:

Variable Default value Details
$XDG_DATA_HOME $HOME/.local/share User-specific data files; e.g. program databases, caches that persist through multiple program runs, search indices, 'Trash' directory for desktop environments.
$XDG_CONFIG_HOME $HOME/.config User-specific configuration files, including .*rc and .*env files; VS Code settings.json.
$XDG_STATE_HOME $HOME/.local/state User-specific state files, such as terminal history files.
$XDG_CACHE_HOME $HOME/.cache Caches limited to single runs of a program, but can extend to persistent caches, e.g. user-installed package manager caches for pip, pacman AUR wrappers, vcpkg, etc.

Notice the specification provides for the scenario that these environment variables are not defined:

If $XDG_CONFIG_HOME is either not set or empty, a default equal to $HOME/.config should be used.

2.1.4 xdg-user-dirs

Since Linux does not provide an 'official' userland environment (unlike Windows), the XDG people have again made effort to set up 'Windows-style' user directories with localisation, called xdg-user-dirs. Both GNOME and KDE Plasma Desktop require it. This tool is configured with a straightforward script in $(XDG_CONFIG_HOME)/user-dirs.dirs. The default directories generated on an English-language install are:

  • Desktop
  • Documents
  • Downloads
  • Music
  • Pictures
  • Public
  • Templates
  • Videos

As always, the relevant article at the Arch wiki is very useful.

2.1.5 systemd-homed

systemd-homed allows Linux administrators to create and manage user profiles, and optionally encrypt them. It also has roughly equivalent functionality to Active Directory and roaming user profiles on Windows, where user profiles may be encrypted and stored remotely on some server, for retrieval by terminals upon login.

2.2 Windows

Windows and Microsoft commentary

I hold several very controversial opinions about Microsoft and Windows that would at best, raise many hackers' eyebrows, and at worst, get me called an astroturfer or Microsoft shill. These are best left to a separate blog post.

User profiles on Windows are a rather complex matter, which is not necessarily unwelcome. Most of it is configurable, fairly well-documented, and there is more than one way to do things, although some are clearly better than others.

2.2.1 Environment variables

Modern Windows (i.e. Vista and later) also has environment variables for important directories, including those within the user profile directory. Many are listed here (Microsoft's official documentation is a little less forthcoming), and I reproduce a few in the table below.

Variable Value Details
%SYSTEMDRIVE% Usually C: C: is not necessarily always the Windows install drive.
%WINDIR% %SYSTEMDRIVE%\Windows Windows is not necessarily the install directory for Windows itself. There are legacy directories like WINNT, WINNT35, WTSRV.
%SYSTEMROOT% %WINDIR%
%PROGRAMFILES% %SYSTEMDRIVE%\Program Files Default system-wide installation location for programs. Requires administrator permission to modify.
%PROGRAMFILES(X86)% %SYSTEMDRIVE%\Program Files (x86) Only on 64-bit installs of Windows. 32-bit-only programs go here.
%PROGRAMDATA% %SYSTEMDRIVE%\ProgramData System-wide program configuration and storage; e.g. default configurations, logs, etc.
%USERNAME% Set in the registry Can be changed in Control Panel or Settings.
%USERPROFILE% Usually %SYSTEMDRIVE%\Users\%USERNAME%; see discussion below
%APPDATA% %USERPROFILE%\AppData\Roaming User-specific roaming application data that may be exported between PCs; e.g. Google Chrome user profiles, Steam accounts; likely anything that requires an internet sign-in
%LOCALAPPDATA% %USERPROFILE%\AppData\Local User-specific, PC-specific application data; sometimes includes the entire applications themselves if installed only for the user. e.g. Google Chrome and MiKTeX installed by a non-admin user.

These environment variables may be accessed from C or C++ with getenv(), or any other language-specific API.

2.2.2 User profile creation and naming

This is a mess on Windows. Until Windows 8, user profile creation was straightforward, and only local accounts could be created (not including Active Directory roaming profiles). Since then, however, OneDrive (previously SkyDrive) and Microsoft account integration has thoroughly road-rolled over this simplicity. Today, creating a local, non-connected user profile in the OOBE in Windows 11 is a particularly convoluted process, and may require Windows 11 Pro, which is usually not shipped with consumer devices.

This digression is to discuss what home directory name users end up with: on a profile connected to a Microsoft account, the user home directory has a name that is the first five characters, in lowercase, of the email address or name provided. For instance, in the first case, I would have %SYSTEMDRIVE%\Users\shara, but on a 'local account', it would be %SYSTEMDRIVE%\Users\Sharadh (mine is actually the latter—I am obsessive about this, so I created a local account first, and then synced it to OneDrive). Like many things Windows, configuring the user profile directory after account creation requires editing registry keys using another administrator account.

2.2.3 Libraries and KNOWNFOLDERIDs

Upon user profile creation, Windows automatically sets up several directories, which are also libraries, inside the home directory:

  • 3D Objects
  • Contacts
  • Desktop
  • Documents
  • Downloads
  • Music
  • Saved Games
  • Videos

These libraries are initially created under %USERPROFILE%, but may be moved (right click → PropertiesLocation tab → Move...) by the user at their own discretion. The Windows API provides enumerated GUIDs called KNOWNFOLDERIDs that map to these directories, as well as most of the system directories listed in Environment variables. These may be queried with SHGetKnownFolderPath, like so:

#include <cstdio>
#include <ShlObj.h> // SHGetKnownFolderPath

#pragma comment(lib, "shell32.lib") // link shell32.lib

auto main() -> int
{
    auto path = PWSTR{};

    // get desktop path; return type HRESULT
    if (auto const result = SHGetKnownFolderPath(FOLDERID_Desktop, KF_FLAG_DEFAULT, nullptr, &path);
        FAILED(result)) {
        return 1;
    }
    std::wprintf_s(L"%ls\n", path);
    return 0;
}

In .NET, the Environment.GetFolderPath method, together with the Environment.SpecialFolder enum returns the same value, although this is missing the Vista changes detailed below, because these constants are set to the CSIDL values instead of the KNOWNFOLDERIDs. There is a proposal to add these to .NET 8.

> [System.Environment]::GetFolderPath([System.Environment+SpecialFolder]::Desktop)
D:\Libraries\Desktop

I have set all my libraries to point to D:\Libraries\<library>, because my C: drive is rather small and dedicated to the OS and important programs only, whereas D: is significantly larger. So, the graphical setting for the various libraries update the locations returned by the native Windows API too. Hereafter, I will use the Windows shell shortcuts to describe library locations.

2.2.3.1 Hidden folders on Windows

In particular, the parent directory of shell:AppData and shell:Local AppData%USERPROFILE%\AppData—is a hidden directory. Unlike *nix, 'hidden' on Windows has special semantics accorded by the filesystem (usually NTFS). This can be set either graphically (right-click → PropertiesGeneral tab → Hidden checkbox under Attributes), or retrieved/set programmatically in the Windows API, using GetFileAttributes and SetFileAttributes. These functions return or accept a bitwise-ORed file attribute constant, as demonstrated below:

#include <cstdio>
#include <pathcch.h>
#include <ShlObj.h>

#pragma comment(lib, "Pathcch.lib")
#pragma comment(lib, "Shell32.lib")

auto main() -> int
{
    auto path = PWSTR{};

    if (auto const get_path_result = SHGetKnownFolderPath(FOLDERID_RoamingAppData, KF_FLAG_DEFAULT, nullptr, &path);
        FAILED(get_path_result)) {
        return 1;
    }

    std::printf("SHGetKnownFolderPath succeeded, path is %ws\n", path);

    // get parent path of %APPDATA%, i.e. `%USERPROFILE%\AppData`
    if (auto const remove_leaf_result = PathCchRemoveFileSpec(path, MAX_PATH);
        FAILED(remove_leaf_result)) {
        return 1;
    }

    if (auto const attributes = GetFileAttributesW(path); attributes == INVALID_FILE_ATTRIBUTES) {
        printf("GetFileAttributes failed with error code %d\n", GetLastError());
        return 1;
    } else if (attributes & FILE_ATTRIBUTE_HIDDEN) {
        printf("%%APPDATA%%\\.. is hidden\n");
        return 0;
    } else {
        printf("%%APPDATA%%\\.. is not hidden; something is wrong\n");
        return 1;
    }
}

2.2.4 Windows XP to Vista changes

A Brief History of Windows Profiles is a great article, but I want to focus on one important section. Windows 2000 and XP used %SYSTEMDRIVE%\Documents and Settings\<username> for the home directory, which was moved to %SYSTEMDRIVE%\Users\<username> with Vista and later. Windows Vista also introduced the above-mentioned KNOWNFOLDERIDs, which superseded the older CSIDL enumeration (although the latter is still available, and used by the .NET API).

Many environment variables like %APPDATA% were also redirected to the current locations (the previous was a mouthful: %SYSTEMDRIVE%\Documents and Settings\<username>\Local Settings\Application Data). New known folders were added, such as Downloads and Saved Games; the My⎵ prefix was removed from My Documents and My Music; locations such as Start Menu were moved (in this case, to %APPDATA%\Microsoft\Windows\Start Menu).

The reason for this move is anyone's guess, but I feel the $\geq$ Vista convention makes a lot more sense than the $\leq$ Windows XP one. There is an old guide for sysadmins migrating from Windows XP to Vista, and there is another blog post for Windows application developers: Where Should I Store my Data and Configuration Files if I Target Multiple OS Versions?

As an aside, there is an apocryphal tale for the reason why programs are stored in Program Files and not ProgramFiles or Program_Files: apparently Microsoft wanted to force programmers to write code defensively, and handle spaces in paths without crashing, so they made the program install location itself have spaces in it (this then raises the question: why ProgramData?) Windows also has decent localisation (at least amongst Indo-European languages employing the Latin script): set the system language to German, for instance, and Program Files is now Programme; it is Programfiler in Norwegian.

2.2.5 HKEY_CURRENT_USER registry hive

On Windows, many user-specific configurations are stored in the HKEY_CURRENT_USER registry hive (abbreviated as HKCU), stored in %USERPROFILE%\NTUSER.dat. This includes changes made in both Control Panel and Windows Settings, as well as settings for Microsoft programs like the Office suite. This hive is synchronised across terminals on roaming user profiles (such as with Active Directory and domains).

2.3 Linux and Windows equivalents

So far, I've discussed user profiles on Linux and Windows. There are some rough equivalents between the two, which ought to be useful for developers aiming to write well-behaved cross-platform applications:

Use-case Linux Windows
Program install directory /bin %PROGRAMFILES% or %PROGRAMFILES(X86)%
Headers /usr/include Shipped with the program, or available with Windows SDK
System/program libraries /usr/lib %WINDIR%\System32 or shipped with the program; Visual C++ and .NET redistributables pre-installed or installed on-demand
Default/system-wide program configuration /etc %PROGRAMDATA%
System-wide logs /var/log %PROGRAMDATA%
Per-user program configuration $XDG_CONFIG_HOME %APPDATA% or %LOCALAPPDATA%
Per-user program data $XDG_DATA_HOME %LOCALAPPDATA%
Per-user program cache $XDG_CACHE_HOME %APPDATA%

3 Conventions, and why they're broken

3.1 Respecting the user's choice and expectations

Okay; that was a pretty long introduction, but I haven't really gotten into why these user profile conventions have been established in the first place. It seems like pointless bike-shedding—discussing where user data ought to be saved—but there are real issues which said conventions attempt to solve.

These specifications don't exist merely to 'put things somewhere' for the hell of it. Disparate applications from a wide variety of developers and backgrounds which implement these specifications will have a only single place to write to (and read from), and a single point of backup, which reduces user workload. As I mentioned in the introduction, software ought not to be surprising or unnecessarily opinionated. Users see a specification, or some clearly-labelled pre-generated folder in their home directory, and expect software to write to adhere to these specifications, and write to a sensible location.

GitHub user Lyle Hanson (@lhanson) puts it more clearly than I could, in the issue for Vim:

The benefit of respecting the specification isn't just to put it "somewhere else", it's to put it where I (the user) want it, without having to repeat myself every time I install anything.

... I may eventually get around to trying to back up and/or version control my configuration files without hauling around everything else on my drive and I'll be delighted that most of them seem to be in ~/.config. When I have to add exceptions and regex matches for every program which stores its files in what seem to me like a jumble of random locations, the fact that a given program has done it that way for 30 years doesn't make my job any easier or less frustrating than for a program that was written last week without knowing any better.

... From a UX point of view, I'd rather express my preferences at most once and be able to leverage reasonable assumptions later as my file management habits evolve than to have to express my preferences $n \times m$ times, where $n$ is the number of programs I use and $m$ is the number of unique configuration options each of them exposes to specify where all of their files go.

... It's not about simply appeasing a subset of Vim users who for some reason have glommed onto some new-fangled specification, it's about respecting users in general and making things easier for them by default.

I'd like to tack on to his point about 'respecting users': just because the user's home directory and its contents have full read-write-execute permissions for the user does not mean that software executed by said user should have free rein over the directory.

3.2 So, what now?

The default assumption by a programmer might be:

Let's put config and data files in the same directory as the program.

This assumption immediately breaks on most modern desktop operating systems, because programs are typically installed to locations which require elevated write permissions (e.g. /bin, %PROGRAMFILES%) which means neither users nor their processes can write there willy-nilly. Furthermore, most desktop OSs are multi-user, which means handling several different configurations and data for multiple users. Even Android, a smartphone OS, supports multiple user profiles since 5.0 Lollipop, which means providing ways for developers to handle different user profiles.

From my experiments, apps and their data are completely sandboxed per-user on Android, so apps installed by one user can be neither seen nor accessed by another user. This is workable on desktop OSs, except that it still doesn't really solve the backup problem: do we just sync the entire app and its contents to a server? Many programs on desktop OSs are several gigabytes large, and naïvely syncing this much data is a waste of bandwidth. And it hearkens back to the point mentioned above: any backup utility or the user will have to handle a polynomially large number of application-user-configuration combinations.

3.3 Screw your conventions, we've always done it this way

We come to the real crux of the matter: why and how developers break the above-mentioned conventions and standards. Developers cite only a few reasons for not wanting to adhere to conventions, and the following are lifted verbatim from their respective issue trackers (many closed as 'won't fix'):

Arduino:

It's a change to an existing and established behaviour, a breaking change for all existing users too.

Bash:

There's no reason to change historical behavior here. All the world is not Linux.

OpenSSH (archived page, because the original bug report is inaccessible to guest users):

No. OpenSSH (and it's ancestor ssh-1.x) have a 17 year history of using ~/.ssh. This location is baked into innumerable users' brains, millions of happily working configurations and countless tools. Changing the location of our configuration would require a very strong justification and following a trend of desktop applications (of which OpenSSH is not) is not sufficient.

RenderDoc:

I've thought about it but I don't want to accept this change. The additional complexity and bug surface is not worth the value added by the feature. ... I don't want to add dependency on those environment variables.

Flatpak:

There is no actual problem here.

In a nutshell, these reasons are:

  • ignorance, i.e. the developer doesn't know the specification exists
  • arrogance, i.e. 'my way is correct, I don't care what my users say because they're stupid', or even 'it works on my machine';
  • fear of introducing change for change's sake, and breaking user workflows;
  • fear of introducing complexity in handling these conventions, especially if the program is multi-platform.

3.2.1 Ignorance

This is more understandable than the rest, and is easily mitigated, especially if the developer is responsive and accepting of changes. Nothing else to say here, honestly; if you don't know the platform conventions, you don't; hopefully you pick it up and rewrite your software to be a good citizen of the platform you're targetting.

3.2.2 Arrogance

This is the least acceptable, and it reveals an ugly superiority complex. If developers don't respect their users, then why even bother releasing software publicly, except to flex and collect bragging rights? Many open-source authors angrily retort: 'develop it yourself if you want $x$ feature'. Yes, there are dire warnings about 'as-is', and 'no warranty' for 'merchantability' and 'fitness for a purpose', and I'm not claiming that open-source software authors are legally obliged to make quality-of-life changes. The social contract, however, is that said developers and maintainers listen to and judges user feedback on their own merits, and implements frequently-asked-for features or fixes.

3.2.3 Fear of change and complexity

I'm not going to quote Heraclitus or Benjamin Franklin here ('change is the only constant'), but software should be developed to fit users' needs, and platform conventions. As software is developed, it changes, doesn't it?

In all honesty, it seems like this fear is a function of what is being changed: developers tend to view adding fancy new features with delight, but see more menial tasks like properly handling paths and correct cross-platform behaviour with disdain. If your program is going to be cross-platform, you will have to handle the inherent complexity in supporting all those platforms, and again, be a good citizen on those platforms.

Merely using cross-platform frameworks (like Electron or Qt) is not a panacea: if your program writes to a non-canonical location in the user's home directory, it ought to tell the user where it is writing to, and what it is writing. Case in point: Visual Studio Code is a fairly big problem (see below).

4 Platform-specific issues

4.1 Linux

Before I proceed, I provide a listing of my own $HOME on my Arch Linux install:

My $HOME
% ls -A1 --group-directories-first $HOME
.android
.astropy
.cache
.cgdb
.config
.dotnet
.enthought
.gitkraken
.gnome
.gnupg
.icons
.java
.local
.miktex
.mozilla
.npm
.nuget
.omnisharp
.pki
.pytest_cache
.renpy
.rustup
.sonarlint
.sqlsecrets
.ssh
.stellarium
.swt
.templateengine
.thunderbird
.vcpkg
.vscode
.zotero
Desktop
Documents
Downloads
Music
OneDrive
Pictures
Public
Templates
Videos
VirtualBox VMs'
Zotero
bin
enthought
.Xauthority
.bash_history
.fonts.conf
.gtkrc-2.0
.lesshst
.nvidia-settings-rc
.ocamlinit
.python_history
.utop-history
.viminfo
.vimrc
.wget-hsts
vkvia.html

Of the above directories and files, I created exactly two manually: OneDrive, and bin. The rest (excluding the XDG directories) were either auto-generated by xdg-user-dirs (which is okay), or created and written to by non-compliant software.

The link above discussing the origin of dot-files makes it clear that the current behaviour was possibly a mistake made by the original developers of UNIX. Even if it wasn't, the result today is a messy litter of dot-files dumped all over a user's home directory. The XDG Base Directory (hereafter, XBD) specification is more than a decade old—an eternity in computing terms—and yet, there is a veritable parade of very well-known programs that refuse to follow its guidelines: see the quotes above in §3.3.

The Arch Wiki has a list of programs that are XBD-compliant by default, may be forced to comply after user intervention, and those with hard-coded non-XBD paths. The latter two lists combined is almost twice as long as the compliant list, and includes some very prominent *nix-first software like Bash, Vim, OpenSSH, and Firefox.

There's little more to say here: there exists a specification, many programs adhere to it, and many don't. Fragmentation is only useful in grenades, and not in software specifications.

For the record, people have found XBD compliance painful enough that someone developed a tool dedicated to finding non-compliant programs, and suggesting user-side workarounds: xdg-ninja (written in Haskell, by the way, which is nice).

4.2 Windows

Windows... Ah, Windows. Bastion of backwards compatibility, keeping icons and settings options from Windows 3.1 NT around in Windows 11... And the source of all your problems.

As on Linux, let me provide a listing of my %USERPROFILE%:

My %USERPROFILE%
> gci $env:USERPROFILE -Name
.android
.cache
.config
.dlv
.dotnet
.eclipse
.fop
.gk
.gnupg
.gradle
.librarymanager
.m2
.matplotlib
.ms-ad
.nuget
.omnisharp
.platformio
.sonarlint
.ssh
.templateengine
.thumbnails
.vscode
ansel
Calibre Library
Contacts
dotTraceSnapshots
Heaven
recovered
RenPy
source
Tracing
Zotero
.cortex-debug
.git-for-windows-updater
.gitconfig
.kdiff3rc
.lesshst
Sti_Trace.log

Yep, dot-files... in Windows. Windows does not automatically hide dot-files, whether it be in Explorer, dir.exe, or Get-ChildItem. Notice the above listing is missing AppData—that's because it is properly hidden by the filesystem as mentioned in §2.2.3.1. This is frequently violated by software written for *nix first—notice .git, .ssh... There's even the full set of XDG base directories, which is a *nix specification!

Developers either assume Windows operates the same as *nix, or don't care about Windows much ('second-class citizen'), and again, couch any changes to fix this as 'introducing unnecessary complexity'. Honestly, one would instead think the change is fairly straightforward: test which OS the program is running on using #ifdefs and OS-specific macros, and delegate to the appropriate function. Here's a wiki listing some of these macros. Better still, use the ecosystem's build system to configure builds for different platforms, and shift this test to configure and compile-time instead of run-time.

Some semi-compliant software hard-codes library paths, assuming user libraries will remain in the default locations, i.e. sub-directories of %USERPROFILE%. This is not true, and users can change their locations as mentioned above. Notice there's a Contacts directory in the listing above: this was created by KDE Connect.

That said, not all the above dot-directories are created by traditionally *nix programs, which is a good segue to the next section...

4.2.1 The Microsoft irony

Notice .dotnet, .nuget, .omnisharp, .templateengine, .vscode, and source. Microsoft itself is a lousy citizen of Windows. source is repeatedly created by Visual Studio (not Code), especially on a fresh install. The rest of the projects are open-source, and have active/on-going issues related to home directory pollution:

It feels like developers who have never natively used Windows or its stack, work on and contribute to Microsoft software, without liaising with veteran Windows teams. Maybe that's why Windows 11 looks so...inspired... by macOS (which is an aberration; the macOS windowing system sucks).

4.2.2 Video games

Video games on Windows deserve a special heading of their own, because they are particularly egregious offenders. In my opinion, game devs really have no excuse: almost all games primarily target Windows, or are ported by dedicated teams who know Windows inside out, and more importantly, develop using an entirely native toolchain. Even Unreal and Unity have functionality to call out to native code, and developers still don't use it to properly handle save data. The issue is such that the PCGamingWiki has dedicated 'save file location' and 'config file location' headings for every single game in its database.

Most video games seem to like writing their save files, screenshots, and settings data into Documents\ or Documents\My Games, which is a holdover from Windows XP. Look, I get it, many game engines are old; they trace their lineage to engines first written in the 1990s and early 2000s. But if developers can add ray-tracing updates to a 16-year-old game (Portal: RTX), surely they can spend a couple hours fixing this too.

Here is a listing of my Documents library, which illustrates the problem:

> gci D:\Libraries\Documents -Name
3DMark
4A Games
Anno 1404
Anno 1404 Venecia
Anno 1404 Venedig
Anno 1404 Venezia
Anno 1404 Venice
Anno 1404 Venise
ANNO 1404 Wenecja
Anno 1800
Assassin's Creed Odyssey
Assassin's Creed Unity
Battlefield 1
Blackmagic Design
Custom Office Templates
Dell
en-GB
FeedbackHub
Graphics
Horizon Zero Dawn
IISExpress
Larian Studios
MATLAB
MAXON
My Data Sources
My Digital Editions
My Games
My Spore Creations
My Web Sites
OnScreen Control
Outlook Files
PowerShell
PowerToys
Rockstar Games
Shadow of the Tomb Raider
Sound recordings
Steam Cloud
The Witcher 3
The Witcher 3 Mod Manager
Visual Studio 2019
Visual Studio 2022
WindowsPowerShell
Witcher 2
Wolfram Mathematica
Zoom
_Documents
enc_cert.pfx
mods.settings

Anno 1404, what the hell are you doing? Six different (localised) directories, all containing the same paths.

Honestly speaking, I doubt game developers are going to change to Saved Games, and this is partially Microsoft's fault, again. The folder is neither listed in the default File Explorer Libraries view, nor in the My PC view; users have to manually navigate to it. Its rarity is demonstrated by the ratio of games using either Documents or Documents\My Games compared to those using Saved Games: 32:4 on my computer (the four games using Saved Games are Cyberpunk 2077, Metro: Exodus, Kingdom Come: Deliverance, and **) Some games forgo Documents altogether, and put the save files directly with the game files, or even in %APPDATA%.

This answer on the Game Development StackExchange is relevant:

The best reason I can think of is to reduce cost of customer support. The problem is that %APPDATA% and %LOCALAPPDATA% are hidden directories, and nobody seems to know about %UserProfile%\Saved Games. Placing the files in Documents, studios can save money on support calls that ask how to backup savegames, or how to migrate the savegames from one machine to another.

Notice even here that Microsoft is polluting Documents with Custom Office Templates, FeedbackHub, My Data Sources, Outlook Files, Visual Studio 20XX, PowerShell, PowerToys, and WindowsPowershell (yes; there are two PowerShells). The folder contains everything but documents now, and it is so cluttered that I have the _Documents subfolder for my stuff—files I explicitly created via a save dialog box or the command-line. The leading underscore is a necessary evil, as it means the directory is listed first when sorted by name in File Explorer.

In a nutshell... A hopeless situation. There's no saving Windows user profiles when Microsoft itself doesn't adhere to its own conventions.

5 Solutions

So if things are so 'hopeless' as I put it, then why bother with this article? I think any change towards compliance is better than no change, that's all.

On *nix, the answer is straightforward: get everyone to adhere to the XDG Base Directory specification. Of course, 'get everyone to' is doing a lot of work in that sentence: it involves submitting an issue, convincing maintainers that this is a worthwhile, and possibly writing code to satisfy the specification, correctly (including the fall-back directories), and all the review drudgery before things are finally merged.

On Windows, the problems arise from within, and I sincerely don't see a solution. Thankfully, many non-compliant projects are fully open-source, and contributing to them, or up-voting issues, will hopefully help. Game developers are famously opaque (especially given how lucrative the industry is as a whole), and as mentioned above, they do have genuine reasons for saving stuff in Documents.

That being said, there is still value in attempting to fix this on Windows. The specification is clear, some developers clearly know about it, and it is a good starting point for new projects (Microsoft is apparently promoting Rust now over C++ for green-field Windows development, so maybe there's hope yet).

5.1 Home, sweet home

We have so many real-life analogues in our computers: from files, folders, and rubbish bins, to the 'desktop' metaphor. Surely we can extend the concepts of 'clutter' and 'cleanup', too? Let's be good citizens of the platforms we develop for, and give our users choices and control over their data, and where it is stored. I'd like a nice vacuum cleaner for my home directory, please.

6 Relevant blog posts

Here are several blog posts and articles that inspired me and were useful for my post—some of which were linked above:

@mordae
Copy link

mordae commented Aug 17, 2023

The social contract, however, is that said developers and maintainers listen to and judges user feedback on their own merits, and implements frequently-asked-for features or fixes.

You probably meant:

... and implements merges frequently-asked-for features or fixes.

@ChoHag
Copy link

ChoHag commented Aug 17, 2023

3.2.4 Sod off kid, you're wrong and you don't know enough to know why yet.

Except that was already covered by the quotes in 3.3, which came before 3.2.*.

Great work explaining how we need to be better at organising!

@takluyver
Copy link

Thanks, this was an interesting read. 👍

A few years ago, I was heavily involved in IPython, and we chose to reject calls to follow the XDG spec on Linux, in favour of our traditional .ipython folder. We certainly weren't ignorant of the XDG spec, and I hope we weren't unduly arrogant. Avoiding change and complexity was the rationale, but I think that's worth unpacking a bit more.

You seem to be discussing change & complexity largely as a drawback for developers, and suggesting, not unreasonably, that we should just get over it and put in the work. But this also affects users if they ever want to edit or inspect their config file: it's much easier to say 'it's in ~/.ipython (unless you've deliberately changed that)' than to describe different locations based on the platform and some system config. Of course you can have a command to find it (there is one: ipython locate), but that's still an added layer of complexity for novice users compared to a single default folder. E.g. some users on Windows might need to launch the right kind of command prompt to get that command to work.

The quote from SSH and your points about save game folders on Windows are kind of the same thing: the 'clean' solution is not necessarily the best one for (all) users. And unsurprisingly, different developers prioritise different things.

There's also a sense that the XDG base directories are designed for desktop applications - that is the D in XDG, after all. It's worst for the 'runtime' directory, which you don't cover here - this can be deleted under programs left running in screen/tmux - but that adds to a general sense that XDG is 'not for us'.

@rollcat
Copy link

rollcat commented Aug 17, 2023

This is a very nice write-up. However regarding OpenSSH's reasoning for not respecting $XDG_* variables, they do have an exceptionally strong argument, that doesn't fit your list (ignorance / arrogance / fear of change / complexity).

It's the fact that you have yourself a bootstrapping problem: there's more than one way to set XDG_CONFIG_HOME (and friends) in a manner that's effective for the user, and for the entire duration of their session. The user may choose to do that in a way that's impossible for sshd(8) to honour; e.g. as an export directive in .profile, .bashrc, etc.

Most importantly, the login process is outlined in the man page: all authentication steps must be successfully completed before any code is run as the target user. How would sshd find the authorized_keys file?

To generalise the problem, every system of coordinates needs some absolute zero point. The hard drive has to have a boot sector or a partition table; the kernel needs to run /sbin/init; login(8) has to consult /etc/passwd; and so on. You can't leap into the future to learn the next step, that's the entire point of bootstrapping.

@JoaoCostaIFG
Copy link

I really dislike how the home directory looks with all those directories. I know this doesn't solve all problems, but I've had success with boxxy. From its description:

boxxy (case-sensitive) is a tool for boxing up misbehaving Linux applications and forcing them to put their files and directories in the right place, without symlinks!

@soc
Copy link

soc commented Aug 17, 2023

@sharadhr: On *nix, the answer is straightforward: get everyone to adhere to the XDG Base Directory specification.

A good approach is to make the $HOME directory read-only. Then file issues based on applications failing while trying to write their garbage to $HOME. Let them figure out the right solution themselves.

An even better approach is to write a library (or two) as an easily adoptable solution for application developers that addresses these issues consistently and for all platforms. This also helps with new applications behaving correctly from the start – instead of having to retroactively get mistakes corrected.

I got higher impact this way than with tickets/changes/PRs to individual applications.

@takluyver XDG base directories are designed for desktop applications

Oh, just fuck off with this boomer shit of spewing nonsense where you know it's nonsense and I know you know it's nonsense. Can't believe people are still dragging out this delusional argument in 2023. And yeah, the arrogance the author complains about is on full display with you.

Retroactively coming up with "reasons" while it's completely obvious to everyone that people like you just couldn't be bothered to do things correctly ... stop embarrassing yourself.

Stop acting like you are special. You aren't. Stop acting your use-case is special. It isn't.

@rollcat: To generalise the problem, every system of coordinates needs some absolute zero point. The hard drive has to have a boot sector or a partition table; the kernel needs to run /sbin/init; login(8) has to consult /etc/passwd; and so on. You can't leap into the future to learn the next step, that's the entire point of bootstrapping.

The solution is to get the config dir into /etc/passwd – either by extending the existing columns, or by repurposing one of the existing columns. Ugly, painful, but the only viable option.

@rollcat
Copy link

rollcat commented Aug 17, 2023

@soc:

The solution is to get the config dir into /etc/passwd – either by extending the existing columns, or by repurposing one of the existing columns. Ugly, painful, but the only viable option.

  1. This would complicate the remote login process, something that by definition is security-critical.
  2. We already have plenty of pretty good mechanisms for setting up the user environment, like .profile, .bashrc, PAM, and others (often OS-specific).
  3. You'd have to decide on scope. Do you want the ability to set arbitrary environment variables, or only the specific values strictly relevant to the login process? Have you considered every authentication method currently in use, and how it would affect e.g. PAM (aka the surreal horror), BSD auth, factotum?
  4. This is asking for every system that OpenSSH runs on to implement your extension - good luck getting everyone in the same room to agree on the details.

On top of that, I think this is a net loss on flexibility and usability:

  1. This is less flexible than setting the value in my .profile, e.g. I'd like to detect whether I'm on macOS, and set XDG_CONFIG_HOME to ~/Library/Application Support, and check the whole thing in into my dotfiles repository.
  2. It creates friction when "moving in" to a new machine. My typical process is: 1. install git, 2. clone a git repo. The authorized_keys file is in there - there is no step 3.

To further illustrate my point: I would like your proposal to include the ability to specify an alternative location for /etc/passwd.

@WorldMaker
Copy link

Honestly speaking, I doubt game developers are going to change to Saved Games, and this is partially Microsoft's fault, again.

This is still such an interesting mistake from "Games for Windows Live" in Vista. Microsoft tried to build an Xbox-like certification path for games on Windows, which was a good idea but implemented so poorly in Vista and tied up in too many things at once. The carrot for game developers to do the extra work to pass certification by doing things like only using the Saved Games folder for game saves was supposed to be things like Xbox-style cloud saves and access to Xbox-style network servers and Xbox cross-play/cross-saves. Cloud saves never quite worked right, PC developers generally prefer to run their own network infrastructure, including cross-play/cross-saves, and weren't really ready to see cross-play/cross-save at the time (Microsoft was a few years ahead of that curve). Instead, the only "carrot" most game developers might even notice was the strange and cumbersome "Games Explorer" not-quite-a-Windows-folder/not-quite-a-launcher that most users hated, didn't understand, or didn't know existed.

PC games probably could still use a good certification path to clean up the mess of their save games and some of their filesystem security problems. (Turn on something like Microsoft Defender's "Ransomware Protection" tool and you find all kinds of crazy file access things that games do.) Certification was a good idea. Microsoft seems unlikely to discover the right carrots or sticks, especially now that both Epic and Valve seem adamant to fight them on trying anything to clean up the mess.

@seqizz
Copy link

seqizz commented Aug 17, 2023

I'd like a nice vacuum cleaner for my home directory, please.

It might be your lucky day

@soc
Copy link

soc commented Aug 17, 2023

@rollcat Thanks, but I have done my homework before posting. I'd suggest you do the same.

We already have plenty of pretty good mechanisms for setting up the user environment, like .profile, .bashrc [...]

This is were your complete argument collapses.

To further illustrate my point:

You have no point.

I would like your proposal to include the ability to specify an alternative location for /etc/passwd.

I hope you are able to realize the contradiction with your first quoted sentence yourself.

@rollcat
Copy link

rollcat commented Aug 18, 2023

@soc you did not articulate a single point in your response. If you disagree with anything I wrote, you need to commit more effort than saying an equivalent of "surely you can see why you're wrong".

@ChoHag
Copy link

ChoHag commented Aug 18, 2023

Bah I should have said nothing. It was always going to fall on deaf ears anyway.

Now I'm subscribed to immature devbros vacuously bickering over something they can't change and I don't care about, and the unsubscribe button is replaced with an admonition to join in with the excitement on a phone I don't have.

old man yells at cloud

@sharadhr
Copy link
Author

sharadhr commented Aug 18, 2023

@ChoHag: your own comment was—I daresay—more 'vacuous' and immature than the comments you're riling over.

In order, your comment was—

  1. An argumentum ad verecundiam, or argument from authority, complete with a swear. Your only comment was that I was wrong, and you didn't bother elaborating on why or how I was wrong;
  2. a snide comment about what was obviously a typo, and thereafter:
  3. a straw man dripping with sarcasm, as though said typos completely invalidated the rest of my post.

You made your bed you're sleeping in, r.e. the rest of the comments.

@MarcusJohnson91
Copy link

Solution:

store configuration files with the app in a user specific sub directory, then when the app is uninstalled the useless config files can be removed too.

for example:

/Applications/Google/Chrome.appl/Configurations/Marcus/Chrome.(json|xml|ini).

and security can be enforced by setting the permissions on that user specific sub folder too.

@SarreqTeryx
Copy link

SarreqTeryx commented Aug 26, 2023

2.2.1 - I never understood the need for "Program Files (x86)". It should be done on the app's folder name if it's really necessary to differentiate your ×86 and×64 apps. Or, for that matter, why the "Files" portion of the folder name is needed?
c:\Programs\My App (x64) or c:\Programs\My App (x86) rather than c:\Program Files (x86)\My App or c:\Program Files\My App

My sanity says this should be the correct folder layout for Windows:
c: or whatever windows is installed on
-\Programs
--\My App 1 (x86)
--\My App 2 (x64)
--\AppData ALL shared program/app data goes here, NEVER an entire app
---\Temp ALL systemwide temp files should go here, not in random places in the file system
---\Templates ALL shared templates here
--\Windows Apps I have no problem with this one, as they're (mostly) monolithic
-\Users
--\Current User
---\Data All user-specific settings and temp files should be here, NEVER an entire app (I'm looking at you, Chrome)
-----\Temp
----\Contacts
----\Desktop
----\Favorites
----\Save Games
----\Searches
----\StartUp
----\Templates
---\Home the below would only be defaults
----\Documents
----\Downloads
----\Music
----\Videos

As far as Windows variables go, the current set suck. I'd like to see better ones:
a user's (Marlin's) %USER% folder should be c:\Users\Marlin, while their %HOME% folder should be c:\Users\Marlin\home
%SYSDRIVE% = Window's installation drive
%PROGRAMS% = %SYSDRIVE%:\Programs
%SYSROOT% = Windows installation folder, %WINDIR% is redundant for no reason
%APPDATA% = %PROGRAMS%\AppData, equivalent to old %APPDATA% & %LOCALAPPDATA% minus the allowance of storing whole damn apps here (Chrome, Firefox 🤬🤯)

4.2 - Contacts is a default Windows folder. I don't know why or what for exactly, but it's there. KDE Connect is correct in using it, but Windows created it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment