Skip to content

Instantly share code, notes, and snippets.

@pbhj
Last active October 16, 2023 21:51
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save pbhj/ccae7ef1d1446f4450005de139c601c4 to your computer and use it in GitHub Desktop.
Save pbhj/ccae7ef1d1446f4450005de139c601c4 to your computer and use it in GitHub Desktop.
Investigation into shader cache files under the AMD directory in Windows 10

%localappdata%\Local\AMD\DxCache\ files

An investigation

This file is a stream-of-consciousness record [unfinished] of what I did when investigating the question of whether sharing the AMD caches of Windows 10 users, between said users, would produce a saving somehow. It struck me that the files might have been identical on different user accounts as my understanding was that shader cache files are dependent on a combination of the software being run and the GPU. In the case of Steam, shader cache files are downloaded and then compiled locally. It appears that compilation is on the CPU. Reports say that, on Steamdeck, Steam have chosen a unified shader cache folder across different users. On Windows 10, the DirectX cache files for my users amounted to ~5GB each, across 4 users that play a closely matched selection of games; so I wondered if I could save either the compilation time and/or the disk space for any of those files.

Methodology

  1. First I emptied all the files I could from each of the obvious cache directories under %localappdata%\Local\AMD\ for each of two users. [Some files can't be used as they are in use by the GPU, they are reported by Win10 as "memory mapped".]
  2. Then I logged in as each user and ran Steam, and opened "Civ V" and started a single user game without changing any settings.
  3. Then I used winmerge to compare each of the users \Local\AMD directories. There were no identical files.
  4. Then I chose files at random to compare, they have identical file names.

Cache types

Sources tell me D9 is DirectX9, Dx is DirectX 10/11, Dxc is DirectX 12. The parent \AMD folder holds other cache files and other possible duplicates too.

Details of comparison

An example .parc file shared by users:

C:\Users\USER1\AppData\Local\AMD\DxCache\86946259a90b8ea3.9bb0bb19bef0be85.7b1f799a.dad1858.0.parc

File differences

This file differs between users only by each one having a different 6 byte sequence, which may be a user-profile identifier (but differs from file to file), or a time-based identifier (which would mean files could still be shared between profiles):

USER1, <bh:c9><bh:ae><bh:a3><bh:95><bh:e7> = c9 ae a3 95 e7
USER2, <bh:97><bh:c6><bh:8b><bh:f5><bh:e7> = 97 c6 8b f5 e7 

File similarities

Each .parc file (for both users) starts and ends with this string:

#<bh:d8><bh:fa><bh:e7><bh:0f>_G<bh:be><bh:8b><bh:d1>H<bh:f5><bh:d8><bh:f0><bh:b4><bh:a7> = 23 d8 fa e7 0f 5f 47 be 8b d1 48 f5 d8 f0 b4 a7

The same string appears within the file, it is always preceded by:

<bh:d9><bh:01> = d9 01

File packing format

My best guess is that the file is packed like this:

<header> <data-1> <user> <end-marker> <header> <data-2> <user> <end-marker> <header>

where an example disk file would be:

23 d8 fa e7 0f 5f 47 be 8b d1 48 f5 d8 f0 b4 a7
.. .. .. <data-1> .. .. .. 
c9 ae a3 95 e7 
d9 01 
23 d8 fa e7 0f 5f 47 be 8b d1 48 f5 d8 f0 b4 a7 
.. .. .. <data-2> .. .. .. 
c9 ae a3 95 e7 
d9 01 
23 d8 fa e7 0f 5f 47 be 8b d1 48 f5 d8 f0 b4 a7

Some same-named .parc files are nearly identical across users, they only differ by having different 6-byte sequences immediately preceding the section end marker.

Other .parc files, of the same name, are similar but have large sections of difference interspersed with nearly-identical sections (as in the preceding paragraph).

Smallest files

The smallest files have 84 bytes (as reported by Windows 10). They appear to be sparse files, placeholders created to store GPU data should it be needed.

An example is C:\Users\USER1\AppData\Local\AMD\DxCache\af2a1ac9a5884689.9bb0bb19bef0be85.7b1f799a.dad1858.2.parc. As in the "File packing format" section the file starts with the

above. The "" section differs (suggesting it may not be a user identifier), in this file it is as follows:

<user-1>, <bh:18><bh:5a><bh:cb><bh:8c><bh:e7><bh:e1> = 18 5a cb 8c e7 e1
<user-2>, <bh:4a><bh:e7><bh:a5><bh:7f><bh:f5><bh:e7> = 4a e7 a5 7f f5 e7

The file ends with "d9 01", which above I called , and then the

sequence.

Thus, the data part of the files is:

01 00 00 00 02 00 00 00 34 00 00 00 9a 79 1f 7b 5b 23 fd 70 ec ce 88 63 17 70 96 f0 d6 60 63 a5 7a 64 dd 46 4f 54 52 00 00 00 00 

To reiterate, this "data part" is identical across both files, I'll call this .

Note that across files, the section, but only in the first block, starts with :

<bh:01><bh:00><bh:00><bh:00><bh:02><bh:00><bh:00><bh:00>4<bh:00><bh:00><bh:00><bh:9a>y<bh:1f>{
= 01 00 00 00 02 00 00 00 34 00 00 00 9a 79 1f 7b

Revised file format

My [new] best guess is that the file is packed like this:

<header> <data-header> <data-1> <user> <end-marker> <header> <data-2> <user> <end-marker> <header>

where an example disk file would be:

23 d8 fa e7 0f 5f 47 be 8b d1 48 f5 d8 f0 b4 a7 
01 00 00 00 02 00 00 00 34 00 00 00 9a 79 1f 7b
.. .. .. <data> .. .. .. 
c9 ae a3 95 e7 
d9 01 
23 d8 fa e7 0f 5f 47 be 8b d1 48 f5 d8 f0 b4 a7 
.. .. .. <data2> .. .. .. 
c9 ae a3 95 e7 
d9 01 
23 d8 fa e7 0f 5f 47 be 8b d1 48 f5 d8 f0 b4 a7

What next?

From this point it strikes me there are few things to try:

  1. Can files that are near identical be shared across users, will the GPU driver use those files, or will they just be invalidated and re-created.
  2. If files can't be shared, could they be shared by making a minor change, such as replacing the byte sequence with the byte sequence.
  3. Look at file naming. The files have the same names across Windows user-profiles, so the code on naming these files might offer information on shareability. Maybe the 6-byte "" sequences are arbitrary choices, for example; this might be documented.
  4. The cache files produced are created as a series: the names end .0.parc, .1.parc, .2.parc, .3.parc, .4.parc with identical prefixes; for example a DxcCache filename might be f02ab639eb2d1d03.73d4f86af224526.7170d1f5.6b8ae907.0.parc (see Appendix, for filename examples).

If either of these is possible, can it be used to save processing time and/or disk space.

Other cache files

OglCache

This folder includes .parc files too, they were not similar across users, but had the same name, eg. .

A sequence from the file was:

<bh:11>Generated by the Advanced Micro Devices, Inc., Proprietary GPU Shader Compiler.<bh:00><bh:00>_amdgpu_cs_main<bh:10><bh:00><bh:ef>shdr_intrl_tbl
<bh:03><bh:06><bh:00><bh:f5><bh:02>&<bh:12><bh:00><bh:ee><bh:06><bh:13><bh:f0><bh:06><bh:03><bh:11><bh:11><bh:04><bh:00><bh:15><bh:03><bh:18><bh:00><bh:13><bh:10><bh:08><bh:00><bh:00><bh:10><bh:03><bh:f3><bh:15><bh:f3><bh:03><bh:00><bh:00> <bh:00><bh:00><bh:00>AMDGPU<bh:00><bh:00><bh:82><bh:ae>amdpal.version<bh:92><bh:03><bh:00><bh:b0><bh:12><bh:00><bh:f0>8pipelines<bh:91><bh:8a><bh:a5>.type<bh:a2>Cs<bh:ae>.resource_hash<bh:cf><bh:dc><bh:04>|<bh:18><bh:fc><bh:b9><bh:da><bh:0e><bh:b0>.spill_threshold<bh:cd><bh:ff><bh:ff><bh:b0>.user_]<bh:03><bh:a1>_limit<bh:01><bh:a8>.s<bh:07><bh:01>0s<bh:81><bh:a8>n<bh:03><bh:b2>pute<bh:83><bh:b0>.api_<bh:18><bh:00><bh:01>T<bh:00><bh:f3><bh:1c><bh:92><bh:cf><bh:a8><bh:9e><bh:c9><bh:bc>&<bh:e2>J<bh:8f><bh:cf><bh:ce><bh:ab><bh:d4>m<bh:8e><bh:9c><bh:0b>_<bh:b1>.hardware_mapping<bh:91><bh:a3>.cs<bh:af>O<bh:00>@_sub<bh:a0><bh:00><bh:96><bh:a7>Unknown<bh:b0>/<bh:00>pstages<bh:81>.<bh:00><bh:fb><bh:02><bh:de><bh:00><bh:1b><bh:ac>.entry_point<bh:af><bh:81><bh:01><bh:f5><bh:01><bh:ab>.sgpr_count<bh:18><bh:ab>.v
<bh:00><bh:f4>
<bh:05><bh:a9>.lds_size<bh:00><bh:b4>.scratch_memory<bh:16><bh:00><bh:80>frontendq<bh:00>"ck<bh:16><bh:00>k<bh:b3>.back<bh:15><bh:00>b<bh:b6>.perf<bh:15><bh:01>bbuffer<bh:18><bh:00><bh:10><bh:aa>-<bh:01><bh:d4>s_uavs<bh:c3><bh:ac>.writ<bh:0e><bh:00><bh:12><bh:b4><bh:1a><bh:00>0appK<bh:00><bh:e1>consume<bh:c2><bh:af>.wavew<bh:00><bh:01>A<bh:00>' <bh:b2>n<bh:01>0reg)<bh:01>@<bh:dc><bh:00> <bh:ce>a<bh:09><bh:01><bh:05><bh:00><bh:16><bh:00><bh:0b><bh:00><bh:0f><bh:05><bh:00>t <bh:b7>..<bh:02><bh:d0>adgroup_dimen<bh:82><bh:02><bh:f1><bh:1a>s<bh:93><bh:08><bh:08><bh:01><bh:ab>.float_mode<bh:cc><bh:c0><bh:ae>.fp16_overflow<bh:c2><bh:aa>.ieee<bh:1d><bh:00>b<bh:c2><bh:a9>.wgp<bh:0b><bh:00><bh:f2><bh:13><bh:ac>.mem_ordered<bh:c2><bh:b1>.forward_progress<bh:c2><bh:ab>$<bh:01><bh:00><bh:e8><bh:01><bh:d5>s<bh:03><bh:a8>.excp_en<bh:00><bh:ab><bh:d5><bh:01><bh:f0><bh:04>en<bh:c2><bh:ad>.trap_present<bh:c2><bh:b0>d<bh:02>Bred_<bh:10><bh:02>Ant<bh:00><bh:ad>z<bh:01>0s_pJ<bh:00><bh:f0><bh:10>e<bh:00><bh:af>.checksum_value<bh:ce><bh:ce>K\\<bh:01><bh:aa>.regist<bh:eb><bh:02><bh:f4><bh:00><bh:cd>.B<bh:00><bh:b7>.internal_V<bh:03><bh:03><bh:ec><bh:02><bh:f4><bh:03><bh:8a><bh:ff><bh:eb>SA<bh:f4><bh:be>%<bh:cf>?{U<bh:e4><bh:01>QO<bh:a1><bh:b2><bh:1a><bh:03><bh:15>_C<bh:00><bh:90><bh:85><bh:af>.tidig_<bh:1a><bh:00><bh:00><bh:84><bh:00><bh:d3><bh:01><bh:aa>.tgid_x_en<bh:c3><bh:0c><bh:00><bh:17>y<bh:0c><bh:00><bh:10>z<bh:0c><bh:00>A<bh:ab>.tg<bh:1b><bh:02><bh:00><bh:d6><bh:00><bh:10><bh:a4>f<bh:03><bh:7f><bh:a6>OpenGL

The strings in here (which I would have used strings to extract if I'd been doing this on Kubuntu Linux) gives us a few clues, looking up "amdpal" leads to this informative website, https://llvm.org/docs/AMDGPUUsage.html#amdpal.

###Aside on LLVM

LLVM originally meant Low-Level Virtual Machine, but is a set of compilers and toolchain technologies used for different instruction set architectures; it does things like optimising compiled code.

Wikipedia says:

"Graphics code within the OpenGL stack can be left in intermediate representation and then compiled when run on the target machine. On systems with high-end graphics processing units (GPUs), the resulting code remains quite thin, passing the instructions on to the GPU with minimal changes. On systems with low-end GPUs, LLVM will compile optional procedures that run on the local central processing unit (CPU) that emulate instructions that the GPU cannot run internally. LLVM improved performance on low-end machines using Intel GMA chipsets. A similar system was developed under the Gallium3D LLVMpipe, and incorporated into the GNOME shell to allow it to run without a proper 3D hardware driver loaded." (https://en.wikipedia.org/wiki/LLVM)

Specific game example

Counter Strike 2 (CS2)

CS2 from Valve reportedly uses DX11 and Vulkan as the graphics APIs. So I'm going to:

  1. Play CS2 on two different user accounts, look for similar names and file contents in DxCache\ and VkCache.
  2. Delete CS2 created cache files, play CS2, record the cache files (eg copy the folder to an archive); delete cache files, compare files with prior copies.

Appendix

Cache file naming

As part of the precursor to this investigation I made a quick powershell one-liner to record directory listings in a dated file:

cd C:\Users\$env:UserName\AppData\Local\AMD; gci -Recurse | % { $dateTimeString = (Get-Date).ToString("yyyyMMdd-HHmmss"); $outputFileName = "AMD-listing-$dateTimeString.txt"; "{0};{1};{2};{3}" -f $_.FullName, $_.CreationTime, $_.LastWriteTime, $_.Attributes | Out-File $outputFileName -Append }

Example output, which includes filenames, taken from that file, for some of the AMD cache folders:

C:\Users\USER1\AppData\Local\AMD\DX9Cache\30cd7d59.bin;03/08/2022 22:15:56;08/09/2023 01:01:14;Archive
C:\Users\USER1\AppData\Local\AMD\DX9Cache\5c12ee40.bin;16/10/2023 18:37:01;16/10/2023 18:37:01;Archive
C:\Users\USER1\AppData\Local\AMD\DX9Cache\cbe92ddc.bin;16/10/2023 16:05:41;16/10/2023 16:35:09;Archive
[...]
C:\Users\USER1\AppData\Local\AMD\DxCache\1fa88efdfcc8ed74.9bb0bb19bef0be85.7b1f799a.dad1858.0.parc;16/10/2023 18:00:11;16/10/2023 18:31:15;Archive
C:\Users\USER1\AppData\Local\AMD\DxCache\20b3f865ffb81ec1.1449ada0e32d5eaa.7b1f799a.dad1858.0.parc;16/10/2023 18:36:59;16/10/2023 18:38:47;Archive
C:\Users\USER1\AppData\Local\AMD\DxCache\696a51c21c785da5.9bb0bb19bef0be85.7b1f799a.dad1858.0.parc;16/10/2023 16:03:14;16/10/2023 16:03:14;Archive
[...]
C:\Users\USER1\AppData\Local\AMD\DxcCache\86946259a90b8ea3.73d4f86af224526.7170d1f5.6b8ae907.0.parc;16/10/2023 17:09:59;16/10/2023 17:09:59;Archive
C:\Users\USER1\AppData\Local\AMD\DxcCache\86946259a90b8ea3.73d4f86af224526.7963d102.bb2d92cd.0.parc;16/10/2023 17:09:59;16/10/2023 17:09:59;Archive
C:\Users\USER1\AppData\Local\AMD\DxcCache\86946259a90b8ea3.73d4f86af224526.f5cd4141.87672893.0.parc;16/10/2023 17:09:59;16/10/2023 17:09:59;Archive
C:\Users\USER1\AppData\Local\AMD\DxcCache\f02ab639eb2d1d03.73d4f86af224526.7170d1f5.6b8ae907.0.parc;16/10/2023 17:50:09;16/10/2023 17:50:09;Archive
C:\Users\USER1\AppData\Local\AMD\DxcCache\f02ab639eb2d1d03.73d4f86af224526.7963d102.bb2d92cd.0.parc;16/10/2023 17:50:09;16/10/2023 17:50:09;Archive
C:\Users\USER1\AppData\Local\AMD\DxcCache\f02ab639eb2d1d03.73d4f86af224526.f5cd4141.87672893.0.parc;16/10/2023 17:50:09;16/10/2023 17:50:09;Archive
C:\Users\USER1\AppData\Local\AMD\OglCache\3A4E7E5B258632E20EC74A8D7D0A5.parc;16/10/2023 17:53:51;16/10/2023 17:53:51;Archive
C:\Users\USER1\AppData\Local\AMD\OglCache\4F7CA528A25339022E20EC74A8D7D0A5.parc;16/10/2023 16:03:10;16/10/2023 16:03:10;Archive
C:\Users\USER1\AppData\Local\AMD\OglCache\6A28F8E34684AD62E20EC74A8D7D0A5.parc;16/10/2023 16:03:11;16/10/2023 16:03:11;Archive
[...]
C:\Users\USER1\AppData\Local\AMD\VkCache\6A28F8E34684AD65B78F1EB3BE2D2FD.parc;16/10/2023 17:14:59;16/10/2023 17:14:59;Archive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment