Skip to content

Instantly share code, notes, and snippets.

@alfredopalhares
Last active February 15, 2024 12:15
Show Gist options
  • Star 14 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save alfredopalhares/d475257c84410c7fcca6 to your computer and use it in GitHub Desktop.
Save alfredopalhares/d475257c84410c7fcca6 to your computer and use it in GitHub Desktop.
Restoring a PDF file from chrome cache

So I was trying to do my duties as a good civilian by paying taxes of stuff you already own.

In this particular case it meant going to the website and try and generate a PDF document that would allow me to pay for said stuff. After 2 unsuccessful attempts to generate the PDF (got greeted by 500s), I finally got the PDF open on chromium (using the PDF preview) and trying to save, the browser crashed..

I then quickly reboot chromium and tried to generate the PDF again, only to find out that I can't, the website only allows you to generate the document and download once, this is a critical document used as prof and means to pay your taxes. It doesn't really make much sense that you only can download it once, but such is the way of life.

I've been suffering from this bug for quite a while, but it never affected me in this way.

The file wasn't on ~/Downloads, so I've tried browsing chrome://cache and after a while I found the PDF HTTP request, where is a small expert:

/full/path/of/the/request/
HTTP/1.1 200 OK
Date: Thu, 30 Apr 2015 14:40:44 GMT
Cache-Control: must-revalidate, post-check=0, pre-check=0
Pragma: public
Content-Length: 52831
Expires: 0
Content-Type: application/pdf
Age: 0
Via: AX-CACHE-2.7:243
00000000: d4 13 00 00 03 07 05 00 e4 22 2e 5d 88 73 2e 00  .........".].s..
00000010: 18 86 38 5d 88 73 2e 00 da 00 00 00 48 54 54 50  ..8].s......HTTP
00000020: 2f 31 2e 31 20 32 30 30 20 4f 4b 00 44 61 74 65  /1.1 200 OK.Date
00000030: 3a 20 54 68 75 2c 20 33 30 20 41 70 72 20 32 30  : Thu, 30 Apr 20
00000040: 31 35 20 31 34 3a 34 30 3a 34 34 20 47 4d 54 00  15 14:40:44 GMT.
00000050: 43 61 63 68 65 2d 43 6f 6e 74 72 6f 6c 3a 20 6d  Cache-Control: m
00000060: 75 73 74 2d 72 65 76 61 6c 69 64 61 74 65 2c 20  ust-revalidate, 
00000070: 70 6f 73 74 2d 63 68 65 63 6b 3d 30 2c 20 70 72  post-check=0, pr
00000080: 65 2d 63 68 65 63 6b 3d 30 00 50 72 61 67 6d 61  e-check=0.Pragma
00000090: 3a 20 70 75 62 6c 69 63 00 43 6f 6e 74 65 6e 74  : public.Content
000000a0: 2d 4c 65 6e 67 74 68 3a 20 35 32 38 33 31 00 45  -Length: 52831.E

But this is the full hex dump of HTTP request, not only the PDF file. To try and understand what is what I opened a few PDF files with vim, and noticed they all start with %PDF-<version> and end with %EOF. So with vim I isolated the hex dump to that section:

00000000: 25 50 44 46 2d 31 2e 34 0a 25 e2 e3 cf d3 0a 32  %PDF-1.4.%.....2
00000010: 20 30 20 6f 62 6a 20 3c 3c 2f 4c 65 6e 67 74 68   0 obj <</Length
00000020: 20 35 32 2f 46 69 6c 74 65 72 2f 46 6c 61 74 65   52/Filter/Flate
00000030: 44 65 63 6f 64 65 3e 3e 73 74 72 65 61 6d 0a 78  Decode>>stream.x
00000040: 9c 2b e4 72 0a e1 32 36 53 b0 30 30 d5 b3 34 57  .+.r..26S.00..4W
00000050: 08 49 e1 72 0d e1 0a e4 2a 54 30 54 30 00 42 08  .I.r....*T0T0.B.
00000060: 99 9c ab a0 1f 91 66 a8 e0 92 af 10 c8 05 00 08  ......f.........

All the way to:

0000cdf0: 31 66 33 31 65 63 31 61 39 65 65 32 38 62 39 31  1f31ec1a9ee28b91
0000ce00: 36 33 64 32 35 33 63 35 31 66 66 36 33 35 3e 3c  63d253c51ff635><
0000ce10: 37 63 32 30 62 36 33 65 63 30 35 37 62 62 65 35  7c20b63ec057bbe5
0000ce20: 30 61 66 30 66 30 62 38 33 35 34 31 35 37 30 31  0af0f0b835415701
0000ce30: 3e 5d 2f 49 6e 66 6f 20 33 39 20 30 20 52 2f 53  >]/Info 39 0 R/S
0000ce40: 69 7a 65 20 34 30 3e 3e 0a 73 74 61 72 74 78 72  ize 40>>.startxr
0000ce50: 65 66 0a 35 31 38 38 30 0a 25 25 45 4f 46 0a     ef.51880.%%EOF.

To restore the file I used xxd:

xxd -r pdf.hex test.pdf

Then tested with:

$ file test.pdf 
test.pdf: PDF document, version 1.4

It all seems OK now, I can open the file with any PDF viewer and move on with life.

@maximRnback
Copy link

THANK YOU

@nikhilgeo
Copy link

Thanks, was looking for the same

@dexhunter
Copy link

What if the content is encrypted? Sorry but I didn't find %PDF

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment