Skip to content

Instantly share code, notes, and snippets.

@brl
Last active August 12, 2018 21:21
Show Gist options
  • Save brl/666ade004b7dc59c82824ed7cbd60f1d to your computer and use it in GitHub Desktop.
Save brl/666ade004b7dc59c82824ed7cbd60f1d to your computer and use it in GitHub Desktop.

The HRC_pass..zip documents

On June 20, 2016 the Guccifer 2.0 website announced that they would publish documents the following day at 10am (ET).

I’d like to announce the next piece of docs from DNC.

I found something like a dossier on Hillary Clinton on the its server. It’s a heavy folder of docs that will attract your attention. You’ll like it.

Expect it. I’ll publish them on June 21 at 10 a.m (ET).

On June 21, 2016 a zip archive of files was published on the Guccifer 2.0 wordpress site.

This’s time to keep my word and here’re the docs I promised you.

It’s not a report in one file, it’s a big folder of docs devoted to Hillary Clinton that I found on the DNC server.

The DNC collected all info about the attacks on Hillary Clinton and prepared the ways of her defense, memos, etc., including the most sensitive issues like email hacks.

Links were provided to 5 different file hosting sites, but at the present time only the link to Mediafire is still active:

http://www.mediafire.com/?79a6zy27q9ung

The Mediafire page for this archive contains an interesting fact:

This file was uploaded from France on June 21, 2016 at 4:51 AM

The file uploaded to Mediafire is called HRC_pass..zip (two dots) and is a zip archive encrypted with the password #GucCi2/0 which when extracted contains 261 documents in a single directory.

Timestamp Metadata

NTFS Timestamps

The Zip file format may contain optional metadata which are called extra or extensible data fields. These can be either types of data which are described in the official specification (see 4.5.2) or data which is defined by the vendor of any particular implementation.

The HRC_pass..zip archive contains one type of this additional metadata which records the NTFS FILETIME timestamps of the files which are archived.

Which Zip utilities add these timestamps?

Windows Explorer does not create these records and does not use them when unpacking a zip file.

Winzip Can optionally add and interpret these records if an advanced configuration option is enabled. I don't know if this is a default or not.

I examined the source code for 7zip and I believe it always adds these records. Also 7zip sets a specific protocol version which matches the archive.

const Byte kMadeByProgramVersion = 63;

4E641DB Created Zip Spec      3F '6.3'

Examining timestamps and determining timezone

The zipdetails utility can be used to display the extra metadata information in a Zip archive

First we'll set the timezone environment variable to UTC. The output is very long and has been trimmed to contain only the relevant parts.

$ TZ=UTC zipdetails HRC_pass..zip

4E641D7 CENTRAL HEADER #1     02014B50

4E641E3 Last Mod Time         48D47D40 'Mon Jun 20 15:42:00 2016'
4E64205 Filename              'HRC/'
4E64209 Extra ID #0001        000A 'NTFS FileTimes'
4E64215   Mtime               01D1CB34305B8923 'Mon Jun 20 20:41:58 2016 49821100ns'
4E6421D   Ctime               01D1CB34305B8923 'Mon Jun 20 20:41:58 2016 49821100ns'
4E64225   Atime               01D1CB2650C7A00F 'Mon Jun 20 19:02:39 2016 491073500ns'

Two representations of the same modification time:

15:42:00              <-- Local MS-DOS DateTime representation
20:41:58.49821100     <-- UTC high precision NTFS FILETIME representation

Since these are two different representations of the same time and the first time is in local time, the correct timezone is UTC-5.

Each file has a modification date from April 26, and the local values also have an offset of UTC-5.

4E6422D CENTRAL HEADER #2     02014B50
4E64239 Last Mod Time         489A6211 'Tue Apr 26 12:16:34 2016'
4E6425B Filename              'HRC/04.29.15 CGEP.docx'
4E64271 Extra ID #0001        000A 'NTFS FileTimes'
4E6427D   Mtime               01D19FDF61EE8500 'Tue Apr 26 17:16:34 2016 0ns'
4E64285   Ctime               01D1CB2650C8639E 'Mon Jun 20 19:02:39 2016 496079800ns'
4E6428D   Atime               01D1CB2650C8639E 'Mon Jun 20 19:02:39 2016 496079800ns'

So now we'll set the TZ variable to the correct value to reconcile the timestamps. (yes, this is the correct TZ variable for UTC-5)

$ TZ=UTC+5 zipdetails HRC_pass..zip

4E641D7 CENTRAL HEADER #1     02014B50
4E641E3 Last Mod Time         48D47D40 'Mon Jun 20 15:42:00 2016'
4E64205 Filename              'HRC/'
4E64215   Mtime               01D1CB34305B8923 'Mon Jun 20 15:41:58 2016 49821100ns'
4E6421D   Ctime               01D1CB34305B8923 'Mon Jun 20 15:41:58 2016 49821100ns'
4E64225   Atime               01D1CB2650C7A00F 'Mon Jun 20 14:02:39 2016 491073500ns'

First file in archive:

4E6422D CENTRAL HEADER #2     02014B50
4E64239 Last Mod Time         489A6211 'Tue Apr 26 12:16:34 2016'
4E6425B Filename              'HRC/04.29.15 CGEP.docx'
4E6427D   Mtime               01D19FDF61EE8500 'Tue Apr 26 12:16:34 2016 0ns'
4E64285   Ctime               01D1CB2650C8639E 'Mon Jun 20 14:02:39 2016 496079800ns'
4E6428D   Atime               01D1CB2650C8639E 'Mon Jun 20 14:02:39 2016 496079800ns'

Last file in archive:

4E6BA38 CENTRAL HEADER #106   02014B50
4E6BA44 Last Mod Time         489A636E 'Tue Apr 26 12:27:28 2016'
4E6BA66 Filename              'HRC/WJC Speeches.xlsx'
4E6BA87   Mtime               01D19FE0E7BF0000 'Tue Apr 26 12:27:28 2016 0ns'
4E6BA8F   Ctime               01D1CB265466344D 'Mon Jun 20 14:02:45 2016 563502100ns'
4E6BA97   Atime               01D1CB265466344D 'Mon Jun 20 14:02:45 2016 563502100ns'

Conversion from FILETIME (NTFS) to DateTime (MS-DOS)

The FILETIME stamps have 100ns resolution and DateTime values only have 2 second resolution.

DateTime values for last modification timestamps are used on FAT filesystem and also for the standard last modification time field in Zip archives.

It's important to understand how the conversion to DateTime values is performed because there are various possibilities and the actual conversion is the least intuitive one. The conversion always rounds UP to the next two-second value even if high resolution value is only a single millisecond higher than another two-second value.

Timestamps are always rounded up and never down to avoid timestamps going backwards in time when copying files into a Zip archive or on to a FAT filesystem.

Extracting metadata

The output produced by zipdetails is not convenient to process for further analysis, so the following shell script was used to massage the data into a more useful form.

You can download the script and the output files here

#!/bin/bash

if [ $# -ne 1 ]; then
    echo "Path to HRC_pass..zip needed"
    exit 1
fi

FILENAME=$1

TYPE=$(file --brief --mime-type ${FILENAME})
if [ "$TYPE" != "application/zip" ]; then
    echo "File ${FILENAME} does not seem to be a Zip archive: ${TYPE}"
    exit 1
fi

mkdir -p data

TZ=UTC+5 zipdetails ${FILENAME} > data/zipdetails

#             $1       $2                        $3
#           4E64315   Ctime               01D1CB2650CAD5D2 'Mon Jun 20 19:02:39
#
#     'Ctime' lines        | hex field, prepend 0x |    print as a decimal value
#
grep Ctime data/zipdetails |  awk '{print "0x"$3}' | xargs printf "%d\n" > data/ctimes

zipinfo -1 ${FILENAME} > data/names

# there are some non utf-8 characters in filenames
iconv -f ISO-8859-15 -t UTF-8 data/names > data/names.utf8

# remove 2 lines of header and one line of footer
zipinfo -T ${FILENAME} | head -n -1 | tail -n +3 > data/zipinfo

cat data/zipinfo | awk '{print $7}' > data/dostimes
cat data/zipinfo | awk '{print $4}' > data/filesizes

# join 4 data files into a csv file, drop the first line/entry for the directory
paste -d"," data/ctimes data/dostimes data/filesizes data/names.utf8 | tail -n +2 > data/hrc-metadata-ctimes.csv
paste -d"," data/dostimes data/filesizes data/names | tail -n +2 | sort > data/hrc-metadata-dostimes.csv
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment