Skip to content

Instantly share code, notes, and snippets.

@bontchev
Last active Jul 6, 2020
Embed
What would you like to do?
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from __future__ import print_function
from re import compile
from sys import stdout, stderr
from argparse import ArgumentParser
try:
from virustotal import VirusTotal
except ImportError:
print('Could not import module "virustotal"; try "pip install virustotal".')
try:
from ratelimiter import RateLimiter
except ImportError:
print('Could not import module "ratelimiter"; try "pip install ratelimiter".')
__description__ = 'Converts MD5 to SHA256 hases using VirusTotal.'
__license__ = 'GPL'
__uri__ = 'https://gist.github.com/bontchev/8a53787a37862c3dc11d1dddff143c3e'
__VERSION__ = '1.0.0'
__author__ = 'Vesselin Bontchev'
__email__ = 'vbontchev@yahoo.com'
api_key = 'Your VirusTotal API key here'
def process_hash(hash, output_file, use_sha1, verbose, rate_limiter, v):
hex_digits_md5 = compile('[0-9a-fA-F]{32}')
if not hex_digits_md5.match(hash):
print('Bad MD5 hash: "{}".'.format(hash), file=stdout)
return True
if verbose:
print('Processing {}...'.format(hash))
with rate_limiter:
try:
report = v.get(hash)
if report is None:
print('{}'.format(hash), file=output_file)
return False
report.join()
assert report.done == True
new_hash = report.sha1 if use_sha1 else report.sha256
print('{}\t{}'.format(hash.upper(), new_hash.upper()), file=output_file)
output_file.flush()
except Exception as e:
print('Error: {}'.format(e), file=stderr)
return True
return False
def process_file(hash_file, output_file, use_sha1, verbose, rate_limiter, v):
error_level = False
try:
with open(hash_file, 'r') as f:
lines = f.readlines()
for line in lines:
line = line.strip()
if not line:
continue
if line[0] == '#':
continue
line = line.split()[0]
if process_hash(line, output_file, use_sha1, verbose, rate_limiter, v):
error_level = True
except Exception as e:
print('Error: {}.'.format(e), file=stderr)
return True
return error_level
def get_options():
parser = ArgumentParser(description=__description__)
parser.add_argument('-v', '--version', action='version',
version='%(prog)s version {}'.format(__VERSION__))
parser.add_argument('-a', '--apikey', default=api_key,
help='VirusTotal API key')
parser.add_argument('-b', '--verbose', action='store_true',
help='Display the hashes as they are processed')
parser.add_argument('-r', '--rate', type=int, default=4,
help='Requests per minute (default: 4)')
parser.add_argument('-s', '--sha1', action='store_true',
help='Use SHA1 instead of SHA256')
parser.add_argument('-o', '--output', dest='outputfile', default=None,
help='Output file name (default: stdout)')
parser.add_argument('fileOrMD5', nargs='+', help='@File or MD5 hash')
return parser.parse_args()
def main():
args = get_options()
hex_digits_api = compile('[0-9a-fA-F]{64}')
if not hex_digits_api.match(args.apikey):
print('API key not set correctly.', file=stderr)
exit(1)
output_file = stdout
error_level = 0
if args.outputfile is not None:
output_file = open(args.outputfile, 'w')
rate_limiter = RateLimiter(max_calls=args.rate, period=60)
v = VirusTotal(args.apikey)
for hash in args.fileOrMD5:
if hash[0] == '@':
if process_file(hash[1:], output_file, args.sha1, args.verbose, rate_limiter, v):
error_level = 1
elif process_hash(hash, output_file, args.sha1, args.verbose, rate_limiter, v):
error_level = 1
if args.outputfile is not None:
output_file.close()
exit(error_level)
if __name__ == '__main__':
main()
@bontchev

This comment has been minimized.

Copy link
Owner Author

@bontchev bontchev commented Jul 6, 2020

md5tosha.py - Converting MD5 hashes to SHA hashes using VirusTotal

Introduction

Don't you hate it when a malware report includes as indicators of compromise
(IoCs) MD5 hashes of the malware instead of SHA256 hashes? MD5 is an obsolete
and insecure hash function and it should no longer be used. Besides, many
honeypots store the files uploaded by the attackers with names derived from
their SHA hashes, so it's easier to check if your honeypot has already seen
this malware if you had its SHA hash.

Sadly, it is not possible to compute the SHA hash from the MD5 hash. However,
VirusTotal computes and stores different kinds of hashes for the malware
uploaded to it and, if the sample of the malware has been uploaded there,
it is possible to obtain any of its hashes by knowing some other hash of it.

Doing this manually is a rather irritating and time-consumming process, especially
for large batches of hashes, so I have written this small Python script in
order to automate it.

Dependencies

The script depends on two external Python modules: virustotal and
ratelimiter. Both can be installed with pip:

pip install virustotal
pip install ratelimiter

In addition, you need a VirusTotal API key. One can be obtaied by registering
a free account threre. Once obtained, the key can be put in the script (in the
variable api_key), or it can be specified at run-time from the command line
by using the -a option. Please keep in mind that the API keys of the free
accounts are limited to 4 queries per minute.

Also note that the script can find out the SHA hash only if the malware with
the corresponding MD5 hash has been uploaded to VirusTotal.

Usage

md5tosha.py [-h] [-v] [-a APIKEY] [-b] [-r RATE] [-s] [-o OUTPUTFILE]
            fileOrMD5 [fileOrMD5 ...]

Converts MD5 to SHA256 hases using VirusTotal.

positional arguments:
  fileOrMD5             @File or MD5 hash

optional arguments:
  -h, --help            show this help message and exit
  -v, --version         show program's version number and exit
  -a APIKEY, --apikey APIKEY
                        VirusTotal API key
  -b, --verbose         Display the hashes as they are processed
  -r RATE, --rate RATE  Requests per minute (default: 4)
  -s, --sha1            Use SHA1 instead of SHA256
  -o OUTPUTFILE, --output OUTPUTFILE
                        Output file name (default: stdout)

The script takes as arguments one or more MD5 hashes or file names. If a file
name is specified, it must be a text file and must be specified with a @
character prepended to its name.

The format of the file is the following:

# Lines starting with a '#' characters are comments
        # Leading (and trailing) space is ignored
# Lines can be empty

# The first word of a non-comment, non-empty line MUST be an MD5 hash
44D88612FEA8A8F36DE82E1278ABB02F        Everything after the first word is ignored

The script takes the following command-line options:

-h, --help Displays usage information

-v, --version Displays the version number of the script

-a APIKEY, --apikey APIKEY Specifies a VirusTotal API key

-b, --verbose Displays to stdout each MD5 hash as it is being processed

-r RATE, --rate RATE Specifies the rate at which the queries should be sent
to VirusTotal. Use this option if you have a paid account (and an API key from
it) that allows more frequent queries

-s, --sha1 Display the SHA1 hash of the file instead of the default SHA256

-o OUTPUTFILE, --output OUTPUTFILE Stores the results into the specified
output file

If VirusTotal has a sample of the file whose MD5 hash is being queried, the
script will output a line containing the MD5 hash and the SHA hash, separated
with a tab. If VirusTotal does not have a sample, the script will output only
the MD5 hash on the line.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment