Skip to content

Instantly share code, notes, and snippets.

@DanilAndreev
Last active February 13, 2024 09:13
Show Gist options
  • Save DanilAndreev/77036b5f7a7dc656c54aacb31140c22c to your computer and use it in GitHub Desktop.
Save DanilAndreev/77036b5f7a7dc656c54aacb31140c22c to your computer and use it in GitHub Desktop.
Rainbow tables (hash-chains)
# Author: Andrieiev Danil Maksimovich
# danssg08@gmail.com | https://github.com/DanilAndreev
import table
import postgresql
import string
import hashlib
def hash_func(word: str) -> str:
"""
hash_func - function, designed to hash password.
:param word: Input password.
:return: Hashed password.
"""
return hashlib.md5(bytes(word, "utf-8")).hexdigest()
# Generating alphabet. We will use lowercase english letters.
ALPHABET: list = list(string.ascii_lowercase)
# Rainbow table record chain length.
CHAIN_LENGTH = 1000
WORD_LENGTH = 6
# Generate rainbow table of 30000 records with chain length 1000 for 6 letters passwords.
# You should execute this function once for generating table.
table.generate(
quantity=3000,
hash_func=hash_func,
add_to_table=postgresql.add_to_table,
chain_length=CHAIN_LENGTH,
alphabet=ALPHABET,
word_length=WORD_LENGTH
)
password: str = "banana"
password_hash: str = hash_func(password)
# Calculating password from rainbow table.
found_password = table.get_password(
hash_string=password_hash,
hash_func=hash_func,
get_from_table=postgresql.get_from_table,
chain_length=CHAIN_LENGTH,
word_length=WORD_LENGTH,
alphabet=ALPHABET
)
print("Password is:", found_password)
# Author: Andrieiev Danil Maksimovich
# danssg08@gmail.com | https://github.com/DanilAndreev
# Here declared bridge functions for storing rainbow table in PostgreSQL database.
import psycopg2
CONNECTION = psycopg2.connect(
host="localhost",
database="postgres",
user="postgres",
password="postgres",
port=5432
)
def get_from_table(end_word: str) -> list:
"""
get_from_table - function, designed to get records from rainbow table by end_word.
:param end_word: End word in chain.
:return: Start words of the chain with input end_word.
"""
cursor = CONNECTION.cursor()
try:
cursor.execute("select start_word from rainbow_table where end_word = %s", [end_word])
return [x[0] for x in cursor.fetchall()]
except Exception as exception:
print("Error reading from table:", exception)
finally:
cursor.close()
def add_to_table(start_word: str, end_word: str) -> bool:
"""
add_to_table
:param start_word: Start word of the chain.
:param end_word: End word of the chain.
:return: True if record was saved and False if not.
"""
cursor = CONNECTION.cursor()
try:
cursor.execute("insert into rainbow_table (start_word, end_word) values (%s, %s)", (start_word, end_word))
CONNECTION.commit()
return True
except Exception as exception:
print("Failed to push:", start_word, exception)
CONNECTION.rollback()
return False
finally:
cursor.close()

Rainbow tables

Introduction

A lot of water has flowed under the bridge since then, when the theme of rainbow tables was at its peak, but, nevertheless, the idea is entertaining. It is about this kind of intricate tables that I want to tell you.

Theory

Brute force method

Let's figure out what, how and why. The task is: A certain hash is sent to the input of the program, at the output we must get the password from which this hash was taken.

To simplify the example, we will assume that passwords are only six characters long and can consist exclusively of lowercase Latin characters.

The most obvious hack is brute force. Its essence is to take a hash from every possible permutation of characters in the password until it matches the desired hash. Based on an experiment, my computer generated hashes for 100000000 ten character passwords approximately 1.11691 sec. Let's count the number of different ten character passwords consisting of upper and lower latin letters and numbers using the combinatorial formula for the number of placements.

 k    10       n!        26!
A  = A   = ────────── = ───── = 21 ⋅ 22 ⋅ 23 ⋅ 24 ⋅ 25 ⋅ 26 = 165765600
 n    62    (n - k)!     20!

Now let's calculate the maximum time it takes to crack:

  1.11691
─────────── ⋅ 19275223968000 = 215287 seconds ≈ 60 hours 
 100000000

Not bad, but too long for a single password. Please note that our search alphabet consists of only 26 characters, this is essential reduces the number of combinations compared to real passwords.

The following thought comes to mind: what if you generate hashes for all passwords in advance and then search as you search words in the dictionary. No sooner said than counted. Let's assume that words are encoded in ASCII and a letter occupies one byte. Password will take 10 bytes. We will use md5 as a hash, it will take 32 bytes. By simple calculations we get: approximately 736 terabytes. We got into extremes again...

Rainbow table

The rainbow table is a special variant of lookup tables for inverting cryptographic hash functions using the mechanism of the reasonable compromise between table lookup time and memory usage.

        H                  R1          H                  R2          H                  R3
banana ───> 21e74d8ec2bde ───> friend ───> 009310b2075cf ───> potato ───> 2ef83e4ddb0aa ───> freeze

        H                  R1          H                  R2          H                  R3
beezer ───> 21e74d8ec2bde ───> jumper ───> 009310b2075cf ───> expand ───> 2ef83e4ddb0aa ───> luxury

        H                  R1          H                  R2          H                  R3
reflex ───> 21e74d8ec2bde ───> vertex ───> 009310b2075cf ───> anyway ───> 2ef83e4ddb0aa ───> mayday

Diagram of a rainbow table with a chain length of three. R1, R2, R3 - reduction functions, H - hashing function.

Creating a table

Suppose we have a finite set of P passwords. So, the idea is this - we will take a random password from P, then by we take a hash from it and apply another reduction function to the result. We repeat this operation k times.

        H                  R1          H                  R2          H                  R3
qwerty ───> 21e74d9ec2bae ───> sdgcds ───> 0093272b205cf ───> kfpsen ───> 2ef82e4ddb1aa ───> zlfesp

The requirement for the reduction function is to return values from the same alphabet as passwords.

Now we have a chain that can be recalculated and contains k passwords. We construct n such chains and write only the opening and closing passwords to the table:

start_word end_word
banana freeze
beezer luxury
reflex mayday

Table usage

Now, let's figure out how to crack the notorious hash using this table.

For each hash, the value of which we want to reverse, calculate the sequence R (… R (H (R (h)))…). If any of intermediate values will coincide with some end of any chain, we take the beginning of this chain and we restore it completely. It is highly likely that the complete chain will contain the hash value h, and the preceding one the password will be the one he is looking for.

Now let's figure out what this means. We have a hash h that needs to be reversed. Our task is to assume that this hash is located on some of the reductions and build the chain to the end. We will go with the k reduction to the first. When we will get the end of the chain for each of the assumptions, we will look for the resulting password in our table among end_word. If there is a match, we will take start_word of the matched record in the table and restore the entire chain and look for the desired hash in it. The previous password in the chain is likely to be the desired one.

It is worth noting that the restored chain does not always contain the desired hash value h. This is possible with a collision occurs between the hashing function H or the reduction function R. In this case, there may be false triggers, they must be ignored and the search continued.

If, upon reaching the zero position, we still did not find the password, it means that it did not appear in the pre-calculated chains.

Problems

Simple hash chains have several disadvantages. The most serious is the possibility of merging two chains into one (generation the same value in different chains). All values generated after the merge will be the same in both chains, which narrows the number of passwords covered. Since the pre-generated chains are not completely saved, it is impossible to effectively compare all generated values with each other. As a rule, the absence of collisions in the hash function H is taken care of by the encryption side of the passwords, so the main problem lies in the reduction function R.

Another serious problem is the selection of such an R function that will generate passwords with the required coverage and distribution. The limitation of the output alphabet is a serious limitation to the choice of such a function.

Preparing

First, let's define the length of the password to be cracked and the input alphabet. Offered as input the alphabet use lowercase latin letters. Their number is 26 letters. Password length, let it be equal to 6 characters. We will write the code in python for clarity.

import string

ALPHABET: list = list(string.ascii_lowercase)
WORD_LENGTH = 6

Using the combinatorial formula for determining the number of placements, we calculate the number of possible password combinations:

 k    6        n!        26!
A  = A   = ────────── = ───── = 21 ⋅ 22 ⋅ 23 ⋅ 24 ⋅ 25 ⋅ 26 = 165765600
 n    26    (n - k)!     20!

So, from a six-character password from lowercase letters of the Latin alphabet, you can get 165765600 different password combinations.

For example, we will use the md5 hashing algorithm.

Reducers

About the reduction function. It should turn the hash into some password of a given length of characters in the input alphabet. It is proposed to read the hash bit by bit, as a number, and take modulo by the length of the alphabet. Thus, we get the number symbols in the alphabet. Further, in order to shift the values, it is necessary to completely divide the hash by the length of the alphabet. We do this until we collect characters for one word.

import string


def reduce(hash_string: str, iteration: int, alphabet=None, word_length: int = 6) -> str:
    if alphabet is None:
        alphabet = list(string.ascii_letters)

    # Shifting input hash value by iteration and modulo by 2^40.
    value = (int(hash_string, 16) + iteration) % (2 ** 40)
    result = []
    for i in range(word_length):
        # Getting modulo by alphabet length. Result number will be between 0 and len(alphabet).
        mod = value % len(alphabet)
        # Dividing value by alphabet length.
        value //= len(alphabet)
        # Getting symbol from input alphabet by calculated value in range from 0 to len(alphabet).
        result.append(alphabet[mod])
    # Generating word from calculated symbols list.
    return "".join(result)

Basic rainbow table generation

First, let's create a function that will accept the password hash, the initial reduction iteration, and return the trailing password for the given chain.

import reduction
from collections import Callable


def get_end_of_chain(
        hash_string: str,
        hash_func: Callable,
        chain_length: int = 1000,
        start_step: int = 0,
        word_length: int = 6,
        alphabet=None
) -> str:
    word: str = ""
    for i in range(start_step, chain_length):
        word = reduction.reduce(
            hash_string=hash_string,
            iteration=i,
            alphabet=alphabet,
            word_length=word_length
        )
        hash_string = hash_func(word)
    return word

Function arguments:

  • hash_string: str - Input hash of the previous password in the chain.
  • hash_func: Callable - Hash function.
  • chain_length: int - Rainbow table chain length.
  • start_step: int - The reduction iteration to start with. Default - 0.
  • word_length: int - The length of the desired password.
  • alphabet: list - Input alphabet.

So let's try to generate a rainbow table. We will generate a random six character password, which will become the beginning of the chain, and count the entire chain for it. The password obtained after calculating the chain will be the closing one. In the table, we need to put the opening chain and closing passwords.

from collections import Callable
from random import Random
import string


def generate(
        quantity: int,
        hash_func: Callable,
        add_to_table: Callable,
        chain_length: int = 1000,
        alphabet=None,
        word_length: int = 6
) -> None:
    if alphabet is None:
        alphabet = list(string.ascii_letters)

    random = Random()
    i = 0
    while i < quantity:
        start_word: str = "".join([alphabet[random.randint(0, len(alphabet) - 1)] for i in range(word_length)])
        end_word: str = get_end_of_chain(
            hash_string=hash_func(start_word),
            hash_func=hash_func,
            chain_length=chain_length,
            word_length=word_length,
            alphabet=alphabet
        )
        if add_to_table(start_word, end_word):
            i += 1

Function arguments:

  • quantity: int - Target number of records in the table.
  • hash_func: Callable - Hash function.
  • add_to_table: Callable - Callback function to add record to table.
  • chain_length: int - Rainbow table chain length.
  • word_length: int - Target length of passwords.
  • alphabet: list - Input alphabet.

For example, I generated a table with the following parameters:

  • Alphabet - [a, b, c, d, e, f, g, h, i, j ,k, l, m, n, o, p, q, r, s, t u, v, w, x, y, z]
  • Chain length - 5000
  • Password length - 6
  • Records quantity - 40000
Result:

The table was generated in 36 minutes 21 seconds on one thread.

Hash cracking

The table is generated, the tea has already cooled down 10 times. It's time to use it. The password is calculated as follows:

  1. At the input, we receive the hash H, which must be reversed (find the password from which the hash was obtained).
  2. We assume that the resulting hash is the penultimate link in the chain of length n.
  3. Compute the n-1 reduction for it. At the output of the reduction, we get a password.
  4. We are looking for the received password in the table by trailing passwords (end_word).
  5. If there is a match, we take the password that opens the chain from the table record and build a full chain for it until we find the password we need in it. We do this for all matches until we find the password.
  6. If the password is not found, we apply a reduction to the input hash H n-2, hash the result and apply an n-1 reduction to it. We perform steps 4 and 5 again. And so on in a circle until we reach zero reduction or find a password
  7. If the password is not found, the table does not contain the information required to reverse the input hash H.
from collections import Callable
import reduction


def get_password(
        hash_string: str,
        hash_func: Callable,
        get_from_table: Callable,
        chain_length: int = 1000,
        word_length: int = 6,
        alphabet=None
) -> str:
    for i in range(chain_length - 1, -1, -1):
        end_word: str = get_end_of_chain(
            hash_string=hash_string,
            hash_func=hash_func,
            chain_length=chain_length,
            start_step=i,
            word_length=word_length,
            alphabet=alphabet
        )
        entries: list = get_from_table(end_word)
        if len(entries):
            for word in entries:
                for step in range(0, chain_length):
                    current_hash: str = hash_func(word)
                    if hash_string == current_hash:
                        return word
                    word = reduction.reduce(
                        hash_string=current_hash,
                        iteration=step,
                        alphabet=alphabet,
                        word_length=word_length
                    )

Function arguments:

  • hash_string: str - Target number of records in the table.
  • hash_func: Callable - Hash function.
  • get_from_table: Callable - Callback function for finding records in a table by end_word parameter.
  • chain_length: int - Rainbow table chain length.
  • word_length: int - Target length of passwords.
  • alphabet: list - Input alphabet.

Tests:

So, we randomly generate a six-character password from the characters of the input alphabet, calculate its hash and try to spin the password back. We do this 1500 times, along the way measuring some parameters.

Result:
  • Average password computation time: 1 minute 24 seconds.
  • Average time for successful password calculation: 37.5 seconds.
  • Hacking probability: 47.34%.

The tests took approximately 36 hours.

Conclusion

In total, a rainbow table does not guarantee a 100% chance of hash cracking, but it significantly reduces costs. Fundamental part is a reduction function, the quality of the table depends on it.

Python is not the fastest language, but it is great for prototyping and learning algorithms. For maximum efficiency, you should implement this algorithm, for example, in C or C ++.

Also, it does not hurt to improve the reduction function so that it provides the most uniform distribution values.
In addition to that, here are some thoughts on how you can parallelize your key guessing process:

  • It is possible to parallelize the steps of checking the n-th reduction.
  • It is better to perform several reductions in one instance of the stream in order to optimize the cost of initializing the stream.

Thanks for attention! :)

Credits

Author: Danil Andreev

# Author: Andrieiev Danil Maksimovich
# danssg08@gmail.com | https://github.com/DanilAndreev
import string
def reduce(hash_string: str, iteration: int, alphabet=None, word_length: int = 6) -> str:
"""
reduce - reduces input hash to next chain password.
:param hash_string: Input hash string.
:param iteration: Current iteration step.
:param alphabet: Input alphabet.
:param word_length: Target password length.
:returns: Reduced password.
"""
if alphabet is None:
alphabet = list(string.ascii_letters)
# Shifting input hash value by iteration and modulo by 2^40.
value = (int(hash_string, 16) + iteration) % (2 ** 40)
result = []
for i in range(word_length):
# Getting modulo by alphabet length. Result number will be between 0 and len(alphabet).
mod = value % len(alphabet)
# Dividing value by alphabet length.
value //= len(alphabet)
# Getting symbol from input alphabet by calculated value in range from 0 to len(alphabet).
result.append(alphabet[mod])
# Generating word from calculated symbols list.
return "".join(result)
# Author: Andrieiev Danil Maksimovich
# danssg08@gmail.com | https://github.com/DanilAndreev
from collections import Callable
from random import Random
import string
import reduction
def get_end_of_chain(
hash_string: str,
hash_func: Callable,
chain_length: int = 1000,
start_step: int = 0,
word_length: int = 6,
alphabet=None
) -> str:
"""
get_end_of_chain - function designed to get end password in the chain.
:param hash_string: Input hash string to reduce.
:param hash_func: Hash function.
:param chain_length: Max chain length.
:param start_step: Start step of reduction to generate next chain links.
:param word_length: Target password length.
:param chain_length: Max chain length.
:param alphabet: Input alphabet.
:return: Last chain link password.
"""
word: str = ""
for i in range(start_step, chain_length):
word = reduction.reduce(
hash_string=hash_string,
iteration=i,
alphabet=alphabet,
word_length=word_length
)
hash_string = hash_func(word)
return word
def generate(
quantity: int,
hash_func: Callable,
add_to_table: Callable,
chain_length: int = 1000,
alphabet=None,
word_length: int = 6
) -> None:
"""
generate - function, designed to generate rainbow table.
:param quantity: Target quantity of table records to generate.
:param hash_func: Callback for getting hash from the password.
:param add_to_table: Callback for adding records to rainbow table.
:param chain_length: Target chain length for one record.
:param alphabet: Input alphabet.
:param word_length: Password length.
"""
if alphabet is None:
alphabet = list(string.ascii_letters)
random = Random()
i = 0
while i < quantity:
start_word: str = "".join([alphabet[random.randint(0, len(alphabet) - 1)] for i in range(word_length)])
end_word: str = get_end_of_chain(
hash_string=hash_func(start_word),
hash_func=hash_func,
chain_length=chain_length,
word_length=word_length,
alphabet=alphabet
)
if add_to_table(start_word, end_word):
i += 1
def get_password(
hash_string: str,
hash_func: Callable,
get_from_table: Callable,
chain_length: int = 1000,
word_length: int = 6,
alphabet=None
) -> str:
"""
get_password - function, designed to calculate password from rainbow table.
:param chain_length: Target chain length.
:param get_from_table: Callback fro getting records from table by end_word.
:param hash_func: Callback for getting hash from the password.
:param hash_string: Input hash of the password.
:param word_length: Target password length.
:param alphabet: Input alphabet.
:return: Target password
"""
# Decreasing reduction start.
for i in range(chain_length - 1, -1, -1):
# Calculating end of chain for selected reduction start.
end_word: str = get_end_of_chain(
hash_string=hash_string,
hash_func=hash_func,
chain_length=chain_length,
start_step=i,
word_length=word_length,
alphabet=alphabet
)
# Looking for coincidence in the table.
entries: list = get_from_table(end_word)
if len(entries):
# For each coincidence.
for word in entries:
# Restoring entire chain.
for step in range(0, chain_length):
current_hash: str = hash_func(word)
# If hash matches - congratulations, we have found password!
if hash_string == current_hash:
return word
word = reduction.reduce(
hash_string=current_hash,
iteration=step,
alphabet=alphabet,
word_length=word_length
)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment