Skip to content
{{ message }}

Instantly share code, notes, and snippets.

# hellman/writeup.ipynb

Created May 3, 2020
De1CTF 2020 - Mini Pure Plus
 { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# De1CTF 2020 - Mini Purε Plus" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Intro\n", "\n", "In this challenge, we have a simple 48-bit *block cipher*:\n", "```py\n", "ROUND = 16\n", "F = ffield.FField(24)\n", "\n", "def Mul(q1,q2):\n", " return F.Multiply(q1,q2)\n", "\n", "def fbox(x):\n", " return Mul(Mul(x,x),x)\n", "\n", "def enc_block(l, r, keys):\n", " for i in range(ROUND):\n", " l , r = r , l ^ fbox(keys[i] ^ r)\n", " l , r = r , l\n", " return l, r\n", "```\n", "\n", "It is a 16-round Feistel Network with the *cube function* as the Feistel function, over the field \$GF(2^{24})\$ (each branch). It is a small-scale variant of the \$\\mathcal{PURE}\$ cipher, which in turn is a simplification of the KN-cipher (by Knudsen and Nyberg [](https://link.springer.com/article/10.1007/BF00204800)), where the round function is the composition of an extension to an odd-degree field, the cube function, and the truncation to the original field. Both ciphers are interesting in that they are provably secure against linear and differential cryptanalysis, due to the use of the cube function. However, both ciphers were shown to be very weak against higher-order differential and interpolation attacks by Knudsen and Jacobsen [](https://link.springer.com/chapter/10.1007/BFb0052332)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## First attempt\n", "\n", "We are given a set of \$2^{24}\$ plaintext-ciphertext pairs, where plaintexts have form \$(\\mathtt{777777},\\mathtt{*})\$, and \$*\$ goes over all possible values. This allows to simplify interpolation attacks, since we have only one variable in the input.\n", "\n", "In a meet-in-the-middle interpolation attack, we choose an intermediate state and express it as a polynomial \$g(x)\$ in the (right half of the) input, and as a polynomial \$h(y,z)\$ in the ciphertext halves. We then know that\n", "\$\$\n", "g(x) = h(y,z)\n", "\$\$\n", "for all \$x,y,z\$ in the known plaintext-ciphertext pairs. Then, we can interpolate \$g\$ and \$h\$.\n", "\n", "Unfortunately, the original paper does not mention a particular method, except Lagrange interpolation for the univariate case, which is non-trivial to extend to the multivariate case. A simple method is to evaluate all possible monomials (which could obtained e.g. via sampling) on the plaintext-ciphertext pairs to obtain a vector of values for each monomials, and find a linear combination that sums to zero. However, this method has a close to cubic complexity (in the number of unknown monomials).\n", "\n", "*Please let me know if there is an efficient method for multivariate interpolation.*" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's count unknown monomials in the actual cipher's polynomials." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "execution": { "iopub.execute_input": "2020-05-03T20:42:36.502353Z", "iopub.status.busy": "2020-05-03T20:42:36.501693Z", "iopub.status.idle": "2020-05-03T20:42:36.548393Z", "shell.execute_reply": "2020-05-03T20:42:36.547297Z", "shell.execute_reply.started": "2020-05-03T20:42:36.502241Z" } }, "outputs": [], "source": [ "from sage.all import * # 9.0\n", "from tqdm import tqdm\n", "\n", "# boilerplate\n", "x = GF(2).polynomial_ring().gen()\n", "mod = x**24 + x**16 + x**15 + x**14 + x**13 + x**10 + x**9 + x**7 + x**5 + x**3 + 1\n", "F = GF(2**24, name='a', modulus=mod)\n", "toF = F.fetch_int\n", "fromF = lambda v: v.integer_representation()" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "execution": { "iopub.execute_input": "2020-05-03T20:42:36.550432Z", "iopub.status.busy": "2020-05-03T20:42:36.549843Z", "iopub.status.idle": "2020-05-03T20:42:36.573447Z", "shell.execute_reply": "2020-05-03T20:42:36.572596Z", "shell.execute_reply.started": "2020-05-03T20:42:36.550339Z" } }, "outputs": [], "source": [ "R = PolynomialRing(F, names='x,y')\n", "x, y = R.gens()\n", "\n", "ks = [toF(100 * i + 44) for i in range(16)]" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "execution": { "iopub.execute_input": "2020-05-03T20:42:36.574748Z", "iopub.status.busy": "2020-05-03T20:42:36.574383Z", "iopub.status.idle": "2020-05-03T20:42:55.132046Z", "shell.execute_reply": "2020-05-03T20:42:55.131474Z", "shell.execute_reply.started": "2020-05-03T20:42:36.574702Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "after 1 rounds: unknown: 0 3 all: 1 4\n", "after 2 rounds: unknown: 3 7 all: 4 8\n", "after 3 rounds: unknown: 7 21 all: 8 22\n", "after 4 rounds: unknown: 21 61 all: 22 62\n", "after 5 rounds: unknown: 61 212 all: 62 214\n", "after 6 rounds: unknown: 212 705 all: 214 706\n", "after 7 rounds: unknown: 705 2170 all: 706 2172\n", "after 8 rounds: unknown: 2170 6544 all: 2172 6546\n", "after 9 rounds: unknown: 6544 19649 all: 6546 19652\n", "after 10 rounds: unknown: 19649 59028 all: 19652 59030\n", "\n" ] } ], "source": [ "# forward direction\n", "l = toF(0x777777)\n", "r = x\n", "l = F.fetch_int(0x777777)\n", "for i in range(10):\n", " l += (r + ks[i])**3\n", " l, r = r, l\n", " # assume coefficient 1 is always independent of the keys\n", " lmonos = [mono for c, mono in l if c != 1]\n", " rmonos = [mono for c, mono in r if c != 1]\n", " print(f\"after {i+1:2} rounds:\",\n", " f\" unknown: {len(lmonos):6} {len(rmonos):6}\",\n", " f\" all: {len(list(l)):6} {len(list(r)):6}\")\n", "print()" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "execution": { "iopub.execute_input": "2020-05-03T20:42:55.132988Z", "iopub.status.busy": "2020-05-03T20:42:55.132710Z", "iopub.status.idle": "2020-05-03T20:43:28.405884Z", "shell.execute_reply": "2020-05-03T20:43:28.404945Z", "shell.execute_reply.started": "2020-05-03T20:42:55.132950Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "after 1 rounds: unknown: 3 0 all: 5 1\n", "after 2 rounds: unknown: 13 3 all: 17 5\n", "after 3 rounds: unknown: 48 13 all: 56 17\n", "after 4 rounds: unknown: 335 48 all: 358 56\n", "after 5 rounds: unknown: 2233 335 all: 2303 358\n", "after 6 rounds: unknown: 25889 2233 all: 26171 2303\n", "\n" ] } ], "source": [ "# backward direction\n", "l = x\n", "r = y\n", "l, r = r, l\n", "for i in range(6):\n", " l, r = r, l\n", " l += (r + ks[i])**3\n", " # assume coefficient 1 is always independent of the keys\n", " lmonos = [mono for c, mono in l if c != 1]\n", " rmonos = [mono for c, mono in r if c != 1]\n", " print(f\"after {i+1:2} rounds:\",\n", " f\" unknown: {len(lmonos):6} {len(rmonos):6}\",\n", " f\" all: {len(list(l)):6} {len(list(r)):6}\")\n", "print()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Clearly, the ciphertext-side polynomial grows very fast, because it is bivariate. In principle, the meet-in-the-middle attack is already possible:\n", "\n", "- the left side of the 10-round plaintext polynomial has 19649 unknown coefficients;\n", "- the left side of the 6-round ciphertext polynomial has 25889 unknown coefficients;\n", "\n", "The total number of unknowns is 45538, which is much less than the number of available pairs (which is \$2^{24}\$). However, interpolating such huge polynomials using the naive linear-algebra method is rather expensive. Only the matrix itself would take more than 5 GB to store. While practical, this is not a fast enough short-term solution." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Improved Method\n", "\n", "How can we avoid the huge interpolation step? Let's combine plaintext-ciphertext pairs and sum the corresponding equations \$g(x)=h(y,z)\$. Since we have full codebook in \$x\$, we can choose arbitrary structures in \$x\$. The goal then would be to simplify or even get rid of \$g(x)\$ (ensure it sums to zero), such that only the sum of \$h(y,z)\$ is left as unknown.\n", "\n", "The idea is to sum equations with \$x\$ going over a subgroup of \$GF(2^{24})^*\$ of order higher than the degree of \$g\$ (see e.g. Proposition 2 of [](https://eprint.iacr.org/2020/188)). Let \$w\$ be an \$d\$-th root of unity, \$\\deg{g}(2^{24}-1)/3\$, so the right side does not sum to a constant. The equation that we obtain is basically the same:\n", "\$\$\n", "\\sum_{i=0}^{(2^{24}-1)/3-1} (l_{i'}^3 + k_{15}l_{i'}^2 + k_{15}^2l_{i'} + k_{15}^3 + r_{i'}) = g(0) = (l_j^3 + k_{15}l_j^2 + k_{15}^2l_j + k_{15}^3 + r_j),\n", "\$\$\n", "where \$i'\$ is the index of the plaintext-ciphertext pair with plaintext \$(\\mathtt{777777}, aw^i)\$, \$w=g^3\$, and \$a\$ is one of \$1,g,g^2\$; \$j\$ is the index of the pair with plaintext \$(\\mathtt{777777}, 0)\$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Summing up\n", "\n", "The final attack is as follows: we compute the polynomial in \$k_{15}\$ as in the above equation by summing ciphertext-based values over a 1/3rd of the given dataset. The resulting polynomial of degree=2 (\$k_{15}^3\$ is summed an even number of times) must have the last subkey as a root.\n", "\n", "In order to recover the second-to-last subkey, we can simply repeat the attack by decrypting the last round and attacking 15 rounds, by conveniently setting \$d=(2^{24}-1)/9\$.\n", "\n", "The challenge authors also provide information about the subkeys in a form of the result of multiplication of the subkey vector (of length 16) by a random \$16\\times 14\$ matrix over some prime finite field. This allows to recover all subkeys given any two of them.\n", "\n", "## The code" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "execution": { "iopub.execute_input": "2020-05-03T20:43:28.407727Z", "iopub.status.busy": "2020-05-03T20:43:28.407121Z", "iopub.status.idle": "2020-05-03T20:44:17.998106Z", "shell.execute_reply": "2020-05-03T20:44:17.997536Z", "shell.execute_reply.started": "2020-05-03T20:43:28.407637Z" } }, "outputs": [], "source": [ "# read the dataset\n", "enc = {}\n", "for lp, lc in zip(open(\"pt.txt\"), open(\"ct.txt\")):\n", " line = int(lp, 16)\n", " px = int(line & 0xffffff)\n", " \n", " line = int(lc, 16)\n", " cl = int(line >> 24)\n", " cr = int(line & 0xffffff)\n", " enc[px] = cl, cr" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "execution": { "iopub.execute_input": "2020-05-03T20:44:17.998907Z", "iopub.status.busy": "2020-05-03T20:44:17.998666Z", "iopub.status.idle": "2020-05-03T20:48:37.407971Z", "shell.execute_reply": "2020-05-03T20:48:37.406333Z", "shell.execute_reply.started": "2020-05-03T20:44:17.998876Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "100%|██████████| 5592406/5592406 [04:19<00:00, 21562.45it/s]" ] }, { "name": "stdout", "output_type": "stream", "text": [ "[16137118, 1102165, 14344588, 0]\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "# recover the last subkey\n", "w = F.gen()**3\n", "poly = vector(F, 4)\n", "for ii in tqdm(range(-1, (2**24-1)//3)):\n", " if ii == -1: \n", " r0 = F(0)\n", " elif ii == 0:\n", " r0 = F(1)\n", " \n", " # get ciphertext, undo unswap\n", " r, l = map(toF, enc[fromF(r0)])\n", "\n", " poly += vector(F, [l**3 + r, l**2, l, 1])\n", " r0 *= w\n", "print([fromF(v) for v in poly])" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "execution": { "iopub.execute_input": "2020-05-03T20:48:37.409872Z", "iopub.status.busy": "2020-05-03T20:48:37.409391Z", "iopub.status.idle": "2020-05-03T20:48:37.435974Z", "shell.execute_reply": "2020-05-03T20:48:37.435255Z", "shell.execute_reply.started": "2020-05-03T20:48:37.409812Z" } }, "outputs": [ { "data": { "text/plain": [ "[4122190, 14944194]" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Rk = PolynomialRing(F, names='k')\n", "k = Rk.gen()\n", "rpoly = sum(c*k**i for i, c in enumerate(poly))\n", "K15cands = [r for r, _ in rpoly.roots()]\n", "[fromF(v) for v in K15cands]" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "execution": { "iopub.execute_input": "2020-05-03T20:48:37.437420Z", "iopub.status.busy": "2020-05-03T20:48:37.436910Z", "iopub.status.idle": "2020-05-03T20:51:42.796709Z", "shell.execute_reply": "2020-05-03T20:51:42.796269Z", "shell.execute_reply.started": "2020-05-03T20:48:37.437378Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "100%|██████████| 1864136/1864136 [01:32<00:00, 20118.78it/s]\n", " 0%| | 1949/1864136 [00:00<01:35, 19488.02it/s]" ] }, { "name": "stdout", "output_type": "stream", "text": [ "candidate pair (7100038, 4122190)\n", "candidate pair (7904001, 4122190)\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "100%|██████████| 1864136/1864136 [01:32<00:00, 20113.97it/s]" ] }, { "name": "stdout", "output_type": "stream", "text": [ "candidate pair (9208218, 14944194)\n", "candidate pair (11612732, 14944194)\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "K1415 = []\n", "# recover the pre-last subkey\n", "for K15 in K15cands:\n", " w = F.gen()**9\n", " poly = vector(F, 4)\n", " for ii in tqdm(range(-1, (2**24-1)//9)):\n", " if ii == -1: \n", " r0 = F(0)\n", " elif ii == 0:\n", " r0 = F(1)\n", "\n", " # get ciphertext, undo unswap\n", " r, l = map(toF, enc[fromF(r0)])\n", " # undo last round\n", " l, r = r, l\n", " l += (r + K15)**3\n", "\n", " poly += vector(F, [l**3 + r, l**2, l, 1])\n", " r0 *= w\n", " \n", " rpoly = sum(c*k**i for i, c in enumerate(poly))\n", " K14cands = [r for r, _ in rpoly.roots()]\n", " for K14 in K14cands:\n", " K1415.append((fromF(K14), fromF(K15)))\n", " print(\"candidate pair\", K1415[-1])" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "execution": { "iopub.execute_input": "2020-05-03T20:51:42.797613Z", "iopub.status.busy": "2020-05-03T20:51:42.797371Z", "iopub.status.idle": "2020-05-03T20:51:42.826001Z", "shell.execute_reply": "2020-05-03T20:51:42.825465Z", "shell.execute_reply.started": "2020-05-03T20:51:42.797580Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Subkeys recovered: [16359893, 9091260, 11254674, 353718, 5395716, 9319892, 2360013, 12784246, 9857353, 2940944, 964650, 3296014, 7022345, 198188, 9208218, 14944194]\n" ] } ], "source": [ "# use the hint to recover all subkeys\n", "import re\n", "array = re.findall(r\"\\d+\", open(\"data.txt\").read())\n", "\n", "R = GF(next_prime(2**24))\n", "\n", "for K14, K15 in K1415:\n", " m = matrix(R, 17, 14, array)\n", " v = -m[-3] * K14 -m[-2] * K15 + m[-1]\n", " m = m[:-3]\n", " \n", " ks = m.solve_left(v)\n", " ks = list(map(int, list(ks) + [K14, K15]))\n", "\n", " # 7777776f4593 -> 04e0b9dd5f76\n", " l, r = toF(0x777777), toF(0x6f4593)\n", " for i in range(16):\n", " l += (r + toF(int(ks[i])))**3\n", " l, r = r, l\n", " l, r = fromF(r), fromF(l)\n", " if l == 0x04e0b9 and r == 0xdd5f76:\n", " print(\"Subkeys recovered:\", ks)\n", " break " ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "execution": { "iopub.execute_input": "2020-05-03T20:51:42.827096Z", "iopub.status.busy": "2020-05-03T20:51:42.826797Z", "iopub.status.idle": "2020-05-03T20:51:42.841972Z", "shell.execute_reply": "2020-05-03T20:51:42.841457Z", "shell.execute_reply.started": "2020-05-03T20:51:42.827054Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "b'De1CTF{6a2ddcc3-c729-48f8-b5e0-7574c46a2846}\\x04\\x04\\x04\\x04'\n" ] } ], "source": [ "# decrypt the flag\n", "ct = open(\"flag.txt\").read()\n", "pt = b\"\"\n", "for i in range(0, len(ct), 12):\n", " l = toF(int(ct[i:i+6], 16))\n", " r = toF(int(ct[i+6:i+12], 16))\n", " l, r = r, l\n", " for j in range(15, -1, -1):\n", " l, r = r, l\n", " l += (r + toF(int(ks[j])))**3\n", " pt += bytes.fromhex(\"%06x\" % fromF(l))\n", " pt += bytes.fromhex(\"%06x\" % fromF(r))\n", "print(pt)" ] } ], "metadata": { "kernelspec": { "display_name": "SageMath 9.0", "language": "sage", "name": "sagemath3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 4 }
 #!/usr/bin/env python3 # coding=utf-8 import os,random,sys,string import binascii from pyfinite import ffield from FLAG import flag import sympy import numpy as np from Crypto.Util.number import long_to_bytes ROUND = 16 F = ffield.FField(24) flag = ''.join(['%02x' % b for b in flag.encode(encoding="utf-8")]) def pad(plain): pad_length = (6 - len(plain) % 6 ) % 6 return plain + chr(pad_length) * pad_length def genkeys(): keys=[] for _ in range(ROUND): key = os.urandom(3) key_int = int(binascii.hexlify(key),16) keys.append(key_int) return keys def Mul(q1,q2): return F.Multiply(q1,q2) def fbox(x): return Mul(Mul(x,x),x) def enc_block(plain,keys): l = int(binascii.hexlify(plain[:3]),16) r = int(binascii.hexlify(plain[3:]),16) for i in range(ROUND): l , r = r , l ^ fbox(keys[i] ^ r) l , r = r , l result = (hex((l << 24) + r)[2:]).rjust(12,'0') return result def encrypt(plain, keys): p = '' for i in range(0,len(plain),2): p += chr(int(plain[i:i+2],16)) plain = pad(p).encode(encoding='utf-8') cipher = '' for i in range(0,len(plain),6): cipher += enc_block(plain[i:i+6],keys) return cipher keys = genkeys() with open('pt.txt','r') as ptfile: with open('ct.txt','w') as ctfile: while True: pt = ptfile.readline().rstrip('\n') if pt == '': break ct = encrypt(pt,keys) ctfile.write(ct + '\n') p = sympy.nextprime(2**24) arr = np.random.randint(0,p,(ROUND,ROUND-2),dtype='int64') keys = np.array(keys) res = np.mod(np.dot(keys,arr),p) with open('data.txt','w') as f: f.write('Array: ' + str(arr) + '\n') f.write('Result: ' + str(res) + '\n') with open('flag.txt','w') as f: ct = encrypt(flag,keys) f.write(ct)
to join this conversation on GitHub. Already have an account? Sign in to comment