Skip to content

Instantly share code, notes, and snippets.

@ky28059
Last active June 2, 2024 22:35
Show Gist options
  • Save ky28059/c5cdfd92faf2b3938497dfcf6bb219dd to your computer and use it in GitHub Desktop.
Save ky28059/c5cdfd92faf2b3938497dfcf6bb219dd to your computer and use it in GitHub Desktop.

TJCTF 2024 — golf-hard

regex below par? note that this challenge has five levels.

nc tjc.tf 31627

We're given a Regex "quiz" with 5 levels. After passing all 5, we get the flag.

#!/usr/local/bin/python3.11

import regex  # le duh
import os
import tomllib
from tabulate import tabulate
from importlib import import_module
from itertools import zip_longest as zp


def check_regex(pattern, matches, nonmatches):
    try:
        re = regex.compile(pattern, flags=regex.V1)
    except:
        print("nope")
        return False

    for text in matches:
        if not re.search(text):
            print(f"whoops, didn't match on {text}")
            return False

    for text in nonmatches:
        if re.search(text):
            print(f"whoops, matched on {text}")
            return False

    return True


def main():
    for dir in sorted(os.listdir("challenges")):
        tomlpath = os.sep.join(["challenges", dir, "info.toml"])
        with open(tomlpath, "rb") as f:
            info = tomllib.load(f)

        matches = info["matches"]
        nonmatches = info["nonmatches"]
        length = info["length"]

        print(info["description"])
        print(
            tabulate(
                zp(matches, nonmatches),
                [
                    "Match on all of these:",
                    "But none of these:    ",
                ],
                disable_numparse=True,
                tablefmt="simple",
            )
        )
        print(f"\nMaximum allowable length is {length}\n")

        # include some test cases that may be inconvenient to display
        # only some challenges have extra tests
        # fear not, the intended pattern will remain the same
        ext = import_module(f"challenges.{dir}.extensions")
        matches.extend(ext.more_matches())
        nonmatches.extend(ext.more_nonmatches())

        pattern = input("> ")
        if len(pattern) > length:
            print(f"whoops, too long")
            return

        if not check_regex(pattern, matches, nonmatches):
            return

    print(open("flag.txt").read())


if __name__ == "__main__":
    main()

Since the server compiles regex using regex instead of re, we can use fancier regular expression extensions like recursion.

Warmup

We just want to match all words starting with a. This is pretty simple; just use ^ to force the match to start at the start of the string, then look for the first a:

^a
1. "Warmup"
    This one's pretty straightforward.

Match on all of these:    But none of these:
------------------------  ------------------------
ampasimenite              jasmone
anchorable                decisivenesses
aconic                    backoff
antistrophic              whitebark
abrade                    physogastrism
arroya                    shavee
apoplex                   hanoverian
ayahs                     weatherstripped
arock                     naturalize
anconoid                  lophotrichous
anglophile                nonprecedential
acalephan                 tjctf
amaze                     shepherdia
adelphe                   waymark
anno                      picnicker
amiable                   bottleholder
aliquid                   porule
achromatin                diagraming
aircheck                  solifuge
antigenically             phrenetically

Maximum allowable length is 2

> ^a

Subtraction

The first regex relying on PCRE-style group-insertion shenanigans. We can take some inspiration from this StackOverflow: essentially, we can split the first string of x's into two captured groups, with the second string of xs being the first group and the resulting sum being the second.

^(.*)(.*)-\1=\2$
2. "Subtraction"
    Think addition, but not.

Match on all of these:                    But none of these:
----------------------------------------  -------------------------------------------
xxxxxxxxxxxxx-xxx=xxxxxxxxxx              xxx-xxxx=xxxxxxxxxx
xxxxxxx-x=xxxxxx                          xxxxxxxxxxxx-xxxxxxxxx=xx
xxxxxxxxxxxxxxxx-xxxxxxxxxxxxx=xxx        xxxx-xxxxxxxxxxxxx=
x-x=                                      xxxxxxxxxxxxxxx-xxxxxxx=xxxxxxx
xxxx-xxxx=                                xxxxxxxxxxx-xxxxxxx=xxxxxxxxxxxxxx
xx-=xx                                    xxxxxxxxxxxxxx-xxx=xxxxxxxxxxxx
xxxxxxxxxxx-xxxxx=xxxxxx                  xxxx-xxxx=xxxxxxxxxxxxxxxxxx
xxxxxxxxxx-xxxxx=xxxxx                    xxxxxxxxxxxxxxx-xxx=xxxxxxxxxxx
-=                                        xxxxxxxxxxxxx-xxxxxxx=
xxxxxxxxxxxxxxxxx-xxxxxxxxxxx=xxxxxx      xxxxxxx-xxxxxxxx=xxxxxxxxxxx
xxxxxxxxxxxxx-xxxxxxxxxxxxx=              x-xxxxxxxxxxxxxxxxxxxx=
xxxxxxxxxxxxxxxxxx-xxxxxxxxxxxxxxx=xxx    xxxxxxxxxxxxxxx-xxxxxxxxxxxxxxx=xxxxxxxxxx
xxxxxxxxxxx-xxxxxxxxxxx=                  xxxxx-xxxxxxxxxx=xx
xxxxxxxxxxxxxxxxx-xxxxx=xxxxxxxxxxxx      xxxxxxxxxxx-xxxxxxxxxxxxxx=xxxxxx
xxxxxxxxx-xx=xxxxxxx                      xxxxxxxxxxxxx-=xxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxx-xxxxxxxxxxxx=xxxxxx    xxxxxxxx-xxxxxxxxxx=xxxxxxxxxxxxxxxxx
x-=x                                      xxxxxxx-xxxxxxx=xxxxxxxxxxxxxx
xxxxxxxx-xxxxx=xxx                        -xxxxxxxxxxxxxxxxxxxx=
xxxxxxxxxx-xxxxxx=xxxx                    xxxxxxxxxxxxxxxxxxxx-xxxxxxxxxxxxxxxx=x
xxxxxxxxxxxxxx-xxxxxx=xxxxxxxx            xxxxxxxx-xxxxxxxx=xxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxx-xxxxxxxxxxxxx=xxxxx    xxxxxx-xxxxxxxxxxxxxxxxxxx=xxxxxxxx
xxxxxxxxxxxxxxx-xxxx=xxxxxxxxxxx          xx-xxxxxxxxxxxxxxx=xx
xxxxxxxxxxxxxxxx-xxxxxxx=xxxxxxxxx        xxxxxxxx-xxxxxxxxxxxxxxxxxxxx=xxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxx-x=xxxxxxxxxxxxxxxxxx  xxxxxxxxx-xx=xxxxxxxxxxxxxxxxx

Maximum allowable length is 16

> ^(.*)(.*)-\1=\2$

Parity

The first regex relying on PCRE recursion shenanigans. We can again take inspiration from a StackOverflow post for this: ignoring the .NET solution using balancing groups, we can edit the recursive solution to use angled brackets and ignore text between brackets to trim down its length.

^(<(?1)*>)+$
3. "Parity"
    This one's supposed to be impossible in the general case.
    The tests will enforce a solution for the general case.

Match on all of these:                            But none of these:
------------------------------------------------  ------------------------------------------------
<>                                                <
<<>>                                              >
<><>                                              <>>
<<><>>                                            >><>>
<<<<>>>>                                          <><><<
<<<>>><>                                          <<>>><>
<<><><<>>>                                        <<>>>>><
<><<<><>>>                                        >><>>>><<
<<<><><>>>                                        <>><<>><><<<<>>><<
<<<<<>>><>>>                                      <<>>>>><>><<>><<><<
<<<<<<>>>>>>                                      >>>>><><><><<<>>><>>
<<<<<<<>>>>>>>                                    ><><<<><><>>>>>><<><<
<<<>><><<<>>>>                                    <<<<<><<<><>><>>><>>>><>>>
<<<<<<<<<<>>>>>>>>>>                              >>>><>>><><><<<<<>>><>><<><>>><><<>
<<<<<<<<<<>>>>>>>>>><>                            <>><><>><>><<><>><>>><><>>><><>><><>><
<><<<>><<>>><<<<>><>>>                            <<<><>>>>><<><<><><<<<>>><<<>>>>>>><>>>
<<><><><<<<<><<>>>>>><>>                          ><<>><<<<><><<<><>><<>><<><>>><>>>>>><<>
<<<<><>>><><><>><><><><>                          >>><<<<<<<<<<>>>><<<><<><>><<<><><><<<<>>
<<>><<>><<>><<<<<<<>>>>><>>>                      <<><>><<>><>>>>><>><><<><>>><<<<<><>><<<<><>
<<<<<<>>>>>><<<<<<<>>><>>>>>                      <>><<<><<><<<<<>>><><<<<<<<><><<>><><><<>>>>
<<<<>><><>>><<><<>><<<<>>><>>>                    ><><<<>>><<><<><>>><<<<>><>><><<>>><><<><<<<
<<<><<>><<<>><<<>>><>>>><<<>><>>                  ><><<<<><<<<<>><<><<><>>>><><<><><><>><<<><><
<><<><><><<<>>><<<>><>>><<<<><>>>>                >>><<>>><><<<<><><><<<<>>>><<<>>><>><<<<<<<<<>
<<<<<<>>><>><><><>><>><<<>><><<<>>>>              <>><><>>><<>>><>>><>>><<<<>>>><>>><<<<<>><>><>><
<<<>><><>><><<><><<<>>>><<<>>><<<><<>>>><><<<>>>  <>>>><<<><>><<><<<><<<><>>>>>><>>><>>>>><<<><<<>

Maximum allowable length is 12

> ^(<(?1)*>)+$

Topsy Turvy

Another recursive regex, this time finding palindromes. Because this is a well known / solved problem, we can just use the solution from this StackOverflow answer to pass.

^((.)(?1)\2|.?)$
4. "Topsy Turvy"
    This one's another famously impossible regex challenge.
    Have fun solving it!

Match on all of these:     But none of these:
-------------------------  ------------------------
i                          cashew
dad                        hamleteer
abba                       electromagnetic
kayak                      zimbabwe
refer                      xerox
civic                      epiphanizing
pullup                     detected
racecar                    tahinis
rotator                    foolproof
detartrated                sailboat
wassamassaw                racecars
tattarrattat               redeemer
satireveritas              woodrow
dogeeseseegod              bathtub
neveroddoreven             butterfly
madaminedenimadam          scarabs
anutforajaroftuna          alabama
wasitacaroracatisaw        yamaha
amanaplanacanalpanama      engage
satorarepotenetoperarotas  nonattributiveness

Maximum allowable length is 18

> ^((.)(?1)\2|.?)$

Multiplication

Since this isn't exactly a "well known regex problem", we'll have to cook up a solution ourselves. We can leverage PCRE capture groups and recursion to capture the first operand, then "paste" it on the right of the equals for each character in the second. After a bit of fanagling, we can come up with

^(x*)\*(x(?2)\1|=)$

and get the flag:

5. "Multiplication"
    Seems self-explanatory.

Match on all of these:                       But none of these:
-------------------------------------------  ------------------------------------------
xxx*xxxxx=xxxxxxxxxxxxxxx                    x*x=xx
xxxxx*xxxx=xxxxxxxxxxxxxxxxxxxx              xxxxxxxx*xxxx=xxxxxxxx
xxxxx*xxx=xxxxxxxxxxxxxxx                    x*xxxxxxxx=xxxxxxx
xx*xx=xxxx                                   xxxxxxx*xxxxx=xxxxxxxxx
xx*xxxx=xxxxxxxx                             xxxxxxxx*xxx=xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxx*xxxxxx=xxxxxxxxxxxxxxxxxx                xxxxx*xxxxx=xxxxxxxxxxxxxx
xxx*xxxxx=xxxxxxxxxxxxxxx                    xxx*xxxxxx=xxxx
xx*xxxxx=xxxxxxxxxx                          xx*xxxx=xxxxxxxxxx
xx*xxx=xxxxxx                                xxxx*xxxxx=xx
xxxxx*xxxxxx=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx  xxx*xxx=xxx
x*xx=xx                                      xxxx*xxxxxxxx=xxxxxxxxxxxxxxxxxxxx
xxxxxx*xxxx=xxxxxxxxxxxxxxxxxxxxxxxx         *x=x
x*x=x                                        xxxxxx*xxxxxxx=xxxxxxxxxxxxxxxxxxxxxxxxx
xxx*xxx=xxxxxxxxx                            xx*xxxxxx=xxxxxxxxx
xxxx*xxxxxx=xxxxxxxxxxxxxxxxxxxxxxxx         xxxxxxxx*xxxxxx=xxxxxxxxxxxxxxxxxxxxxx
xxxx*xx=xxxxxxxx                             xxxx*xxxxxxx=xxxxxxxxxxxxxxxxxxxxxxxxxx
*x=                                          xxxxxxxx*xx=xxxxxxxxxxxxxxx
xxxxx*xxxxxx=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx  x*xxxxxxx=x
xxx*xx=xxxxxx                                xx*x=x
xxxx*xxxxxx=xxxxxxxxxxxxxxxxxxxxxxxx         xx*xxxxx=xxxxxx

Maximum allowable length is 20

> ^(x*)\*(x(?2)\1|=)$


From the moment I understood the weakness of my code, it disgusted me. I craved the strength and certainty of ascii. I aspired to the purity of the Blessed Regex.

Your kind cling to your code, as if it will not have bugs and fail you. One day the crude boilerplate you call a script will wither, and you will beg my kind to save you. But I am already saved, for the Regex is immortal...

tjctf{even_in_death_I_serve_the_PCRE_Standard_3ceb7afc}
@VinhPham2106
Copy link

coolio

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment