Skip to content

Instantly share code, notes, and snippets.

@jejones3141
Last active January 19, 2020 02:09
Show Gist options
  • Save jejones3141/34720b687dcb96551f613cfc40e7bb93 to your computer and use it in GitHub Desktop.
Save jejones3141/34720b687dcb96551f613cfc40e7bb93 to your computer and use it in GitHub Desktop.
Solution for Jan 3 2020 fivethirtyeight.com Riddler Classic
# Jan 3 2020 Riddler Classic -- find "Spelling Bee" puzzle w/max total word score
# Here we're trying to take Peter Norvig's solution a step further by noting that
# the subsets (independent of a particular honeycomb) can be precalculated, too.
# Does that turn out faster than running combinations()? We'll see.
# We're now representing the center by its index in the honeycomb string.
from collections import Counter
def signature(w):
return ''.join(sorted(set(w)))
def isPangram(sig):
return len(sig) == 7
# Word and signature length constraints are inherent in the game.
# The Riddler added the "no 's'" requirement.
def wordConstraints(w):
return len(w) > 3 and "s" not in w and len(signature(w)) <= 7
def wordScore(w, sig):
lw = len(w)
return 1 if lw == 4 else (lw + (7 if isPangram(sig) else 0))
def runEncode(bitmap):
runs = []
len = 0
for i in range(7):
if bitmap & (1 << i):
if len == 0:
pos = i
len += 1
elif len > 0:
runs.append((pos, pos + len))
len = 0
if len > 0:
runs.append((pos, pos + len))
return runs
def rleDict():
rd = {}
for center in range(7):
centerBit = 1 << center
lsbMask = centerBit - 1
msbMask = 63 ^ (centerBit | lsbMask)
rd[center] = [runEncode((n & msbMask) << 1 | centerBit | (n & lsbMask))
for n in range(64)]
return rd
def subsets(puzzle, rd):
(honeycomb, center) = puzzle
for rle in rd[center]:
yield ''.join(honeycomb[m : n] for (m, n) in rle)
def bestPuzzle():
sp = Counter()
sigs = set()
with open("enable1.txt", "r") as f:
for w in filter(wordConstraints, f.read().splitlines()):
sig = signature(w)
sigs.add(sig)
sp[sig] += wordScore(w, sig)
rd = rleDict()
def gameScore(puzzle):
return sum(sp[s] for s in subsets(puzzle, rd))
puzzles = ((h, c) for h in filter(isPangram, sigs) for c in range(7))
return max(((gameScore(p), p) for p in puzzles), key = lambda x:x[0])
@jejones3141
Copy link
Author

We now no longer presume the best score is necessarily a pangram--for example, the input might contain an attempt to transcribe a scream: aaaaaaaaaaaaaaaaaaaaaaaaaah!

@jejones3141
Copy link
Author

OK... we think we understand now how puzzles are scored, and we merge word lists for signatures such that one is a prefix of another.

@jejones3141
Copy link
Author

Added comments, and tried to be more concise and Pythonic (though we did define a function with a lambda; sorry, Guido).

@jejones3141
Copy link
Author

We've given up on merging, at least for now, and chug through the possible (string, center) pairs looking for the highest scoring pair.

@jejones3141
Copy link
Author

I should've created a repository for this one, but I thought it would be easy, so into a gist it went.
This is how I'll leave it for now; it's been fed through the Black Python formatter. My goal was to write something clean and concise to solve the specified problem. It will take some tweaks and hoisting of functions to allow experimentation.

History: after a very brief fling with grinding through 7 * (25 choose 7) possible puzzles, I decided to go the other way, starting with creating a dictionary with what I called a word "signature"--the word with duplicate letters deleted--as key. The corresponding value was a list of all the words with that signature. The idea I had was to merge dictionary entries where possible, but the center gets in the way. I was still beating my head against that when a friend pointed me at Assistant Professor Laurent Lessard's solution, and I took his approach, iterating over the possible puzzle strings and possible center characters.

It works out quite nicely and the dictionary proved useful. Rather than testing individual words, you can test a signature--if it passes muster, all words with that signature will work. For what it's worth, my desktop computer has an AMD FX-8370. I haven't had the patience to run it repeatedly to get an average run time; individual runs vary. The last run I did took 3:40.

@jejones3141
Copy link
Author

As the Bond movie title says, "Never Say Never Again". I saw the Hacker News post about the Python profiler Scalene (emeryberger/scalene right here on GitHub), and I had to look again. The overwhelming majority of the time taken is in the assignment to matches, so I did another thing Lessard did, namely notice once ahead of time for each center which signatures include it (well, he does words instead of signatures).

The time mentioned in a previous comment came from printing asctime() before and after a call to bestPuzzle(). Looking at the times from scalene (I had to add a line with a call to bestPuzzle() so there was something to measure), the previous version, according to scalene, took 326 seconds, or 5:26. The current version, again according to scalene, takes 218 seconds, or 3:38. (Profilers do slow programs down, but scalene's overhead is impressively low.)

@jejones3141
Copy link
Author

OK, this is the last version in Python for me. Peter Norvig did a seriously good job of speeding things up. His code finds the Riddler solution in a shade under two seconds. The major insight is that if you create a dictionary mapping letter sets (what we called the "signature" above) to the sum of values of all the words with that letter set, you can save a bunch of work later.

The current version of the gist uses Norvig's method, except... it occurred to me that you could precalculate the subsets, too, though they'd have to have a form not tied to a particular honeycomb. Would that save any more time?

The answer: I have yet to come up with a good way to take those precalculated sets and a honeycomb and generate the pertinent letter sets that's faster than just using itertools.combinations(), no precalculations, as Norvig does. One can use the above code to solve the Riddler Classic problem... but, at least on my computer, according to the scalene profiler, it takes 6.34 seconds.

Maybe a Go version...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment