Skip to content

Instantly share code, notes, and snippets.


Robyn Speer rspeer

View GitHub Profile
View how-to-make-a-racist-ai-without-really-trying.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View snekmaze2.p8
function _init()
-- tiles to move per frame
-- don't make this more than 1
fstep = 1/8
-- step counter
-- it can overflow, that's fine
step = 0
trailpos = 0
This file contains code that, when run on Python 2.7.5 or earlier, creates
a string that should not exist: u'\Udeadbeef'. That's a single "character"
that's illegal in Python because it's outside the valid Unicode range.
It then uses it to crash various things in the Python standard library and
corrupt a database.
On Python 3... well, this file is full of syntax errors on Python 3. But
if you were to change the print statements and byte literals and stuff:
rspeer / countmerge.awk
Last active Jun 20, 2018
Given a sorted file where each line is a key and a count, merge adjacent lines with the same key by adding their counts.
View countmerge.awk
# Given a tab-separated, sorted file where each line is a key and a count,
# merge adjacent lines with the same key by adding their counts.
# Initialize the current count.
# We use the empty string as a sentinel value, indicating that we haven't
# seen a key yet. We won't output a total for the empty string.
key = ""
count = 0
>>> from wordfreq import tokenize, word_frequency
>>> tokenize('电影放映机', 'zh')
['电影', '放映', '机']
>>> word_frequency('电影放映机', 'zh')
>>> word_frequency('programme', 'en')
/* This Rust code scans through the Common Crawl, looking for text that's
* not English. I suspect I may learn much later that it's terrible,
* unidiomatic Rust, but it would take me months to learn what good Rust is.
* We depend on some external libraries:
* - html5ever: an HTML parser (we only use its low-level tokenizer)
* - encoding: handles text in all the encodings that WHATWG recognizes
* - string_cache: interns a bunch of frequently-used strings, like tag names -- necessary to use
* the html5ever tokenizer
rspeer / aaaa.html
Created Mar 14, 2016
Overflowing the stack of Text.HTML.TagSoup with a straightforward HTML file
View aaaa.html
View dominion-rnn-cards.txt
$3, Action
Trash this card. If you do, gain a Silver per 5 cards it, and put them into your hand.
$5, Action, Duration
View description.txt
^ marks the name of the card.
The column with all the @ signs indicates the cost and type. I probably missed some because I was impatiently editing a file I had already.
A = Action, T = Treasure, V = victory, a = Attack, R = Reaction, v = traVeler, D = Duration, E = Event, r = Ruins.
| indicates a line break, and --- indicates a horizontal line.
>>> import wordfreq, langcodes
>>> def legible_list(lst):
... return('\N{LEFT-TO-RIGHT MARK}, '.join(lst))
>>> for lang in sorted(wordfreq.available_languages()):
... language_name = langcodes.get(lang).language_name('en')
... top_ten = legible_list(wordfreq.top_n_list(lang, 10))
... print('%-3s %-12s %s' % (lang, language_name, top_ten))