Skip to content

Instantly share code, notes, and snippets.

View gartenfeld's full-sized avatar

David Rosson gartenfeld

View GitHub Profile
@gartenfeld
gartenfeld / download_section_metadata.sh
Last active September 29, 2023 23:49
Downloading section JSON data
# First, set up ssh for Puhti
# Use ssh-copy-id to add your key
ssh-copy-id yourcscusername@puhti.csc.fi
# This way, you don't have to type your password for every command
# Make a .txt file with one file name (relative) per line
# Copy this file to remote
scp ./download_list.txt yourcscusername@puhti.csc.fi:.
# Log in
@gartenfeld
gartenfeld / syllable_frequency.txt
Created June 16, 2016 19:24
Frequency of Syllables in English
ðə 23038047
ə 18735224
ˈtu 12418461
ˈænd 11260533
ˈʌv 10968008
ɪn 10738928
li 5689929
ˈðæt 5416929
ˈaɪ 4683241
ˈfɔr 4678692
@gartenfeld
gartenfeld / multi_find.py
Last active November 8, 2022 02:38
BeautifulSoup with multiple criteria.
soup.find_all('div', {'class': ['first', 'second']})
soup.find_all(lambda tag: tag.name=="div" and
tag.get("id")=="examples_box" and
tag.get("class") == "term-subsec")
@gartenfeld
gartenfeld / arpabet.txt
Last active July 26, 2022 11:44
Frequency distribution of syllables using CMU dictionary and COCA.
AA ɑ
AA0 ɑ
AA1 ɑ
AA2 ɑ
AE æ
AE0 æ
AE1 æ
AE2 æ
AH ə
AH0 ə
@gartenfeld
gartenfeld / mac_dictionary.sh
Created July 12, 2020 12:56
Search for word patterns
egrep ".*word$" /usr/share/dict/words

Install MacPorts

Optionally update MacPorts sudo port selfupdate

Download the port tree of HFST

  • Unpack the port tree
  • Add the port tree dir file:///.../hfst-macport/ to /opt/local/etc/macports/sources.conf
  • sudo port install hfst
@gartenfeld
gartenfeld / jwt_user.js
Last active May 3, 2020 15:09
User decoder
const server = {}; // ...
const cookieParser = require('cookie-parser');
const jwt = require('jsonwebtoken');
server.express.use(cookieParser());
server.express.use((req, res, next) => {
const { token } = req.cookies;
if (token) {
const { userId } = jwt.verify(token, process.env.APP_JWT_SECRET);
req.userId = userId;
@gartenfeld
gartenfeld / flat_to_mongo.py
Last active April 17, 2020 05:35
Import data from a flat file into MongoDB.
import sys
import re
import codecs # UniCode support
from pymongo import Connection # For DB Connection
from pymongo.errors import ConnectionFailure # For catching exeptions
def main():
# MongoDB connection
try:
@gartenfeld
gartenfeld / de_word_final_ngrams.csv
Last active October 5, 2019 12:57
German Word-Final N-Grams
We can't make this file beautiful and searchable because it's too large.
reverse, ending, length, is_word, ngram_weight, m_raw, f_raw, n_raw, m_freq, f_freq, n_freq, highest, predicted, correct, top
e, e, 1, 0, 473.2622, 306, 3562, 175, 0.0652, 0.8713, 0.0635, 0.8713, f, 3562, Woche, Seite, Frage
t, t, 1, 0, 416.4539, 1178, 1573, 873, 0.3479, 0.4234, 0.2287, 0.4234, f, 1573, Stadt, Arbeit, Welt
g, g, 1, 0, 369.4735, 727, 2307, 94, 0.3026, 0.676, 0.0214, 0.676, f, 2307, Regierung, Entscheidung, Zeitung
r, r, 1, 0, 362.7837, 2421, 336, 394, 0.7634, 0.1102, 0.1264, 0.7634, m, 2421, September, Oktober, November
re, er, 2, 0, 283.5243, 2063, 146, 280, 0.8272, 0.0535, 0.1193, 0.8272, m, 2063, September, Oktober, November
gn, ng, 2, 0, 278.8847, 197, 2301, 57, 0.0941, 0.8921, 0.0138, 0.8921, f, 2301, Regierung, Entscheidung, Zeitung
gnu, ung, 3, 0, 250.8859, 18, 2299, 0, 0.0093, 0.9907, 0, 0.9907, f, 2299, Regierung, Entscheidung, Zeitung
n, n, 1, 0, 234.9137, 671, 1019, 590, 0.297, 0.4714, 0.2316, 0.4714, f, 1019, Million, Information, Region
l, l, 1, 0, 136.552, 585, 175, 445, 0.4508,
@gartenfeld
gartenfeld / ffmpeg_timelapse.sh
Created May 31, 2019 10:04
Generate time-lapse
# 0.1 makes it 10x as fast
ffmpeg -i input.mp4 -filter:v "setpts=0.1*PTS" output.mp4