James Tauber jtauber

## gist:8b2fa862d43ecedf7536
# frustratingly, we need to have "real" gcc to build the i386-elf binutils/gcc

brew install gcc

# $PROJECT_HOME should be the parent directory under which you'll download
# everything and build the toolchain

cd $PROJECT_HOME
mkdir toolchain

## keybase.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                jtauber
                / keybase.md
            
            
              Created
              January 19, 2021 16:16
            
          
    Keybase proof

I hereby claim:

I am jtauber on github.
I am jtauber (https://keybase.io/jtauber) on keybase.
I have a public key ASBKXRd38Pg3fZKJJkLQ9TG3sLxxs17UAcX1zhjDiL3cpQo

To claim this, I am signing this object:

  
## .block
license: mit

## old-norse-on-macos.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              1 star
            
          
                jtauber
                / old-norse-on-macos.md
            
            
              Last active
              October 14, 2018 18:46
            
          
    moved to https://digitaltolkien.com/old-norse-keyboard-guide/

  
## gist:ed07e0fd15ecdc5394755d3e0c9304f8
VARIA = "\u0300"
OXIA = "\u0301"
PERISPOMENI = "\u0342"

ACCENTS = [VARIA, OXIA, PERISPOMENI]


def strip_accents(s):
    return unicodedata.normalize("NFKC", "".join(
        c for c in return unicodedata.normalize("NFD", s) if c not in ACCENTS

## tokenize_01.py
# Opens the file with the given filename for reading and puts the resultant
# file object in the variable `f`.
f = open("OCR Output linebreaks removed.txt")

# `f.read()` reads the file and returns a string.
# `.split()` splits that string on whitespace and returns a list of strings.
# `for A in B:` iterates over the list B and runs the indented block with each
# list item in the variable A.
for token in f.read().split():

## recipes.py
### strip specific accents

def strip_accents(w):
    return unicodedata.normalize("NFC", "".join(
        ch
        for ch in unicodedata.normalize("NFD", w)
        if ch not in ["\u0300", "\u0301", "\u0342"]
    ))


## mean_log_frequency.py
#!/usr/bin/env python3

from collections import defaultdict
from math import log

from pysblgnt import morphgnt_rows

items_by_target = defaultdict(list)
count_by_item = defaultdict(int)
total_item_count = 0

## 01_the_ugly_duckling.txt
The Ugly Duckling.

A duck made her nest under some leaves.
She sat on the eggs to keep them warm.
At last the eggs broke, one after the other. Little ducks came out.
Only one egg was left. It was a very large one.
At last it broke, and out came a big, ugly duckling.
"What a big duckling!" said the old duck. "He does not look like us. Can he be a turkey?--We will see. If he does not like the water, he is not a duck."

The next day the mother duck took her ducklings to the pond.

## gist:6331304

Matthew

5997c5997
< 40011026003 N- ----VSM- πατήρ, πατήρ
---
> 40011026003 N- ----NSM- πατήρ, πατήρ
10483c10483
< 40018012006 RI ----DSM- τινι τις
---
	# frustratingly, we need to have "real" gcc to build the i386-elf binutils/gcc

	brew install gcc

	# $PROJECT_HOME should be the parent directory under which you'll download
	# everything and build the toolchain

	cd $PROJECT_HOME
	mkdir toolchain
	VARIA = "\u0300"
	OXIA = "\u0301"
	PERISPOMENI = "\u0342"

	ACCENTS = [VARIA, OXIA, PERISPOMENI]


	def strip_accents(s):
	return unicodedata.normalize("NFKC", "".join(
	c for c in return unicodedata.normalize("NFD", s) if c not in ACCENTS
	# Opens the file with the given filename for reading and puts the resultant
	# file object in the variable `f`.
	f = open("OCR Output linebreaks removed.txt")

	# `f.read()` reads the file and returns a string.
	# `.split()` splits that string on whitespace and returns a list of strings.
	# `for A in B:` iterates over the list B and runs the indented block with each
	# list item in the variable A.
	for token in f.read().split():
	### strip specific accents

	def strip_accents(w):
	return unicodedata.normalize("NFC", "".join(
	ch
	for ch in unicodedata.normalize("NFD", w)
	if ch not in ["\u0300", "\u0301", "\u0342"]
	))
	#!/usr/bin/env python3

	from collections import defaultdict
	from math import log

	from pysblgnt import morphgnt_rows

	items_by_target = defaultdict(list)
	count_by_item = defaultdict(int)
	total_item_count = 0
	The Ugly Duckling.

	A duck made her nest under some leaves.
	She sat on the eggs to keep them warm.
	At last the eggs broke, one after the other. Little ducks came out.
	Only one egg was left. It was a very large one.
	At last it broke, and out came a big, ugly duckling.
	"What a big duckling!" said the old duck. "He does not look like us. Can he be a turkey?--We will see. If he does not like the water, he is not a duck."

	The next day the mother duck took her ducklings to the pond.

	Matthew

	5997c5997
	< 40011026003 N- ----VSM- πατήρ, πατήρ
	---
	> 40011026003 N- ----NSM- πατήρ, πατήρ
	10483c10483
	< 40018012006 RI ----DSM- τινι τις
	---