Skip to content

Instantly share code, notes, and snippets.

View kscanne's full-sized avatar

Kevin Scannell kscanne

View GitHub Profile
@kscanne
kscanne / neamhchaighdeanach.txt
Created October 16, 2023 13:07
Ainmneacha briathra malartacha in FGB, de réir minicíochta (131 milliún focal sa chorpas seo)
feiscint 3038
leanacht 2425
cuartú 1373
áiteamh 1366
inseacht 1306
geallúint 1259
tógaint 876
leanstan 844
pilleadh 828
toiseacht 788
@kscanne
kscanne / varf-in-fgb
Last active October 5, 2023 11:32
baininscneach malartach
aga^1, m.
aicearra, m.
áiméar, m.
ainm, m.
aiséirí, m.
aistear, m.
aiteann, m.
ancaire^1, m.
anfa, m.
ara^2, m.
cat seal/ig7 | egrep '^[^ ]+[.] n[bf]' | sed 's/[.].*//' | sort | uniq -c | sort -r -n | egrep -v ' 1 ' | sed 's/^ *[0-9]* //' | sort
ábhar
acht
acra
aingeal
aird
aire
airí
aiteacht
aithne
@kscanne
kscanne / gan.txt
Created December 14, 2022 15:29
Comhrá le ChatGPT faoi ghramadach na Gaeilge
KPS:
Should I lenite a proper name in Irish after the preposition "gan"? For example, should it be "Chuaigh muid ann gan Cáit" or "Chuaigh muid ann gan Cháit"?
ChatGPT:
In Irish, the general rule is that a proper noun (such as a person's name) should not be lenited after the preposition "gan." So in your example sentence, it should be "Chuaigh muid ann gan Cáit."
KPS:
### Keybase proof
I hereby claim:
* I am kscanne on github.
* I am kscanne (https://keybase.io/kscanne) on keybase.
* I have a public key ASDwIxpS-I6l0azzeS0OPCEMq2aNsYg7JTTf6c17f-Zpsgo
To claim this, I am signing this object:
@kscanne
kscanne / feature-upos.txt
Created April 25, 2021 13:46
Permitted UPOS/Feature combinations for Irish treebank
$ egrep -h '^[0-9]+' *.conllu | cut -f 6 | egrep '..' | tr "|" "\n" | sed 's/=.*//' | sort -u | while read x; do echo; echo "$x..."; egrep "^[0-9]+.*[^A-Za-z]$x=" *.conllu | cut -f 4 | sort | uniq -c | sort -r -n; done
Abbr...
300 PROPN
101 NOUN
18 ADV
18 ADJ
5 X
4 SYM
4 NUM
@kscanne
kscanne / fixed
Created April 15, 2021 18:39
Possibly need fixed deprel
ga_idt-ud-dev.conllu
Line 664 — i gcuideachta
Line 1781 — D' ainneoin
Line 3957 — Le linn
Line 6511 — i láthair
Line 7657 — In aice
Line 7692 — i gcaitheamh
Line 11020 — i measc
Line 11295 — i measc
@kscanne
kscanne / frith-aidiachtai.txt
Created November 20, 2020 18:49
Focail fhrithchiallacha
aontreoch —— déthreo
ar fónamh, folláin —— breoite, easlán, i do luí tinn, othrasach, tinn, tinnlag, éagrua
ainhidriúil —— hidriúil
dofheicthe —— ar amharc, feicseanach, infheicthe, le feiceáil, ris, sofheicthe
buan, gan stad, leanúnach, seasta —— bearnach, briste, neamhleanúnach
gainmheach —— créúil, marlach
neamhghairmiúil, tuata —— gairmiúil, proifisiúnta
corr, corraiceach —— cothrom, réidh
slán, sábháilte, treabhar —— contúirteach, eisinnill, i gcontúirt, i mbaol, neamhshlán, éadaingean, éislinneach
dofhaighte —— ann, ar fáil, faoi réir, infhaighte, inúsáide
ábhann. nf1
agar. nf1
aireamh. nf1
áirí. nb4
airleacán. nf1
aiseach. a1
aithneach. a1
aitiú. nf
ama. nf4
ana. nb4
@kscanne
kscanne / litreacha.txt
Created January 8, 2020 08:45
minicíocht litreacha
88966225 a
54840907 i
44737799 h
44507829 n
33382025 r
31318119 e
28412299 t
28054379 s
25577640 c
23458466 o