This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
## Given Google Drive file id and the filename that it should be saved as, fetch and save. | |
fileid="$1" | |
filename="$2" | |
curl -c ./cookie -s -L "https://drive.google.com/uc?export=download&id=${fileid}" > /dev/null | |
curl -Lb ./cookie "https://drive.google.com/uc?export=download&confirm=`awk '/download/ {print $NF}' ./cookie`&id=${fileid}" -o ${filename} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
apertium-kaz$ echo "Біздің компания да осы процеске белсене қатысқысы келеді." | apertium-destxt -n | apertium -f none -d . kaz-morph | cg-conv -la | apertium-retxt | python3 ~/src/sourceforge-apertium/branches/kaz-tagger/kaz_tagger.py | vislcg3 -g apertium-kaz.kaz.rlx | python3 ../ud-scripts/vislcg3-to-conllu.py "" 2> /dev/null | python3 ../ud-scripts/conllu-feats.py apertium-kaz.kaz.udx 2> /dev/null | python3 ../ud-scripts/conllu-nospaceafter.py 2> /dev/null | |
# sent_id = :1:0 | |
# text = Біздің компания да осы процеске белсене қатысқысы келеді. | |
1 Біздің біз NOUN n Case=Gen 2 nmod:poss _ _ | |
2-3 компания да _ _ _ _ _ _ _ _ | |
2 компания компания NOUN n Case=Nom 7 nsubj _ _ | |
3 да да ADV postadv _ 7 X _ _ | |
4 осы осы NOUN n Case=Nom 5 obj _ _ | |
5 процеске процесс NOUN n Case=Dat 7 obl _ _ | |
6 белсене белсен VERB v Aspect=Imp|VerbForm=Cov 7 X _ _ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python3 | |
# Compatible with Python 2.7 and 3.2+, can be used either as a module | |
# or a standalone executable. | |
# | |
# Copyright 2017, 2018 Institute of Formal and Applied Linguistics (UFAL), | |
# Faculty of Mathematics and Physics, Charles University, Czech Republic. | |
# | |
# This Source Code Form is subject to the terms of the Mozilla Public | |
# License, v. 2.0. If a copy of the MPL was not distributed with this |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
## A script to scrape all listings on this site: | |
## https://www.point2homes.com/US/Land-For-Sale/NH/Coos-County.html | |
## | |
## into the following csv format: | |
## | |
## Name,Address,Amount,Acres,Type,Misc | |
## | |
## e.g. | |
## Name,Address,Amount,Acres,Type,Misc | |
## "L52 Cloutier, Stark, NH","Stark, NH","$27,500","5.16","5 days on Point2 Homes" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#lang rash | |
;; ASSUME: - this script is placed into apertium-all/ | |
;; - hfst-covtest is in the PATH | |
;; USAGE: racket apertium-turkic-bilingual-stats.rkt > /tmp/bilingual 2>&1 | |
;; REQUIRES: racket | |
;; rash (install with "raco pkg install rash") | |
(provide MONOLINGUAL BILINGUAL) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
nog: commit 6f65e512b45e04ef9f177ea8e1adf6ba26cb648e | |
stems: 1367 | |
bible coverage | |
Number of tokenised words in the corpus: 189329 | |
Coverage: 81.88% | |
Top unknown words in the corpus: | |
343 Масих | |
341 а | |
306 Раббий | |
233 Кие |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env bash | |
## Downloads a "pages-articles-multistream.xml.bz2" Wikipedia dump: | |
## - for the language LANG (iso2 or iso3 code), | |
## - from day DATE (in yyyymmdd format or "latest") | |
## makes a frequency list out of it, | |
## measures MODE's coverage on that freqeuncy list, | |
## and compares it with coverage of the previos revision of that mode. | |
## | |
## USAGE: ./test-cov-on-wiki.sh <lang> <date> <mode> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
## A little script to test morphology/morphophonology, originally written by spectie | |
## for apertium-chv. | |
## | |
## USAGE: python3 test.py <lang code> | |
## python3 test.py <lang code> <tsv file> | |
## where tsv file is a tsv file with three columns: | |
## 1. direction restriction, which is either _ (no restriction), > (test generation only) and | |
## < test analysis only. | |
## 2. lexcial form | |
## 3. surface form |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env bash | |
git clone https://github.com/IlnarSelimcan/apertium-quality.git | |
cd apertium-quality/mwtools/python3 | |
sudo python3 ./setup.py install | |
cd ../../ | |
./autogen.sh && make && sudo make install | |
cd ../../../ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env bash | |
git clone https://github.com/apertium/apertium-quality.git | |
cd apertium-quality/mwtools/python3 | |
sudo python3 ./setup.py install | |
cd ../../ | |
./autogen.sh && make && sudo make install | |
cd ../../../ |
NewerOlder