Skip to content

Instantly share code, notes, and snippets.

Avatar

Ilnar Salimzianov IlnarSelimcan

View GitHub Profile
View get-gdrive-file
## Given Google Drive file id and the filename that it should be saved as, fetch and save.
fileid="$1"
filename="$2"
curl -c ./cookie -s -L "https://drive.google.com/uc?export=download&id=${fileid}" > /dev/null
curl -Lb ./cookie "https://drive.google.com/uc?export=download&confirm=`awk '/download/ {print $NF}' ./cookie`&id=${fileid}" -o ${filename}
View gist:b5868d645b63b182100b46b1bd5dfde1
apertium-kaz$ echo "Біздің компания да осы процеске белсене қатысқысы келеді." | apertium-destxt -n | apertium -f none -d . kaz-morph | cg-conv -la | apertium-retxt | python3 ~/src/sourceforge-apertium/branches/kaz-tagger/kaz_tagger.py | vislcg3 -g apertium-kaz.kaz.rlx | python3 ../ud-scripts/vislcg3-to-conllu.py "" 2> /dev/null | python3 ../ud-scripts/conllu-feats.py apertium-kaz.kaz.udx 2> /dev/null | python3 ../ud-scripts/conllu-nospaceafter.py 2> /dev/null
# sent_id = :1:0
# text = Біздің компания да осы процеске белсене қатысқысы келеді.
1 Біздің біз NOUN n Case=Gen 2 nmod:poss _ _
2-3 компания да _ _ _ _ _ _ _ _
2 компания компания NOUN n Case=Nom 7 nsubj _ _
3 да да ADV postadv _ 7 X _ _
4 осы осы NOUN n Case=Nom 5 obj _ _
5 процеске процесс NOUN n Case=Dat 7 obl _ _
6 белсене белсен VERB v Aspect=Imp|VerbForm=Cov 7 X _ _
View conll18_ud_eval_lax.py
#!/usr/bin/env python3
# Compatible with Python 2.7 and 3.2+, can be used either as a module
# or a standalone executable.
#
# Copyright 2017, 2018 Institute of Formal and Applied Linguistics (UFAL),
# Faculty of Mathematics and Physics, Charles University, Czech Republic.
#
# This Source Code Form is subject to the terms of the Mozilla Public
# License, v. 2.0. If a copy of the MPL was not distributed with this
@IlnarSelimcan
IlnarSelimcan / scrape_coos_county.py
Created Feb 7, 2020
An example of me scraping a website using Python3 (with Requests & BeautifulSoup libraries)
View scrape_coos_county.py
## A script to scrape all listings on this site:
## https://www.point2homes.com/US/Land-For-Sale/NH/Coos-County.html
##
## into the following csv format:
##
## Name,Address,Amount,Acres,Type,Misc
##
## e.g.
## Name,Address,Amount,Acres,Type,Misc
## "L52 Cloutier, Stark, NH","Stark, NH","$27,500","5.16","5 days on Point2 Homes"
View apertium-turkic-bilingual-stats.rkt
#lang rash
;; ASSUME: - this script is placed into apertium-all/
;; - hfst-covtest is in the PATH
;; USAGE: racket apertium-turkic-bilingual-stats.rkt > /tmp/bilingual 2>&1
;; REQUIRES: racket
;; rash (install with "raco pkg install rash")
(provide MONOLINGUAL BILINGUAL)
View gist:54cc2ab1fc4b6bdc9991f97a0d8a3b33
nog: commit 6f65e512b45e04ef9f177ea8e1adf6ba26cb648e
stems: 1367
bible coverage
Number of tokenised words in the corpus: 189329
Coverage: 81.88%
Top unknown words in the corpus:
343 Масих
341 а
306 Раббий
233 Кие
View test-cov-on-wiki.sh
#!/usr/bin/env bash
## Downloads a "pages-articles-multistream.xml.bz2" Wikipedia dump:
## - for the language LANG (iso2 or iso3 code),
## - from day DATE (in yyyymmdd format or "latest")
## makes a frequency list out of it,
## measures MODE's coverage on that freqeuncy list,
## and compares it with coverage of the previos revision of that mode.
##
## USAGE: ./test-cov-on-wiki.sh <lang> <date> <mode>
View gist:05b3bd768a476307df5280af4c1411a7
## A little script to test morphology/morphophonology, originally written by spectie
## for apertium-chv.
##
## USAGE: python3 test.py <lang code>
## python3 test.py <lang code> <tsv file>
## where tsv file is a tsv file with three columns:
## 1. direction restriction, which is either _ (no restriction), > (test generation only) and
## < test analysis only.
## 2. lexcial form
## 3. surface form
View install-apertium-quality.sh
#!/usr/bin/env bash
git clone https://github.com/IlnarSelimcan/apertium-quality.git
cd apertium-quality/mwtools/python3
sudo python3 ./setup.py install
cd ../../
./autogen.sh && make && sudo make install
cd ../../../
View install-apertium-quality.sh
#!/usr/bin/env bash
git clone https://github.com/apertium/apertium-quality.git
cd apertium-quality/mwtools/python3
sudo python3 ./setup.py install
cd ../../
./autogen.sh && make && sudo make install
cd ../../../