Created
September 20, 2019 03:11
-
-
Save IlnarSelimcan/54cc2ab1fc4b6bdc9991f97a0d8a3b33 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
nog: commit 6f65e512b45e04ef9f177ea8e1adf6ba26cb648e | |
stems: 1367 | |
bible coverage | |
Number of tokenised words in the corpus: 189329 | |
Coverage: 81.88% | |
Top unknown words in the corpus: | |
343 Масих | |
341 а | |
306 Раббий | |
233 Кие | |
230 иман | |
194 аркалы | |
189 баьриси | |
176 Масихтинъ | |
148 Петер | |
146 Паул | |
143 А | |
128 дува | |
127 Раббийдинъ | |
123 Рух | |
120 солай | |
116 оьким | |
105 баьрисин | |
85 сокталары | |
85 болынъыз | |
83 Масихке | |
Translation time: 1.595435380935669 seconds | |
bible corpus size (tokens): 138010 ../../../data4apertium/corpora/bible/nog.txt | |
sah: commit 46b66f6f3e90a766d13647d00e1a6bcf03f1b25e | |
stems: 9505 | |
bible coverage | |
Error: Malformed input stream.Number of tokenised words in the corpus: 1107 | |
Coverage: 91.06% | |
Top unknown words in the corpus: | |
5 Апостоллар | |
3 Евангелие | |
2 Спасскай | |
2 миссионердар | |
2 Аланд | |
1 Annotation | |
1 НОВЫЙ | |
1 ЗАВЕТ | |
1 якутском | |
1 языке | |
1 in | |
1 c | |
1 эҕэрдэлиибин | |
1 Тэнгри | |
1 чочуобунаны | |
1 сүрэхтэнэр | |
1 бэргэһэлэнэр | |
1 Дежнев | |
1 Абакайааданы | |
1 бэргэһэлээбит | |
Translation time: 0.040879249572753906 seconds | |
bible corpus size (tokens): 146801 ../../../data4apertium/corpora/bible/sah.txt | |
wikipedia coverage | |
Error: Malformed input stream.Number of tokenised words in the corpus: 3672 | |
Coverage: 90.36% | |
Top unknown words in the corpus: | |
5 Орест | |
4 якутского | |
4 языка | |
3 Вейсенбург | |
2 Ромулус | |
2 киириитиэр | |
2 саарар | |
2 фон | |
2 монастырыгар | |
2 рукопись | |
2 этилэр | |
2 биллибитэ | |
2 К | |
2 Пеллерин | |
2 Кустуктуурап | |
2 буоланнар | |
2 Максимовы | |
2 Маҥаачыйа | |
2 Поликарпов | |
2 сорохторун | |
Translation time: 0.08163261413574219 seconds | |
wikipedia corpus size (tokens): 5082510 wiki.txt | |
chv: commit 16c6cacbb54cd238566d5da2b7a807085bb9d6cd | |
stems: 62530 | |
bible coverage | |
Number of tokenised words in the corpus: 196268 | |
Coverage: 94.08% | |
Top unknown words in the corpus: | |
1432 Иисус | |
272 Эй | |
248 Иисуса | |
136 Святой | |
125 Иоанн | |
86 Моисей | |
86 пӗтӗмпех | |
75 кирек | |
67 тӳрре | |
61 шыва | |
61 Симон | |
60 Пилат | |
56 Давид | |
54 тунине | |
53 Павела | |
52 Ирод | |
50 самантрах | |
49 ҫавнашкалах | |
48 Аминь | |
48 пулнӑран | |
Translation time: 3.941494941711426 seconds | |
bible corpus size (tokens): 133632 ../../../data4apertium/corpora/bible/chv.txt | |
wikipedia coverage | |
Error: Malformed input stream.Number of tokenised words in the corpus: 686 | |
Coverage: 92.57% | |
Top unknown words in the corpus: | |
2 Хуть | |
2 тĕрлĕрен | |
2 идиомсем | |
2 тĕпченин | |
2 каяп | |
1 мĕшĕнче | |
1 нимпех | |
1 уйралса | |
1 талккăшпех | |
1 ыттисенчен | |
1 идиомсенчен | |
1 паллăраххисем | |
1 этнографилле | |
1 ыттисемшĕн | |
1 радиокăларăмсемпе | |
1 телепередачăсем | |
1 тулăшĕнче | |
1 кодлăхĕсем | |
1 кодлăхĕсене | |
1 Ăнлантаркăч | |
Translation time: 0.07455682754516602 seconds | |
wikipedia corpus size (tokens): 295582 wiki.txt | |
kum: commit 162d6e69a4e860d4057489f2fcd97105ce99d9d9 | |
stems: 4949 | |
bible coverage | |
Number of tokenised words in the corpus: 207468 | |
Coverage: 93.33% | |
Top unknown words in the corpus: | |
191 Месигьни | |
144 оьзлени | |
90 ягьудилени | |
90 ягьудилер | |
79 Месигьге | |
79 Я | |
76 таби | |
69 Месигьден | |
62 каламын | |
62 чакъы | |
59 оьзлеге | |
58 сужда | |
57 Шолайлыкъда | |
55 инкар | |
55 Устаз | |
55 Къанунну | |
54 эсе | |
50 я | |
49 сюннет | |
45 ягьуди | |
Translation time: 2.8214516639709473 seconds | |
bible corpus size (tokens): 153845 ../../../data4apertium/corpora/bible/kum.txt | |
kaa: commit 85249552c43627efee4c754ccc7954ac4c8a953e | |
stems: 28474 | |
bible coverage | |
Number of tokenised words in the corpus: 190814 | |
Coverage: 93.79% | |
Top unknown words in the corpus: | |
520 Muxaddes | |
408 ytkeni | |
353 Masix | |
346 A | |
214 z | |
183 Masixtıń | |
172 ǵo | |
161 Petr | |
148 Pavel | |
142 atırǵan | |
139 zi | |
108 Háy | |
106 muxaddes | |
105 Ruwx | |
85 atanaq | |
81 Masixqa | |
75 haq | |
75 ziniń | |
68 bolǵanlıqtan | |
65 Erusalimge | |
Translation time: 4.4775168895721436 seconds | |
bible corpus size (tokens): 145429 ../../../data4apertium/corpora/bible/kaa.txt | |
wikipedia coverage | |
Error: Malformed input stream.Number of tokenised words in the corpus: 4 | |
Coverage: 100.00% | |
Top unknown words in the corpus: | |
Translation time: 0.016776323318481445 seconds | |
wikipedia corpus size (tokens): 337430 wiki.txt | |
tuk: commit c1493edb237396bcc0432743c9ca16b6439fc541 | |
stems: 2986 | |
bible coverage | |
Number of tokenised words in the corpus: 598585 | |
Coverage: 70.38% | |
Top unknown words in the corpus: | |
3337 Reb | |
2415 Rebbiň | |
2359 Ol | |
2228 Men | |
2132 ol | |
1424 olar | |
1366 Olar | |
1309 olaryň | |
1070 oňa | |
1056 Sen | |
1053 Rebbe | |
1048 men | |
1041 olary | |
923 meniň | |
850 Meniň | |
828 seniň | |
782 maňa | |
772 siz | |
768 çünki | |
761 Eý | |
Translation time: 5.525491237640381 seconds | |
bible corpus size (tokens): 401307 ../../../data4apertium/corpora/bible/tuk.txt | |
wikipedia coverage | |
Error: Malformed input stream.Number of tokenised words in the corpus: 129 | |
Coverage: 82.17% | |
Top unknown words in the corpus: | |
2 inženerligi | |
1 Sahypa | |
1 Kimýä | |
1 ähtimal | |
1 şire | |
1 çyg | |
1 tagam | |
1 splawy | |
1 guýmak | |
1 garylmak | |
1 tebigat | |
1 iň | |
1 wajyp | |
1 olaryň | |
1 üýtgeýişleri | |
1 üýtgeýişleriň | |
1 tabyn | |
1 baradaky | |
1 algoritm | |
1 giňişleýin | |
Translation time: 0.010648488998413086 seconds | |
wikipedia corpus size (tokens): 2021374 wiki.txt | |
bak: commit 2cb89f7bc78526f47da6c6de1b1420475ca85b58 | |
stems: 56463 | |
bible coverage | |
Number of tokenised words in the corpus: 197315 | |
Coverage: 94.29% | |
Top unknown words in the corpus: | |
144 Һөйөнөслө | |
114 китте | |
94 имандаштар | |
75 үҙҙәре | |
71 шундай | |
63 алдына | |
58 ҡыуып | |
54 арҡысаҡҡа | |
53 киткән | |
52 фарисейҙар | |
51 бөтөнөһө | |
50 ҡисса | |
49 дусар | |
49 өҫтөнән | |
48 Ирод | |
48 береһенә | |
48 Имандаштар | |
46 Йәһүҙә | |
46 халҡы | |
45 Яҡуб | |
Translation time: 4.478304386138916 seconds | |
bible corpus size (tokens): 145707 ../../../data4apertium/corpora/bible/bak.txt | |
wikipedia coverage | |
Error: Malformed input stream.Number of tokenised words in the corpus: 474 | |
Coverage: 96.20% | |
Top unknown words in the corpus: | |
4 власы | |
2 ө | |
2 н | |
1 ноябрендә | |
1 октябрендә | |
1 суверенитеты | |
1 февраленән | |
1 ТӨРКСОЙ | |
1 Халҡы | |
1 ын | |
1 тауҙарының | |
1 Ямантау | |
1 мәмерйәләре | |
Translation time: 0.0639498233795166 seconds | |
wikipedia corpus size (tokens): 17443832 wiki.txt | |
kaz: commit 04f31c3d337e1fa69420b6ffbcab7cc826490032 | |
stems: 37801 | |
bible coverage | |
Number of tokenised words in the corpus: 210008 | |
Coverage: 98.09% | |
Top unknown words in the corpus: | |
66 яһудилер | |
65 парызшылдар | |
63 немесе | |
50 Пилат | |
49 Жохан | |
41 Яһудея | |
36 яһудилердің | |
32 Ғалилея | |
29 Қорынттықтарга | |
27 Лұқа | |
24 Марқа | |
23 Яһудилердің | |
23 Філіп | |
23 дұғай | |
20 Менмін | |
20 Барнаба | |
19 тағзым | |
17 Ыбырайымға | |
17 Тоқтының | |
16 Тімоте | |
Translation time: 9.840543031692505 seconds | |
bible corpus size (tokens): 151631 ../../../data4apertium/corpora/bible/kaz.txt | |
wikipedia coverage | |
Error: Malformed input stream.Number of tokenised words in the corpus: 2889 | |
Coverage: 96.16% | |
Top unknown words in the corpus: | |
3 оңт | |
2 тайпалық | |
2 жоспарлық | |
2 Беловеж | |
2 респ | |
2 жоғ | |
2 өкілеттігі | |
2 мореналық | |
2 тен | |
2 ке | |
2 сағ | |
1 Бaтысында | |
1 төмeнгі | |
1 Мұхитқа | |
1 жəне | |
1 aлуан | |
1 Хaлықтың | |
1 православты | |
1 номинал | |
1 Стан | |
Translation time: 0.38750314712524414 seconds | |
wikipedia corpus size (tokens): 33782767 wiki.txt | |
tur: commit 1e6e3b4d3fce24e0aa18342051dc6fe8533da679 | |
stems: 22652 | |
bible coverage | |
Number of tokenised words in the corpus: 481376 | |
Coverage: 93.90% | |
Top unknown words in the corpus: | |
829 nın | |
809 ın | |
627 a | |
407 na | |
289 ı | |
268 i | |
267 nun | |
203 dan | |
187 Kâhin | |
179 ndan | |
174 yı | |
161 nı | |
159 Irmağı | |
154 Filistliler | |
152 Efrayim | |
151 nin | |
142 Yoav | |
137 Manaşşe | |
127 Moav | |
122 nde | |
Translation time: 11.723340511322021 seconds | |
bible corpus size (tokens): 309293 ../../../data4apertium/corpora/bible/tur.txt | |
wikipedia coverage | |
Error: Malformed input stream.Number of tokenised words in the corpus: 16711 | |
Coverage: 89.29% | |
Top unknown words in the corpus: | |
134 ın | |
124 Temuçin | |
40 nin | |
38 ı | |
38 a | |
36 i | |
34 Camuka | |
33 Jin | |
33 nın | |
29 Cuci | |
27 Harezmşah | |
22 Cebe | |
17 Mukhulai | |
17 Subutay | |
16 Yesügey | |
14 Börte | |
14 Alaeddin | |
13 Höelin | |
13 Suphi | |
12 Şira | |
Translation time: 0.6075348854064941 seconds | |
wikipedia corpus size (tokens): 54337641 wiki.txt | |
tat: commit 3c854811ec1251b005529a2da9df8a2d81b93680 | |
stems: 59755 | |
bible coverage | |
Number of tokenised words in the corpus: 196538 | |
Coverage: 98.61% | |
Top unknown words in the corpus: | |
52 фарисейләр | |
30 кайберләре | |
29 Corinthians | |
23 Фарисейләр | |
22 Revelation | |
21 1st | |
20 Петернең | |
17 кайберәүләр | |
17 Тимуте | |
16 2nd | |
15 Антиухеягә | |
14 Һанани | |
14 Әгрип | |
13 Яһүдиядә | |
13 кинаяле | |
13 саддукейлар | |
13 Петергә | |
13 Леви | |
13 Көрнили | |
13 Фисте | |
Translation time: 7.186840295791626 seconds | |
bible corpus size (tokens): 144953 ../../../data4apertium/corpora/bible/tat.txt | |
wikipedia coverage | |
Error: Malformed input stream.Number of tokenised words in the corpus: 26 | |
Coverage: 69.23% | |
Top unknown words in the corpus: | |
1 Tatarça | |
1 İnternet | |
1 tulısınça | |
1 İnternetı | |
1 Respublikası | |
1 däwlät | |
1 tellärendä | |
1 tatar | |
Translation time: 0.12969040870666504 seconds | |
wikipedia corpus size (tokens): 6884329 wiki.txt | |
gag: commit 3c6bf03fcdcb84bf35831d6e5ebc39f28e74087c | |
stems: 6470 | |
bible coverage | |
Number of tokenised words in the corpus: 1 | |
Coverage: 100.00% | |
Top unknown words in the corpus: | |
Translation time: 0.027973413467407227 seconds | |
bible corpus size (tokens): wikipedia coverage | |
Number of tokenised words in the corpus: 478 | |
Coverage: 93.72% | |
Top unknown words in the corpus: | |
2 Aarı | |
2 mikrotemaları | |
2 abzaț | |
1 notoc | |
1 noeditsection | |
1 Ağrı | |
1 viridis | |
1 gruz | |
1 აფხაზეთი | |
1 kismi | |
1 Topraaın | |
1 sunnü | |
1 mikrotema | |
1 abzațın | |
1 ercääz | |
1 başlıkları | |
1 Rhodeus | |
1 sericeus | |
1 akarlarda | |
1 göllerde | |
Translation time: 0.03628849983215332 seconds | |
wikipedia corpus size (tokens): 123741 wiki.txt | |
uzb: commit b5f2b1242b784271ed4c30acee83048e97e109bc | |
stems: 36684 | |
bible coverage | |
Number of tokenised words in the corpus: 198551 | |
Coverage: 95.11% | |
Top unknown words in the corpus: | |
185 ko | |
71 g | |
61 so | |
48 cho | |
41 Kimki | |
36 lasizlar | |
31 bilasizlar | |
30 emasmi | |
28 qiladigan | |
28 Acts | |
27 go | |
26 lur | |
26 Shoul | |
24 emasman | |
23 vahiy | |
23 III | |
22 Isha | |
22 yozilganidek | |
22 Revelation | |
21 qilasizlar | |
Translation time: 2.853121280670166 seconds | |
bible corpus size (tokens): 131151 ../../../data4apertium/corpora/bible/uzb.txt | |
wikipedia coverage | |
Error: Malformed input stream.Number of tokenised words in the corpus: 4163 | |
Coverage: 90.37% | |
Top unknown words in the corpus: | |
8 sr | |
7 dan | |
6 srlarda | |
5 Tuproqqalʼa | |
4 2ming | |
4 Herirud | |
3 MDH | |
3 shyo | |
3 reytingida | |
3 GFP | |
3 Tajan | |
3 3ming | |
3 xorazmiylar | |
3 Kot | |
2 Global | |
2 vertolyot | |
2 reytingga | |
2 Murgʻob | |
2 Gekatey | |
2 Gerodot | |
Translation time: 0.07757258415222168 seconds | |
wikipedia corpus size (tokens): 9183827 wiki.txt | |
crh: commit e64d105662e9f4776368fbfdf47de37635e557bf | |
stems: 13631 | |
bible coverage | |
Number of tokenised words in the corpus: 118941 | |
Coverage: 37.10% | |
Top unknown words in the corpus: | |
1566 ве | |
1022 Иса | |
910 бир | |
880 деди | |
729 ичюн | |
677 исе | |
624 эди | |
578 да | |
556 деп | |
505 Мен | |
501 бу | |
464 де | |
420 адам | |
352 оларгъа | |
339 не | |
336 Онынъ | |
319 Алланынъ | |
315 сонъ | |
296 Бу | |
285 оны | |
Translation time: 0.3830137252807617 seconds | |
bible corpus size (tokens): 82456 ../../../data4apertium/corpora/bible/crh.txt | |
wikipedia coverage | |
Number of tokenised words in the corpus: 5006 | |
Coverage: 92.75% | |
Top unknown words in the corpus: | |
12 Amdi | |
10 Abibulla | |
8 Odabaş | |
5 Giraybay | |
5 nemse | |
4 Ablây | |
4 افغانستان | |
4 Afġānistān | |
3 ci | |
3 Abeşistan | |
3 Aluston | |
2 ac | |
2 ae | |
2 af | |
2 ag | |
2 ai | |
2 am | |
2 ao | |
2 Antarktidanıñ | |
2 au | |
Translation time: 0.2028791904449463 seconds | |
wikipedia corpus size (tokens): 173704 wiki.txt | |
kir: commit caec6e6e4bd6e33be07b36820bd06ca349423497 | |
stems: 15886 | |
bible coverage | |
Number of tokenised words in the corpus: 201319 | |
Coverage: 94.95% | |
Top unknown words in the corpus: | |
946 Кудай | |
597 Кудайдын | |
305 Кудайга | |
94 Кудайды | |
90 Кудайдан | |
58 Ысман | |
43 Ыбрайым | |
38 Жүйүт | |
38 расмисинен | |
29 Corinthians | |
28 Барнап | |
28 Шабыл | |
27 чөмүлдүрүү | |
24 Кудайыбыз | |
23 Ыбрайымдын | |
23 чөмүлүү | |
22 Revelation | |
21 аян | |
21 алышпады | |
21 1st | |
Translation time: 6.5446202754974365 seconds | |
bible corpus size (tokens): 148445 ../../../data4apertium/corpora/bible/kir.txt | |
wikipedia coverage | |
Error: Malformed input stream.Number of tokenised words in the corpus: 9198 | |
Coverage: 91.01% | |
Top unknown words in the corpus: | |
15 С | |
10 К | |
7 комплекстик | |
7 Н | |
6 закон | |
6 Кыргызжер | |
5 П | |
5 В | |
5 масштабдагы | |
5 Д | |
5 Памир | |
5 каганаты | |
4 Ч | |
4 Гумбольдт | |
4 Риттер | |
4 чарбалык | |
4 Тескей | |
4 ч | |
4 Арсланбаб | |
4 жаңгак | |
Translation time: 0.3299267292022705 seconds | |
wikipedia corpus size (tokens): 8321420 wiki.txt | |
tyv: commit f89afc383b03e30638da11fb296d93d7da2f7a0e | |
stems: 11845 | |
bible coverage | |
Number of tokenised words in the corpus: 219959 | |
Coverage: 96.11% | |
Top unknown words in the corpus: | |
82 Бижилгеде | |
62 бараалгакчылары | |
49 бараалгакчызы | |
45 угаадыглыг | |
45 шыдажып | |
41 израиль | |
41 экиртип | |
37 бузуттуг | |
34 согур | |
33 эккеп | |
32 эрлик | |
32 дирлип | |
32 хевин | |
29 Аминь | |
29 Варнава | |
29 Corinthians | |
28 шыдамык | |
26 доңгая | |
26 алгыржып | |
26 соп | |
Translation time: 3.4303221702575684 seconds | |
bible corpus size (tokens): 156126 ../../../data4apertium/corpora/bible/tyv.txt | |
wikipedia coverage | |
Number of tokenised words in the corpus: 413 | |
Coverage: 94.19% | |
Top unknown words in the corpus: | |
4 quot | |
2 калмык | |
1 х | |
1 талакы | |
1 глобус | |
1 шиштейин | |
1 Бажин | |
1 субурган | |
1 топограф | |
1 Каррутерс | |
1 Михаилның | |
1 мрамор | |
1 силбип | |
1 үндүсүн | |
1 тоолдап | |
1 Гэсэр | |
1 эпостуң | |
1 1990чч | |
1 ля | |
1 минор | |
Translation time: 0.0222933292388916 seconds | |
wikipedia corpus size (tokens): 337589 wiki.txt | |
uig: commit 7ce96d726371b41ac371745ae5267fa246557144 | |
stems: 25385 | |
bible coverage | |
Number of tokenised words in the corpus: 1 | |
Coverage: 100.00% | |
Top unknown words in the corpus: | |
Translation time: 0.062395572662353516 seconds | |
bible corpus size (tokens): wikipedia coverage | |
Error: Malformed input stream.Number of tokenised words in the corpus: 203 | |
Coverage: 86.21% | |
Top unknown words in the corpus: | |
3 كومپۇتەر | |
1 ۋىكىپىدىيە | |
1 ۋىكىپېدىيەنىڭ | |
1 نۇسخاسىغا | |
1 نەۋرۇز | |
1 بايرامى | |
1 بیلگیسايار | |
1 مۈھەندیسلیغی | |
1 ھەلقی | |
1 نھايیتی | |
1 ۋەئی | |
1 شلیتیشیمیزگە | |
1 ھیتاي | |
1 كیشیلیریمیزئو | |
1 قۇۋاتقان | |
1 يیتیشیۋاتقان | |
1 مما | |
1 بیزنیڭئو | |
1 مۇمی | |
1 سەلیشتۇرغاندا | |
Translation time: 0.0732121467590332 seconds | |
wikipedia corpus size (tokens): 1791416 wiki.txt | |
aze: commit da572614b8d54f1caef8d0eabe11ca2e669edf42 | |
stems: 11583 | |
bible coverage | |
Number of tokenised words in the corpus: 753060 | |
Coverage: 56.79% | |
Top unknown words in the corpus: | |
12538 və | |
3642 Rəbb | |
3569 də | |
2634 ilə | |
2451 Rəbbin | |
2397 görə | |
2021 idi | |
1971 hər | |
1834 Çünki | |
1813 oğlu | |
1729 Mən | |
1624 isə | |
1347 etdi | |
1277 qədər | |
1265 İsa | |
1207 Allah | |
1151 çünki | |
1136 Allahın | |
1080 Ey | |
940 yanına | |
Translation time: 6.365848779678345 seconds | |
bible corpus size (tokens): 534925 ../../../data4apertium/corpora/bible/aze.txt | |
wikipedia coverage | |
Number of tokenised words in the corpus: 1 | |
Coverage: 100.00% | |
Top unknown words in the corpus: | |
Translation time: 0.014212846755981445 seconds | |
wikipedia corpus size (tokens): 0 wiki.txt | |
kjh: commit b624f2e589f1421d15c962506943abd360894742 | |
stems: 710 | |
bible coverage | |
Number of tokenised words in the corpus: 175448 | |
Coverage: 47.63% | |
Top unknown words in the corpus: | |
1926 паза | |
1210 Иисус | |
1085 даа | |
979 тізең | |
957 тіп | |
851 дее | |
720 ниме | |
681 ӱчӱн | |
659 прай | |
625 Хан | |
607 Че | |
577 теен | |
525 че | |
497 хада | |
436 нимес | |
426 парған | |
342 киліп | |
334 нооза | |
329 Аннаңар | |
321 ағаа | |
Translation time: 0.7867178916931152 seconds | |
bible corpus size (tokens): 137272 ../../../data4apertium/corpora/bible/kjh.txt | |
krc: commit e9f9e9c1406aae02a973147cce1b1f49e21b282c | |
stems: 8551 | |
bible coverage | |
Number of tokenised words in the corpus: 193680 | |
Coverage: 85.41% | |
Top unknown words in the corpus: | |
802 Исса | |
444 Иссаны | |
416 Масих | |
313 Раббий | |
287 юсюнден | |
242 Кесини | |
217 кесини | |
168 санга | |
156 Раббийни | |
148 муну | |
144 Масихни | |
123 Иссагъа | |
121 жууапха | |
119 Муну | |
115 Пауул | |
113 жууап | |
110 этигиз | |
109 махтау | |
96 Нюр | |
96 Раббийибиз | |
Translation time: 1.7533786296844482 seconds | |
bible corpus size (tokens): 142337 ../../../data4apertium/corpora/bible/krc.txt | |
wikipedia coverage | |
Number of tokenised words in the corpus: 1 | |
Coverage: 100.00% | |
Top unknown words in the corpus: | |
Translation time: 0.01462864875793457 seconds | |
wikipedia corpus size (tokens): 0 wiki.txt | |
ota: commit 788a52c454b3e6e815df89e7b09069da16a40a44 | |
stems: 77 | |
bible coverage | |
Number of tokenised words in the corpus: 1 | |
Coverage: 100.00% | |
Top unknown words in the corpus: | |
Translation time: 0.0038390159606933594 seconds | |
bible corpus size (tokens): | |
alt: commit 9fd53d1efb6e6556848ccb6695d7d7a597164f73 | |
stems: 182 | |
bible coverage | |
Number of tokenised words in the corpus: 194244 | |
Coverage: 61.89% | |
Top unknown words in the corpus: | |
354 ончо | |
335 1 | |
313 12 | |
302 13 | |
300 9 | |
293 14 | |
290 8 | |
289 6 | |
288 11 | |
286 2 | |
284 10 | |
283 3 | |
281 4 | |
280 5 | |
272 15 | |
272 Кайракан | |
268 7 | |
261 ажыра | |
258 18 | |
256 17 | |
Translation time: 0.8320250511169434 seconds | |
bible corpus size (tokens): 133151 ../../../data4apertium/corpora/bible/alt.txt | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment