Skip to content

Instantly share code, notes, and snippets.

@jessielw
Last active December 22, 2021 21:36
Show Gist options
  • Save jessielw/f0f405c076eeb1e017dda45ad5f1e9f9 to your computer and use it in GitHub Desktop.
Save jessielw/f0f405c076eeb1e017dda45ad5f1e9f9 to your computer and use it in GitHub Desktop.
Python Dictionary of ISO-639-2 Codes
iso_639_2_codes_dictionary = {'English': 'eng',
'Chinese': 'chi',
'French': 'fre',
'Japanese': 'jpn',
'Spanish': 'spa',
'German': 'ger',
'Korean': 'kor',
'Russian': 'rus',
'Afar': 'aar',
'Abkhazian': 'abk',
'Achinese': 'ace',
'Acoli': 'ach',
'Adangme': 'ada',
'Adyghe;Adygei': 'ady',
'Afro-Asiatic languages': 'afa',
'Afrihili': 'afh',
'Afrikaans': 'afr',
'Ainu': 'ain',
'Akan': 'aka',
'Akkadian': 'akk',
'Aleut': 'ale',
'Algonquian languages': 'alg',
'Southern Altai': 'alt',
'Amharic': 'amh',
'English: Old (ca.450–1100)': 'ang',
'Angika': 'anp',
'Apache languages': 'apa',
'Arabic': 'ara',
'Official Aramaic (700-300 BCE); Imperial Aramaic (700-300 BCE)': 'arc',
'Aragonese': 'arg',
'Armenian': 'arm',
'Mapudungun; Mapuche': 'arn',
'Arapaho': 'arp',
'Artificial languages': 'art',
'Arawak': 'arw',
'Assamese': 'asm',
'Asturian; Bable; Leonese; Asturleonese': 'ast',
'Athapascan languages': 'ath',
'Australian languages': 'aus',
'Avaric': 'ava',
'Avestan': 'ave',
'Awadhi': 'awa',
'Aymara': 'aym',
'Azerbaijani': 'aze',
'Banda languages': 'bad',
'Bamileke languages': 'bai',
'Bashkir': 'bak',
'Baluchi': 'bal',
'Bambara': 'bam',
'Balinese': 'ban',
'Basque': 'baq',
'Basa': 'bas',
'Baltic languages ': 'bat',
'Beja; Bedawiyet': 'bej',
'Belarusian': 'bel',
'Bemba': 'bem',
'Bengali': 'ben',
'Berber languages': 'ber',
'Bhojpuri': 'bho',
'Bihari languages': 'bih',
'Bikol': 'bik',
'Bini; Edo': 'bin',
'Bislama': 'bis',
'Siksika': 'bla',
'Bantu languages': 'bnt',
'Tibetan': 'tib',
'Bosnian': 'bos',
'Braj': 'bra',
'Breton': 'bre',
'Batak languages': 'btk',
'Buriat': 'bua',
'Buginese': 'bug',
'Bulgarian': 'bul',
'Burmese': 'bur',
'Blin; Bilin': 'byn',
'Caddo': 'cad',
'Central American Indian languages': 'cai',
'Galibi Carib': 'car',
'Catalan; Valencian': 'cat',
'Caucasian languages': 'cau',
'Cebuano': 'ceb',
'Celtic languages': 'cel',
'Czech': 'cze',
'Chamorro': 'cha',
'Chibcha': 'chb',
'Chechen': 'che',
'Chagatai': 'chg',
'Chuukese': 'chk',
'Mari': 'chm',
'Chinook jargon': 'chn',
'Choctaw': 'cho',
'Chipewyan; Dene Suline': 'chp',
'Cherokee': 'chr',
'Church Slavic; Old Slavonic': 'chu',
'Chuvash': 'chv',
'Cheyenne': 'chy',
'Chamic languages': 'cmc',
'Montenegrin': 'cnr',
'Coptic': 'cop',
'Cornish': 'cor',
'Corsican': 'cos',
'Creoles and pidgins, English based': 'cpe',
'Creoles and pidgins, French-based': 'cpf',
'Creoles and pidgins, Portuguese-based': 'cpp',
'Cree': 'cre',
'Crimean Tatar; Crimean Turkish': 'crh',
'Creoles and pidgins': 'cpr',
'Kashubian': 'csb',
'Cushitic languages': 'cus',
'Welsh': 'wel',
'Dakota': 'dak',
'Danish': 'dan',
'Dargwa': 'dar',
'Land Dayak languages': 'day',
'Delaware': 'del',
'Slave (Athapascan)': 'den',
'Dogrib': 'dgr',
'Dinka': 'din',
'Divehi; Dhivehi; Maldivian': 'div',
'Dogri': 'doi',
'Dravidian languages': 'dra',
'Lower Sorbian': 'dsb',
'Duala': 'dua',
'Dutch, Middle (ca.1050-1350)': 'dum',
'Dutch; Flemish': 'dut',
'Dyula': 'dyu',
'Dzongkha': 'dzo',
'Efik': 'efi',
'Egyptian (Ancient)': 'egy',
'Ekajuk': 'eka',
'Elamite': 'elx',
'English, Middle (1100-1500)': 'enm',
'Esperanto': 'epo',
'Estonian': 'est',
'Ewe': 'ewe',
'Ewondo': 'ewo',
'Fang': 'fan',
'Faroese': 'fao',
'Fanti': 'fat',
'Fijian': 'fij',
'Filipino; Pilipino': 'fil',
'Finnish': 'fin',
'Finno-Ugrian languages': 'fiu',
'Fon': 'fon',
'French, Middle (ca.1400-1600)': 'frm',
'French, Old (842-ca.1400)': 'fro',
'Northern Frisian': 'frr',
'Eastern Frisian': 'frs',
'Western Frisian': 'fry',
'Fulah': 'ful',
'Friulian': 'fur',
'Ga': 'gaa',
'Gayo': 'gay',
'Gbaya': 'gba',
'Germanic languages': 'gem',
'Georgian': 'geo',
'Geez': 'gez',
'Gilbertese': 'gil',
'Gaelic; Scottish Gaelic': 'gla',
'Irish': 'gle',
'Galician': 'glg',
'Manx': 'glv',
'German, Middle High (ca.1050-1500)': 'gmh',
'German, Old High (ca.750-1050)': 'goh',
'Gondi': 'gon',
'Gorontalo': 'gor',
'Gothic': 'got',
'Grebo': 'grb',
'Greek, Ancient (to 1453)': 'grc',
'Greek, Modern (1453-)': 'gre',
'Guarani': 'grn',
'Swiss German; Alemannic; Alsatian': 'gsw',
'Gujarati': 'guj',
"Gwich'in": 'gwi',
'Haida': 'hai',
'Haitian; Haitian Creole': 'hat',
'Hausa': 'hau',
'Hawaiian': 'haw',
'Hebrew': 'heb',
'Herero': 'her',
'Hiligaynon': 'hil',
'Himachali languages; Western Pahari languages': 'him',
'Hindi': 'hin',
'Hittite': 'hit',
'Hmong; Mong': 'hmn',
'Hiri Motu': 'hmo',
'Croatian': 'hrv',
'Upper Sorbian': 'hsb',
'Hungarian': 'hun',
'Hupa': 'hup',
'Iban': 'iba',
'Igbo': 'ibo',
'Icelandic': 'ice',
'Ido': 'ido',
'Sichuan Yi; Nuosu': 'iii',
'Ijo languages': 'ijo',
'Inuktitut': 'iku',
'Interlingue; Occidental': 'ile',
'Iloko': 'ilo',
'Interlingua (International Auxiliary Language Association)': 'ina',
'Indic languages': 'inc',
'Indonesian': 'ind',
'Indo-European languages': 'ine',
'Ingush': 'inh',
'Inupiaq': 'ipk',
'Iranian languages': 'ira',
'Iroquoian languages': 'iro',
'Italian': 'ita',
'Javanese': 'jav',
'Lojban': 'jbo',
'Judeo-Persian': 'jpr',
'Judeo-Arabic': 'jrb',
'Kara-Kalpak': 'kaa',
'Kabyle': 'kab',
'Kachin; Jingpho': 'kac',
'Kalaallisut; Greenlandic': 'kal',
'Kamba': 'kam',
'Kannada': 'kan',
'Karen languages': 'kar',
'Kashmiri': 'kas',
'Kanuri': 'kau',
'Kawi': 'kaw',
'Kazakh': 'kaz',
'Kabardian': 'kbd',
'Khasi': 'kha',
'Khoisan languages': 'khi',
'Central Khmer': 'khm',
'Khotanese; Sakan': 'kho',
'Kikuyu; Gikuyu': 'kik',
'Kinyarwanda': 'kin',
'Kirghiz; Kyrgyz': 'kir',
'Kimbundu': 'kmb',
'Konkani': 'kok',
'Komi': 'kom',
'Kongo': 'kon',
'Kosraean': 'kos',
'Kpelle': 'kpe',
'Karachay-Balkar': 'krc',
'Karelian': 'krl',
'Kru languages': 'kro',
'Kurukh': 'kru',
'Kuanyama; Kwanyama': 'kua',
'Kumyk': 'kum',
'Kurdish': 'kur',
'Kutenai': 'kut',
'Ladino': 'lad',
'Lahnda': 'lah',
'Lamba': 'lam',
'Lao': 'lao',
'Latin': 'lat',
'Latvian': 'lav',
'Lezghian': 'lez',
'Limburgan; Limburger; Limburgish': 'lim',
'Lingala': 'lin',
'Lithuanian': 'lit',
'Mongo': 'lol',
'Lozi': 'loz',
'Luxembourgish; Letzeburgesch': 'ltz',
'Luba-Lulua': 'lua',
'Luba-Katanga': 'lub',
'Ganda': 'lug',
'Luiseno': 'lui',
'Lunda': 'lun',
'Luo (Kenya and Tanzania)': 'luo',
'Lushai': 'lus',
'Macedonian': 'mac',
'Madurese': 'mad',
'Magahi': 'mag',
'Marshallese': 'mah',
'Maithili': 'mai',
'Makasar': 'mak',
'Malayalam': 'mal',
'Mandingo': 'man',
'Maori': 'mao',
'Austronesian languages': 'map',
'Marathi': 'mar',
'Masai': 'mas',
'Moksha': 'mdf',
'Mandar': 'mdr',
'Mende': 'men',
'Irish, Middle (900-1200)': 'mga',
"Mi'kmaq; Micmac": 'mic',
'Minangkabau': 'min',
'Uncoded languages': 'mis',
'Mon-Khmer languages': 'mkh',
'Malagasy': 'mlg',
'Maltese': 'mlt',
'Manchu': 'mnc',
'Manipuri': 'mni',
'Manobo languages': 'mno',
'Mohawk': 'moh',
'Mongolian': 'mon',
'Mossi': 'mos',
'Malay': 'may',
'Multiple languages': 'mul',
'Munda languages': 'mun',
'Creek': 'mus',
'Mirandese': 'mwl',
'Marwari': 'mwr',
'Mayan languages': 'myn',
'Erzya': 'myv',
'Nahuatl languages': 'nah',
'North American Indian languages': 'nai',
'Neapolitan': 'nap',
'Nauru': 'nau',
'Navajo; Navaho': 'nav',
'Ndebele, South; South Ndebele': 'nbl',
'Ndebele, North; North Ndebele': 'nde',
'Ndonga': 'ndo',
'Low German; Low Saxon; German, Low; Saxon, Low': 'nds',
'Nepali': 'nep',
'Nepal Bhasa; Newari': 'new',
'Nias': 'nia',
'Niger-Kordofanian languages': 'nic',
'Niuean': 'niu',
'Norwegian Nynorsk; Nynorsk, Norwegian': 'nno',
'Bokmal, Norwegian; Norwegian Bokmal': 'nob',
'Nogai': 'nog',
'Norse, Old': 'non',
'Norwegian': 'nor',
"N'Ko": 'nqo',
'Pedi; Sepedi; Northern Sotho': 'nso',
'Nubian languages': 'nub',
'Classical Newari; Old Newari; Classical Nepal Bhasa': 'nwc',
'Chichewa; Chewa; Nyanja': 'nya',
'Nyamwezi': 'nym',
'Nyankole': 'nyn',
'Nyoro': 'nyo',
'Nzima': 'nzi',
'Occitan (post 1500)': 'oci',
'Ojibwa': 'oji',
'Oriya': 'ori',
'Oromo': 'orm',
'Osage': 'osa',
'Ossetian; Ossetic': 'oss',
'Turkish, Ottoman (1500-1928)': 'ota',
'Otomian languages': 'oto',
'Papuan languages ': 'paa',
'Pangasinan': 'pag',
'Pahlavi': 'pal',
'Pampanga; Kapampangan': 'pam',
'Panjabi; Punjabi': 'pan',
'Papiamento': 'pap',
'Palauan': 'pau',
'Persian, Old (ca.600-400 B.C.)': 'peo',
'Persian': 'per',
'Philippine languages': 'phi',
'Phoenician': 'phn',
'Pali': 'pli',
'Polish': 'pol',
'Pohnpeian': 'pon',
'Portuguese': 'por',
'Prakrit languages': 'pra',
'Provencal, Old (to 1500);Occitan, Old (to 1500)': 'pro',
'Pushto; Pashto': 'pus',
'Quechua': 'que',
'Rajasthani': 'raj',
'Rapanui': 'rap',
'Rarotongan; Cook Islands Maori': 'rar',
'Romance languages': 'roa',
'Romansh': 'roh',
'Romany': 'rom',
'Romanian; Moldavian; Moldovan': 'rum',
'Rundi': 'run',
'Aromanian; Arumanian; Macedo-Romanian': 'rup',
'Sandawe': 'sad',
'Sango': 'sag',
'Yakut': 'sah',
'South American Indian languages': 'sai',
'Salishan languages': 'sal',
'Samaritan Aramaic': 'sam',
'Sanskrit': 'san',
'Sasak': 'sas',
'Santali': 'sat',
'Sicilian': 'scn',
'Scots': 'sco',
'Selkup': 'sel',
'Semitic languages': 'sem',
'Irish, Old (to 900)': 'sga',
'Sign Languages': 'sgn',
'Shan': 'shn',
'Sidamo': 'sid',
'Sinhala; Sinhalese': 'sin',
'Siouan languages': 'sio',
'Sino-Tibetan languages': 'sit',
'Slavic languages': 'sla',
'Slovak': 'slo',
'Slovenian': 'slv',
'Southern Sami': 'sma',
'Norhern Sami': 'sme',
'Sami languages': 'smi',
'Lule Sami': 'smj',
'Inari Sami': 'smn',
'Samoan': 'smo',
'Skolt Sami': 'sms',
'Shona': 'sna',
'Sindhi': 'snd',
'Soninke': 'snk',
'Sogdian': 'sog',
'Somali': 'som',
'Songhai languages': 'son',
'Sotho, Southern': 'sot',
'Spanish; Castilian': 'spa',
'Albanian': 'alb',
'Sardinian': 'srd',
'Sranan Tongo': 'srn',
'Serbian': 'srp',
'Serer': 'srr',
'Nilo-Saharan languages': 'ssa',
'Swati': 'ssw',
'Sukuma': 'suk',
'Sundanese': 'sun',
'Susu': 'sus',
'Sumerian': 'sux',
'Swahili': 'swa',
'Swedish': 'swe',
'Classical Syriac': 'syc',
'Syriac': 'syr',
'Tahitian': 'tah',
'Tai languages': 'tai',
'Tamil': 'tam',
'Tatar': 'tat',
'Telugu': 'tel',
'Timne': 'tem',
'Tereno': 'ter',
'Tetum': 'tet',
'Tajik': 'tgk',
'Tagalog': 'tgl',
'Thai': 'tha',
'Tigre': 'tig',
'Tigrinya': 'tir',
'Tiv': 'tiv',
'Tokelau': 'tkl',
'Klingon; tlhIngan-Hol': 'tlh',
'Tlingit': 'tli',
'Tamashek': 'tmh',
'Tonga (Nyasa)': 'tog',
'Tonga (Tonga Islands': 'ton',
'Tok Pisin': 'tpi',
'Tsimshian': 'tsi',
'Tswana': 'tsn',
'Tsonga': 'tso',
'Turkmen': 'tuk',
'Tumbuka': 'tum',
'Tupi languages': 'tup',
'Turkish': 'tur',
'Altaic languages': 'tut',
'Tuvalu': 'tvl',
'Twi': 'twi',
'Tuvinian': 'tyv',
'Udmurt': 'udm',
'Ugaritic': 'uga',
'Uighur; Uyghur': 'uig',
'Ukrainian': 'ukr',
'Umbundu': 'umb',
'Undertermined': 'und',
'Urdu': 'urd',
'Uzbek': 'uzb',
'Vai': 'vai',
'Venda': 'ven',
'Vietnamese': 'vie',
'Volapuk': 'vol',
'Votic': 'vot',
'Wakashan languages': 'wak',
'Wolaitta; Wolaytta': 'wal',
'Waray': 'war',
'Washo': 'was',
'Sorbian languages': 'wen',
'Walloon': 'wln',
'Wolof': 'wol',
'Kalmyk; Oirat': 'xal',
'Xhosa': 'xho',
'Yao': 'yao',
'Yapese': 'yap',
'Yiddish': 'yid',
'Yoruba': 'yor',
'Yupik languages': 'ypk',
'Zapotec': 'zap',
'Blissymbols; Blissymbolics; Bliss': 'zbl',
'Zenaga': 'zen',
'Standard Moroccan Tamazight': 'zgh',
'Zhuang; Chuang': 'zha',
'Zande languages': 'znd',
'Zulu': 'zul',
'Zuni': 'zun',
'No linguistic content; Not applicable': 'zxx',
'Zaza; Dimili; Dimli; Kirdki; Kirmanjki; Zazaki': 'zza'}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment