Skip to content

Instantly share code, notes, and snippets.

@vjt
Created May 24, 2014 10:33
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save vjt/e5bc5336a32268964135 to your computer and use it in GitHub Desktop.
Save vjt/e5bc5336a32268964135 to your computer and use it in GitHub Desktop.

Welcome in many languages (in JSON format)

Today I wanted to implement a multi-language greeter for an application, and I ended up on http://www.omniglot.com/language/phrases/welcome.htm

I extracted all the welcomes from the page using jQuery, serialized them as JSON and processed them using Ruby:

welcomes = x.inject({}) {|h, (lang, val)| val.gsub!(/\(.*?\)/, ''); vals = val.split(%r{\s+/\s+|\n}); vals.map! {|val| val.strip!; val.gsub!(/^\w+ -\s+|- \w+$|^lit.*|>.+$/, ''); val.strip!; val}; vals.flatten!; vals.reject!(&:blank?);; h.update(lang => vals) }

The result is what you can download now as welcome.json from this gist

😃

[["Afrikaans",["Welkom"]],["Albanian",["Mirë se vjen"]],["Alsatian",["Wellkumma","Willkumme"]],["Amharic",["in kwahn deh-na meh tah in kwahn deh-na meh tash"]],["Arabic(Egyptian)",["أهلاً و سهلاً"]],["Arabic(Lebanese)",["Ahla w sahla"]],["Arabic(Modern Standard)",["أهلاً و سهلاً"]],["Aragonese",["Bienveniu","Bienvenius"]],["Armenian(Eastern)",["Բարի գալուստ!"]],["Armenian(Western)",["Բարի՜ եկաք:"]],["Aromanian",["Ghini vinit! Ghini vinishi!"]],["Assamese",["আদৰিণ"]],["Asturian",["Bienveníu","Bienvenida"]],["Azeri",["Xoş gəlmişsiniz!"]],["Banjar",["Salamat datang"]],["Basque",["Ongi etorri"]],["Batak",["Menjuah-juah! Horas!"]],["Belarusian",["Вiтаем","Прывiтанне"]],["Bengali",["স্বাগতম"]],["Bhojpuri",["स्वागत बा"]],["Bosnian",["Dobrodošli"]],["Breton",["Degemer mat"]],["Bulgarian",["Добре дошъл","Добре дошла","Добре дошли","Добре заварил","Добре заварила","Добре заварили","Добре дошло"]],["Catalan",["Benvingut ,","Benvinguda ,","Benvinguts ,","Benvingudes"]],["Cebuano",["Maayong pag-abot"]],["Chamorro",["Bien binidu","Buen binidu"]],["Chechen",["марша вагIийла хьо","марша йагIийла хьо","марша дагIийла шу"]],["Cherokee",["ᎤᎵᎮᎵᏍᏗ"]],["Chinese(Cantonese)",["歡迎"]],["Chinese(Hakka)",["歡迎"]],["Chinese(Mandarin)",["歡迎光臨 [欢迎光临]"]],["Chinese(Shanghainese)",["欢迎"]],["Chinese(Taiwanese)",["歡迎光臨"]],["Cornish",["Dynnargh dhis Dynnargh dhywgh"]],["Cuyonon",["Malipayeng Pag-abot!"]],["Czech",["Vítáme tĕ","Vítáme vás"]],["Danish",["Velkommen"]],["Dutch",["Welkom"]],["Efik",["A me di o"]],["Esperanto",["Bonvenon"]],["Estonian",["Tere tulemast"]],["Ewe",["Woezor"]],["Faroese",["Vælkomin"]],["Finnish",["Tervetuloa"]],["French",["Bienvenue"]],["Frisian (North)",["Wäljkiimen"]],["Frisian (West)",["Wolkom"]],["Friulian",["Agradît","Benvignût"]],["Galician",["Benvido","Benvida"]],["German",["Willkommen"]],["Georgian",["კეთილი იყოს თქვენი","შენი მობრძანება"]],["Greek (Modern)",["Καλώς Ορίσατε","Καλώς Όρισες","Καλώς Ήλθατε","Καλώς Ήλθες","Καλώς Ήρθατε","Καλώς Ήρθες"]],["Greenlandic",["Tikilluarit","Tikilluaritsi"]],["Guarani",["Eguahé porá"]],["Gujarati",["પધારો"]],["Haitian Creole",["Byen venu","V byenvini","N bèlantre"]],["Hausa",["Sannu da zuwa"]],["Hawaiian",["Aloha"]],["Hebrew",["ברוך הבא","ברוכים הבאים"]],["Hindi",["स्वागत","सवागत हैं"]],["Hungarian",["Üdvözlet; Isten hozott ; Isten hozta"]],["Icelandic",["Velkomin","Velkominn"]],["Igbo",["nno; dalụ"]],["Indonesian",["Selamat datang"]],["Iñupiaq",["Qaimarutin"]],["Inuktitut",["ᑐᙵᓱ"]],["Irish (Gaelic)",["Fáilte,","Tá fáilte romhat","romhaibh","Fáilte romhat isteach!","Céad míle fáilte"]],["Italian",["Benvenuto","Benvenuti","Benvenuta","Benvenute"]],["Japanese",["ようこそ"]],["Javanese",["Sugeng rawuh"]],["Jèrriais",["Séyiz les beinv'nu!"]],["Kannada",["ಸುಸ್ವಾಗತ"]],["Kazakh",["Қош келдіңіз!"]],["Khmer",["សូមស្វាគមន៍"]],["Kinyarwanda",["Murakaza neza"]],["Korean",["환영합니다"]],["Kurdish (Kurmanji)",["Be kher hati"]],["Kurdish (Sorani)",["Bi xêr bî"]],["Kyrgyz",["Kosh kelinizder"]],["Lakota Sioux",["Taŋyáŋ yahí Taŋyáŋ yahípi"]],["Lao",["ຍິນດີຕ້ອນຮັບ"]],["Latin",["Salve"]],["Latvian",["Laipni lūdzam"]],["Limburgish",["Wilkóm"]],["Lithuanian",["Sveiki atvykę"]],["Lojban",["coi ro do"]],["Lozi",["Mu amuhezwi"]],["Luxembourgish",["Wëllkomm"]],["Macedonian",["Добредојде","Добредојдовте"]],["Malagasy",["Tonga soa e"]],["Malay",["Selamat datang"]],["Malayalam",["സ്വാഗതം"]],["Maltese",["Merħba"]],["Manx (Gaelic)",["Failt, Failt royd"]],["Māori",["Haere mai","Nau mai"]],["Marathi",["स्वागत आहे"]],["Mazanderani",["خش بهمونی"]],["Mi'kmaq",["Weltasualuleg"]],["Mongolian",["Тавтай морилогтун"]],["Nahuatl",["Ximopanōltih"]],["Ndebele (Northern)",["Siyalemukela"]],["Nepali",["स्वागतम्"]],["Newari / Nepal Bhasa",["लसकुस"]],["Norwegian",["Velkommen"]],["Occitan",["Benvengut!","Benvenguda!","Planvengut!","Planvenguda!"]],["Oriya",["Swaagata"]],["Papiamento",["Bon bini"]],["Pashto",["پخير‏"]],["Persian (Farsi)",["خوش آمدید"]],["Polish",["Witam Cię","Witamy Cię","Witam Was","Witamy Was","Witam","Witamy","Witaj","Witajcie"]],["Portuguese",["Bem-vindo", "Bem-vinda", "Bem-vindos"]],["Portuguese (Brazilian)",["Bem-vindo","Bem-vinda ,","Bem-vindos"]],["Punjabi",["ਜੀ ਆਇਆ ਨੂੰ।","جی آیاں نُوں"]],["Quechua",["Haykuykuy!"]],["Romanian",["Bine ai venit","Bine ați venit"]],["Romansh",["Bainvegni"]],["Russian",["Добро пожаловать!"]],["Samoan",["Afio mai","Susu mai","Maliu mai"]],["Sardinian(Logudorese)",["Ennidos"]],["Scots",["Walcome","Welcum"]],["Scottish Gaelic",["Fàilte","Ceud mìle fàilte"]],["Serbian",["Добродошли"]],["Sesotho",["Kena ka kgotso! Kenang ka kgotso!"]],["Shona",["Mauya"]],["Sicilian",["Binvinutu"]],["Sinhala",["සාදරයෙන් පිලිගන්නවා"]],["Slovak",["Vitaj","Vitajte"]],["Slovenian",["Dobrodošli"]],["Somali",["Soo dhowow"]],["Spanish",["Bienvenido","Bienvenidos"]],["Swahili",["Karibu Karibuni"]],["Swedish",["Välkommen","Välkomna"]],["Swiss German",["Wilkomme"]],["Tagalog",["Maligayang pagdating","Mabuhay"]],["Tahitian",["Maeva","Mānava"]],["Tamil",["வாங்க"]],["Tatar",["Räxim itegez"]],["Telugu",["సుస్వాగతం"]],["Tetum",["Ksolok Bodik Mai","Bemvindu"]],["Thai",["ยินดีต้อนรับ"]],["Tibetan",["ཕེབས་པར་དགའ་བསུ་ཞུ།"]],["Tigrinya",["መርሓባ","እንቋዕ ብደሐን መጻእካ m","እንቋዕ ብደሐን መጻእኪ f","እንቋዕ ብድሐን ጸናሕካ m","እንቋዕ ብድሐን ጸናሕኪ f"]],["Tok Pisin",["Welkam"]],["Tongan",["Talitali fiefia"]],["Tswana",["O amogetswe Le amogetswe"]],["Turkish",["Hoş geldin Hoş geldiniz"]],["Ukrainian",["Ласкаво просимо","Вітаємо"]],["Urdu",["خوش آمديد"]],["Uzbek",["Xush kelibsiz"]],["Venetian",["Benvignùo","Benvegnù","Benvegnesto","Benvegnùa","Benvegnesta","Benvegnùi","Benvegnesti","Benvegnùe","Benvegneste"]],["Vietnamese",["Hoan nghênh","Được tiếp đãi ân cần"]],["Volapük",["Vekömö"]],["Võro",["Tere tulõmast"]],["Walloon",["Benvnuwe"]],["Welsh",["Croeso"]],["Wolof",["Merhbe"]],["Xhosa",["Siya namkela nonke"]],["Yiddish",["sg","ברוך־הבא","pl","ברוכים־הבאים"]],["Yorùbá",["Ẹ ku abọ"]],["Yucatec Maya",["Kíimak 'oolal"]],["Zulu",["Ngiyakwemukela","Ngiyanemukela"]]]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment