Skip to content

Instantly share code, notes, and snippets.

@dingsdax
Created August 11, 2011 20:51
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dingsdax/1140739 to your computer and use it in GitHub Desktop.
Save dingsdax/1140739 to your computer and use it in GitHub Desktop.
Jim Breen's WWWJDIC regexp
# source: http://stackoverflow.com/questions/3002650/parsing-dictionary-entries-with-regex
#
# example_url: http://www.csse.monash.edu.au/~jwb/cgi-bin/wwwjdic.cgi?1ZUJ%E5%85%88%E7%94%9F
# output:
# 先生 [せんせい] /(n) (1) teacher/master/doctor/(suf) (2) with names of teachers, etc. as an honorific/(P)/
# 先生に就く [せんせいにつく] /(exp,v5k) to study under (a teacher)/
# 先生の述 [せんせいのじゅつ] /(n) teachers statement (expounding)/
# 先生方 [せんせいがた] /(n) doctors/teachers/
# regexp
dictionary = entries.map do |entry|
entry.scan(/(.*) \[(.*)\] \/(.*)\//).map do |(headword, kana, definition)|
{ headword: headword, kana: kana, definition: definition }
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment