Skip to content

Instantly share code, notes, and snippets.

@tboeghk
Created August 18, 2014 13:32
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tboeghk/d613a3955344207db4d4 to your computer and use it in GitHub Desktop.
Save tboeghk/d613a3955344207db4d4 to your computer and use it in GitHub Desktop.
Downloads and creates a german wordlist for use in Solr
# download
wget http://repo.or.cz/w/wortliste.git/blob_plain/master:/wortliste
# easy things
awk -F ';' '{print $1;}' > list_words_dictionary.de.txt
# ok, complex
grep -v ";-2-;" wortliste |awk -F ';' '{print $2;}' | tr -d "[·|-<>]" | grep "=" | awk -F '=' '{print $1"\n"$2"\n"$3"\n"$4;}' | sort -u | tr [:upper:] [:lower:] >> list_words_dictionary.de.txt
# reset
mv list_words_dictionary.de.txt wortliste
# lowercase
cat wortliste | tr [:upper:] [:lower:] > list_words_dictionary.de.txt
# sql
cat list_words_dictionary.de.txt |awk -F '*' '{print "insert into lists (type, entry, modified_by, modified_at) values ('\''words-dictionary'\'', '\''"$1"'\'', '\''TorstenKoester'\'', now());";}' > list_words_dictionary.de.sql
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment