Skip to content

Instantly share code, notes, and snippets.

@jmealo
Created August 6, 2015 14:33
Show Gist options
  • Save jmealo/103411148300ade619b5 to your computer and use it in GitHub Desktop.
Save jmealo/103411148300ade619b5 to your computer and use it in GitHub Desktop.
#!/bin/bash
cd /usr/share/postgresql/9.4/tsearch_data
wget https://stop-words.googlecode.com/files/stop-words-collection-2011.11.21.zip
unzip stop-words-collection-2011.11.21.zip
wget http://src.chromium.org/svn/trunk/deps/third_party/hunspell_dictionaries/en_US.dic
wget http://src.chromium.org/svn/trunk/deps/third_party/hunspell_dictionaries/en_US.dic_delta
wget http://src.chromium.org/svn/trunk/deps/third_party/hunspell_dictionaries/en_US.aff -O en_us.affix
# Remove first line
sed -i 1d en_US.dic
# Concat the dic and dic_delta, sort alphabetically and remove the leading blank line (leaves the ending newline intact)
cat en_US.dic en_US.dic_delta | sort > en_us.dict
sed -i 1d en_us.dict
cat stop-words/stop-words-english3-google.txt english.stop | sort | uniq -u > english.stop
# clean up
rm -rf stop-words-collection-2011.11.21.zip stop-words rm project-information.txt en_US*
chown -R postgres:postgres *
sudo -u postgres psql -c "CREATE TEXT SEARCH DICTIONARY ispell_en_us (template = ispell, dictfile = en_us, afffile = en_us, stopwords = english);"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment