Skip to content

Instantly share code, notes, and snippets.

@Alix-Martin
Created December 18, 2014 13:07
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Alix-Martin/7186e38459e88a474e13 to your computer and use it in GitHub Desktop.
Save Alix-Martin/7186e38459e88a474e13 to your computer and use it in GitHub Desktop.
elasticsearch analyzer : invalid stopword syntax that should work
# Remove old data
curl -XDELETE "http://localhost:9200/french"
# Create index with custom analyzer
# this syntax is an extrapolation of
# http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/using-stopwords.html
#
# but apparently if an analyzer is of type "custom",
# it requires the stopwords to be defined as a filter
curl -XPOST "http://localhost:9200/french/" -d '
{
"settings":{
"index":{
"analysis":{
"tokenizer" : {
"host_tokenizer" : {
"type": "pattern",
"pattern": "[a-zA-Z0-9]+",
"group": 0
}
},
"analyzer":{
"host_analyzer":{
"type":"custom",
"tokenizer" : "host_tokenizer",
"stopwords": ["www", "fr", "com"]
}
}
}
}
}
}'
# this shows the stopwords "www", "fr", and "com" are not removed
curl -XGET 'http://localhost:9200/french/_analyze?analyzer=host_analyzer' -d 'www.google.fr.com'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment