Skip to content

Instantly share code, notes, and snippets.

@vthacker
Created February 4, 2013 05:15
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save vthacker/4705144 to your computer and use it in GitHub Desktop.
Save vthacker/4705144 to your computer and use it in GitHub Desktop.
Word Delimiter Token Filter
curl -XPUT localhost:9200/test -d '{
"settings": {
"number_of_shards": 2,
"number_of_replicas": 1,
"analysis": {
"filter": {"myWordDelimiter": {"type": "word_delimiter"}},
"analyzer" : {
"myAnalyzer" : {
"type": "custom",
"tokenizer": "whitespace",
"filter": ["lowercase", "myWordDelimiter", "stop", "unique"]
}
}
}
}
}'
curl -XPUT localhost:9200/test2 -d '{
"settings": {
"number_of_shards": 2,
"number_of_replicas": 1,
"analysis": {
"filter": {"myWordDelimiter": {"type": "word_delimiter"}},
"analyzer" : {
"myAnalyzer" : {
"type": "custom",
"tokenizer": "whitespace",
"filter": ["lowercase", "word_delimiter", "stop", "unique"]
}
}
}
}
}'
curl -XGET 'localhost:9200/test/_analyze?analyzer=myAnalyzer&pretty=true' -d 'world-class player wouldn\u0027t'
curl -XGET 'localhost:9200/test2/_analyze?analyzer=myAnalyzer&pretty=true' -d 'world-class player wouldn\u0027t'
@vthacker
Copy link
Author

vthacker commented Feb 4, 2013

varun@varun:~$ curl -XGET 'localhost:9200/test/_analyze?analyzer=myAnalyzer&pretty=true' -d 'world-class player wouldn\u0027t'
{
"tokens" : [ {
"token" : "player",
"start_offset" : 12,
"end_offset" : 18,
"type" : "word",
"position" : 2
} ]
}

varun@varun:~$ curl -XGET 'localhost:9200/test2/_analyze?analyzer=myAnalyzer&pretty=true' -d 'world-class player wouldn\u0027t'
{
"tokens" : [ {
"token" : "world",
"start_offset" : 0,
"end_offset" : 5,
"type" : "word",
"position" : 1
}, {
"token" : "class",
"start_offset" : 6,
"end_offset" : 11,
"type" : "word",
"position" : 2
}, {
"token" : "player",
"start_offset" : 12,
"end_offset" : 18,
"type" : "word",
"position" : 3
}, {
"token" : "wouldn",
"start_offset" : 19,
"end_offset" : 25,
"type" : "word",
"position" : 4
}, {
"token" : "u",
"start_offset" : 26,
"end_offset" : 27,
"type" : "word",
"position" : 5
}, {
"token" : "0027",
"start_offset" : 27,
"end_offset" : 31,
"type" : "word",
"position" : 6
}, {
"token" : "t",
"start_offset" : 31,
"end_offset" : 32,
"type" : "word",
"position" : 7
} ]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment