Skip to content

Instantly share code, notes, and snippets.

@benediktarnold
Created April 3, 2017 10:56
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save benediktarnold/2698ce0bf8a5c57516ef29b6ccb6d060 to your computer and use it in GitHub Desktop.
Save benediktarnold/2698ce0bf8a5c57516ef29b6ccb6d060 to your computer and use it in GitHub Desktop.
Reproduce bug in elasticsearch regarding html_strip in combination with highlighting
curl -XDELETE http://localhost:9200/test
curl -XPUT http://localhost:9200/test -d '{
"settings":{
"analysis":{
"analyzer":{
"my_analyzer":{
"type":"custom",
"tokenizer":"standard",
"char_filter": ["my_char_filter"]
}
},
"char_filter": {
"my_char_filter": {
"type": "html_strip"
}
}
}
},
"mappings" : {
"document" : {
"properties" : {
"text" : { "type" : "text", "analyzer": "my_analyzer" }
}
}
}
}'
curl -XPUT http://localhost:9200/test/document/1 -d '{"text": "<a href=mytoken>mytoken</a>"}'
curl -XPUT http://localhost:9200/test/document/2 -d '{"text": "mytoken<br> lorem ipsum"}'
curl -XPUT http://localhost:9200/test/document/3 -d '{"text": "<h1>mytoken</h1> lorem ipsum dolor sit"}'
sleep 1
curl http://localhost:9200/_search -d '{
"query" : {
"match": { "text": "mytoken" }
},
"highlight" : {
"fields" : {
"text" : {}
}
}
}' | json_pp
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment