Created
September 16, 2011 12:44
-
-
Save tpoljak/1222046 to your computer and use it in GitHub Desktop.
Proximity and phrase search highlighting using fast-vector-highlighter vs. highlighter
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1) Indexing with default settings (automatic index creating and type mapping definition -> default/plain highlighter) | |
curl -XPUT 'http://localhost:9200/twitter/tweet/1' -d '{ | |
"user" : "kimchy", | |
"post_date" : "2009-11-15T14:12:12", | |
"message" : "trying out Elastic and Search" | |
}' | |
curl -XPUT 'http://localhost:9200/twitter/tweet/2' -d '{ | |
"user" : "kimchy", | |
"post_date" : "2009-11-15T14:12:12", | |
"message" : "trying out Search and Elastic" | |
}' | |
Proximity search: | |
curl -XGET 'http://localhost:9200/twitter/tweet/_search' -d '{ | |
"query" : { | |
"query_string" : { | |
"default_field" : "message", | |
"query" : "\"Elastic Search\"~4" | |
} | |
}, | |
"highlight" : { | |
"fields" : { | |
"message" : {} | |
} | |
} | |
} | |
' | |
Result(both matches' highlighting snippets present): | |
{ | |
"took": 320, | |
"timed_out": false, | |
"_shards": { | |
"total": 5, | |
"successful": 5, | |
"failed": 0 | |
}, | |
"hits": { | |
"total": 2, | |
"max_score": 0.18985549, | |
"hits": [ | |
{ | |
"_index": "twitter", | |
"_type": "tweet", | |
"_id": "1", | |
"_score": 0.18985549, | |
"_source": { | |
"user": "kimchy", | |
"post_date": "2009-11-15T14:12:12", | |
"message": "trying out Elastic and Search" | |
}, | |
"highlight": { | |
"message": [ | |
"trying out <em>Elastic</em> and <em>Search</em>" | |
] | |
} | |
}, | |
{ | |
"_index": "twitter", | |
"_type": "tweet", | |
"_id": "2", | |
"_score": 0.13424811, | |
"_source": { | |
"user": "kimchy", | |
"post_date": "2009-11-15T14:12:12", | |
"message": "trying out Search and Elastic" | |
}, | |
"highlight": { | |
"message": [ | |
"trying out <em>Search</em> and <em>Elastic</em>" | |
] | |
} | |
} | |
] | |
} | |
} | |
Exact phrase search: | |
curl -XGET 'http://localhost:9200/twitter/tweet/_search' -d '{ | |
"query" : { | |
"query_string" : { | |
"default_field" : "message", | |
"query" : "\"Elastic and Search\"" | |
} | |
}, | |
"highlight" : { | |
"fields" : { | |
"message" : {} | |
} | |
} | |
} | |
' | |
Result (highlighting returned, present on non-stop words): | |
{ | |
"took": 250, | |
"timed_out": false, | |
"_shards": { | |
"total": 5, | |
"successful": 5, | |
"failed": 0 | |
}, | |
"hits": { | |
"total": 1, | |
"max_score": 0.30685282, | |
"hits": [ | |
{ | |
"_index": "twitter", | |
"_type": "tweet", | |
"_id": "1", | |
"_score": 0.30685282, | |
"_source": { | |
"user": "kimchy", | |
"post_date": "2009-11-15T14:12:12", | |
"message": "trying out Elastic and Search" | |
}, | |
"highlight": { | |
"message": [ | |
"trying out <em>Elastic</em> and <em>Search</em>" | |
] | |
} | |
} | |
] | |
} | |
} | |
2) Indexing with "term_vector" ("term_vector" : "with_positions_offsets" -> fast-vector-highlighter) | |
curl -XPUT 'http://localhost:9200/twitter/' | |
curl -XPUT 'http://localhost:9200/twitter/tweet/_mapping' -d '{ | |
"text": { | |
"properties": { | |
"message": { | |
"type": "string", | |
"store": "yes", | |
"term_vector": "with_positions_offsets" | |
} | |
} | |
} | |
}' | |
curl -XPUT 'http://localhost:9200/twitter/tweet/1' -d '{ | |
"user" : "kimchy", | |
"post_date" : "2009-11-15T14:12:12", | |
"message" : "trying out Elastic and Search" | |
}' | |
curl -XPUT 'http://localhost:9200/twitter/tweet/2' -d '{ | |
"user" : "kimchy", | |
"post_date" : "2009-11-15T14:12:12", | |
"message" : "trying out Search and Elastic" | |
}' | |
Proximity search: | |
curl -XGET 'http://localhost:9200/twitter/tweet/_search' -d '{ | |
"query" : { | |
"query_string" : { | |
"default_field" : "message", | |
"query" : "\"Elastic Search\"~4" | |
} | |
}, | |
"highlight" : { | |
"fields" : { | |
"message" : {} | |
} | |
} | |
} | |
' | |
Result (only one 'right ordered' match has highlighting snippet): | |
{ | |
"took": 273, | |
"timed_out": false, | |
"_shards": { | |
"total": 5, | |
"successful": 5, | |
"failed": 0 | |
}, | |
"hits": { | |
"total": 2, | |
"max_score": 0.2169777, | |
"hits": [ | |
{ | |
"_index": "twitter", | |
"_type": "tweet", | |
"_id": "1", | |
"_score": 0.2169777, | |
"_source": { | |
"user": "kimchy", | |
"post_date": "2009-11-15T14:12:12", | |
"message": "trying out Elastic and Search" | |
}, | |
"highlight": { | |
"message": [ | |
"g out <em>Elastic</em> and <em>Search</em> " | |
] | |
} | |
}, | |
{ | |
"_index": "twitter", | |
"_type": "tweet", | |
"_id": "2", | |
"_score": 0.15342641, | |
"_source": { | |
"user": "kimchy", | |
"post_date": "2009-11-15T14:12:12", | |
"message": "trying out Search and Elastic" | |
} | |
} | |
] | |
} | |
} | |
Exact phrase search: | |
curl -XGET 'http://localhost:9200/twitter/tweet/_search' -d '{ | |
"query" : { | |
"query_string" : { | |
"default_field" : "message", | |
"query" : "\"Elastic and Search\"" | |
} | |
}, | |
"highlight" : { | |
"fields" : { | |
"message" : {} | |
} | |
} | |
} | |
' | |
Result (highlighting not even present): | |
{ | |
"took": 11, | |
"timed_out": false, | |
"_shards": { | |
"total": 5, | |
"successful": 5, | |
"failed": 0 | |
}, | |
"hits": { | |
"total": 1, | |
"max_score": 0.30685282, | |
"hits": [ | |
{ | |
"_index": "twitter", | |
"_type": "tweet", | |
"_id": "1", | |
"_score": 0.30685282, | |
"_source": { | |
"user": "kimchy", | |
"post_date": "2009-11-15T14:12:12", | |
"message": "trying out Elastic and Search" | |
} | |
} | |
] | |
} | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment