Skip to content

Instantly share code, notes, and snippets.

@pulkitsinghal
Created February 28, 2012 13:59
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 4 You must be signed in to fork a gist
  • Save pulkitsinghal/1932711 to your computer and use it in GitHub Desktop.
Save pulkitsinghal/1932711 to your computer and use it in GitHub Desktop.
Ramping Up on ElasticSearch Query DSL
// 1) Lets start with the simplest query that you can run in
// the head plugin for ElasticSearch located at the url:
// http://localhost:9200/_plugin/head/
{
"query": {
"match_all": {}
}
}
// 2) Before jumping into forming queries with the ES Query DSL
// syntax alone, lets try out a few queries written within
// the comfort zone of those who are already familiar with
// Lucene and/or Solr
{
"query": {
"query_string": {
"query": "+camera +laptop",
"use_dis_max": true
}
}
}
// 3) Lets try it again but with slightly different syntax, it gives
// exactly the same number of total hits as the previous query.
{
"query": {
"query_string": {
"query": "camera AND laptop",
"use_dis_max": true
}
}
}
// 4) What if we want to specify the exact fields or terms, which should
// be searched for the query? Well, here's how:
{
"query": {
"query_string": {
"fields": [
"name",
"shortDescription"
],
"query": "+camera +laptop",
"use_dis_max": true
}
}
}
// 5) Someone who is not yet well-versed with JSON, may ask: what if I just want
// to search inside one field exactly for a match and not multiple ones?
// Simple, don't specify multiple fields:
{
"query": {
"query_string": {
"fields": [
"name"
],
"query": "+camera +laptop",
"use_dis_max": true
}
}
}
// 6) That's all good but now we want to limit the fields coming back
// in the response. It would be nice to specify exactly the fields
// that we want to retrieve:
{
"fields": [
"name",
"shortDescription",
"longDescription"
],
"query": {
"query_string": {
"fields": [
"name"
],
"query": "+camera +laptop",
"use_dis_max": true
}
}
}
// 7) Now lets try something different like a range query to try and
// group our results so far into price buckets.
//
// WARNING: Make sure to POST to the exact index:
// http://localhost:9200/my_index/_search/
// And not against any & all indices
// http://localhost:9200/_search/
// Otherwise you may run into an exception like the following:
// org.elasticsearch.search.facet.FacetPhaseExecutionException:
// Facet [range1]: No mapping found for key_field [regularPrice]
// from other indices that don't share the same fields:
{
"fields": [
"name",
"upc",
"salePrice",
"regularPrice"
],
"query": {
"query_string": {
"fields": [
"name"
],
"query": "+camera +laptop",
"use_dis_max": true
}
},
"facets": {
"range1": {
"range": {
"regularPrice": [
{
"to": 100
},
{
"from": 100,
"to": 200
},
{
"from": 200,
"to": 300
},
{
"from": 300
}
]
}
}
}
}
// 8) The total # of results is simple to find out but what about getting
// all the results back? Well both getting the results back and doing so
// in an orderly and paginated manner is easy to do like so:
//
// 8.1) Get the results starting from offset 0 and get a total of 2 entries
{
"from": 0,
"size": 2,
"fields": [
"name"
],
"query": {
"match_all": {}
}
}
// 8.2) Get the results starting from offset 2 and get a total of 2 entries
{
"from": 2,
"size": 2,
"fields": [
"name"
],
"query": {
"match_all": {}
}
}
// 9) If your index doesn't change often, then neither will the position of results
// being returned by the query. These results are positioned by the score allotted
// to each one of them in terms of how likely Lucene/ES thinks they are to be
// the closest-fit / best-answer to your query.
//
// Lets say you want random results, what then? How can you ask ES to change the
// score and return different results for every query? Here's how:
{
"from": 0,
"size": 2,
"fields": [
"name",
"upc"
],
"query": {
"custom_score": {
"query": {
"match_all": {}
},
"script": "random()"
}
}
}
// In this example by separating out the key and value facet fields,
// the range buckets are filled based on the key field ... while the
// averages etc. are calculated based on the the value field ... the
// assumption here (I suppose) is that the documents which have the
// key fields present, will also have the value field present (optimistic?)
{
"fields": [
"name",
"upc",
"salePrice",
"regularPrice"
],
"query": {
"query_string": {
"fields": [
"name"
],
"query": "+camera +laptop",
"use_dis_max": true
}
},
"facets": {
"range1": {
"range": {
"key_field": "regularPrice",
"value_field": "salePrice",
"ranges": [
{
"to": 100
},
{
"from": 100,
"to": 200
},
{
"from": 200,
"to": 300
},
{
"from": 300
}
]
}
}
}
}
// Good example of using histogram facets if the ranges that you specify
// happen to be at regular intervals anyway.
{
"fields": [
"name",
"upc",
"salePrice",
"regularPrice"
],
"query": {
"query_string": {
"fields": [
"name"
],
"query": "+camera +laptop",
"use_dis_max": true
}
},
"facets": {
"histo1": {
"histogram": {
"field": "regularPrice",
"interval": 100
}
}
}
}
// The following is supposed to be the same but seems to give more metrics
{
"fields": [
"name",
"upc",
"salePrice",
"regularPrice"
],
"query": {
"query_string": {
"fields": [
"name"
],
"query": "+camera +laptop",
"use_dis_max": true
}
},
"facets": {
"histo1": {
"histogram": {
"key_field": "regularPrice",
"value_field": "salePrice",
"interval": 100
}
}
}
}
// This is a really exciting example where we take our query results and
// bucket them by making a date histogram facet, therefore, shirking all
// responsibility of figuring out which items fall in which year by startDate
// All work is done by ES! Can you say $profit$ :)
{
"fields": [
"name",
"upc",
"salePrice",
"regularPrice"
],
"query": {
"query_string": {
"query": "+camera +laptop",
"use_dis_max": true
}
},
"facets": {
"histo1": {
"date_histogram": {
"key_field": "startDate",
"value_field": "startDate",
"interval": "year"
}
}
}
}
// What if you have a scenario where out want to figure out if there are
// really good sales going on in the area that you are interested in?
// Then statistical facet queries should get your motor started.
{
"query": {
"query_string": {
"fields": [
"name",
"description"
],
"query": "+camera +laptop",
"use_dis_max": true
}
},
"facets": {
"stat1": {
"statistical": {
"field": "regularPrice"
}
},
"stat2": {
"statistical": {
"field": "salePrice"
}
}
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment