amulyakashyap09/elastic_search_query.md

## elastic_search_query.md

      
    Raw
  

              elastic_search_query.md
            
          
    Terminologies


We will be using following information throughout this article:


index_name : customers
index_type : personal
customer will have name,age,gender,email,phone,address,city,state as fields in schema for now

INFO Queries


List all the nodes present in the cluster
curl -XGET 'http://localhost:9200/_cat/nodes?v&pretty'
List all the indices present in the node
curl -XGET 'http://localhost:9200/_cat/indices?v&pretty'
Check health of your elastic-search node
curl -XGET 'http://localhost:9200/_cat/health?v&pretty'

CRUD Queries


In this section, you'll be covering how to create and delete indexes, and create, read, udpate or delete the documents to/from the indexes. Each query, will have it's syntax and an example to try it on the scene.


Create a new index

Syntax
curl -XPUT 'http://localhost:9200/<index_name>?&pretty'
Example
curl -XPUT 'http://localhost:9200/customers?&pretty'


Delete an index

Syntax
curl -XDELETE 'http://localhost:9200/<indices_name>?pretty'
Example
curl -XDELETE 'http://localhost:9200/customers?pretty'


Create a new document

Syntax
curl -XPUT 'http://localhost:9200/<indices_name>/<_type>/<doc_uniq_id>?pretty' -d '{key1:val1,key2:val2,key3:val3}'
Example
curl -XPUT 'http://localhost:9200/customers/personal/1?pretty' -d '{'name':'Amulya','age':25,'gender':'male','email':'amulyakashyap@gmail.com','phone':'9559004779','address':'Kurla Mumbai Maharashtra','city':'Mumbai','state':'Maharashtra'}'


Retrieve a whole document

Syntax
curl -XGET 'http://localhost:9200/<index_name>/<_type>/<doc_uniq_id>?pretty'
Example
curl -XGET 'http://localhost:9200/customers/personal/1?pretty'


Retrieval partial document or with fewer fields

Syntax
curl -XGET 'http://localhost:9200/<index_name>/<_type>/<_id>?pretty&_source=field1,field2,field3'
Example
curl -XGET 'http://localhost:9200/customers/personal/1?pretty&_source=name,age,gender'


Update a whole document

Syntax
curl -XPUT 'http://localhost:9200/<indices_name>/<_type>/<doc_uniq_id>?pretty' -d '{key1: val2,key2: val3,key3: val4}'
Example
curl -XPUT 'http://localhost:9200/customers/personal/1?pretty' -d '{'name':'Amulya','age':27,'gender':'male','email':'amulyakashyap09@gmail.com','phone':'9559974779','address':'Andheri Mumbai Maharashtra','city':'Mumbai','state':'Maharashtra'}'


Update a document partially | only specific fields

Syntax
curl -XPOST 'http://localhost:9200/<indices_name>/<_type>/<doc_uniq_id>/_update?pretty' -d '{'doc':{new_key: new_val}}'
Example
curl -XPOST 'http://localhost:9200/customers/personal/1/_update?pretty' -d '{'doc':{'age': '27'}}'


Delete a document

Syntax
curl -XDELETE 'http://localhost:9200/<indices_name>/<_type>/<doc_uniq_id>?pretty
Example
curl -XDELETE 'http://localhost:9200/customers/personal/1?pretty'


SCRIPT


You can also, perform mathematical operations in the update query using SCRIPT clause


Syntax
curl -XPOST 'http://localhost:9200/<indices_name>/<_type>/<doc_uniq_id>/_update?pretty' -d '{'script': 'ctx._source.<field_name> <mathematical_operator> <value>'}'
Example
curl -XPOST 'http://localhost:9200/customers/personal/1/_udpate?pretty' -d '{'script':'ctx._source.age *= 2'}'

BULK


Bulk operation allows you to perform multiple operations in elastic-search in one go


_MGET | Fetch from multiple indexes and type in bulk


Multiple Index Bulk fetch | You can fetch from multiple indexes in single fetch _mget operation

Syntax
curl -XGET 'http://localhost:9200/_mget?pretty' -d '{"docs":[{"_index": <value>,"_type": <value>,"_id": <value>},{"_index": <value>,"_type": <value>,"_id": <value>}]}'


Specific Index Mutiple Type Bulk Fetch | You can fetch from single index where all types present in that index in single fetch _mget operation

Syntax
curl -XGET 'http://localhost:9200/<index_name>/_mget?pretty' -d '{"docs": [{"_type": <value>,"_id": <value>},{"_type": <value>,"_id": <value>}]}'


Single Index Specific Type Bulk Fetch | You can fetch from Specific index where single type is present in single fetch _mget operation

Syntax
curl -XGET 'http://localhost:9200/<index_name>/_mget?pretty' -d '{"docs": [{"_id": <value>},{"_id": <value>}]}'


_BULK | Perform multiple operations in single request


Multi-Operation Query | Here you can execute heterogeneous operations in query

Syntax

curl -XPOST 'http://localhost:9200/<_index>/<_type>/_bulk?pretty' -H 'Content-Type: application/json' -d '
 {
   {"index": {<doc_id>: <value>}}
   {<key1>:<value1>, <key2>:<value2>, <key3>:<value3>}
   {"delete": {<doc_id>: <value>}}
   {"create": {<doc_id>: <value>}}
   {<key1>:<value1>, <key2>:<value2>, <key3>:<value3>}
   {"update": {<doc_id>: <value>}}
   {<key1>:<value1>, <key2>:<value2>, <key3>:<value3>}
 }'


Read bulk data From JSON file | We will be covering this in future article.


SEARCHING COMPONENTS


When we talk about querying the elastic search fetch the records we need to know few things beforehand. There are many clauses in the elastic search which are used in different combination to get the desired results. I'm listing down the clauses:


QUERY - It works on the concept of relevant scoring and returns the documents with high scores. It takes some time because it assigns score to indivdual document based on their search algo. Higher the score, more relevant the result.


Filters - Filters returns boolean whether docs should be included in the results or not. Filters are faster than query because it just checks whether documents matches at all and not whether it matches well. Data is well structured and can perform more checks like range queries, exact matches, etc


Score calculation mentioned above is related to TF, IDF, FNL. We will cover these things in different chapter. To just give you guys overview about above terms:
TF - Term Frequency - How often does the term appear in the field ?
- More often, more relevant
Example:
1) Amulya is a great person
2) Amulya is a great and really great and super great person
- Output:
- TF for Statement (2) will be higher
IDF - Inverse Document Frequency - How often does the term appear in the index ?
- More often, less relevant
FLN - Field Length Norm - How long is the field which was searched ?
- Longer fields, less relevant

SIMPLE SEARCH QUERY


A very simple search query in beginning to see if some documents are returned.


Explanation : In below example, we are searching wymoing across all the fields in customer documents.

Syntax
curl -XGET "localhost:9200/<_index>/_search?q=<keyword>&pretty"
Example
curl -XGET "localhost:9200/customers/_search?q=wyoming&pretty"


Explanation : In below example, we are searching wymoing in state field presnt in customer documents.

Syntax
curl -XGET "localhost:9200/<_index>/_search?q=<field>:<keyword>&pretty"
Example
curl -XGET "localhost:9200/customers/_search?q=state:wyoming&pretty"


SORTING


Well, you can sort you search results in increasing or decreasing order.


Explanation : In the below example, we're querying wyoming across all the fields and sorting the result by age of the customers in descending order.

Syntax
curl -XGET "localhost:9200/<_index>/_search?q=<keyword>&sort=<field>:<order>&pretty"
Example
curl -XGET "localhost:9200/customers/_search?q=wyoming&sort=age:desc&pretty"


SKIP/FROM and LIMIT/SIZE


These keywords help you to limit your result count with skipping the old ones in every new request. We use from to range our result to start from the given number and size is used to limit our result.


Explanation : In the below example, we are searching for the wyoming across all the fields which have kentucky as record value, while skipping the first 0-4 results and returning 20 customers data only.


Syntax
curl -XGET "localhost:9200/<_index>/_search?q=<keyword>&from=<number>&size=<number>&pretty"


Example
curl -XGET "localhost:9200/customers/_search?q=wyoming&from=5&size=20&pretty"


EXPLAIN


If you want to see how elastic search computes a score explanation for a query and a specific document. This can give useful feedback whether a document matches or didn’t match a specific query.


Explanation : In the below example, we are getting the explanation of the operation in which we are searching kentucky as value of state fields across all the customers record. It will show us the many things which includes relevance score calculation, memory used in search, time consumed, etc.


Syntax
curl -XGET "localhost:9200/<_index>/_explain?q=<field>:<keyword>&pretty"


Example
curl -XGET "localhost:9200/customers/_explain?q=state:kentucky&pretty"


QUERY


Query context has been already set below in this article, we're putting syntax and example here to more clarify it's usage practically.


Explanation : In the below example, we're querying everything from elastic-search, sorting the result by age of the customers and limit the result count to 20.

_source is used to include only the mentioned fields in the results document.
query is used match the document agains the specified condition.
match_all is the simplest clause to match everything present that index.
sort clause sorts for document in specified order against a field.


Syntax
  curl -H 'Content-Type: application/json' -XGET "localhost:9200/<index_name>/_search?pretty" -d '
  {
    "query": {"match_all":{}},
    "sort":{<field_name>: {"order": <order>}},
    "size": <number>,
    "_source": ["field1","field2","field3"]
  }'


Example
  curl -H 'Content-Type: application/json' -XGET "localhost:9200/customers/_search?pretty" -d '
  {
    "query": {"match_all":{}},
    "sort":{age: {"order": "desc"}},
    "size": 20,
    "_source": ["name","age","gender"],
  }'


TERM QUERY


Term query is used for matching the exact keyword. We should avoid using it against the text datatype field.


Explanation : In the below example, we searching for the keyword amulya. This will search document which contains word amulya as individual / key word.


Example


curl -H 'Content-Type: application/json' -XGET "localhost:9200/customers/_search?pretty" -d '
{
  "_source": ["name","age","gender"],
  "query": {"term":{"name":"amulya"}}
}'

REGEX QUERY


Regex query is used for pattern matching against every field in the document if any specific field not specified.


Explanation : In the below example, we searching for the document which do not contains any special character . It also includes and excludes fields matchig the regex given respectively.


Example


curl -H 'Content-Type: application/json' -XGET "localhost:9200/customers/_search?pretty" -d '
{
  "_source": {
    "includes": ["n*"],
    "excludes": ["a*"]
  },
  "query": {
   "regexp" : {
      "name" : "/[0-9A-Za-Z]/"
    }
  }
}'

WILDCARD QUERY


Regex query is used for pattern matching against every field in the document if any specific field not specified.


Explanation : In the below example, we searching for the document which contain name starting from amulya. It also includes and excludes fields matchig the regex given respectively.


Example


curl -H 'Content-Type: application/json' -XGET "localhost:9200/customers/_search?pretty" -d '
{
  "_source": {
    "includes": ["n*"],
    "excludes": ["a*"]
  },
  "query": {
    "wildcard" : {
      "name" : "amulya*"
    }
  }
}'

FUZZY QUERY


Regex query is used for pattern matching against every field in the document if any specific field not specified.


Explanation : In the below example, we searching for the document which contains beautiful in name field. Fuzziness can be [0, 1, 2] or AUTO as per the requirements.


Example


curl -H 'Content-Type: application/json' -XGET "localhost:9200/customers/_search?pretty" -d '
{
  "_source": {
    "includes": ["n*"],
    "excludes": ["a*"]
  },
  "query": {
    "match" : {
      "name" : "beutifell",
      "fuzziness": "AUTO"
    }
  }
}'

RANGE QUERY


Range query helps us to perform range searches like documents between two date ranges.


Example : list of customers who are aged between 10 - 50 age group.

Explanation: In the below exmaple, we are trying to get all customers who are aged between 20 - 60 years.
  curl -H 'Content-Type: application/json' -XGET "localhost:9200/customers/_search?pretty" -d '
  {
    "query": {
      "bool":{
        "must":{"match_all":{}},
        "filter":{
          "range": {
            "age": {
              "gte": 20,
              "lte": 60
            }
          }
        }
      }
    }
  }'

FULL TEXT SEARCH


Full text search is a more advanced way to search a database. Full text search quickly finds all instances of a term (word) in a table without having to scan rows and without having to know which column a term is stored in. Full text search works by using text indexes. In elasticsearch, we have clauses, match, match_phrase, match_phrase_prefix and multi_match. We'll be covering each of the clauses with explanantion and example. we have skipped few clauses that will be covered in the advanced elastic search article.


match - standard full text query
Explanation : In below example, we performing full text search on text fields.
curl -H 'Content-Type: application/json' -XGET "localhost:9200/customers/_search?pretty" -d '
{
  "query": {
    "match":{
      "name":{
        "query": "amulya kashyap",
        "operator": "or"
      }
    }
  }
}'


match_phrase - for matching exact phrases
Explanation : In below example, it will search for exact phrase Amulya Kashyap against the name field in the documents.
curl -H 'Content-Type: application/json' -XGET "localhost:9200/customers/_search?pretty" -d '
{
  "query": {
    "match_phrase":{
      "name": "Amulya Kashyap"
    }
  }
}'


match_phrase_prefix - poor man’s autocomplete
Explanation : In below example, it will search for customers name starting with amu in the documents.
curl -H 'Content-Type: application/json' -XGET "localhost:9200/customers/_search?pretty" -d '
{
  "query": {
    "match_phrase_prefix":{
      "name": "amu"
    }
  }
}'


multi_match - it allows you to search same string in multiple fields.

Multi Match can have type

best_fields
most_fields
cross_fields
phrase
phrase_prefix


Explanation: In below example, we are searching amulya against multiple fields which will result into more accurate results. In backgroud, match clause is executed for every single field specified.

All the multi_match type will be covered in the advance elastic search article. we can skip this for now.

curl -H 'Content-Type: application/json' -XGET 'localhost:9200/customers/_search?pretty' -d '
{
  "query": {
      "multi_match" : {
        "query":      "amulya",
        "fields":     [ "name", "state", "email", "city" ]
      }
  }
}'


BOOLEAN SEARCH


A query which matches the documents based on other conditions/criteria given. This query takes a more-matches-is-better approach, so the score from each matching must or should clause will be added together to provide the final _score for each document, and is built using one or more query clauses with a typed occurrence. This search has multiple occurrences:
- must - this clause specifies that keyword must appear in matching document.
- must_not - this clause specifies that keyword must not appear in matching document.
- filter - this clause must appear in matching documents. However unlike must the score of the query will be ignored.
- should - this clause specifies that keyword may be present in matching document or may not sometimes.

```
curl -H 'Content-Type: application/json' -XGET "localhost:9200/customers/_search?pretty" -d '
{
  "query": {
    "bool" : {
      "must" : {
        "term" : { "name" : "amulya" }
      },
      "filter": {
        "term" : { "state" : "mumbai" }
      },
      "must_not" : {
        "range" : {
          "age" : { "gte" : 10, "lte" : 40 }
        }
      },
      "should" : [
        { "match" : { "email" : "am*" } }
      ],
      "minimum_should_match" : 1,
      "boost" : 1.0
    }
  }
}'
```