Skip to content

Instantly share code, notes, and snippets.

@rsrini7
Forked from peschlowp/elasticsearch-sense-examples
Last active August 29, 2015 14:17
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rsrini7/3e2f641cdb9f35ff7624 to your computer and use it in GitHub Desktop.
Save rsrini7/3e2f641cdb9f35ff7624 to your computer and use it in GitHub Desktop.
###################################################
###################################################
# Crash course
###################################################
###################################################
En
###################################################
# Quickstart
###################################################
# Delete the "books" index, just in case it is still there from previous experiments.
DELETE /books
# Create an empty "books" index.
PUT /books
# Add a book of type "tech" to the index.
PUT /books/tech/1
{
"author": "Joshua Bloch",
"title": "Effective Java",
"date": "2008-05-08",
"text": "Are you looking for a deeper understanding of the Java programming language so that you can write code that is clearer, more correct, more robust, and more reusable? Look no further!"
}
# Retrieve the book again by ID.
GET /books/tech/1
# Add another book.
PUT /books/tech/2
{
"author": "Robert C. Martin",
"title": "Clean Code",
"date": "2008-08-01",
"text": "Even bad code can function. But if code isn't clean, it can bring a development organization to its knees. Every year, countless hours and significant resources are lost because of poorly written code. But it doesn't have to be that way."
}
# Retrieve both books by ID.
GET /books/tech/_mget
{
"ids": ["1", "2"]
}
# Add one more book.
PUT /books/tech/3
{
"author": "Brian Goetz",
"title": "Java Concurrency in Practice",
"date": "2006-05-09",
"text": "Writing correct programs is hard; writing correct concurrent programs is harder."
}
# Search for books with "clean" in the title.
GET /books/tech/_search
{
"query": {
"match": {"title": "clean"}
}
}
# Search for books with "code" in the text.
GET /books/tech/_search
{
"query": {
"match": {"text": "code"}
}
}
# Only retrieve some fields.
GET /books/tech/_search
{
"query": {
"match": {"text": "code"}
},
"fields": ["author", "title"]
}
### Talk about scoring.
# Search for multiple fields. Here: Books with "code" in the text and "java" in the title.
GET /books/tech/_search
{
"query": {
"bool": {
"must": [
{"match": {"text": "code"}},
{"match": {"title": "java"}}
]
}
}
}
# Search for books with "resource" in the text. Won't match because no stemming is applied.
GET /books/tech/_search
{
"query": {
"match": {"text": "resource"}
}
}
###################################################
# Analysis
###################################################
### Talk about the concept of inverted index and what it means for indexing and querying.
# Whitespace analyzer.
GET /books/_analyze?analyzer=whitespace&text=I like good books about BigData
# Standard analyzer. Also lowercases.
GET /books/_analyze?analyzer=standard&text=I like good books about BigData
# English analyzer. Stems words. Here: books -> book.
GET /books/_analyze?analyzer=english&text=I like good books about BigData
# Custom analyzer.
GET /books/_analyze?tokenizer=keyword&filters=lowercase&text=I like good books about BigData
# Custom analyzer with word delimiter filter (wrong order).
GET /books/_analyze?tokenizer=keyword&filters=lowercase,word_delimiter&text=I like good books about BigData
# Custom analyzer with word delimiter filter (correct order).
GET /books/_analyze?tokenizer=keyword&filters=word_delimiter,lowercase&text=I like good books about BigData
# Analyzer used by a field of the index.
GET /books/_analyze?field=text&text=I like good books about BigData
### Show inquisitor plugin.
### Show reference guide to analysis. As an example maybe Word Delimiter filter.
###################################################
# Mapping
###################################################
# Get the current mappings of the books index.
GET /books/_mapping
# Close the index in order to update a non-dynamic setting.
POST /books/_close
# Create a custom analyzer.
PUT /books/_settings
{
"settings": {
"analysis": {
"type": "custom",
"filter": {
"my_edge_ngram_filter": {
"type": "edgeNGram",
"min_gram": "1",
"max_gram": "6"
}
},
"analyzer": {
"my_title_analyzer": {
"tokenizer": "whitespace",
"filter": ["word_delimiter", "my_edge_ngram_filter", "lowercase"]
}
}
}
}
}
# Open the index again.
POST /books/_open
# Test the analyzer.
GET /books/_analyze?analyzer=my_title_analyzer&text=I like good books about BigData
# Try to update the analyzer of the title field. Won't work because the field is already mapped.
PUT /books/tech/_mapping
{
"tech": {
"properties": {
"title": {
"type": "string",
"analyzer": "my_title_analyzer"
}
}
}
}
# Delete the old mapping (and with it, the documents).
DELETE /books/tech/_mapping
# Try again.
PUT /books/tech/_mapping
{
"tech": {
"properties": {
"title": {
"type": "string",
"analyzer": "my_title_analyzer"
}
}
}
}
# Check the mapping.
GET /books/_mapping
# Add the books again. Note: Bulk action.
POST /books/tech/_bulk
{"index": {"_id": "1"}}
{"author": "Joshua Bloch", "title": "Effective Java", "date": "2008-05-08", "text": "Are you looking for a deeper understanding of the Java programming language so that you can write code that is clearer, more correct, more robust, and more reusable? Look no further!"}
{"index": {"_id": "2"}}
{"author": "Robert C. Martin", "title":"Clean Code", "date":"2008-08-01", "text":"Even bad code can function. But if code isn't clean, it can bring a development organization to its knees. Every year, countless hours and significant resources are lost because of poorly written code. But it doesn't have to be that way."}
{"index": {"_id": "3"}}
{"author": "Brian Goetz", "title": "Java Concurrency in Practice", "date": "2006-05-09", "text":"Writing correct programs is hard; writing correct concurrent programs is harder."}
# Check the mapping again.
GET /books/_mapping
# See if it works.
GET /books/_search
{
"query": {
"match": {"title": "effec"}
}
}
# Umm, why does this also match? We had a maxgram size of 6, but this is 7.
GET /books/_search
{
"query": {
"match": {"title": "effecti"}
}
}
### Check with inquisitor.
# NGrams are created for the query, too. We don't want this. See how, as it is now, "clean" will also match "Concurrency" in a book title.
GET /books/_search
{
"query": {
"match": {"title": "clean"}
}
}
# So let's split up the analyzers for indexing and querying.
DELETE /books
PUT /books
POST /books/_close
PUT /books/_settings
{
"settings": {
"analysis": {
"type": "custom",
"filter": {
"my_edge_ngram_filter": {
"type": "edgeNGram",
"min_gram": "1",
"max_gram": "6"
}
},
"analyzer": {
"my_title_index_analyzer": {
"tokenizer": "whitespace",
"filter": ["word_delimiter", "my_edge_ngram_filter", "lowercase"]
},
"my_title_search_analyzer": {
"tokenizer": "whitespace",
"filter": ["word_delimiter", "lowercase"]
}
}
}
}
}
POST /books/_open
# And now configure separate index and search analyzers for the title field.
PUT /books/tech/_mapping
{
"tech": {
"properties": {
"title": {
"type": "string",
"index_analyzer": "my_title_index_analyzer",
"search_analyzer": "my_title_search_analyzer"
}
}
}
}
POST /books/tech/_bulk
{"index": {"_id": "1"}}
{"author": "Joshua Bloch", "title": "Effective Java", "date": "2008-05-08", "text": "Are you looking for a deeper understanding of the Java programming language so that you can write code that is clearer, more correct, more robust, and more reusable? Look no further!"}
{"index": {"_id": "2"}}
{"author": "Robert C. Martin", "title":"Clean Code", "date":"2008-08-01", "text":"Even bad code can function. But if code isn't clean, it can bring a development organization to its knees. Every year, countless hours and significant resources are lost because of poorly written code. But it doesn't have to be that way."}
{"index": {"_id": "3"}}
{"author": "Brian Goetz", "title": "Java Concurrency in Practice", "date": "2006-05-09", "text":"Writing correct programs is hard; writing correct concurrent programs is harder."}
# And let's check again.
GET /books/_search
{
"query": {
"match": {"title": "effecti"}
}
}
### Confirm with Inquisitor.
### Note: Usually, this is the way you evolve your mapping, and when you are done you store it on index creation.
# A "final" mapping might be:
DELETE /books
PUT /books
{
"settings": {
"analysis": {
"type": "custom",
"filter": {
"my_edge_ngram_filter": {
"type": "edgeNGram",
"min_gram": "1",
"max_gram": "6"
}
},
"analyzer": {
"my_title_index_analyzer": {
"tokenizer": "whitespace",
"filter": ["word_delimiter", "my_edge_ngram_filter", "lowercase"]
},
"my_title_search_analyzer": {
"tokenizer": "whitespace",
"filter": ["word_delimiter", "lowercase"]
}
}
}
},
"mappings": {
"tech": {
"properties": {
"author": {
"type": "string",
"analyzer": "standard"
},
"title": {
"type": "string",
"index_analyzer": "my_title_index_analyzer",
"search_analyzer": "my_title_search_analyzer"
},
"date": {
"type": "date"
},
"text": {
"type": "string",
"analyzer": "english"
}
}
}
}
}
POST /books/tech/_bulk
{"index": {"_id": "1"}}
{"author": "Joshua Bloch", "title": "Effective Java", "date": "2008-05-08", "text": "Are you looking for a deeper understanding of the Java programming language so that you can write code that is clearer, more correct, more robust, and more reusable? Look no further!"}
{"index": {"_id": "2"}}
{"author": "Robert C. Martin", "title":"Clean Code", "date":"2008-08-01", "text":"Even bad code can function. But if code isn't clean, it can bring a development organization to its knees. Every year, countless hours and significant resources are lost because of poorly written code. But it doesn't have to be that way."}
{"index": {"_id": "3"}}
{"author": "Brian Goetz", "title": "Java Concurrency in Practice", "date": "2006-05-09", "text":"Writing correct programs is hard; writing correct concurrent programs is harder."}
# Example with stemming, as we have now configured the English analyzer for the text field.
GET /books/tech/_search
{
"query": {
"match": {"text": "resource"}
}
}
### Confirm with Inquisitor.
###################################################
# Search features
###################################################
# Highlighting.
GET /books/tech/_search
{
"query": {
"match": {"text": "resource"}
},
"highlight": {
"fields": { "text": {}}
}
}
# Suggestions. Here: May helo to fix a typo.
GET /books/tech/_search
{
"query": {
"match": { "text": "claen"}
},
"suggest": {
"my-suggestion-1": {
"text": "claen",
"term": { "field": "text"}
}
}
}
# More like this. Here: Title "Effective Java".
GET /books/tech/_search
{
"query": {
"more_like_this": {
"fields": ["title"],
"ids": ["1"],
"min_term_freq": 1,
"min_doc_freq": 1
}
}
}
# Percolator.
GET /books/.percolator/1
{
"query" : {
"match": {"title" : "java"}
}
}
# Percolate a new document. Note: Indexing needs to be done separately.
GET /books/tech/_percolate
{
"doc": {
"author": ["Charlie Hunt", "Binu John"],
"title": "Java Performance",
"date": "2011-10-14",
"text": "Improvements in the Java platform and new multicore/multiprocessor hardware have made it possible to dramatically improve the performance and scalability of Java software."
}
}
# Aggregation.
GET /books/tech/_search?search_type=count
{
"aggs": {
"year_of_publication": {
"date_histogram": {
"field": "date",
"interval": "year"
}
}
}
}
###################################################
# Sharding + Replication
###################################################
### Show that documents live in different shards. Introduce "routing" to assign documents to shards.
### Explain the advantages of sharding and replication.
### Start up three more cluster nodes. Show what happens in Marvel or Head plugin.
### Discuss number of shards/replicas.
# Add more replicas.
PUT /books/_settings
{
"index": {"number_of_replicas": 2}
}
### Shut down a node. See what happens.
### Talk about master role.
###################################################
###################################################
# The road to expertise
###################################################
###################################################
###################################################
# Map a field multiple times
###################################################
# Start all over.
DELETE /books
PUT /books
{
"settings": {
"analysis": {
"type": "custom",
"filter": {
"my_edge_ngram_filter": {
"type": "edgeNGram",
"min_gram": "1",
"max_gram": "6"
}
},
"analyzer": {
"my_title_index_analyzer": {
"tokenizer": "whitespace",
"filter": ["word_delimiter", "my_edge_ngram_filter", "lowercase"]
},
"my_title_search_analyzer": {
"tokenizer": "whitespace",
"filter": ["word_delimiter", "lowercase"]
}
}
}
},
"mappings": {
"tech": {
"properties": {
"author": {
"type": "string",
"analyzer": "standard"
},
"title": {
"type": "string",
"index_analyzer": "my_title_index_analyzer",
"search_analyzer": "my_title_search_analyzer"
},
"date": {
"type": "date"
},
"text": {
"type": "string",
"analyzer": "english"
}
}
}
}
}
POST /books/tech/_bulk
{"index": {"_id": "1"}}
{"author": "Joshua Bloch", "title": "Effective Java", "date": "2008-05-08", "text": "Are you looking for a deeper understanding of the Java programming language so that you can write code that is clearer, more correct, more robust, and more reusable? Look no further!"}
{"index": {"_id": "2"}}
{"author": "Robert C. Martin", "title":"Clean Code", "date":"2008-08-01", "text":"Even bad code can function. But if code isn't clean, it can bring a development organization to its knees. Every year, countless hours and significant resources are lost because of poorly written code. But it doesn't have to be that way."}
{"index": {"_id": "3"}}
{"author": "Brian Goetz", "title": "Java Concurrency in Practice", "date": "2006-05-09", "text":"Writing correct programs is hard; writing correct concurrent programs is harder."}
# See what a search for "programming" brings.
GET /books/tech/_search
{
"query": {
"match": {"text": "programming"}
},
"highlight": {
"fields": {"text": {}}
}
}
# Map a field twice.
DELETE /books
PUT /books
{
"settings": {
"analysis": {
"type": "custom",
"filter": {
"my_edge_ngram_filter": {
"type": "edgeNGram",
"min_gram": "1",
"max_gram": "6"
}
},
"analyzer": {
"my_title_index_analyzer": {
"tokenizer": "whitespace",
"filter": ["word_delimiter", "my_edge_ngram_filter", "lowercase"]
},
"my_title_search_analyzer": {
"tokenizer": "whitespace",
"filter": ["word_delimiter", "lowercase"]
}
}
}
},
"mappings": {
"tech": {
"properties": {
"author": {
"type": "string",
"analyzer": "standard"
},
"title": {
"type": "string",
"index_analyzer": "my_title_index_analyzer",
"search_analyzer": "my_title_search_analyzer"
},
"date": {
"type": "date"
},
"text": {
"type": "string",
"analyzer": "english",
"fields": {
"standard": {
"type": "string",
"analyzer": "standard"
}
}
}
}
}
}
}
POST /books/tech/_bulk
{"index": {"_id": "1"}}
{"author": "Joshua Bloch", "title": "Effective Java", "date": "2008-05-08", "text": "Are you looking for a deeper understanding of the Java programming language so that you can write code that is clearer, more correct, more robust, and more reusable? Look no further!"}
{"index": {"_id": "2"}}
{"author": "Robert C. Martin", "title":"Clean Code", "date":"2008-08-01", "text":"Even bad code can function. But if code isn't clean, it can bring a development organization to its knees. Every year, countless hours and significant resources are lost because of poorly written code. But it doesn't have to be that way."}
{"index": {"_id": "3"}}
{"author": "Brian Goetz", "title": "Java Concurrency in Practice", "date": "2006-05-09", "text":"Writing correct programs is hard; writing correct concurrent programs is harder."}
# Example with higher score for documents that match both fields.
GET /books/tech/_search
{
"query": {
"bool": {
"should": [
{"match": {"text": "programming"}},
{"match": {"text.standard": "programming"}}
]
}
},
"highlight": {
"fields": {"text": {}}
}
}
###################################################
# Explain
###################################################
# Same query as before but with score explanation.
GET /books/tech/_search?explain
{
"query": {
"bool": {
"should": [
{"match": {"text": "programming"}},
{"match": {"text.standard": "programming"}}
]
}
},
"highlight": {
"fields": {"text": {}}
}
}
# Query validation.
GET /books/tech/_validate/query?explain
{
"query": {
"bool": {
"should": [
{"match": {"text": "programming"}},
{"match": {"text.standard": "programming"}}
]
}
}
}
###################################################
# Filters
###################################################
# Let's say we have a publisher field.
PUT /books/tech/_mapping
{
"tech": {
"properties": {
"publisher": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
# Re-index a book with publisher information.
PUT /books/tech/1
{
"author": "Joshua Bloch",
"title": "Effective Java",
"date": "2008-05-08",
"text": "Are you looking for a deeper understanding of the Java programming language so that you can write code that is clearer, more correct, more robust, and more reusable? Look no further!",
"publisher": "Addison Wesley"
}
# Combine a query with a filter for publisher and a date filter.
GET /books/tech/_search
{
"query": {
"filtered": {
"query": {
"match": {"text": "program"}
},
"filter": {
"bool": {
"must": [
{"term": {"publisher": "Addison Wesley"}},
{"range": {"date": {"gte": "2008-01-01"}}}
]
}
}
}
}
}
###################################################
# Updates
###################################################
# Update the title of a book.
POST /books/tech/2/_update
{
"doc": {
"title": "Clean Code: A Handbook of Agile Software Craftsmanship"
}
}
GET books/tech/2
# Add two comments to a book.
POST /books/tech/1/_update
{
"doc": {
"comments": [
{
"author": "Patrick Peschlow",
"text": "Great book!"
},
{
"author": "Daniel Schneller",
"text": "I don't like it."
}
]
}
}
GET books/tech/1
###################################################
# Relations
###################################################
# See what search for those comments might bring.
GET /books/tech/_search
{
"query": {
"bool": {
"must": [
{"match": {"comments.author": "Patrick Peschlow"}},
{"match": {"comments.text": "I don't like it."}
}
]
}
}
}
# Let's create a new mapping with type "nested" for comments.
DELETE /books
PUT /books
{
"settings": {
"analysis": {
"type": "custom",
"filter": {
"my_edge_ngram_filter": {
"type": "edgeNGram",
"min_gram": "1",
"max_gram": "6"
}
},
"analyzer": {
"my_title_index_analyzer": {
"tokenizer": "whitespace",
"filter": ["word_delimiter", "my_edge_ngram_filter", "lowercase"]
},
"my_title_search_analyzer": {
"tokenizer": "whitespace",
"filter": ["word_delimiter", "lowercase"]
}
}
}
},
"mappings": {
"tech": {
"properties": {
"author": {
"type": "string",
"analyzer": "standard"
},
"title": {
"type": "string",
"index_analyzer": "my_title_index_analyzer",
"search_analyzer": "my_title_search_analyzer"
},
"date": {
"type": "date"
},
"text": {
"type": "string",
"analyzer": "english"
},
"comments": {
"type": "nested"
}
}
}
}
}
POST /books/tech/_bulk
{"index": {"_id": "1"}}
{"author": "Joshua Bloch", "title": "Effective Java", "date": "2008-05-08", "text": "Are you looking for a deeper understanding of the Java programming language so that you can write code that is clearer, more correct, more robust, and more reusable? Look no further!"}
{"index": {"_id": "2"}}
{"author": "Robert C. Martin", "title":"Clean Code", "date":"2008-08-01", "text":"Even bad code can function. But if code isn't clean, it can bring a development organization to its knees. Every year, countless hours and significant resources are lost because of poorly written code. But it doesn't have to be that way."}
{"index": {"_id": "3"}}
{"author": "Brian Goetz", "title": "Java Concurrency in Practice", "date": "2006-05-09", "text":"Writing correct programs is hard; writing correct concurrent programs is harder."}
POST /books/tech/1/_update
{
"doc": {
"comments": [
{
"author": "Patrick Peschlow",
"text": "Great book!"
},
{
"author": "Daniel Schneller",
"text": "I don't like it."
}
]
}
}
GET books/tech/1
# Now search again. No result anymore!
GET /books/tech/_search
{
"query": {
"bool": {
"must": [
{"match": {"comments.author": "Patrick Peschlow"}},
{"match": {"comments.text": "I don't like it."}}
]
}
}
}
# Do a nested search.
GET /books/tech/_search
{
"query": {
"nested": {
"path": "comments",
"query": {
"bool": {
"must": [
{"match": {"comments.author": "Patrick Peschlow"}},
{"match": {"comments.text": "Great book!"}}
]
}
}
}
}
}
# Double-check. Shouldn't match the bad combination.
GET /books/tech/_search
{
"query": {
"nested": {
"path": "comments",
"query": {
"bool": {
"must": [
{"match": {"comments.author": "Patrick Peschlow"}},
{"match": {"comments.text": "I don't like it."}}
]
}
}
}
}
}
### Talk about parent/child.
###################################################
# Alias
###################################################
POST /_aliases
{
"actions": [
{
"add": {
"index": "books",
"alias": "literature"
}
}
]
}
GET _aliases
GET /literature/tech/_search
{
"query": {
"match_all": {}
}
}
POST /_aliases
{
"actions": [
{
"add": {
"index": "books",
"alias": "thatday",
"filter": {
"term": {"date": "2006-05-09"}
}
}
}
]
}
GET /thatday/_search
{
"query": {
"match_all": {}
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment