Skip to content

Instantly share code, notes, and snippets.

@peschlowp
Last active November 10, 2015 09:58
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save peschlowp/3aa550665ce3a417b617 to your computer and use it in GitHub Desktop.
Save peschlowp/3aa550665ce3a417b617 to your computer and use it in GitHub Desktop.
Elasticsearch Sense contents to demonstrate various basic and advanced features of Elasticsearch. Just copy paste into Sense and run the statements top to bottom.
###################################################
###################################################
# Crash course
###################################################
###################################################
En
###################################################
# Quickstart
###################################################
# Delete the "books" index, just in case it is still there from previous experiments.
DELETE /books
# Create an empty "books" index.
PUT /books
# Add a book of type "tech" to the index.
PUT /books/tech/1
{
"author": "Joshua Bloch",
"title": "Effective Java",
"date": "2008-05-08",
"text": "Are you looking for a deeper understanding of the Java programming language so that you can write code that is clearer, more correct, more robust, and more reusable? Look no further!"
}
# Retrieve the book again by ID.
GET /books/tech/1
# Add another book.
PUT /books/tech/2
{
"author": "Robert C. Martin",
"title": "Clean Code",
"date": "2008-08-01",
"text": "Even bad code can function. But if code isn't clean, it can bring a development organization to its knees. Every year, countless hours and significant resources are lost because of poorly written code. But it doesn't have to be that way."
}
# Retrieve both books by ID.
GET /books/tech/_mget
{
"ids": ["1", "2"]
}
# Add one more book.
PUT /books/tech/3
{
"author": "Brian Goetz",
"title": "Java Concurrency in Practice",
"date": "2006-05-09",
"text": "Writing correct programs is hard; writing correct concurrent programs is harder."
}
# Search for books with "clean" in the title.
GET /books/tech/_search
{
"query": {
"match": {"title": "clean"}
}
}
# Search for books with "code" in the text.
GET /books/tech/_search
{
"query": {
"match": {"text": "code"}
}
}
# Only retrieve some fields.
GET /books/tech/_search
{
"query": {
"match": {"text": "code"}
},
"fields": ["author", "title"]
}
### Talk about scoring.
# Search for multiple fields. Here: Books with "code" in the text and "java" in the title.
GET /books/tech/_search
{
"query": {
"bool": {
"must": [
{"match": {"text": "code"}},
{"match": {"title": "java"}}
]
}
}
}
# Search for books with "resource" in the text. Won't match because no stemming is applied.
GET /books/tech/_search
{
"query": {
"match": {"text": "resource"}
}
}
###################################################
# Analysis
###################################################
### Talk about the concept of inverted index and what it means for indexing and querying.
# Whitespace analyzer.
GET /books/_analyze?analyzer=whitespace&text=I like good books about BigData
# Standard analyzer. Also lowercases.
GET /books/_analyze?analyzer=standard&text=I like good books about BigData
# English analyzer. Stems words. Here: books -> book.
GET /books/_analyze?analyzer=english&text=I like good books about BigData
# Custom analyzer.
GET /books/_analyze?tokenizer=keyword&filters=lowercase&text=I like good books about BigData
# Custom analyzer with word delimiter filter (wrong order).
GET /books/_analyze?tokenizer=keyword&filters=lowercase,word_delimiter&text=I like good books about BigData
# Custom analyzer with word delimiter filter (correct order).
GET /books/_analyze?tokenizer=keyword&filters=word_delimiter,lowercase&text=I like good books about BigData
# Analyzer used by a field of the index.
GET /books/_analyze?field=text&text=I like good books about BigData
### Show inquisitor plugin.
### Show reference guide to analysis. As an example maybe Word Delimiter filter.
###################################################
# Mapping
###################################################
# Get the current mappings of the books index.
GET /books/_mapping
# Close the index in order to update a non-dynamic setting.
POST /books/_close
# Create a custom analyzer.
PUT /books/_settings
{
"settings": {
"analysis": {
"type": "custom",
"filter": {
"my_edge_ngram_filter": {
"type": "edgeNGram",
"min_gram": "1",
"max_gram": "6"
}
},
"analyzer": {
"my_title_analyzer": {
"tokenizer": "whitespace",
"filter": ["word_delimiter", "my_edge_ngram_filter", "lowercase"]
}
}
}
}
}
# Open the index again.
POST /books/_open
# Test the analyzer.
GET /books/_analyze?analyzer=my_title_analyzer&text=I like good books about BigData
# Try to update the analyzer of the title field. Won't work because the field is already mapped.
PUT /books/tech/_mapping
{
"tech": {
"properties": {
"title": {
"type": "string",
"analyzer": "my_title_analyzer"
}
}
}
}
# Delete the old mapping (and with it, the documents).
DELETE /books/tech/_mapping
# Try again.
PUT /books/tech/_mapping
{
"tech": {
"properties": {
"title": {
"type": "string",
"analyzer": "my_title_analyzer"
}
}
}
}
# Check the mapping.
GET /books/_mapping
# Add the books again. Note: Bulk action.
POST /books/tech/_bulk
{"index": {"_id": "1"}}
{"author": "Joshua Bloch", "title": "Effective Java", "date": "2008-05-08", "text": "Are you looking for a deeper understanding of the Java programming language so that you can write code that is clearer, more correct, more robust, and more reusable? Look no further!"}
{"index": {"_id": "2"}}
{"author": "Robert C. Martin", "title":"Clean Code", "date":"2008-08-01", "text":"Even bad code can function. But if code isn't clean, it can bring a development organization to its knees. Every year, countless hours and significant resources are lost because of poorly written code. But it doesn't have to be that way."}
{"index": {"_id": "3"}}
{"author": "Brian Goetz", "title": "Java Concurrency in Practice", "date": "2006-05-09", "text":"Writing correct programs is hard; writing correct concurrent programs is harder."}
# Check the mapping again.
GET /books/_mapping
# See if it works.
GET /books/_search
{
"query": {
"match": {"title": "effec"}
}
}
# Umm, why does this also match? We had a maxgram size of 6, but this is 7.
GET /books/_search
{
"query": {
"match": {"title": "effecti"}
}
}
### Check with inquisitor.
# NGrams are created for the query, too. We don't want this. See how, as it is now, "clean" will also match "Concurrency" in a book title.
GET /books/_search
{
"query": {
"match": {"title": "clean"}
}
}
# So let's split up the analyzers for indexing and querying.
DELETE /books
PUT /books
POST /books/_close
PUT /books/_settings
{
"settings": {
"analysis": {
"type": "custom",
"filter": {
"my_edge_ngram_filter": {
"type": "edgeNGram",
"min_gram": "1",
"max_gram": "6"
}
},
"analyzer": {
"my_title_index_analyzer": {
"tokenizer": "whitespace",
"filter": ["word_delimiter", "my_edge_ngram_filter", "lowercase"]
},
"my_title_search_analyzer": {
"tokenizer": "whitespace",
"filter": ["word_delimiter", "lowercase"]
}
}
}
}
}
POST /books/_open
# And now configure separate index and search analyzers for the title field.
PUT /books/tech/_mapping
{
"tech": {
"properties": {
"title": {
"type": "string",
"index_analyzer": "my_title_index_analyzer",
"search_analyzer": "my_title_search_analyzer"
}
}
}
}
POST /books/tech/_bulk
{"index": {"_id": "1"}}
{"author": "Joshua Bloch", "title": "Effective Java", "date": "2008-05-08", "text": "Are you looking for a deeper understanding of the Java programming language so that you can write code that is clearer, more correct, more robust, and more reusable? Look no further!"}
{"index": {"_id": "2"}}
{"author": "Robert C. Martin", "title":"Clean Code", "date":"2008-08-01", "text":"Even bad code can function. But if code isn't clean, it can bring a development organization to its knees. Every year, countless hours and significant resources are lost because of poorly written code. But it doesn't have to be that way."}
{"index": {"_id": "3"}}
{"author": "Brian Goetz", "title": "Java Concurrency in Practice", "date": "2006-05-09", "text":"Writing correct programs is hard; writing correct concurrent programs is harder."}
# And let's check again.
GET /books/_search
{
"query": {
"match": {"title": "effecti"}
}
}
### Confirm with Inquisitor.
### Note: Usually, this is the way you evolve your mapping, and when you are done you store it on index creation.
# A "final" mapping might be:
DELETE /books
PUT /books
{
"settings": {
"analysis": {
"type": "custom",
"filter": {
"my_edge_ngram_filter": {
"type": "edgeNGram",
"min_gram": "1",
"max_gram": "6"
}
},
"analyzer": {
"my_title_index_analyzer": {
"tokenizer": "whitespace",
"filter": ["word_delimiter", "my_edge_ngram_filter", "lowercase"]
},
"my_title_search_analyzer": {
"tokenizer": "whitespace",
"filter": ["word_delimiter", "lowercase"]
}
}
}
},
"mappings": {
"tech": {
"properties": {
"author": {
"type": "string",
"analyzer": "standard"
},
"title": {
"type": "string",
"index_analyzer": "my_title_index_analyzer",
"search_analyzer": "my_title_search_analyzer"
},
"date": {
"type": "date"
},
"text": {
"type": "string",
"analyzer": "english"
}
}
}
}
}
POST /books/tech/_bulk
{"index": {"_id": "1"}}
{"author": "Joshua Bloch", "title": "Effective Java", "date": "2008-05-08", "text": "Are you looking for a deeper understanding of the Java programming language so that you can write code that is clearer, more correct, more robust, and more reusable? Look no further!"}
{"index": {"_id": "2"}}
{"author": "Robert C. Martin", "title":"Clean Code", "date":"2008-08-01", "text":"Even bad code can function. But if code isn't clean, it can bring a development organization to its knees. Every year, countless hours and significant resources are lost because of poorly written code. But it doesn't have to be that way."}
{"index": {"_id": "3"}}
{"author": "Brian Goetz", "title": "Java Concurrency in Practice", "date": "2006-05-09", "text":"Writing correct programs is hard; writing correct concurrent programs is harder."}
# Example with stemming, as we have now configured the English analyzer for the text field.
GET /books/tech/_search
{
"query": {
"match": {"text": "resource"}
}
}
### Confirm with Inquisitor.
###################################################
# Search features
###################################################
# Highlighting.
GET /books/tech/_search
{
"query": {
"match": {"text": "resource"}
},
"highlight": {
"fields": { "text": {}}
}
}
# Suggestions. Here: May helo to fix a typo.
GET /books/tech/_search
{
"query": {
"match": { "text": "claen"}
},
"suggest": {
"my-suggestion-1": {
"text": "claen",
"term": { "field": "text"}
}
}
}
# More like this. Here: Title "Effective Java".
GET /books/tech/_search
{
"query": {
"more_like_this": {
"fields": ["title"],
"ids": ["1"],
"min_term_freq": 1,
"min_doc_freq": 1
}
}
}
# Percolator.
GET /books/.percolator/1
{
"query" : {
"match": {"title" : "java"}
}
}
# Percolate a new document. Note: Indexing needs to be done separately.
GET /books/tech/_percolate
{
"doc": {
"author": ["Charlie Hunt", "Binu John"],
"title": "Java Performance",
"date": "2011-10-14",
"text": "Improvements in the Java platform and new multicore/multiprocessor hardware have made it possible to dramatically improve the performance and scalability of Java software."
}
}
# Aggregation.
GET /books/tech/_search?search_type=count
{
"aggs": {
"year_of_publication": {
"date_histogram": {
"field": "date",
"interval": "year"
}
}
}
}
###################################################
# Sharding + Replication
###################################################
### Show that documents live in different shards. Introduce "routing" to assign documents to shards.
### Explain the advantages of sharding and replication.
### Start up three more cluster nodes. Show what happens in Marvel or Head plugin.
### Discuss number of shards/replicas.
# Add more replicas.
PUT /books/_settings
{
"index": {"number_of_replicas": 2}
}
### Shut down a node. See what happens.
### Talk about master role.
###################################################
###################################################
# The road to expertise
###################################################
###################################################
###################################################
# Map a field multiple times
###################################################
# Start all over.
DELETE /books
PUT /books
{
"settings": {
"analysis": {
"type": "custom",
"filter": {
"my_edge_ngram_filter": {
"type": "edgeNGram",
"min_gram": "1",
"max_gram": "6"
}
},
"analyzer": {
"my_title_index_analyzer": {
"tokenizer": "whitespace",
"filter": ["word_delimiter", "my_edge_ngram_filter", "lowercase"]
},
"my_title_search_analyzer": {
"tokenizer": "whitespace",
"filter": ["word_delimiter", "lowercase"]
}
}
}
},
"mappings": {
"tech": {
"properties": {
"author": {
"type": "string",
"analyzer": "standard"
},
"title": {
"type": "string",
"index_analyzer": "my_title_index_analyzer",
"search_analyzer": "my_title_search_analyzer"
},
"date": {
"type": "date"
},
"text": {
"type": "string",
"analyzer": "english"
}
}
}
}
}
POST /books/tech/_bulk
{"index": {"_id": "1"}}
{"author": "Joshua Bloch", "title": "Effective Java", "date": "2008-05-08", "text": "Are you looking for a deeper understanding of the Java programming language so that you can write code that is clearer, more correct, more robust, and more reusable? Look no further!"}
{"index": {"_id": "2"}}
{"author": "Robert C. Martin", "title":"Clean Code", "date":"2008-08-01", "text":"Even bad code can function. But if code isn't clean, it can bring a development organization to its knees. Every year, countless hours and significant resources are lost because of poorly written code. But it doesn't have to be that way."}
{"index": {"_id": "3"}}
{"author": "Brian Goetz", "title": "Java Concurrency in Practice", "date": "2006-05-09", "text":"Writing correct programs is hard; writing correct concurrent programs is harder."}
# See what a search for "programming" brings.
GET /books/tech/_search
{
"query": {
"match": {"text": "programming"}
},
"highlight": {
"fields": {"text": {}}
}
}
# Map a field twice.
DELETE /books
PUT /books
{
"settings": {
"analysis": {
"type": "custom",
"filter": {
"my_edge_ngram_filter": {
"type": "edgeNGram",
"min_gram": "1",
"max_gram": "6"
}
},
"analyzer": {
"my_title_index_analyzer": {
"tokenizer": "whitespace",
"filter": ["word_delimiter", "my_edge_ngram_filter", "lowercase"]
},
"my_title_search_analyzer": {
"tokenizer": "whitespace",
"filter": ["word_delimiter", "lowercase"]
}
}
}
},
"mappings": {
"tech": {
"properties": {
"author": {
"type": "string",
"analyzer": "standard"
},
"title": {
"type": "string",
"index_analyzer": "my_title_index_analyzer",
"search_analyzer": "my_title_search_analyzer"
},
"date": {
"type": "date"
},
"text": {
"type": "string",
"analyzer": "english",
"fields": {
"standard": {
"type": "string",
"analyzer": "standard"
}
}
}
}
}
}
}
POST /books/tech/_bulk
{"index": {"_id": "1"}}
{"author": "Joshua Bloch", "title": "Effective Java", "date": "2008-05-08", "text": "Are you looking for a deeper understanding of the Java programming language so that you can write code that is clearer, more correct, more robust, and more reusable? Look no further!"}
{"index": {"_id": "2"}}
{"author": "Robert C. Martin", "title":"Clean Code", "date":"2008-08-01", "text":"Even bad code can function. But if code isn't clean, it can bring a development organization to its knees. Every year, countless hours and significant resources are lost because of poorly written code. But it doesn't have to be that way."}
{"index": {"_id": "3"}}
{"author": "Brian Goetz", "title": "Java Concurrency in Practice", "date": "2006-05-09", "text":"Writing correct programs is hard; writing correct concurrent programs is harder."}
# Example with higher score for documents that match both fields.
GET /books/tech/_search
{
"query": {
"bool": {
"should": [
{"match": {"text": "programming"}},
{"match": {"text.standard": "programming"}}
]
}
},
"highlight": {
"fields": {"text": {}}
}
}
###################################################
# Explain
###################################################
# Same query as before but with score explanation.
GET /books/tech/_search?explain
{
"query": {
"bool": {
"should": [
{"match": {"text": "programming"}},
{"match": {"text.standard": "programming"}}
]
}
},
"highlight": {
"fields": {"text": {}}
}
}
# Query validation.
GET /books/tech/_validate/query?explain
{
"query": {
"bool": {
"should": [
{"match": {"text": "programming"}},
{"match": {"text.standard": "programming"}}
]
}
}
}
###################################################
# Filters
###################################################
# Let's say we have a publisher field.
PUT /books/tech/_mapping
{
"tech": {
"properties": {
"publisher": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
# Re-index a book with publisher information.
PUT /books/tech/1
{
"author": "Joshua Bloch",
"title": "Effective Java",
"date": "2008-05-08",
"text": "Are you looking for a deeper understanding of the Java programming language so that you can write code that is clearer, more correct, more robust, and more reusable? Look no further!",
"publisher": "Addison Wesley"
}
# Combine a query with a filter for publisher and a date filter.
GET /books/tech/_search
{
"query": {
"filtered": {
"query": {
"match": {"text": "program"}
},
"filter": {
"bool": {
"must": [
{"term": {"publisher": "Addison Wesley"}},
{"range": {"date": {"gte": "2008-01-01"}}}
]
}
}
}
}
}
###################################################
# Updates
###################################################
# Update the title of a book.
POST /books/tech/2/_update
{
"doc": {
"title": "Clean Code: A Handbook of Agile Software Craftsmanship"
}
}
GET books/tech/2
# Add two comments to a book.
POST /books/tech/1/_update
{
"doc": {
"comments": [
{
"author": "Patrick Peschlow",
"text": "Great book!"
},
{
"author": "Daniel Schneller",
"text": "I don't like it."
}
]
}
}
GET books/tech/1
###################################################
# Relations
###################################################
# See what search for those comments might bring.
GET /books/tech/_search
{
"query": {
"bool": {
"must": [
{"match": {"comments.author": "Patrick Peschlow"}},
{"match": {"comments.text": "I don't like it."}
}
]
}
}
}
# Let's create a new mapping with type "nested" for comments.
DELETE /books
PUT /books
{
"settings": {
"analysis": {
"type": "custom",
"filter": {
"my_edge_ngram_filter": {
"type": "edgeNGram",
"min_gram": "1",
"max_gram": "6"
}
},
"analyzer": {
"my_title_index_analyzer": {
"tokenizer": "whitespace",
"filter": ["word_delimiter", "my_edge_ngram_filter", "lowercase"]
},
"my_title_search_analyzer": {
"tokenizer": "whitespace",
"filter": ["word_delimiter", "lowercase"]
}
}
}
},
"mappings": {
"tech": {
"properties": {
"author": {
"type": "string",
"analyzer": "standard"
},
"title": {
"type": "string",
"index_analyzer": "my_title_index_analyzer",
"search_analyzer": "my_title_search_analyzer"
},
"date": {
"type": "date"
},
"text": {
"type": "string",
"analyzer": "english"
},
"comments": {
"type": "nested"
}
}
}
}
}
POST /books/tech/_bulk
{"index": {"_id": "1"}}
{"author": "Joshua Bloch", "title": "Effective Java", "date": "2008-05-08", "text": "Are you looking for a deeper understanding of the Java programming language so that you can write code that is clearer, more correct, more robust, and more reusable? Look no further!"}
{"index": {"_id": "2"}}
{"author": "Robert C. Martin", "title":"Clean Code", "date":"2008-08-01", "text":"Even bad code can function. But if code isn't clean, it can bring a development organization to its knees. Every year, countless hours and significant resources are lost because of poorly written code. But it doesn't have to be that way."}
{"index": {"_id": "3"}}
{"author": "Brian Goetz", "title": "Java Concurrency in Practice", "date": "2006-05-09", "text":"Writing correct programs is hard; writing correct concurrent programs is harder."}
POST /books/tech/1/_update
{
"doc": {
"comments": [
{
"author": "Patrick Peschlow",
"text": "Great book!"
},
{
"author": "Daniel Schneller",
"text": "I don't like it."
}
]
}
}
GET books/tech/1
# Now search again. No result anymore!
GET /books/tech/_search
{
"query": {
"bool": {
"must": [
{"match": {"comments.author": "Patrick Peschlow"}},
{"match": {"comments.text": "I don't like it."}}
]
}
}
}
# Do a nested search.
GET /books/tech/_search
{
"query": {
"nested": {
"path": "comments",
"query": {
"bool": {
"must": [
{"match": {"comments.author": "Patrick Peschlow"}},
{"match": {"comments.text": "Great book!"}}
]
}
}
}
}
}
# Double-check. Shouldn't match the bad combination.
GET /books/tech/_search
{
"query": {
"nested": {
"path": "comments",
"query": {
"bool": {
"must": [
{"match": {"comments.author": "Patrick Peschlow"}},
{"match": {"comments.text": "I don't like it."}}
]
}
}
}
}
}
### Talk about parent/child.
###################################################
# Alias
###################################################
POST /_aliases
{
"actions": [
{
"add": {
"index": "books",
"alias": "literature"
}
}
]
}
GET _aliases
GET /literature/tech/_search
{
"query": {
"match_all": {}
}
}
POST /_aliases
{
"actions": [
{
"add": {
"index": "books",
"alias": "thatday",
"filter": {
"term": {"date": "2006-05-09"}
}
}
}
]
}
GET /thatday/_search
{
"query": {
"match_all": {}
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment