Skip to content

Instantly share code, notes, and snippets.

@geekpete
Last active October 27, 2016 04:39
Show Gist options
  • Save geekpete/70628e2cb6f8af9966bc8ada30b01615 to your computer and use it in GitHub Desktop.
Save geekpete/70628e2cb6f8af9966bc8ada30b01615 to your computer and use it in GitHub Desktop.
Elasticsearch Dynamic Mapping Examples
# dynamic mapping, true vs false vs strict
#
# First review the short section of documentation around this functionality as it will aid in understanding:
# https://www.elastic.co/guide/en/elasticsearch/reference/2.4/dynamic.html
#
# This entire example can be pasted into Sense (now called Console inside the Dev Tools section of Kibana since 5.0)
# and each command run, beware that this will create indices in the Elasticsearch cluster that Kibana is pointed at,
# but should not affect existing indices.
#
# TODO: need to add some examples of how document fields are searchable/aggregateable or not.
# TODO: add _source includes example as well and explain why this is an antipattern for preventing fields being indexed.
# ensure previous testing indices are removed first
DELETE /my_index_dynfalse
DELETE /my_index_dyntrue
DELETE /my_index_dynstrict
# create 3 identical indices based on the documentation apart from the dynamic setting of false, true, strict.
PUT my_index_dynfalse
{
"mappings": {
"my_type": {
"dynamic": false,
"properties": {
"user": {
"properties": {
"name": {
"type": "string"
},
"social_networks": {
"dynamic": true,
"properties": {}
}
}
}
}
}
}
}
PUT my_index_dyntrue
{
"mappings": {
"my_type": {
"dynamic": true,
"properties": {
"user": {
"properties": {
"name": {
"type": "string"
},
"social_networks": {
"dynamic": true,
"properties": {}
}
}
}
}
}
}
}
PUT my_index_dynstrict
{
"mappings": {
"my_type": {
"dynamic": "strict",
"properties": {
"user": {
"properties": {
"name": {
"type": "string"
},
"social_networks": {
"dynamic": true,
"properties": {}
}
}
}
}
}
}
}
# check our mappings are as expected
GET /my_index_dyn*/_mapping
# Now let's insert a "correct" doc into each index
PUT /my_index_dynfalse/my_type/user1
{
"user": {
"name": "John Smith",
"social_networks": {
"twitter": "@jsmith"
}
}
}
PUT /my_index_dyntrue/my_type/user1
{
"user": {
"name": "John Smith",
"social_networks": {
"twitter": "@jsmith"
}
}
}
PUT /my_index_dynstrict/my_type/user1
{
"user": {
"name": "John Smith",
"social_networks": {
"twitter": "@jsmith"
}
}
}
# let's take a look at the contents of those indices to verify those newly created docs
GET /my_index_dyn*/_search
# now let's see the difference in how each index behaves when inserting a doc with a new top level field called "address"
PUT /my_index_dynfalse/my_type/user2
{
"user": {
"name": "Terrence Trailer",
"social_networks": {
"twitter": "@ttrailer"
},
"address": "123 trailer park lane"
}
}
PUT /my_index_dyntrue/my_type/user2
{
"user": {
"name": "Terrence Trailer",
"social_networks": {
"twitter": "@ttrailer"
},
"address": "123 trailer park lane"
}
}
PUT /my_index_dynstrict/my_type/user2
{
"user": {
"name": "Terrence Trailer",
"social_networks": {
"twitter": "@ttrailer"
},
"address": "123 trailer park lane"
}
}
# the strict index throws and error, the other two allow the new document, so let's see what the resulting docs look like.
GET /my_index_dyn*/_search
# since only the non-strict indices allowed the doc in, the "user2" doc is present in both and identical.
# But lets see what the mappings of our indices look like after that last doc insert
GET /my_index_dyn*/_mapping
# we now see that the my_index_dyntrue has a new top level field in the mapping called "address" whereas the other two indices do not. The my_index_dynfalse has allowed the document in but has not altered the index mapping away from the original one since dynamic mapping is disabled, it quietly just ignored that attempted dynamic change to the mapping. The strict index loudly complained and refused to even insert the document.
# With dynamic set to false, Elasticsearch ignores the fields not specified in the mapping (unless they override the parent dynamic setting with a separate dynamic true setting) and this means the fields are not stored for searching/aggregation.
# The data from these fields is stored in the _source though unless otherwise configured not to do so (eg _source is turned off).
# This would allow you to store a complete copy of the original document in _source without incurring the overhead of analysing/indexing the unneeded fields separately in the document. One use case example is say a "results" field that is a structured bunch of data that could potentially be very different or large. You might not want this data in your document mapping or want to search/aggregate on it, but you might want to be able to retrieve it in full from _source after your search finds the document using other fields.
# The _source field also lets you reindex the document to a new mapping, say if you realised you do want to have one of the previously ignored fields become searchable/aggregatable. Also letting you avoid dragging all the data down from your primary data store like a database which would be slower than a reindex.
### _source include rules for indexes ###
# So what if we try to stop those unwanted dynamic fields in docs from "going in" using _source include/exclude functionality (https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-source-field.html#include-exclude)? Let's try it. (Spoiler Alert: this approach won't do what you think either and doesn't stop those fields going in)
# We'll just focus on creating a new index with dynamic false for this test, since we know the strict mapping index we created will reject the doc with the extra fields. We'll include the existing fields, using a wildcard for the social_networks field since it's still a dynamic field allowing any new sub-object. Then we'll exclude everything else.
# just make sure it's not there already from previous testing runs
DELETE /my_index_dynfalse_sourceincludes
# then add the index mapping
PUT /my_index_dynfalse_sourceincludes
{
"mappings": {
"my_type": {
"dynamic": false,
"_source": {
"includes": [
"user.name",
"user.social_networks.*"
]
},
"properties": {
"user": {
"properties": {
"name": {
"type": "string"
},
"social_networks": {
"dynamic": true,
"properties": {}
}
}
}
}
}
}
}
# now lets try to index our doc with the extra address field into it
PUT /my_index_dynfalse_sourceincludes/my_type/user2
{
"user": {
"name": "Terrence Trailer",
"social_networks": {
"twitter": "@ttrailer"
},
"address": "123 trailer park lane"
}
}
# and lets see what the inserted doc look like now
GET /my_index_dynfalse_sourceincludes/_search
# Nice! so that pesky address field is no longer present...but wait, the "address" field has only been prevented from being included in _source. The document still contains the field and it's still searchable, lets test that out now
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment