glenacota/elastic_exercise_indexing_and_mapping

## elastic_exercise_indexing_and_mapping
# ** EXAM OBJECTIVES: INDEXING DATA + MAPPINGS AND TEXT ANALYSIS **
# (remove, if present, any `hamlet*` index and index template)
# Create the index `hamlet_raw`, with one primary shard and four replicas
# Index in `hamlet_raw` a document that satisfies the following criteria: (i) has id "1"; (ii) has default type; (iii) has a field `line` with value "To be, or not to be: that is the question"
# Update the document with id "1" by adding the field `line_number` with value "3.1.64"
# Index in `hamlet_raw` a new document without specifying any id. The fields of this document are: (i) `text_entry` with value "Whether tis nobler in the mind to suffer"; (ii) `line_number` with value "3.1.66"
# Update the precedent document by setting `line_number` to "3.1.65"
# (in one request) Update all documents in `hamlet_raw` by adding a new field `speaker` with value "HAMLET"
# Update the document with id "1" by renaming the field `line` into `text_entry`
# Delete the `hamlet_raw` index
# Create the index template `hamlet_template`, which satisfies the following criteria: (i) it matches the index patterns "hamlet_*" and "hamlet-*"; (ii) it allocates one primary shard and no replicas for each matching index
# Create two indices named `hamlet2` and `hamlet_test`. Verify that `hamlet_template` applied only to `hamlet_test`
# (in one request) Delete the `hamlet2` and `hamlet_test` indices
# Update `hamlet_template` by defining a mapping that satisfies the following criteria: (i) the type is "_doc"; (ii) has three fields named `speaker`, `line_number` and `text_entry`; (iii) `speaker` and `line_number` map to a unanalysed string; (iv) `text_entry` is a text associated with the "english" analyzer
# Create the index `hamlet_1`, and populate it by running the _bulk command with the request-body below
{"index":{"_index":"hamlet_1","_id":0}}
{"line_number":"1.1.1","speaker":"BERNARDO","text_entry":"Whos there?"}
{"index":{"_index":"hamlet_1","_id":1}}
{"line_number":"1.1.2","speaker":"FRANCISCO","text_entry":"Nay, answer me: stand, and unfold yourself."}
{"index":{"_index":"hamlet_1","_id":2}}
{"line_number":"1.1.3","speaker":"BERNARDO","text_entry":"Long live the king!"}
{"index":{"_index":"hamlet_1","_id":3}}
{"line_number":"1.2.1","speaker":"KING CLAUDIUS","text_entry":"Though yet of Hamlet our dear brothers death"}
{"index":{"_index":"hamlet_1","_id":4}}
{"line_number":"1.2.2","speaker":"KING CLAUDIUS","text_entry":"The memory be green, and that it us befitted"}
{"index":{"_index":"hamlet_1","_id":5}}
{"line_number":"1.3.1","speaker":"LAERTES","text_entry":"My necessaries are embarkd: farewell:"}
{"index":{"_index":"hamlet_1","_id":6}}
{"line_number":"1.3.4","speaker":"LAERTES","text_entry":"But let me hear from you."}
{"index":{"_index":"hamlet_1","_id":7}}
{"line_number":"1.3.5","speaker":"OPHELIA","text_entry":"Do you doubt that?"}
{"index":{"_index":"hamlet_1","_id":8}}
{"line_number":"1.4.1","speaker":"HAMLET","text_entry":"The air bites shrewdly; it is very cold."}
{"index":{"_index":"hamlet_1","_id":9}}
{"line_number":"1.4.2","speaker":"HORATIO","text_entry":"It is a nipping and an eager air."}
{"index":{"_index":"hamlet_1","_id":10}}
{"line_number":"1.4.3","speaker":"HAMLET","text_entry":"What hour now?"}
{"index":{"_index":"hamlet_1","_id":11}}
{"line_number":"1.5.2","speaker":"Ghost","text_entry":"Mark me."}
{"index":{"_index":"hamlet_1","_id":12}}
{"line_number":"1.5.3","speaker":"HAMLET","text_entry":"I will."}
# Create the index `hamlet_2`, and populate it by running the _bulk command with the request-body below
{"index":{"_index":"hamlet_2","_id":14}}
{"line_number":"2.1.1","speaker":"LORD POLONIUS","text_entry":"Give him this money and these notes, Reynaldo."}
{"index":{"_index":"hamlet_2","_id":15}}
{"line_number":"2.1.2","speaker":"REYNALDO","text_entry":"I will, my lord."}
{"index":{"_index":"hamlet_2","_id":16}}
{"line_number":"2.1.3","speaker":"LORD POLONIUS","text_entry":"You shall do marvellous wisely, good Reynaldo,"}
{"index":{"_index":"hamlet_2","_id":17}}
{"line_number":"2.1.4","speaker":"LORD POLONIUS","text_entry":"Before you visit him, to make inquire"}
{"index":{"_index":"hamlet_2","_id":18}}
{"line_number":"2.2.1","speaker":"KING CLAUDIUS","text_entry":"Welcome, dear Rosencrantz and Guildenstern!"}
{"index":{"_index":"hamlet_2","_id":19}}
{"line_number":"2.2.2","speaker":"KING CLAUDIUS","text_entry":"Moreover that we much did long to see you,"}
{"index":{"_index":"hamlet_2","_id":20}}
{"line_number":"2.2.3","speaker":"KING CLAUDIUS","text_entry":"The need we have to use you did provoke"}
# Create an alias named `hamlet` that maps both `hamlet_1` and `hamlet_2`
# Verify that the documents grouped in `hamlet` are 20
# Configure `hamlet_1` to be the write index of the `hamlet` alias
# Index in `hamlet` a document with id "13", default type, and the following fields: (i) `text_entry` with value "My hour is almost come,"; (ii) `line_number` with value "1.5.4"; (iii) `speaker`, with value "Ghost"
# Update the mapping of `hamlet_template`, satisfying the following criteria: (i) remove the definitions of the `line_number` and `speaker` fields; (ii) disable aggregations for `text_entry`; (iii) dynamically assign an integer type to any field starting by "number_"; (iv) dynamically map strings to unanalysed text as a default
# Create the index `hamlet_3`, and populate it by running the _bulk command with the request-body below
{"index":{"_index":"hamlet_3","_id":21}}
{"line_number":"3.1.4","speaker":"KING CLAUDIUS","text_entry":"With turbulent and dangerous lunacy?"}
{"index":{"_index":"hamlet_3","_id":22}}
{"line_number":"3.1.5","speaker":"ROSENCRANTZ","text_entry":"He does confess he feels himself distracted;"}
{"index":{"_index":"hamlet_3","_id":23}}
{"line_number":"3.1.64","speaker":"HAMLET","text_entry":"To be, or not to be: that is the question:"}
{"index":{"_index":"hamlet_3","_id":24}}
{"line_number":"3.1.65","speaker":"HAMLET","text_entry":"Whether tis nobler in the mind to suffer"}
{"index":{"_index":"hamlet_3","_id":25}}
{"line_number":"3.1.66","speaker":"HAMLET","text_entry":"The slings and arrows of outrageous fortune,"}
{"index":{"_index":"hamlet_3","_id":26}}
{"line_number":"3.1.67","speaker":"HAMLET","text_entry":"Or to take arms against a sea of troubles,"}
{"index":{"_index":"hamlet_3","_id":27}}
{"line_number":"3.1.68","speaker":"HAMLET","text_entry":"And by opposing end them? To die: to sleep;"}
{"index":{"_index":"hamlet_3","_id":28}}
{"line_number":"3.1.69","speaker":"HAMLET","text_entry":"No more; and by a sleep to say we end"}
# Store in the cluster state a new script named `control_reindex_batch`, which checks whether the `reindexBatch` field exists in a document. In the affirmative case, then the script increments the field value by a parameter named `increment`; otherwise, the script sets the field value to 1
# Reindex `hamlet` into `hamlet_3`, satisfying the following criteria: (i) disable refreshes of `hamlet_3` during the operation; (ii) apply the `control_reindex_batch` script with the `increment` parameter set to 1; (iii) reindex in two parallel slices
# Update all documents in `hamlet_3` by initialising the `reindexBatch` field to 1, if not present
# (in one request) Add `hamlet_3` to the alias `hamlet`, and delete the `hamlet_1` and `hamlet_2` indices
# Update all documents in `hamlet_3` by running the `control_reindex_batch` script with an `increment` of 10
# Remove from `hamlet_3` the documents that have "KING CLAUDIUS" as `speaker`
# Store in the cluster state a new ingest pipeline named `split_act_scene_line`, which satisfies the following criteria: (i) it splits the value of `line_number` by using dots as the separator; (ii) it stores the split values into three new numeric fields, named `number_act`, `number_scene`, and `number_line`, respectively
# Update all documents in `hamlet_3` using the `split_act_scene_line` pipeline
	# EXAM OBJECTIVES: INDEXING DATA + MAPPINGS AND TEXT ANALYSIS
	# (remove, if present, any `hamlet*` index and index template)
	# Create the index `hamlet_raw`, with one primary shard and four replicas
	# Index in `hamlet_raw` a document that satisfies the following criteria: (i) has id "1"; (ii) has default type; (iii) has a field `line` with value "To be, or not to be: that is the question"
	# Update the document with id "1" by adding the field `line_number` with value "3.1.64"
	# Index in `hamlet_raw` a new document without specifying any id. The fields of this document are: (i) `text_entry` with value "Whether tis nobler in the mind to suffer"; (ii) `line_number` with value "3.1.66"
	# Update the precedent document by setting `line_number` to "3.1.65"
	# (in one request) Update all documents in `hamlet_raw` by adding a new field `speaker` with value "HAMLET"
	# Update the document with id "1" by renaming the field `line` into `text_entry`
	# Delete the `hamlet_raw` index
	# Create the index template `hamlet_template`, which satisfies the following criteria: (i) it matches the index patterns "hamlet_" and "hamlet-"; (ii) it allocates one primary shard and no replicas for each matching index
	# Create two indices named `hamlet2` and `hamlet_test`. Verify that `hamlet_template` applied only to `hamlet_test`
	# (in one request) Delete the `hamlet2` and `hamlet_test` indices
	# Update `hamlet_template` by defining a mapping that satisfies the following criteria: (i) the type is "_doc"; (ii) has three fields named `speaker`, `line_number` and `text_entry`; (iii) `speaker` and `line_number` map to a unanalysed string; (iv) `text_entry` is a text associated with the "english" analyzer
	# Create the index `hamlet_1`, and populate it by running the _bulk command with the request-body below
	{"index":{"_index":"hamlet_1","_id":0}}
	{"line_number":"1.1.1","speaker":"BERNARDO","text_entry":"Whos there?"}
	{"index":{"_index":"hamlet_1","_id":1}}
	{"line_number":"1.1.2","speaker":"FRANCISCO","text_entry":"Nay, answer me: stand, and unfold yourself."}
	{"index":{"_index":"hamlet_1","_id":2}}
	{"line_number":"1.1.3","speaker":"BERNARDO","text_entry":"Long live the king!"}
	{"index":{"_index":"hamlet_1","_id":3}}
	{"line_number":"1.2.1","speaker":"KING CLAUDIUS","text_entry":"Though yet of Hamlet our dear brothers death"}
	{"index":{"_index":"hamlet_1","_id":4}}
	{"line_number":"1.2.2","speaker":"KING CLAUDIUS","text_entry":"The memory be green, and that it us befitted"}
	{"index":{"_index":"hamlet_1","_id":5}}
	{"line_number":"1.3.1","speaker":"LAERTES","text_entry":"My necessaries are embarkd: farewell:"}
	{"index":{"_index":"hamlet_1","_id":6}}
	{"line_number":"1.3.4","speaker":"LAERTES","text_entry":"But let me hear from you."}
	{"index":{"_index":"hamlet_1","_id":7}}
	{"line_number":"1.3.5","speaker":"OPHELIA","text_entry":"Do you doubt that?"}
	{"index":{"_index":"hamlet_1","_id":8}}
	{"line_number":"1.4.1","speaker":"HAMLET","text_entry":"The air bites shrewdly; it is very cold."}
	{"index":{"_index":"hamlet_1","_id":9}}
	{"line_number":"1.4.2","speaker":"HORATIO","text_entry":"It is a nipping and an eager air."}
	{"index":{"_index":"hamlet_1","_id":10}}
	{"line_number":"1.4.3","speaker":"HAMLET","text_entry":"What hour now?"}
	{"index":{"_index":"hamlet_1","_id":11}}
	{"line_number":"1.5.2","speaker":"Ghost","text_entry":"Mark me."}
	{"index":{"_index":"hamlet_1","_id":12}}
	{"line_number":"1.5.3","speaker":"HAMLET","text_entry":"I will."}
	# Create the index `hamlet_2`, and populate it by running the _bulk command with the request-body below
	{"index":{"_index":"hamlet_2","_id":14}}
	{"line_number":"2.1.1","speaker":"LORD POLONIUS","text_entry":"Give him this money and these notes, Reynaldo."}
	{"index":{"_index":"hamlet_2","_id":15}}
	{"line_number":"2.1.2","speaker":"REYNALDO","text_entry":"I will, my lord."}
	{"index":{"_index":"hamlet_2","_id":16}}
	{"line_number":"2.1.3","speaker":"LORD POLONIUS","text_entry":"You shall do marvellous wisely, good Reynaldo,"}
	{"index":{"_index":"hamlet_2","_id":17}}
	{"line_number":"2.1.4","speaker":"LORD POLONIUS","text_entry":"Before you visit him, to make inquire"}
	{"index":{"_index":"hamlet_2","_id":18}}
	{"line_number":"2.2.1","speaker":"KING CLAUDIUS","text_entry":"Welcome, dear Rosencrantz and Guildenstern!"}
	{"index":{"_index":"hamlet_2","_id":19}}
	{"line_number":"2.2.2","speaker":"KING CLAUDIUS","text_entry":"Moreover that we much did long to see you,"}
	{"index":{"_index":"hamlet_2","_id":20}}
	{"line_number":"2.2.3","speaker":"KING CLAUDIUS","text_entry":"The need we have to use you did provoke"}
	# Create an alias named `hamlet` that maps both `hamlet_1` and `hamlet_2`
	# Verify that the documents grouped in `hamlet` are 20
	# Configure `hamlet_1` to be the write index of the `hamlet` alias
	# Index in `hamlet` a document with id "13", default type, and the following fields: (i) `text_entry` with value "My hour is almost come,"; (ii) `line_number` with value "1.5.4"; (iii) `speaker`, with value "Ghost"
	# Update the mapping of `hamlet_template`, satisfying the following criteria: (i) remove the definitions of the `line_number` and `speaker` fields; (ii) disable aggregations for `text_entry`; (iii) dynamically assign an integer type to any field starting by "number_"; (iv) dynamically map strings to unanalysed text as a default
	# Create the index `hamlet_3`, and populate it by running the _bulk command with the request-body below
	{"index":{"_index":"hamlet_3","_id":21}}
	{"line_number":"3.1.4","speaker":"KING CLAUDIUS","text_entry":"With turbulent and dangerous lunacy?"}
	{"index":{"_index":"hamlet_3","_id":22}}
	{"line_number":"3.1.5","speaker":"ROSENCRANTZ","text_entry":"He does confess he feels himself distracted;"}
	{"index":{"_index":"hamlet_3","_id":23}}
	{"line_number":"3.1.64","speaker":"HAMLET","text_entry":"To be, or not to be: that is the question:"}
	{"index":{"_index":"hamlet_3","_id":24}}
	{"line_number":"3.1.65","speaker":"HAMLET","text_entry":"Whether tis nobler in the mind to suffer"}
	{"index":{"_index":"hamlet_3","_id":25}}
	{"line_number":"3.1.66","speaker":"HAMLET","text_entry":"The slings and arrows of outrageous fortune,"}
	{"index":{"_index":"hamlet_3","_id":26}}
	{"line_number":"3.1.67","speaker":"HAMLET","text_entry":"Or to take arms against a sea of troubles,"}
	{"index":{"_index":"hamlet_3","_id":27}}
	{"line_number":"3.1.68","speaker":"HAMLET","text_entry":"And by opposing end them? To die: to sleep;"}
	{"index":{"_index":"hamlet_3","_id":28}}
	{"line_number":"3.1.69","speaker":"HAMLET","text_entry":"No more; and by a sleep to say we end"}
	# Store in the cluster state a new script named `control_reindex_batch`, which checks whether the `reindexBatch` field exists in a document. In the affirmative case, then the script increments the field value by a parameter named `increment`; otherwise, the script sets the field value to 1
	# Reindex `hamlet` into `hamlet_3`, satisfying the following criteria: (i) disable refreshes of `hamlet_3` during the operation; (ii) apply the `control_reindex_batch` script with the `increment` parameter set to 1; (iii) reindex in two parallel slices
	# Update all documents in `hamlet_3` by initialising the `reindexBatch` field to 1, if not present
	# (in one request) Add `hamlet_3` to the alias `hamlet`, and delete the `hamlet_1` and `hamlet_2` indices
	# Update all documents in `hamlet_3` by running the `control_reindex_batch` script with an `increment` of 10
	# Remove from `hamlet_3` the documents that have "KING CLAUDIUS" as `speaker`
	# Store in the cluster state a new ingest pipeline named `split_act_scene_line`, which satisfies the following criteria: (i) it splits the value of `line_number` by using dots as the separator; (ii) it stores the split values into three new numeric fields, named `number_act`, `number_scene`, and `number_line`, respectively
	# Update all documents in `hamlet_3` using the `split_act_scene_line` pipeline