- Elasticsearch Internals
- Lucene Indexing
- Java Library
- Shard
- A shard is an instance of Lucene
- max # of documents in a shard is Integer.MAX_VALUE - 128
- clients do not refer to shards directly (use index instead
- What's in a shard?
- indexed document gets analyzed, put in buffer
- buffer defaults to 10% of node heap (set with indices.memory.index_buffer_size)
- buffer uis shared by all shards allocated to that node
- after buffer is full, or some time passes (refresh_interval), a "lucene flush" occurs
- documents are not searchable until flushed to a "segment"
- indexed document gets analyzed, put in buffer
- refresh parameter on search
- default: false
- true: force a lucene flush
- wait_for: wait for next flush
- Understanding Segments
- shard=lucene index= collection of segements
- think of segment as an immutable "mini index"
- Each segment is a fully independent index
- Each segment is composed by a bunch of files
- Segment files are written to disk
- Segment files are never updated (read only)
- a search request is distributed to the shards and on each shard it is performed sequentially over the segments
- shard=lucene index= collection of segements
- Segment Merges
- segments are periodically merged into larger segments
- keeps index size and number of segments management
- deleted documents get expunged during a merge
- force a merge with POST index_name/_forcemerge
- Elasticsearch Indexing
- index = 1 or more shards
- a shard consists of segments
- segments are fysynced to disk during a lucene commit
- a relatively heavy operation
- should not be performed after every index or delete operation
- until a commit segment data is on disk it is susceptible to loss
- to prevent this data loss, ES implements a transaction log for each shard
- transaction log is written at the same time as buffer
- in the event of a failure, translog will be replayed
- force es flush with POST index_name/_flush
- synced flush
- performs a normal flush, then adds a generated unique marker (sync_id) to all shards
- sync_id provides a quick way to check if two shards are identical.
- happens every minute or so
- consider doing it before a cluster restart
- Doc Values
- a data structure that store the values of a document on-disk in a column-oriented fashion, at index time
- which makes sorting and aggregations faster
- a data structure that store the values of a document on-disk in a column-oriented fashion, at index time
- Caching
- one cache per node that is shared by all shards
- LRU eviction policy
- it only caches queries inside a filter context
- one static node-level settings
- indices.queries.cache.size: "5%"
- segment level each uses bit sets...
- array of bits, each position represents a socument
- super efficient
- Only built for segments that are "big enough"
- array of bits, each position represents a socument
- Shared request cache
- one cache per node that is shared by all shards
- LRU eviction policy
- Data modification invalidates the cache
- Good fir for indices that you don't write anymore
- search requests return results almost instantly
- static setting that must be configured on every node:
- indices.request.cache.size: "5%"
- shard level request cache
- enabled on every index by default
- one cache per node that is shared by all shards
- Lucene Indexing
- Field Modeling
- use cases for granular fields
- suppose a version number
- allow queries like 5.4 or 6.x
- instead of simple version number, index major, minor, and bug fix number + "display name"
- now we can search at any desired level
- Range data types introduced in 5.2
- integer_range
- float_range
- long_range
- double_range
- date_range
- ip_range
- use range query
- intersects
- contains
- within
- Mapping parameters
- https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-params.html
- index: false (field will still be in _source)
- doc_values: false (field will still be in _source, field can still be used in queries, but not aggregations
- enabled: false (can not query or aggregate, will still be in _source)
- disabling _all
- _all will be removed from future versions of ES
- can not use it in new indexes, as of ES6
- disable with "_all":{"enabled": false} in mappings
- copy_to
- use case example: multiple fields that represent location (city, region, country), want to search them in a single query
- could just run book query across the fields
- or copy all the values to a single array field during indexing
- use case example: multiple fields that represent location (city, region, country), want to search them in a single query
- null: treated as if has no value
- the average of 5.0 and null is 5.0
- use null_value to set a default
- coercing data
- disable with coerce:false
- _meta field
- custom metadata in your mapping
- Dynamic templates
- dealing with large number if fileds
- dynamic filed names not known at time of mapping definition
- define a fields mapping based on
- datatype
- name of the field
- for example, map f_* as float
- path to the field
- controlling dynamic fields
- strict mappings
- by default, if a document is indexed with an unexpected field, the mapping is dynamically modified
- in prod, you will likely define mappings prior to indexing
- you probably do not want your mappings to change
- dynamic: strict
- do not index a doc that contains fields not already in mapping
- you can still modify the mapping directly
- other values:
- true, default behavior
- false, new fields will be ignored
- strict mappings
- use cases for granular fields
- Fixing Data
- Tools for fixing data
- Painless is a scripting language designed specifically for use with ES
- fast, secure, groovy-like syntax
- supports all of Java's data types and a subset of the Java API
- use cases
- accessing document fields
- updated/deleare a field in a document
- perform computations on fields before returning them
- customize the score of a doc
- working with aggregations
- Ingest node
- execute a script within an ingest pipeline
- Reindex API
- manipulate data as it is getting reindexed
- Basic example: updating a field
- increment number of views on a blog post
- accessing document fields
- Can be defined inline, or stored (POST /_scripts/script_name)
- scripts are cached (inline or stored)
- new script can evict a cached script
- default cache is 100 scripts
- configure using script.cache.max_size and script.cache.expire
- compilation can be expensive
- default: 75 compilations per 5-minute window (75/5m)
- configured by script.max_compilations_rate
- avoid by using parameters
- Languages like Groovy, Javascript, and Python are no longer available as of ES6
- Reindex API
- take data from one index, optionally manipulate, and write to a new one.
- Updated by query
- same as reindex API, but updates itself
- delete by query
- like SQL delete.. where
- Painless is a scripting language designed specifically for use with ES
- fixing mappings
- you can not change the data type in a mapping
- must define a new index with the desired data types
- post _reindex, define source and dest indexes
- reindexing tips
- conflicts parameter
- picking up mapping changes
- you can always add fields, but updating the mapping won't update the index
- POST blogs_fixed/_update_by_query
- reindexBatch pattern— use a field with a batch ID number. to continue after a fail, update_by_query to finish where you left off.
- fixing fields
- splitting a field
- Ingest Node
- provide the ability to preprocess a doc right before it gets indexed
- intercepts an index or bulk API request
- applies transformations
- passes the document back to the index or bulk API
- when indexing a doc, you specify a pipeline
- centralized, pipelines stored in cluster state
- by default, all nodes can ingest. node.ingest=true
- pipeline
- a set of processors
- similar to a filter in Logstash
- has read and write access to documents that pass through the pipeline
- defined with PUT on the Ingest API
- description
- processors
- on_failure
- simulate a pipeline with _simulate endpoint
- a set of processors
- Tools for fixing data
- Advanced Search & Aggregations
- Searching for patterns
- wildcard query is a term-based search that contains a wildcard
-
- = anything
- ? = any single character
- cam be expensive, especially if you have leading wildcards
-
- regex query
- expensive
- powerful
- based on lucene regex engine https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-regexp-query.html#regexp-syntax
- dealing with null values
- documents will likely have missing or null fields
- exists query
- wrap exists in a must_not to find docs that are missing a field
- scripted queries
- script queries must compute a boolean value
- test a query with the _execute API
- contexts: filter, or scoring
- scripted fields
- add fields to a query response that are generated in a script
- script_fields in query body
- expensive, consider adding as a field, calculating on ingest
- Search templates
- like stored procedures
- mustache templates, script.lang=mustache
- does not have if/else logic
- but you can define a section that gets skipped if the parameter is false or not defined
- percentiles aggregation
- defaults percentiles, but can be overridden
- percentiles_rank (reverse percentiles)
- top_hits
- missing
- scripted aggregations
- combine fields for a terms agg
- significant terms aggregation
- terms aggregations + noise filter
- discards commonly common terms that the terms agg would return
- low frequency terms in the background data pop out as high frequency terms in the foreground data
- terms aggregations + noise filter
- Pipeline aggregations
- performs computations on the results of other aggregations
- the input of a pipeline agg is the output of another agg
- use the buckets_path parameter to reference the values in the other aggregations
- executed on the coordinating node, after the results of the input agg are collected
- https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-pipeline.html
- performs computations on the results of other aggregations
- wildcard query is a term-based search that contains a wildcard
- Searching for patterns
- Cluster Management
- Architecture Recap
- largest unit is a cluster
- cluster made up of nodes
- each node is running ES process and typically there is 1x1 relationship between a server and a node
- indexes and shards
- cluster contains one ore more indices
- each index is shared and its shards are distributed over nodes
- ES automatically distributes the shards evenly across the nodes
- maybe you want more control
- your servers specification
- what nodes shards are allocated to
- data management strategy? some data on less expensive hardware?
- Dedicated Nodes
- several roles a node can have
- master eligible
- data
- ingest
- machine learning
- coordinating
- nodes can be dedicated only to a single role
- by default, a node is master-eligible, data, and ingest
- node.master=true
- node.data=true
- node.ingest=true
- why use?
- dedicated master eligible nodes
- focus on cluster state
- machines with low CPU, RAM, and disk resources
- dedicated data nodes
- fast disks, lots of ram
- can focus on data storage and processing client requests
- dedicated ingest nodes
- can focus on data processing
- machines with low disk, medium RAM, and high CPU
- coordinating-only node
- machines with low disk, medium/high RAM, and medium/high CPU resources
- behave like smart load balances
- perform the gather/reduce phase of search requests
- lightens the load on data nodes
- coordinating + ingest not uncommon
- dedicated master eligible nodes
- several roles a node can have
- Hot/Warm architecture
- you can configure the dedicated data nodes in your cluster to use a hot/warm arch
- useful for scenarios where you want to control which nodes perform indexing vs query handling
- fine-grained control over data allocation
- dedicated data nodes can be used as
- hot nodes for indexing
- warm nodes for older, read-only indices
- larger amounts of data may require additional nodes to meet performance requirements
- you can configure the dedicated data nodes in your cluster to use a hot/warm arch
- Shard Filtering
- the ability to control which nodes the shards for an index are allocated
- use node.attr to tag your nodes
- use index.routing.allocation to assign indexes to nodes
- three types of rules for assigning indexes to node
- index.routing.allocations.include
- index.routing.allocations.exclude
- index.routing.allocations.require
- PUT _settings on index to configure
- the ability to control which nodes the shards for an index are allocated
- Shard filtering for Hardware
- you can use multiple tags, like "temperature" and "size"
- example: use nodes that are medium or small, but not hot.
- make sure you don't exclude all nodes!
- https://www.elastic.co/guide/en/elasticsearch/reference/6.6/ilm-policy-definition.html
- Shard allocation awareness
- make nodes aware of where they are (racks, availability zone)
- you can make ES aware of the physical configuration of your hardware
- cluster.routing.allocation.awareness
- useful when more than once node shares the same resource (disk, host machine, network switch, rack, etc)
- Forced Awareness
- suppose you have a rack or zone that fails
- the remaining rack will try to reassign all the missing replicas
- the single rack might not be able to handle that type of volume
- you might not want ES to panic when you restart a rack
- stay yellow during a temporary event
- Architecture Recap
- Capacity Planning
- designing for scale
- default settings can take you a long way
- proper design can make scaling easier
- one shard does not scale well
- two shards can scale if we add a node
- shard overallocation
- if you are expecting your cluster to grow, it's good to plan for that by oveallocating shards
- num shards > num nodes
- be careful
- unit of scaling is the shard
- but you should not have too many
- specially sub-utilized shards
- each shard uses resources from the machine
- if you have too many, it will seriously slow things down
- too many shards, ES will be slow
- "too much"
- a little overallocation is good
- a "kajillion" shards is not
- a shard can optimally hold from 10-40GB
- optimal depends on the use case
- a 1GB shard is sub-utilized
- https://www.elastic.co/blog/how-many-shards-should-i-have-in-my-elasticsearch-cluster
- do not overshard
- business req: 1GB per day, 6 months retention, ~180GB
- could easily have this in 10 shards
- common scenario
- 3 different logs
- 1 index per day each day
- 5 shards
- 6 months retention
- ~2700 shards! (too many)
- index templates!
- rollover API
- business req: 1GB per day, 6 months retention, ~180GB
- if you are expecting your cluster to grow, it's good to plan for that by oveallocating shards
- show how many shards should I configure?
- no simple formula
- "it depends!"
- too many factors involved:
- use case: metrics, logging, search, api etc
- hardware
- num of docs
- size and complexity of docs
- before trying to determine your capacity, determine your SLA's
- docs/second
- queries/second
- maximum response time for queries
- get some production data
- actual documents
- actual queries
- actual mappings
- create a one-node (one shard?) cluster
- push it until it breaks (breaks SLA's)
- rally automates this? https://esrally.readthedocs.io/en/stable/quickstart.html
- run it in docker
- number of primary shards
- not an exact since, but you can
- estimate the total amount of data for your index
- leave room for growth
- divide by the maximum capacity of a single shard
- the number of primary shards is usually determined by indexing speed
- for searching, the total number of shards is the measurement (agnostic of index)
- note we are not talking about replicas here, just number of primary shards
- not an exact since, but you can
- measure indexing capacity of a node
- index documents in a similar way as your app woll
- index in parallel until error 429
- use this to calculate the number of docs/second
- you can calculate how to scale you nodes for ingestion based on your SLA
- scaling with replicas
- adding replicas provides an additional level od scaling
- we know replicas provide high availability
- they also provide scaling fro reads and search (IF YOU ADD ADDITIONAL HARDWARE)
- cost of replicas:
- slower indexing speed (although replicas are indexed in parallel)
- more storage on disk
- larger associated heap memory footprint of the extra shard(s)
- adding replicas provides an additional level od scaling
- scaling with indices
- using multiple indices also provides scaling
- if you need to add capacity, consider just creating a new index
- then search across both indices to search new and old data
- you could even define a single alias for the multiple indices
- searching 1 index with 50 shards is equivalent to searching 50 indices with 1 shard each
- using multiple indices also provides scaling
- Capacity planning use cases
- understand
- what your data looks like
- how that data is going to be searched
- two common use cases are:
- searching fixed-size data a large dataset that may grow slowly
- care about relevancy, not so much timeliness
- plainning
- sizes your indices based on the maximum shard capacity vs the amount of data you anticipate
- if you need to increase capacity, either reindex or add multiple indices (preferably not often)
- easy to increase throughput by adding nodes and replicas
- sizes your indices based on the maximum shard capacity vs the amount of data you anticipate
- time based data: grows rapidly, like logs, social media streams, time-based events
- all docs have a timestamp, and likely do not change
- planning
- searching usually involves a timestamp
- older docs become less important
- data ingestion is a key factor
- typically a lot of data coming in
- you do not want indexing to become a bottleneck (it has to keep up)
- common pattern: time-based indices
- new index for da, week, month, year
- add the date to the name of the index
- search using index wildcards or aliases
- data ingestion
- hardcode the index name in the app is not scalable
- you can use date math in your index name, like <logstash-{now/d}>playself
- update aliases with curator: https://www.elastic.co/guide/en/elasticsearch/client/curator/5.6/index.html
- aliases
- simple to define
- need a tool to update the alias
- no need to redeploy your application
- managing time-based indices
- for optimal ingest rates
- spread shards of your active index over as many nodes as possible
- optimal search
- shrink older indices down to the optimal number of shards
- close indices that are no longer being search
- use hot/warm architecture
- hot for indexing, warm for querying
- for optimal ingest rates
- searching fixed-size data a large dataset that may grow slowly
- understand
- designing for scale
- Document Modeling
- the need for doc modeling
- in ES, you typically denormalize your data
- flat world has advantages
- indexing and searching is fast
- no need to join tables to lock rows
- sometimes relationships matter
- we need to bridge the gap between normal and flat
- four common techniques for managing relational data in ES
- denormalization
- store redundant copies of data in each document
- example, tweet + user
- optimizing for reads
- no mechanism for keeping denormalized data up to date— avoid demoralizing data that is likely to change frequently
- when denormalizing data does change ensure that there is 1 and only 1 authoritative source for the data
- if you do denormalize data that might change, it can help to also denormalize a field that does not change, like an _id.
- try not to maintain state in ES, instead try to record events
- app-side joins (run multiple queries)
- not recommended
- nested objects (for working with arrays of objects)
- parent/child relationships
- denormalization
- denormalization
- the need for nested types
- ES flattens objects (tags.key, tags.value)
- allows arrays of objects to be indexed and queries independently of each other
- use nested if you need to maintain the relationship of each object in the array
- nested objects are stored separately
- nested types
- querying a nested type
- {query: {nested: {path: tags, query:{}}}}
- inner_hits to see why a document matched
- the nested aggregation
- puts nested objects in a bucket
- useful for performing sub-aggregations
- puts nested objects in a bucket
- parent/child relationship
- 1-N relationships
- updating a nested object requires reindexing the root object and a complete reindexing of all its nested objects. nested objects are a READ optimization.
- using a join datatype, you can completely seperate two objects while maintaining their relationship
- parent and child are complete seperate documents
- the parent can be updated without reindexing the children
- children can be added/changed/deleted without affecting the parent or other children.
- it's a write optimization.
- configured in your mappings using the join datatype
- parent and all children must live in the same shard
- overrides the default shard routing algorithm, uses the parents _id.
- therefore, every time you refer to a child document, you must specify its parent's id
- defining the relationship
- define the mapping, type: join.
- index some parent documents
- index some child documents, ?routing=parent_id
- index/_search will return parents and children
- the has_child query
- has_child type: relation_name
- filter that accepts a query and the child type to run against
- results in parent document
- the has_parent query
- matches children that have parent that matches
- always need id of parent when you query children
- Kibana has very limited support for nested types and parent/child relationships
- the need for doc modeling
- Monitoring and Alerting
- Stats API
- Task management API
- show how busy the server is
- _cat API
- wrapper around many jspn APIs
- command-line tool friendly
- helpful for simple monitoring
- v parameter to show headers
- h parameter to specify column, * for all
- Diagnosing performance issues
- thread pool queue
- GET _nodes/thead_pool
- GET _nodes/stats/thread_pool
- when a queue is full, error 429
- _cat/thread_pool?v
- a full queue can be good or bad
- OK if bulk indexing is faster than ES can handle
- bad if search queue is full
- the hot_threads API
- allows you to view current hot threads on each node
- GET _nodes/hot_threads
- GET _nodes/node/hot_threads
- The indexing slow log
- captures long-running index operations to a log file
- logs indexing events that take longer than configured thresholds
- defaults set in log4j2.properties
- can be overridden in index settings, index.indexing.slowlog
- search slow logs
- disabled by default
- usefulness can be limited since it logs per shard
- packet beat may be a better solution
- disabled by default
- Profile API
- just set profile to true in your query
- profiler response: hard to read, info on each shard that processed the search
- nice support in Kibana
- thread pool queue
- Elastic monitoring component
- use elasticsearch to monitor elasticsearch
- xpack.monitoring.collection.enabled defaults to false
- the stats of all nodes are indexed into Elasticsearch
- common settings
- pack.monitoing.collection.indices (which indices to collect from)
- .interval: how often samples are collected, default 10s
- .duration: how long before indices created by monitoring are automatically deleted. default 7d
- https://www.elastic.co/guide/en/elasticsearch/reference/current/monitoring-settings.html
- dedicated monitoring cluster
- reduce the load and storage on your other clusters
- access to monitoring even when other clusters are unhealthy
- seperate security levels from monitoring and production clusters
- if elastic security is enabled, then provide credentials
- you can also use http exporters, pack.monitoring.exporters
- monitoring multiple clusters
- many clusters can be monitored with a single monitoring cluster (gold/platinum)
- monitoring UI
- you can see heap pressure
- Clusters dashboard
- Configure alerts
- for unexpected or extreme situations
- elastic alerting
- watch for changed or anomalies in your data
- perform the necessary actions in response
- for example:
- open a helpdesk ticket when servers are running out of free space
- track network activity to detect malicious activity
- send immediate notification if nodes leave the cluster
- watch
- trigger: when the watch is executed
- input
- loads data unto the watch payload
- condition
- whether the watch actions are executed
- actions
- actions to execute if condition is true
- https://www.elastic.co/guide/en/elastic-stack-overview/current/actions.html
- PUT _xpack/watcher/watch/match_name
- watch executions are stored in ES too (.watcher-history/_search
- users can have watcher-specific role
- From Dev to Production
- disabling dynamic indexes
- put a doc into an index that doesn't exist, and it'll be created
- disable in a prod environment
- PUT _cluster/settings
- action.auto_create_index: false
- You can whitelist certain patterns, usually time-series indexes
- dev vs prod mode
- HTTP vs. Transport
- HTTP- how APIs are exposed
- ports in 9200-9299 range
- transport: used for internal communication between nodes in cluster
- ports in 9300-9399 range
- HTTP- how APIs are exposed
- binds to localhost by default
- dev mode: if it does not bind to an external interface
- prod mode: if it does bind transport to an external interface
- bootstrap tests
- dev mode: any bootstrap checks that fail will generate warnings
- prod mode: will refuse to start
- HTTP vs. Transport
- best practices
- avoid running over WAN links between datacenters
- not officially supported by elastic
- try to have 0 or few hops between nodes
- If you have multiple network cards, seperate transport and http traffic
- bind to different intefaces
- use separate firewall rules for each kind of traffic
- use long-lived HTTP connections
- client libraries support this
- use a proxy/load-balancer
- storage best practices
- segments are immutable, so the write amplification factor approaches one and is a non-issue
- local disk is king!
- avoid network storage
- Elasticsearch does not need redundant storage
- replicas/sofetare provide HA
- local disks are better than SAN
- RAID 1/5/10 is not necessary
- if you have multiple disks in the same server, set RADO0 or path.data
- RAID 0
- splits data evenly across two or more disks
- perfect distribution of the data across the disks
- if you lose one disk, you lose them all
- path.data
- distribute index across multiple SSD's
- potential to an unbalanced distribution
- if you lose one disk, the data on the other disks are preserved
- SSD: use noop or deadline schedule in the OS https://www.elastic.co/guide/en/elasticsearch/guide/current/hardware.html#_disks
- spinning disk are OK for warm nodes, but discourage concurrent merges
- index.merge.scheduler.max_thread_count: 1
- Trim your SSD's https://www.elastic.co/blog/is-your-elasticsearch-trimmed
- Hardware selection
- in general, choose medium machines over large machines
- loss of a large node has greater impact
- prefer six 4cpu X 64GB X 4 1tb drives
- avoid two 12 cpu 256GB 12 1tb drives
- avoid running multiple nodes on one server
- one ES instance can fully consume a machine
- larger machines can be helpful as warm nodes
- configure shard allocation filtering as previously discussed.
- Cloud strategies
- use discovery plugin
- dynamically configure the unicast hosts
- span the cluster across more than one AZ
- prefer ephemeral storage over network storage
- EBS: maybe provision IOPS?
- snapshot to cloud storage with repository plugins
- use shard awareness and forced awareness
- avoid instances marked with low networking performance.
- use discovery plugin
- Throttles
- relocation nd recovery throttles to ensure these tasks do not have a negative impact
- Recovery
- for faster recovery, temporarily increase the number of concurrent recoveries
- _cluster/settings
- cluster.routing.allocation.node_concurrent_recoveries
- Relocation
- for faster rebalancing of shards, increase cluster.routing.allocation.cluster_concurrent_rebalance
- JVM configuration
- only 64-bit JVMs are supported
- general, half-memory to JVM
- recommendation config/jvm.options
- what goes in the heap?
- indexing buffer
- completion suggester
- cluster state
- caches
- node query cache
- shard query cache
- fielddata
- and more
- HEAP size
- default is 1GB, not high enough for production
- configure with Xms (min heap) and Xmx(max heap)
- like -Xms8g -Xmx8g
- set Xmx to no more than 50% of your physical RAM
- rule of thumb for setting the JVM heap
- do not exceed more than 30GB of memory (to not exceed the compressed ordinary object pointers limit)
- https://www.elastic.co/blog/a-heap-of-trouble
- common causes of poor query performance
- use filters to benefit from caching
- limit the scope of aggregations by adding a query
- use the sampler bucket aggregation
- don't try to use ES like a relational database
- consider denormalization, or polyglot architecture
- avoid parent/child relationships
- too many shards
- more shards== slower query performance
- the default of 5 shards can be too high for some scenarious
- having thousands of 100MB indices with 5 shads each is not good
- solution
- shards can be fairly large (40GB is good)
- create as few of them as needed
- unnecessary scripting
- try to run scripts at index time instead of query time
- regex are slow
- use them less
- clever indexes, need to search the end, try the reverse token filter
- avoid running over WAN links between datacenters
- Cross-cluster search
- allow any node to act as a client for executing queries across multiple clusters
- introduced in ES 5.3, replaces "tribe node"
- remote clusters are configured in the cluster settings
- using the cluster.remote property
- seeds is a list of nodes in the remote cluster used to retrieve the cluster state when registering the remote cluster
- search with GET cluster_name:index-name/_search
- search multiple clusters with GET index_name, cluster_name:blog/_search
- how it works
- from a search execution perspective, there is no diff between local indices and remote indices
- as long as local nodes can reach remote nodes
- coordinating node gets size hits from each shard (local and remote) and performs reduction
- top hits are fetched from the relevant shards and returned to client
- you can use wildcards in cluster names
- Overview of Upgrades
- Versions are major.minor.patch/maintenance
- ES can use indices created in the previous major version, but older indices must be reindexed or deleted
- 6.x can use indices created in 5.x
- 6.x cannot use indices created in 2.x or before
- 5.x can use indices created in 2.x
- 5.x cannot use indices created in 1.x or before
- ES will fail to start if an incompatible index is found
- modern versions, you can do rolling upgrades
- https://www.elastic.co/products/upgrade_guide
- Cluster restart strategies
- rolling restart
- zero downtime
- reads and writes continue to operate normally
- Full cluster
- can be faster
- steps for a rolling restart
- stop non-essential indexing (if possible)
- disable shard allocation
- transitent: cluster.routing.allocation.enable: none
- POST _flush/synced
- stop and updated one node
- restart the node
- check node health
- re-enable shard allocation and wait
- check cluster health
- go back to "disable shard allocation"
- full cluster restart
- stop indexing
- disable shard allocation
- make sure to use "persistent"
- synced flush
- shutdown and update all nodes
- start all dedicated master nodes
- wait for master election
- start the other nodes
- wait for yellow
- reenable shard allocation
- rolling restart
- disabling dynamic indexes
- Resources
Last active
July 16, 2023 14:13
-
-
Save rosskarchner/2381ca5f8136f72b3718ca8353a7d5cc to your computer and use it in GitHub Desktop.
Notes from Elasticsearch Engineer II training
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment