Optimize your Elasticsearch/OpenSearch indexing performance with these key adjustments:
-
Java Heap Size:
- Default: Varies.
- Recommended: 50% of RAM.
- Example: For 32GB RAM, set heap size to 16GB.
-
Flush Translog Threshold:
- Default:
512MB
. - Recommended: Increase to 25% of Java heap.
- Example: For a 16GB heap, set to
4GB
.
- Default:
-
Index Refresh Interval:
- Default:
1s
. - Recommended: Increase during heavy indexing. Disable or set to
30s
. - Example:
"index.refresh_interval": "30s"
.
- Default:
-
Index Buffer Size:
- Default: 10% of JVM memory.
- Recommended: Increase to up to 25% for heavy indexing.
- Example: For a 16GB heap, up to
4GB
.
-
Concurrent Merges (
max_merge_count
):- Default: Varies.
- Recommended: Increase if experiencing index throttling.
- Example:
"index.merge.scheduler.max_merge_count": 6
.
-
Shard Distribution:
- Formula: Number of shards = k * (Number of data nodes).
- Example: For 8 nodes, with k=3, ensure 24 shards in the index.
-
Setting Replica Count to Zero:
- Concern: Potential data loss during node failures.
- Example:
"index.number_of_replicas": 0
during heavy indexing, revert post-indexing.
-
Optimal Bulk Request Size:
- Start: 5 MiB to 15 MiB.
- Adjust until no further performance gain.
-
Instance Type with SSD:
- Use SSD-backed instances (e.g., AWS I3) for superior ingestion performance.
-
Reduce Response Size:
- Use
filter_path
to limit response data. - Example:
?filter_path=-took,-items.*._index
.
- Use
-
Compression Codecs (OpenSearch 2.9+):
- Default: LZ4.
- Recommended:
zstd
orzstd_no_dict
for up to 14% better throughput and 30% storage efficiency. - Example:
"index.codec": "zstd"
.
Each adjustment aims to balance performance with operational safety. Monitor impacts closely, especially when modifying settings like translog flush thresholds and replica counts.