rupeshtiwari/indexing performance in opensearch.md

## indexing performance in opensearch.md

      
    Raw
  

              indexing performance in opensearch.md
            
          
    Optimize your Elasticsearch/OpenSearch indexing performance with these key adjustments:


Java Heap Size:

Default: Varies.
Recommended: 50% of RAM.
Example: For 32GB RAM, set heap size to 16GB.


Flush Translog Threshold:

Default: 512MB.
Recommended: Increase to 25% of Java heap.
Example: For a 16GB heap, set to 4GB.


Index Refresh Interval:

Default: 1s.
Recommended: Increase during heavy indexing. Disable or set to 30s.
Example: "index.refresh_interval": "30s".


Index Buffer Size:

Default: 10% of JVM memory.
Recommended: Increase to up to 25% for heavy indexing.
Example: For a 16GB heap, up to 4GB.


Concurrent Merges (max_merge_count):

Default: Varies.
Recommended: Increase if experiencing index throttling.
Example: "index.merge.scheduler.max_merge_count": 6.


Shard Distribution:

Formula: Number of shards = k * (Number of data nodes).
Example: For 8 nodes, with k=3, ensure 24 shards in the index.


Setting Replica Count to Zero:

Concern: Potential data loss during node failures.
Example: "index.number_of_replicas": 0 during heavy indexing, revert post-indexing.


Optimal Bulk Request Size:

Start: 5 MiB to 15 MiB.
Adjust until no further performance gain.


Instance Type with SSD:

Use SSD-backed instances (e.g., AWS I3) for superior ingestion performance.


Reduce Response Size:

Use filter_path to limit response data.
Example: ?filter_path=-took,-items.*._index.


Compression Codecs (OpenSearch 2.9+):

Default: LZ4.
Recommended: zstd or zstd_no_dict for up to 14% better throughput and 30% storage efficiency.
Example: "index.codec": "zstd".


Each adjustment aims to balance performance with operational safety. Monitor impacts closely, especially when modifying settings like translog flush thresholds and replica counts.
Reference


Tunning your cluster for indexing speed
How can I improve the indexing performance on my Amazon OpenSearch Service cluster?