Skip to content

Instantly share code, notes, and snippets.

@lextoumbourou
Last active May 10, 2019 10:20
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save lextoumbourou/08de4465afdf74893f50 to your computer and use it in GitHub Desktop.
Save lextoumbourou/08de4465afdf74893f50 to your computer and use it in GitHub Desktop.
Hbase Heap Size Calculator
def get_regionserver_heap_size(
storage_capacity_in_gb,
region_max_filesize=10737418240,
memstore_flush_size=134217728,
replication_factor=3,
memstore_heap_fraction=0.4
):
"""
Calculates heap size required based on storage requirements.
Source: http://hadoop-hbase.blogspot.com.au/2013/01/hbase-region-server-memory-sizing.html
Args:
storage_capacity_in_gb (int): Total disk capacity per box.
region_max_filesize (int): Region size (hbase.hregion.max.filesize)
memstore_flush_size (int): Memstore flush size (hbase.hregion.memstore.flush.size)
replication_factor (int): HDFS replication factor (dfs.replication)
memstore_heap_fraction (int): Memstore heap fraction (hbase.regionserver.global.memstore.size)
"""
one_gb_in_bytes = 1073741824.0
storage_capacity_in_bytes = storage_capacity_in_gb * one_gb_in_bytes
number_of_regions = (1.0 * storage_capacity_in_bytes / replication_factor) / region_max_filesize
return number_of_regions * memstore_flush_size / memstore_heap_fraction / one_gb_in_bytes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment