According to Sean, this is the version of memory allocator
libtcmalloc-minimal4 2.4-0ubuntu5.16.04.1
jewel: 10.2.7
Test | Type | IOPS |
---|---|---|
4k-rbd_cache_true | rand rw | 10777/13604 (w/r iops) 4k |
4k-rbd_cache_false | rand rw | 10539/28141 (w/r iops) 4k |
4k-rbd_writethrough_cache | rand rw | 660/19887 (w/r iops) 4k |
jewel: 10.2.10 (client/server)
Test | Type | IOPS |
---|---|---|
4k-jewel-latest-rbd_cache_true | rand rw | 9009/20537 (w/r iops) 4k |
4k-jewel_latest_rbd_cache_false | rand rw | 10592/29959 (w/r iops) 4k |
4k-jewel_latest_rbd_cache_writethrough_true | rand rw | 5412/11345 (w/r iops) 4k |
Allicator | Verion | Cache | Type | IOPS | Spreadsheet |
---|---|---|---|---|---|
tcmalloc | 2.4 | 128mb cache | rand rw | 10355/87073 (w/r iops) 4k | Spreadsheet row 18 |
tcmalloc | 2.4 | 256mb cache | rand rw | 9926/30893 (w/r iops) 4k | Spreadsheet row 10 |
jemalloc | 3.6 | rand rw | 14034/30165 (w/r iops) 4k | Spreadsheet row 2 |
Note: These tests were all run using rbd_cache=false with attached volumes
- It's not standard install.
- According to testing done by Red Hat jemalloc will use significantly more memory than tcmalloc on the order to 200M to 300MB per OSD process under nomal use and about 400MB more during recovery. But according to our tests in the lab it only increased by approxiately 100MB
Looking for volume iops to increase and watching for increase memory usage for everyday workload and a recovery
-
Metrics to gather prior to the jemalloc change (Is rpc-maas installed on the staging environment?)
- attached volume iops (should already have iops and memory)
- Perform a recovery test to gather timming and memory usage
- Define a valid recovery test with RPC support
-
Install/Setup jemalloc
# Install new memory allocator
sudo apt-get install libjemalloc1 libjemalloc-dev
# Uncomment “#LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1” from /etc/default/ceph
# Restart ceph services on the host
- Verify ceph is now using jemalloc
lsof -E | grep malloc
-
Volume performance increase test
- Create several vm and attach a volume using rbd_cache = none
- Run fio to verify the performance gain based on the documented tests you already have
- Review osd memory profile.
-
Recovery (track time of operation and memory foot print)
- Run the same recovery operation done in Step 2.
- Review timing and memory usage.
Execute fio test
fio 4k-randrw.fio | tee <situation>.txt
fio test file
$ cat 4k-randrw.fio
[global]
bs=4k
iodepth=128
direct=1
ioengine=libaio
randrepeat=0
group_reporting
time_based
runtime=60
filesize=10G
[4k-randwrite]
rw=randwrite
stonewall
filename=<device>
Andy has submitted a Pull Request to ceph-ansible to add a flag to install, configure jemalloc
"Ceph default packages use tcmalloc.
Red Hat presentation https://www.youtube.com/watch?v=oxixZPSTzDQ&feature=youtu.be
"For flash optimized configurations, we found jemalloc providing best possible performance without performance degradation over time."
http://tracker.ceph.com/projects/ceph/wiki/Tuning_for_All_Flash_Deployments