Skip to content

Instantly share code, notes, and snippets.

@dormando
Created July 16, 2012 04:01
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dormando/3120414 to your computer and use it in GitHub Desktop.
Save dormando/3120414 to your computer and use it in GitHub Desktop.
stock memcached with slab rebalancer turbo edition

Pseudo-random eviction

https://github.com/dormando/memcached/tree/slab_boost (the branch is here since it's not been merged into mainline yet)

I've made some small modifications to the slab rebalancing system to make it more aggressive. You can now reassign memory between classes all you want, compared to having to wait for a class to use up all the new items you just moved to it. The rebalancer and automover are now actually separate threads, instead of separate processes in the same thread. The rebalancer waits for a kickoff from a "slabs_reassign" command and immediately chews through memory. This command may be kicked off manually, via the slow automover (now known as automove=1), and via a new pants-on-fire "move slab page on first eviction" mode (automove=2).

memcached v1.4.13-slab_boost with slab_reassign,slab_automove=2

+---------------------------------------------------------+
| size | calls |  slab  |  slab  | slab  | slab  | server |
|      |       |   100  |   1K   |  10K  | 100K  | error  |
+---------------------------------------------------------+
| 100  |  10K  |   64   |   0    |   0   |   0   |   0    |
+---------------------------------------------------------+
|  1K  |  1K   |   1    |   64   |   0   |   0   |   0    |
+---------------------------------------------------------+
| 10K  |  100  |   1    |   1    |   64  |   0   |   0    |
+---------------------------------------------------------+
| 100K |  10   |   1    |   1    |   1   |   64  |  67    |
+---------------------------------------------------------+

(for reference: I haven't ran these myself, but inlining the results from twemcache v2.4.0) From: https://github.com/twitter/twemcache/blob/master/notes/random_eviction.md

+---------------------------------------------------------+
| size | calls |  slab  |  slab  | slab  | slab  | server |
|      |       |   100  |   1K   |  10K  | 100K  | error  |
+---------------------------------------------------------+
| 100  |  10K  |   64   |   0    |   0   |   0   |   0    |
+---------------------------------------------------------+
|  1K  |  1K   |   15   |   49   |   0   |   0   |   0    |
+---------------------------------------------------------+
| 10K  |  100  |   6    |   4    |   54  |   0   |   0    |
+---------------------------------------------------------+
| 100K |  10   |   2    |   1    |   10  |   51  |   0    |
+---------------------------------------------------------+

I would, however, highly recommend people do not run with slab_automove=2 enabled. I've been on a lot of your machines and seen a lot of your stats outputs, and generally people run memcached with all slab classes evicting all the time, and enabling this option will cause you to fling about memory randomly and ruin your hit rate.

However, if you are highly strict about not having evictions ever, and you need to respond fast to changes in memory requirements, this branch is the shit.

Details

memcached v1.4.13-slab_boost

$ memcached -o slab_reassign,slab_automove=2

10K requests, 100 byte size, 100 conns, 100MB data

$ ./mcperf --num-conns=100 --conn-rate=1000 --sizes=0.01 --num-calls=10000

Total: connections 100 requests 1000000 responses 1000000 test-duration 50.860 s

Connection rate: 2.0 conn/s (508.6 ms/conn <= 100 concurrent connections)
Connection time [ms]: avg 47912.0 min 40682.7 max 50755.9 stddev 2460.22
Connect time [ms]: avg 2.0 min 0.1 max 16.1 stddev 3.45

Request rate: 19662.0 req/s (0.1 ms/req)
Request size [B]: avg 129.0 min 129.0 max 129.0 stddev 0.00

Response rate: 19662.0 rsp/s (0.1 ms/rsp)
Response size [B]: avg 8.0 min 8.0 max 8.0 stddev 0.00
Response time [ms]: avg 4.8 min 0.0 max 90.1 stddev 0.01
Response time [ms]: p25 1.0 p50 2.0 p75 2.0
Response time [ms]: p95 21.0 p99 31.0 p999 42.0
Response type: stored 1000000 not_stored 0 exists 0 not_found 0
Response type: num 0 deleted 0 end 0 value 0
Response type: error 0 client_error 0 server_error 0

Errors: total 0 client-timo 0 socket-timo 0 connrefused 0 connreset 0
Errors: fd-unavail 0 ftab-full 0 addrunavail 0 other 0

CPU time [s]: user 18.95 system 7.39 (user 37.3% system 14.5% total 51.8%)
Net I/O: bytes 130.7 MB rate 2630.6 KB/s (21.5*10^6 bps)

$ printf "stats slabs\r\n" | nc localhost 11211 | grep "total_pages"
STAT 4:total_pages 64i

1K requests, 1000 byte size, 100 conns, 100MB data

$ ./mcperf --num-conns=100 --conn-rate=1000 --sizes=0.001 --num-calls=1000

Total: connections 100 requests 100000 responses 100000 test-duration 5.909 s

Connection rate: 16.9 conn/s (59.1 ms/conn <= 100 concurrent connections)
Connection time [ms]: avg 5456.5 min 4272.3 max 5805.3 stddev 359.05
Connect time [ms]: avg 1.2 min 0.1 max 9.9 stddev 1.84

Request rate: 16924.0 req/s (0.1 ms/req)
Request size [B]: avg 1030.0 min 1030.0 max 1030.0 stddev 0.00

Response rate: 16924.0 rsp/s (0.1 ms/rsp)
Response size [B]: avg 8.0 min 8.0 max 8.0 stddev 0.00
Response time [ms]: avg 5.5 min 0.0 max 50.1 stddev 0.01
Response time [ms]: p25 1.0 p50 2.0 p75 3.0
Response time [ms]: p95 24.0 p99 32.0 p999 40.0
Response type: stored 100000 not_stored 0 exists 0 not_found 0
Response type: num 0 deleted 0 end 0 value 0
Response type: error 0 client_error 0 server_error 0

Errors: total 0 client-timo 0 socket-timo 0 connrefused 0 connreset 0
Errors: fd-unavail 0 ftab-full 0 addrunavail 0 other 0

CPU time [s]: user 2.23 system 0.84 (user 37.7% system 14.2% total 52.0%)
Net I/O: bytes 99.0 MB rate 17155.4 KB/s (140.5*10^6 bps)

$ printf "stats slabs\r\n" | nc localhost 11211 | grep "total_pages"
STAT 4:total_pages 1
STAT 12:total_pages 64

100 requests, 10000 byte size, 100 conns, 100MB data

$ ./mcperf --num-conns=100 --conn-rate=1000 --sizes=0.0001 --num-calls=100

Total: connections 100 requests 10000 responses 10000 test-duration 0.374 s

Connection rate: 267.2 conn/s (3.7 ms/conn <= 100 concurrent connections)
Connection time [ms]: avg 286.7 min 169.5 max 329.6 stddev 29.23
Connect time [ms]: avg 1.5 min 0.0 max 8.3 stddev 1.83

Request rate: 26717.5 req/s (0.0 ms/req)
Request size [B]: avg 10031.0 min 10031.0 max 10031.0 stddev 0.00

Response rate: 26717.5 rsp/s (0.0 ms/rsp)
Response size [B]: avg 8.0 min 8.0 max 8.0 stddev 0.00
Response time [ms]: avg 2.9 min 0.0 max 40.2 stddev 0.01
Response time [ms]: p25 1.0 p50 1.0 p75 2.0
Response time [ms]: p95 13.0 p99 28.0 p999 37.0
Response type: stored 10000 not_stored 0 exists 0 not_found 0
Response type: num 0 deleted 0 end 0 value 0
Response type: error 0 client_error 0 server_error 0

Errors: total 0 client-timo 0 socket-timo 0 connrefused 0 connreset 0
Errors: fd-unavail 0 ftab-full 0 addrunavail 0 other 0

CPU time [s]: user 0.08 system 0.11 (user 21.4% system 29.4% total 50.8%)
Net I/O: bytes 95.7 MB rate 261930.9 KB/s (2145.7*10^6 bps)

$ printf "stats slabs\r\n" | nc localhost 11211 | grep "total_pages"
STAT 4:total_pages 1
STAT 12:total_pages 1
STAT 22:total_pages 64

10 requests, 100000 byte size, 100 conns, 100MB data

$ ./mcperf --num-conns=100 --conn-rate=1000 --sizes=0.00001 --num-calls=10

Total: connections 100 requests 1000 responses 1000 test-duration 0.440 s

Connection rate: 227.5 conn/s (4.4 ms/conn <= 99 concurrent connections)
Connection time [ms]: avg 214.8 min 83.3 max 344.5 stddev 39.82
Connect time [ms]: avg 0.3 min 0.0 max 1.9 stddev 0.40

Request rate: 2275.1 req/s (0.4 ms/req)
Request size [B]: avg 100032.0 min 100032.0 max 100032.0 stddev 0.00

Response rate: 2275.1 rsp/s (0.4 ms/rsp)
Response size [B]: avg 10.3 min 8.0 max 43.0 stddev 8.76
Response time [ms]: avg 21.4 min 0.1 max 221.2 stddev 0.03
Response time [ms]: p25 3.0 p50 12.0 p75 23.0
Response time [ms]: p95 88.0 p99 121.0 p999 131.0
Response type: stored 933 not_stored 0 exists 0 not_found 0
Response type: num 0 deleted 0 end 0 value 0
Response type: error 0 client_error 0 server_error 67

Errors: total 0 client-timo 0 socket-timo 0 connrefused 0 connreset 0
Errors: fd-unavail 0 ftab-full 0 addrunavail 0 other 0

CPU time [s]: user 0.08 system 0.10 (user 18.2% system 22.8% total 41.0%)
Net I/O: bytes 95.4 MB rate 222270.9 KB/s (1820.8*10^6 bps)

$ printf "stats slabs\r\n" | nc localhost 11211 | grep "total_pages"
STAT 4:total_pages 1
STAT 12:total_pages 1
STAT 22:total_pages 1
STAT 32:total_pages 64
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment