DiskCache Cache Benchmarks¶

Accurately measuring performance is a difficult task. The benchmarks on this page are synthetic in the sense that they were designed to stress getting, setting, and deleting items repeatedly. Measurements in production systems are much harder to reproduce reliably. So take the following data with a grain of salt. A stated feature of DiskCache is performance so we would be remiss not to produce this page with comparisons.

The source for all benchmarks can be found under the “tests” directory in the source code repository. Measurements are reported by percentile: median, 90th percentile, 99th percentile, and maximum along with total time and miss rate. The average is not reported as its less useful in response-time scenarios. Each process in the benchmark executes 100,000 operations with ten times as many sets as deletes and ten times as many gets as sets.

Each comparison includes Memcached and Redis with default client and server settings. Note that these backends work differently as they communicate over the localhost network. The also require a server process running and maintained. All keys and values are short byte strings to reduce the network impact.

Single Access¶

The single access workload starts one worker processes which performs all operations. No concurrent cache access occurs.

Get¶

Above displays cache access latency at three percentiles. Notice the performance of DiskCache is faster than highly optimized memory-backed server solutions.

Set¶

Above displays cache store latency at three percentiles. The cost of writing to disk is higher but still sub-millisecond. All data in DiskCache is persistent.

Delete¶

Above displays cache delete latency at three percentiles. As above, deletes require disk writes but latency is still sub-millisecond.

Timing Data¶

Not all data is easily displayed in the graphs above. Miss rate, maximum latency and total latency is recorded below.

Timings for diskcache.Cache
Action	Count	Miss	Median	P90	P99	Max	Total
get	88966	9705	12.159us	17.166us	28.849us	174.999us	1.206s
set	9021	0	68.903us	93.937us	188.112us	10.297ms	875.907ms
delete	1012	104	47.207us	66.042us	128.031us	7.160ms	89.599ms
Total	98999						2.171s

The generated workload includes a ~1% cache miss rate. All items were stored with no expiry. The miss rate is due entirely to gets after deletes.

Timings for diskcache.FanoutCache(shards=4, timeout=1.0)
Action	Count	Miss	Median	P90	P99	Max	Total
get	88966	9705	15.020us	20.027us	33.855us	437.021us	1.425s
set	9021	0	71.049us	100.136us	203.133us	9.186ms	892.262ms
delete	1012	104	48.161us	69.141us	129.952us	5.216ms	87.294ms
Total	98999						2.405s

The high maximum store latency is likely an artifact of disk/OS interactions.

Timings for diskcache.FanoutCache(shards=8, timeout=0.010)
Action	Count	Miss	Median	P90	P99	Max	Total
get	88966	9705	15.020us	20.027us	34.094us	627.995us	1.420s
set	9021	0	72.956us	100.851us	203.133us	9.623ms	927.824ms
delete	1012	104	50.783us	72.002us	132.084us	8.396ms	78.898ms
Total	98999						2.426s

Notice the low overhead of the FanoutCache. Increasing the number of shards from four to eight has a negligible impact on performance.

Timings for pylibmc.Client
Action	Count	Miss	Median	P90	P99	Max	Total
get	88966	9705	25.988us	29.802us	41.008us	139.952us	2.388s
set	9021	0	27.895us	30.994us	40.054us	97.990us	254.248ms
delete	1012	104	25.988us	29.087us	38.147us	89.169us	27.159ms
Total	98999						2.669s

Memcached performance is low latency and stable.

Timings for redis.StrictRedis
Action	Count	Miss	Median	P90	P99	Max	Total
get	88966	9705	44.107us	54.121us	73.910us	204.086us	4.125s
set	9021	0	45.061us	56.028us	75.102us	237.942us	427.197ms
delete	1012	104	44.107us	54.836us	72.002us	126.839us	46.771ms
Total	98999						4.599s

Redis performance is roughly half that of Memcached. DiskCache performs better than Redis for get operations through the Max percentile.

Concurrent Access¶

The concurrent access workload starts eight worker processes each with different and interleaved operations. None of these benchmarks saturated all the processors.

Get¶

Under heavy load, DiskCache gets are low latency. At the 90th percentile, they are less than half the latency of Memcached.

Set¶

Stores are much slower under load and benefit greatly from sharding. Not displayed are latencies in excess of five milliseconds. With one shard allocated per worker, latency is within a magnitude of memory-backed server solutions.

Delete¶

Again deletes require writes to disk. Only the FanoutCache performs well with one shard allocated per worker.

Timing Data¶

Not all data is easily displayed in the graphs above. Miss rate, maximum latency and total latency is recorded below.

Timings for diskcache.Cache
Action	Count	Miss	Median	P90	P99	Max	Total
get	712546	71214	15.974us	23.127us	40.054us	4.953ms	12.349s
set	71530	0	94.891us	1.328ms	21.307ms	1.846s	131.728s
delete	7916	807	65.088us	1.278ms	19.610ms	1.244s	13.811s
Total	791992						157.888s

Notice the unacceptably high maximum store and delete latency. Without sharding, cache writers block each other. By default Cache objects raise a timeout error after sixty seconds.

Timings for diskcache.FanoutCache(shards=4, timeout=1.0)
Action	Count	Miss	Median	P90	P99	Max	Total
get	712546	71623	19.073us	35.048us	59.843us	12.980ms	16.849s
set	71530	0	108.004us	1.313ms	9.176ms	333.361ms	50.821s
delete	7916	767	73.195us	1.264ms	9.033ms	108.232ms	4.964s
Total	791992						72.634s

Here FanoutCache uses four shards to distribute writes. That reduces the maximum latency by a factor of ten. Note the miss rate is variable due to the interleaved operations of concurrent workers.

Timings for diskcache.FanoutCache(shards=8, timeout=0.010)
Action	Count	Miss	Median	P90	P99	Max	Total
get	712546	71106	25.034us	47.922us	101.089us	9.015ms	22.336s
set	71530	39	134.945us	1.324ms	5.763ms	16.027ms	33.347s
delete	7916	775	88.930us	1.267ms	5.017ms	13.732ms	3.308s
Total	791992						58.991s

With one shard allocated per worker and a low timeout, the maximum latency is more reasonable and corresponds to the specified 10 millisecond timeout. Some set and delete operations were therefore canceled and recorded as cache misses. The miss rate due to timeout is about 0.01% so our success rate is four-nines or 99.99%.

Timings for pylibmc.Client
Action	Count	Miss	Median	P90	P99	Max	Total
get	712546	72043	83.923us	107.050us	123.978us	617.027us	61.824s
set	71530	0	84.877us	108.004us	124.931us	312.090us	6.283s
delete	7916	796	82.970us	105.858us	123.024us	288.963us	680.970ms
Total	791992						68.788s

Memcached performance is low latency and stable even under heavy load. Notice that cache gets are three times slower in total as compared with FanoutCache. The superior performance of get operations put the overall performance of DiskCache ahead of Memcached.

Timings for redis.StrictRedis
Action	Count	Miss	Median	P90	P99	Max	Total
get	712546	72093	138.044us	169.039us	212.908us	151.121ms	101.197s
set	71530	0	138.998us	169.992us	216.007us	1.200ms	10.173s
delete	7916	752	136.137us	167.847us	211.954us	1.059ms	1.106s
Total	791992						112.476s

Redis performance is roughly half that of Memcached. Beware the impact of persistence settings on your Redis performance. Depending on your use of logging and snapshotting, maximum latency may increase significantly.

DiskCache Cache Benchmarks¶

Single Access¶

Get¶

Set¶

Delete¶

Timing Data¶

Concurrent Access¶

Get¶

Set¶

Delete¶

Timing Data¶

DiskCache

Give Support

Table of Contents

Related Topics