SlideShare a Scribd company logo
Java Caching, Turbo Charged
JavaDevRoom, FOSDEM 2015
Jens Wilke, headissue GmbH
twitter.com/cruftex
github.com/cruftex
http://cache2k.org
cache2k Overview
● Started in year 2000 as in house product and evolving since
● Focus on in memory (in heap) caching (persistence and off heap is on the
way)
● Research on optimized performance / modern eviction policies
● Open sourced 2013
● Contains features not found in (all) cache products, e.g.:
– On time expiry
– Extensive statistics
– Support for exceptions and nulls
– Blocking fetch for multiple requests on the same key
(read through configuration)
Eviction AlgorithmsEviction Algorithms
flickr:alexander
LRU
1 2 3 4 5 6 7
1 2 3 5 6 74
LRU Entry
cache access => move to front
CLOCK
hand
1=hit
1=hit0=no hit
0=no hit
0=no hit
1=hit
1=hit 1=hit
1=hit
Improving on LRU...
protect the working set
● For completeness: Least frequently used
– LFU
– LRFU
– …
● Split set of entries into cold and hot, to protect the working set
– 2Q
– LIRS
– ARC – Adaptive Replacement Cache
● Nimrod Megiddo and Dharmendra S. Modha (Usenix 2003) – patented by
IBM
– Clock-Pro
● Song Jiang, Feng Chen and Xiaodong Zhang (Usenix 2005)
cold set hot set
Improving on LRU...
history of seen entries
● Keep an LRU list of the evicted keys
● If seen again, insert directly into hot set
cold set hot set
ghost set (only keys)
Clock-Pro+
hand
Hot
0 hits
1 hit
0 hits
2 hits
0 hits
1 hit 4 hits
0 hits
2 hits
handCold
5 hits
0 hits 1 hits
Clock-Pro+ Evaluation
– Only inexpensive operation on access,
no exclusive access needed
– Better efficiency then LRU for most analyzed workloads
– Downside
● Eviction overhead increases when possible hitrates get high
(e.g. 3 entries scanned per eviction at 50% hitrate, 10 entries
scanned at 95%)
● High complexity, no straight forward implementation by the
book, lots of tuning needed (and possible)
– Still missing:
● Optimal selection of cold / hot space sizes
BenchmarksBenchmarks
flickr:bantam10
Benchmark Setup
● Cache implementations:
– Cache2k Version 0.21 (to be release next week)
– EHCache Version 2.9.0
– Guava 18
– Infinispan 7.1.0.CR2
● Oracle JRE 1.8-25
● Hardware
– Intel(R) Core(TM) i7-2620M CPU @ 2.70GHz
Test workload
– Keys and values are integers
– Read through configuration, the cache source
just returns the key
– Not practical: emphasis of caching overhead
// run the benchmark
Integer[] trace = ….
for (Integer v : trace) {
cache.get(v);
}
// Implementation of cache source
public Integer get(Integer o) {
incrementMissCount();
return o;
}
Runtime for artificial traces
3 million requests on cache with 500 capacity
Except Hits2000: cache with 2000 capacity
Hits: repeat different 500 values
Random: random select from 1000 values
Eff90 / Eff95: random trace with approx.
90% and 95% hitrate on LRU0
1
2
3
4
5
6
runtimeinseconds
Runtime of 3 million cache requests
cache2k/CLOCK
cache2k/CP+
cache2k/ARC
EHCache
Infinispan
Guava
Runtime for mostly hits
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
runtimeinseconds
Runtime of 3 million cache hits
HashMap+Counter
cache2k/CLOCK
cache2k/CP+
cache2k/ARC
EHCache
Infinispan
Guava
The first four times for Hits:
20ms, 50ms, 50ms, 70ms
Runtime with two threads
0
0.5
1
1.5
2
2.5
runtimeinseconds
3 million cache requests Eff95 per thread count
cache2k/CLOCK
cache2k/CP+
cache2k/ARC
EHCache
Infinispan
Guava
Some CPU consuming
computation is done on
cache miss
Eff95Threads2:
Same trace executed in
separate thread
with index offset
Hitrate comparison -
Artificial traces
0
10
20
30
40
50
60
70
80
90
100
runtimeinseconds
Hitrate of 3 million cache requests
cache2k/CLOCK
cache2k/CP+
cache2k/ARC
EHCache
Infinispan
Guava
Hitrate comparison -
Multi2 trace
0
10
20
30
40
50
60
70
80
Hitrates for Multi2 trace
OPT
LRU
CLOCK
CP+
ARC
EHCache
Infinispan
Guava
RAND
Hitrates comparison -
Web12 trace
0
10
20
30
40
50
60
70
80
90
Hitrates for Web12 trace
OPT
LRU
CLOCK
CP+
ARC
EHCache
Infinispan
Guava
RAND
Hitrate comparison -
Sprite trace
0
10
20
30
40
50
60
70
80
90
100
Hitrates for Sprite trace
OPT
LRU
CLOCK
CP+
ARC
EHCache
Infinispan
Guava
RAND
Take away
● The goal:
– Eviction algorithm doing better than LRU
– Self tuning / adapting
– Minimal overhead on cache access
Clock-Pro+ is quite there
Get involved...
● Try it: cache2k is on maven central
● Source on github:
● http://github.com/headissue/cache2k
● http://github.com/headissue/cache2k-benchmarks
● Ask questions on stackoverflow!
Thanks & Enjoy Life!Thanks & Enjoy Life!
http://cruftex.nethttp://cruftex.net http://cache2k.orghttp://cache2k.org

More Related Content

What's hot

High-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uringHigh-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uring
ScyllaDB
 
Efficient Bytecode Analysis: Linespeed Shellcode Detection
Efficient Bytecode Analysis: Linespeed Shellcode DetectionEfficient Bytecode Analysis: Linespeed Shellcode Detection
Efficient Bytecode Analysis: Linespeed Shellcode Detection
Georg Wicherski
 
Modern Linux Tracing Landscape
Modern Linux Tracing LandscapeModern Linux Tracing Landscape
Modern Linux Tracing Landscape
Kernel TLV
 

What's hot (20)

High-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uringHigh-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uring
 
Linux kernel debugging
Linux kernel debuggingLinux kernel debugging
Linux kernel debugging
 
Rust Is Safe. But Is It Fast?
Rust Is Safe. But Is It Fast?Rust Is Safe. But Is It Fast?
Rust Is Safe. But Is It Fast?
 
Yet another introduction to Linux RCU
Yet another introduction to Linux RCUYet another introduction to Linux RCU
Yet another introduction to Linux RCU
 
Data Structures for High Resolution, Real-time Telemetry at Scale
Data Structures for High Resolution, Real-time Telemetry at ScaleData Structures for High Resolution, Real-time Telemetry at Scale
Data Structures for High Resolution, Real-time Telemetry at Scale
 
RISC-V on Edge: Porting EVE and Alpine Linux to RISC-V
RISC-V on Edge: Porting EVE and Alpine Linux to RISC-VRISC-V on Edge: Porting EVE and Alpine Linux to RISC-V
RISC-V on Edge: Porting EVE and Alpine Linux to RISC-V
 
Get Lower Latency and Higher Throughput for Java Applications
Get Lower Latency and Higher Throughput for Java ApplicationsGet Lower Latency and Higher Throughput for Java Applications
Get Lower Latency and Higher Throughput for Java Applications
 
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 InstanceExtreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
 
syzbot and the tale of million kernel bugs
syzbot and the tale of million kernel bugssyzbot and the tale of million kernel bugs
syzbot and the tale of million kernel bugs
 
Where Did All These Cycles Go?
Where Did All These Cycles Go?Where Did All These Cycles Go?
Where Did All These Cycles Go?
 
Let’s Fix Logging Once and for All
Let’s Fix Logging Once and for AllLet’s Fix Logging Once and for All
Let’s Fix Logging Once and for All
 
LMAX Disruptor as real-life example
LMAX Disruptor as real-life exampleLMAX Disruptor as real-life example
LMAX Disruptor as real-life example
 
syzkaller: the next gen kernel fuzzer
syzkaller: the next gen kernel fuzzersyzkaller: the next gen kernel fuzzer
syzkaller: the next gen kernel fuzzer
 
Continuous Performance Regression Testing with JfrUnit
Continuous Performance Regression Testing with JfrUnitContinuous Performance Regression Testing with JfrUnit
Continuous Performance Regression Testing with JfrUnit
 
Erasing Belady's Limitations: In Search of Flash Cache Offline Optimality
Erasing Belady's Limitations: In Search of Flash Cache Offline OptimalityErasing Belady's Limitations: In Search of Flash Cache Offline Optimality
Erasing Belady's Limitations: In Search of Flash Cache Offline Optimality
 
Practical Glusto Example
Practical Glusto ExamplePractical Glusto Example
Practical Glusto Example
 
protothread and its usage in contiki OS
protothread and its usage in contiki OSprotothread and its usage in contiki OS
protothread and its usage in contiki OS
 
Java Heap Dump Analysis Primer
Java Heap Dump Analysis PrimerJava Heap Dump Analysis Primer
Java Heap Dump Analysis Primer
 
Efficient Bytecode Analysis: Linespeed Shellcode Detection
Efficient Bytecode Analysis: Linespeed Shellcode DetectionEfficient Bytecode Analysis: Linespeed Shellcode Detection
Efficient Bytecode Analysis: Linespeed Shellcode Detection
 
Modern Linux Tracing Landscape
Modern Linux Tracing LandscapeModern Linux Tracing Landscape
Modern Linux Tracing Landscape
 

Similar to cache2k, Java Caching, Turbo Charged, FOSDEM 2015

Android Boot Time Optimization
Android Boot Time OptimizationAndroid Boot Time Optimization
Android Boot Time Optimization
Kan-Ru Chen
 
Hardware Assisted Latency Investigations
Hardware Assisted Latency InvestigationsHardware Assisted Latency Investigations
Hardware Assisted Latency Investigations
ScyllaDB
 

Similar to cache2k, Java Caching, Turbo Charged, FOSDEM 2015 (20)

Java In-Process Caching - Performance, Progress and Pitfalls
Java In-Process Caching - Performance, Progress and PitfallsJava In-Process Caching - Performance, Progress and Pitfalls
Java In-Process Caching - Performance, Progress and Pitfalls
 
Java In-Process Caching - Performance, Progress and Pittfalls
Java In-Process Caching - Performance, Progress and PittfallsJava In-Process Caching - Performance, Progress and Pittfalls
Java In-Process Caching - Performance, Progress and Pittfalls
 
LCA14: LCA14-412: GPGPU on ARM SoC session
LCA14: LCA14-412: GPGPU on ARM SoC sessionLCA14: LCA14-412: GPGPU on ARM SoC session
LCA14: LCA14-412: GPGPU on ARM SoC session
 
Memory model
Memory modelMemory model
Memory model
 
Java gpu computing
Java gpu computingJava gpu computing
Java gpu computing
 
StormCrawler at Bristech
StormCrawler at BristechStormCrawler at Bristech
StormCrawler at Bristech
 
Android Boot Time Optimization
Android Boot Time OptimizationAndroid Boot Time Optimization
Android Boot Time Optimization
 
Lock, Stock and Backup: Data Guaranteed
Lock, Stock and Backup: Data GuaranteedLock, Stock and Backup: Data Guaranteed
Lock, Stock and Backup: Data Guaranteed
 
Java and Containers - Make it Awesome !
Java and Containers - Make it Awesome !Java and Containers - Make it Awesome !
Java and Containers - Make it Awesome !
 
Hardware Assisted Latency Investigations
Hardware Assisted Latency InvestigationsHardware Assisted Latency Investigations
Hardware Assisted Latency Investigations
 
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architectureCeph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
 
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA ArchitectureCeph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
 
Varnish - PLNOG 4
Varnish - PLNOG 4Varnish - PLNOG 4
Varnish - PLNOG 4
 
Testing Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with SherlockTesting Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with Sherlock
 
VMworld 2016: vSphere 6.x Host Resource Deep Dive
VMworld 2016: vSphere 6.x Host Resource Deep DiveVMworld 2016: vSphere 6.x Host Resource Deep Dive
VMworld 2016: vSphere 6.x Host Resource Deep Dive
 
JVM Performance Tuning
JVM Performance TuningJVM Performance Tuning
JVM Performance Tuning
 
Project Tungsten: Bringing Spark Closer to Bare Metal
Project Tungsten: Bringing Spark Closer to Bare MetalProject Tungsten: Bringing Spark Closer to Bare Metal
Project Tungsten: Bringing Spark Closer to Bare Metal
 
Java util concurrent
Java util concurrentJava util concurrent
Java util concurrent
 
Persistent Memory Programming with Java*
Persistent Memory Programming with Java*Persistent Memory Programming with Java*
Persistent Memory Programming with Java*
 
Embedded_ PPT_4-5 unit_Dr Monika-edited.pptx
Embedded_ PPT_4-5 unit_Dr Monika-edited.pptxEmbedded_ PPT_4-5 unit_Dr Monika-edited.pptx
Embedded_ PPT_4-5 unit_Dr Monika-edited.pptx
 

Recently uploaded

AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
Alluxio, Inc.
 
Mastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdfMastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdf
mbmh111980
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns
 

Recently uploaded (20)

Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
 
Studiovity film pre-production and screenwriting software
Studiovity film pre-production and screenwriting softwareStudiovity film pre-production and screenwriting software
Studiovity film pre-production and screenwriting software
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
 
Crafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM IntegrationCrafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM Integration
 
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
 
iGaming Platform & Lottery Solutions by Skilrock
iGaming Platform & Lottery Solutions by SkilrockiGaming Platform & Lottery Solutions by Skilrock
iGaming Platform & Lottery Solutions by Skilrock
 
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
 
De mooiste recreatieve routes ontdekken met RouteYou en FME
De mooiste recreatieve routes ontdekken met RouteYou en FMEDe mooiste recreatieve routes ontdekken met RouteYou en FME
De mooiste recreatieve routes ontdekken met RouteYou en FME
 
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdfA Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
 
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
 
Agnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in KrakówAgnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in Kraków
 
Mastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdfMastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdf
 
Breaking the Code : A Guide to WhatsApp Business API.pdf
Breaking the Code : A Guide to WhatsApp Business API.pdfBreaking the Code : A Guide to WhatsApp Business API.pdf
Breaking the Code : A Guide to WhatsApp Business API.pdf
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
 
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
 

cache2k, Java Caching, Turbo Charged, FOSDEM 2015

  • 1. Java Caching, Turbo Charged JavaDevRoom, FOSDEM 2015 Jens Wilke, headissue GmbH twitter.com/cruftex github.com/cruftex http://cache2k.org
  • 2. cache2k Overview ● Started in year 2000 as in house product and evolving since ● Focus on in memory (in heap) caching (persistence and off heap is on the way) ● Research on optimized performance / modern eviction policies ● Open sourced 2013 ● Contains features not found in (all) cache products, e.g.: – On time expiry – Extensive statistics – Support for exceptions and nulls – Blocking fetch for multiple requests on the same key (read through configuration)
  • 4. LRU 1 2 3 4 5 6 7 1 2 3 5 6 74 LRU Entry cache access => move to front
  • 5. CLOCK hand 1=hit 1=hit0=no hit 0=no hit 0=no hit 1=hit 1=hit 1=hit 1=hit
  • 6. Improving on LRU... protect the working set ● For completeness: Least frequently used – LFU – LRFU – … ● Split set of entries into cold and hot, to protect the working set – 2Q – LIRS – ARC – Adaptive Replacement Cache ● Nimrod Megiddo and Dharmendra S. Modha (Usenix 2003) – patented by IBM – Clock-Pro ● Song Jiang, Feng Chen and Xiaodong Zhang (Usenix 2005) cold set hot set
  • 7. Improving on LRU... history of seen entries ● Keep an LRU list of the evicted keys ● If seen again, insert directly into hot set cold set hot set ghost set (only keys)
  • 8. Clock-Pro+ hand Hot 0 hits 1 hit 0 hits 2 hits 0 hits 1 hit 4 hits 0 hits 2 hits handCold 5 hits 0 hits 1 hits
  • 9. Clock-Pro+ Evaluation – Only inexpensive operation on access, no exclusive access needed – Better efficiency then LRU for most analyzed workloads – Downside ● Eviction overhead increases when possible hitrates get high (e.g. 3 entries scanned per eviction at 50% hitrate, 10 entries scanned at 95%) ● High complexity, no straight forward implementation by the book, lots of tuning needed (and possible) – Still missing: ● Optimal selection of cold / hot space sizes
  • 11. Benchmark Setup ● Cache implementations: – Cache2k Version 0.21 (to be release next week) – EHCache Version 2.9.0 – Guava 18 – Infinispan 7.1.0.CR2 ● Oracle JRE 1.8-25 ● Hardware – Intel(R) Core(TM) i7-2620M CPU @ 2.70GHz
  • 12. Test workload – Keys and values are integers – Read through configuration, the cache source just returns the key – Not practical: emphasis of caching overhead // run the benchmark Integer[] trace = …. for (Integer v : trace) { cache.get(v); } // Implementation of cache source public Integer get(Integer o) { incrementMissCount(); return o; }
  • 13. Runtime for artificial traces 3 million requests on cache with 500 capacity Except Hits2000: cache with 2000 capacity Hits: repeat different 500 values Random: random select from 1000 values Eff90 / Eff95: random trace with approx. 90% and 95% hitrate on LRU0 1 2 3 4 5 6 runtimeinseconds Runtime of 3 million cache requests cache2k/CLOCK cache2k/CP+ cache2k/ARC EHCache Infinispan Guava
  • 14. Runtime for mostly hits 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 runtimeinseconds Runtime of 3 million cache hits HashMap+Counter cache2k/CLOCK cache2k/CP+ cache2k/ARC EHCache Infinispan Guava The first four times for Hits: 20ms, 50ms, 50ms, 70ms
  • 15. Runtime with two threads 0 0.5 1 1.5 2 2.5 runtimeinseconds 3 million cache requests Eff95 per thread count cache2k/CLOCK cache2k/CP+ cache2k/ARC EHCache Infinispan Guava Some CPU consuming computation is done on cache miss Eff95Threads2: Same trace executed in separate thread with index offset
  • 16. Hitrate comparison - Artificial traces 0 10 20 30 40 50 60 70 80 90 100 runtimeinseconds Hitrate of 3 million cache requests cache2k/CLOCK cache2k/CP+ cache2k/ARC EHCache Infinispan Guava
  • 17. Hitrate comparison - Multi2 trace 0 10 20 30 40 50 60 70 80 Hitrates for Multi2 trace OPT LRU CLOCK CP+ ARC EHCache Infinispan Guava RAND
  • 18. Hitrates comparison - Web12 trace 0 10 20 30 40 50 60 70 80 90 Hitrates for Web12 trace OPT LRU CLOCK CP+ ARC EHCache Infinispan Guava RAND
  • 19. Hitrate comparison - Sprite trace 0 10 20 30 40 50 60 70 80 90 100 Hitrates for Sprite trace OPT LRU CLOCK CP+ ARC EHCache Infinispan Guava RAND
  • 20. Take away ● The goal: – Eviction algorithm doing better than LRU – Self tuning / adapting – Minimal overhead on cache access Clock-Pro+ is quite there
  • 21. Get involved... ● Try it: cache2k is on maven central ● Source on github: ● http://github.com/headissue/cache2k ● http://github.com/headissue/cache2k-benchmarks ● Ask questions on stackoverflow!
  • 22. Thanks & Enjoy Life!Thanks & Enjoy Life! http://cruftex.nethttp://cruftex.net http://cache2k.orghttp://cache2k.org

Editor's Notes

  1. Hello! Jens
  2. When cache becomes full: What entry to remove? Also called replacement policy Heart of the cache
  3. Implementation: Each access moves entry to the front of a double linked list Eviction: LRU entry is at the tail of the list Evaluation: Simple List manipulation needs exclusive access Does not work well for some workloads, especially: not scan resistant! Around since 1965
  4. Implementation: Cache entries in cyclic linked list Hand points into list and moved forward for eviction Insert: Entry inserted before hand Eviction: Move hand. Reset hit-bit or evict entry if hit-bit is 0 Evaluation: Inexpensive operation on access: Just set hit bit Usually not as effective as LRU Not scan resistant
  5. Implementation: Three clocks for cold, hot and ghost entries Increment hit counter on access Eviction: „shuffle“ entries between cold and hot and select the entry to be evicted with lowest hits Insert: Check for ghost, then insert in cold or hot set
  6. 3 million accesses on 500 entry cache First values are:30ms, 50ms, 50ms, 70ms 32x more effective then Infinispan ARC implementation uses synchronize for LRU operation Single threaded benchmark Java 8 does a good job for optimizing synchronize GC time for boxed integer keys is significant Benchmark is (always) questionable,timer resolution!
  7. about 95% hitrate Second thread executes the same trace with an offset More realistic, cache source does more work (generates 1000 random numbers per request) cache2k uses no segmentation, the hitrate and the cost for generating the cached value influences the number of possible concurrency