The new ehcache 2.0 and hibernate spiPresentation Transcript
Caching, and what’s new in Ehcache
2 and Hibernate Caching Provider
Thursday, 17 June 2010
Why Cache?
Reasons to cache:
Offload - reducing the amount of resources consumed and hence cost
Performance - increasing the speed of processing
Scale out - distributed caching is a leading scale-out architecture
www.terracotta.org 2
Thursday, 17 June 2010
Truncating the request-response Loop
load balancer
load balancer
HTTPD HTTPD HTTPD
Application Application
server server
ehcache ehcache
Terracotta ... Terracotta
Terracotta
Server ... Terracotta
Server MySQL
Server Database
stripe 1 Server
stripe n
stripe 1 stripe n
www.terracotta.org 3
Thursday, 17 June 2010
Truncating the request-response Loop
load balancer
load balancer
HTTPD HTTPD HTTPD
Application Application
server server
ehcache ehcache
Terracotta ... Terracotta
Terracotta
Server ... Terracotta
Server MySQL
Server Database
stripe 1 Server
stripe n
stripe 1 stripe n
www.terracotta.org 3
Thursday, 17 June 2010
Truncating the request-response Loop
load balancer
load balancer
HTTPD HTTPD HTTPD
Application Application
server server
ehcache ehcache
Terracotta ... Terracotta
Terracotta
Server ... Terracotta
Server MySQL
Server Database
stripe 1 Server
stripe n
stripe 1 stripe n
www.terracotta.org 3
Thursday, 17 June 2010
Truncating the request-response Loop
load balancer
load balancer
HTTPD HTTPD HTTPD
Application Application
server server
ehcache ehcache
Terracotta ... Terracotta
Terracotta
Server ... Terracotta
Server MySQL
Server Database
stripe 1 Server
stripe n
stripe 1 stripe n
www.terracotta.org 3
Thursday, 17 June 2010
Truncating the request-response Loop
load balancer
load balancer
HTTPD HTTPD HTTPD
Application Application
server server
ehcache ehcache
Terracotta ... Terracotta
Terracotta
Server ... Terracotta
Server MySQL
Server Database
stripe 1 Server
stripe n
stripe 1 stripe n
www.terracotta.org 3
Thursday, 17 June 2010
Amdahl’s Law
Amdahl's law, after Gene Amdahl, is used to find the system speed up
from a speed up in part of the system.
1 / ((1 - Proportion Sped Up) + Proportion Sped Up / Speed up)
To apply Amdahl’s law you must measure the components of system
time and the before and after affect of the perf change made.
It is thus an empirical approach.
Not recommended is the other approach “When all you have is a hammer
every problem looks like a nail”. Lots of times it is something completely
new. However very few developers will take the time to make careful
measurements.
www.terracotta.org 4
Thursday, 17 June 2010
Cache Efficiency
cache efficiency = cache hits / total hits
➡ High efficiency = high offload
➡ High efficiency = high performance
www.terracotta.org 5
Thursday, 17 June 2010
Why does caching work?
Locality of Reference
Pareto Distributions
www.terracotta.org 6
Thursday, 17 June 2010
Locality of Reference
Many computer systems exhibit the phenomenon of locality of reference.
Data that is near other data or has just been used is more likely to be
used again.
Temporal locality - refers to the reuse of specific data and/or resources
within relatively small time durations.
Spatial locality - refers to the use of data elements within relatively close
storage locations
e.g. this is the reason for hierarchical memory design in computers
www.terracotta.org 7
Thursday, 17 June 2010
Pareto Distributions
Chris Anderson, of Wired Magazine, coined the term The Long Tail to
refer to Ecommerce systems.
The mathematical term is a Pareto Distribution aka Power Law
Distribution.
www.terracotta.org 8
Thursday, 17 June 2010
Another Problem...
But...
What if the data set is too large to fit in the cache?
What about staleness of data
www.terracotta.org 9
Thursday, 17 June 2010
Coherency with SOR
classically solved with an automatic expiry. Ehcache has both TTL and
TTI.
Better:
Eternal caching plus a cache invalidation protocol
Write-through or behind caches. The SOR gets updated in-line with the
cache. Hibernate read-write and transactional strategies are examples.
Also Ehcache CacheWriter
www.terracotta.org 10
Thursday, 17 June 2010
Why run a cluster?
Availability, most often n+1 redundancy
Scale out
But this creates a new cascade of problems:
• N * problem
• Cluster coherency problem
• CAP theorem limits
www.terracotta.org 11
Thursday, 17 June 2010
N * Problem
On a single node work is done once and then cached. Cache hits offload.
But in a cluster the work must be done N times, where N is the number of
nodes
The solution to the N * times problem is a replicated or distributed cache
replicated cache - the data is copied to each node. All data is held in
each node
distributed cache - the most used data is held in a node. The balance of
the data is held outside the application node
www.terracotta.org 12
Thursday, 17 June 2010
Cluster Coherency Problem
Each cache makes independent cache changes. The caches become
different
Solved partly by a replicated or distributed cache
But, without locking, race conditions will cause incoherencies
Solution is a coherent, distributed or replicated cache
www.terracotta.org 13
Thursday, 17 June 2010
CAP Theorem PACELC
The CAP theorem, also known as Brewer's theorem, states that it is
impossible for a distributed computer system to simultaneously provide
all three of the following guarantees: consistency, availability and
tolerance to partition.
A better explanation of the tradeoffs is PACELC: if there is a partition (P)
how does the system tradeoff between availability and consistency (A
and C); else (E) when the system is running as normal in the absence of
partitions, how does the system tradeoff between latency (L) and
consistency (C)?
There is no right answer, but the properties to be traded off will be
different for different applications. So the solution must be configurable.
1. http://dbmsmusings.blogspot.com/2010/04/problems-with-cap-and-yahoos-little.html
www.terracotta.org 14
Thursday, 17 June 2010
About Ehcache
The world's most widely used Java cache
Founded in 2003
Apache 2.0 License
Integrated by lots of projects, products
Hibernate Provider implemented 2003
Web Caching 2004
Distributed Caching 2006
Greg Luck becomes co-spec lead of JSR107
JCACHE (JSR107) implementation 2007
REST and SOAP APIs 2008
SourceForge Project of the Month March 2009
Acquired by Terracotta 2009; Integrated with Terracotta Server
Ehcache 2.0 March 2010
Forrester Wave “Leader” May 2010
www.terracotta.org 15
Thursday, 17 June 2010
Ehcache 2
16
Thursday, 17 June 2010
Ehcache before Terracotta
www.terracotta.org 12
17
Thursday, 17 June 2010
Ehcache after Terracotta
www.terracotta.org 13
18
Thursday, 17 June 2010
Adding a specific ehcache.xml
ehcache.xml:
<ehcache>
<defaultCache
maxElementsInMemory="10000"
eternal="false"
timeToLiveSeconds="120"
/>
<cache name="org.hibernate.cache.UpdateTimestampsCache"
maxElementsInMemory="10000"
timeToIdleSeconds="300"
/>
<cache name="org.hibernate.cache.StandardQueryCache"
maxElementsInMemory="10000"
timeToIdleSeconds="300"
/>
</ehcache>
www.terracotta.org 15
19
Thursday, 17 June 2010
Ehcache 2 - New Features
Hibernate 3.3+ Caching SPI
Old SPI was heavily synchronized and not well suited to
clusters
New SPI uses CacheRegionFactory
Fully cluster safe with Terracotta Server Array
Unification of the Ehcache and Terracotta 3.2 providers
JTA i.e. transactional caching strategy
JTA
Cache as an XAResource
Detects most common Transaction Managers
Others configurable
Works with Spring, EJB and manual transactions
www.terracotta.org 21
Thursday, 17 June 2010
Ehcache 2 - New Features
Write-behind
Offloads Databases with high write workloads
CacheWriter Interface to implement
cache.putWithWriter(...) and cache.removeWithWriter(...)
Write-through and Write-behind modes
Batching, coalescing and very configurable
Standalone with in-memory write-behind queue.
TSA with HA, durability and distributed workload balancing
Bulk Loading
incoherent mode for startup or periodic cache loading
10 x faster
No change to the API (put, load etc).
SetCoherent(), isCoherent(), waitForCoherent()
www.terracotta.org 22
Thursday, 17 June 2010
Ehcache 2 - New Features ...cont.
New CAP configurability – per cache basis
coherent – run coherent or incoherent (faster)
synchronousWrites – true for ha, false is faster
copyOnRead – true to stop interactions between threads outside
of the cache
Cluster events – notification of partition and reconnection
NonStopCache - decorated cache favouring availability
UnlockedReadsView - decorated cache favouring speed
Management
Dynamic Configuration of common cache configs from JMX and
DevConsole
New web-based Monitoring with UI and API
www.terracotta.org 23
Thursday, 17 June 2010
Ehcache 2.0 Monitoring Options
JMX
• is built in to Ehcache but...
• JMX needs use portmap
• Slow
• Machines may be headless
Terracotta Dev Console (if using Terracotta)
Ehcache Console, new in 2.1
www.terracotta.org 24
Thursday, 17 June 2010
Including Web Services
Simple + Performant + Coherent + HA + Scaleable
Application
Ehcache Terracotta Terracotta
Server Server
Application
Ehcache Terracotta Terracotta
Server Server
Application
Ehcache Terracotta
Terracotta Server
Server
PHP App
C App
Web Container
REST/
Ehcache
C# App HTTP
Server
Ruby App
www.terracotta.org 25
Thursday, 17 June 2010
Terracotta Developer Console
Cache hit ratios
Hit/miss rates
Hits on the database
Cache puts
Detailed efficiency of cache
regions
Dramatically simplifies tuning
and operations, and shows the
database offload.
www.terracotta.org 26
Thursday, 17 June 2010
Ehcache Console
Web based
Configuration
Efficiency
Memory Use
Comes with supported versions
API to connect Operations
Monitoring
www.terracotta.org 27
Thursday, 17 June 2010
Exploring Hibernate Caching:
Spring Pet Clinic
28
Thursday, 17 June 2010
Code - Spring Pet Clinic
www.terracotta.org 29
Thursday, 17 June 2010
Pet Clinic Domain Model
Domain Objects Sprint PetClinic Domain Model
Vets Vet Specialty
attr = "" attr = ""
Specialty
Owner
Pet Owner
attr = ""
PetType
Visit
Pet PetType
attr = "" attr = ""
Visit
attr = ""
www.terracotta.org 30
Thursday, 17 June 2010
Code
Steps:
Configure PetClinic for Hibernate
Configure hibernate for second-level cache
Configure hbm file for caching
Update query code to add caching
Optional but recommended:
add ehcache.xml to WEB-INF/classes
specify cache regions and config
www.terracotta.org 31
Thursday, 17 June 2010
Standalone Performance
Put Performance
www.terracotta.org 36
Thursday, 17 June 2010
Ehcache in-process vs Memcached
www.terracotta.org 37
Thursday, 17 June 2010
REST Performance
Ehcache Server Memcache
4000
3000
2000
1000
0
Get Multi-Get Put Remove
Source: MemcacheBench with Java clients. Time for 10,000 operations
www.terracotta.org 38
Thursday, 17 June 2010
Ehcache with Terracotta vs the Rest
Application
Tests done with Owners = 25K and 125K which translates
to total objects of 0.3 M and 1.5 M
Minimal tuning.
Cluster Configuration:
8 Client JVMs (1.75G Heap)
1 (+0) Terracotta Servers (6G Heap)
MySql: sales18.
www.terracotta.org 39
Thursday, 17 June 2010
Ehcache with Terracotta vs the Rest
Ehcache
Replicated with RMI not included because not coherent
Single TSA Server
15 threads and some with 100 threads
IMDG
15 threads
Cache deployed in Partitioned Mode
Tests were also done with Replicated – which did well for
small cache sizes but failed to complete with larger cache
sizes. So, it is not included.
memcached
15 threads
1 server
www.terracotta.org 40
Thursday, 17 June 2010
Hibernate - Read Only TPS
www.terracotta.org 41
Thursday, 17 June 2010
Test Source
The code behind the benchmarks is in the
Terracotta Community SVN repository.
Download https://svn.terracotta.org/repo/forge/
projects/ehcacheperf/
(Terracotta Community Login Required)
www.terracotta.org 45
Thursday, 17 June 2010
Performance Conclusions
With Hibernate, Using Spring Pet Clinic
After app servers and DBs tuned by
independent 3rd parties
30-95% database load reduction
80 times read-only performance of MySQL
Notably lower latency
1.5 ms versus 120 ms for database (25k)
www.terracotta.org 46
Thursday, 17 June 2010
What about NoSQL?
Ehcache + Terracotta configured with persistence gives you a NoSQL
store with limited features i.e. no search
TerraStore - new open source project from Sergio Bossa is a document
oriented NoSQL store based on Terracotta and ful
www.terracotta.org 47
Thursday, 17 June 2010
Wrap Up
48
Thursday, 17 June 2010
Q&A
Please ask any questions you have in the Q&A
window.
www.terracotta.org 49
Thursday, 17 June 2010