Overview of the ehcache

Overview of the Ehcache

2011.12.02
chois79

Contents
• About Caches
• Why caching works
• Will an Application Benefit from Caching?
• How much will an application speed up?
• About Ehcache
• Features of Ehcache
• Key Concepts of Ehcache
• Using Ehcache
• Distributed Ehcache Architecture
• References

About Caches
• In Wiktionary
– A store of things that will be required in future and can be
retrieved rapidly

• In computer science
– A collection of temporary data which either duplicates data
located elsewhere of is the result of a computation

– The data can be repeatedly accessed inexpensively

Why caching works
• Locality of Reference
– Data that is near other data or has just been used is more likely to be used
again

• The Long Tail

A small number of items may make up the
bulk of sales. – Chris Anderson

– One form of a Power Law distribution is the Pareto distribution (80:20 rule)
– IF 20% of objects are used 80% of the time and a way can be found to
reduce the cost of obtaining that 20%, then system performance will improve

Will an Application Benefit from Caching?
CPU bound Application
• The time taken principally depends on the speed of the CPU
and main memory
• Speeding up
– Improving algorithm performance
– Parallelizing the computations across multiple CPUs or multiple
machines
– Upgrading the CPU speed

• The role of caching
– Temporarily store computations that may be reused again
• Ex) DB Cache, Large web pages that have a high rendering cost.

I/O bound Application
• The time taken to complete a computation depends principally
on the rate at which data can be obtained
• Speeding up
– Hard disks are speeding up by using their own caching of blocks into
memory
• There is no Moore’s law for hard disk.

– Increase the network bandwidth

• The role of cache
– Web page caching, for pages generated from databases
– Data Access object caching

Increased Application Scalability

• Data bases can do 100 expensive queries per second
– Caching may be able to reduce the workload required

How much will an application speed up?
(Amdahl’s Law)

• Depend on a multitude of factors
– How many times a cached piece of data can and is
reduced by the application

– The proportion of the response time that is alleviated by
caching

• Amdahl’s Law
P: Proportion speed up
S: Speed up

Amdahl’s Law Example
(Speed up from a Database Level Cache)
Un-cached page time: 2 seconds
Database time: 1.5 seconds
Cache retrieval time: 2ms
Proportion: 75% (2/1.5)
The expected system speedup is thus:
1 / (( 1 – 0.75) + 0.75 / (1500/2))
= 1 / (0.25 + 0.75/750)
= 3.98 times system speedup

About Ehcache
• Open source, standards-based cache used to boost performance

• Basically, based on in-process
• Scale from in-process with one more nodes through to a mixed in-
process/out-of-process configuration with terabyte-sized caches
• For applications needing a coherent distributed cache, Ehcache uses
the open source Terracotta Server Array

• Java-based Cache, Available under an Apache 2 license

• The Wikimedia Foundation use Ehcache to improve the performance
of its wiki projects

Features of Ehcache(1/2)
• Fast and Light Weight
– Fast, Simple API
– Small foot print: Ehcache 2.2.3 is 668 kb making it convenient to package
– Minimal dependencies: only dependency on SLF4J

• Scalable
– Provides Memory and Disk store for scalability into gigabytes
– Scalable to hundreds of nodes with the Terracotta Server Array

• Flexible
– Supports Object or Serializable caching
– Provides LRU, LFU and FIFO cache eviction policies
– Provides Memory and Disk stores

Features of Ehcache(2/2)
• Standards Based
– Full implementation of JSR107 JCACHE API

• Application Persistence
– Persistent disk store which stores data between VM restarts

• JMX Enable
• Distributed Caching
– Clustered caching via Terracotta
– Replicated caching via RMI, JGroups, or JMS

• Cache Server
– RESTful, SOAP cache Server

• Search
– Standalone and distributed search using a fluent query language

Key Concepts of Ehcache
Key Classes
• CacheManager
– Manages caches

• Ehcache
– All caches implement the Ehcache interface
– A cache has a name and attributes
– Cache elements are stored in the memory store, optionally the also overflow
to a disk store

• Element
– An atomic entry in a cache
– Has key and value
– Put into and removed from caches

Usage patterns: Cache-aside
• Application code use the cache directly
• Order
– Application code consult the cache first
– If cache contains the data, then return the data directly
– Otherwise, the application cod must fetch the data from the system-of-record,
store the data in the cache, then return.

– 0

Usage patterns: Read-through
• Mimics the structure of the cache-aside patterns when reading data
• The difference
– Must implement the CacheEntryFactory interface to instruct the cache how to
read objects on a cache miss
– Must wrap the Ehcache instance with an instance of SelfPopulationCache

– 4

Usage patterns: Write-through and behind
• Mimics the structure of the cache-aside pattern when data write
• The difference
– Must implement the CacheWriter interface and configure the cache for write-through or write
behind
– A write-through cache writes data to the system-of-record in the same thread of execution
– A write-behind queues the data for write at a later time

– d

Usage patterns: Cache-as-sor
• Delegate SOR reading and writing actives to the cache
• To implement, use a combination of the following patterns
– Read-through
– Write-through or write-behind

• Advantages
– Less cluttered application code
– Easily choose between write-through or write-behind strategies
– Allow the cache to solve the “thundering-herd” problem

• Disadvantages
– Less directly visible code-path

Storage Options: Memory Store
• Suitable Element Types
– All Elements are suitable for placement in the Memory Store

• Characteristics
– Thread safe for use by multiple concurrent threads
– Backed By LinkedHashMap (Jdk 1.4 later)
• LinkedHashMap: Hash table and linked list implementation of the Map interface

– Fast

• Memory Use, Spooling and Expiry Strategy
– Least Recently Used (LRU): default
– Least frequently Used (LFU)
– First In First Out (FIFO)

Storage Options: Big-Memory Store
• Pure java product from Terracotta that permits caches to use an additional type of
memory store outside the object heap. (Packaged for use in Enterprise Ehcache)
– Not subject to Java GC
– 100 times faster than Disk-Store
– Allows very large caches to be created(tested up to 350GB)

• Two implementations
– Only Serializable cache keys and values can be placed similar to Disk Store
– Serializaion and deserialization take place putting and getting from the store
• Around 10 times slower than Memory Store
• The memory store holds the hottest subset of data from the off-heap store, already in deserialized form

• Suitable Element Types
– Only Elements which are serializable can be placed in the off-heap
– Any non serializable Elements will be removed and WARNING level log message emitted

Storage Options: Disk Store
• Disk Store are optional
• Suitable Element Type
– Only Elements which are serializable can be placed in the off-heap
– Any non serializable Elements will be removed and WARNING level
log message emitted

• Eviction
– The LFU algorithm is used and it is not configurable or changeable

• Persistence
– Controlled by the disk persistent configuration
– If false or onmitted, disk store will not presit between CacheManager restarts

Replicated Caching
• Ehcache has a pluggable cache replication scheme
– RMI, JGroups, JMS

• Using a Cache Server
– To achieve shared data, all JVMs read to and write from a Cache Server

• Notification Strategies
– If the Element is not available anywhere else then the element it self shoud from the pay load
of the notification

– D

Search APIs
• Allows you to execute arbitrarily complex queries either a standalone
cache or a Terracotta clustered cache with pre-built indexes
• Searchable attributes may be extracted from both key and vales
• Attribute Extractors
– Attributes are extracted from keys or values
– This is done during search or, if using Distributed Ehcache on put() into the
cache using AttributeExtractors
– Supported types
• Boolean, Byte, Character, Double, Float, Integer, Long, Short, String, Enum, java.util.Date,
Java.sql.Date

Using Ehcache
General-Purpose Caching
• Local Cache
• Configuration
– Place the Ehcache jar into your class-path
– Configure ehcache.xml and place it in your class-path
– Optionally, configure an appropriate logging level

DB

Local
Application Web
Ehcache
Server

– d Web
Server

Using Ehcache
Cache Server
• Support for RESTful and SOAP APIs
• Redundant, Scalable with client hash-based routing
– The client can be implemented in any language
– The client must work out a partitioning scheme

– s

Using Ehcache
Integrate with other solutions
• Hivernate

• Java EE Servlet Caching

• JCache style caching

• Spring, cocoon, Acegi and other frameworks

Distributed Ehcache Architecture
(Logical View)
• Distributed Ehcache combines an in-process Ehcache with the Terracotta Server Array

• The data is split between an Ehcache node(L1) and the Terracotta Server Array(L2)
– The L1 can hold as much data as is comfortable

– The L2 always a complete copy of all cache data

– The L1 acts as a hot-set of recently used data

(Ehcache topologies)
• Standalone
– The cache data set is held in the application node
– Any other application nodes are independent with no communication
between them

• Distributed Ehcache
– The data is held in a Terracotta server Array with a subset of recently used
data held in each application cache node

• Replicated
– The cached data set is held in each application node and data is copied or
invalidated across the cluster without locking
– Replication can be either asynchronous or synchronous
– The only consistency mode available is weak consistency

(Network View)
• From a network topology point of view Distributed Ehcache consist of
– Ehcache node(L1)
• The Ehcache library is present in each app
• An Ehcache instance, running in-process sits in each JVM

– Terracotta Server Array(L2)
• Each Ehcache instance maintains a connection with one or more Terracotta Servers
• Consistent hashing is used by the Ehcache nodes to store and retrieve cache data

• 4

(Memory Hierarchy View)
• Each in-process Ehcache instance
– Heap memory
– Off-heap memory(Big Memory)

• The Terracotta Server Arrays
– Heap memory
– Off-heap memory
– Disk storage.
• This is optional.(Persistence)

– 1

Ehcache in-process compared with
Memcached

Reference
• Ehcache User Guide
– http://ehcache.org/documentation

• Ehcache Architecture, Features And Usage patterns
– Greg Luck, 2009 JavaOne Session 2007

Overview of the ehcache

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Overview of the ehcache

Similar to Overview of the ehcache (20)

More from HyeonSeok Choi

More from HyeonSeok Choi (20)

Recently uploaded

Recently uploaded (20)

Overview of the ehcache