Hi praba_tuty, I tried to keep this presentation as generic and small as possible so it includes so it is as it is. I think DB and Object caching are worse separate presentation, agree? :) But anyway thanks for your comment. It’s so cool to receive feedbacks! Serhiy
What is Cache
• Term introduced in IBM in 60’s
• Copy of a real data with faster (and/or cheaper)
access
4
Caching is everywhere
• Hardware (CPU, HDD)
• Operating Systems (RAM as a Disk Cache for
example)
• Web browsers, Proxy Servers
• Applications
• DNS, DBMS, File Systems (NFS, SMB), …
5
Why do we use it?
• Smaller application response time
• Reduce load on main data source or processing
time
6
Main principles
• Don’t execute code unless you need to
• Get data from the fastest place you can
• Don’t get the same data twice
7
What do we cache?
• Storage requests
• Network requests
• Data from expensive calculations
• Static data
• Whole web-pages
• …
8
Terminology
9
Common Terminology
• Cache hit
• Cache miss
• Hit ratio (hit rate)
• Storage cost
• Retrieval cost
• Invalidation
• Replacement policy
• Optimal replacement policy
10
Cache Hit/Miss
11
Hit Ratio
• 45 Cache Hit
• 10 Cache Miss
• 45*100/(45+10) ≈ 82% Hit Ratio
12
Storage Cost
• Cost of storing an item in cache
• Basically a space needed to store a single item
in cache
13
Retrieval Cost
• Cost of retrieving item from cache
• Can be: network load in distributed cache,
algorithm overhead, ...
14
Invalidation
• Keeping cache up-to-date, removing irrelevant
data and/or replacing it with relevant one
15
Replacement(Eviction) Policy
• Heuristic used for selecting the entry to eject
when no room left for new item
16
Optimal Replacement Policy
(Theoretical Optimum)
• The most efficient caching algorithm would be to
always discard the information that will not be needed
for the longest time in the future. This optimal result is
referred to as Belady`s optimal algorithm or the
clairvoyant algorithm. Since it is generally impossible
to predict how far in the future information will be
needed, this is generally not implementable in practice.
The practical minimum can be calculated only after
experimentation, and one can compare the effectiveness
of the actually chosen cache algorithm with the optimal
minimum.
-- Wikipedia
17
Simple Time-based
• Invalidates entries based
on absolute time period
(Time-To-Live, TTL)
• Fast
• Not adaptive
• Not scan resistant
19
Absolute Time-based
• Data is invalidated at certain points of time
(like every 5 minutes, each day at 13pm, …)
• Fast
• Not adaptive
• Not scan resistant
20
Sliding Time-based
• Keeps the items last usage date
• Invalidates entries after specified idle period
(Time-To-Idle, TTI)
• Fast
• Adaptive
• Not scan resistant
21
Other Algorithms
• Other popular algorithm types include:
– Dependency-based
– Rule-based
22
Replacement Policies
(Eviction Policies)
23
FIFO (First in First out)
• Simple queue of items
• Order never changes
• If cache is full Items are discarded in the order they
were added
• Fast with very small overhead
• Not adaptive
• Not scan resistant
24
LFU (Least Frequently Used)
• Keeps track of item usage count
• Items with lowest counter value are discarded first
• Fast
• Adaptive
• Not scan resistant
• Variations:
– Perfect LFU: keeps usage info of all items whether or
not in the cache
– In cache LFU: keeps usage info of only “in cache”
items
25
LRU (Least Recently Used)
• Track the last usage date
• Item with oldest usage date discarded first
• Fast
• Adaptive
• Not scan resistant
• Also known as LRU/1 or LRU-1
26
LRU/2 (Least Recently Used
• Items are added to the main cache the second
time they are accessed
• Discards the item whose penultimate (second-
to-last) access is least recent
• Not especially fast
• Adaptive
• Scan resistant
27
MRU (Most Recently Used)
• Discards, in contrast to LRU, the most recently
used items first
• Fast
• Not adaptive
• Scan resistant
28
Random
• Discards random item
• Very fast
• Almost no overhead
• Not adaptive
29
Summary
• Above we saw popular Paging-based policies
(storage and retrieval costs are treated equal for
all elements)
• Other can take into account:
– Retrieval cost
– Storage cost
– Both
– ...
30
3 comments
Comments 1 - 3 of 3 previous next Post a comment