8. What is Redis
Redis is an open source, BSD licensed, advanced key-value
cache and store. It is often referred to as a data structure
server since keys can contain strings, hashes, lists, sets,
sorted sets, bitmaps and hyperloglogs.[1]
6[1] http://redis.io
11. Persistence
• Datasets can be saved to disk.
1. RDB(Redis DB) snaphots (default)
RDB is a very compact single-file point-in-time
representation of the Redis dataset.
2. AOF(Append Only File) logs
AOF is a simple text log of write operations.
9
12. Virtual Memory
The goal of the VM subsystem is to free memory transferring Redis
Objects from memory to disk.
10
Keys swap file
Values
Values
Values
Redis
Storage
13. Virtual Memory
The goal of the VM subsystem is to free memory transferring Redis
Objects from memory to disk.
10
Keys
Memory Storage
swap file
Values
Values
Values
14. Virtual Memory
The goal of the VM subsystem is to free memory transferring Redis
Objects from memory to disk.
10
Keys
Memory Storage
swap file
Values
Values
Values
15. Virtual Memory
The goal of the VM subsystem is to free memory transferring Redis
Objects from memory to disk.
10
Keys
Memory Storage
swap file
Values
Values
Values
16. swap file
Virtual Memory
The goal of the VM subsystem is to free memory transferring Redis
Objects from memory to disk.
10
Keys
Memory Storage
swap file
Values
Values
17. swap file
Virtual Memory
The goal of the VM subsystem is to free memory transferring Redis
Objects from memory to disk.
10
Keys
Memory Storage
swap file
Values
Values
Values
18. Redis keys are binary safe, this means that you can use any binary sequence
as a key, from a string like "foo" to the content of a JPEG file. The empty
string is also a valid key.
For example:
object-type:id
Key Value
11
19. • Binary-safe strings.
• Lists: collections of string elements sorted according to the order of
insertion. They are basically linked lists.
• Sets: collections of unique, unsorted string elements.
• Sorted sets, similar to Sets but where every string element is associated to
a floating number value, called score.
• Hashes, which are maps composed of fields associated with values. Both
the field and the value are strings.
• Bit arrays (or simply bitmaps): it is possible, using special commands, to
handle String values like an array of bits: you can set and clear individual
bits, count all the bits set to 1, find the first set or unset bit, and so forth.
• HyperLogLogs: this is a probabilistic data structure which is used in order to
estimate the cardinality of a set.
Key Value
12
23. Cluster
13
Master B
5501-11000
Master A
0-5500
Master C
11001-16384
Cluster Bus
Every Redis Cluster node requires two TCP connections open.
• The normal Redis TCP port used to serve clients.
• The high port is used for the Cluster bus, that is a node-to-
node communication channel using a binary protocol.
24. Cluster
13
Master B
5501-11000
Master A
0-5500
Master C
11001-16384
Cluster Bus
Every Redis Cluster node requires two TCP connections open.
• The normal Redis TCP port used to serve clients.
• The high port is used for the Cluster bus, that is a node-to-
node communication channel using a binary protocol.
A given key "foo" is at slot:
slot = crc16("foo") mod NUMER_SLOTS
25. Cluster
• All nodes are directly
connected with a
service channel. TCP
baseport+4000,
example 6379 ->
10379.
• Node to Node protocol is
binary, optimized for
bandwidth and speed.
• Clients talk to nodes as
usually, using ascii
protocol, with minor
additions.
• Nodes don't proxy queries.
14
Client Client Client
26. Cluster
14
Client Client Client
• Dummy Client
1. Client => A: GET foo
2. A => Client: -MOVED 8
192.168.5.21:6391
3. Client => B: GET foo
4. B => Client: "bar"
• Smart Client
1. Client => A: CLUSTER
HINTS
2. A => Client: ... a map of
hash slots -> nodes
3. Client => B: GET foo
4. B => Client: "bar"
58. Conclusion
Redis is a different evolution path in the key-value DBs where
values can contain more complex data types, with atomic
operations defined on those data types.
Redis is an in-memory but persistent on disk database, so it
represents a different trade off where very high write and read
speed is achieved with the limitation of data sets that can't be
larger than memory.
25
59. Memcached
• What is Memcached?
• Where is it used?
• How is it used?
• How does it work internally?
• A Cluster of Memcached Servers
• Memcached and CAP Theorem
• Memcached vs Redis
26
60. What is Memcached?
Memcached is a free and open-source general-purpose
distributed in-memory object caching system intended to be
used for web applications.
Its primary goal is to cache the data returned from database
interactions and/or cache the result of costy operations and
API calls.
27
61. History & Characteristics
• Originally written in Perl in 2003 for LiveJournal.com,
then rewritten in C
• First release in May 2003
• Released under Revised BSD License
• Platform-independent (Windows, Linux, BSD, UNIX, Mac)
• Client APIs in many different languages:
• C, C++, Java, PHP, Perl, Python, Ruby, Windows .NET, Lua, …
• Used by many giant web sites: Facebook, YouTube, Reddit, Twitter,
Wikipedia, Flickr, etc.
28
62. Where is it used? (Architectural view)
Memcached resides between Front-end
Web tier and the Back-end Database tier.
Front-end web tier tries to fetch
data from Memcached at first
and then interacts with
database in case of
cache miss.
1
1
2
2
29
63. How is it used? (Conceptual view)
From a programmer’s point of view, Memcached is a
collection of key-value items stored in memory.
Keys are unique identifiers with a possible length of up to 250
characters.
Values can have any format including raw binary data and
they can contain data up to 1 MB (by default).
30
64. How is it used? (Conceptual view)
Example (in Perl):
use Cache::Memcached;
$cache = new Cache::Memcached {
servers => ["192.168.0.10:11211",
"192.168.0.20:11211"] };
$user_info = $cache->get("user:$id");
if (!$user_info) {
$user_info = new User($id);
$cache->set("user:$id", $user_info);
}
print $user_info->name;
31
65. Some Common Operations
set(key, val)
add(key, val): like set but only works if the key does not
already exist
replace(key, val): like set but only works if the key already
exists
get(key)
delete(key)
incr(key)
decr(key)
32
66. How does it work internally? (Physical view)
• Characteristics of data items
• Internal data storage and memory management
• Cluster topology
33
67. Characteristics of data items
• Each data item consists of a key, value, optional flags and
an optional expiration time (TTL)
• Data is removed from the cache once its TTL has expired
or the cache is already full and it is selected by the
replacement algorithm
• Memcached uses Least Recently Used (LRU) policy to free
the cache
34
68. Internal data storage and memory management
Memcached has 4 primary internal data structures:
A hash table to store and locate cache items
A Least Recently Used (LRU) list to specify cache item
removal (eviction) when the cache is full
A cache item data structure for storing key, flags, data and
pointers
A slab allocator which handles memory management for
cache items
35
69. Hash Table
• The hash table data structure is an array whose elements are buckets
• Each bucket is a single linked-list of cache items
• Each cache item maintains its pointer in its respective bucket
36
70. LRU Data Structure
• LRU is used to determine which cache item to evict
• LRU is a double linked-list for holding all cache items in a particular slab
• Each slab has its own LRU data structure
• The LRU pointers are located in each cache item data structure
• Each access/modification of a cache item updates its pointers in its LRU
• Last item (tail) of each list is the item to be removed in the event of eviction
37
71. Slab Allocator
Problems with malloc() and free() functions for memory management:
• Poor performance
• Memory fragmentation after some time
To overcome these problems, Memcached uses a Slab Allocator:
• The allowed amount of memory is allocated all at once at startup time
• The memory is divided into different classes called slab classes (pages) that are equal in
size
• Each slab class is again divided into smaller pieces for data storage called slabs
(chunks)
• Slab allocator takes care of pages and slabs
• A cache item will be located in the smallest slab that is bigger than or equal to the item
38
72. Slab Allocator
Example:
2048 bytes of allowed memory is split into 4 slab classes (512 bytes each) and
each slab class is divided into chunks
slab class 1
slab class 2
slab class 3
slab class 4
64 B 64 B 64 B 64 B 64 B 64 B 64 B 64 B
128 B 128 B 128 B 128 B
256 B 256 B
512 B
512 bytes
39
73. Slab Allocator
To see the slab classes, run Memcached with -vv option:
$ memcached –m 64M –vv
Output:
slab class 1: chunk size 96 perslab 10922
slab class 2: chunk size 120 perslab 8738
slab class 3: chunk size 152 perslab 6898
slab class 4: chunk size 192 perslab 5461
.
.
.
slab class 42: chunk size 1048576 perslab 1
40
74. Handling Client Requests
• Client requests are handled with libevent
and assigned to threads in Memcached
• Each thread waits to acquire the lock
to enter the critical section to do the hash
table processing and updating LRU data structure
• Then, the thread leaves the critical section
and sends back the result to the client.
à Every internal operation in Memcached
has the cost of O(1).
à The critical section is a performance
bottleneck since the execution is serial.
41
75. A Cluster of Memcached Servers
• It is possible to have a number of different
Memcached instances on different machines.
• The instances are unaware of one another
and do not communicate with others.
• The client API determines which server to
use by applying a hash function on item’s key
• Servers are grouped together with consistent
hashing.
• A server can have a weight to affect its
selection (default: 1)
• In the event of failure, the client can use
other servers
Client
hash(key)
42
76. Memcached and CAP Theorem
✓ Consistency: All clients see the same data at the same time.
× Availability: Some clients might not be able to access Memcached at
some time.
✓ Partition tolerance: Memcached continues to work in case of arbitrary
message loss or failure of other servers.
43
77. Redis vs Memcached
Redis Memcached
Persistent storage Transient caching
In memory, able to persist to disk Only in memory caching
Appropriate for complex data structures
(hashes, lists, sets, etc)
Appropriate for simple key-value pairs
Master/Slave replication No support for replication
High performance in large volumes of read
and write operations
High performance in only large volumes of
read operations
Key length limit: 2 GB! Key length limit: 250 bytes
Open source (BSD license) Open source (Revised BSD license)
44
78. “Memory is the new disk, disk is the new tape.”
—Jim Gray (1944 – 2007)
45