Redis
Everything you always wanted to know
about Redis but were afraid to ask

Carlos Abalde
carlos.abalde@gmail.com
March 2014
Introduction
Remote dictionary server
INSERT INTO…
$ redis-cli -n 0	
!

127.0.0.1:6379> SET the-answer 42	
OK	
!

127.0.0.1:6379> QUIT	
!

rulo:~$
SELECT * FROM…
$ redis-cli -n 0	
!

127.0.0.1:6379> GET the-answer	
"42"	
!

127.0.0.1:6379> QUIT	
!

rulo:~$
The end
The end
Agenda
I. Introduction
‣

Context, popular Redis users, latest releases…

II. Redis 101
‣

Basics, scripting, some examples…

III.Mastering Redis
‣

Persistence, replication, performance, sharding…
I. Introduction

http://www.flickr.com/photos/verino77/5616332196/
NoSQL / NoREL mess
๏

Document DBs
‣

๏

Graph DBs
‣

๏

Neo4j, FlockDB…

Column oriented DBs
‣

๏

MongoDB, CouchDB, Riak…

HBase, Cassandra, BigTable…

Key-Value DBs
‣

Memcache, MemcacheDB, Redis, Voldemort, Dynamo…
Who’s behind Redis?
๏

Created by Salvatore Sanfilippo
‣
‣

๏

http://antirez.com
@antirez at Twitter

Currently sponsored by Pivotal
‣

Previously to May 2013
sponsored by VMware
Who’s using Redis? I
Who’s using Redis? II
๏

The architecture Twitter uses to deal with 150M active
users, 300K QPS, a 22 MB/S Firehose, and send tweets
in under 5 seconds. High Scalability (2013)▸

๏

Storing hundreds of millions of simple key-value pairs in
Redis. Instagram Engineering Blog (2012)▸

๏

The Instagram architecture Facebook bought for a cool
billion dollars. High Scalability (2012)▸

๏

Facebook’s Instagram: making the switch to Cassandra
from Redis, a 75% ‘insta’ savings. Planet Cassandra
(2013)▸
Who’s using Redis? III
๏

Highly available real time push notifications and
you. Flickr Engineering Blog (2012)▸

๏

Using Redis as a secondary index for MySQL. Flickr
Engineering Blog (2013)▸

๏

How we made GitHub fast. The GitHub Blog (2009)▸

๏

Real world Redis. Agora Games (2012)▸

๏

Disqus discusses migration from Redis to Cassandra
for horizontal Scalability. Planet Cassandra (2013)▸
Memory is the new disk
๏

BSD licensed in-memory data structure server
‣

Strings, hashes, lists, sets…

๏

Optional durability

๏

Bindings to almost all relevant languages
“Memory is the new disk, disk is the new tape”
— Jim Gray
A fight against complexity
๏

Simple & robust foundations
‣
‣

๏

Single threaded
No map-reduce, no indexes, no vector clocks, no
Paxos, no Merkle trees, no gossip protocols…

Blazingly fast
‣

Implemented in C (20K LoC for the 2.2 release)

‣

No dependencies
A fight against complexity
!

…
5. We’re against complexity. We believe designing systems
is a fight against complexity. […] Most of the time the
best way to fight complexity is by not creating it at all.
…
The Redis Manifesto▸
Most popular K-V DB
๏

Currently most popular key-value DB▸

๏

Redis 1.0 (April’09) ↝ Redis 2.8.6 (March’14)

Google Trends▸
Latest releases I
๏

Redis 2.6 (October’12)
‣

LUA scripting

‣

New commands

‣

Milliseconds precision expires

‣

Unlimited number of clients

‣

Improved AOF generation
Latest releases II
๏

Redis 2.8 (November’13)
‣

Redis 2.7 removing clustering stuff

‣

Partial resynchronization with slaves

‣

IPv6 support

‣

Config rewriting

‣

Key-space changes notifications via Pub/Sub
Latest releases III
๏

Redis 3.0
‣

Next beta release planned to March’14

‣

Redis Cluster

‣

Speed improvements under certain workloads
Commands
๏

redis-server

๏

redis-cli
‣

๏

redis-benchmark
‣

๏

Command line interface

Benchmarking utility

redis-check-dump & redis-check-aof
‣

Corrupted RDB/AOF files utilities
Performance
Sample benchmark

๏

Redis 2.6.14

๏

Intel Xeon CPU E5520 @ 2.27GHz

๏

50 simultaneous clients performing 2M requests

๏

Loopback interface

๏

Key space of 1M keys
Performance
No pipelining

$ redis-benchmark 	
-r 1000000 -n 2000000 	
-t get,set,lpush,lpop -q	
!

SET: 122556.53 requests per second	
GET: 123601.76 requests per second	
LPUSH: 136752.14 requests per second	
LPOP: 132424.03 requests per second
Performance
16 command per pipeline

$ redis-benchmark 	
-r 1000000 -n 2000000 	
-t get,set,lpush,lpop -P 16 -q	
!

SET: 552028.75 requests per second	
GET: 707463.75 requests per second	
LPUSH: 767459.75 requests per second	
LPOP: 770119.38 requests per second
Summary
✓

Simple

✓

Fast

✓

Predictable

✓

Widely supported

✓

Reliable

✓

Lightweight
II. Redis 101

http://www.flickr.com/photos/caseycanada/2058552752/
Overview
๏

Family of fundamental data structures
‣
‣

Accessed / indexed by key

‣
๏

Strings and string containers

Directly exposed — No abstraction layers

Rich set of atomic operations over the structures
‣

๏

Detailed reference using big-O notation for complexities

Basic publish / subscribe infrastructure
Keys
๏

Arbitrary ASCII strings
‣
‣

๏

Define some format convention and adhere to it
Key length matters!

Multiple name spaces are available
‣

Separate DBs indexed by an integer value
-

๏

SELECT command
Multiples DBs vs. Single DB + key prefixes

Keys can expire automatically
Data structures I
๏

Strings
‣

๏

Hashes
‣

๏

Caching, counters, realtime metrics…

“Object” storage…

Lists
‣

Logs, queues, message passing…
Data structures II
๏

Sets
‣

๏

Membership, tracking…

Ordered sets
‣

Leaderboards, activity feeds…

RTFM, please :) ▸
Publish / Subscribe
Overview
๏

Classic pattern decoupling publishers & subscribers
‣

‣
๏

You can subscribe to channels; when someone publish in a
channel matching your interests Redis will send it to you
SUBSCRIBE, UNSUBSCRIBE & PUBLISH commands

Fire and forget notifications
‣

๏

Not suitable for reliable off-line notification of events

Pattern-matching subscriptions
‣

PSUBSCRIBE & PUNSUBSCRIBE commands
Publish / Subscribe
Key-space notifications
๏

Available since Redis 2.8
‣
‣

๏

Disabled in the default configuration
Key-space vs. keys-event notifications

Delay of key expiration events
‣

Expired events are generated when Redis deletes the key;
not when the TTL is consumed
-

Lazy (i.e. on access time) key eviction

-

Background key eviction process
Pipelining
๏

Redis pipelines are just a RTT optimization
‣

Deliver multiple commands together without waiting for
replies

‣

Fetch all replies in a single step
-

Server needs to buffer all replies!

๏

Pipelines are NOT transactional or atomic

๏

Redis scripting FTW!
‣

Much more flexible alternative
Transactions
๏

Or, more precisely, “transactions”
‣

Commands are executed as an atomic & single isolated operation
-

‣
๏

Rollback is not supported!

MULTI, EXEC & DISCARD commands
‣

๏

Partial execution is possible due to pre/post EXEC failures!

Conditional EXEC with WATCH

Redis scripting FTW!
‣

Redis transactions are complex and cumbersome
Scripting
Overview I
๏

Added in Redis 2.6

๏

Uses the LUA 5.1 programming language▸
‣

Base, Table, String, Math & Debug libraries

‣

Built-in support for JSON and MessagePack

‣

No global variables

‣

redis.{call(), pcall()}	

‣

redis.{error_reply(), status_reply(), log()}
Scripting
Overview II
๏

Scripts are atomic, like any other command

๏

Scripts add minimal overhead
‣

๏

Shared LUA context

Scripts are replicated on slaves by sending the script
(i.e. not the resulting commands)
‣

‣

Single thread

Scripts are required to be pure functions

Maximum execution time vs. Atomic execution
Scripting
What is fixed with scripting?
๏

Server side manipulation of data

๏

Minimizes latency
‣

๏

No round trip delay

Maximizes CPU usage
‣
‣

๏

Less parsing
Less OS system calls

Simpler & faster alternative to WATCH
Scripting
Scripts vs. Stored procedures
๏

Stored procedures are evil

๏

Backend logic should be 100% application side
‣
‣

๏

No hidden behaviors
No crazy version management

Redis keys are explicitly declared as parameters of the script
‣

Cluster friendly

‣

Hashed scripts
Scripting
Hello world!

> EVAL "	
return redis.call('SET', 	
KEYS[1],	
ARGV[1])" 1 foo 42	
OK	
!

> GET foo	
"42"
Scripting
DECREMENT-IF-GREATER-THAN
EVAL "	
local res = redis.call('GET', KEYS[1]);	
!

if res ~= nil then 	
res = tonumber(res);	
if res ~= nil and res > tonumber(ARGV[1]) then	
res = redis.call('DECR', KEYS[1]);	
end	
end	
!

return res" 1 foo 100
Scripting
Some more commands
๏

EVALSHA sha1 nkeys key [key…] arg [arg…]	
‣

Client libraries optimistically use EVALSHA
-

‣
๏

On NOSCRIPT error, EVAL is used

Automatic version management

SCRIPT LOAD script	
‣

Cached scripts are no flushed until server restart

‣

Ensures EVALSHA will not fail (e.g. MULTI/EXEC)
Dangerous commands
๏

KEYS pattern	

๏

SAVE	

๏

FLUSHALL & FLUSHDB

๏

CONFIG
Some examples I
๏

11 common web use cases solved in Redis▸

๏

How to take advantage of Redis just adding it to
your stack▸

๏

A case study: design and implementation of a
simple Twitter clone using only PHP and Redis▸

๏

Scaling Crashlytics: building analytics on Redis
2.6▸
Some examples II
๏

Fast, easy, realtime metrics using Redis
bitmaps▸

๏

Redis - NoSQL data store▸

๏

Auto complete with Redis▸

๏

Multi user high performance web chat▸
III. Mastering Redis

http://www.fotolia.com/id/19245921
Persistence
Overview
๏

The whole dataset needs to feet in memory
‣
‣

Very high read & write rates

‣
๏

Durability is optional

Optimal & simple memory and disk representations

What if Redis runs out of memory?
‣

Swapping

Performance degradation

‣

Hit maxmemory limit

Failed writes or eviction policy
Persistence
Snapshotting — RDB
๏

Periodic asynchronous point-in-time dump to disk
‣

Every S seconds and C changes

‣

Fast service restarts

๏

Possible data lost during a crash

๏

Compact files

๏

Minimal overhead during operation

๏

Huge data sets may experience short delays during fork()

๏

Copy-on-write fork() semantics

2x memory problem
Persistence
Append only file — AOF
๏

Journal file logging every write operation
‣
‣

๏

Configurable fsync frequency: speed vs. safety
Commands replayed when server restarts

No as compact as RDB
‣

Safe background AOF file rewrite fork()

๏

Overhead during operation depends on fsync behavior

๏

Recommended to use both RDB + AOF
‣

RDB is the way to of for backups & disaster recovery
Security
๏

Designed for trusted clients in trusted environments
‣

๏

Basic unencrypted AUTH command
‣

๏

No users, no access control, no connection filtering…

requirepass s3cr3t	

Command renaming
‣

rename-command FLUSHALL f1u5hc0mm4nd	

‣

rename-command FLUSHALL ""
Replication
Overview I
๏

One master — Multiple slaves
‣

Scalability & redundancy
-

‣

Client side failover, eviction, query routing…

Lightweight master

๏

Slaves are able to accept other slave connections

๏

Non-blocking in the master, but blocking on the slaves

๏

Asynchronous but periodically acknowledged
Replication
Overview II
๏

Automatic slave reconnection

๏

Partial resynchronization: PSYNC vs. SYNC
‣

๏

RDB snapshots are used during initial SYNC

Read-write slaves
‣
‣

๏

slave-read-only no	
Ephemeral data storage

Minimum replication factor
Replication
Some commands & configuration
๏

Trivial setup
‣
‣

๏

slaveof <host> <port>	
SLAVEOF [<host> <port >| NO ONE]	

Some more configuration tips
‣

slave-serve-stale-data [yes|no]	

‣

repl-ping-slave-period <seconds>	

‣

masterauth <password>
Replication
Final tips

๏

Inconsistencies are possible when using some
eviction policy in a replicated setup
‣

Set slave’s maxmemory to 0
Performance
General tips
๏

Fast CPUs with large caches and not many cores

๏

Do not invest on expensive fast memory modules

๏

Avoid virtual machines

๏

Use UNIX domain sockets when possible

๏

Aggregate commands when possible

๏

Keep low the number of client connections
Performance
Advanced optimization

๏

Special encoding of small aggregate data types

๏

32 vs. 64 bit instances

๏

Consider using bit & byte level operations

๏

Use hashes when possible

๏

Alway check big-O notation complexities
Performance
Understanding metrics I
๏

redis-cli --latency	
‣

Typical latency for 1 GBits/s network is 200 μs

‣

SHOWLOG GET

‣

Monitor number of client connections and
consider using multiplexing proxy

‣

Improve memory management
Performance
Understanding metrics II
๏

redis-cli INFO | grep …

๏

used_memory	
‣

Usually inferior to used_memory_rss
-

Used memory as seen by the OS

‣

Swapping risk when approaching 45% / 95%

‣

Reduce Redis footprint when possible
Performance
Understanding metrics III
๏

total_commands_processed	
‣

๏

Use multi-argument commands, scripts and
pipelines when possible

mem_fragmentation_ratio	
‣

used_memory_rss ÷ used_memory	

‣

Execute SHUTDOWN SAVE and restart the instance	

‣

Consider alternative memory allocators
Performance
Understanding metrics IV

๏

evicted_keys	
‣

Keys removed when hitting maxmemory limit

‣

Increase maxmemory when possible

‣

Reduce Redis footprint when possible

‣

Consider sharding
Redis pools
๏

Redis is extremely small footprint and lightweight

๏

Multiple Redis instances per node
‣
‣

Mitigated RDB 2x memory problem

‣
๏

Full CPU usage

Fine tuned instances

How to use multiple instances?
‣

Sharding

‣

Specialized instances
Redis Sentinel
Overview I
๏

Official Redis HA / failover solution
‣
‣

On master failure, choose slave & promote to master

‣
๏

Periodically check liveness of Redis instances

Notify clients & slaves about the new master

Multiple Sentinels
‣

Complex distributed system

‣

Gossip, quorum & leader election algorithms
Redis Sentinel
Overview II
๏

Work in progress not ready for production

๏

Master pub/sub capabilities
‣

Auto discovery of other sentinels & slaves

‣

Notification of master failover

๏

Explicit client support required

๏

Redis Sentinel is a monitoring system with support for
automatic failover. It does not turn Redis into a
distributed data store. CAP discussions do not apply▸
Redis Sentinel
Apache Zookeeper
๏

Set of primitives to ease building distributed systems
‣
‣

Handling of network partitions, leader election,
quorum management…

‣
๏

http://zookeeper.apache.org

Replicated, highly available, well-known…

Ad-hoc Redis HA alternative to Sentinel
‣

Explicit client implementation required
Redis Cluster
๏

Long term project to be released in Redis 3.0

๏

High performance & linearly scalable complex distributed DB
‣
‣

๏

Sharding across multiple nodes
Graceful handling of network partitions

Implemented subset
‣
‣

๏

Commands dealing with multiple keys, etc. not supported
Multiple databases are not supported

Keys hash tags
Sharding
Overview I
๏

Distribute data into multiple Redis instances
‣
‣

๏

Allows much larger databases
Allows to scale the computational power

Data distribution strategies
‣

Directory based

‣

Ranges

‣

Hash + Module

‣

Consistent hashing
Sharding
Overview II
๏

Data distribution responsibility
‣
‣

Proxy assisted

‣
๏

Client side

Query routing

Do I really need sharding?
‣

Very unlikely CPU becomes bottleneck with Redis

‣

500K requests per second!
Sharding
Disadvantages

๏

Multi-key commands are not supported

๏

Multi-key transactions are not supported

๏

Sharding unit is the key

๏

Harder client logics

๏

Complex to scale up/down when used as a store
Sharding
Presharding
๏

Hard to scale up/down sharded databases
‣

๏

Take advantage of small Redis footprint
‣

๏

But data storage needs may vary over the time

Think big!

Redis replication allows moving instances with
minimal downtime
Sharding
Twimproxy overview I
๏

Redis Cluster is currently not production ready
‣

Mix between query routing & client side partitioning

๏

Not all Redis clients support sharding

๏

Automatic sharding Redis & Memcache (ASCII) proxy
‣

Developed by Twitter & Apache 2.0 licensed

‣

https://github.com/twitter/twemproxy/

‣

Single threaded & extremely fast
Sharding
Twimproxy overview II
๏

Also known as nutcracker

๏

Connection multiplexer pipelining requests and
responses
‣

Original motivation

๏

No bottleneck or single point of failure

๏

Optional node ejection
‣

Only useful when using Redis as a cache
Sharding
Why Twimproxy ?
๏

Multiplexed persistent server connections

๏

Automatic sharding and protocol pipelining

๏

Multiple distribution algorithms supporting
nicknames

๏

Simple dumb clients

๏

Automatic fault tolerance capabilities

๏

Zero copy
Sharding
Why not Twimproxy ?
๏

Extra network hop
‣

๏

Pipelining is your friend

Not all commands supported
‣
‣

๏

Transactions
Pub / Sub

HA not supported
‣

Redis Sentinel Twemproxy agent
Thanks!

http://www.flickr.com/photos/62337512@N00/3958637561/

Everything you always wanted to know about Redis but were afraid to ask

  • 1.
    Redis Everything you alwayswanted to know about Redis but were afraid to ask Carlos Abalde carlos.abalde@gmail.com March 2014
  • 2.
  • 3.
    INSERT INTO… $ redis-cli-n 0 ! 127.0.0.1:6379> SET the-answer 42 OK ! 127.0.0.1:6379> QUIT ! rulo:~$
  • 4.
    SELECT * FROM… $redis-cli -n 0 ! 127.0.0.1:6379> GET the-answer "42" ! 127.0.0.1:6379> QUIT ! rulo:~$
  • 5.
  • 6.
  • 7.
    Agenda I. Introduction ‣ Context, popularRedis users, latest releases… II. Redis 101 ‣ Basics, scripting, some examples… III.Mastering Redis ‣ Persistence, replication, performance, sharding…
  • 8.
  • 9.
    NoSQL / NoRELmess ๏ Document DBs ‣ ๏ Graph DBs ‣ ๏ Neo4j, FlockDB… Column oriented DBs ‣ ๏ MongoDB, CouchDB, Riak… HBase, Cassandra, BigTable… Key-Value DBs ‣ Memcache, MemcacheDB, Redis, Voldemort, Dynamo…
  • 10.
    Who’s behind Redis? ๏ Createdby Salvatore Sanfilippo ‣ ‣ ๏ http://antirez.com @antirez at Twitter Currently sponsored by Pivotal ‣ Previously to May 2013 sponsored by VMware
  • 11.
  • 12.
    Who’s using Redis?II ๏ The architecture Twitter uses to deal with 150M active users, 300K QPS, a 22 MB/S Firehose, and send tweets in under 5 seconds. High Scalability (2013)▸ ๏ Storing hundreds of millions of simple key-value pairs in Redis. Instagram Engineering Blog (2012)▸ ๏ The Instagram architecture Facebook bought for a cool billion dollars. High Scalability (2012)▸ ๏ Facebook’s Instagram: making the switch to Cassandra from Redis, a 75% ‘insta’ savings. Planet Cassandra (2013)▸
  • 13.
    Who’s using Redis?III ๏ Highly available real time push notifications and you. Flickr Engineering Blog (2012)▸ ๏ Using Redis as a secondary index for MySQL. Flickr Engineering Blog (2013)▸ ๏ How we made GitHub fast. The GitHub Blog (2009)▸ ๏ Real world Redis. Agora Games (2012)▸ ๏ Disqus discusses migration from Redis to Cassandra for horizontal Scalability. Planet Cassandra (2013)▸
  • 14.
    Memory is thenew disk ๏ BSD licensed in-memory data structure server ‣ Strings, hashes, lists, sets… ๏ Optional durability ๏ Bindings to almost all relevant languages “Memory is the new disk, disk is the new tape” — Jim Gray
  • 15.
    A fight againstcomplexity ๏ Simple & robust foundations ‣ ‣ ๏ Single threaded No map-reduce, no indexes, no vector clocks, no Paxos, no Merkle trees, no gossip protocols… Blazingly fast ‣ Implemented in C (20K LoC for the 2.2 release) ‣ No dependencies
  • 16.
    A fight againstcomplexity ! … 5. We’re against complexity. We believe designing systems is a fight against complexity. […] Most of the time the best way to fight complexity is by not creating it at all. … The Redis Manifesto▸
  • 17.
    Most popular K-VDB ๏ Currently most popular key-value DB▸ ๏ Redis 1.0 (April’09) ↝ Redis 2.8.6 (March’14) Google Trends▸
  • 18.
    Latest releases I ๏ Redis2.6 (October’12) ‣ LUA scripting ‣ New commands ‣ Milliseconds precision expires ‣ Unlimited number of clients ‣ Improved AOF generation
  • 19.
    Latest releases II ๏ Redis2.8 (November’13) ‣ Redis 2.7 removing clustering stuff ‣ Partial resynchronization with slaves ‣ IPv6 support ‣ Config rewriting ‣ Key-space changes notifications via Pub/Sub
  • 20.
    Latest releases III ๏ Redis3.0 ‣ Next beta release planned to March’14 ‣ Redis Cluster ‣ Speed improvements under certain workloads
  • 21.
    Commands ๏ redis-server ๏ redis-cli ‣ ๏ redis-benchmark ‣ ๏ Command line interface Benchmarkingutility redis-check-dump & redis-check-aof ‣ Corrupted RDB/AOF files utilities
  • 22.
    Performance Sample benchmark ๏ Redis 2.6.14 ๏ IntelXeon CPU E5520 @ 2.27GHz ๏ 50 simultaneous clients performing 2M requests ๏ Loopback interface ๏ Key space of 1M keys
  • 23.
    Performance No pipelining $ redis-benchmark -r 1000000 -n 2000000 -t get,set,lpush,lpop -q ! SET: 122556.53 requests per second GET: 123601.76 requests per second LPUSH: 136752.14 requests per second LPOP: 132424.03 requests per second
  • 24.
    Performance 16 command perpipeline $ redis-benchmark -r 1000000 -n 2000000 -t get,set,lpush,lpop -P 16 -q ! SET: 552028.75 requests per second GET: 707463.75 requests per second LPUSH: 767459.75 requests per second LPOP: 770119.38 requests per second
  • 25.
  • 26.
  • 27.
    Overview ๏ Family of fundamentaldata structures ‣ ‣ Accessed / indexed by key ‣ ๏ Strings and string containers Directly exposed — No abstraction layers Rich set of atomic operations over the structures ‣ ๏ Detailed reference using big-O notation for complexities Basic publish / subscribe infrastructure
  • 28.
    Keys ๏ Arbitrary ASCII strings ‣ ‣ ๏ Definesome format convention and adhere to it Key length matters! Multiple name spaces are available ‣ Separate DBs indexed by an integer value - ๏ SELECT command Multiples DBs vs. Single DB + key prefixes Keys can expire automatically
  • 29.
    Data structures I ๏ Strings ‣ ๏ Hashes ‣ ๏ Caching,counters, realtime metrics… “Object” storage… Lists ‣ Logs, queues, message passing…
  • 30.
    Data structures II ๏ Sets ‣ ๏ Membership,tracking… Ordered sets ‣ Leaderboards, activity feeds… RTFM, please :) ▸
  • 31.
    Publish / Subscribe Overview ๏ Classicpattern decoupling publishers & subscribers ‣ ‣ ๏ You can subscribe to channels; when someone publish in a channel matching your interests Redis will send it to you SUBSCRIBE, UNSUBSCRIBE & PUBLISH commands Fire and forget notifications ‣ ๏ Not suitable for reliable off-line notification of events Pattern-matching subscriptions ‣ PSUBSCRIBE & PUNSUBSCRIBE commands
  • 32.
    Publish / Subscribe Key-spacenotifications ๏ Available since Redis 2.8 ‣ ‣ ๏ Disabled in the default configuration Key-space vs. keys-event notifications Delay of key expiration events ‣ Expired events are generated when Redis deletes the key; not when the TTL is consumed - Lazy (i.e. on access time) key eviction - Background key eviction process
  • 33.
    Pipelining ๏ Redis pipelines arejust a RTT optimization ‣ Deliver multiple commands together without waiting for replies ‣ Fetch all replies in a single step - Server needs to buffer all replies! ๏ Pipelines are NOT transactional or atomic ๏ Redis scripting FTW! ‣ Much more flexible alternative
  • 34.
    Transactions ๏ Or, more precisely,“transactions” ‣ Commands are executed as an atomic & single isolated operation - ‣ ๏ Rollback is not supported! MULTI, EXEC & DISCARD commands ‣ ๏ Partial execution is possible due to pre/post EXEC failures! Conditional EXEC with WATCH Redis scripting FTW! ‣ Redis transactions are complex and cumbersome
  • 35.
    Scripting Overview I ๏ Added inRedis 2.6 ๏ Uses the LUA 5.1 programming language▸ ‣ Base, Table, String, Math & Debug libraries ‣ Built-in support for JSON and MessagePack ‣ No global variables ‣ redis.{call(), pcall()} ‣ redis.{error_reply(), status_reply(), log()}
  • 36.
    Scripting Overview II ๏ Scripts areatomic, like any other command ๏ Scripts add minimal overhead ‣ ๏ Shared LUA context Scripts are replicated on slaves by sending the script (i.e. not the resulting commands) ‣ ‣ Single thread Scripts are required to be pure functions Maximum execution time vs. Atomic execution
  • 37.
    Scripting What is fixedwith scripting? ๏ Server side manipulation of data ๏ Minimizes latency ‣ ๏ No round trip delay Maximizes CPU usage ‣ ‣ ๏ Less parsing Less OS system calls Simpler & faster alternative to WATCH
  • 38.
    Scripting Scripts vs. Storedprocedures ๏ Stored procedures are evil ๏ Backend logic should be 100% application side ‣ ‣ ๏ No hidden behaviors No crazy version management Redis keys are explicitly declared as parameters of the script ‣ Cluster friendly ‣ Hashed scripts
  • 39.
    Scripting Hello world! > EVAL" return redis.call('SET', KEYS[1], ARGV[1])" 1 foo 42 OK ! > GET foo "42"
  • 40.
    Scripting DECREMENT-IF-GREATER-THAN EVAL " local res= redis.call('GET', KEYS[1]); ! if res ~= nil then res = tonumber(res); if res ~= nil and res > tonumber(ARGV[1]) then res = redis.call('DECR', KEYS[1]); end end ! return res" 1 foo 100
  • 41.
    Scripting Some more commands ๏ EVALSHAsha1 nkeys key [key…] arg [arg…] ‣ Client libraries optimistically use EVALSHA - ‣ ๏ On NOSCRIPT error, EVAL is used Automatic version management SCRIPT LOAD script ‣ Cached scripts are no flushed until server restart ‣ Ensures EVALSHA will not fail (e.g. MULTI/EXEC)
  • 42.
  • 43.
    Some examples I ๏ 11common web use cases solved in Redis▸ ๏ How to take advantage of Redis just adding it to your stack▸ ๏ A case study: design and implementation of a simple Twitter clone using only PHP and Redis▸ ๏ Scaling Crashlytics: building analytics on Redis 2.6▸
  • 44.
    Some examples II ๏ Fast,easy, realtime metrics using Redis bitmaps▸ ๏ Redis - NoSQL data store▸ ๏ Auto complete with Redis▸ ๏ Multi user high performance web chat▸
  • 45.
  • 46.
    Persistence Overview ๏ The whole datasetneeds to feet in memory ‣ ‣ Very high read & write rates ‣ ๏ Durability is optional Optimal & simple memory and disk representations What if Redis runs out of memory? ‣ Swapping Performance degradation ‣ Hit maxmemory limit Failed writes or eviction policy
  • 47.
    Persistence Snapshotting — RDB ๏ Periodicasynchronous point-in-time dump to disk ‣ Every S seconds and C changes ‣ Fast service restarts ๏ Possible data lost during a crash ๏ Compact files ๏ Minimal overhead during operation ๏ Huge data sets may experience short delays during fork() ๏ Copy-on-write fork() semantics 2x memory problem
  • 48.
    Persistence Append only file— AOF ๏ Journal file logging every write operation ‣ ‣ ๏ Configurable fsync frequency: speed vs. safety Commands replayed when server restarts No as compact as RDB ‣ Safe background AOF file rewrite fork() ๏ Overhead during operation depends on fsync behavior ๏ Recommended to use both RDB + AOF ‣ RDB is the way to of for backups & disaster recovery
  • 49.
    Security ๏ Designed for trustedclients in trusted environments ‣ ๏ Basic unencrypted AUTH command ‣ ๏ No users, no access control, no connection filtering… requirepass s3cr3t Command renaming ‣ rename-command FLUSHALL f1u5hc0mm4nd ‣ rename-command FLUSHALL ""
  • 50.
    Replication Overview I ๏ One master— Multiple slaves ‣ Scalability & redundancy - ‣ Client side failover, eviction, query routing… Lightweight master ๏ Slaves are able to accept other slave connections ๏ Non-blocking in the master, but blocking on the slaves ๏ Asynchronous but periodically acknowledged
  • 51.
    Replication Overview II ๏ Automatic slavereconnection ๏ Partial resynchronization: PSYNC vs. SYNC ‣ ๏ RDB snapshots are used during initial SYNC Read-write slaves ‣ ‣ ๏ slave-read-only no Ephemeral data storage Minimum replication factor
  • 52.
    Replication Some commands &configuration ๏ Trivial setup ‣ ‣ ๏ slaveof <host> <port> SLAVEOF [<host> <port >| NO ONE] Some more configuration tips ‣ slave-serve-stale-data [yes|no] ‣ repl-ping-slave-period <seconds> ‣ masterauth <password>
  • 53.
    Replication Final tips ๏ Inconsistencies arepossible when using some eviction policy in a replicated setup ‣ Set slave’s maxmemory to 0
  • 54.
    Performance General tips ๏ Fast CPUswith large caches and not many cores ๏ Do not invest on expensive fast memory modules ๏ Avoid virtual machines ๏ Use UNIX domain sockets when possible ๏ Aggregate commands when possible ๏ Keep low the number of client connections
  • 55.
    Performance Advanced optimization ๏ Special encodingof small aggregate data types ๏ 32 vs. 64 bit instances ๏ Consider using bit & byte level operations ๏ Use hashes when possible ๏ Alway check big-O notation complexities
  • 56.
    Performance Understanding metrics I ๏ redis-cli--latency ‣ Typical latency for 1 GBits/s network is 200 μs ‣ SHOWLOG GET ‣ Monitor number of client connections and consider using multiplexing proxy ‣ Improve memory management
  • 57.
    Performance Understanding metrics II ๏ redis-cliINFO | grep … ๏ used_memory ‣ Usually inferior to used_memory_rss - Used memory as seen by the OS ‣ Swapping risk when approaching 45% / 95% ‣ Reduce Redis footprint when possible
  • 58.
    Performance Understanding metrics III ๏ total_commands_processed ‣ ๏ Usemulti-argument commands, scripts and pipelines when possible mem_fragmentation_ratio ‣ used_memory_rss ÷ used_memory ‣ Execute SHUTDOWN SAVE and restart the instance ‣ Consider alternative memory allocators
  • 59.
    Performance Understanding metrics IV ๏ evicted_keys ‣ Keysremoved when hitting maxmemory limit ‣ Increase maxmemory when possible ‣ Reduce Redis footprint when possible ‣ Consider sharding
  • 60.
    Redis pools ๏ Redis isextremely small footprint and lightweight ๏ Multiple Redis instances per node ‣ ‣ Mitigated RDB 2x memory problem ‣ ๏ Full CPU usage Fine tuned instances How to use multiple instances? ‣ Sharding ‣ Specialized instances
  • 61.
    Redis Sentinel Overview I ๏ OfficialRedis HA / failover solution ‣ ‣ On master failure, choose slave & promote to master ‣ ๏ Periodically check liveness of Redis instances Notify clients & slaves about the new master Multiple Sentinels ‣ Complex distributed system ‣ Gossip, quorum & leader election algorithms
  • 62.
    Redis Sentinel Overview II ๏ Workin progress not ready for production ๏ Master pub/sub capabilities ‣ Auto discovery of other sentinels & slaves ‣ Notification of master failover ๏ Explicit client support required ๏ Redis Sentinel is a monitoring system with support for automatic failover. It does not turn Redis into a distributed data store. CAP discussions do not apply▸
  • 63.
    Redis Sentinel Apache Zookeeper ๏ Setof primitives to ease building distributed systems ‣ ‣ Handling of network partitions, leader election, quorum management… ‣ ๏ http://zookeeper.apache.org Replicated, highly available, well-known… Ad-hoc Redis HA alternative to Sentinel ‣ Explicit client implementation required
  • 64.
    Redis Cluster ๏ Long termproject to be released in Redis 3.0 ๏ High performance & linearly scalable complex distributed DB ‣ ‣ ๏ Sharding across multiple nodes Graceful handling of network partitions Implemented subset ‣ ‣ ๏ Commands dealing with multiple keys, etc. not supported Multiple databases are not supported Keys hash tags
  • 65.
    Sharding Overview I ๏ Distribute datainto multiple Redis instances ‣ ‣ ๏ Allows much larger databases Allows to scale the computational power Data distribution strategies ‣ Directory based ‣ Ranges ‣ Hash + Module ‣ Consistent hashing
  • 66.
    Sharding Overview II ๏ Data distributionresponsibility ‣ ‣ Proxy assisted ‣ ๏ Client side Query routing Do I really need sharding? ‣ Very unlikely CPU becomes bottleneck with Redis ‣ 500K requests per second!
  • 67.
    Sharding Disadvantages ๏ Multi-key commands arenot supported ๏ Multi-key transactions are not supported ๏ Sharding unit is the key ๏ Harder client logics ๏ Complex to scale up/down when used as a store
  • 68.
    Sharding Presharding ๏ Hard to scaleup/down sharded databases ‣ ๏ Take advantage of small Redis footprint ‣ ๏ But data storage needs may vary over the time Think big! Redis replication allows moving instances with minimal downtime
  • 69.
    Sharding Twimproxy overview I ๏ RedisCluster is currently not production ready ‣ Mix between query routing & client side partitioning ๏ Not all Redis clients support sharding ๏ Automatic sharding Redis & Memcache (ASCII) proxy ‣ Developed by Twitter & Apache 2.0 licensed ‣ https://github.com/twitter/twemproxy/ ‣ Single threaded & extremely fast
  • 70.
    Sharding Twimproxy overview II ๏ Alsoknown as nutcracker ๏ Connection multiplexer pipelining requests and responses ‣ Original motivation ๏ No bottleneck or single point of failure ๏ Optional node ejection ‣ Only useful when using Redis as a cache
  • 71.
    Sharding Why Twimproxy ? ๏ Multiplexedpersistent server connections ๏ Automatic sharding and protocol pipelining ๏ Multiple distribution algorithms supporting nicknames ๏ Simple dumb clients ๏ Automatic fault tolerance capabilities ๏ Zero copy
  • 72.
    Sharding Why not Twimproxy? ๏ Extra network hop ‣ ๏ Pipelining is your friend Not all commands supported ‣ ‣ ๏ Transactions Pub / Sub HA not supported ‣ Redis Sentinel Twemproxy agent
  • 73.