RAJESH KUMAR
1
NOSQL
 Non relational databases
 Existed since 60s, made popular in early 2000s
 Do not follow the tabular/relational structure of RDBMSs
 Major motivation and selling point is horizontal scaling
 Generally Compromise on consistency
 Lack true ACID properties
2
TYPES OF NOSQL
 Key-Value
 Redis, Dynamo, Hazelcast, Memcached, Ehcache, Oracle NoSQL
 Column
 Cassandra, Hbase, Vertica
 Document
 MongoDB, Couchbase, DocumentDB
 Graph
 ArangoDB, Neo4J, Giraph, OrientDB
3
KEY VALUE DB
 Store values as associative arrays (Map or Dictionary)
 Simplest of data models
 Provide eventual consistency model and serialization
 Also support ordering of keys
4
DOCUMENT DB
 Document oriented DBs are also key value pair
 But the value is in form of document
 Generally document is in some encoding e.g. JSON, XML etc
5
COLUMNAR DB
 Column refers to the value stored
 Timestamp acts as the key
 Latest timestamp has the most recent value
6
GRAPH DB
 Values are kept in vertices in graphs
 Vertices are interconnected with other vertices by relations
7
CHARACTERISTICS
8
Data Model Performance Scalability Flexibility Complexity
Key Value High High High None
Columnar High High Moderate Low
Document High High High Moderate
Graph Moderate* Moderate* High High
NOSQL USE CASES
9
 Huge data
 Massive write performance
 Low latency
 Flexible schema
 Parallel computing
BREAK
10
INTRODUCTION
 Redis is an open source, in memory data structure.
 BSD license
 Redis can be used as a database, cache and a message broker.
 Supports following data structures
 Strings, hashes, lists, sets and ordered sets
 Redis is written in ANSI C
11
OS SUPPORT
 Developed and tested on POSIX systems
 Unix, Mac OS
 Works on Solaris, but support not provided
 No official redis for Windows
 But Microsoft Open Tech group develops and maintains a 64 bit port of
Redis
 https://github.com/MSOpenTech/redis
 Current stable release is 3.2 (4.0 beta is available too)
12
SETTING UP
 Install redis on your machine
 redis-server
 Starts the redis server
 Can be configured as windows service
 redis-cli
 Starts the redis command line
13
SETTING UP NODE REDIS
 Install node
 Install npm
 npm install redis
 Write code in js file and run with node
14
VERIFY INSTALLATION
 redis-cli --version
 ping
 echo “Hello World”
 redis-server --version
 quit
 To quit to command line
15
GETTING HELP
 Help <command> or visit https://redis.io/commands
 Shows the information about command
 Summary
 Version
 Group
16
KEY AND VALUES
 Keys and values can be maximum 512MB
 Smaller keys are a smart idea, but stick with a balance
 Keys are binary safe
 Any string even if binary can be key
 An empty string can be key
17
BASIC COMMANDS
 Set, get
 Incr, incrby, decr, decrby
 Mset, mget
 Exists, del
 Type
 Expire, pexpire, ttl, pttl
18
HANDSON
BREAK
20
QUESTIONS?
21
SYSTEM COMMANDS
 FLUSHALL
 Removes all keys from all databases
 FLUSHDB
 Removes all keys from current database
 SELECT
 Select a database
22
SYSTEM COMMANDS CONTINUED
 DBSIZE
 Count of keys in current database
 SLOWLOG
 Gets the command which executes slower than slowlog-log-slower-than
 TIME
 Gets the current time in seconds and microsecond.
23
DATA STRUCTURES
 Redis supports following data structures
 String,
 Hash,
 List,
 Set,
 Sorted set
24
LIST
 Redis lists are linked lists and not arrays
 Lpush adds the element to the head
 Rpush adds the element to the tail
 Lrange command extracts ranges of elements from lists
 Rpop, lpop pop elements from tail and head respectively
 Twitter takes the latest tweets posted by users into Redis lists.
25
HANDSON
BLOCKING QUEUE
 To implement blocking popping actions on a list
 Brpop
 Blpop
 The commands have a timeout option as well
 0 means indefinitely
 Rpoplpush, brpoplpush
27
LIST AS BLOCKING QUEUE
 To implement blocking popping actions on a list
 Brpop
 Blpop
 The commands have a timeout option as well
 0 means indefinitely
 Rpoplpush, brpoplpush
28
HANDSON
HASHES
 Hset, hget
 Hmset, hmget
 Hashes with small values are represented very memory efficiently
30
HASHES COMMANDS
 HDEL, HEXISTS
 HGETALL
 HKEYS, HVALS
 HLEN, HSTRLEN, HINCRBY, HINCRBYFLOAT
 HSETNX
 HSCAN
31
HANDSON
SETS
 Unordered and unique collection of strings
 SADD, SMEMBERS
 SINTER, SUNION, SDIFF,
 SDIFFSTORE, SINTERSTORE, SUNIONSTORE
 SRANDMEMBER, SCARD
 SREM, SPOP, SMOVE, SSCAN
33
HANDSON
SORTED SETS
 Unordered, sorted and unique collection of strings
 A field score is associated to aid in sorting
 ZADD, ZRANGE
 ZCARD, ZCOUNT, ZRANK
 ZREVRANK, ZSCORE, ZSCAN
 ZREMRANGEBYRANK, ZREMRANGEBYSCORE, ZREMRANGEBYLEX
35
HANDSON
QUESTION?
 What will adding an element twice will do?
 If the score is different it will be updated.
37
BITMAPS
 Not an actual data type
 Operations defined on the strings
 As strings are binary safe blobs of bits
 SETBIT, GETBIT,
 BITCOUNT, BITOP, BITOPS
 BITFIELD
38
HANDSON
HYPERLOGLOGS (HLLS)
 Probabilistic data structure to count unique things
 Just like counting members of a set
 Not very exact (around 1% error)
 But HLLs require very little memory
 Actually it never add elements, Just maintains the count
 PFADD, PFCOUNT, PFMERGE
40
HANDSON
BREAK
42
QUESTIONS?
43
CONFIGURATION
 Redis configuration lies in redis.conf file
 To override the default config
 redis-server /path/to/redis.conf
 redis-server --port 9999 --slaveof 127.0.0.1 6379
 Configurations can also be set at runtime
 CONFIG SET, CONFIG GET, CONFIG REWRITE
44
INFO AND STATISTICS
 Info
 server: General information about the Redis server
 clients: Client connections section
 memory: Memory consumption related information
 persistence: RDB and AOF related information
 stats: General statistics
 replication: Master/slave replication information
45
INFO AND STATISTICS CONTINUED
 Info
 cpu: CPU consumption statistics
 commandstats: Redis command statistics
 cluster: Redis Cluster section
 keyspace: Database related statistics
 all: Return all sections
 default: Return only the default set of sections
46
HANDSON
RESP
 Redis Serialization Protocol is a text based protocol used by clients
and servers to communicate to each other
 Follows request response architecture and operates over TCP
 Except when pipelining and Pub/Sub
 Redis clusters use a different protocol for communication among
nodes
48
RESP WORKING
 Clients send commands to a server
 The server replies with one of the RESP types
 For Simple Strings the first byte of the reply is "+"
 For Errors the first byte of the reply is "-"
 For Integers the first byte of the reply is ":"
 For Bulk Strings the first byte of the reply is "$"
 For Arrays the first byte of the reply is "*"
49
DISTRIBUTED LOCKING
 Distributed locks are a very useful primitive in many environments
where different processes must operate with shared resources in a
mutually exclusive way.
 A distributed lock manager runs in every machine in a cluster, with
an identical copy of a cluster-wide lock database.
 DLM synchronize the access to shared resource.
50
REDIS DISTRIBUTED LOCKING
 Redis suggests to implement Redlock algorithm for DLM
 Redlock works as follows
 Gets the current time in milliseconds
 Sequentially tries to acquire locks on all instances within timeout
 If fail to acquire all locks within timeout then releases all acquired locks
 Retries after some time
51
REDLOCK IMPLEMENTATION
 You do not have to implement Redlock
 Many open source implementations are already available
 Java
 https://github.com/redisson/redisson
 Node
 https://github.com/mike-marcacci/node-redlock
52
REDLOCK IMPLEMENTATION
 You do not have to implement Redlock
 Many open source implementations are already available
 Java
 https://github.com/redisson/redisson
 Node
 https://github.com/mike-marcacci/node-redlock
53
REDISSON EXAMPLE JAVA
54
NODE REDLOCK EXAMPLE
55
 Install Redlock
 npm install redlock
NODE REDLOCK EXAMPLE CONTINUED
56
HANDSON
BREAK
58
QUESTIONS?
59
AUTOCOMPLETE
60
 Find the rank of word typed e.g. fo
 zrank zset fo
 Will provide the rank of fo. It runs on O(log n)
 zrange zset <rank of fo +1> -1
 Will provide all values coming after fo
PIPELINING
 Sending multiple commands at once, Instead of one by one
 Since redis follows request response architecture, sending
commands one by one will result in responses
 Every response has Round Trip Time associated with it, pipelining cuts RTT
 It’s a client side phenomenon and redis server has nothing to do
with it
 Clients buffer the commands at TCP stack and send at once
61
PIPELINING JAVA EXAMPLE
 Most client libraries support redis pipelining e.g. Java, Node etc.
62
PIPELINING COMMAND LINE
 redis-cli --pipe
 Makes the command line run in pipeline mode
 e.g. cat commands.txt | redis-cli --pipe will run all commands in
commands.txt
 Commands have to be encoded in redis protocol
 e.g. Set key value is represented as follows
 "*3rn$3rnSETrn$3rnkeyrn$5rnvaluern"
63
RESP PROTOCOL SYNTAX
*3<cr><lf>
$3<cr><lf>
SET<cr><lf>
$3<cr><lf>
key<cr><lf>
$5<cr><lf>
value<cr><lf>
64
PIPELINING INTERNALS
 redis-cli --pipe tries to sends and reads data when available
 Once there is no more data to read from input, it sends a
special ECHO command with a random 20 bytes string
 Once this special final command is sent, the code receiving replies
starts to match replies with this 20 bytes.
 When the matching reply is reached it can exit with success.
65
HANDSON
BREAK
67
QUESTIONS?
68
PUB SUB
 Redis supports a publisher subscriber model
 Messages are published to a channel
 Publisher and Subscribers are completely detached
 Any message not consumed is discarded
 DB Scoping does not work on Pub Sub
 Pub Sub support pattern matching
69
PUB SUB COMMANDS
 PUBLISH
 SUBSCRIBE
 PSUBSCRIBE
 UNSUBSCRIBE
 PUNSUBSCRIBE
 PUBSUB
70
JAVA PUB SUB EXAMPLE
 https://dzone.com/articles/redis-pubsub-using-spring
71
HANDSON
MEMORY OPTIMIZATIONS
 Redis encodes several data types automatically
 It’s a cpu vs memory trade-off
 Max number of entries and max size of the key can be specified
 If an encoded value will overflow the configured max size, Redis
will automatically convert it into normal encoding
 Redis 32 bit instances are faster than 64 bit instances
73
MEMORY OPTIMIZATIONS CONTINUED
 Zip list values bigger than 1024 are generally very slow
 Try to keep size of keys smaller, wherever possible
 Zip list entries into hundred thousands generally are very slow
 Make sure you benchmark wherever required
 list-max-ziplist-entries 512
 list-max-ziplist-value 64
74
TRANSACTIONS
 All commands are serialized and executed sequentially
 No other command can be executed in the middle of transaction
commands
 Either all of the commands or none are processed
 A Redis transaction is atomic
75
TRANSACTION USAGE
 MULTI command starts the transaction
 A number of Redis commands are issued
 EXEC will send commands to server to be executed
 DISCARD will flush the commands and transaction is terminated
 Commands will not be sent to server for execution
76
TRANSACTION USAGE CONTINUED
 WATCH is used to provide the Check-And-Set behavior
SET mykey 13
WATCH mykey
MULTI
// Some other client sets mykey to 19
SET mykey 29
EXEC
// Will fail
77
TRANSACTION USAGE CONTINUED
 UNWATCH flushes all watches in the transaction
 Calling EXEC or DISCARD automatically flushes all watches
 All Lua scripts runs in transactions automatically
78
TRANSACTION QUEUE ERROR
 A command fail to be queued
 Wrong number of arguments, wrong command name
 Memory limit reached
 So there will be error before EXEC is called
 If such an error occurs then transaction is cancelled
 True for most clients and redis 2.6+
79
TRANSACTION EXEC ERROR
 A command fail after EXEC is called
 Calling list op against string value
 The successful commands will be processed
 Error command will not be processed
 Transaction is not cancelled
80
ROLLBACK
 Redis does not support rollbacks
 For simplicity of design and performance considerations
81
HANDSON
BREAK
83
QUESTIONS?
84
PARTITIONING
 Partitioning is the process of splitting your data into multiple
instances
 Client side partitioning
 Clients select the right node where to write or read a given key
 Proxy assisted partitioning
 clients send requests to a proxy and proxy selects the node
 twemproxy is an open source proxy developed by twitter for redis
 https://github.com/twitter/twemproxy
85
PARTITIONING CONTINUED
 Query routing
 Client sends request to a random instance, and the instance forwards to the
node
 Redis cluster supports query routing
 It is the preferred way
86
SHARDING
 Sharding your database involves
 breaking up your big database into many, much smaller databases that
 share nothing and
 can be spread across multiple servers.
 Same thing as horizontal partitioning
 e.g. splitting a customer database geographically
87
CLUSTERING
 Redis Cluster provides a way to run a Redis installation where data
is automatically sharded across multiple Redis nodes.
 Redis Cluster also provides some degree of availability during
partitions,
 ability to continue the operations when some nodes fail
 cluster stops to operate in the event when most nodes fail
88
SETTING UP CLUSTER
 Using the create cluster script
 Easy, good for practice
 We will try this
 Manually setting up the cluster
 Creating production grade cluster
 Port 6379 and 16379 should be open on all nodes irrespective of which
port redis is installed
89
CREATE CLUSTER
 gem install redis
 create-cluster start
 create-cluster create (press yes if asked)
 Will create six node of redis
 redis-cli with –c switch can be used to connect to cluster
 create-cluster stop
90
CLUSTER-CLI REDIRECTION
 redis-cli -c –p 30001
 set foo bar
 set hello world
 get foo
 get hello
 Check how you are redirected to various nodes
91
RESHARDING
 Redis does not do automatic resharding
 redis-trib.rb check 127.0.0.1:30001
 redis-trib.rb reshard 127.0.0.1:30001
 How many slots do you want to move (from 1 to 16384)?
 Destination Node
 Source Nodes (There can be more than one)
92
RESHARDING CONTINUED
 Validate
 redis-trib.rb check 127.0.0.1:30001
 Non interactive version
 redis-trib.rb reshard --from <node-id> --to <node-id> -
-slots <number of slots> --yes <host>:<port>
93
HANDSON
BREAK
95
QUESTIONS?
96
INDEXING
 Natively redis only offers only primary key indexing
 Secondary indexes is required, can be stored in sorted sets
 You have a Hash
 We can create an index on a field and store it in Sorted Set
 Retrieve key value based on the secondary index
 Retrieve hash objects based on the key retrieved
97
INDEXING EXAMPLE
 HMSET user:1 id 1 username antirez age 38
 HMSET user:2 id 2 username maria age 42
 HMSET user:3 id 3 username jbaird age 33
 ZADD age 38 1
 ZADD age 42 2
 ZADD age 33 3
 ZRANGEBYSCORE age 20 40
 Will return 1 and 3
 HMGET user:1 user:3
98
HANDSON
QUESTIONS?
100
CLIENT COMMANDS
 Client Kill
 CLIENT KILL addr 127.0.0.1:6379 type slave
 Client List
 Returns information and statistics about the client connections server
 Client Pause
 Client Pause is a connections control command able to suspend all the
Redis clients for the specified amount of time (in milliseconds).
101
CLIENT COMMANDS
 Client Reply
 To completely disable replies from the Redis server
 Client SetName
 To set name of client
102
HANDSON
QUESTIONS?
104
PERSISTENCE
 Redis supports two kinds of backup format
 Append Only File (AOF)
 RDB
105
AOF
 Human Readable,
 All commands are recorded in text format,
 Slower
106
RDB
 Binary format,
 Snapshot in a time,
 Uses LZF compression, compact
 Faster than AOF
 You have to define save points
 Min time is every 5 minutes
107
RDB VS AOF
 If both enabled then AOF takes over
 RDB allows faster restarts with big datasets compared to AOF
 RDB can have data loss but AOF can not
 If dataset is big then RDB takes more time for backup than AOF
 RDB does a stop the world event, AOF backup can be run in
background
108
RDB OR AOF
 If data is more important then use AOF
 If recovery speed is important then use RDB
 Redis is working on a new persistence strategy which will use
advantages of both RDB and AOF
109
HANDSON
BREAK
111
QUESTIONS?
112
REDIS-CLI
 redis-cli --stat
 A new line is printed every second with memory usage, clients connected
etc
 redis-cli --bigkeys
 Scans the entire keyspace to find biggest keys and average sizes per key
type
 redis-cli monitor
 Will print all the commands received by an instance
113
REDIS-CLI CONTINUED
 redis-cli --latency
 PING commands time to get a reply is measured
 redis-cli --rdb /tmp/dump.rdb
 Copies the RDB dump to specified location
 redis-cli --slave
 Allows to inspect what master sends to slaves, used for debugging
114
HANDSON
QUESTIONS?
116
SECURITY
 Designed to be accessed by trusted clients on trusted
environments
 Should not be exposed on public ip addresses
 In protected mode, can only be accessed from loopback interfaces
 Redis does not support any access control
 Supports limited authentication via redis.conf
 Possible to rename or disable certain commands
117
ENCRYPTION
 Redis does not support encryption or SSL
 There are some open source libraries that can encrypt
communication to Redis instances
 e.g. Spiped http://www.tarsnap.com/spiped.html
118
HANDSON
QUESTIONS?
120
CONNECTION HANDLING
 Maximum number of clients
 default 10000 clients can connect simultaneously
 maxclients can be specified in redis.conf
 Output buffer
 No limit for normal clients, default 32 MB per minute for pub sub clients
 Slaves has 256 MB per minute
 client-output-buffer-limit can be set
121
CONNECTION HANDLING CONTINUED
 Query Buffer
 One GB, it is very high and generally is not reached
 Client timeout
 By default client is not timed out
 timeout value in seconds can be specified, not applied to pub/sub clients
122
HANDSON
QUESTIONS?
124
SENTINEL
 Sentinel provides high availability for Redis
 Helps clusters resist certain failures without human intervention
 Sentinel capabilities
 Monitoring,
 Notification,
 Automatic failover and
 Configuration provider
125
SENTINEL CONTINUED
 There are multiple sentinel instances in a cluster
 Multiple Sentinel instances co-operate for decision making
 Multiple Sentinels agree about if a master is no longer available
 This lowers the probability of false positives.
 Sentinel works even if not all the Sentinel processes are working
 Helps avoid becoming single point of failure
126
SENTINEL CONFIGURATION
sentinel monitor mymaster 127.0.0.1 6379 2
sentinel down-after-milliseconds mymaster 60000
sentinel failover-timeout mymaster 180000
sentinel parallel-syncs mymaster 1
127
SENTINEL QUORUM
 Quorum is the number of Sentinels that need to agree for a
decision
 Quorum is defined in redis.conf
 Quorum is only used to detect the failure
 To perform a failover, one of the Sentinels needs to be elected
leader
 Leader is elected with the vote of majority of Sentinel processes
128
RUNNING SENTINEL
 There are two ways to run Sentinel
 redis-sentinel /path/to/sentinel.conf
 redis-server /path/to/sentinel.conf --sentinel
 sentinel.conf is mandatory
 Sentinel runs on 26379 port by default
129
RUNNING SENTINEL CONTINUED
 sentinel master mymaster
 Will show information about the mymaster instance
 sentinel slaves mymaster
 sentinel sentinels mymaster
130
TESTING FAILOVER
 redis-cli -p 6379 DEBUG sleep 60
 Will make redis sleep for 60 seconds, no longer reachable
 After some time if you ask sentinel about master
 SENTINEL get-master-addr-by-name mymaster
 You will get a different master
131
SENTINEL BEST PRACTICES
 Deploy at least three Sentinels
 Sentinels can be deployed on same node as master and slaves
132
HANDSON
BREAK
134
QUESTIONS?
135
REDIS USE CASES
 Real time delivery architecture at Twitter
 https://www.infoq.com/presentations/Real-Time-Delivery-Twitter
 GitHub repositories
 https://github.com/blog/530-how-we-made-github-fast
 Pintrest Follower model
 https://engineering.pinterest.com/blog/building-follower-model-scratch
136
REDIS USE CASES CONTINUED
 Craiglist
 https://blog.zawodny.com/2011/02/26/redis-sharding-at-craigslist/
 Stackexchange
 https://goo.gl/Xvqtq0
 Flickr
 https://goo.gl/EVni0g
137
QUESTIONS?
138
RAJESH KUMAR
Thanks a lot
Rajesh.Jangra@gmail.com

Redis

  • 1.
  • 2.
    NOSQL  Non relationaldatabases  Existed since 60s, made popular in early 2000s  Do not follow the tabular/relational structure of RDBMSs  Major motivation and selling point is horizontal scaling  Generally Compromise on consistency  Lack true ACID properties 2
  • 3.
    TYPES OF NOSQL Key-Value  Redis, Dynamo, Hazelcast, Memcached, Ehcache, Oracle NoSQL  Column  Cassandra, Hbase, Vertica  Document  MongoDB, Couchbase, DocumentDB  Graph  ArangoDB, Neo4J, Giraph, OrientDB 3
  • 4.
    KEY VALUE DB Store values as associative arrays (Map or Dictionary)  Simplest of data models  Provide eventual consistency model and serialization  Also support ordering of keys 4
  • 5.
    DOCUMENT DB  Documentoriented DBs are also key value pair  But the value is in form of document  Generally document is in some encoding e.g. JSON, XML etc 5
  • 6.
    COLUMNAR DB  Columnrefers to the value stored  Timestamp acts as the key  Latest timestamp has the most recent value 6
  • 7.
    GRAPH DB  Valuesare kept in vertices in graphs  Vertices are interconnected with other vertices by relations 7
  • 8.
    CHARACTERISTICS 8 Data Model PerformanceScalability Flexibility Complexity Key Value High High High None Columnar High High Moderate Low Document High High High Moderate Graph Moderate* Moderate* High High
  • 9.
    NOSQL USE CASES 9 Huge data  Massive write performance  Low latency  Flexible schema  Parallel computing
  • 10.
  • 11.
    INTRODUCTION  Redis isan open source, in memory data structure.  BSD license  Redis can be used as a database, cache and a message broker.  Supports following data structures  Strings, hashes, lists, sets and ordered sets  Redis is written in ANSI C 11
  • 12.
    OS SUPPORT  Developedand tested on POSIX systems  Unix, Mac OS  Works on Solaris, but support not provided  No official redis for Windows  But Microsoft Open Tech group develops and maintains a 64 bit port of Redis  https://github.com/MSOpenTech/redis  Current stable release is 3.2 (4.0 beta is available too) 12
  • 13.
    SETTING UP  Installredis on your machine  redis-server  Starts the redis server  Can be configured as windows service  redis-cli  Starts the redis command line 13
  • 14.
    SETTING UP NODEREDIS  Install node  Install npm  npm install redis  Write code in js file and run with node 14
  • 15.
    VERIFY INSTALLATION  redis-cli--version  ping  echo “Hello World”  redis-server --version  quit  To quit to command line 15
  • 16.
    GETTING HELP  Help<command> or visit https://redis.io/commands  Shows the information about command  Summary  Version  Group 16
  • 17.
    KEY AND VALUES Keys and values can be maximum 512MB  Smaller keys are a smart idea, but stick with a balance  Keys are binary safe  Any string even if binary can be key  An empty string can be key 17
  • 18.
    BASIC COMMANDS  Set,get  Incr, incrby, decr, decrby  Mset, mget  Exists, del  Type  Expire, pexpire, ttl, pttl 18
  • 19.
  • 20.
  • 21.
  • 22.
    SYSTEM COMMANDS  FLUSHALL Removes all keys from all databases  FLUSHDB  Removes all keys from current database  SELECT  Select a database 22
  • 23.
    SYSTEM COMMANDS CONTINUED DBSIZE  Count of keys in current database  SLOWLOG  Gets the command which executes slower than slowlog-log-slower-than  TIME  Gets the current time in seconds and microsecond. 23
  • 24.
    DATA STRUCTURES  Redissupports following data structures  String,  Hash,  List,  Set,  Sorted set 24
  • 25.
    LIST  Redis listsare linked lists and not arrays  Lpush adds the element to the head  Rpush adds the element to the tail  Lrange command extracts ranges of elements from lists  Rpop, lpop pop elements from tail and head respectively  Twitter takes the latest tweets posted by users into Redis lists. 25
  • 26.
  • 27.
    BLOCKING QUEUE  Toimplement blocking popping actions on a list  Brpop  Blpop  The commands have a timeout option as well  0 means indefinitely  Rpoplpush, brpoplpush 27
  • 28.
    LIST AS BLOCKINGQUEUE  To implement blocking popping actions on a list  Brpop  Blpop  The commands have a timeout option as well  0 means indefinitely  Rpoplpush, brpoplpush 28
  • 29.
  • 30.
    HASHES  Hset, hget Hmset, hmget  Hashes with small values are represented very memory efficiently 30
  • 31.
    HASHES COMMANDS  HDEL,HEXISTS  HGETALL  HKEYS, HVALS  HLEN, HSTRLEN, HINCRBY, HINCRBYFLOAT  HSETNX  HSCAN 31
  • 32.
  • 33.
    SETS  Unordered andunique collection of strings  SADD, SMEMBERS  SINTER, SUNION, SDIFF,  SDIFFSTORE, SINTERSTORE, SUNIONSTORE  SRANDMEMBER, SCARD  SREM, SPOP, SMOVE, SSCAN 33
  • 34.
  • 35.
    SORTED SETS  Unordered,sorted and unique collection of strings  A field score is associated to aid in sorting  ZADD, ZRANGE  ZCARD, ZCOUNT, ZRANK  ZREVRANK, ZSCORE, ZSCAN  ZREMRANGEBYRANK, ZREMRANGEBYSCORE, ZREMRANGEBYLEX 35
  • 36.
  • 37.
    QUESTION?  What willadding an element twice will do?  If the score is different it will be updated. 37
  • 38.
    BITMAPS  Not anactual data type  Operations defined on the strings  As strings are binary safe blobs of bits  SETBIT, GETBIT,  BITCOUNT, BITOP, BITOPS  BITFIELD 38
  • 39.
  • 40.
    HYPERLOGLOGS (HLLS)  Probabilisticdata structure to count unique things  Just like counting members of a set  Not very exact (around 1% error)  But HLLs require very little memory  Actually it never add elements, Just maintains the count  PFADD, PFCOUNT, PFMERGE 40
  • 41.
  • 42.
  • 43.
  • 44.
    CONFIGURATION  Redis configurationlies in redis.conf file  To override the default config  redis-server /path/to/redis.conf  redis-server --port 9999 --slaveof 127.0.0.1 6379  Configurations can also be set at runtime  CONFIG SET, CONFIG GET, CONFIG REWRITE 44
  • 45.
    INFO AND STATISTICS Info  server: General information about the Redis server  clients: Client connections section  memory: Memory consumption related information  persistence: RDB and AOF related information  stats: General statistics  replication: Master/slave replication information 45
  • 46.
    INFO AND STATISTICSCONTINUED  Info  cpu: CPU consumption statistics  commandstats: Redis command statistics  cluster: Redis Cluster section  keyspace: Database related statistics  all: Return all sections  default: Return only the default set of sections 46
  • 47.
  • 48.
    RESP  Redis SerializationProtocol is a text based protocol used by clients and servers to communicate to each other  Follows request response architecture and operates over TCP  Except when pipelining and Pub/Sub  Redis clusters use a different protocol for communication among nodes 48
  • 49.
    RESP WORKING  Clientssend commands to a server  The server replies with one of the RESP types  For Simple Strings the first byte of the reply is "+"  For Errors the first byte of the reply is "-"  For Integers the first byte of the reply is ":"  For Bulk Strings the first byte of the reply is "$"  For Arrays the first byte of the reply is "*" 49
  • 50.
    DISTRIBUTED LOCKING  Distributedlocks are a very useful primitive in many environments where different processes must operate with shared resources in a mutually exclusive way.  A distributed lock manager runs in every machine in a cluster, with an identical copy of a cluster-wide lock database.  DLM synchronize the access to shared resource. 50
  • 51.
    REDIS DISTRIBUTED LOCKING Redis suggests to implement Redlock algorithm for DLM  Redlock works as follows  Gets the current time in milliseconds  Sequentially tries to acquire locks on all instances within timeout  If fail to acquire all locks within timeout then releases all acquired locks  Retries after some time 51
  • 52.
    REDLOCK IMPLEMENTATION  Youdo not have to implement Redlock  Many open source implementations are already available  Java  https://github.com/redisson/redisson  Node  https://github.com/mike-marcacci/node-redlock 52
  • 53.
    REDLOCK IMPLEMENTATION  Youdo not have to implement Redlock  Many open source implementations are already available  Java  https://github.com/redisson/redisson  Node  https://github.com/mike-marcacci/node-redlock 53
  • 54.
  • 55.
    NODE REDLOCK EXAMPLE 55 Install Redlock  npm install redlock
  • 56.
  • 57.
  • 58.
  • 59.
  • 60.
    AUTOCOMPLETE 60  Find therank of word typed e.g. fo  zrank zset fo  Will provide the rank of fo. It runs on O(log n)  zrange zset <rank of fo +1> -1  Will provide all values coming after fo
  • 61.
    PIPELINING  Sending multiplecommands at once, Instead of one by one  Since redis follows request response architecture, sending commands one by one will result in responses  Every response has Round Trip Time associated with it, pipelining cuts RTT  It’s a client side phenomenon and redis server has nothing to do with it  Clients buffer the commands at TCP stack and send at once 61
  • 62.
    PIPELINING JAVA EXAMPLE Most client libraries support redis pipelining e.g. Java, Node etc. 62
  • 63.
    PIPELINING COMMAND LINE redis-cli --pipe  Makes the command line run in pipeline mode  e.g. cat commands.txt | redis-cli --pipe will run all commands in commands.txt  Commands have to be encoded in redis protocol  e.g. Set key value is represented as follows  "*3rn$3rnSETrn$3rnkeyrn$5rnvaluern" 63
  • 64.
  • 65.
    PIPELINING INTERNALS  redis-cli--pipe tries to sends and reads data when available  Once there is no more data to read from input, it sends a special ECHO command with a random 20 bytes string  Once this special final command is sent, the code receiving replies starts to match replies with this 20 bytes.  When the matching reply is reached it can exit with success. 65
  • 66.
  • 67.
  • 68.
  • 69.
    PUB SUB  Redissupports a publisher subscriber model  Messages are published to a channel  Publisher and Subscribers are completely detached  Any message not consumed is discarded  DB Scoping does not work on Pub Sub  Pub Sub support pattern matching 69
  • 70.
    PUB SUB COMMANDS PUBLISH  SUBSCRIBE  PSUBSCRIBE  UNSUBSCRIBE  PUNSUBSCRIBE  PUBSUB 70
  • 71.
    JAVA PUB SUBEXAMPLE  https://dzone.com/articles/redis-pubsub-using-spring 71
  • 72.
  • 73.
    MEMORY OPTIMIZATIONS  Redisencodes several data types automatically  It’s a cpu vs memory trade-off  Max number of entries and max size of the key can be specified  If an encoded value will overflow the configured max size, Redis will automatically convert it into normal encoding  Redis 32 bit instances are faster than 64 bit instances 73
  • 74.
    MEMORY OPTIMIZATIONS CONTINUED Zip list values bigger than 1024 are generally very slow  Try to keep size of keys smaller, wherever possible  Zip list entries into hundred thousands generally are very slow  Make sure you benchmark wherever required  list-max-ziplist-entries 512  list-max-ziplist-value 64 74
  • 75.
    TRANSACTIONS  All commandsare serialized and executed sequentially  No other command can be executed in the middle of transaction commands  Either all of the commands or none are processed  A Redis transaction is atomic 75
  • 76.
    TRANSACTION USAGE  MULTIcommand starts the transaction  A number of Redis commands are issued  EXEC will send commands to server to be executed  DISCARD will flush the commands and transaction is terminated  Commands will not be sent to server for execution 76
  • 77.
    TRANSACTION USAGE CONTINUED WATCH is used to provide the Check-And-Set behavior SET mykey 13 WATCH mykey MULTI // Some other client sets mykey to 19 SET mykey 29 EXEC // Will fail 77
  • 78.
    TRANSACTION USAGE CONTINUED UNWATCH flushes all watches in the transaction  Calling EXEC or DISCARD automatically flushes all watches  All Lua scripts runs in transactions automatically 78
  • 79.
    TRANSACTION QUEUE ERROR A command fail to be queued  Wrong number of arguments, wrong command name  Memory limit reached  So there will be error before EXEC is called  If such an error occurs then transaction is cancelled  True for most clients and redis 2.6+ 79
  • 80.
    TRANSACTION EXEC ERROR A command fail after EXEC is called  Calling list op against string value  The successful commands will be processed  Error command will not be processed  Transaction is not cancelled 80
  • 81.
    ROLLBACK  Redis doesnot support rollbacks  For simplicity of design and performance considerations 81
  • 82.
  • 83.
  • 84.
  • 85.
    PARTITIONING  Partitioning isthe process of splitting your data into multiple instances  Client side partitioning  Clients select the right node where to write or read a given key  Proxy assisted partitioning  clients send requests to a proxy and proxy selects the node  twemproxy is an open source proxy developed by twitter for redis  https://github.com/twitter/twemproxy 85
  • 86.
    PARTITIONING CONTINUED  Queryrouting  Client sends request to a random instance, and the instance forwards to the node  Redis cluster supports query routing  It is the preferred way 86
  • 87.
    SHARDING  Sharding yourdatabase involves  breaking up your big database into many, much smaller databases that  share nothing and  can be spread across multiple servers.  Same thing as horizontal partitioning  e.g. splitting a customer database geographically 87
  • 88.
    CLUSTERING  Redis Clusterprovides a way to run a Redis installation where data is automatically sharded across multiple Redis nodes.  Redis Cluster also provides some degree of availability during partitions,  ability to continue the operations when some nodes fail  cluster stops to operate in the event when most nodes fail 88
  • 89.
    SETTING UP CLUSTER Using the create cluster script  Easy, good for practice  We will try this  Manually setting up the cluster  Creating production grade cluster  Port 6379 and 16379 should be open on all nodes irrespective of which port redis is installed 89
  • 90.
    CREATE CLUSTER  geminstall redis  create-cluster start  create-cluster create (press yes if asked)  Will create six node of redis  redis-cli with –c switch can be used to connect to cluster  create-cluster stop 90
  • 91.
    CLUSTER-CLI REDIRECTION  redis-cli-c –p 30001  set foo bar  set hello world  get foo  get hello  Check how you are redirected to various nodes 91
  • 92.
    RESHARDING  Redis doesnot do automatic resharding  redis-trib.rb check 127.0.0.1:30001  redis-trib.rb reshard 127.0.0.1:30001  How many slots do you want to move (from 1 to 16384)?  Destination Node  Source Nodes (There can be more than one) 92
  • 93.
    RESHARDING CONTINUED  Validate redis-trib.rb check 127.0.0.1:30001  Non interactive version  redis-trib.rb reshard --from <node-id> --to <node-id> - -slots <number of slots> --yes <host>:<port> 93
  • 94.
  • 95.
  • 96.
  • 97.
    INDEXING  Natively redisonly offers only primary key indexing  Secondary indexes is required, can be stored in sorted sets  You have a Hash  We can create an index on a field and store it in Sorted Set  Retrieve key value based on the secondary index  Retrieve hash objects based on the key retrieved 97
  • 98.
    INDEXING EXAMPLE  HMSETuser:1 id 1 username antirez age 38  HMSET user:2 id 2 username maria age 42  HMSET user:3 id 3 username jbaird age 33  ZADD age 38 1  ZADD age 42 2  ZADD age 33 3  ZRANGEBYSCORE age 20 40  Will return 1 and 3  HMGET user:1 user:3 98
  • 99.
  • 100.
  • 101.
    CLIENT COMMANDS  ClientKill  CLIENT KILL addr 127.0.0.1:6379 type slave  Client List  Returns information and statistics about the client connections server  Client Pause  Client Pause is a connections control command able to suspend all the Redis clients for the specified amount of time (in milliseconds). 101
  • 102.
    CLIENT COMMANDS  ClientReply  To completely disable replies from the Redis server  Client SetName  To set name of client 102
  • 103.
  • 104.
  • 105.
    PERSISTENCE  Redis supportstwo kinds of backup format  Append Only File (AOF)  RDB 105
  • 106.
    AOF  Human Readable, All commands are recorded in text format,  Slower 106
  • 107.
    RDB  Binary format, Snapshot in a time,  Uses LZF compression, compact  Faster than AOF  You have to define save points  Min time is every 5 minutes 107
  • 108.
    RDB VS AOF If both enabled then AOF takes over  RDB allows faster restarts with big datasets compared to AOF  RDB can have data loss but AOF can not  If dataset is big then RDB takes more time for backup than AOF  RDB does a stop the world event, AOF backup can be run in background 108
  • 109.
    RDB OR AOF If data is more important then use AOF  If recovery speed is important then use RDB  Redis is working on a new persistence strategy which will use advantages of both RDB and AOF 109
  • 110.
  • 111.
  • 112.
  • 113.
    REDIS-CLI  redis-cli --stat A new line is printed every second with memory usage, clients connected etc  redis-cli --bigkeys  Scans the entire keyspace to find biggest keys and average sizes per key type  redis-cli monitor  Will print all the commands received by an instance 113
  • 114.
    REDIS-CLI CONTINUED  redis-cli--latency  PING commands time to get a reply is measured  redis-cli --rdb /tmp/dump.rdb  Copies the RDB dump to specified location  redis-cli --slave  Allows to inspect what master sends to slaves, used for debugging 114
  • 115.
  • 116.
  • 117.
    SECURITY  Designed tobe accessed by trusted clients on trusted environments  Should not be exposed on public ip addresses  In protected mode, can only be accessed from loopback interfaces  Redis does not support any access control  Supports limited authentication via redis.conf  Possible to rename or disable certain commands 117
  • 118.
    ENCRYPTION  Redis doesnot support encryption or SSL  There are some open source libraries that can encrypt communication to Redis instances  e.g. Spiped http://www.tarsnap.com/spiped.html 118
  • 119.
  • 120.
  • 121.
    CONNECTION HANDLING  Maximumnumber of clients  default 10000 clients can connect simultaneously  maxclients can be specified in redis.conf  Output buffer  No limit for normal clients, default 32 MB per minute for pub sub clients  Slaves has 256 MB per minute  client-output-buffer-limit can be set 121
  • 122.
    CONNECTION HANDLING CONTINUED Query Buffer  One GB, it is very high and generally is not reached  Client timeout  By default client is not timed out  timeout value in seconds can be specified, not applied to pub/sub clients 122
  • 123.
  • 124.
  • 125.
    SENTINEL  Sentinel provideshigh availability for Redis  Helps clusters resist certain failures without human intervention  Sentinel capabilities  Monitoring,  Notification,  Automatic failover and  Configuration provider 125
  • 126.
    SENTINEL CONTINUED  Thereare multiple sentinel instances in a cluster  Multiple Sentinel instances co-operate for decision making  Multiple Sentinels agree about if a master is no longer available  This lowers the probability of false positives.  Sentinel works even if not all the Sentinel processes are working  Helps avoid becoming single point of failure 126
  • 127.
    SENTINEL CONFIGURATION sentinel monitormymaster 127.0.0.1 6379 2 sentinel down-after-milliseconds mymaster 60000 sentinel failover-timeout mymaster 180000 sentinel parallel-syncs mymaster 1 127
  • 128.
    SENTINEL QUORUM  Quorumis the number of Sentinels that need to agree for a decision  Quorum is defined in redis.conf  Quorum is only used to detect the failure  To perform a failover, one of the Sentinels needs to be elected leader  Leader is elected with the vote of majority of Sentinel processes 128
  • 129.
    RUNNING SENTINEL  Thereare two ways to run Sentinel  redis-sentinel /path/to/sentinel.conf  redis-server /path/to/sentinel.conf --sentinel  sentinel.conf is mandatory  Sentinel runs on 26379 port by default 129
  • 130.
    RUNNING SENTINEL CONTINUED sentinel master mymaster  Will show information about the mymaster instance  sentinel slaves mymaster  sentinel sentinels mymaster 130
  • 131.
    TESTING FAILOVER  redis-cli-p 6379 DEBUG sleep 60  Will make redis sleep for 60 seconds, no longer reachable  After some time if you ask sentinel about master  SENTINEL get-master-addr-by-name mymaster  You will get a different master 131
  • 132.
    SENTINEL BEST PRACTICES Deploy at least three Sentinels  Sentinels can be deployed on same node as master and slaves 132
  • 133.
  • 134.
  • 135.
  • 136.
    REDIS USE CASES Real time delivery architecture at Twitter  https://www.infoq.com/presentations/Real-Time-Delivery-Twitter  GitHub repositories  https://github.com/blog/530-how-we-made-github-fast  Pintrest Follower model  https://engineering.pinterest.com/blog/building-follower-model-scratch 136
  • 137.
    REDIS USE CASESCONTINUED  Craiglist  https://blog.zawodny.com/2011/02/26/redis-sharding-at-craigslist/  Stackexchange  https://goo.gl/Xvqtq0  Flickr  https://goo.gl/EVni0g 137
  • 138.
  • 139.
    RAJESH KUMAR Thanks alot Rajesh.Jangra@gmail.com