SlideShare a Scribd company logo
1 of 24
Download to read offline
"Little Server of Awesome"


       2011 Dvir Volk

      Software Architect, Do@
   dvir@doat.com http://doat.com
What is redis
● Memcache-ish in-memory key/value store
● But it's also persistent!
● And it also has very cool value types:
    ○ lists
    ○ sets
    ○ sorted sets
    ○ hash tables
    ○ append-able buffers
● Open source; very helpful and friendly community.
  Development is very active and responsive to requests.
● Sponsored by VMWare
● Used in the real world: github, craigslist, engineyard, ...
● Used heavily in do@ as a front-end database, search, geo
  resolving
Key Features and Cool Stuff
● All data is in memory (almost)
● All data is eventually persistent (But can be immediately)
● Handles huge workloads easily
● Mostly O(1) behavior
● Ideal for write-heavy workloads
● Support for atomic operations
● Supports for transactions
● Has pub/sub functionality
● Tons of client libraries for all major languages
● Single threaded, uses aync. IO
● Internal scripting with LUA coming soon
A little benchmark

This is on my laptop (core i7 @2.2Ghz)

 ● SET: 187265.92 requests per second
 ● GET: 185185.17 requests per second
 ● INCR: 190114.06 requests per second


 ● LPUSH: 190114.06 requests per second
 ● LPOP: 187090.73 requests per second

 ●
 ● SADD: 186567.16 requests per second
 ● SPOP: 185873.61 requests per second
Scaling it up

 ● Master-slave replication out of the box

 ● Slaves can be made masters on the fly

 ● Currently does not support "real" clustered mode....

 ● ... But Redis-Cluster to be released soon

 ● You can manually shard it client side

 ● Single threaded - run num_cores/2 instances on the same
   machine
Persistence
● All data is synchronized to disk - eventually or immediately
● Pick your risk level Vs. performance
● Data is either dumped in a forked process, or written as a
  append-only change-log (AOF)
● Append-only mode supports transactional disk writes so you
  can lose no data (cost: 99% speed loss :) )
● AOF files get huge, but redis can minimize them on the fly.
● You can save the state explicitly, background or blocking

● Default configuration:
   ○ Save after 900 sec (15 min) if at least 1 key changed
   ○ Save after 300 sec (5 min) if at least 10 keys changed
   ○ Save after 60 sec if at least 10000 keys changed
Virtual Memory

● If your database is too big - redis can handle swapping on
  its own.

● Keys remain in memory and least used values are swapped
  to disk.

● Swapping IO happens in separate threads

● But if you need this - don't use redis, or get a bigger
  machine ;)
Show me the features!

Now let's see the key featurs:

 ● Get/Set/Incr - strings/numbers
 ● Lists
 ● Sets
 ● Sorted Sets
 ● Hash Tables
 ● PubSub
 ● SORT
 ● Transactions

We'll use redis-cli for the examples.
Some of the output has been modified for readability.
The basics...
Get/Sets - nothing fancy. Keys are strings, anything goes - just quote spaces.
redis> SET foo "bar"
OK
redis> GET foo
"bar"

You can atomically increment numbers
redis> SET bar 337
OK
redis> INCRBY bar 1000
(integer) 1337

Getting multiple values at once
redis> MGET foo bar
1. "bar"
2. "1337"

Keys are lazily expired
redis> EXPIRE foo 1
(integer) 1
redis> GET foo
(nil)
Be careful with EXPIRE - re-setting a value without re-expiring it will remove the
expiration!
Atomic Operations
GETSET puts a different value inside a key, retriving the old one
redis> SET foo bar
OK
redis> GETSET foo baz
"bar"
redis> GET foo
"baz"

SETNX sets a value only if it does not exist
redis> SETNX foo bar
*OK*
redis> SETNX foo baz
*FAILS*

SETNX + Timestamp => Named Locks! w00t!
redis> SETNX myLock <current_time>
OK
redis> SETNX myLock <new_time>
*FAILS*

Note that If the locking client crashes that might cause some problems, but it can be solved
easily.
List operations
  ● Lists are your ordinary linked lists.
  ● You can push and pop at both sides, extract range, resize,
    etc.
  ● Random access and ranges at O(N)! :-(
redis> LPUSH foo bar
(integer) 1

redis> LPUSH foo baz
(integer) 2

redis> LRANGE foo 0 2
1. "baz"
2. "bar"

redis> LPOP foo
"baz"

      ● BLPOP: Blocking POP - wait until a list has elements and pop them. Useful for realtime stuff.
redis> BLPOP baz 10 [seconds]
..... We wait!
Set operations
  ● Sets are... well, sets of unique values w/ push, pop, etc.
  ● Sets can be intersected/diffed /union'ed server side.
  ● Can be useful as keys when building complex schemata.
redis> SADD foo bar
(integer) 1
redis> SADD foo baz
(integer) 1
redis> SMEMBERS foo
["baz", "bar"]

redis> SADD foo2 baz // << another set
(integer) 1
redis> SADD foo2 raz
(integer) 1

redis> SINTER foo foo2 // << only one common element
1. "baz"
redis> SUNION foo foo2 // << UNION
["raz", "bar", "baz"]
Sorted Sets
 ● Same as sets, but with score per element
 ● Ranked ranges, aggregation of scores on INTERSECT
 ● Can be used as ordered keys in complex schemata
 ● Think timestamps, inverted index, geohashing, ip ranges
 redis> ZADD foo 1337 hax0r       redis> ZRANGE foo 0 10
 (integer) 1                      1. "luser"
 redis> ZADD foo 100 n00b         2. "hax0r"
 (integer) 1                      3. "n00b"
 redis> ZADD foo 500 luser
 (integer) 1                      redis> ZREVRANGE foo 0 10
                                  1. "n00b"
 redis> ZSCORE foo n00b           2. "hax0r"
 "100"                            3. "luser"

 redis> ZINCRBY foo 2000 n00b
 "2100"

 redis> ZRANK foo n00b
 (integer) 2
Hashes
 ● Hash tables as values
 ● Think of an object store with atomic access to object
   members

 redis> HSET foo bar 1             redis> HINCRBY foo bar 1
 (integer) 1                       (integer) 2
 redis> HSET foo baz 2
 (integer) 1                       redis> HGET foo bar
 redis> HSET foo foo foo           "2"
 (integer) 1
                                   redis> HKEYS foo
 redis> HGETALL foo                1. "bar"
 {                                 2. "baz"
   "bar": "1",                     3. "foo"
   "baz": "2",
   "foo": "foo"
 }
PubSub - Publish/Subscribe
 ● Clients can subscribe to channels or patterns and receive
   notifications when messages are sent to channels.
 ● Subscribing is O(1), posting messages is O(n)
 ● Think chats, Comet applications: real-time analytics, twitter
 redis> subscribe feed:joe feed:moe feed:
 boe

 //now we wait
 ....                                        redis> publish feed:joe "all your base are
                           <<<<<----------   belong to me"
 1. "message"                                (integer) 1 //received by 1
 2. "feed:joe"
 3. "all your base are belong to me"
SORT FTW!
  ● Key redis awesomeness
  ● Sort SETs or LISTS using external values, and join values
    in one go:

SORT key
SORT key BY pattern (e.g. sort userIds BY user:*->age)
SORT key BY pattern GET othervalue

SORT userIds BY user:*->age GET user:*->name

  ● ASC|DESC, LIMIT available, results can be stored, sorting
    can be numeric or alphabetic

  ● Keep in mind that it's blocking and redis is single threaded.
    Maybe put a slave aside if you have big SORTs
Transactions
  ● MULTI, ...., EXEC: Easy because of the single thread.
  ● All commands are executed after EXEC, block and return
    values for the commands as a list.
  ● Example:
redis> MULTI
OK
redis> SET "foo" "bar"
QUEUED
redis> INCRBY "num" 1
QUEUED
redis> EXEC
1) OK
2) (integer) 1

  ● Transactions can be discarded with DISCARD.

  ● WATCH allows you to lock keys while you are queuing your
    transaction, and avoid race conditions.
Gotchas, Lessons Learned
● Memory fragmentation can be a problem with some usage
  patterns. Alternative allocators (jemalloc, tcmalloc) ease
  that.

● Horrible bug with Ubuntu 10.x servers and amazon EC2
  machines [resulted in long, long nights at the office...]

● 64 bit instances consume much much more RAM.

● Master/Slave sync far from perfect.

● DO NOT USE THE KEYS COMMAND!!!

● vm.overcommit_memory = 1

● Use MONITOR to see what's going on
Example: *Very* Simple Social Feed
#let's add a couple of followers
>>> client.rpush('user:1:followers', 2)
>>> numFollowers = client.rpush('user:1:followers', 3)
>>> msgId = client.incr('messages:id') #ATOMIC OPERATION

#add a message
>>> client.hmset('messages:%s' % msgId, {'text': 'hello world', 'user': 1})

#distribute to followers
>>> followers = client.lrange('user:1:followers', 0, numFollowers)

>>> pipe = client.pipeline()
>>> for f in followers:
  pipe.rpush('user:%s:feed' % f, msgId)
>>> pipe.execute()

>>> msgId = client.incr('messages:id') #increment id
#....repeat...repeat..repeat..repeat..
#now get user 2's feed
>>> client.sort(name = 'user:2:feed', get='messages:*->text')
['hello world', 'foo bar']
Other use case ideas
● Geo Resolving with geohashing
● Implemented and opened by yours truly https://github.com/doat/geodis

● Real time analytics
● use ZSET, SORT, INCR of values

● API Key and rate management
● Very fast key lookup, rate control counters using INCR

● Real time game data
● ZSETs for high scores, HASHES for online users, etc

● Database Shard Index
● map key => database id. Count size with SETS

● Comet - no polling ajax
● use BLPOP or pub/sub

● Queue Server
● resque - a large portion of redis' user base
Melt - My little evil master-plan
● We wanted freakin' fast access to data on the front-end.

● but our ability to cache personalized and query bound data
  is limited.

● Redis to the rescue!

● But we still want the data to be in an RDBMs.

● So we made a framework to "melt the borders" between
  them...
Introducing melt
● ALL front end data is in RAM, denormalized and optimized for
  speed. Front end talks only to Redis.

● We use Redis' set features as keys and scoring vectors.

● All back end data is on mysql, with a manageable normalized
  schema. The admin talks only to MySQL.

● A sync queue in the middle keeps both ends up to date.

● A straightforward ORM is used to manage and sync the data.

● Automates indexing in Redis, generates models from MySQL.

● Use the same model on both ends, or create conversions.

● Central Id generator.
Melt - an example:
#syncing objects:
with MySqlStore:
  users = Users.get({Users.id: Int(1,2,3,4)})
  with RedisStore:
     for user in users:
        Users.save(user)


#pushing a new feed item from front to back:
with RedisStore:
  #create an object - any object!
  feedItem = FeedItem(userId, title, time.time())
  #use the model to save it
  Feed.save(feedItem)
  #now just tell the queue to put it on the other side
  SyncQueue.pushItem(action = 'update', model = FeedItem,
    source = 'redis', dest = 'mysql',
     id = feedItem.id)

Coming soon to a github near you! :)
More resources

Redis' website:
http://redis.io

Excellent and more detailed presentation by Simon Willison:
http://simonwillison.net/static/2010/redis-tutorial/

Much more complex twitter clone:
http://code.google.com/p/redis/wiki/TwitterAlikeExample

Full command reference:
http://code.google.com/p/redis/wiki/CommandReference

More Related Content

What's hot

From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.Taras Matyashovsky
 
DNS Security Presentation ISSA
DNS Security Presentation ISSADNS Security Presentation ISSA
DNS Security Presentation ISSASrikrupa Srivatsan
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explainedconfluent
 
APACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka StreamsAPACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka StreamsKetan Gote
 
Spark shuffle introduction
Spark shuffle introductionSpark shuffle introduction
Spark shuffle introductioncolorant
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkFlink Forward
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for ExperimentationGleb Kanterov
 
Thousands of Threads and Blocking I/O
Thousands of Threads and Blocking I/OThousands of Threads and Blocking I/O
Thousands of Threads and Blocking I/OGeorge Cao
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsAlluxio, Inc.
 
Building Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta LakeBuilding Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta LakeFlink Forward
 
RocksDB detail
RocksDB detailRocksDB detail
RocksDB detailMIJIN AN
 
HBase in Practice
HBase in PracticeHBase in Practice
HBase in Practicelarsgeorge
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseenissoz
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiDatabricks
 
Apache Hadoop YARN: best practices
Apache Hadoop YARN: best practicesApache Hadoop YARN: best practices
Apache Hadoop YARN: best practicesDataWorks Summit
 
Unique ID generation in distributed systems
Unique ID generation in distributed systemsUnique ID generation in distributed systems
Unique ID generation in distributed systemsDave Gardner
 
Deep Dive into Apache Kafka
Deep Dive into Apache KafkaDeep Dive into Apache Kafka
Deep Dive into Apache Kafkaconfluent
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBScyllaDB
 

What's hot (20)

Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
 
From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.
 
Intro to HBase
Intro to HBaseIntro to HBase
Intro to HBase
 
DNS Security Presentation ISSA
DNS Security Presentation ISSADNS Security Presentation ISSA
DNS Security Presentation ISSA
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
 
APACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka StreamsAPACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka Streams
 
Spark shuffle introduction
Spark shuffle introductionSpark shuffle introduction
Spark shuffle introduction
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for Experimentation
 
Thousands of Threads and Blocking I/O
Thousands of Threads and Blocking I/OThousands of Threads and Blocking I/O
Thousands of Threads and Blocking I/O
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
 
Building Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta LakeBuilding Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta Lake
 
RocksDB detail
RocksDB detailRocksDB detail
RocksDB detail
 
HBase in Practice
HBase in PracticeHBase in Practice
HBase in Practice
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
 
Apache Hadoop YARN: best practices
Apache Hadoop YARN: best practicesApache Hadoop YARN: best practices
Apache Hadoop YARN: best practices
 
Unique ID generation in distributed systems
Unique ID generation in distributed systemsUnique ID generation in distributed systems
Unique ID generation in distributed systems
 
Deep Dive into Apache Kafka
Deep Dive into Apache KafkaDeep Dive into Apache Kafka
Deep Dive into Apache Kafka
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDB
 

Similar to Introduction to Redis

Introduction to redis - version 2
Introduction to redis - version 2Introduction to redis - version 2
Introduction to redis - version 2Dvir Volk
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to RedisRizky Abdilah
 
REDIS intro and how to use redis
REDIS intro and how to use redisREDIS intro and how to use redis
REDIS intro and how to use redisKris Jeong
 
Redis SoCraTes 2014
Redis SoCraTes 2014Redis SoCraTes 2014
Redis SoCraTes 2014steffenbauer
 
Paris Redis Meetup Introduction
Paris Redis Meetup IntroductionParis Redis Meetup Introduction
Paris Redis Meetup IntroductionGregory Boissinot
 
MongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: ShardingMongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: ShardingMongoDB
 
Managing terabytes: When Postgres gets big
Managing terabytes: When Postgres gets bigManaging terabytes: When Postgres gets big
Managing terabytes: When Postgres gets bigSelena Deckelmann
 
BlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephBlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephSage Weil
 
Managing terabytes: When PostgreSQL gets big
Managing terabytes: When PostgreSQL gets bigManaging terabytes: When PostgreSQL gets big
Managing terabytes: When PostgreSQL gets bigSelena Deckelmann
 
MySQL 5.7 in a Nutshell
MySQL 5.7 in a NutshellMySQL 5.7 in a Nutshell
MySQL 5.7 in a NutshellEmily Ikuta
 
Redis modules 101
Redis modules 101Redis modules 101
Redis modules 101Dvir Volk
 
What's new in Redis v3.2
What's new in Redis v3.2What's new in Redis v3.2
What's new in Redis v3.2Itamar Haber
 
Redis - Usability and Use Cases
Redis - Usability and Use CasesRedis - Usability and Use Cases
Redis - Usability and Use CasesFabrizio Farinacci
 
BlueStore, A New Storage Backend for Ceph, One Year In
BlueStore, A New Storage Backend for Ceph, One Year InBlueStore, A New Storage Backend for Ceph, One Year In
BlueStore, A New Storage Backend for Ceph, One Year InSage Weil
 

Similar to Introduction to Redis (20)

Introduction to redis - version 2
Introduction to redis - version 2Introduction to redis - version 2
Introduction to redis - version 2
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
REDIS intro and how to use redis
REDIS intro and how to use redisREDIS intro and how to use redis
REDIS intro and how to use redis
 
Redis SoCraTes 2014
Redis SoCraTes 2014Redis SoCraTes 2014
Redis SoCraTes 2014
 
Paris Redis Meetup Introduction
Paris Redis Meetup IntroductionParis Redis Meetup Introduction
Paris Redis Meetup Introduction
 
Bluestore
BluestoreBluestore
Bluestore
 
Bluestore
BluestoreBluestore
Bluestore
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Redis introduction
Redis introductionRedis introduction
Redis introduction
 
MongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: ShardingMongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: Sharding
 
Kyotoproducts
KyotoproductsKyotoproducts
Kyotoproducts
 
Managing terabytes: When Postgres gets big
Managing terabytes: When Postgres gets bigManaging terabytes: When Postgres gets big
Managing terabytes: When Postgres gets big
 
BlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephBlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for Ceph
 
Managing terabytes: When PostgreSQL gets big
Managing terabytes: When PostgreSQL gets bigManaging terabytes: When PostgreSQL gets big
Managing terabytes: When PostgreSQL gets big
 
MySQL 5.7 in a Nutshell
MySQL 5.7 in a NutshellMySQL 5.7 in a Nutshell
MySQL 5.7 in a Nutshell
 
Redis modules 101
Redis modules 101Redis modules 101
Redis modules 101
 
Programar para GPUs
Programar para GPUsProgramar para GPUs
Programar para GPUs
 
What's new in Redis v3.2
What's new in Redis v3.2What's new in Redis v3.2
What's new in Redis v3.2
 
Redis - Usability and Use Cases
Redis - Usability and Use CasesRedis - Usability and Use Cases
Redis - Usability and Use Cases
 
BlueStore, A New Storage Backend for Ceph, One Year In
BlueStore, A New Storage Backend for Ceph, One Year InBlueStore, A New Storage Backend for Ceph, One Year In
BlueStore, A New Storage Backend for Ceph, One Year In
 

More from Dvir Volk

Searching Billions of Documents with Redis
Searching Billions of Documents with RedisSearching Billions of Documents with Redis
Searching Billions of Documents with RedisDvir Volk
 
Boosting Machine Learning with Redis Modules and Spark
Boosting Machine Learning with Redis Modules and SparkBoosting Machine Learning with Redis Modules and Spark
Boosting Machine Learning with Redis Modules and SparkDvir Volk
 
Tales Of The Black Knight - Keeping EverythingMe running
Tales Of The Black Knight - Keeping EverythingMe runningTales Of The Black Knight - Keeping EverythingMe running
Tales Of The Black Knight - Keeping EverythingMe runningDvir Volk
 
10 reasons to be excited about go
10 reasons to be excited about go10 reasons to be excited about go
10 reasons to be excited about goDvir Volk
 
Kicking ass with redis
Kicking ass with redisKicking ass with redis
Kicking ass with redisDvir Volk
 
Introduction to Thrift
Introduction to ThriftIntroduction to Thrift
Introduction to ThriftDvir Volk
 

More from Dvir Volk (7)

RediSearch
RediSearchRediSearch
RediSearch
 
Searching Billions of Documents with Redis
Searching Billions of Documents with RedisSearching Billions of Documents with Redis
Searching Billions of Documents with Redis
 
Boosting Machine Learning with Redis Modules and Spark
Boosting Machine Learning with Redis Modules and SparkBoosting Machine Learning with Redis Modules and Spark
Boosting Machine Learning with Redis Modules and Spark
 
Tales Of The Black Knight - Keeping EverythingMe running
Tales Of The Black Knight - Keeping EverythingMe runningTales Of The Black Knight - Keeping EverythingMe running
Tales Of The Black Knight - Keeping EverythingMe running
 
10 reasons to be excited about go
10 reasons to be excited about go10 reasons to be excited about go
10 reasons to be excited about go
 
Kicking ass with redis
Kicking ass with redisKicking ass with redis
Kicking ass with redis
 
Introduction to Thrift
Introduction to ThriftIntroduction to Thrift
Introduction to Thrift
 

Recently uploaded

WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 

Recently uploaded (20)

WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 

Introduction to Redis

  • 1. "Little Server of Awesome" 2011 Dvir Volk Software Architect, Do@ dvir@doat.com http://doat.com
  • 2. What is redis ● Memcache-ish in-memory key/value store ● But it's also persistent! ● And it also has very cool value types: ○ lists ○ sets ○ sorted sets ○ hash tables ○ append-able buffers ● Open source; very helpful and friendly community. Development is very active and responsive to requests. ● Sponsored by VMWare ● Used in the real world: github, craigslist, engineyard, ... ● Used heavily in do@ as a front-end database, search, geo resolving
  • 3. Key Features and Cool Stuff ● All data is in memory (almost) ● All data is eventually persistent (But can be immediately) ● Handles huge workloads easily ● Mostly O(1) behavior ● Ideal for write-heavy workloads ● Support for atomic operations ● Supports for transactions ● Has pub/sub functionality ● Tons of client libraries for all major languages ● Single threaded, uses aync. IO ● Internal scripting with LUA coming soon
  • 4. A little benchmark This is on my laptop (core i7 @2.2Ghz) ● SET: 187265.92 requests per second ● GET: 185185.17 requests per second ● INCR: 190114.06 requests per second ● LPUSH: 190114.06 requests per second ● LPOP: 187090.73 requests per second ● ● SADD: 186567.16 requests per second ● SPOP: 185873.61 requests per second
  • 5. Scaling it up ● Master-slave replication out of the box ● Slaves can be made masters on the fly ● Currently does not support "real" clustered mode.... ● ... But Redis-Cluster to be released soon ● You can manually shard it client side ● Single threaded - run num_cores/2 instances on the same machine
  • 6. Persistence ● All data is synchronized to disk - eventually or immediately ● Pick your risk level Vs. performance ● Data is either dumped in a forked process, or written as a append-only change-log (AOF) ● Append-only mode supports transactional disk writes so you can lose no data (cost: 99% speed loss :) ) ● AOF files get huge, but redis can minimize them on the fly. ● You can save the state explicitly, background or blocking ● Default configuration: ○ Save after 900 sec (15 min) if at least 1 key changed ○ Save after 300 sec (5 min) if at least 10 keys changed ○ Save after 60 sec if at least 10000 keys changed
  • 7. Virtual Memory ● If your database is too big - redis can handle swapping on its own. ● Keys remain in memory and least used values are swapped to disk. ● Swapping IO happens in separate threads ● But if you need this - don't use redis, or get a bigger machine ;)
  • 8. Show me the features! Now let's see the key featurs: ● Get/Set/Incr - strings/numbers ● Lists ● Sets ● Sorted Sets ● Hash Tables ● PubSub ● SORT ● Transactions We'll use redis-cli for the examples. Some of the output has been modified for readability.
  • 9. The basics... Get/Sets - nothing fancy. Keys are strings, anything goes - just quote spaces. redis> SET foo "bar" OK redis> GET foo "bar" You can atomically increment numbers redis> SET bar 337 OK redis> INCRBY bar 1000 (integer) 1337 Getting multiple values at once redis> MGET foo bar 1. "bar" 2. "1337" Keys are lazily expired redis> EXPIRE foo 1 (integer) 1 redis> GET foo (nil) Be careful with EXPIRE - re-setting a value without re-expiring it will remove the expiration!
  • 10. Atomic Operations GETSET puts a different value inside a key, retriving the old one redis> SET foo bar OK redis> GETSET foo baz "bar" redis> GET foo "baz" SETNX sets a value only if it does not exist redis> SETNX foo bar *OK* redis> SETNX foo baz *FAILS* SETNX + Timestamp => Named Locks! w00t! redis> SETNX myLock <current_time> OK redis> SETNX myLock <new_time> *FAILS* Note that If the locking client crashes that might cause some problems, but it can be solved easily.
  • 11. List operations ● Lists are your ordinary linked lists. ● You can push and pop at both sides, extract range, resize, etc. ● Random access and ranges at O(N)! :-( redis> LPUSH foo bar (integer) 1 redis> LPUSH foo baz (integer) 2 redis> LRANGE foo 0 2 1. "baz" 2. "bar" redis> LPOP foo "baz" ● BLPOP: Blocking POP - wait until a list has elements and pop them. Useful for realtime stuff. redis> BLPOP baz 10 [seconds] ..... We wait!
  • 12. Set operations ● Sets are... well, sets of unique values w/ push, pop, etc. ● Sets can be intersected/diffed /union'ed server side. ● Can be useful as keys when building complex schemata. redis> SADD foo bar (integer) 1 redis> SADD foo baz (integer) 1 redis> SMEMBERS foo ["baz", "bar"] redis> SADD foo2 baz // << another set (integer) 1 redis> SADD foo2 raz (integer) 1 redis> SINTER foo foo2 // << only one common element 1. "baz" redis> SUNION foo foo2 // << UNION ["raz", "bar", "baz"]
  • 13. Sorted Sets ● Same as sets, but with score per element ● Ranked ranges, aggregation of scores on INTERSECT ● Can be used as ordered keys in complex schemata ● Think timestamps, inverted index, geohashing, ip ranges redis> ZADD foo 1337 hax0r redis> ZRANGE foo 0 10 (integer) 1 1. "luser" redis> ZADD foo 100 n00b 2. "hax0r" (integer) 1 3. "n00b" redis> ZADD foo 500 luser (integer) 1 redis> ZREVRANGE foo 0 10 1. "n00b" redis> ZSCORE foo n00b 2. "hax0r" "100" 3. "luser" redis> ZINCRBY foo 2000 n00b "2100" redis> ZRANK foo n00b (integer) 2
  • 14. Hashes ● Hash tables as values ● Think of an object store with atomic access to object members redis> HSET foo bar 1 redis> HINCRBY foo bar 1 (integer) 1 (integer) 2 redis> HSET foo baz 2 (integer) 1 redis> HGET foo bar redis> HSET foo foo foo "2" (integer) 1 redis> HKEYS foo redis> HGETALL foo 1. "bar" { 2. "baz" "bar": "1", 3. "foo" "baz": "2", "foo": "foo" }
  • 15. PubSub - Publish/Subscribe ● Clients can subscribe to channels or patterns and receive notifications when messages are sent to channels. ● Subscribing is O(1), posting messages is O(n) ● Think chats, Comet applications: real-time analytics, twitter redis> subscribe feed:joe feed:moe feed: boe //now we wait .... redis> publish feed:joe "all your base are <<<<<---------- belong to me" 1. "message" (integer) 1 //received by 1 2. "feed:joe" 3. "all your base are belong to me"
  • 16. SORT FTW! ● Key redis awesomeness ● Sort SETs or LISTS using external values, and join values in one go: SORT key SORT key BY pattern (e.g. sort userIds BY user:*->age) SORT key BY pattern GET othervalue SORT userIds BY user:*->age GET user:*->name ● ASC|DESC, LIMIT available, results can be stored, sorting can be numeric or alphabetic ● Keep in mind that it's blocking and redis is single threaded. Maybe put a slave aside if you have big SORTs
  • 17. Transactions ● MULTI, ...., EXEC: Easy because of the single thread. ● All commands are executed after EXEC, block and return values for the commands as a list. ● Example: redis> MULTI OK redis> SET "foo" "bar" QUEUED redis> INCRBY "num" 1 QUEUED redis> EXEC 1) OK 2) (integer) 1 ● Transactions can be discarded with DISCARD. ● WATCH allows you to lock keys while you are queuing your transaction, and avoid race conditions.
  • 18. Gotchas, Lessons Learned ● Memory fragmentation can be a problem with some usage patterns. Alternative allocators (jemalloc, tcmalloc) ease that. ● Horrible bug with Ubuntu 10.x servers and amazon EC2 machines [resulted in long, long nights at the office...] ● 64 bit instances consume much much more RAM. ● Master/Slave sync far from perfect. ● DO NOT USE THE KEYS COMMAND!!! ● vm.overcommit_memory = 1 ● Use MONITOR to see what's going on
  • 19. Example: *Very* Simple Social Feed #let's add a couple of followers >>> client.rpush('user:1:followers', 2) >>> numFollowers = client.rpush('user:1:followers', 3) >>> msgId = client.incr('messages:id') #ATOMIC OPERATION #add a message >>> client.hmset('messages:%s' % msgId, {'text': 'hello world', 'user': 1}) #distribute to followers >>> followers = client.lrange('user:1:followers', 0, numFollowers) >>> pipe = client.pipeline() >>> for f in followers: pipe.rpush('user:%s:feed' % f, msgId) >>> pipe.execute() >>> msgId = client.incr('messages:id') #increment id #....repeat...repeat..repeat..repeat.. #now get user 2's feed >>> client.sort(name = 'user:2:feed', get='messages:*->text') ['hello world', 'foo bar']
  • 20. Other use case ideas ● Geo Resolving with geohashing ● Implemented and opened by yours truly https://github.com/doat/geodis ● Real time analytics ● use ZSET, SORT, INCR of values ● API Key and rate management ● Very fast key lookup, rate control counters using INCR ● Real time game data ● ZSETs for high scores, HASHES for online users, etc ● Database Shard Index ● map key => database id. Count size with SETS ● Comet - no polling ajax ● use BLPOP or pub/sub ● Queue Server ● resque - a large portion of redis' user base
  • 21. Melt - My little evil master-plan ● We wanted freakin' fast access to data on the front-end. ● but our ability to cache personalized and query bound data is limited. ● Redis to the rescue! ● But we still want the data to be in an RDBMs. ● So we made a framework to "melt the borders" between them...
  • 22. Introducing melt ● ALL front end data is in RAM, denormalized and optimized for speed. Front end talks only to Redis. ● We use Redis' set features as keys and scoring vectors. ● All back end data is on mysql, with a manageable normalized schema. The admin talks only to MySQL. ● A sync queue in the middle keeps both ends up to date. ● A straightforward ORM is used to manage and sync the data. ● Automates indexing in Redis, generates models from MySQL. ● Use the same model on both ends, or create conversions. ● Central Id generator.
  • 23. Melt - an example: #syncing objects: with MySqlStore: users = Users.get({Users.id: Int(1,2,3,4)}) with RedisStore: for user in users: Users.save(user) #pushing a new feed item from front to back: with RedisStore: #create an object - any object! feedItem = FeedItem(userId, title, time.time()) #use the model to save it Feed.save(feedItem) #now just tell the queue to put it on the other side SyncQueue.pushItem(action = 'update', model = FeedItem, source = 'redis', dest = 'mysql', id = feedItem.id) Coming soon to a github near you! :)
  • 24. More resources Redis' website: http://redis.io Excellent and more detailed presentation by Simon Willison: http://simonwillison.net/static/2010/redis-tutorial/ Much more complex twitter clone: http://code.google.com/p/redis/wiki/TwitterAlikeExample Full command reference: http://code.google.com/p/redis/wiki/CommandReference