SlideShare a Scribd company logo
Kicking Ass With



Redis for real world problems
Dvir Volk, Chief Architect, Everything.me (@dvirsky)
O HAI! I CAN HAS REDIS?
Extremely Quick introduction to Redis
● Key => Data Structure server
● In memory, with persistence
● Extremely fast and versatile
● Rapidly growing (Instagr.am, Craigslist, Youporn ....)
● Open Source, awesome community
● Used as the primary data source in Everything.me:
    ○ Relational Data
    ○ Queueing
    ○ Caching
    ○ Machine Learning
    ○ Text Processing and search
    ○ Geo Stuff
Key => { Data Structures }

         "I'm a Plain Text String!"                    Strings/Blobs/Bitmaps

         Key1                     Val1
                                                       Hash Tables (objects!)
         Key2                     Val 2


 Key       C        B         B           A      C     Linked Lists


            A           B            C         D
                                                       Sets


                                                       Sorted Sets
          A: 0.1     B: 0.3        C: 500     D: 500
Redis is like Lego for Data
● Yes, It can be used as a simple KV store.
● But to really Use it, you need to think of it as a tool set.
● You have a nail - redis is a hammer building toolkit.
● That can make almost any kind of hammer.
● Learning how to efficiently model your problem is the
  Zen of Redis.
● Here are a few examples...
Pattern 1: Simple, Fast, Object Store
Our problem:
● Very fast object store that scales up well.
● High write throughput.
● Atomic manipulation of object members.

Possible use cases:
● Online user data (session, game state)
● Social Feed
● Shopping Cart
● Anything, really...
Storing users as HASHes

              email      john@domain.com

              name       John
    users:1
              Password   aebc65feae8b

              id         1



              email      Jane@domain.com

              name       Jane
    users:2
              Password   aebc65ab117b

              id         2
Redis Pattern 1
● Each object is saved as a HASH.
● Hash objects are { key=> string/number }
● No JSON & friends serialization overhead.
● Complex members and relations are stored
  as separate HASHes.
● Atomic set / increment / getset members.
● Use INCR for centralized incremental ids.
● Load objects with HGETALL / HMGET
Objects as Hashes
class User(RedisObject):                   > INCR users:id
    def __init__(email, name, password):   (integer) 1
        self.email = email
        self.name = name                   > HMSET "users:1"
        self.password = password                "email" "user@domain.com"
        self.id = self.createId()               "name" "John"
                                                "password" "1234"
user = User('user@domain.com', 'John',     OK
'1234)                                     > HGETALL "users:1"
                                           { "email": "user@domain.com", ... }
user.save()
Performance with growing data
Pattern 2: Object Indexing
The problem:
● We want to index the objects we saved by
  various criteria.
● We want to rank and sort them quickly.
● We want to be able to update an index
  quickly.
Use cases:
● Tagging
● Real-Time score tables
● Social Feed Views
Indexing with Sorted Sets

                                    k:users:email
          email   john@domain.com
                                    user:1 => 1789708973
          name    John
users:1
          score   300               user:2 => 2361572523

          id      1
                                    ....



          email   Jane@domain.com
                                    k:users:score
          name    Jane
                                    user:2 => 250
users:2
          score   250
                                    user:1 => 300
          id      2

                                    user:3 => 300
Redis Pattern
●   Indexes are sorted sets (ZSETs)
●   Access by value O(1), by score O(log(N)). plus ranges.
●   Sorted Sets map { value => score (double) }
●   So we map { objectId => score }
●   For numerical members, the value is the score
●   For string members, the score is a hash of the string.

● Fetching is done with ZRANGEBYSCORE
● Ranges with ZRANGE / ZRANGEBYSCORE on
  numeric values only (or very short strings)
● Deleting is done with ZREM
● Intersecting keys is possible with ZINTERSTORE
● Each class' objects have a special sorted set for ids.
Automatic Keys for objects
class User(RedisObject):                 > ZADD k:users:email 238927659283691 "1"
                                         1
    _keySpec = KeySpec(
        UnorderedKey('email'),           > ZADD k:users:name 9283498696113 "1"
        UnorderedKey('name'),            1
        OrderedNumericalKey('points')    > ZADD k:users:points 300 "1"
    )
    ....                                 1
                                         > ZREVRANGE k:users:points 0 20 withscores
#creating the users - now with points    1) "1"
user = User('user@domain.com', 'John',   2) "300"
'1234', points = 300)
                                         > ZRANGEBYSCORE k:users:email 238927659283691
                                         238927659283691
#saving auto-indexes                     1) "1"
user.save()
                                         redis 127.0.0.1:6379> HGETALL users:1
                                         { .. }
#range query on rank
users = User.getByRank(0,20)


#get by name
users = User.get(name = 'John')
Pattern 3: Unique Value Counter
The problem:
● We want an efficient way to measure
  cardinality of a set of objects over time.
● We may want it in real time.
● We don't want huge overhead.
Use Cases:
● Daily/Monthly Unique users
● Split by OS / country / whatever
● Real Time online users counter
Bitmaps to the rescue
Redis Pattern
● Redis strings can be treated as bitmaps.
● We keep a bitmap for each time slot.
● We use BITSET offset=<object id>
● the size of a bitmap is max_id/8 bytes
● Cardinality per slot with BITCOUNT (2.6)
● Fast bitwise operations - OR / AND / XOR
  between time slots with BITOP
● Aggregate and save results periodically.
● Requires sequential object ids - or mapping
  of (see incremental ids)
Counter API (with redis internals)
counter = BitmapCounter('uniques', timeResolutions=(RES_DAY,))

#sampling current users
counter.add(userId)
> BITSET uniques:day:1339891200 <userId> 1
#Getting the unique user count for today
counter.getCount(time.time())
> BITCOUNT uniques:day:1339891200
 
#Getting the the weekly unique users in the past week
timePoints = [now() - 86400*i for i in xrange(7, 0, -1)]
counter.aggregateCounts(timePoints, counter.OP_TOTAL)
> BITOP OR tmp_key uniques:day:1339891200 uniques:day:1339804800 ....
> BITCOUNT tmp_key
 
 
 
Pattern 4: Geo resolving
The Problem:
● Resolve lat,lon to real locations
● Find locations of a certain class (restaurants)
  near me
● IP2Location search

Use Cases:
● Find a user's City, ZIP code, Country, etc.
● Find the user's location by IP
A bit about geohashing
● Converts (lat,lon) into a single 64 bit hash (and back)
● The closer points are, their common prefix is generally bigger.
● Trimming more lower bits describes a larger bounding box.
● example:
   ○ Tel Aviv (32.0667, 34.7667) =>
      14326455945304181035
   ○ Netanya (32.3336, 34.8578) =>
      14326502174498709381
● We can use geohash as scores
  in sorted sets.
● There are drawbacks such as
  special cases near lat/lon 0.
Redis Pattern
● Let's index cities in a sorted set:
   ○ { cityId => geohash(lat,lon) }
● We convert the user's {lat,lon} into a goehash too.
● Using ZRANGEBYSCORE we find the N larger and N
  smaller elements in the set:
   ○ ZRANGEBYSCORE <user_hash> +inf 0 8
   ○ ZREVRANGEBYSCORE <user_hash> -inf 0 8
● We use the scores as lat,lons again to find distance.
● We find the closest city, and load it.
● We can save bounding rects for more precision.
● The same can be done for ZIP codes, venues, etc.
● IP Ranges are indexed on a sorted set, too.
Other interesting use cases
●   Distributed Queue
     ○ Workers use blocking pop (BLPOP) on a list.
     ○ Whenever someone pushes a task to the list (RPUSH) it will be
        popped by exactly one worker.

●   Push notifications / IM
     ○ Use redis PubSub objects as messaging channels between users.
     ○ Combine with WebSocket to push messages to Web Browsers, a-la
       googletalk.

●   Machine learning
    ○ Use redis sorted sets as a fast storage for feature vectors, frequency
       counts, probabilities, etc.
    ○ Intersecting sorted sets can yield SUM(scores) - think log(P(a)) + log
       (P(b))
Get the sources
Implementations of most of the examples in this
slideshow:
https://github.com/EverythingMe/kickass-redis

Geo resolving library:
http://github.com/doat/geodis

Get redis at http://redis.io

More Related Content

What's hot

Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
Mike Dirolf
 
Redis Introduction
Redis IntroductionRedis Introduction
Redis IntroductionAlex Su
 
Intro to HBase
Intro to HBaseIntro to HBase
Intro to HBase
alexbaranau
 
Introduction to redis
Introduction to redisIntroduction to redis
Introduction to redis
NexThoughts Technologies
 
Caching solutions with Redis
Caching solutions   with RedisCaching solutions   with Redis
Caching solutions with Redis
George Platon
 
redis basics
redis basicsredis basics
redis basics
Manoj Kumar
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm Chandler Huang
 
Introduction to redis
Introduction to redisIntroduction to redis
Introduction to redis
Tanu Siwag
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
Jurriaan Persyn
 
Redis cluster
Redis clusterRedis cluster
Redis cluster
iammutex
 
Redis data modeling examples
Redis data modeling examplesRedis data modeling examples
Redis data modeling examples
Terry Cho
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDB
MongoDB
 
A whirlwind tour of the LLVM optimizer
A whirlwind tour of the LLVM optimizerA whirlwind tour of the LLVM optimizer
A whirlwind tour of the LLVM optimizer
Nikita Popov
 
12-Step Program for Scaling Web Applications on PostgreSQL
12-Step Program for Scaling Web Applications on PostgreSQL12-Step Program for Scaling Web Applications on PostgreSQL
12-Step Program for Scaling Web Applications on PostgreSQL
Konstantin Gredeskoul
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
TO THE NEW | Technology
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
Knoldus Inc.
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveDataWorks Summit
 
Learning Rust the Hard Way for a Production Kafka + ScyllaDB Pipeline
Learning Rust the Hard Way for a Production Kafka + ScyllaDB PipelineLearning Rust the Hard Way for a Production Kafka + ScyllaDB Pipeline
Learning Rust the Hard Way for a Production Kafka + ScyllaDB Pipeline
ScyllaDB
 
DB12c: All You Need to Know About the Resource Manager
DB12c: All You Need to Know About the Resource ManagerDB12c: All You Need to Know About the Resource Manager
DB12c: All You Need to Know About the Resource Manager
Maris Elsins
 
Redis overview for Software Architecture Forum
Redis overview for Software Architecture ForumRedis overview for Software Architecture Forum
Redis overview for Software Architecture Forum
Christopher Spring
 

What's hot (20)

Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Redis Introduction
Redis IntroductionRedis Introduction
Redis Introduction
 
Intro to HBase
Intro to HBaseIntro to HBase
Intro to HBase
 
Introduction to redis
Introduction to redisIntroduction to redis
Introduction to redis
 
Caching solutions with Redis
Caching solutions   with RedisCaching solutions   with Redis
Caching solutions with Redis
 
redis basics
redis basicsredis basics
redis basics
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm
 
Introduction to redis
Introduction to redisIntroduction to redis
Introduction to redis
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
 
Redis cluster
Redis clusterRedis cluster
Redis cluster
 
Redis data modeling examples
Redis data modeling examplesRedis data modeling examples
Redis data modeling examples
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDB
 
A whirlwind tour of the LLVM optimizer
A whirlwind tour of the LLVM optimizerA whirlwind tour of the LLVM optimizer
A whirlwind tour of the LLVM optimizer
 
12-Step Program for Scaling Web Applications on PostgreSQL
12-Step Program for Scaling Web Applications on PostgreSQL12-Step Program for Scaling Web Applications on PostgreSQL
12-Step Program for Scaling Web Applications on PostgreSQL
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
 
Learning Rust the Hard Way for a Production Kafka + ScyllaDB Pipeline
Learning Rust the Hard Way for a Production Kafka + ScyllaDB PipelineLearning Rust the Hard Way for a Production Kafka + ScyllaDB Pipeline
Learning Rust the Hard Way for a Production Kafka + ScyllaDB Pipeline
 
DB12c: All You Need to Know About the Resource Manager
DB12c: All You Need to Know About the Resource ManagerDB12c: All You Need to Know About the Resource Manager
DB12c: All You Need to Know About the Resource Manager
 
Redis overview for Software Architecture Forum
Redis overview for Software Architecture ForumRedis overview for Software Architecture Forum
Redis overview for Software Architecture Forum
 

Viewers also liked

High-Volume Data Collection and Real Time Analytics Using Redis
High-Volume Data Collection and Real Time Analytics Using RedisHigh-Volume Data Collection and Real Time Analytics Using Redis
High-Volume Data Collection and Real Time Analytics Using Redis
cacois
 
Scaling Crashlytics: Building Analytics on Redis 2.6
Scaling Crashlytics: Building Analytics on Redis 2.6Scaling Crashlytics: Building Analytics on Redis 2.6
Scaling Crashlytics: Building Analytics on Redis 2.6Crashlytics
 
Redis in Practice
Redis in PracticeRedis in Practice
Redis in Practice
Noah Davis
 
Redis Use Patterns (DevconTLV June 2014)
Redis Use Patterns (DevconTLV June 2014)Redis Use Patterns (DevconTLV June 2014)
Redis Use Patterns (DevconTLV June 2014)
Itamar Haber
 
Redis data design by usecase
Redis data design by usecaseRedis data design by usecase
Redis data design by usecase
Kris Jeong
 
Everything you always wanted to know about Redis but were afraid to ask
Everything you always wanted to know about Redis but were afraid to askEverything you always wanted to know about Redis but were afraid to ask
Everything you always wanted to know about Redis but were afraid to ask
Carlos Abalde
 

Viewers also liked (6)

High-Volume Data Collection and Real Time Analytics Using Redis
High-Volume Data Collection and Real Time Analytics Using RedisHigh-Volume Data Collection and Real Time Analytics Using Redis
High-Volume Data Collection and Real Time Analytics Using Redis
 
Scaling Crashlytics: Building Analytics on Redis 2.6
Scaling Crashlytics: Building Analytics on Redis 2.6Scaling Crashlytics: Building Analytics on Redis 2.6
Scaling Crashlytics: Building Analytics on Redis 2.6
 
Redis in Practice
Redis in PracticeRedis in Practice
Redis in Practice
 
Redis Use Patterns (DevconTLV June 2014)
Redis Use Patterns (DevconTLV June 2014)Redis Use Patterns (DevconTLV June 2014)
Redis Use Patterns (DevconTLV June 2014)
 
Redis data design by usecase
Redis data design by usecaseRedis data design by usecase
Redis data design by usecase
 
Everything you always wanted to know about Redis but were afraid to ask
Everything you always wanted to know about Redis but were afraid to askEverything you always wanted to know about Redis but were afraid to ask
Everything you always wanted to know about Redis but were afraid to ask
 

Similar to Kicking ass with redis

Database madness with_mongoengine_and_sql_alchemy
Database madness with_mongoengine_and_sql_alchemyDatabase madness with_mongoengine_and_sql_alchemy
Database madness with_mongoengine_and_sql_alchemyJaime Buelta
 
NoSQL Infrastructure
NoSQL InfrastructureNoSQL Infrastructure
NoSQL Infrastructure
Server Density
 
MongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: ShardingMongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: ShardingMongoDB
 
Replication and Replica Sets
Replication and Replica SetsReplication and Replica Sets
Replication and Replica Sets
MongoDB
 
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDB
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDBAWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDB
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDB
Amazon Web Services
 
OpenTSDB 2.0
OpenTSDB 2.0OpenTSDB 2.0
OpenTSDB 2.0
HBaseCon
 
Couchbase Korea User Group 2nd Meetup #2
Couchbase Korea User Group 2nd Meetup #2Couchbase Korea User Group 2nd Meetup #2
Couchbase Korea User Group 2nd Meetup #2
won min jang
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & Analytics
Server Density
 
C++ process new
C++ process newC++ process new
C++ process new
敬倫 林
 
DynamodbDB Deep Dive
DynamodbDB Deep DiveDynamodbDB Deep Dive
DynamodbDB Deep Dive
Amazon Web Services
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
Amazon Web Services
 
MongoDB - A Document NoSQL Database
MongoDB - A Document NoSQL DatabaseMongoDB - A Document NoSQL Database
MongoDB - A Document NoSQL DatabaseRuben Inoto Soto
 
Webinar: Replication and Replica Sets
Webinar: Replication and Replica SetsWebinar: Replication and Replica Sets
Webinar: Replication and Replica Sets
MongoDB
 
2012 mongo db_bangalore_roadmap_new
2012 mongo db_bangalore_roadmap_new2012 mongo db_bangalore_roadmap_new
2012 mongo db_bangalore_roadmap_newMongoDB
 
Cassandra 3.0 Awesomeness
Cassandra 3.0 AwesomenessCassandra 3.0 Awesomeness
Cassandra 3.0 Awesomeness
Jon Haddad
 
Building a Scalable Inbox System with MongoDB and Java
Building a Scalable Inbox System with MongoDB and JavaBuilding a Scalable Inbox System with MongoDB and Java
Building a Scalable Inbox System with MongoDB and Java
antoinegirbal
 
Deep Dive: Amazon DynamoDB
Deep Dive: Amazon DynamoDBDeep Dive: Amazon DynamoDB
Deep Dive: Amazon DynamoDB
Amazon Web Services
 
Replication and Replica Sets
Replication and Replica SetsReplication and Replica Sets
Replication and Replica SetsMongoDB
 
Timothy N. Tsvetkov, Rails 3.1
Timothy N. Tsvetkov, Rails 3.1Timothy N. Tsvetkov, Rails 3.1
Timothy N. Tsvetkov, Rails 3.1
Evil Martians
 
Managing Social Content with MongoDB
Managing Social Content with MongoDBManaging Social Content with MongoDB
Managing Social Content with MongoDB
MongoDB
 

Similar to Kicking ass with redis (20)

Database madness with_mongoengine_and_sql_alchemy
Database madness with_mongoengine_and_sql_alchemyDatabase madness with_mongoengine_and_sql_alchemy
Database madness with_mongoengine_and_sql_alchemy
 
NoSQL Infrastructure
NoSQL InfrastructureNoSQL Infrastructure
NoSQL Infrastructure
 
MongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: ShardingMongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: Sharding
 
Replication and Replica Sets
Replication and Replica SetsReplication and Replica Sets
Replication and Replica Sets
 
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDB
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDBAWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDB
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDB
 
OpenTSDB 2.0
OpenTSDB 2.0OpenTSDB 2.0
OpenTSDB 2.0
 
Couchbase Korea User Group 2nd Meetup #2
Couchbase Korea User Group 2nd Meetup #2Couchbase Korea User Group 2nd Meetup #2
Couchbase Korea User Group 2nd Meetup #2
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & Analytics
 
C++ process new
C++ process newC++ process new
C++ process new
 
DynamodbDB Deep Dive
DynamodbDB Deep DiveDynamodbDB Deep Dive
DynamodbDB Deep Dive
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 
MongoDB - A Document NoSQL Database
MongoDB - A Document NoSQL DatabaseMongoDB - A Document NoSQL Database
MongoDB - A Document NoSQL Database
 
Webinar: Replication and Replica Sets
Webinar: Replication and Replica SetsWebinar: Replication and Replica Sets
Webinar: Replication and Replica Sets
 
2012 mongo db_bangalore_roadmap_new
2012 mongo db_bangalore_roadmap_new2012 mongo db_bangalore_roadmap_new
2012 mongo db_bangalore_roadmap_new
 
Cassandra 3.0 Awesomeness
Cassandra 3.0 AwesomenessCassandra 3.0 Awesomeness
Cassandra 3.0 Awesomeness
 
Building a Scalable Inbox System with MongoDB and Java
Building a Scalable Inbox System with MongoDB and JavaBuilding a Scalable Inbox System with MongoDB and Java
Building a Scalable Inbox System with MongoDB and Java
 
Deep Dive: Amazon DynamoDB
Deep Dive: Amazon DynamoDBDeep Dive: Amazon DynamoDB
Deep Dive: Amazon DynamoDB
 
Replication and Replica Sets
Replication and Replica SetsReplication and Replica Sets
Replication and Replica Sets
 
Timothy N. Tsvetkov, Rails 3.1
Timothy N. Tsvetkov, Rails 3.1Timothy N. Tsvetkov, Rails 3.1
Timothy N. Tsvetkov, Rails 3.1
 
Managing Social Content with MongoDB
Managing Social Content with MongoDBManaging Social Content with MongoDB
Managing Social Content with MongoDB
 

More from Dvir Volk

RediSearch
RediSearchRediSearch
RediSearch
Dvir Volk
 
Searching Billions of Documents with Redis
Searching Billions of Documents with RedisSearching Billions of Documents with Redis
Searching Billions of Documents with Redis
Dvir Volk
 
Boosting Machine Learning with Redis Modules and Spark
Boosting Machine Learning with Redis Modules and SparkBoosting Machine Learning with Redis Modules and Spark
Boosting Machine Learning with Redis Modules and Spark
Dvir Volk
 
Redis modules 101
Redis modules 101Redis modules 101
Redis modules 101
Dvir Volk
 
Tales Of The Black Knight - Keeping EverythingMe running
Tales Of The Black Knight - Keeping EverythingMe runningTales Of The Black Knight - Keeping EverythingMe running
Tales Of The Black Knight - Keeping EverythingMe running
Dvir Volk
 
10 reasons to be excited about go
10 reasons to be excited about go10 reasons to be excited about go
10 reasons to be excited about go
Dvir Volk
 
Introduction to Thrift
Introduction to ThriftIntroduction to Thrift
Introduction to ThriftDvir Volk
 

More from Dvir Volk (7)

RediSearch
RediSearchRediSearch
RediSearch
 
Searching Billions of Documents with Redis
Searching Billions of Documents with RedisSearching Billions of Documents with Redis
Searching Billions of Documents with Redis
 
Boosting Machine Learning with Redis Modules and Spark
Boosting Machine Learning with Redis Modules and SparkBoosting Machine Learning with Redis Modules and Spark
Boosting Machine Learning with Redis Modules and Spark
 
Redis modules 101
Redis modules 101Redis modules 101
Redis modules 101
 
Tales Of The Black Knight - Keeping EverythingMe running
Tales Of The Black Knight - Keeping EverythingMe runningTales Of The Black Knight - Keeping EverythingMe running
Tales Of The Black Knight - Keeping EverythingMe running
 
10 reasons to be excited about go
10 reasons to be excited about go10 reasons to be excited about go
10 reasons to be excited about go
 
Introduction to Thrift
Introduction to ThriftIntroduction to Thrift
Introduction to Thrift
 

Recently uploaded

Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 

Recently uploaded (20)

Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 

Kicking ass with redis

  • 1. Kicking Ass With Redis for real world problems Dvir Volk, Chief Architect, Everything.me (@dvirsky)
  • 2. O HAI! I CAN HAS REDIS? Extremely Quick introduction to Redis ● Key => Data Structure server ● In memory, with persistence ● Extremely fast and versatile ● Rapidly growing (Instagr.am, Craigslist, Youporn ....) ● Open Source, awesome community ● Used as the primary data source in Everything.me: ○ Relational Data ○ Queueing ○ Caching ○ Machine Learning ○ Text Processing and search ○ Geo Stuff
  • 3. Key => { Data Structures } "I'm a Plain Text String!" Strings/Blobs/Bitmaps Key1 Val1 Hash Tables (objects!) Key2 Val 2 Key C B B A C Linked Lists A B C D Sets Sorted Sets A: 0.1 B: 0.3 C: 500 D: 500
  • 4. Redis is like Lego for Data ● Yes, It can be used as a simple KV store. ● But to really Use it, you need to think of it as a tool set. ● You have a nail - redis is a hammer building toolkit. ● That can make almost any kind of hammer. ● Learning how to efficiently model your problem is the Zen of Redis. ● Here are a few examples...
  • 5. Pattern 1: Simple, Fast, Object Store Our problem: ● Very fast object store that scales up well. ● High write throughput. ● Atomic manipulation of object members. Possible use cases: ● Online user data (session, game state) ● Social Feed ● Shopping Cart ● Anything, really...
  • 6. Storing users as HASHes email john@domain.com name John users:1 Password aebc65feae8b id 1 email Jane@domain.com name Jane users:2 Password aebc65ab117b id 2
  • 7. Redis Pattern 1 ● Each object is saved as a HASH. ● Hash objects are { key=> string/number } ● No JSON & friends serialization overhead. ● Complex members and relations are stored as separate HASHes. ● Atomic set / increment / getset members. ● Use INCR for centralized incremental ids. ● Load objects with HGETALL / HMGET
  • 8. Objects as Hashes class User(RedisObject): > INCR users:id def __init__(email, name, password): (integer) 1 self.email = email self.name = name > HMSET "users:1" self.password = password "email" "user@domain.com" self.id = self.createId() "name" "John" "password" "1234" user = User('user@domain.com', 'John', OK '1234) > HGETALL "users:1" { "email": "user@domain.com", ... } user.save()
  • 10. Pattern 2: Object Indexing The problem: ● We want to index the objects we saved by various criteria. ● We want to rank and sort them quickly. ● We want to be able to update an index quickly. Use cases: ● Tagging ● Real-Time score tables ● Social Feed Views
  • 11. Indexing with Sorted Sets k:users:email email john@domain.com user:1 => 1789708973 name John users:1 score 300 user:2 => 2361572523 id 1 .... email Jane@domain.com k:users:score name Jane user:2 => 250 users:2 score 250 user:1 => 300 id 2 user:3 => 300
  • 12. Redis Pattern ● Indexes are sorted sets (ZSETs) ● Access by value O(1), by score O(log(N)). plus ranges. ● Sorted Sets map { value => score (double) } ● So we map { objectId => score } ● For numerical members, the value is the score ● For string members, the score is a hash of the string. ● Fetching is done with ZRANGEBYSCORE ● Ranges with ZRANGE / ZRANGEBYSCORE on numeric values only (or very short strings) ● Deleting is done with ZREM ● Intersecting keys is possible with ZINTERSTORE ● Each class' objects have a special sorted set for ids.
  • 13. Automatic Keys for objects class User(RedisObject): > ZADD k:users:email 238927659283691 "1" 1 _keySpec = KeySpec( UnorderedKey('email'), > ZADD k:users:name 9283498696113 "1" UnorderedKey('name'), 1 OrderedNumericalKey('points') > ZADD k:users:points 300 "1" ) .... 1 > ZREVRANGE k:users:points 0 20 withscores #creating the users - now with points 1) "1" user = User('user@domain.com', 'John', 2) "300" '1234', points = 300) > ZRANGEBYSCORE k:users:email 238927659283691 238927659283691 #saving auto-indexes 1) "1" user.save() redis 127.0.0.1:6379> HGETALL users:1 { .. } #range query on rank users = User.getByRank(0,20) #get by name users = User.get(name = 'John')
  • 14. Pattern 3: Unique Value Counter The problem: ● We want an efficient way to measure cardinality of a set of objects over time. ● We may want it in real time. ● We don't want huge overhead. Use Cases: ● Daily/Monthly Unique users ● Split by OS / country / whatever ● Real Time online users counter
  • 15. Bitmaps to the rescue
  • 16. Redis Pattern ● Redis strings can be treated as bitmaps. ● We keep a bitmap for each time slot. ● We use BITSET offset=<object id> ● the size of a bitmap is max_id/8 bytes ● Cardinality per slot with BITCOUNT (2.6) ● Fast bitwise operations - OR / AND / XOR between time slots with BITOP ● Aggregate and save results periodically. ● Requires sequential object ids - or mapping of (see incremental ids)
  • 17. Counter API (with redis internals) counter = BitmapCounter('uniques', timeResolutions=(RES_DAY,)) #sampling current users counter.add(userId) > BITSET uniques:day:1339891200 <userId> 1 #Getting the unique user count for today counter.getCount(time.time()) > BITCOUNT uniques:day:1339891200   #Getting the the weekly unique users in the past week timePoints = [now() - 86400*i for i in xrange(7, 0, -1)] counter.aggregateCounts(timePoints, counter.OP_TOTAL) > BITOP OR tmp_key uniques:day:1339891200 uniques:day:1339804800 .... > BITCOUNT tmp_key      
  • 18. Pattern 4: Geo resolving The Problem: ● Resolve lat,lon to real locations ● Find locations of a certain class (restaurants) near me ● IP2Location search Use Cases: ● Find a user's City, ZIP code, Country, etc. ● Find the user's location by IP
  • 19. A bit about geohashing ● Converts (lat,lon) into a single 64 bit hash (and back) ● The closer points are, their common prefix is generally bigger. ● Trimming more lower bits describes a larger bounding box. ● example: ○ Tel Aviv (32.0667, 34.7667) => 14326455945304181035 ○ Netanya (32.3336, 34.8578) => 14326502174498709381 ● We can use geohash as scores in sorted sets. ● There are drawbacks such as special cases near lat/lon 0.
  • 20. Redis Pattern ● Let's index cities in a sorted set: ○ { cityId => geohash(lat,lon) } ● We convert the user's {lat,lon} into a goehash too. ● Using ZRANGEBYSCORE we find the N larger and N smaller elements in the set: ○ ZRANGEBYSCORE <user_hash> +inf 0 8 ○ ZREVRANGEBYSCORE <user_hash> -inf 0 8 ● We use the scores as lat,lons again to find distance. ● We find the closest city, and load it. ● We can save bounding rects for more precision. ● The same can be done for ZIP codes, venues, etc. ● IP Ranges are indexed on a sorted set, too.
  • 21. Other interesting use cases ● Distributed Queue ○ Workers use blocking pop (BLPOP) on a list. ○ Whenever someone pushes a task to the list (RPUSH) it will be popped by exactly one worker. ● Push notifications / IM ○ Use redis PubSub objects as messaging channels between users. ○ Combine with WebSocket to push messages to Web Browsers, a-la googletalk. ● Machine learning ○ Use redis sorted sets as a fast storage for feature vectors, frequency counts, probabilities, etc. ○ Intersecting sorted sets can yield SUM(scores) - think log(P(a)) + log (P(b))
  • 22. Get the sources Implementations of most of the examples in this slideshow: https://github.com/EverythingMe/kickass-redis Geo resolving library: http://github.com/doat/geodis Get redis at http://redis.io