• Save
Riding redis @ask.fm
Upcoming SlideShare
Loading in...5
×
 

Riding redis @ask.fm

on

  • 788 views

Slides from LatJug

Slides from LatJug

Statistics

Views

Total Views
788
Views on SlideShare
787
Embed Views
1

Actions

Likes
2
Downloads
0
Comments
0

1 Embed 1

https://twitter.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Riding redis @ask.fm Riding redis @ask.fm Presentation Transcript

    • RIDING REDIS @
    • 60M REGISTERED USERS 200M MONTHLY UNIQUE USERS 450M PAGEVIEWS A DAY
    • 600 QUESTIONS / SECOND 20K REQUESTS / SECOND 50K LIKES / MINUTE 250K REGISTRATIONS / DAY
    • ~600 SERVERS ~300 RUBY NODES ~50 JAVA NODES ~35 REDIS NODES ~90 MySQL NODES
    • COMPARISON 13B PAGEVIEWS 200M UNIQUE USERS 32M QUESTIONS ANSWERED 13B PAGEVIEWS* 300M UNIQUE USERS** 76.9M POSTS CREATED* 130B PAGEVIEWS*** N/A UNIQUE USERS 340M TWEETS CREATED*** _________________________________________________ * http://www.tumblr.com/press ** http://yahoo.tumblr.com/post/50902111638/tumblr-yahoo ***http://en.wikipedia.org/wiki/Twitter
    • REDIS
    • Open-source Key-value database In-memory Data-structure server REDIS
    • Salvatore Sanfilippo @antirez REDIS AUTHOR
    • TWITTER INSTAGRAM STACK-OVERFLOW BLIZZARD WHO IS USING REDIS?
    • 6 SCENARIOS 6 REDIS DATA-STRUCTURES
    • COUNTERS STRING MESSAGES QUEUE LIST USER ACTIVITY MATRIX HASH FUNCTIONAL SWITCHES BIT OPERATIONS THE WALL SORTED SET REAL-TIME MONITORING PUB / SUB
    • Scenario 1 : Counters
    • GLOBAL COUNTERS ACTIVE USERS QUESTIONS ASKED TODAY LIKES TOTALLY REGISTRATIONS THIS MONTH
    • SELECT count(*) FROM users WHERE state='active'; NAIVE SOLUTION
    • REDIS STRING MOST BASIC REDIS TYPE BINARY SAFE UP TO 512 Mb OPERATIONS INCR, DECR
    • @redis.incr "users/active/total" ... @redis.decr "users/active/total" users / active / total 61 771 480 users / inactive / total 6 943 366 EXAMPLE
    • USERS COUNTERS
    • @redis.incr "user/#{@user.id}/answers" .. @redis.decr "user/#{@user.id}/likes" .. @redis.incr "user/#{@user.id}/gifts" user / :user_id/ answers 57 user / :user_id / likes 13 905 user / :user_id / gifts 27 EXAMPLE #2: USERS COUNTERS
    • COUNTERS CONSISTENCY
    • CONSISTENCY = 1 / PERFORMANCE
    • @redis.expire "user/#{@user.id}/inbox_questions", 1.day SOLVED MOST CASES FOR US REDIS TTL
    • SCALABILITY
    • @redis_shard_map = { 0 => Redis.new(:host=>"redis_host_0"), 1 => Redis.new(:host=>"redis_host_1"), ..., 7 => Redis.new(:host=>"redis_host_7") } @redis = @redis_shard_map[@user.id % 8] @redis.incr "user/#{@user.id}/answers" SCALABILITY (2)
    • Scenario 2 : Messages Queue
    • THIRD PARTY INTERACTION
    • RUBY BLOCKING MODEL
    • Resque to the rescue https://github.com/resque/resque
    • REDIS-BACKED LIBRARY FOR ● CREATING BACKGROUND JOBS ● PLACING THOSE JOBS ON MULTIPLE QUEUES ● AND PROCESSING THEM LATER RESQUE
    • REDIS LIST SIMPLY LISTS OF STRINGS MAX LENGTH 232 - 1 CONSTANT TIME INSERTION OPERATIONS: LPUSH, LPOP, RPUSH, RPOP LREM, LRANGE
    • WORKFLOW
    • Why Resque?
    • Resque.enqueue(PostToFacebookJob, "Hello world") ADDING A NEW JOB
    • @redis.lpush 'queue:facebook', '{ "class":"PostToFacebookJob", "args":["Hello world"] }' O(1) COMPLEXITY
    • @redis.rpop "queue:facebook" O(1) GETTING A JOB FROM QUEUE
    • Scenario 3 : User activity matrix
    • ROBOTS CRAWLERS SPAM TIME-BASED RESTRICTIONS
    • POST ASK A QUESTION LOGIN LIKE GET PROFILE VIEW USER ACTIONS
    • LIMIT ACTIVITY PER MINUTE PER HOUR PER DAY REQUIREMENTS
    • REDIS HASH MAPS BETWEEN STRING FIELDS AND VALUES MAX PAIRS 232 - 1 OPERATIONS: HGET, HGETALL HSET, HDEL
    • LIKES QUESTIONS REQUESTS user/:uid/date/:today/minute/:minute 12 3 27 user/:uid/date/:today/hour/:hour 34 15 113 user/:uid/date/:today/day/:day 158 22 529 STRUCTURE
    • def register_users_like!(user_id) minute = @redis.hincrby minute_key, 'like', 1 hour = @redis.hincrby hour_key, 'like', 1 day = @redis.hincrby day_key, 'like', 1 end EXAMPLE: REGISTER ACTIVITY
    • def allowed_to_like_questions?(user_id) minute = @redis.hget minute_key, "likes" hour = @redis.hget hour_key, "likes" day = @redis.hget day_key, "likes" return per_minute < LIKES_PER_MINUTE_THRESHOLD && per_hour < LIKES_PER_HOUR_THRESHOLD && per_day < LIKES_PER_DATE_THRESHOLD end EXAMPLE: ALLOWED?
    • @redis.expire per_minute_key, 1.minute @redis.expire per_hour_key, 1.hour @redis.expire per_day_key, 1.day CLEANUP
    • SCALABILITY SCALE BY USER_ID DATE PHASE OF THE MOON
    • CONSISTENT HASHING
    • Scenario 4 : Functional switches
    • TURN ON/OFF ANY FEATURE ON SITE REQUIREMENTS
    • PHOTO ANSWERS VIDEO ANSWERS WALL STREAM DATABASE SHARDS SET POSTS PER PAGE FUNCTIONALITY
    • BIT OPERATIONS BIT OPERATIONS SETBIT, GETBIT BITCOUNT
    • @redis.setbit common_settings_key, WALL_ENABLED, true EXAMPLE
    • MASTER / SLAVE REPLICATION SCALABILITY
    • Scenario 5 : The Wall
    • THE WALL
    • THE WALL
    • SHOW FRIENDS POSTS INITIAL REQUIREMENT
    • SELECT * FROM questions q LEFT JOIN followships f ON (q.user_id = f.friend_id) WHERE f.user_id = :my_user_id ORDER BY q.answered_at LIMIT 0,25 SOLUTION
    • LATER REQUIREMENTS LIKES INTRODUCED SHOW RETWEETS UNIQUENESS OF ANSWERS ORDERED BY FIRST OCCURRENCE PAGINATION NEEDED DO NOT SHOW OWN POSTS SHOW RETWEETS SINCE STARTED FOLLOWING A FRIENDS
    • MORE REQUIREMENTS DO NOT SHOW RETWEETS IF ANSWERER OR RETWEETER IS DISABLED SHOW LATEST FRIENDS WHO LIKED A QUESTION
    • OUR SOLUTION
    • IDEA STORE SEPARATE SET OF QUESTIONS FOR EVERY USER
    • NON REPEATING COLLECTIONS OF STRINGS EACH MEMBER ASSOCIATED WITH SCORE QUICKLY ACCESS ELEMENTS IN ORDER FAST EXISTENCE TEST FAST ACCESS TO ELEMENTS IN THE MIDDLE REDIS SORTED SET
    • user/:user_id/wall score_1 score_2 ... score_N question_id_1 question_id_2 ... question_id_N score - timestamp, when the question_id first occurred in a set STRUCTURE
    • ● GET USERS WALL ○ ZREVRANGEBYSCORE - O(log(N)+M) ● USER ANSWERED A QUESTION ○ ZADD - O(log(N)) ● LIKE ○ ZRANK - O(log(N)) ○ ZADD - O(log(N)) ● REMOVE ANSWER ○ ZREM - O(M*log(N) OPERATIONS
    • GUARANTEED 1000-1500 POSTS ON WALL PERIODICALLY CALL ZCARD ZREMRANGEBYRANK CLEANUP
    • USER_ID SHARDING
    • Scenario 6 : Real time monitoring
    • PATTERN DETECTION REQUIREMENT
    • HUMAN vs MACHINE
    • MySQL TABLE PULL INITIAL SOLUTION
    • PUBLISH / SUBSCRIBE MESSAGING PARADIGM ALLOW PATTERN-MATCHING SUBSCRIPTIONS OPERATIONS: SUBSCRIBE, UNSUBSCRIBE PUBLISH REDIS PUB/SUB
    • SCHEMA
    • MODERATORS PANEL
    • Time complexity: O(N+M) where N is the number of clients subscribed to the receiving channel and M is the total number of subscribed patterns (by any client). SCALING
    • So, why Redis?
    • SIMPLE FAST FLEXIBLE ROBUST FREE WHY REDIS?
    • CLUSTERING WHAT'S MISSING
    • Not covered?
    • SETS LUA SCRIPTING TRANSACTIONS PIPELINED NOT COVERED
    • JAVA
    • Questions