• Save
Riding redis @ask.fm
Upcoming SlideShare
Loading in...5
×
 

Riding redis @ask.fm

on

  • 794 views

Slides from LatJug

Slides from LatJug

Statistics

Views

Total Views
794
Views on SlideShare
793
Embed Views
1

Actions

Likes
3
Downloads
0
Comments
0

1 Embed 1

https://twitter.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Riding redis @ask.fm Riding redis @ask.fm Presentation Transcript

  • RIDING REDIS @
  • 60M REGISTERED USERS 200M MONTHLY UNIQUE USERS 450M PAGEVIEWS A DAY
  • 600 QUESTIONS / SECOND 20K REQUESTS / SECOND 50K LIKES / MINUTE 250K REGISTRATIONS / DAY
  • ~600 SERVERS ~300 RUBY NODES ~50 JAVA NODES ~35 REDIS NODES ~90 MySQL NODES
  • COMPARISON 13B PAGEVIEWS 200M UNIQUE USERS 32M QUESTIONS ANSWERED 13B PAGEVIEWS* 300M UNIQUE USERS** 76.9M POSTS CREATED* 130B PAGEVIEWS*** N/A UNIQUE USERS 340M TWEETS CREATED*** _________________________________________________ * http://www.tumblr.com/press ** http://yahoo.tumblr.com/post/50902111638/tumblr-yahoo ***http://en.wikipedia.org/wiki/Twitter
  • REDIS
  • Open-source Key-value database In-memory Data-structure server REDIS
  • Salvatore Sanfilippo @antirez REDIS AUTHOR
  • TWITTER INSTAGRAM STACK-OVERFLOW BLIZZARD WHO IS USING REDIS?
  • 6 SCENARIOS 6 REDIS DATA-STRUCTURES
  • COUNTERS STRING MESSAGES QUEUE LIST USER ACTIVITY MATRIX HASH FUNCTIONAL SWITCHES BIT OPERATIONS THE WALL SORTED SET REAL-TIME MONITORING PUB / SUB
  • Scenario 1 : Counters
  • GLOBAL COUNTERS ACTIVE USERS QUESTIONS ASKED TODAY LIKES TOTALLY REGISTRATIONS THIS MONTH
  • SELECT count(*) FROM users WHERE state='active'; NAIVE SOLUTION
  • REDIS STRING MOST BASIC REDIS TYPE BINARY SAFE UP TO 512 Mb OPERATIONS INCR, DECR
  • @redis.incr "users/active/total" ... @redis.decr "users/active/total" users / active / total 61 771 480 users / inactive / total 6 943 366 EXAMPLE
  • USERS COUNTERS
  • @redis.incr "user/#{@user.id}/answers" .. @redis.decr "user/#{@user.id}/likes" .. @redis.incr "user/#{@user.id}/gifts" user / :user_id/ answers 57 user / :user_id / likes 13 905 user / :user_id / gifts 27 EXAMPLE #2: USERS COUNTERS
  • COUNTERS CONSISTENCY
  • CONSISTENCY = 1 / PERFORMANCE
  • @redis.expire "user/#{@user.id}/inbox_questions", 1.day SOLVED MOST CASES FOR US REDIS TTL
  • SCALABILITY
  • @redis_shard_map = { 0 => Redis.new(:host=>"redis_host_0"), 1 => Redis.new(:host=>"redis_host_1"), ..., 7 => Redis.new(:host=>"redis_host_7") } @redis = @redis_shard_map[@user.id % 8] @redis.incr "user/#{@user.id}/answers" SCALABILITY (2)
  • Scenario 2 : Messages Queue
  • THIRD PARTY INTERACTION
  • RUBY BLOCKING MODEL
  • Resque to the rescue https://github.com/resque/resque
  • REDIS-BACKED LIBRARY FOR ● CREATING BACKGROUND JOBS ● PLACING THOSE JOBS ON MULTIPLE QUEUES ● AND PROCESSING THEM LATER RESQUE
  • REDIS LIST SIMPLY LISTS OF STRINGS MAX LENGTH 232 - 1 CONSTANT TIME INSERTION OPERATIONS: LPUSH, LPOP, RPUSH, RPOP LREM, LRANGE
  • WORKFLOW
  • Why Resque?
  • Resque.enqueue(PostToFacebookJob, "Hello world") ADDING A NEW JOB
  • @redis.lpush 'queue:facebook', '{ "class":"PostToFacebookJob", "args":["Hello world"] }' O(1) COMPLEXITY
  • @redis.rpop "queue:facebook" O(1) GETTING A JOB FROM QUEUE
  • Scenario 3 : User activity matrix
  • ROBOTS CRAWLERS SPAM TIME-BASED RESTRICTIONS
  • POST ASK A QUESTION LOGIN LIKE GET PROFILE VIEW USER ACTIONS
  • LIMIT ACTIVITY PER MINUTE PER HOUR PER DAY REQUIREMENTS
  • REDIS HASH MAPS BETWEEN STRING FIELDS AND VALUES MAX PAIRS 232 - 1 OPERATIONS: HGET, HGETALL HSET, HDEL
  • LIKES QUESTIONS REQUESTS user/:uid/date/:today/minute/:minute 12 3 27 user/:uid/date/:today/hour/:hour 34 15 113 user/:uid/date/:today/day/:day 158 22 529 STRUCTURE
  • def register_users_like!(user_id) minute = @redis.hincrby minute_key, 'like', 1 hour = @redis.hincrby hour_key, 'like', 1 day = @redis.hincrby day_key, 'like', 1 end EXAMPLE: REGISTER ACTIVITY
  • def allowed_to_like_questions?(user_id) minute = @redis.hget minute_key, "likes" hour = @redis.hget hour_key, "likes" day = @redis.hget day_key, "likes" return per_minute < LIKES_PER_MINUTE_THRESHOLD && per_hour < LIKES_PER_HOUR_THRESHOLD && per_day < LIKES_PER_DATE_THRESHOLD end EXAMPLE: ALLOWED?
  • @redis.expire per_minute_key, 1.minute @redis.expire per_hour_key, 1.hour @redis.expire per_day_key, 1.day CLEANUP
  • SCALABILITY SCALE BY USER_ID DATE PHASE OF THE MOON
  • CONSISTENT HASHING
  • Scenario 4 : Functional switches
  • TURN ON/OFF ANY FEATURE ON SITE REQUIREMENTS
  • PHOTO ANSWERS VIDEO ANSWERS WALL STREAM DATABASE SHARDS SET POSTS PER PAGE FUNCTIONALITY
  • BIT OPERATIONS BIT OPERATIONS SETBIT, GETBIT BITCOUNT
  • @redis.setbit common_settings_key, WALL_ENABLED, true EXAMPLE
  • MASTER / SLAVE REPLICATION SCALABILITY
  • Scenario 5 : The Wall
  • THE WALL
  • THE WALL
  • SHOW FRIENDS POSTS INITIAL REQUIREMENT
  • SELECT * FROM questions q LEFT JOIN followships f ON (q.user_id = f.friend_id) WHERE f.user_id = :my_user_id ORDER BY q.answered_at LIMIT 0,25 SOLUTION
  • LATER REQUIREMENTS LIKES INTRODUCED SHOW RETWEETS UNIQUENESS OF ANSWERS ORDERED BY FIRST OCCURRENCE PAGINATION NEEDED DO NOT SHOW OWN POSTS SHOW RETWEETS SINCE STARTED FOLLOWING A FRIENDS
  • MORE REQUIREMENTS DO NOT SHOW RETWEETS IF ANSWERER OR RETWEETER IS DISABLED SHOW LATEST FRIENDS WHO LIKED A QUESTION
  • OUR SOLUTION
  • IDEA STORE SEPARATE SET OF QUESTIONS FOR EVERY USER
  • NON REPEATING COLLECTIONS OF STRINGS EACH MEMBER ASSOCIATED WITH SCORE QUICKLY ACCESS ELEMENTS IN ORDER FAST EXISTENCE TEST FAST ACCESS TO ELEMENTS IN THE MIDDLE REDIS SORTED SET
  • user/:user_id/wall score_1 score_2 ... score_N question_id_1 question_id_2 ... question_id_N score - timestamp, when the question_id first occurred in a set STRUCTURE
  • ● GET USERS WALL ○ ZREVRANGEBYSCORE - O(log(N)+M) ● USER ANSWERED A QUESTION ○ ZADD - O(log(N)) ● LIKE ○ ZRANK - O(log(N)) ○ ZADD - O(log(N)) ● REMOVE ANSWER ○ ZREM - O(M*log(N) OPERATIONS
  • GUARANTEED 1000-1500 POSTS ON WALL PERIODICALLY CALL ZCARD ZREMRANGEBYRANK CLEANUP
  • USER_ID SHARDING
  • Scenario 6 : Real time monitoring
  • PATTERN DETECTION REQUIREMENT
  • HUMAN vs MACHINE
  • MySQL TABLE PULL INITIAL SOLUTION
  • PUBLISH / SUBSCRIBE MESSAGING PARADIGM ALLOW PATTERN-MATCHING SUBSCRIPTIONS OPERATIONS: SUBSCRIBE, UNSUBSCRIBE PUBLISH REDIS PUB/SUB
  • SCHEMA
  • MODERATORS PANEL
  • Time complexity: O(N+M) where N is the number of clients subscribed to the receiving channel and M is the total number of subscribed patterns (by any client). SCALING
  • So, why Redis?
  • SIMPLE FAST FLEXIBLE ROBUST FREE WHY REDIS?
  • CLUSTERING WHAT'S MISSING
  • Not covered?
  • SETS LUA SCRIPTING TRANSACTIONS PIPELINED NOT COVERED
  • JAVA
  • Questions