Indexing Thousands of  Writes per Second      with Redis          Paul Dix      paul@pauldix.net          @pauldix      ht...
I’m Paul Dix
I wrote this   book
Benchmark Solutions*                                         who I work for* we’re hiring, duh! email: paul@benchmarksolut...
had a spiel about thesuit..Before we get to the       talk...
That bastard stole my      thunder!
You don’t think of suit                   wearing badassSeñor Software Engineer
I work
Finance
the janitors, cleaning          staff, and the 18 year          old intern get this title          too...Vice President
How could I wear               anything but a suit?Finance + VP + Suit =     douchebag
Distractionhttp://www.flickr.com/photos/33562486@N07/4288275204/
Bethttp://www.flickr.com/photos/11448492@N07/2825781502/
Barhttp://www.flickr.com/photos/11448492@N07/2825781502/
@flavorjones coauthor        of Nokogiricredit: @ebiltwin
JSON vs. XML
XML Sucks Hard
JSON is teh awesome
XML parsing S-L-O-W
10x slower
Mike called BS
A bet!
and I was like: “sure, for        a beer”
and Mike was all like:“ok, but that’s lame”
“let’s make itinteresting. Loser wears   my daughter’s fairy wings during your talk”
Sure, that’ll be funny   and original...
Dr. Nic in fairy wings
That bastard stole my      thunder!
Nic may have done it as       part of the talk, but he       didn’t lose a bet...       put wings on in red-faced       sh...
credit: @jonathanpberger
Nokogiri ~ 6.8x slower
REXML(ActiveRecord.from_xml)      ~ 400x slower
Lesson:Always use JSON
Lesson:Don’t make bar bets
However, the bet saidnothing about my slides
Aaron Patterson father of nokogiri 3 slides with @tenderlove’s picture? wtf?!!
Called Mike:“Nokogiri’s mother”
Fairy Godmother?
Lesson:     Learn Photoshop(this shit is embarrassing)
Anyway, the point of    the suit...
take me seriously,dammit!
On to the actual talk...
it’s about...
Redis
Sustained write load of   ~ 5k per second
Redis + other datastores = bad assery
@flavorjones   and maybe about Mike   being some kind of virgin   mothercredit: @ebiltwin
Lesson:   Be specific about the      terms of a bet(because at least someone   can use Photoshop)
Who’s used Redis?
NoSQL
Key/Value Store
Created bySalvatore Sanfilippo     @antirez
Data structure store
Basics
Keysrequire redisredis = Redis.newredis.set("mike", "grants wishes") # => OKredis.get("mike")   # => "grants wishes"
Countersredis.incr("fairy_references") # => 1redis.decr("dignity") # => -1redis.incrby("fairy_references", 23) # => 24redi...
Expirationredis.expire("mike", 120)redis.expireat("mike",  1.day.from_now.midnight)
Hashesredis.hset("paul", "has_wings", true)redis.hget("paul", "has_wings") # => "true"redis.hmset("paul",  :location, "Bal...
Listsredis.lpush("events", "first") # => 1redis.lpush("events", "second") # => 2redis.lrange("events", 0, -1) #  => ["seco...
Setsredis.sadd("user_ids", 1)    #   =>   trueredis.scard("user_ids")      #   =>   1redis.smembers("user_ids")   #   =>  ...
Sets Continued# know_paul ["1", "3", "4"]# know_mike ["3", "5"]redis.sinter("know_paul", "know_mike")   # =>["3"]redis.sdi...
Sorted Setsredis.zadd("wish_counts", 2, "paul") # =>  trueredis.zcard("wish_counts") # => 1redis.zismember("paul") # => tr...
Sorted Sets Continuedredis.zadd("wish_counts", 12, "rubyland")redis.zrange("wish_counts", 0, -1) # =>  ["paul", "rubyland"...
Sorted Sets Continuedredis.zrevrangebyscore("wish_counts",  "+inf", "-inf") # =>  ["rubyland", "paul"]redis.zrevrangebysco...
Lesson:Keeping examples consistent with a stupid story is hard
pubsub, transactions,more commadnds.not covered here, leave mealone                 There’s more
Crazy Fast
Faster than a greased       cheetah
or a Delorean with 1.21       gigawatts
OMG Scaling Sprinkles!
No Wishes Grantedf-you, f-ball!
Lesson:Getting someone to pose          is easier (also, learn Photoshop)
Still monolithicnot horizontallyscalable, oh noes!
Can shard in client like          memcachedI know haters, you cando this
Still not highly available
Still susceptible to      partitions
However, it’s wicked      cool
Why Index with Redis?
Don’tyou probably don’t needit                          http://www.flickr.com/photos/34353483@N00/205467442/
and you’re all like,“Paul, ...”      But I have to SCALE!
No you don’t
Trust me, I’m wearing a          suit
that means I haveauthority and...                    I know shit
and still you cry:                     But no, really...
Sad SQL is Sadthousands ofwrites persecond? No megusto!
ok, fine.
My Use Cases
40k unique things
Updating every 10    seconds
Plus other updates...
Average write load of  3k-5k writes per      second
LVCredis.hset("bonds|1", "bid_price",   96.01)redis.hset("bonds|1", "ask_price",   97.53)redis.hset("bonds|2", "bid_price"...
Index on the fly
SORTredis.sort("bond_ids",  :by => "bonds|*->bid_price") # => ["2", "1"]redis.sort("bond_ids",  :by => "bonds|*->bid_price...
SORT Continuedredis.sort("bond_ids",  :by => "bonds|*->bid_price",  :limit => [0, 1]) # => ["2"]redis.sort("bond_ids",  :b...
Getting Recordsids = redis_sort_results.map {|id| id.to_i}bonds = Bond.find(ids)           note that prices (high         ...
Getting From Redisredis.hset("bonds|2", "values", data.to_json)raw_json = redis.sort("bond_ids",    However, then you have...
Pre-Indexing
Rolling Index
Last n Events
Activity Log
News Feed
Use a List                                       O(1) constant time                                       complexity to ad...
Indexing Events Since       Time T
Using a Listredis.lpush("bond_trades|1|2011-05-19-10",  trade_id)redis.lrange("bond_trades|1|2011-05-19-10",  0, -1)result...
Rolling the Index# when something tradesredis.sadd("bonds_traded|2011-05-19-10",  bond_id)# cron task to remove old datatr...
Using a Sorted Set# Time based Rolling Index using sorted setredis.zadd("bond_trades|1",      O(log(n)) writes  Time.now.t...
Rolling the Index# cron task to roll the indexbond_ids = redis.smembers("bond_ids")remove_since_time = 24.hours.ago.to_ire...
Or roll on read or           writeredis.zadd("bond_trades|1",  Time.now.to_i, trade_id)redis.zremrangebyscore("bond_trades...
Indexing N Valuesredis.zadd("highest_follower_counts", 2300, 20)redis.zadd("lowest_follower_counts", 2300, 20)# rolling th...
rolling requires moreroundtrips           2 roundtrips        (only with complex             pipelining)
Roll indexes with only       one trip
Tweet to @antirez that  you want scripting
Keeping it Consistent
create/update/destroy
database transactions          can’t help you here, you’ll          have to put them into          your application logicN...
Disaster Recovery
Two failure scenarios
Web app dies
Redis server dies
Could result in index   inconsistency
Simple recovery script
Write Index Timesredis.set("last_bond_trade_indexed",  trade.created_at.to_i)
Restore Each Indextime_int = redis.get("last_bond_trade_indexed").to_iindex_time = Time.at(time_int)trades = Trade.where( ...
Our scale
Single Process
Sets don’t work withintersection, union, ordiff.SORT won’t work unlessall those keys fall on thesame server            Eas...
Works like a champ
Final Thoughts
Use Only if you have        to!
Index the minimum tokeep memory footprint         down use rolling indexes, don’t keep more shit in memory than you need. ...
Plan for disaster andconsistency checking
Finally...
Look at my circle, bitches!
Lesson:Never trust a guy in a suitnot pull a fast one on you
Thanks!    Paul Dixpaul@pauldix.net    @pauldixhttp://pauldix.net
Indexing thousands of writes per second with redis
Indexing thousands of writes per second with redis
Upcoming SlideShare
Loading in...5
×

Indexing thousands of writes per second with redis

11,673

Published on

My talk from RailsConf 2011. Indexing thousands of writes per second with Redis.

0 Comments
23 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
11,673
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
310
Comments
0
Likes
23
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Indexing thousands of writes per second with redis

    1. 1. Indexing Thousands of Writes per Second with Redis Paul Dix paul@pauldix.net @pauldix http://pauldix.net
    2. 2. I’m Paul Dix
    3. 3. I wrote this book
    4. 4. Benchmark Solutions* who I work for* we’re hiring, duh! email: paul@benchmarksolutions.com
    5. 5. had a spiel about thesuit..Before we get to the talk...
    6. 6. That bastard stole my thunder!
    7. 7. You don’t think of suit wearing badassSeñor Software Engineer
    8. 8. I work
    9. 9. Finance
    10. 10. the janitors, cleaning staff, and the 18 year old intern get this title too...Vice President
    11. 11. How could I wear anything but a suit?Finance + VP + Suit = douchebag
    12. 12. Distractionhttp://www.flickr.com/photos/33562486@N07/4288275204/
    13. 13. Bethttp://www.flickr.com/photos/11448492@N07/2825781502/
    14. 14. Barhttp://www.flickr.com/photos/11448492@N07/2825781502/
    15. 15. @flavorjones coauthor of Nokogiricredit: @ebiltwin
    16. 16. JSON vs. XML
    17. 17. XML Sucks Hard
    18. 18. JSON is teh awesome
    19. 19. XML parsing S-L-O-W
    20. 20. 10x slower
    21. 21. Mike called BS
    22. 22. A bet!
    23. 23. and I was like: “sure, for a beer”
    24. 24. and Mike was all like:“ok, but that’s lame”
    25. 25. “let’s make itinteresting. Loser wears my daughter’s fairy wings during your talk”
    26. 26. Sure, that’ll be funny and original...
    27. 27. Dr. Nic in fairy wings
    28. 28. That bastard stole my thunder!
    29. 29. Nic may have done it as part of the talk, but he didn’t lose a bet... put wings on in red-faced shame.So who won?
    30. 30. credit: @jonathanpberger
    31. 31. Nokogiri ~ 6.8x slower
    32. 32. REXML(ActiveRecord.from_xml) ~ 400x slower
    33. 33. Lesson:Always use JSON
    34. 34. Lesson:Don’t make bar bets
    35. 35. However, the bet saidnothing about my slides
    36. 36. Aaron Patterson father of nokogiri 3 slides with @tenderlove’s picture? wtf?!!
    37. 37. Called Mike:“Nokogiri’s mother”
    38. 38. Fairy Godmother?
    39. 39. Lesson: Learn Photoshop(this shit is embarrassing)
    40. 40. Anyway, the point of the suit...
    41. 41. take me seriously,dammit!
    42. 42. On to the actual talk...
    43. 43. it’s about...
    44. 44. Redis
    45. 45. Sustained write load of ~ 5k per second
    46. 46. Redis + other datastores = bad assery
    47. 47. @flavorjones and maybe about Mike being some kind of virgin mothercredit: @ebiltwin
    48. 48. Lesson: Be specific about the terms of a bet(because at least someone can use Photoshop)
    49. 49. Who’s used Redis?
    50. 50. NoSQL
    51. 51. Key/Value Store
    52. 52. Created bySalvatore Sanfilippo @antirez
    53. 53. Data structure store
    54. 54. Basics
    55. 55. Keysrequire redisredis = Redis.newredis.set("mike", "grants wishes") # => OKredis.get("mike") # => "grants wishes"
    56. 56. Countersredis.incr("fairy_references") # => 1redis.decr("dignity") # => -1redis.incrby("fairy_references", 23) # => 24redis.decrby("dignity", 56) # => 57
    57. 57. Expirationredis.expire("mike", 120)redis.expireat("mike", 1.day.from_now.midnight)
    58. 58. Hashesredis.hset("paul", "has_wings", true)redis.hget("paul", "has_wings") # => "true"redis.hmset("paul", :location, "Baltimore", :twitter, "@pauldix")redis.hvals("paul") # => {# "has_wings"=>"true",# "location"=>"Baltimore",# "twitter"=>"@pauldix" }redis.hlen("paul") # => 3
    59. 59. Listsredis.lpush("events", "first") # => 1redis.lpush("events", "second") # => 2redis.lrange("events", 0, -1) # => ["second", "first"]redis.rpush("events", "third") # => 3redis.lrange("events", 0, -1) # => ["second", "first", "third"]redis.lpop("events") # => "second"redis.lrange("events", 0, -1) # => ["first", "third"]redis.rpoplpush("events", "fourth") # => "third"
    60. 60. Setsredis.sadd("user_ids", 1) # => trueredis.scard("user_ids") # => 1redis.smembers("user_ids") # => ["1"]redis.sismember(1) # => trueredis.srem("user_ids", 1) # => true
    61. 61. Sets Continued# know_paul ["1", "3", "4"]# know_mike ["3", "5"]redis.sinter("know_paul", "know_mike") # =>["3"]redis.sdiff("know_paul", "know_mike") # =>["1", "4"]redis.sdiff("know_mike", "know_paul") # =>["5"]redis.sunion("know_paul", "know_mike") # =>["1", "3", "4", "5"]
    62. 62. Sorted Setsredis.zadd("wish_counts", 2, "paul") # => trueredis.zcard("wish_counts") # => 1redis.zismember("paul") # => trueredis.zrem("wish_counts", "paul") # => true
    63. 63. Sorted Sets Continuedredis.zadd("wish_counts", 12, "rubyland")redis.zrange("wish_counts", 0, -1) # => ["paul", "rubyland"]redis.zrange("wish_counts", 0, -1, :with_scores => true) # => ["paul", "2", "rubyland", "12"]redis.zrevrange("wish_counts", 0, -1) # => ["rubyland", "paul"]
    64. 64. Sorted Sets Continuedredis.zrevrangebyscore("wish_counts", "+inf", "-inf") # => ["rubyland", "paul"]redis.zrevrangebyscore("wish_counts", "+inf", "10") # => ["rubyland"]redis.zrevrangebyscore("wish_counts", "+inf", "-inf", :limit => [0, 1]) # => ["rubyland"]
    65. 65. Lesson:Keeping examples consistent with a stupid story is hard
    66. 66. pubsub, transactions,more commadnds.not covered here, leave mealone There’s more
    67. 67. Crazy Fast
    68. 68. Faster than a greased cheetah
    69. 69. or a Delorean with 1.21 gigawatts
    70. 70. OMG Scaling Sprinkles!
    71. 71. No Wishes Grantedf-you, f-ball!
    72. 72. Lesson:Getting someone to pose is easier (also, learn Photoshop)
    73. 73. Still monolithicnot horizontallyscalable, oh noes!
    74. 74. Can shard in client like memcachedI know haters, you cando this
    75. 75. Still not highly available
    76. 76. Still susceptible to partitions
    77. 77. However, it’s wicked cool
    78. 78. Why Index with Redis?
    79. 79. Don’tyou probably don’t needit http://www.flickr.com/photos/34353483@N00/205467442/
    80. 80. and you’re all like,“Paul, ...” But I have to SCALE!
    81. 81. No you don’t
    82. 82. Trust me, I’m wearing a suit
    83. 83. that means I haveauthority and... I know shit
    84. 84. and still you cry: But no, really...
    85. 85. Sad SQL is Sadthousands ofwrites persecond? No megusto!
    86. 86. ok, fine.
    87. 87. My Use Cases
    88. 88. 40k unique things
    89. 89. Updating every 10 seconds
    90. 90. Plus other updates...
    91. 91. Average write load of 3k-5k writes per second
    92. 92. LVCredis.hset("bonds|1", "bid_price", 96.01)redis.hset("bonds|1", "ask_price", 97.53)redis.hset("bonds|2", "bid_price", 90.50)redis.hset("bonds|2", "ask_price", 92.25)redis.sadd("bond_ids", 1)redis.sadd("bond_ids", 2)
    93. 93. Index on the fly
    94. 94. SORTredis.sort("bond_ids", :by => "bonds|*->bid_price") # => ["2", "1"]redis.sort("bond_ids", :by => "bonds|*->bid_price", :get => "bonds|*->bid_price") # =>["90.5", "96.01"]redis.sort("bond_ids", :by => "bonds|*->bid_price", :get => ["bonds|*->bid_price", "#"]) # =>["90.5", "2", "96.01", "1"]
    95. 95. SORT Continuedredis.sort("bond_ids", :by => "bonds|*->bid_price", :limit => [0, 1]) # => ["2"]redis.sort("bond_ids", :by => "bonds|*->bid_price", :order => "desc") # => ["1", "2"]redis.sort("bond_ids", :by => "bonds|*->ask_price") # => ["1", "2"]redis.sort("bond_ids", :by => "bonds|*->ask_price", :store => "bond_ids_sorted_by_ask_price", :expire => 300) # => 2
    96. 96. Getting Recordsids = redis_sort_results.map {|id| id.to_i}bonds = Bond.find(ids) note that prices (high write volume data) comebond_ids_to_bond = {} from elsewhere (not the SQL db)bonds.each do |bond| bond_ids_to_bond[bond.id] = bondendresults = ids.map do |id| bond_ids_to_bond[id]end
    97. 97. Getting From Redisredis.hset("bonds|2", "values", data.to_json)raw_json = redis.sort("bond_ids", However, then you have to worry about keeping the t wo data stores in :get => "bonds|*->bid_price", sync. We’ll talk about it later :get => "bonds|*->values")results = raw_json.map do |json| DataObject.new(JSON.parse(json))end
    98. 98. Pre-Indexing
    99. 99. Rolling Index
    100. 100. Last n Events
    101. 101. Activity Log
    102. 102. News Feed
    103. 103. Use a List O(1) constant time complexity to add O(start + n) for readingN = 500size = redis.lpush("bond_trades|1", trade_id)# roll the indexredis.rpop("bond_trades|1") if size > N# get resultsredis.lrange("bond_trades|1", 0, 49)
    104. 104. Indexing Events Since Time T
    105. 105. Using a Listredis.lpush("bond_trades|1|2011-05-19-10", trade_id)redis.lrange("bond_trades|1|2011-05-19-10", 0, -1)results = redis.pipelined do redis.lrange("bond_trades|1|2011-05-19-10", 0, -1) redis.lrange("bond_trades|1|2011-05-19-09", 0, -1)end.flatten
    106. 106. Rolling the Index# when something tradesredis.sadd("bonds_traded|2011-05-19-10", bond_id)# cron task to remove old datatraded_ids = redis.smembers( "bonds_traded|2011-05-19-10")keys = traded_ids.map do |id| "bond_trades|#{id}|2011-05-19-10"endkeys << "bonds_traded|2011-05-19-10"redis.del(*keys)
    107. 107. Using a Sorted Set# Time based Rolling Index using sorted setredis.zadd("bond_trades|1", O(log(n)) writes Time.now.to_i, trade_id) O(log(n) + M) reads# last 20 tradesredis.zrevrange("bond_trades|1", 0, 20)# trades in the last hourredis.zrevrangebyscore("bond_trades|1", "+inf", 1.hour.ago.to_i)
    108. 108. Rolling the Index# cron task to roll the indexbond_ids = redis.smembers("bond_ids")remove_since_time = 24.hours.ago.to_iredis.pipelined do bond_ids.each do |id| redis.zremrangebyscore( "bond_trades|#{id}", "-inf", remove_since_time) endend
    109. 109. Or roll on read or writeredis.zadd("bond_trades|1", Time.now.to_i, trade_id)redis.zremrangebyscore("bond_trades|1", "-inf", 24.hours.ago)
    110. 110. Indexing N Valuesredis.zadd("highest_follower_counts", 2300, 20)redis.zadd("lowest_follower_counts", 2300, 20)# rolling the indexes# keep the lowestsize = redis.zcard("lowest_follower_counts")redis.zremrangebyrank("lowest_follower_counts", N, -1) if size > N# keep the highestsize = redis.zcard("highest_follower_counts")redis.zremrangebyrank("highest_follower_counts", 0, size - N) if size > N
    111. 111. rolling requires moreroundtrips 2 roundtrips (only with complex pipelining)
    112. 112. Roll indexes with only one trip
    113. 113. Tweet to @antirez that you want scripting
    114. 114. Keeping it Consistent
    115. 115. create/update/destroy
    116. 116. database transactions can’t help you here, you’ll have to put them into your application logicNo transactions,application logic
    117. 117. Disaster Recovery
    118. 118. Two failure scenarios
    119. 119. Web app dies
    120. 120. Redis server dies
    121. 121. Could result in index inconsistency
    122. 122. Simple recovery script
    123. 123. Write Index Timesredis.set("last_bond_trade_indexed", trade.created_at.to_i)
    124. 124. Restore Each Indextime_int = redis.get("last_bond_trade_indexed").to_iindex_time = Time.at(time_int)trades = Trade.where( "created_at > :index_time AND created_at <= :now", {:index_time => index_time, :now => Time.now})trades.each do |trade| list you have to run while not writing new data. trade.index_in_redis Set can be made to runend while writing new data
    125. 125. Our scale
    126. 126. Single Process
    127. 127. Sets don’t work withintersection, union, ordiff.SORT won’t work unlessall those keys fall on thesame server Easy to Scale (consistent hashing)
    128. 128. Works like a champ
    129. 129. Final Thoughts
    130. 130. Use Only if you have to!
    131. 131. Index the minimum tokeep memory footprint down use rolling indexes, don’t keep more shit in memory than you need. Users won’t page through 20 pages of results, so don’t store that many
    132. 132. Plan for disaster andconsistency checking
    133. 133. Finally...
    134. 134. Look at my circle, bitches!
    135. 135. Lesson:Never trust a guy in a suitnot pull a fast one on you
    136. 136. Thanks! Paul Dixpaul@pauldix.net @pauldixhttp://pauldix.net
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×