An insight into NoSQL solutions implemented at RTV Slovenia and elsewhere, what problems we are trying to solve and an introduction to solving them with Redis.
Talk given at #wwwh @ Ljubljana, 30.1.2013 by me, Tit Petric
NoSQL: why it’s here
SQL:
- Slow query performance
- Concurrency / locking
- Hard to scale (even harder for writes, storage)
Typical problems
- Session storage
- Statistics (high write to read ratio)
- Modifying schema on large data sets
Tit Petric / Twitter @titpetric
NoSQL: memcache
Memcache (RTV 2008-Present)
- Pro: stability, speed
- Pro: simple text protocol (added binary to fuck with us)
- Love/Hate: Scaling out reads/writes
- Con: Persistence
- Con: Replication
- Con: Key eviction not based on LRU/TTL but slab allocation
Tit Petric / Twitter @titpetric
NoSQL: sharedance
Sharedance (2009-2011)
- Pro: Persistent KV storage
- Pro: Simple text protocol (wrote a client in LUA)
- Con: Had to patch daemon to handle eviction load (1 key = 1
file, filesystems can’t handle this)
- Con: Had to use special ReiserFS filesystem on deployment
Tit Petric / Twitter @titpetric
NoSQL: uses at RTV Slovenia
To-do (2013-?)
- Memcached protocol translator to Redis
- Look at twemcached, avoid client based sharding
- Implement webdis deployment
- Redis scripting with LUA
Tit Petric / Twitter @titpetric
NoSQL: Redis data types
Redis is still a Key/Value store!
Only we can have different values:
- Strings (essentialy the same as memcache)
- Hashes (nested k/v pairs)
- Lists (simple arrays)
- Sets (unique arrays)
- Sorted sets (weighted unique arrays)
Tit Petric / Twitter @titpetric
NoSQL: Redis data types
LETS SEE SOME EXAMPLES!
Tit Petric / Twitter @titpetric
Redis data types: Strings
Limiting Google bot crawl rate
“A 503 (Service Unavailable) error will result in fairly
frequent retrying. To temporarily suspend crawling, it is
recommended to serve a 503 HTTP result code.”
Tit Petric / Twitter @titpetric
Redis data types: Strings
Limiting Google bot crawl rate
SETNX – set a key if it doesn’t exist
EXPIRE – expire a key after TTL seconds
INCR – increment value by one
Tit Petric / Twitter @titpetric
Redis data types: Strings
Limiting Google bot crawl rate
Tit Petric / Twitter @titpetric
Redis data types: Hashes
News ratings
HMSET – set multiple hash fields / values
HGETALL – get all fields and values
HINCRBY – increment integer value of a hash
field
Tit Petric / Twitter @titpetric
Redis data types: Hashes
News ratings
Vote data Why Expire?
Race condition.
We need HMSETNX
Could be better.
Tit Petric / Twitter @titpetric
Redis data types: Hashes
Other use cases include:
- User data, partial data retrieval
- select username, realname, birthday from users where id=?
- HMGET users:$id username realname birthday
- Using SORT (list|set) BY hash values
- Don’t use HASHes to store session. Eviction policy
works on KEYS not on hash values!
Tit Petric / Twitter @titpetric
Redis data types: Lists
Any kind of information log (statistics,…)
LPUSH – push values to the beginning of the list
RPUSH – push values to the end of the list
LRANGE – get a range of values
LTRIM – trim a list to the specified range
LLEN – get the length of the list
Tit Petric / Twitter @titpetric
Redis data types: Lists
Collecting statistics
We can skip the database completely
Tit Petric / Twitter @titpetric
Redis data types: Lists
Process data
Into SQL
database
Tit Petric / Twitter @titpetric
Redis data types: Lists
Well, it’s a way to scale writes to SQL
Processing job can DIE for ages, because:
- Back of the envelope calculation for redis memory use:
100M keys use 16 GB ram
- Logs get processed in small chunks (200 items), avoiding
memory limits. Could increase this by a lot.
- We also use sharding so writes are distributed per $table
Tit Petric / Twitter @titpetric
Redis data types: Sets
Set values are UNIQUE
SADD – Add one or more members to a set
Perfect use case: set insersection with
SINTERSTORE, find duplicates.
MySQL is too slow for this, even with
indexes…
Tit Petric / Twitter @titpetric
Redis data types: Sets
SET Intersection in MySQL
List1 = first table of data
List2 = second table of data
Tit Petric / Twitter @titpetric
Redis data types: Sets
Bulk transfer MySQL data to redis
Via: http://dcw.ca/blog/2013/01/02/mysql-to-redis-in-one-step/
Tit Petric / Twitter @titpetric
Redis data types: Sets
SET Intersection in Redis
Much faster, without indexes!
0.118 seconds vs. mysql 1.35 (+0.36 for index)
15x speed increase!
Tit Petric / Twitter @titpetric
Redis data types: Sets
Other possible uses for sets:
• Common friends between A and B
• Friend suggestions (You might know…)
• People currently online …
Tit Petric / Twitter @titpetric
Redis data types: Sets vs. Sorted sets
Ok, typical use case in sql>
select title, content from news order by
stamp desc limit 0,10
#1) Use SORT from redis + HMGET
#2) Use sorted sets (ZSET type)
Tit Petric / Twitter @titpetric
Redis data types: Sorted sets
Sorted sets by time with a PK
auto_increment? NO!
• Most read news items (sort by views)
• Order comments by comment rating
• Friends by most friends in common
Tit Petric / Twitter @titpetric
Redis data types: Sorted sets
Order comments by rating
ZINCRBY – increase/decrease score of item
ZRANGE – return portion of sorted set, ASC
ZREVRANGE – portion of sorted set, DESC
Tit Petric / Twitter @titpetric
Redis data types: Sorted sets
Sort comments by rating! With pagination!
ZRANGE – return portion of sorted set, ASC
ZREVRANGE – portion of sorted set, DESC
Tit Petric / Twitter @titpetric
Scaling Redis deployment
SLAVEOF [host] [port]
Starts replicating from [host]:[port], making this instance a slave
SLAVEOF NO ONE
Promote instance to MASTER role
Tit Petric / Twitter @titpetric
Scaling Redis deployment
Phpredis client does not implement
sharding by itself! But …
- Master / Multi-slave scaling is easy to do
- Failover for reads is easy, node ejection possible
- Client deploys still take time – twemproxy is an option
- Twemproxy also provides sharding support, & Memcached
- Want to see what Redis is doing? Issue “MONITOR” command.
- Stale data is better than no data, we still consider Redis volatile
- FlushDB = rebuild cache, we tolerate data loss
Tit Petric / Twitter @titpetric
Redis: Q & A section
Questions and answers!
Follow me on Twitter: @titpetric
Read our tech blog: http://foreach.org
Tit Petric / Twitter @titpetric