Redis and Python by Josiah Carlson @dr_josiah dr-josiah.blogspot.com bit.ly/redis-in-action
Redis and Python; Its PB & J time by Josiah Carlson @dr_josiah dr-josiah.blogspot.com bit.ly/redis-in-action
What will be covered• Who am I?• What is Redis?• Why Redis with Python?• Cool stuff you can do by combining them
Who am I?• A Python user for 12+ years• Former python-dev bike-shedder• Former maintainer of Python async sockets libraries• Author of a few small OS projects o rpqueue, parse-crontab, async_http, timezone-utils, PyPE• Worked at some cool places youve never heard of (Networks In Motion, Ad.ly)• Cool places you have (Google)• And cool places you will (ChowNow)• Heavy user of Redis• Author of upcoming Redis in Action
What is Redis?• In-memory database/data structure server o Limited to main memory; vm and diskstore defunct• Persistence via snapshot or append-only file• Support for master/slave replication (multiple slaves and slave chaining supported) o No master-master, dont even try o Client-side sharding o Cluster is in-progress• Five data structures + publish/subscribe o Strings, Lists, Sets, Hashes, Sorted Sets (ZSETs)• Server-side scripting with Lua in Redis 2.6
What is Redis? (compared to otherdatabases/caches)• Memcached o in-memory, no-persistence, counters, strings, very fast, multi-threaded• Redis o in-memory, optionally persisted, data structures, very fast, server-side scripting, single-threaded• MongoDB o on-disk, speed inversely related to data integrity, bson, master/slave, sharding, multi-master, server-side mapreduce, database-level locking• Riak o on-disk, pluggable data stores, multi-master sharding, RESTful API, server-side map-reduce, (Erlang + C)• MySQL/PostgreSQL o on-disk/in-memory, pluggable data stores, master/slave, sharding, stored procedures, ...
What is Redis? (Strings)• Really scalars of a few different types o Character strings concatenate values to the end get/set individual bits get/set byte ranges o Integers (platform long int) increment/decrement auto "casting" o Floats (IEEE 754 FP Double) increment/decrement auto "casting"
What is Redis? (Lists)• Doubly-linked list of character strings o Push/pop from both ends o [Blocking] pop from multiple lists o [Blocking] pop from one list, push on another o Get/set/search for item in a list o Sortable
What is Redis? (Sets)• Unique unordered sequence of character strings o Backed by a hash table o Add, remove, check membership, pop, random pop o Set intersection, union, difference o Sortable
What is Redis? (Hashes)• Key-value mapping inside a key o Get/Set/Delete single/multiple o Increment values by ints/floats o Bulk fetch of Keys/Values/Both o Sort-of like a small version of Redis that only supports strings/ints/floats
What is Redis? (Sorted Sets -ZSETs)• Like a Hash, with members and scores, scores limited to float values o Get, set, delete, increment o Can be accessed by the sorted order of the (score,member) pair By score By index
What is Redis? (Publish/Subscribe)• Readers subscribe to "channels" (exact strings or patterns)• Writers publish to channels, broadcasting to all subscribers• Messages are transient
Why Redis with Python?• The power of Python lies in: o Reasonably sane syntax/semantics o Easy manipulation of data and data structures o Large and growing community• Redis also has: o Reasonably sane syntax/semantics o Easy manipulation of data and data structures o Medium-sized and growing community o Available as remote server Like a remote IPython, only for data So useful, people have asked for a library version
Per-hour and Per-day hit countersfrom itertools import imapimport redisdef process_lines(prefix, logfile): conn = redis.Redis() for log in imap(parse_line, open(logfile, rb)): time = log.timestamp.isoformat() hour = time.partition(:) day = time.partition(T) conn.zincrby(prefix + hour, log.path) conn.zincrby(prefix + day, log.path) conn.expire(prefix + hour, 7*86400) conn.expire(prefix + day, 30*86400)
Per-hour and Per-day hit counters(with pipelines for speed)from itertools import imapimport redisdef process_lines(prefix, logfile): pipe = redis.Redis().pipeline(False) for i, log in enumerate(imap(parse_line, open(logfile, rb))): time = log.timestamp.isoformat() hour = time.partition(:) day = time.partition(T) pipe.zincrby(prefix + hour, log.path) pipe.zincrby(prefix + day, log.path) pipe.expire(prefix + hour, 7*86400) pipe.expire(prefix + day, 30*86400) if not i % 1000: pipe.execute() pipe.execute()