Python redis talk


Published on

This is a talk that I gave on July 20, 2012 at the Southern California Python Interest Group meetup at Cross Campus, with food and drinks provided by Graph Effect.

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Python redis talk

  1. Redis and Python by Josiah Carlson @dr_josiah
  2. Redis and Python; Its PB & J time by Josiah Carlson @dr_josiah
  3. What will be covered• Who am I?• What is Redis?• Why Redis with Python?• Cool stuff you can do by combining them
  4. Who am I?• A Python user for 12+ years• Former python-dev bike-shedder• Former maintainer of Python async sockets libraries• Author of a few small OS projects o rpqueue, parse-crontab, async_http, timezone-utils, PyPE• Worked at some cool places youve never heard of (Networks In Motion,• Cool places you have (Google)• And cool places you will (ChowNow)• Heavy user of Redis• Author of upcoming Redis in Action
  5. What is Redis?• In-memory database/data structure server o Limited to main memory; vm and diskstore defunct• Persistence via snapshot or append-only file• Support for master/slave replication (multiple slaves and slave chaining supported) o No master-master, dont even try o Client-side sharding o Cluster is in-progress• Five data structures + publish/subscribe o Strings, Lists, Sets, Hashes, Sorted Sets (ZSETs)• Server-side scripting with Lua in Redis 2.6
  6. What is Redis? (compared to otherdatabases/caches)• Memcached o in-memory, no-persistence, counters, strings, very fast, multi-threaded• Redis o in-memory, optionally persisted, data structures, very fast, server-side scripting, single-threaded• MongoDB o on-disk, speed inversely related to data integrity, bson, master/slave, sharding, multi-master, server-side mapreduce, database-level locking• Riak o on-disk, pluggable data stores, multi-master sharding, RESTful API, server-side map-reduce, (Erlang + C)• MySQL/PostgreSQL o on-disk/in-memory, pluggable data stores, master/slave, sharding, stored procedures, ...
  7. What is Redis? (Strings)• Really scalars of a few different types o Character strings  concatenate values to the end  get/set individual bits  get/set byte ranges o Integers (platform long int)  increment/decrement  auto "casting" o Floats (IEEE 754 FP Double)  increment/decrement  auto "casting"
  8. What is Redis? (Lists)• Doubly-linked list of character strings o Push/pop from both ends o [Blocking] pop from multiple lists o [Blocking] pop from one list, push on another o Get/set/search for item in a list o Sortable
  9. What is Redis? (Sets)• Unique unordered sequence of character strings o Backed by a hash table o Add, remove, check membership, pop, random pop o Set intersection, union, difference o Sortable
  10. What is Redis? (Hashes)• Key-value mapping inside a key o Get/Set/Delete single/multiple o Increment values by ints/floats o Bulk fetch of Keys/Values/Both o Sort-of like a small version of Redis that only supports strings/ints/floats
  11. What is Redis? (Sorted Sets -ZSETs)• Like a Hash, with members and scores, scores limited to float values o Get, set, delete, increment o Can be accessed by the sorted order of the (score,member) pair  By score  By index
  12. What is Redis? (Publish/Subscribe)• Readers subscribe to "channels" (exact strings or patterns)• Writers publish to channels, broadcasting to all subscribers• Messages are transient
  13. Why Redis with Python?• The power of Python lies in: o Reasonably sane syntax/semantics o Easy manipulation of data and data structures o Large and growing community• Redis also has: o Reasonably sane syntax/semantics o Easy manipulation of data and data structures o Medium-sized and growing community o Available as remote server  Like a remote IPython, only for data  So useful, people have asked for a library version
  14. Per-hour and Per-day hit countersfrom itertools import imapimport redisdef process_lines(prefix, logfile): conn = redis.Redis() for log in imap(parse_line, open(logfile, rb)): time = log.timestamp.isoformat() hour = time.partition(:)[0] day = time.partition(T)[0] conn.zincrby(prefix + hour, log.path) conn.zincrby(prefix + day, log.path) conn.expire(prefix + hour, 7*86400) conn.expire(prefix + day, 30*86400)
  15. Per-hour and Per-day hit counters(with pipelines for speed)from itertools import imapimport redisdef process_lines(prefix, logfile): pipe = redis.Redis().pipeline(False) for i, log in enumerate(imap(parse_line, open(logfile, rb))): time = log.timestamp.isoformat() hour = time.partition(:)[0] day = time.partition(T)[0] pipe.zincrby(prefix + hour, log.path) pipe.zincrby(prefix + day, log.path) pipe.expire(prefix + hour, 7*86400) pipe.expire(prefix + day, 30*86400) if not i % 1000: pipe.execute() pipe.execute()
  16. Simple task queue - add/run itemsimport jsonimport redisdef add_item(queue, name, *args, **kwargs): redis.Redis().rpush(queue, json.dumps([name, args, kwargs]))def execute_one(queues): item = redis.Redis().blpop(queues, 30) name, args, kwargs = json.loads(item) REGISTRY[name](*args, **kwargs)
  17. Simple task queue - register tasksREGISTRY = {}def task(queue): def wrapper(function): def defer(*args, **kwargs): add_item(queue, name, *args, **kwargs) name = function.__name__ if name in REGISTRY: raise Exception( "Duplicate callback %s"%(name,)) REGISTRY[name] = function return defer if isinstance(queue, str): return wrapper function, queue = queue, default return wrapper(function)
  18. Simple task queue – register tasks@task(high)def do_something(arg): pass@taskdef do_something_else(arg): pass
  19. Cool stuff to do...• Reddit • Publish/Subscribe• Caching • Messaging• Cookies • Search engines• Analytics • Ad targeting• Configuration • Twitter management • Chat rooms• Autocomplete • Job search• Distributed locks • ...• Counting Semaphores• Task queues
  20. Thank you Questions?