Ruby on Redis
Upcoming SlideShare
Loading in...5
×
 

Ruby on Redis

on

  • 730 views

Making an application horizontally scalable in 30 minutes. This presentation describes how a linear processing application (mail merge) can be converted into a horizontally scalable using Redis and ...

Making an application horizontally scalable in 30 minutes. This presentation describes how a linear processing application (mail merge) can be converted into a horizontally scalable using Redis and provides some context why a multi-process approach is preferable to a multi-threaded approach.

Statistics

Views

Total Views
730
Views on SlideShare
727
Embed Views
3

Actions

Likes
1
Downloads
4
Comments
0

1 Embed 3

https://twitter.com 3

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Ruby on Redis Ruby on Redis Presentation Transcript

  • Ruby on Redis Pascal Weemaels Koen Handekyn Oct 2013
  • Target Create a Zip file of PDF’s based on a CSV data file ‣  Linear version ‣  Making it scale with Redis parse csv create pdf create pdf ... create pdf zip
  • Step 1: linear ‣  Parse CSV •  std lib : require ‘csv’ •  docs = CSV.read("#{DATA}.csv")
  • Simple Templating with String Interpolation invoice.html <<Q <div class="title"> INVOICE #{invoice_nr} ‣  Merge data into HTML •  template = File.new('invoice.html'). read •  html = eval("<<QQQn#{template} nQQQ”) </div> <div class="address"> #{name}</br> #{street}</br> #{zip} #{city}</br> </div> Q
  • Step 1: linear ‣  Create PDF •  prince xml using princely gem •  http://www.princexml.com •  p = Princely.new p.add_style_sheets('invoice.css') p.pdf_from_string(html)
  • Step 1: linear ‣  Create ZIP •  Zip::ZipOutputstream. open(zipfile_name)do |zos| files.each do |file, content| zos.new_entry(file) zos.puts content end end
  • Full Code require 'csv'! require 'princely'! require 'zip/zip’! ! DATA_FILE = ARGV[0]! DATA_FILE_BASE_NAME = File.basename(DATA_FILE, ".csv”)! ! # create a pdf document from a csv line! def create_pdf(invoice_nr, name, street, zip, city)! template = File.new('../resources/invoice.html').read! html = eval("<<WTFMFn#{template}nWTFMF")! p = Princely.new! p.add_style_sheets('../resources/invoice.css')! p.pdf_from_string(html)! end! ! # zip files from hash ! def create_zip(files_h)! zipfile_name = "../out/#{DATA_FILE_BASE_NAME}.#{Time.now.to_s}.zip"! Zip::ZipOutputStream.open(zipfile_name) do |zos|! files_h.each do |name, content|! zos.put_next_entry "#{name}.pdf"! zos.puts content! end! end! zipfile_name! end! ! # load data from csv! docs = CSV.read(DATA_FILE) # array of arrays! ! # create a pdf for each line in the csv ! # and put it in a hash! files_h = docs.inject({}) do |files_h, doc|! files_h[doc[0]] = create_pdf(*doc)! files_h! end! ! # zip all pfd's from the hash ! create_zip files_h! !
  • DEMO
  • Step 2: from linear ... parse csv create pdf create pdf ... create pdf zip
  • Step 2: ...to parallel parse csv create pdf create pdf zip Threads ? create pdf
  • Multi Threaded ‣  Advantage •  Lightweight (minimal overhead) ‣  Challenges (or why is it hard) •  Hard to code: most data structures are not thread safe by default, they need synchronized access •  Hard to test: different execution paths , timings •  Hard to maintain ‣  Limitation •  single machine - not a solution for horizontal scalability  beyond the multi core cpu
  • Step 2: ...to parallel parse csv ? create pdf create pdf zip create pdf
  • Multi Process • scale across machines •  advanced support for debugging and monitoring at the OS level • simpler (code, testing, debugging, ...) •  slightly more overhead BUT
  • But all this assumes “shared state across processes” MemCached parse csv SQL? shared state create pdf create pdf create pdf shared state File System zip … OR … Terra Cotta
  • Hello Redis ‣  Shared Memory Key Value Store with High Level Data Structure support •  String (String, Int, Float) •  Hash (Map, Dictionary) •  List (Queue) •  Set •  ZSet (ordered by member or score)
  • About Redis •  Single threaded : 1 thread to serve them all •  (fit) Everything in memory •  “Transactions” (multi exec) •  Expiring keys •  LUA Scripting •  Publisher-Subscriber •  Auto Create and Destroy •  Pipelining •  But … full clustering (master-master) is not available (yet)
  • Hello Redis ‣  redis-cli •  •  •  •  set name “pascal” => “pascal” incr counter => 1 incr counter => 2 hset pascal name “pascal” •  hset pascal address “merelbeke” •  •  sadd persons pascal smembers persons => [pascal] •  •  •  •  •  •  •  keys * type pascal => hash lpush todo “read” => 1 lpush todo “eat” => 2 lpop todo => “eat” rpoplpush todo done => “read” lrange done 0 -1 => “read”
  • Let Redis Distribute parse csv create pdf process process create pdf process zip ...
  • Spread the Work parse csv process 1 zip counter Queue with data create pdf process create pdf process ...
  • Ruby on Redis ‣  Put PDF Create Input data on a Queue and do the counter bookkeeping ! docs.each do |doc|! data = YAML::dump(doc)! !r.lpush 'pdf:queue’, data! r.incr "ctr” # bookkeeping! end!
  • Create PDF’s process parse csv zip counter Queue with data Hash with pdfs 2 1 create pdf process create pdf process ...
  • Ruby on Redis ‣  Read PDF input data from Queue and do the counter bookkeeping and put each created PDF in a Redis hash and signal if ready while (true)! _, msg = r.brpop 'pdf:queue’! !doc = YAML::load(msg)! #name of hash, key=docname, value=pdf! r.hset(‘pdf:pdfs’, doc[0], create_pdf(*doc)) ! ctr = r.decr ‘ctr’ ! r.rpush "ready", "done" if ctr == 0! end!
  • Zip When Done parse csv process ready zip 3 Hash with pdfs create pdf process create pdf process ...
  • Ruby on Redis ‣  Wait for the ready signal  Fetch all pdf ’s And zip them ! r.brpop "ready“ # wait for signal! pdfs = r.hgetall ‘pdf:pdfs‘ # fetch hash! create_zip pdfs # zip it
  • More Parallelism parse csv zip ready ready ready counter counter counter hash hash Pdfs Hash with Queue with data create pdf create pdf ...
  • Ruby on Redis ‣  Put PDF Create Input data on a Queue and do the counter bookkeeping # unique id for this input file! UUID = SecureRandom.uuid! docs.each do |doc|! data = YAML::dump([UUID, doc])! !r.lpush 'pdf:queue’, data! r.incr "ctr:#{UUID}” # bookkeeping! end!
  • Ruby on Redis ‣  Read PDF input data from Queue and do the counter bookkeeping and put each created PDF in a Redis hash while (true)! _, msg = r.brpop 'pdf:queue’! uuid, doc = YAML::load(msg)! r.hset(uuid, doc[0], create_pdf(*doc))! ctr = r.decr "ctr:#{uuid}" ! r.rpush "ready:#{uuid}", "done" if ctr == 0 end! !
  • Ruby on Redis ‣  Wait for the ready signal  Fetch all pdf ’s And zip them ! r.brpop "ready:#{UUID}“ # wait for signal! pdfs = r.hgetall(‘pdf:pdfs‘) # fetch hash! create_zip(pdfs) # zip it
  • Full Code require 'csv'! require 'princely'! require 'zip/zip’! ! DATA_FILE = ARGV[0]! DATA_FILE_BASE_NAME = File.basename(DATA_FILE, ".csv”)! ! # create a pdf document from a csv line! def create_pdf(invoice_nr, name, street, zip, city)! template = File.new('../resources/invoice.html').read! html = eval("<<WTFMFn#{template}nWTFMF")! p = Princely.new! p.add_style_sheets('../resources/invoice.css')! p.pdf_from_string(html)! end! ! # zip files from hash ! def create_zip(files_h)! zipfile_name = "../out/#{DATA_FILE_BASE_NAME}.#{Time.now.to_s}.zip"! Zip::ZipOutputStream.open(zipfile_name) do |zos|! files_h.each do |name, content|! zos.put_next_entry "#{name}.pdf"! zos.puts content! end! end! zipfile_name! end! ! # load data from csv! docs = CSV.read(DATA_FILE) # array of arrays! ! # create a pdf for each line in the csv ! # and put it in a hash! files_h = docs.inject({}) do |files_h, doc|! files_h[doc[0]] = create_pdf(*doc)! files_h! end! ! # zip all pfd's from the hash ! create_zip files_h! ! LINEAR require 'csv’! require 'zip/zip'! require 'redis'! require 'yaml'! require 'securerandom'! ! # zip files from hash ! def create_zip(files_h)! zipfile_name = "../out/#{DATA_FILE_BASE_NAME}.#{Time.now.to_s}.zip"! Zip::ZipOutputStream.open(zipfile_name) do |zos|! files_h.each do |name, content|! zos.put_next_entry "#{name}.pdf"! zos.puts content! end! end! zipfile_name! end! ! DATA_FILE = ARGV[0]! DATA_FILE_BASE_NAME = File.basename(DATA_FILE, ".csv")! UUID = SecureRandom.uuid! ! r = Redis.new! my_counter = "ctr:#{UUID}"! ! # load data from csv! docs = CSV.read(DATA_FILE) # array of arrays! ! docs.each do |doc| # distribute!! r.lpush 'pdf:queue' , YAML::dump([UUID, doc])! r.incr my_counter! end! ! r.brpop "ready:#{UUID}" #collect!! create_zip(r.hgetall(UUID)) ! ! # clean up! r.del my_counter! r.del UUID ! puts "All done!”! MAIN require 'redis'! require 'princely'! require 'yaml’! ! # create a pdf document from a csv line! def create_pdf(invoice_nr, name, street, zip, city)! template = File.new('../resources/invoice.html').read! html = eval("<<WTFMFn#{template}nWTFMF")! p = Princely.new! p.add_style_sheets('../resources/invoice.css')! p.pdf_from_string(html)! end! ! r = Redis.new! while (true)! _, msg = r.brpop 'pdf:queue'! uuid, doc = YAML::load(msg)! r.hset(uuid , doc[0] , create_pdf(*doc))! ctr = r.decr "ctr:#{uuid}" ! r.rpush "ready:#{uuid}", "done" if ctr == 0! end! WORKER Key functions (create pdf and create zip) remain unchanged. Distribution code highlighted
  • DEMO 2
  • Multi Language Participants parse csv zip counter counter counter Queue with data create pdf hash hash pdfs Hash with create pdf ...
  • Conclusions From Linear To Multi Process Distributed Is easy with Redis Shared Memory High Level Data Structures Atomic Counter for bookkeeping Queue for work distribution Queue as Signal Hash for result sets