Ruby on Redis

Ruby on Redis

Pascal Weemaels

Koen Handekyn

Oct 2013

Target

Create a Zip ﬁle of PDF’s
based on a CSV data ﬁle

‣  Linear version

‣  Making it scale with Redis

parse csv

create pdf

create pdf

...

create pdf

zip

Step 1: linear

‣  Parse CSV

•  std lib : require ‘csv’

•  docs = CSV.read("#{DATA}.csv")

Simple Templating with String Interpolation

invoice.html

<<Q

<div class="title">

INVOICE #{invoice_nr}

‣  Merge data into HTML

• 

template =
File.new('invoice.html').
read

• 

html =
eval("<<QQQn#{template}
nQQQ”)

</div>

<div class="address">

#{name}</br>

#{street}</br>

#{zip} #{city}</br>

</div>

Q

Step 1: linear

‣  Create PDF

•  prince xml using princely gem

•  http://www.princexml.com

•  p = Princely.new
p.add_style_sheets('invoice.css')
p.pdf_from_string(html)

Step 1: linear

‣  Create ZIP

•  Zip::ZipOutputstream.
open(zipfile_name)do |zos|
files.each do |file, content|
zos.new_entry(file)
zos.puts content
end
end

Full Code

require 'csv'!
require 'princely'!
require 'zip/zip’!
!
DATA_FILE = ARGV[0]!
DATA_FILE_BASE_NAME = File.basename(DATA_FILE, ".csv”)!
!
# create a pdf document from a csv line!
def create_pdf(invoice_nr, name, street, zip, city)!
template = File.new('../resources/invoice.html').read!
html = eval("<<WTFMFn#{template}nWTFMF")!
p = Princely.new!
p.add_style_sheets('../resources/invoice.css')!
p.pdf_from_string(html)!
end!
!
# zip files from hash !
def create_zip(files_h)!
zipfile_name = "../out/#{DATA_FILE_BASE_NAME}.#{Time.now.to_s}.zip"!
Zip::ZipOutputStream.open(zipfile_name) do |zos|!
files_h.each do |name, content|!
zos.put_next_entry "#{name}.pdf"!
zos.puts content!
end!
end!
zipfile_name!
end!
!
# load data from csv!
docs = CSV.read(DATA_FILE) # array of arrays!
!
# create a pdf for each line in the csv !
# and put it in a hash!
files_h = docs.inject({}) do |files_h, doc|!
files_h[doc[0]] = create_pdf(*doc)!
files_h!
end!
!
# zip all pfd's from the hash !
create_zip files_h!
!

Step 2: from linear ...

parse csv

create pdf

create pdf

...

create pdf

zip

Step 2: ...to parallel

parse csv

create pdf

create pdf

zip

Threads

?

create pdf

Multi Threaded

‣  Advantage

•  Lightweight (minimal overhead)

‣  Challenges (or why is it hard)

•  Hard to code: most data structures are not thread safe by default, they
need synchronized access

•  Hard to test: different execution paths , timings

•  Hard to maintain

‣  Limitation

•  single machine - not a solution for horizontal scalability
beyond the multi core cpu

Step 2: ...to parallel

parse csv

?

create pdf

create pdf

zip

create pdf

Multi Process

• scale across machines

•  advanced support for debugging and monitoring at the
OS level

• simpler (code, testing, debugging, ...)

•  slightly more overhead

BUT

But

all this assumes

“shared state across processes”

MemCached

parse csv

SQL?

shared state

create pdf

create pdf

create pdf

shared state

File System

zip

… OR …

Terra Cotta

Hello Redis

‣  Shared Memory Key Value Store with
High Level Data Structure support

•  String (String, Int, Float)

•  Hash (Map, Dictionary)

•  List (Queue)

•  Set

•  ZSet (ordered by member or score)

About Redis

•  Single threaded : 1 thread to serve them all

•  (ﬁt) Everything in memory

• 

“Transactions” (multi exec)

• 

Expiring keys

• 

LUA Scripting

• 

Publisher-Subscriber

• 

Auto Create and Destroy

• 

Pipelining

• 

But … full clustering (master-master) is not available (yet)

Hello Redis

‣  redis-cli

• 
• 
• 
• 

set name “pascal” =
“pascal”
incr counter = 1
incr counter = 2
hset pascal name
“pascal”

• 

hset pascal address
“merelbeke”

• 
• 

sadd persons pascal
smembers persons =
[pascal]

• 
• 
• 
• 
• 
• 
• 

keys *
type pascal = hash
lpush todo “read” = 1
lpush todo “eat” = 2
lpop todo = “eat”
rpoplpush todo done =
“read”
lrange done 0 -1 =
“read”

Let Redis Distribute

parse csv

create pdf

process

process

create pdf

process

zip

...

Spread the Work

parse csv

process

1

zip

counter

Queue with data

create pdf

process

create pdf

process

...

Ruby on Redis

‣ 

Put PDF Create Input data on a Queue and do the counter
bookkeeping

!
docs.each do |doc|!
data = YAML::dump(doc)!
!r.lpush 'pdf:queue’, data!
r.incr ctr” # bookkeeping!
end!

Create PDF’s

process

parse csv

zip

counter

Queue with data

Hash with pdfs

2

1

create pdf

process

create pdf

process

...

Ruby on Redis

‣ 

Read PDF input data from Queue and do the counter bookkeeping
and put each created PDF in a Redis hash and signal if ready

while (true)!
_, msg = r.brpop 'pdf:queue’!
!doc = YAML::load(msg)!
#name of hash, key=docname, value=pdf!
r.hset(‘pdf:pdfs’, doc[0], create_pdf(*doc))
!
ctr = r.decr ‘ctr’

!

r.rpush ready, done if ctr == 0!
end!

Zip When Done

parse csv

process

ready

zip

3
Hash with pdfs

create pdf

process

create pdf

process

...

Ruby on Redis

‣ 

Wait for the ready signal
Fetch all pdf ’s
And zip them

!
r.brpop ready“ # wait for signal!
pdfs = r.hgetall ‘pdf:pdfs‘ # fetch hash!
create_zip pdfs # zip it

More Parallelism

parse csv

zip

ready

ready

ready

counter
counter

counter

hash

hash Pdfs
Hash with

Queue with data

create pdf

create pdf

...

Ruby on Redis

‣ 

Put PDF Create Input data on a Queue and do the counter
bookkeeping

# unique id for this input file!
UUID = SecureRandom.uuid!
docs.each do |doc|!
data = YAML::dump([UUID, doc])!
!r.lpush 'pdf:queue’, data!
r.incr ctr:#{UUID}” # bookkeeping!
end!

Ruby on Redis

‣ 

Read PDF input data from Queue and do the counter bookkeeping and
put each created PDF in a Redis hash

while (true)!
_, msg = r.brpop 'pdf:queue’!
uuid, doc = YAML::load(msg)!
r.hset(uuid, doc[0], create_pdf(*doc))!
ctr = r.decr ctr:#{uuid}

!

r.rpush ready:#{uuid}, done if ctr == 0
end!

!

Ruby on Redis

‣ 

Wait for the ready signal
Fetch all pdf ’s
And zip them

!
r.brpop ready:#{UUID}“ # wait for signal!
pdfs = r.hgetall(‘pdf:pdfs‘) # fetch hash!
create_zip(pdfs) # zip it

Full Code

require 'csv'!
require 'princely'!
require 'zip/zip’!
!
DATA_FILE_BASE_NAME = File.basename(DATA_FILE, .csv”)!
!
html = eval(WTFMFn#{template}nWTFMF)!
p = Princely.new!
end!
!
zipfile_name = ../out/#{DATA_FILE_BASE_NAME}.#{Time.now.to_s}.zip!
zos.put_next_entry #{name}.pdf!
zos.puts content!
end!
end!
zipfile_name!
end!
!
!
# create a pdf for each line in the csv !
# and put it in a hash!
files_h = docs.inject({}) do |files_h, doc|!
files_h[doc[0]] = create_pdf(*doc)!
files_h!
end!
!
# zip all pfd's from the hash !
create_zip files_h!
!

LINEAR

require 'csv’!
require 'zip/zip'!
require 'redis'!
require 'yaml'!
require 'securerandom'!
!
zipfile_name = ../out/#{DATA_FILE_BASE_NAME}.#{Time.now.to_s}.zip!
zos.put_next_entry #{name}.pdf!
zos.puts content!
end!
end!
zipfile_name!
end!
!
DATA_FILE_BASE_NAME = File.basename(DATA_FILE, .csv)!
UUID = SecureRandom.uuid!
!
r = Redis.new!
my_counter = ctr:#{UUID}!
!
!
docs.each do |doc| # distribute!!
r.lpush 'pdf:queue' , YAML::dump([UUID, doc])!
r.incr my_counter!
end!
!
r.brpop ready:#{UUID} #collect!!
create_zip(r.hgetall(UUID)) !
!
# clean up!
r.del my_counter!
r.del UUID !
puts All done!”!

MAIN

require 'redis'!
require 'princely'!
require 'yaml’!
!
html = eval(WTFMFn#{template}nWTFMF)!
p = Princely.new!
end!
!
r = Redis.new!
while (true)!
_, msg = r.brpop 'pdf:queue'!
uuid, doc = YAML::load(msg)!
r.hset(uuid , doc[0] , create_pdf(*doc))!
ctr = r.decr ctr:#{uuid} !
r.rpush ready:#{uuid}, done if ctr == 0!
end!

WORKER

Key functions (create pdf and create zip)
remain unchanged.

Distribution code highlighted

Multi Language Participants

parse csv

zip

counter
counter

counter

Queue with data

create pdf

hash

hash pdfs
Hash with

create pdf

...

Conclusions

From Linear To Multi Process Distributed

Is easy with

Redis Shared Memory High Level Data Structures

Atomic Counter for bookkeeping

Queue for work distribution

Queue as Signal

Hash for result sets

Ruby on Redis

More Related Content

Viewers also liked

Similar to Ruby on Redis

Recently uploaded

Ruby on Redis