SlideShare a Scribd company logo
Ruby on Redis	

Pascal Weemaels	

Koen Handekyn	

Oct 2013
Target	

Create a Zip file of PDF’s
based on a CSV data file	

‣  Linear version	

‣  Making it scale with Redis	


parse csv
	


create pdf
	

create pdf
	


...	


create pdf
	


zip
Step 1: linear 	

‣  Parse CSV	

•  std lib : require ‘csv’	

•  docs = CSV.read("#{DATA}.csv")
Simple Templating with String Interpolation	

invoice.html	

<<Q	

<div class="title">	

INVOICE #{invoice_nr}	


‣  Merge data into HTML	

• 

template =
File.new('invoice.html').
read

• 

html =
eval("<<QQQn#{template}
nQQQ”)

</div>	

<div class="address">	

#{name}</br>	

#{street}</br>	

#{zip} #{city}</br>	

</div>	

Q
Step 1: linear 	

‣  Create PDF	

•  prince xml using princely gem	

•  http://www.princexml.com	

•  p = Princely.new
p.add_style_sheets('invoice.css')
p.pdf_from_string(html)
Step 1: linear	

‣  Create ZIP	

•  Zip::ZipOutputstream.
open(zipfile_name)do |zos|
files.each do |file, content|
zos.new_entry(file)
zos.puts content
end
end
Full Code
	

require 'csv'!
require 'princely'!
require 'zip/zip’!
!
DATA_FILE = ARGV[0]!
DATA_FILE_BASE_NAME = File.basename(DATA_FILE, ".csv”)!
!
# create a pdf document from a csv line!
def create_pdf(invoice_nr, name, street, zip, city)!
template = File.new('../resources/invoice.html').read!
html = eval("<<WTFMFn#{template}nWTFMF")!
p = Princely.new!
p.add_style_sheets('../resources/invoice.css')!
p.pdf_from_string(html)!
end!
!
# zip files from hash !
def create_zip(files_h)!
zipfile_name = "../out/#{DATA_FILE_BASE_NAME}.#{Time.now.to_s}.zip"!
Zip::ZipOutputStream.open(zipfile_name) do |zos|!
files_h.each do |name, content|!
zos.put_next_entry "#{name}.pdf"!
zos.puts content!
end!
end!
zipfile_name!
end!
!
# load data from csv!
docs = CSV.read(DATA_FILE) # array of arrays!
!
# create a pdf for each line in the csv !
# and put it in a hash!
files_h = docs.inject({}) do |files_h, doc|!
files_h[doc[0]] = create_pdf(*doc)!
files_h!
end!
!
# zip all pfd's from the hash !
create_zip files_h!
!
DEMO
Step 2: from linear ...	

parse csv
	


create pdf
	

create pdf
	


...	


create pdf
	


zip
Step 2: ...to parallel	

parse csv
	


create pdf
	


create pdf
	


zip
	


Threads
	

?
	


create pdf
Multi Threaded	

‣  Advantage	

•  Lightweight (minimal overhead)	

‣  Challenges (or why is it hard)	

•  Hard to code: most data structures are not thread safe by default, they
need synchronized access	


•  Hard to test: different execution paths , timings	

•  Hard to maintain	

‣  Limitation	

•  single machine - not a solution for horizontal scalability 
beyond the multi core cpu
Step 2: ...to parallel	

parse csv
	

?
	


create pdf
	


create pdf
	


zip
	


create pdf
Multi Process	

• scale across machines	

•  advanced support for debugging and monitoring at the
OS level	


• simpler (code, testing, debugging, ...)	

•  slightly more overhead 	


	

BUT
But	

all this assumes	

“shared state across processes”	


MemCached	


parse csv
	


SQL?	


shared state
	


create pdf
	


create pdf
	


create pdf
	


shared state
	


File System
	


zip
	

… OR …
	


Terra Cotta
Hello Redis	

‣  Shared Memory Key Value Store with
High Level Data Structure support 	

•  String (String, Int, Float)	

•  Hash (Map, Dictionary) 	

•  List (Queue) 	

•  Set 	

•  ZSet (ordered by member or score)
About Redis	

•  Single threaded : 1 thread to serve them all	

•  (fit) Everything in memory	

• 

“Transactions” (multi exec)	


• 

Expiring keys	


• 

LUA Scripting	


• 

Publisher-Subscriber	


• 

Auto Create and Destroy	


• 

Pipelining	


• 

But … full clustering (master-master) is not available (yet)
Hello Redis	

‣  redis-cli	

• 
• 
• 
• 

set name “pascal” =
“pascal”
incr counter = 1
incr counter = 2
hset pascal name
“pascal”

• 

hset pascal address
“merelbeke”

• 
• 

sadd persons pascal
smembers persons =
[pascal]

• 
• 
• 
• 
• 
• 
• 

keys *
type pascal = hash
lpush todo “read” = 1
lpush todo “eat” = 2
lpop todo = “eat”
rpoplpush todo done =
“read”
lrange done 0 -1 =
“read”
Let Redis Distribute	

parse csv
	


create pdf
	


process	


process	


create pdf
	


process	


zip
	


...
Spread the Work	

parse csv
	


process	


1
	


zip
	


counter
	


Queue with data
	


create pdf
	


process	


create pdf
	


process	


...
Ruby on Redis	

‣ 

Put PDF Create Input data on a Queue and do the counter
bookkeeping	


!
docs.each do |doc|!
data = YAML::dump(doc)!
!r.lpush 'pdf:queue’, data!
r.incr ctr” # bookkeeping!
end!
Create PDF’s	

process	


parse csv
	


zip
	


counter
	

Queue with data
	

Hash with pdfs
	


2	


1	

create pdf
	


process	


create pdf
	


process	


...
Ruby on Redis	

‣ 

Read PDF input data from Queue and do the counter bookkeeping
and put each created PDF in a Redis hash and signal if ready	


while (true)!
_, msg = r.brpop 'pdf:queue’!
!doc = YAML::load(msg)!
#name of hash, key=docname, value=pdf!
r.hset(‘pdf:pdfs’, doc[0], create_pdf(*doc))
!
ctr = r.decr ‘ctr’

!

r.rpush ready, done if ctr == 0!
end!
Zip When Done	

parse csv
	


process	


ready
	


zip
	


3
Hash with pdfs
	


create pdf
	


process	


create pdf
	


process	


...
Ruby on Redis	

‣ 

Wait for the ready signal 
Fetch all pdf ’s
And zip them	


!
r.brpop ready“ # wait for signal!
pdfs = r.hgetall ‘pdf:pdfs‘ # fetch hash!
create_zip pdfs # zip it
More Parallelism 	

parse csv
	


zip
	

ready 	

	

ready 	

ready

counter
counter	

counter	

	

hash 	

	

hash Pdfs
Hash with
	


Queue with data
	


create pdf
	


create pdf
	


...
Ruby on Redis	

‣ 

Put PDF Create Input data on a Queue and do the counter
bookkeeping	


# unique id for this input file!
UUID = SecureRandom.uuid!
docs.each do |doc|!
data = YAML::dump([UUID, doc])!
!r.lpush 'pdf:queue’, data!
r.incr ctr:#{UUID}” # bookkeeping!
end!
Ruby on Redis	

‣ 

Read PDF input data from Queue and do the counter bookkeeping and
put each created PDF in a Redis hash	


while (true)!
_, msg = r.brpop 'pdf:queue’!
uuid, doc = YAML::load(msg)!
r.hset(uuid, doc[0], create_pdf(*doc))!
ctr = r.decr ctr:#{uuid}

!

r.rpush ready:#{uuid}, done if ctr == 0
end!

!
Ruby on Redis	

‣ 

Wait for the ready signal 
Fetch all pdf ’s
And zip them	


!
r.brpop ready:#{UUID}“ # wait for signal!
pdfs = r.hgetall(‘pdf:pdfs‘) # fetch hash!
create_zip(pdfs) # zip it
Full Code
	

require 'csv'!
require 'princely'!
require 'zip/zip’!
!
DATA_FILE = ARGV[0]!
DATA_FILE_BASE_NAME = File.basename(DATA_FILE, .csv”)!
!
# create a pdf document from a csv line!
def create_pdf(invoice_nr, name, street, zip, city)!
template = File.new('../resources/invoice.html').read!
html = eval(WTFMFn#{template}nWTFMF)!
p = Princely.new!
p.add_style_sheets('../resources/invoice.css')!
p.pdf_from_string(html)!
end!
!
# zip files from hash !
def create_zip(files_h)!
zipfile_name = ../out/#{DATA_FILE_BASE_NAME}.#{Time.now.to_s}.zip!
Zip::ZipOutputStream.open(zipfile_name) do |zos|!
files_h.each do |name, content|!
zos.put_next_entry #{name}.pdf!
zos.puts content!
end!
end!
zipfile_name!
end!
!
# load data from csv!
docs = CSV.read(DATA_FILE) # array of arrays!
!
# create a pdf for each line in the csv !
# and put it in a hash!
files_h = docs.inject({}) do |files_h, doc|!
files_h[doc[0]] = create_pdf(*doc)!
files_h!
end!
!
# zip all pfd's from the hash !
create_zip files_h!
!

LINEAR	


require 'csv’!
require 'zip/zip'!
require 'redis'!
require 'yaml'!
require 'securerandom'!
!
# zip files from hash !
def create_zip(files_h)!
zipfile_name = ../out/#{DATA_FILE_BASE_NAME}.#{Time.now.to_s}.zip!
Zip::ZipOutputStream.open(zipfile_name) do |zos|!
files_h.each do |name, content|!
zos.put_next_entry #{name}.pdf!
zos.puts content!
end!
end!
zipfile_name!
end!
!
DATA_FILE = ARGV[0]!
DATA_FILE_BASE_NAME = File.basename(DATA_FILE, .csv)!
UUID = SecureRandom.uuid!
!
r = Redis.new!
my_counter = ctr:#{UUID}!
!
# load data from csv!
docs = CSV.read(DATA_FILE) # array of arrays!
!
docs.each do |doc| # distribute!!
r.lpush 'pdf:queue' , YAML::dump([UUID, doc])!
r.incr my_counter!
end!
!
r.brpop ready:#{UUID} #collect!!
create_zip(r.hgetall(UUID)) !
!
# clean up!
r.del my_counter!
r.del UUID !
puts All done!”!

MAIN	


require 'redis'!
require 'princely'!
require 'yaml’!
!
# create a pdf document from a csv line!
def create_pdf(invoice_nr, name, street, zip, city)!
template = File.new('../resources/invoice.html').read!
html = eval(WTFMFn#{template}nWTFMF)!
p = Princely.new!
p.add_style_sheets('../resources/invoice.css')!
p.pdf_from_string(html)!
end!
!
r = Redis.new!
while (true)!
_, msg = r.brpop 'pdf:queue'!
uuid, doc = YAML::load(msg)!
r.hset(uuid , doc[0] , create_pdf(*doc))!
ctr = r.decr ctr:#{uuid} !
r.rpush ready:#{uuid}, done if ctr == 0!
end!

WORKER	


Key functions (create pdf and create zip)
remain unchanged.	

	

Distribution code highlighted
DEMO 2
Multi Language Participants	

parse csv
	


zip
	


counter
counter	

counter	

	

Queue with data
	


create pdf
	


hash 	

	

hash pdfs
Hash with
	


create pdf
	


...
Conclusions	

From Linear To Multi Process Distributed	

Is easy with	

Redis Shared Memory High Level Data Structures	

	

Atomic Counter for bookkeeping	

Queue for work distribution	

Queue as Signal	

Hash for result sets

More Related Content

Viewers also liked

Rituales a la diosa Hécate
Rituales a la diosa HécateRituales a la diosa Hécate
Rituales a la diosa Hécate
alumnosdeamparo1
 
Algoritma dan Struktur Data - antrian
Algoritma dan Struktur Data - antrianAlgoritma dan Struktur Data - antrian
Algoritma dan Struktur Data - antrian
Georgius Rinaldo
 
Algoritma dan pengetahuan terkait (menghitung, konversi, dll)
Algoritma dan pengetahuan terkait (menghitung, konversi, dll) Algoritma dan pengetahuan terkait (menghitung, konversi, dll)
Algoritma dan pengetahuan terkait (menghitung, konversi, dll)
Fazar Ikhwan Guntara
 
Queue antrian
Queue antrian Queue antrian
Queue antrian
muissyahril
 
Implementasi queue
Implementasi queueImplementasi queue
Implementasi queue
Rhe Dwi Yuni
 
Materi Struktur data QUEUE
Materi Struktur data QUEUEMateri Struktur data QUEUE
Materi Struktur data QUEUE
Meta N
 
Makalah Or Antrian
Makalah Or  AntrianMakalah Or  Antrian
Makalah Or Antrian
guestb59a8c8
 
2894065
28940652894065
2894065
Sera Dewi
 
Data Structure (Queue)
Data Structure (Queue)Data Structure (Queue)
Data Structure (Queue)
Adam Mukharil Bachtiar
 
Queue
QueueQueue
Queue as data_structure
Queue as data_structureQueue as data_structure
Queue as data_structure
eShikshak
 

Viewers also liked (11)

Rituales a la diosa Hécate
Rituales a la diosa HécateRituales a la diosa Hécate
Rituales a la diosa Hécate
 
Algoritma dan Struktur Data - antrian
Algoritma dan Struktur Data - antrianAlgoritma dan Struktur Data - antrian
Algoritma dan Struktur Data - antrian
 
Algoritma dan pengetahuan terkait (menghitung, konversi, dll)
Algoritma dan pengetahuan terkait (menghitung, konversi, dll) Algoritma dan pengetahuan terkait (menghitung, konversi, dll)
Algoritma dan pengetahuan terkait (menghitung, konversi, dll)
 
Queue antrian
Queue antrian Queue antrian
Queue antrian
 
Implementasi queue
Implementasi queueImplementasi queue
Implementasi queue
 
Materi Struktur data QUEUE
Materi Struktur data QUEUEMateri Struktur data QUEUE
Materi Struktur data QUEUE
 
Makalah Or Antrian
Makalah Or  AntrianMakalah Or  Antrian
Makalah Or Antrian
 
2894065
28940652894065
2894065
 
Data Structure (Queue)
Data Structure (Queue)Data Structure (Queue)
Data Structure (Queue)
 
Queue
QueueQueue
Queue
 
Queue as data_structure
Queue as data_structureQueue as data_structure
Queue as data_structure
 

Similar to Ruby on Redis

Conexión de MongoDB con Hadoop - Luis Alberto Giménez - CAPSiDE #DevOSSAzureDays
Conexión de MongoDB con Hadoop - Luis Alberto Giménez - CAPSiDE #DevOSSAzureDaysConexión de MongoDB con Hadoop - Luis Alberto Giménez - CAPSiDE #DevOSSAzureDays
Conexión de MongoDB con Hadoop - Luis Alberto Giménez - CAPSiDE #DevOSSAzureDays
CAPSiDE
 
Barcelona MUG MongoDB + Hadoop Presentation
Barcelona MUG MongoDB + Hadoop PresentationBarcelona MUG MongoDB + Hadoop Presentation
Barcelona MUG MongoDB + Hadoop Presentation
Norberto Leite
 
Full-Stack Data Science: How to be a One-person Data Team
Full-Stack Data Science: How to be a One-person Data TeamFull-Stack Data Science: How to be a One-person Data Team
Full-Stack Data Science: How to be a One-person Data Team
Greg Goltsov
 
Midwest php 2013 deploying php on paas- why & how
Midwest php 2013   deploying php on paas- why & howMidwest php 2013   deploying php on paas- why & how
Midwest php 2013 deploying php on paas- why & how
dotCloud
 
Front End Development Automation with Grunt
Front End Development Automation with GruntFront End Development Automation with Grunt
Front End Development Automation with Grunt
Ladies Who Code
 
Deploying PHP on PaaS: Why and How?
Deploying PHP on PaaS: Why and How?Deploying PHP on PaaS: Why and How?
Deploying PHP on PaaS: Why and How?
Docker, Inc.
 
Relational to Graph - Import
Relational to Graph - ImportRelational to Graph - Import
Relational to Graph - Import
Neo4j
 
How to automate all your SEO projects
How to automate all your SEO projectsHow to automate all your SEO projects
How to automate all your SEO projects
Vincent Terrasi
 
Brust hadoopecosystem
Brust hadoopecosystemBrust hadoopecosystem
Brust hadoopecosystem
Andrew Brust
 
Introduction to NodeJS with LOLCats
Introduction to NodeJS with LOLCatsIntroduction to NodeJS with LOLCats
Introduction to NodeJS with LOLCats
Derek Anderson
 
introduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pigintroduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pig
Ricardo Varela
 
Streams
StreamsStreams
מיכאל
מיכאלמיכאל
מיכאל
sqlserver.co.il
 
PyconJP: Building a data preparation pipeline with Pandas and AWS Lambda
PyconJP: Building a data preparation pipeline with Pandas and AWS LambdaPyconJP: Building a data preparation pipeline with Pandas and AWS Lambda
PyconJP: Building a data preparation pipeline with Pandas and AWS Lambda
Fabian Dubois
 
Big Data - Lab A1 (SC 11 Tutorial)
Big Data - Lab A1 (SC 11 Tutorial)Big Data - Lab A1 (SC 11 Tutorial)
Big Data - Lab A1 (SC 11 Tutorial)
Robert Grossman
 
Why and How Powershell will rule the Command Line - Barcamp LA 4
Why and How Powershell will rule the Command Line - Barcamp LA 4Why and How Powershell will rule the Command Line - Barcamp LA 4
Why and How Powershell will rule the Command Line - Barcamp LA 4
Ilya Haykinson
 
Introduction to Map-Reduce
Introduction to Map-ReduceIntroduction to Map-Reduce
Introduction to Map-Reduce
Brendan Tierney
 
PDF made easy with iText 7
PDF made easy with iText 7PDF made easy with iText 7
PDF made easy with iText 7
iText Group nv
 
Fluentd unified logging layer
Fluentd   unified logging layerFluentd   unified logging layer
Fluentd unified logging layer
Kiyoto Tamura
 
Osd ctw spark
Osd ctw sparkOsd ctw spark
Osd ctw spark
Wisely chen
 

Similar to Ruby on Redis (20)

Conexión de MongoDB con Hadoop - Luis Alberto Giménez - CAPSiDE #DevOSSAzureDays
Conexión de MongoDB con Hadoop - Luis Alberto Giménez - CAPSiDE #DevOSSAzureDaysConexión de MongoDB con Hadoop - Luis Alberto Giménez - CAPSiDE #DevOSSAzureDays
Conexión de MongoDB con Hadoop - Luis Alberto Giménez - CAPSiDE #DevOSSAzureDays
 
Barcelona MUG MongoDB + Hadoop Presentation
Barcelona MUG MongoDB + Hadoop PresentationBarcelona MUG MongoDB + Hadoop Presentation
Barcelona MUG MongoDB + Hadoop Presentation
 
Full-Stack Data Science: How to be a One-person Data Team
Full-Stack Data Science: How to be a One-person Data TeamFull-Stack Data Science: How to be a One-person Data Team
Full-Stack Data Science: How to be a One-person Data Team
 
Midwest php 2013 deploying php on paas- why & how
Midwest php 2013   deploying php on paas- why & howMidwest php 2013   deploying php on paas- why & how
Midwest php 2013 deploying php on paas- why & how
 
Front End Development Automation with Grunt
Front End Development Automation with GruntFront End Development Automation with Grunt
Front End Development Automation with Grunt
 
Deploying PHP on PaaS: Why and How?
Deploying PHP on PaaS: Why and How?Deploying PHP on PaaS: Why and How?
Deploying PHP on PaaS: Why and How?
 
Relational to Graph - Import
Relational to Graph - ImportRelational to Graph - Import
Relational to Graph - Import
 
How to automate all your SEO projects
How to automate all your SEO projectsHow to automate all your SEO projects
How to automate all your SEO projects
 
Brust hadoopecosystem
Brust hadoopecosystemBrust hadoopecosystem
Brust hadoopecosystem
 
Introduction to NodeJS with LOLCats
Introduction to NodeJS with LOLCatsIntroduction to NodeJS with LOLCats
Introduction to NodeJS with LOLCats
 
introduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pigintroduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pig
 
Streams
StreamsStreams
Streams
 
מיכאל
מיכאלמיכאל
מיכאל
 
PyconJP: Building a data preparation pipeline with Pandas and AWS Lambda
PyconJP: Building a data preparation pipeline with Pandas and AWS LambdaPyconJP: Building a data preparation pipeline with Pandas and AWS Lambda
PyconJP: Building a data preparation pipeline with Pandas and AWS Lambda
 
Big Data - Lab A1 (SC 11 Tutorial)
Big Data - Lab A1 (SC 11 Tutorial)Big Data - Lab A1 (SC 11 Tutorial)
Big Data - Lab A1 (SC 11 Tutorial)
 
Why and How Powershell will rule the Command Line - Barcamp LA 4
Why and How Powershell will rule the Command Line - Barcamp LA 4Why and How Powershell will rule the Command Line - Barcamp LA 4
Why and How Powershell will rule the Command Line - Barcamp LA 4
 
Introduction to Map-Reduce
Introduction to Map-ReduceIntroduction to Map-Reduce
Introduction to Map-Reduce
 
PDF made easy with iText 7
PDF made easy with iText 7PDF made easy with iText 7
PDF made easy with iText 7
 
Fluentd unified logging layer
Fluentd   unified logging layerFluentd   unified logging layer
Fluentd unified logging layer
 
Osd ctw spark
Osd ctw sparkOsd ctw spark
Osd ctw spark
 

Recently uploaded

UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
TIPNGVN2
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 

Recently uploaded (20)

UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 

Ruby on Redis

  • 1. Ruby on Redis Pascal Weemaels Koen Handekyn Oct 2013
  • 2. Target Create a Zip file of PDF’s based on a CSV data file ‣  Linear version ‣  Making it scale with Redis parse csv create pdf create pdf ... create pdf zip
  • 3. Step 1: linear ‣  Parse CSV •  std lib : require ‘csv’ •  docs = CSV.read("#{DATA}.csv")
  • 4. Simple Templating with String Interpolation invoice.html <<Q <div class="title"> INVOICE #{invoice_nr} ‣  Merge data into HTML •  template = File.new('invoice.html'). read •  html = eval("<<QQQn#{template} nQQQ”) </div> <div class="address"> #{name}</br> #{street}</br> #{zip} #{city}</br> </div> Q
  • 5. Step 1: linear ‣  Create PDF •  prince xml using princely gem •  http://www.princexml.com •  p = Princely.new p.add_style_sheets('invoice.css') p.pdf_from_string(html)
  • 6. Step 1: linear ‣  Create ZIP •  Zip::ZipOutputstream. open(zipfile_name)do |zos| files.each do |file, content| zos.new_entry(file) zos.puts content end end
  • 7. Full Code require 'csv'! require 'princely'! require 'zip/zip’! ! DATA_FILE = ARGV[0]! DATA_FILE_BASE_NAME = File.basename(DATA_FILE, ".csv”)! ! # create a pdf document from a csv line! def create_pdf(invoice_nr, name, street, zip, city)! template = File.new('../resources/invoice.html').read! html = eval("<<WTFMFn#{template}nWTFMF")! p = Princely.new! p.add_style_sheets('../resources/invoice.css')! p.pdf_from_string(html)! end! ! # zip files from hash ! def create_zip(files_h)! zipfile_name = "../out/#{DATA_FILE_BASE_NAME}.#{Time.now.to_s}.zip"! Zip::ZipOutputStream.open(zipfile_name) do |zos|! files_h.each do |name, content|! zos.put_next_entry "#{name}.pdf"! zos.puts content! end! end! zipfile_name! end! ! # load data from csv! docs = CSV.read(DATA_FILE) # array of arrays! ! # create a pdf for each line in the csv ! # and put it in a hash! files_h = docs.inject({}) do |files_h, doc|! files_h[doc[0]] = create_pdf(*doc)! files_h! end! ! # zip all pfd's from the hash ! create_zip files_h! !
  • 9. Step 2: from linear ... parse csv create pdf create pdf ... create pdf zip
  • 10. Step 2: ...to parallel parse csv create pdf create pdf zip Threads ? create pdf
  • 11. Multi Threaded ‣  Advantage •  Lightweight (minimal overhead) ‣  Challenges (or why is it hard) •  Hard to code: most data structures are not thread safe by default, they need synchronized access •  Hard to test: different execution paths , timings •  Hard to maintain ‣  Limitation •  single machine - not a solution for horizontal scalability beyond the multi core cpu
  • 12. Step 2: ...to parallel parse csv ? create pdf create pdf zip create pdf
  • 13. Multi Process • scale across machines •  advanced support for debugging and monitoring at the OS level • simpler (code, testing, debugging, ...) •  slightly more overhead BUT
  • 14. But all this assumes “shared state across processes” MemCached parse csv SQL? shared state create pdf create pdf create pdf shared state File System zip … OR … Terra Cotta
  • 15. Hello Redis ‣  Shared Memory Key Value Store with High Level Data Structure support •  String (String, Int, Float) •  Hash (Map, Dictionary) •  List (Queue) •  Set •  ZSet (ordered by member or score)
  • 16. About Redis •  Single threaded : 1 thread to serve them all •  (fit) Everything in memory •  “Transactions” (multi exec) •  Expiring keys •  LUA Scripting •  Publisher-Subscriber •  Auto Create and Destroy •  Pipelining •  But … full clustering (master-master) is not available (yet)
  • 17. Hello Redis ‣  redis-cli •  •  •  •  set name “pascal” = “pascal” incr counter = 1 incr counter = 2 hset pascal name “pascal” •  hset pascal address “merelbeke” •  •  sadd persons pascal smembers persons = [pascal] •  •  •  •  •  •  •  keys * type pascal = hash lpush todo “read” = 1 lpush todo “eat” = 2 lpop todo = “eat” rpoplpush todo done = “read” lrange done 0 -1 = “read”
  • 18. Let Redis Distribute parse csv create pdf process process create pdf process zip ...
  • 19. Spread the Work parse csv process 1 zip counter Queue with data create pdf process create pdf process ...
  • 20. Ruby on Redis ‣  Put PDF Create Input data on a Queue and do the counter bookkeeping ! docs.each do |doc|! data = YAML::dump(doc)! !r.lpush 'pdf:queue’, data! r.incr ctr” # bookkeeping! end!
  • 21. Create PDF’s process parse csv zip counter Queue with data Hash with pdfs 2 1 create pdf process create pdf process ...
  • 22. Ruby on Redis ‣  Read PDF input data from Queue and do the counter bookkeeping and put each created PDF in a Redis hash and signal if ready while (true)! _, msg = r.brpop 'pdf:queue’! !doc = YAML::load(msg)! #name of hash, key=docname, value=pdf! r.hset(‘pdf:pdfs’, doc[0], create_pdf(*doc)) ! ctr = r.decr ‘ctr’ ! r.rpush ready, done if ctr == 0! end!
  • 23. Zip When Done parse csv process ready zip 3 Hash with pdfs create pdf process create pdf process ...
  • 24. Ruby on Redis ‣  Wait for the ready signal Fetch all pdf ’s And zip them ! r.brpop ready“ # wait for signal! pdfs = r.hgetall ‘pdf:pdfs‘ # fetch hash! create_zip pdfs # zip it
  • 25. More Parallelism parse csv zip ready ready ready counter counter counter hash hash Pdfs Hash with Queue with data create pdf create pdf ...
  • 26. Ruby on Redis ‣  Put PDF Create Input data on a Queue and do the counter bookkeeping # unique id for this input file! UUID = SecureRandom.uuid! docs.each do |doc|! data = YAML::dump([UUID, doc])! !r.lpush 'pdf:queue’, data! r.incr ctr:#{UUID}” # bookkeeping! end!
  • 27. Ruby on Redis ‣  Read PDF input data from Queue and do the counter bookkeeping and put each created PDF in a Redis hash while (true)! _, msg = r.brpop 'pdf:queue’! uuid, doc = YAML::load(msg)! r.hset(uuid, doc[0], create_pdf(*doc))! ctr = r.decr ctr:#{uuid} ! r.rpush ready:#{uuid}, done if ctr == 0 end! !
  • 28. Ruby on Redis ‣  Wait for the ready signal Fetch all pdf ’s And zip them ! r.brpop ready:#{UUID}“ # wait for signal! pdfs = r.hgetall(‘pdf:pdfs‘) # fetch hash! create_zip(pdfs) # zip it
  • 29. Full Code require 'csv'! require 'princely'! require 'zip/zip’! ! DATA_FILE = ARGV[0]! DATA_FILE_BASE_NAME = File.basename(DATA_FILE, .csv”)! ! # create a pdf document from a csv line! def create_pdf(invoice_nr, name, street, zip, city)! template = File.new('../resources/invoice.html').read! html = eval(WTFMFn#{template}nWTFMF)! p = Princely.new! p.add_style_sheets('../resources/invoice.css')! p.pdf_from_string(html)! end! ! # zip files from hash ! def create_zip(files_h)! zipfile_name = ../out/#{DATA_FILE_BASE_NAME}.#{Time.now.to_s}.zip! Zip::ZipOutputStream.open(zipfile_name) do |zos|! files_h.each do |name, content|! zos.put_next_entry #{name}.pdf! zos.puts content! end! end! zipfile_name! end! ! # load data from csv! docs = CSV.read(DATA_FILE) # array of arrays! ! # create a pdf for each line in the csv ! # and put it in a hash! files_h = docs.inject({}) do |files_h, doc|! files_h[doc[0]] = create_pdf(*doc)! files_h! end! ! # zip all pfd's from the hash ! create_zip files_h! ! LINEAR require 'csv’! require 'zip/zip'! require 'redis'! require 'yaml'! require 'securerandom'! ! # zip files from hash ! def create_zip(files_h)! zipfile_name = ../out/#{DATA_FILE_BASE_NAME}.#{Time.now.to_s}.zip! Zip::ZipOutputStream.open(zipfile_name) do |zos|! files_h.each do |name, content|! zos.put_next_entry #{name}.pdf! zos.puts content! end! end! zipfile_name! end! ! DATA_FILE = ARGV[0]! DATA_FILE_BASE_NAME = File.basename(DATA_FILE, .csv)! UUID = SecureRandom.uuid! ! r = Redis.new! my_counter = ctr:#{UUID}! ! # load data from csv! docs = CSV.read(DATA_FILE) # array of arrays! ! docs.each do |doc| # distribute!! r.lpush 'pdf:queue' , YAML::dump([UUID, doc])! r.incr my_counter! end! ! r.brpop ready:#{UUID} #collect!! create_zip(r.hgetall(UUID)) ! ! # clean up! r.del my_counter! r.del UUID ! puts All done!”! MAIN require 'redis'! require 'princely'! require 'yaml’! ! # create a pdf document from a csv line! def create_pdf(invoice_nr, name, street, zip, city)! template = File.new('../resources/invoice.html').read! html = eval(WTFMFn#{template}nWTFMF)! p = Princely.new! p.add_style_sheets('../resources/invoice.css')! p.pdf_from_string(html)! end! ! r = Redis.new! while (true)! _, msg = r.brpop 'pdf:queue'! uuid, doc = YAML::load(msg)! r.hset(uuid , doc[0] , create_pdf(*doc))! ctr = r.decr ctr:#{uuid} ! r.rpush ready:#{uuid}, done if ctr == 0! end! WORKER Key functions (create pdf and create zip) remain unchanged. Distribution code highlighted
  • 31. Multi Language Participants parse csv zip counter counter counter Queue with data create pdf hash hash pdfs Hash with create pdf ...
  • 32. Conclusions From Linear To Multi Process Distributed Is easy with Redis Shared Memory High Level Data Structures Atomic Counter for bookkeeping Queue for work distribution Queue as Signal Hash for result sets