SlideShare a Scribd company logo
1 of 61
Download to read offline
Scaling Web Applications with Background Jobs:
Takeaways from Generating a Huge PDF
Lydia Cupery
All Things Open, 2023
Hi!
Nice to meet you 󰗝
Lydia Cupery
lydia-cupery.com
Lydia Cupery
The Problem
Motion
design
Physical
Computing
Download
Invoices
Fetch the invoice data from the database
Transform data to shape of PDF data
Put completed PDF somewhere client can
access
Write the Invoice PDF for each customer
Combine all Invoice PDFs
SERVER
write customer
invoices PDF
reference to
created PDF
CLIENT
Fetch the invoice data from the database
Transform data
Put completed PDF somewhere client can access
Write the Invoice PDF for each customer
Combine all Invoice PDFs
Write the Invoice PDF for each customer
Combine all Invoice PDFs
First attempt - create one large PDF.
Write each customer statement to the main pdf.
It works but it takes way too long.
Okay, what if we create all the PDFs at the same time? And then
merge them?
We’ll use a Promise.all and store all the individual PDFs in memory.
What if we processed two or three invoices at a time, and write
each of those as pdfs to the file system?
Then, we could merge those PDFs together with pdf-lib.
What if we used an external service? Is there any external
service that could help us out with our bottlenecks?
Write the Invoice PDF for each customer
Combine all Invoice PDFs
…perhaps for combining PDFs!
Cuts down on the time to combine PDFs, but generating the individual PDFs
and writing the PDFs to the file system still takes too long.
Takes Too Long
Generating & writing PDFs to the
file systems takes too long.
Generate too many pdfs in parallel,
use a lot of memory, the dyno runs
out of memory.
Uses Too Much Memory
VS
What if we keep increasing memory, and upgrade our dynos to ones with more
memory?
invoice report please!
report.pdf
The system could still run out of memory and start failing as soon as more
than one user tries to generate a PDF at once.
invoice report please!
invoice report please!
ERROR
please load X page!
ERROR
report.pdf
Make it a Background Job!
“Background jobs can dramatically
improve the scalability of a web
app by enabling it to offload
slow or CPU-intensive tasks from
its front-end.”
Browser Web Server Background Service
request invoices PDF schedule generate
invoice PDF
in-progress
generate
invoice PDF
is it done yet?
nope!
is it done yet?
nope!
is it done yet?
yes! here it is
Architecture Overview
web process
(web dyno)
data store
(redis)
library to
implement queue
system on top of
redis (BullMQ)
worker processes
(worker dynos)
With a Background Job You Can…
With background
jobs you can…
Speed Things Up
Browser Web Server Background Service
request invoices PDF schedule generate
invoice PDF
in-progress
generate
invoice PDF
is it done yet?
nope!
is it done yet?
nope!
is it done yet?
yes! here it is
Sorted
Customers with
Invoices
fetch invoice data
generate invoices
customerIds: A-E
fetch invoice data
generate invoices
customerIds: F-O
fetch invoice data
generate invoices
customerIds: P-Z
combine generated
invoices
JOB QUEUE
customers A-E
customers F-O
customers P-Z
customer
list/
WORKER_COUNT
generate invoices
customerIds: A-E
generate invoices
customerIds: F-O
generate invoices
customerIds: P-Z
combine generated
invoices
JOB QUEUE
fetch data
(A-E)
write
invoices
upload “batch
invoices xxx - 1”
fetch data
(F-O)
write
invoices
upload “batch
invoices xxx - 2”
fetch data
(P-Z)
write
invoices
upload “batch
invoices xxx - 3”
combine s3 files to
“batch invoices xxx”
upload “batch
invoices xxx”
Job
Partial Job Partial Job Partial Job
Combine Partial Job Outputs
output output
output
With background
jobs you can…
Show Progress
Progress Indication
With no background job (assuming the server doesn’t time out
or run out of memory) :
Browser Web Server
request invoice PDF
here it is!
generate
pdf…
Indicating Progress
Browser Web Server Background Service
request invoices PDF schedule generate
invoice PDF
in-progress
generate
invoice PDF
jobs
is it done yet?
nope! 30% there
is it done yet?
nope! 70% there
is it done yet?
yes! here it is
Generate Partial Invoices Job
(input, updateProgress) =>
● fetch data to generate
invoices
● updateProgress(0.1)
JOB PROGRESS
1 .1
Generate Invoices Job
(input, updateProgress) =>
● fetch data to generate
invoices
● updateProgress(0.1)
● generate the invoices
○ updateProgress each time an
invoice generated
JOB PROGRESS
1 .2
Generate Invoices Job
(input, updateProgress) =>
● fetch data to generate
invoices
● updateProgress(0.1)
● generate the invoices
○ updateProgress each time an
invoice generated
JOB PROGRESS
1 .4
Generate Invoices Job
(input, updateProgress) =>
● fetch data to generate
invoices
● updateProgress(0.1)
● generate the invoices
○ updateProgress each time an
invoice generated
JOB PROGRESS
1 .6
Generate Invoices Job
(input, updateProgress) =>
● fetch data to generate
invoices
● updateProgress(0.1)
● generate the invoices
○ updateProgress each time an
invoice generated
JOB PROGRESS
1 .8
Generate Invoices Job
(input, updateProgress) =>
● fetch data to generate
invoices
● updateProgress(0.1)
● generate the invoices
○ updateProgress each time an
invoice generated
● updateProgress(0.9)
JOB PROGRESS
1 .9
Generate Invoices Job
(input, updateProgress) =>
● fetch data to generate
invoices
● updateProgress(0.1)
● generate the invoices
○ updateProgress each time an
invoice generated
● updateProgress(0.9)
● upload generated file to s3
● updateProgress(1)
JOB PROGRESS
1 1
view job progress
updateProgress(X) updates job progress
aggregate progress
across jobs requests job
progress
With background
jobs you can…
Support Simultaneous Users
Before background jobs…
invoice report please!
invoice report please!
ERROR
please load X page!
ERROR
report.pdf
invoice report please!
please load X page!
working on that!
progress?
progress?
invoice report please!
working on that!
progress?
here you go!
With background
jobs you can…
Save Jobs for Later
invoice report please!
working on that!
progress?
progress?
invoice report please!
working on that!
progress?
With background
jobs you can…
Have Less Timeouts/Errors
Communicating with an External Service
With no background job:
Browser Web Server
send out customer
emails, please!
TIMEOUT
��🏻♀
Mailgun
send these
emails, please!
��
Browser Web Server
send out customer
emails, please!
send these
emails, please!
in-progress
Mailgun
is it done yet?
nope! 30% there
is it done yet?
nope! 70% there
is it done yet?
yes - all emails
are sent!
��
Background
Service
send these
emails,
please!
Recap - With a background job you can…
speed things up
show progress
support simultaneous users
have less timeouts/errors
save jobs for later
Should you use a background job?
You might want a background job for…
CPU-intensive jobs Jobs communicating externally
I/O intensive jobs Scheduled jobs
Tips
Struggling with app responsiveness?
Try a background job.
Don’t recreate the wheel.
Use a library with a robust queueing system.
Does speed matter? It probably does.
Parallelize.
The job queue makes it easy to show users
progress. Do so!
Find the optimal number of workers and
optimal amount of resources per
worker(see next slide…)
1X
1X
1X
$25 x 6 = $150
512 MB x 6 = 3GB
$50 x 3 = $150
1 GB x 3 3GB
1X
1X
1X
2X
2X
2X
$250 x 1 = $250
2.5 GB x 1 = 2.5 GB
Perf M
What about the PDF?
Fetch the invoice data from the database
Transform data to shape of PDF data
Put completed PDF somewhere client can access
Write the Invoice PDF for each customer
Combine all Invoice PDFs
Fetch list of customers with invoices from the database
Transform fetched data to shape of PDF data
Put completed PDF somewhere client can access
Write the Invoice PDF for each customer
Combine all Invoice PDFs
do not need to fetch all
invoice data
Fetch invoice data for customers fetching 1/10 amount of data
transforming 1/10 amount of data
writing PDF for
1/10 customers
worker dyno
worker dyno
Thank You!
Lydia Cupery
lydia-cupery.com
Lydia Cupery

More Related Content

Similar to Scaling Web Applications with Background

Deploying Machine Learning in production without servers - #serverlessCPH
Deploying Machine Learning in production without servers - #serverlessCPHDeploying Machine Learning in production without servers - #serverlessCPH
Deploying Machine Learning in production without servers - #serverlessCPHDamien Cavaillès
 
Automating Workflows for Analytics Pipelines
Automating Workflows for Analytics PipelinesAutomating Workflows for Analytics Pipelines
Automating Workflows for Analytics PipelinesSadayuki Furuhashi
 
Computerized Accounting System
Computerized Accounting SystemComputerized Accounting System
Computerized Accounting SystemSabbir Ahmed
 
Continuous Delivery: The Dirty Details
Continuous Delivery: The Dirty DetailsContinuous Delivery: The Dirty Details
Continuous Delivery: The Dirty DetailsMike Brittain
 
Service workers and their role in PWAs
Service workers and their role in PWAsService workers and their role in PWAs
Service workers and their role in PWAsIpsha Bhidonia
 
On the importance of done
On the importance of doneOn the importance of done
On the importance of doneRob Purdie
 
Sapphire 2013 Presentation - Streamlining SAP Transactions for Barcode Scanne...
Sapphire 2013 Presentation - Streamlining SAP Transactions for Barcode Scanne...Sapphire 2013 Presentation - Streamlining SAP Transactions for Barcode Scanne...
Sapphire 2013 Presentation - Streamlining SAP Transactions for Barcode Scanne...DeeDee Kato
 
Office Add-ins community call-March 2019
Office Add-ins community call-March 2019Office Add-ins community call-March 2019
Office Add-ins community call-March 2019Microsoft 365 Developer
 
Virtualization Commputing
Virtualization CommputingVirtualization Commputing
Virtualization CommputingHi-Techpoint
 
Serverless is more findev than devops
Serverless is more findev than devopsServerless is more findev than devops
Serverless is more findev than devopsYan Cui
 
web, spa vs traditional - 2016
web, spa vs traditional - 2016web, spa vs traditional - 2016
web, spa vs traditional - 2016Yauheni Nikanovich
 
From 10 Deploys Per Year to 4 Per Day at DBS Bank: How Pivotal Platform Can R...
From 10 Deploys Per Year to 4 Per Day at DBS Bank: How Pivotal Platform Can R...From 10 Deploys Per Year to 4 Per Day at DBS Bank: How Pivotal Platform Can R...
From 10 Deploys Per Year to 4 Per Day at DBS Bank: How Pivotal Platform Can R...VMware Tanzu
 
PuppetConf 2017: Puppet & Google Cloud: From Nothing to Production in 10 minu...
PuppetConf 2017: Puppet & Google Cloud: From Nothing to Production in 10 minu...PuppetConf 2017: Puppet & Google Cloud: From Nothing to Production in 10 minu...
PuppetConf 2017: Puppet & Google Cloud: From Nothing to Production in 10 minu...Puppet
 
Max Voloshin - "Organization of frontend development for products with micros...
Max Voloshin - "Organization of frontend development for products with micros...Max Voloshin - "Organization of frontend development for products with micros...
Max Voloshin - "Organization of frontend development for products with micros...IT Event
 
Redesigning a large B2B website - The FusionCharts revamping story
Redesigning a large B2B website - The FusionCharts revamping storyRedesigning a large B2B website - The FusionCharts revamping story
Redesigning a large B2B website - The FusionCharts revamping storyFusionCharts
 
LITE 2018 – A Deep Dive Into the API [Iain Brown]
LITE 2018 – A Deep Dive Into the API [Iain Brown]LITE 2018 – A Deep Dive Into the API [Iain Brown]
LITE 2018 – A Deep Dive Into the API [Iain Brown]getadministrate
 
APIfying an ERP - ongoing saga
APIfying an ERP - ongoing sagaAPIfying an ERP - ongoing saga
APIfying an ERP - ongoing sagaMarjukka Niinioja
 
Application Performance Lecture
Application Performance LectureApplication Performance Lecture
Application Performance LectureVishwanath Ramdas
 
Web Performance, Scalability, and Testing Techniques - Boston PHP Meetup
Web Performance, Scalability, and Testing Techniques - Boston PHP MeetupWeb Performance, Scalability, and Testing Techniques - Boston PHP Meetup
Web Performance, Scalability, and Testing Techniques - Boston PHP MeetupJonathan Klein
 
Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida
Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida  Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida
Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida CLARA CAMPROVIN
 

Similar to Scaling Web Applications with Background (20)

Deploying Machine Learning in production without servers - #serverlessCPH
Deploying Machine Learning in production without servers - #serverlessCPHDeploying Machine Learning in production without servers - #serverlessCPH
Deploying Machine Learning in production without servers - #serverlessCPH
 
Automating Workflows for Analytics Pipelines
Automating Workflows for Analytics PipelinesAutomating Workflows for Analytics Pipelines
Automating Workflows for Analytics Pipelines
 
Computerized Accounting System
Computerized Accounting SystemComputerized Accounting System
Computerized Accounting System
 
Continuous Delivery: The Dirty Details
Continuous Delivery: The Dirty DetailsContinuous Delivery: The Dirty Details
Continuous Delivery: The Dirty Details
 
Service workers and their role in PWAs
Service workers and their role in PWAsService workers and their role in PWAs
Service workers and their role in PWAs
 
On the importance of done
On the importance of doneOn the importance of done
On the importance of done
 
Sapphire 2013 Presentation - Streamlining SAP Transactions for Barcode Scanne...
Sapphire 2013 Presentation - Streamlining SAP Transactions for Barcode Scanne...Sapphire 2013 Presentation - Streamlining SAP Transactions for Barcode Scanne...
Sapphire 2013 Presentation - Streamlining SAP Transactions for Barcode Scanne...
 
Office Add-ins community call-March 2019
Office Add-ins community call-March 2019Office Add-ins community call-March 2019
Office Add-ins community call-March 2019
 
Virtualization Commputing
Virtualization CommputingVirtualization Commputing
Virtualization Commputing
 
Serverless is more findev than devops
Serverless is more findev than devopsServerless is more findev than devops
Serverless is more findev than devops
 
web, spa vs traditional - 2016
web, spa vs traditional - 2016web, spa vs traditional - 2016
web, spa vs traditional - 2016
 
From 10 Deploys Per Year to 4 Per Day at DBS Bank: How Pivotal Platform Can R...
From 10 Deploys Per Year to 4 Per Day at DBS Bank: How Pivotal Platform Can R...From 10 Deploys Per Year to 4 Per Day at DBS Bank: How Pivotal Platform Can R...
From 10 Deploys Per Year to 4 Per Day at DBS Bank: How Pivotal Platform Can R...
 
PuppetConf 2017: Puppet & Google Cloud: From Nothing to Production in 10 minu...
PuppetConf 2017: Puppet & Google Cloud: From Nothing to Production in 10 minu...PuppetConf 2017: Puppet & Google Cloud: From Nothing to Production in 10 minu...
PuppetConf 2017: Puppet & Google Cloud: From Nothing to Production in 10 minu...
 
Max Voloshin - "Organization of frontend development for products with micros...
Max Voloshin - "Organization of frontend development for products with micros...Max Voloshin - "Organization of frontend development for products with micros...
Max Voloshin - "Organization of frontend development for products with micros...
 
Redesigning a large B2B website - The FusionCharts revamping story
Redesigning a large B2B website - The FusionCharts revamping storyRedesigning a large B2B website - The FusionCharts revamping story
Redesigning a large B2B website - The FusionCharts revamping story
 
LITE 2018 – A Deep Dive Into the API [Iain Brown]
LITE 2018 – A Deep Dive Into the API [Iain Brown]LITE 2018 – A Deep Dive Into the API [Iain Brown]
LITE 2018 – A Deep Dive Into the API [Iain Brown]
 
APIfying an ERP - ongoing saga
APIfying an ERP - ongoing sagaAPIfying an ERP - ongoing saga
APIfying an ERP - ongoing saga
 
Application Performance Lecture
Application Performance LectureApplication Performance Lecture
Application Performance Lecture
 
Web Performance, Scalability, and Testing Techniques - Boston PHP Meetup
Web Performance, Scalability, and Testing Techniques - Boston PHP MeetupWeb Performance, Scalability, and Testing Techniques - Boston PHP Meetup
Web Performance, Scalability, and Testing Techniques - Boston PHP Meetup
 
Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida
Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida  Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida
Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida
 

More from All Things Open

Building Reliability - The Realities of Observability
Building Reliability - The Realities of ObservabilityBuilding Reliability - The Realities of Observability
Building Reliability - The Realities of ObservabilityAll Things Open
 
Modern Database Best Practices
Modern Database Best PracticesModern Database Best Practices
Modern Database Best PracticesAll Things Open
 
Open Source and Public Policy
Open Source and Public PolicyOpen Source and Public Policy
Open Source and Public PolicyAll Things Open
 
Weaving Microservices into a Unified GraphQL Schema with graph-quilt - Ashpak...
Weaving Microservices into a Unified GraphQL Schema with graph-quilt - Ashpak...Weaving Microservices into a Unified GraphQL Schema with graph-quilt - Ashpak...
Weaving Microservices into a Unified GraphQL Schema with graph-quilt - Ashpak...All Things Open
 
The State of Passwordless Auth on the Web - Phil Nash
The State of Passwordless Auth on the Web - Phil NashThe State of Passwordless Auth on the Web - Phil Nash
The State of Passwordless Auth on the Web - Phil NashAll Things Open
 
Total ReDoS: The dangers of regex in JavaScript
Total ReDoS: The dangers of regex in JavaScriptTotal ReDoS: The dangers of regex in JavaScript
Total ReDoS: The dangers of regex in JavaScriptAll Things Open
 
What Does Real World Mass Adoption of Decentralized Tech Look Like?
What Does Real World Mass Adoption of Decentralized Tech Look Like?What Does Real World Mass Adoption of Decentralized Tech Look Like?
What Does Real World Mass Adoption of Decentralized Tech Look Like?All Things Open
 
How to Write & Deploy a Smart Contract
How to Write & Deploy a Smart ContractHow to Write & Deploy a Smart Contract
How to Write & Deploy a Smart ContractAll Things Open
 
Spinning Your Drones with Cadence Workflows, Apache Kafka and TensorFlow
 Spinning Your Drones with Cadence Workflows, Apache Kafka and TensorFlow Spinning Your Drones with Cadence Workflows, Apache Kafka and TensorFlow
Spinning Your Drones with Cadence Workflows, Apache Kafka and TensorFlowAll Things Open
 
DEI Challenges and Success
DEI Challenges and SuccessDEI Challenges and Success
DEI Challenges and SuccessAll Things Open
 
Supercharging tutorials with WebAssembly
Supercharging tutorials with WebAssemblySupercharging tutorials with WebAssembly
Supercharging tutorials with WebAssemblyAll Things Open
 
Using SQL to Find Needles in Haystacks
Using SQL to Find Needles in HaystacksUsing SQL to Find Needles in Haystacks
Using SQL to Find Needles in HaystacksAll Things Open
 
Configuration Security as a Game of Pursuit Intercept
Configuration Security as a Game of Pursuit InterceptConfiguration Security as a Game of Pursuit Intercept
Configuration Security as a Game of Pursuit InterceptAll Things Open
 
Scaling an Open Source Sponsorship Program
Scaling an Open Source Sponsorship ProgramScaling an Open Source Sponsorship Program
Scaling an Open Source Sponsorship ProgramAll Things Open
 
Build Developer Experience Teams for Open Source
Build Developer Experience Teams for Open SourceBuild Developer Experience Teams for Open Source
Build Developer Experience Teams for Open SourceAll Things Open
 
Deploying Models at Scale with Apache Beam
Deploying Models at Scale with Apache BeamDeploying Models at Scale with Apache Beam
Deploying Models at Scale with Apache BeamAll Things Open
 
Sudo – Giving access while staying in control
Sudo – Giving access while staying in controlSudo – Giving access while staying in control
Sudo – Giving access while staying in controlAll Things Open
 
Fortifying the Future: Tackling Security Challenges in AI/ML Applications
Fortifying the Future: Tackling Security Challenges in AI/ML ApplicationsFortifying the Future: Tackling Security Challenges in AI/ML Applications
Fortifying the Future: Tackling Security Challenges in AI/ML ApplicationsAll Things Open
 
Securing Cloud Resources Deployed with Control Planes on Kubernetes using Gov...
Securing Cloud Resources Deployed with Control Planes on Kubernetes using Gov...Securing Cloud Resources Deployed with Control Planes on Kubernetes using Gov...
Securing Cloud Resources Deployed with Control Planes on Kubernetes using Gov...All Things Open
 
Building AlmaLinux OS without RHEL sources code
Building AlmaLinux OS without RHEL sources codeBuilding AlmaLinux OS without RHEL sources code
Building AlmaLinux OS without RHEL sources codeAll Things Open
 

More from All Things Open (20)

Building Reliability - The Realities of Observability
Building Reliability - The Realities of ObservabilityBuilding Reliability - The Realities of Observability
Building Reliability - The Realities of Observability
 
Modern Database Best Practices
Modern Database Best PracticesModern Database Best Practices
Modern Database Best Practices
 
Open Source and Public Policy
Open Source and Public PolicyOpen Source and Public Policy
Open Source and Public Policy
 
Weaving Microservices into a Unified GraphQL Schema with graph-quilt - Ashpak...
Weaving Microservices into a Unified GraphQL Schema with graph-quilt - Ashpak...Weaving Microservices into a Unified GraphQL Schema with graph-quilt - Ashpak...
Weaving Microservices into a Unified GraphQL Schema with graph-quilt - Ashpak...
 
The State of Passwordless Auth on the Web - Phil Nash
The State of Passwordless Auth on the Web - Phil NashThe State of Passwordless Auth on the Web - Phil Nash
The State of Passwordless Auth on the Web - Phil Nash
 
Total ReDoS: The dangers of regex in JavaScript
Total ReDoS: The dangers of regex in JavaScriptTotal ReDoS: The dangers of regex in JavaScript
Total ReDoS: The dangers of regex in JavaScript
 
What Does Real World Mass Adoption of Decentralized Tech Look Like?
What Does Real World Mass Adoption of Decentralized Tech Look Like?What Does Real World Mass Adoption of Decentralized Tech Look Like?
What Does Real World Mass Adoption of Decentralized Tech Look Like?
 
How to Write & Deploy a Smart Contract
How to Write & Deploy a Smart ContractHow to Write & Deploy a Smart Contract
How to Write & Deploy a Smart Contract
 
Spinning Your Drones with Cadence Workflows, Apache Kafka and TensorFlow
 Spinning Your Drones with Cadence Workflows, Apache Kafka and TensorFlow Spinning Your Drones with Cadence Workflows, Apache Kafka and TensorFlow
Spinning Your Drones with Cadence Workflows, Apache Kafka and TensorFlow
 
DEI Challenges and Success
DEI Challenges and SuccessDEI Challenges and Success
DEI Challenges and Success
 
Supercharging tutorials with WebAssembly
Supercharging tutorials with WebAssemblySupercharging tutorials with WebAssembly
Supercharging tutorials with WebAssembly
 
Using SQL to Find Needles in Haystacks
Using SQL to Find Needles in HaystacksUsing SQL to Find Needles in Haystacks
Using SQL to Find Needles in Haystacks
 
Configuration Security as a Game of Pursuit Intercept
Configuration Security as a Game of Pursuit InterceptConfiguration Security as a Game of Pursuit Intercept
Configuration Security as a Game of Pursuit Intercept
 
Scaling an Open Source Sponsorship Program
Scaling an Open Source Sponsorship ProgramScaling an Open Source Sponsorship Program
Scaling an Open Source Sponsorship Program
 
Build Developer Experience Teams for Open Source
Build Developer Experience Teams for Open SourceBuild Developer Experience Teams for Open Source
Build Developer Experience Teams for Open Source
 
Deploying Models at Scale with Apache Beam
Deploying Models at Scale with Apache BeamDeploying Models at Scale with Apache Beam
Deploying Models at Scale with Apache Beam
 
Sudo – Giving access while staying in control
Sudo – Giving access while staying in controlSudo – Giving access while staying in control
Sudo – Giving access while staying in control
 
Fortifying the Future: Tackling Security Challenges in AI/ML Applications
Fortifying the Future: Tackling Security Challenges in AI/ML ApplicationsFortifying the Future: Tackling Security Challenges in AI/ML Applications
Fortifying the Future: Tackling Security Challenges in AI/ML Applications
 
Securing Cloud Resources Deployed with Control Planes on Kubernetes using Gov...
Securing Cloud Resources Deployed with Control Planes on Kubernetes using Gov...Securing Cloud Resources Deployed with Control Planes on Kubernetes using Gov...
Securing Cloud Resources Deployed with Control Planes on Kubernetes using Gov...
 
Building AlmaLinux OS without RHEL sources code
Building AlmaLinux OS without RHEL sources codeBuilding AlmaLinux OS without RHEL sources code
Building AlmaLinux OS without RHEL sources code
 

Recently uploaded

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 

Recently uploaded (20)

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 

Scaling Web Applications with Background

  • 1. Scaling Web Applications with Background Jobs: Takeaways from Generating a Huge PDF Lydia Cupery All Things Open, 2023
  • 2. Hi! Nice to meet you 󰗝 Lydia Cupery lydia-cupery.com Lydia Cupery
  • 4.
  • 5. Motion design Physical Computing Download Invoices Fetch the invoice data from the database Transform data to shape of PDF data Put completed PDF somewhere client can access Write the Invoice PDF for each customer Combine all Invoice PDFs SERVER write customer invoices PDF reference to created PDF CLIENT
  • 6.
  • 7. Fetch the invoice data from the database Transform data Put completed PDF somewhere client can access Write the Invoice PDF for each customer Combine all Invoice PDFs
  • 8. Write the Invoice PDF for each customer Combine all Invoice PDFs
  • 9. First attempt - create one large PDF. Write each customer statement to the main pdf. It works but it takes way too long.
  • 10. Okay, what if we create all the PDFs at the same time? And then merge them? We’ll use a Promise.all and store all the individual PDFs in memory.
  • 11. What if we processed two or three invoices at a time, and write each of those as pdfs to the file system? Then, we could merge those PDFs together with pdf-lib.
  • 12. What if we used an external service? Is there any external service that could help us out with our bottlenecks? Write the Invoice PDF for each customer Combine all Invoice PDFs …perhaps for combining PDFs!
  • 13. Cuts down on the time to combine PDFs, but generating the individual PDFs and writing the PDFs to the file system still takes too long.
  • 14. Takes Too Long Generating & writing PDFs to the file systems takes too long. Generate too many pdfs in parallel, use a lot of memory, the dyno runs out of memory. Uses Too Much Memory VS
  • 15. What if we keep increasing memory, and upgrade our dynos to ones with more memory? invoice report please! report.pdf
  • 16. The system could still run out of memory and start failing as soon as more than one user tries to generate a PDF at once. invoice report please! invoice report please! ERROR please load X page! ERROR report.pdf
  • 17. Make it a Background Job!
  • 18. “Background jobs can dramatically improve the scalability of a web app by enabling it to offload slow or CPU-intensive tasks from its front-end.”
  • 19. Browser Web Server Background Service request invoices PDF schedule generate invoice PDF in-progress generate invoice PDF is it done yet? nope! is it done yet? nope! is it done yet? yes! here it is
  • 21. web process (web dyno) data store (redis) library to implement queue system on top of redis (BullMQ) worker processes (worker dynos)
  • 22. With a Background Job You Can…
  • 23. With background jobs you can… Speed Things Up
  • 24. Browser Web Server Background Service request invoices PDF schedule generate invoice PDF in-progress generate invoice PDF is it done yet? nope! is it done yet? nope! is it done yet? yes! here it is
  • 25. Sorted Customers with Invoices fetch invoice data generate invoices customerIds: A-E fetch invoice data generate invoices customerIds: F-O fetch invoice data generate invoices customerIds: P-Z combine generated invoices JOB QUEUE customers A-E customers F-O customers P-Z customer list/ WORKER_COUNT
  • 26. generate invoices customerIds: A-E generate invoices customerIds: F-O generate invoices customerIds: P-Z combine generated invoices JOB QUEUE fetch data (A-E) write invoices upload “batch invoices xxx - 1” fetch data (F-O) write invoices upload “batch invoices xxx - 2” fetch data (P-Z) write invoices upload “batch invoices xxx - 3” combine s3 files to “batch invoices xxx” upload “batch invoices xxx”
  • 27. Job Partial Job Partial Job Partial Job Combine Partial Job Outputs output output output
  • 28. With background jobs you can… Show Progress
  • 29. Progress Indication With no background job (assuming the server doesn’t time out or run out of memory) : Browser Web Server request invoice PDF here it is! generate pdf…
  • 31. Browser Web Server Background Service request invoices PDF schedule generate invoice PDF in-progress generate invoice PDF jobs is it done yet? nope! 30% there is it done yet? nope! 70% there is it done yet? yes! here it is
  • 32. Generate Partial Invoices Job (input, updateProgress) => ● fetch data to generate invoices ● updateProgress(0.1) JOB PROGRESS 1 .1
  • 33. Generate Invoices Job (input, updateProgress) => ● fetch data to generate invoices ● updateProgress(0.1) ● generate the invoices ○ updateProgress each time an invoice generated JOB PROGRESS 1 .2
  • 34. Generate Invoices Job (input, updateProgress) => ● fetch data to generate invoices ● updateProgress(0.1) ● generate the invoices ○ updateProgress each time an invoice generated JOB PROGRESS 1 .4
  • 35. Generate Invoices Job (input, updateProgress) => ● fetch data to generate invoices ● updateProgress(0.1) ● generate the invoices ○ updateProgress each time an invoice generated JOB PROGRESS 1 .6
  • 36. Generate Invoices Job (input, updateProgress) => ● fetch data to generate invoices ● updateProgress(0.1) ● generate the invoices ○ updateProgress each time an invoice generated JOB PROGRESS 1 .8
  • 37. Generate Invoices Job (input, updateProgress) => ● fetch data to generate invoices ● updateProgress(0.1) ● generate the invoices ○ updateProgress each time an invoice generated ● updateProgress(0.9) JOB PROGRESS 1 .9
  • 38. Generate Invoices Job (input, updateProgress) => ● fetch data to generate invoices ● updateProgress(0.1) ● generate the invoices ○ updateProgress each time an invoice generated ● updateProgress(0.9) ● upload generated file to s3 ● updateProgress(1) JOB PROGRESS 1 1
  • 39. view job progress updateProgress(X) updates job progress aggregate progress across jobs requests job progress
  • 40. With background jobs you can… Support Simultaneous Users
  • 41. Before background jobs… invoice report please! invoice report please! ERROR please load X page! ERROR report.pdf
  • 42. invoice report please! please load X page! working on that! progress? progress? invoice report please! working on that! progress? here you go!
  • 43. With background jobs you can… Save Jobs for Later
  • 44. invoice report please! working on that! progress? progress? invoice report please! working on that! progress?
  • 45. With background jobs you can… Have Less Timeouts/Errors
  • 46. Communicating with an External Service With no background job: Browser Web Server send out customer emails, please! TIMEOUT ��🏻♀ Mailgun send these emails, please! ��
  • 47. Browser Web Server send out customer emails, please! send these emails, please! in-progress Mailgun is it done yet? nope! 30% there is it done yet? nope! 70% there is it done yet? yes - all emails are sent! �� Background Service send these emails, please!
  • 48. Recap - With a background job you can… speed things up show progress support simultaneous users have less timeouts/errors save jobs for later
  • 49. Should you use a background job?
  • 50. You might want a background job for… CPU-intensive jobs Jobs communicating externally I/O intensive jobs Scheduled jobs
  • 51. Tips
  • 52. Struggling with app responsiveness? Try a background job.
  • 53. Don’t recreate the wheel. Use a library with a robust queueing system.
  • 54. Does speed matter? It probably does. Parallelize.
  • 55. The job queue makes it easy to show users progress. Do so!
  • 56. Find the optimal number of workers and optimal amount of resources per worker(see next slide…)
  • 57. 1X 1X 1X $25 x 6 = $150 512 MB x 6 = 3GB $50 x 3 = $150 1 GB x 3 3GB 1X 1X 1X 2X 2X 2X $250 x 1 = $250 2.5 GB x 1 = 2.5 GB Perf M
  • 59. Fetch the invoice data from the database Transform data to shape of PDF data Put completed PDF somewhere client can access Write the Invoice PDF for each customer Combine all Invoice PDFs
  • 60. Fetch list of customers with invoices from the database Transform fetched data to shape of PDF data Put completed PDF somewhere client can access Write the Invoice PDF for each customer Combine all Invoice PDFs do not need to fetch all invoice data Fetch invoice data for customers fetching 1/10 amount of data transforming 1/10 amount of data writing PDF for 1/10 customers worker dyno worker dyno