SlideShare a Scribd company logo
1 of 17
Huy Nguyen
CTO, Co-founder - Holistics.io
Building A Job Queue System
with PostgreSQL & Ruby
Grokking TechTalk #24 - Background Job Queues
Ho Chi Minh City - March 2018
About Me
Education:
● Pho Thong Nang Khieu, Tin 04-07
● National University of Singapore (NUS), Computer Science Major.
Work:
● Software Engineer Intern, SenseGraphics (Stockholm, Sweden)
● Software Engineer Intern, Facebook (California, US)
● Data Infrastructure Engineer, Viki (Singapore)
Now:
● Co-founder & CTO, Holistics Software
● Co-founder, Grokking Vietnam
huy@holistics.io facebook.com/huy bit.ly/huy-linkedin
● Building Analytics Infrastructure at Viki
● Why PostgreSQL for Analytics Infrastructure
● PostgreSQL Internals: Discussing on Uber Moving From
PostgreSQL to MySQL
● Now: Building A Job Queue System with PostgreSQL & Ruby
Some Talks I’ve Given
huy@holistics.io facebook.com/huy bit.ly/huy-linkedin
B2B SaaS Web Application.
Connect to customer’s DB
→ run queries
→ wait for results
→ display to end users
Background
Requirements
Customer A Customer B Customer C
Reliability: Job should be processed only once
and never missed; job pickup order; retry
mechanism.
Jobs Persistence: store jobs info, track job’s
statistics (run duration, start time, end time, etc)
Multi-tenancy: Each customer has own queue
slots
Queue Architecture
Customer A Customer B Customer C
Customer’s Queue Slot
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
...
Sidekiq
Holistics Job Queue Layer
● Ruby + Rails
● PostgreSQL for DB
● Sidekiq Background Job
class DataReport < ApplicationRecord
include Queuable
def execute
# compose and run this report
return values
end
end
report = DataReport.find(123)
# normal: execute synchronously, this returns the
return value of `execute` method
report_results = report.execute
# execute asynchronously, this returns a job ID (int)
job_id = report.async.execute
CREATE TABLE jobs (
id INTEGER PRIMARY KEY,
source_type VARCHAR,
source_method VARCHAR,
source_id INTEGER,
args JSONB DEFAULT '{}',
status VARCHAR,
start_time TIMESTAMP,
queued_time TIMESTAMP,
end_time TIMESTAMP,
created_at TIMESTAMP,
stats JSONB DEFAULT '{}',
tenant_id INTEGER
)
Job Status:
created → queued → running → success / failure
Storing Jobs Data
-- finds out how many jobs are running per queue, so we know if it's full
WITH running_jobs_per_queue AS (
SELECT
tenant_id,
count(1) AS running_jobs from jobs
WHERE (status = 'running' OR status = 'queued') -- running or queued
AND created_at > NOW() - INTERVAL '6 HOURS' -- ignore jobs past 6 hours
group by 1
),
-- find out queues that are full
full_queues AS (
select
R.tenant_id
from running_jobs_per_queue R
left join tenant_queues Q ON R.tenant_id = Q.tenant_id
where R.running_jobs >= Q.num_slots
)
select id from jobs
where status = 'created'
and tenant_id NOT IN ( select tenant_id from full_queues )
order by id asc
for update skip locked
limit 1
SQL to claim next job
Select the next job which customer
still have available queue slots.
Skip over rows that’s been selected
(SKIP LOCKED)
Acquire a row-level lock upon
selecting.
CREATE TABLE tenant_queues (
id INTEGER PRIMARY KEY,
tenant_id INTEGER,
num_slots INTEGER
)
Each job is processed in a transaction.
Upon finding next job, change its state
and send over to Sidekiq
Queuing next job
class Job
def self.queue_next_job()
ActiveRecord::Base.transaction do
ret = ActiveRecord::Base.connection.execute queue_sql
return nil if ret.values.empty?
job_id = ret.values.first.first.to_i
job = Job.find(job_id)
# send to background worker
job.status = 'queued' && job.save
JobWorker.perform_async(job_id)
end
end
end
Generic Sidekiq job worker
# simplified code
class JobWorker
include Sidekiq::Worker
def perform(job_id)
job = Job.find(job_id)
job.status = 'running' && job.save
obj = job.source_type.constantize.find(job.source_id)
obj.call(job.source_method, job.args)
job.status = 'success' && job.save
rescue
job.status = 'error' && job.save
ensure
Job.queue_next_job()
end
end
This is run inside Sidekiq worker
(background).
Pull relevant instance from database and
construct object.
Invoke the method with relevant
parameters
Supervisor vs non-supervisor
OTHER JOB QUEUE HOLISTICS JOB QUEUE
Master
Dedicated process to receive
request
SQL + inline with existing Rails or
Sidekiq process
Workers Dedicated processes or threads Pass over to Sidekiq
Easily switch between synchronous vs
asynchronous.
No need to create dedicated workers code.
Thanks to Ruby’s metaprogramming.
The .async keyword
class DataReport < ApplicationRecord
include Queuable
def execute
# compose and run this report
return values
end
end
report = DataReport.find(123)
# normal: execute synchronously, this returns the return
value of `execute` method
report_results = report.execute
# execute asynchronously, this returns a job ID (int)
job_id = report.async.execute
Summary
● Why Reinvent The Wheel? Or did we.
● What we have now.
Customer A Customer B Customer C
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
...
Sidekiq
Holistics Job Queue Layer
● que, an open-source job queue written for Ruby &
PostgreSQL.
● Using advisory locks.
● Learn more: https://github.com/chanks/que
Other Job Queue Using PostgreSQL: que
Q&As?
Jobs @ Holistics
Looking for comrades to join our small team.
Why?
● Global, enterprise product used by well-known companies
(Grab, Traveloka, ...)
● Small, lean team that moves fast
Positions:
● Backend Engineer
● Front-end Engineer
● Technical Sales Engineer
● Product Manager
holistics.io/careers

More Related Content

What's hot

The Future starts with a Promise
The Future starts with a PromiseThe Future starts with a Promise
The Future starts with a Promise
Alexandru Nedelcu
 
Rapid API Development ArangoDB Foxx
Rapid API Development ArangoDB FoxxRapid API Development ArangoDB Foxx
Rapid API Development ArangoDB Foxx
Michael Hackstein
 
Ajax Introduction Presentation
Ajax   Introduction   PresentationAjax   Introduction   Presentation
Ajax Introduction Presentation
thinkphp
 
Parallel & async processing using tpl dataflow
Parallel & async processing using tpl dataflowParallel & async processing using tpl dataflow
Parallel & async processing using tpl dataflow
Codecamp Romania
 

What's hot (20)

The Future starts with a Promise
The Future starts with a PromiseThe Future starts with a Promise
The Future starts with a Promise
 
Flink Forward SF 2017: Dean Wampler - Streaming Deep Learning Scenarios with...
Flink Forward SF 2017: Dean Wampler -  Streaming Deep Learning Scenarios with...Flink Forward SF 2017: Dean Wampler -  Streaming Deep Learning Scenarios with...
Flink Forward SF 2017: Dean Wampler - Streaming Deep Learning Scenarios with...
 
26 Trillion App Recomendations using 100 Lines of Spark Code - Ayman Farahat
26 Trillion App Recomendations using 100 Lines of Spark Code - Ayman Farahat26 Trillion App Recomendations using 100 Lines of Spark Code - Ayman Farahat
26 Trillion App Recomendations using 100 Lines of Spark Code - Ayman Farahat
 
Scala Future & Promises
Scala Future & PromisesScala Future & Promises
Scala Future & Promises
 
Introduction to Structured Streaming
Introduction to Structured StreamingIntroduction to Structured Streaming
Introduction to Structured Streaming
 
From rest api to graph ql a 10 year journey
From rest api to graph ql a 10 year journeyFrom rest api to graph ql a 10 year journey
From rest api to graph ql a 10 year journey
 
EclairJS = Node.Js + Apache Spark
EclairJS = Node.Js + Apache SparkEclairJS = Node.Js + Apache Spark
EclairJS = Node.Js + Apache Spark
 
Javantura v3 - Going Reactive with RxJava – Hrvoje Crnjak
Javantura v3 - Going Reactive with RxJava – Hrvoje CrnjakJavantura v3 - Going Reactive with RxJava – Hrvoje Crnjak
Javantura v3 - Going Reactive with RxJava – Hrvoje Crnjak
 
Spark Summit EU talk by Qifan Pu
Spark Summit EU talk by Qifan PuSpark Summit EU talk by Qifan Pu
Spark Summit EU talk by Qifan Pu
 
Gradoop: Scalable Graph Analytics with Apache Flink @ Flink Forward 2015
Gradoop: Scalable Graph Analytics with Apache Flink @ Flink Forward 2015Gradoop: Scalable Graph Analytics with Apache Flink @ Flink Forward 2015
Gradoop: Scalable Graph Analytics with Apache Flink @ Flink Forward 2015
 
H2O World - Sparkling Water - Michal Malohlava
H2O World - Sparkling Water - Michal MalohlavaH2O World - Sparkling Water - Michal Malohlava
H2O World - Sparkling Water - Michal Malohlava
 
Rapid API Development ArangoDB Foxx
Rapid API Development ArangoDB FoxxRapid API Development ArangoDB Foxx
Rapid API Development ArangoDB Foxx
 
Ajax
AjaxAjax
Ajax
 
What is Spark
What is SparkWhat is Spark
What is Spark
 
Ajax Introduction Presentation
Ajax   Introduction   PresentationAjax   Introduction   Presentation
Ajax Introduction Presentation
 
Raphael Amorim - Scrating React Fiber
Raphael Amorim - Scrating React FiberRaphael Amorim - Scrating React Fiber
Raphael Amorim - Scrating React Fiber
 
OQGraph @ SCaLE 11x 2013
OQGraph @ SCaLE 11x 2013OQGraph @ SCaLE 11x 2013
OQGraph @ SCaLE 11x 2013
 
Parallel & async processing using tpl dataflow
Parallel & async processing using tpl dataflowParallel & async processing using tpl dataflow
Parallel & async processing using tpl dataflow
 
Introduction to ajax
Introduction  to  ajaxIntroduction  to  ajax
Introduction to ajax
 
Gatling overview
Gatling overviewGatling overview
Gatling overview
 

Similar to Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & PostgreSQL

Salesforce Batch processing - Atlanta SFUG
Salesforce Batch processing - Atlanta SFUGSalesforce Batch processing - Atlanta SFUG
Salesforce Batch processing - Atlanta SFUG
vraopolisetti
 
Intro To JavaScript Unit Testing - Ran Mizrahi
Intro To JavaScript Unit Testing - Ran MizrahiIntro To JavaScript Unit Testing - Ran Mizrahi
Intro To JavaScript Unit Testing - Ran Mizrahi
Ran Mizrahi
 
Node js
Node jsNode js
Node js
hazzaz
 
Javascript unit testing with QUnit and Sinon
Javascript unit testing with QUnit and SinonJavascript unit testing with QUnit and Sinon
Javascript unit testing with QUnit and Sinon
Lars Thorup
 
Live Streaming & Server Sent Events
Live Streaming & Server Sent EventsLive Streaming & Server Sent Events
Live Streaming & Server Sent Events
tkramar
 

Similar to Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & PostgreSQL (20)

Sql storeprocedure
Sql storeprocedureSql storeprocedure
Sql storeprocedure
 
Salesforce asynchronous apex
Salesforce asynchronous apexSalesforce asynchronous apex
Salesforce asynchronous apex
 
Salesforce Batch processing - Atlanta SFUG
Salesforce Batch processing - Atlanta SFUGSalesforce Batch processing - Atlanta SFUG
Salesforce Batch processing - Atlanta SFUG
 
Intro To JavaScript Unit Testing - Ran Mizrahi
Intro To JavaScript Unit Testing - Ran MizrahiIntro To JavaScript Unit Testing - Ran Mizrahi
Intro To JavaScript Unit Testing - Ran Mizrahi
 
Introduction to Django
Introduction to DjangoIntroduction to Django
Introduction to Django
 
Node js
Node jsNode js
Node js
 
Javascript Everywhere
Javascript EverywhereJavascript Everywhere
Javascript Everywhere
 
Twins: OOP and FP
Twins: OOP and FPTwins: OOP and FP
Twins: OOP and FP
 
Javascript unit testing with QUnit and Sinon
Javascript unit testing with QUnit and SinonJavascript unit testing with QUnit and Sinon
Javascript unit testing with QUnit and Sinon
 
Task Scheduling and Asynchronous Processing Evolved. Zend Server Job Queue
Task Scheduling and Asynchronous Processing Evolved. Zend Server Job QueueTask Scheduling and Asynchronous Processing Evolved. Zend Server Job Queue
Task Scheduling and Asynchronous Processing Evolved. Zend Server Job Queue
 
Apex code Benchmarking
Apex code BenchmarkingApex code Benchmarking
Apex code Benchmarking
 
Sherlock Homepage - A detective story about running large web services - NDC ...
Sherlock Homepage - A detective story about running large web services - NDC ...Sherlock Homepage - A detective story about running large web services - NDC ...
Sherlock Homepage - A detective story about running large web services - NDC ...
 
Django for IoT: From hackathon to production (DjangoCon US)
Django for IoT: From hackathon to production (DjangoCon US)Django for IoT: From hackathon to production (DjangoCon US)
Django for IoT: From hackathon to production (DjangoCon US)
 
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scalaAutomate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
 
Live Streaming & Server Sent Events
Live Streaming & Server Sent EventsLive Streaming & Server Sent Events
Live Streaming & Server Sent Events
 
Ml pipelines with Apache spark and Apache beam - Ottawa Reactive meetup Augus...
Ml pipelines with Apache spark and Apache beam - Ottawa Reactive meetup Augus...Ml pipelines with Apache spark and Apache beam - Ottawa Reactive meetup Augus...
Ml pipelines with Apache spark and Apache beam - Ottawa Reactive meetup Augus...
 
Using Task Queues and D3.js to build an analytics product on App Engine
Using Task Queues and D3.js to build an analytics product on App EngineUsing Task Queues and D3.js to build an analytics product on App Engine
Using Task Queues and D3.js to build an analytics product on App Engine
 
Twins: Object Oriented Programming and Functional Programming
Twins: Object Oriented Programming and Functional ProgrammingTwins: Object Oriented Programming and Functional Programming
Twins: Object Oriented Programming and Functional Programming
 
Job Queues Overview
Job Queues OverviewJob Queues Overview
Job Queues Overview
 
Scalable web application architecture
Scalable web application architectureScalable web application architecture
Scalable web application architecture
 

More from Grokking VN

Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banksGrokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking VN
 
Grokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles ThinkingGrokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles Thinking
Grokking VN
 
Grokking Techtalk #43: Payment gateway demystified
Grokking Techtalk #43: Payment gateway demystifiedGrokking Techtalk #43: Payment gateway demystified
Grokking Techtalk #43: Payment gateway demystified
Grokking VN
 
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
 Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer... Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
Grokking VN
 

More from Grokking VN (20)

Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banksGrokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
 
Grokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles ThinkingGrokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles Thinking
 
Grokking Techtalk #42: Engineering challenges on building data platform for M...
Grokking Techtalk #42: Engineering challenges on building data platform for M...Grokking Techtalk #42: Engineering challenges on building data platform for M...
Grokking Techtalk #42: Engineering challenges on building data platform for M...
 
Grokking Techtalk #43: Payment gateway demystified
Grokking Techtalk #43: Payment gateway demystifiedGrokking Techtalk #43: Payment gateway demystified
Grokking Techtalk #43: Payment gateway demystified
 
Grokking Techtalk #40: Consistency and Availability tradeoff in database cluster
Grokking Techtalk #40: Consistency and Availability tradeoff in database clusterGrokking Techtalk #40: Consistency and Availability tradeoff in database cluster
Grokking Techtalk #40: Consistency and Availability tradeoff in database cluster
 
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platform
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platformGrokking Techtalk #40: AWS’s philosophy on designing MLOps platform
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platform
 
Grokking Techtalk #39: Gossip protocol and applications
Grokking Techtalk #39: Gossip protocol and applicationsGrokking Techtalk #39: Gossip protocol and applications
Grokking Techtalk #39: Gossip protocol and applications
 
Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
 Grokking Techtalk #39: How to build an event driven architecture with Kafka ... Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
 
Grokking Techtalk #37: Data intensive problem
 Grokking Techtalk #37: Data intensive problem Grokking Techtalk #37: Data intensive problem
Grokking Techtalk #37: Data intensive problem
 
Grokking Techtalk #37: Software design and refactoring
 Grokking Techtalk #37: Software design and refactoring Grokking Techtalk #37: Software design and refactoring
Grokking Techtalk #37: Software design and refactoring
 
Grokking TechTalk #35: Efficient spellchecking
Grokking TechTalk #35: Efficient spellcheckingGrokking TechTalk #35: Efficient spellchecking
Grokking TechTalk #35: Efficient spellchecking
 
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
 Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer... Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
 
Grokking TechTalk #33: High Concurrency Architecture at TIKI
Grokking TechTalk #33: High Concurrency Architecture at TIKIGrokking TechTalk #33: High Concurrency Architecture at TIKI
Grokking TechTalk #33: High Concurrency Architecture at TIKI
 
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
 
SOLID & Design Patterns
SOLID & Design PatternsSOLID & Design Patterns
SOLID & Design Patterns
 
Grokking TechTalk #31: Asynchronous Communications
Grokking TechTalk #31: Asynchronous CommunicationsGrokking TechTalk #31: Asynchronous Communications
Grokking TechTalk #31: Asynchronous Communications
 
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at ScaleGrokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
 
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedInGrokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
 
Grokking TechTalk #27: Optimal Binary Search Tree
Grokking TechTalk #27: Optimal Binary Search TreeGrokking TechTalk #27: Optimal Binary Search Tree
Grokking TechTalk #27: Optimal Binary Search Tree
 
Grokking TechTalk #26: Kotlin, Understand the Magic
Grokking TechTalk #26: Kotlin, Understand the MagicGrokking TechTalk #26: Kotlin, Understand the Magic
Grokking TechTalk #26: Kotlin, Understand the Magic
 

Recently uploaded

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Recently uploaded (20)

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 

Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & PostgreSQL

  • 1. Huy Nguyen CTO, Co-founder - Holistics.io Building A Job Queue System with PostgreSQL & Ruby Grokking TechTalk #24 - Background Job Queues Ho Chi Minh City - March 2018
  • 2. About Me Education: ● Pho Thong Nang Khieu, Tin 04-07 ● National University of Singapore (NUS), Computer Science Major. Work: ● Software Engineer Intern, SenseGraphics (Stockholm, Sweden) ● Software Engineer Intern, Facebook (California, US) ● Data Infrastructure Engineer, Viki (Singapore) Now: ● Co-founder & CTO, Holistics Software ● Co-founder, Grokking Vietnam huy@holistics.io facebook.com/huy bit.ly/huy-linkedin
  • 3. ● Building Analytics Infrastructure at Viki ● Why PostgreSQL for Analytics Infrastructure ● PostgreSQL Internals: Discussing on Uber Moving From PostgreSQL to MySQL ● Now: Building A Job Queue System with PostgreSQL & Ruby Some Talks I’ve Given huy@holistics.io facebook.com/huy bit.ly/huy-linkedin
  • 4. B2B SaaS Web Application. Connect to customer’s DB → run queries → wait for results → display to end users Background
  • 5. Requirements Customer A Customer B Customer C Reliability: Job should be processed only once and never missed; job pickup order; retry mechanism. Jobs Persistence: store jobs info, track job’s statistics (run duration, start time, end time, etc) Multi-tenancy: Each customer has own queue slots
  • 6.
  • 7. Queue Architecture Customer A Customer B Customer C Customer’s Queue Slot Worker Worker Worker Worker Worker Worker Worker Worker ... Sidekiq Holistics Job Queue Layer ● Ruby + Rails ● PostgreSQL for DB ● Sidekiq Background Job
  • 8. class DataReport < ApplicationRecord include Queuable def execute # compose and run this report return values end end report = DataReport.find(123) # normal: execute synchronously, this returns the return value of `execute` method report_results = report.execute # execute asynchronously, this returns a job ID (int) job_id = report.async.execute CREATE TABLE jobs ( id INTEGER PRIMARY KEY, source_type VARCHAR, source_method VARCHAR, source_id INTEGER, args JSONB DEFAULT '{}', status VARCHAR, start_time TIMESTAMP, queued_time TIMESTAMP, end_time TIMESTAMP, created_at TIMESTAMP, stats JSONB DEFAULT '{}', tenant_id INTEGER ) Job Status: created → queued → running → success / failure Storing Jobs Data
  • 9. -- finds out how many jobs are running per queue, so we know if it's full WITH running_jobs_per_queue AS ( SELECT tenant_id, count(1) AS running_jobs from jobs WHERE (status = 'running' OR status = 'queued') -- running or queued AND created_at > NOW() - INTERVAL '6 HOURS' -- ignore jobs past 6 hours group by 1 ), -- find out queues that are full full_queues AS ( select R.tenant_id from running_jobs_per_queue R left join tenant_queues Q ON R.tenant_id = Q.tenant_id where R.running_jobs >= Q.num_slots ) select id from jobs where status = 'created' and tenant_id NOT IN ( select tenant_id from full_queues ) order by id asc for update skip locked limit 1 SQL to claim next job Select the next job which customer still have available queue slots. Skip over rows that’s been selected (SKIP LOCKED) Acquire a row-level lock upon selecting. CREATE TABLE tenant_queues ( id INTEGER PRIMARY KEY, tenant_id INTEGER, num_slots INTEGER )
  • 10. Each job is processed in a transaction. Upon finding next job, change its state and send over to Sidekiq Queuing next job class Job def self.queue_next_job() ActiveRecord::Base.transaction do ret = ActiveRecord::Base.connection.execute queue_sql return nil if ret.values.empty? job_id = ret.values.first.first.to_i job = Job.find(job_id) # send to background worker job.status = 'queued' && job.save JobWorker.perform_async(job_id) end end end
  • 11. Generic Sidekiq job worker # simplified code class JobWorker include Sidekiq::Worker def perform(job_id) job = Job.find(job_id) job.status = 'running' && job.save obj = job.source_type.constantize.find(job.source_id) obj.call(job.source_method, job.args) job.status = 'success' && job.save rescue job.status = 'error' && job.save ensure Job.queue_next_job() end end This is run inside Sidekiq worker (background). Pull relevant instance from database and construct object. Invoke the method with relevant parameters
  • 12. Supervisor vs non-supervisor OTHER JOB QUEUE HOLISTICS JOB QUEUE Master Dedicated process to receive request SQL + inline with existing Rails or Sidekiq process Workers Dedicated processes or threads Pass over to Sidekiq
  • 13. Easily switch between synchronous vs asynchronous. No need to create dedicated workers code. Thanks to Ruby’s metaprogramming. The .async keyword class DataReport < ApplicationRecord include Queuable def execute # compose and run this report return values end end report = DataReport.find(123) # normal: execute synchronously, this returns the return value of `execute` method report_results = report.execute # execute asynchronously, this returns a job ID (int) job_id = report.async.execute
  • 14. Summary ● Why Reinvent The Wheel? Or did we. ● What we have now. Customer A Customer B Customer C Worker Worker Worker Worker Worker Worker Worker Worker ... Sidekiq Holistics Job Queue Layer
  • 15. ● que, an open-source job queue written for Ruby & PostgreSQL. ● Using advisory locks. ● Learn more: https://github.com/chanks/que Other Job Queue Using PostgreSQL: que
  • 16. Q&As?
  • 17. Jobs @ Holistics Looking for comrades to join our small team. Why? ● Global, enterprise product used by well-known companies (Grab, Traveloka, ...) ● Small, lean team that moves fast Positions: ● Backend Engineer ● Front-end Engineer ● Technical Sales Engineer ● Product Manager holistics.io/careers