SlideShare a Scribd company logo
Stream Upload And
Asynchronous Job
Processing System
Lê Bá Minh – minhlb@vng.com.vn
Technical Manager – Zalo Team - VNG
Agenda
• 1/ Why we need an Asynchronous Job Processing
System?
• 2/ How it works ?
• 3/ Application
• 4/ Q &A
Parallel Stream Upload
• Data is separated in chunks
Facts
• Zalo Stream Upload
• Background continuous Voice Upload
• Background Image upload
• …
• Facts (now)
• 1M voices /day
• 800K images /day
• Peak: 500 Chunks/second
• Expect:
• Scalable (more than 5000 chunks/second)
• High performance
What we need
• Asynchronous Job processing System
Collect Data
Processing Data
Response
Collect Data
Processing DataResponse
Workers
What we need
• Asynchronous Job processing System
• Batch Job
• Big data job
• High Reliable: No job missed
• Distributed job processing workers
• High performance
• Persistent
• Load balancing, Failed over, Recoverable
Open-source solutions
• Share-memory workers
• All workers in one physical server
• No fail-over
• Un-scalable
• Gearman
• Good but not completely fit our requirement
• No Batch Job support
• Not full reliable (lost job)
• Not full load-balance
• Un-stable if more than 2000 jobs/second
Zalo Asyn Job Processing
System
Client
Client
Worker 1
Worker 2
Worker 3
Z Database
Short Connection
Long Connection
TCP
TCP
Worker
Manager
Job
Caching
Job
Manager
Persistent
Manager
Job
Clean-Up
Job Server
TCP
TCP
TCP
Implementation
• C/C++ for Job Server
• C/C++, Java for client and workers
• Binary Protocol
• Z-Database
Job State
Queuing
Processing
Failed Time Out
Finished
Deliver to Worker
Worker ACK Failed
Worker ACK Finished
No ACK
Started
Job Type
• Single Job
• Simple task
• Immediately deliver
• Batch Job
• Multiple tasks
• Deliver when received all tasks
Deployment
Job Server 1
Job Server 2
Synchronized
Business Server
Worker 1
Worker 2
Worker 3
Applications
• Using for all Asynchronous job processing in Zalo: voice
upload, image upload, feed processing…
• Benchmark (single server)
• 50K images/seconds (640x480)
• 50k voices/seconds (30s)
• Advantages
• Batch Jobs
• Never lost job
• Worker can restart or stop any time
• Fail-over, Load Balancing, Quick recover in failure
• Issue
• Job duplication (handled by worker)
Q&A
Stream upload and asynchronous job processing  in large scale systems

More Related Content

What's hot (7)

When the connection fails
When the connection failsWhen the connection fails
When the connection fails
 
A Bird and the Web
A Bird and the WebA Bird and the Web
A Bird and the Web
 
Building rich interface components with SharePoint
Building rich interface components with SharePointBuilding rich interface components with SharePoint
Building rich interface components with SharePoint
 
Virtual Reference
Virtual ReferenceVirtual Reference
Virtual Reference
 
MobileClient
MobileClientMobileClient
MobileClient
 
PLNOG 13: Grzegorz Janoszka: Peering vs Tranzyt – Czy peering jest naprawdę s...
PLNOG 13: Grzegorz Janoszka: Peering vs Tranzyt – Czy peering jest naprawdę s...PLNOG 13: Grzegorz Janoszka: Peering vs Tranzyt – Czy peering jest naprawdę s...
PLNOG 13: Grzegorz Janoszka: Peering vs Tranzyt – Czy peering jest naprawdę s...
 
Wa mw 2013
Wa mw 2013Wa mw 2013
Wa mw 2013
 

Similar to Stream upload and asynchronous job processing in large scale systems

Data Care, Feeding, and Maintenance
Data Care, Feeding, and MaintenanceData Care, Feeding, and Maintenance
Data Care, Feeding, and Maintenance
Mercedes Coyle
 
Priority enabled wps
Priority enabled wpsPriority enabled wps
Priority enabled wps
52North
 
Engage 2013 - Leveraging Ad Hoc Analysis
Engage 2013 - Leveraging Ad Hoc AnalysisEngage 2013 - Leveraging Ad Hoc Analysis
Engage 2013 - Leveraging Ad Hoc Analysis
Webtrends
 

Similar to Stream upload and asynchronous job processing in large scale systems (20)

Management Data Warehouse
Management Data WarehouseManagement Data Warehouse
Management Data Warehouse
 
Ahmed Jassat Oracle Customer Day Presentation at Monte Casino
Ahmed Jassat Oracle Customer Day Presentation at Monte CasinoAhmed Jassat Oracle Customer Day Presentation at Monte Casino
Ahmed Jassat Oracle Customer Day Presentation at Monte Casino
 
Moving from Snapshot to Snapshot
Moving from Snapshot to SnapshotMoving from Snapshot to Snapshot
Moving from Snapshot to Snapshot
 
Hands-on Performance Tuning Lab - Devoxx Poland
Hands-on Performance Tuning Lab - Devoxx PolandHands-on Performance Tuning Lab - Devoxx Poland
Hands-on Performance Tuning Lab - Devoxx Poland
 
Data Care, Feeding, and Maintenance
Data Care, Feeding, and MaintenanceData Care, Feeding, and Maintenance
Data Care, Feeding, and Maintenance
 
Real time monitoring of hadoop and spark workflows
Real time monitoring of hadoop and spark workflowsReal time monitoring of hadoop and spark workflows
Real time monitoring of hadoop and spark workflows
 
Maxis Alchemize imug 2017
Maxis Alchemize imug 2017Maxis Alchemize imug 2017
Maxis Alchemize imug 2017
 
Background processing with hangfire
Background processing with hangfireBackground processing with hangfire
Background processing with hangfire
 
Monitoring at Facebook - Ran Leibman, Facebook - DevOpsDays Tel Aviv 2015
Monitoring at Facebook - Ran Leibman, Facebook - DevOpsDays Tel Aviv 2015Monitoring at Facebook - Ran Leibman, Facebook - DevOpsDays Tel Aviv 2015
Monitoring at Facebook - Ran Leibman, Facebook - DevOpsDays Tel Aviv 2015
 
Building data intensive applications
Building data intensive applicationsBuilding data intensive applications
Building data intensive applications
 
Priority enabled wps
Priority enabled wpsPriority enabled wps
Priority enabled wps
 
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
 
Overview of Scientific Workflows - Why Use Them?
Overview of Scientific Workflows - Why Use Them?Overview of Scientific Workflows - Why Use Them?
Overview of Scientific Workflows - Why Use Them?
 
In Transit Images Drives Online Photography Business Forward with DAM
In Transit Images Drives Online Photography Business Forward with DAMIn Transit Images Drives Online Photography Business Forward with DAM
In Transit Images Drives Online Photography Business Forward with DAM
 
ICONUK 2016: Back From the Dead: How Bad Code Kills a Good Server
ICONUK 2016: Back From the Dead: How Bad Code Kills a Good ServerICONUK 2016: Back From the Dead: How Bad Code Kills a Good Server
ICONUK 2016: Back From the Dead: How Bad Code Kills a Good Server
 
EPUG UKI - Lancaster Analytics
EPUG UKI - Lancaster AnalyticsEPUG UKI - Lancaster Analytics
EPUG UKI - Lancaster Analytics
 
Hadoop bangalore-meetup-dec-2011-yoda
Hadoop bangalore-meetup-dec-2011-yodaHadoop bangalore-meetup-dec-2011-yoda
Hadoop bangalore-meetup-dec-2011-yoda
 
Engage 2013 - Leveraging Ad Hoc Analysis
Engage 2013 - Leveraging Ad Hoc AnalysisEngage 2013 - Leveraging Ad Hoc Analysis
Engage 2013 - Leveraging Ad Hoc Analysis
 
Zapping ever faster: how Zap sped up by two orders of magnitude using RavenDB
Zapping ever faster: how Zap sped up by two orders of magnitude using RavenDBZapping ever faster: how Zap sped up by two orders of magnitude using RavenDB
Zapping ever faster: how Zap sped up by two orders of magnitude using RavenDB
 
Mentor Graphics Customer Presentation
Mentor Graphics Customer PresentationMentor Graphics Customer Presentation
Mentor Graphics Customer Presentation
 

More from Barcamp Saigon

More from Barcamp Saigon (14)

7 secrets to be a product manager
7 secrets to be a product manager7 secrets to be a product manager
7 secrets to be a product manager
 
Apolopa Vietnam Introduction and Recruitment
Apolopa Vietnam Introduction and RecruitmentApolopa Vietnam Introduction and Recruitment
Apolopa Vietnam Introduction and Recruitment
 
AWS: How to deploy and scale your web application in the cloud
AWS: How to deploy and scale your web application in the cloudAWS: How to deploy and scale your web application in the cloud
AWS: How to deploy and scale your web application in the cloud
 
Erlang web framework: Chicago boss
Erlang web framework: Chicago bossErlang web framework: Chicago boss
Erlang web framework: Chicago boss
 
Thiền định
Thiền địnhThiền định
Thiền định
 
High Availability - How to get 99.99% service availabilty - Designing cluster...
High Availability - How to get 99.99% service availabilty - Designing cluster...High Availability - How to get 99.99% service availabilty - Designing cluster...
High Availability - How to get 99.99% service availabilty - Designing cluster...
 
Nokia Asha Developer Opportunity
Nokia Asha Developer Opportunity Nokia Asha Developer Opportunity
Nokia Asha Developer Opportunity
 
Data Analytics for Mobile App Development
Data Analytics for Mobile App DevelopmentData Analytics for Mobile App Development
Data Analytics for Mobile App Development
 
Zero cost serverless Real time web app
Zero cost serverless Real time web appZero cost serverless Real time web app
Zero cost serverless Real time web app
 
4Smart - Control everything in your house
4Smart - Control everything in your house4Smart - Control everything in your house
4Smart - Control everything in your house
 
AngularJS Framework
AngularJS FrameworkAngularJS Framework
AngularJS Framework
 
How to transfer a big file
How to transfer a big file How to transfer a big file
How to transfer a big file
 
Những khó khăn của một startup "Sinh viên"
Những khó khăn của một startup "Sinh viên"Những khó khăn của một startup "Sinh viên"
Những khó khăn của một startup "Sinh viên"
 
Students gone Google
Students gone GoogleStudents gone Google
Students gone Google
 

Recently uploaded

Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Peter Udo Diehl
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 

Recently uploaded (20)

Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024
 
In-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsIn-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT Professionals
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
 
The architecture of Generative AI for enterprises.pdf
The architecture of Generative AI for enterprises.pdfThe architecture of Generative AI for enterprises.pdf
The architecture of Generative AI for enterprises.pdf
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 

Stream upload and asynchronous job processing in large scale systems

  • 1. Stream Upload And Asynchronous Job Processing System Lê Bá Minh – minhlb@vng.com.vn Technical Manager – Zalo Team - VNG
  • 2. Agenda • 1/ Why we need an Asynchronous Job Processing System? • 2/ How it works ? • 3/ Application • 4/ Q &A
  • 3. Parallel Stream Upload • Data is separated in chunks
  • 4. Facts • Zalo Stream Upload • Background continuous Voice Upload • Background Image upload • … • Facts (now) • 1M voices /day • 800K images /day • Peak: 500 Chunks/second • Expect: • Scalable (more than 5000 chunks/second) • High performance
  • 5. What we need • Asynchronous Job processing System Collect Data Processing Data Response Collect Data Processing DataResponse Workers
  • 6. What we need • Asynchronous Job processing System • Batch Job • Big data job • High Reliable: No job missed • Distributed job processing workers • High performance • Persistent • Load balancing, Failed over, Recoverable
  • 7. Open-source solutions • Share-memory workers • All workers in one physical server • No fail-over • Un-scalable • Gearman • Good but not completely fit our requirement • No Batch Job support • Not full reliable (lost job) • Not full load-balance • Un-stable if more than 2000 jobs/second
  • 8. Zalo Asyn Job Processing System Client Client Worker 1 Worker 2 Worker 3 Z Database Short Connection Long Connection TCP TCP Worker Manager Job Caching Job Manager Persistent Manager Job Clean-Up Job Server TCP TCP TCP
  • 9. Implementation • C/C++ for Job Server • C/C++, Java for client and workers • Binary Protocol • Z-Database
  • 10. Job State Queuing Processing Failed Time Out Finished Deliver to Worker Worker ACK Failed Worker ACK Finished No ACK Started
  • 11. Job Type • Single Job • Simple task • Immediately deliver • Batch Job • Multiple tasks • Deliver when received all tasks
  • 12. Deployment Job Server 1 Job Server 2 Synchronized Business Server Worker 1 Worker 2 Worker 3
  • 13. Applications • Using for all Asynchronous job processing in Zalo: voice upload, image upload, feed processing… • Benchmark (single server) • 50K images/seconds (640x480) • 50k voices/seconds (30s) • Advantages • Batch Jobs • Never lost job • Worker can restart or stop any time • Fail-over, Load Balancing, Quick recover in failure • Issue • Job duplication (handled by worker)
  • 14. Q&A