Stream upload and asynchronous job processing in large scale systems

884 views
775 views

Published on

Presentation at Barcamp Saigon 2013 - RMIT 7th July
Presenter: Lê Bá Minh (VNG)

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
884
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
10
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Stream upload and asynchronous job processing in large scale systems

  1. 1. Stream Upload And Asynchronous Job Processing System Lê Bá Minh – minhlb@vng.com.vn Technical Manager – Zalo Team - VNG
  2. 2. Agenda • 1/ Why we need an Asynchronous Job Processing System? • 2/ How it works ? • 3/ Application • 4/ Q &A
  3. 3. Parallel Stream Upload • Data is separated in chunks
  4. 4. Facts • Zalo Stream Upload • Background continuous Voice Upload • Background Image upload • … • Facts (now) • 1M voices /day • 800K images /day • Peak: 500 Chunks/second • Expect: • Scalable (more than 5000 chunks/second) • High performance
  5. 5. What we need • Asynchronous Job processing System Collect Data Processing Data Response Collect Data Processing DataResponse Workers
  6. 6. What we need • Asynchronous Job processing System • Batch Job • Big data job • High Reliable: No job missed • Distributed job processing workers • High performance • Persistent • Load balancing, Failed over, Recoverable
  7. 7. Open-source solutions • Share-memory workers • All workers in one physical server • No fail-over • Un-scalable • Gearman • Good but not completely fit our requirement • No Batch Job support • Not full reliable (lost job) • Not full load-balance • Un-stable if more than 2000 jobs/second
  8. 8. Zalo Asyn Job Processing System Client Client Worker 1 Worker 2 Worker 3 Z Database Short Connection Long Connection TCP TCP Worker Manager Job Caching Job Manager Persistent Manager Job Clean-Up Job Server TCP TCP TCP
  9. 9. Implementation • C/C++ for Job Server • C/C++, Java for client and workers • Binary Protocol • Z-Database
  10. 10. Job State Queuing Processing Failed Time Out Finished Deliver to Worker Worker ACK Failed Worker ACK Finished No ACK Started
  11. 11. Job Type • Single Job • Simple task • Immediately deliver • Batch Job • Multiple tasks • Deliver when received all tasks
  12. 12. Deployment Job Server 1 Job Server 2 Synchronized Business Server Worker 1 Worker 2 Worker 3
  13. 13. Applications • Using for all Asynchronous job processing in Zalo: voice upload, image upload, feed processing… • Benchmark (single server) • 50K images/seconds (640x480) • 50k voices/seconds (30s) • Advantages • Batch Jobs • Never lost job • Worker can restart or stop any time • Fail-over, Load Balancing, Quick recover in failure • Issue • Job duplication (handled by worker)
  14. 14. Q&A

×