Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
轉轉轉好運旺來一起來之雲端轉檔大作戰!
林進錕
Senior Engineer
• 我叫林進錕
• 目前住在宜蘭
種稻、種菜
• KKStream
強力徵才!
我是誰,來自哪裡?
• Introduce adaptive streaming encoding workflow
• Review current solutions
• Introduce Mass
• Future works
• Q&A
Outlines
Source check Transcode Package
Check
Adaptive Streaming
Encoding Workflow
1080p Android
代誌毋是憨人想 ê 遐簡單
Source check Transcode Package
Check
Adaptive Streaming
Encoding Workflow
1080p Android
720p
480p
iOS
240p
Edge
Source check Transcode Package
Check
Adaptive Streaming
Encoding Workflow
1080p Android
720p
480p
iOS
Edge
240p
先尋求外援
• Gearman
• Luigi
• Others
Review current solutions
• Gearman
• Luigi
• Others
Review current solutions
• 需要管理 Job Server
Gearman
Job Server
WorkerWorker
Gearman
# install gearman job server
$ sudo apt-get update
$ sudo apt-get upgrade
$ sudo apt-get install gcc autoconf biso...
• 從 Worker 角度出發,看不到 Workflow
Gearman
Gearman
01 from gearman import GearmanWorker
02 from gearman import GearmanClient
03
04 def check(worker, job):
05 # check...
• Gearman
• Luigi
• Others
Review current solutions
• 需要管理 Job Server
• 從 Task 角度出發,看不到 Workflow
Luigi
Job Server
WorkerWorker
01 class Check(luigi.Task):
02 def run(self):
03 return check()
04
05 class Transcode(luigi.Task):
06 def requires(self):
...
• Gearman
• Luigi
• Others (Tractor, Celery, Afanasy, etc.)
需要管理 Job Server
不支援複雜的 Job
Review current solutions
好像都不太好用
• 不想管理 Job Server
去中心化
自我管理
晚上想好好睡覺
What We Need
Job Server
WorkerWorker
• Top-down 的角度來描述 Encoding Workflow
不是 Worker,也不是 Task
What We Need
Check Transcode Package
• Worker 可以根據需要,調整種類或數量
Source check 的機器可以弱一點
Transcode 的機器可以強一點、多一點
可以臨時加入機器,加速轉檔過程
What We Need
Check Transcode Package
代誌絕對毋是憨人想 ê 遐簡單
天公疼憨人
Mass
https://github.com/KKBOX/mass
• 不需維護 Job Server,民眾 (Mass) 自主
Base on Amazon SWF (Simple Workflow)
No centralized server, no Database, almost free
• 從 Jo...
Source check Transcode (Task) Package
Check
Definitions
1080p (SubTask) Android
• Job、Task、Action,用來描述 job
• Worker,用來執行 A...
Mass Worker
01
02
03
04
05 def check(source):
06 pass
07
08
09 def transcode(source, resolution):
10 pass
11
12
13 def pac...
01 from mass.scheduler.swf import SWFWorker
02 worker = SWFWorker()
03
04 @worker.role('check')
05 def check(source):
06 p...
01 # Compute Optimized Instance
02 worker.start({
02 'check': 0,
03 'transcode': 4,
04 'package': 0
05 })
01 # General Pur...
01 from mass import Job, Task, Action, submit
02
03 with Job(title='Encoding Workflow') as job:
04 with Task(title='Check'...
01 from mass import Job, Task, Action, submit
02
03 with Job(title='Encoding Workflow') as job:
04 with Task(title='Check'...
01 from mass import Job, Task, Action, submit
02
03 with Job(title='Encoding Workflow') as job:
04 with Task(title='Check'...
01 import os
02 from mass.scheduler.swf import SWFWorker
03 worker = SWFWorker()
04
05 @worker.role('shell')
06 def check(...
01 from mass import Job, Task, Action, submit
02
03 with Job(title='Encoding Workflow') as job:
04 with Task(title='Transc...
01 from mass import LogHandler
02 from slacker import Slacker
03
04 @log_handler.logger('error')
05 def log_to_slack(msg):...
• About 50,000 hours of video has been encoded
• 3,000 hours of video per day
about 100,000 Tasks
with 800 workers
Examples
• AWS Lambda support
• Error handling with exact Exception Type
• Notification while Job/Task/Cmd start/end
• Available Re...
Q & A
Thank you!
Upcoming SlideShare
Loading in …5
×

轉轉轉好運旺來一起來之雲端轉檔大作戰!

502 views

Published on

In this talk, I will introducing Mass, a Python package, that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, (visualization), handling failures, command line integration, and much more.

The purpose of Mass is to address all the plumbing typically associated with long-running batch processes. You want to chain many tasks, automate them, and failures will happen. These tasks can be anything, but are typically long running things like video encoding, rendering for computer graphics, DB dump, or complex algorithm on big data.

The main features of Mass:

* No centralized server.
* Pythonic code snippets to represent a job
* On-premise, cloud-based or even mixed mass workers are allowed.

Let's start your mass job and enjoy it!

Published in: Engineering
  • Be the first to comment

轉轉轉好運旺來一起來之雲端轉檔大作戰!

  1. 1. 轉轉轉好運旺來一起來之雲端轉檔大作戰! 林進錕 Senior Engineer
  2. 2. • 我叫林進錕 • 目前住在宜蘭 種稻、種菜 • KKStream 強力徵才! 我是誰,來自哪裡?
  3. 3. • Introduce adaptive streaming encoding workflow • Review current solutions • Introduce Mass • Future works • Q&A Outlines
  4. 4. Source check Transcode Package Check Adaptive Streaming Encoding Workflow 1080p Android
  5. 5. 代誌毋是憨人想 ê 遐簡單
  6. 6. Source check Transcode Package Check Adaptive Streaming Encoding Workflow 1080p Android 720p 480p iOS 240p Edge
  7. 7. Source check Transcode Package Check Adaptive Streaming Encoding Workflow 1080p Android 720p 480p iOS Edge 240p
  8. 8. 先尋求外援
  9. 9. • Gearman • Luigi • Others Review current solutions
  10. 10. • Gearman • Luigi • Others Review current solutions
  11. 11. • 需要管理 Job Server Gearman Job Server WorkerWorker
  12. 12. Gearman # install gearman job server $ sudo apt-get update $ sudo apt-get upgrade $ sudo apt-get install gcc autoconf bison flex libtool make curl $ sudo apt-get install libboost-all-dev libcurl4-openssl-dev libevent-dev uuid-dev $ cd ~ $ wget https://launchpad.net/gearmand/1.2/1.1.12/+download/gearmand-1.1.12.tar.gz $ tar -xvf gearmand-1.1.12.tar.gz $ cd gearmand-1.1.12 $ ./configure $ sudo make $ sudo make install $ sudo apt-get install gearman-job-server $ sudo pecl install gearman $ sudo nano /etc/php5/conf.d/gearman.ini $ sudo service apache2 restart # install gearman manager $ cd /opt $ git clone https://github.com/brianlmoon/GearmanManager.git gearman-manager $ cd gearman-manager $ ./install/install.sh $ mkdir /tmp/phpsource $ cd /tmp/phpsource $ apt-get source php5 $ cd /tmp/phpsource/php5-*/ext/pcntl $ phpize $ ./configure $ make $ cd modules $ cp pcntl.so /usr/lib/php5/<long number like 20121212>/ $ chmod 644 /usr/lib/php5/<long number like 20121212>/pcntl.so
  13. 13. • 從 Worker 角度出發,看不到 Workflow Gearman
  14. 14. Gearman 01 from gearman import GearmanWorker 02 from gearman import GearmanClient 03 04 def check(worker, job): 05 # check source 06 do_check(job.data) 07 08 # start transcode task 09 client = GearmanClient() 10 client.submit_job('transcode', job.data) 11 return Check Transcode Package
  15. 15. • Gearman • Luigi • Others Review current solutions
  16. 16. • 需要管理 Job Server • 從 Task 角度出發,看不到 Workflow Luigi Job Server WorkerWorker
  17. 17. 01 class Check(luigi.Task): 02 def run(self): 03 return check() 04 05 class Transcode(luigi.Task): 06 def requires(self): 07 return Check() 08 09 def run(self): 10 return transcode() Luigi Check Transcode Package
  18. 18. • Gearman • Luigi • Others (Tractor, Celery, Afanasy, etc.) 需要管理 Job Server 不支援複雜的 Job Review current solutions
  19. 19. 好像都不太好用
  20. 20. • 不想管理 Job Server 去中心化 自我管理 晚上想好好睡覺 What We Need Job Server WorkerWorker
  21. 21. • Top-down 的角度來描述 Encoding Workflow 不是 Worker,也不是 Task What We Need Check Transcode Package
  22. 22. • Worker 可以根據需要,調整種類或數量 Source check 的機器可以弱一點 Transcode 的機器可以強一點、多一點 可以臨時加入機器,加速轉檔過程 What We Need Check Transcode Package
  23. 23. 代誌絕對毋是憨人想 ê 遐簡單
  24. 24. 天公疼憨人
  25. 25. Mass https://github.com/KKBOX/mass
  26. 26. • 不需維護 Job Server,民眾 (Mass) 自主 Base on Amazon SWF (Simple Workflow) No centralized server, no Database, almost free • 從 Job 角度,Top-down 來描述 Workflow Pythonic code snippets to represent a job • 可動態調整 Worker 種類及數量 On-premise, cloud-based or even mixed mass workers are allowed • Pure python solution (python3) Main features
  27. 27. Source check Transcode (Task) Package Check Definitions 1080p (SubTask) Android • Job、Task、Action,用來描述 job • Worker,用來執行 Action Encoding Workflow (Job) 720p (SubTask)
  28. 28. Mass Worker 01 02 03 04 05 def check(source): 06 pass 07 08 09 def transcode(source, resolution): 10 pass 11 12 13 def package(transcoded, protocol): 14 pass 15 16
  29. 29. 01 from mass.scheduler.swf import SWFWorker 02 worker = SWFWorker() 03 04 @worker.role('check') 05 def check(source): 06 pass 07 08 @worker.role('transcode') 09 def transcode(source, resolution): 10 pass 11 12 @worker.role('package') 13 def package(transcoded, protocol): 14 pass 15 16 worker.start() Mass Worker
  30. 30. 01 # Compute Optimized Instance 02 worker.start({ 02 'check': 0, 03 'transcode': 4, 04 'package': 0 05 }) 01 # General Purpose Instance 02 worker.start({ 02 'check': 1, 03 'transcode': 0, 04 'package': 1 05 }) Mass Worker
  31. 31. 01 from mass import Job, Task, Action, submit 02 03 with Job(title='Encoding Workflow') as job: 04 with Task(title='Check'): 05 Action(source='/path/to/source.mpg', 06 _role='check') Mass Job 01 # worker 02 @worker.role('check') 03 def check(source): 04 pass
  32. 32. 01 from mass import Job, Task, Action, submit 02 03 with Job(title='Encoding Workflow') as job: 04 with Task(title='Check'): 05 Action(source='/path/to/source.mpg', _role='check') 06 with Task(title='Transcode', parallel=True): 07 with Task(title='Transcode 1080p'): 08 Action(source='/path/to/source.mpg', 09 resolution='1080p', _role='transcode') 10 with Task(title='Transcode 720p'): 11 Action(source='/path/to/source.mpg', 12 resolution='720p', _role='transcode') 13 with Task(title='Package', parallel=True): 14 with Task(title='Package for Android'): 15 Action(transcoded='/path/to/transcoded.mpg', 16 protocol='dash', _role='package') 17 with Task(title='Package for iOS'): 18 Action(transcoded='/path/to/transcoded.mpg', 19 protocol='hls', _role='package') 20 21 submit(job) Mass Job
  33. 33. 01 from mass import Job, Task, Action, submit 02 03 with Job(title='Encoding Workflow') as job: 04 with Task(title='Check'): 05 Action(source='/path/to/source.mpg', _role='check') 06 with Task(title='Transcode', parallel=True): 07 with Task(title='Transcode 1080p'): 08 Action(source='/path/to/source.mpg', 09 resolution='1080p', _role='transcode') 10 with Task(title='Transcode 720p'): 11 Action(source='/path/to/source.mpg', 12 resolution='720p', _role='transcode') 13 with Task(title='Package', parallel=True): 14 with Task(title='Package for Android'): 15 Action(transcoded='/path/to/transcoded.mpg', 16 protocol='dash', _role='package') 17 with Task(title='Package for iOS'): 18 Action(transcoded='/path/to/transcoded.mpg', 19 protocol='hls', _role='package') 20 21 submit(job) Mass Job
  34. 34. 01 import os 02 from mass.scheduler.swf import SWFWorker 03 worker = SWFWorker() 04 05 @worker.role('shell') 06 def check(cmd): 07 os.check_call(cmd, shell=True) 08 09 worker.start() Generic Worker
  35. 35. 01 from mass import Job, Task, Action, submit 02 03 with Job(title='Encoding Workflow') as job: 04 with Task(title='Transcode', parallel=True): 05 with Task(title='Transcode 360p'): 06 Action(cmd='ffmpeg -i in.mp4 -s 640x360 out.mp4' 07 _role='shell') 08 with Task(title='Transcode 720p'): 09 Action(cmd='ffmpeg -i in.mp4 -s 1280x720 out.mp4', 10 _role='shell') 11 12 submit(job) Generic Worker
  36. 36. 01 from mass import LogHandler 02 from slacker import Slacker 03 04 @log_handler.logger('error') 05 def log_to_slack(msg): 06 slack = Slacker('<your-slack-api-token>') 07 slack.chat.post_message('#general', msg) Error Handling
  37. 37. • About 50,000 hours of video has been encoded • 3,000 hours of video per day about 100,000 Tasks with 800 workers Examples
  38. 38. • AWS Lambda support • Error handling with exact Exception Type • Notification while Job/Task/Cmd start/end • Available Resource Capacity of Action and Worker • Job Viewer Future works
  39. 39. Q & A
  40. 40. Thank you!

×