Media Processing in the Cloud

3,099 views

Published on

Talk from AWS Media & Entertainment Summit (10/7/2011). Also at http://aws.amazon.com/mediaevent/

Published in: Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
3,099
On SlideShare
0
From Embeds
0
Number of Embeds
52
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Media Processing in the Cloud

  1. 1. Media Processing in the Cloud Deepak Singh P r i n c i p a l P r o d u c t M a n a g e r
  2. 2. media today
  3. 3. lots of content
  4. 4. professional contentImage: chapter.one
  5. 5. Image: Tonz
  6. 6. everything in between
  7. 7. streaming
  8. 8. devices
  9. 9. Image: pennstatelive
  10. 10. Image: NASA
  11. 11. higher resolutions
  12. 12. 3D
  13. 13. so what?
  14. 14. lots and lots and lotsand lots and lots andlots and lots and lots and lots of data
  15. 15. lots of compute
  16. 16. how can thecloud help?
  17. 17. let us orchestrate a processing application
  18. 18. I  want  to  process  data,for  example  encode  movies,  … Job Queue EC2 Instance Data Processing
  19. 19. …  and  store  the  results  in  S3. Job Queue EC2 Instance S3 Data Processing
  20. 20. And  I  want  to  be  no@fied  oncomple@on Job Queue EC2 Instance Result e-mail S3 Data Processing
  21. 21. I  like  using  SNS,  because  … Job Queue EC2 Instance Topic Result e-mail S3 Data Processing
  22. 22. I  like  using  SNS,  because  …I  can  integrate  other  systems  viaSQS,  or  HTTP(S)  web-­‐hooks Job Queue EC2 Instance Topic Result e-mail S3 Data Processing SQS HTTP
  23. 23. I  should  be  done  ...  right? Job Queue EC2 Instance Topic Result e-mail S3 Data Processing
  24. 24. Not  quite  –  I’m  also  cost  conscious  …Don’t  want  to  pay  anything  when  there’s  no  work Autoscaling Job Queue Charge empty $0   Topic Result e-mail S3 Data Processing
  25. 25. Need  a  controller  … Capacity Control (on-demand launch) Autoscaling Job Queue Charge $0   Topic Result e-mail S3 Data Processing
  26. 26. Measure  …  Detect  work  … Capacity Control (on-demand launch) CW Alarm CW Metric #msgs Autoscaling Job Queue Charge $0   Topic Result e-mail S3 Data Processing
  27. 27. When  there’s  a  message  queued  … Capacity Control (on-demand launch) CW Alarm Autoscaling Policy add instance CW Metric if #msg > 0 #msgs Autoscaling Job Queue Charge $0   Topic Result e-mail S3 Data Processing
  28. 28. When  there’s  a  message  queued  …start  an  EC2  instance Capacity Control (on-demand launch) CW Alarm Autoscaling Policy add instance CW Metric if #msg > 0 #msgs Autoscaling Job Queue EC2 Instance Topic Result e-mail S3 Data Processing
  29. 29. ...  or  5000  EC2  instances Capacity Control (on-demand launch) CW Alarm Autoscaling Policy keep adding instances CW Metric if #msg > 0 #msgs Autoscaling Job Queue 5000 EC2 Instances Topic Result e-mail S3 Data Processing
  30. 30. When  the  work  is  done  … Capacity Control (on-demand launch) CW Alarm Autoscaling Policy add instance CW Metric if #msg > 0 #msgs Autoscaling Job Queue EC2 Instance Topic Result e-mail S3 Data Processing
  31. 31. When  the  work  is  done  …Terminate  the  EC2  instance Capacity Control (on-demand launch) CW Alarm Autoscaling Policy add instance CW Metric if #msg > 0 #msgs Autoscaling Job Queue TERM’D Topic Result e-mail S3 Data Processing
  32. 32. For  full  produc@on  use,  addIAM  Users,  Permissions,  Policies,  OperaFonal  Alarm,  etc. Capacity Control (on-demand launch) CW Alarm Autoscaling Policy add instance CW Metric if #msg > 0 #msgs Autoscaling Job Queue EC2 Instance Topic Result e-mail S3 IAM  User IAM  User IAM   IAM   Permissions IAM   Permissions IAM   Permissions IAM   Policy Policy Data Processing
  33. 33. On  Demand   SQS  Worker   Example Capacity Control (on-demand launch) CW Alarm Autoscaling Policy add instance CW Metric if #msg > 0 #msgs Autoscaling Job Queue EC2 Instance Topic Result e-mail S3 Data ProcessingNo  glue-­‐scrip.ng  (or  UI  code)  required  …We  let  you  focus  on  the  business  problem!
  34. 34. Nested  templates:  enable  custom  abstrac@ons Job Queue S3 Job Queue S3 Job Queue S3 Job Queue S3
  35. 35. One  substack  per  e.g.  Video  format  to  encode Parent  template/stack Parameterized  by  e.g.  format/resolu1on Job Queue Encode for low-res, … S3 S3 Job Queue Encode forJob: high-res, … S3 S3Render X Topic Job Queue S3 S3 Job Queue S3 S3
  36. 36. 4
  37. 37. 1. Infrastructure
  38. 38. ec2-run-instances
  39. 39. on demand global secure
  40. 40. programmable
  41. 41. elastic
  42. 42. instance types
  43. 43. t1.micro standard (m1)high memory (m2) high CPU (c1)
  44. 44. high performance
  45. 45. cluster computing
  46. 46. MPI
  47. 47. bandwidth intensive
  48. 48. Cluster Compute Instance
  49. 49. 2*Intel Xeon 5570 23 GB RAM 1.7 TB disk HVM
  50. 50. 10 gig E
  51. 51. Placement Group
  52. 52. ull- on f i ectbis Placement group
  53. 53. linpack
  54. 54. Cores 7040R max 41.82R peak 82.51
  55. 55. 231
  56. 56. 451
  57. 57. WIEN2K Parallel Performance H size 56,000 (25GB) Runtime (16x8 processors) Local (Infiniband) 3h:48 Cloud (10Gbps) 1h:30 ($40) 1200 atom unit cell; SCALAPACK+MPI diagonalization, matrix size 50k-100kCredit: K. Jorissen, F. D. Villa, and J. J. Rehr (U. Washington)
  58. 58. HPC is evolving
  59. 59. 2*Intel Xeon 5570 22 GB RAM 1.7 TB disk HVM2*NVidia M2050
  60. 60. optimizing costs
  61. 61. on-demand
  62. 62. reserved
  63. 63. spot
  64. 64. 2. Orchestration
  65. 65. AWS CloudFormation
  66. 66. bootstrap
  67. 67. Cloud Init
  68. 68. chef/puppet
  69. 69. familiar tools
  70. 70. Oracle Grid Engine
  71. 71. LSF
  72. 72. Condor
  73. 73. combining worlds
  74. 74. MIT Starcluster
  75. 75. $ starcluster start mycluster$ starcluster listclusters
  76. 76. http://www.bioteam.net/2011/03/dude-you-got-some-chef-in-my-starcluster/
  77. 77. 30,472 cores
  78. 78. $1279/hr
  79. 79. Big Data
  80. 80. AmazonElastic MapReduce
  81. 81. S3Input data
  82. 82. S3 Input dataCode Elastic MapReduce
  83. 83. S3 Input dataCode Elastic Name MapReduce node
  84. 84. S3 Input dataCode Elastic Name MapReduce node Elastic cluster
  85. 85. S3 Input dataCode Elastic Name MapReduce node HDFS Elastic cluster
  86. 86. S3 Input dataCode Elastic Name MapReduce node Queries HDFS + BI Via JDBC, Pig, Hive Elastic cluster
  87. 87. S3 Input dataCode Elastic Name Output MapReduce node S3 + SimpleDB Queries HDFS + BI Via JDBC, Pig, Hive Elastic cluster
  88. 88. 3. Applications
  89. 89. the layer ofinnovation
  90. 90. NASA JPL
  91. 91. Netflix needed to transcode 17,000 titles (80TB of data) to support the launch of Sony PS3. They provisioned 1200 Amazon EC2 instances and completed the transcoding process in just days.Source: Adrian Cockroft (Netflix)
  92. 92. Source: Adrian Cockroft (Netflix)
  93. 93. http://vimeo.com/judpratt
  94. 94. “Our tests have shown more than 90percent scaling efficiency onclusters of up to 128 GPUs each”
  95. 95. 4. People
  96. 96. constraintseverywhere
  97. 97. CPU, storage, Hardware memory Collections, datasets,Data management provenance parallelization, Software optimization Backup, redundant, Availability replicated Cost Small
  98. 98. where should we optimize?
  99. 99. Image: Pieter Musterd
  100. 100. removing constraints
  101. 101. undifferentiated heavy lifting
  102. 102. focus on value
  103. 103. faster
  104. 104. more
  105. 105. renderingcompositingtranscoding
  106. 106. creating art
  107. 107. Image: Chris Dagdigian
  108. 108. 4
  109. 109. 1. Infrastructure
  110. 110. 2. Orchestration
  111. 111. 3. Applications
  112. 112. 4. People
  113. 113. creatingcontent
  114. 114. deesingh@amazon.com Twitter:@mndoci http://slideshare.net/mndoci http://mndoci.github.com Inspiration and ideas from Matt Wood & Larry LessigCredit” Oberazzi under a CC-BY-NC-SA license

×