Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Genie - Hadoop Platform as a Service at Netflix

1,830 views

Published on

Recently in our tech-blog, we discussed the architecture of our petabyte-scale data warehouse in the cloud (http://nflx.it/XoySYR). Salient features include the use of Amazon`s Simple Storage Service (S3) as our “source of truth”, leveraging the elasticity of the cloud to run multiple dynamically-resizable Hadoop clusters to support various workloads, and our implementation of a horizontally-scalable Hadoop Platform as a Service called ?Genie?. In this presentation, we will focus on Genie, which provides job and resource management for the Hadoop ecosystem in the cloud, and is the core service that the various components of the enterprise ecosystem at Netflix use to integrate with Hadoop in the cloud. From the perspective of the end-user, Genie abstracts away the physical details of various (potentially transient) Hadoop resources in the cloud, and provides REST-ful APIs to submit and monitor Hadoop, Hive and Pig jobs without having to install any Hadoop clients. We will describe how Genie is used in production at Netflix for processing 100s of terabytes of data everyday, running thousands of ETL (extract, transform, load) jobs, plus hundreds of ad-hoc jobs from our visualization tools and our web interface. Finally, we will discuss our plans for open sourcing Genie.

Published in: Technology, Business
  • Be the first to comment

Genie - Hadoop Platform as a Service at Netflix

  1. 1. 1 Genie – Hadoop Platform as a Service at Netflix Sriram Krishnan Hadoop Summit, June 26, 2013
  2. 2. Netflix does Hadoop
  3. 3. Netflix does Hadoop at scale
  4. 4. Netflix does Hadoop at scale*
  5. 5. Netflix does Hadoop at scale in the cloud
  6. 6. S3 as the Cloud Data Warehouse Cloud Data Warehouse
  7. 7. Multiple Hadoop Clusters Cloud Data Warehouse Hadoop (EMR) Clusters
  8. 8. Data Platform as a Service Cloud Data Warehouse Hadoop (EMR) Clusters Hadoop Platform as a Service Job Execution Resource Configuration & Management Metadata Service (Franklin)
  9. 9. Large Ecosystem of Clients & Tools Cloud Data Warehouse Hadoop (EMR) Clusters Hadoop Platform as a Service Job Execution Resource Configuration & Management Metadata Service (Franklin)
  10. 10. Why Genie?  Simple API for job submission and management  Accessible from the data center and the cloud  Abstraction of physical details of back-end Hadoop clusters
  11. 11. What Genie is Not  A workflow scheduler, such as Oozie  A task scheduler, such as fair share or capacity schedulers  An end-to-end resource management tool
  12. 12. Genie: Job Execution  API to run Hadoop, Hive and Pig jobs  Auto-magic submission of jobs to the right Hadoop cluster  Abstracting away cluster details from clients
  13. 13. Genie: Resource Configuration  API for management of cluster metadata  Status: up, out of service, or terminated  Site-specific Hadoop, Hive and Pig configurations  Cluster naming/tagging for job submissions
  14. 14. Eureka ServiceEureka Service Registers service ClientEureka Client Ribbon Discovers service Invokes (submits job) Launches job Discovers service Client Eureka Client Python API Launches cluster(s) Registers cluster End-users Admins Netflix OSS http://netflix.github.com Karyon Eureka Client Ribbon Servo Hadoop Hive Pig Karyon Archaius Ribbon Servo Hadoop Hive Pig Eureka Client
  15. 15. Genie: Job Execution • Job Type: {hadoop, hive, pig} • File dependencies (script, udfs, etc) • Command-line arguments • Schedule: {adhoc, sla} • Configuration: {prod, test, unittest} REST call
  16. 16. Genie: Job Execution * Used to query status, get outputs, kill job Response: job ID*
  17. 17. Genie Job Details Job ID Script to execute Standard output and error Pig logs Job conf directory
  18. 18. Genie – Use Cases Enabled at Netflix  Running nightly short-lived “bonus” clusters to augment ETL processing  Re-routing traffic between clusters  “Red/black” pushes for clusters  Attaching stand-alone gateways to clusters  Running 100% of all SLA jobs, and a high percentage of ad-hoc jobs
  19. 19. Nightly Short-lived Bonus Clusters Execution Service Configuration Service Prod SLA Cluster: Schedule: sla Configurations: prod
  20. 20. Nightly Short-lived Bonus Clusters Bonus Cluster: Schedule: bonus Configurations: prod Execution Service Configuration Service {Schedule=bonus, Configuration=prod} Prod SLA Cluster: Schedule: sla Configurations: prod
  21. 21. Nightly Short-lived Bonus Clusters Bonus Cluster: Schedule: bonus Configurations: prod Status: OUT_OF_SERVICE Execution Service Configuration Service Prod SLA Cluster: Schedule: sla Configurations: prod {Schedule=sla, Configuration=prod}
  22. 22. Nightly Short-lived Bonus Clusters Bonus Cluster: Schedule: bonus Configurations: prod Status: TERMINATED Execution Service Configuration Service Prod SLA Cluster: Schedule: sla Configurations: prod {Schedule=sla, Configuration=prod}
  23. 23. Rerouting Traffic Between Clusters Ad-hoc Cluster: Schedule: adhoc Configurations: prod, test Prod SLA Cluster: Schedule: sla Configurations: prod Execution Service Configuration Service {Schedule=sla, Configuration=prod}
  24. 24. Rerouting Traffic Between Clusters Ad-hoc Cluster: Schedule: adhoc, sla Configurations: prod, test Execution Service Configuration Service {Schedule=sla, Configuration=prod} Prod SLA Cluster: Schedule: sla Configurations: prod Status: OUT_OF_SERVICE
  25. 25. Rerouting Traffic Between Clusters Ad-hoc Cluster: Schedule: adhoc Configurations: prod, test Prod SLA Cluster: Schedule: sla Configurations: prod Status: UP Execution Service Configuration Service {Schedule=sla, Configuration=prod}
  26. 26. “Red/Black” Pushes for Clusters Prod SLA Cluster: Schedule: sla Configurations: prod Status: UP Execution Service Configuration Service {Schedule=sla, Configuration=prod}
  27. 27. “Red/Black” Pushes for Clusters Prod SLA Cluster: Schedule: sla Configurations: prod Status: OUT_OF_SERVICE Execution Service Configuration Service {Schedule=sla, Configuration=prod} Prod SLA Cluster: Schedule: sla Configurations: prod Status: UP
  28. 28. “Red/Black” Pushes for Clusters Prod SLA Cluster: Schedule: sla Configurations: prod Status: TERMINATED Execution Service Configuration Service {Schedule=sla, Configuration=prod} Prod SLA Cluster: Schedule: sla Configurations: prod Status: UP
  29. 29. Genie Usage at Netflix  Usage statistics brought to you by “Sherlock”  Pig job to gather Hadoop job statistics  Tableau-based visualization
  30. 30. Genie Deployment in the Cloud  Asgard is also part of Netflix OSS  https://github.com/Netflix/asgard
  31. 31. Auto Scaling in the Cloud
  32. 32. Genie is now part of Netflix OSS!  http://techblog.netflix.com/2013/06/genie-is-out- of-bottle.html  Clone it on GitHub at:  https://github.com/Netflix/genie  Still “version 0” – work in progress!  All contributions and feedback welcome!  Come talk to us and check out live demos at the Netflix Booth
  33. 33. Watching Pigs Fly with the Netflix Hadoop Toolkit
  34. 34.  Sriram Krishnan We’re hiring! Thank you! Home: http://www.netflix.com Jobs: http://jobs.netflix.com Tech Blog: http://techblog.netflix.com/

×