May 2012 HUG: Oozie: Towards a scalable Workflow Management System for Hadoop


Published on

During the past three years Oozie has become the de-facto workflow scheduling system for Hadoop. Oozie has proven itself as a scalable, secure and multi-tenant service. Oozie stably processes more than 45% of the jobs run across more than 25 Hadoop clusters in Yahoo. At the same time adoption
in other enterprises has increased substantially since Oozie was contributed to the Apache community. We attribute these achievements to design decisions
that was selected to be presented at a workshop during the ACM/SIGMOD conference. This presentation covers the key architectural design choices described in the paper. Operational metrics will be used to illustrate production experience at Yahoo, and we will also include a quick tutorial.

Published in: Technology
1 Comment
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

May 2012 HUG: Oozie: Towards a scalable Workflow Management System for Hadoop

  1. 1. Oozie: Towards a Scalable Workflow Management System for Hadoop Mohammad Islam And Virag Kothari
  2. 2. Accepted Paper• Workshop in ACM/SIGMOD, May 2012.• It is a team effort!Mohammad Islam Angelo HuangMohamed Battisha Michelle ChiangSanthoshSrinivasan Craig PetersAndreas Neumann Alejandro Abdelnur
  3. 3. Presentation WorkflowOozie Design ResultTutorial Decision s s Question s? Address END Question
  4. 4. Installing OozieStep 1: Download the Oozietarballcurl -O 2: Unpack the tarballtar –xzvf<PATH_TO_OOZIE_TAR>Step 3: Run the setup scriptbin/ -hadoop 0.20.200 ${HADOOP_HOME} -extjs /tmp/ext-2.2.zipStep 4: Start ooziebin/oozie-start.shStep 5: Check status of ooziebin/oozie admin -oozie http://localhost:11000/oozie -status
  5. 5. Running an Example•Standalone Map-Reduce job $ hadoop jar /usr/joe/hadoop-examples.jarorg.myorg.wordcountinputDiroutputDir• Using Oozie MapReduce OK <workflow –app name =..> Start End <start..> wordcount <action> <map-reduce> ERROR …… …… </workflow> Kill Example DAG Workflow.xml
  6. 6. Example Workflow<action name=’wordcount><map-reduce><configuration><property><name>mapred.mapper.class</name><value>org.myorg.WordCount.Map</value> mapred.mapper.class =</property> org.myorg.WordCount.Map <property><name>mapred.reducer.class</name><value>org.myorg.WordCount.Reduce</value></property><property> mapred.reducer.class =<name>mapred.input.dir</name> org.myorg.WordCount.Reduce<value>usr/joe/inputDir</value></property><property> mapred.input.dir = inputDir<name>mapred.output.dir</name><value>/usr/joe/outputDir</value></property></configuration> mapred.output.dir = outputDir</map-reduce></action>
  7. 7. A Workflow ApplicationThree components required for a Workflow:1) Workflow.xml: Contains job definition2) Libraries: optional ‘lib/’ directory contains .jar/.so files3) Properties file:• Parameterization of Workflow xml• Mandatory property is
  8. 8. Workflow SubmissionRun Workflow Job $ oozie job –run http://localhost:11000/oozie/ Workflow ID: 00123-123456-oozie-wrkf-WCheck Workflow Job Status $ oozie job –info 00123-123456-oozie-wrkf-W -ooziehttp://localhost:11000/oozie/ ----------------------------------------------------------------------- Workflow Name: test-wf App Path: hdfs://localhost:11000/user/your_id/oozie/ Workflow job status [RUNNING] ... ------------------------------------------------------------------------
  9. 9. Key Features and DesignDecisions• Multi-tenant• Security – Authenticate every request – Pass appropriate token to Hadoop job• Scalability – Vertical: Add extra memory/disk – Horizontal: Add machines
  10. 10. Oozie Job Processing Oozie Security Hadoop Access Secure Job Kerberos OozieServerEnduser
  11. 11. Oozie-Hadoop Security Oozie Security Hadoop Access Secure Job Kerberos Oozie ServerEnd user c
  12. 12. Oozie-Hadoop Security • Oozie is a multi-tenant system • Job can be scheduled to run later • Oozie submits/maintains the hadoop jobs • Hadoop needs security token for each requestQuestion: Who should provide the securitytoken to hadoop and how?
  13. 13. Oozie-Hadoop Security Contd.• Answer: Oozie• How? – Hadoop considers Oozieas a super-user – Hadoopdoes not check end-user credential – Hadooponly checks the credential of Oozieprocess• BUT hadoop job is executed as end-user.•Oozie utilizes doAs() functionality of Hadoop.
  14. 14. User-Oozie Security Oozie Security Hadoop Access Secure Job Kerberos Oozie ServerEnd user c
  15. 15. Why Oozie Security?• One user should not modify another user’s job• Hadoop doesn’t authenticate end–user• Ooziehas to verifyits user before passing the job to Hadoop
  16. 16. How does Oozie Support Security?• Built-in authentication – Kerberos – Non-secured (default)• Design Decision – Pluggable authentication – Easy to include new type of authentication – Yahoo supports 3 types of authentication.
  17. 17. Job Submission to Hadoop• Oozie is designed to handle thousands of jobs at the same time• Question : Should Oozie server – Submit the hadoop job directly? – Wait for it to finish? • Answer: No
  18. 18. Job Submission Contd.• Reason – Resource constraints: A single Oozie process can’t simultaneously create thousands of thread for each hadoop job. (Scaling limitation) – Isolation: Running user code on Oozie server might de-stabilize Oozie• Design Decision – Create a launcher hadoop job – Execute the actual user job from the launcher. – Wait asynchronously for the job to finish.
  19. 19. Job Submission to Hadoop Hadoop Cluster 5 Job Actual Tracker M/R JobOozie 3Server 1 4 Launcher 2 Mapper
  20. 20. Job Submission Contd.• Advantages – Horizontal scalability: If load increases, add machines into Hadoop cluster – Stability: Isolation of user code and system process• Disadvantages – Extra map-slot is occupied by each job.
  21. 21. Production Setup• Total number of nodes: 42K+• Total number of Clusters: 25+• Total number of processed jobs ≈ 750K/month• Data presented from two clusters• Each of them have nearly 4K nodes• Total number of users /cluster = 50
  22. 22. Oozie Usage Pattern @ Y! Distribution of Job Types On Production Clusters 50 45 40 35 Percentage 30 25 20 #1 Cluster 15 #2 Cluster 10 5 0 fs java map-reduce pig Job type• Pig and Java are the most popular• Number of pure Map-Reduce jobs are fewer
  23. 23. Experimental Setup• Number of nodes: 7• Number of map-slots: 28• 4 Core, RAM: 16 GB• 64 bit RHEL• Oozie Server – 3 GB RAM – Internal Queue size = 10 K – # Worker Thread = 300
  24. 24. Job Acceptance Workflow Acceptance Rate workflows Accepted/Min 1400 1200 1000 800 600 400 200 0 2 6 10 14 20 40 52 100 120 200 320 640 Number of Submission ThreadsObservation: Oozie can accept a large number of jobs
  25. 25. Time Line of a Oozie Job User Oozie Job Job submits submits to completes completes Job Hadoop at Hadoop at Oozie Time Preparation Completion Overhead OverheadTotal Oozie Overhead = Preparation + Completion
  26. 26. Oozie Overhead Per Action OverheadOverhead in millisecs 1800 1600 1400 1200 1000 800 600 400 200 0 1 Action 5 Actions 10 Actions 50 Actions Number of Actions/WorkflowObservation: Oozie overhead is less when multipleactions are in the same workflow.
  27. 27. Oozie Futures• Scalability – Hot-Hot/Load balancing service – Replace SQL DB with Zookeeper• Improved Usability• Extend the benchmarking scope• Monitoring WS API
  28. 28. Take Away ..• Oozie is – Easier to use – Scalable – Secure and multi-tenant
  29. 29. Q&A Mohammad K Virag Kothari Islamkamrul@yahoo-