Oozie @ Riot Games

4,130 views

Published on

Enterprise workflows in Hadoop using Oozie @ Riot Games. Simple use cases and lessons learned from our platform growth.

Published in: Technology

Oozie @ Riot Games

  1. 1. RIOT GAMES SOME CATCHY STATEMENT ABOUT WORKFLOWS ANDYORDLES MATT GOEKE
  2. 2. INTRODUCTION 1 2 3 4 5 6 7
  3. 3. INTRO1 2 3 4 5 6 7 ABOUT THE SPEAKER
  4. 4. •  Previous workflow architecture •  What Oozie is •  How we incorporated Oozie – Relational Data Pipeline – Non-relational Data Pipeline •  Lessons learned •  Where we’re headed THIS PRESENTATION IS ABOUT…1 2 3 4 5 6 7 INTRO
  5. 5. •  Developer and publisher of League of Legends •  Founded 2006 by gamers for gamers •  Player experience focused – Needless to say, data is pretty important to understanding the player experience! WHO is RIOT GAMES?1 2 3 4 5 6 7 INTRO
  6. 6. 1 2 3 4 5 6 7 INTRO LEAGUE OF LEGENDS
  7. 7. ARCHITECTURE 1 2 3 4 5 6 7
  8. 8. ClientMobile WWW 1 2 3 4 5 6 7 Architecture HIGH LEVEL OVERVIEW
  9. 9. ClientMobile WWW 1 2 3 4 5 6 7 Architecture HIGH LEVEL OVERVIEW
  10. 10. 1 2 3 4 5 6 7 Architecture WHY WORKFLOWS? •  Retry a series of jobs in the event of failure •  Execute jobs at a specific time or when data is available •  Correctly order job execution based on resolved dependencies •  Provide a common framework for communication and execution of production process •  Use the the workflow to couple resources instead of having a monolithic code base
  11. 11. 1 2 3 4 5 6 7 Architecture PREVIOUS ARCHITECTURE Tableau Hive Data Warehouse CRON + Pentaho + Custom ETL + Sqoop MySQLPentaho Analysts EUROPE Audit Plat LoL KOREA Audit Plat LoL NORTH AMERICA Audit Plat LoL Business Analyst
  12. 12. 1 2 3 4 5 6 7 Architecture ISSUES WITH PREVIOUS PROCESS •  All of the ETL processes were run on one node which limited concurrency •  If our main runner execution died then the whole ETL for that day would need to be restarted •  No reporting of what was run or the configuration of the ETL without log diving on the actual machine •  No retries (outside of native MR tasks) and no good way to rerun a previous config if the underlying code has been changed
  13. 13. 1 2 3 4 5 6 7 Architecture PREVIOUS ARCHITECTURE Tableau Hive Data Warehouse CRON + Pentaho + Custom ETL + Sqoop MySQLPentaho Analysts EUROPE Audit Plat LoL KOREA Audit Plat LoL NORTH AMERICA Audit Plat LoL Business Analyst
  14. 14. 1 2 3 4 5 6 7 Architecture SOLUTION Tableau Hive Data Warehouse Oozie MySQLPentaho Analysts EUROPE Audit Plat LoL KOREA Audit Plat LoL NORTH AMERICA Audit Plat LoL Business Analyst
  15. 15. OOZIE 1 2 3 4 5 6 7
  16. 16. Oozie 1 2 3 4 5 6 7 WHAT IS OOZIE? •  Oozie is a workflow scheduler system to manage Apache Hadoop jobs •  Oozie is integrated with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box as well as system specific jobs •  Oozie is a scalable, reliable and extensible system
  17. 17. Oozie 1 2 3 4 5 6 7 WHY OOZIE? No need to create custom hooks for job submission NATIVE HADOOP INTEGRATION Jobs are spread against available mappers HORIZONTALLY SCALABLE The project has strong community backing and has committers from several companies OPEN SOURCE Logging and debugging is extremely quick with the web console and SQL VERBOSE REPORTING
  18. 18. Oozie 1 2 3 4 5 6 7 HADOOP ECOSYSTEM
  19. 19. Oozie 1 2 3 4 5 6 7 HADOOP ECOSYSTEM HDFS
  20. 20. Oozie 1 2 3 4 5 6 7 HADOOP ECOSYSTEM MAPREDUCE HDFS
  21. 21. Oozie 1 2 3 4 5 6 7 HADOOP ECOSYSTEM PIG SQOOP HIVE MAPREDUCE HDFS JAVA
  22. 22. Oozie 1 2 3 4 5 6 7 HADOOP ECOSYSTEM OOZIE PIG SQOOP HIVE MAPREDUCE HDFS JAVA
  23. 23. 1 2 3 4 5 6 7 Oozie LAYERS OF OOZIE Action (1..N) Workflow Coordinator (1..N) Bundle Bundle Coord Action WF Job MR / Pig / Java / Hive / Sqoop
  24. 24. 1 2 3 4 5 6 7 Oozie LAYERS OF OOZIE Action (1..N) Workflow Coordinator (1..N) Bundle Bundle Coord Action WF Job MR / Pig / Java / Hive / Sqoop
  25. 25. Oozie 1 2 3 4 5 6 7 WORKFLOW ACTION: JAVA <action name=”java-node"> <java> <job-tracker>foo:9001</job-tracker> <name-node>bar:9000</name-node> <main-class>com.riotgames.MyMainClass</main-class> <java-opts>-Dfoo</java-opts> <arg>bar<arg> </java> <ok to=”next"/> <error to=”error"/> </action> •  Workflow actions are the most granular unit of work
  26. 26. Oozie 1 2 3 4 5 6 7 WORKFLOW ACTION: JAVA <action name=”java-node"> <java> <job-tracker>foo:9001</job-tracker> <name-node>bar:9000</name-node> <main-class>com.riotgames.MyMainClass</main-class> <java-opts>-Dfoo</java-opts> <arg>bar<arg> </java> <ok to=”next"/> <error to=”error"/> </action> 1 java-node 1 •  Workflow actions are the most granular unit of work
  27. 27. Oozie 1 2 3 4 5 6 7 WORKFLOW ACTION: JAVA <action name=”java-node"> <java> <job-tracker>foo:9001</job-tracker> <name-node>bar:9000</name-node> <main-class>com.riotgames.MyMainClass</main-class> <java-opts>-Dfoo</java-opts> <arg>bar<arg> </java> <ok to=”next"/> <error to=”error"/> </action> 1 2 nextjava-node 1 2 •  Workflow actions are the most granular unit of work
  28. 28. Oozie 1 2 3 4 5 6 7 WORKFLOW ACTION: JAVA <action name=”java-node"> <java> <job-tracker>foo:9001</job-tracker> <name-node>bar:9000</name-node> <main-class>com.riotgames.MyMainClass</main-class> <java-opts>-Dfoo</java-opts> <arg>bar<arg> </java> <ok to=”next"/> <error to=”error"/> </action> 1 2 3 nextjava-node error Error 1 2 3 •  Workflow actions are the most granular unit of work
  29. 29. Oozie 1 2 3 4 5 6 7 WORKFLOW ACTION: MAPREDUCE <action name="myfirstHadoopJob"> <map-reduce> <job-tracker>foo:9001</job-tracker> <name-node>bar:9000</name-node> <prepare> <delete path="hdfs://foo:9000/usr/foo/output-data"/> </prepare> <job-xml>/myfirstjob.xml</job-xml> <configuration> <property> <name>mapred.input.dir</name> <value>/usr/foo/input-data</value> </property> <property> <name>mapred.output.dir</name> <value>/usr/foo/input-data</value> </property> <property> <name>mapred.reduce.tasks</name> <value>${firstJobReducers}</value> </property> </configuration> </map-reduce> <ok to="myNextAction"/> <error to="errorCleanup"/> </action>
  30. 30. Oozie 1 2 3 4 5 6 7 WORKFLOW ACTION: MAPREDUCE <action name="myfirstHadoopJob"> <map-reduce> <job-tracker>foo:9001</job-tracker> <name-node>bar:9000</name-node> <prepare> <delete path="hdfs://foo:9000/usr/foo/output-data"/> </prepare> <job-xml>/myfirstjob.xml</job-xml> <configuration> <property> <name>mapred.input.dir</name> <value>/usr/foo/input-data</value> </property> <property> <name>mapred.output.dir</name> <value>/usr/foo/input-data</value> </property> <property> <name>mapred.reduce.tasks</name> <value>${firstJobReducers}</value> </property> </configuration> </map-reduce> <ok to="myNextAction"/> <error to="errorCleanup"/> </action> •  Each action has a type and each type has defined set of key:values that can be used to configure it
  31. 31. Oozie 1 2 3 4 5 6 7 WORKFLOW ACTION: MAPREDUCE <action name="myfirstHadoopJob"> <map-reduce> <job-tracker>foo:9001</job-tracker> <name-node>bar:9000</name-node> <prepare> <delete path="hdfs://foo:9000/usr/foo/output-data"/> </prepare> <job-xml>/myfirstjob.xml</job-xml> <configuration> <property> <name>mapred.input.dir</name> <value>/usr/foo/input-data</value> </property> <property> <name>mapred.output.dir</name> <value>/usr/foo/input-data</value> </property> <property> <name>mapred.reduce.tasks</name> <value>${firstJobReducers}</value> </property> </configuration> </map-reduce> <ok to="myNextAction"/> <error to="errorCleanup"/> </action> •  Each action has a type and each type has defined set of key:values that can be used to configure it The action must also specify which actions to direct to based on success or failure
  32. 32. 1 2 3 4 5 6 7 Oozie LAYERS OF OOZIE Action (1..N) Workflow Coordinator (1..N) Bundle Bundle Coord Action WF Job MR / Pig / Java / Hive / Sqoop
  33. 33. 1 2 3 4 5 6 7 Oozie THE WORKFLOW ENGINE Start End fork joinMapReduce Java Sqoop Hive HDFS Shell decision •  Oozie runs workflows in the form of DAGs (directed acyclical graphs) •  Each element in this workflow is an action •  Some node types are processed internally to Oozie vs farmed to the cluster
  34. 34. 1 2 3 4 5 6 7 Oozie WORKFLOW EXAMPLE <workflow-app name="sample-wf" xmlns="uri:oozie:workflow:0.1"> <start to=‘java-node’/> <action name=”java-node"> ... </action> <end name=‘end’/> <kill name=‘fail’/> </workflow-app> •  This workflow will run the action defined as java-node
  35. 35. 1 2 3 4 5 6 7 Oozie WORKFLOW EXAMPLE <workflow-app name="sample-wf" xmlns="uri:oozie:workflow:0.1"> <start to=‘java-node’/> <action name=”java-node"> ... </action> <end name=‘end’/> <kill name=‘fail’/> </workflow-app> start java-node •  This workflow will run the action defined as java-node 1 1
  36. 36. 1 2 3 4 5 6 7 Oozie WORKFLOW EXAMPLE <workflow-app name="sample-wf" xmlns="uri:oozie:workflow:0.1"> <start to=‘java-node’/> <action name=”java-node"> ... </action> <end name=‘end’/> <kill name=‘fail’/> </workflow-app> start endjava-node •  This workflow will run the action defined as java-node 1 2 1 2
  37. 37. 1 2 3 4 5 6 7 Oozie WORKFLOW EXAMPLE <workflow-app name="sample-wf" xmlns="uri:oozie:workflow:0.1"> <start to=‘java-node’/> <action name=”java-node"> ... </action> <end name=‘end’/> <kill name=‘fail’/> </workflow-app> start endjava-node fail Error •  This workflow will run the action defined as java-node 1 2 3 1 2 3
  38. 38. 1 2 3 4 5 6 7 Oozie LAYERS OF OOZIE Action (1..N) Workflow Coordinator (1..N) Bundle Bundle Coord Action WF Job MR / Pig / Java / Hive / Sqoop
  39. 39. 1 2 3 4 5 6 7 Oozie COORDINATOR •  Oozie coordinators can execute workflows based on time and data dependencies •  Each coordinator is specified a workflow to execute upon meeting its trigger criteria •  Coordinators can pass variables to the workflow layer allowing for dynamic resolution Client Oozie Coordinator Oozie Workflow Oozie Server Hadoop
  40. 40. 1 2 3 4 5 6 7 Oozie EXAMPLE COORDINATOR <?xml version="1.0" ?><coordinator-app end="${COORD_END}" frequency="${coord:hours(1)}" name="test_job_coord" start="$ {COORD_START}" timezone="UTC" xmlns="uri:oozie:coordinator: 0.1"> <action> <workflow> <app-path>hdfs://bar:9000/user/hadoop/oozie/app/test_job</ app-path> </workflow> </action> </coordinator-app> •  This coordinator will run every hour and invoke the workflow found in the / test_job folder
  41. 41. 1 2 3 4 5 6 7 Oozie EXAMPLE COORDINATOR <?xml version="1.0" ?><coordinator-app end="${COORD_END}" frequency="${coord:hours(1)}" name="test_job_coord" start="$ {COORD_START}" timezone="UTC" xmlns="uri:oozie:coordinator: 0.1"> <action> <workflow> <app-path>hdfs://bar:9000/user/hadoop/oozie/app/test_job</ app-path> </workflow> </action> </coordinator-app> •  This coordinator will run every hour and invoke the workflow found in the / test_job folder
  42. 42. 1 2 3 4 5 6 7 Oozie EXAMPLE COORDINATOR <?xml version="1.0" ?><coordinator-app end="${COORD_END}" frequency="${coord:hours(1)}" name="test_job_coord" start="$ {COORD_START}" timezone="UTC" xmlns="uri:oozie:coordinator: 0.1"> <action> <workflow> <app-path>hdfs://bar:9000/user/hadoop/oozie/app/test_job</ app-path> </workflow> </action> </coordinator-app> •  This coordinator will run every hour and invoke the workflow found in the / test_job folder
  43. 43. 1 2 3 4 5 6 7 Oozie LAYERS OF OOZIE Action (1..N) Workflow Coordinator (1..N) Bundle Bundle Coord Action WF Job MR / Pig / Java / Hive / Sqoop
  44. 44. Oozie 1 2 3 4 5 6 7 BUNDLE Client Oozie Coordinator Oozie Workflow Oozie Server Hadoop Oozie Coordinator Oozie Workflow Oozie Bundle •  Bundles are higher level abstractions that will batch a set of coordinators together. •  There is no explicit dependency between coordinators within a bundle but it can be used to more formally define a data pipeline
  45. 45. 1 2 3 4 5 6 7 Oozie THE INTERFACE Multiple ways to interact with Oozie: •  Web Console (read only) •  CLI •  Java client •  Web Service Endpoints •  Directly with the DB using SQL The Java client / CLI are just an abstraction for the web service endpoints and it is easy to extend this functionality in your own apps.
  46. 46. 1 2 3 4 5 6 7 Oozie PIECES OF A DEPLOYABLE The list of components that are needed for a scheduled workflow: •  Coordinator.xml Contains the scheduler definition and path to workflow.xml •  Workflow.xml Contains the job definition •  Libraries Optional jar files •  Properties file (also possible through WS call) Initial parameterization and mandatory specification of coordinator path
  47. 47. 1 2 3 4 5 6 7 Oozie JOB.PROPERTIES NAME_NODE=hdfs://foo:9000 JOB_TRACKER=bar:9001 oozie.libpath=${NAME_NODE}/user/hadoop/oozie/ share/lib oozie.coord.application.path=${NAME_NODE}/user/ hadoop/oozie/app/test_job Important note: •  Any variable put into the job.properties will be inherited by the coordinator / workflow •  E.g. Given the key:value workflow_name=test_job you can access it using ${workflow_name}
  48. 48. 1 2 3 4 5 6 7 Oozie COORDINATOR SUBMISSION •  Deploy the workflow and coordinator to HDFS $ hadoop fs –put test_job oozie/app/ •  Submit and run the workflow job $ oozie job -run -config job.properties •  Check the coordinator status on the web console
  49. 49. 1 2 3 4 5 6 7 Oozie WEB CONSOLE
  50. 50. 1 2 3 4 5 6 7 Oozie WEB CONSOLE: COORDINATORS
  51. 51. 1 2 3 4 5 6 7 Oozie WEB CONSOLE: COORDINATOR DETAILS
  52. 52. WEB CONSOLE: JOB DETAILS1 2 3 4 5 6 7 Oozie
  53. 53. WEB CONSOLE: JOB DAG1 2 3 4 5 6 7 Oozie
  54. 54. WEB CONSOLE: JOB DETAILS1 2 3 4 5 6 7 Oozie
  55. 55. WEB CONSOLE:ACTION DETAILS1 2 3 4 5 6 7 Oozie
  56. 56. JOB TRACKER1 2 3 4 5 6 7 Oozie
  57. 57. A USE CASE: HOURLY JOBS1 2 3 4 5 6 7 Oozie Replace a current CRON job that runs a bash script once a day (6): •  The shell will execute a Java main which pulls data from a filestream (1), dumps it to HDFS and then runs a MapReduce job on the files (2). It will then email a person when the report is done (3). •  It should start within X amount of time (4) •  It should complete withinY amount of time (5) •  It should retry Z times on failure (automatic)
  58. 58. WORKFLOW.XML1 2 3 4 5 6 7 Oozie <workflow-app name=“filestream_wf" xmlns="uri:oozie:workflow:0.1"> <start to=‘java-node’/> <action name=”java-node"> <java> <job-tracker>foo:9001</job-tracker> <name-node>bar:9000</name-node> <main-class>org.foo.bar.PullFileStream</main-class> <arg>argument1</arg> </java> <ok to=”mr-node"/> <error to=”fail"/> </action> <action name=“mr-node”> <map-reduce> <job-tracker>foo:9001</job-tracker> <name-node>bar:9000</name-node> <configuration> ... </configuration> </map-reduce> <ok to=”email-node"/> <error to=”fail"/> </action> ... ... <action name=”email-node"> <email xmlns="uri:oozie:email-action:0.1"> <to>customer@foo.bar</to> <cc>employee@foo.bar</cc> <subject>Email notification</subject> <body>The wf completed</body> </email> <ok to="myotherjob"/> <error to="errorcleanup"/> </action> <end name=‘end’/> <kill name=‘fail’/> </workflow-app>
  59. 59. WORKFLOW.XML1 2 3 4 5 6 7 Oozie <workflow-app name=“filestream_wf" xmlns="uri:oozie:workflow:0.1"> <start to=‘java-node’/> <action name=”java-node"> <java> <job-tracker>foo:9001</job-tracker> <name-node>bar:9000</name-node> <main-class>org.foo.bar.PullFileStream</main-class> <arg>argument1</arg> </java> <ok to=”mr-node"/> <error to=”fail"/> </action> <action name=“mr-node”> <map-reduce> <job-tracker>foo:9001</job-tracker> <name-node>bar:9000</name-node> <configuration> ... </configuration> </map-reduce> <ok to=”email-node"/> <error to=”fail"/> </action> ... ... <action name=”email-node"> <email xmlns="uri:oozie:email-action:0.1"> <to>customer@foo.bar</to> <cc>employee@foo.bar</cc> <subject>Email notification</subject> <body>The wf completed</body> </email> <ok to="myotherjob"/> <error to="errorcleanup"/> </action> <end name=‘end’/> <kill name=‘fail’/> </workflow-app> 1
  60. 60. WORKFLOW.XML1 2 3 4 5 6 7 Oozie <workflow-app name=“filestream_wf" xmlns="uri:oozie:workflow:0.1"> <start to=‘java-node’/> <action name=”java-node"> <java> <job-tracker>foo:9001</job-tracker> <name-node>bar:9000</name-node> <main-class>org.foo.bar.PullFileStream</main-class> <arg>argument1</arg> </java> <ok to=”mr-node"/> <error to=”fail"/> </action> <action name=“mr-node”> <map-reduce> <job-tracker>foo:9001</job-tracker> <name-node>bar:9000</name-node> <configuration> ... </configuration> </map-reduce> <ok to=”email-node"/> <error to=”fail"/> </action> ... ... <action name=”email-node"> <email xmlns="uri:oozie:email-action:0.1"> <to>customer@foo.bar</to> <cc>employee@foo.bar</cc> <subject>Email notification</subject> <body>The wf completed</body> </email> <ok to="myotherjob"/> <error to="errorcleanup"/> </action> <end name=‘end’/> <kill name=‘fail’/> </workflow-app> 1 2
  61. 61. WORKFLOW.XML1 2 3 4 5 6 7 Oozie <workflow-app name=“filestream_wf" xmlns="uri:oozie:workflow:0.1"> <start to=‘java-node’/> <action name=”java-node"> <java> <job-tracker>foo:9001</job-tracker> <name-node>bar:9000</name-node> <main-class>org.foo.bar.PullFileStream</main-class> <arg>argument1</arg> </java> <ok to=”mr-node"/> <error to=”fail"/> </action> <action name=“mr-node”> <map-reduce> <job-tracker>foo:9001</job-tracker> <name-node>bar:9000</name-node> <configuration> ... </configuration> </map-reduce> <ok to=”email-node"/> <error to=”fail"/> </action> ... ... <action name=”email-node"> <email xmlns="uri:oozie:email-action:0.1"> <to>customer@foo.bar</to> <cc>employee@foo.bar</cc> <subject>Email notification</subject> <body>The wf completed</body> </email> <ok to="myotherjob"/> <error to="errorcleanup"/> </action> <end name=‘end’/> <kill name=‘fail’/> </workflow-app> 1 2 3
  62. 62. COORDINATOR.XML1 2 3 4 5 6 7 Oozie <?xml version="1.0" ?><coordinator-app end="${COORD_END}" frequency="${coord:days(1)}" name=”daily_job_coord" start="$ {COORD_START}" timezone="UTC" xmlns="uri:oozie:coordinator:0.1” xmlns:sla="uri:oozie:sla:0.1"> <action> <workflow> <app-path>hdfs://bar:9000/user/hadoop/oozie/app/test_job</app-path> </workflow> <sla:info> <sla:nominal-time>${coord:nominalTime()}</sla:nominal-time> <sla:should-start>${X * MINUTES}</sla:should-start> <sla:should-end>${Y * MINUTES}</sla:should-end> <sla:alert-contact>foo@bar.com</sla:alert-contact> </sla:info> </action> </coordinator-app>
  63. 63. COORDINATOR.XML1 2 3 4 5 6 7 Oozie <?xml version="1.0" ?><coordinator-app end="${COORD_END}" frequency="${coord:days(1)}" name=”daily_job_coord" start="$ {COORD_START}" timezone="UTC" xmlns="uri:oozie:coordinator:0.1” xmlns:sla="uri:oozie:sla:0.1"> <action> <workflow> <app-path>hdfs://bar:9000/user/hadoop/oozie/app/test_job</app-path> </workflow> <sla:info> <sla:nominal-time>${coord:nominalTime()}</sla:nominal-time> <sla:should-start>${X * MINUTES}</sla:should-start> <sla:should-end>${Y * MINUTES}</sla:should-end> <sla:alert-contact>foo@bar.com</sla:alert-contact> </sla:info> </action> </coordinator-app> 4,5
  64. 64. COORDINATOR.XML1 2 3 4 5 6 7 Oozie <?xml version="1.0" ?><coordinator-app end="${COORD_END}" frequency="${coord:days(1)}" name=”daily_job_coord" start="$ {COORD_START}" timezone="UTC" xmlns="uri:oozie:coordinator:0.1” xmlns:sla="uri:oozie:sla:0.1"> <action> <workflow> <app-path>hdfs://bar:9000/user/hadoop/oozie/app/test_job</app-path> </workflow> <sla:info> <sla:nominal-time>${coord:nominalTime()}</sla:nominal-time> <sla:should-start>${X * MINUTES}</sla:should-start> <sla:should-end>${Y * MINUTES}</sla:should-end> <sla:alert-contact>foo@bar.com</sla:alert-contact> </sla:info> </action> </coordinator-app> 6 4,5
  65. 65. WORKFLOWS @ 1 2 3 4 5 6 7
  66. 66. Use Case 1 1 2 3 4 5 6 7 USE CASE 1 – Global Data Means Global Data Problems
  67. 67. Use Case 1 WORKFLOWS: RELATIONAL1 2 3 4 5 6 7 Tableau Hive Data Warehouse Oozie MySQLPentaho Analysts EUROPE Audit Plat LoL KOREA Audit Plat LoL NORTH AMERICA Audit Plat LoL Business Analyst
  68. 68. Use Case 1 WORKFLOWS: RELATIONAL1 2 3 4 5 6 7 Tableau Hive Data Warehouse Oozie MySQLPentaho Analysts EUROPE Audit Plat LoL KOREA Audit Plat LoL NORTH AMERICA Audit Plat LoL Business Analyst
  69. 69. Use Case 1 WORKFLOWS: RELATIONAL1 2 3 4 5 6 7 Hive Final Tables provide more descriptive column naming and native type conversions REGION X Audit Plat LoL Hive Staging Transform Temp Tables map 1:1 with DB table meta Extract Oozie Actions
  70. 70. Use Case 1 WORKFLOWS: RELATIONAL1 2 3 4 5 6 7 Hive Final Tables provide more descriptive column naming and native type conversions REGION X Audit Plat LoL Hive Staging Transform Temp Tables map 1:1 with DB table meta Extract Oozie Actions 1. [Java] Check the partitions for the table and pull the latest date found.Write the key:value pair for latest date back out to a properties file so that it can be referenced by the rest of the workflow.
  71. 71. Use Case 1 WORKFLOWS: RELATIONAL1 2 3 4 5 6 7 Hive Final Tables provide more descriptive column naming and native type conversions REGION X Audit Plat LoL Hive Staging Transform Temp Tables map 1:1 with DB table meta Extract Oozie Actions 2. [Sqoop] If the table is flagged as dynamically partitioned, pull data from the table from the latest partition (referencing the output of the Java node) through todays date. If not, pull data just for the current date.
  72. 72. Use Case 1 WORKFLOWS: RELATIONAL1 2 3 4 5 6 7 Hive Final Tables provide more descriptive column naming and native type conversions REGION X Audit Plat LoL Hive Staging Transform Temp Tables map 1:1 with DB table meta Extract Oozie Actions 3. [Hive] Copy the table from the updated partitions from the staging DB to the prod DB while also performing column name and type conversions.
  73. 73. Use Case 1 WORKFLOWS: RELATIONAL1 2 3 4 5 6 7 Hive Final Tables provide more descriptive column naming and native type conversions REGION X Audit Plat LoL Hive Staging Transform Temp Tables map 1:1 with DB table meta Extract Oozie Actions 4. [Java] Grab row counts for both source and Hive across the dates pulled.Write this as well as some other meta out to a audit DB for reporting. Validation
  74. 74. Use Case 1 AUDITING1 2 3 4 5 6 7 •  We have a Tableau report pointing at the output audit data for a rapid high level view of the health of our ETLs
  75. 75. Use Case 1 SINGLE TABLE ACTION FLOW1 2 3 4 5 6 7 Initialize- node Sqoop- node Oozie- node Extraction actions
  76. 76. Use Case 1 SINGLE TABLE ACTION FLOW1 2 3 4 5 6 7 End Initialize- node Hive-node Audit-node Sqoop- node Oozie- node Start •  This action flow is done once per table Extraction actions Transform workflow
  77. 77. Use Case 1 SINGLE TABLE ACTION FLOW1 2 3 4 5 6 7 End Initialize- node Hive-node Audit-node Sqoop- node Oozie- node Start •  This action flow is done once per table Extraction actions Transform workflow •  The Oozie action allows us to asynchronously run the Hive staging->prod action and the auditing action. It is a Java action which uses the Oozie java client and submits key:value pairs to another workflow.
  78. 78. Use Case 1 FULL SCHEMA WORKFLOW1 2 3 4 5 6 7 End Start
  79. 79. Table 1 Extraction actions Use Case 1 FULL SCHEMA WORKFLOW1 2 3 4 5 6 7 End Start Table 1 Transform workflow
  80. 80. Table 1 Extraction actions Use Case 1 FULL SCHEMA WORKFLOW1 2 3 4 5 6 7 End Start Table 1 Transform workflow Table 2 Extraction actions Table 2 Transform workflow
  81. 81. Table 1 Extraction actions Use Case 1 FULL SCHEMA WORKFLOW1 2 3 4 5 6 7 End Start Table 1 Transform workflow Table 2 Extraction actions Table 2 Transform workflow Table N Extraction actions Table N Transform workflow •  We have one of these workflows per schema •  Different schemas have a different number of tables (e.g. range from 5-20 tables) •  We could fork and do each of these table extractions in parallel but we are trying to limit the I/O load we create on the sources
  82. 82. Use Case 1 COORDINATORS1 2 3 4 5 6 7 Schema 1 Workflow Schema 1 Coordinator •  We have one coordinator per schema workflow •  Currently coordinators are staged in groups based on schema type. Schema 2 Workflow Schema 2 Coordinator Schema N Workflow Schema N Coordinator
  83. 83. •  20+ Regions •  5+ DBs per region •  5-20 Tables per DB 20 * 5 * 12(avg) = ~1200 tables! Use Case 1 IMPORTANT NUMBERS1 2 3 4 5 6 7
  84. 84. •  20+ Regions •  5+ DBs per region •  5-20 Tables per DB 20 * 5 * 12(avg) = ~1200 tables! Use Case 1 IMPORTANT NUMBERS1 2 3 4 5 6 7
  85. 85. •  Not if you have a good deployment pipeline! Use Case 1 TOO UNWIELDY?1 2 3 4 5 6 7
  86. 86. Use Case 1 DEPLOYMENT STACK1 2 3 4 5 6 7
  87. 87. Use Case 1 DEPLOYMENT STACK: JAVA1 2 3 4 5 6 7 •  The java project compiles into the library that is used by the workflows •  It also contains some custom functionality for interacting with the Oozie WS endpoints / Oozie DB Tables
  88. 88. Use Case 1 DEPLOYMENT STACK: PYTHON1 2 3 4 5 6 7 •  The python project dynamically generates all of our workflow/coordinator xml files. It has a multipleYML configs which hold the meta associated with tall of the tables. It also interacts with a DB table for the various DB connection meta.
  89. 89. Use Case 1 DEPLOYMENT STACK: GITHUB1 2 3 4 5 6 7 •  GitHub houses all of the Big Data group’s code bases no matter the language.
  90. 90. Use Case 1 DEPLOYMENT STACK: JENKINS1 2 3 4 5 6 7 •  Jenkins polls GitHub and builds either set of artifacts (Java lib / tar containing workflows/coordinators) whenever it detects changes. It deploys the build artifacts to a simple mount point.
  91. 91. Use Case 1 DEPLOYMENT STACK: CHEF1 2 3 4 5 6 7 •  The Chef cookbook will check for the version declared for both sets of artifacts and grab them from the mount point. It runs a shell which deploys the deflated workflows/coordinators and mounts the jar lib file.
  92. 92. •  20+ Regions •  5+ DBs per region •  5-20 Tables per DB 20 * 5 * 12(avg) = ~1200 tables! Use Case 1 IMPORTANT NUMBERS1 2 3 4 5 6 7
  93. 93. •  20+ Regions •  5+ DBs per region •  5-20 Tables per DB 20 * 5 * 12(avg) = ~1200 tables! Use Case 1 IMPORTANT NUMBERS1 2 3 4 5 6 7
  94. 94. •  20+ Regions •  5+ DBs per region •  5-20 Tables per DB 20 * 5 * 12(avg) = ~1200 tables! Use Case 1 IMPORTANT NUMBERS1 2 3 4 5 6 7
  95. 95. •  20+ Regions •  5+ DBs per region •  5-20 Tables per DB 20 * 5 * 12(avg) = ~1200 tables per day! 1 person < 5 hours a week! Use Case 1 IMPORTANT NUMBERS1 2 3 4 5 6 7
  96. 96. Use Case 2 USE CASE 2 – Dashboarding Cloud Data1 2 3 4 5 6 7
  97. 97. Use Case 2 WORKFLOWS: NON-RELATIONAL1 2 3 4 5 6 7 DashboardHive Data Warehouse Honu Analysts Business Analyst Client Mobile WWW Self Service App (Workflow and Meta)
  98. 98. Use Case 2 WORKFLOWS: NON-RELATIONAL1 2 3 4 5 6 7 DashboardHive Data Warehouse Honu Analysts Business Analyst Client Mobile WWW Self Service App (Workflow and Meta)
  99. 99. WORKFLOWS: NON-RELATIONAL1 2 3 4 5 6 7 External Queue Amazon SQS is a message queue we use for asynchronous communication HONU SOURCE TABLES Audit Plat LoL Honu Derived Message Derived Tables are filtered datasets joined from 1 or more sources Transform Oozie ActionsUse Case 2
  100. 100. WORKFLOWS: NON-RELATIONAL1 2 3 4 5 6 7 External Queue Amazon SQS is a message queue we use for asynchronous communication HONU SOURCE TABLES Audit Plat LoL Honu Derived Message Derived Tables are filtered datasets joined from 1 or more sources Transform Oozie Actions 1. [Java] Check that the required partitions for the derived query exist and contain data. Send a message to an SNS endpoint if a partition exists but contains no rows. Use Case 2
  101. 101. WORKFLOWS: NON-RELATIONAL1 2 3 4 5 6 7 External Queue Amazon SQS is a message queue we use for asynchronous communication HONU SOURCE TABLES Audit Plat LoL Honu Derived Message Derived Tables are filtered datasets joined from 1 or more sources Transform Oozie Actions 2. [Hive] Perform the table transformation query on the selected partition(s).This query can filter any subset of source columns and join any number of source tables. Use Case 2
  102. 102. WORKFLOWS: NON-RELATIONAL1 2 3 4 5 6 7 External Queue Amazon SQS is a message queue we use for asynchronous communication HONU SOURCE TABLES Audit Plat LoL Honu Derived Message Derived Tables are filtered datasets joined from 1 or more sources Transform Oozie Actions 3. [Java] Send an SQS message to an external queue based on the consumer type. Consumers will pull from these queues regularly and update the various dashboards artifacts. Use Case 2
  103. 103. WORKFLOWS: NON-RELATIONAL1 2 3 4 5 6 7 •  End result is that our dashboards get updated either hourly or daily depending on the workflow Use Case 2
  104. 104. LESSONS 1 2 3 4 5 6 7
  105. 105. LESSONS LESSON #1 Distros andVersioning •  If you choose to go with a distro for your Hadoop stack, be extremely vigilant about upgrading to the latest versions whenever possible.You will receive a lot more community support and a lot less headaches if you are not running into bugs that were patched in trunk over a year ago! 1 2 3 4 5 6 7
  106. 106. LESSONS LESSON #2 Solidify Deployment •  The usefulness of Oozie can degrade as complexity creeps into your pipeline. If you do not work towards an automated deployment pipeline at the early stages of your development, you will quickly find maintenance costs rising significantly over time. 1 2 3 4 5 6 7
  107. 107. LESSONS LESSON #3 Extend Capabilities •  Don’t feel limited to using tools based on the supplied APIs. Feel free to implement harnesses that extend capabilities and submit them back to the community – we will welcome it with open arms J 1 2 3 4 5 6 7
  108. 108. LESSONS LESSON #4 Ask for Help! •  Oozie is an open source project and is getting new members/organizations everyday. Don’t spend multiple hours trying to solve an issue that many of us have already worked through. •  There is also a large amount of documentation both in the wikis AND archived listserv responses – leverage them both! 1 2 3 4 5 6 7
  109. 109. THE FUTURE 1 2 3 4 5 6 7
  110. 110. 1 2 3 4 5 6 7 CONTINUE INCREASINGVELOCITY THE FUTURE June 2012 July 2013 MySQL tables 180 1200 Pipeline Events/day 0 7+ Billion Workflows Cronjob + Pentaho Oozie Environment Datacenter DC + AWS SLA 1 day 2 hours Event tracking •  2+ weeks (DB update) •  Dependencies: DBA teams + ETL teams + Tools teams •  Downtime (3h min.) •  10 minutes •  Self-Service •  No downtime
  111. 111. OUR IMMEDIATE GOALS1 2 3 4 5 6 7 THE FUTURE •  Improve Self-service workflow & tooling •  Realtime event aggregation •  Global Data Infrastructure •  Replace legacy audit/event logging services
  112. 112. CHALLENGE: MAKE IT GLOBAL •  Data centers across the globe since latency has huge effect on gameplay à log data scattered around the world •  Large presence in Asia -- some areas (e.g., PH) have bandwidth challenges or bandwidth is expensive 1 2 3 4 5 6 7 THE FUTURE
  113. 113. CHALLENGE: WE HAVE BIG DATA +  chat logs +  detailed gameplay event tracking +  so on…. 1 2 3 4 5 6 7 500G DAILY STRUCTURED DATA > 7PB GAME EVENT DATA 3MM SUBSCRIBERS 448+ MMVIEWS RIOTYOUTUBE CHANNEL THE FUTURE
  114. 114. OUR AUDACIOUS GOALS Have deep, real-time understanding of our systems from player experience and operational standpoints 1 2 3 4 5 6 7 Have ability to identify, understand and react to meaningful trends in real time Build a world-class data and analytics organization •  Deeply understand players across the globe •  Apply that understanding to improve games for players •  Deeply understand our entire ecosystem, including social media THE FUTURE
  115. 115. SHAMELESS HIRING PLUG1 2 3 4 5 6 7 THE FUTURE Like most everybody else at this conference… we’re hiring! PLAYER EXPERIENCE FIRST CHALLENGE CONVENTION FOCUS ON TALENT AND TEAM TAKE PLAY SERIOUSLY STAY HUNGRY, STAY HUMBLE THE RIOT MANIFESTO
  116. 116. SHAMELESS HIRING PLUG1 2 3 4 5 6 7 And yes, you can play games at work. It’s encouraged! THE FUTURE
  117. 117. MATT GOEKE mgoeke@riotgames.com THANK YOU! QUESTIONS?

×