Workflow on Hadoop Using Oozie__HadoopSummit2010

•

11 likes•2,305 views

Yahoo Developer Network

Hadoop Summit 2010 - Developers Track Workflow on Hadoop Using Oozie Alejandro Abdelnur, Yahoo!

Technology

Yahoo! Workflow Engine for Hadoop ,[object Object],Yahoo!

[object Object],[object Object],[object Object],Session Agenda

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Oozie 1, Workflow

Users Experience ,[object Object],[object Object],[object Object],[object Object]

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Some Numbers Map-Red Pig File System Java Sub-Workflow 23% 30% 19% 18% 4%

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],The First Year …

[object Object],[object Object],[object Object],[object Object],[object Object],… The First Year …

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],… The First Year …

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],… The First Year

[object Object],RULE for Oozie Workflows

[object Object],[object Object],[object Object],[object Object],Oozie 2 Coordinator Coordinator app f IN Workflow OUT

Use Cases: Data Pipelines WS f (5min) PH1 1:05 f (60min) PH1 1:10 PH1 1:15 PH1 2:00 LOG 1:05 LOG 1:10 LOG 1:15 LOG 2:00 PH2 2:00 01JAN 31DEC 01JAN 31DEC 1:05 1:10 2:00 1:15 2:00

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Coordinator Applications

Coordinator Input and Output Data PH1 1:05 f j (60min) PH1 1:10 PH1 1:15 PH1 2:00 PH2 2:00 01JAN 31DEC 2:00 ${current(0)} ${current(-11)} ${current(0)} ${current(-10)} ${current(-9)} f i (5min) f o (60min) IN Workflow OUT

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Daylight Saving is Evil

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],What is Next?

[object Object],[object Object],Getting Oozie

Questions? ,[object Object],[object Object]

What's hot

Reactive programming using rx java & akka actors - pdx-scala - june 2014Thomas Lockney

Intro to Functional Programming with RxJavaMike Nakhimovich

Flink Streaming @BudapestDataGyula Fóra

Apache Flink Internals: Stream & Batch Processing in One System – Apache Flin...ucelebi

Continuous Processing with Apache Flink - Strata London 2016Stephan Ewen

Apache Flink: API, runtime, and project roadmapKostas Tzoumas

Marton Balassi – Stateful Stream ProcessingFlink Forward

Apache Flink internalsKostas Tzoumas

Pulsar connector on flink 1.14宇帆盛

RxJava - introduction & designallegro.tech

Matthias J. Sax – A Tale of Squirrels and StormsFlink Forward

Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015Robert Metzger

Stephan Ewen - Stream Processing as a Foundational Paradigm and Apache Flink'...Ververica

Taking a look under the hood of Apache Flink's relational APIs.Fabian Hueske

Apache Flink's Table & SQL API - unified APIs for batch and stream processingTimo Walther

Moon soo Lee – Data Science Lifecycle with Apache Flink and Apache ZeppelinFlink Forward

Tran Nam-Luc – Stale Synchronous Parallel Iterations on FlinkFlink Forward

Yahoo compares Storm and SparkChicago Hadoop Users Group

Flink Forward SF 2017: Feng Wang & Zhijiang Wang - Runtime Improvements in Bl...Flink Forward

Self-managed and automatically reconfigurable stream processingVasia Kalavri

What's hot (20)

Reactive programming using rx java & akka actors - pdx-scala - june 2014

Intro to Functional Programming with RxJava

Flink Streaming @BudapestData

Apache Flink Internals: Stream & Batch Processing in One System – Apache Flin...

Continuous Processing with Apache Flink - Strata London 2016

Apache Flink: API, runtime, and project roadmap

Marton Balassi – Stateful Stream Processing

Apache Flink internals

Pulsar connector on flink 1.14

RxJava - introduction & design

Matthias J. Sax – A Tale of Squirrels and Storms

Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015

Stephan Ewen - Stream Processing as a Foundational Paradigm and Apache Flink'...

Taking a look under the hood of Apache Flink's relational APIs.

Apache Flink's Table & SQL API - unified APIs for batch and stream processing

Moon soo Lee – Data Science Lifecycle with Apache Flink and Apache Zeppelin

Tran Nam-Luc – Stale Synchronous Parallel Iterations on Flink

Yahoo compares Storm and Spark

Flink Forward SF 2017: Feng Wang & Zhijiang Wang - Runtime Improvements in Bl...

Self-managed and automatically reconfigurable stream processing

Similar to Workflow on Hadoop Using Oozie__HadoopSummit2010

2019 05-28 SRE Consul Criteo MeetupPierre Souchay

SECON'2014 - Филипп Торчинский - Трансформация баг-трекера под любой проект: ...

Omid: scalable and highly available transaction processing for Apache PhoenixDataWorks Summit

How Texas Instruments Uses InfluxDB to Uphold Product Standards and to Improv...DevOps.com

How Texas Instruments Uses InfluxDB to Uphold Product Standards and to Improv...InfluxData

Search LuceneJeremy Coates

GolangSoftware Infrastructure

GolangFatih Şimşek

August 2016 HUG: Recent development in Apache OozieYahoo Developer Network

[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종NAVER D2

Hadoop fault tolerancePallav Jha

HadoopRaghu Juluri

Apache Spark Performance is too hard. Let's make it easierDatabricks

Introduction sur Tez par Olivier RENAULT de HortonWorks Meetup du 25/11/2014Modern Data Stack France

Fluentd - RubyKansai 65N Masahiro

Running Cognos on HadoopSenturus

Hadoop ecosystemMohamed Ali Mahmoud khouder

Parallel HDF5The HDF-EOS Tools and Information Center

NaliniProfileNalini Sahoo

Video Transcoding on HadoopDataWorks Summit

Similar to Workflow on Hadoop Using Oozie__HadoopSummit2010 (20)

2019 05-28 SRE Consul Criteo Meetup

SECON'2014 - Филипп Торчинский - Трансформация баг-трекера под любой проект: ...

Omid: scalable and highly available transaction processing for Apache Phoenix

How Texas Instruments Uses InfluxDB to Uphold Product Standards and to Improv...

Search Lucene

Golang

August 2016 HUG: Recent development in Apache Oozie

[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종

Hadoop fault tolerance

Hadoop

Apache Spark Performance is too hard. Let's make it easier

Introduction sur Tez par Olivier RENAULT de HortonWorks Meetup du 25/11/2014

Fluentd - RubyKansai 65

Running Cognos on Hadoop

Hadoop ecosystem

Parallel HDF5

NaliniProfile

Video Transcoding on Hadoop

Recently uploaded

IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge

08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls

[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745

CNv6 Instructor Chapter 6 Quality of Servicegiselly40

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j

2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer

Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko

Finology Group – Insurtech Innovation Award 2024The Digital Insurer

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science

A Domino Admins Adventures (Engage 2024)Gabriella Davis

Scaling API-first – The story of a global engineering organizationRadu Cotescu

Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia

Real Time Object Detection Using Open CVKhem

Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung

From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software

08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls

Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2

Recently uploaded (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions

08448380779 Call Girls In Greater Kailash - I Women Seeking Men

[2024]Digital Global Overview Report 2024 Meltwater.pdf

CNv6 Instructor Chapter 6 Quality of Service

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...

2024: Domino Containers - The Next Step. News from the Domino Container commu...

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024

Handwritten Text Recognition for manuscripts and early printed texts

Finology Group – Insurtech Innovation Award 2024

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx

A Domino Admins Adventures (Engage 2024)

Scaling API-first – The story of a global engineering organization

Boost Fertility New Invention Ups Success Rates.pdf

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...

Real Time Object Detection Using Open CV

Data Cloud, More than a CDP by Matt Robison

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

08448380779 Call Girls In Civil Lines Women Seeking Men

Exploring the Future Potential of AI-Enabled Smartphone Processors