Your SlideShare is downloading. ×
Oozie Summit 2011
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Oozie Summit 2011

886
views

Published on

Presented at HAdoop Summit

Presented at HAdoop Summit

Published in: Technology, Business

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
886
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
34
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Oozie:
Scheduling Workflows
 On
the
Grid
 Mohammad
K
Islam
 kamrul@yahoo‐inc.com

  • 2. Agenda
•  Oozie
Overview
•  Oozie
3.x
features:

 –  Bundle
 –  Scalability
 –  Usability

•  Challenges
•  Future
Plan
•  Q&A


  • 3. Overview:
Workflow
•  Oozie
executes
workflow
defined
as
DAG
of
jobs.
•  The
job
type
includes:
Map‐Reduce/
Pipes/
Streaming/ Pig/Custom
Java
Code
etc.
•  Introduced
in
Oozie
1.x.
 M/R 
 streaming 
 job
 M/R 
 start 
 fork
 join
 job
 Pig
 MORE
 decision
 job
 M/R 
 ENOUGH
 job
 FS
 end
 Java
 job 

  • 4. Overview:
Coordinator
•  Oozie
executes
workflow
based
on:
 –  Time
Dependency
(Frequency)
 –  Data
Dependency

•  Introduced
in
Oozie
2.x.
 Oozie
Server
 Check

 WS
API
 Oozie

 Data
Availability
 Coordinator
 Oozie

 Oozie
 Workflow
 Client
 Hadoop

  • 5. Oozie
3.x:
Bundle
•  User
can
define
and
execute
a bunch of  coordinator
applicaons.
•  User
could
start/stop/suspend/resume/rerun in
 the
bundle
level.
•  Benefits:
Easy
to
maintain
and
control
large
data
 pipelines
applicaons
for
Service
Engineering
 team.
 Oozie
Server
 Check

 WS
API
 Data
Availability
 Bundle
 Coordinator
 Oozie
 Workflow
 Client
 Hadoop

  • 6. Oozie
AbstracNon
Layers
 Bundle  Layer
1
 Coord Job 1  Coord Job 2  Layer
2
Coord  Coord  Coord  Coord Action 1  Action 2  Action1   Action 2 WF Job 1  WF Job 1  WF Job 2  WF Job 2  PIG  Layer
3
 Job  M/R  M/R  PIG  Job  Job  Job  FS  Job 
  • 7. Enhanced
Stability
and
Scalability
•  Issue
:

 –  At
very
high
load,
Oozie
becomes
slow.
 –  90%
of
the
total
Oozie
support
incidence.

•  Reason:

 –  Lot
of
acve
but
non‐progressing
jobs.
 –  Oozie
internal
queue
is
full.
•  Resoluon:
 –  Throcle
the
number
of
acve
jobs/coordinator
 –  Put
the
job
into
meout
state.
 –  Enforce
the
uniqueness
for
oozie
queue
element.


  • 8. Improved
Usability
•  Issue:

 –  Coordinator
job’s
status
is
not
intuive
and
causes
 confusion
to
the
Oozie
user.
•  Reason:
 –  Status
SUCCEEDED
doesn’t
mean
job
is
 successful!!
 –  Status
PREMATER
is
for
oozie
internal
use
only.
 But
it
was
exposed
to
user.
•  Resoluon:
 –  Redesign
Coordinator
status

  • 9. Coordinator
Status
Redesign
Current
 SUSPENDED
 KILLED
 PREP
 PREMATER
 Running
 SUCCEEDED
 FAILED
New
 SUSPENDED
 KILLED
 SUCCEEDED
 PREP
 Running
 DONE_WITH_ERROR
 PAUSED
 FAILED

  • 10. The
Second
Year
...
•  Number
of
Releases
 –  Feature
Releases
:
3
 –  Patches
:
9
•  Backward compa5bility is
strongly
maintained.

•  No
need
to
resubmit
the
job
if
Oozie
is
restarted.
•  Code
Overhaul:
 –  Re‐designed
the
command
pacern
to
avoid
DB
 connecon
leaks
and
to
improve
DB
connecons
 usages.

  • 11. Oozie
Usages
•  Y!
internal
usages:
 –  Total
number
of
user
:
377
 –  Total
number
of
processed
jobs
≈
600K/month
•  External
downloads:
 –  1500+
in
last
8
months
from
Github
 –  A
large
number
of
downloads
maintained

by
3rd
 party
packaging.



  • 12. Oozie
Usages
Cont.
•  User
Community:
 –  Membership
 •  Y!
internal
‐
265
 •  External
–
163
 –  Message
(approximate):
 •  Y!
internal
–
9/day
 •  External
–
7/day

  • 13. Challenges
1
:Data
Availability
Check
•  Issue
:

 –  Currently
checks
directory
in
every
minute
(polling  based).
 –  Increases
NN
overhead
and
does not scale well.
•  Reason:
No
meta‐data
system
with
 appropriate
noficaons
mechanism.
•  Planned
resoluon:
Incorporate
with
HCatalog
 metadata
system.


  • 14. Challenges
2
:
Adaptability
to
Hadoop

•  Issues
:
If
Hadoop
NN
or
JT
is
down,
Oozie
 submits
job
and
obviously
fails.
User
intervenon
 is
required
when
Hadoop
server
is
back.
•  Impact:
Inconvenient
for
Oozie
user.
For
example,
 if
Hadoop
is
restarted
on
Friday
night,
job
will
not
 run
unl
next
Monday.
•  Planned
Resoluon:
Graceful
handling
of
Hadoop
 downme:

 –  If
Hadoop
is
down,
block
submission.

 –  When
Hadoop
becomes
available

 •  Submit
the
blocked
job

 •  Auto‐resubmit
the
untraced
job.


  • 15. Challenges
3:
Horizontally
Scalable
•  Issues:
One
instance
of
Oozie
could
not
efficiently
 handle
a
very
large
number
of
jobs
(say
100K/ hours).
In
addion,
Oozie
doesn’t
support
load
 balancing.
•  
Reason:
Oozie
internal
task
queue
is
not
 synchronized
across
mulple
Oozie
instances.
•  Planned
Resoluon:
Use
Zookeeper
for
coordinaon.
•  Benefits:
As
the
load
increases,
add
extra
Oozie
 server.

  • 16. Future
Plan
•  AutomaNc
Failover:
Using
ZooKeeper.
•  Monitoring:
Rich
WS
API
for
applicaon
 Monitoring/Alerng.
•  Improved
Usability:

 –  Distcp
acon
 –  Hive
Acon
•  Asynchronous
data
processing.
•  Incremental
data
processing.
•  Apache
MigraNon:
Works
iniated.


  • 17. Q&A
•  Github
link:
hcp://yahoo.github.com/oozie
•  Mailing
list:
Oozie-users@yahoogroups.com
 Mohammad
K
Islam
 kamrul@yahoo‐inc.com

  • 18. Backup
Slides

  • 19. Oozie
Workflow
Applicaon
•  Contents
 –  A
workflow.xml
file

 –  Resource
files,
config
files
and
Pig
scripts
 –  All
necessary
JAR
and
nave
library
files

•  Parameters
 –  The
workflow.xml,
is
parameterized,
parameters
 can
be
propagated
to
map-reduce,
pig &
ssh
 jobs
•  Deployment
 –  In
a
directory
in
the
HDFS
of
the
Hadoop
cluster
 where
the
Hadoop
&
Pig
jobs
will
run
 19

  • 20. Oozie
 Running
a
Workflow
Job
 cmd
Workflow
ApplicaNon
Deployment
 $ hadoop fs –mkdir hdfs://usr/tucu/wordcount-wf $ hadoop fs –mkdir hdfs://usr/tucu/wordcount-wf/lib $ hadoop fs –copyFromLocal workflow.xml wordcount.xml hdfs://usr/tucu/wordcount-wf $ hadoop fs –copyFromLocal hadoop-examples.jar hdfs://usr/tucu/wordcount-wf/lib $Workflow
Job
ExecuNon
 $ oozie run -o http://foo.corp:8080/oozie -a hdfs://bar.corp:9000/usr/tucu/wordcount-wf 
 input=/data/2008/input output=/data/2008/output Workflow job id [1234567890-wordcount-wf] $
Workflow
Job
Status
 $ oozie status -o http://foo.corp:8080/oozie -j 1234567890-wordcount-wf Workflow job status [RUNNING] ... $ 20