Advanced Oozie

3,177 views

Published on

Alexey Yakubovich

Published in: Technology
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
3,177
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
25
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide

Advanced Oozie

  1. 1. ADVANCED OOZIE Alexey Yakubovich
  2. 2. In Addition to Boris Lublinsky, Kevin T. Smith and Alexey Yakubovich. “Professional Hadoop Solutions”
  3. 3. New cases • Use cases • Organizing all data processing steps: up-down • Regular data injection • Regular data transformation • Regular report generation • Extensions • File movement on HDFS (synch. java action) • Data transfer (synch - ftp, synch - ssh) • Logging / monitoring (beyond Oozie console)
  4. 4. New & rediscovered Oozie features • 1. JMS notifications (job life cycle, SLA) http://oozie.apache.org/docs/4.0.0/DG_JMSNotifications.html • 2. Overriding the launcher https://github.com/yahoo/oozie/blob/master/examples/src/main/java/org /apache/oozie/example/DemoPigMain.java • Unit testing Oozie with MiniOozie http://oozie.apache.org/docs/4.0.0/ENG_MiniOozie.html
  5. 5. JMS notifications “Push” JMS notifications for action status, SLA met and SLA miss Needs “JMS broker” to interprets notifications Apache ActiveMQ Need “JMS notification configuration” in the oozie-site.xml: oozie.services.ext oozie.services.EventHandlerService… oozie.jms.producer.connection.properties (topic) Notification types Job status: start, success, failure, suspended … SLA: start| end| duration && met | miss Message format: javax.jms.TextMessage with Oozie job specific headers
  6. 6. Overriding the launcher (cross-cutting concerns) • Regular Pig job launcher – org.apache.oozie.action.hadoop.PigMain Reminder: action executor provides all preparations for submitting action as a hadoop job(s). In particularly the PigMain executor invokes the Pig runtime on an Edge (Gateway) node. public class SpecialPigExec extends PigMain() { e.g. logging, external services (security,, transactions) } • Oozie workflow <action name=“pig-special”> <pig> … <property> <name> oozie.launcher.action.main.class </name> <value> … SpecialPigExec</value>
  7. 7. Unit testing Oozie with MiniOozie • MiniOozieTestCase is a junit test class • Allows to test workflow and coordinator applications • Tests workflow directly from IDE (Eclipse for sure) • Does not require access to cluster or running Oozie server • Runs against the local file system • Tested on Linux and Max OS X, configured with Maven (simple) • Needs most (all) Oozie libraries Action choice restricted: java actions is straight forward. others can be “simulated” I can’t tell if possible to combine with PigUnit and Hive standalone mode.

×