More Related Content Similar to Luigi Presentation at OSCON 2013 Similar to Luigi Presentation at OSCON 2013 (20) Luigi Presentation at OSCON 201324. Luigi-AggregateArtists
Run on the command line:
$ python dataflow.py AggregateArtists
DEBUG: Checking if AggregateArtists() is complete
INFO: Scheduled AggregateArtists()
DEBUG: Checking if Streams() is complete
INFO: Done scheduling tasks
DEBUG: Asking scheduler for work...
DEBUG: Pending tasks: 1
INFO: [pid 74375] Running AggregateArtists()
INFO: [pid 74375] Done AggregateArtists()
DEBUG: Asking scheduler for work...
INFO: Done
INFO: There are no more tasks to run at this time
27. Runningitall...
DEBUG: Checking if ArtistToplistToDatabase() is complete
INFO: Scheduled ArtistToplistToDatabase()
DEBUG: Checking if Top10Artists() is complete
INFO: Scheduled Top10Artists()
DEBUG: Checking if AggregateArtists() is complete
INFO: Scheduled AggregateArtists()
DEBUG: Checking if Streams() is complete
INFO: Done scheduling tasks
DEBUG: Asking scheduler for work...
DEBUG: Pending tasks: 3
INFO: [pid 74811] Running AggregateArtists()
INFO: [pid 74811] Done AggregateArtists()
DEBUG: Asking scheduler for work...
DEBUG: Pending tasks: 2
INFO: [pid 74811] Running Top10Artists()
INFO: [pid 74811] Done Top10Artists()
DEBUG: Asking scheduler for work...
DEBUG: Pending tasks: 1
INFO: [pid 74811] Running ArtistToplistToDatabase()
INFO: Done writing, importing at 2013-03-13 15:41:09.407138
INFO: [pid 74811] Done ArtistToplistToDatabase()
DEBUG: Asking scheduler for work...
INFO: Done
INFO: There are no more tasks to run at this time
50. “Oozieexample”
<workflow-app xmlns='uri:oozie:workflow:0.1' name='processDir'>
<start to='getDirInfo' />
<!-- STEP ONE -->
<action name='getDirInfo'>
<!--writes 2 properties: dir.num-files: returns -1 if dir doesn't exist,
otherwise returns # of files in dir dir.age: returns -1 if dir doesn't exist,
otherwise returns age of dir in days -->
<java>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<main-class>com.navteq.oozie.GetDirInfo</main-class>
<arg>${inputDir}</arg>
<capture-output />
</java>
<ok to="makeIngestDecision" />
<error to="fail" />
</action>
<!-- STEP TWO -->
<decision name="makeIngestDecision">
<switch>
<!-- empty or doesn't exist -->
<case to="end">
${wf:actionData('getDirInfo')['dir.num-files'] lt 0 ||
(wf:actionData('getDirInfo')['dir.age'] lt 1 and
wf:actionData('getDirInfo')['dir.num-files'] lt 24)}
</case>
<!-- # of files >= 24 -->
<case to="ingest">
${wf:actionData('getDirInfo')['dir.num-files'] gt 23 ||
wf:actionData('getDirInfo')['dir.age'] gt 6}
</case>
<default to="sendEmail"/>
</switch>
</decision>
<!--EMAIL-->
<action name="sendEmail">
<java>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<main-class>com.navteq.oozie.StandaloneMailer</main-class>
<arg>probedata2@navteq.com</arg>
<arg>gregory.titievsky@navteq.com</arg>
<arg>${inputDir}</arg>
<arg>${wf:actionData('getDirInfo')['dir.num-files']}</arg>
<arg>${wf:actionData('getDirInfo')['dir.age']}</arg>