Azkaban and Pig at LinkedIn

5,545 views

Published on

Description of using Pig with the Azkaban workflow scheduler for Hadoop

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
5,545
On SlideShare
0
From Embeds
0
Number of Embeds
14
Actions
Shares
0
Downloads
58
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Azkaban and Pig at LinkedIn

  1. 1. Azkaban and Pig<br />Richard Park, Russell Jurney<br />LinkedIn Search, Network, Analytics<br />
  2. 2. Installing & Running Azkaban<br />wgethttp://github.com/downloads/azkaban/azkaban/azkaban-0.04.tar.gz<br />tar –xvzf azkaban-0.04.tar.gz<br />mkdir /some-dir/azkaban-jobs<br />cd azkaban-0.04<br />bin/azkaban-server.sh –job-dir /some-dir/azkaban-jobs<br />
  3. 3. Azkaban @ localhost:8080<br />
  4. 4. Pig Configuration<br />myproject.properties – Global Configuration<br />hadoop.job.ugi=rjurney,hadoop<br />udf.import.list=org.apache.pig.builtin.,com.linkedin.pig.,com.linkedin…<br />cc_0_compute_title_counts.job – Pig Job<br />type=pig<br />pig.script=cc_0_compute_title_counts.pig<br />cc_1_reverse_engineer_durak_3000 – Pig Job with Dependency<br />type=pig<br />pig.script=cc_1_reverse_engineer_durak_3000.pig<br />dependencies=cc_0_compute_title_counts<br />
  5. 5. Running Job<br />> bin/run-jobs.sh –job-dir /some-dir/azkaban-jobs my-job<br />
  6. 6. Scheduling Jobs<br />
  7. 7. Viewing Jobs<br />
  8. 8. Editing Jobs<br />
  9. 9. Azkaban Pig Docs<br />

×