SlideShare a Scribd company logo
1 of 12
Download to read offline
9/24/2018 Automating Hadoop Jobs Using Rundeck |
https://acadgild.com/blog/automating-hadoop-jobs-using-rundeck 1/12
Automating Hadoop Jobs Using Rundeck
kiran • April 6, 2017  4  738
All Categories Big Data Hadoop & Spark - Advanced
Rundeck is an open source software that helps in automating a set of
procedures. It provides features to automate a certain set of things. Rundeck
is developed on GitHub as a project called Rundeck SimplifyOps by the
Rundeck community.
Following are some of its exciting features:
100% Free Course On Big
Data Essentials
Subscribe to our blog and get access to this course
ABSOLUTELY FREE.
Name
Email
Phone
Submit
9/24/2018 Automating Hadoop Jobs Using Rundeck |
https://acadgild.com/blog/automating-hadoop-jobs-using-rundeck 2/12
Web API
Distributed command execution
Pluggable execution system (SSH by default)
Multistep workflows
Job execution with on-demand or scheduled runs
Graphical web console for command and job execution
Role-based access control policy with support for LDAP/ActiveDirectory
History and auditing logs
Open integration with external host inventory tools
Command line interface tools
In our previous blog, we have shown how to schedule a Hadoop job in
Rundeck. In this blog, we will give you a demo of how to automate a
Hadoop/Hive/Pig job using Rundeck. This will allow your job to run on a
daily or even on a monthly basis.
We recommend our users to go through our previous blogs on Rundeck for
steps on installation and on how to schedule a Hadoop job.
Let us start with project creation. We will create a list of Hive queries in a file,
after which we will configure the job for it to run automatically every day.
To create a new project, Click On the New project, and provide the
necessary details, like project_name, description as shown in the screenshot
below.
Also, check the option Require File Exists in the Resource Model Source and
click on Save.
9/24/2018 Automating Hadoop Jobs Using Rundeck |
https://acadgild.com/blog/automating-hadoop-jobs-using-rundeck 3/12
Now, scroll toward the end of the page and click on Create. Your Rundeck
project will get created and you will be able to see the project screen as
shown below.
Now click on Create Job at the Right corner and click on the New job. Fill
necessary details like Job name, description as shown in the screenshot
below:
9/24/2018 Automating Hadoop Jobs Using Rundeck |
https://acadgild.com/blog/automating-hadoop-jobs-using-rundeck 4/12
For scheduling, come to the Workflow section and select options of your
choice. We have selected the following options:
If a step fails: Stop at the failed step
Strategy: Sequential
To provide a job or a query, go the Add step section, and select the option
Command.
Here, you need to provide the Hive query file containing a set of Hive
queries. Below is our hive query.
We will get our employee details inside the file emp.csv on a daily basis in
our HDFS. So, we are creating an hql file with the following content. We have
named it as hive_query.hql.
CREATE DATABASE IF IT DOES NOT EXIST employee;
use employee;
9/24/2018 Automating Hadoop Jobs Using Rundeck |
https://acadgild.com/blog/automating-hadoop-jobs-using-rundeck 5/12
CREATE EXTERNAL TABLE IF IT DOES NOT EXIST employee_test (
id STRING,
first_name STRING,
last_name STRING,
email STRING
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
load data inpath '/emp.csv' into table employee_test;
The command used to run this script in the command line is shown below:
hive -f hive_query.hql
 
After entering the command, click on Save, and if you want to run another
query after this, you can do that by adding another step.
After loading the data, I wish to count the number of employees present. We
can do this by using the following Hive query:
9/24/2018 Automating Hadoop Jobs Using Rundeck |
https://acadgild.com/blog/automating-hadoop-jobs-using-rundeck 6/12
select count(*) from employee.employee_test
We will save this query in a file with the name hive_emp.hql. Towards the
end, we have added: >emp_cnt.txt. So, the above query will write the output
into the file: emp_cnt.txt. We will enter this query as the next step in our
workflow as shown in the screen shot below:
hive -f hive_emp.hql>emp_cnt.txt
For automating this job, select the following option:
Schedule or Run Repeated: Yes
You will get two kinds of automation: one is simple and the other is using
the Unix crontab. After selecting the necessary option, scroll to the last and
click on Create.
After clicking on Create, you will be redirected to the Job page. Beside your
job, you can see the countdown left to run it.
You can also see your job definition in the Definition tab as shown below:
9/24/2018 Automating Hadoop Jobs Using Rundeck |
https://acadgild.com/blog/automating-hadoop-jobs-using-rundeck 7/12
In this demo, we have changed the time and we have only 2 min left to run
the job. After 2 min, this job will automatically run.
You can track the job status in the Activity for this job section below. Here,
we have four options: running, recent, failed, and by you.
After 2 min, in the running tab, you can see that your job is running.
Once your job gets deployed, you will get the deployment or execution
number, using which you can track the job running status and its complete
console output. In the screen shot displayed below, you can see that our job
is running.
9/24/2018 Automating Hadoop Jobs Using Rundeck |
https://acadgild.com/blog/automating-hadoop-jobs-using-rundeck 8/12
And the deployment number is 21. In the recent tab, we can see the list of all
the succeeded and failed jobs. Now, we will check for the execution number
21 and then find the console output.
We can see that our job has run successfully. We can check for the output in
Log Output tab. Here, you can see the console output for both the jobs.
9/24/2018 Automating Hadoop Jobs Using Rundeck |
https://acadgild.com/blog/automating-hadoop-jobs-using-rundeck 9/12
Now, we will check for the output in the file emp_cnt.txt.
In the above screen shot, you can see that there are 6000 employees in that
company till date. As scheduled, the same job will run automatically the next
day, and the count will be saved.
Once the job gets completed successfully, you can see the next deployment
countdown as shown in the screen shot below:
We hope this blog helped you in automating your Hadoop jobs using
Rundeck. Keep visiting our website, www.acadgild.com, for more updates on
Big data Training and other technologies.
Related
9/24/2018 Automating Hadoop Jobs Using Rundeck |
https://acadgild.com/blog/automating-hadoop-jobs-using-rundeck 10/12
Scheduling Hadoop Jobs
using RUNDECK
December 26, 2016
In "All Categories"
Scheduling Hadoop Jobs
Using Jenkins
January 10, 2017
In "Big Data Hadoop &
Spark - Advanced"
Running A Map Reduce
Program Using Oozie
January 20, 2016
In "All Categories"
4 Comments
9/24/2018 Automating Hadoop Jobs Using Rundeck |
https://acadgild.com/blog/automating-hadoop-jobs-using-rundeck 11/12
This site uses Akismet to reduce spam. Learn how your comment data is processed.
Reply
Reply
Reply
Reply
drasticdsemulatorinfo
April 16, 2017 at 1:42 PM
Do you mind if I quote a couple of your articles as long as I provide
credit
and sources back to your blog? My website is in the exact
same area of interest as yours and my users would truly benefit from
some
of the information you present here. Please let me know if this alright
with
you. Many thanks!
AcadGild
April 17, 2017 at 10:43 AM
Pls go ahead!
restorative justice in schools
April 18, 2017 at 9:13 AM
Hey, I think your blog might be having browser compatibility
issues. When I look at your blog site in Chrome,
it looks fine but when opening in Internet Explorer,
it has some overlapping. I just wanted to give you a quick heads up!
Other then that, terrific blog!
best golf simulators for home
April 18, 2017 at 7:18 PM
I don’t even know how I ended up right here, but I thought this submit
was
once great. I don’t recognise who you’re however definitely you
are going to a famous blogger should you are not already.
Cheers!
9/24/2018 Automating Hadoop Jobs Using Rundeck |
https://acadgild.com/blog/automating-hadoop-jobs-using-rundeck 12/12

More Related Content

Similar to Automating hadoop jobs using rundeck

Yaroslav Ravlinko “Evolution of Data Processing platform from Hadoop to nowad...
Yaroslav Ravlinko “Evolution of Data Processing platform from Hadoop to nowad...Yaroslav Ravlinko “Evolution of Data Processing platform from Hadoop to nowad...
Yaroslav Ravlinko “Evolution of Data Processing platform from Hadoop to nowad...Lviv Startup Club
 
Webinar: Ways to Succeed with Hadoop in 2015
Webinar: Ways to Succeed with Hadoop in 2015Webinar: Ways to Succeed with Hadoop in 2015
Webinar: Ways to Succeed with Hadoop in 2015Edureka!
 
Data scientist enablement dse 400 week 6 roadmap
Data scientist enablement   dse 400   week 6 roadmapData scientist enablement   dse 400   week 6 roadmap
Data scientist enablement dse 400 week 6 roadmapDr. Mohan K. Bavirisetty
 
5 Scenarios: When To Use & When Not to Use Hadoop
5 Scenarios: When To Use & When Not to Use Hadoop5 Scenarios: When To Use & When Not to Use Hadoop
5 Scenarios: When To Use & When Not to Use HadoopEdureka!
 
Hadoop Tutorial
Hadoop TutorialHadoop Tutorial
Hadoop Tutorialemedin
 
Shift Remote: DevOps: DevOps Heroes - Adding Advanced Automation to your Tool...
Shift Remote: DevOps: DevOps Heroes - Adding Advanced Automation to your Tool...Shift Remote: DevOps: DevOps Heroes - Adding Advanced Automation to your Tool...
Shift Remote: DevOps: DevOps Heroes - Adding Advanced Automation to your Tool...Shift Conference
 
Open Data Science Conference Big Data Infrastructure – Introduction to Hadoop...
Open Data Science Conference Big Data Infrastructure – Introduction to Hadoop...Open Data Science Conference Big Data Infrastructure – Introduction to Hadoop...
Open Data Science Conference Big Data Infrastructure – Introduction to Hadoop...DataKitchen
 
Hadoop live online training
Hadoop live online trainingHadoop live online training
Hadoop live online trainingHarika583
 
Drupal 8 what to wait from
Drupal 8   what to wait fromDrupal 8   what to wait from
Drupal 8 what to wait fromAndrii Podanenko
 
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and HadoopGoogle Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoophuguk
 
Google App Engine for PHP
Google App Engine for PHP Google App Engine for PHP
Google App Engine for PHP Eric Johnson
 
Vipul divyanshu mahout_documentation
Vipul divyanshu mahout_documentationVipul divyanshu mahout_documentation
Vipul divyanshu mahout_documentationVipul Divyanshu
 
In-Hadoop, In-Database and In-Memory Processing for Predictive Analytics
In-Hadoop, In-Database and In-Memory Processing for Predictive AnalyticsIn-Hadoop, In-Database and In-Memory Processing for Predictive Analytics
In-Hadoop, In-Database and In-Memory Processing for Predictive AnalyticsDataWorks Summit
 
CPU/SPU patch deploy through OEM 12c in offline mode
CPU/SPU patch deploy through OEM 12c in offline modeCPU/SPU patch deploy through OEM 12c in offline mode
CPU/SPU patch deploy through OEM 12c in offline modeRaheel Syed
 
Sentiment Analysis using Big Data
Sentiment Analysis using Big Data Sentiment Analysis using Big Data
Sentiment Analysis using Big Data Rajat Mittal
 

Similar to Automating hadoop jobs using rundeck (20)

Yaroslav Ravlinko “Evolution of Data Processing platform from Hadoop to nowad...
Yaroslav Ravlinko “Evolution of Data Processing platform from Hadoop to nowad...Yaroslav Ravlinko “Evolution of Data Processing platform from Hadoop to nowad...
Yaroslav Ravlinko “Evolution of Data Processing platform from Hadoop to nowad...
 
Webinar: Ways to Succeed with Hadoop in 2015
Webinar: Ways to Succeed with Hadoop in 2015Webinar: Ways to Succeed with Hadoop in 2015
Webinar: Ways to Succeed with Hadoop in 2015
 
Hadoop content
Hadoop contentHadoop content
Hadoop content
 
Data scientist enablement dse 400 week 6 roadmap
Data scientist enablement   dse 400   week 6 roadmapData scientist enablement   dse 400   week 6 roadmap
Data scientist enablement dse 400 week 6 roadmap
 
5 Scenarios: When To Use & When Not to Use Hadoop
5 Scenarios: When To Use & When Not to Use Hadoop5 Scenarios: When To Use & When Not to Use Hadoop
5 Scenarios: When To Use & When Not to Use Hadoop
 
Hadoop Tutorial
Hadoop TutorialHadoop Tutorial
Hadoop Tutorial
 
Shift Remote: DevOps: DevOps Heroes - Adding Advanced Automation to your Tool...
Shift Remote: DevOps: DevOps Heroes - Adding Advanced Automation to your Tool...Shift Remote: DevOps: DevOps Heroes - Adding Advanced Automation to your Tool...
Shift Remote: DevOps: DevOps Heroes - Adding Advanced Automation to your Tool...
 
A hadoop map reduce
A hadoop map reduceA hadoop map reduce
A hadoop map reduce
 
B04 06 0918
B04 06 0918B04 06 0918
B04 06 0918
 
Open Data Science Conference Big Data Infrastructure – Introduction to Hadoop...
Open Data Science Conference Big Data Infrastructure – Introduction to Hadoop...Open Data Science Conference Big Data Infrastructure – Introduction to Hadoop...
Open Data Science Conference Big Data Infrastructure – Introduction to Hadoop...
 
Hadoop live online training
Hadoop live online trainingHadoop live online training
Hadoop live online training
 
Drupal 8 what to wait from
Drupal 8   what to wait fromDrupal 8   what to wait from
Drupal 8 what to wait from
 
NYC_2016_slides
NYC_2016_slidesNYC_2016_slides
NYC_2016_slides
 
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and HadoopGoogle Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
 
B04 06 0918
B04 06 0918B04 06 0918
B04 06 0918
 
Google App Engine for PHP
Google App Engine for PHP Google App Engine for PHP
Google App Engine for PHP
 
Vipul divyanshu mahout_documentation
Vipul divyanshu mahout_documentationVipul divyanshu mahout_documentation
Vipul divyanshu mahout_documentation
 
In-Hadoop, In-Database and In-Memory Processing for Predictive Analytics
In-Hadoop, In-Database and In-Memory Processing for Predictive AnalyticsIn-Hadoop, In-Database and In-Memory Processing for Predictive Analytics
In-Hadoop, In-Database and In-Memory Processing for Predictive Analytics
 
CPU/SPU patch deploy through OEM 12c in offline mode
CPU/SPU patch deploy through OEM 12c in offline modeCPU/SPU patch deploy through OEM 12c in offline mode
CPU/SPU patch deploy through OEM 12c in offline mode
 
Sentiment Analysis using Big Data
Sentiment Analysis using Big Data Sentiment Analysis using Big Data
Sentiment Analysis using Big Data
 

Recently uploaded

High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAbhinavSharma374939
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 

Recently uploaded (20)

High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog Converter
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 

Automating hadoop jobs using rundeck

  • 1. 9/24/2018 Automating Hadoop Jobs Using Rundeck | https://acadgild.com/blog/automating-hadoop-jobs-using-rundeck 1/12 Automating Hadoop Jobs Using Rundeck kiran • April 6, 2017  4  738 All Categories Big Data Hadoop & Spark - Advanced Rundeck is an open source software that helps in automating a set of procedures. It provides features to automate a certain set of things. Rundeck is developed on GitHub as a project called Rundeck SimplifyOps by the Rundeck community. Following are some of its exciting features: 100% Free Course On Big Data Essentials Subscribe to our blog and get access to this course ABSOLUTELY FREE. Name Email Phone Submit
  • 2. 9/24/2018 Automating Hadoop Jobs Using Rundeck | https://acadgild.com/blog/automating-hadoop-jobs-using-rundeck 2/12 Web API Distributed command execution Pluggable execution system (SSH by default) Multistep workflows Job execution with on-demand or scheduled runs Graphical web console for command and job execution Role-based access control policy with support for LDAP/ActiveDirectory History and auditing logs Open integration with external host inventory tools Command line interface tools In our previous blog, we have shown how to schedule a Hadoop job in Rundeck. In this blog, we will give you a demo of how to automate a Hadoop/Hive/Pig job using Rundeck. This will allow your job to run on a daily or even on a monthly basis. We recommend our users to go through our previous blogs on Rundeck for steps on installation and on how to schedule a Hadoop job. Let us start with project creation. We will create a list of Hive queries in a file, after which we will configure the job for it to run automatically every day. To create a new project, Click On the New project, and provide the necessary details, like project_name, description as shown in the screenshot below. Also, check the option Require File Exists in the Resource Model Source and click on Save.
  • 3. 9/24/2018 Automating Hadoop Jobs Using Rundeck | https://acadgild.com/blog/automating-hadoop-jobs-using-rundeck 3/12 Now, scroll toward the end of the page and click on Create. Your Rundeck project will get created and you will be able to see the project screen as shown below. Now click on Create Job at the Right corner and click on the New job. Fill necessary details like Job name, description as shown in the screenshot below:
  • 4. 9/24/2018 Automating Hadoop Jobs Using Rundeck | https://acadgild.com/blog/automating-hadoop-jobs-using-rundeck 4/12 For scheduling, come to the Workflow section and select options of your choice. We have selected the following options: If a step fails: Stop at the failed step Strategy: Sequential To provide a job or a query, go the Add step section, and select the option Command. Here, you need to provide the Hive query file containing a set of Hive queries. Below is our hive query. We will get our employee details inside the file emp.csv on a daily basis in our HDFS. So, we are creating an hql file with the following content. We have named it as hive_query.hql. CREATE DATABASE IF IT DOES NOT EXIST employee; use employee;
  • 5. 9/24/2018 Automating Hadoop Jobs Using Rundeck | https://acadgild.com/blog/automating-hadoop-jobs-using-rundeck 5/12 CREATE EXTERNAL TABLE IF IT DOES NOT EXIST employee_test ( id STRING, first_name STRING, last_name STRING, email STRING ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; load data inpath '/emp.csv' into table employee_test; The command used to run this script in the command line is shown below: hive -f hive_query.hql   After entering the command, click on Save, and if you want to run another query after this, you can do that by adding another step. After loading the data, I wish to count the number of employees present. We can do this by using the following Hive query:
  • 6. 9/24/2018 Automating Hadoop Jobs Using Rundeck | https://acadgild.com/blog/automating-hadoop-jobs-using-rundeck 6/12 select count(*) from employee.employee_test We will save this query in a file with the name hive_emp.hql. Towards the end, we have added: >emp_cnt.txt. So, the above query will write the output into the file: emp_cnt.txt. We will enter this query as the next step in our workflow as shown in the screen shot below: hive -f hive_emp.hql>emp_cnt.txt For automating this job, select the following option: Schedule or Run Repeated: Yes You will get two kinds of automation: one is simple and the other is using the Unix crontab. After selecting the necessary option, scroll to the last and click on Create. After clicking on Create, you will be redirected to the Job page. Beside your job, you can see the countdown left to run it. You can also see your job definition in the Definition tab as shown below:
  • 7. 9/24/2018 Automating Hadoop Jobs Using Rundeck | https://acadgild.com/blog/automating-hadoop-jobs-using-rundeck 7/12 In this demo, we have changed the time and we have only 2 min left to run the job. After 2 min, this job will automatically run. You can track the job status in the Activity for this job section below. Here, we have four options: running, recent, failed, and by you. After 2 min, in the running tab, you can see that your job is running. Once your job gets deployed, you will get the deployment or execution number, using which you can track the job running status and its complete console output. In the screen shot displayed below, you can see that our job is running.
  • 8. 9/24/2018 Automating Hadoop Jobs Using Rundeck | https://acadgild.com/blog/automating-hadoop-jobs-using-rundeck 8/12 And the deployment number is 21. In the recent tab, we can see the list of all the succeeded and failed jobs. Now, we will check for the execution number 21 and then find the console output. We can see that our job has run successfully. We can check for the output in Log Output tab. Here, you can see the console output for both the jobs.
  • 9. 9/24/2018 Automating Hadoop Jobs Using Rundeck | https://acadgild.com/blog/automating-hadoop-jobs-using-rundeck 9/12 Now, we will check for the output in the file emp_cnt.txt. In the above screen shot, you can see that there are 6000 employees in that company till date. As scheduled, the same job will run automatically the next day, and the count will be saved. Once the job gets completed successfully, you can see the next deployment countdown as shown in the screen shot below: We hope this blog helped you in automating your Hadoop jobs using Rundeck. Keep visiting our website, www.acadgild.com, for more updates on Big data Training and other technologies. Related
  • 10. 9/24/2018 Automating Hadoop Jobs Using Rundeck | https://acadgild.com/blog/automating-hadoop-jobs-using-rundeck 10/12 Scheduling Hadoop Jobs using RUNDECK December 26, 2016 In "All Categories" Scheduling Hadoop Jobs Using Jenkins January 10, 2017 In "Big Data Hadoop & Spark - Advanced" Running A Map Reduce Program Using Oozie January 20, 2016 In "All Categories" 4 Comments
  • 11. 9/24/2018 Automating Hadoop Jobs Using Rundeck | https://acadgild.com/blog/automating-hadoop-jobs-using-rundeck 11/12 This site uses Akismet to reduce spam. Learn how your comment data is processed. Reply Reply Reply Reply drasticdsemulatorinfo April 16, 2017 at 1:42 PM Do you mind if I quote a couple of your articles as long as I provide credit and sources back to your blog? My website is in the exact same area of interest as yours and my users would truly benefit from some of the information you present here. Please let me know if this alright with you. Many thanks! AcadGild April 17, 2017 at 10:43 AM Pls go ahead! restorative justice in schools April 18, 2017 at 9:13 AM Hey, I think your blog might be having browser compatibility issues. When I look at your blog site in Chrome, it looks fine but when opening in Internet Explorer, it has some overlapping. I just wanted to give you a quick heads up! Other then that, terrific blog! best golf simulators for home April 18, 2017 at 7:18 PM I don’t even know how I ended up right here, but I thought this submit was once great. I don’t recognise who you’re however definitely you are going to a famous blogger should you are not already. Cheers!
  • 12. 9/24/2018 Automating Hadoop Jobs Using Rundeck | https://acadgild.com/blog/automating-hadoop-jobs-using-rundeck 12/12