SlideShare a Scribd company logo
WDABT 2016 – BHARATHIAR
UNIVERSITY
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar
University,- WDABT 2016
TAkE A CloSER look
AT
PRESENTED BY
K.SANTHIYA
PH.D RESEARCH SCHolAR
UNDER THE GUIDANCE of
DR.V.BHUVANESWARI
ASSISTANT PRofESSoR
DEPARTmENT of ComPUTER APPlICATIoNS
BHARATHIAR UNIVERSITY
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar
University,- WDABT 2016
AGENDA
• MAPREDUCE
•  ANALOGY
•  EXECUTION
•  HADOOP INTERACTION
•  BUILD MAPREDUCE PROGRAM IN ECLIPSE
YARN
•  YARN DEFINITION
•  YARN REAL LIFE CONNECT
•  YARN INRASTRUCTURE
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
WHY mAPREDUCE
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
mAP REDUCE
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
REAl TImE USES of mAP
REDUCE
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
MR REAL – LIFE CONNECT
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
MAP REDUCE - ANALOGY
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
MAP REDUCE – ANALOGY CONTD.,
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
MAP REDUCE EXAMPLE
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
MAP EXECUTION
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
MAP EXECUTION – DISTRIBUTED TWO NODE
ENVIRONMENT
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
MAPREDUCE JOBS
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
HADOOP JOB WORK
INTERACTION
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
CHARACTERISTICS OF MR
• MapReduce is designed to handle very large scale data in
the range of petabytes and exabytes.
• It works well on write once and read many data, also
known as WORM data.
• MapReduce allows parallelism without mutexes.
• The Map and Reduce operations are performed by the
same processor.
• Operations are provisioned near the data as data locality is
preferred.
• Commodity hardware and storage is leveraged in
MapReduce.
• The runtime takes care of splitting and moving data for
operations.
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
BUSINESS SCENARIO
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
SET UP ENVIRONMENT
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
SMALL DATA AND BIG DATA
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
UPLOADING SMALL & BIG DATA
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
BUILD MAPREDUCE PROGRAM
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
MAPREDUCE DEMO
• We will be running an example to compute the value of
‘pi’, which is a computation intensive program. The first
argument indicates how many maps to create. Here, we
use 10 mappers. The second argument indicates how
many samples are generated per map; here, we take 100
random samples. So this program uses 10 multiplied by
100, that is, 1000 random points to estimate pi. We could
enhance 100 to 10 million and improve accuracy.
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
HADOOP MR REQUIREMENTS
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
Create a New Project : Step 1
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
Create a New Project : Step 2
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
Create a New Project : Step 3
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
Create a New Project : Step 4
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
Create a New Project : Step 5
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
CHECKING HADOOP
ENVIRONMENT FOR MAPREDUCE
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
Build a MR Application Using Eclipse
and Run in Hadoop Cluster
Let’s build a MapReduce Java program in Eclipse and then run in our Hadoop
cluster. In this demo, we will run Eclipse in the Windows development
machine and our Hadoop cluster will be in Ubuntu.
• First, let’s launch Eclipse.
• 2. Enter the workspace location.
• 3. Click OK
• 4. The Eclipse window will open.
• 5. Close the welcome screen of Eclipse.
• 6. Select the New menu item.
• 7. Select Java Project.
• 8. The New Java Project window opens.
• 9. We will be build a WordCount program here to count the number of
times each word occurs in a particular file.
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
Build a MR Application Using Eclipse
and Run in Hadoop Cluster Contd.,
10. Enter the name of the project as ‘WordCount’ and click Finish.
11. Right click the WordCount project in the panel on the left.
12. Select New and then Class.
13. The New Java Class window opens.
14. Enter the name of the class as ‘WordCount’.
15. Click Finish.
16. Now, let’s copy the WordCount program from the MapReduce tutorial on
Hadoop’s website. You may go to Hadoop’s documentation or directly go
to the link being shown.
17. Copy the source code for the Word Count program.
18. You would notice a lot of compilation errors. Let’s fix the build patch
now. Select the project WordCount.
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
Build a MR Application Using Eclipse
and Run in Hadoop Cluster Contd.,
19. Select the Project menu item.
20. Click Properties.
21. In libraries, add external JARs.
22. Browse to the unpacked Hadoop directory and go to share- Hadoop-
MapReduce directory.
23. Select the Hadoop MapReduce client core and Hadoop MapReduce client
common JAR files.
24. Now, go to share-Hadoop-common directory.
25. Select the Hadoop common JAR file.
26. The compilation errors would have gone by now.
27. Let’s now see various portions of this program.
28. The usual Java imports are at the top of the program.
29. Further, there are Hadoop and MapReduce related import statements.
Select the Description column header.K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
Build a MR Application Using Eclipse
and Run in Hadoop Cluster Contd.,
30. In the main method, we begin by setting configuration of the MapReduce
job.
31. We set the name of the Mapper class.
32. We set the name of Combiner class.
33. Similarly, there is a Reducer class.
34. We can set the output key class.
35. We can also set the output value class.
36. Also, set the input data path for the source dataset.
37. Set the output path to a location where the results are desired.
38. Our Mapper class extends Mapper.
39. It has a map method which takes key and value as arguments and uses
context.
40. In the WordCount logic, we just tokenize each line by space character and
extract individual words.K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
Build a MR Application Using Eclipse
and Run in Hadoop Cluster Contd.,
41. Our Reducer class similarly extends Reducer.
42. The Reduce method takes a key and an iterable list of values as arguments.
43. The final output is again written as key value pairs.
44. Select the New menu item.
45. Let’s now build and export a JAR file to run this program on a Hadoop
cluster. Click File menu and then Export.
46. The Export window opens.
47. Expand Java.
48. Select JAR file.
49. Click the Next button.
50. Enter the path and name of JAR. In this case, let’s name it
‘WordCount.jar’.
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
Build a MR Application Using Eclipse
and Run in Hadoop Cluster Contd.,
51. Make sure that you to select the project.
52. Now, let’s transfer this JAR to the Hadoop cluster. If you are using
Windows, you can use any SCP or FTP client such as WinSCP. Login to
WinSCP using the IP address of the Hadoop Ubuntu cluster.
53. Enter the username of the Hadoop machine.
54. Enter the password.
55. Select the WordCount.jar file from the local Windows machine.
56. Using WinSCP, you can drag and drop to the Ubuntu machine in the panel
on the right.
57. The Copy window opens.
58. Click the Copy button.
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
Build a MR Application Using Eclipse
and Run in Hadoop Cluster Contd.,
59. Now, run the WordCount program in the Hadoop cluster using the hadoop
jar command. Specify the input file name on which WordCount is to be
applied and also the output result path.
60. View the results in the output directory.
61. You will notice a file named similar to the part Out1.
62. View the contents of this output file using the hadoop fs -cat command.
63. The output will have a count of each word’s occurrence in the input
dataset.
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
WHY YARN ?
YARN : Yet ANotheR ResouRce NAvigAtoR
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
WHAT IS YARN ?
YARN is a resource manager. It was created by separating
the processing engine and the management function of
MapReduce. It monitors and manages workloads,
maintains a multi-tenant environment, manages the high
availability features of Hadoop, and implements security
controls.
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
YARN – REAL LIFE CONNECT
• Limitations of MapReduce
• Architected by Yahoo
• Hadoop 2.0 provides a broader ecosystem with
– Spark for Iterative processing
– Storm for Stream processing
– Hadoop for Batch processing
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
YARN INFRASTRUCTURE
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
REFERENCES
• (2012) Carl W. Olofson, Dan Vesset.
Worldwide Hadoop – MapReduce Ecosystem
Software 2012-2016 Forecast [Online] Available
: http://www.idc.com/getdoc.jsp?
containerId=234294
• Philip Russom , " Big Data Analytics " ,
presented by tdwi , 2011
• K. Cukier, “Data, data everywhere,'' Economist,
vol. 394, no. 8671,pp. 3_16, 2010
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016

More Related Content

What's hot

What's hot (20)

End-to-End Learning for Answering Structured Queries Directly over Text
End-to-End Learning for  Answering Structured Queries Directly over Text End-to-End Learning for  Answering Structured Queries Directly over Text
End-to-End Learning for Answering Structured Queries Directly over Text
 
tianpei_research_summary
tianpei_research_summarytianpei_research_summary
tianpei_research_summary
 
Machines are people too
Machines are people tooMachines are people too
Machines are people too
 
Reproducible research: First steps.
Reproducible research: First steps. Reproducible research: First steps.
Reproducible research: First steps.
 
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
 
The need for a transparent data supply chain
The need for a transparent data supply chainThe need for a transparent data supply chain
The need for a transparent data supply chain
 
Interlinking educational data to Web of Data (Thesis presentation)
Interlinking educational data to Web of Data (Thesis presentation)Interlinking educational data to Web of Data (Thesis presentation)
Interlinking educational data to Web of Data (Thesis presentation)
 
Drug Repurposing using Deep Learning on Knowledge Graphs
Drug Repurposing using Deep Learning on Knowledge GraphsDrug Repurposing using Deep Learning on Knowledge Graphs
Drug Repurposing using Deep Learning on Knowledge Graphs
 
Data Tactics Analytics Brown Bag (Aug 22, 2013)
Data Tactics Analytics Brown Bag (Aug 22, 2013)Data Tactics Analytics Brown Bag (Aug 22, 2013)
Data Tactics Analytics Brown Bag (Aug 22, 2013)
 
Hattrick-Simpers MRS Webinar on AI in Materials
Hattrick-Simpers MRS Webinar on AI in MaterialsHattrick-Simpers MRS Webinar on AI in Materials
Hattrick-Simpers MRS Webinar on AI in Materials
 
Sources of Change in Modern Knowledge Organization Systems
Sources of Change in Modern Knowledge Organization SystemsSources of Change in Modern Knowledge Organization Systems
Sources of Change in Modern Knowledge Organization Systems
 
On community-standards, data curation and scholarly communication - BITS, Ita...
On community-standards, data curation and scholarly communication - BITS, Ita...On community-standards, data curation and scholarly communication - BITS, Ita...
On community-standards, data curation and scholarly communication - BITS, Ita...
 
Combining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
Combining Explicit and Latent Web Semantics for Maintaining Knowledge GraphsCombining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
Combining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
 
Knowledge Graph Maintenance
Knowledge Graph MaintenanceKnowledge Graph Maintenance
Knowledge Graph Maintenance
 
From Data Search to Data Showcasing
From Data Search to Data ShowcasingFrom Data Search to Data Showcasing
From Data Search to Data Showcasing
 
Mining Users Rare Sequential Topic Patterns from Tweets based on Topic Extrac...
Mining Users Rare Sequential Topic Patterns from Tweets based on Topic Extrac...Mining Users Rare Sequential Topic Patterns from Tweets based on Topic Extrac...
Mining Users Rare Sequential Topic Patterns from Tweets based on Topic Extrac...
 
On community-standards, data curation and scholarly communication" Stanford M...
On community-standards, data curation and scholarly communication" Stanford M...On community-standards, data curation and scholarly communication" Stanford M...
On community-standards, data curation and scholarly communication" Stanford M...
 
eScience: A Transformed Scientific Method
eScience: A Transformed Scientific MethodeScience: A Transformed Scientific Method
eScience: A Transformed Scientific Method
 
Metadata as Linked Data for Research Data Repositories
Metadata as Linked Data for Research Data RepositoriesMetadata as Linked Data for Research Data Repositories
Metadata as Linked Data for Research Data Repositories
 
A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
A Scalable Approach for Efficiently Generating Structured Dataset Topic ProfilesA Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
 

Viewers also liked

Map reduce programming model to solve graph problems
Map reduce programming model to solve graph problemsMap reduce programming model to solve graph problems
Map reduce programming model to solve graph problems
Nishant Gandhi
 

Viewers also liked (8)

Introducing MapReduce Programming Framework
Introducing MapReduce Programming FrameworkIntroducing MapReduce Programming Framework
Introducing MapReduce Programming Framework
 
Map reduce in Hadoop
Map reduce in HadoopMap reduce in Hadoop
Map reduce in Hadoop
 
Distributed Computing & MapReduce
Distributed Computing & MapReduceDistributed Computing & MapReduce
Distributed Computing & MapReduce
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Map reduce programming model to solve graph problems
Map reduce programming model to solve graph problemsMap reduce programming model to solve graph problems
Map reduce programming model to solve graph problems
 
Kohl's Pay
Kohl's PayKohl's Pay
Kohl's Pay
 
Map reduce presentation
Map reduce presentationMap reduce presentation
Map reduce presentation
 
MapReduce in Simple Terms
MapReduce in Simple TermsMapReduce in Simple Terms
MapReduce in Simple Terms
 

Similar to Hadoop map reduce

Efficient implementations of machine vision algorithms using a dynamically ty...
Efficient implementations of machine vision algorithms using a dynamically ty...Efficient implementations of machine vision algorithms using a dynamically ty...
Efficient implementations of machine vision algorithms using a dynamically ty...
Jan Wedekind
 
Cheng_Wang_resume
Cheng_Wang_resumeCheng_Wang_resume
Cheng_Wang_resume
Cheng Wang
 
Wf4Ever: Work!ows for Methodology and Science Preservation
Wf4Ever: Work!ows for Methodology and Science PreservationWf4Ever: Work!ows for Methodology and Science Preservation
Wf4Ever: Work!ows for Methodology and Science Preservation
Joint ALMA Observatory
 
DCSF 19 Towards Reproducable Climate Research
DCSF 19 Towards Reproducable Climate ResearchDCSF 19 Towards Reproducable Climate Research
DCSF 19 Towards Reproducable Climate Research
Docker, Inc.
 
LEARN PROGRAMMING IN VIRTUAL REALITY_ A PROJECT FOR COMPUTER SCIE.pdf
LEARN PROGRAMMING IN VIRTUAL REALITY_ A PROJECT FOR COMPUTER SCIE.pdfLEARN PROGRAMMING IN VIRTUAL REALITY_ A PROJECT FOR COMPUTER SCIE.pdf
LEARN PROGRAMMING IN VIRTUAL REALITY_ A PROJECT FOR COMPUTER SCIE.pdf
ssuser08e250
 
Robotics-Based Learning in the Context of Computer Programming
Robotics-Based Learning in the Context of Computer ProgrammingRobotics-Based Learning in the Context of Computer Programming
Robotics-Based Learning in the Context of Computer Programming
Jacob Storer
 

Similar to Hadoop map reduce (20)

Efficient implementations of machine vision algorithms using a dynamically ty...
Efficient implementations of machine vision algorithms using a dynamically ty...Efficient implementations of machine vision algorithms using a dynamically ty...
Efficient implementations of machine vision algorithms using a dynamically ty...
 
Cheng_Wang_resume
Cheng_Wang_resumeCheng_Wang_resume
Cheng_Wang_resume
 
Shikha Soni
Shikha SoniShikha Soni
Shikha Soni
 
Open-source tools for generating and analyzing large materials data sets
Open-source tools for generating and analyzing large materials data setsOpen-source tools for generating and analyzing large materials data sets
Open-source tools for generating and analyzing large materials data sets
 
Wf4Ever: Work!ows for Methodology and Science Preservation
Wf4Ever: Work!ows for Methodology and Science PreservationWf4Ever: Work!ows for Methodology and Science Preservation
Wf4Ever: Work!ows for Methodology and Science Preservation
 
Paper review
Paper reviewPaper review
Paper review
 
Resume_HongDapeng_20150914
Resume_HongDapeng_20150914Resume_HongDapeng_20150914
Resume_HongDapeng_20150914
 
2015_CV_J_SHELTON_linked
2015_CV_J_SHELTON_linked2015_CV_J_SHELTON_linked
2015_CV_J_SHELTON_linked
 
Cs8383 oop lab manual-2019
Cs8383 oop lab manual-2019Cs8383 oop lab manual-2019
Cs8383 oop lab manual-2019
 
Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better Science
 
Approach to leverage Websites to APIs through Semantics
Approach to leverage Websites to APIs through SemanticsApproach to leverage Websites to APIs through Semantics
Approach to leverage Websites to APIs through Semantics
 
Improved Knowledge from Data: Building an Immersive Data Analysis Platform
Improved Knowledge from Data: Building an Immersive Data Analysis PlatformImproved Knowledge from Data: Building an Immersive Data Analysis Platform
Improved Knowledge from Data: Building an Immersive Data Analysis Platform
 
DCSF 19 Towards Reproducable Climate Research
DCSF 19 Towards Reproducable Climate ResearchDCSF 19 Towards Reproducable Climate Research
DCSF 19 Towards Reproducable Climate Research
 
LEARN PROGRAMMING IN VIRTUAL REALITY_ A PROJECT FOR COMPUTER SCIE.pdf
LEARN PROGRAMMING IN VIRTUAL REALITY_ A PROJECT FOR COMPUTER SCIE.pdfLEARN PROGRAMMING IN VIRTUAL REALITY_ A PROJECT FOR COMPUTER SCIE.pdf
LEARN PROGRAMMING IN VIRTUAL REALITY_ A PROJECT FOR COMPUTER SCIE.pdf
 
cv_10
cv_10cv_10
cv_10
 
Laboratory_Proposal.pdf
Laboratory_Proposal.pdfLaboratory_Proposal.pdf
Laboratory_Proposal.pdf
 
Robotics-Based Learning in the Context of Computer Programming
Robotics-Based Learning in the Context of Computer ProgrammingRobotics-Based Learning in the Context of Computer Programming
Robotics-Based Learning in the Context of Computer Programming
 
JLopezResume2
JLopezResume2JLopezResume2
JLopezResume2
 
OpenACC Monthly Highlights: January 2024
OpenACC Monthly Highlights: January 2024OpenACC Monthly Highlights: January 2024
OpenACC Monthly Highlights: January 2024
 
Customizing Discovery Interfaces: Understanding Users’ Behaviors and Providin...
Customizing Discovery Interfaces: Understanding Users’ Behaviors and Providin...Customizing Discovery Interfaces: Understanding Users’ Behaviors and Providin...
Customizing Discovery Interfaces: Understanding Users’ Behaviors and Providin...
 

Recently uploaded

一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Domenico Conte
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
Computer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sComputer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage s
MAQIB18
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 

Recently uploaded (20)

一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
Uber Ride Supply Demand Gap Analysis Report
Uber Ride Supply Demand Gap Analysis ReportUber Ride Supply Demand Gap Analysis Report
Uber Ride Supply Demand Gap Analysis Report
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
 
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
 
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
Computer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sComputer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage s
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 

Hadoop map reduce

  • 1. WDABT 2016 – BHARATHIAR UNIVERSITY K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 2. TAkE A CloSER look AT PRESENTED BY K.SANTHIYA PH.D RESEARCH SCHolAR UNDER THE GUIDANCE of DR.V.BHUVANESWARI ASSISTANT PRofESSoR DEPARTmENT of ComPUTER APPlICATIoNS BHARATHIAR UNIVERSITY K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 3. AGENDA • MAPREDUCE •  ANALOGY •  EXECUTION •  HADOOP INTERACTION •  BUILD MAPREDUCE PROGRAM IN ECLIPSE YARN •  YARN DEFINITION •  YARN REAL LIFE CONNECT •  YARN INRASTRUCTURE K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 4. WHY mAPREDUCE K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 5. mAP REDUCE K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 6. REAl TImE USES of mAP REDUCE K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 7. MR REAL – LIFE CONNECT K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 8. MAP REDUCE - ANALOGY K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 9. MAP REDUCE – ANALOGY CONTD., K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 10. MAP REDUCE EXAMPLE K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 11. MAP EXECUTION K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 12. MAP EXECUTION – DISTRIBUTED TWO NODE ENVIRONMENT K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 13. MAPREDUCE JOBS K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 14. HADOOP JOB WORK INTERACTION K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 15. CHARACTERISTICS OF MR • MapReduce is designed to handle very large scale data in the range of petabytes and exabytes. • It works well on write once and read many data, also known as WORM data. • MapReduce allows parallelism without mutexes. • The Map and Reduce operations are performed by the same processor. • Operations are provisioned near the data as data locality is preferred. • Commodity hardware and storage is leveraged in MapReduce. • The runtime takes care of splitting and moving data for operations. K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 16. BUSINESS SCENARIO K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 17. SET UP ENVIRONMENT K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 18. SMALL DATA AND BIG DATA K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 19. UPLOADING SMALL & BIG DATA K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 20. BUILD MAPREDUCE PROGRAM K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 21. MAPREDUCE DEMO • We will be running an example to compute the value of ‘pi’, which is a computation intensive program. The first argument indicates how many maps to create. Here, we use 10 mappers. The second argument indicates how many samples are generated per map; here, we take 100 random samples. So this program uses 10 multiplied by 100, that is, 1000 random points to estimate pi. We could enhance 100 to 10 million and improve accuracy. K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 22. HADOOP MR REQUIREMENTS K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 23. Create a New Project : Step 1 K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 24. Create a New Project : Step 2 K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 25. Create a New Project : Step 3 K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 26. Create a New Project : Step 4 K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 27. Create a New Project : Step 5 K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 28. CHECKING HADOOP ENVIRONMENT FOR MAPREDUCE K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 29. Build a MR Application Using Eclipse and Run in Hadoop Cluster Let’s build a MapReduce Java program in Eclipse and then run in our Hadoop cluster. In this demo, we will run Eclipse in the Windows development machine and our Hadoop cluster will be in Ubuntu. • First, let’s launch Eclipse. • 2. Enter the workspace location. • 3. Click OK • 4. The Eclipse window will open. • 5. Close the welcome screen of Eclipse. • 6. Select the New menu item. • 7. Select Java Project. • 8. The New Java Project window opens. • 9. We will be build a WordCount program here to count the number of times each word occurs in a particular file. K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 30. Build a MR Application Using Eclipse and Run in Hadoop Cluster Contd., 10. Enter the name of the project as ‘WordCount’ and click Finish. 11. Right click the WordCount project in the panel on the left. 12. Select New and then Class. 13. The New Java Class window opens. 14. Enter the name of the class as ‘WordCount’. 15. Click Finish. 16. Now, let’s copy the WordCount program from the MapReduce tutorial on Hadoop’s website. You may go to Hadoop’s documentation or directly go to the link being shown. 17. Copy the source code for the Word Count program. 18. You would notice a lot of compilation errors. Let’s fix the build patch now. Select the project WordCount. K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 31. Build a MR Application Using Eclipse and Run in Hadoop Cluster Contd., 19. Select the Project menu item. 20. Click Properties. 21. In libraries, add external JARs. 22. Browse to the unpacked Hadoop directory and go to share- Hadoop- MapReduce directory. 23. Select the Hadoop MapReduce client core and Hadoop MapReduce client common JAR files. 24. Now, go to share-Hadoop-common directory. 25. Select the Hadoop common JAR file. 26. The compilation errors would have gone by now. 27. Let’s now see various portions of this program. 28. The usual Java imports are at the top of the program. 29. Further, there are Hadoop and MapReduce related import statements. Select the Description column header.K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 32. Build a MR Application Using Eclipse and Run in Hadoop Cluster Contd., 30. In the main method, we begin by setting configuration of the MapReduce job. 31. We set the name of the Mapper class. 32. We set the name of Combiner class. 33. Similarly, there is a Reducer class. 34. We can set the output key class. 35. We can also set the output value class. 36. Also, set the input data path for the source dataset. 37. Set the output path to a location where the results are desired. 38. Our Mapper class extends Mapper. 39. It has a map method which takes key and value as arguments and uses context. 40. In the WordCount logic, we just tokenize each line by space character and extract individual words.K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 33. Build a MR Application Using Eclipse and Run in Hadoop Cluster Contd., 41. Our Reducer class similarly extends Reducer. 42. The Reduce method takes a key and an iterable list of values as arguments. 43. The final output is again written as key value pairs. 44. Select the New menu item. 45. Let’s now build and export a JAR file to run this program on a Hadoop cluster. Click File menu and then Export. 46. The Export window opens. 47. Expand Java. 48. Select JAR file. 49. Click the Next button. 50. Enter the path and name of JAR. In this case, let’s name it ‘WordCount.jar’. K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 34. Build a MR Application Using Eclipse and Run in Hadoop Cluster Contd., 51. Make sure that you to select the project. 52. Now, let’s transfer this JAR to the Hadoop cluster. If you are using Windows, you can use any SCP or FTP client such as WinSCP. Login to WinSCP using the IP address of the Hadoop Ubuntu cluster. 53. Enter the username of the Hadoop machine. 54. Enter the password. 55. Select the WordCount.jar file from the local Windows machine. 56. Using WinSCP, you can drag and drop to the Ubuntu machine in the panel on the right. 57. The Copy window opens. 58. Click the Copy button. K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 35. Build a MR Application Using Eclipse and Run in Hadoop Cluster Contd., 59. Now, run the WordCount program in the Hadoop cluster using the hadoop jar command. Specify the input file name on which WordCount is to be applied and also the output result path. 60. View the results in the output directory. 61. You will notice a file named similar to the part Out1. 62. View the contents of this output file using the hadoop fs -cat command. 63. The output will have a count of each word’s occurrence in the input dataset. K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 36. WHY YARN ? YARN : Yet ANotheR ResouRce NAvigAtoR K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 37. WHAT IS YARN ? YARN is a resource manager. It was created by separating the processing engine and the management function of MapReduce. It monitors and manages workloads, maintains a multi-tenant environment, manages the high availability features of Hadoop, and implements security controls. K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 38. YARN – REAL LIFE CONNECT • Limitations of MapReduce • Architected by Yahoo • Hadoop 2.0 provides a broader ecosystem with – Spark for Iterative processing – Storm for Stream processing – Hadoop for Batch processing K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 39. YARN INFRASTRUCTURE K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 40. REFERENCES • (2012) Carl W. Olofson, Dan Vesset. Worldwide Hadoop – MapReduce Ecosystem Software 2012-2016 Forecast [Online] Available : http://www.idc.com/getdoc.jsp? containerId=234294 • Philip Russom , " Big Data Analytics " , presented by tdwi , 2011 • K. Cukier, “Data, data everywhere,'' Economist, vol. 394, no. 8671,pp. 3_16, 2010 K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016