Cnu FedererImage source : hadoop.apache.org
Cnu Federer ExplainToday.blogspot.comWhat is hadoop?●A powerful frame work to process big data●Parallel processing and Dis...
Cnu Federer ExplainToday.blogspot.comWhere is it come from?●Evolved from Google Map Reduce and Googles Database file syste...
Cnu Federer ExplainToday.blogspot.comWhy it is siginificant?●Data is growing rapidly●Need for proper analytics●Saving powe...
Cnu Federer ExplainToday.blogspot.comKey terms in hadoop●Name Node– Important machine which stores metadata about datanode...
Cnu Federer ExplainToday.blogspot.comKey terms (contd..)●Data Node– Which stores data and do map reduce tasks– We can add ...
Cnu Federer ExplainToday.blogspot.comKey terms (contd..)●HDFS– Hadoop Distributed File System– Each machine has their loca...
Cnu Federer ExplainToday.blogspot.comWhat is map-reduce?●A software framework used to process data●Introduced by Google●Ma...
Cnu Federer ExplainToday.blogspot.comHow map-reduce works?Image source : googleExample : Calculating no.of times a word oc...
Cnu Federer ExplainToday.blogspot.comHadoop – Work flowName NodeResourceManagerDataNodeDataNodeDataNodeDataNodeHistoryServ...
Cnu Federer ExplainToday.blogspot.comHow hadoop works?1 ➔Store data in HDFS across all the nodes➔Namenode will store the m...
Cnu Federer ExplainToday.blogspot.comHow hadoop works? ( contd..)4 ➔Based on namenode inputs, RM will give Map Reduce task...
Cnu Federer ExplainToday.blogspot.comCommercial products●CDH ( Cloudera Distribution inclding Apache Hadoop)●IBM Infospher...
Cnu Federer ExplainToday.blogspot.comReferences●http://en.wikipedia.org/wiki/Apache_Hadoop●http://hadoop.apache.org/●http:...
Cnu Federer (tweet@cnufederer)TRY AND LEARN
Upcoming SlideShare
Loading in …5
×

What is hadoop and how it works?

1,073 views
972 views

Published on

Hadoop is gaining interest all over the world. To make yourself comfortable with this latest technology. Check this presentaon. It explains basics of Hadoop and working flow of cluster.

Published in: Technology
2 Comments
1 Like
Statistics
Notes
No Downloads
Views
Total views
1,073
On SlideShare
0
From Embeds
0
Number of Embeds
436
Actions
Shares
0
Downloads
55
Comments
2
Likes
1
Embeds 0
No embeds

No notes for slide

What is hadoop and how it works?

  1. 1. Cnu FedererImage source : hadoop.apache.org
  2. 2. Cnu Federer ExplainToday.blogspot.comWhat is hadoop?●A powerful frame work to process big data●Parallel processing and Distributed databaseHADOOPBigDataAnalytics,Recommendations,Insights
  3. 3. Cnu Federer ExplainToday.blogspot.comWhere is it come from?●Evolved from Google Map Reduce and Googles Database file system●Later converted to open source project
  4. 4. Cnu Federer ExplainToday.blogspot.comWhy it is siginificant?●Data is growing rapidly●Need for proper analytics●Saving power and time ●Traditional methods failed
  5. 5. Cnu Federer ExplainToday.blogspot.comKey terms in hadoop●Name Node– Important machine which stores metadata about datanodes●Resource Manager (Job Tracker)– Manages available resources (datanodes memory/processing power)These two considered as masters
  6. 6. Cnu Federer ExplainToday.blogspot.comKey terms (contd..)●Data Node– Which stores data and do map reduce tasks– We can add as many as we want●Secondary Name node– Takes frequent image files from Name node– Useful in recovering Namenode failure– Reduces burden for Name node
  7. 7. Cnu Federer ExplainToday.blogspot.comKey terms (contd..)●HDFS– Hadoop Distributed File System– Each machine has their loca file systems, but this is distributed and available for all machines●History Server– Saves Job history of data nodes
  8. 8. Cnu Federer ExplainToday.blogspot.comWhat is map-reduce?●A software framework used to process data●Introduced by Google●Map and Reduce are two phasesMapping phaseReducing PhaseData Key-Value pairsResults
  9. 9. Cnu Federer ExplainToday.blogspot.comHow map-reduce works?Image source : googleExample : Calculating no.of times a word occurs
  10. 10. Cnu Federer ExplainToday.blogspot.comHadoop – Work flowName NodeResourceManagerDataNodeDataNodeDataNodeDataNodeHistoryServerSecondaryNamenode123456
  11. 11. Cnu Federer ExplainToday.blogspot.comHow hadoop works?1 ➔Store data in HDFS across all the nodes➔Namenode will store the metadata of datanodes➔Task will be given to Hadoop cluster➔Resource Manager check with Name node about which datanode has which data23
  12. 12. Cnu Federer ExplainToday.blogspot.comHow hadoop works? ( contd..)4 ➔Based on namenode inputs, RM will give Map Reduce tasks to data nodes➔Data nodes performs Map Reduce and store the task in History Server➔After tasks have completed, results will be collected and given back to user5
  13. 13. Cnu Federer ExplainToday.blogspot.comCommercial products●CDH ( Cloudera Distribution inclding Apache Hadoop)●IBM Infosphere BigInsights●MapR apache hadoop distributions●Hortonworks Hadoop distributions●...... and so many
  14. 14. Cnu Federer ExplainToday.blogspot.comReferences●http://en.wikipedia.org/wiki/Apache_Hadoop●http://hadoop.apache.org/●http://www­01.ibm.com/software/data/infosphere/hadoop/
  15. 15. Cnu Federer (tweet@cnufederer)TRY AND LEARN

×