What is hadoop and how it works?
Upcoming SlideShare
Loading in...5
×
 

What is hadoop and how it works?

on

  • 681 views

Hadoop is gaining interest all over the world. To make yourself comfortable with this latest technology. Check this presentaon. It explains basics of Hadoop and working flow of cluster.

Hadoop is gaining interest all over the world. To make yourself comfortable with this latest technology. Check this presentaon. It explains basics of Hadoop and working flow of cluster.

Statistics

Views

Total Views
681
Views on SlideShare
454
Embed Views
227

Actions

Likes
0
Downloads
29
Comments
2

5 Embeds 227

http://explaintoday.blogspot.in 132
http://explaintoday.blogspot.ru 52
http://explaintoday.blogspot.com 32
http://cnufederer.com 10
http://www.blogger.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • thanks.got to knoe more about hadoop software platform
    Are you sure you want to
    Your message goes here
    Processing…
  • Hi
    I read this post two times.
    I like it so much, please try to keep posting.
    Let me introduce other material that may be good for our community.

    Hadoop online training
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

What is hadoop and how it works? What is hadoop and how it works? Presentation Transcript

  • Cnu FedererImage source : hadoop.apache.org
  • Cnu Federer ExplainToday.blogspot.comWhat is hadoop?●A powerful frame work to process big data●Parallel processing and Distributed databaseHADOOPBigDataAnalytics,Recommendations,Insights
  • Cnu Federer ExplainToday.blogspot.comWhere is it come from?●Evolved from Google Map Reduce and Googles Database file system●Later converted to open source project
  • Cnu Federer ExplainToday.blogspot.comWhy it is siginificant?●Data is growing rapidly●Need for proper analytics●Saving power and time ●Traditional methods failed
  • Cnu Federer ExplainToday.blogspot.comKey terms in hadoop●Name Node– Important machine which stores metadata about datanodes●Resource Manager (Job Tracker)– Manages available resources (datanodes memory/processing power)These two considered as masters
  • Cnu Federer ExplainToday.blogspot.comKey terms (contd..)●Data Node– Which stores data and do map reduce tasks– We can add as many as we want●Secondary Name node– Takes frequent image files from Name node– Useful in recovering Namenode failure– Reduces burden for Name node
  • Cnu Federer ExplainToday.blogspot.comKey terms (contd..)●HDFS– Hadoop Distributed File System– Each machine has their loca file systems, but this is distributed and available for all machines●History Server– Saves Job history of data nodes
  • Cnu Federer ExplainToday.blogspot.comWhat is map-reduce?●A software framework used to process data●Introduced by Google●Map and Reduce are two phasesMapping phaseReducing PhaseData Key-Value pairsResults
  • Cnu Federer ExplainToday.blogspot.comHow map-reduce works?Image source : googleExample : Calculating no.of times a word occurs
  • Cnu Federer ExplainToday.blogspot.comHadoop – Work flowName NodeResourceManagerDataNodeDataNodeDataNodeDataNodeHistoryServerSecondaryNamenode123456
  • Cnu Federer ExplainToday.blogspot.comHow hadoop works?1 ➔Store data in HDFS across all the nodes➔Namenode will store the metadata of datanodes➔Task will be given to Hadoop cluster➔Resource Manager check with Name node about which datanode has which data23
  • Cnu Federer ExplainToday.blogspot.comHow hadoop works? ( contd..)4 ➔Based on namenode inputs, RM will give Map Reduce tasks to data nodes➔Data nodes performs Map Reduce and store the task in History Server➔After tasks have completed, results will be collected and given back to user5
  • Cnu Federer ExplainToday.blogspot.comCommercial products●CDH ( Cloudera Distribution inclding Apache Hadoop)●IBM Infosphere BigInsights●MapR apache hadoop distributions●Hortonworks Hadoop distributions●...... and so many
  • Cnu Federer ExplainToday.blogspot.comReferences●http://en.wikipedia.org/wiki/Apache_Hadoop●http://hadoop.apache.org/●http://www­01.ibm.com/software/data/infosphere/hadoop/
  • Cnu Federer (tweet@cnufederer)TRY AND LEARN