Your SlideShare is downloading. ×
“Deep dive”
Key takeaways
• NameNode is critical to cluster
• NameNode doesn’t equal to SecondaryNameNode, no back
up etc.
• Client ac...
Map-Reduce
YARN – Map-Reduce
Key takeaways
• YARN – promotes hadoop
cluster to “universal
computational cluster”
• Map-Reduce is just one
application r...
High Availability
• Issue for Hadoop 1.x
– NameNode SPOF
– Problems with cluster maintenance
– “Split the brain scenario”
...
Hadoop 2.0 HA – Key points
• Hadoop HA doesn’t influence just HDFS
• Provides semi-automated or automatic failover
• Simpl...
Cluster processes
Hadoop
• NameNode
• SecondaryNameNode
• DataNode
• ResourceManager
• NodeManager
• ZooKeeper
• JournalNo...
Service Profiles – Node Roles
Takeaway
• HDFS should be “SAFE”
– Possibility to protect data
• Resource-Manager is now SPOF
– We may not be able to proc...
Deep dive hadoop
Deep dive hadoop
Deep dive hadoop
Deep dive hadoop
Deep dive hadoop
Deep dive hadoop
Deep dive hadoop
Upcoming SlideShare
Loading in...5
×

Deep dive hadoop

170

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
170
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
9
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Deep dive hadoop"

  1. 1. “Deep dive”
  2. 2. Key takeaways • NameNode is critical to cluster • NameNode doesn’t equal to SecondaryNameNode, no back up etc. • Client access cluster nodes • NameNode doesn’t take part in Data transfer
  3. 3. Map-Reduce
  4. 4. YARN – Map-Reduce
  5. 5. Key takeaways • YARN – promotes hadoop cluster to “universal computational cluster” • Map-Reduce is just one application running on cluster • Hadoop is not just a Map- Reduce since Hadoop 2.0
  6. 6. High Availability • Issue for Hadoop 1.x – NameNode SPOF – Problems with cluster maintenance – “Split the brain scenario” – “Shoot me in the HEAD” • Solutions: – NFS – Facebook’s “Avatar Node” – Hadoop 2.0 • Things to consider – Cold, Warm or Hot stand by – Manual, Semi-automated, Automated failover
  7. 7. Hadoop 2.0 HA – Key points • Hadoop HA doesn’t influence just HDFS • Provides semi-automated or automatic failover • Simplifies cluster maintenance • Complicates node installations • Cluster operations more complicated
  8. 8. Cluster processes Hadoop • NameNode • SecondaryNameNode • DataNode • ResourceManager • NodeManager • ZooKeeper • JournalNode • ZKFailOverControler • History server Hbase • HMaster • RegionServer • ZooKeeper
  9. 9. Service Profiles – Node Roles
  10. 10. Takeaway • HDFS should be “SAFE” – Possibility to protect data • Resource-Manager is now SPOF – We may not be able to process data in cluster

×