Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

What is Hadoop Cluster? Hadoop Cluster Setup and Architecture | Edureka

50 views

Published on

YouTube Link: https://youtu.be/aBCDy-dJE0Y

**Big Data Hadoop Certification Training: https://www.edureka.co/big-data-hadoop-training-certification **
This Edureka PPT on Hadoop Cluster will provide you with detailed knowledge about Hadoop and its Architecture along with it. This video will help you to set up a multi-node cluster on your own. This PPT covers the following topics:
What is a Hadoop Cluster?
Advantages of a Hadoop Cluster
Facebook’s Hadoop Cluster
Hadoop Cluster Architecture
Setting up a Hadoop Cluster
Hadoop Cluster Management System

Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Castbox: https://castbox.fm/networks/505?country=in

Published in: Technology
  • Be the first to comment

What is Hadoop Cluster? Hadoop Cluster Setup and Architecture | Edureka

  1. 1. HADOOP CLUSTER
  2. 2. ADVANTAGES OF HADOOP CLUSTER FACEBOOK HADOOP CLUSTER www.edureka.co WHAT IS A HADOOP CLUSTER? ARCHITECTURE OF HADOOP CLUSTER SETTING UP A HADOOP CLUSTER MANAGING A HADOOP CLUSTER
  3. 3. www.edureka.co WHAT IS A HADOOP CLUSTER?
  4. 4. www.edureka.co WHAT IS A CLUSTER?
  5. 5. www.edureka.co WHAT IS A CLUSTER? Computer Cluster • A computer cluster is a set of loosely or tightly connected computers. • They work together so that, in many respects and viewed as a single system. • Computer clusters have each node set to perform the same task, controlled and scheduled by software.
  6. 6. www.edureka.co AI is a technique that enables machines to mimic human behaviour. WHAT IS A HADOOP CLUSTER? Master Slaves
  7. 7. www.edureka.co WHAT IS A HADOOP CLUSTER? Hadoop Cluster • A Hadoop cluster is a set of connected commodity computers. • They work together so that, in many respects and viewed as a single system. • Hadoop clusters have each node set to perform the same task, controlled and scheduled by the Master.
  8. 8. www.edureka.co HADOOP CLUSTER ADVANTAGES
  9. 9. www.edureka.co The Major advantages of Hadoop Cluster are as follows: • Scalable • Cost effective • Flexible • Fast • Resilient to failure Advantages of Hadoop Cluster
  10. 10. www.edureka.co FACEBOOK HADOOP CLUSTER
  11. 11. www.edureka.co FACEBOOK HADOOP CLUSTER • Facebook’s Cluster is known as the Beefiest Hadoop cluster. • 4,000 machines and storing more than hundreds of millions of gigabytes • Launched in the year 2004 • 2.38 billion accounts
  12. 12. www.edureka.co Facebook Hadoop Cluster • The developers can freely write map-reduce programs in any language. • SQL has been integrated to process extensive data sets • Searching, Log processing, Recommendation system, starting from Data warehousing, to Video and Image analysis
  13. 13. www.edureka.co Facebook Growing Big With Big Data
  14. 14. Web servers FilersScribe MidTier Production Hive-Hadoop Cluster Adhoc Hive-Hadoop Cluster Oracle RAC Federated MySQL Scribe Hadoop Cluster Hive Replication FACEBOOK HADOOP CLUSTER ARCHITECTURE
  15. 15. www.edureka.co HADOOP CLUSTER ARCHITECTURE
  16. 16. www.edureka.co HADOOP CLUSTER ARCHITECTURE NAMENODE RESOURCE MANAGER SECONDARY NAMENODE DATANODE NODEMANAGER HDFS YARN Hadoop MASTER SLAVE
  17. 17. www.edureka.co AI is a technique that enables machines to mimic human behaviour. NAMENODE DATANODES HADOOP CLUSTER ARCHITECTURE
  18. 18. www.edureka.co AI is a technique that enables machines to mimic human behaviour. Name Node • Master daemon manages the Data Nodes. • Records the metadata of all the files • Receives Heartbeat and a block report from Data Nodes. Data Node • Slave daemons runs on slave machine • The actual data is stored on Data Nodes • Responsible for serving read & write requests. NAMENODE SECONDARY NAMENODE FS-image Edit Log Edit Log (New) FS-image Edit Log FS-image (Final) HADOOP CLUSTER ARCHITECTURE
  19. 19. www.edureka.co • YARN ( Yet Another Resource Negotiator ) provides ability to run Non-MapReduce application. • YARN framework is responsible for doing Cluster Resource Management. HADOOP CLUSTER ARCHITECTURE
  20. 20. www.edureka.co CORE SWITCH HADOOP CLUSTER MASTER COMPUTER 1 COMPUTER 2 COMPUTER 3 COMPUTER n CORE SWITCH RACK SWITCH RACK SWITCHRACK SWITCH COMPUTER 1 COMPUTER 2 COMPUTER 3 COMPUTER n COMPUTER 2 COMPUTER 3 COMPUTER n HADOOP CLUSTER ARCHITECTURE
  21. 21. www.edureka.co AI is a technique that enables machines to mimic human behaviour. BLOCK 1 BLOCK 2 BLOCK 3 BLOCK 4 BLOCK 5 NODE 1 NODE 4 NODE 2 NODE 3 BLOCK 1 BLOCK 1BLOCK 1 BLOCK 3 BLOCK 3 BLOCK 3 BLOCK 2 BLOCK 2 BLOCK 2 BLOCK 4 BLOCK 4 BLOCK 4 BLOCK 5 BLOCK 5BLOCK 5 HADOOP CLUSTER ARCHITECTURE
  22. 22. www.edureka.co • Rack Awareness Algorithm reduces latency as well as provide fault tolerance by replicating data block. • Rack Awareness Algorithm says that the first replica of a block will be stored on a local rack & the next two replicas will be stored on a different (remote) rack. HADOOP CLUSTER ARCHITECTURE
  23. 23. www.edureka.co HADOOP CLUSTER ARCHITECTURE
  24. 24. www.edureka.co HADOOP CLUSTER ARCHITECTURE
  25. 25. www.edureka.co HADOOP CLUSTER ARCHITECTURE
  26. 26. www.edureka.co HADOOP CLUSTER ARCHITECTURE
  27. 27. www.edureka.co HADOOP CLUSTER ARCHITECTURE
  28. 28. www.edureka.co HADOOP CLUSTER ARCHITECTURE
  29. 29. www.edureka.co SET UP A HADOOP CLUSTER
  30. 30. www.edureka.co MANAGE A HADOOP CLUSTER
  31. 31. www.edureka.co AI is a technique that enables machines to mimic human behaviour. MANAGE A HADOOP CLUSTER
  32. 32. www.edureka.co • Hadoop is both a command line interface as well as an API. • It does not require any tool in specific for managing and monitoring utilities. • There are some options available such as: 1. Ambari 2. HortonWorks
  33. 33. www.edureka.co DATABASE AMBARI CLUSTER ARCHITECTURE AGENT HOST AGENT HOST AGENT HOST CLUSTERS AMBARI WEB SERVER AMBARI WEB
  34. 34. www.edureka.co AMBARI OVERVIEW
  35. 35. www.edureka.co

×