SlideShare a Scribd company logo
1 of 41
Hadoop
Part 1
Agenda
• Hadoop Ecosystem
• Services
• key ideas
• Architecture
• Usage Patterns
• Tuning Parameters
Hadoop Ecosystem
Ecosystem – Key Services
• HDFS
• YARN ( vs Mesos)
• MR ( vs Tez)
• Hive
• Zookeeper
• Kafka
HDFS – Key Ideas
• Distributed
• Divide files into big blocks and distribute across the cluster
• Replication
• Store multiple replicas of each block for reliability. Enables fault-tolerance.
• Write Once, Read Many times (WORM)
• Blocks are immutable
• Data locality
• Programs can find replicas for each block and gain data locality
HDFS - Architecture
HDFS - Metadata
YARN – Key Ideas
• Separation of Concerns
• Resource Management, Job Scheduling / Monitoring.
• Schedulers and Queues
• Shared Clusters
• Locality awareness
• Rack, File -to-block map
• Support for diverse programming models
YARN Architecture
YARN – Schedulers and Queues
MapReduce – Key Ideas
• It is a parallel programming model
• Key Interfaces/steps
• Map
• Combine, Partition, Shuffle & Sort
• Reduce
• Counters
• Backup tasks for stragglers
MapReduce - Execution
MapReduce - Examples
Tez – Key Ideas
• Expressiveness of DAG
• Dynamically adapting the execution
• Runtime graph re-configuration
• Automatic Partition cardinality estimation
• Scheduling Optimizations
• Container re-use and Session
• Avoid re-computing
Tez – Key Ideas
Tez – DAG
Tez – API
Hive – Key Ideas
• Segregation of Concerns
• Query – parsing, planning, execution & storage handling with serdes
• SQL
• ORC (Optimized Row Columnar) – File Format
• CBO (Cost based Optimizer)
Hive – Architecture
Hive – Query & Plan
Hive – ORC
Hive – CBO
Hive – CBO
Zookeeper – Key Ideas
• It is a wait-free coordination service
• Ordering guarantees:
• Linearizable writes: all requests that update the state of zookeeper are
serializable and respect precedence;
• FIFO client order: all requests from a given client are executed in the order
that they were sent by the client.
• Atomic Broadcast
• Replicated database (a copy is held in-memory)
• A key/value table with hierarchical keys (namespace like a filesystem)
Zookeeper – Architecture and Zab
Zookeeper – Client API
• create
• delete
• exists
• getData
• setData
• getChildren
• sync
Zookeeper – Synchronization Primitives
ZooKeeper API is used to implement more powerful primitives, the ZooKeeper
service knows nothing them since they are entirely implemented at the client
using its API. Some examples are
• Configuration Management
• Rendezvous
• Group Membership
• Locks
• Simple locks
• Simple Locks without Herd Effect
• Read/Write locks
• Double Barrier
Zookeeper – Applications
• Hadoop uses it for automatic fail-over of Hadoop HDFS Namenode and for the high
availability of YARN ResourceManager.
• HBase uses it for master election, lease management of region servers, and other
communication between region servers.
• Storm uses it for leader election, preserving most of its state(not files), leader discovery.
• Spark uses it for leader election and some state storage
• Kafka uses it for maintaining consumption relationship and other usecases like
broker/consumer group membership
• Solr uses it for leader election and centralized configuration.
• Mesos uses it for fault-tolerant replicated master.
• Neo4j uses ZooKeeper for write master selection and read slave coordination.
• Cloudera Search uses ZooKeeper for centralized configuration management.
Kafka – Key Ideas
• Distributed Messaging System
• Stateless broker
• Partitioned topics
• Consumer groups
• Let consumers coordinate among themselves in a decentralized
fashion using Zookeeper.
• Guarantees at-least-once delivery.
Kafka – Architecture
Kafka – Performance
Kafka – Examples
• Message broker
• Log aggregation
• Operational Monitoring
• Website activity tracking (original use case)
• Stream processing (by itself)
• External commit log
APPENDIX
Hadoop – HDP –Timeline
YARN - Tuning – Memory Configurations
YARN - Tuning – Memory Configurations
YARN - Tuning – Memory Configurations
Alternatives - Mesos vs YARN
“While Mesos and YARN both have schedulers at two levels, there are two very
significant differences. First, Mesos is an offer-based resource manager, whereas
YARN has a request-based approach. YARN allows the AM to ask for resources
based on various criteria including locations, allows the requester to modify future
requests based on what was given and on current usage. Our approach was
necessary to support the location based allocation. Second, instead of a per-job
intraframework scheduler, Mesos leverages a pool of central schedulers (e.g.,
classic Hadoop or MPI). YARN enables late binding of containers to tasks, where
each individual job can perform local optimizations, and seems more amenable to
rolling upgrades (since each job can run on a different version of the framework).
On the other side, per-job ApplicationMaster might result in greater overhead than
the Mesos approach.”
References - Papers
• HDFS - http://static.googleusercontent.com/media/research.google.com/en//archive/gfs-sosp2003.pdf
• YARN - http://web.eecs.umich.edu/~mosharaf/Readings/YARN.pdf
• MapReduce - http://static.googleusercontent.com/media/research.google.com/en//archive/mapreduce-
osdi04.pdf
• Hive - http://infolab.stanford.edu/~ragho/hive-icde2010.pdf
• Hive - http://web.cse.ohio-state.edu/hpcs/WWW/HTML/publications/papers/TR-14-2.pdf
• Tez - http://dl.acm.org/citation.cfm?id=2742790
• Zookeeper - http://static.cs.brown.edu/courses/cs227/archives/2012/papers/replication/hunt.pdf
• Zookeeper - https://www.datadoghq.com/wp-content/uploads/2016/04/zab.totally-ordered-broadcast-
protocol.2008.pdf
• Kafka - http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-
final12.pdf
References – Documentation Links/Articles
• User Defined – functions, table-generating functions, aggregation functions
- https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF
• Windowing functions -
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Windo
wingAndAnalytics
• ORC -
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ORC
• Kafka – Zookeeper usage -
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+data+structures
+in+Zookeeper
References - Slides
• Tez - http://www.slideshare.net/Hadoop_Summit/w-1205phall1saha

More Related Content

What's hot

Back to School - St. Louis Hadoop Meetup September 2016
Back to School - St. Louis Hadoop Meetup September 2016Back to School - St. Louis Hadoop Meetup September 2016
Back to School - St. Louis Hadoop Meetup September 2016Adam Doyle
 
Apache Drill (ver. 0.1, check ver. 0.2)
Apache Drill (ver. 0.1, check ver. 0.2)Apache Drill (ver. 0.1, check ver. 0.2)
Apache Drill (ver. 0.1, check ver. 0.2)Camuel Gilyadov
 
Cross-Site BigTable using HBase
Cross-Site BigTable using HBaseCross-Site BigTable using HBase
Cross-Site BigTable using HBaseHBaseCon
 
Cloudera Impala: A Modern SQL Engine for Apache Hadoop
Cloudera Impala: A Modern SQL Engine for Apache HadoopCloudera Impala: A Modern SQL Engine for Apache Hadoop
Cloudera Impala: A Modern SQL Engine for Apache HadoopCloudera, Inc.
 
January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...
January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...
January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...Yahoo Developer Network
 
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaHadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaCloudera, Inc.
 
Big Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on Hadoop
Big Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on HadoopBig Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on Hadoop
Big Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on HadoopGruter
 
HBase: Extreme Makeover
HBase: Extreme MakeoverHBase: Extreme Makeover
HBase: Extreme MakeoverHBaseCon
 
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...lucenerevolution
 
HBase Status Report - Hadoop Summit Europe 2014
HBase Status Report - Hadoop Summit Europe 2014HBase Status Report - Hadoop Summit Europe 2014
HBase Status Report - Hadoop Summit Europe 2014larsgeorge
 
Hadoop 3.0 features
Hadoop 3.0 featuresHadoop 3.0 features
Hadoop 3.0 featuresanand murari
 
Multi-tenant, Multi-cluster and Multi-container Apache HBase Deployments
Multi-tenant, Multi-cluster and Multi-container Apache HBase DeploymentsMulti-tenant, Multi-cluster and Multi-container Apache HBase Deployments
Multi-tenant, Multi-cluster and Multi-container Apache HBase DeploymentsDataWorks Summit
 
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, ClouderaHadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, ClouderaCloudera, Inc.
 
Apache Tez : Accelerating Hadoop Query Processing
Apache Tez : Accelerating Hadoop Query ProcessingApache Tez : Accelerating Hadoop Query Processing
Apache Tez : Accelerating Hadoop Query ProcessingBikas Saha
 
YARN - Next Generation Compute Platform fo Hadoop
YARN - Next Generation Compute Platform fo HadoopYARN - Next Generation Compute Platform fo Hadoop
YARN - Next Generation Compute Platform fo HadoopHortonworks
 
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, ClouderaHBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, ClouderaCloudera, Inc.
 
Apache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data ApplicationsApache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data ApplicationsHortonworks
 
Tuning Apache Ambari performance for Big Data at scale with 3000 agents
Tuning Apache Ambari performance for Big Data at scale with 3000 agentsTuning Apache Ambari performance for Big Data at scale with 3000 agents
Tuning Apache Ambari performance for Big Data at scale with 3000 agentsDataWorks Summit
 

What's hot (20)

Back to School - St. Louis Hadoop Meetup September 2016
Back to School - St. Louis Hadoop Meetup September 2016Back to School - St. Louis Hadoop Meetup September 2016
Back to School - St. Louis Hadoop Meetup September 2016
 
Apache Drill (ver. 0.1, check ver. 0.2)
Apache Drill (ver. 0.1, check ver. 0.2)Apache Drill (ver. 0.1, check ver. 0.2)
Apache Drill (ver. 0.1, check ver. 0.2)
 
HBase internals
HBase internalsHBase internals
HBase internals
 
Cross-Site BigTable using HBase
Cross-Site BigTable using HBaseCross-Site BigTable using HBase
Cross-Site BigTable using HBase
 
Cloudera Impala: A Modern SQL Engine for Apache Hadoop
Cloudera Impala: A Modern SQL Engine for Apache HadoopCloudera Impala: A Modern SQL Engine for Apache Hadoop
Cloudera Impala: A Modern SQL Engine for Apache Hadoop
 
January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...
January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...
January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...
 
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaHadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
 
Big Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on Hadoop
Big Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on HadoopBig Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on Hadoop
Big Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on Hadoop
 
HBase: Extreme Makeover
HBase: Extreme MakeoverHBase: Extreme Makeover
HBase: Extreme Makeover
 
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...
 
HBase Status Report - Hadoop Summit Europe 2014
HBase Status Report - Hadoop Summit Europe 2014HBase Status Report - Hadoop Summit Europe 2014
HBase Status Report - Hadoop Summit Europe 2014
 
Hadoop 3.0 features
Hadoop 3.0 featuresHadoop 3.0 features
Hadoop 3.0 features
 
Multi-tenant, Multi-cluster and Multi-container Apache HBase Deployments
Multi-tenant, Multi-cluster and Multi-container Apache HBase DeploymentsMulti-tenant, Multi-cluster and Multi-container Apache HBase Deployments
Multi-tenant, Multi-cluster and Multi-container Apache HBase Deployments
 
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, ClouderaHadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
 
Apache Tez : Accelerating Hadoop Query Processing
Apache Tez : Accelerating Hadoop Query ProcessingApache Tez : Accelerating Hadoop Query Processing
Apache Tez : Accelerating Hadoop Query Processing
 
YARN - Next Generation Compute Platform fo Hadoop
YARN - Next Generation Compute Platform fo HadoopYARN - Next Generation Compute Platform fo Hadoop
YARN - Next Generation Compute Platform fo Hadoop
 
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, ClouderaHBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
 
Apache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data ApplicationsApache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data Applications
 
Tuning Apache Ambari performance for Big Data at scale with 3000 agents
Tuning Apache Ambari performance for Big Data at scale with 3000 agentsTuning Apache Ambari performance for Big Data at scale with 3000 agents
Tuning Apache Ambari performance for Big Data at scale with 3000 agents
 
Presentation
PresentationPresentation
Presentation
 

Viewers also liked

Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsEsther Kundin
 
Demystify Big Data, Data Science & Signal Extraction Deep Dive
Demystify Big Data, Data Science & Signal Extraction Deep DiveDemystify Big Data, Data Science & Signal Extraction Deep Dive
Demystify Big Data, Data Science & Signal Extraction Deep DiveHyderabad Scalability Meetup
 
elasticsearch basics workshop
elasticsearch basics workshopelasticsearch basics workshop
elasticsearch basics workshopMathieu Elie
 
Workshop: Learning Elasticsearch
Workshop: Learning ElasticsearchWorkshop: Learning Elasticsearch
Workshop: Learning ElasticsearchAnurag Patel
 
Hadoop workshop
Hadoop workshopHadoop workshop
Hadoop workshopFang Mac
 
Apache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS FederationApache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS FederationAdam Kawa
 
Hortonworks Technical Workshop: HBase For Mission Critical Applications
Hortonworks Technical Workshop: HBase For Mission Critical ApplicationsHortonworks Technical Workshop: HBase For Mission Critical Applications
Hortonworks Technical Workshop: HBase For Mission Critical ApplicationsHortonworks
 
Apache HBase for Architects
Apache HBase for ArchitectsApache HBase for Architects
Apache HBase for ArchitectsNick Dimiduk
 
Apache Hadoop and HBase
Apache Hadoop and HBaseApache Hadoop and HBase
Apache Hadoop and HBaseCloudera, Inc.
 
Hadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop SummitHadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop SummitDataWorks Summit
 
Hortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices WorkshopHortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices WorkshopHortonworks
 
Best Practices for the Hadoop Data Warehouse: EDW 101 for Hadoop Professionals
Best Practices for the Hadoop Data Warehouse: EDW 101 for Hadoop ProfessionalsBest Practices for the Hadoop Data Warehouse: EDW 101 for Hadoop Professionals
Best Practices for the Hadoop Data Warehouse: EDW 101 for Hadoop ProfessionalsCloudera, Inc.
 

Viewers also liked (20)

Elasticsearch Workshop
Elasticsearch WorkshopElasticsearch Workshop
Elasticsearch Workshop
 
Hadoop 2.0 handout 5.0
Hadoop 2.0 handout 5.0Hadoop 2.0 handout 5.0
Hadoop 2.0 handout 5.0
 
Deep dive hadoop
Deep dive hadoopDeep dive hadoop
Deep dive hadoop
 
Hadoop – big deal
Hadoop – big dealHadoop – big deal
Hadoop – big deal
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
 
Demystify Big Data, Data Science & Signal Extraction Deep Dive
Demystify Big Data, Data Science & Signal Extraction Deep DiveDemystify Big Data, Data Science & Signal Extraction Deep Dive
Demystify Big Data, Data Science & Signal Extraction Deep Dive
 
Hadoop
HadoopHadoop
Hadoop
 
elasticsearch basics workshop
elasticsearch basics workshopelasticsearch basics workshop
elasticsearch basics workshop
 
HDFS Deep Dive
HDFS Deep DiveHDFS Deep Dive
HDFS Deep Dive
 
Big data hbase
Big data hbase Big data hbase
Big data hbase
 
Workshop: Learning Elasticsearch
Workshop: Learning ElasticsearchWorkshop: Learning Elasticsearch
Workshop: Learning Elasticsearch
 
Hadoop workshop
Hadoop workshopHadoop workshop
Hadoop workshop
 
Hadoop Operations
Hadoop OperationsHadoop Operations
Hadoop Operations
 
Apache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS FederationApache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS Federation
 
Hortonworks Technical Workshop: HBase For Mission Critical Applications
Hortonworks Technical Workshop: HBase For Mission Critical ApplicationsHortonworks Technical Workshop: HBase For Mission Critical Applications
Hortonworks Technical Workshop: HBase For Mission Critical Applications
 
Apache HBase for Architects
Apache HBase for ArchitectsApache HBase for Architects
Apache HBase for Architects
 
Apache Hadoop and HBase
Apache Hadoop and HBaseApache Hadoop and HBase
Apache Hadoop and HBase
 
Hadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop SummitHadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop Summit
 
Hortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices WorkshopHortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices Workshop
 
Best Practices for the Hadoop Data Warehouse: EDW 101 for Hadoop Professionals
Best Practices for the Hadoop Data Warehouse: EDW 101 for Hadoop ProfessionalsBest Practices for the Hadoop Data Warehouse: EDW 101 for Hadoop Professionals
Best Practices for the Hadoop Data Warehouse: EDW 101 for Hadoop Professionals
 

Similar to Hadoop: Components and Key Ideas, -part1

Hadoop Ecosystem
Hadoop EcosystemHadoop Ecosystem
Hadoop EcosystemLior Sidi
 
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDYVenneladonthireddy1
 
Bigdata workshop february 2015
Bigdata workshop  february 2015 Bigdata workshop  february 2015
Bigdata workshop february 2015 clairvoyantllc
 
Distributed Data processing in a Cloud
Distributed Data processing in a CloudDistributed Data processing in a Cloud
Distributed Data processing in a Cloudelliando dias
 
Real time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stackReal time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stackDataWorks Summit/Hadoop Summit
 
HDFS_architecture.ppt
HDFS_architecture.pptHDFS_architecture.ppt
HDFS_architecture.pptvijayapraba1
 
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014cdmaxime
 
(BDT305) Lessons Learned and Best Practices for Running Hadoop on AWS | AWS r...
(BDT305) Lessons Learned and Best Practices for Running Hadoop on AWS | AWS r...(BDT305) Lessons Learned and Best Practices for Running Hadoop on AWS | AWS r...
(BDT305) Lessons Learned and Best Practices for Running Hadoop on AWS | AWS r...Amazon Web Services
 
Big Data Architecture Workshop - Vahid Amiri
Big Data Architecture Workshop -  Vahid AmiriBig Data Architecture Workshop -  Vahid Amiri
Big Data Architecture Workshop - Vahid Amiridatastack
 
BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMRBDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMRAmazon Web Services
 
Introduction To Hadoop Ecosystem
Introduction To Hadoop EcosystemIntroduction To Hadoop Ecosystem
Introduction To Hadoop EcosystemInSemble
 
BDA 302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA 302 Deep Dive on Migrating Big Data Workloads to Amazon EMRBDA 302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA 302 Deep Dive on Migrating Big Data Workloads to Amazon EMRAmazon Web Services
 
Introduction to Hadoop and Big Data
Introduction to Hadoop and Big DataIntroduction to Hadoop and Big Data
Introduction to Hadoop and Big DataJoe Alex
 
East Bay Java User Group Oct 2014 Spark Streaming Kinesis Machine Learning
 East Bay Java User Group Oct 2014 Spark Streaming Kinesis Machine Learning East Bay Java User Group Oct 2014 Spark Streaming Kinesis Machine Learning
East Bay Java User Group Oct 2014 Spark Streaming Kinesis Machine LearningChris Fregly
 

Similar to Hadoop: Components and Key Ideas, -part1 (20)

Hadoop Ecosystem
Hadoop EcosystemHadoop Ecosystem
Hadoop Ecosystem
 
Hadoop
HadoopHadoop
Hadoop
 
Hadoop ppt1
Hadoop ppt1Hadoop ppt1
Hadoop ppt1
 
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY
 
Bigdata workshop february 2015
Bigdata workshop  february 2015 Bigdata workshop  february 2015
Bigdata workshop february 2015
 
Distributed Data processing in a Cloud
Distributed Data processing in a CloudDistributed Data processing in a Cloud
Distributed Data processing in a Cloud
 
Cluster schedulers
Cluster schedulersCluster schedulers
Cluster schedulers
 
Real time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stackReal time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stack
 
HDFS_architecture.ppt
HDFS_architecture.pptHDFS_architecture.ppt
HDFS_architecture.ppt
 
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
 
Big data Hadoop
Big data  Hadoop   Big data  Hadoop
Big data Hadoop
 
Hadoop
HadoopHadoop
Hadoop
 
(BDT305) Lessons Learned and Best Practices for Running Hadoop on AWS | AWS r...
(BDT305) Lessons Learned and Best Practices for Running Hadoop on AWS | AWS r...(BDT305) Lessons Learned and Best Practices for Running Hadoop on AWS | AWS r...
(BDT305) Lessons Learned and Best Practices for Running Hadoop on AWS | AWS r...
 
Big Data Architecture Workshop - Vahid Amiri
Big Data Architecture Workshop -  Vahid AmiriBig Data Architecture Workshop -  Vahid Amiri
Big Data Architecture Workshop - Vahid Amiri
 
BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMRBDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
 
Hadoop training in bangalore
Hadoop training in bangaloreHadoop training in bangalore
Hadoop training in bangalore
 
Introduction To Hadoop Ecosystem
Introduction To Hadoop EcosystemIntroduction To Hadoop Ecosystem
Introduction To Hadoop Ecosystem
 
BDA 302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA 302 Deep Dive on Migrating Big Data Workloads to Amazon EMRBDA 302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA 302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
 
Introduction to Hadoop and Big Data
Introduction to Hadoop and Big DataIntroduction to Hadoop and Big Data
Introduction to Hadoop and Big Data
 
East Bay Java User Group Oct 2014 Spark Streaming Kinesis Machine Learning
 East Bay Java User Group Oct 2014 Spark Streaming Kinesis Machine Learning East Bay Java User Group Oct 2014 Spark Streaming Kinesis Machine Learning
East Bay Java User Group Oct 2014 Spark Streaming Kinesis Machine Learning
 

Recently uploaded

Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 

Recently uploaded (20)

Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 

Hadoop: Components and Key Ideas, -part1

  • 2. Agenda • Hadoop Ecosystem • Services • key ideas • Architecture • Usage Patterns • Tuning Parameters
  • 4. Ecosystem – Key Services • HDFS • YARN ( vs Mesos) • MR ( vs Tez) • Hive • Zookeeper • Kafka
  • 5. HDFS – Key Ideas • Distributed • Divide files into big blocks and distribute across the cluster • Replication • Store multiple replicas of each block for reliability. Enables fault-tolerance. • Write Once, Read Many times (WORM) • Blocks are immutable • Data locality • Programs can find replicas for each block and gain data locality
  • 8. YARN – Key Ideas • Separation of Concerns • Resource Management, Job Scheduling / Monitoring. • Schedulers and Queues • Shared Clusters • Locality awareness • Rack, File -to-block map • Support for diverse programming models
  • 10. YARN – Schedulers and Queues
  • 11. MapReduce – Key Ideas • It is a parallel programming model • Key Interfaces/steps • Map • Combine, Partition, Shuffle & Sort • Reduce • Counters • Backup tasks for stragglers
  • 14. Tez – Key Ideas • Expressiveness of DAG • Dynamically adapting the execution • Runtime graph re-configuration • Automatic Partition cardinality estimation • Scheduling Optimizations • Container re-use and Session • Avoid re-computing
  • 15. Tez – Key Ideas
  • 18. Hive – Key Ideas • Segregation of Concerns • Query – parsing, planning, execution & storage handling with serdes • SQL • ORC (Optimized Row Columnar) – File Format • CBO (Cost based Optimizer)
  • 20. Hive – Query & Plan
  • 24. Zookeeper – Key Ideas • It is a wait-free coordination service • Ordering guarantees: • Linearizable writes: all requests that update the state of zookeeper are serializable and respect precedence; • FIFO client order: all requests from a given client are executed in the order that they were sent by the client. • Atomic Broadcast • Replicated database (a copy is held in-memory) • A key/value table with hierarchical keys (namespace like a filesystem)
  • 26. Zookeeper – Client API • create • delete • exists • getData • setData • getChildren • sync
  • 27. Zookeeper – Synchronization Primitives ZooKeeper API is used to implement more powerful primitives, the ZooKeeper service knows nothing them since they are entirely implemented at the client using its API. Some examples are • Configuration Management • Rendezvous • Group Membership • Locks • Simple locks • Simple Locks without Herd Effect • Read/Write locks • Double Barrier
  • 28. Zookeeper – Applications • Hadoop uses it for automatic fail-over of Hadoop HDFS Namenode and for the high availability of YARN ResourceManager. • HBase uses it for master election, lease management of region servers, and other communication between region servers. • Storm uses it for leader election, preserving most of its state(not files), leader discovery. • Spark uses it for leader election and some state storage • Kafka uses it for maintaining consumption relationship and other usecases like broker/consumer group membership • Solr uses it for leader election and centralized configuration. • Mesos uses it for fault-tolerant replicated master. • Neo4j uses ZooKeeper for write master selection and read slave coordination. • Cloudera Search uses ZooKeeper for centralized configuration management.
  • 29. Kafka – Key Ideas • Distributed Messaging System • Stateless broker • Partitioned topics • Consumer groups • Let consumers coordinate among themselves in a decentralized fashion using Zookeeper. • Guarantees at-least-once delivery.
  • 32. Kafka – Examples • Message broker • Log aggregation • Operational Monitoring • Website activity tracking (original use case) • Stream processing (by itself) • External commit log
  • 34. Hadoop – HDP –Timeline
  • 35. YARN - Tuning – Memory Configurations
  • 36. YARN - Tuning – Memory Configurations
  • 37. YARN - Tuning – Memory Configurations
  • 38. Alternatives - Mesos vs YARN “While Mesos and YARN both have schedulers at two levels, there are two very significant differences. First, Mesos is an offer-based resource manager, whereas YARN has a request-based approach. YARN allows the AM to ask for resources based on various criteria including locations, allows the requester to modify future requests based on what was given and on current usage. Our approach was necessary to support the location based allocation. Second, instead of a per-job intraframework scheduler, Mesos leverages a pool of central schedulers (e.g., classic Hadoop or MPI). YARN enables late binding of containers to tasks, where each individual job can perform local optimizations, and seems more amenable to rolling upgrades (since each job can run on a different version of the framework). On the other side, per-job ApplicationMaster might result in greater overhead than the Mesos approach.”
  • 39. References - Papers • HDFS - http://static.googleusercontent.com/media/research.google.com/en//archive/gfs-sosp2003.pdf • YARN - http://web.eecs.umich.edu/~mosharaf/Readings/YARN.pdf • MapReduce - http://static.googleusercontent.com/media/research.google.com/en//archive/mapreduce- osdi04.pdf • Hive - http://infolab.stanford.edu/~ragho/hive-icde2010.pdf • Hive - http://web.cse.ohio-state.edu/hpcs/WWW/HTML/publications/papers/TR-14-2.pdf • Tez - http://dl.acm.org/citation.cfm?id=2742790 • Zookeeper - http://static.cs.brown.edu/courses/cs227/archives/2012/papers/replication/hunt.pdf • Zookeeper - https://www.datadoghq.com/wp-content/uploads/2016/04/zab.totally-ordered-broadcast- protocol.2008.pdf • Kafka - http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11- final12.pdf
  • 40. References – Documentation Links/Articles • User Defined – functions, table-generating functions, aggregation functions - https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF • Windowing functions - https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Windo wingAndAnalytics • ORC - https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ORC • Kafka – Zookeeper usage - https://cwiki.apache.org/confluence/display/KAFKA/Kafka+data+structures +in+Zookeeper
  • 41. References - Slides • Tez - http://www.slideshare.net/Hadoop_Summit/w-1205phall1saha

Editor's Notes

  1. http://hortonworks.com/products/data-center/hdp/ https://slider.incubator.apache.org/design/architecture.html Broad categories – access, integration, operations, tools Accumulo A sorted, distributed key-value store with cell-based access control. Accumulo is a low-latency, large table data storage and retrieval system with cell-level security. Accumulo is based on Google’s Bigtable and it runs on YARN, the data operating system of Hadoop. YARN provides visualization and analysis applications predictable access to data in Accumulo. Slider Slider is a YARN application to deploy non-YARN-enabled applications in a YARN cluster Slider consists of a YARN application master, the "Slider AM", and a client application which communicates with YARN and the Slider AM via remote procedure calls and/or REST requests. The client application offers command line access as well as low-level API access for test purposes The deployed application must be a program that can be run across a pool of YARN-managed servers, dynamically locating its peers. It is not Slider's responsibility to configure up the peer servers, apart from some initial application-specific application instance configuration. HAWQ – HAWQ is a Hadoop native SQL query engine that combines the key technological advantages of MPP database with the scalability and convenience of Hadoop. HAWQ reads data from and writes data to HDFS natively. Robust ANSI SQL compliance: SQL-92, SQL-99, SQL-2003, OLAP extension Full transaction capability and consistency guarantee: ACID Standard connectivity: JDBC/ODBC Hadoop Native: from storage (HDFS), resource management (YARN) to deployment (Ambari). Support for most third party tools: Tableau, SAS et al.
  2. Paper: http://static.googleusercontent.com/media/research.google.com/en//archive/gfs-sosp2003.pdf Immutable --- no random writes yet, vi hdfs://file isn’t yet there DataNodes handle read and write requests  i.e. a HDFS client directly talks to datanodes
  3. https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html
  4. http://hortonworks.com/blog/hdfs-metadata-directories-explained/ fsimage – An fsimage file contains the complete state of the file system at a point in time. Every file system modification is assigned a unique, monotonically increasing transaction ID. An fsimage file represents the file system state after all modifications up to a specific transaction ID. edits – An edits file is a log that lists each file system change (file creation, deletion or modification) that was made after the most recent fsimage. Namenode - in_use.lock – This is a lock file held by the NameNode process, used to prevent multiple NameNode processes from starting up and concurrently modifying the directory. Datanode - in_use.lock – This is a lock file held by the DataNode process, used to prevent multiple DataNode processes from starting up and concurrently modifying the directory.
  5. http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html The fundamental idea of YARN is to split up the functionalities of resource management and job scheduling/monitoring into separate daemons. The idea is to have a global ResourceManager (RM) and per-application ApplicationMaster (AM). An application is either a single job or a DAG of jobs. The ResourceManager and the NodeManager form the data-computation framework. The ResourceManager is the ultimate authority that arbitrates resources among all the applications in the system. The NodeManager is the per-machine framework agent who is responsible for containers, monitoring their resource usage (cpu, memory, disk, network) and reporting the same to the ResourceManager/Scheduler. The per-application ApplicationMaster is, in effect, a framework specific library and is tasked with negotiating resources from the ResourceManager and working with the NodeManager(s) to execute and monitor the tasks. http://web.eecs.umich.edu/~mosharaf/Readings/YARN.pdf A shared cluster that allows cluster to execute any distributed workload vs purpose built clusters for one/few types of Workloads ex: Vertica
  6. The ResourceManager has two main components: Scheduler and ApplicationsManager. The Scheduler is responsible for allocating resources to applications. The Scheduler is pure scheduler, it performs no monitoring or tracking of status for the application. Also, it offers no guarantees about restarting failed tasks either due to application failure or hardware failures. The Scheduler performs its scheduling function based the resource requirements of the applications; it does so based on the abstract notion of a resource Container which incorporates elements such as memory, cpu, disk, network etc. The Scheduler has a pluggable policy which is responsible for partitioning the cluster resources among the various queues, applications etc.  CapacityScheduler and the FairScheduler would be some examples of plug-ins. The ApplicationsManager is responsible for accepting job-submissions, negotiating the first container for executing the application specific ApplicationMaster and provides the service for restarting the ApplicationMaster container on failure. The per-application ApplicationMaster has the responsibility of negotiating appropriate resource containers from the Scheduler, tracking their status and monitoring for progress.
  7. http://web.eecs.umich.edu/~mosharaf/Readings/YARN.pdf From paper - “We ran experiments on a small (10 machine) cluster, to highlight the potential impact of work-preserving preemption. The cluster runs CapacityScheduler, configured with two queues A and B, respectively entitled to 80% and 20% of the capacity. A MapReduce job is submitted in the smaller queue B, and after a few minutes another MapReduce job is submitted in the larger queue A. In the graph, we show the capacity assigned to each queue under three configurations: 1) no capacity is offered to a queue beyond its guarantee (fixed capacity) 2) queues may consume 100% of the cluster capacity, but no preemption is performed, and 3) queues may consume 100% of the cluster capacity, but containers may be preempted. Workpreserving preemption allows the scheduler to overcommit resources for queue B without worrying about starving applications in queue A. When applications in queue A request resources, the scheduler issues preemption requests, which are serviced by the ApplicationMaster by checkpointing its tasks and yielding containers. This allows queue A to obtain all its guaranteed capacity (80% of cluster) in a few seconds, as opposed to case (2) in which the capacity rebalancing takes about 20 minutes. Finally, since the preemption we use is checkpoint-based and does not waste work, the job running in B can restart tasks from where they left off, and it does so efficiently”
  8. http://static.googleusercontent.com/media/research.google.com/en//archive/mapreduce-osdi04.pdf The users of MapReduce specify the number of reduce tasks/output files that they desire (R). Partitioning Data gets partitioned across reducer tasks using a partitioning function on the intermediate key. A default partitioning function is provided that uses hashing (e.g. “hash(key) mod R”). In some cases, however, it is useful to partition data by some other function of the key. For example, sometimes the output keys are URLs, and we want all entries for a single host to end up in the same output file. To support situations like this, the user of the MapReduce library can provide a special partitioning function. For example, using “hash(Hostname(urlkey)) mod R” as the partitioning function causes all URLs from the same host to end up in the same output file.
  9. http://static.googleusercontent.com/media/research.google.com/en//archive/mapreduce-osdi04.pdf
  10. WordCount Example Pictures taken from paper - http://static.googleusercontent.com/media/research.google.com/en//archive/mapreduce-osdi04.pdf
  11. Paper: http://dl.acm.org/citation.cfm?id=2742790 Dynamically adapting the execution Runtime graph re-configuration Automatic Partition cardinality estimation (number of reducers can be tuned at runtime based on number of partitions and/or other parameters) Scheduling Optimizations (some tasks within a vertex can be started early e.g. In Shuffle operation mentioned above ) Avoid re-computing through in-memory cache of intermediate results e.g. avoid re-building and re-broadcast of smaller tables in hive map-joins Other ideas Data Source Initializer Hive dynamic partition pruning Speculation Speculative additional processing tasks running on Straggler
  12. Paper: http://dl.acm.org/citation.cfm?id=2742790
  13. Paper: http://dl.acm.org/citation.cfm?id=2742790
  14. SQL Complex data types User Defined – functions, table-generating functions, aggregation functions - https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF Windowing functions - https://cwiki.apache.org/confluence/display/Hive/LanguageManual+WindowingAndAnalytics
  15. https://cwiki.apache.org/confluence/display/Hive/Tutorial#Tutorial-MultiTable/FileInserts
  16. Actual data of a leaf column are stored in a data stream. To facilitate the reader of an ORC file, the metadata of a column are also stored in metadata streams. In the column tree, internal columns (internal nodes in tree) are used to record metadata, e.g. the length of an array and so they will not have data streams. Data values of a column are logically broken down into multiple index groups each with a fixed number of values (configurable with a default of 10000). Sparse Indexes Data Statistics - Used to avoid reading unnecessary data from the HDFS. They are created while writing an ORC file. Ex: number of values, the minimum value, the maximum value, the sum, and the length. In ORC File. Data statistics have three levels. File level statistics (recorded at the end of this file) - used in query optimizations, and they are also used to answer simple aggregation queries. Stripe level statistics for every column - used to analyze which stripes are needed to evaluate a query, unneeded stripes will not be read from HDFS. Inside a stripe statistics are recorded for every index group, unnecessary index groups will not be read from HDFS. (An index group containing a small number of values can provide more fine-grained statistics about a column. However, the size of data statistics will increase. ) Position Pointers - When reading an ORC file, the reader needs to know two kinds of positions to perform efficient data reading operations. Because an ORC file can contain multiple stripes and a HDFS block can contain multiple stripes -- To efficiently locate the starting point of a stripe, position pointers of stripes are needed. Those pointers are stored in the file footer of an ORC file (round-dotted lines pointing to starting points of stripes). Because a column in a stripe has multiple logical index groups -- the starting points of every index group in metadata streams and data streams are needed (round-dotted lines pointing to the metadata stream and data stream represent this kind of position pointers). ORC enables Vectorized query execution streamlines operations by processing a block of 1024 (configurable) rows at a time (instead of 1 row at a time)
  17. http://hortonworks.com/blog/hive-0-14-cost-based-optimizer-cbo-technical-overview/
  18. http://hortonworks.com/blog/hive-0-14-cost-based-optimizer-cbo-technical-overview/
  19. http://static.cs.brown.edu/courses/cs227/archives/2012/papers/replication/hunt.pdf Atomic Broadcast All requests that update ZooKeeper state are forwarded to the leader. The leader executes the request and broadcasts the change through Zab (an atomic broadcast protocol). The server that receives the client request responds to the client when it delivers the corresponding state change. Zab uses by default simple majority quorums to decide on a proposal, so Zab and thus ZooKeeper can only work if a majority of servers are correct (i.e., with 2f + 1 server we can tolerate f failures). TCP is used for transport so that message order is maintained by the network, which allows simpler implementation
  20. Zab The protocol used while the atomic broadcast service is operational is called broadcast mode. It resembles a simple a two-phase commit : a leader proposes a request, collects votes, and finally commits. The two-phase commit protocol is simplified as there are no aborts; followers either acknowledge the leader’s proposal or they abandon the leader. The lack of aborts also mean that leader can commit once a quorum of servers ack the proposal rather than waiting for all servers to respond. The broadcast protocol uses FIFO (TCP) channels for all communications. By using FIFO channels, preserving the ordering guarantees becomes very easy. Messages are delivered in order through FIFO channels; as long as messages are processed as they are received, order is preserved. The simplified two phase commit by itself cannot handle leader failures, so there is add recovery mode to handle leader failures.
  21. create(path, data, flags): Creates a znode with path name path, stores data[] in it, and returns the name of the new znode. flags enables a client to select the type of znode: regular, ephemeral, and set the sequential flag; delete(path, version): Deletes the znode path if that znode is at the expected version; exists(path, watch): Returns true if the znode with path name path exists, and returns false otherwise. The watch flag enables a client to set a 3 watch on the znode; getData(path, watch): Returns the data and meta-data, such as version information, associated with the znode. The watch flag works in the same way as it does for exists(), except that ZooKeeper does not set the watch if the znode does not exist; setData(path, data, version): Writes data[] to znode path if the version number is the current version of the znode; getChildren(path, watch): Returns the set of names of the children of a znode; sync(path): Waits for all updates pending at the start of the operation to propagate to the server that the client is connected to. The path is currently ignored.
  22. Configuration Management configuration is stored in a znode, zc. Processes start up with the full pathname of zc. Starting processes obtain their configuration by reading zc with the watch flag set to true. If the configuration in zc is ever updated, the processes are notified and read the new configuration, again setting the watch flag to true. Note that in this scheme, as in most others that use watches, watches are used to make sure that a process has the most recent information. For example, if a process watching zc is notified of a change to zc and before it can issue a read for zc there are three more changes to zc, the process does not receive three more notification events. This does not affect the behavior of the process, since those three events would have simply notified the process of something it already knows: the information it has for zc is stale. Rendezvous Sometimes in distributed systems, it is not always clear a priori what the final system configuration will look like. For example, a client may want to start a master process and several worker processes, but the starting processes is done by a scheduler, so the client does not know ahead of time information such as addresses and ports that it can give the worker processes to connect to the master. This scenario is handled using a rendezvous znode, zr, which is a node created by the client. The client passes the full pathname of zr as a startup parameter of the master and worker processes. When the master starts it fills in zr with information about addresses and ports it is using. When workers start, they read zr with watch set to true. If zr has not been filled in yet, the worker waits to be notified when zr is updated. If zr is an ephemeral node, master and worker processes can watch for zr to be deleted and clean themselves up when the client ends.
  23. https://cwiki.apache.org/confluence/display/KAFKA/Kafka+data+structures+in+Zookeeper Kafka Zookeeper usecases (1) detecting the addition and the removal of brokers and consumers, (2) triggering a rebalance process in each consumer when the above events happen, and (3) maintaining the consumption relationship and keeping track of the consumed offset of each partition http://www.ibm.com/developerworks/library/bd-zookeeper/#resources http://hortonworks.com/blog/fault-tolerant-nimbus-in-apache-storm/
  24. http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf Stateless broker the information about how much each consumer has consumed is not maintained by the broker, but by the consumer itself. A consumer can deliberately rewind back to an old offset and re-consume data Partitioned topics  Topics consist of one or more Partitions that are ordered, immutable sequences of messages. Since writes to a partition are sequential, this design greatly reduces the number of hard disk seeks (with their resulting latency). Consumer group Each consumer group consists of one or more consumers that jointly consume a set of subscribed topics, i.e., each message is delivered to only one of the consumers within the group. Different consumer groups each independently consume the full set of subscribed messages and no coordination is needed across consumer groups. The consumers within the same group can be in different processes or on different machines. A partition within a topic is the smallest unit of parallelism. At any given time, all messages from one partition are consumed only by a single consumer within each consumer group. Exactly once delivery typically requires two-phase commits and is not necessary for our applications
  25. Unlike typical messaging systems, a message stored in Kafka doesn’t have an explicit message id. Instead, each message is addressed by its logical offset in the log. This avoids the overhead of maintaining auxiliary, seek-intensive random-access index structures that map the message ids to the actual message locations. Message ids are increasing but not consecutive. To compute the id of the next message, Kafka adds the length of the current message to its id. A consumer always consumes messages from a particular partition sequentially. If the consumer acknowledges a particular message offset, it implies that the consumer has received all messages prior to that offset in the partition. Under the covers, the consumer is issuing asynchronous pull requests to the broker to have a buffer of data ready for the application to consume. Each client pull request contains the offset of the message from which the consumption begins and an acceptable number of bytes to fetch. Each broker keeps in memory a sorted list of offsets, including the offset of the first message in every segment file. The broker locates the segment file where the requested message resides by searching the offset list, and sends the data back to the consumer. After a consumer receives a message, it computes the offset of the next message to consume and uses it in the next pull request.
  26. https://kafka.apache.org/documentation.html#uses External commit log The log helps replicate data between nodes and acts as a re-syncing mechanism for failed nodes to restore their data.
  27. http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_installing_manually_book/content/determine-hdp-memory-config.html Run the YARN Utility Script Reserved Memory Recommendations Determine the maximum number of containers allowed per node Determine the amount of RAM per container