SlideShare a Scribd company logo
Presented By: Sanjay Sharma Advanced Hadoop Tuning and Optimizations Download the Whitepaper: Deriving Intelligence from Large Data  Using Hadoop and Applying Analytics at  http://bit.ly/cNCCGj
Hadoop- The Good/Bad/Ugly ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Download the Whitepaper: Deriving Intelligence from Large Data  Using Hadoop and Applying Analytics at  http://bit.ly/cNCCGj
Configuration parameters
Compression ,[object Object],[object Object],[object Object],[object Object],[object Object],Download the Whitepaper: Deriving Intelligence from Large Data  Using Hadoop and Applying Analytics at  http://bit.ly/cNCCGj
Speculative Execution ,[object Object],[object Object],[object Object],[object Object],[object Object],Download the Whitepaper: Deriving Intelligence from Large Data  Using Hadoop and Applying Analytics at  http://bit.ly/cNCCGj
Number of Maps/Reducers ,[object Object],[object Object],[object Object],[object Object],[object Object],Download the Whitepaper: Deriving Intelligence from Large Data  Using Hadoop and Applying Analytics at  http://bit.ly/cNCCGj
File block size ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Download the Whitepaper: Deriving Intelligence from Large Data  Using Hadoop and Applying Analytics at  http://bit.ly/cNCCGj
Sort size ,[object Object],[object Object],[object Object],[object Object],[object Object],Download the Whitepaper: Deriving Intelligence from Large Data  Using Hadoop and Applying Analytics at  http://bit.ly/cNCCGj
Sort factor ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Download the Whitepaper: Deriving Intelligence from Large Data  Using Hadoop and Applying Analytics at  http://bit.ly/cNCCGj
JVM reuse ,[object Object],[object Object],[object Object],Download the Whitepaper: Deriving Intelligence from Large Data  Using Hadoop and Applying Analytics at  http://bit.ly/cNCCGj
Reduce parallel copies ,[object Object],[object Object],[object Object],[object Object],Download the Whitepaper: Deriving Intelligence from Large Data  Using Hadoop and Applying Analytics at  http://bit.ly/cNCCGj
The Other Threads ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Download the Whitepaper: Deriving Intelligence from Large Data  Using Hadoop and Applying Analytics at  http://bit.ly/cNCCGj
Other hotspots
Revelation-Temporary space ,[object Object],[object Object],[object Object],[object Object],Download the Whitepaper: Deriving Intelligence from Large Data  Using Hadoop and Applying Analytics at  http://bit.ly/cNCCGj
Java- JVM ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Logging ,[object Object],[object Object],[object Object],[object Object],[object Object],Download the Whitepaper: Deriving Intelligence from Large Data  Using Hadoop and Applying Analytics at  http://bit.ly/cNCCGj
Static Data strategies ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Download the Whitepaper: Deriving Intelligence from Large Data  Using Hadoop and Applying Analytics at  http://bit.ly/cNCCGj
Debugging and profiling - Arun C Murthy ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Tuning - Arun C Murthy ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Next steps ,[object Object],[object Object],[object Object],[object Object],[object Object],Download the Whitepaper: Deriving Intelligence from Large Data  Using Hadoop and Applying Analytics at  http://bit.ly/cNCCGj
Conclusion ,[object Object],Download the Whitepaper: Deriving Intelligence from Large Data  Using Hadoop and Applying Analytics at  http://bit.ly/cNCCGj
References ,[object Object],[object Object],[object Object],Download the Whitepaper: Deriving Intelligence from Large Data  Using Hadoop and Applying Analytics at  http://bit.ly/cNCCGj

More Related Content

What's hot

Hive: Loading Data
Hive: Loading DataHive: Loading Data
Hive: Loading Data
Benjamin Leonhardi
 
What's New in Apache Hive
What's New in Apache HiveWhat's New in Apache Hive
What's New in Apache Hive
DataWorks Summit
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
Databricks
 
Deep dive into stateful stream processing in structured streaming by Tathaga...
Deep dive into stateful stream processing in structured streaming  by Tathaga...Deep dive into stateful stream processing in structured streaming  by Tathaga...
Deep dive into stateful stream processing in structured streaming by Tathaga...
Databricks
 
Reduce Storage Costs by 5x Using The New HDFS Tiered Storage Feature
Reduce Storage Costs by 5x Using The New HDFS Tiered Storage Feature Reduce Storage Costs by 5x Using The New HDFS Tiered Storage Feature
Reduce Storage Costs by 5x Using The New HDFS Tiered Storage Feature
DataWorks Summit
 
Hadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the FieldHadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the Field
DataWorks Summit
 
Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...
Databricks
 
Cosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle ServiceCosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle Service
Databricks
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDB
MongoDB
 
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
HostedbyConfluent
 
MapReduce Scheduling Algorithms
MapReduce Scheduling AlgorithmsMapReduce Scheduling Algorithms
MapReduce Scheduling Algorithms
Leila panahi
 
Performance Troubleshooting Using Apache Spark Metrics
Performance Troubleshooting Using Apache Spark MetricsPerformance Troubleshooting Using Apache Spark Metrics
Performance Troubleshooting Using Apache Spark Metrics
Databricks
 
A Deep Dive into Structured Streaming: Apache Spark Meetup at Bloomberg 2016
A Deep Dive into Structured Streaming:  Apache Spark Meetup at Bloomberg 2016 A Deep Dive into Structured Streaming:  Apache Spark Meetup at Bloomberg 2016
A Deep Dive into Structured Streaming: Apache Spark Meetup at Bloomberg 2016
Databricks
 
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsApache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & Internals
Anton Kirillov
 
Yarn by default (Spark on YARN)
Yarn by default (Spark on YARN)Yarn by default (Spark on YARN)
Yarn by default (Spark on YARN)
Ferran Galí Reniu
 
Introduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of LuceneIntroduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of Lucene
Rahul Jain
 
YARN Federation
YARN Federation YARN Federation
Handling Redis failover with ZooKeeper
Handling Redis failover with ZooKeeperHandling Redis failover with ZooKeeper
Handling Redis failover with ZooKeeperryanlecompte
 
Introduction to apache spark
Introduction to apache spark Introduction to apache spark
Introduction to apache spark
Aakashdata
 
Apache Spark Overview
Apache Spark OverviewApache Spark Overview
Apache Spark Overview
Vadim Y. Bichutskiy
 

What's hot (20)

Hive: Loading Data
Hive: Loading DataHive: Loading Data
Hive: Loading Data
 
What's New in Apache Hive
What's New in Apache HiveWhat's New in Apache Hive
What's New in Apache Hive
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
 
Deep dive into stateful stream processing in structured streaming by Tathaga...
Deep dive into stateful stream processing in structured streaming  by Tathaga...Deep dive into stateful stream processing in structured streaming  by Tathaga...
Deep dive into stateful stream processing in structured streaming by Tathaga...
 
Reduce Storage Costs by 5x Using The New HDFS Tiered Storage Feature
Reduce Storage Costs by 5x Using The New HDFS Tiered Storage Feature Reduce Storage Costs by 5x Using The New HDFS Tiered Storage Feature
Reduce Storage Costs by 5x Using The New HDFS Tiered Storage Feature
 
Hadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the FieldHadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the Field
 
Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...
 
Cosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle ServiceCosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle Service
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDB
 
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
 
MapReduce Scheduling Algorithms
MapReduce Scheduling AlgorithmsMapReduce Scheduling Algorithms
MapReduce Scheduling Algorithms
 
Performance Troubleshooting Using Apache Spark Metrics
Performance Troubleshooting Using Apache Spark MetricsPerformance Troubleshooting Using Apache Spark Metrics
Performance Troubleshooting Using Apache Spark Metrics
 
A Deep Dive into Structured Streaming: Apache Spark Meetup at Bloomberg 2016
A Deep Dive into Structured Streaming:  Apache Spark Meetup at Bloomberg 2016 A Deep Dive into Structured Streaming:  Apache Spark Meetup at Bloomberg 2016
A Deep Dive into Structured Streaming: Apache Spark Meetup at Bloomberg 2016
 
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsApache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & Internals
 
Yarn by default (Spark on YARN)
Yarn by default (Spark on YARN)Yarn by default (Spark on YARN)
Yarn by default (Spark on YARN)
 
Introduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of LuceneIntroduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of Lucene
 
YARN Federation
YARN Federation YARN Federation
YARN Federation
 
Handling Redis failover with ZooKeeper
Handling Redis failover with ZooKeeperHandling Redis failover with ZooKeeper
Handling Redis failover with ZooKeeper
 
Introduction to apache spark
Introduction to apache spark Introduction to apache spark
Introduction to apache spark
 
Apache Spark Overview
Apache Spark OverviewApache Spark Overview
Apache Spark Overview
 

Similar to Advanced Hadoop Tuning and Optimization - Hadoop Consulting

Apache Hadoop India Summit 2011 talk "Hadoop Map-Reduce Programming & Best Pr...
Apache Hadoop India Summit 2011 talk "Hadoop Map-Reduce Programming & Best Pr...Apache Hadoop India Summit 2011 talk "Hadoop Map-Reduce Programming & Best Pr...
Apache Hadoop India Summit 2011 talk "Hadoop Map-Reduce Programming & Best Pr...Yahoo Developer Network
 
Hadoop online-training
Hadoop online-trainingHadoop online-training
Hadoop online-training
Geohedrick
 
Hw09 Production Deep Dive With High Availability
Hw09   Production Deep Dive With High AvailabilityHw09   Production Deep Dive With High Availability
Hw09 Production Deep Dive With High AvailabilityCloudera, Inc.
 
Hadoop live online training
Hadoop live online trainingHadoop live online training
Hadoop live online training
Harika583
 
Hadoop and Mapreduce Introduction
Hadoop and Mapreduce IntroductionHadoop and Mapreduce Introduction
Hadoop and Mapreduce Introduction
rajsandhu1989
 
Power Hadoop Cluster with AWS Cloud
Power Hadoop Cluster with AWS CloudPower Hadoop Cluster with AWS Cloud
Power Hadoop Cluster with AWS Cloud
Edureka!
 
Big Data and Hadoop in Cloud - Leveraging Amazon EMR
Big Data and Hadoop in Cloud - Leveraging Amazon EMRBig Data and Hadoop in Cloud - Leveraging Amazon EMR
Big Data and Hadoop in Cloud - Leveraging Amazon EMR
Vijay Rayapati
 
Hadoop Interview Questions and Answers by rohit kapa
Hadoop Interview Questions and Answers by rohit kapaHadoop Interview Questions and Answers by rohit kapa
Hadoop Interview Questions and Answers by rohit kapa
kapa rohit
 
Hadoop Summit Amsterdam 2014: Capacity Planning In Multi-tenant Hadoop Deploy...
Hadoop Summit Amsterdam 2014: Capacity Planning In Multi-tenant Hadoop Deploy...Hadoop Summit Amsterdam 2014: Capacity Planning In Multi-tenant Hadoop Deploy...
Hadoop Summit Amsterdam 2014: Capacity Planning In Multi-tenant Hadoop Deploy...
Sumeet Singh
 
Learn what is Hadoop-and-BigData
Learn  what is Hadoop-and-BigDataLearn  what is Hadoop-and-BigData
Learn what is Hadoop-and-BigData
Thanusha154
 
Introduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to HadoopIntroduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to Hadoop
GERARDO BARBERENA
 
Hadoop Network Performance profile
Hadoop Network Performance profileHadoop Network Performance profile
Hadoop Network Performance profilepramodbiligiri
 
Hadoop performance optimization tips
Hadoop performance optimization tipsHadoop performance optimization tips
Hadoop performance optimization tips
Subhas Kumar Ghosh
 
Hadoop tutorial hand-outs
Hadoop tutorial hand-outsHadoop tutorial hand-outs
Hadoop tutorial hand-outspardhavi reddy
 
Harnessing Hadoop and Big Data to Reduce Execution Times
Harnessing Hadoop and Big Data to Reduce Execution TimesHarnessing Hadoop and Big Data to Reduce Execution Times
Harnessing Hadoop and Big Data to Reduce Execution Times
David Tjahjono,MD,MBA(UK)
 
Seminar Presentation Hadoop
Seminar Presentation HadoopSeminar Presentation Hadoop
Seminar Presentation HadoopVarun Narang
 
Schedulers optimization to handle multiple jobs in hadoop cluster
Schedulers optimization to handle multiple jobs in hadoop clusterSchedulers optimization to handle multiple jobs in hadoop cluster
Schedulers optimization to handle multiple jobs in hadoop cluster
Shivraj Raj
 
Harnessing Hadoop: Understanding the Big Data Processing Options for Optimizi...
Harnessing Hadoop: Understanding the Big Data Processing Options for Optimizi...Harnessing Hadoop: Understanding the Big Data Processing Options for Optimizi...
Harnessing Hadoop: Understanding the Big Data Processing Options for Optimizi...
Cognizant
 
Hadoop training-in-hyderabad
Hadoop training-in-hyderabadHadoop training-in-hyderabad
Hadoop training-in-hyderabad
sreehari orienit
 

Similar to Advanced Hadoop Tuning and Optimization - Hadoop Consulting (20)

Apache Hadoop India Summit 2011 talk "Hadoop Map-Reduce Programming & Best Pr...
Apache Hadoop India Summit 2011 talk "Hadoop Map-Reduce Programming & Best Pr...Apache Hadoop India Summit 2011 talk "Hadoop Map-Reduce Programming & Best Pr...
Apache Hadoop India Summit 2011 talk "Hadoop Map-Reduce Programming & Best Pr...
 
Hadoop online-training
Hadoop online-trainingHadoop online-training
Hadoop online-training
 
Hw09 Production Deep Dive With High Availability
Hw09   Production Deep Dive With High AvailabilityHw09   Production Deep Dive With High Availability
Hw09 Production Deep Dive With High Availability
 
Hadoop live online training
Hadoop live online trainingHadoop live online training
Hadoop live online training
 
Hadoop and Mapreduce Introduction
Hadoop and Mapreduce IntroductionHadoop and Mapreduce Introduction
Hadoop and Mapreduce Introduction
 
Power Hadoop Cluster with AWS Cloud
Power Hadoop Cluster with AWS CloudPower Hadoop Cluster with AWS Cloud
Power Hadoop Cluster with AWS Cloud
 
Big Data and Hadoop in Cloud - Leveraging Amazon EMR
Big Data and Hadoop in Cloud - Leveraging Amazon EMRBig Data and Hadoop in Cloud - Leveraging Amazon EMR
Big Data and Hadoop in Cloud - Leveraging Amazon EMR
 
Hadoop Interview Questions and Answers by rohit kapa
Hadoop Interview Questions and Answers by rohit kapaHadoop Interview Questions and Answers by rohit kapa
Hadoop Interview Questions and Answers by rohit kapa
 
Hadoop Summit Amsterdam 2014: Capacity Planning In Multi-tenant Hadoop Deploy...
Hadoop Summit Amsterdam 2014: Capacity Planning In Multi-tenant Hadoop Deploy...Hadoop Summit Amsterdam 2014: Capacity Planning In Multi-tenant Hadoop Deploy...
Hadoop Summit Amsterdam 2014: Capacity Planning In Multi-tenant Hadoop Deploy...
 
Learn what is Hadoop-and-BigData
Learn  what is Hadoop-and-BigDataLearn  what is Hadoop-and-BigData
Learn what is Hadoop-and-BigData
 
Introduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to HadoopIntroduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to Hadoop
 
Hadoop Network Performance profile
Hadoop Network Performance profileHadoop Network Performance profile
Hadoop Network Performance profile
 
Hadoop performance optimization tips
Hadoop performance optimization tipsHadoop performance optimization tips
Hadoop performance optimization tips
 
Hadoop
HadoopHadoop
Hadoop
 
Hadoop tutorial hand-outs
Hadoop tutorial hand-outsHadoop tutorial hand-outs
Hadoop tutorial hand-outs
 
Harnessing Hadoop and Big Data to Reduce Execution Times
Harnessing Hadoop and Big Data to Reduce Execution TimesHarnessing Hadoop and Big Data to Reduce Execution Times
Harnessing Hadoop and Big Data to Reduce Execution Times
 
Seminar Presentation Hadoop
Seminar Presentation HadoopSeminar Presentation Hadoop
Seminar Presentation Hadoop
 
Schedulers optimization to handle multiple jobs in hadoop cluster
Schedulers optimization to handle multiple jobs in hadoop clusterSchedulers optimization to handle multiple jobs in hadoop cluster
Schedulers optimization to handle multiple jobs in hadoop cluster
 
Harnessing Hadoop: Understanding the Big Data Processing Options for Optimizi...
Harnessing Hadoop: Understanding the Big Data Processing Options for Optimizi...Harnessing Hadoop: Understanding the Big Data Processing Options for Optimizi...
Harnessing Hadoop: Understanding the Big Data Processing Options for Optimizi...
 
Hadoop training-in-hyderabad
Hadoop training-in-hyderabadHadoop training-in-hyderabad
Hadoop training-in-hyderabad
 

More from Impetus Technologies

Data Warehouse Modernization Webinar Series- Critical Trends, Implementation ...
Data Warehouse Modernization Webinar Series- Critical Trends, Implementation ...Data Warehouse Modernization Webinar Series- Critical Trends, Implementation ...
Data Warehouse Modernization Webinar Series- Critical Trends, Implementation ...
Impetus Technologies
 
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix WebinarFuture-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Impetus Technologies
 
Building Real-time Streaming Apps in Minutes- Impetus Webinar
Building Real-time Streaming Apps in Minutes- Impetus WebinarBuilding Real-time Streaming Apps in Minutes- Impetus Webinar
Building Real-time Streaming Apps in Minutes- Impetus Webinar
Impetus Technologies
 
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...
Impetus Technologies
 
Impetus White Paper- Handling Data Corruption in Elasticsearch
Impetus White Paper- Handling  Data Corruption  in ElasticsearchImpetus White Paper- Handling  Data Corruption  in Elasticsearch
Impetus White Paper- Handling Data Corruption in Elasticsearch
Impetus Technologies
 
Real-world Applications of Streaming Analytics- StreamAnalytix Webinar
Real-world Applications of Streaming Analytics- StreamAnalytix WebinarReal-world Applications of Streaming Analytics- StreamAnalytix Webinar
Real-world Applications of Streaming Analytics- StreamAnalytix Webinar
Impetus Technologies
 
Real-world Applications of Streaming Analytics- StreamAnalytix Webinar
Real-world Applications of Streaming Analytics- StreamAnalytix WebinarReal-world Applications of Streaming Analytics- StreamAnalytix Webinar
Real-world Applications of Streaming Analytics- StreamAnalytix Webinar
Impetus Technologies
 
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
Impetus Technologies
 
Accelerating Hadoop Solution Lifecycle and Improving ROI- Impetus On-demand W...
Accelerating Hadoop Solution Lifecycle and Improving ROI- Impetus On-demand W...Accelerating Hadoop Solution Lifecycle and Improving ROI- Impetus On-demand W...
Accelerating Hadoop Solution Lifecycle and Improving ROI- Impetus On-demand W...
Impetus Technologies
 
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Impetus Technologies
 
SPARK USE CASE- Distributed Reinforcement Learning for Electricity Market Bi...
SPARK USE CASE-  Distributed Reinforcement Learning for Electricity Market Bi...SPARK USE CASE-  Distributed Reinforcement Learning for Electricity Market Bi...
SPARK USE CASE- Distributed Reinforcement Learning for Electricity Market Bi...
Impetus Technologies
 
Enterprise Ready Android and Manageability- Impetus Webcast
Enterprise Ready Android and Manageability- Impetus WebcastEnterprise Ready Android and Manageability- Impetus Webcast
Enterprise Ready Android and Manageability- Impetus Webcast
Impetus Technologies
 
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
Impetus Technologies
 
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...
Impetus Technologies
 
Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...
Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...
Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...
Impetus Technologies
 
Big Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLabBig Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLab
Impetus Technologies
 
Webinar maturity of mobile test automation- approaches and future trends
Webinar  maturity of mobile test automation- approaches and future trendsWebinar  maturity of mobile test automation- approaches and future trends
Webinar maturity of mobile test automation- approaches and future trendsImpetus Technologies
 
Next generation analytics with yarn, spark and graph lab
Next generation analytics with yarn, spark and graph labNext generation analytics with yarn, spark and graph lab
Next generation analytics with yarn, spark and graph lab
Impetus Technologies
 
The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...
The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...
The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...
Impetus Technologies
 
Performance Testing of Big Data Applications - Impetus Webcast
Performance Testing of Big Data Applications - Impetus WebcastPerformance Testing of Big Data Applications - Impetus Webcast
Performance Testing of Big Data Applications - Impetus Webcast
Impetus Technologies
 

More from Impetus Technologies (20)

Data Warehouse Modernization Webinar Series- Critical Trends, Implementation ...
Data Warehouse Modernization Webinar Series- Critical Trends, Implementation ...Data Warehouse Modernization Webinar Series- Critical Trends, Implementation ...
Data Warehouse Modernization Webinar Series- Critical Trends, Implementation ...
 
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix WebinarFuture-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
 
Building Real-time Streaming Apps in Minutes- Impetus Webinar
Building Real-time Streaming Apps in Minutes- Impetus WebinarBuilding Real-time Streaming Apps in Minutes- Impetus Webinar
Building Real-time Streaming Apps in Minutes- Impetus Webinar
 
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...
 
Impetus White Paper- Handling Data Corruption in Elasticsearch
Impetus White Paper- Handling  Data Corruption  in ElasticsearchImpetus White Paper- Handling  Data Corruption  in Elasticsearch
Impetus White Paper- Handling Data Corruption in Elasticsearch
 
Real-world Applications of Streaming Analytics- StreamAnalytix Webinar
Real-world Applications of Streaming Analytics- StreamAnalytix WebinarReal-world Applications of Streaming Analytics- StreamAnalytix Webinar
Real-world Applications of Streaming Analytics- StreamAnalytix Webinar
 
Real-world Applications of Streaming Analytics- StreamAnalytix Webinar
Real-world Applications of Streaming Analytics- StreamAnalytix WebinarReal-world Applications of Streaming Analytics- StreamAnalytix Webinar
Real-world Applications of Streaming Analytics- StreamAnalytix Webinar
 
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
 
Accelerating Hadoop Solution Lifecycle and Improving ROI- Impetus On-demand W...
Accelerating Hadoop Solution Lifecycle and Improving ROI- Impetus On-demand W...Accelerating Hadoop Solution Lifecycle and Improving ROI- Impetus On-demand W...
Accelerating Hadoop Solution Lifecycle and Improving ROI- Impetus On-demand W...
 
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
 
SPARK USE CASE- Distributed Reinforcement Learning for Electricity Market Bi...
SPARK USE CASE-  Distributed Reinforcement Learning for Electricity Market Bi...SPARK USE CASE-  Distributed Reinforcement Learning for Electricity Market Bi...
SPARK USE CASE- Distributed Reinforcement Learning for Electricity Market Bi...
 
Enterprise Ready Android and Manageability- Impetus Webcast
Enterprise Ready Android and Manageability- Impetus WebcastEnterprise Ready Android and Manageability- Impetus Webcast
Enterprise Ready Android and Manageability- Impetus Webcast
 
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
 
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...
 
Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...
Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...
Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...
 
Big Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLabBig Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLab
 
Webinar maturity of mobile test automation- approaches and future trends
Webinar  maturity of mobile test automation- approaches and future trendsWebinar  maturity of mobile test automation- approaches and future trends
Webinar maturity of mobile test automation- approaches and future trends
 
Next generation analytics with yarn, spark and graph lab
Next generation analytics with yarn, spark and graph labNext generation analytics with yarn, spark and graph lab
Next generation analytics with yarn, spark and graph lab
 
The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...
The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...
The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...
 
Performance Testing of Big Data Applications - Impetus Webcast
Performance Testing of Big Data Applications - Impetus WebcastPerformance Testing of Big Data Applications - Impetus Webcast
Performance Testing of Big Data Applications - Impetus Webcast
 

Recently uploaded

Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
TIPNGVN2
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Vladimir Iglovikov, Ph.D.
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 

Recently uploaded (20)

Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 

Advanced Hadoop Tuning and Optimization - Hadoop Consulting

  • 1. Presented By: Sanjay Sharma Advanced Hadoop Tuning and Optimizations Download the Whitepaper: Deriving Intelligence from Large Data Using Hadoop and Applying Analytics at http://bit.ly/cNCCGj
  • 2.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.

Editor's Notes

  1. e.g. a node has 8GB main memory + 8 core CPU + swap space Maximum memory required by a task ~ 500MB Memory required by Tasktracker, Datanode and other processes ~ (1 + 1 +1) = 3GB Maximum tasks that can be run = (8-3) GB/500MB = 10 Number of map or reduce task (out of the maximum tasks) can be decided on the basis of memory usage and computation complexities of the tasks. The memory available to each task (JVM) is controlled by mapred.child.java.opts property. The default is –Xmx200m (200 MB). Other JVM options can also be provided in this property.