SlideShare a Scribd company logo
1 of 25
Download to read offline
© 2015 Enterprise Integration News, Inc.
Introduction
Agenda Bio
Making Hadoop just work better for
varied workloads
Details top challenges in adopting
Hadoop
How Pepperdata automatically
improves performance, visibility,
controls
A pioneer in production-ready Hadoop
15+ years web search and big data; Focus
on huge scale, huge impact products
Started the Silicon Valley branch of
Microsoft’s Bing engineering & product
team
Visibility & Optimization for Hadoop
Sean Suchter
Co-Founder and CEO
1
©2015 Pepperdata
VISIBILITY & OPTIMIZATION FOR HADOOP
• • • • • • • • ©2014 Pepperdata
AGENDA
CHALLENGES USING HADOOP
HOW PEPPERDATA ADDRESSES THESE CHALLENGES
Q&A and NEXT STEPS
3
• • • • • • • • ©2014 Pepperdata
HADOOP CHALLENGES YOU FACE DAILY
4
TacticalStrategicVeryStrategic
Call from your CEO asking
“WTF is happening?!?
Can’t make SEC filing
EOQ and can’t send the
invoice
Critical feature on your website is
broken!
Online ad impression
data unavailable
External customer reports
unavailable
Users complaining
Customer churn metrics
unavailable
Revenue report doesn’t
complete
Have to buy more
servers
End user SLAs
compromised
Finding root-cause of
problems is manual
Low priority jobs taking over the
cluster
Ad hoc jobs interfere with
production jobs
HBase & MapReduce contention
Rogue jobs hammer cluster
performance
Cluster seems near maximum
capacity
Developers can’t submit
new jobs
• • • • • • • • ©2014 Pepperdata
HADOOP WASTES VALUABLE CAPACITY
Physicalhardwareresource
Time
Theoretical maximum usage
(reservation)
Actual physical
capacity used
1. Production clusters are sized for peak
SLA with lots of headroom, so capacity
is wasted
1. Ad-hoc jobs consume capacity from
high-priority jobs, so companies run
them on separate cluster
1. Hadoop’s allocations are predefined
and static, resulting in wasted capacity
• • • • • • • • ©2014 Pepperdata
MORE AND MORE WASTED CAPACITY
6
Over time, more and more clusters are built to
isolate the different workloads
Production Cluster Ad Hoc Cluster Priority Job Cluster HBase Cluster Bulk Load Cluster
But they are full of “holes”!
• • • • • • • • ©2014 Pepperdata
PEPPERDATA MAKES HADOOP WORK BETTER
7
FINE-GRAINED VISIBILITY
Monitor CPU, RAM, I/O, network per task, job, user, group
Identify bottlenecks in real-time or at any moment historically
TOTAL PREDICTABILITY
SLA enforcement for true multi-tenancy: dynamically adjusts resource usage
Set policies to protect high-priority jobs
30-50% GREATER THROUGHPUT ON ALREADY HIGHLY TUNED
CLUSTERS
Reclaims wasted capacity: use all true hardware capacity
Run more jobs with our Dynamic Capacity Creation
• • • • • • • • ©2014 Pepperdata
PEPPERDATA REAL-TIME ARCHITECTURE
8
VISIBILITY
CONTROL
CAPACITY
Delivers real-time, granular
visibility into resource
consumption by user, job,
and task
Allows user-defined
prioritization of Hadoop jobs
and automatically allocates
resources to ensure jobs run
safely
Reclaims wasted capacity
and allows mixed workloads
to be shared on a single
cluster
Developer Analyst
Financial
ReportProduct
Pepperdata Dashboard
Hadoop Configuration
YOUR EXISTING HADOOP
MapReduce, HBase, etc.
Job Tracker / Resource Manager (Scheduler & YARN)
ETL
Policies
• • • • • • • • ©2014 Pepperdata
FINE-GRAINED VISIBILITY INTO THE CLUSTER
9
• • • • • • • • ©2014 Pepperdata
FINE-GRAINED VISIBILITY INTO YOUR CLUSTER
10
• • • • • • • • ©2014 Pepperdata
EASILY PINPOINT BOTTLENECKS IN THE CLUSTER
11
• • • • • • • • ©2014 Pepperdata
PEPPERDATA MAKES HADOOP WORK BETTER
12
FINE-GRAINED VISIBILITY
Monitor CPU, RAM, I/O, network per task, job, user, group
Identify bottlenecks in real-time or at any moment historically
TOTAL PREDICTABILITY
SLA enforcement for true multi-tenancy: dynamically adjusts resource usage
Set policies to protect high-priority jobs
30-50% GREATER THROUGHPUT ON ALREADY HIGHLY TUNED
CLUSTERS
Reclaims wasted capacity: use all true hardware capacity
Run more jobs with our Dynamic Capacity Creation
• • • • • • • • ©2014 Pepperdata
NEXT STEPS
Like what you saw? Want to learn more?
Visit pepperdata.com for more product information.
Visit pepperdata.com/demo to request a
free demo from one of our technical experts!
13
• • • • • • • • ©2014 Pepperdata
THANK YOU
14
© 2015 Enterprise Integration News, Inc.
Questions & Answers
Q&A
Question & Answer
What is the form factor of Pepperdata,
and how long does it take to install?
How do we make sure Pepperdata ‘agents’ are
where they need to be -- and working?
Sean Suchter
Co-Founder and CEO
• • • • • • • • ©2014 Pepperdata
PEPPERDATA REAL-TIME ARCHITECTURE
17
VISIBILITY
CONTROL
CAPACITY
Delivers real-time, granular
visibility into resource
consumption by user, job,
and task
Allows user-defined
prioritization of Hadoop jobs
and automatically allocates
resources to ensure jobs run
safely
Reclaims wasted capacity
and allows mixed workloads
to be shared on a single
cluster
Developer Analyst
Financial
ReportProduct
Pepperdata Dashboard
Hadoop Configuration
YOUR EXISTING HADOOP
MapReduce, HBase, etc.
Job Tracker / Resource Manager (Scheduler & YARN)
ETL
Policies
We have mixed workloads that often force us to
overprovision Hadoop resources.
Does Pepperdata help us deal with this by allowing
Hadoop to adjust dynamically?
Sean Suchter
Co-Founder and CEO
Given Pepperdata’s intelligent and dynamic environment,
how does that impact the way we do Hadoop prep or set-up?
Sean Suchter
Co-Founder and CEO
How much Hadoop cluster resource does Pepperdata use?
Sean Suchter
Co-Founder and CEO
How do customers use the Pepperdata dashboard?
Where is it hosted?
Sean Suchter
Co-Founder and CEO
• • • • • • • • ©2014 Pepperdata
PEPPERDATA REAL-TIME ARCHITECTURE
22
VISIBILITY
CONTROL
CAPACITY
Delivers real-time, granular
visibility into resource
consumption by user, job,
and task
Allows user-defined
prioritization of Hadoop jobs
and automatically allocates
resources to ensure jobs run
safely
Reclaims wasted capacity
and allows mixed workloads
to be shared on a single
cluster
Developer Analyst
Financial
ReportProduct
Pepperdata Dashboard
Hadoop Configuration
YOUR EXISTING HADOOP
MapReduce, HBase, etc.
Job Tracker / Resource Manager (Scheduler & YARN)
ETL
Policies
How is the Pepperdata approach different from YARN?
Sean Suchter
Co-Founder and CEO
Please detail some customer successes from using
Pepperdata with Hadoop?
Sean Suchter
Co-Founder and CEO
© 2015 Enterprise Integration News, Inc.
For More Information
For More Information
Pepperdata – Rely on Hadoop
http://pepperdata.com/
Visibility Capacity
Control Technology
Learn More About Pepperdata
Product
http://pepperdata.com/products/
Real-Time Architecture
http://pepperdata.com/products/#pd-technology
Benefits
http://pepperdata.com/benefits/
Blog
http://pepperdata.com/blog/
Other Pepperdata Resources (Whitepapers & Case Studies)
http://pepperdata.com/resources/
Request a Demo
http://pepperdata.com/demo/

More Related Content

What's hot

Engaging with Cloudera & Morning Wrap Up
Engaging with Cloudera & Morning Wrap UpEngaging with Cloudera & Morning Wrap Up
Engaging with Cloudera & Morning Wrap UpCloudera, Inc.
 
RightScale Roadtrip - Accelerate to Cloud
RightScale Roadtrip - Accelerate to CloudRightScale Roadtrip - Accelerate to Cloud
RightScale Roadtrip - Accelerate to CloudRightScale
 
Driving Splunk Adoption and Proficiency
Driving Splunk Adoption and ProficiencyDriving Splunk Adoption and Proficiency
Driving Splunk Adoption and ProficiencySplunk
 
Get Started with Big Data in the Cloud ASAP
Get Started with Big Data in the Cloud ASAPGet Started with Big Data in the Cloud ASAP
Get Started with Big Data in the Cloud ASAPHortonworks
 
Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Cloudera, Inc.
 
Operationalizing Data Analytics
Operationalizing Data AnalyticsOperationalizing Data Analytics
Operationalizing Data AnalyticsVMware Tanzu
 
Redis rise of Dataops
Redis rise of DataopsRedis rise of Dataops
Redis rise of Dataopslandoop
 
Big Data Taiwan 2014 Keynote 2: Hadoop and the Future of Data Management
Big Data Taiwan 2014 Keynote 2: Hadoop and the Future of Data ManagementBig Data Taiwan 2014 Keynote 2: Hadoop and the Future of Data Management
Big Data Taiwan 2014 Keynote 2: Hadoop and the Future of Data ManagementEtu Solution
 
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...Cloudera, Inc.
 
The Rise of DataOps: Making Big Data Bite Size with DataOps
The Rise of DataOps: Making Big Data Bite Size with DataOpsThe Rise of DataOps: Making Big Data Bite Size with DataOps
The Rise of DataOps: Making Big Data Bite Size with DataOpsDelphix
 
Four Strategies to Create a DevOps Culture & System that Favors Innovation & ...
Four Strategies to Create a DevOps Culture & System that Favors Innovation & ...Four Strategies to Create a DevOps Culture & System that Favors Innovation & ...
Four Strategies to Create a DevOps Culture & System that Favors Innovation & ...Amazon Web Services
 
Improve The Planner Experience With Groovy
Improve The Planner Experience With GroovyImprove The Planner Experience With Groovy
Improve The Planner Experience With GroovyKyle Goodfriend
 
e-IT exec lunch - "It's all about data" - 25 May '16
e-IT exec lunch - "It's all about data" - 25 May '16e-IT exec lunch - "It's all about data" - 25 May '16
e-IT exec lunch - "It's all about data" - 25 May '16Devin Deen
 
Ensuring Cloud Native Success: The Greenfield Journey
Ensuring Cloud Native Success: The Greenfield JourneyEnsuring Cloud Native Success: The Greenfield Journey
Ensuring Cloud Native Success: The Greenfield JourneyVMware Tanzu
 
Data Center Migration Essentials - Adam Saint-Prix Tim Wong
Data Center Migration Essentials - Adam Saint-Prix Tim WongData Center Migration Essentials - Adam Saint-Prix Tim Wong
Data Center Migration Essentials - Adam Saint-Prix Tim WongAtlassian
 
Velocity 2014 Tool Chain Choices
Velocity 2014 Tool Chain ChoicesVelocity 2014 Tool Chain Choices
Velocity 2014 Tool Chain ChoicesMark Sigler
 

What's hot (17)

Engaging with Cloudera & Morning Wrap Up
Engaging with Cloudera & Morning Wrap UpEngaging with Cloudera & Morning Wrap Up
Engaging with Cloudera & Morning Wrap Up
 
RightScale Roadtrip - Accelerate to Cloud
RightScale Roadtrip - Accelerate to CloudRightScale Roadtrip - Accelerate to Cloud
RightScale Roadtrip - Accelerate to Cloud
 
Driving Splunk Adoption and Proficiency
Driving Splunk Adoption and ProficiencyDriving Splunk Adoption and Proficiency
Driving Splunk Adoption and Proficiency
 
Get Started with Big Data in the Cloud ASAP
Get Started with Big Data in the Cloud ASAPGet Started with Big Data in the Cloud ASAP
Get Started with Big Data in the Cloud ASAP
 
Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8
 
Operationalizing Data Analytics
Operationalizing Data AnalyticsOperationalizing Data Analytics
Operationalizing Data Analytics
 
Redis rise of Dataops
Redis rise of DataopsRedis rise of Dataops
Redis rise of Dataops
 
Big Data Taiwan 2014 Keynote 2: Hadoop and the Future of Data Management
Big Data Taiwan 2014 Keynote 2: Hadoop and the Future of Data ManagementBig Data Taiwan 2014 Keynote 2: Hadoop and the Future of Data Management
Big Data Taiwan 2014 Keynote 2: Hadoop and the Future of Data Management
 
Ramesh kutumbaka resume
Ramesh kutumbaka resumeRamesh kutumbaka resume
Ramesh kutumbaka resume
 
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...
 
The Rise of DataOps: Making Big Data Bite Size with DataOps
The Rise of DataOps: Making Big Data Bite Size with DataOpsThe Rise of DataOps: Making Big Data Bite Size with DataOps
The Rise of DataOps: Making Big Data Bite Size with DataOps
 
Four Strategies to Create a DevOps Culture & System that Favors Innovation & ...
Four Strategies to Create a DevOps Culture & System that Favors Innovation & ...Four Strategies to Create a DevOps Culture & System that Favors Innovation & ...
Four Strategies to Create a DevOps Culture & System that Favors Innovation & ...
 
Improve The Planner Experience With Groovy
Improve The Planner Experience With GroovyImprove The Planner Experience With Groovy
Improve The Planner Experience With Groovy
 
e-IT exec lunch - "It's all about data" - 25 May '16
e-IT exec lunch - "It's all about data" - 25 May '16e-IT exec lunch - "It's all about data" - 25 May '16
e-IT exec lunch - "It's all about data" - 25 May '16
 
Ensuring Cloud Native Success: The Greenfield Journey
Ensuring Cloud Native Success: The Greenfield JourneyEnsuring Cloud Native Success: The Greenfield Journey
Ensuring Cloud Native Success: The Greenfield Journey
 
Data Center Migration Essentials - Adam Saint-Prix Tim Wong
Data Center Migration Essentials - Adam Saint-Prix Tim WongData Center Migration Essentials - Adam Saint-Prix Tim Wong
Data Center Migration Essentials - Adam Saint-Prix Tim Wong
 
Velocity 2014 Tool Chain Choices
Velocity 2014 Tool Chain ChoicesVelocity 2014 Tool Chain Choices
Velocity 2014 Tool Chain Choices
 

Similar to Visibility and Optimization for Hadoop

Sneak Peak into Self-Service, Cross-Enterprise, Job Scheduling with CA Worklo...
Sneak Peak into Self-Service, Cross-Enterprise, Job Scheduling with CA Worklo...Sneak Peak into Self-Service, Cross-Enterprise, Job Scheduling with CA Worklo...
Sneak Peak into Self-Service, Cross-Enterprise, Job Scheduling with CA Worklo...CA Technologies
 
Transform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksTransform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksHortonworks
 
Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Pactera_US
 
Level Up – How to Achieve Hadoop Acceleration
Level Up – How to Achieve Hadoop AccelerationLevel Up – How to Achieve Hadoop Acceleration
Level Up – How to Achieve Hadoop AccelerationInside Analysis
 
Hadoop as an Analytic Platform: Why Not?
Hadoop as an Analytic Platform: Why Not?Hadoop as an Analytic Platform: Why Not?
Hadoop as an Analytic Platform: Why Not?Inside Analysis
 
Automate Hadoop Jobs with Real World Business Impact
Automate Hadoop Jobs with Real World Business ImpactAutomate Hadoop Jobs with Real World Business Impact
Automate Hadoop Jobs with Real World Business ImpactCA Technologies
 
Bridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven WorldBridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven WorldCA Technologies
 
Game Changed – How Hadoop is Reinventing Enterprise Thinking
Game Changed – How Hadoop is Reinventing Enterprise ThinkingGame Changed – How Hadoop is Reinventing Enterprise Thinking
Game Changed – How Hadoop is Reinventing Enterprise ThinkingInside Analysis
 
Big Data & SQL: The On-Ramp to Hadoop
Big Data & SQL: The On-Ramp to Hadoop Big Data & SQL: The On-Ramp to Hadoop
Big Data & SQL: The On-Ramp to Hadoop Inside Analysis
 
Has Traditional MDM Finally Met its Match?
Has Traditional MDM Finally Met its Match?Has Traditional MDM Finally Met its Match?
Has Traditional MDM Finally Met its Match?Inside Analysis
 
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...Platfora
 
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...Cloudera, Inc.
 
Rediscover Software Development Edward Hieatt Web Summit 2014
Rediscover Software Development Edward Hieatt Web Summit 2014Rediscover Software Development Edward Hieatt Web Summit 2014
Rediscover Software Development Edward Hieatt Web Summit 2014VMware Tanzu
 
A modern, flexible approach to Hadoop implementation incorporating innovation...
A modern, flexible approach to Hadoop implementation incorporating innovation...A modern, flexible approach to Hadoop implementation incorporating innovation...
A modern, flexible approach to Hadoop implementation incorporating innovation...DataWorks Summit
 
A Tighter Weave – How YARN Changes the Data Quality Game
A Tighter Weave – How YARN Changes the Data Quality GameA Tighter Weave – How YARN Changes the Data Quality Game
A Tighter Weave – How YARN Changes the Data Quality GameInside Analysis
 
Enrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache HadoopEnrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache HadoopHortonworks
 
Hadoop 2.0 - Solving the Data Quality Challenge
Hadoop 2.0 - Solving the Data Quality ChallengeHadoop 2.0 - Solving the Data Quality Challenge
Hadoop 2.0 - Solving the Data Quality ChallengeInside Analysis
 
Hot Technologies of 2013: Hadoop 2.0
Hot Technologies of 2013: Hadoop 2.0Hot Technologies of 2013: Hadoop 2.0
Hot Technologies of 2013: Hadoop 2.0Inside Analysis
 
HP Vertica and MapR Webinar: Building a Business Case for SQL-on-Hadoop
HP Vertica and MapR Webinar: Building a Business Case for SQL-on-HadoopHP Vertica and MapR Webinar: Building a Business Case for SQL-on-Hadoop
HP Vertica and MapR Webinar: Building a Business Case for SQL-on-HadoopMapR Technologies
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopHortonworks
 

Similar to Visibility and Optimization for Hadoop (20)

Sneak Peak into Self-Service, Cross-Enterprise, Job Scheduling with CA Worklo...
Sneak Peak into Self-Service, Cross-Enterprise, Job Scheduling with CA Worklo...Sneak Peak into Self-Service, Cross-Enterprise, Job Scheduling with CA Worklo...
Sneak Peak into Self-Service, Cross-Enterprise, Job Scheduling with CA Worklo...
 
Transform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksTransform You Business with Big Data and Hortonworks
Transform You Business with Big Data and Hortonworks
 
Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks
 
Level Up – How to Achieve Hadoop Acceleration
Level Up – How to Achieve Hadoop AccelerationLevel Up – How to Achieve Hadoop Acceleration
Level Up – How to Achieve Hadoop Acceleration
 
Hadoop as an Analytic Platform: Why Not?
Hadoop as an Analytic Platform: Why Not?Hadoop as an Analytic Platform: Why Not?
Hadoop as an Analytic Platform: Why Not?
 
Automate Hadoop Jobs with Real World Business Impact
Automate Hadoop Jobs with Real World Business ImpactAutomate Hadoop Jobs with Real World Business Impact
Automate Hadoop Jobs with Real World Business Impact
 
Bridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven WorldBridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven World
 
Game Changed – How Hadoop is Reinventing Enterprise Thinking
Game Changed – How Hadoop is Reinventing Enterprise ThinkingGame Changed – How Hadoop is Reinventing Enterprise Thinking
Game Changed – How Hadoop is Reinventing Enterprise Thinking
 
Big Data & SQL: The On-Ramp to Hadoop
Big Data & SQL: The On-Ramp to Hadoop Big Data & SQL: The On-Ramp to Hadoop
Big Data & SQL: The On-Ramp to Hadoop
 
Has Traditional MDM Finally Met its Match?
Has Traditional MDM Finally Met its Match?Has Traditional MDM Finally Met its Match?
Has Traditional MDM Finally Met its Match?
 
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
 
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
 
Rediscover Software Development Edward Hieatt Web Summit 2014
Rediscover Software Development Edward Hieatt Web Summit 2014Rediscover Software Development Edward Hieatt Web Summit 2014
Rediscover Software Development Edward Hieatt Web Summit 2014
 
A modern, flexible approach to Hadoop implementation incorporating innovation...
A modern, flexible approach to Hadoop implementation incorporating innovation...A modern, flexible approach to Hadoop implementation incorporating innovation...
A modern, flexible approach to Hadoop implementation incorporating innovation...
 
A Tighter Weave – How YARN Changes the Data Quality Game
A Tighter Weave – How YARN Changes the Data Quality GameA Tighter Weave – How YARN Changes the Data Quality Game
A Tighter Weave – How YARN Changes the Data Quality Game
 
Enrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache HadoopEnrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache Hadoop
 
Hadoop 2.0 - Solving the Data Quality Challenge
Hadoop 2.0 - Solving the Data Quality ChallengeHadoop 2.0 - Solving the Data Quality Challenge
Hadoop 2.0 - Solving the Data Quality Challenge
 
Hot Technologies of 2013: Hadoop 2.0
Hot Technologies of 2013: Hadoop 2.0Hot Technologies of 2013: Hadoop 2.0
Hot Technologies of 2013: Hadoop 2.0
 
HP Vertica and MapR Webinar: Building a Business Case for SQL-on-Hadoop
HP Vertica and MapR Webinar: Building a Business Case for SQL-on-HadoopHP Vertica and MapR Webinar: Building a Business Case for SQL-on-Hadoop
HP Vertica and MapR Webinar: Building a Business Case for SQL-on-Hadoop
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
 

Recently uploaded

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 

Recently uploaded (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 

Visibility and Optimization for Hadoop

  • 1. © 2015 Enterprise Integration News, Inc. Introduction Agenda Bio Making Hadoop just work better for varied workloads Details top challenges in adopting Hadoop How Pepperdata automatically improves performance, visibility, controls A pioneer in production-ready Hadoop 15+ years web search and big data; Focus on huge scale, huge impact products Started the Silicon Valley branch of Microsoft’s Bing engineering & product team Visibility & Optimization for Hadoop Sean Suchter Co-Founder and CEO 1
  • 2. ©2015 Pepperdata VISIBILITY & OPTIMIZATION FOR HADOOP
  • 3. • • • • • • • • ©2014 Pepperdata AGENDA CHALLENGES USING HADOOP HOW PEPPERDATA ADDRESSES THESE CHALLENGES Q&A and NEXT STEPS 3
  • 4. • • • • • • • • ©2014 Pepperdata HADOOP CHALLENGES YOU FACE DAILY 4 TacticalStrategicVeryStrategic Call from your CEO asking “WTF is happening?!? Can’t make SEC filing EOQ and can’t send the invoice Critical feature on your website is broken! Online ad impression data unavailable External customer reports unavailable Users complaining Customer churn metrics unavailable Revenue report doesn’t complete Have to buy more servers End user SLAs compromised Finding root-cause of problems is manual Low priority jobs taking over the cluster Ad hoc jobs interfere with production jobs HBase & MapReduce contention Rogue jobs hammer cluster performance Cluster seems near maximum capacity Developers can’t submit new jobs
  • 5. • • • • • • • • ©2014 Pepperdata HADOOP WASTES VALUABLE CAPACITY Physicalhardwareresource Time Theoretical maximum usage (reservation) Actual physical capacity used 1. Production clusters are sized for peak SLA with lots of headroom, so capacity is wasted 1. Ad-hoc jobs consume capacity from high-priority jobs, so companies run them on separate cluster 1. Hadoop’s allocations are predefined and static, resulting in wasted capacity
  • 6. • • • • • • • • ©2014 Pepperdata MORE AND MORE WASTED CAPACITY 6 Over time, more and more clusters are built to isolate the different workloads Production Cluster Ad Hoc Cluster Priority Job Cluster HBase Cluster Bulk Load Cluster But they are full of “holes”!
  • 7. • • • • • • • • ©2014 Pepperdata PEPPERDATA MAKES HADOOP WORK BETTER 7 FINE-GRAINED VISIBILITY Monitor CPU, RAM, I/O, network per task, job, user, group Identify bottlenecks in real-time or at any moment historically TOTAL PREDICTABILITY SLA enforcement for true multi-tenancy: dynamically adjusts resource usage Set policies to protect high-priority jobs 30-50% GREATER THROUGHPUT ON ALREADY HIGHLY TUNED CLUSTERS Reclaims wasted capacity: use all true hardware capacity Run more jobs with our Dynamic Capacity Creation
  • 8. • • • • • • • • ©2014 Pepperdata PEPPERDATA REAL-TIME ARCHITECTURE 8 VISIBILITY CONTROL CAPACITY Delivers real-time, granular visibility into resource consumption by user, job, and task Allows user-defined prioritization of Hadoop jobs and automatically allocates resources to ensure jobs run safely Reclaims wasted capacity and allows mixed workloads to be shared on a single cluster Developer Analyst Financial ReportProduct Pepperdata Dashboard Hadoop Configuration YOUR EXISTING HADOOP MapReduce, HBase, etc. Job Tracker / Resource Manager (Scheduler & YARN) ETL Policies
  • 9. • • • • • • • • ©2014 Pepperdata FINE-GRAINED VISIBILITY INTO THE CLUSTER 9
  • 10. • • • • • • • • ©2014 Pepperdata FINE-GRAINED VISIBILITY INTO YOUR CLUSTER 10
  • 11. • • • • • • • • ©2014 Pepperdata EASILY PINPOINT BOTTLENECKS IN THE CLUSTER 11
  • 12. • • • • • • • • ©2014 Pepperdata PEPPERDATA MAKES HADOOP WORK BETTER 12 FINE-GRAINED VISIBILITY Monitor CPU, RAM, I/O, network per task, job, user, group Identify bottlenecks in real-time or at any moment historically TOTAL PREDICTABILITY SLA enforcement for true multi-tenancy: dynamically adjusts resource usage Set policies to protect high-priority jobs 30-50% GREATER THROUGHPUT ON ALREADY HIGHLY TUNED CLUSTERS Reclaims wasted capacity: use all true hardware capacity Run more jobs with our Dynamic Capacity Creation
  • 13. • • • • • • • • ©2014 Pepperdata NEXT STEPS Like what you saw? Want to learn more? Visit pepperdata.com for more product information. Visit pepperdata.com/demo to request a free demo from one of our technical experts! 13
  • 14. • • • • • • • • ©2014 Pepperdata THANK YOU 14
  • 15. © 2015 Enterprise Integration News, Inc. Questions & Answers Q&A Question & Answer
  • 16. What is the form factor of Pepperdata, and how long does it take to install? How do we make sure Pepperdata ‘agents’ are where they need to be -- and working? Sean Suchter Co-Founder and CEO
  • 17. • • • • • • • • ©2014 Pepperdata PEPPERDATA REAL-TIME ARCHITECTURE 17 VISIBILITY CONTROL CAPACITY Delivers real-time, granular visibility into resource consumption by user, job, and task Allows user-defined prioritization of Hadoop jobs and automatically allocates resources to ensure jobs run safely Reclaims wasted capacity and allows mixed workloads to be shared on a single cluster Developer Analyst Financial ReportProduct Pepperdata Dashboard Hadoop Configuration YOUR EXISTING HADOOP MapReduce, HBase, etc. Job Tracker / Resource Manager (Scheduler & YARN) ETL Policies
  • 18. We have mixed workloads that often force us to overprovision Hadoop resources. Does Pepperdata help us deal with this by allowing Hadoop to adjust dynamically? Sean Suchter Co-Founder and CEO
  • 19. Given Pepperdata’s intelligent and dynamic environment, how does that impact the way we do Hadoop prep or set-up? Sean Suchter Co-Founder and CEO
  • 20. How much Hadoop cluster resource does Pepperdata use? Sean Suchter Co-Founder and CEO
  • 21. How do customers use the Pepperdata dashboard? Where is it hosted? Sean Suchter Co-Founder and CEO
  • 22. • • • • • • • • ©2014 Pepperdata PEPPERDATA REAL-TIME ARCHITECTURE 22 VISIBILITY CONTROL CAPACITY Delivers real-time, granular visibility into resource consumption by user, job, and task Allows user-defined prioritization of Hadoop jobs and automatically allocates resources to ensure jobs run safely Reclaims wasted capacity and allows mixed workloads to be shared on a single cluster Developer Analyst Financial ReportProduct Pepperdata Dashboard Hadoop Configuration YOUR EXISTING HADOOP MapReduce, HBase, etc. Job Tracker / Resource Manager (Scheduler & YARN) ETL Policies
  • 23. How is the Pepperdata approach different from YARN? Sean Suchter Co-Founder and CEO
  • 24. Please detail some customer successes from using Pepperdata with Hadoop? Sean Suchter Co-Founder and CEO
  • 25. © 2015 Enterprise Integration News, Inc. For More Information For More Information Pepperdata – Rely on Hadoop http://pepperdata.com/ Visibility Capacity Control Technology Learn More About Pepperdata Product http://pepperdata.com/products/ Real-Time Architecture http://pepperdata.com/products/#pd-technology Benefits http://pepperdata.com/benefits/ Blog http://pepperdata.com/blog/ Other Pepperdata Resources (Whitepapers & Case Studies) http://pepperdata.com/resources/ Request a Demo http://pepperdata.com/demo/