SlideShare a Scribd company logo
Grid Operations



Hadoop Performance at LinkedIn
Allen Wittenauer
Grid Computing Architect


©2012 LinkedIn Corporation. All Rights Reserved.
©2012 LinkedIn Corporation. All Rights Reserved.
“I have never seen a Hadoop cluster that was
             legitimately CPU bound.”
                -- Milind Bhandarkar
                -- Milind Bhandarkar
                -- Milind Bhandarkar



©2012 LinkedIn Corporation. All Rights Reserved.
X5650 - 6 Core @ 2.67 MHz




©2012 LinkedIn Corporation. All Rights Reserved.
X5650 - 6 Core @ 2.67 MHz




©2012 LinkedIn Corporation. All Rights Reserved.
“I have only seen one Hadoop cluster that was
            legitimately CPU bound.”
               -- Milind Bhandarkar
               -- Milind Bhandarkar
               -- Milind Bhandarkar



©2012 LinkedIn Corporation. All Rights Reserved.
Why do we have such high CPU usage?




©2012 LinkedIn Corporation. All Rights Reserved.
We do a lot of Graph Theory.




©2012 LinkedIn Corporation. All Rights Reserved.
Ticket to Ride




   Ticket To Ride is a registered trademark of Days of Wonder


    ©2012 LinkedIn Corporation. All Rights Reserved.             GRID OPERATIONS
Social Graph




©2012 LinkedIn Corporation. All Rights Reserved.   GRID OPERATIONS
2nd Degree Connection




©2012 LinkedIn Corporation. All Rights Reserved.   GRID OPERATIONS
We under-commit our memory.




©2012 LinkedIn Corporation. All Rights Reserved.
Our Hadoop Software Needs... The Plan...

  Tasks
     – 2 GB of RAM = 1 GB of JVM Heap, .5-1GB for non-heap
     – (Typically) 1 Super Active Threads


  TaskTracker
     – 1.5 GB of RAM = 1 GB of JVM Heap, .5GB for non-heap
     – 1-4 Super Active Threads


  DataNode
     – 1.5 GB of RAM = 1 GB of JVM Heap, .5GB for non-heap
     – 1-4 Super Active Threads


  RAM: 3GB + (task count * 2GB) + OS needs
  Threads: 8 + (task count) + OS needs


©2012 LinkedIn Corporation. All Rights Reserved.             GRID OPERATIONS
Our Hadoop Software Needs... The Reality

  Task Counts
     – Westmere (5650): 6
       Cores+HT = 12
       Tasks
     – Sandy Bridge
       (2640): 6 Cores+HT
       = 14 Tasks


  Most of our tasks
   leave at most .5
   GB free
     – = combined -> very
       large buffer & cache




©2012 LinkedIn Corporation. All Rights Reserved.   GRID OPERATIONS
We don’t have as many disks per node.




©2012 LinkedIn Corporation. All Rights Reserved.
Typical Hadoop Node Out in the Wild

  Most user’s don’t know their actual
   needs
     – Vendor advice... play it safe!


  Significantly more memory
     – “For the future!”
     – Badly written code
  Significantly more disk
     – “Hadoop is IO intensive!”
     – “Greater task locality!”


  Greater performance...but is it worth
   the cost...



©2012 LinkedIn Corporation. All Rights Reserved.   GRID OPERATIONS
What Happens With Fewer Disks?

  Physical footprint requirements are smaller
  Linux buffers & caches are more efficient
     – More per disk
     – Fewer to manage
  Spindle count DOES matter... but the price/perf isn’t there for our
   workflows.
     – From a few years ago & based on store.sun.com prices (so not “real”)...

     Nodes/Cores                         RAM/Bus      Disks   Time In Minutes   HW Cost*
             3/24                           16/half    8          254.98         $37827
             3/24                           24/full    8          244.50         $38817
             3/24                           16/half    4          257.38         $21456
             3/24                           24/full    4          246.82         $22986
             6/48                           16/half    4          126.98         $42912

©2012 LinkedIn Corporation. All Rights Reserved.                                    GRID OPERATIONS
LinkedIn Node Configuration

  No RAID controller
     – More cost for negative perf when doing
       JBOD


  6 Drives
     – Still fits in 1U w/SATA drives
     – ~same perf as 8 drives


  Less metal = cheaper cost




©2012 LinkedIn Corporation. All Rights Reserved.   GRID OPERATIONS
Rack Level View

  If we assume we can use 40u in a rack then:
     – More CPUs
     – Just as many HDs
     – More Network
     – Potentially more RAM




©2012 LinkedIn Corporation. All Rights Reserved.   GRID OPERATIONS
We care about file system tuning.




©2012 LinkedIn Corporation. All Rights Reserved.
LinkedIn Hadoop Disk/File Systems

  noatime Enabled

  writeback Enabled

  Each Disk (except root) Partitions:
     – Swap
     – MapReduce Spill Space
     – HDFS


  Delayed Commits
     – Why write once when you can do ganged writes more efficiently?




©2012 LinkedIn Corporation. All Rights Reserved.                        GRID OPERATIONS
We care about job tuning.




©2012 LinkedIn Corporation. All Rights Reserved.
LinkedIn Job Tuning Guidelines

  All jobs get reviewed prior to going to production.

  Task times should be between 5-15 minutes.

  Jobs should have less than 10,000 tasks.

  Jobs should be smart about # of files and the size of those files
   generated.




©2012 LinkedIn Corporation. All Rights Reserved.                  GRID OPERATIONS
... and the result?




©2012 LinkedIn Corporation. All Rights Reserved.
Why is LinkedIn Running so Hot?

  We do a lot of non-MapReduce work.

  RAM buffers and caches allow us to offset a lot of disk IO.

  We audit our jobs.

  As a result, our CPUs are actually busy.




©2012 LinkedIn Corporation. All Rights Reserved.                 GRID OPERATIONS
©2012 LinkedIn Corporation. All Rights Reserved.   BUSINESS OPERATIONS

More Related Content

What's hot

How to Increase Performance of Your Hadoop Cluster
How to Increase Performance of Your Hadoop ClusterHow to Increase Performance of Your Hadoop Cluster
How to Increase Performance of Your Hadoop Cluster
Altoros
 
Hadoop 101
Hadoop 101Hadoop 101
Hadoop 101
EMC
 
Moving from C#/.NET to Hadoop/MongoDB
Moving from C#/.NET to Hadoop/MongoDBMoving from C#/.NET to Hadoop/MongoDB
Moving from C#/.NET to Hadoop/MongoDB
MongoDB
 
White paper hadoop performancetuning
White paper hadoop performancetuningWhite paper hadoop performancetuning
White paper hadoop performancetuning
Anil Reddy
 
Improving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationImproving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux Configuration
DataWorks Summit
 
Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)
Kathleen Ting
 
Introduction to hadoop administration jk
Introduction to hadoop administration   jkIntroduction to hadoop administration   jk
Introduction to hadoop administration jk
Edureka!
 
Apache Spark Introduction @ University College London
Apache Spark Introduction @ University College LondonApache Spark Introduction @ University College London
Apache Spark Introduction @ University College London
Vitthal Gogate
 
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedInHadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
DataWorks Summit
 
Tune hadoop
Tune hadoopTune hadoop
Tune hadoop
Jason Shao
 
Hadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryHadoop Backup and Disaster Recovery
Hadoop Backup and Disaster Recovery
Cloudera, Inc.
 
The Hadoop Ecosystem
The Hadoop EcosystemThe Hadoop Ecosystem
The Hadoop Ecosystem
J Singh
 
Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)
mundlapudi
 
Pptx present
Pptx presentPptx present
Pptx present
Nitish Bhardwaj
 
Big Data Performance and Capacity Management
Big Data Performance and Capacity ManagementBig Data Performance and Capacity Management
Big Data Performance and Capacity Management
rightsize
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Esther Kundin
 
Treasure Data on The YARN - Hadoop Conference Japan 2014
Treasure Data on The YARN - Hadoop Conference Japan 2014Treasure Data on The YARN - Hadoop Conference Japan 2014
Treasure Data on The YARN - Hadoop Conference Japan 2014
Ryu Kobayashi
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
Ran Ziv
 
Hello OpenStack, Meet Hadoop
Hello OpenStack, Meet HadoopHello OpenStack, Meet Hadoop
Hello OpenStack, Meet Hadoop
DataWorks Summit
 
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
tcloudcomputing-tw
 

What's hot (20)

How to Increase Performance of Your Hadoop Cluster
How to Increase Performance of Your Hadoop ClusterHow to Increase Performance of Your Hadoop Cluster
How to Increase Performance of Your Hadoop Cluster
 
Hadoop 101
Hadoop 101Hadoop 101
Hadoop 101
 
Moving from C#/.NET to Hadoop/MongoDB
Moving from C#/.NET to Hadoop/MongoDBMoving from C#/.NET to Hadoop/MongoDB
Moving from C#/.NET to Hadoop/MongoDB
 
White paper hadoop performancetuning
White paper hadoop performancetuningWhite paper hadoop performancetuning
White paper hadoop performancetuning
 
Improving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationImproving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux Configuration
 
Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)
 
Introduction to hadoop administration jk
Introduction to hadoop administration   jkIntroduction to hadoop administration   jk
Introduction to hadoop administration jk
 
Apache Spark Introduction @ University College London
Apache Spark Introduction @ University College LondonApache Spark Introduction @ University College London
Apache Spark Introduction @ University College London
 
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedInHadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
 
Tune hadoop
Tune hadoopTune hadoop
Tune hadoop
 
Hadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryHadoop Backup and Disaster Recovery
Hadoop Backup and Disaster Recovery
 
The Hadoop Ecosystem
The Hadoop EcosystemThe Hadoop Ecosystem
The Hadoop Ecosystem
 
Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)
 
Pptx present
Pptx presentPptx present
Pptx present
 
Big Data Performance and Capacity Management
Big Data Performance and Capacity ManagementBig Data Performance and Capacity Management
Big Data Performance and Capacity Management
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
 
Treasure Data on The YARN - Hadoop Conference Japan 2014
Treasure Data on The YARN - Hadoop Conference Japan 2014Treasure Data on The YARN - Hadoop Conference Japan 2014
Treasure Data on The YARN - Hadoop Conference Japan 2014
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Hello OpenStack, Meet Hadoop
Hello OpenStack, Meet HadoopHello OpenStack, Meet Hadoop
Hello OpenStack, Meet Hadoop
 
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
 

Similar to Hadoop Performance at LinkedIn

Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...
Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...
Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...
Sematext Group, Inc.
 
Kafka at half the price with JBOD setup
Kafka at half the price with JBOD setupKafka at half the price with JBOD setup
Kafka at half the price with JBOD setup
Dong Lin
 
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
Big Data Montreal
 
Bigdata and Hadoop with Docker
Bigdata and Hadoop with DockerBigdata and Hadoop with Docker
Bigdata and Hadoop with Docker
haridasnss
 
Right time Vs real time
Right time Vs real timeRight time Vs real time
Right time Vs real time
Murphy Choy
 
HugNov14
HugNov14HugNov14
HugNov14
Adam Faris
 
Turbocharging php applications with zend server (workshop)
Turbocharging php applications with zend server (workshop)Turbocharging php applications with zend server (workshop)
Turbocharging php applications with zend server (workshop)
Eric Ritchie
 
Light-weighted HDFS disaster recovery
Light-weighted HDFS disaster recoveryLight-weighted HDFS disaster recovery
Light-weighted HDFS disaster recovery
DataWorks Summit
 
Data lake – On Premise VS Cloud
Data lake – On Premise VS CloudData lake – On Premise VS Cloud
Data lake – On Premise VS Cloud
Idan Tohami
 
Intoduction to OrientDB
Intoduction to OrientDBIntoduction to OrientDB
Intoduction to OrientDB
Abdelmawla Mohamed
 
Complex Ephemeral Caching With Redis: Jeff Pollard
Complex Ephemeral Caching With Redis: Jeff PollardComplex Ephemeral Caching With Redis: Jeff Pollard
Complex Ephemeral Caching With Redis: Jeff Pollard
Redis Labs
 
OpenStack Days Krakow
OpenStack Days KrakowOpenStack Days Krakow
OpenStack Days Krakow
Veronika Smidova
 
Turbocharging php applications with zend server
Turbocharging php applications with zend serverTurbocharging php applications with zend server
Turbocharging php applications with zend server
Eric Ritchie
 
Unleashing the power of Scrum and Kanban together - Best of Both Worlds!!
Unleashing the power of Scrum and Kanban together - Best of Both Worlds!!Unleashing the power of Scrum and Kanban together - Best of Both Worlds!!
Unleashing the power of Scrum and Kanban together - Best of Both Worlds!!
Nitin Ramrakhyani
 
Performance tuning PHP on IBMi
Performance tuning PHP on IBMiPerformance tuning PHP on IBMi
Performance tuning PHP on IBMi
Zend by Rogue Wave Software
 
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?  Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?
EMC
 
Serverless Go at BuzzBird
Serverless Go at BuzzBirdServerless Go at BuzzBird
Serverless Go at BuzzBird
Vladislav Supalov
 
MySQL Performance Best Practices
MySQL Performance Best PracticesMySQL Performance Best Practices
MySQL Performance Best Practices
Olivier DASINI
 
An Active and Hybrid Storage System for Data-intensive Applications
An Active and Hybrid Storage System for Data-intensive ApplicationsAn Active and Hybrid Storage System for Data-intensive Applications
An Active and Hybrid Storage System for Data-intensive Applications
Xiao Qin
 
Making Sense of Big data with Hadoop
Making Sense of Big data with HadoopMaking Sense of Big data with Hadoop
Making Sense of Big data with Hadoop
Gwen (Chen) Shapira
 

Similar to Hadoop Performance at LinkedIn (20)

Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...
Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...
Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...
 
Kafka at half the price with JBOD setup
Kafka at half the price with JBOD setupKafka at half the price with JBOD setup
Kafka at half the price with JBOD setup
 
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
 
Bigdata and Hadoop with Docker
Bigdata and Hadoop with DockerBigdata and Hadoop with Docker
Bigdata and Hadoop with Docker
 
Right time Vs real time
Right time Vs real timeRight time Vs real time
Right time Vs real time
 
HugNov14
HugNov14HugNov14
HugNov14
 
Turbocharging php applications with zend server (workshop)
Turbocharging php applications with zend server (workshop)Turbocharging php applications with zend server (workshop)
Turbocharging php applications with zend server (workshop)
 
Light-weighted HDFS disaster recovery
Light-weighted HDFS disaster recoveryLight-weighted HDFS disaster recovery
Light-weighted HDFS disaster recovery
 
Data lake – On Premise VS Cloud
Data lake – On Premise VS CloudData lake – On Premise VS Cloud
Data lake – On Premise VS Cloud
 
Intoduction to OrientDB
Intoduction to OrientDBIntoduction to OrientDB
Intoduction to OrientDB
 
Complex Ephemeral Caching With Redis: Jeff Pollard
Complex Ephemeral Caching With Redis: Jeff PollardComplex Ephemeral Caching With Redis: Jeff Pollard
Complex Ephemeral Caching With Redis: Jeff Pollard
 
OpenStack Days Krakow
OpenStack Days KrakowOpenStack Days Krakow
OpenStack Days Krakow
 
Turbocharging php applications with zend server
Turbocharging php applications with zend serverTurbocharging php applications with zend server
Turbocharging php applications with zend server
 
Unleashing the power of Scrum and Kanban together - Best of Both Worlds!!
Unleashing the power of Scrum and Kanban together - Best of Both Worlds!!Unleashing the power of Scrum and Kanban together - Best of Both Worlds!!
Unleashing the power of Scrum and Kanban together - Best of Both Worlds!!
 
Performance tuning PHP on IBMi
Performance tuning PHP on IBMiPerformance tuning PHP on IBMi
Performance tuning PHP on IBMi
 
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?  Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?
 
Serverless Go at BuzzBird
Serverless Go at BuzzBirdServerless Go at BuzzBird
Serverless Go at BuzzBird
 
MySQL Performance Best Practices
MySQL Performance Best PracticesMySQL Performance Best Practices
MySQL Performance Best Practices
 
An Active and Hybrid Storage System for Data-intensive Applications
An Active and Hybrid Storage System for Data-intensive ApplicationsAn Active and Hybrid Storage System for Data-intensive Applications
An Active and Hybrid Storage System for Data-intensive Applications
 
Making Sense of Big data with Hadoop
Making Sense of Big data with HadoopMaking Sense of Big data with Hadoop
Making Sense of Big data with Hadoop
 

More from Allen Wittenauer

2019-09-10: Testing Contributions at Scale
2019-09-10: Testing Contributions at Scale2019-09-10: Testing Contributions at Scale
2019-09-10: Testing Contributions at Scale
Allen Wittenauer
 
2018-08-23 Apache Yetus: Precommit
2018-08-23 Apache Yetus: Precommit2018-08-23 Apache Yetus: Precommit
2018-08-23 Apache Yetus: Precommit
Allen Wittenauer
 
Apache Yetus: Intro to Precommit for HBase Contributors
Apache Yetus: Intro to Precommit for HBase ContributorsApache Yetus: Intro to Precommit for HBase Contributors
Apache Yetus: Intro to Precommit for HBase Contributors
Allen Wittenauer
 
Apache Yetus: Helping Solve the Last Mile Problem
Apache Yetus: Helping Solve the Last Mile ProblemApache Yetus: Helping Solve the Last Mile Problem
Apache Yetus: Helping Solve the Last Mile Problem
Allen Wittenauer
 
Apache Hadoop Shell Rewrite
Apache Hadoop Shell RewriteApache Hadoop Shell Rewrite
Apache Hadoop Shell Rewrite
Allen Wittenauer
 
Let's Talk Operations! (Hadoop Summit 2014)
Let's Talk Operations! (Hadoop Summit 2014)Let's Talk Operations! (Hadoop Summit 2014)
Let's Talk Operations! (Hadoop Summit 2014)
Allen Wittenauer
 
Apache Hadoop for System Administrators
Apache Hadoop for System AdministratorsApache Hadoop for System Administrators
Apache Hadoop for System Administrators
Allen Wittenauer
 
Hadoop 24/7
Hadoop 24/7Hadoop 24/7
Hadoop 24/7
Allen Wittenauer
 

More from Allen Wittenauer (8)

2019-09-10: Testing Contributions at Scale
2019-09-10: Testing Contributions at Scale2019-09-10: Testing Contributions at Scale
2019-09-10: Testing Contributions at Scale
 
2018-08-23 Apache Yetus: Precommit
2018-08-23 Apache Yetus: Precommit2018-08-23 Apache Yetus: Precommit
2018-08-23 Apache Yetus: Precommit
 
Apache Yetus: Intro to Precommit for HBase Contributors
Apache Yetus: Intro to Precommit for HBase ContributorsApache Yetus: Intro to Precommit for HBase Contributors
Apache Yetus: Intro to Precommit for HBase Contributors
 
Apache Yetus: Helping Solve the Last Mile Problem
Apache Yetus: Helping Solve the Last Mile ProblemApache Yetus: Helping Solve the Last Mile Problem
Apache Yetus: Helping Solve the Last Mile Problem
 
Apache Hadoop Shell Rewrite
Apache Hadoop Shell RewriteApache Hadoop Shell Rewrite
Apache Hadoop Shell Rewrite
 
Let's Talk Operations! (Hadoop Summit 2014)
Let's Talk Operations! (Hadoop Summit 2014)Let's Talk Operations! (Hadoop Summit 2014)
Let's Talk Operations! (Hadoop Summit 2014)
 
Apache Hadoop for System Administrators
Apache Hadoop for System AdministratorsApache Hadoop for System Administrators
Apache Hadoop for System Administrators
 
Hadoop 24/7
Hadoop 24/7Hadoop 24/7
Hadoop 24/7
 

Recently uploaded

Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStrDeep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
saastr
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
Shinana2
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
Operating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptxOperating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptx
Pravash Chandra Das
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
fredae14
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 

Recently uploaded (20)

Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStrDeep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
Operating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptxOperating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptx
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 

Hadoop Performance at LinkedIn

  • 1. Grid Operations Hadoop Performance at LinkedIn Allen Wittenauer Grid Computing Architect ©2012 LinkedIn Corporation. All Rights Reserved.
  • 2. ©2012 LinkedIn Corporation. All Rights Reserved.
  • 3. “I have never seen a Hadoop cluster that was legitimately CPU bound.” -- Milind Bhandarkar -- Milind Bhandarkar -- Milind Bhandarkar ©2012 LinkedIn Corporation. All Rights Reserved.
  • 4. X5650 - 6 Core @ 2.67 MHz ©2012 LinkedIn Corporation. All Rights Reserved.
  • 5. X5650 - 6 Core @ 2.67 MHz ©2012 LinkedIn Corporation. All Rights Reserved.
  • 6. “I have only seen one Hadoop cluster that was legitimately CPU bound.” -- Milind Bhandarkar -- Milind Bhandarkar -- Milind Bhandarkar ©2012 LinkedIn Corporation. All Rights Reserved.
  • 7. Why do we have such high CPU usage? ©2012 LinkedIn Corporation. All Rights Reserved.
  • 8. We do a lot of Graph Theory. ©2012 LinkedIn Corporation. All Rights Reserved.
  • 9. Ticket to Ride  Ticket To Ride is a registered trademark of Days of Wonder ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  • 10. Social Graph ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  • 11. 2nd Degree Connection ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  • 12. We under-commit our memory. ©2012 LinkedIn Corporation. All Rights Reserved.
  • 13. Our Hadoop Software Needs... The Plan...  Tasks – 2 GB of RAM = 1 GB of JVM Heap, .5-1GB for non-heap – (Typically) 1 Super Active Threads  TaskTracker – 1.5 GB of RAM = 1 GB of JVM Heap, .5GB for non-heap – 1-4 Super Active Threads  DataNode – 1.5 GB of RAM = 1 GB of JVM Heap, .5GB for non-heap – 1-4 Super Active Threads  RAM: 3GB + (task count * 2GB) + OS needs  Threads: 8 + (task count) + OS needs ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  • 14. Our Hadoop Software Needs... The Reality  Task Counts – Westmere (5650): 6 Cores+HT = 12 Tasks – Sandy Bridge (2640): 6 Cores+HT = 14 Tasks  Most of our tasks leave at most .5 GB free – = combined -> very large buffer & cache ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  • 15. We don’t have as many disks per node. ©2012 LinkedIn Corporation. All Rights Reserved.
  • 16. Typical Hadoop Node Out in the Wild  Most user’s don’t know their actual needs – Vendor advice... play it safe!  Significantly more memory – “For the future!” – Badly written code  Significantly more disk – “Hadoop is IO intensive!” – “Greater task locality!”  Greater performance...but is it worth the cost... ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  • 17. What Happens With Fewer Disks?  Physical footprint requirements are smaller  Linux buffers & caches are more efficient – More per disk – Fewer to manage  Spindle count DOES matter... but the price/perf isn’t there for our workflows. – From a few years ago & based on store.sun.com prices (so not “real”)... Nodes/Cores RAM/Bus Disks Time In Minutes HW Cost* 3/24 16/half 8 254.98 $37827 3/24 24/full 8 244.50 $38817 3/24 16/half 4 257.38 $21456 3/24 24/full 4 246.82 $22986 6/48 16/half 4 126.98 $42912 ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  • 18. LinkedIn Node Configuration  No RAID controller – More cost for negative perf when doing JBOD  6 Drives – Still fits in 1U w/SATA drives – ~same perf as 8 drives  Less metal = cheaper cost ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  • 19. Rack Level View  If we assume we can use 40u in a rack then: – More CPUs – Just as many HDs – More Network – Potentially more RAM ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  • 20. We care about file system tuning. ©2012 LinkedIn Corporation. All Rights Reserved.
  • 21. LinkedIn Hadoop Disk/File Systems  noatime Enabled  writeback Enabled  Each Disk (except root) Partitions: – Swap – MapReduce Spill Space – HDFS  Delayed Commits – Why write once when you can do ganged writes more efficiently? ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  • 22. We care about job tuning. ©2012 LinkedIn Corporation. All Rights Reserved.
  • 23. LinkedIn Job Tuning Guidelines  All jobs get reviewed prior to going to production.  Task times should be between 5-15 minutes.  Jobs should have less than 10,000 tasks.  Jobs should be smart about # of files and the size of those files generated. ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  • 24. ... and the result? ©2012 LinkedIn Corporation. All Rights Reserved.
  • 25. Why is LinkedIn Running so Hot?  We do a lot of non-MapReduce work.  RAM buffers and caches allow us to offset a lot of disk IO.  We audit our jobs.  As a result, our CPUs are actually busy. ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  • 26. ©2012 LinkedIn Corporation. All Rights Reserved. BUSINESS OPERATIONS