Hadoop Operations at LinkedIn

Person at Effective Machines
Mar. 20, 2013
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
1 of 30

More Related Content

What's hot

Introduction to hadoop administration   jkIntroduction to hadoop administration   jk
Introduction to hadoop administration jkEdureka!
Hadoop Ecosystem OverviewHadoop Ecosystem Overview
Hadoop Ecosystem OverviewGerrit van Vuuren
Big Data Performance and Capacity ManagementBig Data Performance and Capacity Management
Big Data Performance and Capacity Managementrightsize
Hadoop cluster configurationHadoop cluster configuration
Hadoop cluster configurationprabakaranbrick
How to Increase Performance of Your Hadoop ClusterHow to Increase Performance of Your Hadoop Cluster
How to Increase Performance of Your Hadoop ClusterAltoros
02.28.13 WANdisco ApacheCon 201302.28.13 WANdisco ApacheCon 2013
02.28.13 WANdisco ApacheCon 2013WANdisco Plc

Similar to Hadoop Operations at LinkedIn

Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...
Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...Sematext Group, Inc.
Open Source Data DeduplicationOpen Source Data Deduplication
Open Source Data DeduplicationRedWireServices
Big Data and Hadoop in Cloud - Leveraging Amazon EMRBig Data and Hadoop in Cloud - Leveraging Amazon EMR
Big Data and Hadoop in Cloud - Leveraging Amazon EMRVijay Rayapati
Big data nyuBig data nyu
Big data nyuEdward Capriolo
오라클 DR 및 복제 솔루션(Dbvisit 소개)오라클 DR 및 복제 솔루션(Dbvisit 소개)
오라클 DR 및 복제 솔루션(Dbvisit 소개)Linux Foundation Korea
Five Lessons  in Distributed DatabasesFive Lessons  in Distributed Databases
Five Lessons in Distributed Databasesjbellis

More from Allen Wittenauer

2019-09-10: Testing Contributions at Scale2019-09-10: Testing Contributions at Scale
2019-09-10: Testing Contributions at ScaleAllen Wittenauer
2018-08-23 Apache Yetus: Precommit2018-08-23 Apache Yetus: Precommit
2018-08-23 Apache Yetus: PrecommitAllen Wittenauer
Apache Yetus: Intro to Precommit for HBase ContributorsApache Yetus: Intro to Precommit for HBase Contributors
Apache Yetus: Intro to Precommit for HBase ContributorsAllen Wittenauer
Apache Yetus: Helping Solve the Last Mile ProblemApache Yetus: Helping Solve the Last Mile Problem
Apache Yetus: Helping Solve the Last Mile ProblemAllen Wittenauer
Apache Hadoop Shell RewriteApache Hadoop Shell Rewrite
Apache Hadoop Shell RewriteAllen Wittenauer
Let's Talk Operations! (Hadoop Summit 2014)Let's Talk Operations! (Hadoop Summit 2014)
Let's Talk Operations! (Hadoop Summit 2014)Allen Wittenauer

Recently uploaded

Google cloud Study Jam 2023.pptxGoogle cloud Study Jam 2023.pptx
Google cloud Study Jam 2023.pptxGDSCNiT
How to reduce expenses on monitoringHow to reduce expenses on monitoring
How to reduce expenses on monitoringRomanKhavronenko
Product Listing Presentation-Maidy Veloso.pptxProduct Listing Presentation-Maidy Veloso.pptx
Product Listing Presentation-Maidy Veloso.pptxMaidyVeloso
Need for Speed: Removing speed bumps in API ProjectsNeed for Speed: Removing speed bumps in API Projects
Need for Speed: Removing speed bumps in API ProjectsŁukasz Chruściel
V3Cube Gojek Clone - Rebrand With SuperiorityV3Cube Gojek Clone - Rebrand With Superiority
V3Cube Gojek Clone - Rebrand With SuperiorityV3cube
UiPath Tips and Techniques for Debugging - Session 3UiPath Tips and Techniques for Debugging - Session 3
UiPath Tips and Techniques for Debugging - Session 3DianaGray10

Hadoop Operations at LinkedIn

Editor's Notes

  1. Goals: - Fix the performance - Make the system operationally sound
  2. Goals: - Corporate decision to switch to Linux - Start prep for security
  3. we use cobbler to control our kickstart installs. key features: * template engine * snippet system * RPM repo sync * both command line and programmable APIs * and, most importantly, great support for a “netboot always” environment. This means that we always have our hosts boot from the network and, if that fails, local disk. We generally always re-install the machine after a disk failure so that we can start it from a clean slate, cleaning any excess cruft and restoring any host specific parts like Kerberos keytabs. What may be surprising is that our kickstart environment serves primarily to do three things: * partition disks * get enough of the OS installed to troubleshoot a broken kickstart * bootstrap our configuration management tool
  4. Born out of the HPC community in 2004 Python BSD License Love the community Works with everything, not just the Hadoop ecosystem Services based methodology with conflict resolution Awesome reporting engine
  5. Goals: - Deploy secure Hadoop - Reduce user friction
  6. A talk in and of itself Highlights: - another cultural shift - finding many bugs in what was considered stable code - forking the kerberos web filter due to poor code quality
  7. Goals: - What do we need for the next 4 years?