SlideShare a Scribd company logo
The Facebook Data team
Infrastructure and insight

Jeff Hammerbacher
Engineering Manager
January 16, 2008
Agenda

    Motivations
1




    People
2




    Products
3




    Philosophy
4
Motivations

    Source data living on a horizontally partitioned MySQL tier
▪

    Intensive historical analysis difficult
▪

    Difficult to assess impact of changes to the site
▪




    First try: Reporting and Analysis
▪

    Second try: Data Infrastructure and Data Insight
▪

    Third try: Data
▪
People
Infrastructure
    Rama Ramasamy
▪

    Suresh Antony
▪

    Avinash Lakshman
▪

    Hao Liu
▪

    Joydeep Sen Sarma
▪

    Ashish Thusoo
▪

    Pete Wyckoff
▪
People
Insight
    Rupen Parikh
▪

    Itamar Rosenn
▪

    Ravi Grover
▪

    Roddy Lindsay
▪

    Cameron Marlow
▪

    Danny Ferrante
▪

    Ding Zhou
▪
Products
Infrastructure
    Hive
▪

        Hive Shell
    ▪

        Pyhive
    ▪

        Apidictor
    ▪

    Nest
▪

    ???
▪
Products
Insight
     Experimentation Platform
 ▪

     Argus
 ▪

     Insights
 ▪

     InvEst
 ▪

     Growth Report
 ▪



     More coming soon...
 ▪
(c) 2008 Facebook, Inc. or its licensors.  quot;Facebookquot; is a registered trademark of Facebook, Inc.. All rights reserved. 1.0

More Related Content

Similar to 20080115yahoobrickhouse

20080115yahoobrickhouse
20080115yahoobrickhouse20080115yahoobrickhouse
20080115yahoobrickhouse
Jeff Hammerbacher
 
Cloud_Big_Data_Analytics_Mobile_Social_modern_internet_scale_business_models_...
Cloud_Big_Data_Analytics_Mobile_Social_modern_internet_scale_business_models_...Cloud_Big_Data_Analytics_Mobile_Social_modern_internet_scale_business_models_...
Cloud_Big_Data_Analytics_Mobile_Social_modern_internet_scale_business_models_...
John Sing
 
Text Analytics Summit 2009 - Roddy Lindsay - "Social Media, Happiness, Petaby...
Text Analytics Summit 2009 - Roddy Lindsay - "Social Media, Happiness, Petaby...Text Analytics Summit 2009 - Roddy Lindsay - "Social Media, Happiness, Petaby...
Text Analytics Summit 2009 - Roddy Lindsay - "Social Media, Happiness, Petaby...
guest5b1607
 
2016 06-07 data driven production
2016 06-07 data driven production2016 06-07 data driven production
2016 06-07 data driven production
Mark Reynolds
 
DevDay 2016: Dave Farley - The Rationale for Continuous Delivery
DevDay 2016: Dave Farley - The Rationale for Continuous DeliveryDevDay 2016: Dave Farley - The Rationale for Continuous Delivery
DevDay 2016: Dave Farley - The Rationale for Continuous Delivery
DevDay Dresden
 
Ruby, rails, no sql and big data
Ruby, rails, no sql and big dataRuby, rails, no sql and big data
Ruby, rails, no sql and big dataJohn Repko
 
Data Science Provenance: From Drug Discovery to Fake Fans
Data Science Provenance: From Drug Discovery to Fake FansData Science Provenance: From Drug Discovery to Fake Fans
Data Science Provenance: From Drug Discovery to Fake Fans
Jameel Syed
 
Sgd research-damr-19-may-2018
Sgd research-damr-19-may-2018Sgd research-damr-19-may-2018
Sgd research-damr-19-may-2018
Sanjeev Deshmukh
 
All thingspython@pivotal
All thingspython@pivotalAll thingspython@pivotal
All thingspython@pivotal
Srivatsan Ramanujam
 
20080528dublinpt2
20080528dublinpt220080528dublinpt2
20080528dublinpt2
Jeff Hammerbacher
 
Data mining with Rattle For R
Data mining with Rattle For RData mining with Rattle For R
Data mining with Rattle For R
Akhil Anil
 

Similar to 20080115yahoobrickhouse (11)

20080115yahoobrickhouse
20080115yahoobrickhouse20080115yahoobrickhouse
20080115yahoobrickhouse
 
Cloud_Big_Data_Analytics_Mobile_Social_modern_internet_scale_business_models_...
Cloud_Big_Data_Analytics_Mobile_Social_modern_internet_scale_business_models_...Cloud_Big_Data_Analytics_Mobile_Social_modern_internet_scale_business_models_...
Cloud_Big_Data_Analytics_Mobile_Social_modern_internet_scale_business_models_...
 
Text Analytics Summit 2009 - Roddy Lindsay - "Social Media, Happiness, Petaby...
Text Analytics Summit 2009 - Roddy Lindsay - "Social Media, Happiness, Petaby...Text Analytics Summit 2009 - Roddy Lindsay - "Social Media, Happiness, Petaby...
Text Analytics Summit 2009 - Roddy Lindsay - "Social Media, Happiness, Petaby...
 
2016 06-07 data driven production
2016 06-07 data driven production2016 06-07 data driven production
2016 06-07 data driven production
 
DevDay 2016: Dave Farley - The Rationale for Continuous Delivery
DevDay 2016: Dave Farley - The Rationale for Continuous DeliveryDevDay 2016: Dave Farley - The Rationale for Continuous Delivery
DevDay 2016: Dave Farley - The Rationale for Continuous Delivery
 
Ruby, rails, no sql and big data
Ruby, rails, no sql and big dataRuby, rails, no sql and big data
Ruby, rails, no sql and big data
 
Data Science Provenance: From Drug Discovery to Fake Fans
Data Science Provenance: From Drug Discovery to Fake FansData Science Provenance: From Drug Discovery to Fake Fans
Data Science Provenance: From Drug Discovery to Fake Fans
 
Sgd research-damr-19-may-2018
Sgd research-damr-19-may-2018Sgd research-damr-19-may-2018
Sgd research-damr-19-may-2018
 
All thingspython@pivotal
All thingspython@pivotalAll thingspython@pivotal
All thingspython@pivotal
 
20080528dublinpt2
20080528dublinpt220080528dublinpt2
20080528dublinpt2
 
Data mining with Rattle For R
Data mining with Rattle For RData mining with Rattle For R
Data mining with Rattle For R
 

More from Jeff Hammerbacher

20091027genentech
20091027genentech20091027genentech
20091027genentech
Jeff Hammerbacher
 
Mårten Mickos's presentation "Open Source: Why Freedom Makes a Better Busines...
Mårten Mickos's presentation "Open Source: Why Freedom Makes a Better Busines...Mårten Mickos's presentation "Open Source: Why Freedom Makes a Better Busines...
Mårten Mickos's presentation "Open Source: Why Freedom Makes a Better Busines...Jeff Hammerbacher
 
20081030linkedin
20081030linkedin20081030linkedin
20081030linkedin
Jeff Hammerbacher
 
20081022cca
20081022cca20081022cca
20081022cca
Jeff Hammerbacher
 

More from Jeff Hammerbacher (20)

20120223keystone
20120223keystone20120223keystone
20120223keystone
 
20100714accel
20100714accel20100714accel
20100714accel
 
20100608sigmod
20100608sigmod20100608sigmod
20100608sigmod
 
20100513brown
20100513brown20100513brown
20100513brown
 
20100423sage
20100423sage20100423sage
20100423sage
 
20100418sos
20100418sos20100418sos
20100418sos
 
20100301icde
20100301icde20100301icde
20100301icde
 
20100201hplabs
20100201hplabs20100201hplabs
20100201hplabs
 
20100128ebay
20100128ebay20100128ebay
20100128ebay
 
20091203gemini
20091203gemini20091203gemini
20091203gemini
 
20091203gemini
20091203gemini20091203gemini
20091203gemini
 
20091110startup2startup
20091110startup2startup20091110startup2startup
20091110startup2startup
 
20091030nasajpl
20091030nasajpl20091030nasajpl
20091030nasajpl
 
20091027genentech
20091027genentech20091027genentech
20091027genentech
 
Mårten Mickos's presentation "Open Source: Why Freedom Makes a Better Busines...
Mårten Mickos's presentation "Open Source: Why Freedom Makes a Better Busines...Mårten Mickos's presentation "Open Source: Why Freedom Makes a Better Busines...
Mårten Mickos's presentation "Open Source: Why Freedom Makes a Better Busines...
 
20090622 Velocity
20090622 Velocity20090622 Velocity
20090622 Velocity
 
20090422 Www
20090422 Www20090422 Www
20090422 Www
 
20090309berkeley
20090309berkeley20090309berkeley
20090309berkeley
 
20081030linkedin
20081030linkedin20081030linkedin
20081030linkedin
 
20081022cca
20081022cca20081022cca
20081022cca
 

Recently uploaded

GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
ThomasParaiso2
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 

Recently uploaded (20)

GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 

20080115yahoobrickhouse

  • 1.
  • 2. The Facebook Data team Infrastructure and insight Jeff Hammerbacher Engineering Manager January 16, 2008
  • 3. Agenda Motivations 1 People 2 Products 3 Philosophy 4
  • 4. Motivations Source data living on a horizontally partitioned MySQL tier ▪ Intensive historical analysis difficult ▪ Difficult to assess impact of changes to the site ▪ First try: Reporting and Analysis ▪ Second try: Data Infrastructure and Data Insight ▪ Third try: Data ▪
  • 5. People Infrastructure Rama Ramasamy ▪ Suresh Antony ▪ Avinash Lakshman ▪ Hao Liu ▪ Joydeep Sen Sarma ▪ Ashish Thusoo ▪ Pete Wyckoff ▪
  • 6. People Insight Rupen Parikh ▪ Itamar Rosenn ▪ Ravi Grover ▪ Roddy Lindsay ▪ Cameron Marlow ▪ Danny Ferrante ▪ Ding Zhou ▪
  • 7. Products Infrastructure Hive ▪ Hive Shell ▪ Pyhive ▪ Apidictor ▪ Nest ▪ ??? ▪
  • 8. Products Insight Experimentation Platform ▪ Argus ▪ Insights ▪ InvEst ▪ Growth Report ▪ More coming soon... ▪
  • 9. (c) 2008 Facebook, Inc. or its licensors.  quot;Facebookquot; is a registered trademark of Facebook, Inc.. All rights reserved. 1.0