Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

20080115yahoobrickhouse

3,942 views

Published on

final cut

Published in: Technology, Education
  • Be the first to comment

20080115yahoobrickhouse

  1. 1. The Facebook Data teamInfrastructure and insightJeff HammerbacherEngineering ManagerJanuary 16, 2008
  2. 2. Agenda1 Motivations2 People3 Products4 Philosophy
  3. 3. Motivations▪ Source data living on a horizontally partitioned MySQL tier▪ Intensive historical analysis difficult▪ Difficult to assess impact of changes to the site▪ First try: Reporting and Analysis▪ Second try: Data Infrastructure and Data Insight▪ Third try: Data
  4. 4. PeopleInfrastructure▪ Rama Ramasamy▪ Suresh Antony▪ Avinash Lakshman▪ Hao Liu▪ Joydeep Sen Sarma▪ Ashish Thusoo▪ Pete Wyckoff
  5. 5. PeopleInsight▪ Rupen Parikh▪ Itamar Rosenn▪ Ravi Grover▪ Roddy Lindsay▪ Cameron Marlow▪ Danny Ferrante▪ Ding Zhou
  6. 6. ProductsInfrastructure▪ Hive ▪ Hive Shell ▪ Pyhive ▪ Apidictor▪ Nest▪ ???
  7. 7. ProductsInsight ▪ Experimentation Platform ▪ Argus ▪ Insights ▪ InvEst ▪ Growth Report ▪ More coming soon...
  8. 8. Philosophy▪ Data management as important as data analysis▪ Know your data center: compute, storage, interconnect▪ Shared nothing means the interconnect is the bottleneck▪ Engineers are not analysts▪ You must write code▪ Don’t collect data without a purpose▪ Basic statistics is more useful than advanced machine learning▪ Visualization is as important as basic statistics▪ Analysis and not reporting▪ Developing products might be better than YAS
  9. 9. (c) 2008 Facebook, Inc. or its licensors.  "Facebook" is a registered trademark of Facebook, Inc.. All rights reserved. 1.0

×