Your SlideShare is downloading. ×
20080115yahoobrickhouse
20080115yahoobrickhouse
20080115yahoobrickhouse
20080115yahoobrickhouse
20080115yahoobrickhouse
20080115yahoobrickhouse
20080115yahoobrickhouse
20080115yahoobrickhouse
20080115yahoobrickhouse
20080115yahoobrickhouse
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

20080115yahoobrickhouse

3,440

Published on

final cut

final cut

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,440
On Slideshare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
46
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. The Facebook Data teamInfrastructure and insightJeff HammerbacherEngineering ManagerJanuary 16, 2008
  • 2. Agenda1 Motivations2 People3 Products4 Philosophy
  • 3. Motivations▪ Source data living on a horizontally partitioned MySQL tier▪ Intensive historical analysis difficult▪ Difficult to assess impact of changes to the site▪ First try: Reporting and Analysis▪ Second try: Data Infrastructure and Data Insight▪ Third try: Data
  • 4. PeopleInfrastructure▪ Rama Ramasamy▪ Suresh Antony▪ Avinash Lakshman▪ Hao Liu▪ Joydeep Sen Sarma▪ Ashish Thusoo▪ Pete Wyckoff
  • 5. PeopleInsight▪ Rupen Parikh▪ Itamar Rosenn▪ Ravi Grover▪ Roddy Lindsay▪ Cameron Marlow▪ Danny Ferrante▪ Ding Zhou
  • 6. ProductsInfrastructure▪ Hive ▪ Hive Shell ▪ Pyhive ▪ Apidictor▪ Nest▪ ???
  • 7. ProductsInsight ▪ Experimentation Platform ▪ Argus ▪ Insights ▪ InvEst ▪ Growth Report ▪ More coming soon...
  • 8. Philosophy▪ Data management as important as data analysis▪ Know your data center: compute, storage, interconnect▪ Shared nothing means the interconnect is the bottleneck▪ Engineers are not analysts▪ You must write code▪ Don’t collect data without a purpose▪ Basic statistics is more useful than advanced machine learning▪ Visualization is as important as basic statistics▪ Analysis and not reporting▪ Developing products might be better than YAS
  • 9. (c) 2008 Facebook, Inc. or its licensors.  "Facebook" is a registered trademark of Facebook, Inc.. All rights reserved. 1.0

×