Facebook Retrospective - Big data-world-europe-2012

  • 384 views
Uploaded on

A retrospective on building, running and using the Hadoop stack at Facebook.

A retrospective on building, running and using the Hadoop stack at Facebook.

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
384
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
4
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • Intro- self, Qubole. In this video, we will see how users setup a Qubole Cluster in 3 simple steps..Those 3 steps are…
  • They are…123Now lets look at the details of each step, starting with step #1.
  • Intro- self, Qubole. In this video, we will see how users setup a Qubole Cluster in 3 simple steps..Those 3 steps are…

Transcript

  • 1. Data Infrastructure at Facebook A retrospective Joydeep Sen Sarma Ex-Facebook DI Lead, Founder Qubole
  • 2. Intro• File/Database Systems developer (ex- Netapp/Oracle)• Yahoo (2005-07), Facebook (2007-11)• @Facebook: – SysAdmin: operated massive Hadoop/Hive installs – Architect: conceived/wrote Apache Hive. made Hbase@FB happen – Herded cats: first manager of Data Infra team – IT engineer/DBA: built ETL tools, warehouse/reporting for FB Virtual Currency – Vested my stock options!• Founder Qubole Inc. (2011-)
  • 3. What not to do: Yahoo• Want to add ‘feed’ in warehouse?  Fill form, schmooze PM, wait 2 months.• Want to justify project?  Take $100M, double count 5 times.• Hard to find out what data exists in company., silos• Lots of grand architecture, but no progress
  • 4. Goals going in• Universal ability to log data and compute against it• Build infrastructure for data processing – Help people help themselves – Get out of the way• Done is better than perfect, Move Fast. – Iterate, Fix Failures Fast, Do everything twice
  • 5. State of the Union• Sep, 2007: – Use Case: compute relationship strength between friends – Data Sets: user graph, interaction and page-view logs – ~10TB cluster…• July, 2011: – Ads reporting/data-mining, News Feed ranking, Spam classification, PYMK, Search Indexing, Entitization, Sentiment Analysis, Fraud Analysis .. – ~10k queries a day, hundreds of users, scores concurrent – 50PB cluster, 15 engineers/ops in total manning.
  • 6. User Feedback• Ex-Yahoo Senior-Directory Ads Product Mgmt.: "I havent done SQL for ages - but I can use this stuff easily“• Ex-Yahoo Data Scientist: "This is so amazing. That all data is stored in one place and I canget access instantly without having to wait months and contactmultiple groups/silos“• Ex-Paypal Fraud Analyst: "So much better data and infrastructure than I have ever had inthe past"
  • 7. Key Highways• Hive – Centrally managed Hadoop service, no setup – SQL is easy, add scripts for map-reduce – Browser based query wizards for SQL dummies • Download results to Excel • Schedule queries periodically with a few clicks• Scribe – Just log data using Scribe from any application – Dead simple to add attributes to user page views – Easy to pull data from RDBMS
  • 8. Key Highways• Simple Workflow authoring system (Databee)• Reporting is easy – Provision MySQL Data-marts in hours – Easy self-service charting/dashboarding software• Data Explorer – Wiki like system for documenting tables, columns, types – Keyword Search, find table authors, users – Help people help people
  • 9. Democracies – Ugh!“Democracy may not be the perfect … but it is betterthan the alternatives.”“The family that poops together stays together”
  • 10. Maintaining Order• Hadoop Fair Scheduler – Guarantee resources to projects/users. Share excess capacity• Multiple Compute tiers – Production, Large Ad-hoc, Small Ad-hoc, Local-mode queries• Kill the bad guys – Code to hunt down bad queries/apps – Track cpu/disk usage – go after biggies• Ban assault rifles – Basic ACLs – can’t delete important tables, directories
  • 11. Why did we succeed? All Heil Data Consolidation (9pm, FB Hack Night) Ads Engineering Director: “Hey Joy, I want to join user fb-DATA currency purchases with friend request data to test a thesis – pointers?”DATA
  • 12. Hadoop• Cheap – Can consolidate everything. – We made it cheaper (RCFile, HDFS-RAID)• Reduces governance cost – Only worry about really really large stuff. – Less data replication processes to manage• Separates compute from storage – Most legacy vendors don’t get this• Disk Based analytic systems degrade gracefully – No tipping point (vs. in-memory only) – Ability to catchup, go back in past (vs. real-time stream processing only)
  • 13. Things we missed
  • 14. Things we missed• SLOOOOOOW – Extensive work on FB Hadoop repo for faster scheduling – Make testing faster (approx. queries) – Watch @Qubole• SQL as rope – Need higher level templates. Don’t need 10 versions of a 30-day moving average calculator• Duplication of queries/jobs – How to discover if there’s existing summaries? – People help people, but still ..• Didn’t build enough APIs
  • 15. Final Words• It’s not the software stupid – Software is easy to write and fix – Can be slow• It’s the service that matters – Making everything work seamlessly – Ability to fix/improve things FAST
  • 16. Q&A