SlideShare a Scribd company logo
Hpca2012 facebook keynote
Social Networking at Scale



Sanjeev Kumar
Facebook
Outline
 1   What makes scaling Facebook challenging?

 2   Evolution of Software Architecture

 3   Evolution of Datacenter Architecture
845M users worldwide



  2004   2005   2006   2009   2010




           500M                      700B   30B       2.5M
daily active users minutes spent pieces of content  sites using
                  on the site every shared each    social plugins
                       month          month
What makes scaling Facebook challenging?
▪  Massive    scale
▪  Social   Graph is central to everything on the site
▪  Rapidly   evolving product
▪  Complex    Infrastructure
Traditional websites
                        Bob’s data
                                              Bob’s Beth’s data
                                                    data
  Bob                                  Beth



                        Julie’s data                  Sue’s data


 Bob
 Julie                                 Sue



                        Dan’s data                    Erin’s data


  Dan                                  Erin

Horizontally scalable
Social Graph




      People are only one dimension of the social graph
Facebook: The data is interconnected
Common operation: Query the social graph
    Bob                  Beth              Erin




                       Servers
Social Graph Cont’d
▪  Highly   connected
 ▪    4.74 average degree-of-separation between users on Facebook
 ▪    Made denser by our connections to places, interests, etc.
▪  Examples   of Queries on Social Graph
 ▪    What are the most interesting updates from my connections?
 ▪    Who are my connections in real-life who I am not connected to on
      Facebook?
 ▪    What are the most relevant events tonight near me and related to my
      interests? Or that my friends are going to?
Social Graph Cont’d
▪  System   Implications of Social Graph
 ▪    Expensive to query
 ▪    Difficult to partition
 ▪    Highly customized for each user
 ▪    Large working sets (Fat tail)
What makes scaling Facebook challenging?
▪  Massive    scale
▪  Social   Graph: Querying is expensive at every level
▪  Rapidly   evolving product
▪  Complex    Infrastructure
Product Launches
500M
                                                                                                          ?
                                                                                                   Questions
                                                                                                New2011
                                                                                                      Profile
                                                                                               Messages 800M
                                                                                                     2010
                                                                                                Groups
                                                                                                            iPad App
                                                                                                        Video Calling
                                                                                                          Music
                                                                                                       Timeline
                                                                                                   Unified Mobile
                                                                                                       Sites


                                                                                                   2010
                                                                                                 2010
                                                                                             Mobile Event



                                                                                       </>
                                                                                                  2010
                                                                                              Places
                                                                                          Photos Update
                                                                                               2010
                                                                                              2010
                                                                                       Social Plugins
400M                                                                                   Open2010
                                                                                            Graph
                                                                                          2010




300M




200M                                                                      The Stream
                                                                             2009



                                                           Translations
                                                               2008
100M
                               Sign Up
                                         Platform launch
    New Apps      New Apps    NewsFeed       2007
  February 2004   2004/2005     2006



 0M



2004                                                                                                      2011
Rapidly evolving product
▪  Facebook       is a platform
 ▪    External developers are innovating as well
▪  One      integrated product
 ▪    Changes in one part have major implications on other parts
      ▪    For e.g. Timeline surfaces some of the older photos

▪  System       Implications
 ▪    Build for flexibility (avoid premature optimizations)
 ▪    Revisit design tradeoffs (they might have changed)
What makes scaling Facebook challenging?
▪  Massive    scale
▪  Social   Graph: Querying is expensive at every level
▪  Rapidly   evolving product
▪  Complex    Infrastructure
Complex infrastructure
▪  Large   number of Software components
 ▪    Multiple Storage systems
 ▪    Multiple Caching Systems
 ▪    100s of specialized services
▪  Often   deploy cutting-edge hardware
 ▪    At our scale, we are early adopters of new hardware
▪  Failure   is routine
▪  Systems    implications
 ▪    Keep things as simple as possible
Outline
 1   What makes scaling Facebook challenging?

 2   Evolution of Software Architecture

 3   Evolution of Datacenter Architecture
Evolution of the Software Architecture
Evolution of each of these 4 tiers
                          Web Tier



Cache Tier                             Services Tier




                        Storage Tier
Evolution of the Software Architecture
Evolution of Web Tier
                          Web Tier



Cache Tier                             Services Tier




                        Storage Tier
Web Tier
▪  Stateless   request processing
 ▪    Gather Data: from storage tiers
 ▪    Transform: Ranking (for Relevance) and Filtering (for Privacy)
 ▪    Presentation: Generate HTML
▪  Runs   PHP code
 ▪    Widely used for web development
 ▪    Dynamically typed scripting language
▪  Integrated   product è One single source tree for all the entire code
 ▪    Same “binary” on every web tier box
▪  Scalability:   Efficiently process each request
Generation 1: Zend Interpreter for PHP
▪  Reasonably     fast (for an interpreter)
▪  Rapid   development
 ▪    Don’t have to recompile during testing
▪  But:   at scale, performance matters


        C++
       Java                                   Relative Execution Time
         C#
     Ocaml
      Ruby
    Python
  PHP Zend


              0   5     10    15    20    25       30    35    40       45
Generation 2: HipHop Compiler for PHP
        C++
       Java                                  Relative Execution Time
         C#
     Ocaml
      Ruby
    Python
  PHP Zend
PHP HipHop
                0     5    10    15    20   25    30    35     40      45

▪  Technically       challenging, Impressive gains, Still room for improvement
▪  But:      takes time to compile (slows down development)
  ▪    Solution: HipHop interpreter
       ▪    But: Interpreter and compiler sometimes disagree
       ▪        Performance Gains are slowing. Can we improve performance further?
Generation 3: HipHop Virtual Machine
                                                                HHVM
                                                             Interpreter

  PHP                   AST                   Bytecode

             Parser               Bytecode                       HHVM
                                                                  JIT
                                  Generator
                      Optimizer


▪  Best   of both worlds
 ▪    Common path, well-specified bytecode semantics
 ▪    Potential performance upside from dynamic specialization
▪  Work-In-Progress
Web Tier Facts
▪  Execution   time only a small factor in user-perceived performance
 ▪    Can potentially use less powerful processors
 ▪    Throughput matters more than latency (True for other tiers as well)
▪  Memory    management (allocation/free) is a significant remaining cost
 ▪    Copy-on-Write in HipHop implementation
▪  Poor   Instruction Cache Performance
 ▪    Partly due to the one massive binary
▪  Web    load predictable in aggregate
 ▪    Can use less dynamic techniques to save power
 ▪    Potentially even turn off machines. Failure rates is an open question?
Evolution of the Software Architecture
Evolution of Storage Tier
                            Web Tier



Cache Tier                             Services Tier




                       Storage Tier
Evolution of a Storage Tier
▪  Multiple      storage systems at Facebook
 ▪    MySQL
 ▪    HBase (NoSQL)
 ▪    Haystack (for BLOBS) ç
▪  Case      Study: BLOB storage
 ▪    BLOB: Binary Large Objects (Photos, Videos, Email attachments, etc.)
      ▪    Large files, No updates/appends, Sequential reads
 ▪    More than 100 petabytes
 ▪    250 million photos uploaded per day
Generation 1: Commercial Filers
▪  New     Photos Product                              NFS Storage

▪  First   build it the easy way
  ▪    Commercial Storage Tier + HTTP server
  ▪    Each Photo is stored as a separate file

▪  Quickly    up and running
  ▪    Reliably Store and Serve Photos

▪  But:   Inefficient
  ▪    Limited by IO rate and not storage density
  ▪    Average 10 IOs to serve each photo
  ▪    Wasted IO to traverse the directory structure
Generation 2: Gen 1 Optimized
▪  Optimization     Example:                             NFS Storage Optimized
 ▪    Cache NFS handles to reduce wasted IO                 directory inode
                                                            •    owner info
      operations                                            •    size
                                                            •    timestamps
▪  Reduce
        the number of IO operations per                     •    blocks

 photo by 3X
                                                            directory data
                                                            •    inode #
▪  But:                                                     •    filename

 ▪    Still expensive: High end storage boxes
                                                                  file inode
 ▪    Still inefficient: Still IO bound and wasting IOs      •    owner info
                                                            •    size
                                                            •    timestamps
                                                            •    blocks


                                                                     data
Generation 3: Haystack [OSDI’10]
▪  Custom         Solution
                                                     Superblock
 ▪    Commodity Storage Hardware                                            Magic No
                                                      Needle 1
 ▪    Optimized for 1 IO operation per request                                Key

      ▪    File system on top of a file system                                Flags

      ▪    Compact Index in memory                    Needle 2

      ▪    Metadata and data laid out contiguously                           Photo

▪  Efficient        from IO perspective
                                                                           Checksum
▪  But:                                               Needle 3

 ▪    Problem has changed now

                                                     Single Disk IO to read/write a photo
Generation 4: Tiered Storage
▪  Usage      characteristics
 ▪    Fat tail of accesses: everyone has friends J
 ▪    A large fraction of the tier is no longer IO limited (new)
      ▪    Storing efficiency matters much more than serving efficiency

▪  Approach:       Tiered Storage
      ▪    Last layer optimized for storage efficiency and durability
      ▪    Fronted by caching tier optimized for serving efficiency

▪  Working-In-Progress
BLOB Storage Facts
▪  Hot   and Warm data. Little cold data.
▪  Low   CPU utilization
 ▪    Single digit percentages
▪  Fixed   memory need
 ▪    Enough for the index
 ▪    Little use for anything more
▪  Next   generation will use denser storage systems
 ▪    Do we even bother with hardware raid?
 ▪    Details to be publicly released soon
Evolution of the Software Architecture
Evolution of Cache Tier
                            Web Tier



Cache Tier                               Services Tier




                          Storage Tier
First few Generations: Memcache

                         Web Tier



Cache Tier: Memcache            Look-Aside Cache
                                Key-Value Store
                                Does one thing very well
                                Does little else
                                Improved performance by 10X
                       Storage Tier
Memcache limitations
▪  “Values”   are opaque
 ▪    End up moving huge amounts of data across the network




▪  Storage   hierarchy exposed to web tier
 ▪    Harder to explore alternative storage solutions
 ▪    Harder to keep consistent
 ▪    Harder to protect the storage tier from thundering herds
Alternative Caching Tier: Tao

                    Web Tier



Cache Tier: Tao
                         1. Has a data model
                         2. Write-Through Cache
                         3. Abstracts the storage tier

                  Storage Tier
Tao Cont’d
▪  Data        Model
  ▪    Objects (Nodes)
  ▪    Associations (edges)
  ▪    Have “type” and data

▪  Simple        graph operations on them
  ▪    Efficient: Content-aware
       ▪    Can be performed on the caching tier

▪  In   production for a couple of years
  ▪    Serving a big portion of data accesses
Tao opens up possibilities
▪  Alternate      storage systems
 ▪    Multiple storage systems
      ▪    To accommodate different use case (access patterns)



▪  Even      more powerful Graph operations


▪  Multi-Tiered      caching
Cache Tier Facts
▪  Memcache

 ▪    Low CPU utilization
 ▪    Little use for Flash since it is bottlenecked on network
▪  Tao

 ▪    Much higher CPU load
 ▪    Will continue to increase as it supports more complex operations
 ▪    Could use Flash in a multi-tiered cache hierarchy
Evolution of the Software Architecture
Evolution of Services Tier
                         Web Tier



Cache Tier                            Services Tier




                       Storage Tier
Life before Services
Example: Wish your friend a Happy Birthday
                        Web Tier



                         Inefficient and Messy
Cache Tier
                         •  Potentially access hundreds of machines
                         •  Solution: Nightly cron jobs
                         •  Issues with corner cases
                         What about more complex problems?
                         Solution: Build Specialized Services
                      Storage Tier
A more complex service: News Feed
Aggregation of your friends’ activity
One of many (100s) services at Facebook
News Feed Product characteristics
▪  Real-time   distribution
 ▪    Along edges on the Social Graph
▪  Writer   can potentially broadcast to very large audience




▪  Reader   wants different & dynamic ways to filter data
 ▪    Average user has 1000s of stories per day from friends/pages
 ▪    Friend list, Recency, Aggregation, Ranking, etc.
News Feed Service
      User Update                   Query
        [ Write ]                  [ Read ]     Service: News Feed




▪  Build   and maintain an index: Distributed
▪  Rank:   Multiple ranking algorithms
Two approaches: Push vs. Pull
▪  Push    approach                          ▪  Pull   approach
  ▪    Distribute actions by reader            ▪    Distribute actions by writer
  ▪    Write broadcasts, read one location     ▪    Write one location, read gathers


▪  Pull   model is preferred because
  ▪    More dynamic: Easier to iterate
  ▪    “In a social graph, the number of incoming edges is much smaller than the
       outgoing ones.”


                 9,000,000                                 621
News Feed Service: Big Picture
          User Update                      Query
            [ Write ]                     [ Read ]          Service: News Feed
               Aggregators




       Leafs




▪  Pull   Model
  ▪    Leafs: One copy of the entire index. Stored in memory (Soft state)
  ▪    Aggregators: Aggregate results on the read path (Stateless)
News Feed Service: Writes
        User Update              Query
          [ Write ]             [ Read ]   Service: News Feed
              Aggregators




      Leafs




▪  On   User update (Write)
 ▪    Index sharded by Writer
 ▪    Need to update one leaf
News Feed Service: Reads
        User Update                  Query
          [ Write ]                 [ Read ]   Service: News Feed
              Aggregators




      Leafs




▪  On   Query (Read)
 ▪    Query all leafs
 ▪    Then do aggregation/ranking
News Feed Service: Scalability
        User Update                  Query
          [ Write ]                 [ Read ]        Service: News Feed
              Aggregators




      Leafs




▪  1000s   of machines
 ▪    Leafs: Multiple sets. Each set (10s of machines) has the entire index
 ▪    Aggregators: Stateless. Scale with load.
News Feed Service: Reliability
▪  Dealing      with (daily) failures
 ▪    Large number of failure types
      ▪    Hardware/software
      ▪    Servers/Networks
      ▪    Intermittent/Permanent
      ▪    Local/Global
▪  Keep      the software architecture simple
 ▪    Stateless components are a plus
▪  For     example, on read requests:
 ▪    If a leaf is inaccessible, failover the request to a different set
 ▪    If an aggregator is inaccessible, just pick another
New Feed Service Facts
▪  Number    of leafs dominate the number of aggregators
 ▪    Reads are more expensive than writes
 ▪    Every read (query) involves one aggregator and every leaf in the set
▪  Very   high network load between aggregator and leafs
 ▪    Important to keep a full leaf set within a single rack on machines
 ▪    Uses Flash on leafs to ensure this
Evolution of the Software Architecture
Summary
                            Web Tier HipHop Compiler & VM



Cache Tier Memcache & Tao               New Feed Services Tier




                        Storage Tier    BLOB Storage
Outline
 1   What makes scaling Facebook challenging?

 2   Evolution of Software Architecture

 3   Evolution of Datacenter Architecture
Recall: Characteristics of Facebook
▪  Massive   Scale
▪  Social   Graph
 ▪    Expensive to query
 ▪    Hard to partition
 ▪    Large working set (Fat tail)
▪  Product   is rapidly evolving
▪  Hardware    failures are routine
Implications
▪  On      Datacenters
 ▪    Small number of massive datacenters (currently 4)
▪  On      Servers
 ▪    Minimize the “classes” (single digit) of machines deployed
      ▪    Web Tier, Cache Tier, Storage Tier, and a couple of special configurations
▪  Started      with
 ▪    Leased datacenters + Standard server configurations from vendors
▪  Moving       to
 ▪    Custom built datacenters + custom servers
 ▪    Continue to rely on a small number of machine “classes”
Servers
                        Data Center




 Server
      AMD
        Intel

Chassis
   Motherboard
Motherboard


                                      Electrical
   Mechanical


Power
       Battery
    Triplet 
Supply
      Cabinet
     Rack
Hpca2012 facebook keynote
Evaporative cooling system
Open Compute
▪  Custom    datacenters & servers
▪  Minimizes   power loss
 ▪    POE of 1.07
▪  Vanity   Free design
 ▪    Designed for ease of operations
▪  Designs   are open-sourced
 ▪    More on the way
Outline
 1   What makes scaling Facebook challenging?

 2   Evolution of Software Architecture

 3   Evolution of Datacenter Architecture


               Questions?
(c) 2009 Facebook, Inc. or its licensors.  "Facebook" is a registered trademark of Facebook, Inc.. All rights reserved. 1.0

More Related Content

Viewers also liked

Realtime Apache Hadoop at Facebook
Realtime Apache Hadoop at FacebookRealtime Apache Hadoop at Facebook
Realtime Apache Hadoop at Facebook
parallellabs
 
Ipython server(Jupyter Server) 만들기
Ipython server(Jupyter Server) 만들기Ipython server(Jupyter Server) 만들기
Ipython server(Jupyter Server) 만들기
Hyun-sik Yoo
 
intercentros abril 2010
intercentros abril 2010intercentros abril 2010
intercentros abril 2010
oscargaliza
 
LA CRÓNICA 691
LA CRÓNICA 691LA CRÓNICA 691
Premiadas fotciencia13
Premiadas fotciencia13Premiadas fotciencia13
Premiadas fotciencia13
profesdelCarmen
 
Action type fr
Action type frAction type fr
Action type fr
slatefr
 
Was macht machen.de - Beispiele
Was macht machen.de - BeispieleWas macht machen.de - Beispiele
Was macht machen.de - Beispiele
Michael Leibrecht
 
Cómo enseñar ciencias
Cómo enseñar cienciasCómo enseñar ciencias
Cómo enseñar ciencias
Raul Herrera
 
Batch plant maintenance
Batch plant maintenanceBatch plant maintenance
Batch plant maintenance
Daniia Roxio
 
Presentación de Moodle
Presentación de MoodlePresentación de Moodle
Presentación de Moodle
cruizgaray
 
REDES NEURONALES
REDES NEURONALESREDES NEURONALES
REDES NEURONALES
Joan Luis Avalos Caycho
 
Individual and societal risk
Individual and societal riskIndividual and societal risk
Individual and societal risk
Sruthi Madhu
 
Data flow vs. procedural programming: How to put your algorithms into Flink
Data flow vs. procedural programming: How to put your algorithms into FlinkData flow vs. procedural programming: How to put your algorithms into Flink
Data flow vs. procedural programming: How to put your algorithms into Flink
Mikio L. Braun
 
El cambio
El cambioEl cambio
El cambio
memoop
 
The influence-of-prayer-coping-on-patients
The influence-of-prayer-coping-on-patientsThe influence-of-prayer-coping-on-patients
The influence-of-prayer-coping-on-patients
Theo Theo Herbots the voice from #Tienen
 

Viewers also liked (15)

Realtime Apache Hadoop at Facebook
Realtime Apache Hadoop at FacebookRealtime Apache Hadoop at Facebook
Realtime Apache Hadoop at Facebook
 
Ipython server(Jupyter Server) 만들기
Ipython server(Jupyter Server) 만들기Ipython server(Jupyter Server) 만들기
Ipython server(Jupyter Server) 만들기
 
intercentros abril 2010
intercentros abril 2010intercentros abril 2010
intercentros abril 2010
 
LA CRÓNICA 691
LA CRÓNICA 691LA CRÓNICA 691
LA CRÓNICA 691
 
Premiadas fotciencia13
Premiadas fotciencia13Premiadas fotciencia13
Premiadas fotciencia13
 
Action type fr
Action type frAction type fr
Action type fr
 
Was macht machen.de - Beispiele
Was macht machen.de - BeispieleWas macht machen.de - Beispiele
Was macht machen.de - Beispiele
 
Cómo enseñar ciencias
Cómo enseñar cienciasCómo enseñar ciencias
Cómo enseñar ciencias
 
Batch plant maintenance
Batch plant maintenanceBatch plant maintenance
Batch plant maintenance
 
Presentación de Moodle
Presentación de MoodlePresentación de Moodle
Presentación de Moodle
 
REDES NEURONALES
REDES NEURONALESREDES NEURONALES
REDES NEURONALES
 
Individual and societal risk
Individual and societal riskIndividual and societal risk
Individual and societal risk
 
Data flow vs. procedural programming: How to put your algorithms into Flink
Data flow vs. procedural programming: How to put your algorithms into FlinkData flow vs. procedural programming: How to put your algorithms into Flink
Data flow vs. procedural programming: How to put your algorithms into Flink
 
El cambio
El cambioEl cambio
El cambio
 
The influence-of-prayer-coping-on-patients
The influence-of-prayer-coping-on-patientsThe influence-of-prayer-coping-on-patients
The influence-of-prayer-coping-on-patients
 

Similar to Hpca2012 facebook keynote

Android apps promotion and ads optimization in Japan market
Android apps promotion and ads optimization in Japan marketAndroid apps promotion and ads optimization in Japan market
Android apps promotion and ads optimization in Japan market
01Booster
 
一秒間にソーシャルとモバイルで何が起きているか
一秒間にソーシャルとモバイルで何が起きているか一秒間にソーシャルとモバイルで何が起きているか
一秒間にソーシャルとモバイルで何が起きているか
Jun Kaneko
 
MozCamp 2009
MozCamp 2009MozCamp 2009
MozCamp 2009
David Tenser
 
10/17開催:ITS世界会議東京2013アトリウム企画 「ITSお役立ちアプリ大集合」 発表資料 by jig.jp
10/17開催:ITS世界会議東京2013アトリウム企画 「ITSお役立ちアプリ大集合」 発表資料 by jig.jp10/17開催:ITS世界会議東京2013アトリウム企画 「ITSお役立ちアプリ大集合」 発表資料 by jig.jp
10/17開催:ITS世界会議東京2013アトリウム企画 「ITSお役立ちアプリ大集合」 発表資料 by jig.jp
Taisuke Fukuno
 
Game Engines and Middleware (2011)
Game Engines and Middleware (2011)Game Engines and Middleware (2011)
Game Engines and Middleware (2011)
Mark DeLoura
 
"Converged Communications -- Impact and Requirements on future handsets
"Converged Communications -- Impact and Requirements on future handsets"Converged Communications -- Impact and Requirements on future handsets
"Converged Communications -- Impact and Requirements on future handsets
John Loughney
 
Mobile driving Internet to the masses - Mobile Internet World 2012
Mobile driving Internet to the masses - Mobile Internet World 2012Mobile driving Internet to the masses - Mobile Internet World 2012
Mobile driving Internet to the masses - Mobile Internet World 2012
Rob Van Den Dam
 
Ict Education &amp; Job Trends July 2011
Ict Education &amp; Job Trends July 2011Ict Education &amp; Job Trends July 2011
Ict Education &amp; Job Trends July 2011
Garry Roberton
 
Delivering Web to Mobile
Delivering Web to MobileDelivering Web to Mobile
Delivering Web to Mobile
The University of Manchester
 
Why MT Matters
Why MT MattersWhy MT Matters
Why MT Matters
Kirti Vashee
 
Ict education & job trends may 2012
Ict education  & job trends may 2012Ict education  & job trends may 2012
Ict education & job trends may 2012
Garry Roberton
 
Media Drive Viewpoint 2010年10月號
Media Drive Viewpoint 2010年10月號Media Drive Viewpoint 2010年10月號
Media Drive Viewpoint 2010年10月號
Mooi Hsieh
 
Who pays for mobile broadband 2.0
Who pays for mobile broadband 2.0Who pays for mobile broadband 2.0
Who pays for mobile broadband 2.0
Dr. Kim (Kyllesbech Larsen)
 
Mobile data consumption by smartphone users
Mobile data consumption by smartphone usersMobile data consumption by smartphone users
Mobile data consumption by smartphone users
skripnikov
 
SNS Based Project Management Communication
SNS Based Project Management CommunicationSNS Based Project Management Communication
SNS Based Project Management Communication
Peter Kim
 
UNESCO | Touch and Mobile Technologies for the Classroom session 4
UNESCO | Touch and Mobile Technologies for the Classroom session 4UNESCO | Touch and Mobile Technologies for the Classroom session 4
UNESCO | Touch and Mobile Technologies for the Classroom session 4
Giorgio Ungania
 
The Mobile Data Challenge (by Economist Intelligence)
The Mobile Data Challenge (by Economist Intelligence)The Mobile Data Challenge (by Economist Intelligence)
The Mobile Data Challenge (by Economist Intelligence)
Kirill Smirnov
 
Yahoo Nielsen 2011 internet usage philippines
Yahoo Nielsen 2011 internet usage philippinesYahoo Nielsen 2011 internet usage philippines
Yahoo Nielsen 2011 internet usage philippines
Ray Evangelista
 
Digital Philippines 2011 Yahoo - Nielsen Net Index Highlights
Digital Philippines 2011 Yahoo - Nielsen Net Index HighlightsDigital Philippines 2011 Yahoo - Nielsen Net Index Highlights
Digital Philippines 2011 Yahoo - Nielsen Net Index Highlights
Janette Toral
 
Digitalk Martin Blinder - 260911
Digitalk Martin Blinder - 260911Digitalk Martin Blinder - 260911
Digitalk Martin Blinder - 260911
Martin Blinder
 

Similar to Hpca2012 facebook keynote (20)

Android apps promotion and ads optimization in Japan market
Android apps promotion and ads optimization in Japan marketAndroid apps promotion and ads optimization in Japan market
Android apps promotion and ads optimization in Japan market
 
一秒間にソーシャルとモバイルで何が起きているか
一秒間にソーシャルとモバイルで何が起きているか一秒間にソーシャルとモバイルで何が起きているか
一秒間にソーシャルとモバイルで何が起きているか
 
MozCamp 2009
MozCamp 2009MozCamp 2009
MozCamp 2009
 
10/17開催:ITS世界会議東京2013アトリウム企画 「ITSお役立ちアプリ大集合」 発表資料 by jig.jp
10/17開催:ITS世界会議東京2013アトリウム企画 「ITSお役立ちアプリ大集合」 発表資料 by jig.jp10/17開催:ITS世界会議東京2013アトリウム企画 「ITSお役立ちアプリ大集合」 発表資料 by jig.jp
10/17開催:ITS世界会議東京2013アトリウム企画 「ITSお役立ちアプリ大集合」 発表資料 by jig.jp
 
Game Engines and Middleware (2011)
Game Engines and Middleware (2011)Game Engines and Middleware (2011)
Game Engines and Middleware (2011)
 
"Converged Communications -- Impact and Requirements on future handsets
"Converged Communications -- Impact and Requirements on future handsets"Converged Communications -- Impact and Requirements on future handsets
"Converged Communications -- Impact and Requirements on future handsets
 
Mobile driving Internet to the masses - Mobile Internet World 2012
Mobile driving Internet to the masses - Mobile Internet World 2012Mobile driving Internet to the masses - Mobile Internet World 2012
Mobile driving Internet to the masses - Mobile Internet World 2012
 
Ict Education &amp; Job Trends July 2011
Ict Education &amp; Job Trends July 2011Ict Education &amp; Job Trends July 2011
Ict Education &amp; Job Trends July 2011
 
Delivering Web to Mobile
Delivering Web to MobileDelivering Web to Mobile
Delivering Web to Mobile
 
Why MT Matters
Why MT MattersWhy MT Matters
Why MT Matters
 
Ict education & job trends may 2012
Ict education  & job trends may 2012Ict education  & job trends may 2012
Ict education & job trends may 2012
 
Media Drive Viewpoint 2010年10月號
Media Drive Viewpoint 2010年10月號Media Drive Viewpoint 2010年10月號
Media Drive Viewpoint 2010年10月號
 
Who pays for mobile broadband 2.0
Who pays for mobile broadband 2.0Who pays for mobile broadband 2.0
Who pays for mobile broadband 2.0
 
Mobile data consumption by smartphone users
Mobile data consumption by smartphone usersMobile data consumption by smartphone users
Mobile data consumption by smartphone users
 
SNS Based Project Management Communication
SNS Based Project Management CommunicationSNS Based Project Management Communication
SNS Based Project Management Communication
 
UNESCO | Touch and Mobile Technologies for the Classroom session 4
UNESCO | Touch and Mobile Technologies for the Classroom session 4UNESCO | Touch and Mobile Technologies for the Classroom session 4
UNESCO | Touch and Mobile Technologies for the Classroom session 4
 
The Mobile Data Challenge (by Economist Intelligence)
The Mobile Data Challenge (by Economist Intelligence)The Mobile Data Challenge (by Economist Intelligence)
The Mobile Data Challenge (by Economist Intelligence)
 
Yahoo Nielsen 2011 internet usage philippines
Yahoo Nielsen 2011 internet usage philippinesYahoo Nielsen 2011 internet usage philippines
Yahoo Nielsen 2011 internet usage philippines
 
Digital Philippines 2011 Yahoo - Nielsen Net Index Highlights
Digital Philippines 2011 Yahoo - Nielsen Net Index HighlightsDigital Philippines 2011 Yahoo - Nielsen Net Index Highlights
Digital Philippines 2011 Yahoo - Nielsen Net Index Highlights
 
Digitalk Martin Blinder - 260911
Digitalk Martin Blinder - 260911Digitalk Martin Blinder - 260911
Digitalk Martin Blinder - 260911
 

Recently uploaded

Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
bellared2
 
How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx
Adam Dunkels
 
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite SolutionIPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Networks
 
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
shanihomely
 
Google I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged SlidesGoogle I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged Slides
Google Developer Group - Harare
 
(CISOPlatform Summit & SACON 2024) Gen AI & Deepfake In Overall Security.pdf
(CISOPlatform Summit & SACON 2024) Gen AI & Deepfake In Overall Security.pdf(CISOPlatform Summit & SACON 2024) Gen AI & Deepfake In Overall Security.pdf
(CISOPlatform Summit & SACON 2024) Gen AI & Deepfake In Overall Security.pdf
Priyanka Aash
 
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptxDublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Kunal Gupta
 
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes..."Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
Anant Gupta
 
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptxIntroduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
313mohammedarshad
 
Three New Criminal Laws in India 1 July 2024
Three New Criminal Laws in India 1 July 2024Three New Criminal Laws in India 1 July 2024
Three New Criminal Laws in India 1 July 2024
aakash malhotra
 
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python CodebaseEuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
Jimmy Lai
 
Semantic-Aware Code Model: Elevating the Future of Software Development
Semantic-Aware Code Model: Elevating the Future of Software DevelopmentSemantic-Aware Code Model: Elevating the Future of Software Development
Semantic-Aware Code Model: Elevating the Future of Software Development
Baishakhi Ray
 
(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf
(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf
(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf
Priyanka Aash
 
Vertex AI Agent Builder - GDG Alicante - Julio 2024
Vertex AI Agent Builder - GDG Alicante - Julio 2024Vertex AI Agent Builder - GDG Alicante - Julio 2024
Vertex AI Agent Builder - GDG Alicante - Julio 2024
Nicolás Lopéz
 
(CISOPlatform Summit & SACON 2024) Orientation by CISO Platform_ Using CISO P...
(CISOPlatform Summit & SACON 2024) Orientation by CISO Platform_ Using CISO P...(CISOPlatform Summit & SACON 2024) Orientation by CISO Platform_ Using CISO P...
(CISOPlatform Summit & SACON 2024) Orientation by CISO Platform_ Using CISO P...
Priyanka Aash
 
Uncharted Together- Navigating AI's New Frontiers in Libraries
Uncharted Together- Navigating AI's New Frontiers in LibrariesUncharted Together- Navigating AI's New Frontiers in Libraries
Uncharted Together- Navigating AI's New Frontiers in Libraries
Brian Pichman
 
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
aslasdfmkhan4750
 
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
Priyanka Aash
 
BLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
BLOCKCHAIN TECHNOLOGY - Advantages and DisadvantagesBLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
BLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
SAI KAILASH R
 
Evolution of iPaaS - simplify IT workloads to provide a unified view of data...
Evolution of iPaaS - simplify IT workloads to provide a unified view of  data...Evolution of iPaaS - simplify IT workloads to provide a unified view of  data...
Evolution of iPaaS - simplify IT workloads to provide a unified view of data...
Torry Harris
 

Recently uploaded (20)

Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
 
How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx
 
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite SolutionIPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite Solution
 
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
 
Google I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged SlidesGoogle I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged Slides
 
(CISOPlatform Summit & SACON 2024) Gen AI & Deepfake In Overall Security.pdf
(CISOPlatform Summit & SACON 2024) Gen AI & Deepfake In Overall Security.pdf(CISOPlatform Summit & SACON 2024) Gen AI & Deepfake In Overall Security.pdf
(CISOPlatform Summit & SACON 2024) Gen AI & Deepfake In Overall Security.pdf
 
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptxDublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
 
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes..."Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
 
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptxIntroduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
 
Three New Criminal Laws in India 1 July 2024
Three New Criminal Laws in India 1 July 2024Three New Criminal Laws in India 1 July 2024
Three New Criminal Laws in India 1 July 2024
 
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python CodebaseEuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
 
Semantic-Aware Code Model: Elevating the Future of Software Development
Semantic-Aware Code Model: Elevating the Future of Software DevelopmentSemantic-Aware Code Model: Elevating the Future of Software Development
Semantic-Aware Code Model: Elevating the Future of Software Development
 
(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf
(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf
(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf
 
Vertex AI Agent Builder - GDG Alicante - Julio 2024
Vertex AI Agent Builder - GDG Alicante - Julio 2024Vertex AI Agent Builder - GDG Alicante - Julio 2024
Vertex AI Agent Builder - GDG Alicante - Julio 2024
 
(CISOPlatform Summit & SACON 2024) Orientation by CISO Platform_ Using CISO P...
(CISOPlatform Summit & SACON 2024) Orientation by CISO Platform_ Using CISO P...(CISOPlatform Summit & SACON 2024) Orientation by CISO Platform_ Using CISO P...
(CISOPlatform Summit & SACON 2024) Orientation by CISO Platform_ Using CISO P...
 
Uncharted Together- Navigating AI's New Frontiers in Libraries
Uncharted Together- Navigating AI's New Frontiers in LibrariesUncharted Together- Navigating AI's New Frontiers in Libraries
Uncharted Together- Navigating AI's New Frontiers in Libraries
 
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
 
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
 
BLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
BLOCKCHAIN TECHNOLOGY - Advantages and DisadvantagesBLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
BLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
 
Evolution of iPaaS - simplify IT workloads to provide a unified view of data...
Evolution of iPaaS - simplify IT workloads to provide a unified view of  data...Evolution of iPaaS - simplify IT workloads to provide a unified view of  data...
Evolution of iPaaS - simplify IT workloads to provide a unified view of data...
 

Hpca2012 facebook keynote

  • 2. Social Networking at Scale Sanjeev Kumar Facebook
  • 3. Outline 1 What makes scaling Facebook challenging? 2 Evolution of Software Architecture 3 Evolution of Datacenter Architecture
  • 4. 845M users worldwide 2004 2005 2006 2009 2010 500M 700B 30B 2.5M daily active users minutes spent pieces of content sites using on the site every shared each social plugins month month
  • 5. What makes scaling Facebook challenging? ▪  Massive scale ▪  Social Graph is central to everything on the site ▪  Rapidly evolving product ▪  Complex Infrastructure
  • 6. Traditional websites Bob’s data Bob’s Beth’s data data Bob Beth Julie’s data Sue’s data Bob Julie Sue Dan’s data Erin’s data Dan Erin Horizontally scalable
  • 7. Social Graph People are only one dimension of the social graph
  • 8. Facebook: The data is interconnected Common operation: Query the social graph Bob Beth Erin Servers
  • 9. Social Graph Cont’d ▪  Highly connected ▪  4.74 average degree-of-separation between users on Facebook ▪  Made denser by our connections to places, interests, etc. ▪  Examples of Queries on Social Graph ▪  What are the most interesting updates from my connections? ▪  Who are my connections in real-life who I am not connected to on Facebook? ▪  What are the most relevant events tonight near me and related to my interests? Or that my friends are going to?
  • 10. Social Graph Cont’d ▪  System Implications of Social Graph ▪  Expensive to query ▪  Difficult to partition ▪  Highly customized for each user ▪  Large working sets (Fat tail)
  • 11. What makes scaling Facebook challenging? ▪  Massive scale ▪  Social Graph: Querying is expensive at every level ▪  Rapidly evolving product ▪  Complex Infrastructure
  • 12. Product Launches 500M ? Questions New2011 Profile Messages 800M 2010 Groups iPad App Video Calling Music Timeline Unified Mobile Sites 2010 2010 Mobile Event </> 2010 Places Photos Update 2010 2010 Social Plugins 400M Open2010 Graph 2010 300M 200M The Stream 2009 Translations 2008 100M Sign Up Platform launch New Apps New Apps NewsFeed 2007 February 2004 2004/2005 2006 0M 2004 2011
  • 13. Rapidly evolving product ▪  Facebook is a platform ▪  External developers are innovating as well ▪  One integrated product ▪  Changes in one part have major implications on other parts ▪  For e.g. Timeline surfaces some of the older photos ▪  System Implications ▪  Build for flexibility (avoid premature optimizations) ▪  Revisit design tradeoffs (they might have changed)
  • 14. What makes scaling Facebook challenging? ▪  Massive scale ▪  Social Graph: Querying is expensive at every level ▪  Rapidly evolving product ▪  Complex Infrastructure
  • 15. Complex infrastructure ▪  Large number of Software components ▪  Multiple Storage systems ▪  Multiple Caching Systems ▪  100s of specialized services ▪  Often deploy cutting-edge hardware ▪  At our scale, we are early adopters of new hardware ▪  Failure is routine ▪  Systems implications ▪  Keep things as simple as possible
  • 16. Outline 1 What makes scaling Facebook challenging? 2 Evolution of Software Architecture 3 Evolution of Datacenter Architecture
  • 17. Evolution of the Software Architecture Evolution of each of these 4 tiers Web Tier Cache Tier Services Tier Storage Tier
  • 18. Evolution of the Software Architecture Evolution of Web Tier Web Tier Cache Tier Services Tier Storage Tier
  • 19. Web Tier ▪  Stateless request processing ▪  Gather Data: from storage tiers ▪  Transform: Ranking (for Relevance) and Filtering (for Privacy) ▪  Presentation: Generate HTML ▪  Runs PHP code ▪  Widely used for web development ▪  Dynamically typed scripting language ▪  Integrated product è One single source tree for all the entire code ▪  Same “binary” on every web tier box ▪  Scalability: Efficiently process each request
  • 20. Generation 1: Zend Interpreter for PHP ▪  Reasonably fast (for an interpreter) ▪  Rapid development ▪  Don’t have to recompile during testing ▪  But: at scale, performance matters C++ Java Relative Execution Time C# Ocaml Ruby Python PHP Zend 0 5 10 15 20 25 30 35 40 45
  • 21. Generation 2: HipHop Compiler for PHP C++ Java Relative Execution Time C# Ocaml Ruby Python PHP Zend PHP HipHop 0 5 10 15 20 25 30 35 40 45 ▪  Technically challenging, Impressive gains, Still room for improvement ▪  But: takes time to compile (slows down development) ▪  Solution: HipHop interpreter ▪  But: Interpreter and compiler sometimes disagree ▪  Performance Gains are slowing. Can we improve performance further?
  • 22. Generation 3: HipHop Virtual Machine HHVM Interpreter PHP AST Bytecode Parser Bytecode HHVM JIT Generator Optimizer ▪  Best of both worlds ▪  Common path, well-specified bytecode semantics ▪  Potential performance upside from dynamic specialization ▪  Work-In-Progress
  • 23. Web Tier Facts ▪  Execution time only a small factor in user-perceived performance ▪  Can potentially use less powerful processors ▪  Throughput matters more than latency (True for other tiers as well) ▪  Memory management (allocation/free) is a significant remaining cost ▪  Copy-on-Write in HipHop implementation ▪  Poor Instruction Cache Performance ▪  Partly due to the one massive binary ▪  Web load predictable in aggregate ▪  Can use less dynamic techniques to save power ▪  Potentially even turn off machines. Failure rates is an open question?
  • 24. Evolution of the Software Architecture Evolution of Storage Tier Web Tier Cache Tier Services Tier Storage Tier
  • 25. Evolution of a Storage Tier ▪  Multiple storage systems at Facebook ▪  MySQL ▪  HBase (NoSQL) ▪  Haystack (for BLOBS) ç ▪  Case Study: BLOB storage ▪  BLOB: Binary Large Objects (Photos, Videos, Email attachments, etc.) ▪  Large files, No updates/appends, Sequential reads ▪  More than 100 petabytes ▪  250 million photos uploaded per day
  • 26. Generation 1: Commercial Filers ▪  New Photos Product NFS Storage ▪  First build it the easy way ▪  Commercial Storage Tier + HTTP server ▪  Each Photo is stored as a separate file ▪  Quickly up and running ▪  Reliably Store and Serve Photos ▪  But: Inefficient ▪  Limited by IO rate and not storage density ▪  Average 10 IOs to serve each photo ▪  Wasted IO to traverse the directory structure
  • 27. Generation 2: Gen 1 Optimized ▪  Optimization Example: NFS Storage Optimized ▪  Cache NFS handles to reduce wasted IO directory inode •  owner info operations •  size •  timestamps ▪  Reduce the number of IO operations per •  blocks photo by 3X directory data •  inode # ▪  But: •  filename ▪  Still expensive: High end storage boxes file inode ▪  Still inefficient: Still IO bound and wasting IOs •  owner info •  size •  timestamps •  blocks data
  • 28. Generation 3: Haystack [OSDI’10] ▪  Custom Solution Superblock ▪  Commodity Storage Hardware Magic No Needle 1 ▪  Optimized for 1 IO operation per request Key ▪  File system on top of a file system Flags ▪  Compact Index in memory Needle 2 ▪  Metadata and data laid out contiguously Photo ▪  Efficient from IO perspective Checksum ▪  But: Needle 3 ▪  Problem has changed now Single Disk IO to read/write a photo
  • 29. Generation 4: Tiered Storage ▪  Usage characteristics ▪  Fat tail of accesses: everyone has friends J ▪  A large fraction of the tier is no longer IO limited (new) ▪  Storing efficiency matters much more than serving efficiency ▪  Approach: Tiered Storage ▪  Last layer optimized for storage efficiency and durability ▪  Fronted by caching tier optimized for serving efficiency ▪  Working-In-Progress
  • 30. BLOB Storage Facts ▪  Hot and Warm data. Little cold data. ▪  Low CPU utilization ▪  Single digit percentages ▪  Fixed memory need ▪  Enough for the index ▪  Little use for anything more ▪  Next generation will use denser storage systems ▪  Do we even bother with hardware raid? ▪  Details to be publicly released soon
  • 31. Evolution of the Software Architecture Evolution of Cache Tier Web Tier Cache Tier Services Tier Storage Tier
  • 32. First few Generations: Memcache Web Tier Cache Tier: Memcache Look-Aside Cache Key-Value Store Does one thing very well Does little else Improved performance by 10X Storage Tier
  • 33. Memcache limitations ▪  “Values” are opaque ▪  End up moving huge amounts of data across the network ▪  Storage hierarchy exposed to web tier ▪  Harder to explore alternative storage solutions ▪  Harder to keep consistent ▪  Harder to protect the storage tier from thundering herds
  • 34. Alternative Caching Tier: Tao Web Tier Cache Tier: Tao 1. Has a data model 2. Write-Through Cache 3. Abstracts the storage tier Storage Tier
  • 35. Tao Cont’d ▪  Data Model ▪  Objects (Nodes) ▪  Associations (edges) ▪  Have “type” and data ▪  Simple graph operations on them ▪  Efficient: Content-aware ▪  Can be performed on the caching tier ▪  In production for a couple of years ▪  Serving a big portion of data accesses
  • 36. Tao opens up possibilities ▪  Alternate storage systems ▪  Multiple storage systems ▪  To accommodate different use case (access patterns) ▪  Even more powerful Graph operations ▪  Multi-Tiered caching
  • 37. Cache Tier Facts ▪  Memcache ▪  Low CPU utilization ▪  Little use for Flash since it is bottlenecked on network ▪  Tao ▪  Much higher CPU load ▪  Will continue to increase as it supports more complex operations ▪  Could use Flash in a multi-tiered cache hierarchy
  • 38. Evolution of the Software Architecture Evolution of Services Tier Web Tier Cache Tier Services Tier Storage Tier
  • 39. Life before Services Example: Wish your friend a Happy Birthday Web Tier Inefficient and Messy Cache Tier •  Potentially access hundreds of machines •  Solution: Nightly cron jobs •  Issues with corner cases What about more complex problems? Solution: Build Specialized Services Storage Tier
  • 40. A more complex service: News Feed Aggregation of your friends’ activity One of many (100s) services at Facebook
  • 41. News Feed Product characteristics ▪  Real-time distribution ▪  Along edges on the Social Graph ▪  Writer can potentially broadcast to very large audience ▪  Reader wants different & dynamic ways to filter data ▪  Average user has 1000s of stories per day from friends/pages ▪  Friend list, Recency, Aggregation, Ranking, etc.
  • 42. News Feed Service User Update Query [ Write ] [ Read ] Service: News Feed ▪  Build and maintain an index: Distributed ▪  Rank: Multiple ranking algorithms
  • 43. Two approaches: Push vs. Pull ▪  Push approach ▪  Pull approach ▪  Distribute actions by reader ▪  Distribute actions by writer ▪  Write broadcasts, read one location ▪  Write one location, read gathers ▪  Pull model is preferred because ▪  More dynamic: Easier to iterate ▪  “In a social graph, the number of incoming edges is much smaller than the outgoing ones.” 9,000,000 621
  • 44. News Feed Service: Big Picture User Update Query [ Write ] [ Read ] Service: News Feed Aggregators Leafs ▪  Pull Model ▪  Leafs: One copy of the entire index. Stored in memory (Soft state) ▪  Aggregators: Aggregate results on the read path (Stateless)
  • 45. News Feed Service: Writes User Update Query [ Write ] [ Read ] Service: News Feed Aggregators Leafs ▪  On User update (Write) ▪  Index sharded by Writer ▪  Need to update one leaf
  • 46. News Feed Service: Reads User Update Query [ Write ] [ Read ] Service: News Feed Aggregators Leafs ▪  On Query (Read) ▪  Query all leafs ▪  Then do aggregation/ranking
  • 47. News Feed Service: Scalability User Update Query [ Write ] [ Read ] Service: News Feed Aggregators Leafs ▪  1000s of machines ▪  Leafs: Multiple sets. Each set (10s of machines) has the entire index ▪  Aggregators: Stateless. Scale with load.
  • 48. News Feed Service: Reliability ▪  Dealing with (daily) failures ▪  Large number of failure types ▪  Hardware/software ▪  Servers/Networks ▪  Intermittent/Permanent ▪  Local/Global ▪  Keep the software architecture simple ▪  Stateless components are a plus ▪  For example, on read requests: ▪  If a leaf is inaccessible, failover the request to a different set ▪  If an aggregator is inaccessible, just pick another
  • 49. New Feed Service Facts ▪  Number of leafs dominate the number of aggregators ▪  Reads are more expensive than writes ▪  Every read (query) involves one aggregator and every leaf in the set ▪  Very high network load between aggregator and leafs ▪  Important to keep a full leaf set within a single rack on machines ▪  Uses Flash on leafs to ensure this
  • 50. Evolution of the Software Architecture Summary Web Tier HipHop Compiler & VM Cache Tier Memcache & Tao New Feed Services Tier Storage Tier BLOB Storage
  • 51. Outline 1 What makes scaling Facebook challenging? 2 Evolution of Software Architecture 3 Evolution of Datacenter Architecture
  • 52. Recall: Characteristics of Facebook ▪  Massive Scale ▪  Social Graph ▪  Expensive to query ▪  Hard to partition ▪  Large working set (Fat tail) ▪  Product is rapidly evolving ▪  Hardware failures are routine
  • 53. Implications ▪  On Datacenters ▪  Small number of massive datacenters (currently 4) ▪  On Servers ▪  Minimize the “classes” (single digit) of machines deployed ▪  Web Tier, Cache Tier, Storage Tier, and a couple of special configurations ▪  Started with ▪  Leased datacenters + Standard server configurations from vendors ▪  Moving to ▪  Custom built datacenters + custom servers ▪  Continue to rely on a small number of machine “classes”
  • 54. Servers Data Center Server AMD
 Intel
 Chassis Motherboard Motherboard Electrical Mechanical Power
 Battery Triplet Supply Cabinet Rack
  • 57. Open Compute ▪  Custom datacenters & servers ▪  Minimizes power loss ▪  POE of 1.07 ▪  Vanity Free design ▪  Designed for ease of operations ▪  Designs are open-sourced ▪  More on the way
  • 58. Outline 1 What makes scaling Facebook challenging? 2 Evolution of Software Architecture 3 Evolution of Datacenter Architecture Questions?
  • 59. (c) 2009 Facebook, Inc. or its licensors.  "Facebook" is a registered trademark of Facebook, Inc.. All rights reserved. 1.0