Learning and Development                 Be part of the learning experience at Aditi.

              presents
                                               Join the talks. Its free.
                                               Free as in freedom at work, not free-beer.


                                               Its not training. Its mind-opener.

                                               Speak at these events. Or bring an
                                               expert/friend to talk.
    Open Talk Series
                                               Mail OpenTalk@aditi.com with topic and
      A series of illuminating talks and
  interactions that open our minds to new      availability.
ideas and concepts; that makes us look for
   newer or better ways of doing what we
 did; or point us to exciting things we have
  never done before. A range of topics on
     Technology, Business, Fun and Life.
HOW TO ENJOY AN                    TALK



Bring coffee & friends      Switch OFF mobile      Switch ON mind




Sign attendance sheet      SHARE your wisdom      QUESTION notions




              THANK the Talker       SPREAD the good word
architecture
                                        Sundararajan Subramanian
Image Copyright : facebook
facebook in 20 Minutes
                        • 2.7 M Photos
                        • 10.2M Comments
                        • 4.6 Messages

Statistics

What is Facebook
                        •   Shared links: 1,000,000
Technical challenges    •   Tagged photos: 1,323,000
Front End
                        •   Event invites sent out: 1,484,000
Data arch

Services architecture   •   Wall Posts: 1,587,000
                        •   Status updates: 1,851,000
                        •   Friend requests accepted: 1,972,000
                        •   Photos uploaded: 2,716,000
                        •   Comments: 10,208,000
                        •   Message: 4,632,000
facebook in 20 Minutes


                        Direct Friendship




Statistics

What is Facebook

Technical challenges

Front End

Data arch

Services architecture




                                            Friends of Friends
What is facebook

                        • A social graph
                        • Friends , Friends of friends, somewhere in the
                          network.
                        • Friends can comment, like, read your posts
                        • Friends of friends can just read
Statistics

What is Facebook

Technical challenges    • Facebook messages – chat/ email/ SMS
Front End
                        • Near real-time updates
Data arch

Services architecture
Technical Challenges


                          Challenges     Ok to Live with
Statistics

What is Facebook
                        • High           • Not Mission
Technical challenges

Front End
                          Concurrency      Critical
Data arch
                        • High Data      • Cached data is
Services architecture
                          Volumes          fine
                        • Multilevel     • Write Failures
                          Hierarchical     are tolerable
                          data
The Data – (Illustrational)
                                          Everything is a hash lookup
                        User        Friend User             Age    Bio          Intere
                        ID          s with Name                                 sts
Statistics              1           2,3,4       XYZ         ..     ..           ..
What is Facebook        2           1           ..          ..     ..           ..
Technical challenges



                            Challenges                              Solutions
Front End

Data arch

Services architecture



                               The Relational Nature of the data         No Constraints, No Joins in MySQL


                               Data Volumes                              Write Through cache implementation


                               Concurrency                               Hash Ring based architecture
facebook – Data Partition initial thoughts
        • Horizontal partitioning based on
          Networks.
                        – Harvard
Statistics              – Stanford
What is Facebook

Technical challenges
                        – Carnegie
Front End

Data arch

Services architecture
facebook –Photos - HayStack
        • Each File read required a minimum
          of 3 i/o in a typical file system
        • CDNs- Not a Solution
        • Haystack is a customized storage
Statistics



          system, which minimizes the
What is Facebook

Technical challenges

Front End
          amount of File metadata and
          involves only 1 i/o for each file
Data arch

Services architecture


          read.
        • Haystack caches extensive data in
          in its main memory
facebook – HayStack



Statistics              HayStack Interface
                                                     HayStack             HayStack
What is Facebook
                                                     Cache                Directory
Technical challenges

Front End

Data arch                      Logical Drives                        Logical Drives
Services architecture


                          PD        PD          PD              PD        PD          PD




                           http://CDN/Cache/Machine id/(Logical volume, Photo)
Facebook – Serving the Photo - Haystack




Statistics

What is Facebook

Technical challenges

Front End

Data arch

Services architecture
Facebook – Scribe - Logging

                        Nodes              Nodes               Nodes
                            Scribe             Scribe                 Scribe


Statistics

What is Facebook

Technical challenges

Front End                                                    $messages = array();
                                                             $entry = new LogEntry;
Data arch
                                     Central Scribe Server   $entry->category = "buckettest";
Services architecture                                        $entry->message = "something very”;
                                                             $messages []= $entry;
                                                             $result = $conn->Log($messages);


                        Dashboards
                                            HBase
facebook – Services– Thrift
                        • Lightweight Software framework for cross-
                          language development
                        • Dev need not worry about serialization ,
                          connection handling and threading
                        • Supported bindings:
Statistics

What is Facebook

Technical challenges       – C++, PHP, Python, java, ruby, erlang, perl,
Front End                    haskell
                        • Transports : Simple interface to i/o
Data arch

Services architecture


                        • Protocols : Serialization format
                           – TBinaryProtocol, TJsonProtocol
                        • Severs
                           – Non Blocking, Async, Single threaded, multi-
                             threaded
facebook – Memcache
                        • In-memory distributed hash table
                        • “hot” data from MySQL stored in cache

Statistics

What is Facebook

Technical challenges

Front End

Data arch

Services architecture
facebook – front end - PHP
                        • Op – Code Optimization
                        • APC improvements(alternate PHP cache)
                           – Lazy Loading
                           – Cache priming
Statistics
                        • Custom Extensions
What is Facebook

Technical challenges
                           – Memcache Client Extension
Front End                  – Serialization format
Data arch
                           – Logging, Stats Collection, Monitoring
Services architecture

                           – Asynchronous event-handling mechanism
facebook – front end – Hip Hop
                        • Source Code Transformer
                        • Static Analysis, type inference, Code
                          Generation
Statistics
                        • Easier to write extensions
What is Facebook

Technical challenges    • Significantly cuts down on CPU and
                          Memory usage
Front End

Data arch

Services architecture
facebook – front end – Hip Hop



Statistics

What is Facebook

Technical challenges

Front End

Data arch

Services architecture
facebook – front end – BigPipe
                        BigPipe first breaks web pages into multiple chunks called pagelets




Statistics

What is Facebook

Technical challenges

Front End

Data arch

Services architecture
facebook – front end – BigPipe
                        BigPipe first breaks web pages into multiple chunks called pagelets
                                                              Request Parsing

                                            Web Server parses and sanity checks the request


                                                               Data Fetching

                                               Web Server fetches data from storage tier
Statistics

What is Facebook                                            Markup Generation

                                                 Web server generates HTML Markup
Technical challenges

Front End                                                    Network Transport

                                                        Response is transferred
Data arch

Services architecture
                                                             CSS downloading




                                                           Dom Tree Construction




                                                           JavaScript downloading




                                                                JS Execution
facebook – Technology Stack

       Front End                          Big Pipe          Hip Hop

                             PHP - Custom compiler / Cache implementations

                                         Linux – Custom Kernel Extensions



                                                     Service Aggregators
       Scribe

                Thrift




                             Service 1          Service 2         Service 3     Service 4



       Data Store
                            MemCache – Write Through Cache implementation

                         Cassandra            MySQL             HBase         HayStack
facebook – Messages Infrastructure




Statistics

What is Facebook

Technical challenges

Front End

Data arch

Services architecture

Messages
facebook - Messages



Statistics

What is Facebook

Technical challenges

Front End

Data arch

Services architecture

Messages
facebook - Messages



Statistics

What is Facebook

Technical challenges

Front End

Data arch

Services architecture

Messages
facebook – Cells

                        Cell

                                          Node
                                           1
Statistics

What is Facebook
                          Node
Technical challenges                                    Node2
                           n            Zookeper
Front End                               Controller
Data arch                               Machines
Services architecture

Messages                         Node                Node
                                  4                   3


                               Application Server Cluster


                                        Metadata Store
facebook – Cells
                        • They help scale incrementally while
                          limiting failure scenarios
                        • Easy upgrades
Statistics

What is Facebook
                        • Metadata store failures affect only a few
Technical challenges
                          users
Front End

Data arch

Services architecture
                        • Easy rollout
Messages
                        • Flexibility to host cells in different data
                          centers with multi-homing for disaster
                          recovery
Take away – for our applications
          • Really parallel Asynchronous AJAX Pages
              – ASP.Net Update panels is a HOAX
          • Appropriate usage of client side technology
          • Cache – Cache – Cache
              – Write Through Caches are way better
              – App Fabric cache/ Memcache
          • High – Normalization is not needed
              – Store denormalized views – materialized views
          •   Parallel Services and Service aggregators
          •   Fault tolerant applications
          •   Asynchronous Processing
          •   1 Sec response time is too SLOW
References
             •   http://facebook.com/engineering
             •   www.infoq.com
             •   www.highscalability.com
             •   www.stackoverflow.com
             •   www.thrift.org
Keep Learning


For suggestions on topics/ feedbacks etc.,


      Contact OpenTalk@aditi.com

Facebook Architecture - Breaking it Open

  • 1.
    Learning and Development Be part of the learning experience at Aditi. presents Join the talks. Its free. Free as in freedom at work, not free-beer. Its not training. Its mind-opener. Speak at these events. Or bring an expert/friend to talk. Open Talk Series Mail OpenTalk@aditi.com with topic and A series of illuminating talks and interactions that open our minds to new availability. ideas and concepts; that makes us look for newer or better ways of doing what we did; or point us to exciting things we have never done before. A range of topics on Technology, Business, Fun and Life.
  • 2.
    HOW TO ENJOYAN TALK Bring coffee & friends Switch OFF mobile Switch ON mind Sign attendance sheet SHARE your wisdom QUESTION notions THANK the Talker SPREAD the good word
  • 3.
    architecture Sundararajan Subramanian Image Copyright : facebook
  • 4.
    facebook in 20Minutes • 2.7 M Photos • 10.2M Comments • 4.6 Messages Statistics What is Facebook • Shared links: 1,000,000 Technical challenges • Tagged photos: 1,323,000 Front End • Event invites sent out: 1,484,000 Data arch Services architecture • Wall Posts: 1,587,000 • Status updates: 1,851,000 • Friend requests accepted: 1,972,000 • Photos uploaded: 2,716,000 • Comments: 10,208,000 • Message: 4,632,000
  • 5.
    facebook in 20Minutes Direct Friendship Statistics What is Facebook Technical challenges Front End Data arch Services architecture Friends of Friends
  • 6.
    What is facebook • A social graph • Friends , Friends of friends, somewhere in the network. • Friends can comment, like, read your posts • Friends of friends can just read Statistics What is Facebook Technical challenges • Facebook messages – chat/ email/ SMS Front End • Near real-time updates Data arch Services architecture
  • 7.
    Technical Challenges Challenges Ok to Live with Statistics What is Facebook • High • Not Mission Technical challenges Front End Concurrency Critical Data arch • High Data • Cached data is Services architecture Volumes fine • Multilevel • Write Failures Hierarchical are tolerable data
  • 8.
    The Data –(Illustrational) Everything is a hash lookup User Friend User Age Bio Intere ID s with Name sts Statistics 1 2,3,4 XYZ .. .. .. What is Facebook 2 1 .. .. .. .. Technical challenges Challenges Solutions Front End Data arch Services architecture The Relational Nature of the data No Constraints, No Joins in MySQL Data Volumes Write Through cache implementation Concurrency Hash Ring based architecture
  • 9.
    facebook – DataPartition initial thoughts • Horizontal partitioning based on Networks. – Harvard Statistics – Stanford What is Facebook Technical challenges – Carnegie Front End Data arch Services architecture
  • 10.
    facebook –Photos -HayStack • Each File read required a minimum of 3 i/o in a typical file system • CDNs- Not a Solution • Haystack is a customized storage Statistics system, which minimizes the What is Facebook Technical challenges Front End amount of File metadata and involves only 1 i/o for each file Data arch Services architecture read. • Haystack caches extensive data in in its main memory
  • 11.
    facebook – HayStack Statistics HayStack Interface HayStack HayStack What is Facebook Cache Directory Technical challenges Front End Data arch Logical Drives Logical Drives Services architecture PD PD PD PD PD PD http://CDN/Cache/Machine id/(Logical volume, Photo)
  • 12.
    Facebook – Servingthe Photo - Haystack Statistics What is Facebook Technical challenges Front End Data arch Services architecture
  • 13.
    Facebook – Scribe- Logging Nodes Nodes Nodes Scribe Scribe Scribe Statistics What is Facebook Technical challenges Front End $messages = array(); $entry = new LogEntry; Data arch Central Scribe Server $entry->category = "buckettest"; Services architecture $entry->message = "something very”; $messages []= $entry; $result = $conn->Log($messages); Dashboards HBase
  • 14.
    facebook – Services–Thrift • Lightweight Software framework for cross- language development • Dev need not worry about serialization , connection handling and threading • Supported bindings: Statistics What is Facebook Technical challenges – C++, PHP, Python, java, ruby, erlang, perl, Front End haskell • Transports : Simple interface to i/o Data arch Services architecture • Protocols : Serialization format – TBinaryProtocol, TJsonProtocol • Severs – Non Blocking, Async, Single threaded, multi- threaded
  • 15.
    facebook – Memcache • In-memory distributed hash table • “hot” data from MySQL stored in cache Statistics What is Facebook Technical challenges Front End Data arch Services architecture
  • 16.
    facebook – frontend - PHP • Op – Code Optimization • APC improvements(alternate PHP cache) – Lazy Loading – Cache priming Statistics • Custom Extensions What is Facebook Technical challenges – Memcache Client Extension Front End – Serialization format Data arch – Logging, Stats Collection, Monitoring Services architecture – Asynchronous event-handling mechanism
  • 17.
    facebook – frontend – Hip Hop • Source Code Transformer • Static Analysis, type inference, Code Generation Statistics • Easier to write extensions What is Facebook Technical challenges • Significantly cuts down on CPU and Memory usage Front End Data arch Services architecture
  • 18.
    facebook – frontend – Hip Hop Statistics What is Facebook Technical challenges Front End Data arch Services architecture
  • 19.
    facebook – frontend – BigPipe BigPipe first breaks web pages into multiple chunks called pagelets Statistics What is Facebook Technical challenges Front End Data arch Services architecture
  • 20.
    facebook – frontend – BigPipe BigPipe first breaks web pages into multiple chunks called pagelets Request Parsing Web Server parses and sanity checks the request Data Fetching Web Server fetches data from storage tier Statistics What is Facebook Markup Generation Web server generates HTML Markup Technical challenges Front End Network Transport Response is transferred Data arch Services architecture CSS downloading Dom Tree Construction JavaScript downloading JS Execution
  • 21.
    facebook – TechnologyStack Front End Big Pipe Hip Hop PHP - Custom compiler / Cache implementations Linux – Custom Kernel Extensions Service Aggregators Scribe Thrift Service 1 Service 2 Service 3 Service 4 Data Store MemCache – Write Through Cache implementation Cassandra MySQL HBase HayStack
  • 22.
    facebook – MessagesInfrastructure Statistics What is Facebook Technical challenges Front End Data arch Services architecture Messages
  • 23.
    facebook - Messages Statistics Whatis Facebook Technical challenges Front End Data arch Services architecture Messages
  • 24.
    facebook - Messages Statistics Whatis Facebook Technical challenges Front End Data arch Services architecture Messages
  • 25.
    facebook – Cells Cell Node 1 Statistics What is Facebook Node Technical challenges Node2 n Zookeper Front End Controller Data arch Machines Services architecture Messages Node Node 4 3 Application Server Cluster Metadata Store
  • 26.
    facebook – Cells • They help scale incrementally while limiting failure scenarios • Easy upgrades Statistics What is Facebook • Metadata store failures affect only a few Technical challenges users Front End Data arch Services architecture • Easy rollout Messages • Flexibility to host cells in different data centers with multi-homing for disaster recovery
  • 27.
    Take away –for our applications • Really parallel Asynchronous AJAX Pages – ASP.Net Update panels is a HOAX • Appropriate usage of client side technology • Cache – Cache – Cache – Write Through Caches are way better – App Fabric cache/ Memcache • High – Normalization is not needed – Store denormalized views – materialized views • Parallel Services and Service aggregators • Fault tolerant applications • Asynchronous Processing • 1 Sec response time is too SLOW
  • 28.
    References • http://facebook.com/engineering • www.infoq.com • www.highscalability.com • www.stackoverflow.com • www.thrift.org
  • 30.
    Keep Learning For suggestionson topics/ feedbacks etc., Contact OpenTalk@aditi.com