SlideShare a Scribd company logo
1 of 18
Download to read offline
High Scalability by Example –
                                          How can Web Architecture scale like
                                          Facebook, Twitter?




                                          JAX 2011, 03.05.2011
                                          Robert Mederer, Markus Bukowski
    Copyright © 2011 Accenture All Rights Reserved. Accenture, its logo, and High Performance Delivered are trademarks of Accenture.




Speaker

                             Robert Mederer                                                                     Markus Bukowski
                             Lead Technology Architect                                                          Senior Software Engineer
                              Anni-Albers-Straße 11                                                              Kaistraße 20
                                                                                                                 40221 Düsseldorf
                              80807 München
                                                                                                                 Mobil:    +49-175-57-64217
                              Mobil:    +49-175-57-68012
                                                                                                                 markus.bukowski@accenture.com
                              robert.mederer@accenture.com




Experience
2000 - 2005: Technology Architect and Software                                      2006 - 2009: Software Engineer in several projects
Engineer in several projects                                                        (during studies) for mainly telecommunication
                                                                                    companies
2006: Technical Architecture Lead, Integration and
Execution Architecture for Location-Based Service                                   2009 - 2011: Senior Software Engineer in several
Provider                                                                            projects

2009: Technical Architecture Lead, Frontend and                                     2009/2010: Coach and Software Engineer for a
Execution Architecture Government Agency                                            Health Insurance Fund

2009/2010: Technical Architecture and front-office                                  2011: Technical Architecture Lead during the
integration build lead, Integration and Execution                                   development of a Document Management Solution
Architecture Financial Services Agency                                              for a Government Agency

2011: Architect and Performance Engineer for
Location Based Services Platform
Copyright © 2010 Accenture. All rights reserved.                                                                                                 2
Accenture
High performance achieved

Company Profile                                    Worldwide Revenues $21.6 billion
•  Global management consulting,                   (in US$ billion, as of August 31, 2010)
   technology services and outsourcing                                          Communications &
   company                                            Resources                 High Tech
•  215.000 employees
•  Rank 47 among the                                              3.88
                                                                                4.83
   “Best Global Brands 2008”
•  Top 100 Employer
                                                               4.32
•  28 of the DAX-30-Companies                                                      2.98
•  96 of the Fortune-Global-100                                                             Public
•  More than three-quarters of the Fortune-        Financial             5.53               Service
                                                   Services
   Global-500
•  87 of our Top 100-clients have been with                           Products
   us for 10 or more years



Copyright © 2010 Accenture. All rights reserved.                                                      3




Local Accenture … ???

Geographic unit
•  Austria
•  Switzerland
•  Germany
Employees                                                                         Berlin
•  Ca. 6000                                                       Düsseldorf
•  We are hiring!
Exciting Technology work                                          Frankfurt
•  Large scale projects
   (100+ people / multiple years)
                                                                                Munich
•  Most challenging requirements
   –  Stock Exchange / Banking / Trading Systems                                           Vienna
   –  AEMS Mobility Platform
   –  Large Scale Web Applications                                Zurich
      (> 1M page views / day)
   –  Batch Architectures


Copyright © 2010 Accenture. All rights reserved.                                                      4
Agenda


•  Introduction
   – Scalability & CAP Theorem
   – Traditional websites & Social media websites
   – High Scalability in Numbers
•  Accenture – High Scalability by Example
•  Architecture at Internet Scale
•  Common concepts of Scalability
•  Conclusion




Copyright © 2011 Accenture. All rights reserved.                                             5




Scalability



 A system’s capacity to uphold the same performance
               under heavier volumes.

(Patterns for Performance and Operability: Building and Testing Enterprise Software, Chris
                                  Ford et. al., 2008)




Copyright © 2011 Accenture. All rights reserved.                                             6
Vertical Scalability


•  Is achieved by increasing the capacity of a
   single node
   – CPU,
   – Memory,
   – Bandwidth, …
•  Simple Process
   – Application is generally not affected by
     those changes

•  Classical Example are Super Computers
   like
   – HP Integrity Superdome
   – IBM Mainframe
                                                                       Source: Hewlett-Packard




Copyright © 2011 Accenture. All rights reserved.                                                      7




Horizontal Scalability


•  Application is spread on a cluster with
   several nodes
•  Nodes can be added to scale out
•  Produces overhead
   – Keep cluster consistent
   – Node error detection and handling
   – Communication between nodes
•  May be used to increase reliability and
   availability

•  Distributed Systems and Programs like
   – SETI@Home                                     Source: Space Sciences Laboratory, U.C. Berkeley


   – World Wide Web
   – Domain Name Service

Copyright © 2011 Accenture. All rights reserved.                                                      8
CAP Theorem (Brewer‘s Theorem)


                                                                             •  Consistency – all clients see the
                                                                                same data at the same time
                           Consistency                                       •  Availability – all clients can find all
                                                                                data even in presence of failure
                                                                             •  Partition Tolerance – system works
                                                                                even when one node failed


   Partition
                                                     Availability
   Tolerance




                                                    Impossible
Source: PODC-keynote, Towards Robust Distributed Systems, Dr. Eric A. Brewer, 2000
Copyright © 2011 Accenture. All rights reserved.                                                                           9




CAP Theorem

Normally, two of these properties for any shared-data system
                  Consistency + Availability
    C             •  High data integrity
 P      A         •  Single site, cluster database, LDAP, etc.
                  •  2-phase commit, data replication, etc.


          C                       Consistency + Partition
                                  •  Distributed database, distributed locking, etc.
   P             A                •  Pessimistic locking, etc.

                                  Availability + Partition
          C                       •  High scalability
   P             A                •  Distributed cache, DNS, etc.
                                  •  Optimistic locking, expiration/leases (timeout), etc.

Source: “Architecting Cloudy Applications”, David Chou
Copyright © 2011 Accenture. All rights reserved.                                                                          10
Agenda


•  Introduction
   – Scalability & CAP Theorem
   – Traditional websites & Social media websites
   – High Scalability in Numbers
•  Accenture – High Scalability by Example
•  Architecture at Internet Scale
•  Common concepts of Scalability
•  Conclusion




Copyright © 2011 Accenture. All rights reserved.                                 11




Traditional Websites




(Scaling the Social Graph: Infrastructure at Facebook, Jason Sobel, QCon 2011)

Copyright © 2011 Accenture. All rights reserved.                                 12
Social Media Websites




(Scaling the Social Graph: Infrastructure at Facebook, Jason Sobel, QCon 2011)

Copyright © 2011 Accenture. All rights reserved.                                 13




Agenda


•  Introduction
   – Scalability & CAP Theorem
   – Traditional websites & Social media websites
   – High Scalability in Numbers
•  Accenture – High Scalability by Example
•  Architecture at Internet Scale
•  Common concepts of Scalability
•  Conclusion




Copyright © 2011 Accenture. All rights reserved.                                 14
High Scalability in Numbers

The World Is Obsessed With Facebook (Video)




Source: http://www.youtube.com/watch?v=xJXOavGwAW8
Copyright © 2011 Accenture. All rights reserved.                                                            15




High Scalability in Numbers

 facebook                                                                        twitter

 Active users                                                500.000.000 New accounts per day        500.000



 Active users logging in                                     250.000.000 Tweets per day          140.000.000
 daily                                                                   (a year ago)            (50.000.000)


 Active mobile users                                         250.000.000 Max tweets per second         6.939



 In 20 minutes …


 Links shared                                                   1.000.000

 Status updates                                                 1.851.000


 Photos uploaded                                                2.716.000

Source: http://www.facebook.com/press/info.php?statistics, http://www.allfacebook.com/
 Comments made                                                10.208.000                                    16
Copyright © 2011 Accenture. All rights reserved.
High Scalability in Numbers

 LinkedIn
 Members                                                                                          >100.000.000
 Connections between Members                                                                      1.300.000.000


 Visitors per month                                                                                128.000.000
 % of global internet users who visit LinkedIn daily                                                        4%




Source: http://blog.linkedin.com, http://press.linkedin.com/about/, LinkedIn Demographics 20111

Copyright © 2011 Accenture. All rights reserved.                                                              17




Agenda


•  Introduction
•  Accenture – High Scalability by Example
   – The Royal Wedding Website
   – US Based Professional Sport League .com Website
•  Accenture – High Scalability by Example
•  Architecture at Internet Scale
•  Common concepts of Scalability
•  Conclusion




Copyright © 2011 Accenture. All rights reserved.                                                              18
Agenda


•  Introduction
•  Accenture – High Scalability by Example
•  Architecture at Internet Scale
   – Facebook
   – Twitter
   – LinkedIn
   – Challenges
•  Common concepts of Scalability
•  Conclusion


Copyright © 2011 Accenture. All rights reserved.              19




Facebook – Key Facts


•  Biggest Social Network

•  People are linked together
•  People share and exchange information in real-time

•  Facebook provides a
   – A personalized News Feed containing news about friends
   – Picture Service including people tagging / linking
   – Messaging Service to stay in connect with friends
   – Platform to play Games with friends




Copyright © 2011 Accenture. All rights reserved.              20
Facebook – Architecture (1)


•  Facebook „main“ architecture is based                                                                  Memcache
   on a LAMP stack
•  The application logic resides in Web
   Server layer (PHP based)
•  The application layer takes care of the
   data distribution

•  Some services do not fit (C++, Java)
   – Search                                                                                                  MySQL
   – Ads
   – People You May Know
                                                                              Invalidation logic is implemented in
   – Multifeed
                                                                              the application!



Copyright © 2011 Accenture. All rights reserved.                                                                     21




Facebook – Architecture (2)

Multifeed / Newsfeed
•  Distributed System

•  Every user gets
   – 45 best rated updates
   –  on every reload

•  Every DB-Update results in
   a Scribe Notification
   –  Leaf Nodes / Server are
     notified



Source: “High Performance at Massive Scale – Lessons learned at Facebook”, Jeff Rothschild, 2009
Inspired By: “Architecting Cloudy Applications”, David Chou, 2010
Copyright © 2011 Accenture. All rights reserved.                                                                     22
Agenda


•  Introduction
•  Accenture – High Scalability by Example
•  Architecture at Internet Scale
   – Facebook
   – Twitter
   – LinkedIn
   – Challenges
•  Common concepts of Scalability
•  Conclusion


Copyright © 2011 Accenture. All rights reserved.   23




Twitter – Key Facts


•  Twitter is a real-time information network

•  Twitter‘s key characteristics
   – Tweets are used for sharing information
   – Timeline of tweets must be kept
   – A Social Graph is maintained to deliver
     information




Copyright © 2011 Accenture. All rights reserved.   24
Twitter - Architecture


                                                          Load Balancers
Web
Frontend
                                                         Apache mod_proxy

Apps &                                                                                 Distributed
                                                          Rails (Unicorn)              Cache
Services

                                        Flock                memcached       Kestrel   Queues
Partitioned
Data                                                                                   Distributed
                                                     MySQL           Cassandra
                                                                                       Storage

                                                             Daemons
Source: “Architecting Cloudy Applications”, David Chou
Copyright © 2011 Accenture. All rights reserved.                                                 25




Agenda


•  Introduction
•  Accenture – High Scalability by Example
•  Architecture at Internet Scale
   – Facebook
   – Twitter
   – LinkedIn
   – Challenges
•  Common concepts of Scalability
•  Conclusion


Copyright © 2011 Accenture. All rights reserved.                                                 26
LinkedIn – Key Facts


 •  A social media networking platform to find
    connections to recommended job candidates,
    industry experts and business partners
 •  99% pure Java

 LinkedIn – Applications
 •  Groups
 •  Job market
 •  Mailing
 •  Profiles (Friends and Companies)
 •  Mobile Applications (Android, iPhone,
    Blackberry, Palm)




 Copyright © 2011 Accenture. All rights reserved.                                                                                         27




 LinkedIn - Architecture

Web Frontend
                                                                                                  LinkedIn
                                                                                                  Spring Search
                                                                                                  Grails MVC
                                                                                                  Queuing Spring
                                                                                                  Caching / Messaging
                                                                                                  DWR
       Spring MVC                                   Grails                          DWR              Spring Framework (IoC, Web
                                                                                                       Lucene Searchprocessing AOP,
                                                                                                       Model-View-Controller DI, for
                                                                                                       Asynchronous Engine
                                                                                                  •  Direct Web Remoting (AJAX for
                                                                                                       Voldemort
                                                                                                       Open-source web application
                                                                                                       Frameworktriggered writes
                                                                                                  •  Java Library) Concerns,..) by
                                                                                                     Separation of open "coding
                                                                                                       Aims to bring the source in
                                                                                                          •  Apache combines high-
                                                                                                          •  Voldemort
                                                                                                                user
Apps & Services                                                                                   •  RPC library toparadigm to the text
                                                                                                       Used performance, full-featured
                                                                                                     LinkedIn extensions with Groovy
                                                                                                                memory caching
                                                                                                                for
                                                                                                       convention" call Java functions
                                                                                                                search engine library written
                                                                                                                storage system so that a
      Service        Service                 LinkedIn Spring              Service    Service         from JavaScript and (lin:bean,
                                                                                                          •  XML Vocabulary to call
                                                                                                  StorageentirelyLoad of Page not
                                                                                                                Initial
                                                                                                  •  Why Grails? incaching tier is
                                                                                                                  Valdemort
                                                                                                                  Hadoop
                                                                                                                separate Java
                                                                                                  Storagerequired a more from Javaweb
                                                                                                                  Sensei
                                                                                                     JavaScript functions productive
                                                                                                          •  lin:component, …)
                                                                                                                Needed software
                                                                                                       Hadoop isOperations framework
                                                                                                                Pull a
                                                                                                  •  ZoieProperty Editors a major
                                                                                                       Valdemort used as
                                                                                                          •  Forframework
                                                                                                  •  Used supports data-intensive
                                                                                                       distributed database
                                                                                                       distributed key-value data cache
                Lucene
    Zoie                         Bobo        Caching         Voldemort    ehcache      Memcache   •  that realtime indexing/search
                                                                                                       ehcache
                                                                                                          •  …
                                                                                                          •  New kind of application context
              Search Index                                                                                •  Asynchronous Communication
                                                                                                       for• the social graph to (map/
                                                                                                       horizontallyprototyping
                                                                                                                Rapid
                                                                                                  •  distributed applications source,
                                                                                                                system scaledopen
                                                                                                          •  for life-cycle management with
                                                                                                                Ehcache is an
                                                                                                          •  Push Operations with and
                                                                                                                demonstrate feasibility
                                                                                                  •  reduce /Consistency cachehigh
                                                                                                                standards-based
                                                                                                       relaxed distributed file system)  used
                                                                         Queuing / Messaging
                                                                                                       Bobo productivity gain
                                                                                                       Advantage:
                                                                                                             components
                                                                                                       durabilityis usedwith engine offload
                                                                                                          •  …
                                                                                                                to boost performance,
                                                                                                                     guarantees majorjava/
                                                                                                  •  Hadoop database and simplify the
                                                                                                                Integration as a
                                                                                                                faceted search existing for
                                                                              ActiveMQ                          the concurrent read
                                                                                                                Fast
                                                                                                                Socialkey-value data
                                                                                                                        Graph
                                                                                                       distributed index building store
                                                                                                          •  spring-based solutions
                                                                                                                scalability
                                                                                                                Fast
                                                                                                          •  handles semi-structured and
                                                                                                  •  for the social graph
                                                                                                       Memcache structured data
Storage
                                                                                                  •  Bottleneck:
                                                                                                          •  Distributed Cache
               MySQL                    Sensei                  Hadoop              Voldemort             •  Read data
                                                                                                          •  Index building

• Keep core data ACID
• Keep replicated and cached data BASE
• Cache data on a cheap memory (memcached)
• Use a hint to route the client to his / her’s ACID data
 Source: Project Voldemort, Jay Kreps            Source: http://sna-projects.com
 Copyright © 2011 Accenture. All rights reserved.                                                                                         28
Agenda


•  Introduction
•  Accenture – High Scalability by Example
•  Architecture at Internet Scale
•  Common concepts of Scalability
•  Conclusion




Copyright © 2011 Accenture. All rights reserved.                                                 29




Common concepts of Scalability




                                                         parallelization

              asynchronous                                                       idempotent
                                                         7 Habits of              operations
                                                            Good
           partitioned data                              Distributed           fault-tolerance
                                                          Systems

                 optimistic concurrency                                    shared nothing

Source: “Architecting Cloudy Applications”, David Chou
Source: highscalability.com
Copyright © 2011 Accenture. All rights reserved.                                                 30
Common concepts of Scalability
    Hybrid architectures


• Scale-out (horizontal)                           • Scale-up (vertical)
  – BASE                                             – ACID
  – focus on “commit”                                – availability first
  – optimistic locking                               – pessimistic locking
  – shared nothing                                   – transactional
  – maximize scalability                             – favor accuracy/consistency


                                 Most distributed systems realize both




Copyright © 2011 Accenture. All rights reserved.                                    31




Agenda


•  Introduction
•  Accenture – High Scalability by Example
•  Architecture at Internet Scale
•  Common concepts of Scalability
•  Conclusion
   –  Questionnaire
   –  Déjà vu - Common concepts
   –  Don’t forget the Infrastructure




Copyright © 2011 Accenture. All rights reserved.                                    32
Conclusion
                 Questionnaire


•  Is there a need to scale my application?
   –  Vertical scaling is more easy to achieve
   –  Use horizontal scaling only when required

•  Is there a plan to proof your designed solution?
   –  Plan to do a lot of realistic Proof-of-Concepts

•  Is there a one size fits all solution?
   –  NO!

•  How important is ACID?
   – Is BASE enough?
   – Can a NoSQL solution be used?
Copyright © 2011 Accenture. All rights reserved.             33




Conclusion
      Déjà vu - Common concepts


•  Asynchronous Processing
   – Keep Facebook, Twitter and LinkedIn in mind
   –  Asynchronous writes and even UI

•  Data Partioining
   –  You must understand your data
   –  Is a hybrid solution as used by LinkedIn applicable?

•  Design To Failure
   –  Lessons learned from Amazon‘s cloud solution outage




Copyright © 2011 Accenture. All rights reserved.             34
Conclusion
      Don’t forget the Infrastructure


•  Disk IO compared to RAM is slow
      Avoid „slow“ IO access whenever possible
      Caching, Caching, Caching
      Bulk Updates (Batch) over small updates

•  Network Latency
      Keep in mind that there is a network
      Where are the users located?
      How are your users connected? mobile application?
      Is there a need for a datacenter distribution?

•  Hardware configuration
   –  Is the CPU Power Saving working right?
Copyright © 2011 Accenture. All rights reserved.          35




Questions ?




Copyright © 2011 Accenture. All rights reserved.          36

More Related Content

What's hot

DevOps and the DBA- 24 Hours of Pass
DevOps and the DBA-  24 Hours of PassDevOps and the DBA-  24 Hours of Pass
DevOps and the DBA- 24 Hours of PassKellyn Pot'Vin-Gorman
 
Azure Storage Revisited
Azure Storage RevisitedAzure Storage Revisited
Azure Storage RevisitedJoel Cochran
 
LinkedIn Data Infrastructure Slides (Version 2)
LinkedIn Data Infrastructure Slides (Version 2)LinkedIn Data Infrastructure Slides (Version 2)
LinkedIn Data Infrastructure Slides (Version 2)Sid Anand
 
Moving to Web 2.0 - Best Practices for Business and Application Migration
Moving to Web 2.0 - Best Practices for Business and Application MigrationMoving to Web 2.0 - Best Practices for Business and Application Migration
Moving to Web 2.0 - Best Practices for Business and Application Migrationanilmadugula
 
Randy Shoup eBays Architectural Principles
Randy Shoup eBays Architectural PrinciplesRandy Shoup eBays Architectural Principles
Randy Shoup eBays Architectural Principlesdeimos
 
Planning your Migration for SharePoint 2010
Planning your Migration for SharePoint 2010Planning your Migration for SharePoint 2010
Planning your Migration for SharePoint 2010cScape
 
Rethink your architecture - Marten Deinum
Rethink your architecture - Marten DeinumRethink your architecture - Marten Deinum
Rethink your architecture - Marten DeinumNLJUG
 
JBoye Presentation: WCM Trends for 2010
JBoye Presentation: WCM Trends for 2010JBoye Presentation: WCM Trends for 2010
JBoye Presentation: WCM Trends for 2010David Nuescheler
 
Effective admin and development in iib
Effective admin and development in iibEffective admin and development in iib
Effective admin and development in iibm16k
 
Responsive Web Design ~ Best Practices for Maximizing ROI
Responsive Web Design ~ Best Practices for Maximizing ROIResponsive Web Design ~ Best Practices for Maximizing ROI
Responsive Web Design ~ Best Practices for Maximizing ROIJuan Carlos Duron
 
The Top 10 Things Oracle UCM Users Need To Know About WebLogic
The Top 10 Things Oracle UCM Users Need To Know About WebLogicThe Top 10 Things Oracle UCM Users Need To Know About WebLogic
The Top 10 Things Oracle UCM Users Need To Know About WebLogicBrian Huff
 
Build highly scalable_low_latency_applications
Build highly scalable_low_latency_applicationsBuild highly scalable_low_latency_applications
Build highly scalable_low_latency_applicationsShivnarayan Varma
 
windows server 2012 R2
windows server 2012 R2windows server 2012 R2
windows server 2012 R2Gol D Roger
 
Successful Migration to SharePoint 2013 - Planning Considerations & Migration...
Successful Migration to SharePoint 2013 - Planning Considerations & Migration...Successful Migration to SharePoint 2013 - Planning Considerations & Migration...
Successful Migration to SharePoint 2013 - Planning Considerations & Migration...Roberto Vazquez Delgado
 
Databases & Microsoft SQL Server
Databases & Microsoft SQL ServerDatabases & Microsoft SQL Server
Databases & Microsoft SQL ServerMahmoud Abdallah
 
Active/Active Database Solutions with Log Based Replication in xDB 6.0
Active/Active Database Solutions with Log Based Replication in xDB 6.0Active/Active Database Solutions with Log Based Replication in xDB 6.0
Active/Active Database Solutions with Log Based Replication in xDB 6.0EDB
 
Developing modular, polyglot applications with Spring (SpringOne India 2012)
Developing modular, polyglot applications with Spring (SpringOne India 2012)Developing modular, polyglot applications with Spring (SpringOne India 2012)
Developing modular, polyglot applications with Spring (SpringOne India 2012)Chris Richardson
 

What's hot (20)

DevOps and the DBA- 24 Hours of Pass
DevOps and the DBA-  24 Hours of PassDevOps and the DBA-  24 Hours of Pass
DevOps and the DBA- 24 Hours of Pass
 
Azure Storage Revisited
Azure Storage RevisitedAzure Storage Revisited
Azure Storage Revisited
 
LinkedIn Data Infrastructure Slides (Version 2)
LinkedIn Data Infrastructure Slides (Version 2)LinkedIn Data Infrastructure Slides (Version 2)
LinkedIn Data Infrastructure Slides (Version 2)
 
How to prepare for your SharePoint upgrade
How to prepare for your SharePoint upgradeHow to prepare for your SharePoint upgrade
How to prepare for your SharePoint upgrade
 
Moving to Web 2.0 - Best Practices for Business and Application Migration
Moving to Web 2.0 - Best Practices for Business and Application MigrationMoving to Web 2.0 - Best Practices for Business and Application Migration
Moving to Web 2.0 - Best Practices for Business and Application Migration
 
Randy Shoup eBays Architectural Principles
Randy Shoup eBays Architectural PrinciplesRandy Shoup eBays Architectural Principles
Randy Shoup eBays Architectural Principles
 
Planning your Migration for SharePoint 2010
Planning your Migration for SharePoint 2010Planning your Migration for SharePoint 2010
Planning your Migration for SharePoint 2010
 
DevOps and DBA- Delphix
DevOps and DBA-  DelphixDevOps and DBA-  Delphix
DevOps and DBA- Delphix
 
Rethink your architecture - Marten Deinum
Rethink your architecture - Marten DeinumRethink your architecture - Marten Deinum
Rethink your architecture - Marten Deinum
 
JBoye Presentation: WCM Trends for 2010
JBoye Presentation: WCM Trends for 2010JBoye Presentation: WCM Trends for 2010
JBoye Presentation: WCM Trends for 2010
 
Effective admin and development in iib
Effective admin and development in iibEffective admin and development in iib
Effective admin and development in iib
 
Responsive Web Design ~ Best Practices for Maximizing ROI
Responsive Web Design ~ Best Practices for Maximizing ROIResponsive Web Design ~ Best Practices for Maximizing ROI
Responsive Web Design ~ Best Practices for Maximizing ROI
 
The Top 10 Things Oracle UCM Users Need To Know About WebLogic
The Top 10 Things Oracle UCM Users Need To Know About WebLogicThe Top 10 Things Oracle UCM Users Need To Know About WebLogic
The Top 10 Things Oracle UCM Users Need To Know About WebLogic
 
DevOps for the DBA- Jax Style!
DevOps for the DBA-  Jax Style!DevOps for the DBA-  Jax Style!
DevOps for the DBA- Jax Style!
 
Build highly scalable_low_latency_applications
Build highly scalable_low_latency_applicationsBuild highly scalable_low_latency_applications
Build highly scalable_low_latency_applications
 
windows server 2012 R2
windows server 2012 R2windows server 2012 R2
windows server 2012 R2
 
Successful Migration to SharePoint 2013 - Planning Considerations & Migration...
Successful Migration to SharePoint 2013 - Planning Considerations & Migration...Successful Migration to SharePoint 2013 - Planning Considerations & Migration...
Successful Migration to SharePoint 2013 - Planning Considerations & Migration...
 
Databases & Microsoft SQL Server
Databases & Microsoft SQL ServerDatabases & Microsoft SQL Server
Databases & Microsoft SQL Server
 
Active/Active Database Solutions with Log Based Replication in xDB 6.0
Active/Active Database Solutions with Log Based Replication in xDB 6.0Active/Active Database Solutions with Log Based Replication in xDB 6.0
Active/Active Database Solutions with Log Based Replication in xDB 6.0
 
Developing modular, polyglot applications with Spring (SpringOne India 2012)
Developing modular, polyglot applications with Spring (SpringOne India 2012)Developing modular, polyglot applications with Spring (SpringOne India 2012)
Developing modular, polyglot applications with Spring (SpringOne India 2012)
 

Similar to High Scalability by Example – How can Web-Architecture scale like Facebook, Twitter, YouTube, etc.

Things you should know about Scalability!
Things you should know about Scalability!Things you should know about Scalability!
Things you should know about Scalability!Robert Mederer
 
Portfolio Sameer
Portfolio SameerPortfolio Sameer
Portfolio Sameersamahmedksa
 
Career Portfolio Sameer Ahmed
Career Portfolio Sameer AhmedCareer Portfolio Sameer Ahmed
Career Portfolio Sameer Ahmedsamahmedksa
 
Destination Marketing Open Source and Cloud Presentation
Destination Marketing Open Source and Cloud PresentationDestination Marketing Open Source and Cloud Presentation
Destination Marketing Open Source and Cloud PresentationIsaac Christoffersen
 
Telecom Clouds crossing borders, Chet Golding, Zefflin Systems
Telecom Clouds crossing borders, Chet Golding, Zefflin SystemsTelecom Clouds crossing borders, Chet Golding, Zefflin Systems
Telecom Clouds crossing borders, Chet Golding, Zefflin SystemsSriram Subramanian
 
OCSL - Migrating to a Virtualised Modern Desktop June 2013
OCSL - Migrating to a Virtualised Modern Desktop June 2013OCSL - Migrating to a Virtualised Modern Desktop June 2013
OCSL - Migrating to a Virtualised Modern Desktop June 2013OCSL
 
CWIN17 london becoming cloud native part 2 - guy martin docker
CWIN17 london   becoming cloud native part 2 - guy martin dockerCWIN17 london   becoming cloud native part 2 - guy martin docker
CWIN17 london becoming cloud native part 2 - guy martin dockerCapgemini
 
Akshay guleria cloud_dev_ops_infra
Akshay guleria cloud_dev_ops_infraAkshay guleria cloud_dev_ops_infra
Akshay guleria cloud_dev_ops_infraAkshay Guleria
 
CIS Infrastructure Group Don Mori Profile 2016
CIS Infrastructure Group Don Mori Profile 2016CIS Infrastructure Group Don Mori Profile 2016
CIS Infrastructure Group Don Mori Profile 2016Don Mori
 
AWS Webcast - Datacenter Migration to AWS
AWS Webcast - Datacenter Migration to AWSAWS Webcast - Datacenter Migration to AWS
AWS Webcast - Datacenter Migration to AWSAmazon Web Services
 
Startups: Streit, Scaleup - introduction and product demo
Startups: Streit, Scaleup - introduction and product demoStartups: Streit, Scaleup - introduction and product demo
Startups: Streit, Scaleup - introduction and product demoCloudOps Summit
 
Sybrant Technologies Company Presentation
Sybrant Technologies Company PresentationSybrant Technologies Company Presentation
Sybrant Technologies Company Presentationmanimsquare
 
DECIDE H2020 Brochure
DECIDE H2020 BrochureDECIDE H2020 Brochure
DECIDE H2020 BrochureDECIDEH2020
 
Resume_Achhar_Kalia
Resume_Achhar_KaliaResume_Achhar_Kalia
Resume_Achhar_KaliaAchhar Kalia
 
Oracle - Soluções do device ao Datacenter
Oracle - Soluções do device ao DatacenterOracle - Soluções do device ao Datacenter
Oracle - Soluções do device ao DatacenterGeneXus
 

Similar to High Scalability by Example – How can Web-Architecture scale like Facebook, Twitter, YouTube, etc. (20)

Things you should know about Scalability!
Things you should know about Scalability!Things you should know about Scalability!
Things you should know about Scalability!
 
Jobs in the Cloud
 Jobs in the Cloud Jobs in the Cloud
Jobs in the Cloud
 
Portfolio Sameer
Portfolio SameerPortfolio Sameer
Portfolio Sameer
 
Agcv
AgcvAgcv
Agcv
 
Career Portfolio Sameer Ahmed
Career Portfolio Sameer AhmedCareer Portfolio Sameer Ahmed
Career Portfolio Sameer Ahmed
 
Destination Marketing Open Source and Cloud Presentation
Destination Marketing Open Source and Cloud PresentationDestination Marketing Open Source and Cloud Presentation
Destination Marketing Open Source and Cloud Presentation
 
Telecom Clouds crossing borders, Chet Golding, Zefflin Systems
Telecom Clouds crossing borders, Chet Golding, Zefflin SystemsTelecom Clouds crossing borders, Chet Golding, Zefflin Systems
Telecom Clouds crossing borders, Chet Golding, Zefflin Systems
 
OCSL - Migrating to a Virtualised Modern Desktop June 2013
OCSL - Migrating to a Virtualised Modern Desktop June 2013OCSL - Migrating to a Virtualised Modern Desktop June 2013
OCSL - Migrating to a Virtualised Modern Desktop June 2013
 
CWIN17 london becoming cloud native part 2 - guy martin docker
CWIN17 london   becoming cloud native part 2 - guy martin dockerCWIN17 london   becoming cloud native part 2 - guy martin docker
CWIN17 london becoming cloud native part 2 - guy martin docker
 
Akshay guleria cloud_dev_ops_infra
Akshay guleria cloud_dev_ops_infraAkshay guleria cloud_dev_ops_infra
Akshay guleria cloud_dev_ops_infra
 
CIS Infrastructure Group Don Mori Profile 2016
CIS Infrastructure Group Don Mori Profile 2016CIS Infrastructure Group Don Mori Profile 2016
CIS Infrastructure Group Don Mori Profile 2016
 
AWS Webcast - Datacenter Migration to AWS
AWS Webcast - Datacenter Migration to AWSAWS Webcast - Datacenter Migration to AWS
AWS Webcast - Datacenter Migration to AWS
 
Startups: Streit, Scaleup - introduction and product demo
Startups: Streit, Scaleup - introduction and product demoStartups: Streit, Scaleup - introduction and product demo
Startups: Streit, Scaleup - introduction and product demo
 
I Tech Art
I Tech ArtI Tech Art
I Tech Art
 
Sybrant Technologies Company Presentation
Sybrant Technologies Company PresentationSybrant Technologies Company Presentation
Sybrant Technologies Company Presentation
 
DECIDE H2020 Brochure
DECIDE H2020 BrochureDECIDE H2020 Brochure
DECIDE H2020 Brochure
 
Chiranjib resume
Chiranjib resumeChiranjib resume
Chiranjib resume
 
Resume_Achhar_Kalia
Resume_Achhar_KaliaResume_Achhar_Kalia
Resume_Achhar_Kalia
 
Oracle - Soluções do device ao Datacenter
Oracle - Soluções do device ao DatacenterOracle - Soluções do device ao Datacenter
Oracle - Soluções do device ao Datacenter
 
Views from the Unified Communications Summit
Views from the Unified Communications SummitViews from the Unified Communications Summit
Views from the Unified Communications Summit
 

Recently uploaded

TrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie WorldTrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie WorldTrustArc
 
Webinar: The Art of Prioritizing Your Product Roadmap by AWS Sr PM - Tech
Webinar: The Art of Prioritizing Your Product Roadmap by AWS Sr PM - TechWebinar: The Art of Prioritizing Your Product Roadmap by AWS Sr PM - Tech
Webinar: The Art of Prioritizing Your Product Roadmap by AWS Sr PM - TechProduct School
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
 
March Patch Tuesday
March Patch TuesdayMarch Patch Tuesday
March Patch TuesdayIvanti
 
Stobox 4: Revolutionizing Investment in Real-World Assets Through Tokenization
Stobox 4: Revolutionizing Investment in Real-World Assets Through TokenizationStobox 4: Revolutionizing Investment in Real-World Assets Through Tokenization
Stobox 4: Revolutionizing Investment in Real-World Assets Through TokenizationStobox
 
Novo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNovo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNeo4j
 
Where developers are challenged, what developers want and where DevEx is going
Where developers are challenged, what developers want and where DevEx is goingWhere developers are challenged, what developers want and where DevEx is going
Where developers are challenged, what developers want and where DevEx is goingFrancesco Corti
 
UiPath Studio Web workshop series - Day 1
UiPath Studio Web workshop series  - Day 1UiPath Studio Web workshop series  - Day 1
UiPath Studio Web workshop series - Day 1DianaGray10
 
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdf
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdfQ4 2023 Quarterly Investor Presentation - FINAL - v1.pdf
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdfTejal81
 
SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENT
SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENTSIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENT
SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENTxtailishbaloch
 
2024.03.12 Cost drivers of cultivated meat production.pdf
2024.03.12 Cost drivers of cultivated meat production.pdf2024.03.12 Cost drivers of cultivated meat production.pdf
2024.03.12 Cost drivers of cultivated meat production.pdfThe Good Food Institute
 
IT Service Management (ITSM) Best Practices for Advanced Computing
IT Service Management (ITSM) Best Practices for Advanced ComputingIT Service Management (ITSM) Best Practices for Advanced Computing
IT Service Management (ITSM) Best Practices for Advanced ComputingMAGNIntelligence
 
From the origin to the future of Open Source model and business
From the origin to the future of  Open Source model and businessFrom the origin to the future of  Open Source model and business
From the origin to the future of Open Source model and businessFrancesco Corti
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightSafe Software
 
Top 10 Squarespace Development Companies
Top 10 Squarespace Development CompaniesTop 10 Squarespace Development Companies
Top 10 Squarespace Development CompaniesTopCSSGallery
 
Introduction - IPLOOK NETWORKS CO., LTD.
Introduction - IPLOOK NETWORKS CO., LTD.Introduction - IPLOOK NETWORKS CO., LTD.
Introduction - IPLOOK NETWORKS CO., LTD.IPLOOK Networks
 
Technical SEO for Improved Accessibility WTS FEST
Technical SEO for Improved Accessibility  WTS FESTTechnical SEO for Improved Accessibility  WTS FEST
Technical SEO for Improved Accessibility WTS FESTBillieHyde
 
The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)IES VE
 
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptxGraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptxNeo4j
 
AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024Brian Pichman
 

Recently uploaded (20)

TrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie WorldTrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie World
 
Webinar: The Art of Prioritizing Your Product Roadmap by AWS Sr PM - Tech
Webinar: The Art of Prioritizing Your Product Roadmap by AWS Sr PM - TechWebinar: The Art of Prioritizing Your Product Roadmap by AWS Sr PM - Tech
Webinar: The Art of Prioritizing Your Product Roadmap by AWS Sr PM - Tech
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
March Patch Tuesday
March Patch TuesdayMarch Patch Tuesday
March Patch Tuesday
 
Stobox 4: Revolutionizing Investment in Real-World Assets Through Tokenization
Stobox 4: Revolutionizing Investment in Real-World Assets Through TokenizationStobox 4: Revolutionizing Investment in Real-World Assets Through Tokenization
Stobox 4: Revolutionizing Investment in Real-World Assets Through Tokenization
 
Novo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNovo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4j
 
Where developers are challenged, what developers want and where DevEx is going
Where developers are challenged, what developers want and where DevEx is goingWhere developers are challenged, what developers want and where DevEx is going
Where developers are challenged, what developers want and where DevEx is going
 
UiPath Studio Web workshop series - Day 1
UiPath Studio Web workshop series  - Day 1UiPath Studio Web workshop series  - Day 1
UiPath Studio Web workshop series - Day 1
 
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdf
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdfQ4 2023 Quarterly Investor Presentation - FINAL - v1.pdf
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdf
 
SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENT
SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENTSIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENT
SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENT
 
2024.03.12 Cost drivers of cultivated meat production.pdf
2024.03.12 Cost drivers of cultivated meat production.pdf2024.03.12 Cost drivers of cultivated meat production.pdf
2024.03.12 Cost drivers of cultivated meat production.pdf
 
IT Service Management (ITSM) Best Practices for Advanced Computing
IT Service Management (ITSM) Best Practices for Advanced ComputingIT Service Management (ITSM) Best Practices for Advanced Computing
IT Service Management (ITSM) Best Practices for Advanced Computing
 
From the origin to the future of Open Source model and business
From the origin to the future of  Open Source model and businessFrom the origin to the future of  Open Source model and business
From the origin to the future of Open Source model and business
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
Top 10 Squarespace Development Companies
Top 10 Squarespace Development CompaniesTop 10 Squarespace Development Companies
Top 10 Squarespace Development Companies
 
Introduction - IPLOOK NETWORKS CO., LTD.
Introduction - IPLOOK NETWORKS CO., LTD.Introduction - IPLOOK NETWORKS CO., LTD.
Introduction - IPLOOK NETWORKS CO., LTD.
 
Technical SEO for Improved Accessibility WTS FEST
Technical SEO for Improved Accessibility  WTS FESTTechnical SEO for Improved Accessibility  WTS FEST
Technical SEO for Improved Accessibility WTS FEST
 
The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)
 
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptxGraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
 
AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024
 

High Scalability by Example – How can Web-Architecture scale like Facebook, Twitter, YouTube, etc.

  • 1. High Scalability by Example – How can Web Architecture scale like Facebook, Twitter? JAX 2011, 03.05.2011 Robert Mederer, Markus Bukowski Copyright © 2011 Accenture All Rights Reserved. Accenture, its logo, and High Performance Delivered are trademarks of Accenture. Speaker Robert Mederer Markus Bukowski Lead Technology Architect Senior Software Engineer Anni-Albers-Straße 11 Kaistraße 20 40221 Düsseldorf 80807 München Mobil: +49-175-57-64217 Mobil: +49-175-57-68012 markus.bukowski@accenture.com robert.mederer@accenture.com Experience 2000 - 2005: Technology Architect and Software 2006 - 2009: Software Engineer in several projects Engineer in several projects (during studies) for mainly telecommunication companies 2006: Technical Architecture Lead, Integration and Execution Architecture for Location-Based Service 2009 - 2011: Senior Software Engineer in several Provider projects 2009: Technical Architecture Lead, Frontend and 2009/2010: Coach and Software Engineer for a Execution Architecture Government Agency Health Insurance Fund 2009/2010: Technical Architecture and front-office 2011: Technical Architecture Lead during the integration build lead, Integration and Execution development of a Document Management Solution Architecture Financial Services Agency for a Government Agency 2011: Architect and Performance Engineer for Location Based Services Platform Copyright © 2010 Accenture. All rights reserved. 2
  • 2. Accenture High performance achieved Company Profile Worldwide Revenues $21.6 billion •  Global management consulting, (in US$ billion, as of August 31, 2010) technology services and outsourcing Communications & company Resources High Tech •  215.000 employees •  Rank 47 among the 3.88 4.83 “Best Global Brands 2008” •  Top 100 Employer 4.32 •  28 of the DAX-30-Companies 2.98 •  96 of the Fortune-Global-100 Public •  More than three-quarters of the Fortune- Financial 5.53 Service Services Global-500 •  87 of our Top 100-clients have been with Products us for 10 or more years Copyright © 2010 Accenture. All rights reserved. 3 Local Accenture … ??? Geographic unit •  Austria •  Switzerland •  Germany Employees Berlin •  Ca. 6000 Düsseldorf •  We are hiring! Exciting Technology work Frankfurt •  Large scale projects (100+ people / multiple years) Munich •  Most challenging requirements –  Stock Exchange / Banking / Trading Systems Vienna –  AEMS Mobility Platform –  Large Scale Web Applications Zurich (> 1M page views / day) –  Batch Architectures Copyright © 2010 Accenture. All rights reserved. 4
  • 3. Agenda •  Introduction – Scalability & CAP Theorem – Traditional websites & Social media websites – High Scalability in Numbers •  Accenture – High Scalability by Example •  Architecture at Internet Scale •  Common concepts of Scalability •  Conclusion Copyright © 2011 Accenture. All rights reserved. 5 Scalability A system’s capacity to uphold the same performance under heavier volumes. (Patterns for Performance and Operability: Building and Testing Enterprise Software, Chris Ford et. al., 2008) Copyright © 2011 Accenture. All rights reserved. 6
  • 4. Vertical Scalability •  Is achieved by increasing the capacity of a single node – CPU, – Memory, – Bandwidth, … •  Simple Process – Application is generally not affected by those changes •  Classical Example are Super Computers like – HP Integrity Superdome – IBM Mainframe Source: Hewlett-Packard Copyright © 2011 Accenture. All rights reserved. 7 Horizontal Scalability •  Application is spread on a cluster with several nodes •  Nodes can be added to scale out •  Produces overhead – Keep cluster consistent – Node error detection and handling – Communication between nodes •  May be used to increase reliability and availability •  Distributed Systems and Programs like – SETI@Home Source: Space Sciences Laboratory, U.C. Berkeley – World Wide Web – Domain Name Service Copyright © 2011 Accenture. All rights reserved. 8
  • 5. CAP Theorem (Brewer‘s Theorem) •  Consistency – all clients see the same data at the same time Consistency •  Availability – all clients can find all data even in presence of failure •  Partition Tolerance – system works even when one node failed Partition Availability Tolerance Impossible Source: PODC-keynote, Towards Robust Distributed Systems, Dr. Eric A. Brewer, 2000 Copyright © 2011 Accenture. All rights reserved. 9 CAP Theorem Normally, two of these properties for any shared-data system Consistency + Availability C •  High data integrity P A •  Single site, cluster database, LDAP, etc. •  2-phase commit, data replication, etc. C Consistency + Partition •  Distributed database, distributed locking, etc. P A •  Pessimistic locking, etc. Availability + Partition C •  High scalability P A •  Distributed cache, DNS, etc. •  Optimistic locking, expiration/leases (timeout), etc. Source: “Architecting Cloudy Applications”, David Chou Copyright © 2011 Accenture. All rights reserved. 10
  • 6. Agenda •  Introduction – Scalability & CAP Theorem – Traditional websites & Social media websites – High Scalability in Numbers •  Accenture – High Scalability by Example •  Architecture at Internet Scale •  Common concepts of Scalability •  Conclusion Copyright © 2011 Accenture. All rights reserved. 11 Traditional Websites (Scaling the Social Graph: Infrastructure at Facebook, Jason Sobel, QCon 2011) Copyright © 2011 Accenture. All rights reserved. 12
  • 7. Social Media Websites (Scaling the Social Graph: Infrastructure at Facebook, Jason Sobel, QCon 2011) Copyright © 2011 Accenture. All rights reserved. 13 Agenda •  Introduction – Scalability & CAP Theorem – Traditional websites & Social media websites – High Scalability in Numbers •  Accenture – High Scalability by Example •  Architecture at Internet Scale •  Common concepts of Scalability •  Conclusion Copyright © 2011 Accenture. All rights reserved. 14
  • 8. High Scalability in Numbers The World Is Obsessed With Facebook (Video) Source: http://www.youtube.com/watch?v=xJXOavGwAW8 Copyright © 2011 Accenture. All rights reserved. 15 High Scalability in Numbers facebook twitter Active users 500.000.000 New accounts per day 500.000 Active users logging in 250.000.000 Tweets per day 140.000.000 daily (a year ago) (50.000.000) Active mobile users 250.000.000 Max tweets per second 6.939 In 20 minutes … Links shared 1.000.000 Status updates 1.851.000 Photos uploaded 2.716.000 Source: http://www.facebook.com/press/info.php?statistics, http://www.allfacebook.com/ Comments made 10.208.000 16 Copyright © 2011 Accenture. All rights reserved.
  • 9. High Scalability in Numbers LinkedIn Members >100.000.000 Connections between Members 1.300.000.000 Visitors per month 128.000.000 % of global internet users who visit LinkedIn daily 4% Source: http://blog.linkedin.com, http://press.linkedin.com/about/, LinkedIn Demographics 20111 Copyright © 2011 Accenture. All rights reserved. 17 Agenda •  Introduction •  Accenture – High Scalability by Example – The Royal Wedding Website – US Based Professional Sport League .com Website •  Accenture – High Scalability by Example •  Architecture at Internet Scale •  Common concepts of Scalability •  Conclusion Copyright © 2011 Accenture. All rights reserved. 18
  • 10. Agenda •  Introduction •  Accenture – High Scalability by Example •  Architecture at Internet Scale – Facebook – Twitter – LinkedIn – Challenges •  Common concepts of Scalability •  Conclusion Copyright © 2011 Accenture. All rights reserved. 19 Facebook – Key Facts •  Biggest Social Network •  People are linked together •  People share and exchange information in real-time •  Facebook provides a – A personalized News Feed containing news about friends – Picture Service including people tagging / linking – Messaging Service to stay in connect with friends – Platform to play Games with friends Copyright © 2011 Accenture. All rights reserved. 20
  • 11. Facebook – Architecture (1) •  Facebook „main“ architecture is based Memcache on a LAMP stack •  The application logic resides in Web Server layer (PHP based) •  The application layer takes care of the data distribution •  Some services do not fit (C++, Java) – Search MySQL – Ads – People You May Know Invalidation logic is implemented in – Multifeed the application! Copyright © 2011 Accenture. All rights reserved. 21 Facebook – Architecture (2) Multifeed / Newsfeed •  Distributed System •  Every user gets – 45 best rated updates –  on every reload •  Every DB-Update results in a Scribe Notification –  Leaf Nodes / Server are notified Source: “High Performance at Massive Scale – Lessons learned at Facebook”, Jeff Rothschild, 2009 Inspired By: “Architecting Cloudy Applications”, David Chou, 2010 Copyright © 2011 Accenture. All rights reserved. 22
  • 12. Agenda •  Introduction •  Accenture – High Scalability by Example •  Architecture at Internet Scale – Facebook – Twitter – LinkedIn – Challenges •  Common concepts of Scalability •  Conclusion Copyright © 2011 Accenture. All rights reserved. 23 Twitter – Key Facts •  Twitter is a real-time information network •  Twitter‘s key characteristics – Tweets are used for sharing information – Timeline of tweets must be kept – A Social Graph is maintained to deliver information Copyright © 2011 Accenture. All rights reserved. 24
  • 13. Twitter - Architecture Load Balancers Web Frontend Apache mod_proxy Apps & Distributed Rails (Unicorn) Cache Services Flock memcached Kestrel Queues Partitioned Data Distributed MySQL Cassandra Storage Daemons Source: “Architecting Cloudy Applications”, David Chou Copyright © 2011 Accenture. All rights reserved. 25 Agenda •  Introduction •  Accenture – High Scalability by Example •  Architecture at Internet Scale – Facebook – Twitter – LinkedIn – Challenges •  Common concepts of Scalability •  Conclusion Copyright © 2011 Accenture. All rights reserved. 26
  • 14. LinkedIn – Key Facts •  A social media networking platform to find connections to recommended job candidates, industry experts and business partners •  99% pure Java LinkedIn – Applications •  Groups •  Job market •  Mailing •  Profiles (Friends and Companies) •  Mobile Applications (Android, iPhone, Blackberry, Palm) Copyright © 2011 Accenture. All rights reserved. 27 LinkedIn - Architecture Web Frontend LinkedIn Spring Search Grails MVC Queuing Spring Caching / Messaging DWR Spring MVC Grails DWR Spring Framework (IoC, Web Lucene Searchprocessing AOP, Model-View-Controller DI, for Asynchronous Engine •  Direct Web Remoting (AJAX for Voldemort Open-source web application Frameworktriggered writes •  Java Library) Concerns,..) by Separation of open "coding Aims to bring the source in •  Apache combines high- •  Voldemort user Apps & Services •  RPC library toparadigm to the text Used performance, full-featured LinkedIn extensions with Groovy memory caching for convention" call Java functions search engine library written storage system so that a Service Service LinkedIn Spring Service Service from JavaScript and (lin:bean, •  XML Vocabulary to call StorageentirelyLoad of Page not Initial •  Why Grails? incaching tier is Valdemort Hadoop separate Java Storagerequired a more from Javaweb Sensei JavaScript functions productive •  lin:component, …) Needed software Hadoop isOperations framework Pull a •  ZoieProperty Editors a major Valdemort used as •  Forframework •  Used supports data-intensive distributed database distributed key-value data cache Lucene Zoie Bobo Caching Voldemort ehcache Memcache •  that realtime indexing/search ehcache •  … •  New kind of application context Search Index •  Asynchronous Communication for• the social graph to (map/ horizontallyprototyping Rapid •  distributed applications source, system scaledopen •  for life-cycle management with Ehcache is an •  Push Operations with and demonstrate feasibility •  reduce /Consistency cachehigh standards-based relaxed distributed file system) used Queuing / Messaging Bobo productivity gain Advantage: components durabilityis usedwith engine offload •  … to boost performance, guarantees majorjava/ •  Hadoop database and simplify the Integration as a faceted search existing for ActiveMQ the concurrent read Fast Socialkey-value data Graph distributed index building store •  spring-based solutions scalability Fast •  handles semi-structured and •  for the social graph Memcache structured data Storage •  Bottleneck: •  Distributed Cache MySQL Sensei Hadoop Voldemort •  Read data •  Index building • Keep core data ACID • Keep replicated and cached data BASE • Cache data on a cheap memory (memcached) • Use a hint to route the client to his / her’s ACID data Source: Project Voldemort, Jay Kreps Source: http://sna-projects.com Copyright © 2011 Accenture. All rights reserved. 28
  • 15. Agenda •  Introduction •  Accenture – High Scalability by Example •  Architecture at Internet Scale •  Common concepts of Scalability •  Conclusion Copyright © 2011 Accenture. All rights reserved. 29 Common concepts of Scalability parallelization asynchronous idempotent 7 Habits of operations Good partitioned data Distributed fault-tolerance Systems optimistic concurrency shared nothing Source: “Architecting Cloudy Applications”, David Chou Source: highscalability.com Copyright © 2011 Accenture. All rights reserved. 30
  • 16. Common concepts of Scalability Hybrid architectures • Scale-out (horizontal) • Scale-up (vertical) – BASE – ACID – focus on “commit” – availability first – optimistic locking – pessimistic locking – shared nothing – transactional – maximize scalability – favor accuracy/consistency Most distributed systems realize both Copyright © 2011 Accenture. All rights reserved. 31 Agenda •  Introduction •  Accenture – High Scalability by Example •  Architecture at Internet Scale •  Common concepts of Scalability •  Conclusion –  Questionnaire –  Déjà vu - Common concepts –  Don’t forget the Infrastructure Copyright © 2011 Accenture. All rights reserved. 32
  • 17. Conclusion Questionnaire •  Is there a need to scale my application? –  Vertical scaling is more easy to achieve –  Use horizontal scaling only when required •  Is there a plan to proof your designed solution? –  Plan to do a lot of realistic Proof-of-Concepts •  Is there a one size fits all solution? –  NO! •  How important is ACID? – Is BASE enough? – Can a NoSQL solution be used? Copyright © 2011 Accenture. All rights reserved. 33 Conclusion Déjà vu - Common concepts •  Asynchronous Processing – Keep Facebook, Twitter and LinkedIn in mind –  Asynchronous writes and even UI •  Data Partioining –  You must understand your data –  Is a hybrid solution as used by LinkedIn applicable? •  Design To Failure –  Lessons learned from Amazon‘s cloud solution outage Copyright © 2011 Accenture. All rights reserved. 34
  • 18. Conclusion Don’t forget the Infrastructure •  Disk IO compared to RAM is slow   Avoid „slow“ IO access whenever possible   Caching, Caching, Caching   Bulk Updates (Batch) over small updates •  Network Latency   Keep in mind that there is a network   Where are the users located?   How are your users connected? mobile application?   Is there a need for a datacenter distribution? •  Hardware configuration –  Is the CPU Power Saving working right? Copyright © 2011 Accenture. All rights reserved. 35 Questions ? Copyright © 2011 Accenture. All rights reserved. 36