SlideShare a Scribd company logo
1 of 55
Download to read offline
TOPIC NAME 
Software Architecture Fundamentals 
Tung.Nguyen (tungnq@fsoft.com.vn) 
Solution Architect 
Jun-2014 
Architecture Patterns 
Open Discussion 
-Scalable System Design 
-Facebook Architecture
Architecture Portfolios 
Scalable System Design Principles
Quality Attributes 
Availability 
Performance 
Reliability 
Scalability 
Manageability 
Cost 
Key Quality Attributes 
3 
http://www.aosabook.org/en/distsys.html
"Scalability" is not equivalent to "Raw Performance" 
Understand environmental workload conditions that the system is design for 
Understand who is your priority customers 
Scale out and Not scale up 
Keep your code modular and simple 
Don't guess the bottleneck, Measure it 
Plan for growth 
What Should We Focus On? 
4 
http://horicky.blogspot.com/2008/02/scalable-system-design.html
Techniques 
Server Farm/Cluster (real time access) 
Data Partitioning 
Map / Reduce (Batch Parallel Processing) 
Content Delivery Network (Static Cache) 
Cache Engine (Dynamic Cache) 
Resources Pool 
Calculate an approximate result 
Filtering at the source 
Asynchronous Processing 
Common Techniques 
5
Architecture Portfolios 
Scalable System Design Patterns
Load Balancer 
•A dispatcher determines which worker instance will handle a request based on different policies. 
Scatter and Gather 
•A dispatcher multicasts requests to all workers in a pool. Each worker will compute a local result and send it back to the dispatcher, who will consolidate them into a single response and then send back to the client. 
Result Cache 
•A dispatcher will first lookup if the request has been made before and try to find the previous result to return, in order tosave the actual execution. 
Shared Space 
•All workers monitors information from the shared space and contributes partial knowledge back to the blackboard. The informationis continuously enriched until a solution is reached. 
Pipe and Filter 
•All workers connected by pipes across which data flows. 
MapReduce 
•Targets batch jobs where disk I/O is the major bottleneck. It use a distributed file system so that disk I/O can be done in parallel. 
Bulk Synchronous Parallel 
•A lock-step execution across all workers, coordinated by a master. 
Execution Orchestrator 
•An intelligent scheduler / orchestrator schedules ready-to-run tasks (based on a dependency graph) across a clusters of dumb workers. 
8 Commonly Used Scalable System Design Patterns 
7
Load Balancer (1) 
8 
•There is a dispatcher that determines which worker instance will handle the request based on different policies. 
•The application should best be "stateless" so any worker instance can handle the request.
Load Balancer (2) 
•Multi-Datacenter Architecture 
9 
•3-Tier Architecture 
Rightscale: Cloud_Computing_System_Architecture_Diagrams
Load Balancer (3) 
10 
•Multi-Tier Architecture with Memcached 
Rightscale: Cloud_Computing_System_Architecture_ Diagrams
Scatter and Gather 
11 
•This pattern is used in Search engines like Yahoo, Google to handle user's keyword search request ... etc.
Result Cache 
12 
The dispatcher will first lookup if the request has been made before and try to find the previous result to return, in order to save the actual execution.
Shared Space 
13
Pipe and Filter 
14 
It is a very common EAI pattern.
Map Reduce 
15 
The model is targeting batch jobs where disk I/O is the major bottleneck. It use a distributed file system so that disk I/O can be done in parallel.
Bulk Synchronous Parallel 
16
Execution Orchestrator 
17
Architecture Portfolios 
Facebook Architecture 
-Facebook Web Site 
-Chat Service @ Facebook 
-Big Data @ Facebook
Facebook Web Site 
19
•Users 
–More than 400 million active users. 
–50% of active users login each day. 
–Average user has 130 friends on the site. 
•Activity 
–User spends an average of 55 minsper day. 
–More than 60 million status updates each day. 
–More than 100 million photos uploaded each day. 
•Platform 
–Currently 500K active applications. 
–About 250 apps have more than 1 million users. 
–About 60 million users use FB Connect from external web sites each month. 
Facebook Web SiteStatistic 
20
Challenges 
•High Concurrency 
•High Data Volumes 
•Multilevel Hierarchical Data 
Ok to Live with 
•Not Mission Critical 
•Cached data is fine 
•Write Failures are tolerable 
Facebook Web SiteTechnical Challenges 
21
•General Design Principles 
–Use open source where possible 
–Unix philosophy 
•Keep individual components simple yet performant 
•Combine as necessary 
•Concentrate on clean interface points 
–Build everything for scale 
–Try to minimize failure points 
–Simplicity, Simplicity, Simplicity. 
Facebook Web SiteArchitecture 
22
Facebook Web SiteArchitecture 
23 
Services 
Cache 
Database 
File System 
Presentation Layer 
Load Balancer 
Web Servers runingPHP, basically 
assembles data from lower layer 
and presents it on the page. 
Backend services are mainly implemented in C++(other language can be used). Very fast. 
Memcachedis used to cache almost everything that is needed to produce a page 
An array of MySQLservers are used in an interesting way to store the data 
Internally developed file system, Haystackused to store photos.
Facebook Web SiteArchitecture 
24
Facebook Web SiteTechnology Stack 
25
•Presentation Layer: PHP 
•Issues: 
–High CPU and memory consumption. 
–An Interoperability with C++ Challenging. 
–Language doesn’t encourage good programming in the large. 
–Initialization cost of each page scales with size of code base 
Facebook Web SiteWeb Tier at Facebook 
26 
70 
40 
38 
21 
6 
3 
2 
1 
0 
10 
20 
30 
40 
50 
60 
70 
80 
Ruby 
PHP 
Perl 
Python 
Erlang 
C# 
Java 
C++ 
Relative Performance of Language Runtime (lower is better) 
Programming Language performance ranking
•Optimizing PHP 
–Op-code optimization 
–APC improvements 
•Lazy loading 
•Cache priming 
–Custom extensions 
•Memcacheclient extension 
•Serialization format 
•Logging, Stats Collection, Monitoring 
•Asynchronous event-handling mechanism 
Facebook Web SiteWeb Tier at Facebook 
27
•HipHop 
–Source Code Transformer 
•Transform PHP in to highly optimized C++ and then compile it using g++ 
–50% reduction in CPU usage than Apache + HTTP 
–Facebook’s API tier can serve twice the traffic using 30% less CPU 
–It has embedded simple webserver on top of libevent. 
Facebook Web SiteWeb Tier at Facebook 
28 
https://github.com/facebook/hhvm
•Tornado: Facebook's Real-Time Web Framework for Python 
–Tornado is a relatively simple, non-blocking Web server framework written in Python 
–It is designed to handle thousands of simultaneous connections, making it ideal for real-time Web services. 
Facebook Web SiteWeb Tier at Facebook 
29 
http://www.tornadoweb.org/
•BigPipe: first breaks web pages into multiple chunk called pagelets 
Facebook Web SiteWeb Tier at Facebook 
30
•Memcachedis an in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, or page rendering. 
•Alleviate database load 
•Over 25TB in-memory caching on more than 800 servers 
•Multi-gets used to make the system more efficient 
•Facebook contributes UDP support and performance enhanced to Memcached. 
Facebook Web SiteMemcached 
31 
http://memcached.org/
•MySQL 
–Fast and reliable 
–Thousands of MySQL servers 
•Users randomly distributed across these servers 
–Relational aspect of DB is not used 
•No joins. Logically difficult(Data is distributed randomly) 
•Primarily key-value store 
–Customizations 
•Custom partitioning scheme –Global id assigned to all data 
•Custom archiving scheme –Base on frequency and recencyof data on a per user basis 
Facebook Web SiteMySQL Database 
32
Facebook Web SiteMemcached& MySQL at facebook 
33
•Many services written in C++, Python, Java 
–AdServer 
–Search 
–Network Selector 
–News Feed 
–Blogfeeds 
–CSSParser 
–Mobile 
–ShareScraper 
Facebook Web SiteServices 
34
•Services Philosophy 
–Create a service if required 
•Real overhead for deployment, maintenance, separate code-base 
•Another failure point 
–Create a common framework and toolset that allow for easier creation of services 
•Thrift 
•Scribe 
•ODS, Alerting, Monitoring service 
–Use the right language 
Facebook Web SiteServices 
35
•How to community between these? 
•Thrift 
–http://incubator.apache.org/thrift/ 
–Lightweight software framework for cross-language development 
–Provide IDL, statically generate code 
–Supported bindings: C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, Smalltalk, and Ocaml 
Facebook Web SiteServices 
36
Architecture Portfolios 
Facebook Architecture 
-Facebook Web Site 
-Chat Service @ Facebook 
-Big Data @ Facebook
•Statistic 
–Facebook has 200M active users 
–800+ million user messages / day 
–7+ million active channels at peak 
–1GB+ in / sec at peak 
–100+ channel machines 
•System challenges 
–How does synchronous messaging work on the Web? 
•"Presence" is hard to scale 
–Need a system to queue and deliver messages 
–Millions of connections, mostly idle 
–Need logging, at least between page loads 
–Make it work in Facebook’s environment 
Chat Service @ Facebook Statistic & Challenges 
38
Chat Service @ Facebook System Overview 
39
Chat Service @ Facebook System Overview 
•User Interface 
–Mix of client-side Javascriptand server-side PHP 
–Works around transport errors, browser differences 
–Regular AJAX for sending messages, fetching conversation history 
–Periodic AJAX polling for list of online friends 
–AJAX long-polling for messages (Comet) 
•Back End 
–Discrete responsibilities for each service 
•Communicate via Thrift 
–Channel (Erlang): message queuing and delivery 
•Queue messages in each user’s “channel” 
•Deliver messages as responses to long-polling HTTP requests 
–Presence (C++): aggregates online info in memory (pull-based presence) 
–Chatlogger(C++): stores conversations between page loads 
–Web tier (PHP): serves our vanilla web requests 
40
Chat Service @ Facebook System Overview 
41
Chat Service @ Facebook Message Send 
42
Chat Service @ Facebook Channel Server 
43
Chat Service @ Facebook Channel Server 
44 
•One channel per user 
•Web tier delivers messages for that user 
•Channel State: short queue of sequenced messages 
•Long poll for streaming (Comet) 
–Clients make an HTTP request 
–Server replies when a message is ready 
–One active request per browser tab
Architecture Portfolios 
Facebook Architecture 
-Facebook Web Site 
-Chat Service @ Facebook 
-Big Data @ Facebook
Big Data @ FacebookOverview 
46
Big Data @ FacebookData Infrastructure Overview 
47 
Data Infrastructure @ FB built on open source technologies: 
•Many committers across all the projects 
•Plans to open source other parts of the data stack 
•Figure out a model to stay in sync and still work at FB speed
Big Data @ FacebookLife of a tag for data infrastructure 
48
Big Data @ FacebookLife of a tag for data infrastructure 
49 
•Technoloy: 
–Log collection -Scribe 
–Realtimeanalyitcs-Puma 
–Batch analytics –Hive 
–Adhocanalytics -Peregrine 
–Periodic analytics -Nocron
Big Data @ FacebookWarehouse Architecture 
50 
4TB of compressed new data added per day. 
135TB of compressed data scanned per day. 
7500+ Hive Job on production cluster per day. 
80K computer hours per day.
•Scribe 
–http://developers.facebook.com/scribe/ 
–Scalable distributed logging framework 
–Useful for logging a wide array of data 
–Simple data model 
–Built on top of Thrift 
•Hive 
–A system for managing and querying structure data built on top of Hadoop. 
•Map Reduce for execution 
•HDFS for storage 
•Metadata in an RDBMS 
–Key Building Principles: 
•SQL as a familiar data warehousing tool. 
•Extensibility –Type, Functions, Formats, Scripts. 
•Scalability and Performance 
•Interoperability. 
Big Data @ FacebookWarehouse Architecture 
51
Big Data @ FacebookWarehouse Architecture 
52 
•Hive Architecture
Big Data @ FacebookWarehouse Architecture 
53 
•Data Flow Architecture
•Memcacheat FB - https://www.facebook.com/video/video.php?v=631826881803 
•http://www.gargasz.info/facebook-discovering-software-architecture/ 
•http://www.infoq.com/presentations/Facebook-Software-Stack 
•http://www.infoq.com/presentations/Scale-at-Facebook 
•http://www.slideshare.net/AditiTechnologies/facebook-architecture- breaking-it-open 
•http://stackoverflow.com/questions/3533948/facebook-architecture 
•http://www.quora.com/Facebook-Engineering/What-is-Facebooks- architecture 
•http://www.slideshare.net/AditiTechnologies/google-architecture-breaking- it-open 
Open Discussion !!! 
54
Architecture Patterns - Open Discussion

More Related Content

What's hot

7 Stages of Scaling Web Applications
7 Stages of Scaling Web Applications7 Stages of Scaling Web Applications
7 Stages of Scaling Web ApplicationsDavid Mitzenmacher
 
Web Server-Side Programming Techniques
Web Server-Side Programming TechniquesWeb Server-Side Programming Techniques
Web Server-Side Programming Techniquesguest8899ec02
 
Zing Database – Distributed Key-Value Database
Zing Database – Distributed Key-Value DatabaseZing Database – Distributed Key-Value Database
Zing Database – Distributed Key-Value Databasezingopen
 
Web Performance First Aid
Web Performance First AidWeb Performance First Aid
Web Performance First AidAlan Seiden
 
Java Web Start - How Zhara POS Works
Java Web Start - How Zhara POS WorksJava Web Start - How Zhara POS Works
Java Web Start - How Zhara POS WorksYohan Liyanage
 
Performance metrics for a social network
Performance metrics for a social networkPerformance metrics for a social network
Performance metrics for a social networkThierry Schellenbach
 
Fundamentals of performance tuning PHP on IBM i
Fundamentals of performance tuning PHP on IBM i  Fundamentals of performance tuning PHP on IBM i
Fundamentals of performance tuning PHP on IBM i Zend by Rogue Wave Software
 
Pabug Presentation Final
Pabug Presentation   FinalPabug Presentation   Final
Pabug Presentation FinalMelissa Miller
 
Best And Worst Practices Deploying IBM Connections
Best And Worst Practices Deploying IBM ConnectionsBest And Worst Practices Deploying IBM Connections
Best And Worst Practices Deploying IBM ConnectionsLetsConnect
 
Back from the Dead: When Bad Code Kills a Good Server
Back from the Dead: When Bad Code Kills a Good ServerBack from the Dead: When Bad Code Kills a Good Server
Back from the Dead: When Bad Code Kills a Good ServerTeamstudio
 
Pro lab synopsis (body)
Pro lab synopsis (body)Pro lab synopsis (body)
Pro lab synopsis (body)Asish Verma
 
The Autobahn Has No Speed Limit - Your XPages Shouldn't Either!
The Autobahn Has No Speed Limit - Your XPages Shouldn't Either!The Autobahn Has No Speed Limit - Your XPages Shouldn't Either!
The Autobahn Has No Speed Limit - Your XPages Shouldn't Either!Teamstudio
 
Asp.Net 3 5 Part 1
Asp.Net 3 5 Part 1Asp.Net 3 5 Part 1
Asp.Net 3 5 Part 1asim78
 

What's hot (20)

7 Stages of Scaling Web Applications
7 Stages of Scaling Web Applications7 Stages of Scaling Web Applications
7 Stages of Scaling Web Applications
 
Architecture of Facebook
Architecture of FacebookArchitecture of Facebook
Architecture of Facebook
 
Life in the Fast Lane: Full Speed XPages!, #dd13
Life in the Fast Lane: Full Speed XPages!, #dd13Life in the Fast Lane: Full Speed XPages!, #dd13
Life in the Fast Lane: Full Speed XPages!, #dd13
 
Web Server-Side Programming Techniques
Web Server-Side Programming TechniquesWeb Server-Side Programming Techniques
Web Server-Side Programming Techniques
 
Keeping up with PHP
Keeping up with PHPKeeping up with PHP
Keeping up with PHP
 
Zing Database – Distributed Key-Value Database
Zing Database – Distributed Key-Value DatabaseZing Database – Distributed Key-Value Database
Zing Database – Distributed Key-Value Database
 
Mule 4 vanrish
Mule 4   vanrishMule 4   vanrish
Mule 4 vanrish
 
Web Performance First Aid
Web Performance First AidWeb Performance First Aid
Web Performance First Aid
 
Java Web Start - How Zhara POS Works
Java Web Start - How Zhara POS WorksJava Web Start - How Zhara POS Works
Java Web Start - How Zhara POS Works
 
Resolving problems & high availability
Resolving problems & high availabilityResolving problems & high availability
Resolving problems & high availability
 
Performance metrics for a social network
Performance metrics for a social networkPerformance metrics for a social network
Performance metrics for a social network
 
Fundamentals of performance tuning PHP on IBM i
Fundamentals of performance tuning PHP on IBM i  Fundamentals of performance tuning PHP on IBM i
Fundamentals of performance tuning PHP on IBM i
 
Fashiolista
FashiolistaFashiolista
Fashiolista
 
Pabug Presentation Final
Pabug Presentation   FinalPabug Presentation   Final
Pabug Presentation Final
 
Best And Worst Practices Deploying IBM Connections
Best And Worst Practices Deploying IBM ConnectionsBest And Worst Practices Deploying IBM Connections
Best And Worst Practices Deploying IBM Connections
 
Back from the Dead: When Bad Code Kills a Good Server
Back from the Dead: When Bad Code Kills a Good ServerBack from the Dead: When Bad Code Kills a Good Server
Back from the Dead: When Bad Code Kills a Good Server
 
Pro lab synopsis (body)
Pro lab synopsis (body)Pro lab synopsis (body)
Pro lab synopsis (body)
 
The Autobahn Has No Speed Limit - Your XPages Shouldn't Either!
The Autobahn Has No Speed Limit - Your XPages Shouldn't Either!The Autobahn Has No Speed Limit - Your XPages Shouldn't Either!
The Autobahn Has No Speed Limit - Your XPages Shouldn't Either!
 
Asp.Net 3 5 Part 1
Asp.Net 3 5 Part 1Asp.Net 3 5 Part 1
Asp.Net 3 5 Part 1
 
Alfresco Architecture
Alfresco ArchitectureAlfresco Architecture
Alfresco Architecture
 

Viewers also liked

SaaS Introduction-May2014
SaaS Introduction-May2014SaaS Introduction-May2014
SaaS Introduction-May2014Nguyen Tung
 
Microservice Architecture
Microservice ArchitectureMicroservice Architecture
Microservice ArchitectureNguyen Tung
 
Developing Liferay Plugins with Maven
Developing Liferay Plugins with MavenDeveloping Liferay Plugins with Maven
Developing Liferay Plugins with MavenMika Koivisto
 
OOD Principles and Patterns
OOD Principles and PatternsOOD Principles and Patterns
OOD Principles and PatternsNguyen Tung
 
J2EE Technology Mapping-21-may-2014
J2EE Technology Mapping-21-may-2014J2EE Technology Mapping-21-may-2014
J2EE Technology Mapping-21-may-2014Nguyen Tung
 
Liferay Portal Introduction
Liferay Portal IntroductionLiferay Portal Introduction
Liferay Portal IntroductionNguyen Tung
 
Design a scalable site: Problem and solutions
Design a scalable site: Problem and solutionsDesign a scalable site: Problem and solutions
Design a scalable site: Problem and solutionsChau Thanh
 
Sơ lược kiến trúc hệ thống Zing Me
Sơ lược kiến trúc hệ thống Zing MeSơ lược kiến trúc hệ thống Zing Me
Sơ lược kiến trúc hệ thống Zing Mezingopen
 
Building ZingMe News Feed System
Building ZingMe News Feed SystemBuilding ZingMe News Feed System
Building ZingMe News Feed SystemChau Thanh
 
Mobile application architecture
Mobile application architectureMobile application architecture
Mobile application architectureChristos Matskas
 
Zing Me Real Time Web Chat Architect
Zing Me Real Time Web Chat ArchitectZing Me Real Time Web Chat Architect
Zing Me Real Time Web Chat ArchitectChau Thanh
 
Design a scalable social network: Problems and solutions
Design a scalable social network: Problems and solutionsDesign a scalable social network: Problems and solutions
Design a scalable social network: Problems and solutionsChau Thanh
 
Zingme practice for building scalable website with PHP
Zingme practice for building scalable website with PHPZingme practice for building scalable website with PHP
Zingme practice for building scalable website with PHPChau Thanh
 
Facebook architecture presentation: scalability challenge
Facebook architecture presentation: scalability challengeFacebook architecture presentation: scalability challenge
Facebook architecture presentation: scalability challengeCristina Munoz
 
Mobile Application Design & Development
Mobile Application Design & DevelopmentMobile Application Design & Development
Mobile Application Design & DevelopmentRonnie Liew
 
architecture of mobile software applications
architecture of mobile software applicationsarchitecture of mobile software applications
architecture of mobile software applicationsHassan Dar
 
Big Data in Real-Time at Twitter
Big Data in Real-Time at TwitterBig Data in Real-Time at Twitter
Big Data in Real-Time at Twitternkallen
 
High Availability in the Cloud - Architectural Best Practices
High Availability in the Cloud - Architectural Best PracticesHigh Availability in the Cloud - Architectural Best Practices
High Availability in the Cloud - Architectural Best PracticesRightScale
 
Scalable Elastic Systems Architecture (SESA)
Scalable Elastic Systems Architecture (SESA)Scalable Elastic Systems Architecture (SESA)
Scalable Elastic Systems Architecture (SESA)Eric Van Hensbergen
 
17 applied architectures
17 applied architectures17 applied architectures
17 applied architecturesMajong DevJfu
 

Viewers also liked (20)

SaaS Introduction-May2014
SaaS Introduction-May2014SaaS Introduction-May2014
SaaS Introduction-May2014
 
Microservice Architecture
Microservice ArchitectureMicroservice Architecture
Microservice Architecture
 
Developing Liferay Plugins with Maven
Developing Liferay Plugins with MavenDeveloping Liferay Plugins with Maven
Developing Liferay Plugins with Maven
 
OOD Principles and Patterns
OOD Principles and PatternsOOD Principles and Patterns
OOD Principles and Patterns
 
J2EE Technology Mapping-21-may-2014
J2EE Technology Mapping-21-may-2014J2EE Technology Mapping-21-may-2014
J2EE Technology Mapping-21-may-2014
 
Liferay Portal Introduction
Liferay Portal IntroductionLiferay Portal Introduction
Liferay Portal Introduction
 
Design a scalable site: Problem and solutions
Design a scalable site: Problem and solutionsDesign a scalable site: Problem and solutions
Design a scalable site: Problem and solutions
 
Sơ lược kiến trúc hệ thống Zing Me
Sơ lược kiến trúc hệ thống Zing MeSơ lược kiến trúc hệ thống Zing Me
Sơ lược kiến trúc hệ thống Zing Me
 
Building ZingMe News Feed System
Building ZingMe News Feed SystemBuilding ZingMe News Feed System
Building ZingMe News Feed System
 
Mobile application architecture
Mobile application architectureMobile application architecture
Mobile application architecture
 
Zing Me Real Time Web Chat Architect
Zing Me Real Time Web Chat ArchitectZing Me Real Time Web Chat Architect
Zing Me Real Time Web Chat Architect
 
Design a scalable social network: Problems and solutions
Design a scalable social network: Problems and solutionsDesign a scalable social network: Problems and solutions
Design a scalable social network: Problems and solutions
 
Zingme practice for building scalable website with PHP
Zingme practice for building scalable website with PHPZingme practice for building scalable website with PHP
Zingme practice for building scalable website with PHP
 
Facebook architecture presentation: scalability challenge
Facebook architecture presentation: scalability challengeFacebook architecture presentation: scalability challenge
Facebook architecture presentation: scalability challenge
 
Mobile Application Design & Development
Mobile Application Design & DevelopmentMobile Application Design & Development
Mobile Application Design & Development
 
architecture of mobile software applications
architecture of mobile software applicationsarchitecture of mobile software applications
architecture of mobile software applications
 
Big Data in Real-Time at Twitter
Big Data in Real-Time at TwitterBig Data in Real-Time at Twitter
Big Data in Real-Time at Twitter
 
High Availability in the Cloud - Architectural Best Practices
High Availability in the Cloud - Architectural Best PracticesHigh Availability in the Cloud - Architectural Best Practices
High Availability in the Cloud - Architectural Best Practices
 
Scalable Elastic Systems Architecture (SESA)
Scalable Elastic Systems Architecture (SESA)Scalable Elastic Systems Architecture (SESA)
Scalable Elastic Systems Architecture (SESA)
 
17 applied architectures
17 applied architectures17 applied architectures
17 applied architectures
 

Similar to Architecture Patterns - Open Discussion

Hpc lunch and learn
Hpc lunch and learnHpc lunch and learn
Hpc lunch and learnJohn D Almon
 
2014 09-12 lambda-architecture-at-indix
2014 09-12 lambda-architecture-at-indix2014 09-12 lambda-architecture-at-indix
2014 09-12 lambda-architecture-at-indixYu Ishikawa
 
Membase Meetup - Silicon Valley
Membase Meetup - Silicon ValleyMembase Meetup - Silicon Valley
Membase Meetup - Silicon ValleyMembase
 
e-Learning Delivery System : The Challenges
e-Learning Delivery System : The Challengese-Learning Delivery System : The Challenges
e-Learning Delivery System : The ChallengesDenpong Soodphakdee
 
high performance databases
high performance databaseshigh performance databases
high performance databasesmahdi_92
 
Presention on Facebook in f Distributed systems
Presention on Facebook in f Distributed systemsPresention on Facebook in f Distributed systems
Presention on Facebook in f Distributed systemsAhmad Yar
 
Membase East Coast Meetups
Membase East Coast MeetupsMembase East Coast Meetups
Membase East Coast MeetupsMembase
 
Realtime traffic analyser
Realtime traffic analyserRealtime traffic analyser
Realtime traffic analyserAlex Moskvin
 
Apache Arrow: Leveling Up the Analytics Stack
Apache Arrow: Leveling Up the Analytics StackApache Arrow: Leveling Up the Analytics Stack
Apache Arrow: Leveling Up the Analytics StackWes McKinney
 
Membase Introduction
Membase IntroductionMembase Introduction
Membase IntroductionMembase
 
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your MindDeliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your MindAvere Systems
 
After the LAMP, it's time to get MEAN
After the LAMP, it's time to get MEANAfter the LAMP, it's time to get MEAN
After the LAMP, it's time to get MEANJeff Fox
 
Membase Intro from Membase Meetup San Francisco
Membase Intro from Membase Meetup San FranciscoMembase Intro from Membase Meetup San Francisco
Membase Intro from Membase Meetup San FranciscoMembase
 
Simplifying Big Data Integration with Syncsort DMX and DMX-h
Simplifying Big Data Integration with Syncsort DMX and DMX-hSimplifying Big Data Integration with Syncsort DMX and DMX-h
Simplifying Big Data Integration with Syncsort DMX and DMX-hPrecisely
 
Membase Meetup 2010
Membase Meetup 2010Membase Meetup 2010
Membase Meetup 2010Membase
 
Facebook[The Nuts and Bolts Technology]
Facebook[The Nuts and Bolts Technology]Facebook[The Nuts and Bolts Technology]
Facebook[The Nuts and Bolts Technology]Koushik Reddy
 
Presto: Fast SQL on Everything
Presto: Fast SQL on EverythingPresto: Fast SQL on Everything
Presto: Fast SQL on EverythingDavid Phillips
 

Similar to Architecture Patterns - Open Discussion (20)

Hpc lunch and learn
Hpc lunch and learnHpc lunch and learn
Hpc lunch and learn
 
2014 09-12 lambda-architecture-at-indix
2014 09-12 lambda-architecture-at-indix2014 09-12 lambda-architecture-at-indix
2014 09-12 lambda-architecture-at-indix
 
Membase Meetup - Silicon Valley
Membase Meetup - Silicon ValleyMembase Meetup - Silicon Valley
Membase Meetup - Silicon Valley
 
e-Learning Delivery System : The Challenges
e-Learning Delivery System : The Challengese-Learning Delivery System : The Challenges
e-Learning Delivery System : The Challenges
 
high performance databases
high performance databaseshigh performance databases
high performance databases
 
Presention on Facebook in f Distributed systems
Presention on Facebook in f Distributed systemsPresention on Facebook in f Distributed systems
Presention on Facebook in f Distributed systems
 
Membase East Coast Meetups
Membase East Coast MeetupsMembase East Coast Meetups
Membase East Coast Meetups
 
Lecture 9: Dynamic web application
Lecture 9: Dynamic web applicationLecture 9: Dynamic web application
Lecture 9: Dynamic web application
 
Realtime traffic analyser
Realtime traffic analyserRealtime traffic analyser
Realtime traffic analyser
 
Apache Arrow: Leveling Up the Analytics Stack
Apache Arrow: Leveling Up the Analytics StackApache Arrow: Leveling Up the Analytics Stack
Apache Arrow: Leveling Up the Analytics Stack
 
Intro to Big Data
Intro to Big DataIntro to Big Data
Intro to Big Data
 
Membase Introduction
Membase IntroductionMembase Introduction
Membase Introduction
 
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your MindDeliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
 
After the LAMP, it's time to get MEAN
After the LAMP, it's time to get MEANAfter the LAMP, it's time to get MEAN
After the LAMP, it's time to get MEAN
 
Apache drill
Apache drillApache drill
Apache drill
 
Membase Intro from Membase Meetup San Francisco
Membase Intro from Membase Meetup San FranciscoMembase Intro from Membase Meetup San Francisco
Membase Intro from Membase Meetup San Francisco
 
Simplifying Big Data Integration with Syncsort DMX and DMX-h
Simplifying Big Data Integration with Syncsort DMX and DMX-hSimplifying Big Data Integration with Syncsort DMX and DMX-h
Simplifying Big Data Integration with Syncsort DMX and DMX-h
 
Membase Meetup 2010
Membase Meetup 2010Membase Meetup 2010
Membase Meetup 2010
 
Facebook[The Nuts and Bolts Technology]
Facebook[The Nuts and Bolts Technology]Facebook[The Nuts and Bolts Technology]
Facebook[The Nuts and Bolts Technology]
 
Presto: Fast SQL on Everything
Presto: Fast SQL on EverythingPresto: Fast SQL on Everything
Presto: Fast SQL on Everything
 

Recently uploaded

The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningVitsRangannavar
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsMehedi Hasan Shohan
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 

Recently uploaded (20)

The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learning
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software Solutions
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 

Architecture Patterns - Open Discussion

  • 1. TOPIC NAME Software Architecture Fundamentals Tung.Nguyen (tungnq@fsoft.com.vn) Solution Architect Jun-2014 Architecture Patterns Open Discussion -Scalable System Design -Facebook Architecture
  • 2. Architecture Portfolios Scalable System Design Principles
  • 3. Quality Attributes Availability Performance Reliability Scalability Manageability Cost Key Quality Attributes 3 http://www.aosabook.org/en/distsys.html
  • 4. "Scalability" is not equivalent to "Raw Performance" Understand environmental workload conditions that the system is design for Understand who is your priority customers Scale out and Not scale up Keep your code modular and simple Don't guess the bottleneck, Measure it Plan for growth What Should We Focus On? 4 http://horicky.blogspot.com/2008/02/scalable-system-design.html
  • 5. Techniques Server Farm/Cluster (real time access) Data Partitioning Map / Reduce (Batch Parallel Processing) Content Delivery Network (Static Cache) Cache Engine (Dynamic Cache) Resources Pool Calculate an approximate result Filtering at the source Asynchronous Processing Common Techniques 5
  • 6. Architecture Portfolios Scalable System Design Patterns
  • 7. Load Balancer •A dispatcher determines which worker instance will handle a request based on different policies. Scatter and Gather •A dispatcher multicasts requests to all workers in a pool. Each worker will compute a local result and send it back to the dispatcher, who will consolidate them into a single response and then send back to the client. Result Cache •A dispatcher will first lookup if the request has been made before and try to find the previous result to return, in order tosave the actual execution. Shared Space •All workers monitors information from the shared space and contributes partial knowledge back to the blackboard. The informationis continuously enriched until a solution is reached. Pipe and Filter •All workers connected by pipes across which data flows. MapReduce •Targets batch jobs where disk I/O is the major bottleneck. It use a distributed file system so that disk I/O can be done in parallel. Bulk Synchronous Parallel •A lock-step execution across all workers, coordinated by a master. Execution Orchestrator •An intelligent scheduler / orchestrator schedules ready-to-run tasks (based on a dependency graph) across a clusters of dumb workers. 8 Commonly Used Scalable System Design Patterns 7
  • 8. Load Balancer (1) 8 •There is a dispatcher that determines which worker instance will handle the request based on different policies. •The application should best be "stateless" so any worker instance can handle the request.
  • 9. Load Balancer (2) •Multi-Datacenter Architecture 9 •3-Tier Architecture Rightscale: Cloud_Computing_System_Architecture_Diagrams
  • 10. Load Balancer (3) 10 •Multi-Tier Architecture with Memcached Rightscale: Cloud_Computing_System_Architecture_ Diagrams
  • 11. Scatter and Gather 11 •This pattern is used in Search engines like Yahoo, Google to handle user's keyword search request ... etc.
  • 12. Result Cache 12 The dispatcher will first lookup if the request has been made before and try to find the previous result to return, in order to save the actual execution.
  • 14. Pipe and Filter 14 It is a very common EAI pattern.
  • 15. Map Reduce 15 The model is targeting batch jobs where disk I/O is the major bottleneck. It use a distributed file system so that disk I/O can be done in parallel.
  • 18. Architecture Portfolios Facebook Architecture -Facebook Web Site -Chat Service @ Facebook -Big Data @ Facebook
  • 20. •Users –More than 400 million active users. –50% of active users login each day. –Average user has 130 friends on the site. •Activity –User spends an average of 55 minsper day. –More than 60 million status updates each day. –More than 100 million photos uploaded each day. •Platform –Currently 500K active applications. –About 250 apps have more than 1 million users. –About 60 million users use FB Connect from external web sites each month. Facebook Web SiteStatistic 20
  • 21. Challenges •High Concurrency •High Data Volumes •Multilevel Hierarchical Data Ok to Live with •Not Mission Critical •Cached data is fine •Write Failures are tolerable Facebook Web SiteTechnical Challenges 21
  • 22. •General Design Principles –Use open source where possible –Unix philosophy •Keep individual components simple yet performant •Combine as necessary •Concentrate on clean interface points –Build everything for scale –Try to minimize failure points –Simplicity, Simplicity, Simplicity. Facebook Web SiteArchitecture 22
  • 23. Facebook Web SiteArchitecture 23 Services Cache Database File System Presentation Layer Load Balancer Web Servers runingPHP, basically assembles data from lower layer and presents it on the page. Backend services are mainly implemented in C++(other language can be used). Very fast. Memcachedis used to cache almost everything that is needed to produce a page An array of MySQLservers are used in an interesting way to store the data Internally developed file system, Haystackused to store photos.
  • 26. •Presentation Layer: PHP •Issues: –High CPU and memory consumption. –An Interoperability with C++ Challenging. –Language doesn’t encourage good programming in the large. –Initialization cost of each page scales with size of code base Facebook Web SiteWeb Tier at Facebook 26 70 40 38 21 6 3 2 1 0 10 20 30 40 50 60 70 80 Ruby PHP Perl Python Erlang C# Java C++ Relative Performance of Language Runtime (lower is better) Programming Language performance ranking
  • 27. •Optimizing PHP –Op-code optimization –APC improvements •Lazy loading •Cache priming –Custom extensions •Memcacheclient extension •Serialization format •Logging, Stats Collection, Monitoring •Asynchronous event-handling mechanism Facebook Web SiteWeb Tier at Facebook 27
  • 28. •HipHop –Source Code Transformer •Transform PHP in to highly optimized C++ and then compile it using g++ –50% reduction in CPU usage than Apache + HTTP –Facebook’s API tier can serve twice the traffic using 30% less CPU –It has embedded simple webserver on top of libevent. Facebook Web SiteWeb Tier at Facebook 28 https://github.com/facebook/hhvm
  • 29. •Tornado: Facebook's Real-Time Web Framework for Python –Tornado is a relatively simple, non-blocking Web server framework written in Python –It is designed to handle thousands of simultaneous connections, making it ideal for real-time Web services. Facebook Web SiteWeb Tier at Facebook 29 http://www.tornadoweb.org/
  • 30. •BigPipe: first breaks web pages into multiple chunk called pagelets Facebook Web SiteWeb Tier at Facebook 30
  • 31. •Memcachedis an in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, or page rendering. •Alleviate database load •Over 25TB in-memory caching on more than 800 servers •Multi-gets used to make the system more efficient •Facebook contributes UDP support and performance enhanced to Memcached. Facebook Web SiteMemcached 31 http://memcached.org/
  • 32. •MySQL –Fast and reliable –Thousands of MySQL servers •Users randomly distributed across these servers –Relational aspect of DB is not used •No joins. Logically difficult(Data is distributed randomly) •Primarily key-value store –Customizations •Custom partitioning scheme –Global id assigned to all data •Custom archiving scheme –Base on frequency and recencyof data on a per user basis Facebook Web SiteMySQL Database 32
  • 33. Facebook Web SiteMemcached& MySQL at facebook 33
  • 34. •Many services written in C++, Python, Java –AdServer –Search –Network Selector –News Feed –Blogfeeds –CSSParser –Mobile –ShareScraper Facebook Web SiteServices 34
  • 35. •Services Philosophy –Create a service if required •Real overhead for deployment, maintenance, separate code-base •Another failure point –Create a common framework and toolset that allow for easier creation of services •Thrift •Scribe •ODS, Alerting, Monitoring service –Use the right language Facebook Web SiteServices 35
  • 36. •How to community between these? •Thrift –http://incubator.apache.org/thrift/ –Lightweight software framework for cross-language development –Provide IDL, statically generate code –Supported bindings: C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, Smalltalk, and Ocaml Facebook Web SiteServices 36
  • 37. Architecture Portfolios Facebook Architecture -Facebook Web Site -Chat Service @ Facebook -Big Data @ Facebook
  • 38. •Statistic –Facebook has 200M active users –800+ million user messages / day –7+ million active channels at peak –1GB+ in / sec at peak –100+ channel machines •System challenges –How does synchronous messaging work on the Web? •"Presence" is hard to scale –Need a system to queue and deliver messages –Millions of connections, mostly idle –Need logging, at least between page loads –Make it work in Facebook’s environment Chat Service @ Facebook Statistic & Challenges 38
  • 39. Chat Service @ Facebook System Overview 39
  • 40. Chat Service @ Facebook System Overview •User Interface –Mix of client-side Javascriptand server-side PHP –Works around transport errors, browser differences –Regular AJAX for sending messages, fetching conversation history –Periodic AJAX polling for list of online friends –AJAX long-polling for messages (Comet) •Back End –Discrete responsibilities for each service •Communicate via Thrift –Channel (Erlang): message queuing and delivery •Queue messages in each user’s “channel” •Deliver messages as responses to long-polling HTTP requests –Presence (C++): aggregates online info in memory (pull-based presence) –Chatlogger(C++): stores conversations between page loads –Web tier (PHP): serves our vanilla web requests 40
  • 41. Chat Service @ Facebook System Overview 41
  • 42. Chat Service @ Facebook Message Send 42
  • 43. Chat Service @ Facebook Channel Server 43
  • 44. Chat Service @ Facebook Channel Server 44 •One channel per user •Web tier delivers messages for that user •Channel State: short queue of sequenced messages •Long poll for streaming (Comet) –Clients make an HTTP request –Server replies when a message is ready –One active request per browser tab
  • 45. Architecture Portfolios Facebook Architecture -Facebook Web Site -Chat Service @ Facebook -Big Data @ Facebook
  • 46. Big Data @ FacebookOverview 46
  • 47. Big Data @ FacebookData Infrastructure Overview 47 Data Infrastructure @ FB built on open source technologies: •Many committers across all the projects •Plans to open source other parts of the data stack •Figure out a model to stay in sync and still work at FB speed
  • 48. Big Data @ FacebookLife of a tag for data infrastructure 48
  • 49. Big Data @ FacebookLife of a tag for data infrastructure 49 •Technoloy: –Log collection -Scribe –Realtimeanalyitcs-Puma –Batch analytics –Hive –Adhocanalytics -Peregrine –Periodic analytics -Nocron
  • 50. Big Data @ FacebookWarehouse Architecture 50 4TB of compressed new data added per day. 135TB of compressed data scanned per day. 7500+ Hive Job on production cluster per day. 80K computer hours per day.
  • 51. •Scribe –http://developers.facebook.com/scribe/ –Scalable distributed logging framework –Useful for logging a wide array of data –Simple data model –Built on top of Thrift •Hive –A system for managing and querying structure data built on top of Hadoop. •Map Reduce for execution •HDFS for storage •Metadata in an RDBMS –Key Building Principles: •SQL as a familiar data warehousing tool. •Extensibility –Type, Functions, Formats, Scripts. •Scalability and Performance •Interoperability. Big Data @ FacebookWarehouse Architecture 51
  • 52. Big Data @ FacebookWarehouse Architecture 52 •Hive Architecture
  • 53. Big Data @ FacebookWarehouse Architecture 53 •Data Flow Architecture
  • 54. •Memcacheat FB - https://www.facebook.com/video/video.php?v=631826881803 •http://www.gargasz.info/facebook-discovering-software-architecture/ •http://www.infoq.com/presentations/Facebook-Software-Stack •http://www.infoq.com/presentations/Scale-at-Facebook •http://www.slideshare.net/AditiTechnologies/facebook-architecture- breaking-it-open •http://stackoverflow.com/questions/3533948/facebook-architecture •http://www.quora.com/Facebook-Engineering/What-is-Facebooks- architecture •http://www.slideshare.net/AditiTechnologies/google-architecture-breaking- it-open Open Discussion !!! 54