SlideShare a Scribd company logo
1 of 42
SCALING YOUR WEBSITE
Alejandro Marcu
Dutch PHP Conference 2016
2
 Started programming Logo at
8 years old
 Then moved to Basic, Turbo
Pascal, C++, Java
 2001 – 2004 Various
programming jobs in
Argentina
 2004 – 2008: TopCoder
 2009 – 2015: Facebook
Alejandro Marcu
3
 Scalable architecture
 Scaling the database
 Caching
 Introducing new features
What You Will Learn Today
Scalable architecture
5
Single Server
 Hosted or in the cloud
 Web App: Apache/Nginx +
PHP
 DB: MySql, MongoDB, etc.
 Cache: Memcache, Redis Web App
CacheDB
Server
User
6
 More RAM
 More cores or faster CPU
 SSD
 RAID
 Network Interfaces
Scaling Vertically
7
Functional Partitioning
 Servers can have different
hardware specs
 More latency
 Limited growth
Server 1
Server 3Server 2
Web App
CacheDB
Data Center
User
8
Splitting the Web App
 Web Front End should be a
thin presentation layer
 Services
 Just another class
 Remote over SOAP, REST,
Thrift
 Start simple, plan for scale
Web Front End
Service 1
DB
Service 2 Service n
Back End
Cache
iOS
App
Android
App
9
Functional Partitioning
 Back end servers can have
one or more services
 Some services can be in
more than one server
Service 1 Service n
Back End
Server 4 Server k
Server 1
Server 3Server 2
Web Front End
CacheDB
Data Center
User
10
 Don’t store anything locally
 Use external storage (e.g. databases)
 Can use local caching
Stateless Services
11
 HTTP Session
 Cookies
 External Data Store
 Uploaded Files
 DFS: GFS, HDFS, ClusterFS
 Amazon S3
Stateless Front End
12
Multiple Front End Servers
Load Balancer:
 Cloud based (Amazon ELB)
 Software (NGINX, HAProxy)
 Hardware (BIG-IP,
Netscaler)
Load
Balancer
Service 1 Service n
Back End
CacheDB
Data Center
User
Web FE 1
Front End
Web FE k
13
Caching static files
 Files that are the same on
each request, e.g. jpg, png,
css, js, mp3, etc
 Reverse HTTP Proxy
 Load balancers usually
provide this functionality
 CDN (Content Delivery
Network)
 E.g. Akamai, Amazon
Cloudfront
 Pay for usage
 Multiple locations
User CDN
Data Center
static
content
dynamic
content
14
 Advantages
 Lower latency for users
 Reduced disaster risk
 Economic opportunities
 Challenges
 Consistency
 Latency between data centers
 Bandwidth between data centers
Multiple Data Centers
Scaling databases
16
 Too much data
 Too many reads
 Too many writes
 Want higher availability
Scaling relational databases
17
Replication
 Usually much more reads
than writes
 Higher availability
 Read after write can be
wrong
Master
Slave Slave
R/W
R
DB clients
Binlogs
18
Functional Partitioning
 Limited growth
 Can separate unrelated
functionality
User
Post
Payment
DB 1
DB 2
19
Sharding
 Tables are split into multiple
DBs
 Sharding key used to decide
which db, e.g. id
 Sharding function, e.g.
db(id) = (id % 2) + 1
 Searching becomes more
complicated
id name
1 John
3 Jack
5 Anne
id name
2 Louise
4 Bob
6 Marie
DB 1
DB 2
20
Sharding
 E.g., add an extra db
 New sharding function:
db(id) = (id % 3) + 1
 Conclusion: modulo is not a
good sharding function
id name
1 John
3 Jack
5 Anne
id name
2 Louise
4 Bob
6 Marie
DB 1
DB 2
id name
1 John
4 Bob
DB 1
id name
2 Louise
5 Anne
DB 2
id name
3 Jack
6 Marie
DB 3
21
Consistent Sharding
 Consistent sharding needs
less reallocations id name
1 John
3 Jack
5 Anne
id name
2 Louise
4 Bob
6 Marie
DB 1
DB 2
id name
1 John
3 Jack
DB 1
id name
2 Louise
4 Bob
DB 2
id name
5 Anne
6 Marie
DB 3
22
Sharding
 Create many logical DBs
 Distribute them across
servers
Server 1
DB 1
DB 2
…
…
DB 16
Server 2
DB 17
DB 18
…
…
DB 32
23
Sharding
 Re-distribute DBs when
needed
 Need a function to map db to
server, can be a
configuration
Server 1
DB 1
DB 2
…
…
DB 16
Server 2
DB 17
DB 18
…
…
DB 24
Server 3
DB 25
DB 18
…
…
DB 32
24
Sharding colocation
 Put owned data in the same
table (e.g. shard by user_id
in post table)
 Can execute joins
user
id name
1 John
3 Jack
5 Anne
id name
2 Louise
4 Bob
6 Marie
DB 1
DB 2
user
post
id user_id text
100 1 …
125 1 …
180 3 …
post
id user_id text
143 2 …
110 6 …
175 6 …
25
Sharding fan-out
 Many-to-many relationships
are spread out
 To get friend’s names:
 Get ids
 Group by db
 Query on each db
 Gets worse with more dbs
 Caching helps a lot
 Needs inverse entries
user
id name
1 John
3 Jack
5 Anne
id name
2 Louise
4 Bob
6 Marie
DB 1
DB 2
user
friend
id1 id2
1 2
1 4
3 4
friend
id1 id2
2 1
4 1
4 3
26
 Replication
 Scales reads, higher availability
 Functional partitioning
 Limited scalability
 Helps across the board
 Sharding
 Scales reads, writes, too much data and helps with availability
 Those 3 techniques can be combined
Database scaling
Caching
28
 Usually required at large scale
 Key-Value stores
 Set(key, value[, TTL])
 Get(key)
 Delete(key)
 Different levels
 Client side (e.g. in the browser in JS)
 In the WebServer (e.g. APC)
 Distributed cache (e.g. Redis, Memcached)
Caching application data
29
 E.g. APC (Alternative PHP Cache)
 Very fast
 Duplicated caching between web servers
 Expensive to invalidate
 Use sparingly, mostly for global data
Caching in the web server
30
 Examples:
 Redis
 Memcached (+ McRouter or libmemcached)
 One or more cache servers, shared use between clients
 Network latency
Distributed cache
31
Features to consider:
 Replication
 Partitioning
 Separate pools
 Persistence
 Atomic operations
Distributed cache
32
 When the value is no longer valid, usually just delete the key
 Example:
user_friends:100 => ‘John X, Bob Y, Anne Z’
 Need to invalidate when:
 The user adds or removes friends
 A friend removes him as a friend
 A friend changes his name
 Can you tolerate temporary inconsistencies?
Cache invalidation
33
 What happens if you change the structure of the values? Example:
(old) user_friends:100 => ‘John X, Bob Y, Anne Z’
(new) user_friends:100 => ‘1:John X, 25:Bob Y, 37:Anne Z’
 New code breaks with old style keys
 Old code breaks with new style keys
 Solution: use versions:
(old) user_friends:100:1 => ‘John X, Bob Y, Anne Z’
(new) user_friends:100:2 => ‘1:John X, 25:Bob Y, 37:Anne Z’
Cache versioning
Introducing new features
35
Objectives:
 A/B testing
 Quickly revert it if needed
 Protect infrastructure
 Ease of development
Introducing new features
36
Some possibilities:
1. Development branch
2. Feature toggle
3. Percentage Rollout
4. Advanced Rollout
Introducing new features
37
 New branch for the feature, merge when finished
 Can be fine in the early stages
 No extra setup or complexity
 Long living branch, may be hard to merge
Development Branch
38
 Can be changed at run time (console or configuration)
 Should distinguish prod from testing
 Allows for intermediate commits
 Code structure:
if (feature_enabled(‘homepage_redesign’)) {
new_homepage();
} else {
old_homepage();
}
Feature Toggle
39
 Dynamically control the percentage of users
for a feature
 When increasing the percentage, should
include previous users
 Code structure:
if (feature_enabled(‘homepage_redesign’, $user_id)) {
new_homepage();
} else {
old_homepage();
}
Percentage Rollout
40
Turn on/off features for a percentage of users that:
 Are employees
 Are in another rollout group
 Use a certain language
 Are in a certain country
 Individually whitelist or blacklist people
Advanced Rollout
41
 Some frameworks to check out:
 Swivel
 Opensoft/rollout
 LaunchDarkly
 Don’t forget to clean up the old code paths
Introducing new features
42
Contact Information
amarcu@gmail.com
/alejandro.marcu
/alejandromarcu
@AlejandroMarcu
/in/alejandromarcu

More Related Content

What's hot

Hadoop training institute in bangalore
Hadoop training institute in bangaloreHadoop training institute in bangalore
Hadoop training institute in bangaloreKelly Technologies
 
Domain name system presentation
Domain name system presentationDomain name system presentation
Domain name system presentationAnchit Dhingra
 
Windows Server 2008 R2 Active Directory ADFS Claims Base Identity for Windows...
Windows Server 2008 R2 Active Directory ADFS Claims Base Identity for Windows...Windows Server 2008 R2 Active Directory ADFS Claims Base Identity for Windows...
Windows Server 2008 R2 Active Directory ADFS Claims Base Identity for Windows...Tũi Wichets
 
OpenLDAP - Installation and Configuration
OpenLDAP - Installation and ConfigurationOpenLDAP - Installation and Configuration
OpenLDAP - Installation and ConfigurationWildan Maulana
 
Dhcp, dns and proxy server (1)
Dhcp, dns and proxy server (1)Dhcp, dns and proxy server (1)
Dhcp, dns and proxy server (1)Sahira Khan
 
DSpace: Technical Basics
DSpace: Technical BasicsDSpace: Technical Basics
DSpace: Technical BasicsIryna Kuchma
 
Dns Hardening Linux Os
Dns Hardening   Linux OsDns Hardening   Linux Os
Dns Hardening Linux Osecarrow
 
DSpace Tutorial : Open Source Digital Library
DSpace Tutorial : Open Source Digital LibraryDSpace Tutorial : Open Source Digital Library
DSpace Tutorial : Open Source Digital Libraryrajivkumarmca
 
Hadoop HDFS Architeture and Design
Hadoop HDFS Architeture and DesignHadoop HDFS Architeture and Design
Hadoop HDFS Architeture and Designsudhakara st
 
Dspace configuration on XMLUI DSpace
Dspace configuration on XMLUI DSpaceDspace configuration on XMLUI DSpace
Dspace configuration on XMLUI DSpaceBharat Chaudhari
 
Active directory interview_questions
Active directory interview_questionsActive directory interview_questions
Active directory interview_questionssubhashmr
 
LibLive CD/DVD - Dspace Manual
LibLive CD/DVD - Dspace ManualLibLive CD/DVD - Dspace Manual
LibLive CD/DVD - Dspace ManualVaibhav Gaikwad
 

What's hot (20)

Domain name system
Domain name systemDomain name system
Domain name system
 
The History of DNS
The History of DNSThe History of DNS
The History of DNS
 
Hadoop training institute in bangalore
Hadoop training institute in bangaloreHadoop training institute in bangalore
Hadoop training institute in bangalore
 
Dns server
Dns serverDns server
Dns server
 
Domain name system presentation
Domain name system presentationDomain name system presentation
Domain name system presentation
 
Windows Server 2008 R2 Active Directory ADFS Claims Base Identity for Windows...
Windows Server 2008 R2 Active Directory ADFS Claims Base Identity for Windows...Windows Server 2008 R2 Active Directory ADFS Claims Base Identity for Windows...
Windows Server 2008 R2 Active Directory ADFS Claims Base Identity for Windows...
 
OpenLDAP - Installation and Configuration
OpenLDAP - Installation and ConfigurationOpenLDAP - Installation and Configuration
OpenLDAP - Installation and Configuration
 
Dhcp, dns and proxy server (1)
Dhcp, dns and proxy server (1)Dhcp, dns and proxy server (1)
Dhcp, dns and proxy server (1)
 
Domain Name System (DNS)
Domain Name System (DNS)Domain Name System (DNS)
Domain Name System (DNS)
 
DSpace: Technical Basics
DSpace: Technical BasicsDSpace: Technical Basics
DSpace: Technical Basics
 
Dns Hardening Linux Os
Dns Hardening   Linux OsDns Hardening   Linux Os
Dns Hardening Linux Os
 
DSpace Tutorial : Open Source Digital Library
DSpace Tutorial : Open Source Digital LibraryDSpace Tutorial : Open Source Digital Library
DSpace Tutorial : Open Source Digital Library
 
Hadoop HDFS Architeture and Design
Hadoop HDFS Architeture and DesignHadoop HDFS Architeture and Design
Hadoop HDFS Architeture and Design
 
Dspace configuration on XMLUI DSpace
Dspace configuration on XMLUI DSpaceDspace configuration on XMLUI DSpace
Dspace configuration on XMLUI DSpace
 
Active directory interview_questions
Active directory interview_questionsActive directory interview_questions
Active directory interview_questions
 
D Space Installation
D Space InstallationD Space Installation
D Space Installation
 
LibLive CD/DVD - Dspace Manual
LibLive CD/DVD - Dspace ManualLibLive CD/DVD - Dspace Manual
LibLive CD/DVD - Dspace Manual
 
DNS resolution
DNS resolutionDNS resolution
DNS resolution
 
Introduction to DSpace
Introduction to DSpaceIntroduction to DSpace
Introduction to DSpace
 
Dns2
Dns2Dns2
Dns2
 

Viewers also liked

10 Things You Can Do to Speed Up Your Web App Today
10 Things You Can Do to Speed Up Your Web App Today10 Things You Can Do to Speed Up Your Web App Today
10 Things You Can Do to Speed Up Your Web App TodayChris Love
 
A301 ctu madrid2016-monitoring
A301 ctu madrid2016-monitoringA301 ctu madrid2016-monitoring
A301 ctu madrid2016-monitoringMichael Dawson
 
Gear6 and Scaling Website Performance: Caching Session and Profile Data with...
Gear6 and Scaling Website Performance:  Caching Session and Profile Data with...Gear6 and Scaling Website Performance:  Caching Session and Profile Data with...
Gear6 and Scaling Website Performance: Caching Session and Profile Data with...Gear6
 
Best Practices for Large-Scale Web Sites
Best Practices for Large-Scale Web SitesBest Practices for Large-Scale Web Sites
Best Practices for Large-Scale Web SitesCraig Dickson
 
Zingme practice for building scalable website with PHP
Zingme practice for building scalable website with PHPZingme practice for building scalable website with PHP
Zingme practice for building scalable website with PHPChau Thanh
 
Deep Dive: Scaling Up to Your First 10 Million Users
Deep Dive: Scaling Up to Your First 10 Million UsersDeep Dive: Scaling Up to Your First 10 Million Users
Deep Dive: Scaling Up to Your First 10 Million UsersAmazon Web Services
 
Building a Scalable Architecture for web apps
Building a Scalable Architecture for web appsBuilding a Scalable Architecture for web apps
Building a Scalable Architecture for web appsDirecti Group
 
7 Stages of Scaling Web Applications
7 Stages of Scaling Web Applications7 Stages of Scaling Web Applications
7 Stages of Scaling Web ApplicationsDavid Mitzenmacher
 
Architecture of a Modern Web App
Architecture of a Modern Web AppArchitecture of a Modern Web App
Architecture of a Modern Web Appscothis
 
10 Best Practices of a Best Company to Work For
10 Best Practices of a Best Company to Work For10 Best Practices of a Best Company to Work For
10 Best Practices of a Best Company to Work ForO.C. Tanner
 

Viewers also liked (11)

10 Things You Can Do to Speed Up Your Web App Today
10 Things You Can Do to Speed Up Your Web App Today10 Things You Can Do to Speed Up Your Web App Today
10 Things You Can Do to Speed Up Your Web App Today
 
A301 ctu madrid2016-monitoring
A301 ctu madrid2016-monitoringA301 ctu madrid2016-monitoring
A301 ctu madrid2016-monitoring
 
Gear6 and Scaling Website Performance: Caching Session and Profile Data with...
Gear6 and Scaling Website Performance:  Caching Session and Profile Data with...Gear6 and Scaling Website Performance:  Caching Session and Profile Data with...
Gear6 and Scaling Website Performance: Caching Session and Profile Data with...
 
Best Practices for Large-Scale Web Sites
Best Practices for Large-Scale Web SitesBest Practices for Large-Scale Web Sites
Best Practices for Large-Scale Web Sites
 
Zingme practice for building scalable website with PHP
Zingme practice for building scalable website with PHPZingme practice for building scalable website with PHP
Zingme practice for building scalable website with PHP
 
Deep Dive: Scaling Up to Your First 10 Million Users
Deep Dive: Scaling Up to Your First 10 Million UsersDeep Dive: Scaling Up to Your First 10 Million Users
Deep Dive: Scaling Up to Your First 10 Million Users
 
IoC and Mapper in C#
IoC and Mapper in C#IoC and Mapper in C#
IoC and Mapper in C#
 
Building a Scalable Architecture for web apps
Building a Scalable Architecture for web appsBuilding a Scalable Architecture for web apps
Building a Scalable Architecture for web apps
 
7 Stages of Scaling Web Applications
7 Stages of Scaling Web Applications7 Stages of Scaling Web Applications
7 Stages of Scaling Web Applications
 
Architecture of a Modern Web App
Architecture of a Modern Web AppArchitecture of a Modern Web App
Architecture of a Modern Web App
 
10 Best Practices of a Best Company to Work For
10 Best Practices of a Best Company to Work For10 Best Practices of a Best Company to Work For
10 Best Practices of a Best Company to Work For
 

Similar to Scaling your website

MySQL And Search At Craigslist
MySQL And Search At CraigslistMySQL And Search At Craigslist
MySQL And Search At CraigslistJeremy Zawodny
 
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Bhupesh Bansal
 
Hadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop User Group
 
Clustered Architecture Patterns Delivering Scalability And Availability
Clustered Architecture Patterns Delivering Scalability And AvailabilityClustered Architecture Patterns Delivering Scalability And Availability
Clustered Architecture Patterns Delivering Scalability And AvailabilityConSanFrancisco123
 
Data Grids with Oracle Coherence
Data Grids with Oracle CoherenceData Grids with Oracle Coherence
Data Grids with Oracle CoherenceBen Stopford
 
MongoDB - A next-generation database that lets you create applications never ...
MongoDB - A next-generation database that lets you create applications never ...MongoDB - A next-generation database that lets you create applications never ...
MongoDB - A next-generation database that lets you create applications never ...Ram Murat Sharma
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeDatabricks
 
How to Manage Scale-Out Environments with MariaDB MaxScale
How to Manage Scale-Out Environments with MariaDB MaxScaleHow to Manage Scale-Out Environments with MariaDB MaxScale
How to Manage Scale-Out Environments with MariaDB MaxScaleMariaDB plc
 
Sql Health in a SharePoint environment
Sql Health in a SharePoint environmentSql Health in a SharePoint environment
Sql Health in a SharePoint environmentEnrique Lima
 
Hadoop training in bangalore-kellytechnologies
Hadoop training in bangalore-kellytechnologiesHadoop training in bangalore-kellytechnologies
Hadoop training in bangalore-kellytechnologiesappaji intelhunt
 
Bp106 Worst Practices Final
Bp106   Worst Practices FinalBp106   Worst Practices Final
Bp106 Worst Practices FinalBill Buchan
 
Exchange 2010 High Availability And Storage
Exchange 2010 High Availability And StorageExchange 2010 High Availability And Storage
Exchange 2010 High Availability And StorageHarold Wong
 
Microsoft Exchange Server 2010
Microsoft Exchange Server 2010Microsoft Exchange Server 2010
Microsoft Exchange Server 2010HCL TECHNOLOGIES
 
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...Alluxio, Inc.
 
Spark to DocumentDB connector
Spark to DocumentDB connectorSpark to DocumentDB connector
Spark to DocumentDB connectorDenny Lee
 
Understanding and building big data Architectures - NoSQL
Understanding and building big data Architectures - NoSQLUnderstanding and building big data Architectures - NoSQL
Understanding and building big data Architectures - NoSQLHyderabad Scalability Meetup
 
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...Yaroslav Tkachenko
 

Similar to Scaling your website (20)

MySQL And Search At Craigslist
MySQL And Search At CraigslistMySQL And Search At Craigslist
MySQL And Search At Craigslist
 
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
 
Hadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedIn
 
Clustered Architecture Patterns Delivering Scalability And Availability
Clustered Architecture Patterns Delivering Scalability And AvailabilityClustered Architecture Patterns Delivering Scalability And Availability
Clustered Architecture Patterns Delivering Scalability And Availability
 
Data Grids with Oracle Coherence
Data Grids with Oracle CoherenceData Grids with Oracle Coherence
Data Grids with Oracle Coherence
 
MongoDB - A next-generation database that lets you create applications never ...
MongoDB - A next-generation database that lets you create applications never ...MongoDB - A next-generation database that lets you create applications never ...
MongoDB - A next-generation database that lets you create applications never ...
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 
How to Manage Scale-Out Environments with MariaDB MaxScale
How to Manage Scale-Out Environments with MariaDB MaxScaleHow to Manage Scale-Out Environments with MariaDB MaxScale
How to Manage Scale-Out Environments with MariaDB MaxScale
 
Sql Health in a SharePoint environment
Sql Health in a SharePoint environmentSql Health in a SharePoint environment
Sql Health in a SharePoint environment
 
Hadoop training in bangalore-kellytechnologies
Hadoop training in bangalore-kellytechnologiesHadoop training in bangalore-kellytechnologies
Hadoop training in bangalore-kellytechnologies
 
Exchange Server 2013 Database and Store Changes
Exchange Server 2013 Database and Store ChangesExchange Server 2013 Database and Store Changes
Exchange Server 2013 Database and Store Changes
 
Ibm db2 case study
Ibm db2 case studyIbm db2 case study
Ibm db2 case study
 
Bp106 Worst Practices Final
Bp106   Worst Practices FinalBp106   Worst Practices Final
Bp106 Worst Practices Final
 
Exchange 2010 High Availability And Storage
Exchange 2010 High Availability And StorageExchange 2010 High Availability And Storage
Exchange 2010 High Availability And Storage
 
Microsoft Exchange Server 2010
Microsoft Exchange Server 2010Microsoft Exchange Server 2010
Microsoft Exchange Server 2010
 
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
 
Spark to DocumentDB connector
Spark to DocumentDB connectorSpark to DocumentDB connector
Spark to DocumentDB connector
 
Understanding and building big data Architectures - NoSQL
Understanding and building big data Architectures - NoSQLUnderstanding and building big data Architectures - NoSQL
Understanding and building big data Architectures - NoSQL
 
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
 

Recently uploaded

Jumark Morit Diezmo- Career portfolio- BPED 3A
Jumark Morit Diezmo- Career portfolio- BPED 3AJumark Morit Diezmo- Career portfolio- BPED 3A
Jumark Morit Diezmo- Career portfolio- BPED 3Ajumarkdiezmo1
 
Crack JAG. Guidance program for entry to JAG Dept. & SSB interview
Crack JAG. Guidance program for entry to JAG Dept. & SSB interviewCrack JAG. Guidance program for entry to JAG Dept. & SSB interview
Crack JAG. Guidance program for entry to JAG Dept. & SSB interviewNilendra Kumar
 
LinkedIn for Your Job Search in April 2024
LinkedIn for Your Job Search in April 2024LinkedIn for Your Job Search in April 2024
LinkedIn for Your Job Search in April 2024Bruce Bennett
 
Protection of Children in context of IHL and Counter Terrorism
Protection of Children in context of IHL and  Counter TerrorismProtection of Children in context of IHL and  Counter Terrorism
Protection of Children in context of IHL and Counter TerrorismNilendra Kumar
 
美国SU学位证,雪城大学毕业证书1:1制作
美国SU学位证,雪城大学毕业证书1:1制作美国SU学位证,雪城大学毕业证书1:1制作
美国SU学位证,雪城大学毕业证书1:1制作ss846v0c
 
办理哈珀亚当斯大学学院毕业证书文凭学位证书
办理哈珀亚当斯大学学院毕业证书文凭学位证书办理哈珀亚当斯大学学院毕业证书文凭学位证书
办理哈珀亚当斯大学学院毕业证书文凭学位证书saphesg8
 
Introduction to Political Parties (1).ppt
Introduction to Political Parties (1).pptIntroduction to Political Parties (1).ppt
Introduction to Political Parties (1).pptSohamChavan9
 
8377877756 Full Enjoy @24/7 Call Girls in Pitampura Delhi NCR
8377877756 Full Enjoy @24/7 Call Girls in Pitampura Delhi NCR8377877756 Full Enjoy @24/7 Call Girls in Pitampura Delhi NCR
8377877756 Full Enjoy @24/7 Call Girls in Pitampura Delhi NCRdollysharma2066
 
AICTE PPT slide of Engineering college kr pete
AICTE PPT slide of Engineering college kr peteAICTE PPT slide of Engineering college kr pete
AICTE PPT slide of Engineering college kr peteshivubhavv
 
LinkedIn Strategic Guidelines April 2024
LinkedIn Strategic Guidelines April 2024LinkedIn Strategic Guidelines April 2024
LinkedIn Strategic Guidelines April 2024Bruce Bennett
 
办澳洲詹姆斯库克大学毕业证成绩单pdf电子版制作修改
办澳洲詹姆斯库克大学毕业证成绩单pdf电子版制作修改办澳洲詹姆斯库克大学毕业证成绩单pdf电子版制作修改
办澳洲詹姆斯库克大学毕业证成绩单pdf电子版制作修改yuu sss
 
Storytelling, Ethics and Workflow in Documentary Photography
Storytelling, Ethics and Workflow in Documentary PhotographyStorytelling, Ethics and Workflow in Documentary Photography
Storytelling, Ethics and Workflow in Documentary PhotographyOrtega Alikwe
 
办理学位证(UoM证书)北安普顿大学毕业证成绩单原版一比一
办理学位证(UoM证书)北安普顿大学毕业证成绩单原版一比一办理学位证(UoM证书)北安普顿大学毕业证成绩单原版一比一
办理学位证(UoM证书)北安普顿大学毕业证成绩单原版一比一A SSS
 
Unlock Your Creative Potential: 7 Skills for Content Creator Evolution
Unlock Your Creative Potential: 7 Skills for Content Creator EvolutionUnlock Your Creative Potential: 7 Skills for Content Creator Evolution
Unlock Your Creative Potential: 7 Skills for Content Creator EvolutionRhazes Ghaisan
 
原版定制copy澳洲查尔斯达尔文大学毕业证CDU毕业证成绩单留信学历认证保障质量
原版定制copy澳洲查尔斯达尔文大学毕业证CDU毕业证成绩单留信学历认证保障质量原版定制copy澳洲查尔斯达尔文大学毕业证CDU毕业证成绩单留信学历认证保障质量
原版定制copy澳洲查尔斯达尔文大学毕业证CDU毕业证成绩单留信学历认证保障质量sehgh15heh
 
定制(SCU毕业证书)南十字星大学毕业证成绩单原版一比一
定制(SCU毕业证书)南十字星大学毕业证成绩单原版一比一定制(SCU毕业证书)南十字星大学毕业证成绩单原版一比一
定制(SCU毕业证书)南十字星大学毕业证成绩单原版一比一z xss
 
定制(Waikato毕业证书)新西兰怀卡托大学毕业证成绩单原版一比一
定制(Waikato毕业证书)新西兰怀卡托大学毕业证成绩单原版一比一定制(Waikato毕业证书)新西兰怀卡托大学毕业证成绩单原版一比一
定制(Waikato毕业证书)新西兰怀卡托大学毕业证成绩单原版一比一Fs
 
Gurgaon Call Girls: Free Delivery 24x7 at Your Doorstep G.G.N = 8377087607
Gurgaon Call Girls: Free Delivery 24x7 at Your Doorstep G.G.N = 8377087607Gurgaon Call Girls: Free Delivery 24x7 at Your Doorstep G.G.N = 8377087607
Gurgaon Call Girls: Free Delivery 24x7 at Your Doorstep G.G.N = 8377087607dollysharma2066
 
办理(Salford毕业证书)索尔福德大学毕业证成绩单原版一比一
办理(Salford毕业证书)索尔福德大学毕业证成绩单原版一比一办理(Salford毕业证书)索尔福德大学毕业证成绩单原版一比一
办理(Salford毕业证书)索尔福德大学毕业证成绩单原版一比一diploma 1
 
AI ppt introduction , advandtage pros and cons.pptx
AI ppt introduction , advandtage pros and cons.pptxAI ppt introduction , advandtage pros and cons.pptx
AI ppt introduction , advandtage pros and cons.pptxdeepakkrlkr2002
 

Recently uploaded (20)

Jumark Morit Diezmo- Career portfolio- BPED 3A
Jumark Morit Diezmo- Career portfolio- BPED 3AJumark Morit Diezmo- Career portfolio- BPED 3A
Jumark Morit Diezmo- Career portfolio- BPED 3A
 
Crack JAG. Guidance program for entry to JAG Dept. & SSB interview
Crack JAG. Guidance program for entry to JAG Dept. & SSB interviewCrack JAG. Guidance program for entry to JAG Dept. & SSB interview
Crack JAG. Guidance program for entry to JAG Dept. & SSB interview
 
LinkedIn for Your Job Search in April 2024
LinkedIn for Your Job Search in April 2024LinkedIn for Your Job Search in April 2024
LinkedIn for Your Job Search in April 2024
 
Protection of Children in context of IHL and Counter Terrorism
Protection of Children in context of IHL and  Counter TerrorismProtection of Children in context of IHL and  Counter Terrorism
Protection of Children in context of IHL and Counter Terrorism
 
美国SU学位证,雪城大学毕业证书1:1制作
美国SU学位证,雪城大学毕业证书1:1制作美国SU学位证,雪城大学毕业证书1:1制作
美国SU学位证,雪城大学毕业证书1:1制作
 
办理哈珀亚当斯大学学院毕业证书文凭学位证书
办理哈珀亚当斯大学学院毕业证书文凭学位证书办理哈珀亚当斯大学学院毕业证书文凭学位证书
办理哈珀亚当斯大学学院毕业证书文凭学位证书
 
Introduction to Political Parties (1).ppt
Introduction to Political Parties (1).pptIntroduction to Political Parties (1).ppt
Introduction to Political Parties (1).ppt
 
8377877756 Full Enjoy @24/7 Call Girls in Pitampura Delhi NCR
8377877756 Full Enjoy @24/7 Call Girls in Pitampura Delhi NCR8377877756 Full Enjoy @24/7 Call Girls in Pitampura Delhi NCR
8377877756 Full Enjoy @24/7 Call Girls in Pitampura Delhi NCR
 
AICTE PPT slide of Engineering college kr pete
AICTE PPT slide of Engineering college kr peteAICTE PPT slide of Engineering college kr pete
AICTE PPT slide of Engineering college kr pete
 
LinkedIn Strategic Guidelines April 2024
LinkedIn Strategic Guidelines April 2024LinkedIn Strategic Guidelines April 2024
LinkedIn Strategic Guidelines April 2024
 
办澳洲詹姆斯库克大学毕业证成绩单pdf电子版制作修改
办澳洲詹姆斯库克大学毕业证成绩单pdf电子版制作修改办澳洲詹姆斯库克大学毕业证成绩单pdf电子版制作修改
办澳洲詹姆斯库克大学毕业证成绩单pdf电子版制作修改
 
Storytelling, Ethics and Workflow in Documentary Photography
Storytelling, Ethics and Workflow in Documentary PhotographyStorytelling, Ethics and Workflow in Documentary Photography
Storytelling, Ethics and Workflow in Documentary Photography
 
办理学位证(UoM证书)北安普顿大学毕业证成绩单原版一比一
办理学位证(UoM证书)北安普顿大学毕业证成绩单原版一比一办理学位证(UoM证书)北安普顿大学毕业证成绩单原版一比一
办理学位证(UoM证书)北安普顿大学毕业证成绩单原版一比一
 
Unlock Your Creative Potential: 7 Skills for Content Creator Evolution
Unlock Your Creative Potential: 7 Skills for Content Creator EvolutionUnlock Your Creative Potential: 7 Skills for Content Creator Evolution
Unlock Your Creative Potential: 7 Skills for Content Creator Evolution
 
原版定制copy澳洲查尔斯达尔文大学毕业证CDU毕业证成绩单留信学历认证保障质量
原版定制copy澳洲查尔斯达尔文大学毕业证CDU毕业证成绩单留信学历认证保障质量原版定制copy澳洲查尔斯达尔文大学毕业证CDU毕业证成绩单留信学历认证保障质量
原版定制copy澳洲查尔斯达尔文大学毕业证CDU毕业证成绩单留信学历认证保障质量
 
定制(SCU毕业证书)南十字星大学毕业证成绩单原版一比一
定制(SCU毕业证书)南十字星大学毕业证成绩单原版一比一定制(SCU毕业证书)南十字星大学毕业证成绩单原版一比一
定制(SCU毕业证书)南十字星大学毕业证成绩单原版一比一
 
定制(Waikato毕业证书)新西兰怀卡托大学毕业证成绩单原版一比一
定制(Waikato毕业证书)新西兰怀卡托大学毕业证成绩单原版一比一定制(Waikato毕业证书)新西兰怀卡托大学毕业证成绩单原版一比一
定制(Waikato毕业证书)新西兰怀卡托大学毕业证成绩单原版一比一
 
Gurgaon Call Girls: Free Delivery 24x7 at Your Doorstep G.G.N = 8377087607
Gurgaon Call Girls: Free Delivery 24x7 at Your Doorstep G.G.N = 8377087607Gurgaon Call Girls: Free Delivery 24x7 at Your Doorstep G.G.N = 8377087607
Gurgaon Call Girls: Free Delivery 24x7 at Your Doorstep G.G.N = 8377087607
 
办理(Salford毕业证书)索尔福德大学毕业证成绩单原版一比一
办理(Salford毕业证书)索尔福德大学毕业证成绩单原版一比一办理(Salford毕业证书)索尔福德大学毕业证成绩单原版一比一
办理(Salford毕业证书)索尔福德大学毕业证成绩单原版一比一
 
AI ppt introduction , advandtage pros and cons.pptx
AI ppt introduction , advandtage pros and cons.pptxAI ppt introduction , advandtage pros and cons.pptx
AI ppt introduction , advandtage pros and cons.pptx
 

Scaling your website

  • 1. SCALING YOUR WEBSITE Alejandro Marcu Dutch PHP Conference 2016
  • 2. 2  Started programming Logo at 8 years old  Then moved to Basic, Turbo Pascal, C++, Java  2001 – 2004 Various programming jobs in Argentina  2004 – 2008: TopCoder  2009 – 2015: Facebook Alejandro Marcu
  • 3. 3  Scalable architecture  Scaling the database  Caching  Introducing new features What You Will Learn Today
  • 5. 5 Single Server  Hosted or in the cloud  Web App: Apache/Nginx + PHP  DB: MySql, MongoDB, etc.  Cache: Memcache, Redis Web App CacheDB Server User
  • 6. 6  More RAM  More cores or faster CPU  SSD  RAID  Network Interfaces Scaling Vertically
  • 7. 7 Functional Partitioning  Servers can have different hardware specs  More latency  Limited growth Server 1 Server 3Server 2 Web App CacheDB Data Center User
  • 8. 8 Splitting the Web App  Web Front End should be a thin presentation layer  Services  Just another class  Remote over SOAP, REST, Thrift  Start simple, plan for scale Web Front End Service 1 DB Service 2 Service n Back End Cache iOS App Android App
  • 9. 9 Functional Partitioning  Back end servers can have one or more services  Some services can be in more than one server Service 1 Service n Back End Server 4 Server k Server 1 Server 3Server 2 Web Front End CacheDB Data Center User
  • 10. 10  Don’t store anything locally  Use external storage (e.g. databases)  Can use local caching Stateless Services
  • 11. 11  HTTP Session  Cookies  External Data Store  Uploaded Files  DFS: GFS, HDFS, ClusterFS  Amazon S3 Stateless Front End
  • 12. 12 Multiple Front End Servers Load Balancer:  Cloud based (Amazon ELB)  Software (NGINX, HAProxy)  Hardware (BIG-IP, Netscaler) Load Balancer Service 1 Service n Back End CacheDB Data Center User Web FE 1 Front End Web FE k
  • 13. 13 Caching static files  Files that are the same on each request, e.g. jpg, png, css, js, mp3, etc  Reverse HTTP Proxy  Load balancers usually provide this functionality  CDN (Content Delivery Network)  E.g. Akamai, Amazon Cloudfront  Pay for usage  Multiple locations User CDN Data Center static content dynamic content
  • 14. 14  Advantages  Lower latency for users  Reduced disaster risk  Economic opportunities  Challenges  Consistency  Latency between data centers  Bandwidth between data centers Multiple Data Centers
  • 16. 16  Too much data  Too many reads  Too many writes  Want higher availability Scaling relational databases
  • 17. 17 Replication  Usually much more reads than writes  Higher availability  Read after write can be wrong Master Slave Slave R/W R DB clients Binlogs
  • 18. 18 Functional Partitioning  Limited growth  Can separate unrelated functionality User Post Payment DB 1 DB 2
  • 19. 19 Sharding  Tables are split into multiple DBs  Sharding key used to decide which db, e.g. id  Sharding function, e.g. db(id) = (id % 2) + 1  Searching becomes more complicated id name 1 John 3 Jack 5 Anne id name 2 Louise 4 Bob 6 Marie DB 1 DB 2
  • 20. 20 Sharding  E.g., add an extra db  New sharding function: db(id) = (id % 3) + 1  Conclusion: modulo is not a good sharding function id name 1 John 3 Jack 5 Anne id name 2 Louise 4 Bob 6 Marie DB 1 DB 2 id name 1 John 4 Bob DB 1 id name 2 Louise 5 Anne DB 2 id name 3 Jack 6 Marie DB 3
  • 21. 21 Consistent Sharding  Consistent sharding needs less reallocations id name 1 John 3 Jack 5 Anne id name 2 Louise 4 Bob 6 Marie DB 1 DB 2 id name 1 John 3 Jack DB 1 id name 2 Louise 4 Bob DB 2 id name 5 Anne 6 Marie DB 3
  • 22. 22 Sharding  Create many logical DBs  Distribute them across servers Server 1 DB 1 DB 2 … … DB 16 Server 2 DB 17 DB 18 … … DB 32
  • 23. 23 Sharding  Re-distribute DBs when needed  Need a function to map db to server, can be a configuration Server 1 DB 1 DB 2 … … DB 16 Server 2 DB 17 DB 18 … … DB 24 Server 3 DB 25 DB 18 … … DB 32
  • 24. 24 Sharding colocation  Put owned data in the same table (e.g. shard by user_id in post table)  Can execute joins user id name 1 John 3 Jack 5 Anne id name 2 Louise 4 Bob 6 Marie DB 1 DB 2 user post id user_id text 100 1 … 125 1 … 180 3 … post id user_id text 143 2 … 110 6 … 175 6 …
  • 25. 25 Sharding fan-out  Many-to-many relationships are spread out  To get friend’s names:  Get ids  Group by db  Query on each db  Gets worse with more dbs  Caching helps a lot  Needs inverse entries user id name 1 John 3 Jack 5 Anne id name 2 Louise 4 Bob 6 Marie DB 1 DB 2 user friend id1 id2 1 2 1 4 3 4 friend id1 id2 2 1 4 1 4 3
  • 26. 26  Replication  Scales reads, higher availability  Functional partitioning  Limited scalability  Helps across the board  Sharding  Scales reads, writes, too much data and helps with availability  Those 3 techniques can be combined Database scaling
  • 28. 28  Usually required at large scale  Key-Value stores  Set(key, value[, TTL])  Get(key)  Delete(key)  Different levels  Client side (e.g. in the browser in JS)  In the WebServer (e.g. APC)  Distributed cache (e.g. Redis, Memcached) Caching application data
  • 29. 29  E.g. APC (Alternative PHP Cache)  Very fast  Duplicated caching between web servers  Expensive to invalidate  Use sparingly, mostly for global data Caching in the web server
  • 30. 30  Examples:  Redis  Memcached (+ McRouter or libmemcached)  One or more cache servers, shared use between clients  Network latency Distributed cache
  • 31. 31 Features to consider:  Replication  Partitioning  Separate pools  Persistence  Atomic operations Distributed cache
  • 32. 32  When the value is no longer valid, usually just delete the key  Example: user_friends:100 => ‘John X, Bob Y, Anne Z’  Need to invalidate when:  The user adds or removes friends  A friend removes him as a friend  A friend changes his name  Can you tolerate temporary inconsistencies? Cache invalidation
  • 33. 33  What happens if you change the structure of the values? Example: (old) user_friends:100 => ‘John X, Bob Y, Anne Z’ (new) user_friends:100 => ‘1:John X, 25:Bob Y, 37:Anne Z’  New code breaks with old style keys  Old code breaks with new style keys  Solution: use versions: (old) user_friends:100:1 => ‘John X, Bob Y, Anne Z’ (new) user_friends:100:2 => ‘1:John X, 25:Bob Y, 37:Anne Z’ Cache versioning
  • 35. 35 Objectives:  A/B testing  Quickly revert it if needed  Protect infrastructure  Ease of development Introducing new features
  • 36. 36 Some possibilities: 1. Development branch 2. Feature toggle 3. Percentage Rollout 4. Advanced Rollout Introducing new features
  • 37. 37  New branch for the feature, merge when finished  Can be fine in the early stages  No extra setup or complexity  Long living branch, may be hard to merge Development Branch
  • 38. 38  Can be changed at run time (console or configuration)  Should distinguish prod from testing  Allows for intermediate commits  Code structure: if (feature_enabled(‘homepage_redesign’)) { new_homepage(); } else { old_homepage(); } Feature Toggle
  • 39. 39  Dynamically control the percentage of users for a feature  When increasing the percentage, should include previous users  Code structure: if (feature_enabled(‘homepage_redesign’, $user_id)) { new_homepage(); } else { old_homepage(); } Percentage Rollout
  • 40. 40 Turn on/off features for a percentage of users that:  Are employees  Are in another rollout group  Use a certain language  Are in a certain country  Individually whitelist or blacklist people Advanced Rollout
  • 41. 41  Some frameworks to check out:  Swivel  Opensoft/rollout  LaunchDarkly  Don’t forget to clean up the old code paths Introducing new features