Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Scalable web architecture LAMP  (& AWS infrastructure) Aleksandr Tsertkov
LAMP
LAMP: Platform <ul><li>L - Linux / Unix </li></ul><ul><li>A - Apache / lighthttpd / nginx </li></ul><ul><li>M - MySQL / Po...
LAMP: Why LAMP <ul><ul><li>We know it! </li></ul></ul><ul><ul><li>Proved by very big guys (Facebook, YouTube, LiveJournal,...
LAMP: Who use it <ul><li>Let’s take top 10 Internet sites according to Alexa. </li></ul><ul><li>Google </li></ul><ul><li>Y...
System design (SD)
SD: Key points <ul><li>Scalability </li></ul><ul><li>HA (High Availibility) </li></ul><ul><ul><li>Backup & restore strateg...
SD: Load scalability <ul><li>“  Load scalability: The ability for a distributed system to easily expand and contract its r...
SD: Horizontal scalability <ul><li>Adding new nodes to a system for handling growing load and removing nodes when load dec...
SD: High Availibility <ul><li>Complex:  High availability is a system design protocol and associated implementation that e...
SD: Fault Tolerant <ul><li>System should be able to continue function normally even if some of its components fail. </li><...
SD: Share Nothing <ul><li>“  A shared nothing architecture (SN) is a distributed computing architecture in which each node...
SD: Share Nothing <ul><li>Can be achieved on different application layers separately: </li></ul><ul><li>Database: data par...
SD: Typical web architecture <ul><li>Load balancer </li></ul><ul><li>Several web servers </li></ul><ul><li>Database server...
SD: Typical web architecture
SD: Typical web architecture <ul><li>Each of these components / layers can be scaled separately. </li></ul><ul><li>Databas...
SD: Scaling database <ul><li>Master-Slave Replication variants: </li></ul><ul><li>Read/Write queries to Master, Read only ...
SD: Scaling database <ul><li>Master should be very powerful machine, but sooner or later you will hit the IO limit. </li><...
SD: Scaling database
SD: Caching strategy <ul><li>Hierarchy of caches should be used for optimal performance and efficiency. </li></ul><ul><li>...
SD: Caching hierarchy <ul><li>App server local in memory cache for highly common items (speedup scripts bootstrapping) </l...
SD: High-CPU app servers <ul><li>For High-CPU computing operations like audio/video processing dedicated application serve...
SD: Web servers optimization <ul><li>General web servers (apache) </li></ul><ul><li>COMET web servers </li></ul><ul><li>St...
SD: Static files strategy <ul><li>Network attached storage (NAS) </li></ul><ul><ul><li>Distributed network file system (Lu...
SD: Static files strategy <ul><li>Distributed filesystem is complex but in a perfect world it should give us what we need:...
SD: Load balancers <ul><li>Software or hardware load balancers </li></ul><ul><li>Traffic distributed between several load ...
SD: Complete picture
Amazon Web Services
AWS: What is AWS? <ul><li>Amazon is not only about books   </li></ul><ul><li>Amazon Web Services provide infrastructure w...
AWS: Why AWS? <ul><li>Because it has everything what we need: </li></ul><ul><ul><li>EC2: Elastic Compute Cloud </li></ul><...
AWS: EC2 <ul><li>Easy to deploy (os images) </li></ul><ul><li>Easy to scale up and down on demand (deals with peaks) with ...
AWS: S3 & CloudFront <ul><li>Out of the box CDN with CloudFront </li></ul><ul><li>DFS (sort of) with S3 </li></ul><ul><li>...
AWS: Services on top of AWS <ul><li>Some like AWS so much that they have created own cloud services on top of it   </li><...
AWS: Panacea? <ul><li>AWS is indeed good to start with since its fast and cheap. </li></ul><ul><li>In a long time term if ...
<ul><li>Thank you! </li></ul>
Upcoming SlideShare
Loading in …5
×

Scalable Web Architecture

13,556 views

Published on

Scalable Web Architecture: LAMP, AWS, S3, CloudFront, EC2, caching strategy, scaling database, hight availibility, fault tolerant, horizontal scalability

Published in: Technology
  • good presentation but has to be more elaborate
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • A good summary, hopefully more in detail.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Scalable Web Architecture

  1. 1. Scalable web architecture LAMP (& AWS infrastructure) Aleksandr Tsertkov
  2. 2. LAMP
  3. 3. LAMP: Platform <ul><li>L - Linux / Unix </li></ul><ul><li>A - Apache / lighthttpd / nginx </li></ul><ul><li>M - MySQL / PostgreSQL / SQLite </li></ul><ul><li>P - PHP / Python / Perl / Ruby </li></ul>
  4. 4. LAMP: Why LAMP <ul><ul><li>We know it! </li></ul></ul><ul><ul><li>Proved by very big guys (Facebook, YouTube, LiveJournal, etc) </li></ul></ul><ul><ul><li>Plenty of shared experience on the web </li></ul></ul><ul><ul><li>Its flexible and extendable </li></ul></ul><ul><ul><li>Easy to find engineers </li></ul></ul><ul><ul><li>Easy to maintain </li></ul></ul><ul><ul><li>Its cheap </li></ul></ul>
  5. 5. LAMP: Who use it <ul><li>Let’s take top 10 Internet sites according to Alexa. </li></ul><ul><li>Google </li></ul><ul><li>Yahoo! </li></ul><ul><li>YouTube </li></ul><ul><li>Facebook </li></ul><ul><li>Windows Live </li></ul><ul><li>MSN </li></ul><ul><li>Wikipedia </li></ul><ul><li>Blogger </li></ul><ul><li>Baidu </li></ul><ul><li>Yahoo! Yapan </li></ul>Who use LAMP? 5 of 10 
  6. 6. System design (SD)
  7. 7. SD: Key points <ul><li>Scalability </li></ul><ul><li>HA (High Availibility) </li></ul><ul><ul><li>Backup & restore strategy!!! </li></ul></ul><ul><li>Fault-tolerant </li></ul><ul><li>SA (Share Nothing) </li></ul>
  8. 8. SD: Load scalability <ul><li>“ Load scalability: The ability for a distributed system to easily expand and contract its resource pool to accommodate heavier or lighter loads. ” </li></ul><ul><li>-Wikipedia </li></ul>
  9. 9. SD: Horizontal scalability <ul><li>Adding new nodes to a system for handling growing load and removing nodes when load decreases. </li></ul>
  10. 10. SD: High Availibility <ul><li>Complex: High availability is a system design protocol and associated implementation that ensures a certain absolute degree of operational continuity during a given measurement period. </li></ul><ul><li>Simple: maximum uptime, minimum downtime. </li></ul>
  11. 11. SD: Fault Tolerant <ul><li>System should be able to continue function normally even if some of its components fail. </li></ul><ul><li>No single point of failure. </li></ul>
  12. 12. SD: Share Nothing <ul><li>“ A shared nothing architecture (SN) is a distributed computing architecture in which each node is independent and self-sufficient, and there is no single point of contention across the system. “ </li></ul><ul><li>- Wikipedia </li></ul>
  13. 13. SD: Share Nothing <ul><li>Can be achieved on different application layers separately: </li></ul><ul><li>Database: data partitioning / sharding </li></ul><ul><li>Cache: memcache client side partitioning </li></ul><ul><li>Computing: job queues </li></ul>
  14. 14. SD: Typical web architecture <ul><li>Load balancer </li></ul><ul><li>Several web servers </li></ul><ul><li>Database server(s) </li></ul><ul><li>Shared file server (NAS) (if really needed) </li></ul>
  15. 15. SD: Typical web architecture
  16. 16. SD: Typical web architecture <ul><li>Each of these components / layers can be scaled separately. </li></ul><ul><li>Database is usually the toughest part to scale. </li></ul>
  17. 17. SD: Scaling database <ul><li>Master-Slave Replication variants: </li></ul><ul><li>Read/Write queries to Master, Read only from Slaves </li></ul><ul><li>Writes to Master, Reads only from Slaves </li></ul>
  18. 18. SD: Scaling database <ul><li>Master should be very powerful machine, but sooner or later you will hit the IO limit. </li></ul><ul><li>Data partitioning / sharding is used to distribute data across number of Masters spreading load between them (horizontal scaling). </li></ul>
  19. 19. SD: Scaling database
  20. 20. SD: Caching strategy <ul><li>Hierarchy of caches should be used for optimal performance and efficiency. </li></ul><ul><li>Local memory -> memcached -> local disk </li></ul>
  21. 21. SD: Caching hierarchy <ul><li>App server local in memory cache for highly common items (speedup scripts bootstrapping) </li></ul><ul><li>Distributed cache system (memcached) for caching database queries and general purpose cache </li></ul><ul><li>App server file cache for big size items </li></ul>
  22. 22. SD: High-CPU app servers <ul><li>For High-CPU computing operations like audio/video processing dedicated application servers should be used. </li></ul><ul><li>Good control over them can be achieved using job queue. </li></ul><ul><li>Video: check YouTube Platform ;-) </li></ul>
  23. 23. SD: Web servers optimization <ul><li>General web servers (apache) </li></ul><ul><li>COMET web servers </li></ul><ul><li>Static content web servers </li></ul><ul><li>Content Delivery Network (CDN) should be used for static public content. </li></ul>
  24. 24. SD: Static files strategy <ul><li>Network attached storage (NAS) </li></ul><ul><ul><li>Distributed network file system (Lustre, GlusterFS, MogileFS) </li></ul></ul><ul><ul><li>Not distributed (NFS) </li></ul></ul><ul><li>Fault-tolerance and data redundancy are required! </li></ul>
  25. 25. SD: Static files strategy <ul><li>Distributed filesystem is complex but in a perfect world it should give us what we need: performance, redundancy, fault-tolerance. </li></ul><ul><li>Static content web servers can run on DF nodes! </li></ul>
  26. 26. SD: Load balancers <ul><li>Software or hardware load balancers </li></ul><ul><li>Traffic distributed between several load balancers using round robin DNS </li></ul><ul><li>HA solution for load balancers </li></ul>
  27. 27. SD: Complete picture
  28. 28. Amazon Web Services
  29. 29. AWS: What is AWS? <ul><li>Amazon is not only about books  </li></ul><ul><li>Amazon Web Services provide infrastructure web services platform in the cloud. </li></ul>
  30. 30. AWS: Why AWS? <ul><li>Because it has everything what we need: </li></ul><ul><ul><li>EC2: Elastic Compute Cloud </li></ul></ul><ul><ul><ul><li>EBS: Elastic Block Store </li></ul></ul></ul><ul><ul><ul><li>CloudWatch: monitoring service with auto scaling </li></ul></ul></ul><ul><ul><ul><li>Elastic Load Balancing </li></ul></ul></ul><ul><ul><li>S3: Simple Storage Service </li></ul></ul><ul><ul><li>Cloud front (CDN) </li></ul></ul><ul><ul><li>SQS: Simple Queue Service </li></ul></ul>
  31. 31. AWS: EC2 <ul><li>Easy to deploy (os images) </li></ul><ul><li>Easy to scale up and down on demand (deals with peaks) with Auto Scaling </li></ul><ul><li>Out of the box monitoring with CloudWatch </li></ul><ul><li>Out of the box load balancing with Elastic Load Balancing http://aws.amazon.com/loadbalancing </li></ul>
  32. 32. AWS: S3 & CloudFront <ul><li>Out of the box CDN with CloudFront </li></ul><ul><li>DFS (sort of) with S3 </li></ul><ul><li>Very reliable </li></ul>
  33. 33. AWS: Services on top of AWS <ul><li>Some like AWS so much that they have created own cloud services on top of it  </li></ul><ul><li>RightScale www.rightscale.com </li></ul><ul><li>GoGrid www.gogrid.com </li></ul>
  34. 34. AWS: Panacea? <ul><li>AWS is indeed good to start with since its fast and cheap. </li></ul><ul><li>In a long time term if everything goes as expected and profit increases it might be better to build own cloud infrastructure and migrate to it at some point. </li></ul>
  35. 35. <ul><li>Thank you! </li></ul>

×