Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Scalability rules for web sites

7,234 views

Published on

This presentation describes some interesting rules in Scaling Web Sites

Published in: Technology
  • Be the first to comment

Scalability rules for web sites

  1. 1. Scalability Rules for Web Sites Minh Tran – 12/2011
  2. 2. Distribute Your Work
  3. 3. Distribute Your Work• Design to Clone Things (X axis)• Design to Split Different Things (Y Axis)• Design to Split Similar Things (Z Axis)
  4. 4. Design to Clone Things (X axis)• When: – Databases with a VERY HIGH READ to write ratio (5:1 or greater—the higher the better). – Any system where transaction growth exceeds data growth.• How to use: – Simply clone services and implement a load balancer. – For databases, ensure the accessing code understands the difference between a read and a write.
  5. 5. Example• Reservation system has – 400 searches (read) for 1 booking (write)• How to scale?? Scaled by creating read-only copies (or replicas) One way is to use a caching tier in front of the database (high recommended 1st step) Most RDBMS allows replication ―out of the box‖ Master ~ primary transactional database (write) Slave ~ read-only copies of the master
  6. 6. Design to Split Different Things(Y Axis) • When: – Very large data sets where relations between data are not necessary. – Large, complex systems where scaling engineering resources requires specialization. • How to use: – Split up actions by using verbs or resources by using nouns or use a mix. – Split both the services and the data along the lines defined by the verb/noun approach.
  7. 7. Example – Split up by verbs Login Signup SearchEcommerce Browse system View Purchase / Buy Add-to-cart
  8. 8. Example – Split up by nouns User Information Product SKU Ecommerce system Catalog Inventory
  9. 9. Design to Split Similar Things(Z Axis)• When: – Very large, similar data sets such as large and rapidly growing customer bases.• How to use: – Identify something about the customer • Customer ID • Last name • Geography • Device – Split or partition both data and services based on that attribute.• Often referred to as sharding or horizontal partitioning
  10. 10. Use the Right Tool• Use Databases Appropriately• Actively Use Log Files
  11. 11. Use Databases Appropriately RDBMS File SystemExample • Oracle, MySQL… • GFS, MogileFS, CephStorage • Transactional integrity (ACID) • No transactionalStructure • Relational structure within • No relationships tablesAdvantages • Minimize data redundancy • Handle very large amount of • Improve transaction files and data processingLimitation • Scalability (ACID) • conflicting reads and writes • Sharding or Partitioning over time (Relational structure)
  12. 12. Use Databases Appropriately (cont) NoSQLExample • Memcached, • Google Big Table, • CouchDB, Amazon ‗s Tokyo Tyrant, Cassandra SimpleDB, Yahoo‘s Voldemort PNUTS, MongoDB,…Storage • Key-value stores • Extensible record stores • Document storesStructure • Single key-value • Row and column data • Multi-indexed object index for data model model • Stored in memory • Documents can be aggregated into collection of documentsAdvantages • Significant • Rows are sharded on • Can be queried based scaling and primary keys (automatic) on many different performance • Columns are broken into attributes groups (user definitions)Limitation • Kind of data can • Asynchronous replication • Asynchronous be stored replication • Synchronous • ACID replication
  13. 13. Actively Use Log Files• When – Put a process in place that monitors log files – Forces people to take action on issues identified.• How to use – Use any number of monitoring tools from custom scripts to Splunk to watch your application logs for errors – Export these and assign resources for identifying and solving the issue.
  14. 14. Use Caching Aggressively• Leverage Content Delivery Networks• Use Expires Headers• Leverage Page Caches• Utilize Application Caches• Make Use of Object Caches• Put Object Caches on Their Own ―Tier‖
  15. 15. Leverage Content Delivery Networks• When – Ensure it is cost justified and then choose which content is most suitable.• How to use – Most CDNs leverage DNS (Domain Name Services or Domain Name Servers) to serve content on your site‘s behalf.
  16. 16. Use Expires Headers• When HTTP Status Code: HTTP/1.1 200 OK – All object types Date: Thu, 21 Oct 2010 20:03:38 GMT need to be Server: Apache/2.2.9 (Fedora) X-Powered-By: PHP/5.2.6 considered. Expires: Mon, 26 Jul 2011 05:00:00• How to use GMT Last-Modified: Thu, 21 Oct 2010 – Headers can be set 20:03:38 GMT Cache-Control: no-cache on Web servers or Vary: Accept-Encoding, User-Agent through application Transfer-Encoding: chunked Content-Type: text/html; code. charset=UTF-8
  17. 17. Leverage Page Caches• When – Always• How to use – Choose a caching system and deploy.
  18. 18. Use Caching Aggressively• Leverage Content Delivery Networks• Use Expires Headers• Leverage Page Caches• Make Use of Object Caches• Put Object Caches on Their Own ―Tier‖
  19. 19. Make Use of Object Caches• When: – Any time you have repetitive queries or computations.• How to use: – Select any one of the many open source or vendor supported solutions – Implement the calls in your application code.• Some popular caches: Memcached, Ehcache, Apache OJB, NCache
  20. 20. Put Object Caches on Their Own ―Tier‖
  21. 21. Learn Aggressively• Take every opportunity to learn.• Be constantly learning from your mistakes as well as successes.• Watch your customers or use A/B testing to determine what works.• Use postmortems to learn from incidents and problems in production.
  22. 22. Reference• Scalability Rules: 50 Principles for Scaling Web Sites• The Art of Scalability: Scalable Web Architecture, Processes, and Organizations for the Modern Enterprise• http://www.codefutures.com/database- sharding/• http://highscalability.com/unorthodox- approach-database-design-coming-shard

×