Skalowalna architektura na przykładzie soccerway.com

2,487 views

Published on

"Skalowalna architektura na przykładzie soccerway.com" - Adam Brodziak - Global Sports Media b.v.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
2,487
On SlideShare
0
From Embeds
0
Number of Embeds
266
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Skalowalna architektura na przykładzie soccerway.com

  1. 1. Scalable architecture By Adam Brodziak Global Sports Media b.v.
  2. 2. Abstract  Adam Brodziak An overview of modern web-based application architecture - from hardware infrastructure, through PHP/SQL code, HTML/CSS markup distribution. All of this spiced up by cache, loadbalancing and CDN.
  3. 3. Who is this guy?  Lead developer at Global Sports Media  GSM collects and process sports data  GSM owns soccerway.com portal  Linux user  Interested in frameworks, design patterns  Semantic Web enthousiast  Football (soccer) fan
  4. 4. Topics  The Challenge  Infrastructure  Code  Cache  CDN
  5. 5. Topics  The Challenge  Infrastructure  Code  Cache  CDN
  6. 6. Raw numbers  7 millions visits / month  52 millions pageviews / month  1 billion request / month  6TB of traffic / month  300k users at peak time  Quite a few clients using the same hardware
  7. 7. Not so much, but...  700 leagues  Livescores  Game events  Match statistics  Rankings  Editorials
  8. 8. Traffic growth
  9. 9. The Challenge  Loads of data to process  Scores  Events  Stats  In real-time (livescores)  Growing number of visitors  13K hits/sec at peak-time
  10. 10. 10 servers to run it all
  11. 11. Topics  The Challenge  Infrastructure  Code  Cache  CDN
  12. 12. It starts with one
  13. 13. Load balancing
  14. 14. Loadbalancing caveats  Don't relay on the local filesystem  Temporary files, session, logs  Avoid assuming exclusive/single cache  APC, Zend Cache  Use distributed session storage  Memcache, database  Encalsulate above
  15. 15. Separate database server
  16. 16. DB replication
  17. 17. Replicaton caveats  Writes only on master  Reads from slaves  Data consistency  Replication lag  Don't do $master->query('UPDATE session SET logged = 1'); $slave->query('SELECT logged FROM session');
  18. 18. Whole image
  19. 19. Topics  The Challenge  Infrastructure  Code  Cache  CDN
  20. 20. PHP is slow!  Yes, but it does not matter!  Database access is slower  Cache over network is slower  Disk access is slower  HTTP requests are slower  Webservice calls are slower  Discover bottlenecks before blaming PHP
  21. 21. It's about architecture  Heavy tasks in background  CRON, Gearman  Pregenerate stuff  Move some code to SQL  Calculations in queries  Stored procedures  Triggers  C/C++ or Java for heavy computation  Use PHP to glue it together
  22. 22. PHP Frameworks  Hundreds of others  Which one to choose?
  23. 23. Framework? Think again!  Raw performance matters  Support for master-slave replication  Multiple layers of cache  Working with accelerators (HipHop!)  Beware of bottlenecks  i.e. core part of framework is slow  Designed to scale
  24. 24. Topics  The Challenge  Infrastructure  Code  Cache  CDN
  25. 25. Cache is everywhere  CPU: L1, L2  Disk buffer  Linux filesystem  MySQL  PHP (APC)  Smarty  HTTP Proxy  Browser cache
  26. 26. Where to cache?
  27. 27. Memory is cheap  Pre-generate stuff  Store results in memory  APC, memcached  App config in memory  APC with stat=off  Increase RAM for MySQL  Disk is the new tape
  28. 28. Memcached for the rescue!  Dead simple  Key-value  Distributed storage pool  Automatic invalidation after X sec  No garbage collecting invoked  Store arrays, objects, simple values  Easy integration
  29. 29. Topics  The Challenge  Infrastructure  Code  Cache  CDN
  30. 30. Reverse-proxy  First line of cache  Returns content if resource is up-to-date  Works on HTTP level  Can be integrated into existing infrastructure  Can do load balancing  In-memory cache storage  Squid, Nginx, Varnish
  31. 31. Content Delivery Network  Network of servers  Worldwide  Automatic loadbalancing  Fast access (low ping time)  Data redundancy gratis  Ideal for static resources  But not only  Must-have for worldwide websites
  32. 32. CDN as reverse-proxy  HTTP request / response chain  Embraces REST architecture  Requests are distributed  Reduces latency  Lowers traffic volume  Increases availability  i.e. Akamai Edge Suite
  33. 33. CDN at soccerway.com  All of the content is served via CDN  Images, CSS, JS  Generated HTML  JSON for Ajax  90% of traffic via CDN  Origin requests only from Europe  Site online even if servers are down  Can't live without ;)
  34. 34. Thank you for listening Questions?
  35. 35. Interested?  Contact me:  adam@globalsportsmedia.com  www.goldenline.pl/adam-brodziak  www.linkedin.com/in/adambrodziak  We're hiring!  Web developers  Football / sport fans

×