Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

The Web Scale


Published on

Tuenti architecture to withstand
1500+ million pageviews / day

  • Be the first to comment

The Web Scale

  1. 1. The Web ScaleTuenti architecture to withstand1500+ million pageviews / day Guillermo Pérez - Security & Backend Architecture Tech Lead
  2. 2. What is a scalable system?
  3. 3. What is scalability
  4. 4. Some Tuenti stats
  5. 5. Tuenti Stats 13M users REALLY ACTIVE 50%+ active weekly >1h browsing per DAY!
  6. 6. Tuenti Stats - Each month, over: 40,000 M pageviews 50,000 M requests 100 M new photos 2,000+ Tb served photos - On peaks: 1,600 million pageviews/day 35,000 requests/second 6,000 million served photos/day
  7. 7. Tuenti Stats - 1200+ servers ~500 FEs ~300 DBs ~100 MCs ~100 image servers Others: Chat, HBase, Queues, Processors...
  8. 8. How to scale?
  9. 9. No silver bullet
  10. 10. MonitorKnow your toolsEvolve, iterate Learn
  11. 11. Monitoring - Your crystal ball! Glimpse of the future Answer questions - Detect bottlenecks - Detect what needs to be optimized The 90/10 Rule No premature optimization - Detect bad usages - Detect browser patterns - Detect changes, issues  
  12. 12. Monitoring
  13. 13. Monitoring
  14. 14. Monitoring
  15. 15. MonitorKnow your tools Evolve, iterate Learn
  16. 16. Know your tools - Stop reading blogs - Read internals documentation - Test software - Test hardware - Experiment  
  17. 17. Know your tools - Mysql (innoDB) IS fast photos table (photo_id, user_id, ...) PK photo_id, KEY user_id PK user_id, photo_id, KEY photo_id Usage: select * from photos where user=X sorting covering index Even No SQL :) Hardware limits, replication
  18. 18. Know your tools
  19. 19. Know your tools - Memcache Tons of persistent TCP conns eats your ram UDP performance issues Single thread for UDP Multiport patch proxies Stresses the network to the max Driver issues, configuration Variable performance with net devices
  20. 20. Know your tools - No SQL Not magic! Good for heavy write loads Good for data processing Still needs tweaking partitioning, schemas
  21. 21. MonitorKnow your toolsEvolve, iterate Learn
  22. 22. Evolve, iterate - All architectures scale till certain point - Then you must rethink everything Then, and only then! Remember premature optimization? Scale != efficient Future is hard to predict    
  23. 23. MonitorKnow your toolsEvolve, iterate Learn
  24. 24. Learn Learn from: Experience Failure Others
  25. 25. Architecture
  26. 26. Architecture - Basic rules: Static: Add layers (easy caching) Dynamic: Move responsibility to edges General: Decentralize, redundancy  
  27. 27. Architecture - Design for failure: Support disabling Nice degradation, fallbacks Controlled launches - Test with dark launches - Think on storage operations - Be able to migrate live - Focus on your core, use CDNs
  28. 28. Architecture - Move work to the browser: Request routing Templates Cache Pefetch - Move remaining to your FEs: Data relations Consistency Privacy, access check Live migrations Knowledge of the storage infraestructure
  29. 29. Architecture - All teams involved Frontend Good JS, templating, caching, prefetching Backend Data design, parallelization, optimizations Systems Iron benchmarks, tunning, networking
  30. 30. Dynamic site example
  31. 31. Scaling a website - Setup: 1 server - Bottleneck: cpu   - Solution: Add fronteds - Changes: Share sessions
  32. 32. Scaling a website - Setup: N fronteds, 1 DB - Bottleneck: DB Reads   - Solution: Add DB slaves - Changes: Split reads to slaves or DB proxy
  33. 33. Scaling a website - Setup: N fronteds, 1 DB Master + N Slaves - Bottleneck: Limited # of slaves, so DB Reads   - Solution: Chain replication / Add cache layer - Changes: Big ones! Some caches in certain places is easy But for dynamic app, Memcache as storage Makes your DB nor relational
  34. 34. Scaling a website - Setup: N FEs, 1 DB Master + N Slaves, Caches - Bottleneck: DB Writes   - Solution: Split tables into DB clusters - Changes: Add some DB abstraction
  35. 35. Scaling a website - Setup: N FEs, N DB clusters, Caches - Bottleneck: DB Writes on certain table   - Solution: Partition tables - Changes: DB abstraction and big changes DB no longer relational, more key based Partition key limits queries Denormalization, duplicity  
  36. 36. Scaling a website - Setup: N FEs, N partitioned DBs, Caches - Bottleneck: Disk space, DB cost   - Solution: Archive tables - Changes: DB abstraction + migration scripts
  37. 37. Scaling a website - Setup: N FEs, N partition+archive DBs, Cache - Bottleneck: Internal network traffic   - Solution: 2 level caches, split services, cache affinity - Changes: Cache abstraction, browsers
  38. 38. Scaling a website - Setup: N FEs, N partition+archive DBs, multilayered Cache, services - Bottleneck: Datacenter   - Solution: Split services Partition users data - Changes: Big ones! Greater replication lags, inconsistencies
  39. 39. The Tuenti Backend Framework
  40. 40. Backend Framework - Our mission: Provide easy to use, productive, easy to debug, testable, fast, extensible, customizable, deterministic, reusable, instrumentalized (stats) framework and tools to ease developers daily work and manage the infraestructure.
  41. 41. Backend Framework - From Request routing to Storage - Simple layers, clean responsibilities - Clean, organized codebase - Using: convention over configuration configuration over coding - Queuing system for async execution - Gathering stats from all levels
  42. 42. Backend Framework - Request routing: Multiple entry points Fast request parsers route to Agents Data centric agents Printers
  43. 43. Backend Framework - Domain Api: Expose top-level business actions Clean, semantic Api No state, no magic, all data in params Check privacy (the right place!)  
  44. 44. Backend Framework - Domain Backend: Implement public/internal business actions Clean, semantic Api No state, no magic, all data in params Coordinate transactions No privacy  
  45. 45. Backend Framework - Domain Storages (ORM like) Configure storage access for a table Fields, validation, partitioning, primary key, caching techniques, custom queries. Provide access to storage via standard apis: CRUD actions Cached Lists Cached Queries + Custom Data container  
  46. 46. Backend Framework - Storage Strategies CRUD Cached Lists Cached Queries CUD Observers for custom actions    
  47. 47. Backend Framework - Storage Service Provides access to the different storage services: mysql, memcache, hbase... Coordinates transactions Abstract the infrastructure complexities: partitioning, read/write, weights, hosts Handles transactions  
  48. 48. Backend Framework - Storage Services (concrete ones) Abstract the infrastructure complexities: partitioning, read/write, weights, hosts Api close to real one: Memcache: set, get, cas... Mysql: insert, select, update...
  49. 49. Backend Framework - Storage Drivers (concrete ones) Read config Manage PHP drivers Enhance API
  50. 50. Love challenges?
  51. 51. We are hiring! Stay tuned for our d...An Tuenti Challenge 2!
  52. 52. ? Thanks! Guillermo Pérez - Security & Backend Architecture Tech Lead Images Creative Commons from flickr:heydanielle, eschipul, deanfotos66, nrbelex, mikolski, fdecomite, guldfisken