Scaling

330 views

Published on

Scaling: a naïve approach

A look at how to scale an existing monolythic system, and how companies such as Disqus and Eventbrite have done it.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
330
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Scaling

  1. 1. SCALING Òscar Vilaplana @grimborg http://oscarvilaplana.cat
  2. 2. WHAT’S THIS ABOUT? People Technology Tools
  3. 3. PEOPLE Care Focus Automate & Test. Shared brain Finish & DRY.
  4. 4. TECH Design to clone Separate pieces API Offload everything Measure
  5. 5. VIRTUAL QUEUE Queue Instance Queue Instance Queue Instance Queue Instance
  6. 6. VIRTUAL QUEUE Queue Instance Queue Instance Queue Instance
  7. 7. VIRTUAL QUEUE Queue Instance Queue Instance Queue Instance
  8. 8. VIRTUAL QUEUE Queue Instance Queue Instance Queue Instance Queue Instance
  9. 9. TECH • Design to clone • Separate pieces • API • Offload everything • Measure
  10. 10. TYPES OF TASKS • Realtime • ASAP • When you have time } Async!
  11. 11. INSTAGRAM’S FEED • Redis queue per follower. • New media: push to queues • Small chained tasks
  12. 12. INSTAGRAM’S FEED harro wouter orestis siebejan oscar Schedule
 next
 batch
  13. 13. SMALL TASKS • 10k followers per task • < 2s • Finer-grained load balancing • Lower penalty of failure/reload
  14. 14. CELERY: REDIS • Good: Fast • Bad: • Polling for task distribution • Messy non-synchronous replication • Memory limits task capacity
  15. 15. CELERY: BEANSTALK • Good: • Fast • Push to consumers • Writes to disk • Bad: • No replication • Only useful for Celery
  16. 16. CELERY: RABBITMQ • Fast • Writes to disk • Low-maintenance synchronous replication • Excellent Celery compatibility • Supports other use cases
  17. 17. RESERVATIONS • UI • Room locking • Room availability • Registration manager • Email, PDF invoice • Payment • Login • …
  18. 18. WE DON’T DO THIS def do_everything(request): hotel_id = request.GET.hotel_id room_number = request.GET.room_number with room_mutex(hotel_id, room_number): room = (session.query(Room) .filter(Room.hotel_id == hotel_id) .filter(Room.room_number == room_number).one()) if not room.available: return Response("Room not available”, template=room_template) reservation = Reservation(client=request.client, room=room) session.add(reservation) room.available = False price = # price_calculation payment = Payment(reservation=reservation, price=price) session.add(payment) session.commit() url = payment.get_psp_url() return Redirect(url)
  19. 19. BUT WE DO THIS • Frontend UI • Locking rooms • Calculating room availability • Temporarily locking rooms • Payment processing • Mail • PDF invoice generation
  20. 20. BUT WE CAN SCALE!
  21. 21. SCALE DB: HARD • Slaves • Master- Master? • Sharding?
  22. 22. SCALING
  23. 23. MINOR SCALE
  24. 24. MAJOR SCALE
  25. 25. FRONTEND Everything Frontend External
 payment
 providers User Everything Frontend Master Read slaves
  26. 26. SPLIT • Responsibility • Stateful/stateless • Type of system
  27. 27. TYPES OF SYSTEMS • Unique (mutex, datastore) • Multiple
  28. 28. TYPES OF TASKS • Realtime • ASAP • When you have time } Async!
  29. 29. SPLIT THIS Everything Frontend External
 payment
 providers User Everything Frontend Master Read slaves
  30. 30. AUTONOMOUS SYSTEMS Payment External
 payment
 providers Locking Invoice
 PDF Mailer UI Reservations Manager User Session
 Storage Datawarehouse Reporting Configuration Payout
  31. 31. CLONABILITY
  32. 32. CLONABILITY
  33. 33. CLONABILITY Frontend
  34. 34. CLONABILITY Everything Frontend External
 payment
 providers User Everything Frontend Master Read slaves
  35. 35. WHAT’S IN AN EASY STEP As little change as possible. Reuse. Unintrusive. Measure. Go on the right direction.
  36. 36. SMALL STEPS PROBLEMS? ! Oversells Configuration Reporting Payout Everything Frontend Everything Frontend Everything Frontend Everything Frontend Everything Frontend Everything Frontend
  37. 37. SMALL STEPS PROBLEMS? ! Oversells Configuration Reporting Payout SessionsRoom Availability Lock Read Everything Frontend Everything Frontend Everything Frontend Everything Frontend Everything Frontend Everything Frontend
  38. 38. ISOLATED SYSTEM Best technology Decoupled API Testable
  39. 39. SMALL STEPS PROBLEMS? ! Oversells Configuration Reporting Payout Sessions Everything Frontend Config Backend Settings Everything Frontend Everything Frontend Everything Frontend Everything Frontend Everything Frontend Everything Frontend
  40. 40. INITIAL SYSTEM Everything Frontend
  41. 41. INITIAL SYSTEM (MODIFIED) Everything Frontend Sales Sync
  42. 42. INITIAL SYSTEM (MODIFIED) Sales Backend
  43. 43. SMALL STEPS PROBLEMS? ! Oversells Configuration Reporting Payout Sessions Everything Frontend Sales Backend Sales Main DB Everything Frontend Everything Frontend Everything Frontend Everything Frontend Everything Frontend Everything Frontend
  44. 44. SMALL STEPS PROBLEMS? ! Oversells Configuration Reporting Payout SessionsSession Storage Everything Frontend Everything Frontend Everything Frontend Everything Frontend Everything Frontend Everything Frontend
  45. 45. WHEN? • Difficult. • Measure everything. • Find patterns. • Define thresholds. • Design: address as risk. • Don’t overenigneer — Don’t ignore.
  46. 46. EVENTBRITE • 2012: $600M ticket sales • Accumulated: $1B
  47. 47. TECHNOLOGY • Monitoring: nagios, ganglia, pingdom • Email: offloaded to StrongMail • Load-balanced read slave pool • Feature flags • Automated server configuration and release with Puppet and Jenkins
  48. 48. TECHNOLOGY • Feature flags • Develop on Vagrant • Celery + RabbitMQ • Virtual customer queue • Big data for reporting, fraud, spam, event recommendations
  49. 49. TECHNOLOGY • Hadoop • Cassandra • HBase • Hive • Separated into independent services
  50. 50. TIPS • Instrument and monitor everything • Lean
  51. 51. HOW BIG? • 2Gb/day database transactions • 3.5Tb/day social data analyzed • 15Gb/day logs
  52. 52. ORDER PROCESSOR • Pub/sub queue with Cassandra and Zookeeper
  53. 53. PUBLISHING Publisher Get queue lock+last batch id Create new batch “process orders 10, 11, 12” Store batch id, release lock
  54. 54. SUBSCRIBING Subscriber Get my latest processed batch id Store result Update my latest processed batch id
  55. 55. SCALING STORAGE • Move to NoSQL • Aggressively move queries to slaves • Different indexes per slave • Better hardware • Most optimal tables for large and highly-utilized datasets
  56. 56. EMAIL ADDRESSES • Users have many email addresses. • Lookup by email, join to users table
  57. 57. FIRST ATTEMPT CREATE TABLE `user_emails` ( `id` int NOT NULL AUTO_INCREMENT, `email_address` varchar(255) NOT NULL, ... --other columns about the user `user_id` int, --foreign key to users KEY (`email_address`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
  58. 58. FIRST ATTEMPT
  59. 59. LOOKUP
  60. 60. CAN IT BE IMPROVED?
  61. 61. INDEX VS PK • InnoDB: B+trees, O(log n) • Known user id: index on email not needed. • Small win on lookup: O(1) • Big win on not storing the index.
  62. 62. INNODB INDEXES
  63. 63. HASH TABLE
  64. 64. DISQUS • >165K messages per second • <10ms latency • 1.3B unique visitors • 10B page views • 500M users in discussions • 3M communitios • 25M comments
  65. 65. ORIGINAL REALTIME BACKEND • Python + gevent • NginxPushStream • Network IO: great • CPU: choking at peaks • <15ms latency
  66. 66. CURRENT REALTIME BACKEND • Go • Handles all users • Normal load:
 3200 connections/machine/sec • <10ms latency • Only 10%-20% CPU
  67. 67. Workers CURRENT REALTIME BACKEND Subscribed to results Push result to user NginxPushStream
  68. 68. TESTING • Test with real traffic • Measure everything
  69. 69. LESSONS • Do work once, distribute results. • Most likely to fail: your code. Don’t reinvent. Keep team small. • End-to-end ACKs are expensive. Avoid. • Understand use cases when load testing. • Tune architecture to scale.
  70. 70. LEARN MORE • Instagram • Braintree • highscalability.com • VelocityConf (youtube, nov 2014 @ bcn?)
  71. 71. QUESTIONS? ANSWERS? THANKS! Òscar Vilaplana @grimborg http://oscarvilaplana.cat

×