Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Scaling Systems: Architectures that grow


Published on

It's harder than ever to predict the load your application will need to handle in advance, so how do you design your architecture so you can afford to implement as you go and be ready for whatever comes your way. It's easy to focus on optimizing each part of your application but your application architecture determines the options you have to make big leaps in scalability. In this talk we'll cover practical patterns you can build today to meet the needs of rapid development while still creating systems that can scale up and out. Specific code examples will focus on .NET but the principles apply across many technologies. Real world systems will be discussed based on our experience helping customers around the world optimize their enterprise applications.

Published in: Software
  • Be the first to comment

  • Be the first to like this

Scaling Systems: Architectures that grow

  1. 1. Scaling Systems: Architectures that Grow Fundamental Patterns for scaling you can implement incrementally Kendall Miller
  2. 2. Who Am I? • Kendall Miller • One of the Founders of Gibraltar Software • Small Independent Software Vendor Founded in 2008 • Developers of VistaDB and Loupe • Engineers, not Sales People • Enterprise Systems Architect & Developer since 1995 • BSE in Computer Engineering, University of Illinois Urbana- Champaign (UIUC)
  3. 3. What Do We Do? Advanced logging and analysis of errors, performance, and usage patterns for .NET web apps, desktop apps and services The easy-to-deploy, SQL Server-compatible, pure .NET embedded database.
  4. 4. Fair Warning
  5. 5. What is Scale? Scaling is the ability to cope and perform under an increasing workload.
  6. 6. What is Scale? Scaling to a load = available sustaining that load
  7. 7. What is Scale? Being available is really about a request being completed in a period of time.
  8. 8. What is Scale? •Requests per Unit Time •Maximum Request Latency
  9. 9. Gibraltar Software 1.00E+00 1.00E+01 1.00E+02 1.00E+03 1.00E+04 1.00E+05 1.00E+06 1.00E+07 1.00E+08 What’s your Target? Average daily traffic in Visitors / Day
  10. 10. What’s your Target? 25,000 Visitors/Day = 125,000 Pages/Day 11 High Traffic Hours/Day = 12,000 Pages/Hour 12,000 Pages/Hour = 3.3 Pages/Second
  11. 11. Specific Architectures • Gossip • Map Reduce • Tree of Responsibility • Stream Processing • Scalable Storage • Publish/Subscribe • Distributed Queues • Load Balancers + Shared Nothing Units • Load Balancers + Stateless Nodes + Scalable Storage • Content Addressable Networks • General Peer to Peer
  12. 12. ACD C
  13. 13. ACD/C • Async – Do the work whenever • Caching – Don’t do any work you don’t have to • Distribution – Get as many people to do the work as you can • Consistency – We all agree on these key things
  14. 14. Async • Decouple operations so you do the minimum amount of work in performance critical paths • Queue work that can be completed later to smooth out load • Speculative Execution • Scheduled Requests (Nightly processes)
  15. 15. Caching • Save results of earlier work nearby where they are handy to use again later • Apply in front of anything that’s time consuming • Easiest to apply from the left to the right • Simple strategies can be really effective (EF Dump all on update)
  16. 16. Why Caching? • Loading the world is impractical • Apps ask a lot of repeating questions. • Stateless applications even more so • Answers don’t change often • Authoritative information is expensive
  17. 17. Distribution • Distribute requests across multiple systems • Classic web “Scale Out” approach • The less state held, the easier to distribute work. • Distributed database = hard • Distributed static content server = easy • Request routing for distribution can serve other availability purposes
  18. 18. Consistency • The degree to which all parties observe the same state of the system at the same time • Scaling inevitably requires compromise • Forces one source of the truth for absolute consistency and requires extensive locking to ensure parties agree • The real world doesn’t require the consistency we tend to demand of our systems
  19. 19. Consistency Challenges • Singleton Data Structures (Order numbers..) • State held between the endpoints of a process • Consistent results of queries across partitioned datasets
  20. 20. Typical Application Client (Web Browser) Server (Web Server) Storage (Database) Session State SSL Session Log Contention Memory Allocation/GC Network Sockets Request Queue Transaction Isolation Reader/Writer Locks Singleton Data Structures
  21. 21. Caching Client (Web Browser) Server (Web Server) Storage (Database) 100% 50% 10% 1%
  22. 22. Client (Web Browser) Distribution Server (Web Server) Storage (Database) Client (Web Browser) Client (Web Browser) Client (Web Browser) Server (Web Server) Session State and Identity need to be factored out Partition (Sticky Session) First, then stateless nodes
  23. 23. Server (Web Server) Client (Web Browser) Partitioned Storage Zones Server (Web Server) Storage (Database)Client (Web Browser) Client (Web Browser) Client (Web Browser) Server (Web Server) Server (Web Server) Storage (Database)
  24. 24. Server (Web Server) Client (Web Browser) Partitioned Storage Intra-Zone Orders Client (Web Browser) Client (Web Browser) Client (Web Browser) Server (Web Server) Products Customer B Server (Web Server) Server (Web Server) Inventory
  25. 25. Server (Web Server) Asynchronous Processing Orders Server (Web Server) Products Server (Web Server) Server (Web Server) Inventory Order Queue Order Processing Server
  26. 26. Fresh Problems
  27. 27. Fallacies of Distributed Computing • The network is reliable • Latency is zero • Bandwidth is infinite • The network is secure • Topology doesn’t change • There is one administrator • Transport cost is zero • The network is homogeneous
  28. 28. Client (Web Browser) Fresh Problems: Partial Failures Server (Web Server) Storage (Database) Client (Web Browser) Client (Web Browser) Client (Web Browser) Server (Web Server)
  29. 29. Fresh Problems: Partial Failures • Break system into individual failure zones • Monitor each instance of each zone for problems • Route around bad instances
  30. 30. Without monitoring, redundancy is worthless
  31. 31. Server (Web Server) Client (Web Browser) Fresh Problems: Upgrades Server (Web Server) Storage (Database)Client (Web Browser) Client (Web Browser) Client (Web Browser) Server (Web Server) Server (Web Server) Storage (Database)
  32. 32. Fresh Problems: Upgrades • Break system into individual upgrade zones • Upgrade each zone – Drain & Stop, Upgrade, Verify. • Cut traffic over to updated zones
  33. 33. Design for Software Update From the Start • Don’t forget Data Schemas
  34. 34. Bring It All Home Don’t worry, we got this.
  35. 35. Bringing Home the Bacon Testing Testing Testing
  36. 36. Critical Lessons Learned • ACD/C • Clear Consistency Strategy • Build in monitoring and management
  37. 37. Thanks! Twitter @KendallMiller Email m Blog