Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Software at Scale

Can your app handle an appearance on the front page of TechCrunch? In this talk, we'll compare common design patterns and strategies for building software that can scale to millions of users and beyond, such as concurrency, caching, CDNs, compression, immutability, sharding, partial ordering, and read optimization. We'll discuss why the REST paradigm has become such a natural fit for building web and app backend services, as well as how to test your app for scalability so you can be confident that it will survive an unexpected spike in traffic.

  • Be the first to comment

  • Be the first to like this

Software at Scale

  1. 1. Why is scale important? 80000 70000 60000 50000 40000 30000 20000 10000 0 Missed opportunity Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Usage Difficulty “Do things that don’t scale!” Permanent scaling need But scale if it’s on the way.
  2. 2. A tale of two startups (“Or how I spent 2013…”) Clipless Built to scale. v1 developed in 3 months. PR blast to TechCrunch, AndroidPolice, etc. led to 1700% month over month growth. Handling over 10,000 QPS. Acquired 3 months from launch. Shark Tank Startup Scaling ignored. v1 developed in 3 months. Reran on Shark Tank, service and website went down almost immediately. Still slow (but steady) growth.
  3. 3. What was different? Clipless (Tomcat, 1-3 Digital Ocean VMs) Load balanced, replicated servers and DBs. Well-written RESTful API, any server could answer any query. Multithreaded backend. Batched, asynchronous DB operations. Caching by locality and time. Queued network operations. S.T. Startup (Ruby on Rails, Heroku) No load balancing. Replicated DB via Heroku postgres. Not truly REST, backends kept state. Single-threaded backend (one request blocked entire Heroku dyno). Direct, blocking DB access. DB caching via ActiveRecord.
  4. 4. Potential Bottlenecks • Client resources • CPU • Memory • I/O • Server resources • Database resources • Open connections • Running queries • Network resources • Bandwidth • Connections / open sockets • Availability (esp. on Wifi / mobile networks)
  5. 5. Potential Bottlenecks • Client resources • CPU • Memory • I/O • Server resources • Database resources • Network resources Profile your algorithms Crunch less data Reuse more old work Offload some processing to the server • Bandwidth • Connections / open sockets • Availability (esp. on Wifi / mobile networks)
  6. 6. Potential Bottlenecks • Client resources • Server resources • CPU • Memory • I/O • Database resources • Network resources Profile your algorithms Crunch less data Reuse more old work (across users) Divide and Conquer (“shard”) Spin up and balance more servers • Bandwidth • Connections / open sockets • Availability (esp. on Wifi / mobile networks)
  7. 7. Potential Bottlenecks • Client resources • Server resources • Database resources • Open connections • Running queries • Network resources Optimize your queries Connection pooling Add a second-level cache Reuse more old work (across users) Divide and Conquer (“shard”) Batch DB requests Spin up and replicate more DBs • Bandwidth • Connections / open sockets • Availability (esp. on Wifi / mobile networks)
  8. 8. Potential Bottlenecks • Client resources • Server resources • Database resources • Network resources • Bandwidth • Connections / open sockets • Availability (esp. on Wifi / mobile networks) Add a local cache Send diffs Compress responses (CPU tradeoff) Connection pooling Batch network requests
  9. 9. Profiling (Diagnosing the problem) Purpose: find the “hotspots” in your program. Things you care about: • “CPU time” – time spent processing your program’s instructions. • “Memory” – RAM being used to store your program’s data. • “Wall time” – overall time spent waiting for the program. • Methods: • Basic: “Stopwatch” • Advanced: Profiler (e.g. jprof, jprofiler, hprof, Netbeans, Visual Studio)
  10. 10. Stopwatch • Easy: just time methods. Matlab: function [result] = do_something_expensive(data) tic … toc end • In Java, use Guava’s Stopwatch class (start() and stop() methods).
  11. 11. Profiler
  12. 12. Strategies
  13. 13. Caching and Reuse “There are only two hard things in Computer Science: cache invalidation and naming things.” --Phil Karlton • Trades off CPU for space. • Look for repetition of input. (Including subproblems) • Compute a key from the input. • Associate the result with the key. • Important: algorithm must be a deterministic mapping from input to output. • Important: if you change what the algorithm depends on, update the cache key. Name: Alice Job: Developer Salary: 100,000 <Alice, a@co.com> Cache
  14. 14. Computing a Cache Key • Hashing is a good strategy. • Object.hash (JDK7) / Objects.hashCode (Guava) • Beware: Hashes can collide – sanity check results! • Searching: • Hash data • Query cache for hash key. • If found, return associated value. • If not, query live service and store the result in the cache. <Alice, a@co.com> 0xAF724…
  15. 15. Concurrency Sequential programs run like this: Work Work Work Concurrent programs run like this: Work Work Work A lot of time Less time
  16. 16. Race Conditions Problem: Two threads can simultaneously write to the same variables. If you ran this code in two threads: if (x < 1) { x++; } Then x would usually end up at 1. But sometimes it would be 2! • Race conditions such as that one are among the hardest bugs to find + fix. • Three ways to manage this: • Immutability • Local state • Synchronization • Race conditions only happen when you write to shared, mutable state.
  17. 17. Immutability • General tip: try to minimize the number of states your program can end up in. • Concurrency • REST • (And your programs will just have less state, so you’ll produce fewer bugs) • Declare variables final where possible, set them in the constructor, and don’t write setters unless you must: // String is an immutable type - can’t change it at runtime. // foo is an immutable variable - can’t reassign it. private final String foo; public Bar(String foo) { this.foo = Preconditions.checkNotNull(foo); }
  18. 18. Local State • Sometimes you need to modify state. • But you can still avoid locking if it’s only visible to you: • Two threads can write copies of same data. • Optionally, can be merged back in single thread afterwards. • (This is how MapReduce works) Java inner classes help tremendously with this! // Every time you run sendToNetwork, you’ll use a new channel. No shared state! void sendToNetwork() { final Channel channel = new HttpChannel(context); channel.connect(); Thread foo = new Thread() { @Override public void run() { channel.send(“I am the jabberwocky”); } } }
  19. 19. Synchronization • If you do need to write shared state, you need to synchronize access to it. • Last resort: slows your program and deadlock-prone. Object lock; synchronized (lock) { if (x < 1) { x++; } } Now x is always 1! No interruption possible between read and write. • More advanced: read/write locks (ReentrantReadWriteLock…) • Also check out Java “Atomic” classes and “concurrent” collections: • AtomicBoolean, AtomicInteger, … • ConcurrentHashMap…
  20. 20. Futures • Threads compute asynchronously. • Caller wants some way of knowing the result when it’s ready. • Future: handle to a result that may or may not be available yet. • future.get(): waits for a result and returns it, with optional timeout. • Futures allow for asynchronous calls to immediately return, and for the program to wait for the results when it’s convenient. • Also see Guava’s ListenableFuture. The usual pattern: ThreadPoolExecutor pool; Callable<String> action = new Callable<String>() { @Override public String call() throws NetworkException { return askTheNetworkForMyString(); } }; Future<String> result = pool.submit(callable); String myString = result.get(); // Waits until the result is available. Throws if an exception was thrown inside the Callable.
  21. 21. REST • Scalable client / server architecture. • Sockets are complicated, usually uses HTTP. • Each HTTP request hits an “endpoint”, which does one thing. e.g. GET http://api.clipless.co/json/deals/near/Times_Square • Principles: • Server does not store state (see immutability) • Responses can be cached (see caching) • Client doesn’t care if server is final endpoint or proxy. • State usually ends up in DB, server communicates with client using tokens.
  22. 22. Clipless Architecture Protobuf over HTTP 10,000 reqs / second Apache (mod_proxy_balancer) Tomcat MySQL Content- Addressable Cache Content- Addressable Cache
  23. 23. Static Content • Static content (e.g. HTML, images) is highly cacheable. • Easiest way to cache: use a CDN • Akamai, S3, CloudFlare, CloudFront, MaxCDN, … • Cache key: • Some HTTP headers (inc. Cache-Control header) • Page requested • Last-modified (e.g. from a “HEAD” to your server) • Added bonus: most CDNs are “closer” to your users than your server. • Compressing content reduces bandwidth: • Browsers usually support gzip decompression. • Apache, nginx: Gzip compression plugins • Javascript / CSS: minification • Images: Google PageSpeed service / CloudFlare • Program data: Protocol Buffers, Thrift • Why use your bandwidth when you can use someone else’s?
  24. 24. Sharding Alice Bob Mallory Requests A-L Requests M-Z
  25. 25. Batching Network Requests The Operation Queue / Proactor Pattern Producer Producer Worker Thread Pool Thread-safe queue Work NetworkListener onUp: queue.resume() onDown: queue.suspend() Work Work Producer ListenableFuture<Result>
  26. 26. How to Test • Mock large amounts of data, measure performance • Can be automated so you never encounter performance regressions • Network stress tests • ab • blitz • loader.io • ulimits • Packet sniffers • Round trip time services, e.g. NewRelic.
  27. 27. General Principles • Scale when you anticipate the need. • Scale eagerly when you don’t need to go far out of the way. • CDNs and Gzip compression good examples. • Or when retrofitting will be painful. • RESTful architecture from the beginning: much easier than tacking it on later! • But caching is usually easy to add later. • Focus on the big improvements: • 80/20 rule • Profile and knock out the biggest CPU / memory hogs first. • Practice and internalize to reduce scaling costs! • Concurrency is much easier with mastery. • Caching seems much easier with mastery, often isn’t. • Internalize immutability and you’ll just write better code.
  28. 28. Thanks! Good luck, and always bring mangosteens to acquisition talks.

×