The Economies of Scaling Software
Abdelmonaim Remani
@PolymathicCoder
Creative Commons Attribution Non-Commercial License 3.0 Unported

The graphics and logos in this presentation belong to th...
About Me
•

Platform Architect at just.me Inc.

•

JavaOne RockStar and frequent speaker at many developer events and conf...
http://speakerscore.com/jazoon-scalability
Follow @PolymathicCoder
The Title of the Talk
• The Economies of Scale
• “In microeconomics, economies of scale are the cost

advantages that ente...
Let’s Go!
Blurred Lines…
• Only the enterprise worried about scalability
• The rise of social and the abundance of mobile
• An expon...
The Bar Is Higher!
Scalability is everyone’s problem…

|

@PolymathicCoder
What is Scalability?
The Common Definition
• The ability of an application to handle an increasing
amount of work without performance degradati...
A Better Definition
• The ability of an application to gracefully evolve within
the constraints of its ecosystem in order ...
A Black Art!
• Don’t be surprised if
• Your application supports one
million users
• You add one more feature
• 500,000 us...
Latency Is
Your Enemy
Syllogismo
• To scale is to reduce latency
• To reduce latency is to address bottlenecks
• To scale is to address bottlene...
Overcoming
The CPU
Bottleneck
Overcoming the CPU Bottleneck
• Nothing affects the CPU more than the instructions it is
summoned to execute
• This is abo...
A Scalable Architecture
Architecture?
• “Things that people perceive as hard-to-change” -Martin
Flower
• http://martinfowler.com/ieeeSoftware/whoN...
Be Wise… Think Twice…
• Choose the right technologies
•
•

Platform
Languages
• Frameworks
• Libraries

• Make the right a...
Write Good Code
Write Good Code
• Think your algorithms through and mind their complexity
(Asymptotic Complexity, Cyclomatic Complexity, e...
Quality… Quality… Quality!
• Obsess with testing
• TDD/BDD

• Tools
• Static code analyzers (PMD, FindBugs, etc…)
• Profil...
Know Thy S#!t
• Read
•
•
•
•
•
•
•
•
•

The Classics (The Mythical Man-Mouth, etc…)
GoF’s “Design Patterns”
Eric Evans’ “D...
The Inevitable
You do all that…
You’ll end up with…

At best…
The fading tradition of making cow dung piles
http://news.ukpha.org/2011/01...
Still better than…

|

@PolymathicCoder
Technical Debt
• What is it?
• The quick-and-dirty you are not proud of
• What you would have done differently haven't you...
Write Code That Scales Up
Vertical Scaling
• Vertical Scaling (Scaling Up)
• On a single-node system
• Adding more computing resources to the node (...
Parallelism At The Node Level
• Writing concurrent code of simultaneously executing
code
• Simple business logic within co...
Easier Said Than Done…
• Moore’s Law
• Performance gain is automatically realized by software (Code is
faster on faster ha...
Easier Said Than Done…
• Synchronize state across threads across multiple cores
• Good luck!

• Relay on frameworks and li...
It Gets More Interesting…
• Amdahl’s Law
• Throwing more cores does not necessarily result in performance
gain
• Diminishi...
Miscellaneous
• Leverage Probabilistic data structures and algorithms
• Bloom Filters, Quotient filters, etc…

• Go Reacti...
Write Code That Scales Out
Horizontal Scaling

• Horizontal Scaling
• On a distributed system (A cluster)
• Adding more nodes

• Writing code to harn...
Topology
• A typical cluster consists of
• A number of identical application server nodes behind a load
balancer

|

@Poly...
Topology
• A typical cluster consists of
• A number of identical application server nodes behind a load
balancer

A number...
Topology
• A typical cluster consists of
• A number of identical application server nodes behind a load
balancer

Identica...
Topology
• A typical cluster consists of
• A number of identical application server nodes behind a load
balancer

Load bal...
Managing State
• Session data
• Session Replication
• Session Affinity / Sticky Session
• Requests from the same client ar...
Parallelism At The Cluster Level
• Leverage Map/Reduce
• “A programming model for processing large data sets
with a parall...
Miscellaneous
• How to HTTPS?
• End at load balancer
• Wildcard SSL

• Distributed Lock Manager (DLM)
• Synchronize access...
Deployment
Deployment
• Multiple Environments
• Development, Test, Stage, and Production
• Automatic Configuration Management

• Prac...
Overcoming
The Storage I/O
Bottleneck
The Storage I/O Bottleneck

• The storage I/O is usually the most significant

|

@PolymathicCoder
The Persistent Datastore
What Datastore to Use?
• Relational of course!
•
•
•
•

Normalized schema guaranteeing data integrity
ACID Transactions
No...
Mucho Data!
• No other choice but scaling out RDBMS
• Master/Slave clusters
• Sharding

• Failed big time!
• RDBMS is desi...
NoSQL
• A wide range of specialized datastores with the goal of
addressing the challenges of the relational model
• “The w...
Polyglot Persistence
• Within the application
• Data is complex and accessed in many different ways
• Why should we fit it...
Caching
Caching
• A cache is typically a simple key-value data structure
• Instead of incurring the overhead of data retrieval or
...
Caching
• Where to cache?
• On disk
• File System: Slow and sequential access
• DB: A bit better (Data is arranged in stru...
Caching
• How to cache?
• Most caches implement a very simple interface
• Always attempt to get from cache first using a k...
Caching Patterns
• Caching Query Results
• Key: Hash of the query itself
• How about parameterized queries?
• Key: Hash of...
Caching Patterns
• Time-series datasets (Ex. Real-time feed)
• Most of the time pseudo/near real-time is enough
• Use cach...
Caching Gotchas
• Profile your code to assess what to cache, and whether
you need to to begin with
• Stale state might bit...
Featured Solutions
•
•
•
•

EhCache
Memcahed
Oracle Coherence
Redis
• A persistence NoSQL datastore
• Built-in data struct...
Overcoming
The Network I/O
Bottleneck
The Network I/O Bottleneck

• The Network I/O is can bring you down as much

|

@PolymathicCoder
Asynchronous Processing
Asynchronous Processing

• Resource-intensive tasks cannot be handled practically during an
HTTP session
• Synchronous pro...
Asynchronous Processing Patterns
• Pseudo-Asynchronous Processing
• Flow
• Process data / operations in advance
• User req...
Asynchronous Processing Patterns
• True Asynchronous Processing
• Flow
• User request data or operation
• Acknowledge
• Ex...
Techniques
• Leverage Job/Work/Task Queues
•
•
•
•
•

JMS (Java Messaging Service) – JSR 914
AMQP (Advanced Message Queuin...
Content Delivery Network
Content Delivery Network (CDN)
• Static content
• Binary (Video, Audio, etc…)
• Web objects (HTML, JavaScript, CSS, etc…)
...
CDN Gotchas
• Dirty Caches
• script.js is a script file deployed on CDN
• Multiple copies of script.js will be replicated ...
CDN Gotchas
• Dirty Caches
• What to do?
•
•
•

Simply append version number to file names
• script-v1.js, script-v2.js, e...
Domain Name Service
Domain Name Service (DNS)
• Do NOT rely on your free domain name registrar DNS
•

Use a scalable DNS solution
• AWS Route ...
Remoting
Remoting
• In a SOA (Service Oriented Architecture)
• RPC calls to multiple services
• Data Exchange (Plain vs. Binary)
• ...
Qualifying
Scalability
Qualifying Scalability
• Instrumentation: Bake it into the code early
• Monitoring
• Health (Application / Infrastructure)...
Disaster
Recovery
When Disaster Hits…
• Goal
• Fault-tolerant system
• Restore service and recover data ASAP in case of a disaster

• Be pro...
Scaling Teams
Scaling Teams
• Hiring
• Always hire top talent
• You are as strong as your weakest link
• Develop a process to bring peop...
Scaling Teams
• Team Structure
• Small is good
• Form ad-hoc teams from pools of Agile breeds
• Product Owners
• Team Memb...
The Take-home
The Take-home Message
• The early-bird gets the worm
• Design to scale from day one
• Plan for capacity early

• Your need...
Take it slow… You’ll get there…
Work smarter not harder…

|

@PolymathicCoder
Questions?
http://speakerscore.com/jazoon-scalability

Thanks for the attention!
Follow @PolymathicCoder

abdelmonaim.remani@gmail.co...
JAZOON'13 - Abdelmonaim Remani - The Economies of Scaling Software
Upcoming SlideShare
Loading in …5
×

JAZOON'13 - Abdelmonaim Remani - The Economies of Scaling Software

941 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
941
On SlideShare
0
From Embeds
0
Number of Embeds
56
Actions
Shares
0
Downloads
22
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

JAZOON'13 - Abdelmonaim Remani - The Economies of Scaling Software

  1. 1. The Economies of Scaling Software Abdelmonaim Remani @PolymathicCoder
  2. 2. Creative Commons Attribution Non-Commercial License 3.0 Unported The graphics and logos in this presentation belong to their rightful owner
  3. 3. About Me • Platform Architect at just.me Inc. • JavaOne RockStar and frequent speaker at many developer events and conferences including JavaOne, JAX, OSCON, OREDEV, 33rd Degree, etc... • Open-source advocate and contributor • Active Community member • • The NorCal Java User Group The Silicon Valley Dart Meetup Bio: Twitter: http://about.me/PolymathicCoder @PolymathicCoder Email: abdelmonaim.remani@gmail.com SlideShare: http://www.slideshare.net/PolymathicCoder/ | @PolymathicCoder
  4. 4. http://speakerscore.com/jazoon-scalability Follow @PolymathicCoder
  5. 5. The Title of the Talk • The Economies of Scale • “In microeconomics, economies of scale are the cost advantages that enterprises obtain due to size [...] often operational efficiency is [...] greater with increasing scale [...]” Wikipedia | @PolymathicCoder
  6. 6. Let’s Go!
  7. 7. Blurred Lines… • Only the enterprise worried about scalability • The rise of social and the abundance of mobile • An exponential growth of internet traffic • The creation of a spoiled user-base • I want to see the closest Moroccan restaurants to my current location on a map along with consumer ratings and whether any of my friends has recently checked-in in the last 30 days • The lines are blurred between consumer applications and the enterprise applications | @PolymathicCoder
  8. 8. The Bar Is Higher! Scalability is everyone’s problem… | @PolymathicCoder
  9. 9. What is Scalability?
  10. 10. The Common Definition • The ability of an application to handle an increasing amount of work without performance degradation • Not a good definition! It implies: • You’ll need to scale forever • Scalability is relative; It is bound by one’s specific needs • You’ll need to be fully scalable from day one • Scalability is evolutionary; It is a gradual process • There are no external constraints • Unrealistic | @PolymathicCoder
  11. 11. A Better Definition • The ability of an application to gracefully evolve within the constraints of its ecosystem in order to handle the maximum potential amount of work without performance degradation • Work? • Simultaneous requests • Performance degradation? • Increased latency or decreased throughput | @PolymathicCoder
  12. 12. A Black Art! • Don’t be surprised if • Your application supports one million users • You add one more feature • 500,000 user load crashes your system or renders it unusable | @PolymathicCoder
  13. 13. Latency Is Your Enemy
  14. 14. Syllogismo • To scale is to reduce latency • To reduce latency is to address bottlenecks • To scale is to address bottlenecks • The usual suspects • The CPU • The Storage I/O • The Network I/O • Inter-related | @PolymathicCoder
  15. 15. Overcoming The CPU Bottleneck
  16. 16. Overcoming the CPU Bottleneck • Nothing affects the CPU more than the instructions it is summoned to execute • This is about your application • How it is written (Architecture, code base, etc..) • How it is deployed | @PolymathicCoder
  17. 17. A Scalable Architecture
  18. 18. Architecture? • “Things that people perceive as hard-to-change” -Martin Flower • http://martinfowler.com/ieeeSoftware/whoNeedsArchitect.pdf • Decision you commit to; the ones that will be stuck with you forever | @PolymathicCoder
  19. 19. Be Wise… Think Twice… • Choose the right technologies • • Platform Languages • Frameworks • Libraries • Make the right abstractions • Loosely-coupled components • Functional abstractions • Technical abstractions • Make sure that the latter is subordinate to the former and not the other way around | @PolymathicCoder
  20. 20. Write Good Code
  21. 21. Write Good Code • Think your algorithms through and mind their complexity (Asymptotic Complexity, Cyclomatic Complexity, etc…) • SOLIDify your design • Single Responsibility, Open-Closed, Liskov Substitution, Interface Segregation, and Dependency Inversion • Understand the limitation of your technology and leverage its strengths | @PolymathicCoder
  22. 22. Quality… Quality… Quality! • Obsess with testing • TDD/BDD • Tools • Static code analyzers (PMD, FindBugs, etc…) • Profilers (Detect memory leaks, bottlenecks, etc…) • Etc… | @PolymathicCoder
  23. 23. Know Thy S#!t • Read • • • • • • • • • The Classics (The Mythical Man-Mouth, etc…) GoF’s “Design Patterns” Eric Evans’ “Domain-Driven Design” Every book by Martin Fowler Uncle Bob’s “Clean Code” Josh Bloch’s “Effective Java” Brian Goetz’s “Java Concurrency in Practice” Tech Papers/Blogs Etc... | @PolymathicCoder
  24. 24. The Inevitable
  25. 25. You do all that… You’ll end up with… At best… The fading tradition of making cow dung piles http://news.ukpha.org/2011/01/the-fading-tradition-of-making-cow-dung-piles/ | @PolymathicCoder
  26. 26. Still better than… | @PolymathicCoder
  27. 27. Technical Debt • What is it? • The quick-and-dirty you are not proud of • What you would have done differently haven't you had time • It’s a matter of time before it starts to smell really bad • What to do? • The fact you recognize it as debt is good thing in itself • Keep tabs and refactor often • Cut the right corners • Don’t mortgage architecture (Don’t lock yourself out) | @PolymathicCoder
  28. 28. Write Code That Scales Up
  29. 29. Vertical Scaling • Vertical Scaling (Scaling Up) • On a single-node system • Adding more computing resources to the node (Getting a beefier machine) • Writing code to harness the full power of the one node | @PolymathicCoder
  30. 30. Parallelism At The Node Level • Writing concurrent code of simultaneously executing code • Simple business logic within containers is already multithreaded • Executing complex business logic within a reasonable time • Break it into smaller steps • Execute them in parallel • Aggregate data back | @PolymathicCoder
  31. 31. Easier Said Than Done… • Moore’s Law • Performance gain is automatically realized by software (Code is faster on faster hardware) • Nothing is forever… • The era of the multi-core chip • We need to write code to take advantage of all cores | @PolymathicCoder
  32. 32. Easier Said Than Done… • Synchronize state across threads across multiple cores • Good luck! • Relay on frameworks and libraries (Fork/Join, Akka, etc…) • Go immutable • Not always straightforward or possible • Go functional (Scala, Clojure, etc…) | @PolymathicCoder
  33. 33. It Gets More Interesting… • Amdahl’s Law • Throwing more cores does not necessarily result in performance gain • Diminishing return at some point no matter how many cores you throw in | @PolymathicCoder
  34. 34. Miscellaneous • Leverage Probabilistic data structures and algorithms • Bloom Filters, Quotient filters, etc… • Go Reactive • http://www.reactivemanifesto.org/ • RxJava, Spring Reactor, etc… | @PolymathicCoder
  35. 35. Write Code That Scales Out
  36. 36. Horizontal Scaling • Horizontal Scaling • On a distributed system (A cluster) • Adding more nodes • Writing code to harness the full power of the cluster | @PolymathicCoder
  37. 37. Topology • A typical cluster consists of • A number of identical application server nodes behind a load balancer | @PolymathicCoder
  38. 38. Topology • A typical cluster consists of • A number of identical application server nodes behind a load balancer A number? • It depends on how many you actually need and can afford • Elastic Scaling / Auto-Scaling • The number of live nodes within the cluster shrinks and grows depending on the load • New ones are provisioned or terminated as needed | @PolymathicCoder
  39. 39. Topology • A typical cluster consists of • A number of identical application server nodes behind a load balancer Identical? • Application nodes are cloned off of image files (Ex. AWS Ec2 AMIs, etc...) • Configuration Management tool (Chef, Puppet, Salt, etc...) | @PolymathicCoder
  40. 40. Topology • A typical cluster consists of • A number of identical application server nodes behind a load balancer Load balancer? • Load is evenly distributed across live nodes according to some algorithm (Round-Robin typically) | @PolymathicCoder
  41. 41. Managing State • Session data • Session Replication • Session Affinity / Sticky Session • Requests from the same client are routed to the same node • When the node dies, the session data dies with it • Shared Session / Distributed Session • Session data is in a “centralized” location • Go Stateless • No session data (Any node would do) | @PolymathicCoder
  42. 42. Parallelism At The Cluster Level • Leverage Map/Reduce • “A programming model for processing large data sets with a parallel, distributed algorithm on a cluster” • Apache Hadoop | @PolymathicCoder
  43. 43. Miscellaneous • How to HTTPS? • End at load balancer • Wildcard SSL • Distributed Lock Manager (DLM) • Synchronize access to shared resources • (Google Chubby, Apache Zookeeper, etc…) • Distributed Transactions • X/Open XA | @PolymathicCoder
  44. 44. Deployment
  45. 45. Deployment • Multiple Environments • Development, Test, Stage, and Production • Automatic Configuration Management • Practice Continuous Delivery • Leverage The Cloud • IaaS, PaaS, SaaS, and NaaS | @PolymathicCoder
  46. 46. Overcoming The Storage I/O Bottleneck
  47. 47. The Storage I/O Bottleneck • The storage I/O is usually the most significant | @PolymathicCoder
  48. 48. The Persistent Datastore
  49. 49. What Datastore to Use? • Relational of course! • • • • Normalized schema guaranteeing data integrity ACID Transactions No biased towards specific access patterns Flexible query language • As datasets grow • • • • • Scale up (Buy beefier machines) Database tuning / query optimization Create materialized views De-normalize Etc… | @PolymathicCoder
  50. 50. Mucho Data! • No other choice but scaling out RDBMS • Master/Slave clusters • Sharding • Failed big time! • RDBMS is designed to run on one machine • Eric Brewer’s CAP Theorem of distributed systems • Pick 2 out of 3: Consistency, Availability, and Partition Tolerance • The relational model is designed to favor CA, hence can never support P | @PolymathicCoder
  51. 51. NoSQL • A wide range of specialized datastores with the goal of addressing the challenges of the relational model • “The whole point of seeking alternatives is that you need to solve a problem that relational databases are a bad fit for” –Eric Evans • A wide variety • • • • Key-Value Datastores Columnar Datastores Document Datastores Graph Datastores | @PolymathicCoder
  52. 52. Polyglot Persistence • Within the application • Data is complex and accessed in many different ways • Why should we fit it into one storage model? • Polyglot Persistence is about • Leveraging multiple data stores based on the specific way the data is stored and accessed • For more info: • Checkout my talk on YouTube from JAX Conf 2012 • “The Rise of NoSQL and Polyglot Persistence” • http://bit.ly/PCWtWi | @PolymathicCoder
  53. 53. Caching
  54. 54. Caching • A cache is typically a simple key-value data structure • Instead of incurring the overhead of data retrieval or computation every time, you check the cache first • You can’t cache everything, caches can be configured to use multiple algorithms depending on the use case (LRU, LFU, Bélády's Algorithm, etc...) • Use aggressively! • What to cache? • Frequently accessed data (Session data, feeds, etc…) • Results of intensive computations | @PolymathicCoder
  55. 55. Caching • Where to cache? • On disk • File System: Slow and sequential access • DB: A bit better (Data is arranged in structures designed for efficiant access, indexes, etc…) • Generally a terrible idea (SSDs make things a bit better) • In-Memory: Fast and random access, but volatile • Something in between: Persistence caches (Redis, etc…) • What type of cache? • Local, Replicated, Distributed, and Clustered | @PolymathicCoder
  56. 56. Caching • How to cache? • Most caches implement a very simple interface • Always attempt to get from cache first using a key • If it is a hit, you saved yourself the overhead • If it is a miss, compute or read from the data store then put in cache for subsequent gets • When you update you can evict stale data • You can set a TTL when you put • Many other common operations... | @PolymathicCoder
  57. 57. Caching Patterns • Caching Query Results • Key: Hash of the query itself • How about parameterized queries? • Key: Hash of the query itself + Hash of parameter values • Method/Function Memoization • Key: Method name • How methods with parameters? • Key: Hash of the method name + Hash of parameter values • Caching Objects • Key: Identity of the object | @PolymathicCoder
  58. 58. Caching Patterns • Time-series datasets (Ex. Real-time feed) • Most of the time pseudo/near real-time is enough • Use caching to throttle access to resources • Cache query result with a t expiry • Fresh data is only read every t | @PolymathicCoder
  59. 59. Caching Gotchas • Profile your code to assess what to cache, and whether you need to to begin with • Stale state might bite you hard • Incoherence: Inconsistent copies of objects cached with multiple keys • Stale nested aggregates • Network overhead of misses might outweighs the performance gain of hits • Consider writing/updating cache when writing/updating the persistence store | @PolymathicCoder
  60. 60. Featured Solutions • • • • EhCache Memcahed Oracle Coherence Redis • A persistence NoSQL datastore • Built-in data structures like sets and lists • Supports intelligent keys and namespaces | @PolymathicCoder
  61. 61. Overcoming The Network I/O Bottleneck
  62. 62. The Network I/O Bottleneck • The Network I/O is can bring you down as much | @PolymathicCoder
  63. 63. Asynchronous Processing
  64. 64. Asynchronous Processing • Resource-intensive tasks cannot be handled practically during an HTTP session • Synchronous processing is overused and not necessary most of the time | @PolymathicCoder
  65. 65. Asynchronous Processing Patterns • Pseudo-Asynchronous Processing • Flow • Process data / operations in advance • User requests data or operation • Respond synchronously with pre-processed result • Sometimes not possible (Dynamic content, etc...) | @PolymathicCoder
  66. 66. Asynchronous Processing Patterns • True Asynchronous Processing • Flow • User request data or operation • Acknowledge • Ex. A REST that return an “202 Accepted” HTTP status code • Do Processing at your own convenience • Allow the user to check progress • Optionally notify when processing is completed | @PolymathicCoder
  67. 67. Techniques • Leverage Job/Work/Task Queues • • • • • JMS (Java Messaging Service) – JSR 914 AMQP (Advanced Message Queuing Protocol): RabbitMQ, ActiveMQ, etc… AWS SQS Redis Lists Etc… • Task Scheduling • Jobs triggered periodically (Cron, Quartz, etc…) • Batch Processing | @PolymathicCoder
  68. 68. Content Delivery Network
  69. 69. Content Delivery Network (CDN) • Static content • Binary (Video, Audio, etc…) • Web objects (HTML, JavaScript, CSS, etc…) • Do NOT serve through your application server • Use a CDN • “A large distributed system of servers deployed in multiple data centers across the internet” • Akamai • AWS CloudFront | @PolymathicCoder
  70. 70. CDN Gotchas • Dirty Caches • script.js is a script file deployed on CDN • Multiple copies of script.js will be replicated across all edge nodes of the CDN • Clients/browsers will their own copies of script.js locally • We update script.js • Since the new and old version have the same URI • New clients will be served the old version by the CDN • Old clients will continue to use the old version from their local cache | @PolymathicCoder
  71. 71. CDN Gotchas • Dirty Caches • What to do? • • • Simply append version number to file names • script-v1.js, script-v2.js, etc… Force invalidation of all copies on edge nodes Set HTTP caching headers properly | @PolymathicCoder
  72. 72. Domain Name Service
  73. 73. Domain Name Service (DNS) • Do NOT rely on your free domain name registrar DNS • Use a scalable DNS solution • AWS Route 53 • DynECT • UltraDNS • Etc… • Domain Sharding • • Browsers limit the number of connections per host (Max of 6 usually) • Creating multiple subdomains (CNAME entries) allow for more resources to be downloaded in parallel Watch out for: DNS lookup overhead, HTTPS cost, Browser’s Same-Origin Policy, etc… | @PolymathicCoder
  74. 74. Remoting
  75. 75. Remoting • In a SOA (Service Oriented Architecture) • RPC calls to multiple services • Data Exchange (Plain vs. Binary) • SOAP / REST with XML or JSON • Google Protocol Buffers, Apache Thrift, Apache Avro, etc… • Protocol • JMS • HTTP • SPDY | @PolymathicCoder
  76. 76. Qualifying Scalability
  77. 77. Qualifying Scalability • Instrumentation: Bake it into the code early • Monitoring • Health (Application / Infrastructure) • Key Performance Indicators (KPIs) • Number of request handled, throughput, latency, Apdex Index, etc ... • Logs • Testing • Load/Stress testing | @PolymathicCoder
  78. 78. Disaster Recovery
  79. 79. When Disaster Hits… • Goal • Fault-tolerant system • Restore service and recover data ASAP in case of a disaster • Be proactive • Develop a Disaster Recovery Plan (DRP) • Practice and test your DRP by doing failure drills | @PolymathicCoder
  80. 80. Scaling Teams
  81. 81. Scaling Teams • Hiring • Always hire top talent • You are as strong as your weakest link • Develop a process to bring people in • Turnkey Hardware/Software Setup (Vagrant, etc...) • Arrange for proper access/accounts • Develop a knowledge base (Architecture documentation, FAQs, etc...) • Development Process • Be Agile • Refine in the spirit of Six Sigma | @PolymathicCoder
  82. 82. Scaling Teams • Team Structure • Small is good • Form ad-hoc teams from pools of Agile breeds • Product Owners • Team Members • Team Lead (Scrum Master) • Engineers • QAs • Architecture Owners • Give them ownership of their DevOps | @PolymathicCoder
  83. 83. The Take-home
  84. 84. The Take-home Message • The early-bird gets the worm • Design to scale from day one • Plan for capacity early • Your needs determine how scalable “your scalable” needs to be • Do not over-engineer • Do not bite more than you can chew • Building scalable system is process • Commit to a road map around bottlenecks • Guided by planned business features • Learn from others’ experiences (Twitter, Netflix, etc...) | @PolymathicCoder
  85. 85. Take it slow… You’ll get there… Work smarter not harder… | @PolymathicCoder
  86. 86. Questions?
  87. 87. http://speakerscore.com/jazoon-scalability Thanks for the attention! Follow @PolymathicCoder abdelmonaim.remani@gmail.com http://blog.polymathiccoder.com

×