Architecture for Scale [AppFirst]


Published on

It’s one thing to support many data sources with megabytes of data. It’s a completely different problem supporting thousands of data sources with terabytes of data every day. How do you create systems that scale infinitely?

The answer is; you don’t . You can not design for infinite scalability. Rather, consider a pod approach where each pod supports a defined capacity. Scalability results from deployment of multiple cooperating pods.

Systems handling extremely large data sources with significant processing requirements are difficult at best to validate. Attempting to deploy such a system without well understood capacity limits is destined for failure.

This was first presented at Cloud Expo NYC.

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Architecture for Scale [AppFirst]

  1. 1. Architecture for Scale A Case Study AppFirst, INC.
  2. 2. •  Automation, Optimization, and Architecture Design o  Autopilot software o  Automated stock trading platform o  Medical device software o  Adaptive control o  Distributed queue technologies Shaun Krueger Lead Software Engineer
  3. 3. •  NYC based software start-up •  Application o  Operational Intelligence, Miss Nothing Data o  Aggregate data from remote servers o  Provide information for web apps and APIs •  A Few Metrics Today o  100ks summaries per minute from 10ks of servers o  Around a GB per remote server per day, TBs daily o  Query & retrieve information in < 100 MS o  Data store for up to 1 year AppFirst Collects, Aggregates, and Correlates Informationfrom Production Applications
  4. 4. Simplified Architecture
  5. 5. Design for Scale• Micro scaleo Application Components• Macro scaleo The Entire Service
  6. 6. Micro Scale:Data ProcessingRequirements:• Process a constant stream of datao 3 snapshots per minute, per remote server• Create summaries in real-timeo Up to 1 minute behind wall clock time• Provide query results in < 100 MS
  7. 7. Micro Scale:EfficiencyWe found that:• Summaries of the data were needed in order tokeep queries < 100 MSo Servero Processo Process setso Topology• Time series needed for each summary typeo Minuteo Houro DayWe tried:• Flat files• Network file systems• Distributed file systems• Relational databases• NoSQL key-value store• Memory based SQL databases• Distributed shared memory
  8. 8. Tape is Dead Disk is Tape Flash is Disk RAM Locality is King Jim Gray Microsoft December 2006 Micro Scale:We learned the hard way

  9. 9. Micro Scale:SolutionAggregation:• HPC pipeline processing model• RAM based data model• Queues as message bus• Stateless processing• Adaptive control• Queries are fully abstractedHorizontal scale may require that you revisit your design
  10. 10. Micro ScaleWe all know we need to scale horizontallyStateless• Any data processing with any time constraint• Processes can be run on any server• Processes can be migrated• Multiple processes can be added as load varies• All data stored in distributed shared memory• Message passing between components• Send keys and not dataCluster• Use components that cluster• Don’t do backups, use replication• Redis, memcached, and Hbase can be clustered• Postgresql, MySQL, and RabbitMQ don’t really cluster
  11. 11. Macro Scale:Application CapacityLoad:• Most significant load impact from remote servers• User interaction, APIs, and queries do not load the system as much as remote servers• Support 100, 1,000, 10,000, 100,000 remote serversWill a design that supports 10,000 remote servers scale to support100,000 remote servers?
  12. 12. Infinite Scale• Paralyzes the design team• Fosters bad behavior• Unrealistic expectations• Developers forced to take unrealistic action• But... you don’t want to say no to the business• The whole purpose is to add users• When the business brings a customer with10,000 servers you want to say; bring it on
  13. 13. Macro Scale:CapacityWe started with a snapshot:• Supported 1000 remote servers• Micro scale results made it possible to scale out• fairly flexible application component design• Scale out to 10,000 remote servers• This is a financial calculation• Scaled out in linear fashion• Data processing• Storage• Started in linear fashion then determined actual requirements
  14. 14. Macro Scale Solution:The PodPod Architecture:• Segmented infrastructure along the lines of load sources• Create infrastructure to support specific load• Instantiate additional infrastructure with additional load• When a pod gets to 85-90% capacity spin out a new pod• Capacity of a pod is a financial calculation• Scale within a pod in 1000 server increments• Need to automate the deployment of a podPod 0 Pod 1
  15. 15. Write Your Own• Adaptive software• RabbitMQ replacement• Network bridgesMetrics are king• Business metrics• Application metricsTime Series Data• Issues relate to a specific time• Complete state information for any given minute• Don’t know what info is needed before aproblem occurs; all data every minuteDon’t trust the data• Clocks are skewed• Encodings fail• Save all bad data & replay• Think defensiveThe Pod Rocks• Isolated• Distributed• Located where needed• Behind the firewall
  16. 16. Conclusions• Stateless Datao Key to horizontal scale• Disk is tapeo RAM based design is critical, not optional• Clustero Use components that cluster, not just master/salve• Design for infinite scale does not work• Pod approach is an answer for infinite scale
  17. 17. Thank You!Shaun