Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

New york-breakfast-seminar


Published on

This is a presentation that NuoDB CTO, Seth Proctor gave at a Breakfast Seminar

Seth has 15+ years of experience in the research, design and implementation of scalable systems. His particular focus is on how to make technology scale and how to make users scale effectively with their systems.

Published in: Software
  • Be the first to comment

  • Be the first to like this

New york-breakfast-seminar

  1. 1. Architecting for the Cloud Seth Proctor, CTO @technicallyseth
  2. 2. What’s unique about “cloud”?
  3. 3. Cloud architecture   On-demand   Scale-out for capacity & availability   Public infrastructure; dynamic provisioning   Flexible   Commodity   Hybrid (public & private)   Simple   Monitoring & management   Platform APIs and automation   Resilient
  4. 4. Why a different architecture?   Greater capacity   Cost-effectiveness   Higher availability and better failure- handling   Lower latencies for global deployment
  5. 5. Challenges   Distribution brings challenges   Lots of failures happen with frequency   More difficult to get a global view   Security & data lifecycle is harder   Everything else about “distributed computing”   Still, we can scale most layers   Load-balancers & name services at the top   Horizontally-scaled app servers   Caches & CDNs for content   Redundant disks and object stores
  6. 6. Scaling the database is the real challenge
  7. 7. Migrating to “the cloud”   Modern architectures are commodity, on-demand and virtualized   Enterprise applications need availability and transactions   If cloud-scaling breaks global consistency migration isn’t possible 7
  8. 8. Traditional database design   RDBMS architectures start at the disk   Vertical scale follows   Caching helps, but often breaks consistency   HA systems become very expensive   Schema & operation is hard to evolve   Hard to harness commodity infrastructure   Not designed to scale-out
  9. 9. Common options   Replication   Active-passive or (gulp) multi-master   Replicated data but visible delays & conflict Sharding   Split one database into many sub-sets   More capacity but hard to evolve and relate   Abandon consistency   Push correctness & conflict to the application   Simpler core architecture but painful for applications and hard to reconcile failures
  10. 10. Consistency
  11. 11. Side-effects   Applications are tied to deployment   A key motivator for dev-ops   Complex for on-demand changes, failures   More, independent pieces   Harder to interpret failures   Complexity
  12. 12. Global operation   Many motivations   Disaster Recovery   Lower-latency for distributed users   Data access & storage residency rules   Trade-offs between latencies and safety   Storage may be a separate concern from interaction
  13. 13. The database is not the disk
  14. 14. Evolution of “operational”   Hybrid Tasks   Hybrid analytics for real-time insight   Document & SQL for flexibility   Graph views to ask hypothetical questions and model and track lifecycle   These imply that …   Data-sets are larger   Access and cache patterns are variable   Latencies have more impact 14
  15. 15. Global requirements   Global Operation   Active in multiple locations & globally consistent   Data Residence & Governance   Where is your data on-disk and in-memory   SLAs as Policy for Automation   Reactive or proactive   Resilient   Multiple Models
  16. 16. Cloud is the evolution from “client-server” to “distributed”
  17. 17. Distributed Database Designs 17 Approach Shared Disk Shared-Nothing/ Sharded Synchronous Replication Durable Distributed Cache Key Idea Sharing a file system. Independent databases for disjoint subsets of data. Committing data transactionally to multiple locations before returning. Replicating data in memory on-demand. Topology Example Oracle RAC DB2 Pure Scale MySQL Cluster and most NoSQL/NewSQL solutions Google F1
  18. 18. A Durable, Distributed Cache   Caching puts a database in-memory   Optimizations focus on memory, not disk   Caches are transient, on-demand and hierarchical by nature   Distribution means independence   Equivalent peers that coordinate to provide a single logical entity   Drives service resiliency   Durability provides safety   Decisions about replication, location and resource allocation are operational 18
  19. 19. Peer to Peer Architecture P P P S3Disk , ... P P NuoDB Database Peer Process Provisioned, Manageable Resources Peer to Peer Communications SQL Client Management Client SQL Front-End SQL Optimizer Transaction Handling Object Caching Object Coordination Durability P
  20. 20. NuoDB is designed for   Global operations   On-demand capacity   Continuous availability   Policy-driven deployment   Multi-tenancy   Multiple models
  21. 21. NuoDB is a single, logical service with global consistency
  22. 22. Scaling YCSB
  23. 23. Scaling DBT-2 23
  24. 24. Summary   Look for distributed architectures with on-demand capabilities   Layer & abstract to support evolution and react gracefully to failures   Assume your needs will evolve; plan with scale in mind
  25. 25.
  26. 26. Questions?