Architecting for the Cloud
Seth Proctor, CTO
@technicallyseth
What’s unique about “cloud”?
Cloud architecture
  On-demand
  Scale-out for capacity & availability
  Public infrastructure; dynamic provisioning
  Flexible
  Commodity
  Hybrid (public & private)
  Simple
  Monitoring & management
  Platform APIs and automation
  Resilient
Why a different architecture?
  Greater capacity
  Cost-effectiveness
  Higher availability and better failure-
handling
  Lower latencies for global
deployment
Challenges
  Distribution brings challenges
  Lots of failures happen with frequency
  More difficult to get a global view
  Security & data lifecycle is harder
  Everything else about “distributed computing”
  Still, we can scale most layers
  Load-balancers & name services at the top
  Horizontally-scaled app servers
  Caches & CDNs for content
  Redundant disks and object stores
Scaling the database is the
real challenge
Migrating to “the cloud”
  Modern architectures are commodity,
on-demand and virtualized
  Enterprise applications need
availability and transactions
  If cloud-scaling breaks global
consistency migration isn’t possible
7
Traditional database design
  RDBMS architectures start at the disk
  Vertical scale follows
  Caching helps, but often breaks consistency
  HA systems become very expensive
  Schema & operation is hard to evolve
  Hard to harness commodity
infrastructure
  Not designed to scale-out
Common options
  Replication
  Active-passive or (gulp) multi-master
  Replicated data but visible delays & conflict
Sharding
  Split one database into many sub-sets
  More capacity but hard to evolve and relate
  Abandon consistency
  Push correctness & conflict to the application
  Simpler core architecture but painful for
applications and hard to reconcile failures
Consistency
Side-effects
  Applications are tied to deployment
  A key motivator for dev-ops
  Complex for on-demand changes, failures
  More, independent pieces
  Harder to interpret failures
  Complexity
Global operation
  Many motivations
  Disaster Recovery
  Lower-latency for distributed users
  Data access & storage residency rules
  Trade-offs between latencies and
safety
  Storage may be a separate concern
from interaction
The database is not the disk
Evolution of “operational”
  Hybrid Tasks
  Hybrid analytics for real-time insight
  Document & SQL for flexibility
  Graph views to ask hypothetical questions
and model and track lifecycle
  These imply that …
  Data-sets are larger
  Access and cache patterns are variable
  Latencies have more impact
14
Global requirements
  Global Operation
  Active in multiple locations & globally
consistent
  Data Residence & Governance
  Where is your data on-disk and in-memory
  SLAs as Policy for Automation
  Reactive or proactive
  Resilient
  Multiple Models
Cloud is the evolution from
“client-server” to “distributed”
Distributed Database Designs
17
Approach Shared Disk
Shared-Nothing/
Sharded
Synchronous
Replication
Durable
Distributed Cache
Key Idea Sharing a file system.
Independent databases
for disjoint subsets of
data.
Committing data
transactionally to multiple
locations before
returning.
Replicating data in
memory on-demand.
Topology
Example
Oracle RAC
DB2 Pure Scale
MySQL Cluster
and most NoSQL/NewSQL
solutions
Google F1
A Durable, Distributed Cache
  Caching puts a database in-memory
  Optimizations focus on memory, not disk
  Caches are transient, on-demand and
hierarchical by nature
  Distribution means independence
  Equivalent peers that coordinate to provide a
single logical entity
  Drives service resiliency
  Durability provides safety
  Decisions about replication, location and
resource allocation are operational
18
Peer to Peer Architecture
P
P P
S3Disk
, ...
P
P NuoDB Database Peer Process
Provisioned, Manageable Resources
Peer to Peer Communications
SQL
Client
Management
Client
SQL Front-End
SQL Optimizer
Transaction Handling
Object Caching
Object Coordination
Durability
P
NuoDB is designed for
  Global operations
  On-demand capacity
  Continuous availability
  Policy-driven deployment
  Multi-tenancy
  Multiple models
NuoDB is a single, logical service
with global consistency
Scaling YCSB
Scaling DBT-2
23
Summary
  Look for distributed architectures
with on-demand capabilities
  Layer & abstract to support evolution
and react gracefully to failures
  Assume your needs will evolve; plan
with scale in mind
http://dev.nuodb.com
Questions?

New york-breakfast-seminar

  • 1.
    Architecting for theCloud Seth Proctor, CTO @technicallyseth
  • 2.
  • 3.
    Cloud architecture   On-demand  Scale-out for capacity & availability   Public infrastructure; dynamic provisioning   Flexible   Commodity   Hybrid (public & private)   Simple   Monitoring & management   Platform APIs and automation   Resilient
  • 4.
    Why a differentarchitecture?   Greater capacity   Cost-effectiveness   Higher availability and better failure- handling   Lower latencies for global deployment
  • 5.
    Challenges   Distribution bringschallenges   Lots of failures happen with frequency   More difficult to get a global view   Security & data lifecycle is harder   Everything else about “distributed computing”   Still, we can scale most layers   Load-balancers & name services at the top   Horizontally-scaled app servers   Caches & CDNs for content   Redundant disks and object stores
  • 6.
    Scaling the databaseis the real challenge
  • 7.
    Migrating to “thecloud”   Modern architectures are commodity, on-demand and virtualized   Enterprise applications need availability and transactions   If cloud-scaling breaks global consistency migration isn’t possible 7
  • 8.
    Traditional database design  RDBMS architectures start at the disk   Vertical scale follows   Caching helps, but often breaks consistency   HA systems become very expensive   Schema & operation is hard to evolve   Hard to harness commodity infrastructure   Not designed to scale-out
  • 9.
    Common options   Replication  Active-passive or (gulp) multi-master   Replicated data but visible delays & conflict Sharding   Split one database into many sub-sets   More capacity but hard to evolve and relate   Abandon consistency   Push correctness & conflict to the application   Simpler core architecture but painful for applications and hard to reconcile failures
  • 10.
  • 11.
    Side-effects   Applications aretied to deployment   A key motivator for dev-ops   Complex for on-demand changes, failures   More, independent pieces   Harder to interpret failures   Complexity
  • 12.
    Global operation   Manymotivations   Disaster Recovery   Lower-latency for distributed users   Data access & storage residency rules   Trade-offs between latencies and safety   Storage may be a separate concern from interaction
  • 13.
    The database isnot the disk
  • 14.
    Evolution of “operational”  Hybrid Tasks   Hybrid analytics for real-time insight   Document & SQL for flexibility   Graph views to ask hypothetical questions and model and track lifecycle   These imply that …   Data-sets are larger   Access and cache patterns are variable   Latencies have more impact 14
  • 15.
    Global requirements   GlobalOperation   Active in multiple locations & globally consistent   Data Residence & Governance   Where is your data on-disk and in-memory   SLAs as Policy for Automation   Reactive or proactive   Resilient   Multiple Models
  • 16.
    Cloud is theevolution from “client-server” to “distributed”
  • 17.
    Distributed Database Designs 17 ApproachShared Disk Shared-Nothing/ Sharded Synchronous Replication Durable Distributed Cache Key Idea Sharing a file system. Independent databases for disjoint subsets of data. Committing data transactionally to multiple locations before returning. Replicating data in memory on-demand. Topology Example Oracle RAC DB2 Pure Scale MySQL Cluster and most NoSQL/NewSQL solutions Google F1
  • 18.
    A Durable, DistributedCache   Caching puts a database in-memory   Optimizations focus on memory, not disk   Caches are transient, on-demand and hierarchical by nature   Distribution means independence   Equivalent peers that coordinate to provide a single logical entity   Drives service resiliency   Durability provides safety   Decisions about replication, location and resource allocation are operational 18
  • 19.
    Peer to PeerArchitecture P P P S3Disk , ... P P NuoDB Database Peer Process Provisioned, Manageable Resources Peer to Peer Communications SQL Client Management Client SQL Front-End SQL Optimizer Transaction Handling Object Caching Object Coordination Durability P
  • 20.
    NuoDB is designedfor   Global operations   On-demand capacity   Continuous availability   Policy-driven deployment   Multi-tenancy   Multiple models
  • 21.
    NuoDB is asingle, logical service with global consistency
  • 22.
  • 23.
  • 24.
    Summary   Look fordistributed architectures with on-demand capabilities   Layer & abstract to support evolution and react gracefully to failures   Assume your needs will evolve; plan with scale in mind
  • 25.
  • 26.