Cloud Design Patterns

Cloud Design Patterns:
Prepare your application for Azure
Carlos Mendible

+34 648 76 84 17
carlos.mendible@sogeti.com
Carlos Mendible
2
Lead Solutions Architect
carlos.mendible.com/blog
@cmendibl3
carlosmendible

Agenda
► Design for the Cloud
► Problem Areas in the Cloud
• Availability
• Data Management
• Design and Implementation
• Messaging
• Management and Monitoring
• Performance and Scalability
• Resiliency
• Security
► Cloud Design Patterns
• Cache-aside
• Circuit Breaker
• Competing Consumers,
• CQRS
• Event Sourcing
• Valet Key,
• Health Endpoint Monitoring,
• Static Content Hosting
3

Pokemon GO Facts
► Total number of downloads – 100 million (by August 8th, Google Play Mkt)
► Total revenue – $268 million (by August 12st)
► Percentage of iOS users that do in-app purchases – 80%
► Daily Active Users – 20+ millions
► Gender female vs. men split percentage – 40/60
http://www.businessofapps.com/pokemon-go-usage-revenue-statistics/

Pokemon GO Architecture?
► Google Cloud Platform
► Java
► NoSQL (BigTable?)
► No offline mode
► No global consistency
► Load Balancing
► Sharding
► Dynamic Scaling

Designing for Cloud
Multi-tenant
Distributed
system
Abstraction
Commodity
hardware at
Internet scale
Composed of
multiple
services

Services at Internet Scale
► Failure is expected
► Latency is a fact of life
► CAP: Consistent, available, and partition tolerant … pick two.
► Upgrade without downtime requires multiple concurrent service versions
► You can’t know what you don’t measure.
► Nothing is like production
► Services should be as simple as possible
► As services scale, cost/resource should decline
8

Designing for Cloud
• Partition application, scale by adding (or removing) resources
• Optimize density by using resources efficiently
• Use the right services for the right job
Design for
Scale (Out)
• Degrade gracefully, isolate faults, fallback to alternate delivery paths
• Ensure customers (and client devices) can access and use the service
• Services that are “live”, but cannot handle desired/required demand are
not available
Design for
Availability
• Insight is critical; instrumentation, monitoring and alerting
• Lifecycle management; service operations, configuration and updates
• Know the quality of your end user experience before Twitter does
Design for
Operations

11
The Book: Cloud Design Patterns
►http://aka.ms/Cloud-Design-Patterns

Problem Areas in the Cloud
► Availability
► Data Management
► Design and Implementation
► Messaging
► Management and Monitoring
► Performance and Scalability
► Resiliency
► Security
12

Availability
► Availability defines the proportion of time that the system is functional
and working.
► Cloud applications typically provide users with a service level agreement
(SLA)  maximize availability.
13

14
Data Management
► Is the key element of cloud applications, and influences most of the
quality attributes.
► Data is typically hosted in different locations and across multiple
servers.

15
Design and Implementation
► Consistency and coherence in component design and deployment
► Maintainability
► Reusability
► Decisions --> huge impact on the quality and the TCO

16
Messaging
► Messaging infrastructure to connect the components and services
► Loosely coupled.
► Asynchronous messaging.

17
Management and Monitoring
► Management and monitoring more difficult than an on-premises
deployment.
► Applications must expose runtime information that administrators and
operators can use to manage and monitor the system.
► Applications support changing business requirements and customization
without requiring the application to be stopped or redeployed.

18
Performance and Scalability
► Performance is an indication of the responsiveness of a system to
execute any action within a given time interval.
► Scalability is ability of a system either to handle increases in load
without impact on performance.
► Applications should be able to scale out within limits to meet peaks in
demand, and scale in when demand decreases.

19
Resiliency
► Resiliency is the ability of a system to gracefully handle and recover
from failures.
► Cloud + Internet  Increased likelihood that both transient and more
permanent faults will arise.
► Detecting failures, and recovering quickly and efficiently, is necessary

20
Security
► Applications must be designed and deployed in a way that protects
them from malicious attacks, restricts access to only approved users,
and protects sensitive data.

Retry Pattern
Runtime Reconfiguration
Scheduler Agent Supervisor
Sharding Pattern
Computer Resource
Consolidation
Throttling
External Configuration Store
Federated Identity
Gatekeepers
Compensating transaction
Index table
Leader Election
Materialized View
Pipes and Filters
Priority Queue
Queue-Based Load Leveling
Cloud Design Patterns
Cache-Aside
Circuit Breaker
Competing Consumers
CQRS
Event Sourcing
Health Endpoint Monitoring
Static Content Hosting
Valet Key

23
Cache-Aside
Context Solution Usage
 Applications use a cache to
optimize repeated access to
information held in a data
store.
 Usually impractical to expect
that cached data will always
be completely consistent
with the data in the data
store.
 Implement a strategy that
helps to ensure that the
data in the cache is up to
date as far as possible
 Detect and handle situations
that arise when the data in
the cache has become stale.
 Load data into the cache on
demand.
 A cache does not provide
native read-through and
write-through operations.
 Resource demand is
unpredictable.
 When the cached data set is
static.
 For caching session state
information in a web
application hosted in a web
farm

25
Circuit Breaker
 Access remote resources
and services.
 Partial loss of connectivity.
 Complete failure of a
service.
 Pointless for an application
to continually retry the
operation.
 Avoid cascading failures
 Prevent an application
repeatedly trying to execute
an operation that is likely to
fail.
 Detect whether the fault has
been resolved
 Prevent an application from
attempting to invoke a
remote service or access a
shared resource if this
operation is highly likely to
fail.
 Handling access to local
private resources in an
application.
 As a substitute for handling
exceptions in the business
logic of your applications

27
Competing Consumers
 Application running in the
cloud may be expected to
handle a large number of
requests.
 Application pass the request
through a messaging system
and then handles them
asynchronously through a
consumer service
 Use a message queue to
implement the
communication channel
between the application and
the instances of the
consumer service.
 Consumer service instances
receive messages from the
queue and process them
 Application workload can
run asynchronously.
 Tasks are independent and
can run in parallel.
 Volume of work is highly
variable.
 Not easy to separate the
application workload into
discrete tasks
 Tasks must be performed
synchronously or in a
specific sequence

28
CQRS
 CRUD operations are
applied to the same
representation of an
entity.
 Data contention in a
collaborative domain.
 Mismatch between
the read and write
representations of the
data
 Use of separate query and
update models for the data.
 Common to separate the
data into different physical
stores to maximize
performance, scalability, and
security
 Task-based user interfaces.
 Performance of data reads
must be tuned separately
from data writes.
 Integration with other
systems.
 Simple Business rules
 CRUD is sufficient.
 Implementation across the
whole system

29
Event Sourcing
 CRUD systems perform
update operations directly:
hit performance and
responsiveness, and limit
scalability.
 Need to records the details
of each operation in a
separate log.
 Handle operations on data
that is driven by a sequence
of events.
 Use an append-only event
store.
 Capture “intent,” “purpose,”
or “reason”
 It’s vital to minimize
conflicting updates
 Restore the state of a
system.
 Eventual consistency is
acceptable
 Simple domains
 Consistency is required

30
Health Endpoint Monitoring
 It is more difficult to monitor
services running in the
cloud than it is to monitor
on-premises services.
 Services typically depend on
other services provided by
third parties.
 Ensure the required level of
availability (SLA)
 Implement health
monitoring by sending
requests to an endpoint on
the application
 Verify availability.
 Check for correct operation.
 Monitoring middle-tier or
shared services.
 Complement existing
instrumentation
 Does not replace the
requirement for logging and
auditing.

Health Endpoint Monitoring Demo

32
Static Content Hosting
 Requests to download static
content.
 Processing cycles can be put
to better use.
 Locating some of an
application’s resources and
static pages in a storage
service.
 Minimize costs related to
hosting static content.
 CDN.
 Monitor costs and bandwith
usage.
 The application needs to
perform some processing
on the static content.
 The volume of static content
is very small.

33
Valet Key
 Client programs and web
browsers often need to read
and write files or data
streams to and from an
application’s storage.
 This approach absorbs
valuable resources such as
compute, memory, and
bandwidth.
 Data stores have the
capability to handle upload
and download of data.
 Provide the client with a key
or token (vale-key) that the
data store itself can validate.
 Provides time-limited access
to specific resources.
 Maximize performance and
scalability.
 Minimize operational cost.
 Clients regularly upload or
download data.
 If the application must perform
some task on the data before
it is stored or before it is sent
to the client.
 Audit trails or control the
number of times a data
transfer
 Limit the size of the data

Cloud Design Patterns

More Related Content

What's hot

Viewers also liked

Similar to Cloud Design Patterns

More from Carlos Mendible

Recently uploaded

Cloud Design Patterns