source: http://www.sfbayacm.org/?p=1394
The specifics of a cloud’s computing architecture may have an impact on application design. This is particularly important in Infrastructure as a Service (IaaS) cloud environments.
This presentation analyzes aspects of the Amazon EC2 IaaS cloud environment that differ from a traditional datacenter and introduces general best practices for ensuring data privacy, storage persistence, and reliable DBMS backup. Best practices for application robustness and scalability on demand are reviewed and are especially significant in leveraging the full potential of an IaaS cloud. The need for a cloud application management and configuration system is briefly reviewed and two alternate approaches to cloud application management are described (RightScale and Kaavo).
2. About HyperStratus
• Silicon Valley-based cloud computing
consultancy
• Founded by executives with deep
experience in corporate IT, enterprise
software, and global consultancy
• We assist clients in establishing cloud
computing strategies, cloud application
architectures, system selection and
implementations
• We also provide cloud computing training
and workshops
3. Topics Covered
• Introduction to Cloud Architecture
• Basic Amazon AWS Concepts
and Considerations
• AWS Cloud Application Design and
Best Practices
5. What is the Cloud?
UC Berkeley RAD Lab Definition
The illusion of infinite computing resources available on
Huge demand, thereby eliminating the need for Cloud
Computing users to plan far ahead for provisioning
Resources
The elimination of an up-front commitment by Cloud
No users, thereby allowing companies to start small and
increase hardware resources only when there is an
Commitment increase in their needs
The ability to pay for use of computing resources on a
Pay by the short-term basis as needed (e.g., processors by the
hour and storage by the day) and release them as
Drink needed
6. Key Cloud Benefits
IT agility as systems can be sized to meet demand --
Huge as load scales, system resources are easily obtained
Resources to ensure SLAs can be met
No No longer face the tradeoff between overprovisioning
(waste of capital) and underprovisioning (waste of
Commitment users)
Move IT payments from CAPEX to OPEX. Pay only for
Pay by the actual resources consumed. Tie IT cost to business
Drink benefit received
7. Cloud Service Categories
• Infrastructure as a Service (IaaS)
– Amazon EC2
– GoGrid
– Eucalyptus
• Platform as a Service (PaaS)
– Google AppEngine (Python, Java)
– Windows Azure (.Net)
• Software as a Service (Saas)
– Salesforce.com
– Gmail
8. How the Cloud is Delivered
More Less
Structured Control
Public Cloud -- SaaS
Public Cloud -- PaaS
Private Cloud -- IaaS
Less More
Structured Public Cloud -- IaaS Control
9. IaaS Cloud Providers
Amazon (AWS)
GoGrid
CohesiveFT (VPN Cubed)
Rackspace
Amazon VPC (IPsec VPN)
Public
Virtual Private Cloud Public Cloud
Internal Private Cloud External Private Cloud
IBM
HP
Private Cisco/VMware
Terremark
HP (EDS)
Microsoft
AT&T
3Tera
IBM
Eucalyptus
Isolated Shared
10. Cloud Application Example
• Grows from 1MM to 100+ MM insurance claims/day
in one week
• Traditional solution: $750K new hardware +
$30K/month maintenance/hosting
• Cloud solution: $600/month Amazon Web Services
11. Cloud Taxonomy
Source: Christofer Hoff, Cloud Security
Alliance “Security Guidance for Critical
Areas of Focus in Cloud Computing,” Page
22
•Foundation of cloud is virtualization
•Upper cloud services are incremental to
lower cloud services
•Lower level services are key for higher level
services
12. IaaS/Paas in Detail
Components Providers
Adapted: Christofer Hoff, “The Frogs Who Desired a King”
Adapted: Christofer Hoff, “The Frogs Who Desired a King”
• Amazon AWS EC2 is an IaaS environment with RESTful
Web Services API to allocate & manage resources
13. IaaS/PaaS in Detail
Components Providers
Adapted: Christofer Hoff, “The Frogs Who Desired a King”
• AWS SQS, SimpleDB, and CloudFront are PaaS Middleware
• Google AppEngine and Microsoft Azure are PaaS AppServers
19. IaaS Network Component :
EC2 Regions & Zones
• Amazon EC2 locations are composed of Regions which
contain Availability Zones.
• Regions consist of one or more Availability Zones, are
geographically dispersed in separate geographic areas
or countries
– Currently only two Regions: “us-east-1”, “eu-west-1”
• Availability Zones are distinct datacenter locations that
are engineered to be insulated from failures in other
Zones and provide inexpensive, low latency network
connectivity to other Availability Zones in the same
Region
– E.g. “us-east-1a”, “us-east-1b”, …
20. IaaS Network Component :
EC2 Regions & Zones (cont)
• Traffic between Availability Zones in a single region is on AWS-controlled
redundant infrastructure
• All traffic between Regions is across a multiple Tier-1 Public Internet infrastructure
21. IaaS Compute Component:
AWS EC2
• EC2 is based upon Xen Hypervisor (with
significant constraints)
• 1 EC2-CU = CPU capacity of 1.0-1.2 GHz 2007
Opteron or 2007 Xeon
• Compute capacity is defined at granular levels –
I.e Number of CPU Cores and “Compute Units”
per core (1 core @ 1CU up to 8 cores @2.5 CU)
• Virtual Memory ranges are 1.7GB, 7.5GB and
15GB depending on instance type
• Default quota of 20 VM instances per account
22. IaaS Compute Component :
EC2 Compute Unit
• Several AWS benchmarks and tests
manage the consistency and predictability
of the performance of an EC2 Compute
Unit
• Over Time, there may be several different
types of physical commodity hardware
underlying EC2 instances, but EC2-CU
performance should remain constant
23. EC2 Standard Linux Instance Types
Type CPU Memory Storage Platform I/O AWS Cost/
(unformatted) Name hour
1 EC2-CU (1 1.7 GB 170GB instance 32-bit Moderate $0.085
Small virtual core with 1 (917MB storage (160GB $747 a
EC2 Compute swap) plus 10GB root m1.small year or
Unit) partition, 1 $490.30 a
spindle) year
Reserved
4 EC2-CU (2 7.5 GB 910GB instance 64-bit High $0.34
virtual cores with (No storage (2 x 450 $2978 a
2 EC2 Compute swap) GB plus 10GB m1.large year or
Large Units each) root partition, 3 $1961 a
spindles). year
Reserved
8 EC2-CU (4 15 GB 1810GB 64-bit High $0.68
Extra virtual cores with (No instance storage $5957 a
Large 2 EC2 Compute swap) (4 x 450GB plus m1.xlarge year or
Units each) 10GB root $3922 a
partition, 5 year
spindles). Reserved
24. EC2 High-CPU Linux Instance
Types
Type CPU Memory Storage Platform I/O AWS Cost/
(unformatted) Name
hour
5 EC2-CU (2 1.7 GB 370 GB 32-bit Moderate $0.17
High- virtual cores with (917MB instance $1489 a
CPU 2.5 EC2 swap) storage (360 c1.medium year or
Medium Compute Units GB plus 10 GB $981 a
each) root partition, 1 year
spindle) Reserved
20 EC2-CU (8 7.5 GB 1810 GB 64-bit High $0.68
High- virtual cores with (No instance $5957 a
CPU 2.5 EC2 swap) storage (4 x c1.xlarge year or
Extra Compute Units 450 GB plus 10 $3922 a
Large each) GB root year
partition, 5 Reserved
spindles)
25. IaaS Storage Component :
EC2, EBS, S3
• EC2 Instance Default Local Storage –
ephemeral virtual disks that are integral part of
EC2 VM instance
– Range from 170GB to 1.8TB total space, 1 to 5 disks
• Elastic Block Storage – EC2 Additional
persistent disk volumes that can be attached
and mounted on a running VM.
– 1TB max per volume, default quota of 20 volumes
• S3 File storage – Reliable web URL accessible
file-based storage.
– 5GB max per file
26. IaaS Storage Component :
EBS
• An EBS volume is created in a user specified
AWS Availability Zone.
• AWS equivalent of a local SAN RAID Disk and
can only be attached to one running EC2
instance at a time in the same Zone
• Appears to running OS VM as standard disk
drive (e.g. /dev/sdg)
• Must be partitioned and/or formatted with file
system before being mounted
• Higher reliability, lower latency and higher
throughput than than Instance Default Storage
• Supports live snapshots to S3
27. IaaS Storage Component :
S3
• S3 File storage – Reliable web URL
accessible file storage (e.g.
<bucket>.s3.amazonaws.com/file_1.mpg).
<bucket>.
• Buckets are created in user assigned
Regions (e.g. “us-east-1”, “eu-west-1”)
• Unlimited number of index folders and files
(i.e. objects) per bucket, 5GB max per file
• Files in a bucket are replicated to dispersed
Zones in the bucket’s Region
28. IaaS Storage Component :
EC2 Ephemeral Storage Notes
• All Default Local instance storage devices (I.e. non-
EBS EC2 volumes) are ephemeral and all data on
them is lost when the instance is terminated (or
crashes and cannot be rebooted). Use S3, EBS, or
SDB for permanent data.
• Analogous to the file system lifecycle of a Linux
Live-CD that uses RAM drives
• However, default instance storage data is retained
on reboot.
• This is a major EC2 constraint that must be taken
into consideration in an application’s design.
29. IaaS Storage Component :
Default Ephemeral Storage Devices
Location Description
/dev/sda1 Formatted and mounted as 10GB root (/) on all instance
types.
/dev/sda2 Formatted and mounted as /mnt on m1.small (150GB) and
c1.medium (350GB) instances
/dev/sda3 Formatted and mounted as /swap on m1.small and
c1.medium instances (Size 939MB)
/dev/sdb Formatted and mounted as /mnt on m1.large, m1.xlarge,
and c1.xlarge instances (430GB)
/dev/sdc Not formatted or mounted on m1.large, m1.xlarge, and
c1.xlarge instances (450GB raw)
/dev/sdd Not formatted or mounted on m1.xlarge and c1.xlarge
instances (450GB raw)
/dev/sde Not formatted or mounted on m1.xlarge and c1.xlarge
instances (450GB raw)
30. IaaS Image Component:
EC2 and AMIs
• EC2 saves a bootable VM root image as an
“Amazon Machine Image” (AMI).
• An AMI is digitally signed and encrypted by
the owner using private x.509 key. AWS has
a copy of the corresponding public X.509
certificate for decrypting an AMI at EC2
Instance “launch” time
• An AMI is equivalent to a “Gold Master”
image of the configured VM for an EC2
instance
• Multiple EC2 instances can be launched from
the same AMI
31. IaaS Image Component :
S3 and AMIs
• EC2 AMIs are stored in S3 as a “bundle” of
segmented 10MB files and EC2 VM instances
are instantiated (launched) from their S3 AMI.
• Users can create their own AMIs from scratch
(P2V); use pre-built public AMIs; or use a pre-
built AMI as a starting point and then add custom
software assets to finalize the desired AMI.
• Updating an EC2 AMI requires a full “bundling”
process and results in an additional AMI,
different than the original one.
32. IaaS Image Component :
EBS and AMIs
• A running EC2 Instance can be imaged as an
EBS-Backed AMI and saved as an EBS
Snapshot.
• Instances launched from these EBS-Backed AMI
snapshots launch must faster and use persistent
default storage.
• Persistent 15GB root file system.
• EBS-Backed instances can be “Stopped” and
“Started” and the contents of the local storage
will persist.
• Caution - If running instance is
“Terminated”, EBS volume will be deleted.
35. IaaS Network Component :
EC2 Virtual NIC
• Each EC2 Instance has only one Virtual NIC that
is assigned a dynamic EC2 MAC Address and
internal private IP Address
• AWS VM Prevents network cross-talk among
users
• No visibility beyond individual machine NIC
traffic -- even among correlated machines in the
same application configuration
• Communicating within multi-tier VM
configurations typically involves dynamic
DNS server registration
36. IaaS IPAM/DNS Component :
EC2 IP Addresses & DNS
• No customer control of initial VM IP Address or
DNS name assignments
• EC2 routers map two IP addresses to the EC2
Instance
• dynamic EC2 Private Address (RFC-1918, e.g.
10.x.x.x)
• dynamic EC2 Public Address using Network
Address Translation (NAT) (Note: public address
range belongs to AWS)
• Auto-generated DNS name has IP Address as a
component of the name.
• Fixed Elastic-IP Addresses pre-allocated for an
AWS account and later assigned to a running EC2
instance.
37. IaaS Security Component :
EC2 Security Groups & ACLs
• EC2 Security Groups function as network
firewall configurations.
– A Security Group is a named collection of incoming
network traffic rules for an EC2 account.
• Access to each S3 file is controlled by its own
Access Control List (ACL).
– ACL allows READ, WRITE, and FULL CONTROL
(includes access to ACL) privileges on:
• “Everyone”
• “Authenticated Users” (only valid AWS users)
• A list of individual AWS users or groups
38. PaaS Messaging/Queuing Component :
AWS SQS
• Highly Reliable Message Queuing Service with
built-in redundancy within user assigned Regions
• Messages accessible from anywhere via Web
API
• Up to 8 KB of Unicode data per message
• Messages can be retained in queues for up to 4
days
• Messages can be sent and read simultaneously
but FIFO not guaranteed
• Queues can be securely shared with other AWS
accounts and Anonymously. Queue sharing can
also be restricted by IP address and time-of-day.
39. PaaS Database Component :
AWS SimpleDB Beta
• Enhanced MyISAM-like database service
• Simple web services interface to create and
store multiple data sets and query your data
• Data is automatically indexed
• Data stored in Region and automatically
replicated to dispersed Zones
• Requests originating from an application
running in same Amazon Region will have
near-LAN latency.
40. PaaS Database Component :
AWS SimpleDB Beta (cont)
• Similar to MyISAM with enhanced features
– No SQL grammar support
– No table JOIN
– Simple WHERE criteria
• 100 domains (tables) quota per account, max
10GB per domain, max 256 attributes (columns)
per row, max 1KB data per attribute (cell)
• Typically used to store App logs, EC2 Instance
configurations, Application state, Instance status,
analytics, indexes to S3 data
• Scale-out is as simple as creating new domains,
rather than building out new servers.
42. Cloud App Design Attributes
Abstract Focus on your needs, not on hardware specs. As
Resources your needs change, so should your resources.
On-Demand Ask for what you need, exactly when you need it.
Provisioning Get rid of it when you don’t need.
Design should allow for resources to scale up or
Scalability
down depending on usage needs.
No contracts or long-term commitments.
No Up-Front
Costs
Pay only for what you use but design for the
possibility of enhanced resource usage.
Each machine instance must be capable of
Dynamism dynamically identifying its configuration and
relationship to other resources in the system.
43. AWS Cloud Application Design:
10 Best Practices
1. Build cloud apps, not apps in the cloud
2. Virtualize the application stack
3. Design for failure and nothing fails
4. Design for scalability
5. Loose coupling lets you maximize plug&play
6. Design for dynamism
7. Build Security into every component
8. Leverage native cloud storage options
9. Leverage best cloud Management Tools
10. Don't fear cloud constraints
44. Best Practices:
Don’t Just Build apps in the cloud
Business tier
Web Tier
Load
Balancer
Back- Back-
up up
Source: GigaSpace, Back-up Back-up
“Practical Guide for Developing Enterprise
Application on the Cloud” Data Tier
Messaging
• Don’t simply port traditional Apps to the Cloud
• Traditional Application Stacks are architected in functional silos
• Each silo has its own machines, network, management, and support
45. Build Cloud Apps:
Virtualize the Application Stack
Web Business
Processing Processing
Units Units
Load
Balancer
Users
DB
Source: GigaSpace,
“Practical Guide for Developing Enterprise
Application on the Cloud”
• Re-factor to use standardized VM containers. Each instance should use
self-discovery, be self configurable, and network independent
• Use cloud standardized Messaging & DB when possible
• Leverage inherent EBS replication and snapshots for DBMS
46. Build Cloud Apps:
Compensate for Ephemeral Storage
• EC2 instance default storage can only be used for
transient data (e.g. intermediate or temp data files).
Don’t use it for archival data logs such as login logs
or error dumps.
– Consider using SDB to store persistent archival data records
that can be associated with a key (e.g. timestamp)
• If OK to recover only from most recent backup, consider
restoring data from S3 at boot-up and backing-up current
data to S3 at shutdown.
• If not OK, use EBS attached volumes for all persistent
file data.
• DBMS should always use EBS volumes
47. Build Cloud Apps:
Compensate for Ephemeral Storage
(cont)
• Consider using soft-links (Linux) to map portions
of the ephemeral Default Storage application file
tree to persistent EBS volumes
– This can be used for archival data logs such as login
logs or error dumps (.i.e /var/logs/ files can be soft
linked to EBS volume).
• If only small chunks of persistent storage is
needed for each Instance, consider using EBS
volumes exported on EC2 NFS servers.
48. Build Cloud Apps:
Compensate for Dynamic IP Address
• Attach ElasticIP for Internet-facing EC2
instances (e.g. the HAProxy load-balancer
instance)
• Use dynamic DNS registration of EC2
instance internal IP address or use SDB
• EC2 instances should only use the internal
IP address for communicating with each
other (free!).
49. Best Practices:
Design for Failure
• "Everything fails, all the time“, Werner
Vogels, CTO Amazon.com
• Avoid single points of failure
• Assume everything fails, and design
backwards
• Design for failure and your App won’t fail
50. Design for Failure:
What Can Fail in AWS?
• The EC2 Instance may crash
• Portions of Zone may not be accessible (i.e.
internal network problem within Zone)
– EC2 Instance in a Zone may not be launch-able
– EBS volumes in a Zone may not be accessible
• AWS Services in a Region may not be
accessible (very low probability)
– S3 buckets in Region may not be accessible
– SDB domains (tables) in a Region may not be
accessible
– SQS Queues in a Region may not be accessible
51. Design for Failure:
Use Failure Tolerant Features
• Use Elastic IP addresses (or their DNS names)
for consistent and re-mappable routes
• Use multiple EC2 Availability Zones
• Use EBS for persistent file systems and
snapshots.
– Snapshots can be used to restore EBS volumes on other
Zones
– Use Rsync for real-time synchronization of RBS volumes
across Zones
• Create multiple DBMS slaves across Availability
Zones
• Use real-time monitoring (Amazon CloudWatch
or RightScale)
52. Best Practices:
Design for Scalability
• A scalable architecture is critical to take
advantage of a scalable infrastructure
• No central point of data storage contention
– Shared Nothing
– Sharding
– Distributed Caching
• Loose coupling of processing requestors
and responders
53. Design for Scalability :
Use AWS Elastic Features
• Use Load Balancing on multiple layers:
either your own (e.g. HAProxy EC2
instance) or AWS Elastic Load Balancing
• Use Cloud monitoring systems: either your
own (e.g. CollectD) or AWS CloudWatch
• Use Auto-scaling technology (Free with
CloudWatch)
55. Best Practices:
Build Loosely Coupled Systems
• Use Independent components
• Design everything as a Black Box with well
defined inputs and outputs
• Use subsystem de-coupling for Hybrid
models
• Use Load-balanced clusters of Black
Boxes to maximize plug&play
56. Loose Coupling:
Use Message Queues
Controller Controller Controller
Tight Coupling A B C
Loose Coupling Q Q Q
1 2 3
using Queues
Controller Controller Controller
Controller
A Controller
B Controller
C
Controller
A Controller
B Controller
C
A B C
• Use MQueue system such as Amazon
SQS or Gearman to pass along requests
• Each message queue consumer can be a
cluster of EC2 instances
57. Best Practices:
Design for Dynamism
• Don’t assume health or fixed location of
components
• Use designs that are resilient to reboot and re-
launch
• Bootstrap your instances based on self-discovery
(E.g. EC2 Metadata API)
– Store configurations in SimpleDB to bootstrap instances
• Enable dynamic configuration
– Store application, subsystem, and EC2 instance state in
SimpleDB so instances can know health of system
58. Best Practices:
Security in every component
• Use de-perimiterized security model
• Create distinct network Security Groups for each
Amazon EC2 instance cluster
• Use group-based network rules for controlling
access between components
• Restrict external access to specific IP ranges
• Encrypt data “at-rest” in Amazon S3
• Encrypt data “in-transit” (SSL)
• Consider encrypted EBS file systems for
sensitive data
59. Best Practices:
Leverage Storage Solutions
• Amazon S3: large static objects
• Amazon CloudFront: content distribution
• Amazon SimpleDB: simple data
indexing/querying
• Amazon EC2 local disc drive : transient
data
• Amazon EBS: RDBMS persistent storage
+ S3 Snapshots
60. Best Practices:
Leverage Best AWS Mgt Tools
• Management of any but the simplest cloud
application configurations is very cumbersome
without advanced tools.
• RightScale is a script-based instance
provisioning, monitoring, & auto-scaling system
– Supports collaborative sharing & reuse of scripts
• Kaavo Infrastructure & Middleware On Demand
(IMOD) is an “Application Centric Management
System”
– manages a multitier cloud application system as
though it were a monolithic application
61. Best Practices:
Don't fear cloud constraints
• Think “out of the box” and leverage cloud
features to solve EC2 constraints
• Components expect Static IP addresses?
– Boot script for software reconfiguration from
SimpleDB or use Dynamic DNS
• Local data center DBSM has better IOPS?
– Try multiple read-only / sharding / DB
clustering
63. AWS Management Tools:
Basic Tools
• Amazon native AWS tools only leverage
basic AWS API capability
– AWS Management Console
• Firefox plugins are slightly more advanced
– Elasticfox – EC2 Instance, EBS, EIP
management
– S3 Organizer – S3 file upload/download
(similar to ftp plugin)
• CloudBerry Explorer – Windows S3 file
upload/download application, slightly better
than S3 Organizer
64. AWS Management Tools:
Ideal Advanced Tools
• Attaching EBS volumes, EIPs, and other resources
should be scripted and managed by “Cloud Deployment
& Mgmt System” (CDMS)
• CDMS should incorporate standards-based
Performance Monitoring services
• Should incorporate standards-based Event Notification
services
• Should incorporate Auto-scaling configuration services
as remediation of Performance/Load Events
• CDMS should incorporate Administrator Collaboration
allowing sharing and partitioning of admin
responsibilities
65. AWS Management Tools:
Ideal Advanced Tools (cont)
• Allow for automated provisioning of EC2
instances
• Should allow sharing of scripts and
launch/terminate of instances based on group
roles or at least read/write/execute rights.
• Should allow for re-use generalized scripts
• Should allow for auto-scaling based on dynamic
load evaluation functions
• CDMS should support escalating event
notification to groups of users.
– Should have interfaces to other EMS (e.g. Nagios)
66. AWS Management Tools:
RightScale
• Script-based instance provisioning, monitoring, &
auto-scaling system
• Manages complex deployments involving
multiple instance clusters
• Re-use of version-controlled scripts in different
deployments
• Full automation of auto-scaling, remediation,
notification and automatic configuration
• Cloud application developer and administrator
collaboration framework
68. RightScale Lifecycle Mgmt Pattern
• RightScale uses an Injection Pattern to push
individual command scripts into a running EC2
instance or an entire deployed cluster of
instances
• Boot Scripts are automatically run at Instance
Launch after OS “boot_finished” event
• Operational Scripts are run during automated
Event Handling or manual operations
• Decommissioning Scripts are automatically run
prior to Instance Termination
69. Current RightScale
Cloud Service Monitoring Pattern
Source: 2009 CummunityOne West Conference:
“Practical Cloud Computing Patterns”
• Based on collectd framework
70. AWS Management Tools:
Scalr
• Similar to RightScale features: instance
provisioning, monitoring, & auto-scaling
system
• Less reliant on “on-the-fly” provisioning.
Suite of Scalr AMIs available for common
application configurations.
• Manages complex deployments involving
multiple instance clusters
• Significantly less expensive
• OpenSource code available for local use.
71. AWS Management Tools:
Kaavo IMOD
• “Application Centric Management System”
• Proxy server manages complex multitier cloud
application system as if it were a monolithic application
via IMOD System Definitions
• Quickstart Kaavo provides out of the box System
Definitions for deploying popular multi-tier HA
infrastructure:
• Ruby on Rails, LAMP, Tomcat, Jboss
• IMOD workflow engine monitors application run-time state
events and responds dynamically with user customized Event
Workflows (e.g. MySQL scale-up/scale-down)
72. Q&A :
More Resources
• www.hyperstratus.com
– White Paper:
“Migrating Applications to the Cloud:
An Amazon Web Services Case
Study”
– Cloud Computing Workshops (via Unitek
Education)
– Jorge.Noa@hyperstratus.com