AWS re:Invent re:Cap - 비용 최적화 - 모범사례와 아키텍처 설계 심화편 - 이원일

Whether you're a startup getting to profitability or an enterprise optimizing spend, it pays to run cost-efficient architectures on AWS. Building on last year's popular foundation of how to reduce waste and fine-tune your AWS spending, this session reviews a wide range of cost planning, monitoring, and optimization strategies, featuring real-world experience from AWS customer Adobe Systems. With the massive growth of subscribers to Adobe's Creative Cloud, Adobe's footprint in AWS continues to expand. We will discuss the techniques used to optimize and manage costs, while maximizing performance and improving resiliency.
When traditional application and operating practices are used in cloud deployments, immediate benefits occur in speed of deployment, automation, and transparency of costs. The next step is a re-architecture of the application to be cloud-native, and significant operating cost reductions can help justify this development work. Cloud-native applications are dynamic and use ephemeral resources that customers are only charged for when the resources are in use.

With AWS, you can reduce capital costs, lower your overall bill, and match your expense to your usage. This session describes how to calculate the total cost of ownership (TCO) for deploying solutions on AWS vs. on-premises or at a colocation facility, as well as how to address common pitfalls in building a TCO analysis. The session presents and models customer examples.
This session is a deep dive into techniques used by successful customers who optimized their use of AWS. Learn tricks and hear tips you can implement right away to reduce waste, choose the most efficient instance, and fine-tune your spending; often with improved performance and a better end-customer experience. We showcase innovative approaches and demonstrate easily applicable methods to save you time and money with Amazon EC2, Amazon S3, and a host of other services.

In this session, you learn how you can leverage AWS services together with third-party storage appliances and gateways to automate your backup and recovery processes so that they are not only less complex and lightweight, but also easy to manage and maintain. We demonstrate how to manage data flow from on- premises systems to the cloud and how to leverage storage gateways. You also learn best practices for quick implementation, reducing TCO, and automating lifecycle management.
In the event of a disaster, you need to be able to recover lost data quickly to ensure business continuity. For critical applications, keeping your time to recover and data loss to a minimum as well as optimizing your overall capital expense can be challenging. This session presents AWS features and services along with Disaster Recovery architectures that you can leverage when building highly available and disaster resilient applications. We will provide recommendations on how to improve your Disaster Recovery plan and discuss example scenarios showing how to recover from a disaster.

•Pay as you go, no up-front investments
•Low ongoing cost
•Flexible capacity
•Speed, agility, and innovation
•Focus on your business
•Go global in minutes

Ecosystem
Global Footprint
New Features
New Services
More AWS Usage
More Infrastructure
Lower Infrastructure Costs
Reduced Prices
More Customers
Infrastructure Innovation
45 price reductions since 2006
Economies
of Scale

aws.amazon.com/premiumsupport/trustedadvisor/
Free with Business or Enterprise Support

Strategy 3: Optimize Architecture

Cloud-Ready
Cloud-Aware
Cloud-Native
•Run AWS like a virtual colocation (Fork-lift)
•Does not optimize for on-demand (overprovisioned)
•Minor modifications to improve cloud usage
•Automating servers can lower operational burden
•Redesign with AWS in mind
(high effort)
•Embrace scalable services (reduce admin)
•EC2, EBS
•HAProxy on EC2
•MySQL on EC2
•Cassandra, Hadoop on EC2
•ActiveMQ/Redis/KAFKA on EC2
•Chef on EC2
•EC2, EBS, S3, CloudFront
•ELB, Route53(round-robin)
•Multi-AZ RDS + read replica
•ElastiCache Redis
•OpsWorks
•Autoscaling, Self-healing
•Route53(LBR)
•RDS Aurora, RedShift
•DynamoDB, EMR
•SQS, SNS, Kinesis
•CloudFormation, Elastic Beanstalk
Development Cost
Scalability/Availability
Management Cost

•Developer, test, training instances
•Use simple instance start and stop
•Or tear down and build up all together
•Instances are disposable
•Automate, automate, automate:
–AWS CloudFormation
–Weekend/off-hours scripts
–Use tags

Monday
Friday
End of Vacation Season
35% saved

Automatic resizing of compute clusters
based on demand
Trigger autoscaling
policy
Feature Details
Control Define minimum and maximum instance pool
sizes and when scaling and cool down occurs.
Integrated to Amazon
CloudWatch
Use metrics gathered by CloudWatch to drive
scaling.
Instance types Run Auto Scaling for On-Demand and Spot
Instances. Compatible with VPC.
AWS autoscaling create-autoscaling-group
— Auto Scaling-group-name MyGroup
— Launch-configuration-name MyConfig
— Min size 4
— Max size 200
— Availability Zones us-west-2c
Amazon
CloudWatch

Cloud capacity
used is maybe
half average DC capacity

Mad scramble to add more DC capacity during launch phase outages

Capacity wasted
on failed launch
magnifies the
losses

Start
Choose an instance that best meets your basic requirements
Start with memory & then choose closest virtual cores
Look for peak IOPS storage requirements
Tune
Change instance size up or down based upon monitoring
Use CloudWatch & Trusted Advisor to assess
Roll-Out
Run multiple instances in multiple Availability Zones

1, 1.7, $0.060 1, 3.75, $0.113 2, 3.75, $0.145 2, 7.5, $0.225 2, 17.1, $0.410 4, 7, $0.300 4, 15, $0.450 4, 34.2, $0.820 8, 15, $0.600 8, 30, $0.900 8, 68.4, $1.640 4, 30.5, $0.853 8, 61, 1.705 16, 30, $1.200 32, 60, $2.400 32, 244, $3.500 16, 122, $3.410 16, 117, $4.600 32, 244, $6.820 0 50 100 150 200 250 300 0 5 10 15 20 25 30 On Demand Prices shown (N.Virginia region), only latest generation instances (M3,C3) shown where applicable, GPU and Micro instances not shown above Memory-Optimized Instances Compute-Optimized Instances General Purpose Instances Storage-Optimized Instances vCPU RAM

More small instances vs. Less large instances
29 m3.xlarge
= 29 x $0.280/hour
= $8.12/hour
69 m3.medium
= 69 x $0.070/hour
= $4.83/hour
40% Savings

1
5
9
13
17
21
25
29
33
37
41
45
49
Web Servers
Week
50% Savings
Weekly CPU Load

Scale up/down by 70%+
Move to Load-Based Scaling
50% Savings

Auto Scaling in the Amazon Cloud
http://techblog.netflix.com/2012/01/auto-scaling-in-amazon-cloud.html
Reactive Auto Scaling saves around 50%
Requests
Servers
50% Savings

Predictive Auto Scaling saves around 70%
Load prediction
Autoscaling Plan
Scryer: Netflix’s Predictive Auto Scaling Engine
http://goo.gl/iFefxJ
70% Savings

1y RI
Break even
3y RI
Break even

•No Upfront
You pay nothing upfront but commit to pay for the Reserved Instance over the course of the Reserved Instance term, with discounts (typically about 30%) when compared to On-Demand. This option is offered with a one year term
•Partial Upfront
You pay for a portion of the Reserved Instance upfront, and then pay for the remainder over the course of the one or three year term. This option balances the RI payments between upfront and hourly.
•All Upfront
You pay for the entire Reserved Instance term (one or three years) with one upfront payment and get the best effective hourly price when compared to On-Demand.

•Can be moved between AZs
•Can be moved between
EC2-Classic and EC2-VPC platforms
•Size can be modified within the
same instance family

•Price based on supply/demand
•You choose your maximum price/hour
•Your instance is started if the Spot price is lower
•Your instance is terminated if the Spot price is higher
•But: You did plan for fault tolerance, didn’t you?

On-Demand:
$0.24
$0.028 (11.7%)
$0.026 (10.8%)
90% Savings

•Very dynamic pricing
•Opportunity to save 80-90% cost
–But there are risks
•Different prices per AZ
•Leverage Auto Scaling!
–One group with Spot Instances
–One group with On-Demand
–Get the best of both worlds
•Coming soon: 2-minute Spot interruption warnings

•Reduced redundancy storage class
–99.99% durability vs. 99.999999999%
–Up to 20% savings
–Everything that is easy to reproduce
–Use Amazon SNS lost object notifications
•Amazon Glacier storage class
–Same 99.999999999% durability
–3 to 5 hours restore time
–Up to 64% savings
–Archiving, long-term backups, and old data
•Use life-cycle rules
64% Savings
20% Savings

•Read/write capacity units (CUs) determine most of DynamoDB cost
•By optimizing CUs, you can save a lot of money
•But:
–Need to provision enough capacity to not run into capacity errors
–Need to prepare for peaks
–Need to constantly monitor/adjust

•Use caching to save read capacity units
–Local RAM caches at app server instances
–Check out Amazon ElastiCache
•Think of strategies for optimizing CU use
–Use multiple tables to support varied access patterns
–Understand access patterns for time series data
–Compress large attribute values
•Use Amazon SQS to buffer over-capacity writes

Caching/Optimization: 80% saved
Cache flush
Dynamic DynamoDB:
20% saved
Growth + new features
80% Savings
20% Savings

•The more you can offload, the less infrastructure you need to maintain, scale, and pay for
•Three easy ways to offload:
–Use Amazon CloudFront
–Introduce caching
–Leverage existing Amazon web services

•Amazon RDS, Amazon DynamoDB or Amazon ElastiCache for Redis, Amazon Redshift
–Instead of running your own database
•Amazon CloudSearch
–Instead of running your own search engine
•Amazon Elastic Transcoder
•Amazon Elastic MapReduce
•Amazon Cognito, Amazon SQS, Amazon SNS, Amazon Simple Workflow Service, Amazon SES, Amazon Kinesis, and more …

November 14, 2014 | Las Vegas
Adrian Cockcroft @adrianco, Battery Ventures

@adrianco
Bill
Now
Next Month
Ages
Ago
Lease
Building
Install
AC etc.
Rack and
Stack
Private Cloud SW
Run
My Stuff
Data Center Up-Front Costs

0
25
50
75
100
125
1
2
3
4
5
6
7
8
9
10
11
12
1
2
3
4
5
6
7
8
9
10
11
12
1
2
3
4
5
6
7
8
9
10
11
12
Three Years Halving Every 18mo = maybe 40% overall savings
Data shown is purely illustrative

Older m1/m2 families
•Slower CPUs
•Higher response times
•Smaller caches (6MB)
•Oldest m1.xlarge
–15G/8.5ECU/35c 23ECU/$
•Old m2.xlarge
– 17G/6.5ECU/25c 26ECU/$
New m3 family
•Faster CPUs
•Lower response times
•Larger caches (20MB)
•Java perf ratio > ECU
•New m3.xlarge
–15G/13ECU/28c 46ECU/$
•77% better ECU/$
•Deploy fewer instances

100
70
70
70
30
30
25
0
25
50
75
100
125
Base Price
Rightsized
Seasonal
Daily Scaling
Reserved
Tech Refresh
Price Cuts
Traditional
application
using AWS
heavy-use
reservations
Base price is for capacity bought up-front

100
70
50
35
25
20
15
0
25
50
75
100
125
Base Price
Rightsized
Seasonal
Daily Scaling
Reserved
Tech Refresh
Price Cuts
Cloud-native
application
partially optimized
light use reservations

100
50
25
12
8
6
4
0
25
50
75
100
125
Base Price
Rightsized
Seasonal
Daily Scaling
Reserved
Tech Refresh
Price Cuts
Cloud-native application
fully optimized autoscaling
mixed reservation use
costs 4% of base price
over three years!

•Business logic isolation in stateless micro-services
•Immutable code with instant rollback
•Autoscaled capacity and deployment updates
•Distributed across availability zones and regions
•De-normalized single function NoSQL data stores
•See over 40 NetflixOSS projects at netflix.github.com
•Get “technical indigestion” trying to keep up with techblog.netflix.com

AdRoll, an online advertising platform, serves 50 billion impressions a day worldwide with its global retargeting platforms.
We spend more on snacks than we do on Amazon DynamoDB.
•Needed high-performance, flexible platform to swiftly sync data for worldwide audience
•Processes 50 TB of data a day
•Serves 50 billion impressions a day
•Stores 1.5 PB of data
•Worldwide deployment minimizes latency
Valentino Volonghi
CTO, Adroll
”
“
Adroll Uses AWS to Grow by More Than 15,000% in a Year

•Handle 150TB/day
•Low <5ms response time
•1,000,000+ global requests/second
•100B items

•Memcache
aOpen source
aMature
aBlazingly fast
rNo strong guarantees
•Redis
aOpen source
rStorage scale
rNot really distributed
rOperationally intense.
•Hbase (we still use this)
aOpen source
aMaturing quickly
aGreat scale
rReally hard to operate
a
a
a
r

•Revisiting 1 million writes per second (Netflix) http://techblog.netflix.com/2014/07/revisiting-1-million-writes-per-second.html
•Mix is 10% writes/90% reads, 1M ops/sec is total capacity.
Cassandra
DynamoDB
Delta
10/90 mix, $/month
$287,064
$131,040
219%
50/50 mix, $/month
$287,064
$280,800
~0%
10/90, 3-yr reserved
$27,075.6
($904k upfront)
$15,736
($504k upfront)
180%
•10 people Cassandra ops team: $150k/month (fully loaded)
•0 DynamoDB ops team: $0

Data Collection = Batch Layer
Bidding = Speed Layer
Data Collection
Data Storage
Global
Distribution
Bid Storage
Bidding

US East region
Availability Zone
Availability Zone
Elastic Load Balancing
instances
instances
Auto Scaling group
Amazon S3
Amazon Kinesis

US East region
Availability Zone Availability Zone
Elastic Load
Balancing
instances instances
Auto Scaling group
Amazon S3
Amazon
Kinesis
Apache
Storm DynamoDB
US West region
EU West region
DynamoDB
DynamoDB

Data Collection Bidding
US East region
instance
s
instance
s
Auto Scaling group
Amazon
S3
Amazon
Kinesis
Apache
Storm
DynamoD
B
Auto Scaling group

Data Collection
Bidding
Ad Network 1 Ad Network 2
Auto Scaling Group Auto Scaling Group Auto Scaling Group Auto Scaling Group Auto Scaling Group Auto Scaling Group
Auto Scaling Group Auto Scaling Group Auto Scaling Group
Apache Storm
v1 v2 V3 V3 v1 v2 V3 V3
V1 V2 V3 V3
Auto Scaling Group
V3 V4
Elastic Load Balancing Elastic Load Balancing Elastic Load Balancing Elastic Load Balancing
DynamoDB
Write
Read Read Read Read
Read Read
Write
Writes
Write
Write
Read
V3
`
DynamoDB
Data Collection
Bidding
DynamoDB
Write
Read
Read
Write
Write
Write
Amazon S3
Amazon
Kinesis
Data
Collection
• Amazon EC2, Elastic Load
Balancing, Auto Scaling
Store
• Amazon S3 + Amazon
Kinesis
Global
Distribution
• Apache Storm on Amazon
EC2
Bid Store
• DynamoDB
Bidding
• Amazon EC2, Elastic Load
Balancing, Auto Scaling

AWS re:Invent re:Cap - 비용 최적화 - 모범사례와 아키텍처 설계 심화편 - 이원일

AWS re:Invent re:Cap - 비용 최적화 - 모범사례와 아키텍처 설계 심화편 - 이원일

More Related Content

Similar to AWS re:Invent re:Cap - 비용 최적화 - 모범사례와 아키텍처 설계 심화편 - 이원일

More from Amazon Web Services Korea

Recently uploaded

AWS re:Invent re:Cap - 비용 최적화 - 모범사례와 아키텍처 설계 심화편 - 이원일