AWS' breadth of services and pricing options, offer the flexibility to effectively manage your costs and still keep the performance and capacity your business requires. With AWS, you can easily right size your services, leverage Reserved Instances, and use tools to track and monitor your resources so you can always be on top of your how much you’re spending. This session covers best practices around cost optimization for large scale deployments on AWS.
Speaker: Vikrant Yagnick
Head - India Enterprise Support
8. Authenticate users
Authorize access
Analyze User Behavior
Store and share media
Synchronize data
Deliver media
Store shared data
Stream real-time dataRun Business Logic
Send push notifications
Manage users and
identity providers
Securely access
cloud resources
Sync user prefs
across devices
Track active users,
Engagement, retention
Run stateless custom
Code without managing servers
Store user-generated photos
Media and share them
Automatically detect mobile devices
Deliver content quickly globally
Bring users back to your app by sending
messages reliably
Store and query fast NoSQL data
across users and devices
Collect real-time clickstream logs
and take actions quickly
Example
Mobile App
9. Authenticate users
Authorize access
Analyze User Behavior
Store and share media
Synchronize data
AWS Mobile SDK
Amazon Mobile
Analytics
(Pay per event)
Deliver media
Amazon Cognito
(Sync)
(Pay per sync and data)
AWS Identity and
Access Management
(Free)
Amazon Cognito
(Identity Broker)
(Pay per active user)
Amazon S3
Transfer Manager
(Pay for data stored and requests.
No compute cost)
Amazon CloudFront
(Device Detection)
(Pay for data-out and requests)
Store shared data
Amazon DynamoDB
(Object Mapper)
(Pay for provisioned IO)
Stream real-time data
Amazon Kinesis
(Recorder)
(Pay per shard-hour and request)
Run Business Logic
AWS Lambda
(Pay per request)
Send push notifications
Amazon SNS
Mobile Push
(Pay per notification)
Your
Mobile
App
10. Turning off instances
Turn off nonproduction instances
• Look for dev/test, nonproduction instances that
are running always-on and turn them off.
• Lambda + CloudWatch = Automated Scheduling*
* https://aws.amazon.com/premiumsupport/knowledge-center/start-stop-lambda-cloudwatch/
11. Auto-scaling
Autoscale production
• Use Auto Scaling to scale up and down based on
demand and usage (for example, spikes).
• Think about both Horizontal and Vertical Scaling
and ways to automate
• Autoscaling is not just for instances-
• 3rd party scripts can scale DynamoDB
PIOPS
• Think about granularity to reduce waste
12. Utilization and Auto-Scaling: Granularity
more small instances vs. less large instances
29 m1.large @ $0.240/hr.
= $6.96
58 vCPU, ~217 GiB RAM
59 m1.small @ $0.06/hr.
= $3.54
59 vCPU, ~100 GiB RAM
13. Pillar 2: Right-Sizing
Right-sizing
• Selecting the cheapest instance available
while meeting performance requirements
• Select latest instance types. They have latest
hardware and are cheaper.
• Looking at CPU, RAM, storage, and network
utilization to identify potential instances that
can be downsized
• Leveraging Amazon CloudWatch metrics and
setting up custom RAM metrics
Rule of thumb: Right size, then reserve.
(But if you’re in a pinch, reserve first.)
18. EBS Throughput Optimized HDD
Throughput
Optimized HDD
st1
Baseline: 40 MB/s per TB up to 500 MB/s
Capacity: 500 GB to 16 TB
Burst: 250 MB/s per TB up to 500 MB/s
Ideal for large block, high throughput sequential workloads
19. Cold HDD
sc1
Cold HDD
Baseline: 12 MB/s per TB up to 192 MB/s
Capacity: 500 GB to 16 TB
Burst: 80 MB/s per TB up to 250 MB/s
Ideal for sequential throughput workloads such as logging and backup
27. 27
About BSE
BSE Limited
Established in 1875, BSE
(formerly known as
Bombay Stock Exchange
Ltd.), is Asia's first & the
Fastest Stock Exchange
in world with the speed
of 6 micro seconds and
one of India's leading
exchange groups.
BSE provides an efficient
and transparent market
for trading in equity,
debt instruments,
derivatives, mutual
funds. It also has a
platform for trading in
equities of small-and-
medium enterprises
(SME)
BSE is a corporatized
and demutualised
entity, with a broad
shareholder-base which
includes two leading
global exchanges,
Deutsche Bourse and
Singapore Exchange as
strategic partners
28. 28
About BSE contd..
BSE Limited
More than 5500 companies
listed on BSE
World’s No. 1 Exchange in
terms of number of
companies
Market Capitalization of
USD 1.65 trillion as on
November 7, 2016
BSE provides host of
services like risk
management, clearing,
settlement, market data
services and education.
It has a global reach with
customers around the
world and a nation-wide
presence.
BSE's popular equity index -
the S&P BSE SENSEX - is
India's most widely tracked
stock market benchmark
index
It is traded internationally
on the EUREX as well as
leading exchanges of the
BRCS nations (Brazil, Russia,
China and South Africa)
29. 29
Scaling Algo Test Lab
Trading is becoming Algo/Programatic
lead
Members write Algo using Matlab,
Python, etc
They want to test these Algo with
historical data (back testing)
They want to test these Algo with real
time market data (paper trading)
Members needed an Environment for
doing these two testing
30. 30
Scaling Algo Test Lab….Contd.
Cost of providing the environment was prohibitive
BSE worked with AWS to put up the environment
Started with 5
environments
Environments start / stop
on Demand
Environments were
automatically shut down
at 8:00PM
If Environment detects
that there is no active
session for last 2 hours
Shutdown!
Cost under lowered by 60%
Now can accommodate 60
environments in same cost
Coding
Testing
Pre Production
Production
31. 31
Scaling BSE Stock Ticker
Current Stock ticker was made using
proprietary technology.
Scale is impossible to estimate – normal days
about 6000 concurrent users / Big Macro Event
users can go up to 60000-100000 users
Infra was proving to be too costly to provision
and keep for peaks
32. 32
Scaling BSE Stock Ticker….Contd.
Architecture consideration
• Using open source technologies
• Considered Polling methods – proved to
be too costly
• Wanted least amount of management
• Do not have to worry about scaling –
needs to be Automatic - irrespective of
concurrent user count
• Need to be cost effective and efficient
then on prem solution
33. 33
Scaling BSE Stock Ticker….Contd.
Customer Data Center
Node.js
server
Auto
Scaling
Application
Load
Balancer
EC2
instance
EC2
instance
ElastiCache
-Redis
Redis
Pub/Sub
Channel
Amazon
Kinesis
Streams
Kinesis
Consumer
EC2
instance
Node.js
server
Node.js
server
Streaming Data
Emitter Server
• Architecture
35. Pillar 3: Leveraging the
Right Pricing Model
Reserved Instances
Spot Instances
On-Demand
36. Reserved Instances for Always-On Instances
Commitment level
1 year
3 year
AWS services offering
Reserved Instances
Amazon EC2
Amazon RDS
Amazon DynamoDB
Amazon Redshift
Amazon ElastiCache
* Dependent on specific AWS service, size/type, and region
37. Introducing Convertible Reserved Instances
With a Convertible Reserved Instance, you can modify
your existing reservation across:
Instance families
Instance sizes
Operating systems
Tenancy
39. Reserved Instances
Step 1: Reserved Instance Coverage
• Cover always-on resources with standard or
convertible Reserved Instances
Step 2: Increase Reserved Instance
Utilization
• Known architectures: Leverage Standard
Reserved Instance flexibility to increase
utilization.
• Growing or changing architectures: Leverage
Convertible Reserved Instances across
families, sizes, and OS.
• Regional Benefit: Consolidated billing,
reservation not critical
40. Options
• Spot Fleet to maintain instance
availability
• Spot Block durations (1-6 hours)
for workloads that must run
continuously
Commitment level
• None
* Compared to On Demand price based on specific EC2 instance type, region, and Availability Zone
Consider Spot for Elastic Workloads
43. Object Storage Classes on Amazon S3
Active data Archive dataInfrequently accessed data
Standard
Hot
Standard - Infrequent Access
Warm
Amazon Glacier
Cold
44. Running the Numbers: S3 or S3-IA
Comparing 1 PB of object storage*
1PB Monthly
S3 S3-IA Savings %
$24,117 $14,116 41%
Rule of thumb: Breakeven = 105% Retrieved per Month
Content Accessed
per Month
10%
$24,117 $18,350 24%50%1PB Monthly
$24,117 $23,593 2%100%1PB Monthly
* Based on US-East Prices
50. Automation
1. Identify always on instances.
2. Identify instances to downsize.
3. Identify warm / cold storage.
4. Recommend Reserved
Instances to purchase.
5. Dashboard our status.
6. Report on savings.
51. AWS Trusted Advisor
Helping customers automate best practices (checks) across
cost optimization, security, fault tolerance, and performance
improvement
Red (action recommended)
Orange (investigation recommended)
Green (no problem detected)
52. “We estimate an average
33% monthly savings on our
total AWS spend.”
Amit Vora, CTO for Hungama
61. Bridging the Gap Between Agents & Principles
Principles Agents
CoE
62. Questions your COE Should be Asking
1. How much of our workloads are “steady state”?
2. What’s keeping us from reserving capacity?
3. How are we currently handling our elasticity needs?
4. Have we had a Well Architected/Cloud Operations review
with AWS?
5. How can I be more involved in our process?
63. • Aligning Incentives (Carrots & Sticks)
• Automation
• Reporting
• Control & Governance
• Metrics / KPIs
Cloud Center of Excellence
64. Value Based Optimization Metrics
A company’s overall AWS cost should be evaluated as a unit cost ratio with
respect to another defined metric:
𝑈𝑛𝑖𝑡 𝐶𝑜𝑠𝑡 =
𝑇𝑜𝑡𝑎𝑙 𝐶𝑜𝑠𝑡
𝐼𝑛𝑑𝑖𝑣𝑖𝑑𝑢𝑎𝑙 𝑜𝑟 𝐵𝑢𝑠𝑖𝑛𝑒𝑠𝑠 𝑀𝑒𝑡𝑟𝑖𝑐
Examples
• Unit cost per customer or active subscriber
• Unit cost per revenue generated
• Unit cost per product or business unit
• Unit cost per internal user
• Unit cost per experiment
• Unit cost per FTE
Align to
Value Drivers
65. Where to start
Set up a Cloud
Competency Center
Bring in the right
tools
Use metrics to
reinforce behavior
Use AWS Support-
Especially
Enterprise Support
Editor's Notes
Cost Optimization is a function of the new business model that the Cloud has brought about.
By making services genuinely pay for what you use, there’s huge opportunity for customers to be lean with what they use and reduce their spend dramatically.
CO should be done early on
To give you a context, lets take an Example, Table stakes of any app
Authenticate Users: Lets start from the users of your app. The most important aspect for you when building a mobile app is to deliver an engaging experience. For that you would want to know who the user is. In most cases you would use third party identity providers like Amazon, Facebook or Google. However, often a login screen proves to be a point of friction, so you would want users to be able to skip any authentication and directly interact with the app. But at the same time when users do decide to login, they expect their preferences, settings, progress to carry over.
Synchronize Data: Users expect their preferences or profiles to be saved from one session to the next. E.g if you have a game, they expect to resume the game where they last left off. To make matters more complicated, your app or game may be available across platforms – iOS, Android, FireOS. If that is the case, users would expect their data, preferences, profile etc. to be automatically synced and available across devices and platforms. E.g with Amazon Instant Video, users can pause a video they are watching on their Kindle Fire and resume on iPad
Store and share assets and media: Appstores generally have a limit on the size of the app that can be downloaded over WAN. You would want to store the app’s assets in a cloud storage so you can reduce the size of the app. In additional may want to store your users data like pictures and video in the cloud.
Store shared data: Often you would want to store app data e.g settings in form of key-value pairs in NoSQL database and query it for fast access.
Push Notifications: Coming back to user engagement, push notifications are a great way to engage your users. You can leverage Push Notifications to remind users of a special ongoing promotion, breaking news, or an update to your app. It’s a great way to bring the users back to your app.
Analyze App Usage & Track Retention: Once you deploy your app, you would want track how your app is performing. You would want to track the usage of your app and also how well you are able to track retention. Some of the common things that you would want to track are active users, session duration, Revenue related metrics like revenue per daily active users, etc.
Analytics User Behavior: You would also want to track user behavior or how users interact with your app. Do they follow the UX flow that you would expect, where would they drop off in your app etc.
Stream data in real-time: You would want to collect large amount of custom metrics from your app for off-line analysis like click-stream logs.
Authorized Access: Most importantly you want to provide secure and authorized access to cloud services.
Now lets see how AWS can help you in each of these areas
How to build an app
1. Authentication
2. Authorization
3. Data Storage and Delivery (Upload and Download)
4. Data Analytics
5. Data Synchronization
6. Push Notifications
7. Shared Data
8. Stream real-time data
9.
All performance characteristics measured as MB/s throughput – NOT IOPS
Mention 20 MB/s for a 500 GB volume
Data only volumes, cannot be a boot / root volume
Mention Latency model for HDD based volume types:
Will include both seek time and access latency
Double digit ms latency vs single in SSD
high throughput sequential IO workloads:
Log processing
ETL
Data warehousing
Hadoop, Kafka, Vertica
Any workload that thrives on large block, sequential throughput performance
Same platform for both new volume types – just the baseline and burst performance thresholds are different (oh and price, of course)
Cost Optimized throughput
Logging, Backup, Archive
6 MB/s for 500GB volume
The third capability of elastic volumes gives you the ability to increase or decrease the provisioned iops of io1 volumes.
Amazon S3 Reduced Redundancy
99.99% durability vs. 99.999999999%
Up to 20% savings
Great for everything that is easy to reproduce
Amazon Glacier
Same durability as S3
3 to 5 hours restore time
Up to 89% savings
Great for archiving, long-term backups and old data
Talk about No Tags- No Instances
Fine Grained Monitoring with Budgets 2.0- Set alerts based on Tags, Service, Accounts, API access, Forecasts etc….