Migrating enterprise workloads to AWS


Published on

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • What period are you amortizing hardware across? 
    Are you using the same RI term? 
    Are you comparing vs. heavy utilization RIs?
    How much buffer capacity are you planning on carrying?  If small, what is your plan if you need to add more?  What if you need less capacity?  What is your plan to be able to scale down costs?
    Are you taking labor into account?  What about maintenance (broken disks, patching hosts, servers going offline, etc).
    What are you assuming for network gear?  What if you need to scale beyond a single rack?
    What about availability?  Are you accounting for 2N power?  If not, what happens when you have a power issue to your rack?
    What is your bandwidth peak to average ratio?
    Have you modeled in AWS lowering prices over time? 
    Your purchased gear will never get cheaper – and hosting (power and cooling) is not getting cheaper
  • Smarter Agent powers more highly rated and downloaded real estate app titles in the Android, iPhone and Blackberry marketplaces than any other in the real estate vertical. This includes the #1 and #2 downloaded and rated large franchisor apps, the most highly downloaded and rated independent brokerage office app and many of the top downloaded Multi Service Listings (MLS) apps.
  • As many metrics as you can manage
    OS level, database, and application metrics
    Time long running activities, and clearly note runtimes in launch plan
  • Migrating enterprise workloads to AWS

    1. 1. • Why Enterprises Choose AWS • Enterprise Applications Architectures • Seven design principals for AWS • Best Practices • Migration Approach • Calculating Total Cost of Ownership (TCO) • Customer Project : Migration lessons learned • Next steps
    2. 2. No Up-Front Capital Expense Pay Only for What You Use Self-Service Infrastructure Easily Scale Up and Down Improve Agility & Time-to-Market Low Cost Deploy
    3. 3. Technology stack On premise solution AWS Network VPN, MPLS AWS VPC, VPN, AWS Direct Connect Security Firewalls, NACLs, routing tables, disk encryption, SSL, IDS, IPS AWS Security Groups, AWS CloudHSM, NACLs, routing tables, disk encryption, SSL, IDS, IPS Storage DAS, SAN, NAS, SSD AWS EBS, AWS S3, AWS EC2 Instance storage (SSD), GlusterFS Computer Hardware, Virtualization AWS EC2 Content Delivery CDN Solutions AWS CloudFront Databases DB2, MS SQL Server, MySQL, Oracle, PostgresSQL, MongoDB, Couchbase AWS RDS, AWS DynamoDB, DB2, MS SQL Server, MySQL,PostgesSQL, Oracle, MongoDB, Couchbase Load Balancing Hardware and software load balancers, HA Proxy AWS Elastic Load Balancer, software load balancers, HA Proxy Scaling Hardware and software clustering, Apache ZooKeeper AWS Auto Scaling, software clustering, Apache ZooKeeper Domain Name Services DNS providers AWS Route 53
    4. 4. Technology stack On premise solution AWS Analytics Hadoop, Cassandra AWS Elastic MapReduce, Hadoop, Cassandra Data Warehousing Specialized hardware and software solutions AWS RedShift Messaging and workflow Messaging and workflow software AWS Simple Queuing Service, AWS Simple Notification Server, AWS Simple Workflow Service Caching Memcached, SAP Hana AWS ElastiCache, Memcached, SAP Hana Archiving Tape library, off site tape storage AWS Glacier Email Email software AWS Simple Email Service Identity Management LDAP AWS IAM, LDAP Deployment Chef, Puppet AWS AMIs, AWS CloudFormation, AWS OpsWorks, AWS Elastic Beanstalk, Chef, Puppet Management and Monitoring CA, BMC, Rightscale AWS CloudWatch, CA, BMC, Rightscale
    5. 5. 1. Design for failure and nothing fails 2. Loose coupling sets you free 3. Implement elasticity
    6. 6. 4. Build security in every layer 5. Don’t fear constraints 6. Think parallel 7. Leverage different storage options
    7. 7. Design for failure – Avoid single points of failure – Assume everything fails and design backwards • Goal: Applications should continue to function even if the underlying physical hardware fails or is removed/replaced. App Server Database Server (Primary) Database Server (Secondary ) Automatic failover
    8. 8. Loose coupling sets you free – Use a queue to pass messages between components Web Servers App Servers Video Processing Servers Queue Decouple tiers with a queue
    9. 9. Implement elasticity – Elasticity is a fundamental property of the cloud – Don’t assume the health, availability, or fixed location of components – Use designs that are resilient to reboot and re-launch – Bootstrap your instances • When an instance launches, it should ask “Who am I and what is my role?” – Favor dynamic configuration
    10. 10. Build security in every layer Security is a shared responsibility. You decide how to: – Encrypt data in transit and at rest – Enforce principle of least privilege – Create distinct, restricted Security Groups for each application role • Restrict external access via these security groups – Use multi-factor authentication
    11. 11. Don’t fear constraints – Need more RAM? • Horizontal : Consider distributing load across machines or a shared cache • Vertical : Stop and restart instance – Need better IOPS for database? • Instead, consider multiple read replicas, sharding, or DB clustering – Hardware failed or config got corrupted? • “Rip and replace”—Simply toss bad instances and instantiate replacement
    12. 12. Think parallel – Experiment with parallel architectures Same cost (i.e., 4 instance hours), but parallel is 4x faster Hour 1 Hour 2 Hour 3 Hour 4
    13. 13. Auto Scaling and Elasticity “AWS enables Netflix to quickly deploy thousands of servers and terabytes of storage within minutes. Users can stream Netflix shows and movies from anywhere in the world, including on the web, on tablets, or on mobile devices such as iPhones.” From 40 EC2 instances to 5k instances after launching the Facebook application
    14. 14. High Availability Within Amazon EC2, Airbnb is using Elastic Load Balancing, which automatically distributes incoming traffic between multiple Amazon EC2 instances. HA using Elastic Load Balancer with Apache-WLS, Oracle WebLogic and Oracle RAC in a multi-AZ configuration
    15. 15. Disaster Recovery Washington Trust Bank and AWS Advanced Consulting Provider IT-Lifeline use the AWS cloud to cut disaster recovery costs, reduce overhead, and improve recovery time in a compliance-driven industry. DiskAgent protects their healthcare industry customers against physical systems damage by storing backed-up records offsite, in multiple Amazon data centers.
    16. 16. VPC • Use it…VPC by default for new accounts • Database in private subnet VPN • Redundant connections • Consider two Customer Gateways • Dynamic routing (BGP) over static (ASA) NAT • Set up multi-AZ NAT IDS/IPS • Trend Micro, AlertLogic, Snort • Host based • Conduct penetration test : prior approval from AWS Dedicated, secure connection • Direct Connect - 1 Gbps or 10 Gbps Fail over • ELB : Multi-AZ • Route 53 : Geo/region
    17. 17. Next Session
    18. 18. EBS • PIOPS (applies to I/O with a block size of 16KB) • Stripe using RAID 0, 10, LVM, or ASM • RAID 10 (can decrease performance) • Snapshot often : Single volume DB • 20 TB DB size (potential max) : Depends upon IOPS and instance type (1 Gbps or 10 Gbps) Tuning • Maintain an average queue length of 1 for every 200 provisioned IOPS in a minute • Pre-warm $ dd of=/dev/md0 if=/dev/null • fio, Oracle ORION • Database Compression File system • ext3/4, XFS (less mature) • Try different block sizes : start with 64K Stripping • Stripe multiple volumes for more IOPS (e.g., (20) x 2,000 IOPS volumes in RAID0 for 40,000 IOPS) • ASM (Oracle) with external redundancy • More difficult to Snapshot : Use OSB, database backup solution Storage • Use Instance storage for temporary storage or database
    19. 19. AMIS • Use vendor provided • Build your own AMI Boot Strapping • User data/scripts • CloudFormation • Consider Chef, Puppet, OpsWorks EC2 • EBS optimized, cluster compute and storage optimized instances • SSD backed for high performance IO : hi1.4xlarge has 2 TB of SSD attached storage • SSD backed, high memory instance for cached database using Oracle Smart Flash Cache: cr1.8xlarge has 240 GB of SSD plus 244 GB of memory and 88 ECUs • Turn off (stop) when not using EBS • Install software binaries on a separate EBS volume https://s3.amazonaws.com/cloudformation-examples/ BoostrappingApplicationsWithAWSCloudFormation.pdf
    20. 20. Scaling • Vertical Scaling with EC2 : stop instance and change instance type • Horizontal scaling for web and application severs : auto scaling • Horizontal Scaling for Database with Read Replicas and multi-AZ • This will need to be configured using Oracle Active Data Guard, Oracle GoldenGate, 3rd party technology • Amazon CloudWatch : detailed monitoring, custom metrics • Amazon Route 53 : Latency based routing to route traffic to region closest to the user Requires replicated, sharded, or geo dispersed databases HA • Elastic IPs and Elastic Network Interfaces (ENIs) • Active-passive multi-AZ using Oracle Data Guard or other replication solutions • Active-Active multi-AZ using Oracle GoldenGate or other replication solutions • Amazon Route 53 : Now supports health checks for multi-region HA • ELB : Web and Application Server for multi-AZ HA. Health checks (HTML file) to see if Oracle DB is up and running. Associate ENI / Elastic IP to new Oracle DB.
    21. 21. http://aws.amazon.com/whitepapers
    22. 22. Questions to ask? • Is it a technology fit? • Is there a pressing business need the migration would address? • Is there an immediate or potentially big business impact the migration may have? Existing Applications “No-brainer to move” Apps Planned Phased Migration Examples • Dev/Test applications • Self-contained Web Applications • Social Media Product Marketing Campaigns • Customer Training Sites • Video Portals (Transcoding and Hosting) • Pre-sales Demo Portal • Software Downloads • Trial Applications
    23. 23. Proof of concept will answer tons of questions quickly • Get your feet wet with Amazon Web Services – Learning AWS – Build reference architecture – Be aware of the security features • Build a Prototype/Pilot – Build support in your organization – Validate the technology – Test legacy software in the cloud – Perform benchmarks and set expectations
    24. 24. • Select apps • Test platform • Plan migration Plan Deploy • Migrate data • Migrate components • Cutover • Embrace AWS services • Re-factor architecture Optimize
    25. 25. One-time upload w/ constant delta updates Data Size* * relative to internet bandwidth and latency Data Velocity Required UDP Transfer Software (e.g., Aspera, Tsunami, …) Attunity Cloudbeam AWS Storage Gateway Riverbed AWS Import / Export Transfer to S3 Over Internet Hours Days GBs TBs
    26. 26. Forklift Embrace Optimize Effort Scalability Operational Burden Forklift Embrace AWS Optimize for AWS • May be only option for some apps • Run AWS like a virtual co-lo (low effort) • Does not optimize for on-demand (over-provisioned) • Minor modifications to improve cloud usage • Automating servers can lower operational burden • Leveraging more scalable storage • Re-design with AWS in mind (high effort) • Embrace scalable services (reduce admin) • Closer to fully utilized resources at all times
    27. 27. Forklift steps: AMI-1 @ C1.Medium AMI-2 @ M2.XLarge AMI-5 @ M2.2XLarge AMI-1 @ C1.Medium AMI-2 @ M2.XLarge Match resources and build AMIs • Thinks about application needs not server specs • Build out custom AMI for application roles AMI-4 @ M1.Large AMI-3 @ C1.Medium AMI-6 @ M2.XLarge Convert appliances: • Map appliances to AWS services or virtual appliance AMIs Deploy supporting components: • NAS replacements • DNS • Domain controllers Secure the application components: • Use layered security groups to replicate firewalls ELB
    28. 28. Master Database Network Filesystem Steps to Embrace AWS: Web Server Web Server App Server ELB Web Server Network Filesystem Rethink storage: • Leverage S3 for scalable storage • Edge cache with CloudFront • Consider RDS for HA RDBMS Web Server App Server Domain Controller DNS Scale out and in on-demand: • Use CloudWatch and Auto-scaling to auto-provision the fleet App Server App Server Web Tier Auto-scaling Group App Tier Auto-scaling Group Parallelize processing: • Bootstrap AMIs for auto-discovery • Pass in bootstrapping parameters • Leverage configuration management tools for Config automated build out Management Server
    29. 29. App Server App Server A Phased Migration to AWS - Optimize Steps to Optimize for AWS: Web Server Web Server Network Filesystem Domain Controller SQS DNS Use Spot where possible to reduce costs Web Server Web Server Web Tier Auto-scaling Group App Tier Auto-scaling Group Config Management Server Re-Rethink storage: • Break up datasets across storage solutions based on best fit and scalability Parallelize processing: • Spread load across multiple resources • Decouple components for parallel processing EMR App Server App Server App Server Embrace scalable on-demand services • Scale out systems with minimal effort • Route53 • SES, SQS, SNS • … Route 53
    30. 30. #1 Start with a use case or an application – compare apples to apples, capacity utilization, networking, availability, peak to average, DR costs, power etc. #2 Take all the fixed costs into consideration (Don’t forget administration, maintenance and redundancy costs) #3 Use Updated Pricing (compute, storage and bandwidth) Price cuts, Tiered Pricing and Volume Discounts #4 Use variable capacity & reserved instances where they fit the business needs #5 Intangible Costs – Take a closer look at what is built in with AWS – security, elasticity, innovation, flexibility
    31. 31. 3 or 5 Year Amortization Use 3-Year Heavy RIs or Fixed RIs Use Volume RI Discounts Ratios (VM:Physical, Servers:Racks, People:Servers) Mention Tiered Pricing (Less expensive at every Tier : network IO, storage) Cost Benefits of Automation (Auto scaling, APIs, Cloud Formation, OpsWorks, Trusted Advisor, Optimization) DOs DON’Ts BONUS
    32. 32. Forget Power/Cooling (compute, storage, shared network) Forget Administration Costs (procurement, design, build, operations, network, security personnel) Forget Rent/Real Estate (building deprecation, taxes, shared services staff) Forget VMware Licensing and Maintenance Costs Forget to mention Cost of “Redundancy”, Multi- AZ Facility DOs DON’Ts BONUS
    33. 33. Time from ordering to procurement (Releasing early = Increased Revenue) Cost of “capacity on shelf” (top of step) Incremental cost of adding an on-premises server when physical space is maxed out Real cost of resource shortfalls (bottom of step) Cost of disappointed or lost customers when unable to scale fast enough DOs DON’Ts BONUS
    34. 34. • Trusted Advisor: Draws upon best practices learned from AWS’ aggregated operational history of serving hundreds of thousands of AWS customers. The AWS Trusted Advisor inspects your AWS environment and makes recommendations when opportunities exist to save money, improve system performance, or close security gaps. • Apptio: Leader in technology business management (TBM), a new category and discipline backed by global IT leaders that helps you understand the cost, quality, and value of the services you provide. • CloudHealth: Delivers business insight for your cloud ecosystem. Designed for management and executive teams to to optimize AWS performance and costs.
    35. 35. • Leading provider of white label mobile applications and services to real estate industry • Powers more real estate app titles than any other in the real estate vertical • Multi-level marketing platform © 2013 smartShift. All rights 9/15/2014 reserved 47
    36. 36. From RETS system Windows PREPROC Oracle 11g DB Daily Database Backup EBS Snapshots Primary Oracle 11g DB Active Standby Oracle 11g DB Apache+ HAProxy 1 JBoss Node 3 Windows Downloader Server Downloader Server From RETS system 3rd Party protocol Internet PREPROC Oracle 11g DB Application Code Bucket JBoss Node n JBoss Node 2 JBoss Node 1 Auto scaling Group Redo Log Shipping Apache+ HAProxy 2 DNS Provider (R53, DNSMadeEasy) Availability Zone Availability Zone
    37. 37. • Choose great partners • Understand the cloud capabilities trajectory (rapid pace of innovation) • Have a strong methodology • Implement rich and detailed monitoring • Plan for, and perform as many launch rehearsals as possible • EBS provisioned IOPS works as promised • AWS continues to rapidly improve services (4K IOPS now available) and reduce costs • Multi-AZ implementation • Rehearsed DB restorations
    38. 38. • The cloud-based system operates as expected in terms of performance and cost • Cloud costs as per our projection (with the use of reserved instances) • Project delivered on budget • Operational staff requirements reduced • Incidentally, physical infrastructure failed on 07/10/13 – would have resulted in a total service outage • Lower overall incident rate • Application and storage performance highly consistent • Infrastructure now a selling point for the business
    39. 39. Here are some additional resources: • Get started with a free trial – http://aws.amazon.com/free • White papers – http://aws.amazon.com/whitepapers/ • Reference Architectures – http://aws.amazon.com/architecture/ • Enterprise on AWS – http://aws.amazon.com/enterprise-it/ • Executive level Overview : Extending Your Infrastructure to the AWS Cloud (4 minutes) – http://www.youtube.com/watch?v=CsGqu5L_PFI • Simple Monthly Pricing Calculator – http://calculator.s3.amazonaws.com/calc5.html • TCO Calculator for Web Applications – http://aws.amazon.com/tco-calculator/ tomlasz@amazon.com