Escape From Amazon: Tips/Techniques for Reducing AWS Dependencies


Published on

As more startups use Amazon Web Services, the following scenario becomes increasingly frequent - the startup is acquired but required by the parent company to move away from AWS and into their own data centers. Given the all encompassing nature of AWS, this is not a trivial task and requires careful planning at both the application and systems level. In this presentation, I recount my experiences at Delve, a video publishing SaaS platform, with our post acquisition migration to Limelight Networks, a global CDN, during a period of tremendous growth in traffic. In particular, I share some of the tips/techniques we employed during this process to reduce AWS dependence and evolve to a hybrid private/AWS global architecture that allowed us to compete effectively with other digital video leaders.

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Escape From Amazon: Tips/Techniques for Reducing AWS Dependencies

  1. 1. Escape from Amazon:Tips/Techniques for ReducingAWS DependenceSoam Acharya, PhDChief Scientist, Limelight Video Platform@soamworkOct 2012
  2. 2. FromAmazon 2
  3. 3. Introduction• Without Amazon we wouldn’t be where we are today• Audience for this talk: – Advanced AWS users • Too much of a good thing • Have to stop using AWS – Beginners • Design system to avoid pitfalls 3
  4. 4. Agenda• Why Reduce AWS Dependence?• Case Study: Delve, now Limelight Video Platform – Who Are We? – Our Experiences • Pre Migration Status • Challenges • Current Setup• Lessons Learned: Tips/Techniques For Reducing AWS Dependencies & Costs 4
  5. 5. Why Reduce AWS Dependence?• Outages – Not limited to a single service 5
  6. 6. Why Reduce AWS Dependence?• Service depreciation – SimpleDB• Shared public cloud – Multi-tenancy issues• Business Reasons: – “frenemy” i.e. you compete with Amazon in something – single vendor lock-in • Reduces leverage 6
  7. 7. Why Reduce AWS Dependence? $$$• Scenario #1: – Startup acquisition – Required to migrate• Scenario #2: – Grow too big for your own good – Economical to run your own hardware 7
  8. 8. Case Study - Limelight Video Platform (LVP)• Many world class customers – NFL, Sony, QVC, Pokemon, MBC, Hearst, Prudential, Alloy Media etc• Global footprint – 100+ countries, 5000+ websites• Based in Seattle with employees in SF, NYC, LON, LAX• Founded in 2006 as Pluggd• Pivoted in 2008 as Delve Networks – Online Video Platform (OVP) – Competes with Ooyala, Brightcove, Kaltura• Acquired by Limelight Networks in August 2010 – Limelight is a global content delivery network 8
  9. 9. LVP Workflow upload manage transcode publishanalytics 9
  10. 10. Backend Notes• SOA – Java, Spring, Hibernate, MySQL, NoSQL, REST etc 10
  11. 11. Case Study – LVP AWS Usage History• Delve Networks: – Founded by ex-Amazon folks – Started moving to AWS in Summer 2008 • Used Scalr for cloud management • At peak: – Several hundred EC2 instances – ELB, S3, SimpleDB, EMR, CloudFront, CloudWatch, EBS, SQS• Acquired by Limelight Networks in August 2010 – Migration work started in late Fall 2010 11
  12. 12. Migration Challenges – AWS Dependence 12
  13. 13. Migration Challenges – LVP Growth 13
  14. 14. Migration Challenges - Other• LVP – Personnel – Service interdependencies – Growing pains • Our own services • AWS outages• Limelight Integration/Migration Challenges – Machines: • Obtaining • Environment • Placement • Maintenance – Operation philosophies • CDN vsSaaS 14
  15. 15. Current Status• Hybrid model – Limelight • 4 data centers – ~400 – 50+ services/handlers • Other infrastructure – Hadoop cluster – Databases – CDN services – AWS services • Burst into EC2 • S3, DynamoDB, SimpleDB, SQS, Elastic Map Reduce – Work continuing on reducing dependence on these 15
  16. 16. Tips/Techniques for Reducing AWSDependency and Costs• Machine Placement• Caching• Parallelization• Open Source + Alternative Services• Cross service redundancy• Miscellaneous tips 16
  17. 17. Tip: Machine Placement• Our strategy: use EC2 as little as possible for steady state• Where put non EC2 machines? – Still need access to other AWS services • Weight of data – Find data centers as close as possible to target AWS center (N Virginia) • Proximity is important – S3 files visible from one data center may not be immediately visible from another – One data center isn’t enough: • Service, geo redundancy
  18. 18. Tip: Machine Placement• Limelight POPs: – Direct connections to access networks – Global fiber-optic interconnect – But: • POP capacity • placement within POP • shipping ..
  19. 19. Machine Placement - PHX• Started off in PHX• Close to Limelight HQ• S3 download tests conducted every hour over a week• Early 2011 19
  20. 20. Machine Placement – SoftLayer/Houston• From SoftLayer in Houston• Has peering arrangement with Amazon 20
  21. 21. Machine Placement - ATL• From Atlanta POP 21
  22. 22. Machine Placement - EC2• From EC2 in N Virginia 22
  23. 23. Machine Placement - IAD• From IAD – Best non EC2 performance – One external hop away• But even within IAD: – Machine NIC – Switch/Router setup• Peering helps 23
  24. 24. Caching• Tip: cache access to AWS services – Save on RTT – Better redundancy, fault tolerance – AWS bandwidth costs 24
  25. 25. Caching: LVP Analytics Reporting S3 LLNW Simple Reporting mem- + DB cached service clusters • Need to quickly fetch, assemble Dynamo analytics reports DB • SimpleDB: charged by usage 25
  26. 26. Caching: Transcoding AWS Virginia IAD Video Processing Handlers Video Processing Handlers S3• Video processors (transoders, thumbnail processors …) require access to original video• Bandwidth out of AWS - $$ 26
  27. 27. Caching: Transcoding AWS Virginia• Use Limelight Proxy IAD Caching Video Processing Handlers L L Video Processing Handlers P S3 r o x y 27
  28. 28. Caching: Transcoding AWS Virginia• Additional benefits IAD Video Processing Handlers L L Video Processing Handlers P S3 r o Another POP x y Video Processing L Handlers L P 28 r
  29. 29. Parallelization• AWS services are set up to be highly distributed• Construct application/systems to parallelize requests: – Useful for applications/systems located outside AWS – Pipelining to get around large RTTs to AWS• Example: – Our transcoding – Our real time analytics processing 29
  30. 30. Parellization – RT Processing Simple• hadoop process in DB IAD Metadata lookup “fast” logs Job S3 Hadoop process Controller Reports Simple DB 30
  31. 31. Parellelization – RT Processing Simple• Move to LL hadoop DB cluster in PHX• Further away from Metadata lookup AWS but …. “fast” logs h h Job S3 Controller h h Reports Simple DB 31
  32. 32. Parellelization/Caching – RT Processing Simple• Introduce caching DB into the mix cache “fast” logs h h Job S3 Controller h h Reports Simple DB 32
  33. 33. Open Source + Alternative Services• Moving out of AWS means you have to find alternatives• Sometimes involves multiple building blocks• Alternatives to – SimpleDB • MongoDB instances – CloudWatch • Cloudkick • Zabbix – S3 • GlusterFS, Limelight Cloud Storage – ELB – Public cloud 33
  34. 34. ELB Alternative• Use Limelight’s Traffic Balancer product (DNS-XD)• nginx
  35. 35. ELB Alternative II• Traffic Balancer also allows geo based request routing
  36. 36. Private Cloud Alternative• At AWS: – Used Scalr for cloud management – Amazon constantly improving own tools• At Limelight: – Original vision: • Use something like Eucalyptus/OpenStack • Seamless amalgam of public-private cloud using Scalr – Rude reality: • Learning curve • Price, maintainance • Didn’t know internal Limelight processes, network topology • Business reality: start migration ASAP 36
  37. 37. Private Cloud Alternative• Opscode’s Chef – Infrastructure as code – Infrastructure as a service • Hosted version of Chef• We use Chef for: – Node management – Service deployment • Limelight • Starting to use in EC2 as well 37
  38. 38. Private Cloud Alternative• Our infrastructure management model: – Recipes: • Tomcat service, apache service, java, memcached setup – Roles: • Use recipes to construct a service – Environment: • Base, dev, staging, production – Node: • Environment + roles• Difficulties: – Rolling deployments – Repurposing nodes without virtualization 38
  39. 39. Cross Service Redundancy• Backup data• Example: we keep copies in S3 of reports stored in SimpleDB, DynamoDB – Alternative source if SimpleDB, DynamoDB goes down – Also: • Easy to copy reports to other alternatives • Don’t have to incur additional AWS costs pulling entire corpus out of dbs 39
  40. 40. Other Miscellaneous Tips• S3: – Compress files! • Save storage costs • Less time to transfer over networks• Elastic Map Reduce: – Multitenancy issues affect performance • Time of day • instance type – Non cluster compute instances 40
  41. 41. Other Miscellaneous Tips• DynamoDB: – A big component of DynamoDB bill is read/write provisioning speed • Limits on how often provisioning can be changed • Can be reduced only once a day – Toggle speeds if uploads can be batched • raise write throughput prior to uploading the bulk of our data for the day, then reduce Start most of the day’s uploads Complete most of the day’s uploadsDdb write speed Time during a day
  42. 42. Q&A 42