Application Optimized Performance: Choosing the Right Instance (CPN212) | AWS re:Invent 2013

5,213 views
4,857 views

Published on

(Presented by Intel)
Each application places a different set of requirements on the underlying infrastructure.
Whether it is web, big data analytics, technical computing, or general enterprise applications, applications are run more efficiently when performance, IO bandwidth, and memory capacity have been custom-tailored for that specific application.
Jason Waxman, GM and VP of Intel’s Cloud Platform Group, looks under the hood at the different types of processors that comprise Amazon Web Services instances and shares insights from Intel IT and industry best practices for right-sizing infrastructure for different application characteristics and capabilities. By leveraging the underlying performance, security capabilities, and flexibility of various instance types, developers can more easily migrate applications into the cloud and drive down TCO for cloud-based services.

Published in: Technology, Business
1 Comment
2 Likes
Statistics
Notes
  • Where can I find more information on the 'Crystal Ridge' platform referenced in slide 43?
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total views
5,213
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
65
Comments
1
Likes
2
Embeds 0
No embeds

No notes for slide

Application Optimized Performance: Choosing the Right Instance (CPN212) | AWS re:Invent 2013

  1. 1. Delivering Compelling Experiences: Choosing the Right Instance for Application Optimized Performance Jason Waxman, GM & VP Cloud Platforms Group, Intel Corporation November 14, 2013
  2. 2. Compelling Experiences Personal drive Growth Predictive Analytics Pervasive Improve healthcare Big Data & Analytics Infrastructure4 43% CAGR Perceptual Video/Media 16X in mobile video traffic Content Delivery Video Search 4X in servers for media/graphics Voice & Gestures 20X Personal assistant Natural Interaction 2 3 growth in speech driven mobile network traffic1 >22X in smartphones with gesture recognition features1
  3. 3. The High Costs of a Bad Experience On an average consumers tell: 9 people about good experiences… 16 ..and 1.http://www.digitalservicecloud.com/resources/blog/good-customer-service.html about bad experiences1
  4. 4. Delivering Compelling Experiences What’s Required?
  5. 5. Diverse Needs, Common Themes DEVELOPER REQUIREMENTS AGILITY EFFICIENCY RELIABILITY Fast Service Delivery Elasticity/Scalability CONSUMER EXPECTATIONS Personal & Customized Cost to Serve Services & APIs reduce headcount required Cost Effective Stable, Consistent Privacy/Security Service Availability Privacy/Security
  6. 6. Delivering the best experience AGILITY EFFICIENCY SCALE: OPTIMIZE: SECURE: TAKE ADVANTAGE OF AWS ELASTICITY CHOOSE THE RIGHT INSTANCES MANAGE RISKS FOR INCREASED DURABILITY RELIABILITY
  7. 7. Delivering the best experience AGILITY EFFICIENCY SCALE: OPTIMIZE: SECURE: TAKE ADVANTAGE OF AWS ELASTICITY CHOOSE THE RIGHT INSTANCES MANAGE RISKS FOR INCREASED DURABILITY RELIABILITY
  8. 8. Intel is using Amazon Web Services An Intel Company Intel® Cloud Services Platform
  9. 9. Mashery API Management Service • • • • 175 Customers 60,000 Applications 215,000 Developers 500,000,000 API calls/day An Intel Company
  10. 10. Scaling enables responsiveness Mashery relies on AWS elasticity • Capacity Planning - Robust Development and QA environment to perform load tests and deploy proofof-concept silos • Modular Infrastructure - Loosely coupled singlepurpose servers that scale horizontally 100X NEWS VOLUMES FOR SOME CUSTOMERS AT PASSING OF CELEBRITY IN EARLY 2012 • Globally Distributed - Extend the infrastructure to where you customers are…every AWS Region, reaching every corner of the globe. From 100 queries/sec to 100,000 queries/sec in a matter of minutes
  11. 11. Kevin Baillee CEO, Atomic Fiction
  12. 12. Company Overview • Atomic Fiction crafts high-end visual effects (VFX) for film and television. • Specialties include digital environments and character work • Staff scales with projects, varying between 15 and 50 artists • Known for high end work, medium volume, low cost • Big company infrastructure, small company vibe • Developing innovative approaches to reducing technological costs in order to fee up resources for experienced artistic talent.
  13. 13. Company Overview
  14. 14. Our AWS Story • • • • • • Pixar-sized render farm in minutes. iMac-sized the next. Only pay for what we use No physical limits on computing = no physical limits on creativity Seamless experience through tools like ZYNC Decreased load, thus increased performance, on local filesystem Same price for faster artist turnaround • 100 computers for 10 hours = $2,200 • 1000 computers for 1 hour = $2,200
  15. 15. Problems we’re trying to solve • The problem: how do we “unlimit” creativity while staying profitable? • Freeing the creative so that directors can achieve their vision • Artists need iterations in order to hit “the look” and stay on schedule • Need security, task-appropriate stats, and unlimited availability on demand • Choosing the right instance: speed vs memory vs cost • c1.xlarge – 20 compute units, 7GB RAM • Low cost per hour for lightweight compositing tasks • cc2.8xlarge – 88 compute units, 60.5GB RAM • Beefy RAM, good $ per compute unit cost proposition • Working with our partners at ZYNC, implementation was plug & play!
  16. 16. Star Trek Into Darkness
  17. 17. Star Trek Into Darkness
  18. 18. Star Trek Into Darkness
  19. 19. Star Trek Into Darkness
  20. 20. Key Learnings/What’s Next? • For Star Trek Into Darkness, we achieved exactly what JJ Abrams wanted • Key findings: • For over 80% of tasks, cc2.8xlarge was fastest & most cost effective • Given higher per-hour costs, efficient utilization of high end instances is critical. Partial hours = wasted money! • Burstability was critical for hitting deadlines. Ran between 0 and 400 instances simultaneously depending on the needs of the moment. • Grew over 200% month-over-month two months in a row • Our ideal instance would be inexpensive, high compute power (Intel Xeon E5-2600 v2), medium memory (32-48GB) with nVidia GPU. • Next for us: moving even more of our workflow into the cloud!
  21. 21. Delivering the best experience AGILITY EFFICIENCY SCALE: OPTIMIZE: SECURE: TAKE ADVANTAGE OF AWS ELASTICITY CHOOSE THE RIGHT INSTANCES MANAGE RISKS FOR INCREASED DURABILITY RELIABILITY
  22. 22. CPU & Memory Intensive Optimize: Choose the right Instance CR1 Memory M3 Standard Optimized Enterprise Instance E5-2670 Applications E5-2670 G2 GPU Instance Graphics E5-2670 Rendering Cluster Graphics Instance Cloud RAN M2 Memory Optimized Content Delivery and Gaming E- Commerce Dedicated Hosting Micro Instance X5570 Small Cell M1 Standard Instance Cold Storage Higher latency, lower throughput I/O Intensive High Cluster Compute Performance Instance Computing E5-2670 Edge Big Data Storage Low End – Optimized Networking C1 Compute Routing Instance Storage De-dupe High Memory E5-2650 Lower latency, higher throughput
  23. 23. Intel® Cloud Services Platform • 175,000 Users • 5M users by 2014 • Entire path of production is on AWS • 1448 instances… Our Wake Up Moment… One month we spent $300K…60% of which we found later was wasted… • We were spinning up instances and forgetting they were on • We had larger instances than we actually needed • Most instances never went over 10% utilization…
  24. 24. Optimize for Efficiency Select the Right Instance Keys to Success • Analyze: “Trusted Advisor” • Select the right type of instance for your workload • Size & Features • # of Instances • Reserve instances where possible for cost efficiency 108hrs PER WEEK OF UNUTILIZED INSTANCES $100K OF SAVINGS BY TURNING OFF NON PRODUCTION INSTANCES AFTER HOURS >60% TOTAL SAVINGS
  25. 25. Steve Litster, PhD. Global Head of Scientific Computing Novartis Institutes for Biomedical Research Accelerating Science
  26. 26. Novartis Institutes for BioMedical Research (NIBR)  Unique research strategy driven by patient needs  World-class research organization with about 6000 scientists globally  Intensifying focus on molecular pathways shared by various diseases  Integration of clinical insights with mechanistic understanding of disease  Research-to-Development transition redefined through fast and rigorous “proof-of-concept” trials  Strategic alliances with academia and biotech strengthen preclinical pipeline
  27. 27. Accelerating the Science  Requirements Large Scale Computational Chemistry Simulation Results in under a week Flexible target Ability to run multiple experiments “on-demand”  Challenges Sustained access to 50000+ compute cores Ability to monitor and re-launch jobs No additional Capital Expenditure Internal HPCC already running at capacity  Job Profile Embarrassingly Parallel CPU Bound Low I/O, Memory and Network requirements Virtual Screening Target Molecule Compound Molecule binding site "Lock" "Keys"
  28. 28. The Cloud: Flexible Science on Flexible Infrastructure Engineering the right infrastructure for a workload:  Software runs the same job many times across instance types  Measures the throughput and determines the $ per job  Use the instances that provide the best scientific ROI  CC2 instance (Intel Xeon® ‘Sandy Bridge’) ran best for this
  29. 29. Super Computing in the Cloud Metric Compute Hours of Science 341,700 hours Compute Days of Science 14,238 days Compute Years of Science 39 years AWS Instance Count-CC2     Count 10,600 instances $44 Million infrastructure 10 million compounds screened 39 Drug Design years in 11 hours for a cost of …$4,232 3 promising compounds identified
  30. 30. Key Learnings/What’s Next?  Diversity of Life Sciences brings unique challenges  Spend the time analyzing and tuning  Flexibility, Scalability and Performance  Time to rethink and retool  Challenge the Science and the Scientist  Collaboration  Future plans  Chemical Universe : 166 Billion cpds ≤ 17 atoms (Extreme scale CPU)  Next Generation Sequencing in the Cloud (Extreme CPU, Mem, I/O)  “Disruptive” Technologies-Imaging (x10 NGS requirements!)
  31. 31. Delivering the best experience AGILITY EFFICIENCY SCALE: OPTIMIZE: SECURE: TAKE ADVANTAGE OF AWS ELASTICITY CHOOSE THE RIGHT INSTANCES MANAGE RISKS FOR INCREASED DURABILITY RELIABILITY
  32. 32. Optimizing for Security Performance (w/AES-NI) m1.xlarge m3.xlarge higher performance saves cost Intel Internal Benchmark 1 $5.6K/year 21 50-75% SAVINGS 1 2 3 42 Instance Requirements for 400mbps OPEN SSL PERFORMANCE 1 - Not required but added for redundancy 2 - Requirement is 3.2, but you can’t buy .2, so round up to 4 $10K/year By upgrading from m1.xlarge to the more expensive m3.xlarge because of AES-NI
  33. 33. NASDAQ OMX OUR TECHNOLOGY OUR GLOBAL PLATFORM CAN HANDLE MORE THAN WE LIST ~3300 GLOBAL COMPANIES WORTH IS USED TO POWER MORE THAN 1 MILLION MESSAGES/SECOND $6 TRILLION 70 MARKETPLACES I N 5 0 COUNTRIES IN MARKET CAP REPRESENTING AT A MEDIAN SPEED OF DIVERSE INDUSTRIES AND S U B - 5 5 MICROSECONDS MOST WELL-KNOWN AND MANY OF THE WORLD’S INNOVATIVE BRANDS FinQloud R3 (Regulatory Record Retention) Durable/Available Elastic Security Cloud Computing Platform Exclusively for Financial Services Transparent POWERED BY AMAZON WEB SERVICES Cost Effective 36
  34. 34. NASDAQ OMX Security Protocols Data Classification Define and enforce what is and is not approved for the Cloud Encryption Data Classification at all times (in flight and at rest) SSL for all data Security Audit Any action someone does in R3 is audited and fully transparent to system admins and regulators AWS Built In Security IAM, MFA, VPC, Direct Connect private circuits, routing/firewalls, etc * Highly confidential data must be encrypted and the keys must be stored in HSMs
  35. 35. Technology: What’s Next Intel® Xeon® Processor E5-2600 v2 Family Rack Scale Architecture Software Defined Infrastructure
  36. 36. Diversity of Datacenter Workloads CPU & Memory Intensive E5-2680v2 CR1 Memory M3 Standard Optimized Enterprise Instance E5-2670 Applications E5-2670 “New” EC2 C3 Graphics Compute Optimized High Cluster Compute Performance E5-2670 Rendering Instance Computing w/ Latest Intel® Xeon® G2 GPU Instance Cluster Graphics InstanceProcessors E5-2600v2 Cloud RAN X5570 Edge E5-2670v2 M2 Memory Optimized Big Data Content Delivery and Gaming “New” EC2 E- Commerce Dedicated Hosting Micro Instance Small Cell M1 Standard Instance Cold Storage Higher latency, lower throughput Storage Optimized C1 Compute Routing Instance Storage De-dupe Storage – Low End w/ Latest Intel® Xeon® Optimized Networking E5-2600v2 Processors I/O Intensive E5-2670 High Memory E5-2650 Lower latency, higher throughput
  37. 37. Technology Matters Amazon Web Services discloses instances based on Intel Xeon
  38. 38. New 2013 AWS C3 Compute-Optimized Instance SC’13 484 TFLOPs* 26K cores based on Intel® Xeon® processor E5-2680v2 Powered by “NEW” Intel® Xeon® E5-2600v2 processor family *SC’13 Submission
  39. 39. The Future of the platform: Intel Rack Scale Architecture Innovation Orchestration Open Network Platform Storage – PCIE –SSD & Caching Photonics & switch fabric Silicon – Intel® Atom™ & Xeon CPU / Mem Modules Network platform – Flexible & Cost effective Increase utilization thru storage aggregation Extreme Compute and Network bandwidth Platform Flexibility - Increase useful life, and capacity Increases Agility, Efficiency & Reliability
  40. 40. The Future of Infrastructure: Intel’s Approach to Software Defined Infrastucture BROADEST ENABLED ECOSYSTEM Integrated and optimized for all leading commercial and open source operating environments for more seamless Data Center operations EXPOSED & INTEGRATED TELEMETRY Hardware and infrastructure attributes are exposed and integrated with orchestration software for deeper insight & optimal provisioning management PLATFORM AND ARCHITECTURAL LEADERSHIP Standards-based compute, network and storage building blocks in Intel’s Rack Scale Architecture drive maximum infrastructure efficiency and flexibility AMAZON WEB SERVICES (AWS) Service Assurance Manager Rack Scale Architecture Storage • Scalable Intel® Atom® & Xeon® storage solutions • SSD’s with Cache Acceleration • Luster • NVM/Crystal Ridge Network • Open Network Platforms • Wind River OS • DPDK • Cave Creek • Silicon Photonics A world where the application defines the system Compute • Intel Xeon • Atom®C2000 • Intel Xeon Phi • Integrated graphics • TXT
  41. 41. Delivering the best experience AGILITY EFFICIENCY SCALE: OPTIMIZE: SECURE: TAKE ADVANTAGE OF AWS ELASTICITY CHOOSE THE RIGHT INSTANCES MANAGE RISKS FOR INCREASED DURABILITY RELIABILITY

×