Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

2017 04-05 aws summit - sydney

105 views

Published on

Dynamic Infrastructure and The Cloud - Adventures in Keeping Your Application Running…at Scale

Published in: Technology
  • Be the first to comment

  • Be the first to like this

2017 04-05 aws summit - sydney

  1. 1. © 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Lee Atchison ∙ Senior Director Strategic Architecture New Relic, Inc. Sydney, Australia Dynamic Infrastructure and The Cloud Adventures in Keeping Your Application Running…at Scale leeatchison@leeatchison
  2. 2. Who am I? 30 years in industry 5 in New Relic (Architect Lead, Cloud, Service Migration) 7 in Amazon Retail & AWS (Built First AppStore, AWS Elastic Beanstalk) Who Specialize in: Cloud computing Services & Microservices Scalability, Availability leeatchison@leeatchison Senior Director Strategic Architecture
  3. 3. Does this sound familiar…
  4. 4. You had power most of the time. Why are you complaining?
  5. 5. I Hope, I Hope, I Hope the Site Stays Up
  6. 6. 9 Keeping Your App Running…At Scale Availability… …is more than you think it is.
  7. 7. Does this sound like something you’ve heard recently… …overheard OPs conversation...
  8. 8. The conversation… “We were wondering how changing a setting on our MySQL database might impact our performance…
  9. 9. The conversation… “We were wondering how changing a setting on our MySQL database might impact our performance… … but we were worried that the change may cause our production database to fail…”
  10. 10. The “scary” overheard conversation… “… Since we didn’t want to bring down production, we decided to make the change to our backup (replica) database instead… Under Construction … but we were worried that the change may cause our production database to fail…”
  11. 11. The “scary” overheard conversation… “… Since we didn’t want to bring down production, we decided to make the change to our backup (replica, hot standby) database instead… … After all, it wasn’t being used for anything at the moment.” Under Construction
  12. 12. The “scary” overheard conversation… Until, of course, the backup was needed… Under Construction X
  13. 13. The “ scary” overheard conversation… Until, of course, the backup was needed… This was a true story Under Construction !!!!X X
  14. 14. Availability can be more subtle, for example…
  15. 15. 18 Confidential ©2008-15 New Relic, Inc. All rights reserved. 300ms 1.5s 18Confidential ©2008-15 New Relic, Inc. All rights reserved.
  16. 16. 19 Confidential ©2008-15 New Relic, Inc. All rights reserved. 19Confidential ©2008-15 New Relic, Inc. All rights reserved. .9s
  17. 17. 20 Confidential ©2008-15 New Relic, Inc. All rights reserved. 20Confidential ©2008-15 New Relic, Inc. All rights reserved.
  18. 18. 21 The Data from Monitoring Your App Dwarfs the Data Inside the App Confidential ©2008-15 New Relic, Inc. All rights reserved.
  19. 19. 22 Confidential ©2008-15 New Relic, Inc. All rights reserved. User Experience Business Outcome Servers Apps BigData Problem
  20. 20. High Expectations Blame Game Intensity Rises The problem must be someone else’s fault Panic
  21. 21. What happened?
  22. 22. Need Data at Every Level Amazon EC2 Instance BrowserMobile Server (Virtual) Hardware Server OS Application & Application Microservices Typical Server / Amazon EC2 Instance • Application & Application Microservices • Server OS • Hardware (virtual)
  23. 23. Amazon EC2 Instance BrowserMobile Server (Virtual) Hardware Server OS Application & Application Microservices Low Level Monitoring Amazon CloudWatch AWS CONSOLE Amazon CloudWatch Monitors • EC2 instance • Virtualization • Hardware • [CPU / Disk / Networking] Doesn’t know about: • Server OS • Memory / Filesystem • Processes • Configuration • Application - Latency - Error rates
  24. 24. Amazon EC2 Instance BrowserMobile Server (Virtual) Hardware Server OS Application & Application Microservices DASHBOARDS Infrastructure / Application Monitoring New Relic Application Monitoring New Relic Infrastructure Monitoring Amazon CloudWatch AWS CONSOLE Monitors (Server): • How O.S. is performing • Configuration Changes • Processes • Hardware Monitors (Application): • App health • App performance • Microservices Doesn’t know • Virtualization
  25. 25. Amazon EC2 Instance BrowserMobile Server (Virtual) Hardware Server OS Application & Application Microservices Full Stack Monitoring New Relic Application Monitoring New Relic Infrastructure Monitoring Amazon CloudWatch AWS CONSOLE Integrations New Relic Monitors CloudWatch monitors DASHBOARDS AWS / CloudWatch • Visibility into virtualization • CPU / Disk / Networking • 14 AWS Services APM • CPU / Disk / Networking • Memory / Filesystem • Processes - Infrastructure components - Configuration inventory • Application / Microservices: - Latency - Error rates - App insights
  26. 26. 29 Why Measurement Matters
  27. 27. 30 Success in Software Analytics Confidential ©2008-15 New Relic, Inc. All rights reserved. Application Performance Customer Experience Business Outcome
  28. 28. 32 Keeping Your App Running…At Scale Availability… …is more than you think it is. Dynamic Cloud… ...make availability happen.
  29. 29. The Cloud Can Help Better Data Center Dynamic Environment How do we use the cloud to accomplish this?
  30. 30. Better Data Center Better Data Center Dynamic Environment
  31. 31. Cloud as a “Better Data Center” Resources are allocated to uses, just like in a data center Provisioning process is faster Lifetime of components is relatively long Capacity planning is still important and still applies
  32. 32. Why use a “Better Data Center”? Add new Capacity (faster) Improve Application Availability (redundancy) Compliance
  33. 33. Dynamic Cloud Better Data Center Dynamic Environment
  34. 34. Cloud as a “Dynamic Tool for Dynamic Apps” Use Only the Resources you need Allocate / de-allocate resources on the fly Resource allocation is an integral part of your application architecture
  35. 35. Dynamic Cloud Resources are: Application in charge: Allocated Application is aware of and is controlling traditional OPs resources Consumed De-allocated
  36. 36. Dynamic Usage Example… Docker Container Age (Count vs. Hours) 1 Hour 200 days 833 days
  37. 37. Dynamic Usage Example… Docker Container Age (by Minute and Hour) 1,200,000 11% underone minute Container age (minutes)
  38. 38. Dynamic Cloud Technologies Dynamic Cloud is about scaling and availability EC2 Auto Scaling Mobile / IoT Dynamic routing Load balancing Queues and notifications Docker
  39. 39. Dynamic Cloud Enables Better Applications Faster Traditional Data Center Cloud Data Center Dynamic Cloud Good Better Best The way you’ve done things in the past won’t work in the future.
  40. 40. Dynamic Cloud Server running application/ processes Process running a command Function performing a task or operation EC2 Docker Lambda Things happen faster because of…
  41. 41. Microcomputing & AWS Lambda • Highly dynamic • Incredibly scalable • No infrastructure to provision • Massively shared infrastructure Also known as: • Functions as a Service (FaaS) • Compute as a Service (CaaS) • Serverless
  42. 42. AWS Lambda S3 Bucket Dynamo DB API Gateway SQS RESOURCESSOME S3 Bucket API Gateway SQS RESOURCESSOME • Takes an event from an AWS resource (A Trigger)
  43. 43. AWS Lambda S3 Bucket Dynamo DB API Gateway SQS RESOURCESSOME S3 Bucket API Gateway SQS RESOURCESSOME Lambda Script Lambda Instances • Takes an event from an AWS resource (A Trigger) • Creates an instance to execute
  44. 44. AWS Lambda S3 Bucket Dynamo DB API Gateway SQS RESOURCESSOME S3 Bucket API Gateway SQS RESOURCESSOME Lambda Script Lambda Instances • Takes an event from an AWS resource (A Trigger) • Creates an instance to execute • Can impact original or different AWS Resource
  45. 45. AWS Lambda S3 Bucket Dynamo DB API Gateway SQS RESOURCESSOME S3 Bucket API Gateway SQS RESOURCESSOME Lambda Script Lambda Instances • Takes an event from an AWS resource (A Trigger) • Creates an instance to execute • Can impact original or different AWS Resource • Any number of instances can run at a time
  46. 46. Dynamic Cloud Dynamic Cloud Easier Scaling Faster Change Faster Response Higher Availability
  47. 47. Dynamic Cloud has unique monitoring requirements… How do I track what the dynamic cloud is doing for me (or to me)?
  48. 48. What is a Dynamic Cloud Application? • Application & Application Microservices Responsible for the parts you care about • Infrastructure • Allocation/Provisioning • Scaling Let cloud manage rest Server OS Server (Virtual) Hardware Application & Application Microservices Provisioning Application & Application Microservices Application & Application Microservices BrowserMobile
  49. 49. Server OS Server (Virtual) Hardware Application & Application Microservices Provisioning Application & Application Microservices Application & Application Microservices BrowserMobile Monitoring Dynamic Cloud Applications AWS CONSOLE CloudWatch
  50. 50. Server OS Server (Virtual) Hardware Application & Application Microservices Provisioning Application & Application Microservices Application & Application Microservices BrowserMobile AWS InfrastructureApplication Performance CloudWatch AWS CONSOLE New Relic Application Monitoring New Relic Infrastructure Monitoring DASHBOARDS Integrations
  51. 51. Server OS Server (Virtual) Hardware Application & Application Microservices Provisioning Application & Application Microservices Application & Application Microservices BrowserMobile CloudWatch AWS CONSOLE New Relic Application Monitoring New Relic Infrastructure Monitoring DASHBOARDS AWS InfrastructureApplication Performance New Relic Monitors CloudWatch & AWS monitors Integrations
  52. 52. Server OS Server (Virtual) Hardware Application & Application Microservices Provisioning Application & Application Microservices Application & Application Microservices BrowserMobile How do you monitor this? ?How do you monitor this?
  53. 53. Where did it go? It was just here!! The thing you monitored 10 minutes ago… ...doesn’t exist anymore!?
  54. 54. Monitoring the Dynamic Cloud Monitor the Cloud Components themselves Monitor the lifecycle of the Cloud Components Very different than monitoring traditional Data Center components
  55. 55. Changing World Ops Previous - STATIC World
  56. 56. Changing World Previous - STATIC World Ops Dev Now - DYNAMIC World Ops
  57. 57. Changing World Dev Now - DYNAMIC World Ops • We know: • Change is inevitable • We must: • Embrace and drive change • Enabling: • Quicker growth • More reliable growth
  58. 58. 62 Keeping Your App Running…At Scale Dynamic Cloud… ...make availability happen. Migration… ...how do I get my app to the cloud?
  59. 59. High Expectations Blame Game Intensity Rises “The problem must be the cloud’s fault” Pressure to declare victory in the migration The Politics of Migration Show me the new apps!!? Promised Performance gains? Cost controls? Optimize costs? Why is it taking so long? Migration failure…
  60. 60. Ops Use the Cloud • Move in a controlled way • Learn as you go • Measure everything Does not have to be painful…
  61. 61. Experiment Secure the Cloud Enable Servers, Enable SaaS Enable Value-Added Services Enable Unique Services Mandate Cloud Usage Progressions in Cloud Adoption…The Controlled Way Standard steps most companies follow
  62. 62. Experiment Progressions in Cloud Adoption
  63. 63. Enterprise IT Cloud Adoption Strategy Experiment  Non-evasive, safe technologies - S3 - Perhaps: CloudFront, SQS, SES  Stay away from EC2/Servers  Security: Easy as one-offs  No “Policies” implemented yet  “Just seeing what this is all about” Progressions in Cloud Adoption What is this cloud thing?
  64. 64. Experiment Secure the Cloud Progressions in Cloud Adoption
  65. 65. Progressions in Cloud Adoption Enterprise IT Cloud Adoption Strategy Secure the Cloud  IAM (Credentials)  VPC (Secure network)  AWS Direct Connect (just another data center)  Cloud policies begin to be formed  All parts of the company are now involved  Critical evolution point Can we trust the cloud?
  66. 66. Experiment Secure the Cloud Enable Servers, Enable SaaS Progressions in Cloud Adoption
  67. 67. Progressions in Cloud Adoption Enterprise IT Cloud Adoption Strategy Enable Servers, Enable SaaS  EC2 - Basic “data center migration” - Just another server type available…  Multiple AZs/Regions - Part of multi-datacenter resiliency strategy  Independently: SaaS usage increases - Non-critical or internal uses first The cloud seems to work pretty well…
  68. 68. Experiment Secure the Cloud Enable Servers, Enable SaaS Enable Value-Added Services Progressions in Cloud Adoption
  69. 69. Progressions in Cloud Adoption Enterprise IT Cloud Adoption Strategy Enable Value-Added Services  Managed Databases - RDS, Aurora  Other Managed Services - Elastic Beanstalk, SES, SQS, ElasticSearch Dynamic Cloud becomes a thing…
  70. 70. Experiment Secure the Cloud Enable Servers, Enable SaaS Enable Value-Added Services Enable Unique Services Progressions in Cloud Adoption
  71. 71. Progressions in Cloud Adoption Enterprise IT Cloud Adoption Strategy Enable Unique Services  High value, Cloud-specific services - Lambda, Kinesis - DynamoDB - SWF, Elastic Transcoder - Redshift  Point of commitment... ...dependent on cloud Dynamic Cloud is deeply ingrained…
  72. 72. Experiment Secure the Cloud Enable Servers, Enable SaaS Enable Value-Added Services Enable Unique Services Mandate Cloud Usage Progressions in Cloud Adoption
  73. 73. Progressions in Cloud Adoption Enterprise IT Cloud Adoption Strategy Mandate Cloud Usage  Cloud as a data center replacement  Company is now “all in” with cloud  Netflix… Why do we need our own data centers?
  74. 74. What is the cloud? Can we trust the cloud? The cloud works pretty well… Dynamic Cloud becomes a thing… Dynamic Cloud is deeply ingrained… Why do we need our own data centers? Progressions in Cloud AdoptionThe steps aren’t easy…
  75. 75. Experiment Secure the Cloud Enable Servers, Enable SaaS Enable Value-Added Services Enable Unique Services Mandate Cloud Usage Progressions in Cloud Adoption Different Companies Different Speed Different Needs
  76. 76. Cloud Adoption Strategies Enterprise IT Cloud Adoption Strategy  Experiment  Secure the Cloud  Enable Servers, Enable SaaS  Enable Value-Added Services  Enable Unique Services  Mandate Cloud Usage Application Cloud Adoption Strategy  Experiment/Peripherial Usage  Cloud Servers  Managed Components  Unique Components  Application Cloud Committed
  77. 77. Application Adoption Corporate Adoption Cloud Adoption Mandate Committed Allow Value-Added Allow SaaS Allow Servers Secure Experiment Experiment Servers Managed Components Unique Components Committed Critical Applications New Applications Non-Critical/ Internal Applications Step #1 Step #2 Step #4 First Steps Application Re-Writes Step #3
  78. 78. IAM VPC Non-Integral SaaS EC2 Integral SaaS RDS SES Lambda Kinesis Application Adoption Corporate Adoption Cloud Adoption Mandate Committed Allow Value-Added Allow SaaS Allow Servers Secure Experiment Experiment Servers Managed Components Unique Components Committed Critical Applications New Applications Non-Critical/ Internal Applications Step #1 Step #2 Step #4 First Steps Application Re-Writes Step #3 S3
  79. 79. Adoption Sweet Spot First Steps Application Adoption Corporate Adoption Mandate Committed Allow Value-Added Allow SaaS Allow Servers Secure Experiment Experiment Servers Managed Components Unique Components Committed Cloud Adoption Center of Gravity
  80. 80. Integral SaaS RDS SES Lambda Kinesis Adoption Sweet Spot Application Adoption Corporate Adoption Mandate Committed Allow Value-Added Allow SaaS Allow Servers Secure Experiment Experiment Servers Managed Components Unique Components Committed S3 EC2 Cloud Adoption Center of GravityIAM VPC Non-Integral SaaS
  81. 81. Migrating to the Cloud… How can I be successful?
  82. 82. Adoption Success Strategies Understand where your culture is Consciously plan your acceptance Drive your cultural change to your desired level Monitor your adoption Understand your needs
  83. 83. Monitor Your Adoption Before Migration Baseline application (servers, databases, caches, applications, microservices) Determine your steady state
  84. 84. Monitor Your Adoption During Migration Incorporate cloud’s internal monitoring Continue application monitoring Understand and solve all deviations from steady state…
  85. 85. The Biggest Role Monitoring Plays In Migration Performance Post Migration & During Optimization Pre-migration Feasibility & Benchmarking
  86. 86. Continue Monitoring… Infrastructure is now out of your control Some cloud specific concerns (EC2 instance failures, instance degradation) Dynamic Technologies Impact Our Applications Understand application impact Ongoing application & infrastructure monitoring is essential Monitor Your Adoption
  87. 87. 919191919191 Fairfax Media Limited is a leading multi platform media company in Australasia, reaching 10.6 million Australians and 2.9 million New Zealanders. Media/Entertainment “Because we monitored our on-premises systems with New Relic before we migrated them to Amazon Web Services, we were able to identify potential issues and fix them during the migration process.” - Cheesun Choong Head of Product Platforms Results Reduced diagnosis time from hours to minutes Migrated to AWS with confidence Identified underutilized servers to save money
  88. 88. 92 Keeping Your App Running…At Scale Dynamic Cloud… ...make availability happen. Migration… ...how do I get my app to the cloud? Availability… …is more than you think it is. Monitor your application and infrastructure
  89. 89. Monitoring just the server EC2 Instance Server OS Server (Virtual) Hardware Application & Application Microservices AWS CONSOLE CloudWatch Worked when rate of change was low…
  90. 90. Dev Ops Dynamic World
  91. 91. Server OS Server (Virtual) Hardware Application & Application Microservices Provisioning Application & Application Microservices Application & Application Microservices BrowserMobile Full Stack Monitoring New Relic Application Monitoring New Relic Infrastructure Monitoring DASHBOARDS • Top to bottom monitoring… • Full stack accountability... • Dynamic infrastructure control... You need:
  92. 92. Digital Fan Experience for Major League Baseball New Relic empowers our developers to experiment and work fast without compromising on the quality of the MLB fan experience. – Sean Curtis Senior Vice President of Engineering
  93. 93. Panic
  94. 94. Change is speeding up Traditional Data Center Cloud Data Center Dynamic Cloud Dynamic Cloud enables better applications faster. Good Better Best The way you’ve done things in the past won’t work in the future.
  95. 95. Server OS Server (Virtual) Hardware Application & Application Microservices Provisioning Application & Application Microservices Application & Application Microservices BrowserMobile Full Stack Monitoring New Relic Application Monitoring New Relic Infrastructure Monitoring DASHBOARDS
  96. 96. Thank you Lee Atchison ∙ Senior Director Strategic Architecture New Relic Architecting for Scale By: Lee Atchison Published by: O’Reilly Media www.architectingforscale.com leeatchison@leeatchison
  97. 97. This document and the information herein (including any information that may be incorporated by reference) is provided for informational purposes only and should not be construed as an offer, commitment, promise or obligation on behalf of New Relic, Inc. (“New Relic”) to sell securities or deliver any product, material, code, functionality, or other feature. Any information provided hereby is proprietary to New Relic and may not be replicated or disclosed without New Relic’s express written permission. Such information may contain forward-looking statements within the meaning of federal securities laws. Any statement that is not a historical fact or refers to expectations, projections, future plans, objectives, estimates, goals, or other characterizations of future events is a forward-looking statement. These forward-looking statements can often be identified as such because the context of the statement will include words such as “believes,” “anticipates,”, “expects” or words of similar import. Actual results may differ materially from those expressed in these forward-looking statements, which speak only as of the date hereof, and are subject to change at any time without notice. Existing and prospective investors, customers and other third parties transacting business with New Relic are cautioned not to place undue reliance on this forward-looking information. The achievement or success of the matters covered by such forward-looking statements are based on New Relic’s current assumptions, expectations, and beliefs and are subject to substantial risks, uncertainties, assumptions, and changes in circumstances that may cause the actual results, performance, or achievements to differ materially from those expressed or implied in any forward-looking statement. Further information on factors that could affect such forward-looking statements is included in the filings we make with the SEC from time to time. Copies of these documents may be obtained by visiting New Relic’s Investor Relations website at http://ir.newrelic.com or the SEC’s website at www.sec.gov. New Relic assumes no obligation and does not intend to update these forward-looking statements, except as required by law. New Relic makes no warranties, expressed or implied, in this document or otherwise, with respect to the information provided. Safe Harbor

×