SlideShare a Scribd company logo
1 of 46
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How to build a company founded on engineering
principles
Brian Scanlan
S T P 2 0 8
Principal Systems Engineer
Intercom
Agenda
Why principles are useful
Intercom’s engineering principles
Intercom’s engineering principles in practice
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Why are principles useful?
A good set of principles allow an organisation to work
off a common mental model
Bad principles?
“We don’t ship bugs”
Good principles!
“We build on AWS”
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Example #1
AWS costs
“Our approach to managing AWS costs is REACTIVE
and prioritises taking action against the highest
contributors to our costs as observed in production”
“The complexity of implementing multi-cloud makes this
a decision we don’t even want to contemplate.”
Things we do:
Tag resources
Use Cost Explorer to visualise trends
Things we do:
Work with product team to understand usage
Use a small number of modern instance families
Things we do:
Use autoscaling support for multiple instance types and
purchase options
Example #2
Monolith
“Rebuilding your monolith from scratch using Go
microservices”
“Our monolith was poorly tested and deployed once
every 6 months, and boy did you not want to be in the
office that week (or month)”
“Our monolith kept slowing us down, so we had to
break it apart!”
Majestic monolith
Majestic monolith running on EC2 instances
Supercharging our monolith with serverless
Example #3
Replacing MongoDB with
DynamoDB
@brian_scanlan
Thank you!
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

More Related Content

What's hot

What's hot (20)

AWS Summit 2013 | India - How Start-Ups Benefit from AWS, Rajas Karandikar
AWS Summit 2013 | India - How Start-Ups Benefit from AWS, Rajas KarandikarAWS Summit 2013 | India - How Start-Ups Benefit from AWS, Rajas Karandikar
AWS Summit 2013 | India - How Start-Ups Benefit from AWS, Rajas Karandikar
 
Getting Started with AWS for Develoeprs
Getting Started with AWS for DeveloeprsGetting Started with AWS for Develoeprs
Getting Started with AWS for Develoeprs
 
AWS Initiate - Security Framework Shakedown: Mapeie sua jornada com as melhor...
AWS Initiate - Security Framework Shakedown: Mapeie sua jornada com as melhor...AWS Initiate - Security Framework Shakedown: Mapeie sua jornada com as melhor...
AWS Initiate - Security Framework Shakedown: Mapeie sua jornada com as melhor...
 
Hands-on Lab: Using a Property Graph
Hands-on Lab: Using a Property GraphHands-on Lab: Using a Property Graph
Hands-on Lab: Using a Property Graph
 
Jon Epstein - LAC 2017 - Disruptive technologies: AI and the future of market...
Jon Epstein - LAC 2017 - Disruptive technologies: AI and the future of market...Jon Epstein - LAC 2017 - Disruptive technologies: AI and the future of market...
Jon Epstein - LAC 2017 - Disruptive technologies: AI and the future of market...
 
AWS Summit 2013 | Singapore - Supporting and Optimizing your AWS Experience
AWS Summit 2013 | Singapore - Supporting and Optimizing your AWS ExperienceAWS Summit 2013 | Singapore - Supporting and Optimizing your AWS Experience
AWS Summit 2013 | Singapore - Supporting and Optimizing your AWS Experience
 
Are you Well Architected?
Are you Well Architected?Are you Well Architected?
Are you Well Architected?
 
Automated Security Remediation - AWS Summit Sydney
Automated Security Remediation - AWS Summit SydneyAutomated Security Remediation - AWS Summit Sydney
Automated Security Remediation - AWS Summit Sydney
 
AWS Canberra User Group Into - July 2019
AWS Canberra User Group Into - July 2019AWS Canberra User Group Into - July 2019
AWS Canberra User Group Into - July 2019
 
Simplify Compliance Through Automation
Simplify Compliance Through AutomationSimplify Compliance Through Automation
Simplify Compliance Through Automation
 
APN_Live_20190722_Well-Architected
APN_Live_20190722_Well-ArchitectedAPN_Live_20190722_Well-Architected
APN_Live_20190722_Well-Architected
 
How WeatherBug Created a Mobile AR App with Amazon Sumerian (ARV352-R1) - AWS...
How WeatherBug Created a Mobile AR App with Amazon Sumerian (ARV352-R1) - AWS...How WeatherBug Created a Mobile AR App with Amazon Sumerian (ARV352-R1) - AWS...
How WeatherBug Created a Mobile AR App with Amazon Sumerian (ARV352-R1) - AWS...
 
Dev ops vs noops vs finops
Dev ops vs noops vs finopsDev ops vs noops vs finops
Dev ops vs noops vs finops
 
AWS Initiate - Otimização de Custos com AWS
AWS Initiate - Otimização de Custos com AWSAWS Initiate - Otimização de Custos com AWS
AWS Initiate - Otimização de Custos com AWS
 
Tendências na Transformação Digital
Tendências na Transformação DigitalTendências na Transformação Digital
Tendências na Transformação Digital
 
AWS Initiate - DevOps do Jeito Amazon
AWS Initiate - DevOps do Jeito AmazonAWS Initiate - DevOps do Jeito Amazon
AWS Initiate - DevOps do Jeito Amazon
 
ADHA Use of AWS Services to Support the National Clinical Terminology Service
ADHA Use of AWS Services to Support the National Clinical Terminology ServiceADHA Use of AWS Services to Support the National Clinical Terminology Service
ADHA Use of AWS Services to Support the National Clinical Terminology Service
 
Workforce Transformation: How to Effectively Lead Change Management
Workforce Transformation: How to Effectively Lead Change ManagementWorkforce Transformation: How to Effectively Lead Change Management
Workforce Transformation: How to Effectively Lead Change Management
 
FinOps
FinOpsFinOps
FinOps
 
Accelerate Your Migration: How Customers Are Approaching Large-Scale Migratio...
Accelerate Your Migration: How Customers Are Approaching Large-Scale Migratio...Accelerate Your Migration: How Customers Are Approaching Large-Scale Migratio...
Accelerate Your Migration: How Customers Are Approaching Large-Scale Migratio...
 

Similar to Brian Scanlan - Intercom and AWS

HigherEducation-Cloud Operating Model and Approach Forward.pdf
HigherEducation-Cloud Operating Model and Approach Forward.pdfHigherEducation-Cloud Operating Model and Approach Forward.pdf
HigherEducation-Cloud Operating Model and Approach Forward.pdf
Amazon Web Services
 

Similar to Brian Scanlan - Intercom and AWS (20)

Introduction to the Well-Architected Framework and Tool - SVC212 - Santa Clar...
Introduction to the Well-Architected Framework and Tool - SVC212 - Santa Clar...Introduction to the Well-Architected Framework and Tool - SVC212 - Santa Clar...
Introduction to the Well-Architected Framework and Tool - SVC212 - Santa Clar...
 
Moving to DevOps the Amazon Way
Moving to DevOps the Amazon WayMoving to DevOps the Amazon Way
Moving to DevOps the Amazon Way
 
So You Want to be Well-Architected?
So You Want to be Well-Architected?So You Want to be Well-Architected?
So You Want to be Well-Architected?
 
HigherEducation-Cloud Operating Model and Approach Forward.pdf
HigherEducation-Cloud Operating Model and Approach Forward.pdfHigherEducation-Cloud Operating Model and Approach Forward.pdf
HigherEducation-Cloud Operating Model and Approach Forward.pdf
 
Moving to DevOps the Amazon Way
Moving to DevOps the Amazon WayMoving to DevOps the Amazon Way
Moving to DevOps the Amazon Way
 
So You Want to be Well-Architected - AWS Summit Sydney 2018
So You Want to be Well-Architected - AWS Summit Sydney 2018So You Want to be Well-Architected - AWS Summit Sydney 2018
So You Want to be Well-Architected - AWS Summit Sydney 2018
 
AWS Initiate Day Manchester 2019 – Moving to DevOps the Amazon Way
AWS Initiate Day Manchester 2019 – Moving to DevOps the Amazon WayAWS Initiate Day Manchester 2019 – Moving to DevOps the Amazon Way
AWS Initiate Day Manchester 2019 – Moving to DevOps the Amazon Way
 
Leading Your Team Through a Cloud Transformation - AWS Online Tech Talks
Leading Your Team Through a Cloud Transformation - AWS Online Tech TalksLeading Your Team Through a Cloud Transformation - AWS Online Tech Talks
Leading Your Team Through a Cloud Transformation - AWS Online Tech Talks
 
Introduction to AWS Global Accelerator - SVC212 - New York AWS Summit
Introduction to AWS Global Accelerator - SVC212 - New York AWS SummitIntroduction to AWS Global Accelerator - SVC212 - New York AWS Summit
Introduction to AWS Global Accelerator - SVC212 - New York AWS Summit
 
Unlocking Software Innovation with AWS - Adrian White - AWS TechShift ANZ 2018
Unlocking Software Innovation with AWS - Adrian White - AWS TechShift ANZ 2018Unlocking Software Innovation with AWS - Adrian White - AWS TechShift ANZ 2018
Unlocking Software Innovation with AWS - Adrian White - AWS TechShift ANZ 2018
 
How_to_build_your_cloud_enablement_engine_with_the_people_you_already_have
How_to_build_your_cloud_enablement_engine_with_the_people_you_already_haveHow_to_build_your_cloud_enablement_engine_with_the_people_you_already_have
How_to_build_your_cloud_enablement_engine_with_the_people_you_already_have
 
AWS Initiate Day Dublin 2019 – Moving to DevOps the Amazon Way
AWS Initiate Day Dublin 2019 – Moving to DevOps the Amazon WayAWS Initiate Day Dublin 2019 – Moving to DevOps the Amazon Way
AWS Initiate Day Dublin 2019 – Moving to DevOps the Amazon Way
 
Initiate Edinburgh 2019 - Moving to DevOps the Amazon Way
Initiate Edinburgh 2019 - Moving to DevOps the Amazon WayInitiate Edinburgh 2019 - Moving to DevOps the Amazon Way
Initiate Edinburgh 2019 - Moving to DevOps the Amazon Way
 
Building Modern Applications on AWS
Building Modern Applications on AWSBuilding Modern Applications on AWS
Building Modern Applications on AWS
 
DevOps, CI/CD, cost management, and security on AWS
DevOps, CI/CD, cost management, and security on AWSDevOps, CI/CD, cost management, and security on AWS
DevOps, CI/CD, cost management, and security on AWS
 
DevSecOps: Instituting Cultural Transformation for Public Sector Organization...
DevSecOps: Instituting Cultural Transformation for Public Sector Organization...DevSecOps: Instituting Cultural Transformation for Public Sector Organization...
DevSecOps: Instituting Cultural Transformation for Public Sector Organization...
 
Are you Well-Architected?
Are you Well-Architected?Are you Well-Architected?
Are you Well-Architected?
 
Security Framework Shakedown
Security Framework ShakedownSecurity Framework Shakedown
Security Framework Shakedown
 
Prepare For The Next Phase of Your AWS Journey With CloudHealth (Session spon...
Prepare For The Next Phase of Your AWS Journey With CloudHealth (Session spon...Prepare For The Next Phase of Your AWS Journey With CloudHealth (Session spon...
Prepare For The Next Phase of Your AWS Journey With CloudHealth (Session spon...
 
Security Framework Shakedown: Chart Your Journey with AWS Best Practices
Security Framework Shakedown: Chart Your Journey with AWS Best PracticesSecurity Framework Shakedown: Chart Your Journey with AWS Best Practices
Security Framework Shakedown: Chart Your Journey with AWS Best Practices
 

Recently uploaded

Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills KuwaitKuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
jaanualu31
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
Epec Engineered Technologies
 
Introduction to Robotics in Mechanical Engineering.pptx
Introduction to Robotics in Mechanical Engineering.pptxIntroduction to Robotics in Mechanical Engineering.pptx
Introduction to Robotics in Mechanical Engineering.pptx
hublikarsn
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Kandungan 087776558899
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
AldoGarca30
 

Recently uploaded (20)

Worksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptxWorksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptx
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdf
 
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
 
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills KuwaitKuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.
 
Computer Graphics Introduction To Curves
Computer Graphics Introduction To CurvesComputer Graphics Introduction To Curves
Computer Graphics Introduction To Curves
 
Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)
 
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
 
Electromagnetic relays used for power system .pptx
Electromagnetic relays used for power system .pptxElectromagnetic relays used for power system .pptx
Electromagnetic relays used for power system .pptx
 
Augmented Reality (AR) with Augin Software.pptx
Augmented Reality (AR) with Augin Software.pptxAugmented Reality (AR) with Augin Software.pptx
Augmented Reality (AR) with Augin Software.pptx
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
 
Signal Processing and Linear System Analysis
Signal Processing and Linear System AnalysisSignal Processing and Linear System Analysis
Signal Processing and Linear System Analysis
 
Linux Systems Programming: Inter Process Communication (IPC) using Pipes
Linux Systems Programming: Inter Process Communication (IPC) using PipesLinux Systems Programming: Inter Process Communication (IPC) using Pipes
Linux Systems Programming: Inter Process Communication (IPC) using Pipes
 
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
 
Introduction to Robotics in Mechanical Engineering.pptx
Introduction to Robotics in Mechanical Engineering.pptxIntroduction to Robotics in Mechanical Engineering.pptx
Introduction to Robotics in Mechanical Engineering.pptx
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
fitting shop and tools used in fitting shop .ppt
fitting shop and tools used in fitting shop .pptfitting shop and tools used in fitting shop .ppt
fitting shop and tools used in fitting shop .ppt
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
 
8086 Microprocessor Architecture: 16-bit microprocessor
8086 Microprocessor Architecture: 16-bit microprocessor8086 Microprocessor Architecture: 16-bit microprocessor
8086 Microprocessor Architecture: 16-bit microprocessor
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network Devices
 

Brian Scanlan - Intercom and AWS

Editor's Notes

  1. Here’s what I’m going to cover in my talk. What are principles and why they’re so valuable to an organisation. I’ll go through Intercom’s engineering principles And finally the principles in practice with respect to our use of AWS. This is the technical part of the talk, but the first two parts are really important!
  2. I’m talking here about principles used by your company, team or organisation. There are good generalised software engineering principles around testability, writing simple code, not repeating yourself etc. They’re good but don’t particularly help your teams build the right thing. Organisation principles are a crucial tool that helps your organisation to learn, grow and establish working patterns for building stuff. Without a common set of principles, how do you know you are building the right thing in the right way? You can make an educated guess. A lot of the time you might be right. You can ask whoever is in charge. That can work fine. You can copy what you’ve done elsewhere, which can also work well in a bunch of cases. You can do something you saw at a cool tech conference where every presentation is full of unicorns, rainbows and every project is wildly successful. Sometimes this can work.
  3. When you get something right on your own organisation, how do you share this information? Principles are a way of encoding successes, helping to repeat the behaviors that led to positive outcomes and avoid the previous behaviors that led to mistakes. This means that everybody in the organisation, even a new joiner, knows how to read the minds of the most experienced or senior engineers in an engineering organisation. That’s pretty powerful. It also enables them to build stuff in the knowledge that within reason as long as it falls roughly within the accepted guidance they’re going about it in the right way. There can be exceptions of course, but with a set of written down principles at least you have something to work with and not just making guesses.
  4. Let’s consider a bunch of bad principles and practices. What might a bad principal look like? Is it a big deal if a bad one exists?
  5. Lol, do you even computer? This is a bad principle because it’s unattainable and not realistic. It is ambitious, sure. It sets a tone of a release being a death march of punishment. No company really wants to do the opposite either. A huge engineering effort might be made to contain bugs and unexpected behaviours, but zero tolerance on bugs sounds like a poor choice. Nobody wants or likes bugs, but they’re unavoidable in the real world.
  6. Ok, so what do good principles look like?
  7. This is simple to understand, opinionated and has opposite cases which can be more than appropriate. Building on AWS may not be appropriate if your business requires sub-millisecond latency to users in one building in Cairo, or you may have invested significantly in local datacentres, or you may require cloud diversity due to business requirements. A principle like this speeds work up because people know what to do without guessing, asking around or doing the opposite thing and finding out later.
  8. Ok, I think I’ve demonstrated that principles are useful. Let’s move on to a few of Intercom’s principles that we use to build product. First, I’m going to explain a little about who Intercom are and what we do.
  9. I work at Intercom as an engineer working with the teams that make up the foundational parts of Intercom such as our use of cloud services, deployments, data storage security and IT. Intercom have a suite of messaging-first products for businesses to accelerate growth across the customer lifecycle. Business use Intercom to do sales, marketing, support, product tours, all through a beautiful and industry leading messenger. Intercom’s R&D HQ is in Dublin. We’re kind of split-HQed between Dublin and San Francisco.
  10. Here’s our messenger. There’s gifs, emojis, applications embedded in them such as Stripe and Shopify. There’s a pretty big set of backend application supporting all this functionality, and we’re all in on AWS, which is why I’m here I guess.
  11. As with everything in the real world, our principles are a work in progress. We revisit them every year or two to make sure they still make sense. But we have a good set today that are battle-hardened, tested in the real world and we’re happy to share to the outside world.
  12. So we have a set of principles across our R&D organisation, which is comprised of engineering, product and design groups. I’m only going to look at at relevant ones. There are ones which more clearly apply to the process in building product.
  13. “Ship to learn” is a universal principle. The sooner we ship, the quicker we learn how our product is used. It also means that shipping a feature is just the start of the process. We believe that great products are built by shipping a feature, understanding how it’s really used, and then iterating. We’d rather get something out quicker that’s functional but incomplete. This works well for us. I would not apply this if I were building, I dunno, software for a Boeing 737 Max or something.
  14. “Build in small steps” is a direct instruction to our engineers. Make small changes frequently. Break work down into safer, smaller steps. This doesn’t just refer to changes done via code deploys and pull requests, but allo usual modern “testing in production” techniques such As feature flags. In addition to being iterative and assistive for an agile development process, there are secondary benefits such as helping our availability, quality and again lets us understand what actually happens when we ship what we’re building. Our environment allows us to frequently and easily ship to production.
  15. We encourage our engineers to be technically conservative. Maybe we could solve the problem using a graph database deployed using Kubernetes backed up onto the blockchain. But first we ask ourselves: “can we solve this problem with tools and techniques that we already know well?” One of our early principles was to “run less software”, meaning we preferred to use a smaller set of software and services, and preferably ones that we didn’t have to operate ourselves. This evolved over time to generally being technical conservative. This doesn’t mean that we don’t build beautiful, functional product, but that our implementation choices are conservative.
  16. We prefer to keep things simple. We will trade off performance, financial cost and perfect abstractions to keep it this way. For example, at our stage of growth, financial optimisation is not *the* goal of our engineering organisation. We’re a successful startup, but not yet an established company. Saving a few thousand dollars won’t make meaningful difference to whether Intercom is a long term success, but building and iterating on new features could. So this is definitely one that can change over time, and again may not be appropriate for many other companies.
  17. This one isn’t really directly related to the rest of my talk, but I really love it. We’re deliberately positive, optimistic, eager to teach and learn, and welcoming to everybody. This is part of why I love working at Intercom.
  18. That was Intercom’s set. Now we’re going to see what they look like in practice, specifically around our use of AWS.
  19. How do our principles influence the management of AWS Costs?
  20. As I’m sure many of you are aware of, understanding your AWS Bill can be pretty complex. There’s a whole cottage industry of companies and consultants who are more than happy to take money off you in return for the promise of making your bill smaller and life easier. Knowing what is worthwhile to reduce costs can be difficult and takes time and effort to do well.
  21. So, I am the “costs person” at Intercom. A lot of the time when people have questions about costs, they ask me. I don’t work alone on this but I have been around for a while and working in the area. In order to scale my function in costs and AWS architecture in general, and ensure I don’t lose my mind, I ended up writing down a load of things that were in my head into a document. Like our principles, they are guidance that reflect actual usage of AWS. The document is in no way complete as it doesn’t explain AWS from scratch, just mostly how we use it. I recommend writing something like this as a way of saving loads of time, but also as a way of testing your mental model about how things are actually used in the real world.
  22. Shout out to the open guide to AWS! It’s a community written guide to using AWS in the real world. AWS’s own documentation is reasonably good. This guide is by and for engineers who use AWS. It aims to be a useful, living reference that consolidates links, tips, gotchas, and best practices. It arose from discussion and editing over beers by several engineers who have used AWS extensively. It’s concise and readable. There’s also a really good Slack community! I point folks to this in my doc. I encourage you to read, share and contribute to it! It’s got a load of great cost related info in it too.
  23. Back to my doc, here’s an important quote. I get asked questions from engineers like “should I worry about costs” or “how much will this feature cost” is “ship it and we’ll find out when it’s used”. Don’t worry about it, just build. There are some exceptions to this of course, when it’s clearly obvious that there will be a massive infrastructural impact (such as doubling all our Elasticsearch fleets). But the guidance I give to our product engineers is to build and deploy, and learn by running their feature in production rather than do a load of upfront estimation work or optimisation work. The benefit of optimisation work is a lot easier to see after something has been optimised in production, otherwise you’re relying on judgement to be certain. This works with “building in small steps” and “ship to learn”. This works for Intercom. It would not work for an environment under strict financial control or with very tight margins.
  24. We don’t want engineers to worry about the existence of other clouds. Just pretend they don’t exist! The real power of cloud services is tying them al together. They come with a load of overhead such as relatively complex permission models and network designs etc. Smaller SaaS providers such as Honeycomb or Gremlin, Datadog or New Relic don’t require complex configuration to get started and don’t have this barrier, but cloud services do. It’s less confusing and less work to use one cloud provider well and a small selection of single purpose world class SaaS providers to run our business. We are all in on AWS. These decisions tie back to our engineering principals “be technically conservative” and “keep it simple”. There are very few user facing products whose customers benefit from being hosted on multiple clouds. We don’t want our engineers worrying about this.
  25. So what’s driving our bill? We run large numbers of instances to serve our application, and we also self-host Elasticsearch for many large datasets, so EC2 dominates. Then there’s a load of RDS Aurora MySQL and Elasticache. To control these it comes down to managing reservations, instance choices and usage of spot Rightsizing and optimising Manage this centrally rather than getting every team to figure this stuff out themselves.
  26. Here’s our EC2 ratios in terms of dollar spend. Not bad overall. On-demand has grown a little over the last two months, I really need to look at that!!!
  27. Managing our costs focuses on the The main tools and features we use to manage and control our costs are: Tag infrastructure by product feature and team where possible. Cost Explorer to visualise trends This is consistent with our principle of keeping things simple. No complicated software involved here.
  28. Understanding the inputs and changes. We’re still small enough so that we can avoid the overhead and abstractions of budgets. Makes reservations a lot easier. Also keeps things simpler. We don’t want perfectly tuned and optimised fleets, we want to spend as little time as possible managing fleets of servers.
  29. Move various workloads to spot. Keep things simple and reactive to real world usage after we have shipped to learn. See how spot performs in the real world. Things have generally stabilised a lot over the last 3 years. In line with our principles. Never want to do complex stuff to participate in bidding wars, AWS now just take care of all of that.
  30. Now I’m going to talk about our Monolith architecture and why it works for us. We run a ruby on rails monolith. This means that we have a single large application with a lot of code that does many, many things. UIs, APIs, helpers, workers, large amounts of datastores… it does everything. When said out loud at a conference, the word monolith is usually followed by phrases like…
  31. This sounds like a fun talk.
  32. This sounds like a fun talk.
  33. All these imaginary talks are well worth going to and no doubt describe real life situations. But it’s not what we’ve found at Intercom. Our Ruby on Rails monolith keeps us fast. Yes, we have to invest in it to keep it this way. Upgrading Ruby on Rails on a monolith is tough if you haven’t done it in a long time. Refactoring core modules to ensure usable, safe boundaries Deploying a monolith so that it doesn’t break all the time is hard if you do it infrequently.
  34. So we garden our majestic monolith - continually upgrade the version of Rails, such that we’re now just about to have a permanent test branch running against the development version. We use code owners and well defined boundaries in the code base to stop people overlapping. Give a great experience for out of the box patterns and tooling for logs, metrics, scaling, etc.
  35. Our majestic monolith with the vast majority of our business logic running on EC2 instances, a lot behind ELBs or running async jobs. 243 Auto Scaling Groups, all running different functions or logical separation of different APIs. We have written lots of different services in the past, kind of because we thought it would make us faster. But when it comes to day to day operations, maintenance, having great observability, upgrades, updates etc. teams see that they get more done in the monolith. What we have observed is teams replacing their micro services and folding the function back into the monolith, where it’s typically easier, faster and cheaper. Keep it simple, ship to learn. We revisit our assumptions.
  36. At the same time as reducing our microservices, We have seen increasing use of AWS Lambda functions, but not generally to replace parts, but to glue different AWS services together and run some simple processing on the data.
  37. Daniel Vassallo, another ex-AWS engineer who is good on twitter, did a good job of describing this pattern recently. “stored procedures for the cloud”. This fits well with a monolith. There’s a danger of important stuff moving out of the monolith and surprising developers, but for simple functions that don’t need deep observability, they’re fast to work with and work well. We haven’t A/B tested this hypothesis, but we’re willing to bet that it will work well over time. We need to invest in the deployment and observability story for running Lambdas alongside our monolith, and make as good an experience as working with the monolith. So I think our setup here is consistent with “be technically conservative”. We prefer to reuse than reinvent.
  38. This was a pretty large project, as our biggest dataset, our users’ users was stored in MongoDB. MongoDB is not a bad database, in fact it’s excellent and has been doing extremely impressive stuff over the last few years. However we were using it badly, with large numbers of individual replica sets, complex indexes and and unclean ways of interacting with our code base (i.e. direct ORM) So we needed to evolve, and so we decided to evolve towards replacing DynamoDB, as we only wanted to have to run a distributed database ourselves if we really, really had to. We thought it would be a similar amount of work to evolve our use of MongoDB vs. Replacing with DynamoDB, and we would gain a lot of “run less software” and general simplicity.
  39. We needed to replace a lot of different functionality, including streaming changes to our Elasticsearch clusters, rate limiting and keeping history of user changes. We ended up using DynamoDB streams, Lambda functions and other services to get data to the right place for our Rails monolith to do stuff like rate limit. (The hilarious thing is that it needs to rate limit mostly itself) One of the big differences between mongoldb and dynamoDB was that we could send diffs to mongoDB, so if one attribute in an update changed we needed to only send that along, and it didn’t seem to be that expensive for mongoldb to handle that. However for DynamoDB, we had to update the entire doc, and some users were huge. To fix this we ended up breaking down the user documents into multiple related documents, allowing us to have very small documents that are updated frequently and larger ones that aren’t. This helped smooth out rate limits, hot-spots and improve costs significantly.
  40. Overall we wanted to get to a more simple setup and we got there in the end, and we removed all use of MongoDB from our production environment last week. In a major change like this, we applied “ship to learn” and “built in small steps” ruthlessly, which meant moving slowly with the project, dual-writing for long durations and not being surprised by the differences between DynamoDB and MongoDB. We were also helped significantly by working with our Technical Account Management team and support team in AWS.
  41. Thank you for listening, more than happy to chat to folks during the event, and there’s my twitter handle if you want to continue the discussion online!!