AWS providesFlexible cloud platformDifferent optionsCustomers appreciate this, but are also asking for Operational best practicesWays to apply consistencyIdeally in checklist formCreating checklistsWide range of customersStartups (Open Amplify social media)Large enterprises like Shell or NASA JPL interacting with rovers on Mars from AWSWide range of needsJust getting started, maybe first POCRunning mission critical applicationsComplex deploymentsBuilding sophisticated cloud management strategiesWe realized that a single checklist would not meet this diverse range of needs, so we created two operational checklists.
For customers just dipping their toe in the cloudprior to initial deploymentAssess app’s use of specific servicesAvoid common first-time implementation mistakesCovers things like making sure your application is leveraging:Basic securityHA/DRapplication testing and deployment best practices
Designed to:Identify key conceptsDevelop a holistic cloud strategySophisticated cloud migrations or deploymentsStrategically approach:BillingSecurityHA & DRand manage changes to their applications and infrastructure
AgendaSummarize Basic Checklist by grouping the checklist questions into related topicsProvide a quick overview to familiarize you with breadth and scope of the Enterprise Operations ChecklistTurn the presentation over to a Tom who will provide some specific examples of the best practices they are using in relation to several of the Enterprise Operational Checklist categories.Quick note:The information that we will discuss today is available on the AWS website under both the whitepaper and architecture centers. You can see the URL to the AWS whitepapers where Operational Checklists for AWS white paper.
We take the security of our customers extremely seriously and therefore added several basic security questions to help guide our customers to leverage security best practices such asUsing Identity & Access Management to provide individual access credentials to AWS APIs instead of shared credentialsApplying security best practices to your EC2 instance operating system:OS user account access credentialsPatching, updating, and hardeningImplementing secure Security Group rulesThinking through the security implications of sharing Amazon Machine ImagesUse of Amazon EC2 checklist items cover basic operational best practices in regards to Amazon’s Elastic Cloud Compute service.AWS provides 2 different classes of EC2 instances based on where the operating system is storedAnd while we are talking about storage, it’s a best practice in any environment to separate your OS and application data volumes for data intensive applications like database servers.Additionally, in order to provide a flexible and dynamic environment for our customers, EC2 provides dynamic IP addresses that can take some getting used to at first.Elastic IPsLoad balancersDynamic DNSManage your own static IP assignments in your own Virtual Private Cloud
Another set of checklist items around high availability, backup and recovery best practicesRegularly backup EC2 instances (e.g. snapshots)Fully test your recovery plansDeploy critical application components across multiple AZs Understand how fail-over will occur across AZsAnother checklist item addresses best practices for mapping customer domain names to AWS ELBs, CloudFront, or S3 buckets. DNS “CNAME” recordsRoute53 “Alias” records for ELB
AWS provides tremendous flexibilitytest in parallellow-cost, only paying for what you use like-like performance testingIdentical Production EnvironmentHour or twobang away at itReturn the capacity with no upfront costs or ongoing commitments.It’s quick, easy, powerful, and inexpensive. Please take advantage of this to deploy better tested, more solid applications.
Summarize Basic ChecklistIntended for new customers or assessing a specific application prior to deploymentEnterprise Operations ChecklistIdentifies some key high-level conceptsSophisticated, multi-application cloud deployments
High level categoriesAWS account management, billing & charge back, and cost optimizationOS, Application, transport and data-at-rest layersTagging, metadata, integration with existing asset management systemsHA & DR pointers and guidanceMonitoring & Incident MgmtCloudWatch, SNS, EC2 instance health APIsThe last 2 section deal with various options for managing change and application deployments, at which point I would like to transition over to Tom from Monetate to talk about some of the things they are doing in this, as well as some other of these checklist categories.
Thank you for joining us. Hopefully they will help you more consistently implement operational best practices in the AWS cloud.Thank you.
Best Practices: Operational Checklists for the AWS Cloud - AWS NYC Summit 2012
Best Practices:Operational Checklists for the AWS CloudSteve Morad – Enterprise Solutions Architect
Operational ChecklistsCustomers Appreciate Our FlexibilityCustomers Asked For Operational Best Practices
Basic Operations ChecklistPurpose Prior to initial deployment Assess an application’s use of specific services Avoid common first-time implementation mistakes
Enterprise Operations ChecklistPurpose Identify Key Concepts Develop a holistic cloud strategy Sophisticated cloud migrations or deployments
Basic Operations Checklist Basic Security Questions Nested IAM Users Instance Security Security Groups Sharing AMIs Operational use of Amazon EC2 Dynamic EBS-backed Instance Separate Addressing Instance Store-backed OS & Data Volumes
Basic Operations Checklist (cont…) HA, Backup and Recovery EC2 EC2 Instance Snapshots Mapping Custom Names to AWS Route 53
Basic Operations Checklist Application Deployment and Testing Opportunities
Customer ExampleTom Janofsky• VP Engineering at MonetateMonetate• SAAS provider of marketing agility tools - testing, targeting and merchandising• 20% of comScore BlackFriday transactions passed through Monetate’s platform• Deployed on AWS for 4 years
Billing & Account Mgmt @ MonetateSimple Setup• 1 AWS account for dev, test, accept, 1 account for productionBilling/Charge Back• Spent much time modeling AWS costs and built a model driven by a single factor (API calls) that is simple to explain and an accurate proxy for actual AWS costs• No direct billing for AWS usageCost Optimization• Reserved instances for constant load• Blend of on-demand and spot Instances with EMR to reduce costs for intensive data processing
Security & Access Mgmt @ MonetateAccess Control • Console access via IAM credentials • AWS REST API via secret keys • Network access via ssh public key authentication • Application access over HTTPS, role based access control • Automated tools for granting and revoking privileges and rolling keys • No PCI or PII data
Application HA/Resilience @ MonetateDeployed in 4 availability zones across 2 regions (east and west)Routing and failover with DNS based global traffic managementEach zone has a consistent configurationCustom load balancing with HAProxyEIP for public facing proxies - automated takeover for failed proxiesAll DBs on EBS volumes, snapshotted
Monitoring & Incident Mgmt @ Monetate24x7 Internal and external based monitoringCloudWatch metricsApplication and OS level monitoring and alerting3rd party notification and escalation tool
Config/Deployment Mgmt @ MonetateConfiguration Management• Consistent AMI across deployment• Automated configuration• Automated patch managementDeployment Management• Updates applied only to new instances, added to cluster, rollback is to existing instances• No downtime for deploymentTesting• 5x like-like production testing