A Secure and Reliable Document Management System is Essential.docx
AWS Best Practices
1. AWS Primer on
Best Practices and Resource Tagging Convention
Kenichi Shibata
2. Why do we need naming convention?
• Tagging for Deployment
• Tagging based on application usage
• Tagging based on ownership
• Tagging for cost structure (can check cost breakdown based on tags)
• Have to enable cost breakdown on the aws billing (usecase: for individual clients)
• Tagging for Automation
• All resources will be tagged and then can filtered or queried via tags
• Tagging as context helpers
• Not all resources are tagged appropriately by AWS and it could be confusing for
understanding what a specific resource does
3. Naming Best Practices
• Before we begin we have understand some key points in order to better
maintain the infrastructure in the cloud
1. Build infrastructure to scale
• This means that the infrastructure you handle right now will need to passed over very quickly
to a new engineer for a minimal training with only documentation and the context.
• Contexts should be provided with each resources to be able to identify how the resource fit
in;
2. Automate all/most infrastructure tasks
• Use Automation tools like cloudformation, boto3 or terraform for provisioning
• If tasks is repeated at least thrice we should automate it.
• Orchestration should be done automatically using config/template management like packer,
ansible.
• Use Infrastructure-As-A-Code whenever you run a new cluster
• Tags should be automated
4. Deployment Best Practices
3. Centralized jobs
• All cron jobs should be observable and should be visible to all devOps team. A local cron job for a specific
server is discouraged.
• Centralized jobs can help manage the workload among the Ops team and create a sense of teamwork
when rotating tasks, It helps gain visibility ensures quick and efficient action when something
unexpected occurs
4. Pipelined Tasks
• Tasks like continuous integration and deployment should be pipelined in order to find out at which stage
the task failed,
• Infrastructure As A Code should be the norm when deploying new infrastructure. These tasks
deployment should be pipelined within the centralized observable pipeline
5. Containerized/AMI Deployments
• Blue Green Deployment should be done to each servers in order to test the code in a production like
setting
• Deployments to each new Instance should be done by using a snapshot of a proven instance running an
application that starts on bootup
• Running Clusters of Containers is Recommended however please keep in mind the overhead of running
containers
5. Security Best Practices
6. Use MFA for all Users and set harder restrictions on password
• Every engineer who has access to the AWS Console should have MFA enabled
• The passwords policy can be configured from the IAM Console, The passwords should at
least have one uppercase and one number.
7. Not Everyone should have access to the console
• Developers who will only need a specific access can use the Command Line Interface
with AWS AccessKey and SecretKey instead,
• Use the principle of least privilege and only give access to specific resource, For example
instead of giving write access to all s3 buckets, Only give list access and write access to a
specific s3 bucket
8. Turn on CloudTrail
• Get all the audit logs for each calls to the aws api using your account credentials
6. Resource Types
• Currently this document covers the following resource types
• EC2
• VPC
• VPC Subnet
• Sec Group
• S3
7. EC2
• Name – naming convention – {env/owner(count)-cluster-app}
• e.g. mng01-infra-bastion, prd-web-abcd, acmecorp01-infra-yourapp1
• Environment – stg, dev, prd, mng
• Cluster/Platform – Infra Management, ecommerce, crm,projectcode1 , (optional)
• App – app1proxy, app1loadbalancer, app1api
• Tier – Database, Web, API, App, Datastore, (for multi tiered architectures)
• Subnet – {subnet-name} should be queried on terraform
• Owner – {customer1} if owned by third party (optional)
• Maintainer –email
• Architecture- /Diagrams/file.uml, http://wiki (optional)
• Count – 01, 02, 03 (for asg use e.g. prd01-web-app-asg) else use prd01-web-app
8. EC2
• Why is name prefixed on client instead of env when it is available?
• We assume that the clients will always have production environment afforded
to them if they will need to test or do acceptance test we can set it up on
staging or development environments to share costs, using data encryption
for the test data if need be.
• e.g. mng01-infra-bastion, prd-web-abcd, acmecorp01-infra-yourapp1
9. VPC
• Name – {platform}(count)-{tenancy}-vpc e.g. warehouse01-default-tokyo-vpc
• Tenancy – Default/Dedicated
• Count – 01, 02, 03 (You can also use names as instead of numbers)
• Region – Use the common region name not the aws official name for brevity
• Platform – Ecommerce, Warehouse management, CRM, projectcode1
• Most application platforms will be inside a single vpc so the name of the platform will suffice.
However if we will need to scale to multiple regions, We will need to tag the Region as well to
provide context where we would like to run this specific platform.
10. VPC Subnet
• Name – {environment}-{platform}-{availability}-{accessibility}-subnet
• Accessibility – Private, Public, Secured
• Maintainer – email
• Platform – Ecommerce, Warehouse management, CRM, projectcode1
• Cluster - DB, Web, API, App
• Environment – prd, stg, dev, mng
• Availability (Zone)– Primary(a), Secondary(c), Tertiary(b),
• Some regions only has ‘a’ and ‘c’ zones
11. What is public/private/secured?
• Public subnets are subnets which are directly connected to the internet
using an internet gateway or a egress only internet gateway
• There is a use case for a public -> nat -> public mapping of subnets if you want to
whitelist all requests from a specific subnet
• Public Subnets are best used for Internet facing Web Servers
• They need a public ip address or elastic ip in order to connect
• Usually the HTTPs termination is done on server running on public subnets
• Private subnets are subnets which are connected to the internet using a
NAT Gateway (network address translation)
• The subnets are usually mapped using private -> nat -> public. However the
limitation to these subnets are that they cannot serve traffic from Public DNS
without port forwarding.
• They are best suited for cache servers, database servers, middleware api, and
secured transaction interfaces,
12. What is public/private/secured?
• Secured Gateways
• Secured Gateways are for directly connecting a specific IP Address to another
IP Address. Without connecting to the rest of the internet, It is usually done
by Network ACL.
• A good alternative for Secured Gateways is AWS Direct Connect or AWS VPN
Connection
13. NAT Gateways
• The best practice in running NAT Gateway is using the AWS NAT Gateway it
scales depending on traffic and is easier to setup than a manual EC2 NAT
Gateway
• however please be aware that since AWS NAT Gateway is a full managed service it is
bound to limitations as well, Port forwarding for example is not supported as is
sniffing logs from the gateway
• To be able to get advanced functionality creating NAT Gateway from EC2 is
needed however setting a single EC2 instance as a NAT Gateway is an anti-
pattern since if the traffic grows enough then the NAT EC2 Instance will
become a bottleneck
• It is recommended to have a failover or a load balanced NAT Gateway if
you are going to use EC2
14. S3
• Name – {company-name}-{platform}-{environment}-{application} (S3 name is global and should be
unique globally, so we add the company name or abbrev)
• Deployment – manual, automated
• Platform – Ecommerce, Warehouse management, CRM, projectcode1
• Application – Logs, Webapp, Webpage
• Environment – prd, stg, dev, admin
• File Naming – Inside the bucket please use random string to generate hashes if there are lots of files.
AWS Stores files in different servers using file name hashes, it is faster to retrieve and store with more
servers
15. Security Group
• Name – {environment}-{tier}-{application}
• Group Name – same as name (required by aws)
• Protocol – ICMP, TCP, UDP
• VPC – {vpc-name}
• Tier – Web, App, Cache, DB, NAT
• Environment – Production, Staging, Development
• Application – (if applicable) nginx, mongodb, ssh,
• Description – Short description of why this security group is needed