Good practices to design and
implement IT architecture based
on AWS
About us
• 11+ years of professional experience in Unix
and 7+ years in Cloud Computing administration
• Founder at LCloud – “Linxsys Cloud”, AWS Partner
from 2012, first AWS Partner in Poland
• We are after 150+ AWS projects
• Enterprise experience:
• Email: jacek.biernat@linxsys.pl
Design for Failure – High Available
solution
One of the major reasons for migration to Cloud
Computing:
- Avoid single points of failure
- Assume everything fails
Our goal: Application should continue to
function even if the underlying physical
hardware fails or is removed or replicated
A simple Architecture
High Available Environment
Stateless solutions
Do not modify the application code:
Stateless solutions
Do not modify the application code:
- Network-Attached Storage (NAS) solution with
Raid 1 via network: DRBD, GlusterFS
Stateless solutions
Do not modify the application code:
- Network-Attached Storage (NAS) solution with
Raid 1 via network: DRBD, GlusterFS
- Amazon Elastic File System (EFS)
Stateless solutions
Do not modify the application code:
- Network-Attached Storage (NAS) solution with
Raid 1 via network: DRBD, GlusterFS
- Amazon Elastic File System (EFS)
- Mount Amazon S3 bucket as file system: s3fs,
s3ql
Stateless solutions
Do not modify the application code:
- Network-Attached Storage (NAS) solution with
Raid 1 via network: DRBD, GlusterFS
- Amazon Elastic File System (EFS)
- Mount Amazon S3 bucket as file system: s3fs,
s3ql
- Amazon Elastic Load Balancer with sticky
session (sometimes is enough)
Stateless solutions
Modify the application code:
Stateless solutions
Modify the application code:
- Our database tier
Stateless solutions
Modify the application code:
- Our database tier
- Amazon S3
Stateless solutions
Modify the application code:
- Our database tier
- Amazon S3
- Amazon Elasticache (Redis), Amazon
DynamoDB
When an AWS AZ in EU-West itself fails
?
When the Entire EU West region is
affected ?
Proposal of solutions when a region is
not available
Proposal of solutions when a region is
not available
• RTO and RPO is max. 24 hours
– Amazon Cloudformation templates
– Copy backup to second region
Proposal of solutions when a region is
not available
• RTO and RPO is max. 24 hours
– Amazon Cloudformation templates
– Copy backup to second region
• RTO and RPO is max. 5 minutes
– Amazon Cloudformation templates
– Configure Replication data between regions
Proposal of solutions when a region is
not available
• RTO and RPO is max. 24 hours
– Amazon Cloudformation templates
– Copy backup to second region
• RTO and RPO is max. 5 minutes
– Amazon Cloudformation templates
– Configure Replication data between regions
• Keep two active environments Master-Master
– Use a queue solution (Amazon DynamoDB, SQS)
Implement Elasticity
Don’t assume health or fixed location of
components. Use designs that are resilient to
reboot and re-launch.
Standardized Application Stacks
Approaches to designing AMI
1. Inventory of fully baked AMI
2. Base AMI with fetch
on boot
3. AMI with Agent to
management system
Fully baked AMI
Tools for fully baked AMI
• AWS Console
• AWS API with scripts
Base AMI
Base AMI
• Jenkins/Team City
• Amazon Cloudformation
Base AMI
• Jenkins/Team City
• Amazon Cloudformation
• Ansible playbooks
• Aminator
AMI with Agent to management
system
Tool to management system
• Puppet
• Chef
• Ansible
Micro-services and elastic resource
pools with AWS
• Each service is decoupled from the rest and
deployed individually
• We run multiple services on the same
instances
• An automated deployment system takes care
of all services lifecycle details
Amazon EC2 Container Service (ECS)
– a fully managed platform
Thank you for your attention

Good practices to design and implement IT architecture based on AWS

  • 1.
    Good practices todesign and implement IT architecture based on AWS
  • 2.
    About us • 11+years of professional experience in Unix and 7+ years in Cloud Computing administration • Founder at LCloud – “Linxsys Cloud”, AWS Partner from 2012, first AWS Partner in Poland • We are after 150+ AWS projects • Enterprise experience: • Email: jacek.biernat@linxsys.pl
  • 3.
    Design for Failure– High Available solution One of the major reasons for migration to Cloud Computing: - Avoid single points of failure - Assume everything fails Our goal: Application should continue to function even if the underlying physical hardware fails or is removed or replicated
  • 4.
  • 5.
  • 6.
    Stateless solutions Do notmodify the application code:
  • 7.
    Stateless solutions Do notmodify the application code: - Network-Attached Storage (NAS) solution with Raid 1 via network: DRBD, GlusterFS
  • 8.
    Stateless solutions Do notmodify the application code: - Network-Attached Storage (NAS) solution with Raid 1 via network: DRBD, GlusterFS - Amazon Elastic File System (EFS)
  • 9.
    Stateless solutions Do notmodify the application code: - Network-Attached Storage (NAS) solution with Raid 1 via network: DRBD, GlusterFS - Amazon Elastic File System (EFS) - Mount Amazon S3 bucket as file system: s3fs, s3ql
  • 10.
    Stateless solutions Do notmodify the application code: - Network-Attached Storage (NAS) solution with Raid 1 via network: DRBD, GlusterFS - Amazon Elastic File System (EFS) - Mount Amazon S3 bucket as file system: s3fs, s3ql - Amazon Elastic Load Balancer with sticky session (sometimes is enough)
  • 11.
  • 12.
    Stateless solutions Modify theapplication code: - Our database tier
  • 13.
    Stateless solutions Modify theapplication code: - Our database tier - Amazon S3
  • 14.
    Stateless solutions Modify theapplication code: - Our database tier - Amazon S3 - Amazon Elasticache (Redis), Amazon DynamoDB
  • 15.
    When an AWSAZ in EU-West itself fails ?
  • 16.
    When the EntireEU West region is affected ?
  • 17.
    Proposal of solutionswhen a region is not available
  • 18.
    Proposal of solutionswhen a region is not available • RTO and RPO is max. 24 hours – Amazon Cloudformation templates – Copy backup to second region
  • 19.
    Proposal of solutionswhen a region is not available • RTO and RPO is max. 24 hours – Amazon Cloudformation templates – Copy backup to second region • RTO and RPO is max. 5 minutes – Amazon Cloudformation templates – Configure Replication data between regions
  • 20.
    Proposal of solutionswhen a region is not available • RTO and RPO is max. 24 hours – Amazon Cloudformation templates – Copy backup to second region • RTO and RPO is max. 5 minutes – Amazon Cloudformation templates – Configure Replication data between regions • Keep two active environments Master-Master – Use a queue solution (Amazon DynamoDB, SQS)
  • 21.
    Implement Elasticity Don’t assumehealth or fixed location of components. Use designs that are resilient to reboot and re-launch.
  • 22.
  • 23.
    Approaches to designingAMI 1. Inventory of fully baked AMI 2. Base AMI with fetch on boot 3. AMI with Agent to management system
  • 24.
  • 25.
    Tools for fullybaked AMI • AWS Console • AWS API with scripts
  • 26.
  • 27.
    Base AMI • Jenkins/TeamCity • Amazon Cloudformation
  • 28.
    Base AMI • Jenkins/TeamCity • Amazon Cloudformation • Ansible playbooks • Aminator
  • 29.
    AMI with Agentto management system
  • 30.
    Tool to managementsystem • Puppet • Chef • Ansible
  • 31.
    Micro-services and elasticresource pools with AWS • Each service is decoupled from the rest and deployed individually • We run multiple services on the same instances • An automated deployment system takes care of all services lifecycle details
  • 32.
    Amazon EC2 ContainerService (ECS) – a fully managed platform
  • 33.
    Thank you foryour attention