AWS architecture
problems while
being fancy
About me
Goran Kopevski
Tech Lead @ Global Savings Group
Agenda
▰ Benefits of using AWS Cloud
▰ Fancy selling point
▰ Common design patterns and problems
What is AWS
Marketing eyes: Amazon Web Services (AWS) is a secure cloud
services platform, offering compute power, database storage,
content delivery and other functionality to help businesses scale
and grow.
Engineer eyes:
▰ Managed services
▰ Easier way for development and deployment
▰ New architecture horizonts
But why AWS or any other cloud?
Four fundamental principles for cloud:
▰ Fault tolerant systems
▰ Scalability
▰ Elasticity
▰ Cost effective
What kind of services they are offering
The good part
▰ Polished services
▻ EC2
▻ S3
▻ EB
▻ CF
▻ AWS RDS
▻ ….
▰ If a service gains popularity it gets big investment from AWS
Challenges
▰ For the sake of having a “service”, let's roll it out
▰ If service is popular -> invest,
▻ if not -> ignore it :)
▰ Stubbornness and simply ignoring requests
▰ Forcing you use their vision about cloud services
▻ Workarounds for other scenarios
The fancy smart wording
▰ “I am experienced in using Elastic mapReduce for distributed
cloud processing of large data sets across clusters of
computers using simple programming models”
▰ “I am using DynamoDB which a fast and flexible NoSQL
database service for all applications that need consistent,
single-digit millisecond latency at any scale”
The real wording
▰ “I am experienced in using Elastic mapReduce for distributed
cloud processing of large data sets across clusters of
computers using simple programming models”
▰ In normal (real) wording “I am using Hadoop”
▰ “I am using DynamoDB which a fast and flexible NoSQL
database service for all applications that need consistent,
single-digit millisecond latency at any scale”
▰ After some experience “I am using simple key value db”
AWS API Gateway: The good parts
▰ API Caching
▰ API limiter
▻ Example: max 1000 requests to specific endpoint
▰ Support for swagger definition of endpoints
▰ Good security
AWS API Gateway: Challenges
▰ Multipart requests
▻ Encode image in base64 and send it like that
▰ 10 MB limit payload
▻ Use streaming request
▰ Creation of endpoint
▻ Swagger custom parameters
Regions problem: The good parts
▰ Regions on every continent
▻ Closer to your clients
▰ Multiple availability zones per region
▰ Main power of the AWS infrastructure
▻ Prerequisite for fault tolerant systems
Regions problem: The challenges
▰ Some services available but not all
▻ First N.Virginia and Ireland then move it to other
regions
▰ Real world scenarios:
▻ DynamoDB caching
▻ DynamoDB backup
▻ CodePipeline
▻ AWS Fargate
▻ …
▰ https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/
Dynamo: The good parts
▰ Fast write
▰ Fast read
▻ Under some conditions
▰ Autoscaling is managed by AWS
▰ You pay for throughput
▻ number of request
▻ speed for writing/reading
▻ You can have 100000000…. TB of data
Dynamo: The challenging part!
▰ For every simple query you need to write a lot of code
instead of “1 liner”
▰ SELECT * FROM X WHERE Status=’Published’ AND
date>:date:
Dynamo: Even more challenges
▰ If you want to query by other parameters (not primary key), you need indexes
▻ Dynamo supports up to 5 indexes :)
▰ Versioning does not work with batch write
▻ You need to handle it yourself
▻ https://github.com/bchew/dynamodump
▰ https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Limits.html (racism link)
Dynamo problems with backup using AWS EMR
▰ Solution provided by AWS was to use new AWS
service (AWS EMR, (Hadoop))
▻ https://aws.amazon.com/blogs/aws/aws-howto-using-amazon-elastic-
mapreduce-with-dynamodb/
▰ The bad part is it was not working
consistently
▻ In a test scenario we restored 80% of the data
Dynamo solution for backup
From November 2017 AWS started to support dynamo table
backup as a managed service
Cloudformation templates
▰ Code as infrastructure
▰ Love hate relationship service
▰ If you use it properly and understand you have incredibly
good tool
▻ If not then you will hate it
▰ Interesting limitation if you don’t pay much attention
▻ 200 resources max per template
Lambda λ
Pros:
▰ FaaS
▻ Pay per execution
▰ No scaling problems
▰ Operational management
▰ Faster innovation
Cons:
▰ No control over environment
▰ Lack of operational tools
▰ Architectural complexity
AWS SQS
▰ Amazon Simple Queue Service (SQS) is a fully managed
message queuing service that makes it easy to decouple and
scale microservices, distributed systems, and serverless
applications.
▰ Nearly unlimited number of transactions per second
▻ 120,000 inflight messages in a queue
▰ Only 1 bad word => Limits:
▻ Activemq 8Gb
▻ RabitMQ 2Gb
▻ AWS SQS 256KB
▰ https://stackshare.io/stackups/amazon-sqs-vs-kafka-vs-rabbitmq
Cloudwatch: The good parts
▰ Out of the box integration with AWS
▻ SNS/SQS
▻ Logging
▻ Lambda
▻ ...
▰ Monitoring tool
▰ Supports for multiple type of notifications
Cloudwatch & Logging
Cloudwatch & Logging
https://eu-central-1.console.aws.amazon.com/lambda/home?region=eu-central-
1#/functions/LogsToElasticsearchEx_deals-es-logging_454597441955?tab=graph
CodePipeline: The good parts
▰ Super easy setup!
▰ Good integration in AWS ecosystem
CodePipeline: The challenges
▰ Integration with 3rd party goes with custom lambda
▻ Lambda for sonar (community)
▻ Lambda for github (community)
▰ No parameterized builds
▰ Code Pipeline Monitoring
The custom lambda problem
▰ If you need to tune the system to the way you want to work
in AWS system easiest way is with Custom Lambda!
▰ Example:
▻ Integration of Sonar with CodePipeline
▻ Integration of Github builds in CodePipeline
▻ Sending logs from Cloudwatch to ElasticSearch
https://forums.aws.amazon.com/thread.jspa?threadID=227681
AWS ES: The good parts
▰ Managed service
▰ Easy setup
▰ Integration with AWS ecosystem
▻ IAM Roles
▻ Kinesis
▻ EC2 instances
AWS ES: Challenges
▰ Transport protocol is disabled
▰ Only HTTP requests
▻ https://forums.aws.amazon.com/thread.jspa?messageID=784997
▰ Sometimes returns 500 :)
▰ Out of the box automatic autoscaling is not supported
Conclusion
▰ Consult/Research before choosing specific AWS service
▰ Managing whole infrastructure is easy with AWS
▰ If you don’t have very specific requirements go with AWS
THANKS!
Any questions?
You can find me at
gkopevski@gmail.com

Aws architecture problems while being fancy

  • 1.
  • 2.
    About me Goran Kopevski TechLead @ Global Savings Group
  • 3.
    Agenda ▰ Benefits ofusing AWS Cloud ▰ Fancy selling point ▰ Common design patterns and problems
  • 4.
    What is AWS Marketingeyes: Amazon Web Services (AWS) is a secure cloud services platform, offering compute power, database storage, content delivery and other functionality to help businesses scale and grow. Engineer eyes: ▰ Managed services ▰ Easier way for development and deployment ▰ New architecture horizonts
  • 5.
    But why AWSor any other cloud? Four fundamental principles for cloud: ▰ Fault tolerant systems ▰ Scalability ▰ Elasticity ▰ Cost effective
  • 6.
    What kind ofservices they are offering
  • 7.
    The good part ▰Polished services ▻ EC2 ▻ S3 ▻ EB ▻ CF ▻ AWS RDS ▻ …. ▰ If a service gains popularity it gets big investment from AWS
  • 8.
    Challenges ▰ For thesake of having a “service”, let's roll it out ▰ If service is popular -> invest, ▻ if not -> ignore it :) ▰ Stubbornness and simply ignoring requests ▰ Forcing you use their vision about cloud services ▻ Workarounds for other scenarios
  • 9.
    The fancy smartwording ▰ “I am experienced in using Elastic mapReduce for distributed cloud processing of large data sets across clusters of computers using simple programming models” ▰ “I am using DynamoDB which a fast and flexible NoSQL database service for all applications that need consistent, single-digit millisecond latency at any scale”
  • 10.
    The real wording ▰“I am experienced in using Elastic mapReduce for distributed cloud processing of large data sets across clusters of computers using simple programming models” ▰ In normal (real) wording “I am using Hadoop” ▰ “I am using DynamoDB which a fast and flexible NoSQL database service for all applications that need consistent, single-digit millisecond latency at any scale” ▰ After some experience “I am using simple key value db”
  • 11.
    AWS API Gateway:The good parts ▰ API Caching ▰ API limiter ▻ Example: max 1000 requests to specific endpoint ▰ Support for swagger definition of endpoints ▰ Good security
  • 12.
    AWS API Gateway:Challenges ▰ Multipart requests ▻ Encode image in base64 and send it like that ▰ 10 MB limit payload ▻ Use streaming request ▰ Creation of endpoint ▻ Swagger custom parameters
  • 13.
    Regions problem: Thegood parts ▰ Regions on every continent ▻ Closer to your clients ▰ Multiple availability zones per region ▰ Main power of the AWS infrastructure ▻ Prerequisite for fault tolerant systems
  • 14.
    Regions problem: Thechallenges ▰ Some services available but not all ▻ First N.Virginia and Ireland then move it to other regions ▰ Real world scenarios: ▻ DynamoDB caching ▻ DynamoDB backup ▻ CodePipeline ▻ AWS Fargate ▻ … ▰ https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/
  • 15.
    Dynamo: The goodparts ▰ Fast write ▰ Fast read ▻ Under some conditions ▰ Autoscaling is managed by AWS ▰ You pay for throughput ▻ number of request ▻ speed for writing/reading ▻ You can have 100000000…. TB of data
  • 16.
    Dynamo: The challengingpart! ▰ For every simple query you need to write a lot of code instead of “1 liner” ▰ SELECT * FROM X WHERE Status=’Published’ AND date>:date:
  • 17.
    Dynamo: Even morechallenges ▰ If you want to query by other parameters (not primary key), you need indexes ▻ Dynamo supports up to 5 indexes :) ▰ Versioning does not work with batch write ▻ You need to handle it yourself ▻ https://github.com/bchew/dynamodump ▰ https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Limits.html (racism link)
  • 18.
    Dynamo problems withbackup using AWS EMR ▰ Solution provided by AWS was to use new AWS service (AWS EMR, (Hadoop)) ▻ https://aws.amazon.com/blogs/aws/aws-howto-using-amazon-elastic- mapreduce-with-dynamodb/ ▰ The bad part is it was not working consistently ▻ In a test scenario we restored 80% of the data
  • 19.
    Dynamo solution forbackup From November 2017 AWS started to support dynamo table backup as a managed service
  • 21.
    Cloudformation templates ▰ Codeas infrastructure ▰ Love hate relationship service ▰ If you use it properly and understand you have incredibly good tool ▻ If not then you will hate it ▰ Interesting limitation if you don’t pay much attention ▻ 200 resources max per template
  • 22.
    Lambda λ Pros: ▰ FaaS ▻Pay per execution ▰ No scaling problems ▰ Operational management ▰ Faster innovation Cons: ▰ No control over environment ▰ Lack of operational tools ▰ Architectural complexity
  • 23.
    AWS SQS ▰ AmazonSimple Queue Service (SQS) is a fully managed message queuing service that makes it easy to decouple and scale microservices, distributed systems, and serverless applications. ▰ Nearly unlimited number of transactions per second ▻ 120,000 inflight messages in a queue ▰ Only 1 bad word => Limits: ▻ Activemq 8Gb ▻ RabitMQ 2Gb ▻ AWS SQS 256KB ▰ https://stackshare.io/stackups/amazon-sqs-vs-kafka-vs-rabbitmq
  • 24.
    Cloudwatch: The goodparts ▰ Out of the box integration with AWS ▻ SNS/SQS ▻ Logging ▻ Lambda ▻ ... ▰ Monitoring tool ▰ Supports for multiple type of notifications
  • 25.
  • 26.
  • 27.
    CodePipeline: The goodparts ▰ Super easy setup! ▰ Good integration in AWS ecosystem
  • 28.
    CodePipeline: The challenges ▰Integration with 3rd party goes with custom lambda ▻ Lambda for sonar (community) ▻ Lambda for github (community) ▰ No parameterized builds ▰ Code Pipeline Monitoring
  • 29.
    The custom lambdaproblem ▰ If you need to tune the system to the way you want to work in AWS system easiest way is with Custom Lambda! ▰ Example: ▻ Integration of Sonar with CodePipeline ▻ Integration of Github builds in CodePipeline ▻ Sending logs from Cloudwatch to ElasticSearch https://forums.aws.amazon.com/thread.jspa?threadID=227681
  • 30.
    AWS ES: Thegood parts ▰ Managed service ▰ Easy setup ▰ Integration with AWS ecosystem ▻ IAM Roles ▻ Kinesis ▻ EC2 instances
  • 31.
    AWS ES: Challenges ▰Transport protocol is disabled ▰ Only HTTP requests ▻ https://forums.aws.amazon.com/thread.jspa?messageID=784997 ▰ Sometimes returns 500 :) ▰ Out of the box automatic autoscaling is not supported
  • 32.
    Conclusion ▰ Consult/Research beforechoosing specific AWS service ▰ Managing whole infrastructure is easy with AWS ▰ If you don’t have very specific requirements go with AWS
  • 33.
    THANKS! Any questions? You canfind me at gkopevski@gmail.com

Editor's Notes

  • #3 Few words about me Tech lead at Intertec, part of GSG, Responsible for the architecture and well being of the Travel domains
  • #4 All of the things i am going to say today are real scenarios that we encountered while using AWS. To mention explicitly this is not hating presentation but rather realistic objective summary about pros/cons of some of the AWS services
  • #8 Now what is the good part. The thing I have concluded while working with AWS is
  • #9 Big portfolio And if you are not able to find better workaround you are going to use their
  • #10 I promised that i will be fancy so lets go: https://aws.amazon.com/emr/
  • #11 I promised that i will be fancy so lets go: https://aws.amazon.com/emr/ The point here is some devops/architect comes and start ...
  • #12 First lets sepak about the service itself: Good (few good words)
  • #13 First lets sepak about the service itself: Good (few good words)
  • #15 Cor
  • #16 https://www.nordcloud.com/tech-blog/aws-dynamodb-design-considerations
  • #17 https://www.nordcloud.com/tech-blog/aws-dynamodb-design-considerations
  • #18 https://www.nordcloud.com/tech-blog/aws-dynamodb-design-considerations
  • #22 https://cloudonaut.io/cloudformation-vs-terraform/
  • #23 https://www.quora.com/What-is-the-advantages-and-disadvantages-of-using-AWS-Lambda-with-and-without-Serverless
  • #24 This is really intresting example of not thinking out of the box.
  • #25 https://aws.amazon.com/cloudwatch/
  • #26 Filebeat is an open source file harvester, mostly used to fetch logs files and feed them into logstash. Logstash is a log pipeline tool that accepts inputs from various sources, executes different transformations, and exports the data to various targets. Elasticsearch is a distributed, RESTful search and analytics engine based on the Lucene search engine. Kibana is a visualization layer that works on top of Elasticsearch
  • #27 If you go under the hood and inspect some of the lambdas you will notice https://aws.amazon.com/kinesis/data-firehose/