SlideShare a Scribd company logo
1 of 33
Download to read offline
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS re:INVENT
Ubisoft: How For Honor Runs Using
Amazon ECS
R a l f M u e l l e r
L o u i s - M i c h e l G é l i n a s
N o v e m b e r 2 7 , 2 0 1 7
GAM307
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Introductions
Ralf Mueller
Online Technical Architect
For Honor
Ubisoft Montréal
Louis-Michel Gélinas
DevOps Team Lead
Game Online Operations
Ubisoft Montréal
Special thanks to our teams!
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
First closed alpha for Ubisoft
Biggest open beta for Ubisoft—6 million players
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
For Honor “Tribute” trailer
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
For Honor: The Journey
• Trailblaze!
• The core beliefs
• When to beautify!
• Bridges and tunnels
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Fail fast! Succeed consistently!
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Fail fast! Succeed consistently!
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
The For Honor core beliefs
Fail fast! Succeed consistently!
Development ease
Automation
Managed infrastructure
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
A beginning
• Limited cloud experience
• Desire to leverage cloud advantages (elasticity, managed services)
• Buy-in from the project
• Limited support from internal partners
• Small team with other tasks
• No option of full continuous delivery because of console constraints
• On-premises systems not ready to interact with off-premises systems
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
For Honor production block diagram
Backend ECS
Amazon
CloudFront
Application
Load Balancer
S3 Faction War
World State
ElastiCache
(REDIS)
AWS
Lambda
Amazon
Elasticsearch
Service
All traffic over
the public
Internet
Game clients
On-premises DC
Front-end
ECS
Service
discovery ECS
Ancillary ECS
Supporting services
Application
Load Balancer
Front-end
ECS
Backend ECS
Game clients
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Getting in shape
“Hello World” service
• Play application
• AWS Elastic Beanstalk
• Couchbase data layer
• Provisioning using a shell
script
• Validation of the tech and
methods
#Create Elastic Beanstalk Environment Template
aws elasticbeanstalk create-configuration-template --application-name
${APPLICATION_NAME} --template-name ${APPLICATION_NAME}-template --solution-stack-name
"64bit Amazon Linux 2014.09 v1.2.0 running Docker 1.3.3" --option-settings
OptionName=InstanceType,Namespace=aws:autoscaling:launchconfiguration,Value=${INSTANCE_T
YPE}
OptionName=IamInstanceProfile,Namespace=aws:autoscaling:launchconfiguration,Value=aws-
elasticbeanstalk-ec2-role
OptionName=EC2KeyName,Namespace=aws:autoscaling:launchconfiguration,Value=${SSH_KEY_NAME
}
OptionName=EnvironmentType,Namespace=aws:elasticbeanstalk:environment,Value=${ENVIRONMEN
T_TYPE} OptionName=VPCId,Namespace=aws:ec2:vpc,Value=vpc-58f2bb3d
OptionName=Subnets,Namespace=aws:ec2:vpc,Value=subnet-9739b2e0
OptionName=AssociatePublicIpAddress,Namespace=aws:ec2:vpc,Value=false
#Create Elastic Beanstalk Environment from Template
aws elasticbeanstalk create-environment --application-name ${APPLICATION_NAME} --
environment-name ${APPLICATION_NAME}-${ENVIRONMENT_NAME} --template-name
${APPLICATION_NAME}-template --cname-prefix ${APPLICATION_NAME}-${ENVIRONMENT_NAME}
#Wait a little moment for Amazon to process environment creation request
sleep 300; #should be fixed with proper status checks through AWS API
#Modify associated security group to restrict access to the newly created application
AWS_SECURITY_GROUP=$(aws ec2 describe-security-groups --filters Name=tag-
value,Values=${APPLICATION_NAME}-${ENVIRONMENT_NAME} | awk '/GroupId/ {gsub(""",
"");print $2}')
aws ec2 authorize-security-group-ingress --group-id ${AWS_SECURITY_GROUP} --protocol tcp
--port 22 --cidr xxx.xxx.xxx.0/20
aws ec2 authorize-security-group-ingress --group-id ${AWS_SECURITY_GROUP} --protocol tcp
--port 80 --cidr xxx.xxx.xxx.0/20
aws ec2 authorize-security-group-ingress --group-id ${AWS_SECURITY_GROUP} --protocol tcp
--port
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• Play framework not a good fit
• Elastic Beanstalk too managed
• Couchbase not a good fit
A dead end
Automation
Managed infrastructure
Development ease
Fail fast! Succeed consistently!
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Planning the route
• GO! 1.5 years before launch
• Two non-mission critical
services
• Faction War
• Player information
enrichment (PIE)
• Immutable Docker images
• Minimize resources for
development
• Run everything local
• Namespace databases
• Multiple UAT environments
• Scale out for production
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Manual AWS CloudFormation setup for basics
• VPC
• ECS clusters
• Security groups
• ElastiCache instances
• Elasticsearch clusters
ECS task and service management using Python scripts
• Emulates a human running aws-cli commands
Setting out at dawn
Vertical slice
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Manual AWS CloudFormation setup for basics
- Depends on documentation
- Fear-driven opposition to change
- Manual tweaks untracked
ECS task and service management using Python
scripts
- Scripted parts depend on manual setup
- Hard to orchestrate (no rollback, golden path
only)
Setting out at dawn
First success retrospective
Automation
Managed infrastructure
Development ease
Fail fast! Succeed consistently!
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• Monitoring and alerting with DataDog
• Logs and metrics
• Load tests
• Track key KPIs
The last mile
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
From trail to autobahn
ECS boot sequence fragility
600 ALBs are not practical!
Instance cycling automation
Automate AWS CloudFormation
Proxies vs. tunnels vs. Internet
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
ECS boot 101
Front-end ASG Backend ASG
ECS Agent ECS Agent
ECS Agent ECS Agent
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Registration and autokill
1. yum install –y jq aws-cli
2. Get instance details
3. Register instance in OpsWorks
4. Set up ECS and start agent
• All steps above can fail
• Retries with timeout
• After 5 minutes: auto-terminate
- Step one must not fail!
- We scan clusters: Is ASG instance count equal to cluster instance count?
ECS boot sequence
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Not all are created equal
UAT PROD
100s of ECS services in 2–5 clusters
100-150 environments (changing weekly)
10s of services on 2 clusters
3 static environments (PS4, Xbox One, PC)
HAProxy/Route 53 routing
- Single node
- Deployment latency
ALBs for scale and reliability
+ Multi-node
+ Seamless deployments
- IP hungry
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Not all are created equal
ALB in PROD Route 53 as opposed to HAProxy in UAT
• 600+ ALBs not practical
• HAProxy container running on every instance (OpsWorks provisioned)
• Scan instance for running services every minute
• Check for new services
• Update Route 53 entries
• Update local HAProxy config
• Route host-header to local container port
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
frontend http-in
timeout client 1m
bind *:80 name http
bind *:443 name https ssl crt /etc/ssl/cert/acme_com_cert.pem
acl host_MGW-Team-HERO-PC-UAT-X hdr(host) -i MGW-Team-HERO-PC-UAT-X.acme.com
use_backend MGW-Team-HERO-PC-UAT-X if host_MGW-Team-HERO-PC-UAT-X
backend MGW-Team-HERO-PC-UAT-X
balance leastconn
timeout connect 10s
timeout server 1m
option httpclose
option forwardfor
cookie JSESSIONID prefix
server a3dbf14d3a92 172.17.0.9:12551 cookie A check
Repeat for each container on the host (1-40)
Not all are created equal
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Reinvoke
Check
Security updates
Amazon
SNS
Lambda
function
Terminating
Set to drainingComplete hook
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Security updates
Lambdas fill gaps in offering
• Tag instances with information needed
• Cluster name
• Sleep Lambda before re-invoking (or suffer throttling)
• Don't repeat calls—add results to SNS message
• Inspiration came from this AWS blog post:
https://aws.amazon.com/blogs/compute/how-to-automate-container-instance-
draining-in-amazon-ecs/
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• Manual is scary
• Store AWS CloudFormation stacks in
Git
• Gitlab CI jobs triggering updates
• Benefit from stack updates (rollback)
• Promote changes from DEV toward
PROD safely
Automate AWS CloudFormation stacks
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• ECS services and tasks as AWS CloudFormation stacks
• Python code generates stack from template and configs
• Triggers stack update
• Benefit from stack updates (rollback)
Automate AWS CloudFormation stacks
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Tunnel vs. proxy vs. internet
UAT on-premises endpoints are not on the public internet
VPC/VPN: Reach internal on-premises endpoints using the VPN
- VPN can flap; it must be monitored
- VPN can become a bottleneck (unsuited for high traffic)
- You need a special DNS configuration to use an on-premises DNS to resolve
private domains
+ Works for us
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Tunnel vs. proxy vs. internet
UAT on-premises endpoints are not on the public internet
Proxies: Whitelist two Elastic IPs and allow traffic from these to reach protected
endpoints
- Need to manage proxies
- Need to whitelist IPs in corporate firewall
+ Worked for other projects
LIVE uses public endpoints on the Internet
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
For Honor UAT block diagram
Backend ECS
Amazon
CloudFront
S3 Faction War
World State
ElastiCache
(REDIS)
AWS
Lambda
Amazon
Elasticsearch
Service
Game Clients
On-premises data center
Front-end
ECS
Service
Discovery ECS
Ancillary ECS
Supporting services
VPN
Tunnel
HAProxy +
Route 53
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Looking back after a break
Automation
Managed infrastructure
Development ease
Fail fast! Succeed consistently!
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Lessons learned
• Everything manual is a risk
• Break-even point of automation changes over time
• Validate all changes in noncritical but identical setups
• AWS CloudFormation can have a mind of its own
• Service containers are hard to manage (even with placement constraints)
• Surprising gaps in offerings: Lambdas can duct tape a lot of features cheaply
• Cheap in dev and operations
• Invest in Lambda CI/CD tooling; it can get messy
• Use managed services (Elasticsearch, ElastiCache, SQS, Lambda, etc.)
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

More Related Content

What's hot

What's hot (20)

Oracle Enterprise Solutions on AWS - ENT326 - re:Invent 2017
Oracle Enterprise Solutions on AWS - ENT326 - re:Invent 2017Oracle Enterprise Solutions on AWS - ENT326 - re:Invent 2017
Oracle Enterprise Solutions on AWS - ENT326 - re:Invent 2017
 
NEW LAUNCH! Introducing AWS Fargate - CON214 - re:Invent 2017
NEW LAUNCH! Introducing AWS Fargate - CON214 - re:Invent 2017NEW LAUNCH! Introducing AWS Fargate - CON214 - re:Invent 2017
NEW LAUNCH! Introducing AWS Fargate - CON214 - re:Invent 2017
 
CON203_Driving Innovation with Containers
CON203_Driving Innovation with ContainersCON203_Driving Innovation with Containers
CON203_Driving Innovation with Containers
 
STG330_Case Study How Experian Leverages Amazon EC2, EBS, and S3 with Clouder...
STG330_Case Study How Experian Leverages Amazon EC2, EBS, and S3 with Clouder...STG330_Case Study How Experian Leverages Amazon EC2, EBS, and S3 with Clouder...
STG330_Case Study How Experian Leverages Amazon EC2, EBS, and S3 with Clouder...
 
GPSWKS301_Comprehensive Big Data Architecture Made Easy
GPSWKS301_Comprehensive Big Data Architecture Made EasyGPSWKS301_Comprehensive Big Data Architecture Made Easy
GPSWKS301_Comprehensive Big Data Architecture Made Easy
 
MBL209_Learn How MicroStrategy on AWS is Helping Vivint Solar Deliver Clean E...
MBL209_Learn How MicroStrategy on AWS is Helping Vivint Solar Deliver Clean E...MBL209_Learn How MicroStrategy on AWS is Helping Vivint Solar Deliver Clean E...
MBL209_Learn How MicroStrategy on AWS is Helping Vivint Solar Deliver Clean E...
 
NEW LAUNCH! Introducing Amazon EKS - CON215 - re:Invent 2017
NEW LAUNCH! Introducing Amazon EKS - CON215 - re:Invent 2017NEW LAUNCH! Introducing Amazon EKS - CON215 - re:Invent 2017
NEW LAUNCH! Introducing Amazon EKS - CON215 - re:Invent 2017
 
NET308_VPC Design Scenarios for Real-Life Use Cases
NET308_VPC Design Scenarios for Real-Life Use CasesNET308_VPC Design Scenarios for Real-Life Use Cases
NET308_VPC Design Scenarios for Real-Life Use Cases
 
GPSWKS408-GPS Migrate Your Databases with AWS Database Migration Service and ...
GPSWKS408-GPS Migrate Your Databases with AWS Database Migration Service and ...GPSWKS408-GPS Migrate Your Databases with AWS Database Migration Service and ...
GPSWKS408-GPS Migrate Your Databases with AWS Database Migration Service and ...
 
CON307_Building Effective Container Images
CON307_Building Effective Container ImagesCON307_Building Effective Container Images
CON307_Building Effective Container Images
 
ARC201_Scaling Up to Your First 10 Million Users
ARC201_Scaling Up to Your First 10 Million UsersARC201_Scaling Up to Your First 10 Million Users
ARC201_Scaling Up to Your First 10 Million Users
 
How to Assess Your Organization's Readiness to Migrate at Scale to AWS - ENT2...
How to Assess Your Organization's Readiness to Migrate at Scale to AWS - ENT2...How to Assess Your Organization's Readiness to Migrate at Scale to AWS - ENT2...
How to Assess Your Organization's Readiness to Migrate at Scale to AWS - ENT2...
 
Reinforcement Learning – The Ultimate AI - ARC320 - re:Invent 2017
Reinforcement Learning – The Ultimate AI - ARC320 - re:Invent 2017Reinforcement Learning – The Ultimate AI - ARC320 - re:Invent 2017
Reinforcement Learning – The Ultimate AI - ARC320 - re:Invent 2017
 
CON209_Interstella 8888 Learn How to Use Docker on AWS
CON209_Interstella 8888 Learn How to Use Docker on AWSCON209_Interstella 8888 Learn How to Use Docker on AWS
CON209_Interstella 8888 Learn How to Use Docker on AWS
 
CMP314_Bringing Deep Learning to the Cloud with Amazon EC2
CMP314_Bringing Deep Learning to the Cloud with Amazon EC2CMP314_Bringing Deep Learning to the Cloud with Amazon EC2
CMP314_Bringing Deep Learning to the Cloud with Amazon EC2
 
Monitoring and Troubleshooting in a Serverless World - SRV303 - re:Invent 2017
Monitoring and Troubleshooting in a Serverless World - SRV303 - re:Invent 2017Monitoring and Troubleshooting in a Serverless World - SRV303 - re:Invent 2017
Monitoring and Troubleshooting in a Serverless World - SRV303 - re:Invent 2017
 
NEW LAUNCH! AWS Serverless Application Repository - SRV215 - re:Invent 2017
NEW LAUNCH! AWS Serverless Application Repository - SRV215 - re:Invent 2017NEW LAUNCH! AWS Serverless Application Repository - SRV215 - re:Invent 2017
NEW LAUNCH! AWS Serverless Application Repository - SRV215 - re:Invent 2017
 
SRV314_Building a Serverless Pipeline to Transcode a Two-Hour Video in Minutes
SRV314_Building a Serverless Pipeline to Transcode a Two-Hour Video in MinutesSRV314_Building a Serverless Pipeline to Transcode a Two-Hour Video in Minutes
SRV314_Building a Serverless Pipeline to Transcode a Two-Hour Video in Minutes
 
MBL201_Progressive Web Apps in the Real World
MBL201_Progressive Web Apps in the Real WorldMBL201_Progressive Web Apps in the Real World
MBL201_Progressive Web Apps in the Real World
 
STG203_Get Rid of Tape and Modernize Backup with AWS
STG203_Get Rid of Tape and Modernize Backup with AWSSTG203_Get Rid of Tape and Modernize Backup with AWS
STG203_Get Rid of Tape and Modernize Backup with AWS
 

Similar to GAM307_Ubisoft How For Honor Runs Using Amazon ECS

Similar to GAM307_Ubisoft How For Honor Runs Using Amazon ECS (20)

Batch Processing with Containers on AWS - CON304 - re:Invent 2017
Batch Processing with Containers on AWS - CON304 - re:Invent 2017Batch Processing with Containers on AWS - CON304 - re:Invent 2017
Batch Processing with Containers on AWS - CON304 - re:Invent 2017
 
Deep dive into AWS fargate
Deep dive into AWS fargateDeep dive into AWS fargate
Deep dive into AWS fargate
 
Amazon ECS Deep Dive
Amazon ECS Deep DiveAmazon ECS Deep Dive
Amazon ECS Deep Dive
 
Amazon Amazon Elastic Container Service (Amazon ECS)
Amazon Amazon Elastic Container Service (Amazon ECS)Amazon Amazon Elastic Container Service (Amazon ECS)
Amazon Amazon Elastic Container Service (Amazon ECS)
 
Building Web Apps on AWS
Building Web Apps on AWSBuilding Web Apps on AWS
Building Web Apps on AWS
 
Leo Zhadanovsky - Building Web Apps with AWS CodeStar and AWS Elastic Beansta...
Leo Zhadanovsky - Building Web Apps with AWS CodeStar and AWS Elastic Beansta...Leo Zhadanovsky - Building Web Apps with AWS CodeStar and AWS Elastic Beansta...
Leo Zhadanovsky - Building Web Apps with AWS CodeStar and AWS Elastic Beansta...
 
Infrastructure Security: Your Minimum Security Baseline
Infrastructure Security: Your Minimum Security BaselineInfrastructure Security: Your Minimum Security Baseline
Infrastructure Security: Your Minimum Security Baseline
 
Containers on AWS
Containers on AWSContainers on AWS
Containers on AWS
 
AWS 容器服務入門實務
AWS 容器服務入門實務AWS 容器服務入門實務
AWS 容器服務入門實務
 
Deep Learning Using Caffe2 on AWS - MCL313 - re:Invent 2017
Deep Learning Using Caffe2 on AWS - MCL313 - re:Invent 2017Deep Learning Using Caffe2 on AWS - MCL313 - re:Invent 2017
Deep Learning Using Caffe2 on AWS - MCL313 - re:Invent 2017
 
Stack Mastery: Create and Optimize Advanced AWS CloudFormation Templates - DE...
Stack Mastery: Create and Optimize Advanced AWS CloudFormation Templates - DE...Stack Mastery: Create and Optimize Advanced AWS CloudFormation Templates - DE...
Stack Mastery: Create and Optimize Advanced AWS CloudFormation Templates - DE...
 
Deep Dive into AWS Fargate - CON333 - re:Invent 2017
Deep Dive into AWS Fargate - CON333 - re:Invent 2017Deep Dive into AWS Fargate - CON333 - re:Invent 2017
Deep Dive into AWS Fargate - CON333 - re:Invent 2017
 
Automate and Scale Configuration Management with AWS OpsWorks - DEV331 - re:I...
Automate and Scale Configuration Management with AWS OpsWorks - DEV331 - re:I...Automate and Scale Configuration Management with AWS OpsWorks - DEV331 - re:I...
Automate and Scale Configuration Management with AWS OpsWorks - DEV331 - re:I...
 
Create a Serverless Image Processing Platform - ARC326 - re:Invent 2017
Create a Serverless Image Processing Platform - ARC326 - re:Invent 2017Create a Serverless Image Processing Platform - ARC326 - re:Invent 2017
Create a Serverless Image Processing Platform - ARC326 - re:Invent 2017
 
Deep Dive on Amazon Elastic Container Service (ECS) and Fargate
Deep Dive on Amazon Elastic Container Service (ECS) and FargateDeep Dive on Amazon Elastic Container Service (ECS) and Fargate
Deep Dive on Amazon Elastic Container Service (ECS) and Fargate
 
Interstella 8888: CICD for Containers on AWS - CON319 - re:Invent 2017
Interstella 8888: CICD for Containers on AWS - CON319 - re:Invent 2017Interstella 8888: CICD for Containers on AWS - CON319 - re:Invent 2017
Interstella 8888: CICD for Containers on AWS - CON319 - re:Invent 2017
 
CON319_Interstella GTC CICD for Containers on AWS
CON319_Interstella GTC CICD for Containers on AWSCON319_Interstella GTC CICD for Containers on AWS
CON319_Interstella GTC CICD for Containers on AWS
 
Build a Java Spring Application on Amazon ECS - CON332 - re:Invent 2017
Build a Java Spring Application on Amazon ECS - CON332 - re:Invent 2017Build a Java Spring Application on Amazon ECS - CON332 - re:Invent 2017
Build a Java Spring Application on Amazon ECS - CON332 - re:Invent 2017
 
Compute@Scale
Compute@ScaleCompute@Scale
Compute@Scale
 
Customer Sharing: Trend Micro - Analytic Engine - A common Big Data computati...
Customer Sharing: Trend Micro - Analytic Engine - A common Big Data computati...Customer Sharing: Trend Micro - Analytic Engine - A common Big Data computati...
Customer Sharing: Trend Micro - Analytic Engine - A common Big Data computati...
 

More from Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

GAM307_Ubisoft How For Honor Runs Using Amazon ECS

  • 1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS re:INVENT Ubisoft: How For Honor Runs Using Amazon ECS R a l f M u e l l e r L o u i s - M i c h e l G é l i n a s N o v e m b e r 2 7 , 2 0 1 7 GAM307
  • 2. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Introductions Ralf Mueller Online Technical Architect For Honor Ubisoft Montréal Louis-Michel Gélinas DevOps Team Lead Game Online Operations Ubisoft Montréal Special thanks to our teams!
  • 3. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 4. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. First closed alpha for Ubisoft Biggest open beta for Ubisoft—6 million players
  • 5. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. For Honor “Tribute” trailer
  • 6. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. For Honor: The Journey • Trailblaze! • The core beliefs • When to beautify! • Bridges and tunnels
  • 7. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Fail fast! Succeed consistently!
  • 8. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Fail fast! Succeed consistently!
  • 9. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. The For Honor core beliefs Fail fast! Succeed consistently! Development ease Automation Managed infrastructure
  • 10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. A beginning • Limited cloud experience • Desire to leverage cloud advantages (elasticity, managed services) • Buy-in from the project • Limited support from internal partners • Small team with other tasks • No option of full continuous delivery because of console constraints • On-premises systems not ready to interact with off-premises systems
  • 11. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. For Honor production block diagram Backend ECS Amazon CloudFront Application Load Balancer S3 Faction War World State ElastiCache (REDIS) AWS Lambda Amazon Elasticsearch Service All traffic over the public Internet Game clients On-premises DC Front-end ECS Service discovery ECS Ancillary ECS Supporting services Application Load Balancer Front-end ECS Backend ECS Game clients
  • 12. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Getting in shape “Hello World” service • Play application • AWS Elastic Beanstalk • Couchbase data layer • Provisioning using a shell script • Validation of the tech and methods #Create Elastic Beanstalk Environment Template aws elasticbeanstalk create-configuration-template --application-name ${APPLICATION_NAME} --template-name ${APPLICATION_NAME}-template --solution-stack-name "64bit Amazon Linux 2014.09 v1.2.0 running Docker 1.3.3" --option-settings OptionName=InstanceType,Namespace=aws:autoscaling:launchconfiguration,Value=${INSTANCE_T YPE} OptionName=IamInstanceProfile,Namespace=aws:autoscaling:launchconfiguration,Value=aws- elasticbeanstalk-ec2-role OptionName=EC2KeyName,Namespace=aws:autoscaling:launchconfiguration,Value=${SSH_KEY_NAME } OptionName=EnvironmentType,Namespace=aws:elasticbeanstalk:environment,Value=${ENVIRONMEN T_TYPE} OptionName=VPCId,Namespace=aws:ec2:vpc,Value=vpc-58f2bb3d OptionName=Subnets,Namespace=aws:ec2:vpc,Value=subnet-9739b2e0 OptionName=AssociatePublicIpAddress,Namespace=aws:ec2:vpc,Value=false #Create Elastic Beanstalk Environment from Template aws elasticbeanstalk create-environment --application-name ${APPLICATION_NAME} -- environment-name ${APPLICATION_NAME}-${ENVIRONMENT_NAME} --template-name ${APPLICATION_NAME}-template --cname-prefix ${APPLICATION_NAME}-${ENVIRONMENT_NAME} #Wait a little moment for Amazon to process environment creation request sleep 300; #should be fixed with proper status checks through AWS API #Modify associated security group to restrict access to the newly created application AWS_SECURITY_GROUP=$(aws ec2 describe-security-groups --filters Name=tag- value,Values=${APPLICATION_NAME}-${ENVIRONMENT_NAME} | awk '/GroupId/ {gsub(""", "");print $2}') aws ec2 authorize-security-group-ingress --group-id ${AWS_SECURITY_GROUP} --protocol tcp --port 22 --cidr xxx.xxx.xxx.0/20 aws ec2 authorize-security-group-ingress --group-id ${AWS_SECURITY_GROUP} --protocol tcp --port 80 --cidr xxx.xxx.xxx.0/20 aws ec2 authorize-security-group-ingress --group-id ${AWS_SECURITY_GROUP} --protocol tcp --port
  • 13. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • Play framework not a good fit • Elastic Beanstalk too managed • Couchbase not a good fit A dead end Automation Managed infrastructure Development ease Fail fast! Succeed consistently!
  • 14. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Planning the route • GO! 1.5 years before launch • Two non-mission critical services • Faction War • Player information enrichment (PIE) • Immutable Docker images • Minimize resources for development • Run everything local • Namespace databases • Multiple UAT environments • Scale out for production
  • 15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Manual AWS CloudFormation setup for basics • VPC • ECS clusters • Security groups • ElastiCache instances • Elasticsearch clusters ECS task and service management using Python scripts • Emulates a human running aws-cli commands Setting out at dawn Vertical slice
  • 16. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Manual AWS CloudFormation setup for basics - Depends on documentation - Fear-driven opposition to change - Manual tweaks untracked ECS task and service management using Python scripts - Scripted parts depend on manual setup - Hard to orchestrate (no rollback, golden path only) Setting out at dawn First success retrospective Automation Managed infrastructure Development ease Fail fast! Succeed consistently!
  • 17. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • Monitoring and alerting with DataDog • Logs and metrics • Load tests • Track key KPIs The last mile
  • 18. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. From trail to autobahn ECS boot sequence fragility 600 ALBs are not practical! Instance cycling automation Automate AWS CloudFormation Proxies vs. tunnels vs. Internet
  • 19. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. ECS boot 101 Front-end ASG Backend ASG ECS Agent ECS Agent ECS Agent ECS Agent
  • 20. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Registration and autokill 1. yum install –y jq aws-cli 2. Get instance details 3. Register instance in OpsWorks 4. Set up ECS and start agent • All steps above can fail • Retries with timeout • After 5 minutes: auto-terminate - Step one must not fail! - We scan clusters: Is ASG instance count equal to cluster instance count? ECS boot sequence
  • 21. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Not all are created equal UAT PROD 100s of ECS services in 2–5 clusters 100-150 environments (changing weekly) 10s of services on 2 clusters 3 static environments (PS4, Xbox One, PC) HAProxy/Route 53 routing - Single node - Deployment latency ALBs for scale and reliability + Multi-node + Seamless deployments - IP hungry
  • 22. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Not all are created equal ALB in PROD Route 53 as opposed to HAProxy in UAT • 600+ ALBs not practical • HAProxy container running on every instance (OpsWorks provisioned) • Scan instance for running services every minute • Check for new services • Update Route 53 entries • Update local HAProxy config • Route host-header to local container port
  • 23. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. frontend http-in timeout client 1m bind *:80 name http bind *:443 name https ssl crt /etc/ssl/cert/acme_com_cert.pem acl host_MGW-Team-HERO-PC-UAT-X hdr(host) -i MGW-Team-HERO-PC-UAT-X.acme.com use_backend MGW-Team-HERO-PC-UAT-X if host_MGW-Team-HERO-PC-UAT-X backend MGW-Team-HERO-PC-UAT-X balance leastconn timeout connect 10s timeout server 1m option httpclose option forwardfor cookie JSESSIONID prefix server a3dbf14d3a92 172.17.0.9:12551 cookie A check Repeat for each container on the host (1-40) Not all are created equal
  • 24. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Reinvoke Check Security updates Amazon SNS Lambda function Terminating Set to drainingComplete hook
  • 25. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Security updates Lambdas fill gaps in offering • Tag instances with information needed • Cluster name • Sleep Lambda before re-invoking (or suffer throttling) • Don't repeat calls—add results to SNS message • Inspiration came from this AWS blog post: https://aws.amazon.com/blogs/compute/how-to-automate-container-instance- draining-in-amazon-ecs/
  • 26. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • Manual is scary • Store AWS CloudFormation stacks in Git • Gitlab CI jobs triggering updates • Benefit from stack updates (rollback) • Promote changes from DEV toward PROD safely Automate AWS CloudFormation stacks
  • 27. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • ECS services and tasks as AWS CloudFormation stacks • Python code generates stack from template and configs • Triggers stack update • Benefit from stack updates (rollback) Automate AWS CloudFormation stacks
  • 28. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Tunnel vs. proxy vs. internet UAT on-premises endpoints are not on the public internet VPC/VPN: Reach internal on-premises endpoints using the VPN - VPN can flap; it must be monitored - VPN can become a bottleneck (unsuited for high traffic) - You need a special DNS configuration to use an on-premises DNS to resolve private domains + Works for us
  • 29. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Tunnel vs. proxy vs. internet UAT on-premises endpoints are not on the public internet Proxies: Whitelist two Elastic IPs and allow traffic from these to reach protected endpoints - Need to manage proxies - Need to whitelist IPs in corporate firewall + Worked for other projects LIVE uses public endpoints on the Internet
  • 30. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. For Honor UAT block diagram Backend ECS Amazon CloudFront S3 Faction War World State ElastiCache (REDIS) AWS Lambda Amazon Elasticsearch Service Game Clients On-premises data center Front-end ECS Service Discovery ECS Ancillary ECS Supporting services VPN Tunnel HAProxy + Route 53
  • 31. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Looking back after a break Automation Managed infrastructure Development ease Fail fast! Succeed consistently!
  • 32. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Lessons learned • Everything manual is a risk • Break-even point of automation changes over time • Validate all changes in noncritical but identical setups • AWS CloudFormation can have a mind of its own • Service containers are hard to manage (even with placement constraints) • Surprising gaps in offerings: Lambdas can duct tape a lot of features cheaply • Cheap in dev and operations • Invest in Lambda CI/CD tooling; it can get messy • Use managed services (Elasticsearch, ElastiCache, SQS, Lambda, etc.)
  • 33. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.