Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

GAM307_Ubisoft How For Honor Runs Using Amazon ECS

570 views

Published on

This session covers how the team at Ubisoft evolved For Honor's infrastructure using Amazon ECS and supporting systems (Amazon CloudFront, Amazon ElastiCache, Amazon Elasticsearch Service, Amazon SQS, and AWS Lambda, with monitoring through DataDog) from a proof of concept to an infrastructure as code solution. The team shares war stories about supporting both internal and live environments, and the challenges of bridging cloud and on-premises systems.

  • Be the first to comment

  • Be the first to like this

GAM307_Ubisoft How For Honor Runs Using Amazon ECS

  1. 1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS re:INVENT Ubisoft: How For Honor Runs Using Amazon ECS R a l f M u e l l e r L o u i s - M i c h e l G é l i n a s N o v e m b e r 2 7 , 2 0 1 7 GAM307
  2. 2. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Introductions Ralf Mueller Online Technical Architect For Honor Ubisoft Montréal Louis-Michel Gélinas DevOps Team Lead Game Online Operations Ubisoft Montréal Special thanks to our teams!
  3. 3. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  4. 4. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. First closed alpha for Ubisoft Biggest open beta for Ubisoft—6 million players
  5. 5. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. For Honor “Tribute” trailer
  6. 6. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. For Honor: The Journey • Trailblaze! • The core beliefs • When to beautify! • Bridges and tunnels
  7. 7. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Fail fast! Succeed consistently!
  8. 8. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Fail fast! Succeed consistently!
  9. 9. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. The For Honor core beliefs Fail fast! Succeed consistently! Development ease Automation Managed infrastructure
  10. 10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. A beginning • Limited cloud experience • Desire to leverage cloud advantages (elasticity, managed services) • Buy-in from the project • Limited support from internal partners • Small team with other tasks • No option of full continuous delivery because of console constraints • On-premises systems not ready to interact with off-premises systems
  11. 11. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. For Honor production block diagram Backend ECS Amazon CloudFront Application Load Balancer S3 Faction War World State ElastiCache (REDIS) AWS Lambda Amazon Elasticsearch Service All traffic over the public Internet Game clients On-premises DC Front-end ECS Service discovery ECS Ancillary ECS Supporting services Application Load Balancer Front-end ECS Backend ECS Game clients
  12. 12. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Getting in shape “Hello World” service • Play application • AWS Elastic Beanstalk • Couchbase data layer • Provisioning using a shell script • Validation of the tech and methods #Create Elastic Beanstalk Environment Template aws elasticbeanstalk create-configuration-template --application-name ${APPLICATION_NAME} --template-name ${APPLICATION_NAME}-template --solution-stack-name "64bit Amazon Linux 2014.09 v1.2.0 running Docker 1.3.3" --option-settings OptionName=InstanceType,Namespace=aws:autoscaling:launchconfiguration,Value=${INSTANCE_T YPE} OptionName=IamInstanceProfile,Namespace=aws:autoscaling:launchconfiguration,Value=aws- elasticbeanstalk-ec2-role OptionName=EC2KeyName,Namespace=aws:autoscaling:launchconfiguration,Value=${SSH_KEY_NAME } OptionName=EnvironmentType,Namespace=aws:elasticbeanstalk:environment,Value=${ENVIRONMEN T_TYPE} OptionName=VPCId,Namespace=aws:ec2:vpc,Value=vpc-58f2bb3d OptionName=Subnets,Namespace=aws:ec2:vpc,Value=subnet-9739b2e0 OptionName=AssociatePublicIpAddress,Namespace=aws:ec2:vpc,Value=false #Create Elastic Beanstalk Environment from Template aws elasticbeanstalk create-environment --application-name ${APPLICATION_NAME} -- environment-name ${APPLICATION_NAME}-${ENVIRONMENT_NAME} --template-name ${APPLICATION_NAME}-template --cname-prefix ${APPLICATION_NAME}-${ENVIRONMENT_NAME} #Wait a little moment for Amazon to process environment creation request sleep 300; #should be fixed with proper status checks through AWS API #Modify associated security group to restrict access to the newly created application AWS_SECURITY_GROUP=$(aws ec2 describe-security-groups --filters Name=tag- value,Values=${APPLICATION_NAME}-${ENVIRONMENT_NAME} | awk '/GroupId/ {gsub(""", "");print $2}') aws ec2 authorize-security-group-ingress --group-id ${AWS_SECURITY_GROUP} --protocol tcp --port 22 --cidr xxx.xxx.xxx.0/20 aws ec2 authorize-security-group-ingress --group-id ${AWS_SECURITY_GROUP} --protocol tcp --port 80 --cidr xxx.xxx.xxx.0/20 aws ec2 authorize-security-group-ingress --group-id ${AWS_SECURITY_GROUP} --protocol tcp --port
  13. 13. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • Play framework not a good fit • Elastic Beanstalk too managed • Couchbase not a good fit A dead end Automation Managed infrastructure Development ease Fail fast! Succeed consistently!
  14. 14. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Planning the route • GO! 1.5 years before launch • Two non-mission critical services • Faction War • Player information enrichment (PIE) • Immutable Docker images • Minimize resources for development • Run everything local • Namespace databases • Multiple UAT environments • Scale out for production
  15. 15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Manual AWS CloudFormation setup for basics • VPC • ECS clusters • Security groups • ElastiCache instances • Elasticsearch clusters ECS task and service management using Python scripts • Emulates a human running aws-cli commands Setting out at dawn Vertical slice
  16. 16. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Manual AWS CloudFormation setup for basics - Depends on documentation - Fear-driven opposition to change - Manual tweaks untracked ECS task and service management using Python scripts - Scripted parts depend on manual setup - Hard to orchestrate (no rollback, golden path only) Setting out at dawn First success retrospective Automation Managed infrastructure Development ease Fail fast! Succeed consistently!
  17. 17. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • Monitoring and alerting with DataDog • Logs and metrics • Load tests • Track key KPIs The last mile
  18. 18. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. From trail to autobahn ECS boot sequence fragility 600 ALBs are not practical! Instance cycling automation Automate AWS CloudFormation Proxies vs. tunnels vs. Internet
  19. 19. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. ECS boot 101 Front-end ASG Backend ASG ECS Agent ECS Agent ECS Agent ECS Agent
  20. 20. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Registration and autokill 1. yum install –y jq aws-cli 2. Get instance details 3. Register instance in OpsWorks 4. Set up ECS and start agent • All steps above can fail • Retries with timeout • After 5 minutes: auto-terminate - Step one must not fail! - We scan clusters: Is ASG instance count equal to cluster instance count? ECS boot sequence
  21. 21. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Not all are created equal UAT PROD 100s of ECS services in 2–5 clusters 100-150 environments (changing weekly) 10s of services on 2 clusters 3 static environments (PS4, Xbox One, PC) HAProxy/Route 53 routing - Single node - Deployment latency ALBs for scale and reliability + Multi-node + Seamless deployments - IP hungry
  22. 22. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Not all are created equal ALB in PROD Route 53 as opposed to HAProxy in UAT • 600+ ALBs not practical • HAProxy container running on every instance (OpsWorks provisioned) • Scan instance for running services every minute • Check for new services • Update Route 53 entries • Update local HAProxy config • Route host-header to local container port
  23. 23. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. frontend http-in timeout client 1m bind *:80 name http bind *:443 name https ssl crt /etc/ssl/cert/acme_com_cert.pem acl host_MGW-Team-HERO-PC-UAT-X hdr(host) -i MGW-Team-HERO-PC-UAT-X.acme.com use_backend MGW-Team-HERO-PC-UAT-X if host_MGW-Team-HERO-PC-UAT-X backend MGW-Team-HERO-PC-UAT-X balance leastconn timeout connect 10s timeout server 1m option httpclose option forwardfor cookie JSESSIONID prefix server a3dbf14d3a92 172.17.0.9:12551 cookie A check Repeat for each container on the host (1-40) Not all are created equal
  24. 24. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Reinvoke Check Security updates Amazon SNS Lambda function Terminating Set to drainingComplete hook
  25. 25. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Security updates Lambdas fill gaps in offering • Tag instances with information needed • Cluster name • Sleep Lambda before re-invoking (or suffer throttling) • Don't repeat calls—add results to SNS message • Inspiration came from this AWS blog post: https://aws.amazon.com/blogs/compute/how-to-automate-container-instance- draining-in-amazon-ecs/
  26. 26. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • Manual is scary • Store AWS CloudFormation stacks in Git • Gitlab CI jobs triggering updates • Benefit from stack updates (rollback) • Promote changes from DEV toward PROD safely Automate AWS CloudFormation stacks
  27. 27. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • ECS services and tasks as AWS CloudFormation stacks • Python code generates stack from template and configs • Triggers stack update • Benefit from stack updates (rollback) Automate AWS CloudFormation stacks
  28. 28. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Tunnel vs. proxy vs. internet UAT on-premises endpoints are not on the public internet VPC/VPN: Reach internal on-premises endpoints using the VPN - VPN can flap; it must be monitored - VPN can become a bottleneck (unsuited for high traffic) - You need a special DNS configuration to use an on-premises DNS to resolve private domains + Works for us
  29. 29. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Tunnel vs. proxy vs. internet UAT on-premises endpoints are not on the public internet Proxies: Whitelist two Elastic IPs and allow traffic from these to reach protected endpoints - Need to manage proxies - Need to whitelist IPs in corporate firewall + Worked for other projects LIVE uses public endpoints on the Internet
  30. 30. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. For Honor UAT block diagram Backend ECS Amazon CloudFront S3 Faction War World State ElastiCache (REDIS) AWS Lambda Amazon Elasticsearch Service Game Clients On-premises data center Front-end ECS Service Discovery ECS Ancillary ECS Supporting services VPN Tunnel HAProxy + Route 53
  31. 31. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Looking back after a break Automation Managed infrastructure Development ease Fail fast! Succeed consistently!
  32. 32. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Lessons learned • Everything manual is a risk • Break-even point of automation changes over time • Validate all changes in noncritical but identical setups • AWS CloudFormation can have a mind of its own • Service containers are hard to manage (even with placement constraints) • Surprising gaps in offerings: Lambdas can duct tape a lot of features cheaply • Cheap in dev and operations • Invest in Lambda CI/CD tooling; it can get messy • Use managed services (Elasticsearch, ElastiCache, SQS, Lambda, etc.)
  33. 33. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

×