Building a World in the Clouds: MMO Architecture on AWS (MBL304) | AWS re:Invent 2013
Upcoming SlideShare
Loading in...5
×
 

Building a World in the Clouds: MMO Architecture on AWS (MBL304) | AWS re:Invent 2013

on

  • 1,278 views

Can you really build the infrastructure required to bring a massively multiplayer online game (MMO) to life in the cloud? This session discusses the evolution of Red 5 Studios' FireFall—a ...

Can you really build the infrastructure required to bring a massively multiplayer online game (MMO) to life in the cloud? This session discusses the evolution of Red 5 Studios' FireFall—a free-to-play MMO. FireFall runs entirely on the AWS platform and allows players from around the world to play together in the cloud. The session covers some of the design decisions made over the last two years—the things that worked well and not so well. The session also presents some of the solutions Red 5 implemented to ease the transition from dedicated data center hardware to virtual servers in AWS.

Statistics

Views

Total Views
1,278
Views on SlideShare
1,278
Embed Views
0

Actions

Likes
1
Downloads
21
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Building a World in the Clouds: MMO Architecture on AWS (MBL304) | AWS re:Invent 2013 Building a World in the Clouds: MMO Architecture on AWS (MBL304) | AWS re:Invent 2013 Presentation Transcript

  • Building a World in the Clouds MMO Architecture on AWS Jeffrey Berube Director of Technical Operations Red 5 Studios, Inc. November 15, 2013 © 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • Overview • • • • • What is Firefall? Why build in the cloud? Infrastructure Goals Evolution of the Platform Tools for Success
  • What is Firefall?
  • • Free-to-play cooperative open world shooter • “Shardless” world • Instance-based maps • Both persistent and transient map types • Possible for many instances of a map to exist at the same time
  • • Free-to-play cooperative open world shooter • “Shardless” world • Instance-based maps • Both persistent and transient map types • Possible for many instances of a map to exist at the same time
  • • Free-to-play cooperative open world shooter • “Shardless” world • Instance-based maps • Both persistent and transient map types • Possible for many instances of a map to exist at the same time
  • Why?
  • Why build in the cloud? • Players are unpredictable • Developers are unpredictable • Cyclical player behavior opens up opportunities for significant cost savings
  • Why build in the cloud? • Players are unpredictable – Forecasts can be (and usually are) wrong • Too little hardware, Too many players – Bad for everyone • Too much hardware, Too few players – Good for players (sort of) but bad for the business – What if they don’t stick around? • Developers are unpredictable • Cyclical player behavior opens up opportunities for significant cost savings
  • Why build in the cloud? • Players are unpredictable • Developers are unpredictable • Cyclical player behavior opens up opportunities for significant cost savings
  • Why build in the cloud? • Players are unpredictable • Developers are unpredictable – Active development has risks • Performance can change drastically • New services can “appear” the day of the patch – MMOs are ALWAYS being actively developed! • (If you want to be successful…) • Cyclical player behavior opens up opportunities for significant cost savings
  • Why build in the cloud? • Players are unpredictable • Developers are unpredictable • Cyclical player behavior opens up opportunities for significant cost savings
  • Player Graph 36 Hours
  • Player Graph with Server Overlay
  • Player Graph with Efficient Server Overlay 35%–37.5% Savings
  • Infrastructure Goals
  • Infrastructure Goals Deployment and Recovery Platform How do we make site management better? How can the platform make the player experience better? • • • • Expansion Scalability Disaster Recovery Self-Healing • Downtime • Player Mobility
  • Infrastructure Goals: Deployment and Recovery
  • Deployment and Recovery Goals • • • • Quick regional expansion On-demand scalability Disaster recovery with minimal downtime Self-healing
  • Deployment and Recovery Goals • Quick regional expansion – Traditionally, expansion is a multi-month process • Contracts, Purchase and Shipping, Installation, etc. – Today, adding a region is about a week long task • Additional improvements are in the works • On-demand scalability • Disaster recovery with minimal downtime • Self-healing
  • Deployment and Recovery Goals • • • • Quick regional expansion On-demand scalability Disaster recovery with minimal downtime Self-healing
  • Deployment and Recovery Goals • Quick regional expansion • On-demand scalability – Automated scale up and down without* limits • Instance sizes desired may not always be available, however • Disaster recovery with minimal downtime • Self-healing
  • Deployment and Recovery Goals • • • • Quick regional expansion On-demand scalability Disaster recovery with minimal downtime Self-healing
  • Deployment and Recovery Goals • Quick regional expansion • On-demand scalability • Disaster recovery with minimal downtime – Traditionally, DR sites are expensive and are not always properly maintained – Our goal is to automate disaster recovery safely • We do a lot manually at present • Self-healing
  • Deployment and Recovery Goals • • • • Quick regional expansion On-demand scalability Disaster recovery with minimal downtime Self-healing
  • Infrastructure Goals: Platform
  • Platform Goals • Zero downtime game updates • Players can play globally without restrictions
  • Platform Goals • Zero downtime game updates – Blue-Green deployment – Doesn’t preclude scheduled maintenance • Some things are more safely done offline • Players can play globally without restrictions
  • Platform Goals • Zero downtime game updates • Players can play globally without restrictions
  • Platform Goals • Zero downtime game updates • Players can play globally without restrictions – Characters won’t be held hostage • Player data is available everywhere they want to be • We don’t charge a player so that they can play with their friends – Prefer closest healthy region, however
  • Evolution of the Platform
  • The Beginning March 2011
  • INET The Beginning March 2011 AWS Services in Use Outside-Game AWS Elastic Compute Cloud (EC2) US-West-1 CORP Availability Zone: b
  • Alpha October 2011
  • INET Operator Alpha October 2011 AWS ELB AWS Services in Use Mx HP Outside-Game HP PvP HP MM HP HP Inside-LB Chef Ic I I C Ar Inside-Core Log U Ad Inside-App C U Ad Inside-DB Availability Zone: b US-West-1 CORP HP Outside-LB Inside-Game AWS MD MCP Elastic Compute Cloud (EC2) Simple Storage Service (S3) Elastic Load Balancing (ELB) Simple Queue Service (SQS)
  • Closed Beta April 2012
  • INET Closed Beta April 2012 Operator AWS ELB AWS Services in Use OW PvP HP HP Outside-Game Outside-Game MM Outside-LB HP HP Inside-LB HP Inside-Game Outside-LB MD MM HP HP Inside-LB MCP Inside-Game HP HP HP Chef I Ic Gr Inside-Core Log C U I C U Ar Ad S Ar Ad S Inside-App Availability Zone: b AWS RDS Inside-App Availability Zone: c Elastic Compute Cloud (EC2) Simple Storage Service (S3) Elastic Load Balancing (ELB) Simple Queue Service (SQS) Relational Database Service (RDS) ElastiCache AWS PvP HP US-West-1 CORP OW HP
  • Gamescom August 2012
  • INET Operator Gamescom August 2012 AWS ELB AWS Services in Use OW NPE PvE EU-West-1 PvP HP HP HP Outside-Game Task Task Task MD MM Inside-Game Log Ic C U Ar L In Ad Co S A P W I Chef C U A P Gr Log Ic Outside-LB I Inside-AppTasks Chef AWS ELB Gr Inside-Core VPC Inside-Core AWS MCP Inside-App W Inside-DB Availability Zone: b US-West-2 CORP US-East-1 US-West-2 Elastic Compute Cloud (EC2) Simple Storage Service (S3) Elastic Load Balancing (ELB) Simple Queue Service (SQS) Relational Database Service (RDS) ElastiCache CloudFront Virtual Private Cloud (VPC) HQ
  • Open Beta July 2013
  • INET Operator Open Beta July 2013 AWS ELB AWS Services in Use ES ES ES OW NPE PvE EU-West-1 PvP HP MCP MD MM HP HP Inside-LB Log Ic C U Ar L In Ad Co S A P W C U A P Gr Log Gr Inside-Core Inside-Core VPC US-East-1 Ops US-East-1 Outside-LB I Chef VPC HP I Inside-AppTasks Ic HP Inside-Game Task Task Task Chef HP Outside-Game Inside-Search AWS ES ES Inside-App W Inside-DB Availability Zone: b US-West-2 CORP ES HQ Elastic Compute Cloud (EC2) Simple Storage Service (S3) Elastic Load Balancing (ELB) Simple Queue Service (SQS) CloudFront Virtual Private Cloud (VPC) Elastic MapReduce (EMR)
  • Today November 2013
  • INET Operator Today November 2013 AWS ELB AWS Services in Use ES ES ES ES ES ES OW NPE PvE EU-West-1 PvP HP Outside-Game MD MM Chef Log Ic Ic Inside-LB C U Ar L In Ad Co S A P W Gr Log Gr VPC Inside-Core VPC US-East-1 Ops AP-NorthEast-1 Inside-App C Inside-Core US-East-1 Outside-LB I Inside-AppTasks Chef HP Inside-Game AWS MCP Task Task Task A P Inside-DB Availability Zone: b US-West-2 SA-East-1 CORP Inside-Search HQ Elastic Compute Cloud (EC2) Simple Storage Service (S3) Elastic Load Balancing (ELB) Simple Queue Service (SQS) CloudFront Virtual Private Cloud (VPC) Elastic MapReduce (EMR)
  • Tools for Success
  • Third-party Tools • • • • • • • • • Opscode Chef collectd Icinga (Nagios) Graphite Graylog2 HAProxy Keepalived elasticsearch Others (Bluepill, Thin, memcached, RabbitMQ, etc.)
  • Third-party Services • Dyn – Global Load Balancing – DNS Failover • PagerDuty – On-call scheduling – Phone calls! • Pingdom – Transaction monitoring • Duo Security
  • Internal Tools • Architect • Cartographer • Dashboards (Everywhere)
  • Internal Tools • Architect – Everything heartbeats – Service-specific data aggregation • Cartographer • Dashboards (Everywhere)
  • Internal Tools • Architect • Cartographer • Dashboards (Everywhere)
  • Internal Tools • Architect • Cartographer – Builds new game server stacks – Replaces failed game server components – Scales up (or down) the servers within a pool depending on player demand • Dashboards (Everywhere)
  • Internal Tools • Architect • Cartographer • Dashboards (Everywhere)
  • Internal Tools • Architect • Cartographer • Dashboards (Everywhere) – Multiple data sources • Graphite • Production databases • Business Intelligence databases
  • Please give us your feedback on this presentation MBL304 As a thank you, we will select prize winners daily for completed surveys!