Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Building Scalable Cloud Tech for Mobile / Social Games


Published on

For companies looking to build (or buy) cloud tech for the mobile + social games. Based on experiences gained developing the bitHeads brainCloud platform.

As presented at GDC 2013 in San Francisco, and in May at OIGC in Ottawa.

For more information about brainCloud go to

Published in: Technology

Building Scalable Cloud Tech for Mobile / Social Games

  1. 1. Thursday, 28 March, 13
  2. 2. This session is for... • Developers looking to build or acquire cloud tech for their next game project • Stakeholders who want to confirm that their cloud tech is ready for deploymentThursday, 28 March, 13
  3. 3. Who am I? • Paul Winterhalder, CDO of bitHeads and playBrains • bitHeads • developers of custom cloud tech and mobile apps • worked on the back-ends for Trade Nations, Simpson’s Tapped Out • playBrains • for-hire developer of mobile + social games + digital games • developer of Eggies, Sideway: New York, Jaws, Madballs, etc... • Product Manager for the brainCloud platformThursday, 28 March, 13
  4. 4. Why is this important?Thursday, 28 March, 13
  5. 5. You don’t want to be a headline... (not this sort of headline anyway) • SimCity Launch marred by Server Problems - Forbes, March 5th, 2013 • Overwhelmed Diablo 3 Servers crash on launch day - Digital Journal, May 15, 2012 • Curiosity’s servers overwhelmed by number of people chiseling away at cube - Polygon, Nov 7, 2012 • EA pulls Simpsons iOS Game [over server issues] - Edge, March 6, 2012Thursday, 28 March, 13
  6. 6. Server development is night vs. day different from client development.Thursday, 28 March, 13
  7. 7. Server development concerns... Concern Server Mobile thousands of Scalability concurrent users One user Only while the user is Reliability 24-7 uptime using the app! Security !!! OS concernThursday, 28 March, 13
  8. 8. Why risk it? Why have server features at all? (Enhances the game for your users and your bottom line.)Thursday, 28 March, 13
  9. 9. Motivation: Key Gameplay The very nature of the game dictates that you must be online. • multiplayer • online worlds • server-based gameplay rulesThursday, 28 March, 13
  10. 10. Motivation: Deeper Engagement Let the player play how they want and when they want. • pick up and play from any device • all key info in the cloud: • game progress • player stats & xp • virtual currenciesThursday, 28 March, 13
  11. 11. Motivation: Social Reach Cloud tech can ease the burden of adding social features to your games. • invites • messages • gifts • leaderboards • tournamentsThursday, 28 March, 13
  12. 12. Motivation: Feature Parity Every platform is different - cloud tech can aid in delivering the same experience to all players. • leaderboards • achievements • async multiplayer • cloud saves • etc.Thursday, 28 March, 13
  13. 13. Motivation: Unite Communities Cloud tech can unite your player communities across platforms - and across IPs. • multiplatform libraries • cross-platform multiplayer • global stats • social identity • cross-platform virtual currencies • cross-ip identity and rewardsThursday, 28 March, 13
  14. 14. Motivation: Monetization Your cloud solution will be a key part of your monetization strategy... • cross-platform virtual currencies • economy tuning • sales & promotions • aggregate ads • analyticsThursday, 28 March, 13
  15. 15. Motivation: Improved iteration! Today’s games must iterate or die! • a/b testing • gameplay tuning • new quests / achievements • additional content (DLC)Thursday, 28 March, 13
  16. 16. Cloud-based features GAMIFICATION Player XP & Missions + Achievements MULTIPLAYER SOCIAL Levels Quests Tournaments Async MP BASE DATA E-COMMERCE Identity Leaderboards Downloader Global Stats Deep Entity Multi-currency Management EVENTS Client Gifts & Transaction File Access User Stats Notifications Versioning Challenges Log Sharding Strategy E-commerce Analytics Friend Data Events IntegrationsThursday, 28 March, 13
  17. 17. So how does cloud tech help? Cloud tech gives you easy access to key infrastructure - but you still need to construct a proper solution.Thursday, 28 March, 13
  18. 18. Key characteristics The National Institute of Standards and Technology’s definition of cloud computing identifies “five essential characteristics”: On-demand self-service - administer yourself via the web Broad network access - ubiquitous access via the greater web Resource pooling - shared resources - multi-tenant Rapid elasticity - can scale up and down with demand Measured service - pay for what you useThursday, 28 March, 13
  19. 19. 3 levels of cloud computing • Software as a Service (SaaS) - you access software in the cloud, but don’t need to manage any of it... just use the APIs. • Platform as a Service (PaaS) - you rent a development environment that has been specifically engineered with built-in services for scalability, rapid deployment, automated elasticity, etc. • Infrastructure as a Service (IaaS) - you rent virtualized hardware in the cloud, and are fully in control of the software that runs upon it.Thursday, 28 March, 13
  20. 20. Cloud services - examples SaaS Flurry iCloud Facebook brainCloud Amazon Microsoft Azure Google Elastic Beanstalk Cloud Services App Engine PaaS Java, PHP, Python, .Net Java, Python, Java, Python, Go, etc. and Node.js node.JS, .NET Google Compute Microsoft Azure IaaS Amazon EC2 Engine Virtual MachinesThursday, 28 March, 13
  21. 21. DESIGNING your solutionThursday, 28 March, 13
  22. 22. It’s all about the DATA. In the end, it all comes down to how efficiently and reliably your games can store and retrieve data to/from the cloud.Thursday, 28 March, 13
  23. 23. Rise of the NoSQL database • Non-relational • Store documents is key-value pairs • Most are not ACID (BASE instead) • Simpler = higher performance • Easier to shard • Recognize that most queries are about a single player - not across the whole domain of the db http://swrpgcronicas.blogspot.comThursday, 28 March, 13
  24. 24. Database Sharding AppServer 1 Vertic al sha rding - simp le, but not op AppServer 2 NonSharded DB timal User Profiles Shard 1 AppServer 1 AppServer 3 ble! N ot scala AppServer 2 Shard 2 Leaderboards g ardin AppServer 3 sh o ntal ! lable Users Stats oriz Shard 3 H ca st s - mo AppServer 1 Users A-J Shard 1 arise P roblems able AppServer 2 wh en one t Users K-O ger / s up lar Shard 2 end affic Note* - for optimal higher tr s... an other AppServer 3 Users P-Z results you may nee d to th Shard 3 manually implement your own horizontal sharding.Thursday, 28 March, 13
  25. 25. In the land of NoSQL, MongoDB is king. Relative popularity of NoSQL database skills, based on LinkedIn skills searches. Did not include a search on BigTable. - blogs.the451group.comThursday, 28 March, 13
  26. 26. Application ServersThursday, 28 March, 13
  27. 27. App Servers • Build on a PaaS solution if possible - gives you load balancing, elasticity • AppServers ideally* should be PaaS stateless Load App MongoDB Client(s) • Use Memcached to cache session Balancer(s) Server(s) Memcached state • Also consider caching static reference data (like game tuning, rules, etc.) in each serverThursday, 28 March, 13
  28. 28. Communications HTTP-based communications is the way to go - nothing else gets through the myriad of firewalls and routers that make-up the web.Thursday, 28 March, 13
  29. 29. Build a Comms Manager • Abstracts communications • Responsible for serializing / de- serializing messages * • Handles HTTP 503 (service Client Client Application unavailable errors). Automatic App Comms Manager Server escalating retries. • Can queue requests (bundling them) • Callback interface to the app for responsesThursday, 28 March, 13
  30. 30. Sequence numbers • Client • Server • Add sequence numbers to all • Use the # to guarantee in-order messages sent from client to processing and discard duplicates server • Saves last response sent to client. • If client doesn’t receive a • If receives duplicate request, response in time, simply retry doesn’t process it, but returns the saved response again. Bonus: Prevents man-in-the-middle / replay attacks!Thursday, 28 March, 13
  31. 31. Message Formatting • JSON format is a good place to start (better than XML) Formal messaging options: • Human readable - and aligns with MongoDB • Google ProtocolBuffers • Can be compressed for • Apache Thrift more optimization • Apache Avro • Go to formal options if you need the extra performance Good comparison here: • Warning - formal options use more CPU on serverThursday, 28 March, 13
  32. 32. Content Distribution Network (CDN) • Also called a Content Delivery Network • Distributed system of servers deployed throughout the world • Content physically closer to the user = FASTER! • Use for static files: media, configs, etc. • Popular CDNs include: Akamai, Amazon CloudFront, CacheFly, etc. Tip - to avoid caching issues, place all files under a <version> directoryThursday, 28 March, 13
  33. 33. Putting it all together...Thursday, 28 March, 13
  34. 34. Data ReplicaSet 1 Minimal Amazon Example NoSQL NoSQL AppServers Mem- Mem- Load Balancers cached1 cached2 Facebook (Flash) ReplicaSet 2 Amazon Elastic Load Amazon Balancer (ELB) Elastic Beanstalk Amazon Elastic Compute Cloud (EC2) Mobile File Serving Cost $1500-2000 / month, should be able to serve ~50,000 DAU Amazon S3 and/or CloudFrontThursday, 28 March, 13
  35. 35. Key feature examplesThursday, 28 March, 13
  36. 36. Feature: Deep Entity • For complex game data - like “ville” games • Data store in the cloud allows: • Players to visit each other • Safe-guards player’s time investment • Allows player to play on multiple devices (not concurrently)Thursday, 28 March, 13
  37. 37. Implementation: Deep Entity • Data objects stored in an entity collection Client Game Client Cloud • When the game starts, the entire set of Start Game entities is retrieved (downloaded) Retrieve Entities All Entities • As the player performs actions and operations, the client both: Plant Tree • Create tree locally updates the entities locally and... New Entity (tree) • Ack sends updates for individual entities [i.e. Level Up Tree create, update, deletes] to the server Update tree locally • If connection is terminated, user would only Update Entity (tree) Ack lose maybe a few updatesThursday, 28 March, 13
  38. 38. Feature: Player Stats • Tracking of simple numeric stats on the server • Stats act as the underlying datastructure of all the meta- systems - xp, achievements, quests • Allows concurrent accessThursday, 28 March, 13
  39. 39. Implementation: Player Stats • Each “stat” is declared on the server Energy example: - with a declared data type, initial value, and known min/max • Client tells server to “Increment Hearts +1” • Client never *sets* the stat - sends • Server looks at Hearts value - sees commands increment, decrement, that it’s 15 - and increments it to 16 reset, etc. • Rules are processed to ensure the • Server then checks the rules - and sees that the Hearts max is supposed stat value maintains consistency to be 15 - so it resets the value back. • Implementation is further enhanced by (Note - simplified example - energy actually uses the the concept of StatsEvents <- server- IncrementToMax operation in our game - we only based stats macros restore to 5 hearts)Thursday, 28 March, 13
  40. 40. Feature: Async Multiplayer • Allows for a game match to be played between a list of players • The currently active player determines which player goes next (to allow for turnovers) • The next player should be notified when it’s their turnThursday, 28 March, 13
  41. 41. Implementation: Async Multiplayer • The initiator of the game will own the match data - the others will have permission to manipulate it • The state of the match, including who’s turn it is, is store in the status entity • Only the player who’s turn it is can manipulate the match (except for termination) • The full play-by-play of the match is stored separately in MatchHistoryThursday, 28 March, 13
  42. 42. Before you deploy...Thursday, 28 March, 13
  43. 43. Performance testing is hard. • need to simulate tens and hundreds of thousands of users • need to simulate the data environment as well (especially challenging for social applications) • Every system is different - which can make it difficult to determine good vs. bad performance. • Focus on changes in relative performance www.mayhemandmuse.comThursday, 28 March, 13
  44. 44. Grinder in the Cloud • Grinder is a free Java-based load testing framework that makes it easy to run a distributed test using many load injector For Simpsons... machines >500 Grinder servers • Scripts are written using a dynamic scripting languages - Python, Clojure, Groovy, etc. 408K concurrent users • Grinder in the Cloud allows you to 4M DAU! temporarily spin up a cloud-based Grinder set - to massively test your system • For more info - grinder.sourceforge.netThursday, 28 March, 13
  45. 45. Gradual Roll-out • These days, it’s more important than ever to do a staged roll-out • Deploy to small market to begin (like Canada) - monitor - and then add additional regions step-by-step, monitoring each as you go along • You’re looking for situations where the performance of the system is degrading in an unexpected, non-linear fashion. Your job is to catch these issues before they become a problem!Thursday, 28 March, 13
  46. 46. Publisher Review What follows are actual questions from a publisher’s server due diligence that was done for a recent game that we produced. It’s useful as a review.Thursday, 28 March, 13
  47. 47. Questions publishers will ask: • Provide a technical architecture diagram describing all server tiers • What database are you using, and how is it set up? • Is there a plan for DB sharding for scalability? • What do you use for caching in front of the DB? What gets cached? • What is your DB backup strategy? • What is the minimum server setup for 50K DAU • If DAU reaches 200K, what would the server architecture look like?Thursday, 28 March, 13
  48. 48. Thanks for listening! • For more info on the tech that we build: • Follow me on twitter: @winterhalder Special thanks to Scott Simpson, Rick McMullin, Chris Justus and Preston Jennings for their contributions and feedback!Thursday, 28 March, 13