Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Akka for realtime multiplayer mobile games

1,036 views

Published on

With the recent emergence of top grossing real time games in the East (Honour of Kings) and West (Clash Royale, Golf Clash, etc) - the doors have opened for new genres of social games on mobile. However, building a real time multiplayer game for mobile comes with many interesting technical and design challenges. Join us in this session to see how the team at Space Ape Games is approaching these challenges head on with the help of Akka and AWS.

Published in: Technology

Akka for realtime multiplayer mobile games

  1. 1. Akka for real-time Multiplayer mobile games
  2. 2. Yan Cui http://theburningmonk.com @theburningmonk Principal Engineer @
  3. 3. Real-Time games in Top 100 Grossing (2017) 2014 2015 2016 2017 (3) (6) (8) (13) 2018 ???
  4. 4. Enabling Factors Source: PC Mag
  5. 5. Enabling Factors Source: OpenSignal
  6. 6. Enabling Factors
  7. 7. > $400M Monthly Revenue
 Source: Bloomberg > 80M DAU
 Source: Tencent
  8. 8. 10-20 inputs/s, sensitive to lags (> 300ms)
  9. 9. unpredictable network, limited bandwidth
  10. 10. Decisions, decisions... Build vs Buy? Global deployment vs Centralized? TCP vs UDP? Server Authoritative vs Lock-Step?
  11. 11. Constraints/Trade-offs Latency (RTT) Cost Complexity Scalability Operational overhead
  12. 12. Global Deployment vs Centralised
  13. 13. 10-20 inputs/s, sensitive to lags (> 300ms)
  14. 14. optimize for this
  15. 15. Global Deployment ! Players are geo-routed to closest multiplayer server. ! Matched with other players in the same geo-region for best UX. ! No need for players to “choose server”, it should just work.
  16. 16. Global Deployment ! Should leaderboards be global or regional? ! Should guilds/alliances be global or regional? ! Should chatrooms be global or regional? ! Should liveops events be global or regional? ! Should players be allowed to play with others in another region? ie. play with distant relatives/friends. ! Should players be allowed to switch default region? eg. moved to Europe after Brexit
  17. 17. Server Authoritative vs Lock-Step
  18. 18. Server Authoritative ! Server decides game logic. ! Client sends all inputs to server. ! Client receives game state (either full, or delta) from server.
  19. 19. Server Authoritative ! Server decides game logic. ! Client sends all inputs to server. ! Client receives game state (either full, or delta) from server. ! Client keeps internal state for game world, which mirrors server state. ! Client doesn’t modify world state directly, only display with some prediction to mask network latency.
  20. 20. Client 1 Client 2Server C1 control 1 C2 control 1 game state 1
  21. 21. Client 1 Client 2Server C1 control 1 C2 control 1 C2 control 2 game state 1 game state 2
  22. 22. Client 1 Client 2Server C1 control 1 C2 control 1 C2 control 2 game state 1 game state 2
  23. 23. Client 1 Client 2Server C1 control 1 C2 control 1 C2 control 2 game state 1 game state 2 game state 3 C1 control 1 C2 control 1 C2 control 2 game state 3 C1 control 1 C2 control 1 C2 control 2 C2 control 3
  24. 24. Client 1 Client 2Server C1 control 1 C2 control 1 C2 control 2 C2 control 3 game state 1 game state 2 game state 3 C1 control 1 C2 control 1 C2 control 2 game state 3 C1 control 1 C2 control 1 C2 control 2 game state 4
  25. 25. Client 1 Client 2Server C1 control 1 C2 control 1 C2 control 2 C2 control 3 game state 1 game state 2 game state 3 C1 control 1 C2 control 1 C2 control 2 game state 3 C1 control 1 C2 control 1 C2 control 2 game state 4
  26. 26. Client 1 Client 2Server C1 control 1 C2 control 1 C2 control 2 C2 control 3 game state 1 game state 2 game state 3 C1 control 1 C2 control 1 C2 control 2 game state 3 C1 control 1 C2 control 1 C2 control 2 game state 5 C2 control 3 game state 4
  27. 27. Client 1 Client 2Server C1 control 1 C2 control 1 C2 control 2 C2 control 3 game state 1 game state 2 game state 3 C1 control 1 C2 control 1 C2 control 2 game state 3 C1 control 1 C2 control 1 C2 control 2 game state 5 C2 control 3 game state 4
  28. 28. Pros ! Always in-sync. ! Hard to cheat - no memory hacks, etc. ! Easy (and quick) to join mid-match. ! Server can detect lagged/DC’d client and take over with AI.
  29. 29. Cons ! High server load. ! High bandwidth usage. ! Synchronization on the client is complicated. ! Little experience in the company with server-side .Net stack. (bus factor of 1) ! .NetCore was/is still a moving target.
  30. 30. high server load and bandwidth needs client has to receive more data
  31. 31. Lock-Step* ! Client sends all inputs to server. ! Server collects all inputs, and buffers them. ! Server sends all buffered inputs to all clients X times a second. * traditional RTS games tend to use peer-to-peer model
  32. 32. Lock-Step* ! Client sends all inputs to server. ! Server collects all inputs, and buffers them. ! Server sends all buffered inputs to all clients X times a second. ! Client executes all inputs in the same order. ! Because everyone is 'guaranteed' to have executed the same input at the same frame in the same order, we get synchronicity. ! Use prediction to mask network latency. * traditional RTS games tend to use peer-to-peer model
  33. 33. Client 1 Client 2Server C1 control 1 C2 control 1 C2 control 2 C2 control 3 C1 control 1 C2 control 1 C2 control 2 C1 control 1 C2 control 1 C2 control 2 C2 control 3 inputs, instead of game state
  34. 34. Client 1 Client 2Server C1 control 1 C2 control 1 C2 control 2 C2 control 3 C1 control 1 C2 control 1 C2 control 2 C1 control 1 C2 control 1 C2 control 2 C2 control 3 RTT: time between sending an input to receiving it back from server
  35. 35. Client 1 Client 2Server C1 control 1 C2 control 1 C2 control 2 C2 control 3 C1 control 1 C2 control 1 C2 control 2 C1 control 1 C2 control 1 C2 control 2 C2 control 3
  36. 36. Client 1 Client 2Server C1 control 1 C2 control 1 C2 control 2 C2 control 3 C1 control 1 C2 control 1 C2 control 2 C1 control 1 C2 control 1 C2 control 2 C2 control 3 RTT frame time latency
  37. 37. Client 1 Client 2Server C1 control 1 C2 control 1 C2 control 2 C2 control 3 C1 control 1 C2 control 1 C2 control 2 C1 control 1 C2 control 1 C2 control 2 C2 control 3 RTT frame time RTT = latency x 2 + X Xmin = 0, Xmax = frame time latency
  38. 38. Pros ! Light server load. ! Lower bandwidth usage. ! Simpler server implementation.
  39. 39. Cons ! Needs deterministic game engine. ! Unity has long-standing determinism problem with floating point. ! Hackable, requires some form of server-side validation. ! All clients must take over lagged/DC’d client with AI. ! Slower to join mid-match, need to process all inputs. ! Need to ensure all clients in a match are compatible.
  40. 40. fix-point math, server validation, ...
  41. 41. bandwidth
  42. 42. Build vs Buy
  43. 43. ApeSync
  44. 44. +
  45. 45. MATCH 1 C1 input C2 input current frame history frame 1 frame 2 frame 3 buffering
  46. 46. connection open MATCH 1 C1 input C2 input current frame history frame 1 frame 2 frame 3 buffering
  47. 47. connection open authenticate MATCH 1 C1 input C2 input current frame history frame 1 frame 2 frame 3C3 joined buffering
  48. 48. connection open authenticate send/receive MATCH 1 C1 input C2 input current frame history frame 1 frame 2 frame 3C3 joined buffering
  49. 49. MATCH 1 C1 input C2 input current frame history frame 1 frame 2 frame 3C3 joined C3 input connection open authenticate send/receive buffering
  50. 50. MATCH 1 C1 input C2 input current frame history frame 1 frame 2 frame 3C3 joined C3 input connection open authenticate send/receive buffering broadcast!
  51. 51. MATCH 1 current frame history frame 1 frame 2 frame 3 C1 input C2 input C3 joined C3 input connection open authenticate send/receive buffering broadcast!
  52. 52. MATCH 1 current frame history frame 1 frame 2 frame 3 frame 4 connection open authenticate send/receive buffering broadcast!
  53. 53. MATCH 1 current frame history frame 1 frame 2 frame 3 frame 4 connection open authenticate send/receive buffering broadcast! C3 input
  54. 54. concurrency
  55. 55. MATCH 1 current frame history frame 1 frame 2 frame 3 ... C1 input C2 input C3 joined C3 input connection open authenticate send/receive C1 input buffering broadcast!
  56. 56. MATCH 1 current frame history frame 1 frame 2 frame 3 ... C1 input C2 input C3 joined C3 input C1 input C2 input buffering broadcast!
  57. 57. MATCH MATCH MATCH MATCH MATCH
  58. 58. MATCH MATCH MATCH MATCH MATCH MATCH MATCH MATCH MATCH MATCH
  59. 59. MATCH MATCH MATCH MATCH MATCH MATCH MATCH MATCH MATCH MATCH MATCH MATCH MATCH MATCH MATCH
  60. 60. MATCH C1 input C2 input current frame history frame 1 frame 2 frame 3C3 joined connection open authenticate send/receive
  61. 61. MATCH C1 input C2 input current frame history frame 1 frame 2 frame 3C3 joined connection open authenticate send/receive
  62. 62. MATCH current frame history frame 1 frame 2 frame 3 C1 input C2 input C3 joined Socket actor Match actor
  63. 63. MATCH current frame history frame 1 frame 2 frame 3 C1 input C2 input C3 joined Root Aggregate Socket actor Match actor
  64. 64. MATCH current frame history frame 1 frame 2 frame 3 C1 input C2 input C3 joined Root Aggregate Socket actor Match actor
  65. 65. MATCH current frame history frame 1 frame 2 frame 3 C1 input C2 input C3 joined
  66. 66. MATCH current frame history frame 1 frame 2 frame 3 C1 input C2 input C3 joined C3 joined act locally think globally how actors interact with each other aka, the “protocol”
  67. 67. the secret to building high performance systems is simplicity complexity kills performance
  68. 68. Higher CCU per server Fewer servers Lower cost Less operational overhead Performance Matters
  69. 69. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%. Performance Matters
  70. 70. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%. Performance Matters
  71. 71. Threads are heavy OS constructs. Each thread is allocated 1MB stack space by default. Context Switching is expensive at scale. Actors are cheap. Actor system can optimise use of threads to minimise context switching. Actor Model >
  72. 72. Non-blocking I/O framework for JVM. Very performant. Simplifies implementation of socket servers (TCP/ UDP). UDP support is “meh”... Netty
  73. 73. Custom network protocol (bandwidth). Minimise Netty object creations (GC pressure). Minimise the no. of message hops inside Akka (latency). Performance Tuning
  74. 74. Buffer pooling (GC pressure). Using direct buffers (GC pressure). Performance Tuning
  75. 75. Disable Nagle's algorithm (latency). TCP tuning (including BBR, etc. with differing results). Epoll. ENA-enabled EC2 instances. Performance Tuning
  76. 76. Custom Reliable UDP protocol, optimized for countries with poor networking conditions. Performance Tuning
  77. 77. AWS Lambda functions to run bot clients (written with Akka): ! Cheaper ! Faster to boot up ! Easy to update Each Lambda invocation could simulate up to 100 bots. Automated Load Testing
  78. 78. http://bit.ly/2xgGHXZ
  79. 79. from US-EAST (Lambda) to EU-WEST (game server)
  80. 80. optimize for tail latencies from US-EAST (Lambda) to EU-WEST (game server)
  81. 81. Monitoring
  82. 82. Monitoring
  83. 83. Performance in the Wild(in poor networking condition) 95 percentile max playable RTT outperforms Photon on performance
  84. 84. Performance in the Wild(in poor networking condition) fewer disconnects
  85. 85. Performance in the Wild ! Improved KPIs - D1 retention, session time, etc. ! 14% cheaper vs. Photon, based on current cost projection

×