Charlie Crane presented on the node.js game server framework Pomelo. Pomelo is an open source framework for building fast, scalable, distributed game servers in node.js. It provides abstractions for servers, requests/responses, and broadcasting that make building real-time multiplayer games easier. Performance testing showed Pomelo can support over 500 concurrent users in a single game area on commodity hardware. The presenter invited developers to contribute to expanding Pomelo's capabilities for building complex multiplayer games.
2. Who am I
NASDAQ: NTES
Senior engineer, architect(8 years) in NetEase
Inc. , developed many web, game products
Recently created the open source game server
framework in node.js: pomelo
3. Agenda
The state of pomelo
Motivation
Framework
Practice
Performance
4. The state of pomelo
Fast, scalable, distributed game
server framework for node.js
Open sourced in 2012.11.20
Newest version: V0.6
https://github.com/NetEase/pomelo
5. Pomelo position
Game server framework
Mobile game
Web game
Social game
MMO RPG(middle scale)
Realtime application server framework
Realtime Multiple User Interaction
6. State of pomelo
Pomelo is not a
single project
Almost 40 repos in total
11. Agenda
The state of pomelo
Motivation
Framework
Practice
Performance
12. Motivation--node.js and game
server
Game Server
Fast Scalable
Network Real-time
Node.js is a platform built on Chrome's JavaScript
runtime for easily building fast, scalable network
applications. Node.js uses an event-driven, non-
blocking I/O model that makes it lightweight and
efficient, perfect for data-intensive real-time
applications that run across distributed devices.
13. Motivation--node.js advantages
Scalability -- event driven I/O
Game, network-insentive, massive network flow
Language, javascript
Browser, HTML5, unity3d, cocos2d-x, other
platform – same language in client and server
Multi-Process, Single thread
No lock
Simple logic
Lightweight, development efficiency, really quick
14. Motivation–node.js disadvantage
Some CPU sensitive actions
Path finding
AI
Solution
Optimization
Divide process
All can be solved in practice
19. Motivation -- game VS web
Long connection VS Short connection
Partition: area based VS Load balanced
cluster
Stateful VS Stateless
Request/broadcast VS
Request/response
24. Framework --- design goal
Abstract of servers(processes)
Auto extend server types
Auto extend servers
Abstract of request/response and broadcast/push
Zero config request
Simple broadcast api
Servers communication---rpc framework
31. Framework -- request abstraction
Socket.io
Socket/We
bSocket
Support socket.io and socket in one project
32. Framework --- push&broadcast
Push messages to a group of users
channelService.pushMessageByUids(msg,
uids, callback);
var channel =
channelService.getChannel(‘area1’);
channel.pushMessage(msg);
35. Framework – rpc framework
Why rpc is so easy in pomelo?
Thrift
Writing a .thrift file
Generate Thrift file to source code
thrift --gen <language> <Thrift filename>
Copy the source to application
Pomelo—start up, all done
Client and server in one project
Servers folder convention
Auto generate proxy and remote on start up
36. Agenda
The state of pomelo
Overview
Framework
Practice
Performance
43. Practice --- handle move
Different situations
Player move, mob move
AI driven or player driven
Smooth effect
Client prediction
Latency Compensate
How to notify
AOI(area of interest)
44. Agenda
The state of pomelo
Overview
Framework
Practice
Performance
45. Performance --- overview
The performance varies largely in different
applications
The variation
Application type: Game or realtime application
Game type: round or realtime fight, room or infinite
Game design: Map size, character density, balance
of areas
Test parameters: Think time, test action
46. Performance --- reference data
Online users per process
next-gen: support 20,000 concurrent players in one
area
World record – not in one area
world of tanks(bigworld), 74,536 online users
But in real world(MMO rpg)
The real online data: maximum 800 concurrent users per
area, 7,000 concurrent users per group game servers
48. Performance --- stress testing
Use Case 1: fight
Stress on single area(area 3), increasing
step by step, login every 2 seconds
Game logic: attack monstors or other
players every 2~5 seconds
54. Performance – stress test
Use case 2 – move
Stress on single area(area 3), increasing
step by step, login every 2 seconds
Game logic: move around in the map every
2~5 seconds
56. Performance --- stress testing
Use Case 3: fight & move
Stress on single area(area 3), increasing
step by step, login every 2 seconds
Game logic: 50% attack monstors or other
players every 2~5 seconds, 50% move
around
Result: 558 onlines in one area
57. What’s more
Plugin, components
Realtime application framework – a better one,
more scalable, adaptable
AI – pomleo-bt
Broadcast strategy – pomelo-aoi , schedule
Data sync strategy – pomelo-sync
Admin all the servers -- pomelo-admin pomelo-
cli
How to build a full game – lordofpomelo play
online
High availability – master and zookeeper
Node.js is extremely suitable for game server development. Look at the definition of node.js, all these key words: fast, scalable, network, real-time are the character of game server. What a perfect match!!!
There are many advantages of developing games with node.js. First, scalability, of course, node was designed for event driven IO, and game server is such a high network communication application, which makes node a killer application. Second, javascript language, HTML5 can enjoy the share language between client and server, but not only html5, other languages are also suitable for it. Third, lightweight, traditionaly game server development heavy, but node.js makes game server development light and happy, it is crucial for efficiency. Forth, multi-process single thread model, game logic is complicated, multi-thread lock can make a mass, but the single thread model of node.js makes logic clean and simple. Awesome!
There are also some disadvantages in node.js since node is not good at CPU sensitive actions. Some AI and path finding actions is CPU sensitive. But in practice, all these problems can be solved. In our demo, all these CPU problems are solve by dividing process and other optimizations.
We are not the first guy who use node.js as game servers. Mozilla published an open source game demo called BrowserQuest a few months before we open sourced. It is really a good demo for HTML5, but on the server side, it uses a single process node server, which can not hold too much online users.
Google also published a game demo called gritsgame, which have an excellent presentation on google io 2012. But same problem, the demo focus on client side,the server side is a simple single process node.js server, which can not hold too much online users.
This is our demo. Believe me, the art material sucks. So what is the difference between our game and BrowserQuest. The server side!!!
We have a strong server side, which can hold thousands of online users.
This the backend of our server architecture, each rectangle represents a process. In the frontend, there are a group of processes called connector, whose job is holding connection of clients. Connector do not do the real logic, all the logics are sent to the backend processes. There are many types of back end processes, including, area, login, chat, status, team etc., in reality, there may have 10 types of processes. The main game logic is handled in area servers.
I believe most of you are from web background. So I will give a comparison between web and game. First, long connection VS short connection, it is obvious, realtime game need instant reaction, we must use long connection to push message.Second, partition, game is partitioned by area or room, it is decided by the nature of game. In the previous demo, all the players in the same map are actually in the same processes. When you walk to another map, you are most likely in another process. Because in game, most of the player interations are in a same map, making them in same process can minimize the inter-process communication. Third, game is stateful, because of the parition strategy, the request of certain player can only be handled by specific server. This is why game can not as scalable as web. Fourth, request/broadcast, most of player action need to be broadcasted to other players immediately, which is a scalability killer, we need many optimizations.
So, because of all the differences, the architecture of web and game is different. Web application can enjoy the benefit of load balancer, since all the processes are equal. Game, on the other hand, is a complicated spider web,different types of servers communicate to each other. It is hard to manage all these servers, and the communication can be complicated, which lead us to distributed programming.
Since the runtime architecture of a game is so complicated, we need a solution to simplify it. And these is a solution: framework, which lead us to next chapter.
So, let’s take a look at the pomelo framework.
You will be surprised to find ot that pomelo framework is nothing to do with game. In essence, it is a distributed scalable realtime application framework.
Last, channel and broadcast. It push messages to a group of users. The API above is just simple.
But as you know, broadcast is one of the most frequent actions in game, and it can can cause many performance problems.
Now, rpc, it works like magic. Zero config, you do not need to specify target server IP, port or anything. Just a function call, the client will automaticlly routed to target server, send to specific file and method. You don’t event feel that you are doing remote call.
Of course, you need define a route function before hand, it’s quite simple.
We have known a bunch of rpc frameworks, but none is as simple as pomelo. Why? For example, thrift, you need to write a .thrift file, and generate it to source code, copy the source to application. If the interface is changed, damn, do it again.
But nothing need to be done in pomelo, you just start up, boo.. Just call rpc like local method, all done. Why? Because we have servers abstraction in our framework. When we start up, we scan the server’s folder, generate the proxy automatically on start up.
The fourth part, practice.
As time is limited, I’m going to talk about the simplest game logic, player move! As you see, there are 6 steps in player move, basicly, the client send a message of move request to server, the connector route the request to specific area server. The area server do move logic, and broadcast to the clients who can see the movement of first client. The clients play the move animation. Done!
The blue part is accomplished by framework, You only need to write the code of step 1,3,6.
Step one, client move, the client side need to find path and move animation. We find path on client side can save many cpu resources on the server and make the animation more smooth. Then the client send the move request to the server, it is the basic API of pomelo client, just like AJAX call.
Step 3, the area server receive the request, route to this method ‘handler.move’ based on convention over configuration. The method signature is similar to http, except we need a session message, and a next callback. The method verify path, handle move logic, and then it need a broadcast to tell all the users who can see the first players’ move. After that, it send the response back with a next method.
The last step, every players received the broadcast message should play the move animation, really simple!
So easy, right? But, in reality, it is not as simple as you think. There are many issues you should consider.
First, different situations. I only demonstrate player move before. The mob move is quite different, it is driven by AI. Second, smooth effect. If you play animation after server send message, there will be some delay. To achieve smooth effect, you must use client prediction. But all the complexity will come since there is disaccord between client and server. Third , how to notify, broadcast to all the users is wasteful, you must use some algorithm to minimized the price of broadcast. This is where AOI fit in.
Last part performance
This the game picture of our stress test. Too much players in the map!!!
This is another picture, we make the map a little bigger, so the players in one screen is dropped, which is good to broadcast.
The time is limited , we do not have time to talk all the details. Two weeks later in Lisbon, I will give a talk focus on performance.