Online games have suffered from some high-profile failures recently. This talk from 2013 looks at some of the root causes and the need for better tools now that games are now effectively high-performance transaction systems.
1997
2009
2013
2004
• PC-‐based
MMORPG
• 150-‐200K
subscribers
• 67K
peak
concurrent
• Custom
servers
• Flat
file
storage
• Data
center
• Minimal
analy/cs
• Subscrip/on
• PC-‐based
MMORPG
• 12M
subscribers
(2010)
• 1M
peak
concurrent
(7K
per
server)
• Custom
servers
• RDBMS
storage
• Data
center
• Minimal
analy/cs
• Subscrip/on
• Web-‐based
casual
game
• 80M
monthly
ac/ve
• 30M
daily
ac/ve
• LAMP
stack
servers
• RDBMS
storage
• AWS
cloud
• Heavy
analy/cs
• Free
to
play
• Mobile
casual
game
• 25M
downloads
• ???
daily
ac/ve
• LAMP
stack
servers
• RDBMS
+
NoSQL
• Data
center
• Heavy
analy/cs
• Free
to
play
WoW
Architecture
Gaming
Client
Login
Server
Game
Server
Game
Server
Game
Server
Game
Server
Game
Server
Game
Server
Game
Server
Game
Server
Game
Server
Game
Server
Game
Server
Game
Server
Main
Database
Read-‐only
Slave
Read-‐only
Slave
Read-‐only
Slave
Game
processing…
Replica/on
Wri/ng
back
data
to
master
DB
Up
to
2
hour
delay
Fixes
Before
A(er
1
database
20
databases
(w/
hash
on
user
ID)
Physical
hardware
Virtual
servers
Single
connec/on
to
shared
ID
system
Mul/ple
VPN’s
into
shared
ID
system
Support
manually
fixing
corrupt
user
data
Fixed
root
cause
of
corrupt
user
data
EA
Central
Services
Central
User
ID
Service
Game
Server
/
Database(s)
Things
they
did
right
• Tes/ng,
tes/ng,
and
more
tes/ng
– Used
AWS
to
set
up
1000’s
of
nodes,
millions
of
users
– Found
problems
w/
concurrent
connec/ons
– Fixes:
OS
level
tuning,
op/mize
network
code,
add
more
hardware,
upgrade
load
balancer
• Shrank
save-‐game
payload
(30K-‐>3.5K)
• Architected
right
from
start
– User
data:
shards
– Analy/cs-‐>Logs-‐>Hadoop-‐>Data
Warehouse
(cube)
Trends
• Tes/ng
is
HARD
but
CRITICAL
– Scale
=
millions
of
users
– Use
real
“user
stories”
–
end-‐to-‐end
cases
– Semng
up
test
(adding
10M
records)
takes
days
• Analy/cs
=
more
and
more
transac/ons
– More
than
half
of
all
transac/ons
now…
– Understand
what’s
going
on
inside
the
game
– Especially
w/
shin
to
free-‐to-‐play
Summary
of
Challenges
• Writes
>
Reads
• Dependencies
on
external
services
• Transac/ons
growing
faster
than
users
• Unsophis/cated
developers