Monster World at Amazon EC21. 1
Monster World at Amazon EC2!
Hosting a social game with one million daily active users
!
Jesper Richter-Reichhelm!
Head of Engineering!
wooga !
2. About
! 2
About
wooga
About
Monster
World
Founded
January
2009
Launched
May
2010
Funding:
Founders,
Balderton
Capital,
Biggest
seller
of
magic
wands
in
the
Holtzbrinck
Ventures
(total
of
€5m+)
world
InternaIonal
team
of
60
Total
team
size
is
15
(2x
backend,
3x
from
15
countries
in
Berlin
frontend)
Key
stats
Key
stats
4
games
on
Facebook;
16m
acIve
users
Hosted
at
Facebook
Biggest
european
social
game
Flash
client
developer
Ruby
on
Rails
backend
Only
5%
of
users
from
adverIsing
MySQL
+
Redis
DB
70%
of
users
are
female
(age
20-‐60)
©
wooga
3. Monster World at Amazon EC2! 3
Starting Point!
Finding Helpers!
Challenges and Solutions!
Looking back!
©
wooga
4. Starting Point
! 4
InOctober 2009 we set out to build a backend
for woogaʼs first game with a persistent world.!
Our goal was to have more than 1,000,000 daily
active users.!
Wehave never done something like this before
(who had?)!
©
wooga
5. Hosting model must fit the needs
! 5
Small team dedicated to a single game!
2 backend folks to do both development and operation!
“Extreme” life cycle of a game!
!(graphic by Rightscale)!
We simply did not know what to expect!
Scale up hosting when you are successful – not before!!
©
wooga
6. Monster World at Amazon EC2! 6
Starting Point!
Finding Helpers!
Challenges and Solutions!
Looking back!
©
wooga
7. Focus on what you do best… ! 7
and get help for the rest
!
Amazon Web Services!
Easy to scale up and down!
No limitations!
Scalarium!
Making operation of a large cluster easy!
Provides default setup!
Provides consultancy if needed!
Based in Berlin => better communication!
©
wooga
8. Monster World at Amazon EC2! 8
Starting Point!
Finding Helpers!
Challenges and Solutions!
Looking back!
©
wooga
9. Challenge:
9
Growing traffic
!
1,200,000
1,000,000
800,000
600,000
400,000
200,000
0
4/22/14 5/22/14 6/22/14 7/22/14 8/22/14 9/22/14 10/22/14 11/22/14
©
wooga
10. Solution:
10
Scale up and out easily
!
Scaling up!
Application servers: 2 cores => 8 cores!
DB servers: 7.5GB => 68GB!
Scaling out!
Application servers: 2 => up to 50!
MySQL servers: 2 => 16 => 8!
Easy installation by automation!
Chef recipes managed by Scalarium make that easy!
©
wooga
11. Challenge:
11
Idle servers cost money, too!
peak : valley ratio
20:1 @ VZ
5:1 @ FB
©
wooga
12. Solution! 12
Run servers only when needed!
Scalarium offers time and load based instances!
Start and stop instances based on time!
Start and stop instances based on load!
©
wooga
13. Solution! 13
Run servers only when needed!
©
wooga
14. Challenge! 14
Itʼs hard to scale out MySQL!
Standard recipes would not work!
Almost all HTTP requests were changing something in DB!
We optimized our MySQL configuration!
Perconaʼs XtraDB, innodb_flush_method = O_DIRECT!
Patches to ActiveRecord and data_fabric gem!
Still I/O performance of EBS was a hard limit!
Maximum of 1,000 write transactions / sec / server!
But already 5,000 writes / sec at peak for 8 masters!
We sharded our MySQL databases!
But handling 16 DBs is no fun…!
… and at that time we only had 300,000 users!
©
wooga
15. Solution! 15
Pick a DB thatʼs better suited!
Redis was our choice!
Master runs in-memory only (45,000 writes / sec / server)!
Slaves backup data to disk every 15 minutes!
Rich data model that is way beyond simple key/value!
We migrated most write heavy tables to Redis!
Currently Redis has 2.5x transactions / sec than MySQL!
But MySQL has still more data (256 GB vs. 40 GB)!
©
wooga
16. Challenge
16
Handling Data (Bases) is hard!
MySQL has its problems!
Making a backup of 64GB takes about 30 minutes…!
But restoring it can take 6 hours or more!
Redis is not perfect, too!
Memory consumption of process grows over time!
If too much memory is used backup to disk no longer works!
Every two weeks we had to replace servers to “reset” RAM!
©
wooga
17. Challenge
17
Redis memory fragmentation!
©
wooga
18. Solution (not really)
18
Having temporary servers helps a bit
!
Scaling up MySQL DBs!
Start up new master / slave and restore backup!
Make master slave of existing slave!
Wait until replication in sync again (some hours)!
Switch to new master and remove old master / slave!
m01a
s01a
m01b
s01b
Replacing Redis DBs!
Same procedure as above!
But everything can be done in 30 minutes!
©
wooga
19. Solution (hopefully)
19
Have an elastic persistence storage!
Currently investigating “elastic” alternative!
Riak, MemBase, Hbase, Cassandra etc.!
Can they provide a similar latency as MySQL?!
A cloud helps with setting up test clusters!
Different test clusters (node count, EBS RAID levels etc.)!
Easy manual cluster setup!
Automated test runs!
©
wooga
20. Monster World at Amazon EC2! 20
Starting Point!
Finding Helpers!
Challenges and Solutions!
Looking back!
©
wooga
21. Monster World now has more
21
than 1,100,000 daily active users
!
Screenshot: Scalarium!
©
wooga
22. We still have only 2 backend
22
developers to operate this!
©
wooga
23. We still have only 2 backend
23
developers to operate this!
©
wooga
24. Know what it means
24
to be in a Cloud
!
Using a cloud has some disadvantages!
Another game with dedicated HW has 8x better performance!
I/O and network performance of EC2 is quite … err … limited!
You cannot pick the best hardware possible!
All hosts have the same chance of failure!
But offers unique advantages!
Having unlimited servers on demand is just awesome!!
You pay only for what you need when you need it!
You can concentrate on your product!
Itʼs very easy to experiment!
©
wooga
25. Play to its strengths
25
and adjust for its weaknesses
!
Play to its strengths!
Program your infrastructure, automate as much as possible!
Measure closely and react to changes!
Scaling up and out is quite easy!
Sit back and relax…!
And adjust for its weaknesses!
Avoid I/O – consider an in memory database or caching!
Be prepared that every host can fail!
©
wooga
26. Thank you!
! 26
ps.
wooga.com/jobs
jesper@wooga.com
©
wooga