●
    scaling case. From 4 users to 90k+
                   ●


              ●
                Jaime Buelta
         ●
           Soft. Developer at
                   ●
The Game
Get image from game
Utopia Kingdoms
●   Fantasy strategy game
●   Build your own Kingdom
●   Create armies and attack other Kingdoms
●   Join other Kingdoms in an Alliance
●   Manage resources
●   Available in Facebook and Kongregate
    http://www.facebook.com/UtopiaKigdomsGame
http://www.kongregate.com/games/JoltOnline/utopia-kingdoms
Technology stack
Technology Stack - Backend

       Python
 Cherrypy framework

  Amazon SimpleDB
 Linux in Amazon EC2
Stack of technologies - Frontend

            HTML
( generated by Genshi templates)


            jQuery
Stack of technologies - Frontend

            HTML
( generated by Genshi templates)


            jQuery
Some points of interest
                 (will discuss them later)

●
    Your resources (population, gold, food, etc)
    grows with time
●
    You actions (build something, attack a player)
    typically takes some time
●
    Players are ranked against the rest
●
    You can add friends and enemies
Do not guess
Measure
Measurement tools
●   OS tools
    ●   Task manager (top)
    ●
        IO Monitor (iostat)
●   Monitoring tools (Munin, Nagios)

●   Logs
    ●
        Needs to find a good compromise detailed/relevance
●   Profiling
You've got to love profiling
●
    Generate profiles with cProfile module
    Profile whole application with
    python -m cProfile -o file.prof my_app.py
    (not very useful in a web app)


●
    If you're using a framework, profile only your
    functions to reduce noise
Profile decorator (example)
def profile_this(func):
  import cProfile
  prof = cProfile.Profile()
  retval = prof.runcall(func)
  filename = 'profile-{ts}.prof'.format(time.time())
  prof.dumpstats(filename)
  return retval
Analyzing profile
●
    gprof2dot
    ●
        Using dot, convert to graph
        gprof2dot -f pstats file.prof | dot -Tpng -o file.png
    ●
        Good for workflows

●
    RunSnakeRun
    ●
        Good for cumulative times
Example of RunSnakeRun
           RAZR
Example of gprof2dot
The power of cache
All static should be out of python
●
    Use a good web server to serve all static
    content (Static HTML, CSS, JavaScript code)
●
    Some options
    ●
        Apache
    ●
        Nginx
    ●
        Cherokee
    ●
        Amazon S3
Use memcached
(and share the cache between your servers)
Example
●
    Asking for friends/enemies to DB
    ●
        Costly request in SimpleDB (using SQL statement)
●
    On each request
●
    Cache the friends on memcache for 1 hour
●
    Invalidate the cache if adding/removing
    friends or enemies
Caching caveats
●
    Cache only after knowing there is a problem
●
    Do not trust in cache for storage
●
    Take a look on size of cached data
●
    Choosing a good cache time can be difcult /
    Invalidate cache can be complex
●
    Some data is too dynamic to be cached
Caching is not just memcached
●
    More options available:
    ●
      Get on memory on start
    ●
      File cache
    ●
      Cache client side
Parse templates just once
●
    The template rendering modules have options
    to parse the templates just once
●
    Be sure to activate it in production
●
    In development, you'll most likely want to
    parse them each time

●
    Same apply to regex, specially complex ones
More problems
Rankings
●
    Sort players on the DB is slow when you grow
    the number of players
●
    Solution:
    ●
        Independent ranking server (operates just in
        memory)
    ●
        Works using binary trees
    ●
        Small Django project, communicates using xmlrpc
●
    Inconvenient:
    ●
        Data is not persistent, if the rankings server goes
        down, needs time to reconstruct the rankings
Database pulling - Resources
●
    There was a process just taking care of the
    growth of resources.
    ●
        It goes element by element, and increasing the
        values
    ●
        It pulls the DB constantly, even when the user has
        their values to maximum
●
    Increment the resources of a user just the next
    time is accessed (by himself or by others)
    ●
        No usage of DB when the user is not in use
    ●
        The request already reads from DB the user
Database pulling - Actions
●
    Lots of actions are delayed. Recruit a unit,
    buildings, raids...
●
    A process check each user if an action has to
    be done NOW.
    ●
        Tons of reads just to check “not now”
    ●
        Great delay in some actions, as they are not
        executed in time
Database pulling - Actions
●
    Implement a queue to execute the actions at
    the proper time:
    ●
        Beanstalk (allows deferred extraction)
    ●
        A process listen to this queue and performs the
        action, independently from request servers.
    ●
        The process can be launched in a diferent
        machine.
    ●
        Multiple process can extract actions faster.
DataBase Issues
Amazon SimpleDB
●
    Key – Value storage
●
    Capable of SQL queries
●
    Store a dictionary (schemaless, multiple
    columns)
●
    All the values are strings
●
    Access through boto module
●
    Pay per use
Problems with SimpleDB
●
    Lack of control
    ●
        Can't use local copy
        –   In development, you must access Amazon servers (slow
            and costly)
    ●
        Can't backup except manually
    ●
        Can't analyze or change DB (e.g. can't define
        indexes)
    ●
        Can't monitor DB
Problems with SimpleDB
●
    Bad tool support
●
    Slow and high variability (especially on SQL
    queries)
    ●
        Sometime, the queries just timeout and had to be
        repeated.
Migrate to MongoDB
MongoDB
●
    NoSQL
●
    Schemaless
●
    Fast
●
    Allow complex queries
●
    Retain control (backups, measure queries, etc)
●
    Previous experience using it from ChampMan
Requisites of the migration
●
    Low-level approach
●
    Objects are basically dictionaries
●
    Be able to save dirty fields (avoid saving
    unchanged values)
●
    Log queries to measure performance
MongoSpell
●
    Thin wrap over pymongo
●
    Objects are just dictionary-like elements
●
    Minimal schema
●
    Fast!
●
    Able to log queries
●
    It will probably be released soon as Open
    Source
Definition of collections
class Spell(Document):
 collection_name = 'spells'
 needed_fields = ['name',
          'cost',
          'duration']
 optional_fields = [
           'elemental',
           ]
 activate_dirty_fields = True
 indexes = ['name__unique', 'cost']
Querying from DB
Spell.get_from_db(name='fireball')
Spell.filter()
Spell.filter(sort='name')
Spell.filter(name__in=['fireball', 'magic missile'])
Spell.filter(elemental__fire__gt=2)
Spell.filter(duration__gt=2,
             cost=3,
             hint='cost')
Spell.filter(name='fireball', only='cost')
Some features
●
    Dirty fields
●
    No type checks
●
    Query logs
●
    10x faster than SimpleDB!!!
Query logs



[07:46:06]-   2.6   ms   –   get_from_db   -   Reinforcement - Reinforcements.py(31)
[07:46:06]-   4.3   ms   -   get_from_db   -   Player   - Player.py(876)
[07:46:10]-   0.1   ms   -   filter        -   Membership- AllianceMembership.py(110)
[07:46:10]-   1.3   ms   -   get_from_db   -   Reinforcement -Reinforcements.py(31)
[07:46:10]-   1.4   ms   -   get_from_db   -   Notifications - Notifications.py (56)
Scalability vs Efciency
Scalable vs Efcient
    Scalable                Efficient
●
    Can support more    ●
                            Can support more
    users adding more       users with the same
    elements                elements

    Work on both to achieve your goals
Keep measuring and improving!
(and monitor production to be proactive)
Thank you for your interest!

           Questions?


           jaime.buelta@gmail.com
 http://WrongSideOfMemphis.wordpress.com
           http://www.joltonline.com

Utopia Kindgoms scaling case: From 4 to 50K users

  • 1.
    scaling case. From 4 users to 90k+ ● ● Jaime Buelta ● Soft. Developer at ●
  • 2.
  • 3.
  • 4.
    Utopia Kingdoms ● Fantasy strategy game ● Build your own Kingdom ● Create armies and attack other Kingdoms ● Join other Kingdoms in an Alliance ● Manage resources ● Available in Facebook and Kongregate http://www.facebook.com/UtopiaKigdomsGame http://www.kongregate.com/games/JoltOnline/utopia-kingdoms
  • 5.
  • 6.
    Technology Stack -Backend Python Cherrypy framework Amazon SimpleDB Linux in Amazon EC2
  • 7.
    Stack of technologies- Frontend HTML ( generated by Genshi templates) jQuery
  • 8.
    Stack of technologies- Frontend HTML ( generated by Genshi templates) jQuery
  • 9.
    Some points ofinterest (will discuss them later) ● Your resources (population, gold, food, etc) grows with time ● You actions (build something, attack a player) typically takes some time ● Players are ranked against the rest ● You can add friends and enemies
  • 10.
  • 11.
    Measurement tools ● OS tools ● Task manager (top) ● IO Monitor (iostat) ● Monitoring tools (Munin, Nagios) ● Logs ● Needs to find a good compromise detailed/relevance ● Profiling
  • 12.
    You've got tolove profiling ● Generate profiles with cProfile module Profile whole application with python -m cProfile -o file.prof my_app.py (not very useful in a web app) ● If you're using a framework, profile only your functions to reduce noise
  • 13.
    Profile decorator (example) defprofile_this(func): import cProfile prof = cProfile.Profile() retval = prof.runcall(func) filename = 'profile-{ts}.prof'.format(time.time()) prof.dumpstats(filename) return retval
  • 14.
    Analyzing profile ● gprof2dot ● Using dot, convert to graph gprof2dot -f pstats file.prof | dot -Tpng -o file.png ● Good for workflows ● RunSnakeRun ● Good for cumulative times
  • 15.
  • 16.
  • 17.
  • 18.
    All static shouldbe out of python ● Use a good web server to serve all static content (Static HTML, CSS, JavaScript code) ● Some options ● Apache ● Nginx ● Cherokee ● Amazon S3
  • 19.
    Use memcached (and sharethe cache between your servers)
  • 20.
    Example ● Asking for friends/enemies to DB ● Costly request in SimpleDB (using SQL statement) ● On each request ● Cache the friends on memcache for 1 hour ● Invalidate the cache if adding/removing friends or enemies
  • 21.
    Caching caveats ● Cache only after knowing there is a problem ● Do not trust in cache for storage ● Take a look on size of cached data ● Choosing a good cache time can be difcult / Invalidate cache can be complex ● Some data is too dynamic to be cached
  • 22.
    Caching is notjust memcached ● More options available: ● Get on memory on start ● File cache ● Cache client side
  • 23.
    Parse templates justonce ● The template rendering modules have options to parse the templates just once ● Be sure to activate it in production ● In development, you'll most likely want to parse them each time ● Same apply to regex, specially complex ones
  • 24.
  • 25.
    Rankings ● Sort players on the DB is slow when you grow the number of players ● Solution: ● Independent ranking server (operates just in memory) ● Works using binary trees ● Small Django project, communicates using xmlrpc ● Inconvenient: ● Data is not persistent, if the rankings server goes down, needs time to reconstruct the rankings
  • 26.
    Database pulling -Resources ● There was a process just taking care of the growth of resources. ● It goes element by element, and increasing the values ● It pulls the DB constantly, even when the user has their values to maximum ● Increment the resources of a user just the next time is accessed (by himself or by others) ● No usage of DB when the user is not in use ● The request already reads from DB the user
  • 27.
    Database pulling -Actions ● Lots of actions are delayed. Recruit a unit, buildings, raids... ● A process check each user if an action has to be done NOW. ● Tons of reads just to check “not now” ● Great delay in some actions, as they are not executed in time
  • 28.
    Database pulling -Actions ● Implement a queue to execute the actions at the proper time: ● Beanstalk (allows deferred extraction) ● A process listen to this queue and performs the action, independently from request servers. ● The process can be launched in a diferent machine. ● Multiple process can extract actions faster.
  • 29.
  • 30.
    Amazon SimpleDB ● Key – Value storage ● Capable of SQL queries ● Store a dictionary (schemaless, multiple columns) ● All the values are strings ● Access through boto module ● Pay per use
  • 31.
    Problems with SimpleDB ● Lack of control ● Can't use local copy – In development, you must access Amazon servers (slow and costly) ● Can't backup except manually ● Can't analyze or change DB (e.g. can't define indexes) ● Can't monitor DB
  • 32.
    Problems with SimpleDB ● Bad tool support ● Slow and high variability (especially on SQL queries) ● Sometime, the queries just timeout and had to be repeated.
  • 33.
  • 34.
    MongoDB ● NoSQL ● Schemaless ● Fast ● Allow complex queries ● Retain control (backups, measure queries, etc) ● Previous experience using it from ChampMan
  • 35.
    Requisites of themigration ● Low-level approach ● Objects are basically dictionaries ● Be able to save dirty fields (avoid saving unchanged values) ● Log queries to measure performance
  • 36.
    MongoSpell ● Thin wrap over pymongo ● Objects are just dictionary-like elements ● Minimal schema ● Fast! ● Able to log queries ● It will probably be released soon as Open Source
  • 37.
    Definition of collections classSpell(Document): collection_name = 'spells' needed_fields = ['name', 'cost', 'duration'] optional_fields = [ 'elemental', ] activate_dirty_fields = True indexes = ['name__unique', 'cost']
  • 38.
    Querying from DB Spell.get_from_db(name='fireball') Spell.filter() Spell.filter(sort='name') Spell.filter(name__in=['fireball','magic missile']) Spell.filter(elemental__fire__gt=2) Spell.filter(duration__gt=2, cost=3, hint='cost') Spell.filter(name='fireball', only='cost')
  • 39.
    Some features ● Dirty fields ● No type checks ● Query logs ● 10x faster than SimpleDB!!!
  • 40.
    Query logs [07:46:06]- 2.6 ms – get_from_db - Reinforcement - Reinforcements.py(31) [07:46:06]- 4.3 ms - get_from_db - Player - Player.py(876) [07:46:10]- 0.1 ms - filter - Membership- AllianceMembership.py(110) [07:46:10]- 1.3 ms - get_from_db - Reinforcement -Reinforcements.py(31) [07:46:10]- 1.4 ms - get_from_db - Notifications - Notifications.py (56)
  • 41.
  • 42.
    Scalable vs Efcient Scalable Efficient ● Can support more ● Can support more users adding more users with the same elements elements Work on both to achieve your goals
  • 43.
    Keep measuring andimproving! (and monitor production to be proactive)
  • 44.
    Thank you foryour interest! Questions? jaime.buelta@gmail.com http://WrongSideOfMemphis.wordpress.com http://www.joltonline.com