SSP : Spil Storage PlatformThijs Terlouw – Senior Backend Engineer12th July 2012
Schedule1. Background   • Problems   • Wish list2. Solution3. Challenges4. Performance5. Lessons learned                  ...
BackgroundMission Spil Games: “ unite the world in play “•   localized social-gaming platforms•   focus on : teens, girls ...
4
Background•   Over 200 countries, 15+ different languages•   On average 85 minutes per month per user•   Over 4000 online ...
Background•   Traditional LAMP stack•   Tweaked over time to keep up with growth•   Reaching limits of current system•   O...
Problems: the database• Not all developers are DB experts  • security  • performance  • caching• Changing requirements• Di...
Wish list1.     Transparent scalability     •     Sharding data     •     Scalable applications on top of sharded data2.  ...
Schedule1.   Background2.   Solution3.   Challenges4.   Performance5.   Lessons learned                       9
Solution• No matching Open Source projects• So we want a massively scalable, soft real-time,  highly available system• Imp...
Solution : mindset1. Our system should be always on2. No global locks3. Inconsistencies are the norm  • Hardware breaks do...
SSP: Spil Storage Platform         Bucket   Buckets:Erlang                             12
SSP : Overview•   Bucket is a list of records of a specific type.    Structured data! A bucket can map to one or several  ...
SSP   Overview      14
SSP: Pipeline•   Why do we need Pipelines?    • Sequential = bottleneck !?!    • Don’t you guys know Erlang is      about ...
SSP: Pipeline•   Drawbacks:    • For hotspots (game with a gazillion       ) sequential (read)      access is bad indeed  ...
SSP: Finding the Pipeline      {bucket, phash2(Gid, Ringsize)}                               17
SSP: Bucket• Each bucket is an OTP application• Buckets are largely generated• XML -> SQL + PIQI -> Erlang   – Using XSLT ...
Piqi?•   PIQI is    • data definition language    • cross-language data serialization system      compatible with Protocol...
SSP: Example Bucket XML definition                  21
gidlog.piqi                   Mostly templated via xslt              22
gidlog_accessors.hrlParse piqigenerated hrl:epp:parse_file/3mostly templateadded as dep                     23
SSP: bucket implementation•   bucketX.erl    –   include_lib(“…/bucketX_accessors.hrl”)    –   verify_record(R)    –   sta...
SSP: Versions1. A bucket is versioned. The interface of a bucket is    stable, but implementation can vary2. We can go up ...
SSP: Shards (storage level)1. GIDs (eg users) are sharded automatically.   • Each version might have multiple shards2. Red...
SSP: Cache•   Each node has a private Memcached instance•   We store all data for a GID/bucket in this cache    • Filters ...
Schedule1.   Background2.   Solution3.   Challenges4.   Performance5.   Lessons learned                       28
Challenge: controlled shutdown node                  29
Challenge: controlled shutdown nodeHow do we shutdown a node without losing jobs?• Shutdown bucketX application on a node ...
Note: shutdown application•   if you terminate an application, all processes that    were started (even if not linked) are...
Challenge: shutdown pipeline•   The Pipeline process that we spawn per Gid needs    to shutdown when done (less memory)•  ...
Challenge: shutdown pipeline (2)• All requests for a GID are handled by a single  Pipeline Factory• The pipeline will issu...
Challenge: high uptime• We want continuous usage of SSP  – Even while upgrading bucket versions  – So there can be multipl...
Challenge: quite complex system                  35
Schedule1.   Background2.   Solution3.   Challenges4.   Performance5.   Lessons learned                       36
Performance•   Currently we run SSP in ´ shadow´mode, so no real    data yet. Making realistic benchmarks is quite a lot  ...
Performance•   Requests (local):    – Getting from cache at about 13.5K req/sec       • elibs_benchmark:test_fun(gidlog_ge...
Schedule1.   Background2.   Solution3.   Challenges4.   Performance5.   Lessons learned                       39
Lessons learned (1)•   There are many good Open Source libraries    • Emysql : we have added transaction support    • Eep0...
Lessons learned (2)• Mnesia is great to replicate state across machines  • Faster local lookups  • Less error prone• Encap...
Lessons learned (3)•   XML + XSD + XSLT are great to define API    • They might have a bad name, but work great    • Can t...
Lessons learned (4)• Rebar is great  • Compilation is pretty convenient, but the best part    are the “dependencies”  • Al...
Lessons learned (5)• We use #records{} for all APIs  – Piqi input/output  – Stable and well-defined  – Will move to Protoc...
Lessons learned (6)• You need to add admin/monitoring interface                      45
Open Source  We will not open-source SSP, but we do actively  contribute to libraries used in SSP (so far Emysql,  Rebar, ...
THANKS!           Questions?Thijs.Terlouw@spilgames.com           47
Upcoming SlideShare
Loading in …5
×

Spil Storage Platform (Erlang) @ EUG-NL

581 views
495 views

Published on

Presentation about the Spil Storage Platform (SSP) written in Erlang. This talk was first given at the Erlang User Group Netherlands in July 2012 hosted at Spilgames in Hilversum.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
581
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
3
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Spil Storage Platform (Erlang) @ EUG-NL

  1. 1. SSP : Spil Storage PlatformThijs Terlouw – Senior Backend Engineer12th July 2012
  2. 2. Schedule1. Background • Problems • Wish list2. Solution3. Challenges4. Performance5. Lessons learned 2
  3. 3. BackgroundMission Spil Games: “ unite the world in play “• localized social-gaming platforms• focus on : teens, girls and family• many portals: • girlsgogames.com • agame.com 3
  4. 4. 4
  5. 5. Background• Over 200 countries, 15+ different languages• On average 85 minutes per month per user• Over 4000 online games• 200 million unique users per month 5
  6. 6. Background• Traditional LAMP stack• Tweaked over time to keep up with growth• Reaching limits of current system• One of largest problems is the database 6
  7. 7. Problems: the database• Not all developers are DB experts • security • performance • caching• Changing requirements• Difficult to shard the databases 7
  8. 8. Wish list1. Transparent scalability • Sharding data • Scalable applications on top of sharded data2. Multi-database transactions • atomic operations across machines3. Fast enough (low-ish latency, high throughput)4. Highly available (central system)5. Can handle large dataset6. Offer flexibility (trade consistency for speed for instance)7. Use MySQL (experience in-house DB-team)8. Don’t expose SQL to devs, offer business-specific model • Storage specific security measures (character escaping)9. Allow changes to storage layer without affecting business (versioning)10. Centralize ownership of caching 8
  9. 9. Schedule1. Background2. Solution3. Challenges4. Performance5. Lessons learned 9
  10. 10. Solution• No matching Open Source projects• So we want a massively scalable, soft real-time, highly available system• Implement it ourselves: Erlang obvious candidate Not the first to think of this: • Amazon SimpleDB • Riak• Use Open Source where possible 10
  11. 11. Solution : mindset1. Our system should be always on2. No global locks3. Inconsistencies are the norm • Hardware breaks down (power failures etc) • Version mismatches (upgrading system non atomic) • State mismatches (adding new machine) 11
  12. 12. SSP: Spil Storage Platform Bucket Buckets:Erlang 12
  13. 13. SSP : Overview• Bucket is a list of records of a specific type. Structured data! A bucket can map to one or several MySQL database tables and offers a CRUD-like interface (with filters)• All data is identified by a unique GID (64 bit integer)• All requests for a particular GID are handled by one Pipeline process (sequentially) 13
  14. 14. SSP Overview 14
  15. 15. SSP: Pipeline• Why do we need Pipelines? • Sequential = bottleneck !?! • Don’t you guys know Erlang is about PARALLELIZING work? 15
  16. 16. SSP: Pipeline• Drawbacks: • For hotspots (game with a gazillion ) sequential (read) access is bad indeed • Optimization: allow dirty read (try local cache first , outside pipeline), other solutions possible.• Advantages: • Facilitates scalability (no global locks, but per bucket/GID sync) • Pipelines make multi-database consistency easierRequests to most GIDs (users) are evenly distributed 16
  17. 17. SSP: Finding the Pipeline {bucket, phash2(Gid, Ringsize)} 17
  18. 18. SSP: Bucket• Each bucket is an OTP application• Buckets are largely generated• XML -> SQL + PIQI -> Erlang – Using XSLT – Piqic 19
  19. 19. Piqi?• PIQI is • data definition language • cross-language data serialization system compatible with Protocol Buffers • Piqi-RPC — an RPC-over-HTTP system for Erlang • Would be better if transport was pluggable • http://piqi.org/ 20
  20. 20. SSP: Example Bucket XML definition 21
  21. 21. gidlog.piqi Mostly templated via xslt 22
  22. 22. gidlog_accessors.hrlParse piqigenerated hrl:epp:parse_file/3mostly templateadded as dep 23
  23. 23. SSP: bucket implementation• bucketX.erl – include_lib(“…/bucketX_accessors.hrl”) – verify_record(R) – start/0 and start_link/0 – init/1 – get_fun(Version), del_fun(V), insert_fun(V),…• bucketX_v1.erl – del, insert, … (Gid, Shard, Filters) – get mysql pool – build some SQL – emysql:execute(Poolname, Sql) 24
  24. 24. SSP: Versions1. A bucket is versioned. The interface of a bucket is stable, but implementation can vary2. We can go up or down a version, migration is automatic • Mirror-mode is introduced so we can write to multiple versions (but read from only one version) 25
  25. 25. SSP: Shards (storage level)1. GIDs (eg users) are sharded automatically. • Each version might have multiple shards2. Redundancy (of data) is handled by MySQL{bucket, GID} -> {Version, Shard} mapping • Version default: config • Shard default: default rule GID % shards • Actual version/shard per GID stored in DB (cached) 26
  26. 26. SSP: Cache• Each node has a private Memcached instance• We store all data for a GID/bucket in this cache • Filters applied after retrieving data from cache• Don´ change data in storage outside of the SSP! t 27
  27. 27. Schedule1. Background2. Solution3. Challenges4. Performance5. Lessons learned 28
  28. 28. Challenge: controlled shutdown node 29
  29. 29. Challenge: controlled shutdown nodeHow do we shutdown a node without losing jobs?• Shutdown bucketX application on a node • stop pipeline factories on this node (for bucketX) • hand over work to other PF (on other nodes) – couple of mnesia ring reads – move ETS table contents to new PF – remember which PF took over (so we can forward) • If we go to another node, clone Pipeline (gen2 pri) • remove this node from the lookup ring • all PFs fix their hash range based on ring • Because there is a race condition handing over many to one (non-continuous blocks) PF • Sleep a while  (actually wait for pipeline handovers) 30
  30. 30. Note: shutdown application• if you terminate an application, all processes that were started (even if not linked) are terminated!• bit hidden in documentation of application:start/2 and stop/1• so we need to explicitly set the group_leader to something that never shuts down: init(#state{} = S ) -> group_leader(whereis(init), self()), {ok, S}. 31
  31. 31. Challenge: shutdown pipeline• The Pipeline process that we spawn per Gid needs to shutdown when done (less memory)• When is it actually done?• Work might be assigned to the Pipeline just when the Pipeline decides it is done: race conditions! 32
  32. 32. Challenge: shutdown pipeline (2)• All requests for a GID are handled by a single Pipeline Factory• The pipeline will issue a ‘work done’ command to the PF with a ‘CommandCounter’• PF maintains an ETS table • Lookup if the registered CommandCounter for that GID is the same as the reported number • If so: tell the Pipeline to die 33
  33. 33. Challenge: high uptime• We want continuous usage of SSP – Even while upgrading bucket versions – So there can be multiple versions running simultaneously• Take care of creating closures• Atomic behavior per GID 34
  34. 34. Challenge: quite complex system 35
  35. 35. Schedule1. Background2. Solution3. Challenges4. Performance5. Lessons learned 36
  36. 36. Performance• Currently we run SSP in ´ shadow´mode, so no real data yet. Making realistic benchmarks is quite a lot of work.• Latency (local machine): – 6-26ms to do a GET request on a primary key (cache miss) – 0.6ms with a cache hit – Cache stores Erlang terms currently (term_to_binary)• Always read from cache – Does not detect changes in storage done outside SSP 37
  37. 37. Performance• Requests (local): – Getting from cache at about 13.5K req/sec • elibs_benchmark:test_fun(gidlog_get, fun() -> gidlog:get(123456) end, 10, 10000). – Getting from mysql about 615 req/sec incl cache miss • elibs_benchmark:test_fun(gidlog_get, fun() -> {_,_,C} = os:timestamp(), gidlog:get(C) end, 10, 100). – ~2 SSP machines can saturate a MySQL machine – 8K writes/sec for 2 MySQL + 4 SSP machines (old hardware) 38
  38. 38. Schedule1. Background2. Solution3. Challenges4. Performance5. Lessons learned 39
  39. 39. Lessons learned (1)• There are many good Open Source libraries • Emysql : we have added transaction support • Eep0018 : fast json encoder/decoder (yajl c++) • Estatsd : graphite-capable monitoring • Poolboy : Erlang worker pool factory (for memcached) • Twig/Lager : logging (syslog) 40
  40. 40. Lessons learned (2)• Mnesia is great to replicate state across machines • Faster local lookups • Less error prone• Encapsulate all Mnesia usage in a module • Adding nodes to Mnesia • Use ram_copies • Transactions are great• We deploy an Erlang cluster (with Mnesia replication) only inside a single DataCenter • Not across unreliable connections! 41
  41. 41. Lessons learned (3)• XML + XSD + XSLT are great to define API • They might have a bad name, but work great • Can transform in any other format • Used to generate documentationTodo:• generate more code (Buckets)• write gen_bucket behaviour• don´ start with generating code t 42
  42. 42. Lessons learned (4)• Rebar is great • Compilation is pretty convenient, but the best part are the “dependencies” • Also the worst part • We have proposed two improvements: • Allow different projects to share dependencies (major speedup for compiling) • Smarter version conflict resolution (semantic versioning: [ “>= 1.3.1”, “< 2.0.0” ] ) 43
  43. 43. Lessons learned (5)• We use #records{} for all APIs – Piqi input/output – Stable and well-defined – Will move to ProtocolBuffers• Use OTP applications everywhere – Start/stop stuff – See started apps: application:which_applications()• Terminate on fatal errors – Memcached down : terminate all buckets, don´t try to recover (prevent overload DB) 44
  44. 44. Lessons learned (6)• You need to add admin/monitoring interface 45
  45. 45. Open Source We will not open-source SSP, but we do actively contribute to libraries used in SSP (so far Emysql, Rebar, Piqi) 46
  46. 46. THANKS! Questions?Thijs.Terlouw@spilgames.com 47

×