Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Retaining globally
distributed high
availability
Art van Scheppingen
Head of Database Engineering
2	
  
1.  Who	
  is	
  Spil	
  Games?	
  
2.  Theory	
  
3.  Spil	
  Storage	
  Pla9orm	
  
4.  Ques=ons?	
  
Overview
Who are we?
Who	
  is	
  Spil	
  Games?	
  
	
  
4	
  
•  Company	
  founded	
  in	
  2001	
  
•  350+	
  employees	
  world	
  wide	
  
•  180M+	
  unique	
  visitors	
  ...
5	
  
Geographic Reach
180	
  Million	
  Monthly	
  Ac=ve	
  Users(*)	
  
Source:	
  (*)	
  Google	
  Analy3cs,	
  August	...
6	
  
Girls,	
  Teens	
  and	
  Family	
  
	
  
spielen.com	
  
juegos.com	
  
gamesgames.com	
  
games.co.uk	
  
Brands
Foundations
The	
  exci2ng	
  theory	
  
	
  
8	
  
•  What	
  does	
  it	
  exactly	
  mean?	
  
Retaining globally distributed HA
9	
  
Wikipedia:	
  
High	
  availability	
  is	
  a	
  system	
  design	
  approach	
  and	
  
associated	
  service	
  i...
10	
  
•  Master	
  with	
  (many)	
  slave(s)	
  
How do we reach HA with MySQL?
Master
Slave Slave Slave
11	
  
•  Master	
  with	
  (many)	
  slave(s)	
  
•  Mul=	
  Master	
  
How do we reach HA with MySQL?
Master
Slave
Maste...
12	
  
•  Master	
  with	
  (many)	
  slave(s)	
  
•  Mul=	
  Master	
  
•  Clustering	
  
How do we reach HA with MySQL?
...
13	
  
•  Master	
  with	
  (many)	
  slave(s)	
  
•  Mul=	
  Master	
  
•  Clustering	
  
•  Geographical	
  redundancy	
...
14	
  
•  Scale	
  up	
  
•  Ver=cal	
  
•  Faster	
  CPU/Memory/disks	
  
•  Expensive	
  
•  Costs	
  mul=ply	
  in	
  s...
15	
  
•  Func=onal	
  
•  Shard	
  your	
  database	
  func=onally	
  
•  Reads	
  
•  Add	
  more	
  slaves	
  (keep	
  ...
16	
  
•  Breaking	
  up	
  tables	
  in	
  small	
  parts	
  on	
  the	
  same	
  host	
  
•  Par==oned	
  on	
  a	
  col...
17	
  
•  Breaking	
  up	
  your	
  table	
  in	
  parts	
  on	
  mul=ple	
  hosts	
  
•  Par==oned	
  on	
  a	
  column	
...
18	
  
•  Parallel	
  execu=on	
  of	
  sequen=al	
  jobs	
  
•  Limited	
  by	
  the	
  weakest	
  link	
  
•  As	
  fast...
19	
  
Typical LAMP stack
Client	
  
Webserver	
  
PHP	
  
MySQL	
  
Memcache	
  
Webserver	
  
PHP	
  
Loadbalancer	
  
20	
  
A-typical LAMP stack
Client	
  
Webserver	
  
PHP	
  
MySQL	
  
Memcache	
  
Webserver	
  
PHP	
  
Loadbalancer	
  ...
Spil Storage
Platform
Abstrac2ng	
  the	
  storage	
  layer	
  
	
  
22	
  
•  Dependent	
  on	
  one	
  storage	
  pla9orm	
  
•  No	
  more	
  pla9orm-­‐specific	
  query	
  language	
  
•  ...
23	
  
Old architecture overview
24	
  
New architecture overview
25	
  
New architecture overview
Server API
Application Model
Storage platform
Client-side API
Presentation layer
Physical...
26	
  
•  Everything	
  wrihen	
  in	
  Erlang	
  
•  Piqi	
  as	
  protocol	
  
•  binary	
  
•  JSON	
  
•  XML	
  
•  S...
27	
  
•  Predictable	
  
•  Reliable	
  
•  Decent	
  performance	
  
•  Easy	
  to	
  comprehend	
  
•  Excellent	
  eco...
28	
  
•  Func=onal	
  language	
  
•  High	
  availability:	
  designed	
  for	
  telecom	
  solu=ons	
  
•  Excels	
  at...
29	
  
•  What	
  is	
  the	
  bucket	
  model?	
  
•  Each	
  record	
  has	
  one	
  unique	
  owner	
  ahribute	
  (GID...
30	
  
$	
  curl	
  -­‐X	
  POST	
  -­‐H	
  'Accept:	
  applica=on/json'	
  -­‐H	
  	
  
'Content-­‐Type:	
  applica=on/js...
31	
  
CREATE	
  TABLE	
  demobucket	
  (	
  
	
  	
  gid	
  bigint(20)	
  unsigned	
  not	
  null,	
  
	
  	
  given_name...
32	
  
CREATE	
  TABLE	
  demobucket	
  (	
  
	
  	
  gid	
  bigint(20)	
  unsigned	
  not	
  null,	
  
	
  	
  user_name	...
33	
  
CREATE	
  COLUMNFAMILY	
  demobucket	
  (	
  
	
  	
  gid	
  int	
  PRIMARY	
  KEY,	
  
	
  	
  given_name	
  varch...
34	
  
demobucket:get(	
  #demobucket_get_input{	
  gid=12345,	
  filters=	
  [	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
 ...
35	
  
Pipeline flow of a bucket
36	
  
•  Nearest	
  datacenter	
  (DC)	
  to	
  the	
  end	
  user	
  
•  Satellite	
  DC	
  
•  Processing	
  and	
  cac...
37	
  
•  Contains	
  GIDs	
  and	
  their	
  master	
  DC	
  
•  GIDs	
  master	
  DC	
  predefined	
  
•  Migrated	
  GID...
38	
  
•  Globally	
  sharded	
  on	
  GID	
  
•  (local)	
  GID	
  Lookup	
  
How does this work?
GID
lookup
Shard 1 Shar...
39	
  
Master/Satellite DC example
40	
  
•  Spread	
  data	
  even	
  on	
  shards	
  
•  Migra=on	
  of	
  buckets	
  between	
  shards	
  
•  GID	
  migra...
41	
  
•  Versioning	
  on	
  bucket	
  defini=ons	
  	
  
•  GIDs	
  are	
  assigned	
  to	
  a	
  bucket	
  version	
  
•...
42	
  
Seamless schema upgrades
Demobucket	
  v1	
  
GID	
  
1234	
  
1235	
  
1236	
  
1237	
  
1238	
  
1239	
  
name	
 ...
43	
  
•  Every	
  cluster	
  (two	
  masters)	
  will	
  contain	
  two	
  shards	
  
•  Data	
  wrihen	
  interleaved	
 ...
44	
  
•  SPAPI	
  is	
  in	
  place	
  
•  SSP	
  is	
  (mostly)	
  running	
  in	
  shadow	
  mode	
  
•  GID	
  buckets...
45	
  
Questions?
47	
  
•  Presenta=on	
  can	
  be	
  found	
  at:	
  
hhp://spil.com/perconalondon2012	
  
•  If	
  you	
  wish	
  to	
  ...
Upcoming SlideShare
Loading in …5
×

Retaining globally distributed high availability

689 views

Published on

Example of a solution for retaining globally distributed high availability with MySQL

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Retaining globally distributed high availability

  1. 1. Retaining globally distributed high availability Art van Scheppingen Head of Database Engineering
  2. 2. 2   1.  Who  is  Spil  Games?   2.  Theory   3.  Spil  Storage  Pla9orm   4.  Ques=ons?   Overview
  3. 3. Who are we? Who  is  Spil  Games?    
  4. 4. 4   •  Company  founded  in  2001   •  350+  employees  world  wide   •  180M+  unique  visitors  per  month   •  45  portals  in  19  languages   •  Casual  games   •  Social  games   •  Real  =me  mul=player  games   •  Mobile  games   •  35+  MySQL  clusters   •  60k  queries  per  second  (3.5  billion  qpd)   Facts
  5. 5. 5   Geographic Reach 180  Million  Monthly  Ac=ve  Users(*)   Source:  (*)  Google  Analy3cs,  August  2012     •  Over  45  localized  portals  in  19  languages   •  Mul=  pla9orm:  web,  mobile,  tablet   •  Focus  on  casual  and  social  games   •  180M  MAU  per  month  (30M  YoY  growth)   •  Over  50M  registered  users  
  6. 6. 6   Girls,  Teens  and  Family     spielen.com   juegos.com   gamesgames.com   games.co.uk   Brands
  7. 7. Foundations The  exci2ng  theory    
  8. 8. 8   •  What  does  it  exactly  mean?   Retaining globally distributed HA
  9. 9. 9   Wikipedia:   High  availability  is  a  system  design  approach  and   associated  service  implementa=on  that  ensures  a   prearranged  level  of  opera=onal  performance  will  be   met  during  a  contractual  measurement  period.     Oracle:   •  Availability  of  resources  in  a  computer  system     What is high availability?
  10. 10. 10   •  Master  with  (many)  slave(s)   How do we reach HA with MySQL? Master Slave Slave Slave
  11. 11. 11   •  Master  with  (many)  slave(s)   •  Mul=  Master   How do we reach HA with MySQL? Master Slave Master Slave
  12. 12. 12   •  Master  with  (many)  slave(s)   •  Mul=  Master   •  Clustering   How do we reach HA with MySQL? MysqldMysqld ndbd ndbd ndbd ndbd ndbd mgmt
  13. 13. 13   •  Master  with  (many)  slave(s)   •  Mul=  Master   •  Clustering   •  Geographical  redundancy     How do we reach HA with MySQL? Master local DC Slave local DC Slave Asia Slave US
  14. 14. 14   •  Scale  up   •  Ver=cal   •  Faster  CPU/Memory/disks   •  Expensive   •  Costs  mul=ply  in  same  rate  as  #  of  nodes   •  Scale  out   •  Horizontal   •  More  (small)  machines   •  Inexpensive   •  Par==oning/federa=ng  (sharding)   What if we keep growing?
  15. 15. 15   •  Func=onal   •  Shard  your  database  func=onally   •  Reads   •  Add  more  slaves  (keep  them  coming!)   •  Writes   •  More  disks   •  Horizontal  par==oning   •  Federated  par==ons   Scale out
  16. 16. 16   •  Breaking  up  tables  in  small  parts  on  the  same  host   •  Par==oned  on  a  column   •  Infinite  growth  (as  long  as  you  add  diskspace)   •  Less  used  data  to  slower  (cheaper)  disks   •  No  stored  procedures,  func=ons,  etc   •  Uneven  usage  of  par==ons  (hash  par==on  may  help)   •  Once  wrihen,  data  remains  on  the  par==on   Horizontal partitioning
  17. 17. 17   •  Breaking  up  your  table  in  parts  on  mul=ple  hosts   •  Par==oned  on  a  column   •  Infinite  growth  (as  long  as  you  add  hosts)   •  Less  used  data  on  slower  hosts   •  Not  supported  in  (standard)  MySQL   •  Par==oning  on  applica=on  level  (or  proxy)   •  Alterna=vely:  NDB   •  Uneven  usage  of  par==ons   •  Once  wrihen  data  (mostly)  remains  on  the  par==on   •  Parallel  queries  to  retrieve  data  from  all  shards   Federated partitions (sharding)
  18. 18. 18   •  Parallel  execu=on  of  sequen=al  jobs   •  Limited  by  the  weakest  link   •  As  fast  as  the  slowest  node   •  Fix:  nonsequen=al  (asynchronous)  execu=on   Amdahl's law
  19. 19. 19   Typical LAMP stack Client   Webserver   PHP   MySQL   Memcache   Webserver   PHP   Loadbalancer  
  20. 20. 20   A-typical LAMP stack Client   Webserver   PHP   MySQL   Memcache   Webserver   PHP   Loadbalancer   MQ   Jobs  
  21. 21. Spil Storage Platform Abstrac2ng  the  storage  layer    
  22. 22. 22   •  Dependent  on  one  storage  pla9orm   •  No  more  pla9orm-­‐specific  query  language   •  Differen=ate  writes     •  Op=mis=c  (asynchronous)   •  Pessimis=c  (synchronous)   •  Shard  data  beher   •  Par==on  on  user  and  func=on   •  Cluster  informa=on  by  users,  not  by  func=on   •  Global  expansion   •  Par==on  on  geographic  loca=on   •  Solve  uneven  usage  of  data  storage   •  Move  data  from  shard  to  shard   •  Anything  may/could/will  fail  eventually   •  Not  designed  for  the  “happy”  flow   What was our wishlist?
  23. 23. 23   Old architecture overview
  24. 24. 24   New architecture overview
  25. 25. 25   New architecture overview Server API Application Model Storage platform Client-side API Presentation layer Physical storage
  26. 26. 26   •  Everything  wrihen  in  Erlang   •  Piqi  as  protocol   •  binary   •  JSON   •  XML   •  SSP  u=lizes  local  caching  (memcache)   •  Flexible  (persistent)  storage  layer   •  MySQL  (various  flavors)   •  Membase/Couchbase   •  Could  be  any  other  storage  product   •  MQs  (DWH  updates)   Our building blocks
  27. 27. 27   •  Predictable   •  Reliable   •  Decent  performance   •  Easy  to  comprehend   •  Excellent  eco  system   •  Libraries   •  Monitoring  tools   •  Knowledge   Why choose MySQL?
  28. 28. 28   •  Func=onal  language   •  High  availability:  designed  for  telecom  solu=ons   •  Excels  at  concurrency,  distribu=on,  fault  tolerance   •  Do  more  with  less!   •  Other  companies  using  Erlang:   Why Erlang?
  29. 29. 29   •  What  is  the  bucket  model?   •  Each  record  has  one  unique  owner  ahribute  (GID)   •  GID  (Global  IDen=fier)  iden=fying  different  types   •  Bucket(s)  per  func=onality   •  Bucket  is  structured  data   •  Ahributes  contain  data  of  records   •  Ahributes  do  not  have  to  correspond  to  schema   How do we shard?
  30. 30. 30   $  curl  -­‐X  POST  -­‐H  'Accept:  applica=on/json'  -­‐H     'Content-­‐Type:  applica=on/json'  -­‐-­‐data-­‐binary  "{"gid":     288511851128422401}"  hhp://127.0.0.1:8777/demobucket/get   {      "records":  [          {              "gid":  288511851128422401,              "given_name":  "g",              "registered_on":  1,              "email":  "mail1",              "gender":  "m",              "birthdate":  {  "year":  1963,  "month":  6,  "day":  21  }          }      ],      "meta_info":  {  "total_ct":  1  }   }   Example bucket
  31. 31. 31   CREATE  TABLE  demobucket  (      gid  bigint(20)  unsigned  not  null,      given_name  varchar(64)  not  null,      registered_on  =nyint(3)  unsigned  default  0,      email  varchar(255)  not  null,      gender  enum(‘m’,  ‘f’,  ‘u’)  not  null  default  ‘m’,      birthdate  date  not  null,      PRIMARY  KEY(gid)   );   Example bucket MySQL 1
  32. 32. 32   CREATE  TABLE  demobucket  (      gid  bigint(20)  unsigned  not  null,      user_name  varchar(64)  not  null,      user_register  =mestamp  on  update   CURRENT_TIMESTAMP(),      user_emailaddress  varchar(255)  not  null,      user_gender  char(1)  not  null  default  ‘m’,      user_dob  varchar(10)  not  null,      PRIMARY  KEY(gid)   );   Example bucket MySQL 2
  33. 33. 33   CREATE  COLUMNFAMILY  demobucket  (      gid  int  PRIMARY  KEY,      given_name  varchar,      registered_on  =mestamp,      email  varchar,      gender  varchar,      birth_date  varchar   );   Example bucket Cassandra
  34. 34. 34   demobucket:get(  #demobucket_get_input{  gid=12345,  filters=  [                            #filter{  ahr=  <<"gender">>        ,  op=  <<"=">>        ,  parms=  {string,  <<"f">>}},                            #filter{  ahr=  <<"registered_on">>,  op=  <<"sort">>,  parms=asc  },                            #filter{  ahr=  <<"gid">>,  op=  <<"limit">>,    parms={int,  10  }}                    ]}  )   Example Erlang filters
  35. 35. 35   Pipeline flow of a bucket
  36. 36. 36   •  Nearest  datacenter  (DC)  to  the  end  user   •  Satellite  DC   •  Processing  and  caching   •  Do  not  own/store  data   •  Storage  DC     •  Processing,  caching  and  persistent  storage   •  Store  all  same  user  data  in  same  DC   •  Par==on  on  user  globally   •  Global  IDen=fier  per  user   Global distribution
  37. 37. 37   •  Contains  GIDs  and  their  master  DC   •  GIDs  master  DC  predefined   •  Migrated  GIDs  get  updated   The lookup server
  38. 38. 38   •  Globally  sharded  on  GID   •  (local)  GID  Lookup   How does this work? GID lookup Shard 1 Shard 2 Persistent storage
  39. 39. 39   Master/Satellite DC example
  40. 40. 40   •  Spread  data  even  on  shards   •  Migra=on  of  buckets  between  shards   •  GID  migra=on  between  DCs   •  Crea=ng  a  new  storage  DC  needs  data  migra=on   •  Users  will  automa=cally  be  migrated  a‚er  visi=ng   another  DC  many  =mes   Why do we need data migration?
  41. 41. 41   •  Versioning  on  bucket  defini=ons     •  GIDs  are  assigned  to  a  bucket  version   •  Data  in  old  bucket  versions  remain  (read  only)   •  New  data  only  gets  wrihen  to  new  bucket  version   •  Updates  migrate  data  to  new  bucket  version   •  Migrates  can  be  triggered   Seamless schema upgrades
  42. 42. 42   Seamless schema upgrades Demobucket  v1   GID   1234   1235   1236   1237   1238   1239   name   Roy   Moss   Jen   Douglas   Denholm   Richmond   Demobucket  v2   GID               name               gender               GID   1241             name   Patricia             gender   f             GID   1241   1235           name   Patricia   Moss           gender   f   m           GID   1234     1236   1237   1238   1239   name   Roy     Jen   Douglas   Denholm   Richmond   GID   1234       1237   1238   1239   name   Roy       Douglas   Denholm   Richmond   GID   1241   1235   1236         name   Patricia   Moss   Jen         gender   f   m   f        
  43. 43. 43   •  Every  cluster  (two  masters)  will  contain  two  shards   •  Data  wrihen  interleaved   •  HA  for  both  shards   •  No  warmup  needed   •  Both  masters  ac=ve  and  “warmed  up”   •  Slaves  added  (other  DC)  for  HA  and  backup   Multi Master writes SSP   Shard  1                                       Shard  2                                      
  44. 44. 44   •  SPAPI  is  in  place   •  SSP  is  (mostly)  running  in  shadow  mode   •  GID  buckets  running  in  produc=on   •  Ac=vity  feed  system  first  to  produc=on   •  Satellite  DC  in  early  2013!   Where do we stand now?
  45. 45. 45  
  46. 46. Questions?
  47. 47. 47   •  Presenta=on  can  be  found  at:   hhp://spil.com/perconalondon2012   •  If  you  wish  to  contact  me:   art@spilgames.com   •  Don’t  forget  to  rate  my  talk!   Thank you!

×