Is it Web Scale?
              	
  
     Mikael	
  Berggren	
  
    miken@spo1fy.com	
  
Who	
  is	
  this	
  old	
  guy?	
  
•  Web	
  Team	
  Lead	
  @	
  Spo1fy	
  
•  Started	
  back	
  in	
  the	
  the	
  glory	
  days	
  of	
  2000	
  
•  Only	
  works	
  for	
  companies	
  beginning	
  with	
  an	
  S	
  
   (Spray,	
  Stardoll	
  &	
  Spo1fy)	
  
Is	
  /dev/null	
  web	
  scale?	
  




    “Mongo	
  DB	
  is	
  Web	
  Scale”	
  –	
  YouTube.com	
  
Success	
  in	
  scale	
  fail	
  
•  Choose	
  by	
  buzz	
  /	
  trends	
  
•  Use	
  technology	
  suitable	
  for	
  something	
  
   completely	
  different	
  
•  Web	
  frameworks	
  
•  Don’t	
  measure	
  or	
  look	
  at	
  graphs	
  
“We’re	
  using	
  X	
  and	
  it	
  scales”	
  
•  There	
  is	
  no	
  magic	
  solu1on	
  that	
  scales	
  out	
  of	
  
   the	
  box	
  
•  Choose	
  the	
  technology	
  you’re	
  familiar	
  with	
  
•  Middle	
  bird	
  get’s	
  the	
  worm	
  
What	
  is	
  scaling?	
  
•    It’s	
  about	
  lying	
  :)	
  
•    Take	
  control	
  of	
  your	
  code	
  
•    Push	
  instead	
  of	
  pull	
  
•    Finding	
  boZlenecks	
  and	
  fix	
  them	
  
•    Avoid	
  SPoF	
  as	
  much	
  as	
  you	
  can	
  
•    Measure	
  and	
  analyze	
  
Cache	
  is	
  king!	
  
•      Cache	
  on	
  mul1ple	
  instances	
  
•      Cache	
  as	
  close	
  to	
  the	
  final	
  result	
  as	
  possible	
  
•      Memcache	
  is	
  good	
  
•      but	
  flat	
  files	
  is	
  even	
  beZer	
  

	
  
Using	
  memcache	
  
•  Local	
  &	
  Global	
  instances	
  
•  Global:	
  For	
  everything	
  that	
  needs	
  to	
  be	
  
   distributed	
  and	
  synchronized	
  
•  Local:	
  For	
  everything	
  else	
  
•  Make	
  sure	
  failovers	
  work	
  correctly	
  
•  Use	
  getMul1	
  when	
  possible	
  
•  Have	
  several	
  small	
  instances	
  instead	
  of	
  one	
  
   large	
  
File	
  cache	
  
•    Fast	
  and	
  reliable	
  
•    No	
  3rd	
  party	
  dependencies	
  
•    Used	
  ocen	
  =>	
  Really	
  fast	
  access	
  
•    Easy	
  to	
  scale	
  
•    Atomic	
  updates	
  
Web	
  frameworks?	
  No	
  thanks!	
  
•  Never	
  built	
  for	
  your	
  specific	
  needs.	
  If	
  they	
  are,	
  
   you’re	
  damn	
  lucky	
  :)	
  
•  Hard	
  to	
  control	
  data	
  and	
  request	
  handling	
  
   your	
  way	
  (plugins,	
  modules)	
  
•  A	
  lot	
  of	
  overhead	
  for	
  each	
  request	
  
Database	
  scaling	
  
•    Do	
  your	
  homework	
  on	
  indexes	
  and	
  queries	
  
•    Test	
  your	
  knowledge	
  on	
  indexes	
  and	
  queries	
  
•    Use	
  slave	
  nodes	
  for	
  reads	
  
•    Horizontal	
  sharding	
  for	
  segments	
  of	
  users	
  
•    Ver1cal	
  sharding	
  for	
  user-­‐data	
  
RDB	
  vs.	
  NoSQL	
  
•  “No”	
  in	
  NoSQL	
  stands	
  for	
  “Not	
  Only”	
  
•  Use	
  the	
  correct	
  storage	
  for	
  your	
  purpose	
  
•  Don’t	
  do	
  as	
  Digg.com…	
  
Scaling	
  at	
  Stardoll	
  
•    >3500	
  dynamic	
  PV/s	
  
•    Horizontal	
  sharding	
  of	
  user-­‐data	
  
•    Pre-­‐genera1on	
  of	
  content	
  
•    Measuring	
  render-­‐1me	
  for	
  each	
  page	
  
Scaling	
  at	
  Spo1fy	
  
•  We	
  said	
  “Bye	
  Wordpress!”	
  and	
  the	
  servers	
  
   where	
  happy	
  again	
  
•  95%	
  of	
  all	
  content:	
  flat	
  files	
  
•  Dependencies	
  on	
  shared	
  services	
  
•  Handle	
  huge	
  spikes	
  in	
  traffic	
  
PV/s	
  
Recommenda1ons	
  
•    Have	
  a	
  code	
  standard	
  everyone	
  must	
  follow!	
  
•    Great	
  error	
  handling	
  and	
  logging	
  
•    Possibility	
  to	
  disable	
  features/func1onality	
  
•    Try	
  to	
  do	
  requests	
  asynchronous	
  
•    Avoid	
  race	
  condi1ons	
  
•    Monitor,	
  measure	
  and	
  analyze	
  	
  !important	
  
Questions?
And btw,
we’re hiring
          	
  
  spo1fy.com/jobs	
  

Optimera STHLM 2011 - Mikael Berggren, Spotify

  • 1.
    Is it WebScale?   Mikael  Berggren   miken@spo1fy.com  
  • 2.
    Who  is  this  old  guy?   •  Web  Team  Lead  @  Spo1fy   •  Started  back  in  the  the  glory  days  of  2000   •  Only  works  for  companies  beginning  with  an  S   (Spray,  Stardoll  &  Spo1fy)  
  • 3.
    Is  /dev/null  web  scale?   “Mongo  DB  is  Web  Scale”  –  YouTube.com  
  • 4.
    Success  in  scale  fail   •  Choose  by  buzz  /  trends   •  Use  technology  suitable  for  something   completely  different   •  Web  frameworks   •  Don’t  measure  or  look  at  graphs  
  • 5.
    “We’re  using  X  and  it  scales”   •  There  is  no  magic  solu1on  that  scales  out  of   the  box   •  Choose  the  technology  you’re  familiar  with   •  Middle  bird  get’s  the  worm  
  • 6.
    What  is  scaling?   •  It’s  about  lying  :)   •  Take  control  of  your  code   •  Push  instead  of  pull   •  Finding  boZlenecks  and  fix  them   •  Avoid  SPoF  as  much  as  you  can   •  Measure  and  analyze  
  • 7.
    Cache  is  king!   •  Cache  on  mul1ple  instances   •  Cache  as  close  to  the  final  result  as  possible   •  Memcache  is  good   •  but  flat  files  is  even  beZer    
  • 8.
    Using  memcache   • Local  &  Global  instances   •  Global:  For  everything  that  needs  to  be   distributed  and  synchronized   •  Local:  For  everything  else   •  Make  sure  failovers  work  correctly   •  Use  getMul1  when  possible   •  Have  several  small  instances  instead  of  one   large  
  • 9.
    File  cache   •  Fast  and  reliable   •  No  3rd  party  dependencies   •  Used  ocen  =>  Really  fast  access   •  Easy  to  scale   •  Atomic  updates  
  • 10.
    Web  frameworks?  No  thanks!   •  Never  built  for  your  specific  needs.  If  they  are,   you’re  damn  lucky  :)   •  Hard  to  control  data  and  request  handling   your  way  (plugins,  modules)   •  A  lot  of  overhead  for  each  request  
  • 11.
    Database  scaling   •  Do  your  homework  on  indexes  and  queries   •  Test  your  knowledge  on  indexes  and  queries   •  Use  slave  nodes  for  reads   •  Horizontal  sharding  for  segments  of  users   •  Ver1cal  sharding  for  user-­‐data  
  • 12.
    RDB  vs.  NoSQL   •  “No”  in  NoSQL  stands  for  “Not  Only”   •  Use  the  correct  storage  for  your  purpose   •  Don’t  do  as  Digg.com…  
  • 13.
    Scaling  at  Stardoll   •  >3500  dynamic  PV/s   •  Horizontal  sharding  of  user-­‐data   •  Pre-­‐genera1on  of  content   •  Measuring  render-­‐1me  for  each  page  
  • 14.
    Scaling  at  Spo1fy   •  We  said  “Bye  Wordpress!”  and  the  servers   where  happy  again   •  95%  of  all  content:  flat  files   •  Dependencies  on  shared  services   •  Handle  huge  spikes  in  traffic  
  • 15.
  • 16.
    Recommenda1ons   •  Have  a  code  standard  everyone  must  follow!   •  Great  error  handling  and  logging   •  Possibility  to  disable  features/func1onality   •  Try  to  do  requests  asynchronous   •  Avoid  race  condi1ons   •  Monitor,  measure  and  analyze    !important  
  • 17.
  • 18.
    And btw, we’re hiring   spo1fy.com/jobs