Erlang as a cloud citizen    Paolo Negri @hungryblank
1 day at• 10 millions players• 2 billions game server requests (http)• 20 devops people
Cloud“A cloud is made of billows uponbillows upon billows that look likeclouds.As you come closer to a cloud youdont get s...
AWS Cloud
This talk will answer• Why building a system targeting the  cloud?• How many EC2 instances do you  need to respond to 0.25...
15 months ago...           http://www.flickr.com/photos/wheatfields/515068701
1st cloud hosted project,     lessons learned1      Pushing live the 60th application      server?      not different from...
1st cloud hosted project,     lessons learned2      local network/local disk are low      performance general purpose     ...
1st cloud hosted project,     lessons learned3      Complete automation is cool      Ease of adding hosts/automation      ...
1st cloud hosted project,      points of pain • A lot of inefficient app servers (as per   tweets) • Much effort to scale u...
Why trying again?          http://www.flickr.com/photos/kky/704056791/
Uncertainity• will we reach100K or 3 millions users?• 3 million users in 2 weeks or 12 months?• cheat tool released 1h ago...
the cloud• standard units (instances) of  computing capacity• a network connecting all instances• an API to provision/dism...
the cloudSounds like a good framework tocompose computing capacityWhy didn’t work as a frameworkto compose throughput?
Scaling in the cloud            the recipeCLOUD: composable units of computing capacity                      +DEVELOPER: t...
turn a unit of                                         computing capacity                                            in a ...
Unit of throughput,      where?      App server       Database
Unit of throughput,             joke?App server    App server    App server     App server                        Cache   ...
Unit of throughput?      App server       Database     No unit!
Unit of throughput?          App server           DatabaseTightly coupled throughput?
Unit of throughput?        App server        DatabaseMonolithic throughput?
Monolithic throughput
Monolithic throughput• likes monolithic infrastructure!• scales well vertically• wants screaming fast stack (network,  dis...
Tightly coupled throughput               +loosely coupled hardware         (like cloud)               =         frustration
Who leads the tightly coupled dance?       App server        Database
Who leads the tightly coupled dance?       App server        Database   The stateless application server!
Stateless application servers guarantee     one thing...      which?
Data is neverwhere you need it
And another one...
If you can feed them  data fast enough...  they’ll choke on garbage collection
We measure memcache    HIT / MISSwhy app servers need to be 100% MISS?
Where’s the bestknowledge about hot/     cold data?
Even the reverse makes      more sense  Database    1. pick your data up              2. go in the stateless App server   ...
WhatWentWrong?
He can tell you!        • Rich Hickey        • Clojure author         “...If not in Erlang which I         think has a com...
Most languages and runtimesdon’t have a safe solution for concurrent, long lived state  Erlang stands out as anexception i...
Erlang... Processes are the primarymeans to structure an Erlang        application.                     wikipedia
Erlang + OTP Generic Server Behaviour A generic server process(gen_server) implemented    using this module...         otp...
Erlang + OTP Generic Server Behaviourhandle_call(_Request, _From, State) ->    {reply, ignored, State}.handle_cast(_Msg, S...
gen_server  • An erlang process  • With LOCAL state  • responding to requests from clientsA unit of throughput!
Composing throughput    with erlanggen_server             +   EC2 instance
Composing throughput      with erlang                   EC2 instance         1 EC2 instance + 1 erlang VM                 ...
Composing throughput      with erlang                   EC2 instance         1 EC2 instance + 1 erlang VM                 ...
Composing throughput      with erlang                   EC2 instance         1 EC2 instance + 1 erlang VM                 ...
Composing throughput      with erlang                   EC2 instance         1 EC2 instance + 1 erlang VM                 ...
Composing throughput      with erlang                   EC2 instance         1 EC2 instance + 1 erlang VM                 ...
Scale by adding units      instances
Scale up adding units      instances
Erlang distribution
Throughput complexity                       VS.• Losely coupled peers       • Tightly coupled roles• Independent throughpu...
Where does the state     come from?             Start                     Databasegen_server             Stop
Now with database                AWS                 S3
Database scalabilityDB is (almost) never onlatency critical path                             AWS                          ...
Database scalabilityThroughput required is low                             AWS                              S3 We can appr...
Database scalabilityUbiquitous and uniform fromapplication servers point ofview.                          AWS             ...
Remember?            AWS             S3
How it actually works                 Data from                 the S3 is                 uniformly                 availa...
And as you zoom in...
And zoom in...
And zoom in you see...    EC2 instance
Always the same  kind of structureA fractal approach to     throughput
Fractal“A cauliflower shows how an objectcan be made of many parts, each ofwhich is like a whole, but smaller.”            ...
HomeworkThe exact same solution might     not work for you...   but look for that unit of         throughput
Answer time           You need            XXXX       Smallish instances         to serve 0.25billions uncacheable reqs/day
Answer time           You need             0XXX       Smallish instances         to serve 0.25billions uncacheable reqs/day
Answer time           You need             00XX       Smallish instances         to serve 0.25billions uncacheable reqs/day
Answer time           You need              0012       Smallish instances         to serve 0.25billions uncacheable reqs/day
What’s1200 ?
ThanksPaolo Negri @hungryblankhttp://www.wooga.com/jobs
Upcoming SlideShare
Loading in...5
×

Erlang and the Cloud: A Fractal Approach to Throughput

6,998

Published on

Published in: Technology, News & Politics
0 Comments
32 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
6,998
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
32
Embeds 0
No embeds

No notes for slide

Erlang and the Cloud: A Fractal Approach to Throughput

  1. 1. Erlang as a cloud citizen Paolo Negri @hungryblank
  2. 2. 1 day at• 10 millions players• 2 billions game server requests (http)• 20 devops people
  3. 3. Cloud“A cloud is made of billows uponbillows upon billows that look likeclouds.As you come closer to a cloud youdont get something smooth, butirregularities at a smaller scale.” Benoît B. Mandelbrot http://www.flickr.com/photos/nirak/644336486
  4. 4. AWS Cloud
  5. 5. This talk will answer• Why building a system targeting the cloud?• How many EC2 instances do you need to respond to 0.25 billion uncacheable game reqs/day?
  6. 6. 15 months ago... http://www.flickr.com/photos/wheatfields/515068701
  7. 7. 1st cloud hosted project, lessons learned1 Pushing live the 60th application server? not different from adding the 6th! push a button
  8. 8. 1st cloud hosted project, lessons learned2 local network/local disk are low performance general purpose tools (nothing to do with ad hoc data center solutions)
  9. 9. 1st cloud hosted project, lessons learned3 Complete automation is cool Ease of adding hosts/automation can lead to bloated infrastructure
  10. 10. 1st cloud hosted project, points of pain • A lot of inefficient app servers (as per tweets) • Much effort to scale up/maintain databases (mySQL & Redis) • Expensive, not crazy expensive, but expensive
  11. 11. Why trying again? http://www.flickr.com/photos/kky/704056791/
  12. 12. Uncertainity• will we reach100K or 3 millions users?• 3 million users in 2 weeks or 12 months?• cheat tool released 1h ago => single game call up 5000%• weekly releases, new feature performance impact?
  13. 13. the cloud• standard units (instances) of computing capacity• a network connecting all instances• an API to provision/dismiss instances
  14. 14. the cloudSounds like a good framework tocompose computing capacityWhy didn’t work as a frameworkto compose throughput?
  15. 15. Scaling in the cloud the recipeCLOUD: composable units of computing capacity +DEVELOPER: turn a unit of computing capacity in a unit of throughput =composable throughput, a plan for scaling BIG
  16. 16. turn a unit of computing capacity in a unit ofhttp://www.flickr.com/photos/pasukaru76 throughput
  17. 17. Unit of throughput, where? App server Database
  18. 18. Unit of throughput, joke?App server App server App server App server Cache Database Database
  19. 19. Unit of throughput? App server Database No unit!
  20. 20. Unit of throughput? App server DatabaseTightly coupled throughput?
  21. 21. Unit of throughput? App server DatabaseMonolithic throughput?
  22. 22. Monolithic throughput
  23. 23. Monolithic throughput• likes monolithic infrastructure!• scales well vertically• wants screaming fast stack (network, disks...)• any performance glitch impacts the whole system
  24. 24. Tightly coupled throughput +loosely coupled hardware (like cloud) = frustration
  25. 25. Who leads the tightly coupled dance? App server Database
  26. 26. Who leads the tightly coupled dance? App server Database The stateless application server!
  27. 27. Stateless application servers guarantee one thing... which?
  28. 28. Data is neverwhere you need it
  29. 29. And another one...
  30. 30. If you can feed them data fast enough... they’ll choke on garbage collection
  31. 31. We measure memcache HIT / MISSwhy app servers need to be 100% MISS?
  32. 32. Where’s the bestknowledge about hot/ cold data?
  33. 33. Even the reverse makes more sense Database 1. pick your data up 2. go in the stateless App server app server
  34. 34. WhatWentWrong?
  35. 35. He can tell you! • Rich Hickey • Clojure author “...If not in Erlang which I think has a complete story for how they do state”[1] [1] Value Identity State @0.27 http://goo.gl/Zdjv0 http://www.flickr.com/photos/ghoseb/5120173586
  36. 36. Most languages and runtimesdon’t have a safe solution for concurrent, long lived state Erlang stands out as anexception in this panorama
  37. 37. Erlang... Processes are the primarymeans to structure an Erlang application. wikipedia
  38. 38. Erlang + OTP Generic Server Behaviour A generic server process(gen_server) implemented using this module... otp documentation
  39. 39. Erlang + OTP Generic Server Behaviourhandle_call(_Request, _From, State) ->    {reply, ignored, State}.handle_cast(_Msg, State) ->    {noreply, State}.handle_info(_Info, State) ->    {noreply, State}.terminate(_Reason, _State) ->    ok.code_change(_OldVsn, State, _Extra) ->    {ok, State}.
  40. 40. gen_server • An erlang process • With LOCAL state • responding to requests from clientsA unit of throughput!
  41. 41. Composing throughput with erlanggen_server + EC2 instance
  42. 42. Composing throughput with erlang EC2 instance 1 EC2 instance + 1 erlang VM =N kilo gen_servers (N kilo units of throughput)
  43. 43. Composing throughput with erlang EC2 instance 1 EC2 instance + 1 erlang VM =N kilo gen_servers (N kilo units of throughput)
  44. 44. Composing throughput with erlang EC2 instance 1 EC2 instance + 1 erlang VM =N kilo gen_servers (N kilo units of throughput)
  45. 45. Composing throughput with erlang EC2 instance 1 EC2 instance + 1 erlang VM =N kilo gen_servers (N kilo units of throughput)
  46. 46. Composing throughput with erlang EC2 instance 1 EC2 instance + 1 erlang VM =N kilo gen_servers (N kilo units of throughput)
  47. 47. Scale by adding units instances
  48. 48. Scale up adding units instances
  49. 49. Erlang distribution
  50. 50. Throughput complexity VS.• Losely coupled peers • Tightly coupled roles• Independent throughput • Dependent throughput
  51. 51. Where does the state come from? Start Databasegen_server Stop
  52. 52. Now with database AWS S3
  53. 53. Database scalabilityDB is (almost) never onlatency critical path AWS S3No need for low latency DB
  54. 54. Database scalabilityThroughput required is low AWS S3 We can approximate S3 capacity as infinite
  55. 55. Database scalabilityUbiquitous and uniform fromapplication servers point ofview. AWS S3
  56. 56. Remember? AWS S3
  57. 57. How it actually works Data from the S3 is uniformly available to any ec2 instance
  58. 58. And as you zoom in...
  59. 59. And zoom in...
  60. 60. And zoom in you see... EC2 instance
  61. 61. Always the same kind of structureA fractal approach to throughput
  62. 62. Fractal“A cauliflower shows how an objectcan be made of many parts, each ofwhich is like a whole, but smaller.” Benoît B. Mandelbrot http://www.flickr.com/photos/paulobrabo/3588387063
  63. 63. HomeworkThe exact same solution might not work for you... but look for that unit of throughput
  64. 64. Answer time You need XXXX Smallish instances to serve 0.25billions uncacheable reqs/day
  65. 65. Answer time You need 0XXX Smallish instances to serve 0.25billions uncacheable reqs/day
  66. 66. Answer time You need 00XX Smallish instances to serve 0.25billions uncacheable reqs/day
  67. 67. Answer time You need 0012 Smallish instances to serve 0.25billions uncacheable reqs/day
  68. 68. What’s1200 ?
  69. 69. ThanksPaolo Negri @hungryblankhttp://www.wooga.com/jobs

×