SlideShare a Scribd company logo
1 of 69
Download to read offline
Erlang as a cloud citizen
    Paolo Negri @hungryblank
1 day at


• 10 millions players
• 2 billions game server requests (http)
• 20 devops people
Cloud
“A cloud is made of billows upon
billows upon billows that look like
clouds.
As you come closer to a cloud you
don't get something smooth, but
irregularities at a smaller scale.”
                   Benoît B. Mandelbrot

                      http://www.flickr.com/photos/nirak/644336486
AWS Cloud
This talk will answer

• Why building a system targeting the
  cloud?
• How many EC2 instances do you
  need to respond to 0.25 billion
  uncacheable game reqs/day?
15 months ago...




           http://www.flickr.com/photos/wheatfields/515068701
1st cloud hosted project,
     lessons learned



1
      Pushing live the 60th application
      server?
      not different from adding the 6th!
      push a button
1st cloud hosted project,
     lessons learned



2
      local network/local disk are low
      performance general purpose
      tools
      (nothing to do with ad hoc data
      center solutions)
1st cloud hosted project,
     lessons learned



3
      Complete automation is cool
      Ease of adding hosts/automation
      can lead to bloated infrastructure
1st cloud hosted project,
      points of pain
 • A lot of inefficient app servers (as per
   tweets)
 • Much effort to scale up/maintain
   databases (mySQL & Redis)
 • Expensive, not crazy expensive, but
   expensive
Why trying again?




          http://www.flickr.com/photos/kky/704056791/
Uncertainity
• will we reach100K or 3 millions users?
• 3 million users in 2 weeks or 12 months?
• cheat tool released 1h ago => single game
  call up 5000%
• weekly releases, new feature performance
  impact?
the cloud

• standard units (instances) of
  computing capacity
• a network connecting all instances
• an API to provision/dismiss instances
the cloud

Sounds like a good framework to
compose computing capacity


Why didn’t work as a framework
to compose throughput?
Scaling in the cloud
            the recipe
CLOUD: composable units of computing capacity

                      +
DEVELOPER: turn a unit of computing capacity in
            a unit of throughput
                      =
composable throughput, a plan for scaling BIG
turn a unit of
                                         computing capacity
                                            in a unit of
http://www.flickr.com/photos/pasukaru76
                                            throughput
Unit of throughput,
      where?

      App server


       Database
Unit of throughput,
             joke?
App server    App server    App server     App server


                        Cache


             Database           Database
Unit of throughput?


      App server


       Database



     No unit!
Unit of throughput?


          App server


           Database



Tightly coupled throughput?
Unit of throughput?


        App server


        Database



Monolithic throughput?
Monolithic throughput
Monolithic throughput
• likes monolithic infrastructure!
• scales well vertically
• wants screaming fast stack (network,
  disks...)
• any performance glitch impacts the
  whole system
Tightly coupled throughput
               +
loosely coupled hardware
         (like cloud)
               =
         frustration
Who leads the tightly
 coupled dance?

       App server


        Database
Who leads the tightly
 coupled dance?

       App server


        Database

   The stateless
 application server!
Stateless application
 servers guarantee
     one thing...

      which?
Data is never
where you need it
And another one...
If you can feed them
  data fast enough...

  they’ll choke on
 garbage collection
We measure memcache
    HIT / MISS

why app servers need
 to be 100% MISS?
Where’s the best
knowledge about hot/
     cold data?
Even the reverse makes
      more sense

  Database    1. pick your data up
              2. go in the stateless
 App server
                 app server
What
Went
Wrong?
He can tell you!
        • Rich Hickey
        • Clojure author
         “...If not in Erlang which I
         think has a complete story
         for how they do state”[1]
         [1] Value Identity State @0.27

         http://goo.gl/Zdjv0


                     http://www.flickr.com/photos/ghoseb/5120173586
Most languages and runtimes
don’t have a safe solution for
 concurrent, long lived state

  Erlang stands out as an
exception in this panorama
Erlang...

 Processes are the primary
means to structure an Erlang
        application.

                     wikipedia
Erlang + OTP
 Generic Server Behaviour

 A generic server process
(gen_server) implemented
    using this module...

         otp documentation
Erlang + OTP
 Generic Server Behaviour
handle_call(_Request, _From, State) ->
    {reply, ignored, State}.

handle_cast(_Msg, State) ->
    {noreply, State}.

handle_info(_Info, State) ->
    {noreply, State}.

terminate(_Reason, _State) ->
    ok.

code_change(_OldVsn, State, _Extra) ->
    {ok, State}.
gen_server

  • An erlang process
  • With LOCAL state
  • responding to requests from clients

A unit of throughput!
Composing throughput
    with erlang


gen_server
             +   EC2 instance
Composing throughput
      with erlang

                   EC2 instance




         1 EC2 instance + 1 erlang VM
                      =
N kilo gen_servers (N kilo units of throughput)
Composing throughput
      with erlang

                   EC2 instance




         1 EC2 instance + 1 erlang VM
                      =
N kilo gen_servers (N kilo units of throughput)
Composing throughput
      with erlang

                   EC2 instance




         1 EC2 instance + 1 erlang VM
                      =
N kilo gen_servers (N kilo units of throughput)
Composing throughput
      with erlang

                   EC2 instance




         1 EC2 instance + 1 erlang VM
                      =
N kilo gen_servers (N kilo units of throughput)
Composing throughput
      with erlang

                   EC2 instance




         1 EC2 instance + 1 erlang VM
                      =
N kilo gen_servers (N kilo units of throughput)
Scale by adding units
      instances
Scale up adding units
      instances
Erlang distribution
Throughput complexity


                       VS.



• Losely coupled peers       • Tightly coupled roles
• Independent throughput     • Dependent throughput
Where does the state
     come from?
             Start



                     Database
gen_server

             Stop
Now with database


                AWS
                 S3
Database scalability
DB is (almost) never on
latency critical path


                             AWS
                              S3


No need for low latency DB
Database scalability

Throughput required is low


                             AWS
                              S3


 We can approximate S3
  capacity as infinite
Database scalability


Ubiquitous and uniform from
application servers point of
view.                          AWS
                                S3
Remember?


            AWS
             S3
How it actually works
                 Data from
                 the S3 is
                 uniformly
                 available to
                 any ec2
                 instance
And as you zoom in...
And zoom in...
And zoom in you see...


    EC2 instance
Always the same
  kind of structure

A fractal approach to
     throughput
Fractal

“A cauliflower shows how an object
can be made of many parts, each of
which is like a whole, but smaller.”
                  Benoît B. Mandelbrot



                 http://www.flickr.com/photos/paulobrabo/3588387063
Homework

The exact same solution might
     not work for you...

   but look for that unit of
         throughput
Answer time
           You need
            XXXX
       Smallish instances
         to serve 0.25
billions uncacheable reqs/day
Answer time
           You need
             0XXX
       Smallish instances
         to serve 0.25
billions uncacheable reqs/day
Answer time
           You need
             00XX
       Smallish instances
         to serve 0.25
billions uncacheable reqs/day
Answer time
           You need
              0012
       Smallish instances
         to serve 0.25
billions uncacheable reqs/day
What’s
1200 ?
Thanks
Paolo Negri @hungryblank
http://www.wooga.com/jobs

More Related Content

What's hot

What's hot (17)

Analysis big data by use php with storm
Analysis big data by use php with stormAnalysis big data by use php with storm
Analysis big data by use php with storm
 
Learning Stream Processing with Apache Storm
Learning Stream Processing with Apache StormLearning Stream Processing with Apache Storm
Learning Stream Processing with Apache Storm
 
Introduction to Storm
Introduction to StormIntroduction to Storm
Introduction to Storm
 
Storm
StormStorm
Storm
 
Exactly-once Semantics in Apache Kafka
Exactly-once Semantics in Apache KafkaExactly-once Semantics in Apache Kafka
Exactly-once Semantics in Apache Kafka
 
Benchmarking at Parse
Benchmarking at ParseBenchmarking at Parse
Benchmarking at Parse
 
Understanding Akka Streams, Back Pressure, and Asynchronous Architectures
Understanding Akka Streams, Back Pressure, and Asynchronous ArchitecturesUnderstanding Akka Streams, Back Pressure, and Asynchronous Architectures
Understanding Akka Streams, Back Pressure, and Asynchronous Architectures
 
Akka-chan's Survival Guide for the Streaming World
Akka-chan's Survival Guide for the Streaming WorldAkka-chan's Survival Guide for the Streaming World
Akka-chan's Survival Guide for the Streaming World
 
Apache Storm
Apache StormApache Storm
Apache Storm
 
Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)
 
Introduction to Apache Storm - Concept & Example
Introduction to Apache Storm - Concept & ExampleIntroduction to Apache Storm - Concept & Example
Introduction to Apache Storm - Concept & Example
 
Webinar: Queues with RabbitMQ - Lorna Mitchell
Webinar: Queues with RabbitMQ - Lorna MitchellWebinar: Queues with RabbitMQ - Lorna Mitchell
Webinar: Queues with RabbitMQ - Lorna Mitchell
 
(CMP311) This One Weird API Request Will Save You Thousands
(CMP311) This One Weird API Request Will Save You Thousands(CMP311) This One Weird API Request Will Save You Thousands
(CMP311) This One Weird API Request Will Save You Thousands
 
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...
 
The inherent complexity of stream processing
The inherent complexity of stream processingThe inherent complexity of stream processing
The inherent complexity of stream processing
 
Apache Storm
Apache StormApache Storm
Apache Storm
 
RMG202 Rainmakers: How Netflix Operates Clouds for Maximum Freedom and Agilit...
RMG202 Rainmakers: How Netflix Operates Clouds for Maximum Freedom and Agilit...RMG202 Rainmakers: How Netflix Operates Clouds for Maximum Freedom and Agilit...
RMG202 Rainmakers: How Netflix Operates Clouds for Maximum Freedom and Agilit...
 

Viewers also liked

Hyperdex - A closer look
Hyperdex - A closer lookHyperdex - A closer look
Hyperdex - A closer look
DECK36
 
Blazes: coordination analysis for distributed programs
Blazes: coordination analysis for distributed programsBlazes: coordination analysis for distributed programs
Blazes: coordination analysis for distributed programs
palvaro
 

Viewers also liked (20)

Mongrel2, a short introduction
Mongrel2, a short introductionMongrel2, a short introduction
Mongrel2, a short introduction
 
SimpleDb, an introduction
SimpleDb, an introductionSimpleDb, an introduction
SimpleDb, an introduction
 
Electron - Solving our cross platform dreams?
Electron - Solving our cross platform dreams?Electron - Solving our cross platform dreams?
Electron - Solving our cross platform dreams?
 
Content Management Systems and Refactoring - Drupal, WordPress and eZ Publish
Content Management Systems and Refactoring - Drupal, WordPress and eZ PublishContent Management Systems and Refactoring - Drupal, WordPress and eZ Publish
Content Management Systems and Refactoring - Drupal, WordPress and eZ Publish
 
Scaling Social Games
Scaling Social GamesScaling Social Games
Scaling Social Games
 
Why you should come to DrupalSouth
Why you should come to DrupalSouthWhy you should come to DrupalSouth
Why you should come to DrupalSouth
 
Entrez dans le mouvement Maker à l’aide des technologies Microsoft
Entrez dans le mouvement Maker à l’aide des technologies MicrosoftEntrez dans le mouvement Maker à l’aide des technologies Microsoft
Entrez dans le mouvement Maker à l’aide des technologies Microsoft
 
A Documentation Crash Course, LinuxCon 2016
A Documentation Crash Course, LinuxCon 2016A Documentation Crash Course, LinuxCon 2016
A Documentation Crash Course, LinuxCon 2016
 
Offre développeur Javascript Back-end
Offre développeur Javascript Back-endOffre développeur Javascript Back-end
Offre développeur Javascript Back-end
 
Erlang introduction geek2geek Berlin
Erlang introduction geek2geek BerlinErlang introduction geek2geek Berlin
Erlang introduction geek2geek Berlin
 
Contentful Berlin Offices
Contentful Berlin OfficesContentful Berlin Offices
Contentful Berlin Offices
 
Automate your docs, automate yourself
Automate your docs, automate yourselfAutomate your docs, automate yourself
Automate your docs, automate yourself
 
The Anatomy of Content Management (workshop by J Gollner at Intelligent Conte...
The Anatomy of Content Management (workshop by J Gollner at Intelligent Conte...The Anatomy of Content Management (workshop by J Gollner at Intelligent Conte...
The Anatomy of Content Management (workshop by J Gollner at Intelligent Conte...
 
Hyperdex - A closer look
Hyperdex - A closer lookHyperdex - A closer look
Hyperdex - A closer look
 
Chloe and the Realtime Web
Chloe and the Realtime WebChloe and the Realtime Web
Chloe and the Realtime Web
 
Brunch With Coffee
Brunch With CoffeeBrunch With Coffee
Brunch With Coffee
 
Riak Search - Erlang Factory London 2010
Riak Search - Erlang Factory London 2010Riak Search - Erlang Factory London 2010
Riak Search - Erlang Factory London 2010
 
Blazes: coordination analysis for distributed programs
Blazes: coordination analysis for distributed programsBlazes: coordination analysis for distributed programs
Blazes: coordination analysis for distributed programs
 
LXC, Docker, and the future of software delivery | LinuxCon 2013
LXC, Docker, and the future of software delivery | LinuxCon 2013LXC, Docker, and the future of software delivery | LinuxCon 2013
LXC, Docker, and the future of software delivery | LinuxCon 2013
 
AWS Lambda in infrastructure
AWS Lambda in infrastructureAWS Lambda in infrastructure
AWS Lambda in infrastructure
 

Similar to Erlang as a cloud citizen, a fractal approach to throughput

Using Kubernetes to deliver a “serverless” service
Using Kubernetes to deliver a “serverless” serviceUsing Kubernetes to deliver a “serverless” service
Using Kubernetes to deliver a “serverless” service
DoKC
 

Similar to Erlang as a cloud citizen, a fractal approach to throughput (20)

Using Kubernetes to deliver a “serverless” service
Using Kubernetes to deliver a “serverless” serviceUsing Kubernetes to deliver a “serverless” service
Using Kubernetes to deliver a “serverless” service
 
Why Scale Matters and How the Cloud is Really Different (at scale)
Why Scale Matters and How the Cloud is Really Different (at scale)Why Scale Matters and How the Cloud is Really Different (at scale)
Why Scale Matters and How the Cloud is Really Different (at scale)
 
Using Kubernetes to deliver a “serverless” service
Using Kubernetes to deliver a “serverless” serviceUsing Kubernetes to deliver a “serverless” service
Using Kubernetes to deliver a “serverless” service
 
Scaling a MeteorJS SaaS app on AWS
Scaling a MeteorJS SaaS app on AWSScaling a MeteorJS SaaS app on AWS
Scaling a MeteorJS SaaS app on AWS
 
Scaling on AWS for the First 10 Million Users
Scaling on AWS for the First 10 Million UsersScaling on AWS for the First 10 Million Users
Scaling on AWS for the First 10 Million Users
 
Scaling on AWS to the First 10 Million Users
Scaling on AWS to the First 10 Million Users Scaling on AWS to the First 10 Million Users
Scaling on AWS to the First 10 Million Users
 
Architecting Scalable Applications in the Cloud
Architecting Scalable Applications in the CloudArchitecting Scalable Applications in the Cloud
Architecting Scalable Applications in the Cloud
 
Scaling on AWS for the First 10 Million Users
Scaling on AWS for the First 10 Million UsersScaling on AWS for the First 10 Million Users
Scaling on AWS for the First 10 Million Users
 
The impact of cloud NSBCon NY by Yves Goeleven
The impact of cloud NSBCon NY by Yves GoelevenThe impact of cloud NSBCon NY by Yves Goeleven
The impact of cloud NSBCon NY by Yves Goeleven
 
AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)
AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)
AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)
 
Anton Boyko "The future of serverless computing"
Anton Boyko "The future of serverless computing"Anton Boyko "The future of serverless computing"
Anton Boyko "The future of serverless computing"
 
Deep Dive: Scaling Up to Your First 10 Million Users
Deep Dive: Scaling Up to Your First 10 Million UsersDeep Dive: Scaling Up to Your First 10 Million Users
Deep Dive: Scaling Up to Your First 10 Million Users
 
Aws webcast - Scaling on AWS 13 08-20
Aws webcast - Scaling on AWS 13 08-20Aws webcast - Scaling on AWS 13 08-20
Aws webcast - Scaling on AWS 13 08-20
 
Immutable Infrastructure: the new App Deployment
Immutable Infrastructure: the new App DeploymentImmutable Infrastructure: the new App Deployment
Immutable Infrastructure: the new App Deployment
 
Architecture Best Practices on Windows Azure
Architecture Best Practices on Windows AzureArchitecture Best Practices on Windows Azure
Architecture Best Practices on Windows Azure
 
(ARC301) Scaling Up to Your First 10 Million Users
(ARC301) Scaling Up to Your First 10 Million Users(ARC301) Scaling Up to Your First 10 Million Users
(ARC301) Scaling Up to Your First 10 Million Users
 
Matt Franklin - Apache Software (Geekfest)
Matt Franklin - Apache Software (Geekfest)Matt Franklin - Apache Software (Geekfest)
Matt Franklin - Apache Software (Geekfest)
 
Microservices: moving parts around
Microservices: moving parts aroundMicroservices: moving parts around
Microservices: moving parts around
 
Anton Boyko, "The evolution of microservices platform or marketing gibberish"
Anton Boyko, "The evolution of microservices platform or marketing gibberish"Anton Boyko, "The evolution of microservices platform or marketing gibberish"
Anton Boyko, "The evolution of microservices platform or marketing gibberish"
 
Satrtup Bootcamp - Scale on AWS
Satrtup Bootcamp - Scale on AWSSatrtup Bootcamp - Scale on AWS
Satrtup Bootcamp - Scale on AWS
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 

Erlang as a cloud citizen, a fractal approach to throughput

  • 1. Erlang as a cloud citizen Paolo Negri @hungryblank
  • 2. 1 day at • 10 millions players • 2 billions game server requests (http) • 20 devops people
  • 3. Cloud “A cloud is made of billows upon billows upon billows that look like clouds. As you come closer to a cloud you don't get something smooth, but irregularities at a smaller scale.” Benoît B. Mandelbrot http://www.flickr.com/photos/nirak/644336486
  • 5. This talk will answer • Why building a system targeting the cloud? • How many EC2 instances do you need to respond to 0.25 billion uncacheable game reqs/day?
  • 6. 15 months ago... http://www.flickr.com/photos/wheatfields/515068701
  • 7. 1st cloud hosted project, lessons learned 1 Pushing live the 60th application server? not different from adding the 6th! push a button
  • 8. 1st cloud hosted project, lessons learned 2 local network/local disk are low performance general purpose tools (nothing to do with ad hoc data center solutions)
  • 9. 1st cloud hosted project, lessons learned 3 Complete automation is cool Ease of adding hosts/automation can lead to bloated infrastructure
  • 10. 1st cloud hosted project, points of pain • A lot of inefficient app servers (as per tweets) • Much effort to scale up/maintain databases (mySQL & Redis) • Expensive, not crazy expensive, but expensive
  • 11. Why trying again? http://www.flickr.com/photos/kky/704056791/
  • 12. Uncertainity • will we reach100K or 3 millions users? • 3 million users in 2 weeks or 12 months? • cheat tool released 1h ago => single game call up 5000% • weekly releases, new feature performance impact?
  • 13. the cloud • standard units (instances) of computing capacity • a network connecting all instances • an API to provision/dismiss instances
  • 14. the cloud Sounds like a good framework to compose computing capacity Why didn’t work as a framework to compose throughput?
  • 15. Scaling in the cloud the recipe CLOUD: composable units of computing capacity + DEVELOPER: turn a unit of computing capacity in a unit of throughput = composable throughput, a plan for scaling BIG
  • 16. turn a unit of computing capacity in a unit of http://www.flickr.com/photos/pasukaru76 throughput
  • 17. Unit of throughput, where? App server Database
  • 18. Unit of throughput, joke? App server App server App server App server Cache Database Database
  • 19. Unit of throughput? App server Database No unit!
  • 20. Unit of throughput? App server Database Tightly coupled throughput?
  • 21. Unit of throughput? App server Database Monolithic throughput?
  • 23. Monolithic throughput • likes monolithic infrastructure! • scales well vertically • wants screaming fast stack (network, disks...) • any performance glitch impacts the whole system
  • 24. Tightly coupled throughput + loosely coupled hardware (like cloud) = frustration
  • 25. Who leads the tightly coupled dance? App server Database
  • 26. Who leads the tightly coupled dance? App server Database The stateless application server!
  • 27. Stateless application servers guarantee one thing... which?
  • 28. Data is never where you need it
  • 30. If you can feed them data fast enough... they’ll choke on garbage collection
  • 31. We measure memcache HIT / MISS why app servers need to be 100% MISS?
  • 32. Where’s the best knowledge about hot/ cold data?
  • 33. Even the reverse makes more sense Database 1. pick your data up 2. go in the stateless App server app server
  • 35. He can tell you! • Rich Hickey • Clojure author “...If not in Erlang which I think has a complete story for how they do state”[1] [1] Value Identity State @0.27 http://goo.gl/Zdjv0 http://www.flickr.com/photos/ghoseb/5120173586
  • 36. Most languages and runtimes don’t have a safe solution for concurrent, long lived state Erlang stands out as an exception in this panorama
  • 37. Erlang... Processes are the primary means to structure an Erlang application. wikipedia
  • 38. Erlang + OTP Generic Server Behaviour A generic server process (gen_server) implemented using this module... otp documentation
  • 39. Erlang + OTP Generic Server Behaviour handle_call(_Request, _From, State) ->     {reply, ignored, State}. handle_cast(_Msg, State) ->     {noreply, State}. handle_info(_Info, State) ->     {noreply, State}. terminate(_Reason, _State) ->     ok. code_change(_OldVsn, State, _Extra) ->     {ok, State}.
  • 40. gen_server • An erlang process • With LOCAL state • responding to requests from clients A unit of throughput!
  • 41. Composing throughput with erlang gen_server + EC2 instance
  • 42. Composing throughput with erlang EC2 instance 1 EC2 instance + 1 erlang VM = N kilo gen_servers (N kilo units of throughput)
  • 43. Composing throughput with erlang EC2 instance 1 EC2 instance + 1 erlang VM = N kilo gen_servers (N kilo units of throughput)
  • 44. Composing throughput with erlang EC2 instance 1 EC2 instance + 1 erlang VM = N kilo gen_servers (N kilo units of throughput)
  • 45. Composing throughput with erlang EC2 instance 1 EC2 instance + 1 erlang VM = N kilo gen_servers (N kilo units of throughput)
  • 46. Composing throughput with erlang EC2 instance 1 EC2 instance + 1 erlang VM = N kilo gen_servers (N kilo units of throughput)
  • 47. Scale by adding units instances
  • 48. Scale up adding units instances
  • 50. Throughput complexity VS. • Losely coupled peers • Tightly coupled roles • Independent throughput • Dependent throughput
  • 51. Where does the state come from? Start Database gen_server Stop
  • 53. Database scalability DB is (almost) never on latency critical path AWS S3 No need for low latency DB
  • 54. Database scalability Throughput required is low AWS S3 We can approximate S3 capacity as infinite
  • 55. Database scalability Ubiquitous and uniform from application servers point of view. AWS S3
  • 56. Remember? AWS S3
  • 57. How it actually works Data from the S3 is uniformly available to any ec2 instance
  • 58. And as you zoom in...
  • 60. And zoom in you see... EC2 instance
  • 61. Always the same kind of structure A fractal approach to throughput
  • 62. Fractal “A cauliflower shows how an object can be made of many parts, each of which is like a whole, but smaller.” Benoît B. Mandelbrot http://www.flickr.com/photos/paulobrabo/3588387063
  • 63. Homework The exact same solution might not work for you... but look for that unit of throughput
  • 64. Answer time You need XXXX Smallish instances to serve 0.25 billions uncacheable reqs/day
  • 65. Answer time You need 0XXX Smallish instances to serve 0.25 billions uncacheable reqs/day
  • 66. Answer time You need 00XX Smallish instances to serve 0.25 billions uncacheable reqs/day
  • 67. Answer time You need 0012 Smallish instances to serve 0.25 billions uncacheable reqs/day