SlideShare a Scribd company logo
1 of 54
Load balancing
theory and practice
Welcome
Me:
• Dave Rosenthal
• Co-founder of FoundationDB
• Spent last three years building a distributed
  transactional NoSQL database
• It’s my birthday

Any time you have multiple computers working on a
      job, you have a load balancing problem!
Warning
There is an ugly downside to learning about load
balancing: TSA checkpoints, grocery store lines,
and traffic lights may become even more
frustrating.
What is load balancing?
Wikipedia: “…methodology to distribute
workload across multiple computers … to
achieve optimal resource utilization, maximize
throughput, minimize response time, and avoid
overload”

          All part of the latency curve
The latency curve
                                           Overload
          10000


          1000
                                            Saturation
Latency




            100


             10


              1    Nominal                 Interesting

                             Jobs/second
Goal for real-time systems

          10000


          1000
Latency




            100

                  Low latency at
             10   given load

              1


                               Jobs/second
Goal for batch systems

          10000


          1000
                    High Jobs/sec at a
Latency




            100     reasonable latency


             10


              1


                        Jobs/second
The latency curve
               1000



               100
Latency (ms)




                              Better load balancing strategies
                              can dramatically improve both
                              latency and throughput
                 10



                  1
                      0     0.2       0.4          0.6      0.8   1
                                            Load
Load balancing tensions
• We want to reduce queue lengths in the
  system to yield better latency
• We want to lengthen queue lengths to keep a
  “buffer” of work to keep busy during irregular
  traffic and yield better throughput
• For distributed systems, equalizing queue
  lengths sounds good
Can we just limit queue sizes?
                    40
                    35
                    30
% of dropped jobs




                    25
                    20
                    15
                    10
                    5
                    0
                         0   5          10           15   20
                                  Queued job limit
Simple strategies
Global job queue: for slow tasks
Round robin: for highly uniform situations
Random: probably won’t screw you
Sticky: for cacheable situations
Fastest of N tries: tradeoff throughput for
latency. I recommend N = 2 or 3.
Use a global queue if possible
                         10
Latency under 80% load




                          1
                                                                               Random assignment
                                                                               Global Job Queue



                         0.1
                               1   2   3   4     5     6      7   8   9   10
                                               Cluster Size
Options for information transfer
•   None (rare)
•   Latency (most common)
•   Failure detection
•   Explicit
    – Load average
    – Queue length
    – Response times
FoundationDB’s approach
1. Request to random of three servers
2. Server either answers query or replies “busy” if its
   queue is longer than the queue limit estimate
3. Queries that were busy are sent to second random
   server with “must do” flag set.

Queue limit = 25 * 2^(20*P)
• A global queue limit is implicitly shared by estimating
  the fraction of incoming requests (P) that are flagged
  “must do”
• Converges to a P(redirect)/queue-size equilibrium
FDB latency curve before/after
          100




          10
Latency




            1




          0.1
                0   200000   400000    600000    800000   1000000   1200000
                                Operations per second
Tackling load balancing
•   Queuing theory: One useful insight
•   Simulation: Do this
•   Instrumentation: Do this
•   Control theory: Know how to avoid this
•   Operations research: Read about this for fun
    – Blackett: Shield planes where they are not shot!
The one insight: Little’s law
                       Q = R*W

•   (Q)ueue size = (R)ate * (W)ait-time
•   Q is the average number of jobs in the system
•   R is the average arrival rate (jobs/second)
•   W is the average wait time (seconds)
•   For any (!) steady-state systems
    – Or sub-systems, or joint systems, or…
Little’s law example 1
                     Q = R*W

•   We get 1,000,000 request per second (R=1E6)
•   We take 100 ms to service each request
•   (Q = 1E6*0.100)
•   Little’s Law: Average queue depth is 100,000!
Little’s law example 2
                   W = Q/R

• We have 100 users in the system making
  continuous requests (Q=100)
• We get 10,000 requests per second
• (W = 100 / 10,000)
• Little’s Law: Average wait time is 10 ms
Little’s law ramifications
                     Q = R*W

• In distributed system:
  – R scales up
  – W remains the same, or gets a bit worse
• To maintain performance, you’re going to
  need a whole lot of jobs in flight
The rest of queuing theory
Erlang
• A language
• A man (Agner Krarup Erlang)
• And a unit! (Q from little’s law AKA offered load is
  measured in dimensionless Erlang units)
• Erlang-B formula (for limited-length queues)
• Erlang-C formula (P(waiting))
Abandon hope

                             Math for queuing theory
                     10000
Complexity of Math




                     1000

                       100

                        10

                         1
                                        Little’s law


                                    Real-world applicability
                                                               ?
Simulation
The best way to explore distributed system
behavior
Quiz
Model: Jobs of random durations. 80% load.
Goal: Minimize average job latency.

What to work a bit more on?
• First task received
• Last task received
• Shortest task
• Longest task
• Random task
• Task with least work remaining
• Task with most work remaining
Simulation code snippits
Simulation results at 80% load
            First task received

            Last task received

                 Shortest task

                 Longest task

                 Random task

Task with least work remaining

Task with most work remaining

                                  0   10   20     30   40   50
                                            Latency
Simulation results at 95% load
            First task received

            Last task received

                 Shortest task

                 Longest task

                 Random task

Task with least work remaining

Task with most work remaining

                                  1   10   100    1000   10000   100000
                                             Latency
FoundationDB’s approach
• Strategy validated using simulation used for a
  single server’s fiber scheduling
• High priority: Work on the next task to finish
• But be careful to enqueue incoming work
  from the network with highest priority—we
  want to know about all our jobs to make good
  decisions
• Low priority: Catch up with housekeeping (e.g.
  non-log writing)
Load spikes
Low load system               High load system




  Bursts of job requests can destroy latency. The
  effect is quadratic: A burst produces a queue of
  size B that lasts time proportional to B. On highly-
  loaded systems, the effect is multiplied by 1/(1-
  load), leading to huge latency impacts.
Burst-avoiding tip
1. Search for any delay/interval in your system
2. If system correctness depends on the
   delay/interval being exact, first fix that
3. Now change that delay/interval to randomly
   wait 0.8-1.2 times the nominal time on each
   execution

YMMV, but this tends to diffuse system events more
  evenly in time and help utilization and latency.
Overload
                                 Overload
          10000


          1000
Latency




            100


             10


              1


                   Jobs/second
Overload
What happens when work comes in too fast?
• Somewhere in your system a queue is going to
  get huge. Where?
• Lowered efficiency due to:
  – Sloshing
  – Poor caching
• Unconditional acceptance of new work means
  no information transfer to previous system!
Overload (cont’d): Sloshing
Loading 10 million rows into popular NoSQL K/V
store shows sloshing

                   12.5 minutes
Overload (cont’d): No sloshing
Loading 10 million rows into FDB shows smooth
behavior:
System queuing

       Queue    Queue    Queue




Work
 A
 B
 C     Node 1   Node 2   Node 3
 D
 E
System queuing

       Queue    Queue    Queue
         A




Work


 B
 C     Node 1   Node 2   Node 3
 D
 E
Internal queue buildup

          Queue   Queue    Queue
           A
           B
            C

Work       D




         Node 1   Node 2   Node 3

 E
Even queues, external buildup

           Queue    Queue    Queue
                               A
                      B
             C

Work




           Node 1   Node 2   Node 3
 D
 E
 …
Our approach
“Ratekeeper”
• Active management of internal queue sizes
  prevents sloshing
• Avoids every subcomponent needing it’s own
  well-tuned load balancing strategy
• Explicitly send queue information at 10hz back to
  a centrally-elected control algorithm
• When queues get large, slow system input
• Pushes latency into an external queue at the
  front of the system using “tickets”
Ratekeeper in action
                        1400000

                        1200000
Operations per second




                        1000000

                        800000

                        600000

                        400000

                        200000

                              0
                                  0     100   200    300      400   500   600
                                                    Seconds
Ratekeeper internals
What can go wrong
Well, we are controlling the queue depths of the
system, so, basically, everything in control
theory…



Namely, oscillation:
Recognizing oscillation
• Something moving up and down :)
  – Look for low utilization of parallel resources
  – Zoom in!
• Think about sources of feedback—is there
  some way that having a machine getting more
  job done feeds either less or more work for
  that machine in the future? (probably yes)
What oscillation looks like
                70

                60

                50
Utilization %




                40
                                                       Node A
                30
                                                       Node B
                20

                10

                0
                     1        2      3      4     5
What oscillation looks like
                120

                100

                80
Utilization %




                60
                                                                    Node A
                                                                    Node B
                40

                20

                 0
                      2      2.05   2.1   2.15   2.2   2.25   2.3
                -20
Avoiding oscillation
• This is control theory—avoid if possible!
• The major thing to know: control gets harder
  at frequencies get higher. (e.g. Bose
  headphones)
• Two strategies:
  – Control on a longer time scale
  – Introduce a low-pass-filer in the control loop (e.g.
    exponential moving average)
Instrumentation
 If you can’t measure, you can’t make it better

Things that might be nice to measure:
• Latencies
• Queue lengths
• Causes of latency?
Measuring latencies
Our approach:
• We want information about the distribution, not
  just the average
• We use a “Distribution” class
  – addSample(X)
  – Stores 500+ samples
  – Throws away half of them when it hits 1000
    samples, and halves probability of accepting new
    samples
  – Also tracks exact min, max, mean, and stddev
Measuring queue lengths
Our approach:
• Track the % of time that a queue is at zero length
• Measure queue length snapshots at intervals
• Watch out for oscillations
   – Slow ones you can see
   – Fast ones look like noise (which, unfortunately, is also
     what noise looks like)
   – “Zoom in” to exclude the possibility of micro-
     oscillations
Measuring latency from blocking
• Easy to calculate:
   – L = (b0^2 + b1^2 … bN^2) / elapsed
   – Total all squared seconds of blocking time over some
     interval, divide by the duration of the interval.
• Measures impact of unavailability on mean
  latency from random traffic
• Example: Is server’s slow latency explained by
  this lock?
• Doesn’t count catch-up time.
Summary
Thanks for listening, and remember:
• Everything has a latency curve
• Little’s law
• Randomize regular intervals
• Validate designs with simulation
• Instrument

     May your queues be small, but not empty
david.rosenthal@foundationdb.com
Prioritization/QOS
• Can help in systems under partial load
• Vital in systems that handle batch and real-
  time loads simultaneously
• Be careful that high priority work doesn’t
  generate other high priority work plus other
  jobs in the queue. This can lead to poor
  utilization analogous to the internal queue
  buildup case.
Congestion pricing
• My favorite topic
• Priority isn’t just a function of the benefit of
  your job
• To be a good citizen, you should subtract the
  costs to others
• For example, jumping into the front of a long
  queue has costs proportional to the queue
  size
Other FIFO alternatives?
• LIFO
  – Avoids the reason to line up early
  – In situations where there is adequate capacity to
    serve everyone, can yield better waiting times for
    everyone involved

More Related Content

What's hot

Design Patterns (Tasarım Kalıpları)
Design Patterns (Tasarım Kalıpları)Design Patterns (Tasarım Kalıpları)
Design Patterns (Tasarım Kalıpları)nedirtv
 
The never-ending REST API design debate
The never-ending REST API design debateThe never-ending REST API design debate
The never-ending REST API design debateRestlet
 
Oracle CodeOne 2019: Decompose Your Monolith: Strategies for Migrating to Mic...
Oracle CodeOne 2019: Decompose Your Monolith: Strategies for Migrating to Mic...Oracle CodeOne 2019: Decompose Your Monolith: Strategies for Migrating to Mic...
Oracle CodeOne 2019: Decompose Your Monolith: Strategies for Migrating to Mic...Chris Richardson
 
Asynchronous processing in big system
Asynchronous processing in big systemAsynchronous processing in big system
Asynchronous processing in big systemNghia Minh
 
High Performance Systems in Go - GopherCon 2014
High Performance Systems in Go - GopherCon 2014High Performance Systems in Go - GopherCon 2014
High Performance Systems in Go - GopherCon 2014Derek Collison
 
Api chaining(tm)
Api chaining(tm)Api chaining(tm)
Api chaining(tm)Owen Rubel
 
CQRS + Event Sourcing
CQRS + Event SourcingCQRS + Event Sourcing
CQRS + Event SourcingMike Bild
 
Intelligent, Automatic Restarts for Unhealthy Kafka Consumers on Kubernetes w...
Intelligent, Automatic Restarts for Unhealthy Kafka Consumers on Kubernetes w...Intelligent, Automatic Restarts for Unhealthy Kafka Consumers on Kubernetes w...
Intelligent, Automatic Restarts for Unhealthy Kafka Consumers on Kubernetes w...HostedbyConfluent
 
Bench, a Framework for Benchmarking Kafka Using K8s and OpenMessaging Benchma...
Bench, a Framework for Benchmarking Kafka Using K8s and OpenMessaging Benchma...Bench, a Framework for Benchmarking Kafka Using K8s and OpenMessaging Benchma...
Bench, a Framework for Benchmarking Kafka Using K8s and OpenMessaging Benchma...HostedbyConfluent
 
Mission Critical Applications Workloads on Amazon Web Services
Mission Critical Applications Workloads on Amazon Web ServicesMission Critical Applications Workloads on Amazon Web Services
Mission Critical Applications Workloads on Amazon Web ServicesAmazon Web Services
 
대용량 분산 아키텍쳐 설계 #5. rest
대용량 분산 아키텍쳐 설계 #5. rest대용량 분산 아키텍쳐 설계 #5. rest
대용량 분산 아키텍쳐 설계 #5. restTerry Cho
 
Chaos Engineering with Kubernetes - Berlin / Hamburg Chaos Engineering Meetup...
Chaos Engineering with Kubernetes - Berlin / Hamburg Chaos Engineering Meetup...Chaos Engineering with Kubernetes - Berlin / Hamburg Chaos Engineering Meetup...
Chaos Engineering with Kubernetes - Berlin / Hamburg Chaos Engineering Meetup...Ana Medina
 
Hash join in MySQL 8
Hash join in MySQL 8Hash join in MySQL 8
Hash join in MySQL 8Erik Frøseth
 
HTTP/3 시대의 웹 성능 최적화 기술 이해하기
HTTP/3 시대의 웹 성능 최적화 기술 이해하기HTTP/3 시대의 웹 성능 최적화 기술 이해하기
HTTP/3 시대의 웹 성능 최적화 기술 이해하기SangJin Kang
 
Microservices, Node, Dapr and more - Part One (Fontys Hogeschool, Spring 2022)
Microservices, Node, Dapr and more - Part One (Fontys Hogeschool, Spring 2022)Microservices, Node, Dapr and more - Part One (Fontys Hogeschool, Spring 2022)
Microservices, Node, Dapr and more - Part One (Fontys Hogeschool, Spring 2022)Lucas Jellema
 

What's hot (20)

Design Patterns (Tasarım Kalıpları)
Design Patterns (Tasarım Kalıpları)Design Patterns (Tasarım Kalıpları)
Design Patterns (Tasarım Kalıpları)
 
The never-ending REST API design debate
The never-ending REST API design debateThe never-ending REST API design debate
The never-ending REST API design debate
 
Oracle CodeOne 2019: Decompose Your Monolith: Strategies for Migrating to Mic...
Oracle CodeOne 2019: Decompose Your Monolith: Strategies for Migrating to Mic...Oracle CodeOne 2019: Decompose Your Monolith: Strategies for Migrating to Mic...
Oracle CodeOne 2019: Decompose Your Monolith: Strategies for Migrating to Mic...
 
Source control
Source controlSource control
Source control
 
Asynchronous processing in big system
Asynchronous processing in big systemAsynchronous processing in big system
Asynchronous processing in big system
 
High Performance Systems in Go - GopherCon 2014
High Performance Systems in Go - GopherCon 2014High Performance Systems in Go - GopherCon 2014
High Performance Systems in Go - GopherCon 2014
 
Java performance tuning
Java performance tuningJava performance tuning
Java performance tuning
 
Api chaining(tm)
Api chaining(tm)Api chaining(tm)
Api chaining(tm)
 
CQRS + Event Sourcing
CQRS + Event SourcingCQRS + Event Sourcing
CQRS + Event Sourcing
 
Apache ZooKeeper
Apache ZooKeeperApache ZooKeeper
Apache ZooKeeper
 
Intelligent, Automatic Restarts for Unhealthy Kafka Consumers on Kubernetes w...
Intelligent, Automatic Restarts for Unhealthy Kafka Consumers on Kubernetes w...Intelligent, Automatic Restarts for Unhealthy Kafka Consumers on Kubernetes w...
Intelligent, Automatic Restarts for Unhealthy Kafka Consumers on Kubernetes w...
 
Practical Object Oriented Models In Sql
Practical Object Oriented Models In SqlPractical Object Oriented Models In Sql
Practical Object Oriented Models In Sql
 
Domain Event - The Hidden Gem of DDD
Domain Event - The Hidden Gem of DDDDomain Event - The Hidden Gem of DDD
Domain Event - The Hidden Gem of DDD
 
Bench, a Framework for Benchmarking Kafka Using K8s and OpenMessaging Benchma...
Bench, a Framework for Benchmarking Kafka Using K8s and OpenMessaging Benchma...Bench, a Framework for Benchmarking Kafka Using K8s and OpenMessaging Benchma...
Bench, a Framework for Benchmarking Kafka Using K8s and OpenMessaging Benchma...
 
Mission Critical Applications Workloads on Amazon Web Services
Mission Critical Applications Workloads on Amazon Web ServicesMission Critical Applications Workloads on Amazon Web Services
Mission Critical Applications Workloads on Amazon Web Services
 
대용량 분산 아키텍쳐 설계 #5. rest
대용량 분산 아키텍쳐 설계 #5. rest대용량 분산 아키텍쳐 설계 #5. rest
대용량 분산 아키텍쳐 설계 #5. rest
 
Chaos Engineering with Kubernetes - Berlin / Hamburg Chaos Engineering Meetup...
Chaos Engineering with Kubernetes - Berlin / Hamburg Chaos Engineering Meetup...Chaos Engineering with Kubernetes - Berlin / Hamburg Chaos Engineering Meetup...
Chaos Engineering with Kubernetes - Berlin / Hamburg Chaos Engineering Meetup...
 
Hash join in MySQL 8
Hash join in MySQL 8Hash join in MySQL 8
Hash join in MySQL 8
 
HTTP/3 시대의 웹 성능 최적화 기술 이해하기
HTTP/3 시대의 웹 성능 최적화 기술 이해하기HTTP/3 시대의 웹 성능 최적화 기술 이해하기
HTTP/3 시대의 웹 성능 최적화 기술 이해하기
 
Microservices, Node, Dapr and more - Part One (Fontys Hogeschool, Spring 2022)
Microservices, Node, Dapr and more - Part One (Fontys Hogeschool, Spring 2022)Microservices, Node, Dapr and more - Part One (Fontys Hogeschool, Spring 2022)
Microservices, Node, Dapr and more - Part One (Fontys Hogeschool, Spring 2022)
 

Viewers also liked

Deterministic simulation testing
Deterministic simulation testingDeterministic simulation testing
Deterministic simulation testingFoundationDB
 
My Trip Through Australia
My Trip Through AustraliaMy Trip Through Australia
My Trip Through Australiadeankathryn
 
рождественский вертеп
рождественский вертепрождественский вертеп
рождественский вертепTatiana Tretiakova
 
Estacion los angeles
Estacion los angelesEstacion los angeles
Estacion los angelesJAIME JIPSION
 
Micro-services Battle Scars
Micro-services Battle ScarsMicro-services Battle Scars
Micro-services Battle ScarsRichard Rodger
 
Design Patterns in Micro-services architectures & Gilmour
Design Patterns in Micro-services architectures & GilmourDesign Patterns in Micro-services architectures & Gilmour
Design Patterns in Micro-services architectures & GilmourPiyush Verma
 
Reactive Web-Applications @ LambdaDays
Reactive Web-Applications @ LambdaDaysReactive Web-Applications @ LambdaDays
Reactive Web-Applications @ LambdaDaysManuel Bernhardt
 
Delivering with Microservices - How to Iterate Towards Sophistication
Delivering with Microservices - How to Iterate Towards SophisticationDelivering with Microservices - How to Iterate Towards Sophistication
Delivering with Microservices - How to Iterate Towards SophisticationThoughtworks
 
Developing applications with a microservice architecture (SVforum, microservi...
Developing applications with a microservice architecture (SVforum, microservi...Developing applications with a microservice architecture (SVforum, microservi...
Developing applications with a microservice architecture (SVforum, microservi...Chris Richardson
 

Viewers also liked (14)

Deterministic simulation testing
Deterministic simulation testingDeterministic simulation testing
Deterministic simulation testing
 
My Trip Through Australia
My Trip Through AustraliaMy Trip Through Australia
My Trip Through Australia
 
рождественский вертеп
рождественский вертепрождественский вертеп
рождественский вертеп
 
Simple past lesson
Simple past lesson Simple past lesson
Simple past lesson
 
Музей Истоки
Музей ИстокиМузей Истоки
Музей Истоки
 
Лицей
ЛицейЛицей
Лицей
 
Estacion los angeles
Estacion los angelesEstacion los angeles
Estacion los angeles
 
Micro-services Battle Scars
Micro-services Battle ScarsMicro-services Battle Scars
Micro-services Battle Scars
 
Design Patterns in Micro-services architectures & Gilmour
Design Patterns in Micro-services architectures & GilmourDesign Patterns in Micro-services architectures & Gilmour
Design Patterns in Micro-services architectures & Gilmour
 
NoSQL and ACID
NoSQL and ACIDNoSQL and ACID
NoSQL and ACID
 
Reactive Web-Applications @ LambdaDays
Reactive Web-Applications @ LambdaDaysReactive Web-Applications @ LambdaDays
Reactive Web-Applications @ LambdaDays
 
Delivering with Microservices - How to Iterate Towards Sophistication
Delivering with Microservices - How to Iterate Towards SophisticationDelivering with Microservices - How to Iterate Towards Sophistication
Delivering with Microservices - How to Iterate Towards Sophistication
 
Process design
Process designProcess design
Process design
 
Developing applications with a microservice architecture (SVforum, microservi...
Developing applications with a microservice architecture (SVforum, microservi...Developing applications with a microservice architecture (SVforum, microservi...
Developing applications with a microservice architecture (SVforum, microservi...
 

Similar to Load balancing theory and practice

Lessons learnt on a 2000-core cluster
Lessons learnt on a 2000-core clusterLessons learnt on a 2000-core cluster
Lessons learnt on a 2000-core clusterEugene Kirpichov
 
It Probably Works
It Probably WorksIt Probably Works
It Probably WorksFastly
 
Complicating Complexity: Performance in a New Machine Age
Complicating Complexity: Performance in a New Machine AgeComplicating Complexity: Performance in a New Machine Age
Complicating Complexity: Performance in a New Machine AgeMaurice Naftalin
 
On the way to low latency
On the way to low latencyOn the way to low latency
On the way to low latencyArtem Orobets
 
Patterns of parallel programming
Patterns of parallel programmingPatterns of parallel programming
Patterns of parallel programmingAlex Tumanoff
 
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...MLconf
 
Resilience at exascale
Resilience at exascaleResilience at exascale
Resilience at exascaleMarc Snir
 
Intro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS CloudIntro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS CloudAmazon Web Services
 
On the way to low latency (2nd edition)
On the way to low latency (2nd edition)On the way to low latency (2nd edition)
On the way to low latency (2nd edition)Artem Orobets
 
Resilience at Extreme Scale
Resilience at Extreme ScaleResilience at Extreme Scale
Resilience at Extreme ScaleMarc Snir
 
Mininet: Moving Forward
Mininet: Moving ForwardMininet: Moving Forward
Mininet: Moving ForwardON.Lab
 
The Power of Determinism in Database Systems
The Power of Determinism in Database SystemsThe Power of Determinism in Database Systems
The Power of Determinism in Database SystemsDaniel Abadi
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudyJohn Adams
 
Discretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSP
Discretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSPDiscretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSP
Discretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSPTathagata Das
 
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and HadoopEventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and HadoopAyon Sinha
 
Erlang as a cloud citizen, a fractal approach to throughput
Erlang as a cloud citizen, a fractal approach to throughputErlang as a cloud citizen, a fractal approach to throughput
Erlang as a cloud citizen, a fractal approach to throughputPaolo Negri
 
Erlang and the Cloud: A Fractal Approach to Throughput
Erlang and the Cloud: A Fractal Approach to ThroughputErlang and the Cloud: A Fractal Approach to Throughput
Erlang and the Cloud: A Fractal Approach to ThroughputWooga
 
Erlang as a Cloud Citizen
Erlang as a Cloud CitizenErlang as a Cloud Citizen
Erlang as a Cloud CitizenWooga
 
Netflix Development Patterns for Scale, Performance & Availability (DMG206) |...
Netflix Development Patterns for Scale, Performance & Availability (DMG206) |...Netflix Development Patterns for Scale, Performance & Availability (DMG206) |...
Netflix Development Patterns for Scale, Performance & Availability (DMG206) |...Amazon Web Services
 

Similar to Load balancing theory and practice (20)

Lessons learnt on a 2000-core cluster
Lessons learnt on a 2000-core clusterLessons learnt on a 2000-core cluster
Lessons learnt on a 2000-core cluster
 
It Probably Works
It Probably WorksIt Probably Works
It Probably Works
 
Complicating Complexity: Performance in a New Machine Age
Complicating Complexity: Performance in a New Machine AgeComplicating Complexity: Performance in a New Machine Age
Complicating Complexity: Performance in a New Machine Age
 
On the way to low latency
On the way to low latencyOn the way to low latency
On the way to low latency
 
Patterns of parallel programming
Patterns of parallel programmingPatterns of parallel programming
Patterns of parallel programming
 
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
 
Resilience at exascale
Resilience at exascaleResilience at exascale
Resilience at exascale
 
Intro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS CloudIntro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS Cloud
 
On the way to low latency (2nd edition)
On the way to low latency (2nd edition)On the way to low latency (2nd edition)
On the way to low latency (2nd edition)
 
Adaptive availability
Adaptive availabilityAdaptive availability
Adaptive availability
 
Resilience at Extreme Scale
Resilience at Extreme ScaleResilience at Extreme Scale
Resilience at Extreme Scale
 
Mininet: Moving Forward
Mininet: Moving ForwardMininet: Moving Forward
Mininet: Moving Forward
 
The Power of Determinism in Database Systems
The Power of Determinism in Database SystemsThe Power of Determinism in Database Systems
The Power of Determinism in Database Systems
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudy
 
Discretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSP
Discretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSPDiscretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSP
Discretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSP
 
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and HadoopEventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
 
Erlang as a cloud citizen, a fractal approach to throughput
Erlang as a cloud citizen, a fractal approach to throughputErlang as a cloud citizen, a fractal approach to throughput
Erlang as a cloud citizen, a fractal approach to throughput
 
Erlang and the Cloud: A Fractal Approach to Throughput
Erlang and the Cloud: A Fractal Approach to ThroughputErlang and the Cloud: A Fractal Approach to Throughput
Erlang and the Cloud: A Fractal Approach to Throughput
 
Erlang as a Cloud Citizen
Erlang as a Cloud CitizenErlang as a Cloud Citizen
Erlang as a Cloud Citizen
 
Netflix Development Patterns for Scale, Performance & Availability (DMG206) |...
Netflix Development Patterns for Scale, Performance & Availability (DMG206) |...Netflix Development Patterns for Scale, Performance & Availability (DMG206) |...
Netflix Development Patterns for Scale, Performance & Availability (DMG206) |...
 

Recently uploaded

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 

Recently uploaded (20)

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 

Load balancing theory and practice

  • 2. Welcome Me: • Dave Rosenthal • Co-founder of FoundationDB • Spent last three years building a distributed transactional NoSQL database • It’s my birthday Any time you have multiple computers working on a job, you have a load balancing problem!
  • 3. Warning There is an ugly downside to learning about load balancing: TSA checkpoints, grocery store lines, and traffic lights may become even more frustrating.
  • 4. What is load balancing? Wikipedia: “…methodology to distribute workload across multiple computers … to achieve optimal resource utilization, maximize throughput, minimize response time, and avoid overload” All part of the latency curve
  • 5. The latency curve Overload 10000 1000 Saturation Latency 100 10 1 Nominal Interesting Jobs/second
  • 6. Goal for real-time systems 10000 1000 Latency 100 Low latency at 10 given load 1 Jobs/second
  • 7. Goal for batch systems 10000 1000 High Jobs/sec at a Latency 100 reasonable latency 10 1 Jobs/second
  • 8. The latency curve 1000 100 Latency (ms) Better load balancing strategies can dramatically improve both latency and throughput 10 1 0 0.2 0.4 0.6 0.8 1 Load
  • 9. Load balancing tensions • We want to reduce queue lengths in the system to yield better latency • We want to lengthen queue lengths to keep a “buffer” of work to keep busy during irregular traffic and yield better throughput • For distributed systems, equalizing queue lengths sounds good
  • 10. Can we just limit queue sizes? 40 35 30 % of dropped jobs 25 20 15 10 5 0 0 5 10 15 20 Queued job limit
  • 11. Simple strategies Global job queue: for slow tasks Round robin: for highly uniform situations Random: probably won’t screw you Sticky: for cacheable situations Fastest of N tries: tradeoff throughput for latency. I recommend N = 2 or 3.
  • 12. Use a global queue if possible 10 Latency under 80% load 1 Random assignment Global Job Queue 0.1 1 2 3 4 5 6 7 8 9 10 Cluster Size
  • 13. Options for information transfer • None (rare) • Latency (most common) • Failure detection • Explicit – Load average – Queue length – Response times
  • 14. FoundationDB’s approach 1. Request to random of three servers 2. Server either answers query or replies “busy” if its queue is longer than the queue limit estimate 3. Queries that were busy are sent to second random server with “must do” flag set. Queue limit = 25 * 2^(20*P) • A global queue limit is implicitly shared by estimating the fraction of incoming requests (P) that are flagged “must do” • Converges to a P(redirect)/queue-size equilibrium
  • 15. FDB latency curve before/after 100 10 Latency 1 0.1 0 200000 400000 600000 800000 1000000 1200000 Operations per second
  • 16. Tackling load balancing • Queuing theory: One useful insight • Simulation: Do this • Instrumentation: Do this • Control theory: Know how to avoid this • Operations research: Read about this for fun – Blackett: Shield planes where they are not shot!
  • 17. The one insight: Little’s law Q = R*W • (Q)ueue size = (R)ate * (W)ait-time • Q is the average number of jobs in the system • R is the average arrival rate (jobs/second) • W is the average wait time (seconds) • For any (!) steady-state systems – Or sub-systems, or joint systems, or…
  • 18. Little’s law example 1 Q = R*W • We get 1,000,000 request per second (R=1E6) • We take 100 ms to service each request • (Q = 1E6*0.100) • Little’s Law: Average queue depth is 100,000!
  • 19. Little’s law example 2 W = Q/R • We have 100 users in the system making continuous requests (Q=100) • We get 10,000 requests per second • (W = 100 / 10,000) • Little’s Law: Average wait time is 10 ms
  • 20. Little’s law ramifications Q = R*W • In distributed system: – R scales up – W remains the same, or gets a bit worse • To maintain performance, you’re going to need a whole lot of jobs in flight
  • 21. The rest of queuing theory Erlang • A language • A man (Agner Krarup Erlang) • And a unit! (Q from little’s law AKA offered load is measured in dimensionless Erlang units) • Erlang-B formula (for limited-length queues) • Erlang-C formula (P(waiting))
  • 22. Abandon hope Math for queuing theory 10000 Complexity of Math 1000 100 10 1 Little’s law Real-world applicability ?
  • 23. Simulation The best way to explore distributed system behavior
  • 24. Quiz Model: Jobs of random durations. 80% load. Goal: Minimize average job latency. What to work a bit more on? • First task received • Last task received • Shortest task • Longest task • Random task • Task with least work remaining • Task with most work remaining
  • 26. Simulation results at 80% load First task received Last task received Shortest task Longest task Random task Task with least work remaining Task with most work remaining 0 10 20 30 40 50 Latency
  • 27. Simulation results at 95% load First task received Last task received Shortest task Longest task Random task Task with least work remaining Task with most work remaining 1 10 100 1000 10000 100000 Latency
  • 28. FoundationDB’s approach • Strategy validated using simulation used for a single server’s fiber scheduling • High priority: Work on the next task to finish • But be careful to enqueue incoming work from the network with highest priority—we want to know about all our jobs to make good decisions • Low priority: Catch up with housekeeping (e.g. non-log writing)
  • 29. Load spikes Low load system High load system Bursts of job requests can destroy latency. The effect is quadratic: A burst produces a queue of size B that lasts time proportional to B. On highly- loaded systems, the effect is multiplied by 1/(1- load), leading to huge latency impacts.
  • 30. Burst-avoiding tip 1. Search for any delay/interval in your system 2. If system correctness depends on the delay/interval being exact, first fix that 3. Now change that delay/interval to randomly wait 0.8-1.2 times the nominal time on each execution YMMV, but this tends to diffuse system events more evenly in time and help utilization and latency.
  • 31. Overload Overload 10000 1000 Latency 100 10 1 Jobs/second
  • 32. Overload What happens when work comes in too fast? • Somewhere in your system a queue is going to get huge. Where? • Lowered efficiency due to: – Sloshing – Poor caching • Unconditional acceptance of new work means no information transfer to previous system!
  • 33. Overload (cont’d): Sloshing Loading 10 million rows into popular NoSQL K/V store shows sloshing 12.5 minutes
  • 34. Overload (cont’d): No sloshing Loading 10 million rows into FDB shows smooth behavior:
  • 35. System queuing Queue Queue Queue Work A B C Node 1 Node 2 Node 3 D E
  • 36. System queuing Queue Queue Queue A Work B C Node 1 Node 2 Node 3 D E
  • 37. Internal queue buildup Queue Queue Queue A B C Work D Node 1 Node 2 Node 3 E
  • 38. Even queues, external buildup Queue Queue Queue A B C Work Node 1 Node 2 Node 3 D E …
  • 39. Our approach “Ratekeeper” • Active management of internal queue sizes prevents sloshing • Avoids every subcomponent needing it’s own well-tuned load balancing strategy • Explicitly send queue information at 10hz back to a centrally-elected control algorithm • When queues get large, slow system input • Pushes latency into an external queue at the front of the system using “tickets”
  • 40. Ratekeeper in action 1400000 1200000 Operations per second 1000000 800000 600000 400000 200000 0 0 100 200 300 400 500 600 Seconds
  • 42. What can go wrong Well, we are controlling the queue depths of the system, so, basically, everything in control theory… Namely, oscillation:
  • 43. Recognizing oscillation • Something moving up and down :) – Look for low utilization of parallel resources – Zoom in! • Think about sources of feedback—is there some way that having a machine getting more job done feeds either less or more work for that machine in the future? (probably yes)
  • 44. What oscillation looks like 70 60 50 Utilization % 40 Node A 30 Node B 20 10 0 1 2 3 4 5
  • 45. What oscillation looks like 120 100 80 Utilization % 60 Node A Node B 40 20 0 2 2.05 2.1 2.15 2.2 2.25 2.3 -20
  • 46. Avoiding oscillation • This is control theory—avoid if possible! • The major thing to know: control gets harder at frequencies get higher. (e.g. Bose headphones) • Two strategies: – Control on a longer time scale – Introduce a low-pass-filer in the control loop (e.g. exponential moving average)
  • 47. Instrumentation If you can’t measure, you can’t make it better Things that might be nice to measure: • Latencies • Queue lengths • Causes of latency?
  • 48. Measuring latencies Our approach: • We want information about the distribution, not just the average • We use a “Distribution” class – addSample(X) – Stores 500+ samples – Throws away half of them when it hits 1000 samples, and halves probability of accepting new samples – Also tracks exact min, max, mean, and stddev
  • 49. Measuring queue lengths Our approach: • Track the % of time that a queue is at zero length • Measure queue length snapshots at intervals • Watch out for oscillations – Slow ones you can see – Fast ones look like noise (which, unfortunately, is also what noise looks like) – “Zoom in” to exclude the possibility of micro- oscillations
  • 50. Measuring latency from blocking • Easy to calculate: – L = (b0^2 + b1^2 … bN^2) / elapsed – Total all squared seconds of blocking time over some interval, divide by the duration of the interval. • Measures impact of unavailability on mean latency from random traffic • Example: Is server’s slow latency explained by this lock? • Doesn’t count catch-up time.
  • 51. Summary Thanks for listening, and remember: • Everything has a latency curve • Little’s law • Randomize regular intervals • Validate designs with simulation • Instrument May your queues be small, but not empty david.rosenthal@foundationdb.com
  • 52. Prioritization/QOS • Can help in systems under partial load • Vital in systems that handle batch and real- time loads simultaneously • Be careful that high priority work doesn’t generate other high priority work plus other jobs in the queue. This can lead to poor utilization analogous to the internal queue buildup case.
  • 53. Congestion pricing • My favorite topic • Priority isn’t just a function of the benefit of your job • To be a good citizen, you should subtract the costs to others • For example, jumping into the front of a long queue has costs proportional to the queue size
  • 54. Other FIFO alternatives? • LIFO – Avoids the reason to line up early – In situations where there is adequate capacity to serve everyone, can yield better waiting times for everyone involved