The Beer Game slides

18,063 views

Published on

Published in: Economy & Finance, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
18,063
On SlideShare
0
From Embeds
0
Number of Embeds
86
Actions
Shares
0
Downloads
517
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

The Beer Game slides

  1. 1. Artificial Agents Play the Beer Game Eliminate the Bullwhip Effect and Whip the MBAs Steven O. Kimbrough D.-J. Wu Fang Zhong FMEC, Philadelphia, June 2000; file: beergameslides.ppt
  2. 2. The MIT Beer Game <ul><li>Players </li></ul><ul><ul><li>Retailer, Wholesaler, Distributor and Manufacturer. </li></ul></ul><ul><li>Goal </li></ul><ul><ul><li>Minimize system-wide (chain) long-run average cost. </li></ul></ul><ul><li>Information sharing: Mail. </li></ul><ul><li>Demand: Deterministic. </li></ul><ul><li>Costs </li></ul><ul><ul><li>Holding cost: $1.00/case/week. </li></ul></ul><ul><ul><li>Penalty cost: $2.00/case/week. </li></ul></ul><ul><li>Leadtime: 2 weeks physical delay </li></ul>
  3. 3. Timing <ul><li>1. New shipments delivered. </li></ul><ul><li>2. Orders arrive. </li></ul><ul><li>3. Fill orders plus backlog. </li></ul><ul><li>4. Decide how much to order. </li></ul><ul><li>5. Calculate inventory costs. </li></ul>
  4. 4. Game Board <ul><li>… </li></ul>
  5. 5. The Bullwhip Effect <ul><ul><li>Order variability is amplified upstream in the supply chain. </li></ul></ul><ul><ul><li>Industry examples (P&G, HP). </li></ul></ul>
  6. 6. Observed Bullwhip effect from undergraduates game playing
  7. 7. Bullwhip Effect Example (P & G) <ul><li>Lee et al., 1997, Sloan Management Review </li></ul>
  8. 8. Analytic Results: Deterministic Demand <ul><li>Assumptions : </li></ul><ul><ul><li>Fixed lead time. </li></ul></ul><ul><ul><li>Players work as a team. </li></ul></ul><ul><ul><li>Manufacturer has unlimited capacity. </li></ul></ul><ul><li>“ 1-1” policy is optimal -- order whatever amount is ordered from your customer. </li></ul>
  9. 9. Analytic Results: Stochastic Demand (Chen, 1999, Management Science ) <ul><li>Additional assumptions: </li></ul><ul><ul><li>Only the Retailer incurs penalty cost. </li></ul></ul><ul><ul><li>Demand distribution is common knowledge. </li></ul></ul><ul><ul><li>Fixed information lead time. </li></ul></ul><ul><ul><li>Decreasing holding costs upstream in the chain. </li></ul></ul><ul><li>Order-up-to (base stock installation) policy is optimal . </li></ul>
  10. 10. Agent-Based Approach <ul><li>Agents work as a team. </li></ul><ul><li>No agent has knowledge on demand distribution. </li></ul><ul><li>No information sharing among agents. </li></ul><ul><li>Agents learn via genetic algorithms. </li></ul><ul><li>Fixed or stochastic leadtime. </li></ul>
  11. 11. Research Questions <ul><li>Can the agents track the demand? </li></ul><ul><li>Can the agents eliminate the Bullwhip effect? </li></ul><ul><li>Can the agents discover the optimal policies if they exist? </li></ul><ul><li>Can the agents discover reasonably good policies under complex scenarios where analytical solutions are not available? </li></ul>
  12. 12. Flowchart
  13. 13. Agents Coding Strategy <ul><ul><li>Bit-string representation with fixed length n . </li></ul></ul><ul><ul><li>Leftmost bit represents the sign of “ + ” or “ - ”. </li></ul></ul><ul><ul><li>The rest bits represent how much to order. </li></ul></ul><ul><ul><li>Rule “ x+1 ” means “if demand is x then order x+1 ”. </li></ul></ul><ul><ul><li>Rule search space is 2 n-1 – 1. </li></ul></ul>
  14. 14. Experiment 1a: First Cup <ul><li>Environment: </li></ul><ul><ul><li>Deterministic demand with fixed leadtime. </li></ul></ul><ul><ul><li>Fix the policy of Wholesaler, Distributor and Manufacturer to be “1-1”. </li></ul></ul><ul><ul><li>Only the Retailer agent learns. </li></ul></ul><ul><li>Result: Retailer Agent finds “1-1”. </li></ul>
  15. 15. Experiment 1b <ul><li>All four Agents learn under the environment of experiment 1a. </li></ul><ul><li>Über rule for the team. </li></ul><ul><li>All four agents find “1-1”. </li></ul>
  16. 16. Result of Experiment 1b <ul><li>All four agents can find the optimal “1-1” policy </li></ul>
  17. 17. <ul><li>Artificial Agents Whip the MBAs and Undergraduates in Playing the MIT Beer Game </li></ul>
  18. 18. Stability (Experiment 1b) <ul><li>Fix any three agents to be “1-1”, and allow the fourth agent to learn. </li></ul><ul><li>The fourth agent minimizes its own long-run average cost rather than the team cost. </li></ul><ul><li>No agent has any incentive to deviate once the others are playing “1-1”. </li></ul><ul><li>Therefore “1-1” is apparently Nash. </li></ul>
  19. 19. Experiment 2: Second Cup <ul><li>Environment: </li></ul><ul><ul><li>Demand uniformly distributed between [0,15]. </li></ul></ul><ul><ul><li>Fixed lead time. </li></ul></ul><ul><ul><li>All four Agents make their own decisions as in experiment 1b. </li></ul></ul><ul><li>Agents eliminate the Bullwhip effect. </li></ul><ul><li>Agents find better policies than “1-1”. </li></ul>
  20. 20. Artificial agents eliminate the Bullwhip effect.
  21. 21. Artificial agents discover a better policy than “1-1” when facing stochastic demand with penalty costs for all players.
  22. 22. Experiment 3: Third Cup <ul><li>Environment: </li></ul><ul><ul><li>Lead time uniformly distributed between [0,4]. </li></ul></ul><ul><ul><li>The rest as in experiment 2. </li></ul></ul><ul><li>Agents find better policies than “1-1”. </li></ul><ul><li>No Bullwhip effect. </li></ul><ul><li>The polices discovered by agents are Nash. </li></ul>
  23. 23. Artificial agents discover better and stable policies than “1-1” when facing stochastic demand and stochastic lead-time.
  24. 24. Artificial Agents are able to eliminate the Bullwhip effect when facing stochastic demand with stochastic leadtime .
  25. 25. Agents learning
  26. 26. The Columbia Beer Game <ul><li>Environment: </li></ul><ul><ul><li>Information lead time: (2, 2, 2, 0). </li></ul></ul><ul><ul><li>Physical lead time: (2, 2, 2, 3). </li></ul></ul><ul><ul><li>Initial conditions set as Chen (1999). </li></ul></ul><ul><li>Agents find the optimal policy: order whatever is ordered with time shift, i.e., </li></ul><ul><ul><li>Q 1 = D (t-1), Q i = Q i-1 (t – l i-1 ). </li></ul></ul>
  27. 27. Ongoing Research: More Beer <ul><li>Value of information sharing. </li></ul><ul><li>Coordination and cooperation. </li></ul><ul><li>Bargaining and negotiation. </li></ul><ul><li>Alternative learning mechanisms: Classifier systems. </li></ul>
  28. 28. Summary <ul><li>Agents are capable of playing the Beer Game </li></ul><ul><ul><li>Track demand. </li></ul></ul><ul><ul><li>Eliminate the Bullwhip effect. </li></ul></ul><ul><ul><li>Discover the optimal policies if exist. </li></ul></ul><ul><ul><li>Discover good policies under complex scenarios where analytical solutions not available. </li></ul></ul><ul><li>Intelligent and agile supply chain. </li></ul><ul><li>Multi-agent enterprise modeling. </li></ul>
  29. 29. A framework for multi-agent intelligent enterprise modeling

×