This document discusses recursion and its application to probabilistic dynamic programming and inventory control. Recursion is a problem solving technique that breaks down complex problems into simpler subproblems. The document provides an example of applying probabilistic dynamic programming to optimize the expected cost of finding parking. It explains how to model the problem as a series of states and use recursion to determine the optimal parking strategy. The document also discusses how probabilistic dynamic programming can be used to determine optimal order levels and order-up-to levels in inventory control problems.
06_Joeri Van Speybroek_Dell_MeetupDora&Cybersecurity.pdf
Recursion in Mathematical Blog #2: Parking Spaces Example
1. Mathematical blog #2: Recursion
Introduction
In history, mankind is continuously faced with problems and we try to come to a solution using our
knowledge, mind and experiences. ‘Recursion’ is one way of thinking (approaching) a given problem
and is one the central ideas of computer science and thus optimization in this continuously in
complexity growing supply chain world.
In a formal definition, recursion is the philosophy that underlies the analysis of many models,
particularly dynamic ones. The principle, popularized by Bellman and developed by R. Howard under
the name ‘dynamic programming’, is to decompose a complicated problem into a series of smaller,
simpler problems.
Within this dynamic programming, you have a distinction between deterministic and probabilistic
programming. Probabilistic programming is highly effective as applied to multi-stage decisions,
whereby, as the name implies, opportunities to revise previous decisions occur time and again,
eliminating the need for a single once-and-for-all decision. For this reason, probabilistic dynamic
programming, has one or two possible applications in the theory of inventory control. For instance, in
the determination of the levels s and S in an (s, S) order policy (min-max policy) and in the calculation
of optimum production levels.
The remainder of the blog will deal with an example of applying probabilistic dynamic programming
in the real-world and the application of it in inventory control.
Parking Spaces
In this problem (MacQueen and Miller, 1960), you are driving down a one-way road toward your
destination, looking for a parking space. As you drive along, you can observe only one parking space
at a time, the one right next to you, noting whether or not it is vacant. If it is vacant, you may either (1)
stop and park there or (2) drive on to the next space. If it is occupied (not vacant), then you must
drive on to the next space. You cannot return to vacant spaces that you have passed. Assume that
vacant spaces occur independently and that the probability that any given space is vacant is p where
0 < p < 1. If you have not parked by the time you reach your destination, you must park in the payed
parking lot, at a cost of c. If you park x spaces away from your destination, then you consider that a
parking cost of x has been incurred. The closest space (before reaching the pay parking lot) to your
destination is one space away from your destination: x=1. For convenience, we interpret space x=0 as
existing and corresponding to having driven by the last available space, so the only option left is the
pay lot. Furthermore, c > 1: There are possible spaces that are better than going straight to the pay
lot. Your objective is to minimize your total expected parking cost.
Let the state s = (x, i) be defined by the number x of the space that is being approached, and i, which
is the availability of the space: i=0 indicates that the space is vacant and i=1 indicates that the space is
occupied. For example, if s = (3,0) then you are still looking for a space, have approached the spot
that is three spaces away from your destination, and it is vacant. Let a=0 denote the decision to park
2. and let a=1 denote the decision to drive to the next space. If you decide to park (having observed
state (3,0)), you incur the cost of 3 and the process ends.
Both options are available if the space is vacant: A(x,0) = {0,1}, and you must drive to the next spot if
the space is occupied: A(x,1)={1}, for all x.
Now let f(s) = f(x, i) denote the minimum expected cost of parking, starting in state s (space x with
availability i). For state s = (1, i) for example, the expected costs can be denoted as follows:
𝑓(1, 𝑖) = {
min(1, 𝑐) = 1 𝑖𝑓 𝑖 = 0
𝑐 𝑖𝑓 𝑖 = 1
(1.1)
That is, if i=0, you have a choice of parking now (choosing a=0), resulting in a cost of 1, or driving
past (choosing a=1), which means that you must park in the pay lot resulting in a cost of c. Since, by
assumption, c > 1, you prefer to park if you can.
It is convenient to define another function, which facilitates the general analysis.
For x ≥ 0, let
𝐹( 𝑥) ≔ 𝑝𝑓( 𝑥,0) + 𝑞𝑓( 𝑥,1), (1.2)
where q ≔ 1-p is the probability that a space is occupied. F(x) is the minimum expected cost of
parking, given that you are approaching space x (still looking to park) but have not yet observed its
availability. If the space is vacant, which occurs with probability p, the optimal cost will be f(x,0) and if
it is occupied (probability q), the cost will be f(x,1).
The expected costs for all states s = (x, i) can therefore be denoted as
𝑓( 𝑥, 𝑖) = {
min( 𝑥, 𝐹( 𝑥 − 1)) 𝑖𝑓 𝑖 = 0
𝐹( 𝑥 − 1) 𝑖𝑓 𝑖 = 1
(1.3)
In words, suppose that you approach space x (still looking to park) and observe that its availability. If
the space is available, then you can choose to park there, and incur cost x, or drive on to the next
space and incur the optimal cost of doing so, namely F(x-1). If x ≤ F(x-1), then it is optimal to park.
Otherwise, it is optimal to drive on. If space x is occupied, you must drive to the next space and face
the optimal expected cost, F(x-1) of doing so.
Now, if we use the understanding of above, we can interpret F(x-1) – x as the saving by driving to the
next space, closer to our destination. Let’s denote this as
𝑔( 𝑥) ≔ 𝐹( 𝑥 − 1) − 𝑥, (1.4)
which state that g(x) is negative number when driving to the next space if beneficial to drive to next
space. If it is positive, we have to park in x.
It is also easy to see, that we can write g(x+1) as
𝑔( 𝑥 + 1) = −1 + 𝑝(min(0, 𝑔( 𝑥)) + 𝑞𝑔( 𝑥), (1.5)
3. If we work backwards using (1.5) and using the understanding of the recursion in (1.4), it’s easy to
verify that we have to find a so-called cutoff point (let’s call this S) where g(x) is positive and g(x+1) is
negative. We can therefore solve (1.5) for every state, starting at x=0 and find our optimal cutoff point.
For example, suppose p = 0.2 and c =5. Then g(1) = c-1 =4. Using (1.4), we compute g(x) for
increasing values of x until g(x) < 0, at which point the optimal policy has been found. Here, g(2)= -1
+0.2(0)+0.8(4)=2.2, g(3)= -1+0.2(0)+0.8(2.2)=0.76, and g(4)= -1+0.2(0)+0.8(0.76)=-0.392. This means
that the optimal cutoff point is x=3. This means that you could drive straight to x=3 and park there if
it is vacant.
Application in inventory control
The example of probabilistic dynamic programming is already quite complex to comprehend when
you don’t have a specific background. When we extend this problem to an infinite horizon and
multiple parameters (what is mostly the case in inventory control), the solution becomes even more
complex.
Within inventory control, the most significant benefit of probabilistic dynamic programming, is to find
an optimal order strategy. The optimal order strategy in theory is an s, S policy. This means that you
have an order level of s and when your inventory position is going under this order level s, you have
to order up to ceiling S. Although literature provides solid solutions (eg. Bartmann (1992), Federgruen
& Zheng, 1991) to this problem, there are some drawbacks in these solutions. They are
incomprehensive to inventory controllers with not a specific mathematical background and most
software packages don’t contain these algorithms or techniques and/or are not transparent in it.
One way to avoid the above drawbacks, is computing the order level s in a formal way (see for
example Axsäter, 2006) and compute S as s added with the economic order quantity (EOQ). This
yields an approximate solution where the total cost deviation is at his minimum.
Next blog
The next blog will be about probability theory: distributions and numerical integration. The
applications in inventory control to be discussed in this blog will be mainly determining order levels.
Further questions or discussions on recursion can be addressed to s.pauly@slimstock.com.