SlideShare a Scribd company logo
1 of 89
Download to read offline
Max-Plus Theory and Applications
Jeremy Rolph
August 10, 2015
Abstract
In this thesis we consider the ‘max-plus’ algebra; that is, the set Rmax = R ∪ {−∞}
endowed with the operations a ⊕ b = max{a, b} and a ⊗ b = a + b. It is shown
that Rmax has the structure of a semiring with several additional useful properties.
We introduce the idea of matrices over the max-plus semiring and develop max-plus
variants of several familiar concepts from classical linear algebra; most notably the
theory of eigenvalues and eigenvectors. In Chapter 2 we introduce the theory of
event graphs which are used to model dynamical systems which admit a degree of
synchronisation such as rail networks or automated manufacturing processes. We use
the theory of max-plus algebra developed in Chapter 1 to derive results concerning
the time evolution of such systems and also consider their long-term behaviour.
Finally, in Chapter 3 we consider event graphs in which the timed elements form
sequences of random variables. We look for steady state distributions and conditions
for their existence, and attempt to characterise the asymptotic behaviour of the event
timings concerned. We conclude by exploring how we can represent certain types
of queuing systems by stochastic event graphs and present a key theorem regarding
the stability of their waiting times.
i
Contents
Abstract i
Table of Contents ii
0 Introduction 1
1 Max-Plus Algebra 3
1.1 The Max-Plus Semiring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.1 Basic Definitions and Properties . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.2 Other Algebraic Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Vectors and Matrices over Rmax . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.1 Definitions and Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.2 Matrix Inversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.3 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3 Graph-theoretic Interpretations in Max-Plus . . . . . . . . . . . . . . . . . . . . . 11
1.4 Spectral Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.4.1 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.4.2 The Eigenspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.4.3 A Worked Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.5 Recurrence Relations & Periodicity . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.5.1 Solving Max-Plus Recurrence Relations . . . . . . . . . . . . . . . . . . . 28
ii
1.5.2 Limiting Behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2 Petri Nets and Timed Event Graphs 38
2.1 A Motivating Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.2 Preliminaries of Event Graph Theory . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.2.1 Definitions and Set-up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.2.2 A Simple Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.3 The Basic Autonomous Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.3.1 Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.3.2 Extensions to the Initial Condition . . . . . . . . . . . . . . . . . . . . . . 49
2.3.3 Solving the Basic Autonomous Equation . . . . . . . . . . . . . . . . . . . 51
2.3.4 Behaviour of the Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2.4 A Simple Example Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.4.1 General Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.4.2 An Optimal Timetable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
2.5 The Non-autonomous Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3 Stochastic Event Systems Over Max-plus 64
3.1 Introduction & Stochastic Background . . . . . . . . . . . . . . . . . . . . . . . . 64
3.2 Statistical Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.3 Asymptotic Firing Rates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.3.1 The Strongly Connected Case . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.3.2 The General Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.4 Queuing Systems and Timed Event Graphs . . . . . . . . . . . . . . . . . . . . . 74
3.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.4.2 Example: The G/G/1 Queue . . . . . . . . . . . . . . . . . . . . . . . . . 76
3.4.3 Stability Analysis of Waiting Times . . . . . . . . . . . . . . . . . . . . . 77
iii
Bibliography 83
iv
Chapter 0
Introduction
Exotic semirings such as (R ∪ {−∞}, max, +) and (R ∪ {+∞}, min, +) have been studied at
length since the 1950s, beginning primarily in the area of operational research. Nowadays the
term ‘tropical mathematics’ is often used to describe their study, though this term originally
referred to one particular discrete version of the max-plus algebra introduced by I. Simon
in 1988 [15]. Their applications span a wide range of fields including optimisation & control,
mathematical physics, algebraic geometry, dynamic programming and mathematical biology [10,
15]. In particular, the study of such algebras in relation to discrete event system theory (both
deterministic and stochastic), graph theory, Markov decision processes, asymptotic analysis and
language theory has lead to some significant progress in these areas over the last 30 years [8].
Many of the concepts developed in conventional linear algebra have been ‘translated’ into the
world of max-plus, including solutions to linear and non-linear systems (both analytical and
numerical), linear dependence and independence, determinants, eigenvalues and eigenvectors
[9]. In 1979 Cuninghame-Green authored the first comprehensive unified account of these results
entitled “Minimax Algebra” [7], building on many papers published over the preceding 20 years
from various disciplines within mathematics, economics and computer science. As recently
as 2006, Heidergott, Olsder and Woude published what they consider the first ‘textbook’ in
the area of max-plus algebra [13], and many of the ideas explored below can be found in this
publication.
In the first chapter of this thesis, we aim to give an overview of max-plus linear algebra and
to build the necessary groundwork required for the applications discussed in the chapters that
follow. In particular, we present two celebrated theorems in the area of max-plus theory. The
first, which can be found in [7], concerns spectral theory and says that under mild conditions,
a matrix over the max-plus algebra has a unique eigenvalue with a simple graph-theoretic
interpretation. The second, originally proved by M. Viot in 1983 [2, 6], relates to the asymptotic
behaviour of sequential powers of max-plus matrices, which turns out to be essentially periodic
and has great implications for the material explored in Chapters 2 & 3.
In chapter 2 we introduce the concept of timed Petri nets & event graphs. For a thorough
1
discussion on the scope of their application readers are referred to [18]; in this thesis we fo-
cus solely on their use in the modelling of the time behaviour of a class of dynamic systems
known as ‘discrete event dynamic systems’. In simple terms, these are systems in which a finite
number of resources (e.g. processors or machines) are shared by several users (e.g. packets or
manufactured objects) which all contribute to the achievement of some common goal (e.g. a
parallel computation or the assembly of a product) [2]. We will see that under certain conditions
these systems, while highly non-linear in the conventional sense, can be ‘linearised’ by using
the max-plus algebra. This observation, first made in [5], is of vital importance and constitutes
one of the main reasons for the continued study of max-plus algebra today. The main content
of Chapter 2 concerns the ‘basic autonomous equation’ which governs the time evolution of
discrete event systems, and the steps towards its solution. We are then able to apply some ideas
from Chapter 1 to explore the long-term behaviour of such systems.
Chapter 3 concerns stochastic event graphs, which can be thought of as a natural extension
to the concepts introduced in Chapter 2. As the name suggests, we now assume a degree of
randomness in the event timings of the systems we are trying to model. Amongst other things,
stochastic event graphs can be used to model many types of queuing systems [3], the most
simple of which being the G/G/1 queue. We introduce several key ‘first order’ theorems which
establish the nature of stationary regimes in terms of the inverse throughput, and explore the
conditions under which such regimes are reached. We end by presenting a ‘second order’ theorem
concerning the stability of inter-event timings (for example, waiting times) in the context of
queuing systems.
2
Chapter 1
Max-Plus Algebra
1.1 The Max-Plus Semiring
1.1.1 Basic Definitions and Properties
In this thesis we work exclusively with the max-plus algebra (Rmax, ⊕, ⊗), where Rmax = R ∪
{−∞}, and for a, b ∈ Rmax:
a ⊕ b := max{a, b}
a ⊗ b := a + b
We begin by examining its algebraic structure, and we will then move on to vectors and matrices
over Rmax. We start by defining the term semiring.
Definition 1.1. A semiring is a triple (R, +, ×) where R is a non-empty set and +, × are
binary operations on R (referred to as addition and multiplication respectively) such that
(i) (R, +) is commutative and associative, with zero element εR:
(a) a + b = b + a
(b) (a + b) + c = a + (b + c)
(c) εR + a = a + εR = a
(ii) (R, ×) is associative, with unit element eR:
(a) (a × b) × c = a × (b × c)
(b) eR × a = a × eR = a
(iii) Multiplication distributes over addition:
(a) a × (b + c) = (a × b) + (a × c)
(b) (a + b) × c = (a × c) + (b × c)
(iv) Multiplication by εR annihilates R:
3
(a) εR × a = a × εR = εR
Note that the final axiom is not required in the definition of a standard ring since it follows
from the others, but it is needed here.
As the title of this section suggests, the max-plus algebra is a semiring with additive identity
ε := −∞ and multiplicative identity e := 0. It is straightforward to verify that all the axioms
of Definition 1.1 hold in the case of (Rmax, ⊕, ⊗). For example, the first distributive law holds
since
a ⊗ (b ⊕ c) = a + max{b, c}
= max{a + b, a + c}
= (a ⊗ b) ⊕ (a ⊗ c)
and the others follow similarly. For the sake of simplicity we will write Rmax for (Rmax, ⊕, ⊗)
when the context is clear.
Below we list three additional algebraic properties of Rmax which do not form part of the
definition of a semiring:
(i) Commutativity of ⊗:
∀a, b ∈ Rmax : a ⊗ b = b ⊗ a
(ii) Existence of multiplicative inverses:
∀a ∈ Rmax{ε} ∃ b ∈ Rmax such that a ⊗ b = e
(iii) Idempotency of ⊕:
∀a ∈ Rmax : a ⊕ a = a
The first two properties follow directly from the fact that (R, +) forms an abelian group, and
the third property is easily proved: a ⊕ a = max{a, a} = a. Properties (i) and (ii) mean that
we could refer to (Rmax, ⊕, ⊗) as a semifield (i.e. a field without additive inverses), though
this term can be ambiguous and is seldom used in mathematical literature. Note also that in
general, any semiring in which addition is idempotent we call an idempotent semiring. The term
dioid (originating from the phrase double monoid) was introduced by Baccelli et al. in 1992 to
mean idempotent semiring [2], but we do not use this word here.
The crucial difference between a semiring and a ring in general is that an element of the former
need not have an additive inverse. Note that this does not say that additive inverses can never
exist - there may be a non-empty subset of R containing elements which do have additive
inverses (which could be thought of as the additive analogue to the set of units in a standard
ring). However, the following lemma immediately tells us that no elements of Rmax (apart from
4
ε) have additive inverses.
Lemma 1.2. Let (R, +, ×) be a semiring. If + is idempotent then additive inverses do not
exist.
Proof. Suppose that εR = a ∈ R has an additive inverse b. Then
a + b = εR
Adding a to both sides of the equation yields
a + a + b = a + εR
By idempotency of +, the left-hand side is equal to a + b, whereas the right-hand side is equal
to a. Hence we have
a + b = a
which contradicts a + b = εR. Thus a does not have an additive inverse.
1.1.2 Other Algebraic Definitions
For a ∈ Rmax, n ∈ N, define
a⊗n
:= a ⊗ a ⊗ · · · ⊗ a
n times
Thus exponentiation in max-plus is equivalent to conventional multiplication a⊗n = n×a. Some
of the laws of exponentiation are therefore different to what we are used to. For a, b ∈ Rmax,
m, n ∈ N:
(i) a⊗m ⊗ a⊗n = ma + na = (m + n)a = a⊗(m⊗n)
(ii) (a⊗m)⊗n = (ma)⊗n = nma = a⊗(m⊗n)
(iii) a⊗1 = 1a = a
(iv) a⊗m ⊗ b⊗m = ma + mb = m(a + b) = (a ⊗ b)⊗m
and we also adopt the natural conventions a⊗ε := ε and a⊗e := e. For negative exponents we
can take
a⊗−n
:= (a⊗n
)⊗−1
where the outer exponent on the right-hand side denotes the max-plus multiplicative inverse,
which was shown to exist in the previous section. Finally, we can extend the concept of ex-
ponentiation in Rmax to non-integer exponents using conventional notation in the following
5
way:
a⊗ n
m :=
n
m
× a
which is well-defined, assuming m = ε.
Next, we can equip the max-plus algebra with a natural order relation as follows:
Definition 1.3. For a, b ∈ Rmax, we say a ≤ b if a ⊕ b = b.
It is easily verified that the max-plus operations ⊕ and ⊗ preserve this order, i.e. ∀a, b, c ∈ Rmax,
a ≤ b ⇒ a ⊕ c ≤ b ⊕ c and a ⊗ c ≤ b ⊗ c.
Finally, infinite sums in max-plus are defined by i∈I xi := sup{xi : i ∈ I} for any possibly
infinite (even uncountable) family {xi}i∈I of elements of Rmax, when the supremum exists. In
general, we say that an idempotent semiring is complete if any such family has a supremum,
and if the product distributes over infinite sums. The max-plus semiring Rmax is not complete
(a complete idempotent semiring must have a maximal element), but it can be embedded in
the complete semiring (Rmax, ⊕, ⊗), where Rmax := Rmax ∪ {+∞}.
1.2 Vectors and Matrices over Rmax
1.2.1 Definitions and Structure
Let n, m ∈ N. We denote the set of n × m matrices over Rmax by Rn×m
max . For i ∈ {1, . . . , n},
j ∈ {1, . . . , m}, the element of a matrix A ∈ Rn×m
max in row i and column j is denoted by [A]ij,
or simply aij for notational convenience. Thus A ∈ Rn×m
max can be written as








a11 a12 · · · a1m
a21 a22 · · · a2m
...
...
...
...
an1 an2 · · · anm








where a11, . . . , anm ∈ Rmax. In a similar vein, the elements of Rn
max := Rn×1
max are called max-plus
vectors, and we write the i-th element of a vector x ∈ Rn
max as [x]i, or simply xi.
Typical concepts and operations from conventional algebra are defined for max-plus matrices
in the usual way (replacing + and × with ⊕ and ⊗ respectively), as outlined in the following
definitions.
Definition 1.4. The n × n max-plus identity matrix, denoted En, is defined by
[En]ij =



0 i = j
ε i = j
We will write E := En whenever the context is clear.
6
Definitions 1.5. (i) For A, B ∈ Rn×m
max , their sum A ⊕ B is defined by
[A ⊕ B]ij = aij ⊕ bij = max aij, bij
(ii) For A ∈ Rn×k
max and B ∈ Rk×m
max , their product A ⊗ B is defined by
[A ⊗ B]il =
k
j=1
(aij ⊗ bjl) = max
j=1,...,k
(aij + bjl)
(iii) The transpose of a matrix A ∈ Rn×m
max is denoted by A and is defined as usual by
[A ]ij = [A]ji
(iv) For A ∈ Rn×n
max and k ∈ N, the k-th power of A, denoted A⊗k, is defined by
A⊗k
= A ⊗ A ⊗ · · · ⊗ A
k times
For k = 0, A⊗0 := En.
(v) For A ∈ Rn×m
max and α ∈ Rmax, α ⊗ A is defined by
[α ⊗ A]ij = α ⊗ [A]ij
We now look at a crucial result concerning the algebraic structure of square matrices over Rmax.
Proposition 1.6. (Rn×n
max , ⊕, ⊗) is an idempotent semiring with multiplicative identity En.
Proof. The axioms of Definition 1.1 all follow from the semiring structure of Rmax, and are
readily verified. For example, for A, B, C ∈ Rn×n
max we have that
[A ⊗ (B ⊕ C)]il =
n
j=1
(aij ⊗ (bjl ⊕ cjl))
=
n
j=1
(aij ⊗ bjl) ⊕ (aij ⊗ cjl)
=
n
j=1
(aij ⊗ bjl) ⊕
n
j=1
(aij ⊗ cjl)
= [(A ⊗ B) ⊕ (A ⊗ C)]il
and so A ⊗ (B ⊕ C) = (A ⊗ B) ⊕ (A ⊗ C). The other axioms follow similarly.
Note that since addition in (Rn×n
max , ⊕, ⊗) is idempotent, we can apply Lemma 1.2 once again to
see that no element of Rn×n
max has an additive inverse. However, unlike in Rmax, multiplication
7
of matrices over Rmax is not commutative. For example


1 e
ε −2




2 −1
3 ε

 =


3 e
1 ε

 =


3 2
4 3

 =


2 −1
3 ε




1 e
ε −2


Also unlike Rmax, matrices over Rmax do not necessarily have multiplicative inverses (i.e. they
are not necessarily invertible). We explore this in the next section.
1.2.2 Matrix Inversion
Definition 1.7. Let A, B ∈ Rn×n
max . B is a right inverse of A if A ⊗ B = E, and B is a left
inverse of A if B ⊗ A = E.
Definition 1.8. A max-plus permutation matrix is a matrix A ∈ Rn×n
max with each row and
each column containing exactly one entry equal to e, with all other entries equal to ε. If σ :
{1, . . . , n} → {1, . . . , n} is a permutation, the max plus permutation matrix Pσ is defined by
[Pσ]ij :=



e i = σ(j)
ε i = σ(j)
As the name suggests, left multiplication by Pσ permutes the rows of a matrix: the i-th row of
a matrix A ∈ Rn×n
max will appear as the σ(i)-th row of Pσ ⊗ A. For example, if n = 2 and σ is
defined by σ(1) = 2, σ(2) = 1:


ε e
e ε




1 2
3 4

 =


3 4
1 2


Similarly, it is straightforward to see that right multiplication by Pσ permutes the columns of
a matrix.
Definition 1.9. A matrix A ∈ Rn×n
max is diagonal if [A]ij = ε for all i = j. If a1, . . . , an ∈
Rmax{ε}, the diagonal matrix D(a1, . . . , an) is defined by
[D(a1, . . . , an)]ij :=



ai i = j
ε i = j
Combining these two definitions, if σ is a permutation and a1, . . . , an ∈ Rmax {ε}, Pσ ⊗
D(a1, . . . , an) gives a matrix in which each row and each column contains exactly one finite
entry. This class of matrices (sometimes referred to as generalised permutation matrices) in
max-plus turns out to be of some significance, as the theorem below shows.
Theorem 1.10. A matrix A ∈ Rn×n
max has a right inverse if and only if A = Pσ ⊗ D(a1, . . . , an)
for some permutation σ and a1, . . . , an ∈ Rmax{ε}.
Proof. Suppose A = Pσ ⊗ D(a1, . . . , an) for some permutation σ and a1, . . . , an ∈ Rmax{ε}.
8
Recalling from Section 1.1.1 that multiplicative inverses exist in Rmax, define B ∈ Rn×n
max by
[B]ij =



[A]⊗−1
ji if [A]ji = ε
ε otherwise
Then for i, j = 1, . . . , n we have that
[A ⊗ B]ij = max
k=1,...,n
aik ⊗ bkj
=



e j = i
ε j = i
Since if j = i, at least one of aik, bkj is equal to ε for each k = 1, . . . , n (since A only has one
finite element per column and row). Thus A ⊗ B = E, and B is a right inverse of A.
Conversely, suppose A has inverse B ∈ Rn×n
max . For i, j = 1, . . . , n we have
n
k=1
[A]ik ⊗ [B]kj = [E]ij
and therefore for each i = 1, . . . , n there is a (least) index c(i) (1 ≤ c(i) ≤ n) such that [A]ic(i)
and [B]c(i)i are both finite, since [E]ii = e. Moreover we cannot have [A]hc(i) finite with h = i,
since then
[A ⊗ B]hi ≥ [A]hc(i) ⊗ [B]c(i)i > ε = [E]hi
which contradicts our assumption that B is a right inverse of A. It follows that the mapping i →
c(i) is a bijection, i.e. each column of A is labelled c(i) for some i and contains exactly one finite
element, and each row of A contains exactly one finite element. That is, A = Pσ ⊗D(a1, . . . , an)
for some permutation σ and a1, . . . , an ∈ Rmax{ε}.
Theorem 1.11. For A, B ∈ Rn×n
max , A ⊗ B = E if and only if B ⊗ A = E (i.e. right and left
inverses are equivalent), and A uniquely determines B.
Proof. Suppose that A has right inverse BR ∈ Rn×n
max . Then by Theorem 1.10, we know that
A = Pσ ⊗ D(a1, . . . , an) for some permutation σ and a1, . . . , an ∈ Rmax{ε}. Now, as before,
define BL ∈ Rn×n
max by
[BL]ij =



[A]⊗−1
ji if [A]ji = ε
ε otherwise
and using the same reasoning as before we observe that BL is a left inverse of A. Finally, note
that
BR = E ⊗ BR = (BL ∗ A) ⊗ BR = BL ⊗ (A ⊗ BR) = BL ⊗ E = BL
showing that BR is uniquely determined, and is also a left inverse.
9
Theorem 1.11 tells us that we do not need to make a distinction between right and left inverses,
as we did in Definition 1.7. Before moving on we show one last result which says that the
product of two invertible matrices is also invertible.
Proposition 1.12. If A, B ∈ Rn×n
max are invertible then A ⊗ B is also invertible.
Proof. This proof uses some simple results regarding diagonal and permutation matrices in
conventional algebra, whose analogues are easily proved in max-plus. To start, recall that for a
permutation matrix Pσ, we have that P−1
σ = Pσ−1 . Thus if D(a1, . . . , an) is a diagonal matrix:
D(a1, . . . , an) ⊗ Pσ = (Pσ ⊗ Pσ−1 ) ⊗ D(a1, . . . , an) ⊗ Pσ
= Pσ ⊗ (Pσ−1 ⊗ D(a1, . . . , an) ⊗ Pσ)
= Pσ ⊗ D(aσ(1), . . . , aσ(n))
Now from Theorem 1.10 we can write A = PσA ⊗D(a1, . . . , an), B = PσB ⊗D(b1, . . . , bn). Then
using the above
A ⊗ B = PσA ⊗ D(a1, . . . , an) ⊗ PσB ⊗ D(b1, . . . , bn)
= PσA ⊗ PσB ⊗ D(aσA(1), . . . , aσA(n)) ⊗ D(b1, . . . , bn)
= PσB ◦ σA ⊗ D(aσA(1) ⊗ b1, . . . , aσA(n) ⊗ bn)
and therefore A ⊗ B is invertible by Theorem 1.10.
1.2.3 Determinants
Recall that in conventional algebra, the determinant of a matrix A ∈ Rn×n is defined as
det(A) =
σ∈Sn
sgn(σ)
n
i=1
aiσi
where Sn is the symmetric group on n elements (so an element of Sn is a permutation σ :
{1, . . . , n} → {1, . . . , n}), and the sign of a permutation σ ∈ Sn, denoted sgn(σ), is defined by
sgn(σ) =



1 σ even
−1 σ odd
Unfortunately this definition cannot be immediately translated into max-plus (i.e. by replacing
+ and × with ⊕ and ⊗ respectively) because the use of the sign function requires that we have
additive inverses. Instead, two related concepts are introduced below which offer alternatives
to the notion of the determinant in the case of the max-plus algebra.
Definition 1.13. Let A ∈ Rn×n
max . The permanent of A, denoted perm(A), is defined as
perm(A) =
σ∈Sn
n
i=1
aiσi
10
Note that, crudely put, the permanent is the max-plus analogue of the determinant with the
minuses simply removed. We can understand the formula to give the maximal sum of the
diagonal values for all permutations of the columns of A. The permanent has been studied at
length both in the case of conventional algebra (see [17]) and in max-plus & related semirings
(see [19]).
Note that if A ∈ Rn×n
max is invertible then by Theorem 1.10, A = Pσ ⊗ D(a1, . . . , an) and so
perm(A) = n
i=1 ai = ε. However, unlike in the case of determinants in conventional matrix
algebra, the converse is not necessarily true.
The second concept in max-plus related to the determinant, known as the dominant, can be
thought of as a refinement of the permanent. It is defined below.
Definition 1.14. Let A ∈ Rn×n
max and let the matrix zA be defined by [zA]ij = zaij . The dominant
of A, denoted dom(A), is defined as
dom(A) =



highest exponent in det(zA) if det(zA) = 0
ε otherwise
The dominant can be used to prove max-plus analogues of major results such as Cramér’s
Theorem and the Cayley-Hamilton Theorem. We do not have the space to include these here;
for a comprehensive discussion readers are again referred to [19].
1.3 Graph-theoretic Interpretations in Max-Plus
As in conventional linear algebra, when working with vectors and matrices it is often natural
to interpret definitions and theorems graphically. It turns out that in the case of max-plus
algebra, it is not only natural to do so but also rather insightful. We will only really be able to
appreciate this when we come to look at the eigenvalue problem in the next section, but firstly
we must define all of the graph-theoretic concepts that we will require.
Definitions 1.15. (i) A directed graph G is a pair (V, E) where V is the set of vertices (or
nodes) and E ⊆ V × V is the set of edges (or arcs).
(ii) A path from vertex i to vertex j is a sequence of edges p = (i1, . . . , is+1) with i1 = i and
is+1 = j, such that (ik, ik+1) ∈ E for all k ∈ {1, . . . , s}.
(iii) The length of a path p = (i1, . . . , is+1), denoted |p|l, is equal to s. The set of paths from
vertex i to vertex j of length k is denoted Pk(i, j).
(iv) The weight of a path p from vertex i to vertex j of length d is given by
|p|w =
d
k=1
aik+1,ik
where i1 = i and id+1 = j.
11
(v) The average weight of a path p is given by |p|w
|p|l
.
(vi) A circuit of length s is a path of length s which starts and finishes at the same vertex, i.e.
a path c = (i1, . . . , is+1) such that i1 = is+1.
(vii) A circuit c = (i1, . . . , is+1) is elementary if i1, . . . , is are distinct, and s ≥ 1. We denote
the set of elementary circuits in G(A) by C(A).
(viii) For A ∈ Rn×n
max , the communication graph (or the precedence graph) of A, denoted G(A),
is the graph with vertex set V(A) = {1, . . . , n} and edge set E(A) = {(i, j) : aji = ε}. The
weight of the edge (i, j) ∈ E(A) is given by the entry aji.
Note that the (i, j)-th entry of the matrix A specifies the weight of the edge in G(A) from vertex
j to vertex i. This is common practice in the area of max-plus and graph theory but may not
appear intuitive to those new to the subject.
We now move on to looking at two particular matrices that play a vital role in relating graph
theory to max-plus linear algebra. For A ∈ Rn×n
max , let
A+
:=
∞
k=1
A⊗k
The element [A+]ji gives the maximal weight of any path from i to j in G(A). This statement
is non-trivial, but follows directly from the theorem below.
Theorem 1.16. Let A ∈ Rn×n
max . Then ∀k ∈ N:
[A⊗k
]ji =



max |p|w : p ∈ Pk(i, j) if Pk(i, j) = ∅
ε if Pk(i, j) = ∅
Proof. We use induction on k. Let i, j ∈ {1, . . . , n}. When k = 1, P1(i, j) either contains a
single path of length 1, namely the edge (i, j), or is empty if no such edge exists. In the first case,
the weight of the path is by definition [A]ji, and in the second case max |p|w : p ∈ Pk(i, j) = ε,
which is again equal to the value [A]ji (since there is no edge from i to j).
Now suppose the result holds for some k. Firstly, assume that Pk+1(i, j) = ∅. A path p ∈
Pk+1(i, j) can be split up into a subpath of length k running from i to some vertex l, and a
path consisting of a single edge from l to j. More formally:
p = ˆp ◦ (l, j) with ˆp ∈ Pk(i, l)
The maximal weight of any path in Pk+1(i, j) can thus be obtained from
max
l=1,...,n
[A]jl + max{|ˆp|w : ˆp ∈ Pk(i, l)}
= max
l=1,...,n
[A]jl + [A⊗k
]li (Inductive hypothesis)
12
=
n
l=1
[A]jl ⊗ [A⊗k
]li
= [A ⊗ A⊗k
]ji
= [A⊗(k+1)
]ji
which is what we wanted to prove. Finally, consider the case when Pk+1(i, j) = ∅; i.e. when
there exists no path of length k + 1 from i to j. This implies that for any vertex l, either there
is no path of length k from i to l or there is no edge from l to j (or possibly both). Hence
for any l, at least one of the values [A]jl, [A⊗k]li equals ε. Therefore [A⊗(k+1)]ji = ε, and this
completes the proof.
Note that Theorem 1.16 immediately tells us that A+ is not necessarily well-defined. For
example, if there exists a circuit c = (i1, . . . , is+1) in G(A) in which every edge has positive
weight, then [A⊗k]ji diverges (i.e. tends to +∞) as k → ∞ for any i, j ∈ {i1, . . . , is+1} (since
we can loop around the circuit c as many times a we like, creating a path of higher and higher
weight). The next lemma provides us with a sufficient condition for A+ to be well-defined, and
also reduces the complexity of the infinite sum.
Lemma 1.17. Let A ∈ Rn×n
max be such that any circuit in G(A) has non-positive average weight
(i.e. less than or equal to e). Then we have
A+
= A⊗1
⊕ A⊗2
⊕ A⊗3
⊕ · · · ⊕ A⊗n
∈ Rn×n
max
Proof. Since A is of dimension n, any path p in G(A) from i to j of length greater than n
necessarily contains at least one circuit. We have assumed that all of the circuits in G(A) have
non-positive weights, so removing the circuits in p yields a path from i to j of length at most
n, and of greater average weight. It follows that
[A+
]ji ≤ max [A⊗k
]ji : k ∈ {0, . . . , n}
and the reverse inequality is immediate from the definition of A+. This concludes the proof.
Before moving on, we prove one simple property of A+ that will come in handy later on.
Proposition 1.18. For A ∈ Rn×n
max , we have that A+ ⊗ A+ = A+.
Proof. Consider two vertices i, l ∈ {1, . . . , n}. A path of maximal weight from i to l can be split
up as a path of maximal weight from i to j plus a path of maximal weight from j to l, for any
j ∈ {1, . . . , n} for which the sum of the two path weights is maximal. Indeed this relationship
holds if and only if j is in the path of maximal weight from i to l, but for our purposes we can
simply take the maximum over all vertices.
By Theorem 1.16, the weight of such a path is given by [A+]li. Thus in max-plus notation
13
(recalling that ⊗ is commutative for scalars α ∈ Rmax), we can write
[A+
]li =
n
j=1
[A+
]ji ⊗ [A+
]lj
=
n
j=1
[A+
]lj ⊗ [A+
]ji = [A+
⊗ A+
]li
and therefore A+ = A+ ⊗ A+ as required.
We now introduce one more definition which is closely related to the object A+ defined above.
This will prove to be an integral concept throughout the rest of this chapter and beyond, and
as such, this is one of the most important definitions in this thesis.
Definition 1.19. For A ∈ Rn×n
max , let
A∗
:=
∞
k=0
A⊗k
= E ⊕ A+
Clearly, A∗ and A+ only differ on the leading diagonal. By Theorem 1.16, the (j, i)-th of A∗
could be interpreted as the maximal weight of any path from i to j in G(A), provided we
recognise the additional concept of an empty circuit of length 0 and weight e from every vertex
to itself.
Using Lemma 1.17, it is immediate from the definition of A∗ that if all the circuits in G(A) have
non-positive average weight, then A∗ = A⊗0 ⊕ A⊗1 ⊕ · · · ⊕ A⊗n. However, as the lemma below
shows, thanks to the addition of the identity matrix (i.e. the A⊗0 term) in A∗, we are able to
refine this result slightly by dropping the final term in the sum.
Lemma 1.20. Let A ∈ Rn×n
max be such that any circuit in G(A) has non-positive average weight.
Then we have
A∗
= A⊗0
⊕ A⊗1
⊕ A⊗2
⊕ · · · ⊕ A⊗(n−1)
∈ Rn×n
max
Proof. The same argument applies as in the proof of Lemma 1.17. Note that any path p in G(A)
from i to j of length n or greater necessarily contains at least one circuit, and so removing the
circuit(s) yields a path from i to j of length at most n − 1 and with greater average weight. For
the special case when i = j and p is an elementary circuit of length n (so visiting each vertex
in G(A) exactly once), the i-th entry on the diagonal of A⊗0 (which equals e by definition) will
always be greater than the corresponding entry in A⊗n, since e is the maximum possible weight
of any circuit. This is why we can drop the A⊗n term.
Note that we also have a direct analogue of Lemma 1.18 for the matrix A∗, and this will be
useful in the analysis that follows:
Proposition 1.21. For A ∈ Rn×n
max , we have that A∗ ⊗ A∗ = A∗.
14
Proof. From Lemma 1.18 we have that A+ = A+ ⊗A+. Recalling the definition of A∗ and using
idempotency of matrix addition, we have
A∗
⊗ A∗
= (A+
⊕ E) ⊗ (A+
⊕ E)
= (A+
⊗ A+
) ⊕ (A+
⊗ E) ⊕ (E ⊗ A+
) ⊕ E
= A+
⊕ A+
⊕ A+
⊕ E
= A+
⊕ E = A∗
as required.
To finish this section, we introduce one more important property of square matrices over max-
plus known as irreducibility. The definition comes in three parts:
Definitions 1.22. (i) In a graph G, a vertex j is reachable from vertex i if there exists a
path from i to j.
(ii) A graph is strongly connected if every vertex is reachable from every other vertex.
(iii) A matrix A ∈ Rn×n
max is irreducible if G(A) is strongly connected.
The class of irreducible matrices over max-plus will turn out to be of real significance in Section
1.4. From a practical point of view it is not obvious how to determine whether a given matrix
A ∈ Rn×n
max is irreducible, but as the proposition below shows, one option is to examine the matrix
A+. Combined with Lemma 1.17 (when A has the appropriate properties), this provides us with
a handy (and computationally quick) way to check for matrix irreducibility over max-plus.
Proposition 1.23. A matrix A ∈ Rn×n
max is irreducible if and only if all the entries of A+ are
different from ε.
Proof. A matrix is irreducible if there is a path between any two vertices i and j in G(A), which
by Theorem 1.16 occurs exactly when the entry [A+]ji is not equal to ε.
1.4 Spectral Theory
1.4.1 Eigenvalues and Eigenvectors
Given a matrix A ∈ Rn×n
max , we consider the problem of existence of eigenvalues and eigenvectors.
The main result in max-plus spectral theory is that, under mild conditions, A has a unique
eigenvalue with a simple graph-theoretic interpretation. As can be seen below, the definition of
max-plus eigenvalues and eigenvectors is a direct translation from conventional linear algebra,
with the × operator replaced with ⊗:
Definition 1.24. Let A ∈ Rn×n
max . If there exists a scalar µ ∈ Rmax and a vector v ∈ Rn
max
(containing at least one finite element) such that
A ⊗ v = µ ⊗ v
15
then µ is an eigenvalue of A and v is an eigenvector of A associated with the eigenvalue µ.
Note that Definition 1.24 allows an eigenvalue to be µ = ε. However, the proposition below says
that this can only happen when A has a column in which all entries are ε. In graph-theoretic
terms this means that G(A) has a vertex which, once visited, can never be left (sometimes called
a sink). This is uninteresting from an analytical point of view, so it is reasonable to consider
the case µ = ε to be trivial. Before we prove this result, we introduce some simple notation.
Notation. Let A ∈ Rn×n
max . For i ∈ {1, . . . , n}, we denote the i-th row of A by [A]i·. Similarly,
for j ∈ {1, . . . , n}, we denote the j-th column of A by [A]·j.
Proposition 1.25. ε is an eigenvalue of A ∈ Rn×n
max iff A has at least one column in which all
entries are ε.
Proof. Let A ∈ Rn×n
max be such that [A]·j = (ε, . . . , ε) for some j ∈ {1, . . . , n}. Let v ∈ Rn
max be
such that [v]i = ε ∀i = j and [v]j = α = ε. Then it is easy to verify that [A ⊗ v]i = ε for all
i = 1, . . . , n; that is, ε is an eigenvalue of A with an associated eigenvector v.
Conversely, suppose A ∈ Rn×n
max has eigenvalue ε with an associated eigenvector v. let J = {j :
vj = ε}, which is non-empty by definition. Then for each i = 1, . . . , n we have
ε = [A ⊗ v]i =
n
j=1
aij ⊗ vj =
j∈J
aij ⊗ vj
=⇒ aij = ε ∀j ∈ J
So every column j of A for which vj = ε has all its entries equal to ε. In particular, A contains
at least one column in which all entries are ε.
Corollary 1.26. If A ∈ Rn×n
max is irreducible then ε is not an eigenvalue of A.
Proof. If A is irreducible then it cannot have a column in which all entries are ε. Thus by
Proposition 1.25, ε is not an eigenvalue of A.
Note that eigenvectors are not unique: any scalar multiple of an eigenvector is also an eigen-
vector, and more generally, if µ is an eigenvalue of A, v1, v2 are associated eigenvectors and
α1, α2 ∈ Rmax{ε}, then we have
A ⊗ (α1 ⊗ v1) ⊕ (α2 ⊗ v2) = A ⊗ (α1 ⊗ v1) ⊕ A ⊗ (α2 ⊗ v2)
= α1 ⊗ (A ⊗ v1) ⊕ α2 ⊗ (A ⊗ v2)
= α1 ⊗ (µ ⊗ v1) ⊕ α2 ⊗ (µ ⊗ v2)
= µ ⊗ (α1 ⊗ v1) ⊕ (α2 ⊗ v2)
So (α1 ⊗ v1) ⊕ (α2 ⊗ v2) is also an eigenvector associated with the eigenvalue µ. In fact, the
eigenvectors associated with a given eigenvalue form a vector space in max-plus called the
eigenspace which we shall explore in depth later.
16
As we mentioned at the beginning of Section 1.3, many of the results in the area of max-plus
spectral theory can be interpreted graphically, and the next key lemma constitutes the first step
in doing just that.
Lemma 1.27. Let A ∈ Rn×n
max have finite eigenvalue µ. Then µ is the average weight of some
elementary circuit in G(A).
Proof. Let v be an associated eigenvector of µ. Then by definition not all the entries of v equal
ε, i.e. there exists a vertex/index j1 ∈ {1, . . . , n} such that vj1 = ε. Now v is an eigenvector
and so we have [A ⊗ v]j1 = µ ⊗ vj1 = ε. But [A ⊗ v]j1 = n
k=1 aj1k ⊗ vk, and therefore there
exists a vertex j2 such that
aj1j2 ⊗ vj2 = [A ⊗ v]j1 = ε (1.1)
which implies aj1j2 = ε, i.e. (j2, j1) is an edge in G(A). (1.1) also implies that vj2 = ε, so we
can continue in the same fashion to find a vertex j3 with (j3, j2) an edge in G(A) and vj3 = ε.
Proceeding in this way, eventually some vertex, say, vertex jh, must be encountered for a second
time since the number of vertices is finite. Thus by ignoring the edges prior to encountering jh
for the first time, we have found an elementary circuit
c = ((jh, jh+l−1), (jh+l−1, jh+l−2), . . . , (jh+1, jh))
of length |c|l = l, and with weight
|c|w =
l−1
k=0
ajh+kjh+k+1
(1.2)
where jh = jh+l. By construction, we have that
l−1
k=0
(ajh+kjh+k+1
⊗ vjh+k+1
) = µ⊗l
⊗
l−1
k=0
vjh+k
or equivalently in conventional algebra (for ease of manipulation):
l−1
k=0
ajh+kjh+k+1
+ vjh+k+1
= (l × µ) +
l−1
k=0
vjh+k
Now, because jh = jh+l it follows that l−1
k=0 vjh+k+1
= l−1
k=0 vjh+k
, so subtracting l−1
k=0 vjh+k
from both sides yields
l−1
k=0
ajh+k
jh+k+1 = l × µ
and translated back into max-plus, we can substitute this into (1.2) to see that |c|w = µ⊗l.
17
Thus we have that the average weight of the circuit c is equal to
|c|w
|c|l
=
µ⊗l
l
= µ
as required.
Lemma 1.27 tells us that the only candidates for eigenvalues are the average weights of circuits
in G(A). However, it does not tell us which circuits actually define an eigenvalue and which
do not. Fortunately, when A is irreducible the answer to this question is very simple: only
the maximal average circuit weight defines an eigenvalue. This result is established in the two
theorems below, but first we require some additional definitions and notation.
Definitions 1.28. (i) A circuit c ∈ C(A) is critical if its average weight is maximal.
(ii) For A ∈ Rn×n
max , the critical graph of A, denoted Gc(A), is the graph containing the vertices
and edges which belong to the critical circuits in G(A). We write Gc(A) = (Vc(A), Ec(A)),
and refer to the vertices in Vc(A) as critical vertices.
(iii) The critical classes of A ∈ Rn×n
max are the maximal strongly connected components of Gc(A).
Notation. Let A ∈ Rn×n
max . For β ∈ Rmax{ε}, define the matrix Aβ by [Aβ]ij = aij − β.
Note that the ‘−’ operator is to be interpreted in conventional algebra, where we adopt the
convention ε − x = ε ∀x ∈ R. If β is an eigenvalue of A, the matrix Aβ is sometimes called the
normalised matrix.
Note that the communication graphs G(A) and G(Aβ) are identical except for their edge weights,
and if a circuit c in G(A) has average weight w then the same circuit in G(Aβ) has average weight
w − β. In particular, if G(A) has finite maximal average circuit weight λ then the maximal
average circuit weight in G(Aλ) is λ − λ = 0. Furthermore, a circuit in G(A) is critical if and
only if it is critical in G(Aλ), and therefore Gc(A) and Gc(Aλ) are identical (again, except for
their edge weights).
Consider the matrix A+
λ , which is to be read (Aλ)+
. By Theorem 1.16, the element [A+
λ ]ij gives
the maximal weight of any path from j to i in G(Aλ). In particular, since all circuits in G(Aλ)
have non-positive average weight, we must have [A+
λ ]ii ≤ e for all i ∈ {1, . . . , n}. Furthermore,
for the matrix A∗
λ (also to be read (Aλ)∗
) we obtain [A∗
λ]ii = e⊕[A+
λ ]ii = e for all i ∈ {1, . . . , n}.
Theorem 1.29. Let the communication graph G(A) of a matrix A ∈ Rn×n
max have finite maximal
average circuit weight λ. Then λ is an eigenvalue of A, with an associated eigenvector [A∗
λ]·j
for any vertex j ∈ Vc(A).
Proof. Firstly note that all the circuits in G(Aλ) have non-positive average weight, and therefore
A+
λ is well-defined by Lemma 1.17. Now, every vertex in Gc(Aλ) is contained in a non-empty
circuit which has weight e, i.e.
∀j ∈ Vc
(A) : [A+
λ ]jj = e (1.3)
18
Next, write
[A∗
λ]ij = [E ⊕ A+
λ ]ij =



ε ⊕ [A+
λ ]ij for i = j
e ⊕ [A+
λ ]ij for i = j
Then from (1.3), for j ∈ Vc(A) it follows that
[A+
λ ]·j = [A∗
λ]·j (1.4)
Now, note that we have
A+
λ = A⊗1
λ ⊕ A⊗2
λ ⊕ . . .
= Aλ ⊗ (A⊗0
λ ⊕ A⊗1
λ ⊕ . . . ) = Aλ ⊗ A∗
λ
So substituting this into (1.4) gives for j ∈ Vc(A)
[Aλ ⊗ A∗
λ]·j = [A∗
λ]·j ⇐⇒ Aλ ⊗ [A∗
λ]·j = [A∗
λ]·j
⇐⇒ A ⊗ [A∗
λ]·j = λ ⊗ [A∗
λ]·j
Therefore λ is an eigenvalue of A and the j-th column of A∗
λ is an associated eigenvector for
any j ∈ Vc(A).
Theorem 1.30. Let A ∈ Rn×n
max be irreducible. Then A has a unique eigenvalue, denoted λ(A),
which is finite and equal to the maximal average circuit weight in G(A).
Proof. Let the maximal average circuit weight in G(A) be denoted by λ. Since A is irreducible,
G(A) must contain a circuit and therefore λ is necessarily finite. Thus by Theorem 1.29 we
know that λ is an eigenvalue of A, and it remains to show uniqueness.
Let c = (j1, . . . , jl+1) be an arbitrary circuit in C(A) of length l = |c|l, with jl+1 = j1. Then
ajk+1jk
= ε for all k ∈ {1, . . . , l}. Further, suppose that µ is an eigenvalue of A with an associated
eigenvector v. Note that A is irreducible, so by Corollary 1.26 we have that µ = ε. Now, since
A ⊗ v = µ ⊗ v, it follows that
ajk+1jk
⊗ vjk
≤ µ ⊗ vjk+1
, k ∈ {1, . . . , l}.
and arguing as in Lemma 1.27 (replacing equalities with the appropriate inequalities), we see
that the average weight of the circuit c satisfies
|c|w
|c|l
≤
µ⊗l
l
= µ (1.5)
That is, µ ≥ λ (since (1.5) holds for all c ∈ C(A), and we already have that the maximal
average circuit weight is λ). But by Lemma 1.27, µ is equal to the average weight of some
circuit c ∈ C(A), and so µ ≤ λ also. Hence µ = λ, i.e. λ is the unique eigenvalue of A.
19
When A is large it is often difficult to identify the maximal average circuit weight in G(A). In
fact, there exist several numerical procedures used to determine the eigenvalue of an irreducible
matrix in max-plus, including Karp’s Algorithm and the Power Algorithm. However, none of
these has a particularly attractive order of complexity - for example, the complexity of Karp’s
Algorithm is of order n3, and the complexity of the Power Algorithm is not known precisely
(see [11]). We do not have space here to describe the methods in detail; for more information
readers are referred to chapter five of [13].
We end this section with a simple proposition that, while interesting in its own right, will come
in handy when we begin to look at the eigenspace.
Proposition 1.31. Let A ∈ Rn×n
max be an irreducible matrix with eigenvalue λ and associated
eigenvector v. We have that vi > ε for all i ∈ {1, . . . , n}.
Proof. Call the set of vertices of G(A) corresponding to the finite entries of v the support of
v, denoted Z(v). Suppose that Z(v) does not contain all the elements of V(A). Since A is
irreducible, there must be edges from the vertices in Z(v) to vertices not belonging to Z(v).
Hence there exists vertices j ∈ Z(v), i /∈ Z(v) with aij = ε. Then
[A ⊗ v]i ≥ aij ⊗ vj > ε
That is, Z(A ⊗ v) is strictly bigger than Z(v). But A ⊗ v = λ ⊗ v (and λ is finite by Theorem
1.30), so Z(v) and Z(A⊗v) should be equal. This is a contradiction, and so Z(v) must contain
all the elements of V(A).
1.4.2 The Eigenspace
Let A ∈ Rn×n
max have finite eigenvalue λ. In this part of our analysis we let V (A, λ) denote the
set of all eigenvectors of A associated with the eigenvalue λ, which we call the eigenspace of A
w.r.t. λ. If A is irreducible then by Theorem 1.30 we know that it has a unique eigenvalue, so
we can drop the dependence on λ and denote the eigenspace of A simply by V (A).
The main aim of this section is to find an expression that completely characterises the eigenspace
of A. In Theorem 1.29 we established that [A∗
λ]·j is an eigenvector of A for any j ∈ Vc(A),
but are these the only eigenvectors (of course, up to taking linear combinations, as discussed
above)? We will eventually see that the answer to this question is yes, but first we require some
intermediate steps.
Lemma 1.32. Let A ∈ Rn×n
max . We have that A∗
λ = (E ⊕ Aλ)⊗(n−1).
Proof. If n = 1 then the result is trivial. Otherwise, since E and Aλ commute, we can carry
out the iterated multiplication (E ⊕ Aλ) ⊗ · · · ⊗ (E ⊕ Aλ) to obtain
(E ⊕ Aλ)⊗(n−1)
= E ⊕
n−1
i=1
A⊗i
λ ⊕ · · · ⊕ A⊗i
λ
(n−1
i ) times
(1.6)
20
Each power A⊗0
λ , . . . , A
⊗(n−1)
λ occurs at least once, so by idempotency of ⊕, (1.6) becomes
(E + Aλ)⊗(n−1)
= E ⊕ Aλ ⊕ A⊗2
λ ⊕ . . . A
⊗(n−1)
λ (1.7)
However, noting that every circuit in G(Aλ) must have non-positive weight, we can apply Lemma
1.20 to see that the right-hand side of (1.7) is equal to A∗
λ. This completes the proof.
Lemma 1.33. Let A ∈ Rn×n
max be an irreducible matrix, with eigenvalue λ and an associated
eigenvector v. Then the matrix A∗
λ has eigenvalue e, also with an associated eigenvector v.
Proof. Firstly, note that for any j ∈ {1, . . . , n}
[λ ⊗ v]j = [A ⊗ v]j ⇐⇒ vj = [A ⊗ v]j − λ ⇐⇒ e ⊗ vj = [Aλ ⊗ v]j
That is, e ⊗ v = Aλ ⊗ v, and v is also an eigenvector of Aλ (whose unique eigenvalue must be e
by Theorem 1.30). Thus the eigenspaces V (A) and V (Aλ) coincide. Next, note that
(E ⊕ Aλ) ⊗ v = (E ⊗ v) ⊕ (Aλ ⊗ v) = v ⊕ v = v
Therefore, using Lemma 1.32:
A∗
λ ⊗ v = (E ⊕ Aλ)⊗(n−1)
⊗ v = v = e ⊗ v
as required.
Definition 1.34. Let A ∈ Rn×n
max be a matrix with eigenvalue λ and associated eigenvector v.
The saturation graph of A with respect to λ, denoted Sλ(A, v), is the graph consisting of those
edges (j, i) ∈ E(A) such that aij ⊗ vj = λ ⊗ vi, with vi, vj = ε.
Recall that by definition, if v is an eigenvector of A then there exists at least one i ∈ {1, . . . , n}
such that vi = ε. Then, since A ⊗ v = λ ⊗ v we have that n
j=1 aij ⊗ vj = λ ⊗ vi, which implies
that there exists (at least one) j ∈ {1, . . . , n} such that aij ⊗ vj = λ ⊗ vi. This value is finite
(assuming λ = ε), so we must have (j, i) ∈ E Sλ(A, v) . That is, the saturation graph of A
w.r.t. λ is never empty. Indeed, if A is irreducible, by Proposition 1.31 we know that vi > ε for
all i ∈ {1, . . . , n}, and so by the same argument, Sλ(A, v) contains all the vertices in V(A). In
this case we know that the eigenvalue λ is unique, and therefore we drop the dependence on λ
and simply refer to the saturation graph of A.
Lemma 1.35. Let A ∈ Rn×n
max be an irreducible matrix, with eigenvalue λ and associated eigen-
vector v. We have:
(i) For each vertex i ∈ V(A), there exists a circuit in S(A, v) from which vertex i can be
reached in a finite number of steps.
(ii) Any circuit in S(A, v) belongs to Gc(A).
Proof. (i) A is irreducible, so by Proposition 1.31 we know that vi > ε for all i ∈ {1, . . . , n}. Let
21
i ∈ V(A), which by the discussion above we know is a vertex of the saturation graph S(A, v).
Thus there is a vertex j such that λ ⊗ vi = aij ⊗ vj. Repeating this argument, we can identify
a vertex k such that λ ⊗ vj = ajk ⊗ vk. Repeating this argument an arbitrary number of times,
say, m, we get a path in S(A, v) of length m. If m > n, the constructed path must contain a
circuit.
(ii) Let c = (i1, i2, . . . , il+1) be a circuit of length l in S(A, v). By definition, for all k ∈ {1, . . . , n}
we have that
λ ⊗ vik+1
= aik+1ik
⊗ vik
which implies that
λ⊗l
⊗ vi1 =
l
k=1
aik+1ik
⊗ vi1
Hence, recalling that vi1 is finite:
λ⊗l
=
l
k=1
aik+1ik
But the right-hand side is simply equal to the weight of the circuit c, which thus has average
weight λ. But A is irreducible, so by Theorem 1.30 λ is equal to the maximal average circuit
weight in G(A). Thus c is critical, and belongs to Gc(A).
Lemma 1.36. Let A ∈ Rn×n
max be an irreducible matrix, with eigenvalue λ and associated eigen-
vector v. Then v can be written as
v =
j∈Vc(A)
αj ⊗ [A∗
λ]·j
for some αj ∈ Rmax, j ∈ Vc(A).
Proof. Consider two vertices i, j in S(Aλ, v) such that there exists a path from i to j, say,
(i1, i2, . . . , il+1), with i1 = i and il+1 = j. Then by definition of the saturation graph, this gives
[Aλ]ik+1ik
⊗ vik
= vik+1
, k ∈ {1, . . . , l}
Hence vj = a ⊗ vi, where a is given by
a =
l
k=1
[Aλ]ik+1ik
≤ [A⊗l
λ ]ji ≤ [A∗
λ]ji (1.8)
Now, using that vj = a ⊗ vi, for an arbitrary vertex ν ∈ {1, . . . , n}:
[A∗
λ]νj ⊗ vj = [A∗
λ]νj ⊗ a ⊗ vi
≤ [A∗
λ]νj ⊗ [A∗
λ]ji ⊗ vi (by (1.8))
22
≤ [A∗
λ]νi ⊗ vi (1.9)
where the last inequality follows from Proposition 1.21. By applying Lemma 1.35, for any vertex
j in S(Aλ, v) there exists a vertex i = i(j) ∈ Vc(A). Inequality (1.9) therefore implies
j∈S(Aλ,v)
[A∗
λ]νj ⊗ vj ≤
i∈Vc(Aλ)
[A∗
λ]νi ⊗ vi (1.10)
and this holds for any ν ∈ {1, . . . , n}.
Now, by Lemma 1.33, A∗
λ has eigenvalue e with an associated eigenvector v, i.e. v = A∗
λ ⊗ v.
The value of vν is equal to [A∗
λ]νj ⊗vj for some j, which by definition has to be in the saturation
graph S(Aλ, v). Thus it holds for ν ∈ {1, . . . , n} that
vν =
j∈S(Aλ,v)
[A∗
λ]νj ⊗ vj
(1.10)
≤
j∈Vc(Aλ)
[A∗
λ]νj ⊗ vj
On the other hand, since v is an eigenvector of A∗
λ associated with the eigenvalue e,
vν = [A∗
λ ⊗ v]ν =
n
j=1
[A∗
λ]νj ⊗ vj ≥
i∈Vc(Aλ)
[A∗
λ]νi ⊗ vi
which also holds for any ν ∈ {1, . . . , n}. Thus we have shown
vν =
i∈Vc(Aλ)
[A∗
λ]νi ⊗ vi
and since Vc(Aλ) = Vc(A) (see the proof of Theorem 1.29), the proof is complete.
The lemma above shows that for an irreducible matrix A, the vectors [A∗
λ]·j, with j ∈ Vc(A),
constitute a generating set for the eigenspace of A. Notice that in the proof we have actually
identified the coefficients αi to which we referred in the statement of the lemma. If some of the
columns of A∗
λ are colinear then the αi’s are non-unique and some can be chosen to equal ε.
We have now done most of the work in characterising the eigenspace of an irreducible matrix.
We now require a small extension of our notation and one more lemma before we are able to
give a complete expression for the eigenspace, and we will end this section by referring to a
theorem which shows that it is not possible to simplify this expression any further.
Notation. Recall that the critical classes of a matrix A ∈ Rn×n
max are the maximal strongly
connected subgraphs of Gc(A). Let Nc(A) denote the number of critical classes of A, so Nc(A) ∈
N. For r ∈ {1, . . . , Nc(A)}, let Gc
r(A) = (Vc
r (A), Ec
r (A)) denote the r-th critical class of A and
let jc
r := min{j ∈ Vc
r (A)} be the smallest numbered vertex in the r-th critical class. We call
{jc
1, . . . , jc
Nc(A)} a set of representative vertices of the critical classes of A.
Note that in the way defined above, the set of representative vertices is unique. However, this
is not important - in general, a representative vertex jc
r of the rth critical class of A can be any
23
j ∈ Vc
r (A).
Lemma 1.37. Let A ∈ Rn×n
max be an irreducible matrix with eigenvalue λ. Then for i, j ∈ Vc(A),
there exists α ∈ Rmax{ε} such that
α ⊗ [A∗
λ]·i = [A∗
λ]·j
iff i and j are members of the same critical class.
Proof. Suppose that i, j ∈ Vc(A) are members of the same critical class of Aλ. Then i and j
communicate with each other in the critical graph, i.e. (i, j, i) is an elementary circuit in Gc(Aλ).
As we have argued before (see Theorem 1.29), any circuit in Gc(Aλ) must have weight e, and
therefore in this case we have [Aλ]ji ⊗ [Aλ]ij = e. Then by definition of A∗
λ, we have that
[A∗
λ]ji ⊗ [A∗
λ]ij ≥ [Aλ]ji ⊗ [Aλ]ij = e (1.11)
Now by a previous observation we know that [A∗
λ]jj = e, and by Proposition 1.21 we have that
A∗
λ = A∗
λ ⊗ A∗
λ. Therefore we also have
[A∗
λ]ji ⊗ [A∗
λ]ij ≤
n
l=1
[A∗
λ]jl ⊗ [A∗
λ]lj = [A∗
λ ⊗ A∗
λ]jj = [A∗
λ]jj = e (1.12)
and from (1.11) and (1.12) we conclude that [A∗
λ]ji ⊗ [A∗
λ]ij = e. Thus for all l ∈ {1, . . . , n}:
[A∗
λ]li ⊗ [A∗
λ]ij ≤ [A∗
λ]lj
= [A∗
λ]lj ⊗ [A∗
λ]ji ⊗ [A∗
λ]ij
≤ [A∗
λ]li ⊗ [A∗
λ]ij
and therefore [A∗
λ]lj = [A∗
λ]li ⊗ [A∗
λ]ij. Hence the statement of the lemma has been proved, with
α = [A∗
λ]ij.
Conversely, suppose now that i, j ∈ Vc(A) do not belong to the same critical class, and suppose
for contradiction that we can find α ∈ Rmax{ε} such that α ⊗ [A∗
λ]·i = [A∗
λ]·j. The i-th and
j-th components of this equation read
α ⊗ e = [A∗
λ]ij and α ⊗ [A∗
λ]ji = e
respectively, from which it follows that
[A∗
λ]ij ⊗ [A∗
λ]ji = e
Therefore the elementary circuit (i, j, i) has average weight e, and therefore belongs to Gc(Aλ).
Thus vertices i and j are members of the same critical class (since they communicate with each
other), which is a contradiction.
Theorem 1.38. Let A ∈ Rn×n
max be an irreducible matrix with (unique) eigenvalue λ. The
24
eigenspace of A is given by
V (A) =



Nc(A)
r=1
αr ⊗ [A∗
λ]·jc
r
: αr ∈ Rmax, at least one αr finite



for any set of representative vertices {jc
1, . . . , jc
Nc(A)} of the critical classes of A.
Proof. By Lemma 1.36 we know that any eigenvector of A is a linear combination of the columns
[A∗
λ]·j, for j ∈ Vc(A). However, by Lemma 1.37 we know that the columns [A∗
λ]·j for j in some
critical class Vc
r (A) are all colinear. Therefore to build any eigenvector we only need one column
corresponding to each critical class, and so it suffices to take the sum over a set of representative
vertices of the critical classes of A.
Theorem 1.39. No column [A∗
λ]·i, for i ∈ Vc(A), can be expressed as a linear combination of
columns [A∗
λ]·jc
r
, where jc
r varies over the representative vertices of critical classes distinct from
that of i.
Proof. The proof of this statement requires substantial groundwork which we do not have the
space to include. For all the details and a full proof, readers are referred to theorem 3.101 in
[2].
Theorem 1.39 above tells us that we cannot simplify any further the expression for V (A) given
in Theorem 1.38. It also tells us that for an irreducible matrix A, the columns [A∗
λ]·jc
r
, where
{jc
1, . . . , jc
Nc(A)} is a set of representative vertices of the critical classes of A, form a basis for
the eigenspace V (A).
1.4.3 A Worked Example
Consider the matrix
A =







ε −2 ε 6
1 ε 4 ε
ε 8 ε ε
ε 5 ε 6







Thus G(A) looks like
25
1
4
2 3
1 8
-2
5
4
6
6
Figure 1.1: Communication graph of the matrix A given above. Vertices are represented as
circles and numbered 1-4 by convention. Edges are present only if the corresponding entry in
A is finite, in which case this value specifies the edge weight.
We can see that G(A) is strongly connected, so A is irreducible. Thus by Theorem 1.30, A has
a unique eigenvalue λ given by the maximal average circuit weight in G(A). The elementary
circuits and their average weights are
c1 = (1, 2, 1) |c1|w/|c1|l = (1 ⊗ −2)/2 = −0.5
c2 = (1, 2, 4, 1) |c2|w/|c2|l = (1 ⊗ 5 ⊗ 6)/3 = 4
c3 = (2, 3, 2) |c3|w/|c3|l = (8 ⊗ 4)/2 = 6
c4 = (4, 4) |c4|w/|c4|l = (6)/1 = 6
and therefore λ = max{−0.5, 4, 6, 6} = 6. Circuits c3 and c4 are critical, so the critical graph
Gc(A) looks like
2 3 4
8
4
6
Figure 1.2: Critical graph of the matrix A given above. Both the circuits have maximal average
weight of 6. The other circuits present in Figure 1.1 are no longer included because they are
not critical (their average weight is not maximal).
We can see that Vc(A) = {2, 3, 4}, and Gc(A) has two critical classes with vertex sets Vc
1(A) =
{2, 3} and Vc
2(A) = {4} respectively. Thus {jc
1 = 2, jc
2 = 4} is a set of representative vertices of
the critical classes of A. Now, using that [Aλ]ij = aij − λ, we have
Aλ =







ε −8 ε e
−5 ε −2 ε
ε 2 ε ε
ε −1 ε e







and either by inspection of G(Aλ), or by using Lemma 1.17 and computing A⊗1
λ , A⊗2
λ , A⊗3
λ and
26
A⊗4
λ , we can see that
A+
λ =







−6 −1 −3 e
−5 e −2 −5
−3 2 e −3
−6 −1 −3 e







Similarly, by using Lemma 1.20 (or by simply replacing any non-zero diagonal values in A+
λ
above by e), we obtain
A∗
λ =







e −1 −3 e
−5 e −2 −5
−3 2 e −3
−6 −1 −3 e







Now by theorems 1.38 and 1.39, the columns [A∗
λ]·2 & [A∗
λ]·4 form a basis for the eigenspace of
A, i.e.
V (A) =



α1 ⊗







−1
e
2
−1







α2 ⊗







e
−5
−3
e







: α1, α2 ∈ Rmax, at least one αr finite



For example, if we take α1 = −2, α2 = 1 we get
v := −2 ⊗







−1
e
2
−1







1 ⊗







e
−5
−3
e







=







−3
−2
e
−3














1
−4
−2
1







=







1
−2
e
1







and we can easily verify that this is indeed an eigenvector of A, associated with the unique
eigenvalue λ = 6:
A ⊗ v =







ε −2 ε 6
1 ε 4 ε
ε 8 ε ε
ε 5 ε 6














1
−2
e
1







=







7
4
6
7







= 6 ⊗







1
−2
e
1







= λ ⊗ v
Finally, we can observe that
[A∗
λ]·3 =







−3
−2
e
−3







= −2 ⊗







−1
e
2
−1







= −2 ⊗ [A∗
λ]·2
That is, columns [A∗
λ]·2 and [A∗
λ]·3 are scalar multiples of each other, which we would expect
27
(see Lemma 1.37) since vertices 2 and 3 are in the same critical class.
1.5 Recurrence Relations & Periodicity
1.5.1 Solving Max-Plus Recurrence Relations
In many of the applications discussed in Chapters 2 and 3 we will need to solve recurrence
relations over the max-plus semiring. A key insight in doing this is to view implicit first-order
recurrence relations of the form x(k + 1) = (A ⊗ x(k + 1)) ⊕ (B ⊗ x(k)) as a system of max-plus
linear equations x = (A ⊗ x) ⊕ b. The result below uses the ∗ operator (see Definition 1.19) to
solve systems of this form.
Theorem 1.40. Let A ∈ Rn×n
max and b ∈ Rn
max. If the communication graph G(A) has no circuit
with positive average weight, then the equation
x = (A ⊗ x) ⊕ b (1.13)
has the solution x = A∗ ⊗ b. Furthermore, if all the circuit weights in G(A) are negative, then
this solution is unique.
Proof. By Lemma 1.20 we know that A∗ exists. We therefore have
A∗
⊗ b =
∞
k=0
A⊗k
⊗ b
=


∞
k=1
A⊗k
⊗ b

 ⊕ (E ⊗ b)
= A ⊗


∞
k=0
A⊗k
⊗ b

 ⊕ (E ⊗ b)
= A ⊗ (A∗
⊗ b) ⊕ b
and therefore A∗ ⊗ b is indeed a solution of (1.13). To show uniqueness, suppose that x is a
solution of x = b⊕(A⊗x); then we can substitute the expression for x back into the right-hand
side of the equation to obtain
x = b ⊕ (A ⊗ b) ⊕ (A⊗2
⊗ x)
Repeating this procedure yields
x = b ⊕ (A ⊗ b) ⊕ (A⊗2
⊗ b) ⊕ (A⊗3
⊗ x)
= . . .
= b ⊕ (A ⊗ b) ⊕ · · · ⊕ (A⊗(k−1)
⊗ b) ⊕ (A⊗k
⊗ x)
=
k−1
l=0
(A⊗l
⊗ b) ⊕ (A⊗k
⊗ x) (1.14)
28
By Theorem 1.16, the entries A⊗k are the maximal weights of paths of length k. For k large
enough, these paths necessarily contain elementary circuits, which have negative weight by
assumption. Indeed, as k → ∞ the number of elementary circuits in these paths also necessarily
tends to ∞, and so the elements of A⊗k tend to ε. Hence, letting k → ∞ in (1.14) gives that
x = A∗ ⊗ b (where once again we have applied Lemma 1.20), as required.
As we mentioned above, Theorem 1.40 above can be applied to the implicit recurrence relation
x(k + 1) = A ⊗ x(k + 1) ⊕ B ⊗ x(k)
to yield the explicit recurrence relation
x(k + 1) = A∗
⊗ B ⊗ x(k)
and this technique will be used several times in Chapter 2. However, can we extend this theory?
In many applications we will encounter systems whose dynamics follow a recurrence relation of
order higher than one. Consider the most general (explicit) linear recurrence relation of order
M ≥ 1:
x(k) =
M
m=0
Am ⊗ x(k − m), k ≥ 0 (1.15)
Here, A0, . . . , AM ∈ Rn×n
max and x(m) ∈ Rn
max, −M ≤ m ≤ −1 are given. We show below that
we can transform (1.15) into a first-order recurrence relation of the form x(k + 1) = A ⊗ x(k),
provided that A0 has no circuit of positive weight.
To begin, set
b(k) =
M
m=1
Am ⊗ x(k − m)
Then (1.15) becomes
x(k) = A0 ⊗ x(k) ⊕ b(k) (1.16)
Then, since A0 has no circuit of positive weight by assumption, we can apply Theorem 1.40 to
write (1.16) as
x(k) = A∗
0 ⊗ b(k)
= A∗
0 ⊗ A1 ⊗ x(k − 1) ⊕ · · · ⊕ A∗
0 ⊗ AM ⊗ x(k − M) (1.17)
Note that we have now changed the implicit M-th order recurrence relation (1.15) into the
explicit M-th order recurrence relation (1.17) (the x(k) term does not feature on the right hand
side). To finish the job, we set
x(k) := (x (k − 1), x (k − 2), . . . , x (k − M))
29
and (with E denoting a matrix of all εs):
A :=











A∗
0 ⊗ A1 A∗
0 ⊗ A2 . . . . . . A∗
0 ⊗ AM
E E . . . . . . E
E E
... E
...
...
...
E E . . . E E











Then (1.15) can be written as
x(k + 1) = A ⊗ x(k), k ≥ 0 (1.18)
which is what we were aiming for.
It will come as no surprise that problems of this form are closely related to the concept of
eigenvalues and eigenvectors studied in the previous section. For example, if the recurrence
relation x(k + 1) = A ⊗ x(k) is given the initial condition x(0), where x(0) is an eigenvector of
A with corresponding eigenvalue λ, then the solution x(k) is given by x(k) = λ⊗k ⊗ x(0). It
could then be said that the solution is periodic. The final section of this chapter explores the
limiting behaviour of the solution x(k) when the system is initialised with an arbitrary vector
x(0), and in particular whether we can say anything about its periodicity in general.
1.5.2 Limiting Behaviour
In this section we state and prove a theorem which establishes an important result on the
asymptotic behaviour of the powers of an irreducible matrix A in terms of its unique eigenvalue
λ. In simple terms, this theorem says that sequential powers of A always exhibit periodic
behaviour after a finite number of steps. We will then apply this result to the recurrence
relations we studied in the previous section. It turns out that the periodicity depends on a
quantity known as the cyclicity of A, which we define below in two steps.
Definition 1.41. The cyclicity of a graph G, denoted σG, is defined as follows:
• If G is strongly connected, then its cyclicity equals the greatest common divisor of the
lengths of all the elementary circuits in G. If G consists of just one vertex without a
self-loop, then its cyclicity is taken to be 1.
• If G is not strongly connected, then its cyclicity equals the least common multiple of the
cyclicities of all the maximal strongly connected subgraphs of G.
Definition 1.42. The cyclicity of a matrix A ∈ Rn×n
max , denoted σ(A), is equal to σGc(A), the
cyclicity of the critical graph of A.
If A is a square matrix over Rmax then we often talk of the graph cyclicity and matrix cyclicity
of A, where the graph cyclicity refers to the cyclicity of the graph Gc(A).
30
It may seem strange to define the cyclicity of a matrix A via its critical graph and not its
communication graph. However, as we will see below, it turns out that the former quantity
determines the periodic behaviour of the powers of A, so the reason for this choice should be
clear.
Before proving the main theorem of this section we require several preliminary results. The first
one is an important lemma from graph theory, which we explore below.
Lemma 1.43. Let A ∈ Rn×n
max be an irreducible matrix, and let the cyclicity of its communica-
tion graph be σG. Then, after a suitable relabelling of the vertices of G(A), the matrix A⊗σG
corresponds to a block diagonal matrix with σG blocks on the diagonal. The communication graph
of each diagonal block is strongly connected and has cyclicity one. Moreover, the eigenvalues of
all diagonal blocks have the same value.
Proof. For i, j ∈ V(A), define the relation
i ∼ j ⇐⇒ the length of every path from i to j in G(A) is a multiple of σG.
It is easy to show that this is an equivalence relation on V(A). Therefore if k0 ∈ V(A) is fixed,
we can introduce equivalence classes C0, C1, . . . , CσG−1 as
i ∈ Cl ⇐⇒ every path from k0 to i in G(A) has length (mod σG) equal to l, (1.19)
for l = 0, 1, . . . , σG. Then for i, j ∈ V(A), we have that i ∼ j ⇐⇒ i, j ∈ Cl for some
l = 0, 1, . . . , σG − 1.
Assume that there is a path from i to j of length σG. By definition of cyclicity, the length of any
circuit starting and finishing at i must be divisible by σG, so there must also be a path from j
to i whose length is a multiple of σG. Therefore every path from i to j must have a length that
is a multiple of σG (since if not, we could use such a path to create a circuit whose length is not
divisible by σG). Hence, every path of length σG must start and end in the same equivalence
class as defined in (1.19). Since A⊗σG can be computed by considering all paths of length σG in
G(A) (see Theorem 1.16), it follows that A⊗σG is block-diagonal, possibly after an appropriate
relabelling of the vertices according to the classes C1, . . . , CσG−1; for instance, by first labelling
all vertices in C0, then all the vertices in C1, and so on.
Now let l ∈ {0, 1, . . . , σG − 1}. From our remark above we know that if i, j ∈ Cl then i ∼ j, i.e.
the length of every path from i to j is a multiple of σG. Since A is irreducible there must be at
least one such path, which can be split up into a number of subpaths, all of length σG and going
from one vertex in Cl to another vertex in Cl. It follows that the block of A⊗σG corresponding
to class Cl is irreducible.
Next, note that every circuit in G(A) must go through all the equivalence classes C1, . . . , CσG−1.
To see this, suppose there is a circuit going through just τ of the classes, where τ < σG. Then
there must be a class Cl and vertices i, j ∈ Cl such that there is a path from i to j of length
31
less than or equal to τ. This is a contradiction, since the length of a path between vertices in
the same class must be a multiple of σG. Hence the number of circuits in G(A) is the same
as the number of circuits going through any class Cl. Observe that circuits in G(A) of length
κ × σG can be associated with circuits in G(A⊗σG ) of length κ. Since the greatest common
divisor of all circuits in G(A) is σG, it follows that the communication graph of the block in
A⊗σG corresponding to class Cl has cyclicity one.
Finally, the fact that the eigenvalues of the diagonal blocks are identical follows immediately
from the irreducibility of A.
Corollary 1.44. Under the conditions of Lemma 1.43, let τ be a multiple of σG. Then, after
a relabelling of the vertices of G(A), the matrix A⊗τ corresponds to a block diagonal matrix
with σG blocks on the diagonal. The communication graph of each diagonal block is strongly
connected and has cyclicity one.
Proof. This follows along the same lines as the proof of Lemma 1.43.
Let A ∈ Rn×n
max be an irreducible matrix and let Gc(A) be its critical graph. Define the critical
matrix of A, denoted Ac, to be the submatrix of A such that the communication graph of Ac is
equal to the critical graph of A, i.e. G(Ac) = Gc(A). Matrix Ac can be obtained from matrix A
by restricting A to those entries that correspond to edges in Gc(A). Clearly the critical graph of
Ac is the same as its communication graph, i.e. Gc(Ac) = G(Ac), and therefore σGc(Ac) = σG(Ac).
It then follows that the cyclicity of the matrix Ac is equal to the cyclicity of the communication
graph G(Ac) (i.e. σ(Ac) = σG(Ac)); that is, for the critical matrix Ac both types of cyclicity
coincide and are equal to σ(A). We know that G(Ac) = Gc(A) = Gc(Ac), but we can prove
more:
Lemma 1.45. Let A be an irreducible matrix, and let Ac be its corresponding critical matrix.
Then, for all k ≥ 1 we have
G((Ac
)⊗k
) = Gc
(A⊗k
) = Gc
((Ac
)⊗k
).
Proof. As we noted above, Ac is a submatrix of A, and therefore (Ac)⊗k is a submatrix of A⊗k.
Furthermore, note that Gc(·) is a subgraph of G(·), which we shall denote Gc(·) ⊆ G(·). It follows
that Gc((Ac)⊗k) ⊆ Gc(A⊗k) and Gc((Ac)⊗k) ⊆ G((Ac)⊗k).
To prove the converse inclusions, note that any edge in G(A⊗k) from vertex i to vertex j
corresponds to a path in G(A) of length k from vertex i to vertex j. Thus if a number of edges
in G(A⊗k) form a circuit of length l, then the corresponding paths in G(A) form a circuit of
length k×l. Conversely, consider a circuit in G(A), choose any vertex on the circuit and traverse
the circuit with steps of length k until the chosen vertex is reached again. If l such steps are
required then there is a corresponding circuit in G(A⊗k) of length l. In the same way, critical
circuits in G(A⊗k) of length l correspond to critical circuits in G(A) of length k × l, and vice
versa.
32
If c is a critical circuit of length l in G(A⊗k) then there is a corresponding critical circuit c of
length k ×l in G(A). This circuit must be in Gc(A) (because it is critical), which in turn implies
that c is a critical circuit in G((Ac)⊗k). Hence, it follows that Gc((Ac)⊗k) ⊇ Gc(A⊗k). The other
inclusion is proved in the same way.
Lemma 1.46. Let A ∈ Rn×n
max be an irreducible matrix with cyclicity σ = σ(A). Then the
cyclicity of the matrix A⊗σ is equal to one.
Proof. Firstly, suppose the critical matrix Ac is irreducible. By the remarks prior to Lemma 1.45
we know that the cyclicity of Ac and that of its communication graph is equal to σ, so by Lemma
1.43, after a suitable relabelling of vertices, (Ac)⊗σ corresponds to a block diagonal matrix with
square diagonal blocks that are irreducible and have graph cyclicity one. However, by Lemma
1.45 with k = σ, we have that Gc((Ac)⊗σ) = G((Ac)⊗σ), and therefore the communication graph
of each of the diagonal blocks of (Ac)⊗σ coincides with its critical graph. Thus for each diagonal
block both cyclicities coincide, and therefore both are one.
If Ac is reducible then the same process can be done for each of the critical classes of Gc(A) with
their individual cyclicities. According to Definition 1.41, the least common multiple of these
cyclicities equals σ, the matrix cyclicity of A. Noting that σ is a multiple of σG(A), it follows
from Corollary 1.44 that each diagonal block of (Ac)⊗σ corresponds to a block diagonal matrix
with square diagonal blocks that are irreducible and have cyclicity one. Note that if Gc(A) does
not cover all the vertices of G(A) then we must augment the overall block diagonal matrix with
a square block with entries equal to ε in order to keep it the same size as the original matrix A.
In both cases it follows that each diagonal block of the block diagonal matrix corresponding to
(Ac)⊗σ is irreducible and has cyclicity one. Taking the least common multiple of all cyclicities,
this means that the cyclicity of the whole matrix (Ac)⊗σ is equal to one, and therefore the graph
cyclicity of Gc((Ac)⊗σ) is also equal to one. But by Lemma 1.45 with k = σ, this graph is the
same as Gc(A⊗σ), which therefore must also have cyclicity one. Thus A⊗σ has matrix cyclicity
one, which completes the proof.
We now state a fundamental theorem, the proof of which can be found in [4].
Theorem 1.47. Let β1, . . . , βq ∈ N be such that gcd{β1, . . . , βq} = 1. Then there exists N ∈ N
such that for all k ≥ N there exist n1, . . . , nq ∈ N0 such that k = (n1 × β1) + · · · + (nq × βq).
We finally state and prove one last prerequisite result which is essentially a special case of the
theorem that follows. It turns out that the generalisation is relatively straightforward, so in
proving this lemma we will have done most of the work in proving the main result.
Lemma 1.48. Let A ∈ Rn×n
max be an irreducible matrix with unique eigenvalue e and cyclicity
one. Then there exists N ∈ N such that
A⊗(k+1)
= A⊗k
33
for all k ≥ N.
Proof. The proof comes in three stages. We show that there exists N ∈ N such that for all
k ≥ N:
1. [A⊗k]ii = [A+]ii = e for all i ∈ Vc(A),
2. [A⊗k]ij = [A+]ij for all i ∈ Vc(A) and j ∈ {1, . . . , n},
3. [A⊗k]ij = l∈Vc(A)[A+]il ⊗ [A+]lj for all i, j ∈ {1, . . . , n}.
The result then follows immediately from statement 3 since the right hand side does not depend
on k.
Statement 1. Consider i ∈ Vc(A). Then there is a critical class of Gc(A), say Gc
1(A) =
(Vc
1(A), Ec
1(A)), such that i ∈ Vc
1(A). Since the cyclicity of matrix A is one, it follows that
the cyclicity of graph Gc
1(A) is equal to one too. Hence there exist circuits in Gc
1(A), say
c1, . . . , cq, whose lengths have a greatest common divisor equal to one. Since Gc
1(A) is a critical
class it must be strongly connected, and therefore there exists a circuit α in Gc
1(A) that passes
through i and through all circuits c1, . . . , cq (i.e. α ∩ cj = ∅ ∀j = 1, . . . , q).
Now, by Theorem 1.47, there exists N ∈ N such that for each k ≥ N, there exist n1, . . . , nq ∈ N0
such that
k = |α|l + (n1 × |c1|l) + · · · + (nq × |cq|l).
For these n1, . . . , nq, we can construct a circuit passing through i, built from circuit α, n1 copies
of circuit c1, n2 copies of circuit c2 and so on, up to nq copies of circuit cq. Clearly this circuit
is in Gc
1(A), so it must be critical with weight e. Since the maximal average circuit weight in
G(A) is e, it follows that [A⊗k]ii = e for all k ≥ N, which, by the definition of A+, also implies
that [A+]ii = e, as required.
Statement 2. By the definition of A+ there exists l ∈ N such that [A⊗l]ij = [A+]ij. In fact, since
the eigenvalue of A is e, it follows from Lemma 1.17 that l ≤ n. From statement 1, for k large
enough, i ∈ Vc(A) and j ∈ {1, . . . , n}, we then have
[A⊗(k+l)
]ij ≥ [A⊗k
]ii ⊗ [A⊗l
]ij = [A⊗l
]ij = [A+
]ij.
In addition, clearly we also have
[A+
]ij =
∞
m=1
[A⊗m
]ij ≥ [A⊗(k+l)
]ij ≥ [A+
]ij,
so by replacing k + l with k, it therefore follows that [A⊗k]ij = [A+]ij for all i ∈ Vc(A) and
j ∈ {1, . . . , n}, with k large enough. This is what we wanted to prove.
Statement 3. Following the same lines as in the proof of statement 2, we can also show that
[A⊗m]ij = [A+]ij for all i ∈ {1, . . . , n}, j ∈ Vc(A) and with m large enough. Together, take k
34
and m large enough such that [A⊗k]il = [A+]il and [A⊗m]lj = [A+]lj for all l ∈ Vc(A). Then
[A⊗(k+m)
]ij ≥ [A⊗k
]il ⊗ [A⊗m
]lj = [A+
]il ⊗ [A+
]lj,
for all l ∈ Vc(A). By replacing k + m with k, it follows that for k large enough
[A⊗k
]ij ≥
l∈Vc(A)
[A+
]il ⊗ [A+
]lj.
Now let the maximal average weight of a non-critical circuit (i.e. a circuit not passing through
any vertex in Vc(A)) be δ. Then the weight of a path from j to i of length k + 1 in G(A) not
passing through any vertex in Vc(A) can be bounded above by [A+]ij + (k × δ) = [A+]ij ⊗ δ⊗k,
since such a path consists of an elementary path from j to i (whose weight is bounded above by
[A+]ij) and at most k non-critical circuits (whose weights are each bounded above by δ). Since
the maximal average circuit weight in G(A) is e we must have δ < e, and so for k large enough
[A+
]ij ⊗ δ⊗k
≤
l∈Vc(A)
[A+
]il ⊗ [A+
]lj.
Indeed, the right-hand side is fixed, while the left-hand side tends to ε as k → ∞. Hence for k
large enough we have that
[A⊗k
]ij =
l∈V(A)
[A+
]il ⊗ [A+
]lj =
l∈Vc(A)
[A+
]il ⊗ [A+
]lj,
for all i, j = 1, . . . , n.
We can now state and prove the main theorem of this section.
Theorem 1.49. Let A ∈ Rn×n
max be an irreducible matrix with unique eigenvalue λ and cyclicity
σ := σ(A). Then there exists N ∈ N such that
A⊗(k+σ)
= λ⊗σ
⊗ A⊗k
for all k ≥ N.
Proof. Consider the matrix B := (Aλ)⊗σ. Recall that σ is the cyclicity of the critical graph of
A, which is a multiple of the cyclicity of the communication graph G(A). By Corollary 1.44,
after a suitable relabelling of the vertices of G(A), matrix B is a block diagonal matrix with
square diagonal blocks whose communication graphs are strongly connected and have cyclicity
one. By Lemma 1.46 we have that the cyclicity of B is one, which implies that the cyclicity of
each of its diagonal blocks is one. Hence by applying Lemma 1.48 to each diagonal block, it
ultimately follows that there exists M ∈ N such that B⊗(l+1) = B⊗l for all l ≥ M. That is,
(Aλ)⊗σ
⊗(l+1)
= (Aλ)⊗σ
⊗l
,
35
which can further be written as (Aλ)⊗(l×σ+σ) = (Aλ)⊗(l×σ), or
A⊗(l×σ+σ)
= λ⊗σ
⊗ A⊗(l×σ)
,
for all l ≥ M. Finally, note that A⊗(l×σ+j+σ) = λ⊗σ ⊗A⊗(l×σ+j) for any 0 ≤ j ≤ σ−1, implying
that for all k ≥ N := M × σ it follows that
A⊗(k+σ)
= λ⊗σ
⊗ A⊗k
,
as required.
Theorem 1.49 can be seen as the max-plus analogue of the Perron-Frobenius theorem in con-
ventional linear algebra. Strictly speaking it is the normalised matrix Aλ that exhibits periodic
behaviour, since the unique eigenvalue of Aλ is e = 0, and then A
⊗(k+σ)
λ = A⊗k
λ for k sufficiently
large. However, we use the term ‘periodic’ to describe the more general behaviour seen here.
Note that the cyclicity of A is the smallest possible length of such periodic behaviour (see [2] for
the proof of this). For our purposes, we now move on to applying this result to the recurrence
relations studied in Section 1.5.1.
Recall the form of the basic first-order recurrence relation
x(k + 1) = A ⊗ x(k), k ≥ 0, (1.20)
which has the solution
x(k) = A⊗k
⊗ x(0).
We can apply Theorem 1.49 in this context to give us that for k sufficiently large:
x(k + σ(A)) = A⊗(k+σ(A))
⊗ x(0)
= λ⊗σ(A)
⊗ A⊗k
⊗ x(0)
= λ⊗σ(A)
⊗ x(k).
That is, the solution x(k) is periodic with period σ(A). If we interpret k as a time index,
then also by Theorem 1.49, the solution enters this periodic regime after N =: t(A) time steps,
where we call t(A) is the transient time of A. In particular, if A has cyclicity equal to 1 then
x(k+1) = A⊗x(k) = λ⊗x(k) ∀k ≥ t(A), and so for k sufficiently large x(k) effectively becomes
an eigenvector of A. In other words, after t(A) time steps, x(k) behaves like an eigenvector, and
the effect of the initial condition x(0) has died out.
Note that the transient time of a matrix can be large even for systems of small dimension. For
example, the matrix A defined by
A =


−1 −N
e e


36
where N ∈ {2, 3, . . . } has transient time t(A) = N, while its cyclicity is clearly 1.
Finally, we make some observations regarding the growth rate of the solution x(k). Note that
if we take x(0) = v in (1.20), where v is an eigenvector of A, then we immediately obtain that
for all j = 1, . . . , n:
lim
k→∞
xj(k)
k
= λ,
where λ is the unique eigenvalue of A. By applying Theorem 1.49 it should be clear that this
holds true for any initial value x(0) and not just for eigenvectors; indeed this result is proved
in [13]. We therefore say that the solution has an asymptotic growth rate of λ. Assuming
irreducibility, all recurrence relations over max-plus exhibit this behaviour, regardless of the
choice of the matrix A!
37
Chapter 2
Petri Nets and Timed Event Graphs
2.1 A Motivating Example
The following example is adapted from chapter 1 of [2]. Consider a manufacturing system
consisting of three machines M1, M2 and M3, which produces three kinds of parts P1, P2 and
P3 according to different product mixes. The manufacturing process for each part is depicted
below.
M1 M2 M3
P2
P3
P1
Figure 2.1: Manufacturing Process for each part. Grey boxes represent the three machines;
arrows represent the routes that the different parts must take in their respective manufacture.
Processing times are different for each machine and each part, and are given in the following
table:
P1 P2 P3
M1 - 1 5
M2 3 2 3
M3 4 3 -
Table 2.1: Processing times for each part at each machine (arbitrary time units). Blank entries
correspond to combinations of machine & part that do not form part of the manufacturing
process.
Parts are carried through the manufacturing process on a limited number of pallets. We make
38
the following assumptions:
1. Only one pallet is available for each part type.
2. Once production of a part is completed, it is removed from its respective pallet and the
pallet returns to the beginning of the production line.
3. There are no set-up times or traveling times between machines.
4. The sequencing of part types on the machines is fixed, and for M1 is (P2, P3), for M2
(P1, P2, P3) and for M3 (P1, P2).
Assumption (3) gives no loss of generality since if set-up times or traveling times did exist,
we could combine them with the processing time at the appropriate machine. Assumption (4)
means that machines have to wait for the appropriate part rather than starting work on any
part that arrives first (see below for an example). This may or may not be realistic; extensions
to the theory presented below in which this assumption is dropped are discussed in chapter 9
of [2].
We can model the time evolution of this system by considering the time that each machine starts
working on the k-th part of type i, for i = 1, 2, 3 and k ∈ N. There are seven combinations of
machines and parts, so we define x(k) = (x1(k), . . . , x7(k)) as follows:
Variable xi(k) Definition
x1(k) time that M1 starts working on the k-th unit of P2
x2(k) time that M1 starts working on the k-th unit of P3
x3(k) time that M2 starts working on the k-th unit of P1
x4(k) time that M2 starts working on the k-th unit of P2
x5(k) time that M2 starts working on the k-th unit of P3
x6(k) time that M3 starts working on the k-th unit of P1
x7(k) time that M3 starts working on the k-th unit of P2
Table 2.2: Definitions of each entry of the state vector x(k), for k ∈ N.
By examining the production process, work by each machine on the (k+1)-st part is constrained
in the following way:
x1(k + 1) ≥ max x7(k) + 3, x2(k) + 5
x2(k + 1) ≥ max x5(k) + 3, x1(k + 1) + 1
x3(k + 1) ≥ max x6(k) + 4, x5(k) + 3
x4(k + 1) ≥ max x3(k + 1) + 3, x1(k + 1) + 1
x5(k + 1) ≥ max x2(k + 1) + 5, x4(k + 1) + 2
x6(k + 1) ≥ max x3(k + 1) + 3, x7(k) + 3
x7(k + 1) ≥ max x6(k + 1) + 4, x4(k + 1) + 2
39
For example, the inequality for x6(k + 1) comes from the fact that M3 cannot start working on
the (k + 1)-st unit of P1 until it has finished working on the k-th unit of P2, and until M2 has
finished working on the (k + 1)-st unit of P1.
If we are to optimise the system, the inequalities above will actually be equalities. This is where
the theory of max-plus algebra comes to the fore. We can write the system in max-plus matrix
form as
x(k + 1) = A0 ⊗ x(k + 1) ⊕ A1 ⊗ x(k)
where
A0 =
















ε ε ε ε ε ε ε
1 ε ε ε ε ε ε
ε ε ε ε ε ε ε
1 ε 3 ε ε ε ε
ε 5 ε 2 ε ε ε
ε ε 3 ε ε ε ε
ε ε ε 2 ε 4 ε
















; A1 =
















ε 5 ε ε ε ε 3
ε ε ε ε 3 ε ε
ε ε ε ε 3 4 ε
ε ε ε ε ε ε ε
ε ε ε ε ε ε ε
ε ε ε ε ε ε 3
ε ε ε ε ε ε ε
















This is a first-order recurrence relation like we have seen in Section 1.5. A quick examination of
G(A0) shows that it does not contain any circuits of positive weight (indeed it does not contain
any circuits at all), and therefore we can apply Theorem 1.40 to find the unique solution
x(k + 1) = A∗
0 ⊗ A1 ⊗ x(k)
= B ⊗ x(k) (2.1)
where B := A∗
0 ⊗ A1, or explicitly:
B =
















ε 5 ε ε ε ε 3
ε 6 ε ε 3 ε 4
ε ε ε ε 3 4 ε
ε 6 ε ε 6 7 4
ε 11 ε ε 8 9 9
ε ε ε ε 6 7 3
ε 8 ε ε 10 11 7
















If numerical values of x1(1), . . . , x7(1) are given then these values constitute the initial condition,
and the future evolution of the system is uniquely determined. There are no restrictions on
x(1) from a mathematical point of view, but given the physical interpretation of the system,
limitations do exist. For example, if we assume that all three pallets start at the beginning of
their respective production lines (with M1 working on P2 first), we have x1(1) = x3(1) = 0, but
x2(1) cannot be less than 1 since M1 has to finish working on P2 before it can start working on
P3.
40
Note that if we had allowed more than one pallet on any of the three production lines then the
system would have been of higher order (for example, if the production line of P1 had three
pallets then work on the (k +1)-st unit could start once the (k −2)-th unit had been produced).
This system would be solvable using the techniques developed at the end of Section 1.5.
Another possible extension would be to incorporate variable processing times rather than the
constant values given in table 2.1. The processing times could vary according to how many
parts the machines have already processed (i.e. vary with k), or they could exhibit stochastic
variability (i.e. following some specified probability distribution). The first type of variability
will be introduced with the basic autonomous equation below; stochastic event graph theory
will be discussed in Chapter 3.
Note that since we can describe the evolution of the system by a recurrence relation of the form
(2.1), we might expect that we can apply Theorem 1.49 to see that the system settles down into
a periodic regime after a finite length of time. However, upon closer inspection we see that the
matrix B has a column of ε, so it is not irreducible and thus Theorem 1.49 does not apply. Later
on in this chapter we will discuss some techniques which ensure that the evolution equation does
involve an irreducible matrix and therefore enables us to draw the relevant conclusions.
To end this introductory example, note that the way we have modeled our system does not
immediately give us the production times of the k-th unit of P1, P2 and P3. In order to find
these we could introduce an output vector y(k) = (y1(k), y2(k), y3(k)) defined by
y(k) = C ⊗ x(k)
where
C =





ε ε ε ε ε 4 ε
ε ε ε ε ε ε 3
ε ε ε ε 3 ε ε





Left multiplication by C adds the appropriate processing time to the starting time at the last
machine in the production line of each part. Thus yi(k) gives us the time of production of the
k-th unit of part Pi.
In the following section we will introduce the concept of timed event graphs, which are the tools
we will use to model discrete event systems such as the production line we have considered here.
2.2 Preliminaries of Event Graph Theory
2.2.1 Definitions and Set-up
As we have seen above, max-plus algebra allows us to describe the evolution of events on
a network subject to synchronisation constraints. In our example, a part moving from one
machine to the next is an event. An appropriate tool to model events on a certain class of
41
networks is known as a Petri net. We will focus on certain type of Petri net called an event
graph, which can be modeled by max-plus linear recurrence relations of the form discussed in
Section 1.5. We start by defining the relevant terms and setting out some notation. In order to
fully appreciate all the concepts we introduce, it may be helpful to read this section alongside
the example that follows (Section 2.2.2).
Definition 2.1. Let G = (V, E) be a graph and let i, j ∈ V. We say that i is a predecessor (or
an upstream vertex) of j if (i, j) ∈ E, and that i is a successor (or a downstream vertex) of j
if (j, i) ∈ E.
Definition 2.2. A Petri net is a pair (G, µ) where G = (V, E) is a directed graph and µ is a
vector, satisfying the following properties:
(i) G is bipartite, i.e. V is partitioned into two disjoint sets P and Q (called places and
transitions respectively) such that E only consists of edges of the form (pi, qj) and (qj, pi),
with pi ∈ P and qj ∈ Q.
(ii) µ is a |P|-vector of non-negative integers, known as the initial marking.
Definition 2.3. An event graph is a Petri net in which every place has exactly one upstream
and downstream transition.
Notation. For general i ∈ V, we let π(i) denote the set of all predecessors of i and σ(i) denote
the set of all successors of i. In the case of Petri nets and event graphs, when we want to work
with indices we will sometimes use the following additional notation: if pi ∈ π(qj), we write
i ∈ πq(j), and if qj ∈ π(pi), we write j ∈ πp(i). Similarly, if pi ∈ σ(qj), we write i ∈ σq(j), and
if qj ∈ σ(pi), we write j ∈ σp(i). Note that in the case of an event graph, for any place pi we
have that |πp(i)| = |σp(i)| = 1, so we often allow the abuse of notation πp(i) = j (as opposed
to πp(i) = {j}).
We can think of places as conditions and transitions as events. For example, a machine working
on a part is a place, and a transition occurs when the part moves on to the next machine. Each
place has an associated marking (given initially by the vector µ) which indicates whether or not
the condition has been fulfilled, e.g. whether or not a machine is working on a given part type.
Equivalently we say that each place has an associated number of tokens, which can be thought
of as the number of data items or resources available at each place. In our example each place
can have either 0 or 1 tokens, but in general there can be any amount (e.g. if machines are
capable of working on more than one part at once).
We say that a transition is enabled if each of its upstream places contains at least one token.
When this is the case the transition fires, meaning that one token is removed from each of its
upstream places and one token is added to each of its downstream places. If the initial marking
42
is µ, a transition firing gives a new marking µ, defined by
µi =



µi − 1 if pi ∈ π(qj)
µi + 1 if pi ∈ σ(qj)
µi otherwise
In this case we say that the marking µ is reachable from µ. It is easy to see that for a general
Petri net the total number of tokens can change when a transition fires; for example a transition
may have one upstream place but two downstream places, in which case the transition firing
causes the total number of tokens to increase by one. Furthermore, note that the definition of an
event graph allows for input and output transitions (known and sources and sinks respectively),
i.e. transitions that do not have any upstream or downstream places. Source transitions are
enabled by the outside world and deliver tokens into the system; sink transitions remove tokens
from the system completely. The following definition makes an important distinction between
two types of event graph:
Definition 2.4. An event graph is autonomous if it contains no source transitions, and non-
autonomous otherwise.
The important property of event graphs is that they do not allow for models conflicts; that is,
a token in a given place can be consumed by only one predetermined transition. The ‘opposite’
to an event graph (i.e. a Petri net in which each transition has exactly one upstream place
and one downstream place), known as a state machine, does allow for this competition element
but does not admit synchronisation. It can be shown that state machines are equivalent to
the automata studied in computer science, which shows that Petri nets in general have more
modelling power than automata.
Up until now, the theory we have introduced is only concerned with the ordering of events. If
we wish to investigate network performance, it is necessary to introduce time. There are two
ways in which this could be done: we can either associate durations with transition firings, or
holding times with places. In fact, in many applications it could be that both times are present;
for example the real-life manufacturing system in Section 2.1 would exhibit travel times as well
as processing times. However, as we noted before, by incorporating the firing times into the
holding times at places, in the case of event graphs it may be assumed without loss of generality
that the firing times are equal to 0. We therefore introduce the concept of timed event graph
below.
Definition 2.5. A timed event graph is an event graph endowed with a |P|-vector α of holding
times associated with each place.
Note that the definition of a timed event graph does not uniquely determine all future firing
times. This is because the initial marking does not specify how long each token has spent
in its respective place. We will deal with this more fully when we come to look at the basic
autonomous equation in the next section.
43
2.2.2 A Simple Example
To consolidate all of this theory, consider this simple example. A train network connects the
main stations of two cities. There are two routes from station S1 to station S2; one visiting
an intermediate station S3 along the way and the other visiting a different intermediate station
S4. Trains link up at S2 and return to S1 via a single fast track with no stops, where they then
split up again and repeat their respective journeys. There are also two inner-city loops at S1
and S2 which visit the suburbs of their respective cities. The travel time from Sj to Sl is given
as the (l, j)-th entry of the matrix A below:
A =







2 5 ε ε
ε 3 5 3
2 ε ε ε
4 ε ε ε







. (2.2)
We can represent this network as a standard graph as follows:
S1
S3
S4
S22
2
4
3
5
5
3
Figure 2.2: Standard graph of the simple train network. Stations (the vertices) are represented
by circles and tracks by weighted edges. The travel times are given by the edge weights.
Similarly to before, we can assume that there are no waiting times at stations by incorporating
them into the travel times. We want the system to be synchronised in the sense that trains
arriving at a station should wait for each other to allow for the changeover of passengers. This
means that departures from a given station will coincide (once the last train has arrived, all
trains can then depart). We can model this system with a timed event graph, where ‘tracks’
are represented by places (the timed elements of the network); trains by tokens and departures
at each station by transitions. Note that each transition has an upstream place so the event
graph will be autonomous.
In order to fully specify the event graph we need to state the positions of the trains in the
network at time 0, which corresponds to the initial marking. We assume that at time 0 there
is one train travelling from S1 to S3, two trains travelling from S1 to S4, one train travelling
back from S2 to S1 and one train on each of the inner-city loops. This gives the following timed
event graph, pictured at time 0:
44
q1
2
4
q4
3
q2
3
2
q3
5
5
Figure 2.3: Timed event graph of the train network depicted in Figure 2.2. The transitions q1,
q2, q3 and q4 represent departures from the four respective stations. The edges can be thought
of as the tracks between stations, with the intermediate places (depicted as circles) specifying
the travel times. Tokens inside the places represent trains on the tracks.
Note that transitions are depicted by bars, places by circles and tokens by counters inside the
circles. As we have noted before, we cannot tell which transition will fire first since we do not
know how long each token of the initial marking has spent in its respective place (i.e. how close
to their respective destinations the trains are at time 0). If transitions q3 and q4 both fire once,
the token distribution changes to the following:
q1
2
4
q4
3
q2
3
2
q3
5
5
Figure 2.4: Timed event graph of the train network after transitions q3 and q4 have fired. One
token has been removed from each of their upstream places and one token has been added to
each of their downstream places.
This corresponds to the train on the track from S1 to S3 having reached S3 and departed for
S2, and also one of the trains on the track from S1 to S4 having reached S4 and departed for
S2. Once these trains both reach S2 they link up to form one train, and assuming the inner-city
train at S2 is ready and waiting, transition q2 will fire and the token distribution of the event
graph will change to:
45
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR
Thesis_JR

More Related Content

What's hot

Calculus Research Lab 3: Differential Equations!
Calculus Research Lab 3: Differential Equations!Calculus Research Lab 3: Differential Equations!
Calculus Research Lab 3: Differential Equations!A Jorge Garcia
 
No Cloning Theorem with essential Mathematics and Physics
No Cloning Theorem with essential Mathematics and PhysicsNo Cloning Theorem with essential Mathematics and Physics
No Cloning Theorem with essential Mathematics and PhysicsRitajit Majumdar
 
Quantm mechanics book
Quantm mechanics bookQuantm mechanics book
Quantm mechanics bookish khan
 
M2 Internship report rare-earth nickelates
M2 Internship report rare-earth nickelatesM2 Internship report rare-earth nickelates
M2 Internship report rare-earth nickelatesYiteng Dang
 
MasterThesis(SubmittedVer)
MasterThesis(SubmittedVer)MasterThesis(SubmittedVer)
MasterThesis(SubmittedVer)Yûto Murashita
 
Senior_Thesis_Evan_Oman
Senior_Thesis_Evan_OmanSenior_Thesis_Evan_Oman
Senior_Thesis_Evan_OmanEvan Oman
 
phd Thesis Mengdi Zheng (Summer) Brown Applied Maths
phd Thesis Mengdi Zheng (Summer) Brown Applied Mathsphd Thesis Mengdi Zheng (Summer) Brown Applied Maths
phd Thesis Mengdi Zheng (Summer) Brown Applied MathsZheng Mengdi
 
Engn4625 e brick1_2014_ circuitanalysismaths
Engn4625 e brick1_2014_ circuitanalysismathsEngn4625 e brick1_2014_ circuitanalysismaths
Engn4625 e brick1_2014_ circuitanalysismathssskara
 
bachelors_thesis_stephensen1987
bachelors_thesis_stephensen1987bachelors_thesis_stephensen1987
bachelors_thesis_stephensen1987Hans Jacob Teglbj
 
Quantum Variables in Finance and Neuroscience Lecture Slides
Quantum Variables in Finance and Neuroscience Lecture SlidesQuantum Variables in Finance and Neuroscience Lecture Slides
Quantum Variables in Finance and Neuroscience Lecture SlidesLester Ingber
 
On the discretized algorithm for optimal proportional control problems constr...
On the discretized algorithm for optimal proportional control problems constr...On the discretized algorithm for optimal proportional control problems constr...
On the discretized algorithm for optimal proportional control problems constr...Alexander Decker
 
Statistical Mechanics & Thermodynamics 2: Physical Kinetics
Statistical Mechanics & Thermodynamics 2: Physical KineticsStatistical Mechanics & Thermodynamics 2: Physical Kinetics
Statistical Mechanics & Thermodynamics 2: Physical KineticsInon Sharony
 
Unit i-engineering-mechanics
Unit i-engineering-mechanicsUnit i-engineering-mechanics
Unit i-engineering-mechanicsEdhole.com
 
Be2419772016
Be2419772016Be2419772016
Be2419772016IJMER
 
Fundamentals of quantum computing part i rev
Fundamentals of quantum computing   part i revFundamentals of quantum computing   part i rev
Fundamentals of quantum computing part i revPRADOSH K. ROY
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 

What's hot (20)

Calculus Research Lab 3: Differential Equations!
Calculus Research Lab 3: Differential Equations!Calculus Research Lab 3: Differential Equations!
Calculus Research Lab 3: Differential Equations!
 
N system physics
N system physicsN system physics
N system physics
 
No Cloning Theorem with essential Mathematics and Physics
No Cloning Theorem with essential Mathematics and PhysicsNo Cloning Theorem with essential Mathematics and Physics
No Cloning Theorem with essential Mathematics and Physics
 
Quantm mechanics book
Quantm mechanics bookQuantm mechanics book
Quantm mechanics book
 
M2 Internship report rare-earth nickelates
M2 Internship report rare-earth nickelatesM2 Internship report rare-earth nickelates
M2 Internship report rare-earth nickelates
 
MasterThesis(SubmittedVer)
MasterThesis(SubmittedVer)MasterThesis(SubmittedVer)
MasterThesis(SubmittedVer)
 
Quantum gravity
Quantum gravityQuantum gravity
Quantum gravity
 
Senior_Thesis_Evan_Oman
Senior_Thesis_Evan_OmanSenior_Thesis_Evan_Oman
Senior_Thesis_Evan_Oman
 
Master degree thesis
Master degree thesisMaster degree thesis
Master degree thesis
 
phd Thesis Mengdi Zheng (Summer) Brown Applied Maths
phd Thesis Mengdi Zheng (Summer) Brown Applied Mathsphd Thesis Mengdi Zheng (Summer) Brown Applied Maths
phd Thesis Mengdi Zheng (Summer) Brown Applied Maths
 
Engn4625 e brick1_2014_ circuitanalysismaths
Engn4625 e brick1_2014_ circuitanalysismathsEngn4625 e brick1_2014_ circuitanalysismaths
Engn4625 e brick1_2014_ circuitanalysismaths
 
bachelors_thesis_stephensen1987
bachelors_thesis_stephensen1987bachelors_thesis_stephensen1987
bachelors_thesis_stephensen1987
 
Quantum Variables in Finance and Neuroscience Lecture Slides
Quantum Variables in Finance and Neuroscience Lecture SlidesQuantum Variables in Finance and Neuroscience Lecture Slides
Quantum Variables in Finance and Neuroscience Lecture Slides
 
On the discretized algorithm for optimal proportional control problems constr...
On the discretized algorithm for optimal proportional control problems constr...On the discretized algorithm for optimal proportional control problems constr...
On the discretized algorithm for optimal proportional control problems constr...
 
Statistical Mechanics & Thermodynamics 2: Physical Kinetics
Statistical Mechanics & Thermodynamics 2: Physical KineticsStatistical Mechanics & Thermodynamics 2: Physical Kinetics
Statistical Mechanics & Thermodynamics 2: Physical Kinetics
 
Unit i-engineering-mechanics
Unit i-engineering-mechanicsUnit i-engineering-mechanics
Unit i-engineering-mechanics
 
Be2419772016
Be2419772016Be2419772016
Be2419772016
 
Fundamentals of quantum computing part i rev
Fundamentals of quantum computing   part i revFundamentals of quantum computing   part i rev
Fundamentals of quantum computing part i rev
 
Report_Final
Report_FinalReport_Final
Report_Final
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 

Viewers also liked

Masters Thesis Defense
Masters Thesis DefenseMasters Thesis Defense
Masters Thesis Defensessj4mathgenius
 
4 4 more on algebra of radicals-x
4 4 more on algebra of radicals-x4 4 more on algebra of radicals-x
4 4 more on algebra of radicals-xmath123b
 
Expectations (Algebra 2)
Expectations (Algebra 2)Expectations (Algebra 2)
Expectations (Algebra 2)rfant
 
Algebra Is Weightlifting For The Brain
Algebra Is  Weightlifting For The  BrainAlgebra Is  Weightlifting For The  Brain
Algebra Is Weightlifting For The BrainMaria H. Andersen
 
Elastisidad ng Suplay
Elastisidad ng SuplayElastisidad ng Suplay
Elastisidad ng SuplayApril Lane
 
Modyul 14 patakaran sa pananalapi
Modyul 14   patakaran sa pananalapiModyul 14   patakaran sa pananalapi
Modyul 14 patakaran sa pananalapidionesioable
 
Modyul 9 implasyon
Modyul 9   implasyonModyul 9   implasyon
Modyul 9 implasyondionesioable
 
Modyul 18 mga ideolohiyang laganap
Modyul 18   mga ideolohiyang laganapModyul 18   mga ideolohiyang laganap
Modyul 18 mga ideolohiyang laganap南 睿
 
Problems and Issues in the Philippine Educational System
Problems and Issues in the Philippine Educational SystemProblems and Issues in the Philippine Educational System
Problems and Issues in the Philippine Educational SystemJames Paglinawan
 

Viewers also liked (11)

Masters Thesis Defense
Masters Thesis DefenseMasters Thesis Defense
Masters Thesis Defense
 
Bachelor Thesis Report
Bachelor Thesis ReportBachelor Thesis Report
Bachelor Thesis Report
 
4 4 more on algebra of radicals-x
4 4 more on algebra of radicals-x4 4 more on algebra of radicals-x
4 4 more on algebra of radicals-x
 
Expectations (Algebra 2)
Expectations (Algebra 2)Expectations (Algebra 2)
Expectations (Algebra 2)
 
Algebra Is Weightlifting For The Brain
Algebra Is  Weightlifting For The  BrainAlgebra Is  Weightlifting For The  Brain
Algebra Is Weightlifting For The Brain
 
Elastisidad ng Suplay
Elastisidad ng SuplayElastisidad ng Suplay
Elastisidad ng Suplay
 
Ap salapi
Ap salapiAp salapi
Ap salapi
 
Modyul 14 patakaran sa pananalapi
Modyul 14   patakaran sa pananalapiModyul 14   patakaran sa pananalapi
Modyul 14 patakaran sa pananalapi
 
Modyul 9 implasyon
Modyul 9   implasyonModyul 9   implasyon
Modyul 9 implasyon
 
Modyul 18 mga ideolohiyang laganap
Modyul 18   mga ideolohiyang laganapModyul 18   mga ideolohiyang laganap
Modyul 18 mga ideolohiyang laganap
 
Problems and Issues in the Philippine Educational System
Problems and Issues in the Philippine Educational SystemProblems and Issues in the Philippine Educational System
Problems and Issues in the Philippine Educational System
 

Similar to Thesis_JR

A Graph Theoretic Approach To Matrix Functions And Quantum Dynamics (PhD Thesis)
A Graph Theoretic Approach To Matrix Functions And Quantum Dynamics (PhD Thesis)A Graph Theoretic Approach To Matrix Functions And Quantum Dynamics (PhD Thesis)
A Graph Theoretic Approach To Matrix Functions And Quantum Dynamics (PhD Thesis)Amy Cernava
 
On the Numerical Solution of Differential Equations
On the Numerical Solution of Differential EquationsOn the Numerical Solution of Differential Equations
On the Numerical Solution of Differential EquationsKyle Poe
 
Algorithms for Reinforcement Learning
Algorithms for Reinforcement LearningAlgorithms for Reinforcement Learning
Algorithms for Reinforcement Learningmustafa sarac
 
Clustering Financial Time Series and Evidences of Memory E
Clustering Financial Time Series and Evidences of Memory EClustering Financial Time Series and Evidences of Memory E
Clustering Financial Time Series and Evidences of Memory EGabriele Pompa, PhD
 
2004 zuckerberg a set theoretic approach to lifting procedures for 0-1 inte...
2004 zuckerberg   a set theoretic approach to lifting procedures for 0-1 inte...2004 zuckerberg   a set theoretic approach to lifting procedures for 0-1 inte...
2004 zuckerberg a set theoretic approach to lifting procedures for 0-1 inte...Alejandro Angulo
 
Compiled Report
Compiled ReportCompiled Report
Compiled ReportSam McStay
 
Xin-She Yang - Introductory Mathematics for Earth Scientists -Dunedin Academi...
Xin-She Yang - Introductory Mathematics for Earth Scientists -Dunedin Academi...Xin-She Yang - Introductory Mathematics for Earth Scientists -Dunedin Academi...
Xin-She Yang - Introductory Mathematics for Earth Scientists -Dunedin Academi...Aditya Singh
 

Similar to Thesis_JR (20)

A Graph Theoretic Approach To Matrix Functions And Quantum Dynamics (PhD Thesis)
A Graph Theoretic Approach To Matrix Functions And Quantum Dynamics (PhD Thesis)A Graph Theoretic Approach To Matrix Functions And Quantum Dynamics (PhD Thesis)
A Graph Theoretic Approach To Matrix Functions And Quantum Dynamics (PhD Thesis)
 
Thesis lebanon
Thesis lebanonThesis lebanon
Thesis lebanon
 
On the Numerical Solution of Differential Equations
On the Numerical Solution of Differential EquationsOn the Numerical Solution of Differential Equations
On the Numerical Solution of Differential Equations
 
Time series Analysis
Time series AnalysisTime series Analysis
Time series Analysis
 
Algorithms for Reinforcement Learning
Algorithms for Reinforcement LearningAlgorithms for Reinforcement Learning
Algorithms for Reinforcement Learning
 
Clustering Financial Time Series and Evidences of Memory E
Clustering Financial Time Series and Evidences of Memory EClustering Financial Time Series and Evidences of Memory E
Clustering Financial Time Series and Evidences of Memory E
 
2004 zuckerberg a set theoretic approach to lifting procedures for 0-1 inte...
2004 zuckerberg   a set theoretic approach to lifting procedures for 0-1 inte...2004 zuckerberg   a set theoretic approach to lifting procedures for 0-1 inte...
2004 zuckerberg a set theoretic approach to lifting procedures for 0-1 inte...
 
Non omniscience
Non omniscienceNon omniscience
Non omniscience
 
Book linear
Book linearBook linear
Book linear
 
Compiled Report
Compiled ReportCompiled Report
Compiled Report
 
SMA206_NOTES
SMA206_NOTESSMA206_NOTES
SMA206_NOTES
 
Discontinuous Galerkin Timestepping for Nonlinear Parabolic Problems
Discontinuous Galerkin Timestepping for Nonlinear Parabolic ProblemsDiscontinuous Galerkin Timestepping for Nonlinear Parabolic Problems
Discontinuous Galerkin Timestepping for Nonlinear Parabolic Problems
 
Barret templates
Barret templatesBarret templates
Barret templates
 
Zettili.pdf
Zettili.pdfZettili.pdf
Zettili.pdf
 
Cálculo lambda
Cálculo lambdaCálculo lambda
Cálculo lambda
 
t
tt
t
 
final_report_template
final_report_templatefinal_report_template
final_report_template
 
Xin-She Yang - Introductory Mathematics for Earth Scientists -Dunedin Academi...
Xin-She Yang - Introductory Mathematics for Earth Scientists -Dunedin Academi...Xin-She Yang - Introductory Mathematics for Earth Scientists -Dunedin Academi...
Xin-She Yang - Introductory Mathematics for Earth Scientists -Dunedin Academi...
 
Applied Math
Applied MathApplied Math
Applied Math
 
main
mainmain
main
 

Thesis_JR

  • 1. Max-Plus Theory and Applications Jeremy Rolph August 10, 2015
  • 2. Abstract In this thesis we consider the ‘max-plus’ algebra; that is, the set Rmax = R ∪ {−∞} endowed with the operations a ⊕ b = max{a, b} and a ⊗ b = a + b. It is shown that Rmax has the structure of a semiring with several additional useful properties. We introduce the idea of matrices over the max-plus semiring and develop max-plus variants of several familiar concepts from classical linear algebra; most notably the theory of eigenvalues and eigenvectors. In Chapter 2 we introduce the theory of event graphs which are used to model dynamical systems which admit a degree of synchronisation such as rail networks or automated manufacturing processes. We use the theory of max-plus algebra developed in Chapter 1 to derive results concerning the time evolution of such systems and also consider their long-term behaviour. Finally, in Chapter 3 we consider event graphs in which the timed elements form sequences of random variables. We look for steady state distributions and conditions for their existence, and attempt to characterise the asymptotic behaviour of the event timings concerned. We conclude by exploring how we can represent certain types of queuing systems by stochastic event graphs and present a key theorem regarding the stability of their waiting times. i
  • 3. Contents Abstract i Table of Contents ii 0 Introduction 1 1 Max-Plus Algebra 3 1.1 The Max-Plus Semiring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1.1 Basic Definitions and Properties . . . . . . . . . . . . . . . . . . . . . . . 3 1.1.2 Other Algebraic Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.2 Vectors and Matrices over Rmax . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.2.1 Definitions and Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.2.2 Matrix Inversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.2.3 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.3 Graph-theoretic Interpretations in Max-Plus . . . . . . . . . . . . . . . . . . . . . 11 1.4 Spectral Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 1.4.1 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . 15 1.4.2 The Eigenspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 1.4.3 A Worked Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 1.5 Recurrence Relations & Periodicity . . . . . . . . . . . . . . . . . . . . . . . . . . 28 1.5.1 Solving Max-Plus Recurrence Relations . . . . . . . . . . . . . . . . . . . 28 ii
  • 4. 1.5.2 Limiting Behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 2 Petri Nets and Timed Event Graphs 38 2.1 A Motivating Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 2.2 Preliminaries of Event Graph Theory . . . . . . . . . . . . . . . . . . . . . . . . . 41 2.2.1 Definitions and Set-up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 2.2.2 A Simple Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 2.3 The Basic Autonomous Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 2.3.1 Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 2.3.2 Extensions to the Initial Condition . . . . . . . . . . . . . . . . . . . . . . 49 2.3.3 Solving the Basic Autonomous Equation . . . . . . . . . . . . . . . . . . . 51 2.3.4 Behaviour of the Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 2.4 A Simple Example Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 2.4.1 General Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 2.4.2 An Optimal Timetable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 2.5 The Non-autonomous Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 3 Stochastic Event Systems Over Max-plus 64 3.1 Introduction & Stochastic Background . . . . . . . . . . . . . . . . . . . . . . . . 64 3.2 Statistical Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 3.3 Asymptotic Firing Rates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 3.3.1 The Strongly Connected Case . . . . . . . . . . . . . . . . . . . . . . . . . 69 3.3.2 The General Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 3.4 Queuing Systems and Timed Event Graphs . . . . . . . . . . . . . . . . . . . . . 74 3.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 3.4.2 Example: The G/G/1 Queue . . . . . . . . . . . . . . . . . . . . . . . . . 76 3.4.3 Stability Analysis of Waiting Times . . . . . . . . . . . . . . . . . . . . . 77 iii
  • 6. Chapter 0 Introduction Exotic semirings such as (R ∪ {−∞}, max, +) and (R ∪ {+∞}, min, +) have been studied at length since the 1950s, beginning primarily in the area of operational research. Nowadays the term ‘tropical mathematics’ is often used to describe their study, though this term originally referred to one particular discrete version of the max-plus algebra introduced by I. Simon in 1988 [15]. Their applications span a wide range of fields including optimisation & control, mathematical physics, algebraic geometry, dynamic programming and mathematical biology [10, 15]. In particular, the study of such algebras in relation to discrete event system theory (both deterministic and stochastic), graph theory, Markov decision processes, asymptotic analysis and language theory has lead to some significant progress in these areas over the last 30 years [8]. Many of the concepts developed in conventional linear algebra have been ‘translated’ into the world of max-plus, including solutions to linear and non-linear systems (both analytical and numerical), linear dependence and independence, determinants, eigenvalues and eigenvectors [9]. In 1979 Cuninghame-Green authored the first comprehensive unified account of these results entitled “Minimax Algebra” [7], building on many papers published over the preceding 20 years from various disciplines within mathematics, economics and computer science. As recently as 2006, Heidergott, Olsder and Woude published what they consider the first ‘textbook’ in the area of max-plus algebra [13], and many of the ideas explored below can be found in this publication. In the first chapter of this thesis, we aim to give an overview of max-plus linear algebra and to build the necessary groundwork required for the applications discussed in the chapters that follow. In particular, we present two celebrated theorems in the area of max-plus theory. The first, which can be found in [7], concerns spectral theory and says that under mild conditions, a matrix over the max-plus algebra has a unique eigenvalue with a simple graph-theoretic interpretation. The second, originally proved by M. Viot in 1983 [2, 6], relates to the asymptotic behaviour of sequential powers of max-plus matrices, which turns out to be essentially periodic and has great implications for the material explored in Chapters 2 & 3. In chapter 2 we introduce the concept of timed Petri nets & event graphs. For a thorough 1
  • 7. discussion on the scope of their application readers are referred to [18]; in this thesis we fo- cus solely on their use in the modelling of the time behaviour of a class of dynamic systems known as ‘discrete event dynamic systems’. In simple terms, these are systems in which a finite number of resources (e.g. processors or machines) are shared by several users (e.g. packets or manufactured objects) which all contribute to the achievement of some common goal (e.g. a parallel computation or the assembly of a product) [2]. We will see that under certain conditions these systems, while highly non-linear in the conventional sense, can be ‘linearised’ by using the max-plus algebra. This observation, first made in [5], is of vital importance and constitutes one of the main reasons for the continued study of max-plus algebra today. The main content of Chapter 2 concerns the ‘basic autonomous equation’ which governs the time evolution of discrete event systems, and the steps towards its solution. We are then able to apply some ideas from Chapter 1 to explore the long-term behaviour of such systems. Chapter 3 concerns stochastic event graphs, which can be thought of as a natural extension to the concepts introduced in Chapter 2. As the name suggests, we now assume a degree of randomness in the event timings of the systems we are trying to model. Amongst other things, stochastic event graphs can be used to model many types of queuing systems [3], the most simple of which being the G/G/1 queue. We introduce several key ‘first order’ theorems which establish the nature of stationary regimes in terms of the inverse throughput, and explore the conditions under which such regimes are reached. We end by presenting a ‘second order’ theorem concerning the stability of inter-event timings (for example, waiting times) in the context of queuing systems. 2
  • 8. Chapter 1 Max-Plus Algebra 1.1 The Max-Plus Semiring 1.1.1 Basic Definitions and Properties In this thesis we work exclusively with the max-plus algebra (Rmax, ⊕, ⊗), where Rmax = R ∪ {−∞}, and for a, b ∈ Rmax: a ⊕ b := max{a, b} a ⊗ b := a + b We begin by examining its algebraic structure, and we will then move on to vectors and matrices over Rmax. We start by defining the term semiring. Definition 1.1. A semiring is a triple (R, +, ×) where R is a non-empty set and +, × are binary operations on R (referred to as addition and multiplication respectively) such that (i) (R, +) is commutative and associative, with zero element εR: (a) a + b = b + a (b) (a + b) + c = a + (b + c) (c) εR + a = a + εR = a (ii) (R, ×) is associative, with unit element eR: (a) (a × b) × c = a × (b × c) (b) eR × a = a × eR = a (iii) Multiplication distributes over addition: (a) a × (b + c) = (a × b) + (a × c) (b) (a + b) × c = (a × c) + (b × c) (iv) Multiplication by εR annihilates R: 3
  • 9. (a) εR × a = a × εR = εR Note that the final axiom is not required in the definition of a standard ring since it follows from the others, but it is needed here. As the title of this section suggests, the max-plus algebra is a semiring with additive identity ε := −∞ and multiplicative identity e := 0. It is straightforward to verify that all the axioms of Definition 1.1 hold in the case of (Rmax, ⊕, ⊗). For example, the first distributive law holds since a ⊗ (b ⊕ c) = a + max{b, c} = max{a + b, a + c} = (a ⊗ b) ⊕ (a ⊗ c) and the others follow similarly. For the sake of simplicity we will write Rmax for (Rmax, ⊕, ⊗) when the context is clear. Below we list three additional algebraic properties of Rmax which do not form part of the definition of a semiring: (i) Commutativity of ⊗: ∀a, b ∈ Rmax : a ⊗ b = b ⊗ a (ii) Existence of multiplicative inverses: ∀a ∈ Rmax{ε} ∃ b ∈ Rmax such that a ⊗ b = e (iii) Idempotency of ⊕: ∀a ∈ Rmax : a ⊕ a = a The first two properties follow directly from the fact that (R, +) forms an abelian group, and the third property is easily proved: a ⊕ a = max{a, a} = a. Properties (i) and (ii) mean that we could refer to (Rmax, ⊕, ⊗) as a semifield (i.e. a field without additive inverses), though this term can be ambiguous and is seldom used in mathematical literature. Note also that in general, any semiring in which addition is idempotent we call an idempotent semiring. The term dioid (originating from the phrase double monoid) was introduced by Baccelli et al. in 1992 to mean idempotent semiring [2], but we do not use this word here. The crucial difference between a semiring and a ring in general is that an element of the former need not have an additive inverse. Note that this does not say that additive inverses can never exist - there may be a non-empty subset of R containing elements which do have additive inverses (which could be thought of as the additive analogue to the set of units in a standard ring). However, the following lemma immediately tells us that no elements of Rmax (apart from 4
  • 10. ε) have additive inverses. Lemma 1.2. Let (R, +, ×) be a semiring. If + is idempotent then additive inverses do not exist. Proof. Suppose that εR = a ∈ R has an additive inverse b. Then a + b = εR Adding a to both sides of the equation yields a + a + b = a + εR By idempotency of +, the left-hand side is equal to a + b, whereas the right-hand side is equal to a. Hence we have a + b = a which contradicts a + b = εR. Thus a does not have an additive inverse. 1.1.2 Other Algebraic Definitions For a ∈ Rmax, n ∈ N, define a⊗n := a ⊗ a ⊗ · · · ⊗ a n times Thus exponentiation in max-plus is equivalent to conventional multiplication a⊗n = n×a. Some of the laws of exponentiation are therefore different to what we are used to. For a, b ∈ Rmax, m, n ∈ N: (i) a⊗m ⊗ a⊗n = ma + na = (m + n)a = a⊗(m⊗n) (ii) (a⊗m)⊗n = (ma)⊗n = nma = a⊗(m⊗n) (iii) a⊗1 = 1a = a (iv) a⊗m ⊗ b⊗m = ma + mb = m(a + b) = (a ⊗ b)⊗m and we also adopt the natural conventions a⊗ε := ε and a⊗e := e. For negative exponents we can take a⊗−n := (a⊗n )⊗−1 where the outer exponent on the right-hand side denotes the max-plus multiplicative inverse, which was shown to exist in the previous section. Finally, we can extend the concept of ex- ponentiation in Rmax to non-integer exponents using conventional notation in the following 5
  • 11. way: a⊗ n m := n m × a which is well-defined, assuming m = ε. Next, we can equip the max-plus algebra with a natural order relation as follows: Definition 1.3. For a, b ∈ Rmax, we say a ≤ b if a ⊕ b = b. It is easily verified that the max-plus operations ⊕ and ⊗ preserve this order, i.e. ∀a, b, c ∈ Rmax, a ≤ b ⇒ a ⊕ c ≤ b ⊕ c and a ⊗ c ≤ b ⊗ c. Finally, infinite sums in max-plus are defined by i∈I xi := sup{xi : i ∈ I} for any possibly infinite (even uncountable) family {xi}i∈I of elements of Rmax, when the supremum exists. In general, we say that an idempotent semiring is complete if any such family has a supremum, and if the product distributes over infinite sums. The max-plus semiring Rmax is not complete (a complete idempotent semiring must have a maximal element), but it can be embedded in the complete semiring (Rmax, ⊕, ⊗), where Rmax := Rmax ∪ {+∞}. 1.2 Vectors and Matrices over Rmax 1.2.1 Definitions and Structure Let n, m ∈ N. We denote the set of n × m matrices over Rmax by Rn×m max . For i ∈ {1, . . . , n}, j ∈ {1, . . . , m}, the element of a matrix A ∈ Rn×m max in row i and column j is denoted by [A]ij, or simply aij for notational convenience. Thus A ∈ Rn×m max can be written as         a11 a12 · · · a1m a21 a22 · · · a2m ... ... ... ... an1 an2 · · · anm         where a11, . . . , anm ∈ Rmax. In a similar vein, the elements of Rn max := Rn×1 max are called max-plus vectors, and we write the i-th element of a vector x ∈ Rn max as [x]i, or simply xi. Typical concepts and operations from conventional algebra are defined for max-plus matrices in the usual way (replacing + and × with ⊕ and ⊗ respectively), as outlined in the following definitions. Definition 1.4. The n × n max-plus identity matrix, denoted En, is defined by [En]ij =    0 i = j ε i = j We will write E := En whenever the context is clear. 6
  • 12. Definitions 1.5. (i) For A, B ∈ Rn×m max , their sum A ⊕ B is defined by [A ⊕ B]ij = aij ⊕ bij = max aij, bij (ii) For A ∈ Rn×k max and B ∈ Rk×m max , their product A ⊗ B is defined by [A ⊗ B]il = k j=1 (aij ⊗ bjl) = max j=1,...,k (aij + bjl) (iii) The transpose of a matrix A ∈ Rn×m max is denoted by A and is defined as usual by [A ]ij = [A]ji (iv) For A ∈ Rn×n max and k ∈ N, the k-th power of A, denoted A⊗k, is defined by A⊗k = A ⊗ A ⊗ · · · ⊗ A k times For k = 0, A⊗0 := En. (v) For A ∈ Rn×m max and α ∈ Rmax, α ⊗ A is defined by [α ⊗ A]ij = α ⊗ [A]ij We now look at a crucial result concerning the algebraic structure of square matrices over Rmax. Proposition 1.6. (Rn×n max , ⊕, ⊗) is an idempotent semiring with multiplicative identity En. Proof. The axioms of Definition 1.1 all follow from the semiring structure of Rmax, and are readily verified. For example, for A, B, C ∈ Rn×n max we have that [A ⊗ (B ⊕ C)]il = n j=1 (aij ⊗ (bjl ⊕ cjl)) = n j=1 (aij ⊗ bjl) ⊕ (aij ⊗ cjl) = n j=1 (aij ⊗ bjl) ⊕ n j=1 (aij ⊗ cjl) = [(A ⊗ B) ⊕ (A ⊗ C)]il and so A ⊗ (B ⊕ C) = (A ⊗ B) ⊕ (A ⊗ C). The other axioms follow similarly. Note that since addition in (Rn×n max , ⊕, ⊗) is idempotent, we can apply Lemma 1.2 once again to see that no element of Rn×n max has an additive inverse. However, unlike in Rmax, multiplication 7
  • 13. of matrices over Rmax is not commutative. For example   1 e ε −2     2 −1 3 ε   =   3 e 1 ε   =   3 2 4 3   =   2 −1 3 ε     1 e ε −2   Also unlike Rmax, matrices over Rmax do not necessarily have multiplicative inverses (i.e. they are not necessarily invertible). We explore this in the next section. 1.2.2 Matrix Inversion Definition 1.7. Let A, B ∈ Rn×n max . B is a right inverse of A if A ⊗ B = E, and B is a left inverse of A if B ⊗ A = E. Definition 1.8. A max-plus permutation matrix is a matrix A ∈ Rn×n max with each row and each column containing exactly one entry equal to e, with all other entries equal to ε. If σ : {1, . . . , n} → {1, . . . , n} is a permutation, the max plus permutation matrix Pσ is defined by [Pσ]ij :=    e i = σ(j) ε i = σ(j) As the name suggests, left multiplication by Pσ permutes the rows of a matrix: the i-th row of a matrix A ∈ Rn×n max will appear as the σ(i)-th row of Pσ ⊗ A. For example, if n = 2 and σ is defined by σ(1) = 2, σ(2) = 1:   ε e e ε     1 2 3 4   =   3 4 1 2   Similarly, it is straightforward to see that right multiplication by Pσ permutes the columns of a matrix. Definition 1.9. A matrix A ∈ Rn×n max is diagonal if [A]ij = ε for all i = j. If a1, . . . , an ∈ Rmax{ε}, the diagonal matrix D(a1, . . . , an) is defined by [D(a1, . . . , an)]ij :=    ai i = j ε i = j Combining these two definitions, if σ is a permutation and a1, . . . , an ∈ Rmax {ε}, Pσ ⊗ D(a1, . . . , an) gives a matrix in which each row and each column contains exactly one finite entry. This class of matrices (sometimes referred to as generalised permutation matrices) in max-plus turns out to be of some significance, as the theorem below shows. Theorem 1.10. A matrix A ∈ Rn×n max has a right inverse if and only if A = Pσ ⊗ D(a1, . . . , an) for some permutation σ and a1, . . . , an ∈ Rmax{ε}. Proof. Suppose A = Pσ ⊗ D(a1, . . . , an) for some permutation σ and a1, . . . , an ∈ Rmax{ε}. 8
  • 14. Recalling from Section 1.1.1 that multiplicative inverses exist in Rmax, define B ∈ Rn×n max by [B]ij =    [A]⊗−1 ji if [A]ji = ε ε otherwise Then for i, j = 1, . . . , n we have that [A ⊗ B]ij = max k=1,...,n aik ⊗ bkj =    e j = i ε j = i Since if j = i, at least one of aik, bkj is equal to ε for each k = 1, . . . , n (since A only has one finite element per column and row). Thus A ⊗ B = E, and B is a right inverse of A. Conversely, suppose A has inverse B ∈ Rn×n max . For i, j = 1, . . . , n we have n k=1 [A]ik ⊗ [B]kj = [E]ij and therefore for each i = 1, . . . , n there is a (least) index c(i) (1 ≤ c(i) ≤ n) such that [A]ic(i) and [B]c(i)i are both finite, since [E]ii = e. Moreover we cannot have [A]hc(i) finite with h = i, since then [A ⊗ B]hi ≥ [A]hc(i) ⊗ [B]c(i)i > ε = [E]hi which contradicts our assumption that B is a right inverse of A. It follows that the mapping i → c(i) is a bijection, i.e. each column of A is labelled c(i) for some i and contains exactly one finite element, and each row of A contains exactly one finite element. That is, A = Pσ ⊗D(a1, . . . , an) for some permutation σ and a1, . . . , an ∈ Rmax{ε}. Theorem 1.11. For A, B ∈ Rn×n max , A ⊗ B = E if and only if B ⊗ A = E (i.e. right and left inverses are equivalent), and A uniquely determines B. Proof. Suppose that A has right inverse BR ∈ Rn×n max . Then by Theorem 1.10, we know that A = Pσ ⊗ D(a1, . . . , an) for some permutation σ and a1, . . . , an ∈ Rmax{ε}. Now, as before, define BL ∈ Rn×n max by [BL]ij =    [A]⊗−1 ji if [A]ji = ε ε otherwise and using the same reasoning as before we observe that BL is a left inverse of A. Finally, note that BR = E ⊗ BR = (BL ∗ A) ⊗ BR = BL ⊗ (A ⊗ BR) = BL ⊗ E = BL showing that BR is uniquely determined, and is also a left inverse. 9
  • 15. Theorem 1.11 tells us that we do not need to make a distinction between right and left inverses, as we did in Definition 1.7. Before moving on we show one last result which says that the product of two invertible matrices is also invertible. Proposition 1.12. If A, B ∈ Rn×n max are invertible then A ⊗ B is also invertible. Proof. This proof uses some simple results regarding diagonal and permutation matrices in conventional algebra, whose analogues are easily proved in max-plus. To start, recall that for a permutation matrix Pσ, we have that P−1 σ = Pσ−1 . Thus if D(a1, . . . , an) is a diagonal matrix: D(a1, . . . , an) ⊗ Pσ = (Pσ ⊗ Pσ−1 ) ⊗ D(a1, . . . , an) ⊗ Pσ = Pσ ⊗ (Pσ−1 ⊗ D(a1, . . . , an) ⊗ Pσ) = Pσ ⊗ D(aσ(1), . . . , aσ(n)) Now from Theorem 1.10 we can write A = PσA ⊗D(a1, . . . , an), B = PσB ⊗D(b1, . . . , bn). Then using the above A ⊗ B = PσA ⊗ D(a1, . . . , an) ⊗ PσB ⊗ D(b1, . . . , bn) = PσA ⊗ PσB ⊗ D(aσA(1), . . . , aσA(n)) ⊗ D(b1, . . . , bn) = PσB ◦ σA ⊗ D(aσA(1) ⊗ b1, . . . , aσA(n) ⊗ bn) and therefore A ⊗ B is invertible by Theorem 1.10. 1.2.3 Determinants Recall that in conventional algebra, the determinant of a matrix A ∈ Rn×n is defined as det(A) = σ∈Sn sgn(σ) n i=1 aiσi where Sn is the symmetric group on n elements (so an element of Sn is a permutation σ : {1, . . . , n} → {1, . . . , n}), and the sign of a permutation σ ∈ Sn, denoted sgn(σ), is defined by sgn(σ) =    1 σ even −1 σ odd Unfortunately this definition cannot be immediately translated into max-plus (i.e. by replacing + and × with ⊕ and ⊗ respectively) because the use of the sign function requires that we have additive inverses. Instead, two related concepts are introduced below which offer alternatives to the notion of the determinant in the case of the max-plus algebra. Definition 1.13. Let A ∈ Rn×n max . The permanent of A, denoted perm(A), is defined as perm(A) = σ∈Sn n i=1 aiσi 10
  • 16. Note that, crudely put, the permanent is the max-plus analogue of the determinant with the minuses simply removed. We can understand the formula to give the maximal sum of the diagonal values for all permutations of the columns of A. The permanent has been studied at length both in the case of conventional algebra (see [17]) and in max-plus & related semirings (see [19]). Note that if A ∈ Rn×n max is invertible then by Theorem 1.10, A = Pσ ⊗ D(a1, . . . , an) and so perm(A) = n i=1 ai = ε. However, unlike in the case of determinants in conventional matrix algebra, the converse is not necessarily true. The second concept in max-plus related to the determinant, known as the dominant, can be thought of as a refinement of the permanent. It is defined below. Definition 1.14. Let A ∈ Rn×n max and let the matrix zA be defined by [zA]ij = zaij . The dominant of A, denoted dom(A), is defined as dom(A) =    highest exponent in det(zA) if det(zA) = 0 ε otherwise The dominant can be used to prove max-plus analogues of major results such as Cramér’s Theorem and the Cayley-Hamilton Theorem. We do not have the space to include these here; for a comprehensive discussion readers are again referred to [19]. 1.3 Graph-theoretic Interpretations in Max-Plus As in conventional linear algebra, when working with vectors and matrices it is often natural to interpret definitions and theorems graphically. It turns out that in the case of max-plus algebra, it is not only natural to do so but also rather insightful. We will only really be able to appreciate this when we come to look at the eigenvalue problem in the next section, but firstly we must define all of the graph-theoretic concepts that we will require. Definitions 1.15. (i) A directed graph G is a pair (V, E) where V is the set of vertices (or nodes) and E ⊆ V × V is the set of edges (or arcs). (ii) A path from vertex i to vertex j is a sequence of edges p = (i1, . . . , is+1) with i1 = i and is+1 = j, such that (ik, ik+1) ∈ E for all k ∈ {1, . . . , s}. (iii) The length of a path p = (i1, . . . , is+1), denoted |p|l, is equal to s. The set of paths from vertex i to vertex j of length k is denoted Pk(i, j). (iv) The weight of a path p from vertex i to vertex j of length d is given by |p|w = d k=1 aik+1,ik where i1 = i and id+1 = j. 11
  • 17. (v) The average weight of a path p is given by |p|w |p|l . (vi) A circuit of length s is a path of length s which starts and finishes at the same vertex, i.e. a path c = (i1, . . . , is+1) such that i1 = is+1. (vii) A circuit c = (i1, . . . , is+1) is elementary if i1, . . . , is are distinct, and s ≥ 1. We denote the set of elementary circuits in G(A) by C(A). (viii) For A ∈ Rn×n max , the communication graph (or the precedence graph) of A, denoted G(A), is the graph with vertex set V(A) = {1, . . . , n} and edge set E(A) = {(i, j) : aji = ε}. The weight of the edge (i, j) ∈ E(A) is given by the entry aji. Note that the (i, j)-th entry of the matrix A specifies the weight of the edge in G(A) from vertex j to vertex i. This is common practice in the area of max-plus and graph theory but may not appear intuitive to those new to the subject. We now move on to looking at two particular matrices that play a vital role in relating graph theory to max-plus linear algebra. For A ∈ Rn×n max , let A+ := ∞ k=1 A⊗k The element [A+]ji gives the maximal weight of any path from i to j in G(A). This statement is non-trivial, but follows directly from the theorem below. Theorem 1.16. Let A ∈ Rn×n max . Then ∀k ∈ N: [A⊗k ]ji =    max |p|w : p ∈ Pk(i, j) if Pk(i, j) = ∅ ε if Pk(i, j) = ∅ Proof. We use induction on k. Let i, j ∈ {1, . . . , n}. When k = 1, P1(i, j) either contains a single path of length 1, namely the edge (i, j), or is empty if no such edge exists. In the first case, the weight of the path is by definition [A]ji, and in the second case max |p|w : p ∈ Pk(i, j) = ε, which is again equal to the value [A]ji (since there is no edge from i to j). Now suppose the result holds for some k. Firstly, assume that Pk+1(i, j) = ∅. A path p ∈ Pk+1(i, j) can be split up into a subpath of length k running from i to some vertex l, and a path consisting of a single edge from l to j. More formally: p = ˆp ◦ (l, j) with ˆp ∈ Pk(i, l) The maximal weight of any path in Pk+1(i, j) can thus be obtained from max l=1,...,n [A]jl + max{|ˆp|w : ˆp ∈ Pk(i, l)} = max l=1,...,n [A]jl + [A⊗k ]li (Inductive hypothesis) 12
  • 18. = n l=1 [A]jl ⊗ [A⊗k ]li = [A ⊗ A⊗k ]ji = [A⊗(k+1) ]ji which is what we wanted to prove. Finally, consider the case when Pk+1(i, j) = ∅; i.e. when there exists no path of length k + 1 from i to j. This implies that for any vertex l, either there is no path of length k from i to l or there is no edge from l to j (or possibly both). Hence for any l, at least one of the values [A]jl, [A⊗k]li equals ε. Therefore [A⊗(k+1)]ji = ε, and this completes the proof. Note that Theorem 1.16 immediately tells us that A+ is not necessarily well-defined. For example, if there exists a circuit c = (i1, . . . , is+1) in G(A) in which every edge has positive weight, then [A⊗k]ji diverges (i.e. tends to +∞) as k → ∞ for any i, j ∈ {i1, . . . , is+1} (since we can loop around the circuit c as many times a we like, creating a path of higher and higher weight). The next lemma provides us with a sufficient condition for A+ to be well-defined, and also reduces the complexity of the infinite sum. Lemma 1.17. Let A ∈ Rn×n max be such that any circuit in G(A) has non-positive average weight (i.e. less than or equal to e). Then we have A+ = A⊗1 ⊕ A⊗2 ⊕ A⊗3 ⊕ · · · ⊕ A⊗n ∈ Rn×n max Proof. Since A is of dimension n, any path p in G(A) from i to j of length greater than n necessarily contains at least one circuit. We have assumed that all of the circuits in G(A) have non-positive weights, so removing the circuits in p yields a path from i to j of length at most n, and of greater average weight. It follows that [A+ ]ji ≤ max [A⊗k ]ji : k ∈ {0, . . . , n} and the reverse inequality is immediate from the definition of A+. This concludes the proof. Before moving on, we prove one simple property of A+ that will come in handy later on. Proposition 1.18. For A ∈ Rn×n max , we have that A+ ⊗ A+ = A+. Proof. Consider two vertices i, l ∈ {1, . . . , n}. A path of maximal weight from i to l can be split up as a path of maximal weight from i to j plus a path of maximal weight from j to l, for any j ∈ {1, . . . , n} for which the sum of the two path weights is maximal. Indeed this relationship holds if and only if j is in the path of maximal weight from i to l, but for our purposes we can simply take the maximum over all vertices. By Theorem 1.16, the weight of such a path is given by [A+]li. Thus in max-plus notation 13
  • 19. (recalling that ⊗ is commutative for scalars α ∈ Rmax), we can write [A+ ]li = n j=1 [A+ ]ji ⊗ [A+ ]lj = n j=1 [A+ ]lj ⊗ [A+ ]ji = [A+ ⊗ A+ ]li and therefore A+ = A+ ⊗ A+ as required. We now introduce one more definition which is closely related to the object A+ defined above. This will prove to be an integral concept throughout the rest of this chapter and beyond, and as such, this is one of the most important definitions in this thesis. Definition 1.19. For A ∈ Rn×n max , let A∗ := ∞ k=0 A⊗k = E ⊕ A+ Clearly, A∗ and A+ only differ on the leading diagonal. By Theorem 1.16, the (j, i)-th of A∗ could be interpreted as the maximal weight of any path from i to j in G(A), provided we recognise the additional concept of an empty circuit of length 0 and weight e from every vertex to itself. Using Lemma 1.17, it is immediate from the definition of A∗ that if all the circuits in G(A) have non-positive average weight, then A∗ = A⊗0 ⊕ A⊗1 ⊕ · · · ⊕ A⊗n. However, as the lemma below shows, thanks to the addition of the identity matrix (i.e. the A⊗0 term) in A∗, we are able to refine this result slightly by dropping the final term in the sum. Lemma 1.20. Let A ∈ Rn×n max be such that any circuit in G(A) has non-positive average weight. Then we have A∗ = A⊗0 ⊕ A⊗1 ⊕ A⊗2 ⊕ · · · ⊕ A⊗(n−1) ∈ Rn×n max Proof. The same argument applies as in the proof of Lemma 1.17. Note that any path p in G(A) from i to j of length n or greater necessarily contains at least one circuit, and so removing the circuit(s) yields a path from i to j of length at most n − 1 and with greater average weight. For the special case when i = j and p is an elementary circuit of length n (so visiting each vertex in G(A) exactly once), the i-th entry on the diagonal of A⊗0 (which equals e by definition) will always be greater than the corresponding entry in A⊗n, since e is the maximum possible weight of any circuit. This is why we can drop the A⊗n term. Note that we also have a direct analogue of Lemma 1.18 for the matrix A∗, and this will be useful in the analysis that follows: Proposition 1.21. For A ∈ Rn×n max , we have that A∗ ⊗ A∗ = A∗. 14
  • 20. Proof. From Lemma 1.18 we have that A+ = A+ ⊗A+. Recalling the definition of A∗ and using idempotency of matrix addition, we have A∗ ⊗ A∗ = (A+ ⊕ E) ⊗ (A+ ⊕ E) = (A+ ⊗ A+ ) ⊕ (A+ ⊗ E) ⊕ (E ⊗ A+ ) ⊕ E = A+ ⊕ A+ ⊕ A+ ⊕ E = A+ ⊕ E = A∗ as required. To finish this section, we introduce one more important property of square matrices over max- plus known as irreducibility. The definition comes in three parts: Definitions 1.22. (i) In a graph G, a vertex j is reachable from vertex i if there exists a path from i to j. (ii) A graph is strongly connected if every vertex is reachable from every other vertex. (iii) A matrix A ∈ Rn×n max is irreducible if G(A) is strongly connected. The class of irreducible matrices over max-plus will turn out to be of real significance in Section 1.4. From a practical point of view it is not obvious how to determine whether a given matrix A ∈ Rn×n max is irreducible, but as the proposition below shows, one option is to examine the matrix A+. Combined with Lemma 1.17 (when A has the appropriate properties), this provides us with a handy (and computationally quick) way to check for matrix irreducibility over max-plus. Proposition 1.23. A matrix A ∈ Rn×n max is irreducible if and only if all the entries of A+ are different from ε. Proof. A matrix is irreducible if there is a path between any two vertices i and j in G(A), which by Theorem 1.16 occurs exactly when the entry [A+]ji is not equal to ε. 1.4 Spectral Theory 1.4.1 Eigenvalues and Eigenvectors Given a matrix A ∈ Rn×n max , we consider the problem of existence of eigenvalues and eigenvectors. The main result in max-plus spectral theory is that, under mild conditions, A has a unique eigenvalue with a simple graph-theoretic interpretation. As can be seen below, the definition of max-plus eigenvalues and eigenvectors is a direct translation from conventional linear algebra, with the × operator replaced with ⊗: Definition 1.24. Let A ∈ Rn×n max . If there exists a scalar µ ∈ Rmax and a vector v ∈ Rn max (containing at least one finite element) such that A ⊗ v = µ ⊗ v 15
  • 21. then µ is an eigenvalue of A and v is an eigenvector of A associated with the eigenvalue µ. Note that Definition 1.24 allows an eigenvalue to be µ = ε. However, the proposition below says that this can only happen when A has a column in which all entries are ε. In graph-theoretic terms this means that G(A) has a vertex which, once visited, can never be left (sometimes called a sink). This is uninteresting from an analytical point of view, so it is reasonable to consider the case µ = ε to be trivial. Before we prove this result, we introduce some simple notation. Notation. Let A ∈ Rn×n max . For i ∈ {1, . . . , n}, we denote the i-th row of A by [A]i·. Similarly, for j ∈ {1, . . . , n}, we denote the j-th column of A by [A]·j. Proposition 1.25. ε is an eigenvalue of A ∈ Rn×n max iff A has at least one column in which all entries are ε. Proof. Let A ∈ Rn×n max be such that [A]·j = (ε, . . . , ε) for some j ∈ {1, . . . , n}. Let v ∈ Rn max be such that [v]i = ε ∀i = j and [v]j = α = ε. Then it is easy to verify that [A ⊗ v]i = ε for all i = 1, . . . , n; that is, ε is an eigenvalue of A with an associated eigenvector v. Conversely, suppose A ∈ Rn×n max has eigenvalue ε with an associated eigenvector v. let J = {j : vj = ε}, which is non-empty by definition. Then for each i = 1, . . . , n we have ε = [A ⊗ v]i = n j=1 aij ⊗ vj = j∈J aij ⊗ vj =⇒ aij = ε ∀j ∈ J So every column j of A for which vj = ε has all its entries equal to ε. In particular, A contains at least one column in which all entries are ε. Corollary 1.26. If A ∈ Rn×n max is irreducible then ε is not an eigenvalue of A. Proof. If A is irreducible then it cannot have a column in which all entries are ε. Thus by Proposition 1.25, ε is not an eigenvalue of A. Note that eigenvectors are not unique: any scalar multiple of an eigenvector is also an eigen- vector, and more generally, if µ is an eigenvalue of A, v1, v2 are associated eigenvectors and α1, α2 ∈ Rmax{ε}, then we have A ⊗ (α1 ⊗ v1) ⊕ (α2 ⊗ v2) = A ⊗ (α1 ⊗ v1) ⊕ A ⊗ (α2 ⊗ v2) = α1 ⊗ (A ⊗ v1) ⊕ α2 ⊗ (A ⊗ v2) = α1 ⊗ (µ ⊗ v1) ⊕ α2 ⊗ (µ ⊗ v2) = µ ⊗ (α1 ⊗ v1) ⊕ (α2 ⊗ v2) So (α1 ⊗ v1) ⊕ (α2 ⊗ v2) is also an eigenvector associated with the eigenvalue µ. In fact, the eigenvectors associated with a given eigenvalue form a vector space in max-plus called the eigenspace which we shall explore in depth later. 16
  • 22. As we mentioned at the beginning of Section 1.3, many of the results in the area of max-plus spectral theory can be interpreted graphically, and the next key lemma constitutes the first step in doing just that. Lemma 1.27. Let A ∈ Rn×n max have finite eigenvalue µ. Then µ is the average weight of some elementary circuit in G(A). Proof. Let v be an associated eigenvector of µ. Then by definition not all the entries of v equal ε, i.e. there exists a vertex/index j1 ∈ {1, . . . , n} such that vj1 = ε. Now v is an eigenvector and so we have [A ⊗ v]j1 = µ ⊗ vj1 = ε. But [A ⊗ v]j1 = n k=1 aj1k ⊗ vk, and therefore there exists a vertex j2 such that aj1j2 ⊗ vj2 = [A ⊗ v]j1 = ε (1.1) which implies aj1j2 = ε, i.e. (j2, j1) is an edge in G(A). (1.1) also implies that vj2 = ε, so we can continue in the same fashion to find a vertex j3 with (j3, j2) an edge in G(A) and vj3 = ε. Proceeding in this way, eventually some vertex, say, vertex jh, must be encountered for a second time since the number of vertices is finite. Thus by ignoring the edges prior to encountering jh for the first time, we have found an elementary circuit c = ((jh, jh+l−1), (jh+l−1, jh+l−2), . . . , (jh+1, jh)) of length |c|l = l, and with weight |c|w = l−1 k=0 ajh+kjh+k+1 (1.2) where jh = jh+l. By construction, we have that l−1 k=0 (ajh+kjh+k+1 ⊗ vjh+k+1 ) = µ⊗l ⊗ l−1 k=0 vjh+k or equivalently in conventional algebra (for ease of manipulation): l−1 k=0 ajh+kjh+k+1 + vjh+k+1 = (l × µ) + l−1 k=0 vjh+k Now, because jh = jh+l it follows that l−1 k=0 vjh+k+1 = l−1 k=0 vjh+k , so subtracting l−1 k=0 vjh+k from both sides yields l−1 k=0 ajh+k jh+k+1 = l × µ and translated back into max-plus, we can substitute this into (1.2) to see that |c|w = µ⊗l. 17
  • 23. Thus we have that the average weight of the circuit c is equal to |c|w |c|l = µ⊗l l = µ as required. Lemma 1.27 tells us that the only candidates for eigenvalues are the average weights of circuits in G(A). However, it does not tell us which circuits actually define an eigenvalue and which do not. Fortunately, when A is irreducible the answer to this question is very simple: only the maximal average circuit weight defines an eigenvalue. This result is established in the two theorems below, but first we require some additional definitions and notation. Definitions 1.28. (i) A circuit c ∈ C(A) is critical if its average weight is maximal. (ii) For A ∈ Rn×n max , the critical graph of A, denoted Gc(A), is the graph containing the vertices and edges which belong to the critical circuits in G(A). We write Gc(A) = (Vc(A), Ec(A)), and refer to the vertices in Vc(A) as critical vertices. (iii) The critical classes of A ∈ Rn×n max are the maximal strongly connected components of Gc(A). Notation. Let A ∈ Rn×n max . For β ∈ Rmax{ε}, define the matrix Aβ by [Aβ]ij = aij − β. Note that the ‘−’ operator is to be interpreted in conventional algebra, where we adopt the convention ε − x = ε ∀x ∈ R. If β is an eigenvalue of A, the matrix Aβ is sometimes called the normalised matrix. Note that the communication graphs G(A) and G(Aβ) are identical except for their edge weights, and if a circuit c in G(A) has average weight w then the same circuit in G(Aβ) has average weight w − β. In particular, if G(A) has finite maximal average circuit weight λ then the maximal average circuit weight in G(Aλ) is λ − λ = 0. Furthermore, a circuit in G(A) is critical if and only if it is critical in G(Aλ), and therefore Gc(A) and Gc(Aλ) are identical (again, except for their edge weights). Consider the matrix A+ λ , which is to be read (Aλ)+ . By Theorem 1.16, the element [A+ λ ]ij gives the maximal weight of any path from j to i in G(Aλ). In particular, since all circuits in G(Aλ) have non-positive average weight, we must have [A+ λ ]ii ≤ e for all i ∈ {1, . . . , n}. Furthermore, for the matrix A∗ λ (also to be read (Aλ)∗ ) we obtain [A∗ λ]ii = e⊕[A+ λ ]ii = e for all i ∈ {1, . . . , n}. Theorem 1.29. Let the communication graph G(A) of a matrix A ∈ Rn×n max have finite maximal average circuit weight λ. Then λ is an eigenvalue of A, with an associated eigenvector [A∗ λ]·j for any vertex j ∈ Vc(A). Proof. Firstly note that all the circuits in G(Aλ) have non-positive average weight, and therefore A+ λ is well-defined by Lemma 1.17. Now, every vertex in Gc(Aλ) is contained in a non-empty circuit which has weight e, i.e. ∀j ∈ Vc (A) : [A+ λ ]jj = e (1.3) 18
  • 24. Next, write [A∗ λ]ij = [E ⊕ A+ λ ]ij =    ε ⊕ [A+ λ ]ij for i = j e ⊕ [A+ λ ]ij for i = j Then from (1.3), for j ∈ Vc(A) it follows that [A+ λ ]·j = [A∗ λ]·j (1.4) Now, note that we have A+ λ = A⊗1 λ ⊕ A⊗2 λ ⊕ . . . = Aλ ⊗ (A⊗0 λ ⊕ A⊗1 λ ⊕ . . . ) = Aλ ⊗ A∗ λ So substituting this into (1.4) gives for j ∈ Vc(A) [Aλ ⊗ A∗ λ]·j = [A∗ λ]·j ⇐⇒ Aλ ⊗ [A∗ λ]·j = [A∗ λ]·j ⇐⇒ A ⊗ [A∗ λ]·j = λ ⊗ [A∗ λ]·j Therefore λ is an eigenvalue of A and the j-th column of A∗ λ is an associated eigenvector for any j ∈ Vc(A). Theorem 1.30. Let A ∈ Rn×n max be irreducible. Then A has a unique eigenvalue, denoted λ(A), which is finite and equal to the maximal average circuit weight in G(A). Proof. Let the maximal average circuit weight in G(A) be denoted by λ. Since A is irreducible, G(A) must contain a circuit and therefore λ is necessarily finite. Thus by Theorem 1.29 we know that λ is an eigenvalue of A, and it remains to show uniqueness. Let c = (j1, . . . , jl+1) be an arbitrary circuit in C(A) of length l = |c|l, with jl+1 = j1. Then ajk+1jk = ε for all k ∈ {1, . . . , l}. Further, suppose that µ is an eigenvalue of A with an associated eigenvector v. Note that A is irreducible, so by Corollary 1.26 we have that µ = ε. Now, since A ⊗ v = µ ⊗ v, it follows that ajk+1jk ⊗ vjk ≤ µ ⊗ vjk+1 , k ∈ {1, . . . , l}. and arguing as in Lemma 1.27 (replacing equalities with the appropriate inequalities), we see that the average weight of the circuit c satisfies |c|w |c|l ≤ µ⊗l l = µ (1.5) That is, µ ≥ λ (since (1.5) holds for all c ∈ C(A), and we already have that the maximal average circuit weight is λ). But by Lemma 1.27, µ is equal to the average weight of some circuit c ∈ C(A), and so µ ≤ λ also. Hence µ = λ, i.e. λ is the unique eigenvalue of A. 19
  • 25. When A is large it is often difficult to identify the maximal average circuit weight in G(A). In fact, there exist several numerical procedures used to determine the eigenvalue of an irreducible matrix in max-plus, including Karp’s Algorithm and the Power Algorithm. However, none of these has a particularly attractive order of complexity - for example, the complexity of Karp’s Algorithm is of order n3, and the complexity of the Power Algorithm is not known precisely (see [11]). We do not have space here to describe the methods in detail; for more information readers are referred to chapter five of [13]. We end this section with a simple proposition that, while interesting in its own right, will come in handy when we begin to look at the eigenspace. Proposition 1.31. Let A ∈ Rn×n max be an irreducible matrix with eigenvalue λ and associated eigenvector v. We have that vi > ε for all i ∈ {1, . . . , n}. Proof. Call the set of vertices of G(A) corresponding to the finite entries of v the support of v, denoted Z(v). Suppose that Z(v) does not contain all the elements of V(A). Since A is irreducible, there must be edges from the vertices in Z(v) to vertices not belonging to Z(v). Hence there exists vertices j ∈ Z(v), i /∈ Z(v) with aij = ε. Then [A ⊗ v]i ≥ aij ⊗ vj > ε That is, Z(A ⊗ v) is strictly bigger than Z(v). But A ⊗ v = λ ⊗ v (and λ is finite by Theorem 1.30), so Z(v) and Z(A⊗v) should be equal. This is a contradiction, and so Z(v) must contain all the elements of V(A). 1.4.2 The Eigenspace Let A ∈ Rn×n max have finite eigenvalue λ. In this part of our analysis we let V (A, λ) denote the set of all eigenvectors of A associated with the eigenvalue λ, which we call the eigenspace of A w.r.t. λ. If A is irreducible then by Theorem 1.30 we know that it has a unique eigenvalue, so we can drop the dependence on λ and denote the eigenspace of A simply by V (A). The main aim of this section is to find an expression that completely characterises the eigenspace of A. In Theorem 1.29 we established that [A∗ λ]·j is an eigenvector of A for any j ∈ Vc(A), but are these the only eigenvectors (of course, up to taking linear combinations, as discussed above)? We will eventually see that the answer to this question is yes, but first we require some intermediate steps. Lemma 1.32. Let A ∈ Rn×n max . We have that A∗ λ = (E ⊕ Aλ)⊗(n−1). Proof. If n = 1 then the result is trivial. Otherwise, since E and Aλ commute, we can carry out the iterated multiplication (E ⊕ Aλ) ⊗ · · · ⊗ (E ⊕ Aλ) to obtain (E ⊕ Aλ)⊗(n−1) = E ⊕ n−1 i=1 A⊗i λ ⊕ · · · ⊕ A⊗i λ (n−1 i ) times (1.6) 20
  • 26. Each power A⊗0 λ , . . . , A ⊗(n−1) λ occurs at least once, so by idempotency of ⊕, (1.6) becomes (E + Aλ)⊗(n−1) = E ⊕ Aλ ⊕ A⊗2 λ ⊕ . . . A ⊗(n−1) λ (1.7) However, noting that every circuit in G(Aλ) must have non-positive weight, we can apply Lemma 1.20 to see that the right-hand side of (1.7) is equal to A∗ λ. This completes the proof. Lemma 1.33. Let A ∈ Rn×n max be an irreducible matrix, with eigenvalue λ and an associated eigenvector v. Then the matrix A∗ λ has eigenvalue e, also with an associated eigenvector v. Proof. Firstly, note that for any j ∈ {1, . . . , n} [λ ⊗ v]j = [A ⊗ v]j ⇐⇒ vj = [A ⊗ v]j − λ ⇐⇒ e ⊗ vj = [Aλ ⊗ v]j That is, e ⊗ v = Aλ ⊗ v, and v is also an eigenvector of Aλ (whose unique eigenvalue must be e by Theorem 1.30). Thus the eigenspaces V (A) and V (Aλ) coincide. Next, note that (E ⊕ Aλ) ⊗ v = (E ⊗ v) ⊕ (Aλ ⊗ v) = v ⊕ v = v Therefore, using Lemma 1.32: A∗ λ ⊗ v = (E ⊕ Aλ)⊗(n−1) ⊗ v = v = e ⊗ v as required. Definition 1.34. Let A ∈ Rn×n max be a matrix with eigenvalue λ and associated eigenvector v. The saturation graph of A with respect to λ, denoted Sλ(A, v), is the graph consisting of those edges (j, i) ∈ E(A) such that aij ⊗ vj = λ ⊗ vi, with vi, vj = ε. Recall that by definition, if v is an eigenvector of A then there exists at least one i ∈ {1, . . . , n} such that vi = ε. Then, since A ⊗ v = λ ⊗ v we have that n j=1 aij ⊗ vj = λ ⊗ vi, which implies that there exists (at least one) j ∈ {1, . . . , n} such that aij ⊗ vj = λ ⊗ vi. This value is finite (assuming λ = ε), so we must have (j, i) ∈ E Sλ(A, v) . That is, the saturation graph of A w.r.t. λ is never empty. Indeed, if A is irreducible, by Proposition 1.31 we know that vi > ε for all i ∈ {1, . . . , n}, and so by the same argument, Sλ(A, v) contains all the vertices in V(A). In this case we know that the eigenvalue λ is unique, and therefore we drop the dependence on λ and simply refer to the saturation graph of A. Lemma 1.35. Let A ∈ Rn×n max be an irreducible matrix, with eigenvalue λ and associated eigen- vector v. We have: (i) For each vertex i ∈ V(A), there exists a circuit in S(A, v) from which vertex i can be reached in a finite number of steps. (ii) Any circuit in S(A, v) belongs to Gc(A). Proof. (i) A is irreducible, so by Proposition 1.31 we know that vi > ε for all i ∈ {1, . . . , n}. Let 21
  • 27. i ∈ V(A), which by the discussion above we know is a vertex of the saturation graph S(A, v). Thus there is a vertex j such that λ ⊗ vi = aij ⊗ vj. Repeating this argument, we can identify a vertex k such that λ ⊗ vj = ajk ⊗ vk. Repeating this argument an arbitrary number of times, say, m, we get a path in S(A, v) of length m. If m > n, the constructed path must contain a circuit. (ii) Let c = (i1, i2, . . . , il+1) be a circuit of length l in S(A, v). By definition, for all k ∈ {1, . . . , n} we have that λ ⊗ vik+1 = aik+1ik ⊗ vik which implies that λ⊗l ⊗ vi1 = l k=1 aik+1ik ⊗ vi1 Hence, recalling that vi1 is finite: λ⊗l = l k=1 aik+1ik But the right-hand side is simply equal to the weight of the circuit c, which thus has average weight λ. But A is irreducible, so by Theorem 1.30 λ is equal to the maximal average circuit weight in G(A). Thus c is critical, and belongs to Gc(A). Lemma 1.36. Let A ∈ Rn×n max be an irreducible matrix, with eigenvalue λ and associated eigen- vector v. Then v can be written as v = j∈Vc(A) αj ⊗ [A∗ λ]·j for some αj ∈ Rmax, j ∈ Vc(A). Proof. Consider two vertices i, j in S(Aλ, v) such that there exists a path from i to j, say, (i1, i2, . . . , il+1), with i1 = i and il+1 = j. Then by definition of the saturation graph, this gives [Aλ]ik+1ik ⊗ vik = vik+1 , k ∈ {1, . . . , l} Hence vj = a ⊗ vi, where a is given by a = l k=1 [Aλ]ik+1ik ≤ [A⊗l λ ]ji ≤ [A∗ λ]ji (1.8) Now, using that vj = a ⊗ vi, for an arbitrary vertex ν ∈ {1, . . . , n}: [A∗ λ]νj ⊗ vj = [A∗ λ]νj ⊗ a ⊗ vi ≤ [A∗ λ]νj ⊗ [A∗ λ]ji ⊗ vi (by (1.8)) 22
  • 28. ≤ [A∗ λ]νi ⊗ vi (1.9) where the last inequality follows from Proposition 1.21. By applying Lemma 1.35, for any vertex j in S(Aλ, v) there exists a vertex i = i(j) ∈ Vc(A). Inequality (1.9) therefore implies j∈S(Aλ,v) [A∗ λ]νj ⊗ vj ≤ i∈Vc(Aλ) [A∗ λ]νi ⊗ vi (1.10) and this holds for any ν ∈ {1, . . . , n}. Now, by Lemma 1.33, A∗ λ has eigenvalue e with an associated eigenvector v, i.e. v = A∗ λ ⊗ v. The value of vν is equal to [A∗ λ]νj ⊗vj for some j, which by definition has to be in the saturation graph S(Aλ, v). Thus it holds for ν ∈ {1, . . . , n} that vν = j∈S(Aλ,v) [A∗ λ]νj ⊗ vj (1.10) ≤ j∈Vc(Aλ) [A∗ λ]νj ⊗ vj On the other hand, since v is an eigenvector of A∗ λ associated with the eigenvalue e, vν = [A∗ λ ⊗ v]ν = n j=1 [A∗ λ]νj ⊗ vj ≥ i∈Vc(Aλ) [A∗ λ]νi ⊗ vi which also holds for any ν ∈ {1, . . . , n}. Thus we have shown vν = i∈Vc(Aλ) [A∗ λ]νi ⊗ vi and since Vc(Aλ) = Vc(A) (see the proof of Theorem 1.29), the proof is complete. The lemma above shows that for an irreducible matrix A, the vectors [A∗ λ]·j, with j ∈ Vc(A), constitute a generating set for the eigenspace of A. Notice that in the proof we have actually identified the coefficients αi to which we referred in the statement of the lemma. If some of the columns of A∗ λ are colinear then the αi’s are non-unique and some can be chosen to equal ε. We have now done most of the work in characterising the eigenspace of an irreducible matrix. We now require a small extension of our notation and one more lemma before we are able to give a complete expression for the eigenspace, and we will end this section by referring to a theorem which shows that it is not possible to simplify this expression any further. Notation. Recall that the critical classes of a matrix A ∈ Rn×n max are the maximal strongly connected subgraphs of Gc(A). Let Nc(A) denote the number of critical classes of A, so Nc(A) ∈ N. For r ∈ {1, . . . , Nc(A)}, let Gc r(A) = (Vc r (A), Ec r (A)) denote the r-th critical class of A and let jc r := min{j ∈ Vc r (A)} be the smallest numbered vertex in the r-th critical class. We call {jc 1, . . . , jc Nc(A)} a set of representative vertices of the critical classes of A. Note that in the way defined above, the set of representative vertices is unique. However, this is not important - in general, a representative vertex jc r of the rth critical class of A can be any 23
  • 29. j ∈ Vc r (A). Lemma 1.37. Let A ∈ Rn×n max be an irreducible matrix with eigenvalue λ. Then for i, j ∈ Vc(A), there exists α ∈ Rmax{ε} such that α ⊗ [A∗ λ]·i = [A∗ λ]·j iff i and j are members of the same critical class. Proof. Suppose that i, j ∈ Vc(A) are members of the same critical class of Aλ. Then i and j communicate with each other in the critical graph, i.e. (i, j, i) is an elementary circuit in Gc(Aλ). As we have argued before (see Theorem 1.29), any circuit in Gc(Aλ) must have weight e, and therefore in this case we have [Aλ]ji ⊗ [Aλ]ij = e. Then by definition of A∗ λ, we have that [A∗ λ]ji ⊗ [A∗ λ]ij ≥ [Aλ]ji ⊗ [Aλ]ij = e (1.11) Now by a previous observation we know that [A∗ λ]jj = e, and by Proposition 1.21 we have that A∗ λ = A∗ λ ⊗ A∗ λ. Therefore we also have [A∗ λ]ji ⊗ [A∗ λ]ij ≤ n l=1 [A∗ λ]jl ⊗ [A∗ λ]lj = [A∗ λ ⊗ A∗ λ]jj = [A∗ λ]jj = e (1.12) and from (1.11) and (1.12) we conclude that [A∗ λ]ji ⊗ [A∗ λ]ij = e. Thus for all l ∈ {1, . . . , n}: [A∗ λ]li ⊗ [A∗ λ]ij ≤ [A∗ λ]lj = [A∗ λ]lj ⊗ [A∗ λ]ji ⊗ [A∗ λ]ij ≤ [A∗ λ]li ⊗ [A∗ λ]ij and therefore [A∗ λ]lj = [A∗ λ]li ⊗ [A∗ λ]ij. Hence the statement of the lemma has been proved, with α = [A∗ λ]ij. Conversely, suppose now that i, j ∈ Vc(A) do not belong to the same critical class, and suppose for contradiction that we can find α ∈ Rmax{ε} such that α ⊗ [A∗ λ]·i = [A∗ λ]·j. The i-th and j-th components of this equation read α ⊗ e = [A∗ λ]ij and α ⊗ [A∗ λ]ji = e respectively, from which it follows that [A∗ λ]ij ⊗ [A∗ λ]ji = e Therefore the elementary circuit (i, j, i) has average weight e, and therefore belongs to Gc(Aλ). Thus vertices i and j are members of the same critical class (since they communicate with each other), which is a contradiction. Theorem 1.38. Let A ∈ Rn×n max be an irreducible matrix with (unique) eigenvalue λ. The 24
  • 30. eigenspace of A is given by V (A) =    Nc(A) r=1 αr ⊗ [A∗ λ]·jc r : αr ∈ Rmax, at least one αr finite    for any set of representative vertices {jc 1, . . . , jc Nc(A)} of the critical classes of A. Proof. By Lemma 1.36 we know that any eigenvector of A is a linear combination of the columns [A∗ λ]·j, for j ∈ Vc(A). However, by Lemma 1.37 we know that the columns [A∗ λ]·j for j in some critical class Vc r (A) are all colinear. Therefore to build any eigenvector we only need one column corresponding to each critical class, and so it suffices to take the sum over a set of representative vertices of the critical classes of A. Theorem 1.39. No column [A∗ λ]·i, for i ∈ Vc(A), can be expressed as a linear combination of columns [A∗ λ]·jc r , where jc r varies over the representative vertices of critical classes distinct from that of i. Proof. The proof of this statement requires substantial groundwork which we do not have the space to include. For all the details and a full proof, readers are referred to theorem 3.101 in [2]. Theorem 1.39 above tells us that we cannot simplify any further the expression for V (A) given in Theorem 1.38. It also tells us that for an irreducible matrix A, the columns [A∗ λ]·jc r , where {jc 1, . . . , jc Nc(A)} is a set of representative vertices of the critical classes of A, form a basis for the eigenspace V (A). 1.4.3 A Worked Example Consider the matrix A =        ε −2 ε 6 1 ε 4 ε ε 8 ε ε ε 5 ε 6        Thus G(A) looks like 25
  • 31. 1 4 2 3 1 8 -2 5 4 6 6 Figure 1.1: Communication graph of the matrix A given above. Vertices are represented as circles and numbered 1-4 by convention. Edges are present only if the corresponding entry in A is finite, in which case this value specifies the edge weight. We can see that G(A) is strongly connected, so A is irreducible. Thus by Theorem 1.30, A has a unique eigenvalue λ given by the maximal average circuit weight in G(A). The elementary circuits and their average weights are c1 = (1, 2, 1) |c1|w/|c1|l = (1 ⊗ −2)/2 = −0.5 c2 = (1, 2, 4, 1) |c2|w/|c2|l = (1 ⊗ 5 ⊗ 6)/3 = 4 c3 = (2, 3, 2) |c3|w/|c3|l = (8 ⊗ 4)/2 = 6 c4 = (4, 4) |c4|w/|c4|l = (6)/1 = 6 and therefore λ = max{−0.5, 4, 6, 6} = 6. Circuits c3 and c4 are critical, so the critical graph Gc(A) looks like 2 3 4 8 4 6 Figure 1.2: Critical graph of the matrix A given above. Both the circuits have maximal average weight of 6. The other circuits present in Figure 1.1 are no longer included because they are not critical (their average weight is not maximal). We can see that Vc(A) = {2, 3, 4}, and Gc(A) has two critical classes with vertex sets Vc 1(A) = {2, 3} and Vc 2(A) = {4} respectively. Thus {jc 1 = 2, jc 2 = 4} is a set of representative vertices of the critical classes of A. Now, using that [Aλ]ij = aij − λ, we have Aλ =        ε −8 ε e −5 ε −2 ε ε 2 ε ε ε −1 ε e        and either by inspection of G(Aλ), or by using Lemma 1.17 and computing A⊗1 λ , A⊗2 λ , A⊗3 λ and 26
  • 32. A⊗4 λ , we can see that A+ λ =        −6 −1 −3 e −5 e −2 −5 −3 2 e −3 −6 −1 −3 e        Similarly, by using Lemma 1.20 (or by simply replacing any non-zero diagonal values in A+ λ above by e), we obtain A∗ λ =        e −1 −3 e −5 e −2 −5 −3 2 e −3 −6 −1 −3 e        Now by theorems 1.38 and 1.39, the columns [A∗ λ]·2 & [A∗ λ]·4 form a basis for the eigenspace of A, i.e. V (A) =    α1 ⊗        −1 e 2 −1        α2 ⊗        e −5 −3 e        : α1, α2 ∈ Rmax, at least one αr finite    For example, if we take α1 = −2, α2 = 1 we get v := −2 ⊗        −1 e 2 −1        1 ⊗        e −5 −3 e        =        −3 −2 e −3               1 −4 −2 1        =        1 −2 e 1        and we can easily verify that this is indeed an eigenvector of A, associated with the unique eigenvalue λ = 6: A ⊗ v =        ε −2 ε 6 1 ε 4 ε ε 8 ε ε ε 5 ε 6               1 −2 e 1        =        7 4 6 7        = 6 ⊗        1 −2 e 1        = λ ⊗ v Finally, we can observe that [A∗ λ]·3 =        −3 −2 e −3        = −2 ⊗        −1 e 2 −1        = −2 ⊗ [A∗ λ]·2 That is, columns [A∗ λ]·2 and [A∗ λ]·3 are scalar multiples of each other, which we would expect 27
  • 33. (see Lemma 1.37) since vertices 2 and 3 are in the same critical class. 1.5 Recurrence Relations & Periodicity 1.5.1 Solving Max-Plus Recurrence Relations In many of the applications discussed in Chapters 2 and 3 we will need to solve recurrence relations over the max-plus semiring. A key insight in doing this is to view implicit first-order recurrence relations of the form x(k + 1) = (A ⊗ x(k + 1)) ⊕ (B ⊗ x(k)) as a system of max-plus linear equations x = (A ⊗ x) ⊕ b. The result below uses the ∗ operator (see Definition 1.19) to solve systems of this form. Theorem 1.40. Let A ∈ Rn×n max and b ∈ Rn max. If the communication graph G(A) has no circuit with positive average weight, then the equation x = (A ⊗ x) ⊕ b (1.13) has the solution x = A∗ ⊗ b. Furthermore, if all the circuit weights in G(A) are negative, then this solution is unique. Proof. By Lemma 1.20 we know that A∗ exists. We therefore have A∗ ⊗ b = ∞ k=0 A⊗k ⊗ b =   ∞ k=1 A⊗k ⊗ b   ⊕ (E ⊗ b) = A ⊗   ∞ k=0 A⊗k ⊗ b   ⊕ (E ⊗ b) = A ⊗ (A∗ ⊗ b) ⊕ b and therefore A∗ ⊗ b is indeed a solution of (1.13). To show uniqueness, suppose that x is a solution of x = b⊕(A⊗x); then we can substitute the expression for x back into the right-hand side of the equation to obtain x = b ⊕ (A ⊗ b) ⊕ (A⊗2 ⊗ x) Repeating this procedure yields x = b ⊕ (A ⊗ b) ⊕ (A⊗2 ⊗ b) ⊕ (A⊗3 ⊗ x) = . . . = b ⊕ (A ⊗ b) ⊕ · · · ⊕ (A⊗(k−1) ⊗ b) ⊕ (A⊗k ⊗ x) = k−1 l=0 (A⊗l ⊗ b) ⊕ (A⊗k ⊗ x) (1.14) 28
  • 34. By Theorem 1.16, the entries A⊗k are the maximal weights of paths of length k. For k large enough, these paths necessarily contain elementary circuits, which have negative weight by assumption. Indeed, as k → ∞ the number of elementary circuits in these paths also necessarily tends to ∞, and so the elements of A⊗k tend to ε. Hence, letting k → ∞ in (1.14) gives that x = A∗ ⊗ b (where once again we have applied Lemma 1.20), as required. As we mentioned above, Theorem 1.40 above can be applied to the implicit recurrence relation x(k + 1) = A ⊗ x(k + 1) ⊕ B ⊗ x(k) to yield the explicit recurrence relation x(k + 1) = A∗ ⊗ B ⊗ x(k) and this technique will be used several times in Chapter 2. However, can we extend this theory? In many applications we will encounter systems whose dynamics follow a recurrence relation of order higher than one. Consider the most general (explicit) linear recurrence relation of order M ≥ 1: x(k) = M m=0 Am ⊗ x(k − m), k ≥ 0 (1.15) Here, A0, . . . , AM ∈ Rn×n max and x(m) ∈ Rn max, −M ≤ m ≤ −1 are given. We show below that we can transform (1.15) into a first-order recurrence relation of the form x(k + 1) = A ⊗ x(k), provided that A0 has no circuit of positive weight. To begin, set b(k) = M m=1 Am ⊗ x(k − m) Then (1.15) becomes x(k) = A0 ⊗ x(k) ⊕ b(k) (1.16) Then, since A0 has no circuit of positive weight by assumption, we can apply Theorem 1.40 to write (1.16) as x(k) = A∗ 0 ⊗ b(k) = A∗ 0 ⊗ A1 ⊗ x(k − 1) ⊕ · · · ⊕ A∗ 0 ⊗ AM ⊗ x(k − M) (1.17) Note that we have now changed the implicit M-th order recurrence relation (1.15) into the explicit M-th order recurrence relation (1.17) (the x(k) term does not feature on the right hand side). To finish the job, we set x(k) := (x (k − 1), x (k − 2), . . . , x (k − M)) 29
  • 35. and (with E denoting a matrix of all εs): A :=            A∗ 0 ⊗ A1 A∗ 0 ⊗ A2 . . . . . . A∗ 0 ⊗ AM E E . . . . . . E E E ... E ... ... ... E E . . . E E            Then (1.15) can be written as x(k + 1) = A ⊗ x(k), k ≥ 0 (1.18) which is what we were aiming for. It will come as no surprise that problems of this form are closely related to the concept of eigenvalues and eigenvectors studied in the previous section. For example, if the recurrence relation x(k + 1) = A ⊗ x(k) is given the initial condition x(0), where x(0) is an eigenvector of A with corresponding eigenvalue λ, then the solution x(k) is given by x(k) = λ⊗k ⊗ x(0). It could then be said that the solution is periodic. The final section of this chapter explores the limiting behaviour of the solution x(k) when the system is initialised with an arbitrary vector x(0), and in particular whether we can say anything about its periodicity in general. 1.5.2 Limiting Behaviour In this section we state and prove a theorem which establishes an important result on the asymptotic behaviour of the powers of an irreducible matrix A in terms of its unique eigenvalue λ. In simple terms, this theorem says that sequential powers of A always exhibit periodic behaviour after a finite number of steps. We will then apply this result to the recurrence relations we studied in the previous section. It turns out that the periodicity depends on a quantity known as the cyclicity of A, which we define below in two steps. Definition 1.41. The cyclicity of a graph G, denoted σG, is defined as follows: • If G is strongly connected, then its cyclicity equals the greatest common divisor of the lengths of all the elementary circuits in G. If G consists of just one vertex without a self-loop, then its cyclicity is taken to be 1. • If G is not strongly connected, then its cyclicity equals the least common multiple of the cyclicities of all the maximal strongly connected subgraphs of G. Definition 1.42. The cyclicity of a matrix A ∈ Rn×n max , denoted σ(A), is equal to σGc(A), the cyclicity of the critical graph of A. If A is a square matrix over Rmax then we often talk of the graph cyclicity and matrix cyclicity of A, where the graph cyclicity refers to the cyclicity of the graph Gc(A). 30
  • 36. It may seem strange to define the cyclicity of a matrix A via its critical graph and not its communication graph. However, as we will see below, it turns out that the former quantity determines the periodic behaviour of the powers of A, so the reason for this choice should be clear. Before proving the main theorem of this section we require several preliminary results. The first one is an important lemma from graph theory, which we explore below. Lemma 1.43. Let A ∈ Rn×n max be an irreducible matrix, and let the cyclicity of its communica- tion graph be σG. Then, after a suitable relabelling of the vertices of G(A), the matrix A⊗σG corresponds to a block diagonal matrix with σG blocks on the diagonal. The communication graph of each diagonal block is strongly connected and has cyclicity one. Moreover, the eigenvalues of all diagonal blocks have the same value. Proof. For i, j ∈ V(A), define the relation i ∼ j ⇐⇒ the length of every path from i to j in G(A) is a multiple of σG. It is easy to show that this is an equivalence relation on V(A). Therefore if k0 ∈ V(A) is fixed, we can introduce equivalence classes C0, C1, . . . , CσG−1 as i ∈ Cl ⇐⇒ every path from k0 to i in G(A) has length (mod σG) equal to l, (1.19) for l = 0, 1, . . . , σG. Then for i, j ∈ V(A), we have that i ∼ j ⇐⇒ i, j ∈ Cl for some l = 0, 1, . . . , σG − 1. Assume that there is a path from i to j of length σG. By definition of cyclicity, the length of any circuit starting and finishing at i must be divisible by σG, so there must also be a path from j to i whose length is a multiple of σG. Therefore every path from i to j must have a length that is a multiple of σG (since if not, we could use such a path to create a circuit whose length is not divisible by σG). Hence, every path of length σG must start and end in the same equivalence class as defined in (1.19). Since A⊗σG can be computed by considering all paths of length σG in G(A) (see Theorem 1.16), it follows that A⊗σG is block-diagonal, possibly after an appropriate relabelling of the vertices according to the classes C1, . . . , CσG−1; for instance, by first labelling all vertices in C0, then all the vertices in C1, and so on. Now let l ∈ {0, 1, . . . , σG − 1}. From our remark above we know that if i, j ∈ Cl then i ∼ j, i.e. the length of every path from i to j is a multiple of σG. Since A is irreducible there must be at least one such path, which can be split up into a number of subpaths, all of length σG and going from one vertex in Cl to another vertex in Cl. It follows that the block of A⊗σG corresponding to class Cl is irreducible. Next, note that every circuit in G(A) must go through all the equivalence classes C1, . . . , CσG−1. To see this, suppose there is a circuit going through just τ of the classes, where τ < σG. Then there must be a class Cl and vertices i, j ∈ Cl such that there is a path from i to j of length 31
  • 37. less than or equal to τ. This is a contradiction, since the length of a path between vertices in the same class must be a multiple of σG. Hence the number of circuits in G(A) is the same as the number of circuits going through any class Cl. Observe that circuits in G(A) of length κ × σG can be associated with circuits in G(A⊗σG ) of length κ. Since the greatest common divisor of all circuits in G(A) is σG, it follows that the communication graph of the block in A⊗σG corresponding to class Cl has cyclicity one. Finally, the fact that the eigenvalues of the diagonal blocks are identical follows immediately from the irreducibility of A. Corollary 1.44. Under the conditions of Lemma 1.43, let τ be a multiple of σG. Then, after a relabelling of the vertices of G(A), the matrix A⊗τ corresponds to a block diagonal matrix with σG blocks on the diagonal. The communication graph of each diagonal block is strongly connected and has cyclicity one. Proof. This follows along the same lines as the proof of Lemma 1.43. Let A ∈ Rn×n max be an irreducible matrix and let Gc(A) be its critical graph. Define the critical matrix of A, denoted Ac, to be the submatrix of A such that the communication graph of Ac is equal to the critical graph of A, i.e. G(Ac) = Gc(A). Matrix Ac can be obtained from matrix A by restricting A to those entries that correspond to edges in Gc(A). Clearly the critical graph of Ac is the same as its communication graph, i.e. Gc(Ac) = G(Ac), and therefore σGc(Ac) = σG(Ac). It then follows that the cyclicity of the matrix Ac is equal to the cyclicity of the communication graph G(Ac) (i.e. σ(Ac) = σG(Ac)); that is, for the critical matrix Ac both types of cyclicity coincide and are equal to σ(A). We know that G(Ac) = Gc(A) = Gc(Ac), but we can prove more: Lemma 1.45. Let A be an irreducible matrix, and let Ac be its corresponding critical matrix. Then, for all k ≥ 1 we have G((Ac )⊗k ) = Gc (A⊗k ) = Gc ((Ac )⊗k ). Proof. As we noted above, Ac is a submatrix of A, and therefore (Ac)⊗k is a submatrix of A⊗k. Furthermore, note that Gc(·) is a subgraph of G(·), which we shall denote Gc(·) ⊆ G(·). It follows that Gc((Ac)⊗k) ⊆ Gc(A⊗k) and Gc((Ac)⊗k) ⊆ G((Ac)⊗k). To prove the converse inclusions, note that any edge in G(A⊗k) from vertex i to vertex j corresponds to a path in G(A) of length k from vertex i to vertex j. Thus if a number of edges in G(A⊗k) form a circuit of length l, then the corresponding paths in G(A) form a circuit of length k×l. Conversely, consider a circuit in G(A), choose any vertex on the circuit and traverse the circuit with steps of length k until the chosen vertex is reached again. If l such steps are required then there is a corresponding circuit in G(A⊗k) of length l. In the same way, critical circuits in G(A⊗k) of length l correspond to critical circuits in G(A) of length k × l, and vice versa. 32
  • 38. If c is a critical circuit of length l in G(A⊗k) then there is a corresponding critical circuit c of length k ×l in G(A). This circuit must be in Gc(A) (because it is critical), which in turn implies that c is a critical circuit in G((Ac)⊗k). Hence, it follows that Gc((Ac)⊗k) ⊇ Gc(A⊗k). The other inclusion is proved in the same way. Lemma 1.46. Let A ∈ Rn×n max be an irreducible matrix with cyclicity σ = σ(A). Then the cyclicity of the matrix A⊗σ is equal to one. Proof. Firstly, suppose the critical matrix Ac is irreducible. By the remarks prior to Lemma 1.45 we know that the cyclicity of Ac and that of its communication graph is equal to σ, so by Lemma 1.43, after a suitable relabelling of vertices, (Ac)⊗σ corresponds to a block diagonal matrix with square diagonal blocks that are irreducible and have graph cyclicity one. However, by Lemma 1.45 with k = σ, we have that Gc((Ac)⊗σ) = G((Ac)⊗σ), and therefore the communication graph of each of the diagonal blocks of (Ac)⊗σ coincides with its critical graph. Thus for each diagonal block both cyclicities coincide, and therefore both are one. If Ac is reducible then the same process can be done for each of the critical classes of Gc(A) with their individual cyclicities. According to Definition 1.41, the least common multiple of these cyclicities equals σ, the matrix cyclicity of A. Noting that σ is a multiple of σG(A), it follows from Corollary 1.44 that each diagonal block of (Ac)⊗σ corresponds to a block diagonal matrix with square diagonal blocks that are irreducible and have cyclicity one. Note that if Gc(A) does not cover all the vertices of G(A) then we must augment the overall block diagonal matrix with a square block with entries equal to ε in order to keep it the same size as the original matrix A. In both cases it follows that each diagonal block of the block diagonal matrix corresponding to (Ac)⊗σ is irreducible and has cyclicity one. Taking the least common multiple of all cyclicities, this means that the cyclicity of the whole matrix (Ac)⊗σ is equal to one, and therefore the graph cyclicity of Gc((Ac)⊗σ) is also equal to one. But by Lemma 1.45 with k = σ, this graph is the same as Gc(A⊗σ), which therefore must also have cyclicity one. Thus A⊗σ has matrix cyclicity one, which completes the proof. We now state a fundamental theorem, the proof of which can be found in [4]. Theorem 1.47. Let β1, . . . , βq ∈ N be such that gcd{β1, . . . , βq} = 1. Then there exists N ∈ N such that for all k ≥ N there exist n1, . . . , nq ∈ N0 such that k = (n1 × β1) + · · · + (nq × βq). We finally state and prove one last prerequisite result which is essentially a special case of the theorem that follows. It turns out that the generalisation is relatively straightforward, so in proving this lemma we will have done most of the work in proving the main result. Lemma 1.48. Let A ∈ Rn×n max be an irreducible matrix with unique eigenvalue e and cyclicity one. Then there exists N ∈ N such that A⊗(k+1) = A⊗k 33
  • 39. for all k ≥ N. Proof. The proof comes in three stages. We show that there exists N ∈ N such that for all k ≥ N: 1. [A⊗k]ii = [A+]ii = e for all i ∈ Vc(A), 2. [A⊗k]ij = [A+]ij for all i ∈ Vc(A) and j ∈ {1, . . . , n}, 3. [A⊗k]ij = l∈Vc(A)[A+]il ⊗ [A+]lj for all i, j ∈ {1, . . . , n}. The result then follows immediately from statement 3 since the right hand side does not depend on k. Statement 1. Consider i ∈ Vc(A). Then there is a critical class of Gc(A), say Gc 1(A) = (Vc 1(A), Ec 1(A)), such that i ∈ Vc 1(A). Since the cyclicity of matrix A is one, it follows that the cyclicity of graph Gc 1(A) is equal to one too. Hence there exist circuits in Gc 1(A), say c1, . . . , cq, whose lengths have a greatest common divisor equal to one. Since Gc 1(A) is a critical class it must be strongly connected, and therefore there exists a circuit α in Gc 1(A) that passes through i and through all circuits c1, . . . , cq (i.e. α ∩ cj = ∅ ∀j = 1, . . . , q). Now, by Theorem 1.47, there exists N ∈ N such that for each k ≥ N, there exist n1, . . . , nq ∈ N0 such that k = |α|l + (n1 × |c1|l) + · · · + (nq × |cq|l). For these n1, . . . , nq, we can construct a circuit passing through i, built from circuit α, n1 copies of circuit c1, n2 copies of circuit c2 and so on, up to nq copies of circuit cq. Clearly this circuit is in Gc 1(A), so it must be critical with weight e. Since the maximal average circuit weight in G(A) is e, it follows that [A⊗k]ii = e for all k ≥ N, which, by the definition of A+, also implies that [A+]ii = e, as required. Statement 2. By the definition of A+ there exists l ∈ N such that [A⊗l]ij = [A+]ij. In fact, since the eigenvalue of A is e, it follows from Lemma 1.17 that l ≤ n. From statement 1, for k large enough, i ∈ Vc(A) and j ∈ {1, . . . , n}, we then have [A⊗(k+l) ]ij ≥ [A⊗k ]ii ⊗ [A⊗l ]ij = [A⊗l ]ij = [A+ ]ij. In addition, clearly we also have [A+ ]ij = ∞ m=1 [A⊗m ]ij ≥ [A⊗(k+l) ]ij ≥ [A+ ]ij, so by replacing k + l with k, it therefore follows that [A⊗k]ij = [A+]ij for all i ∈ Vc(A) and j ∈ {1, . . . , n}, with k large enough. This is what we wanted to prove. Statement 3. Following the same lines as in the proof of statement 2, we can also show that [A⊗m]ij = [A+]ij for all i ∈ {1, . . . , n}, j ∈ Vc(A) and with m large enough. Together, take k 34
  • 40. and m large enough such that [A⊗k]il = [A+]il and [A⊗m]lj = [A+]lj for all l ∈ Vc(A). Then [A⊗(k+m) ]ij ≥ [A⊗k ]il ⊗ [A⊗m ]lj = [A+ ]il ⊗ [A+ ]lj, for all l ∈ Vc(A). By replacing k + m with k, it follows that for k large enough [A⊗k ]ij ≥ l∈Vc(A) [A+ ]il ⊗ [A+ ]lj. Now let the maximal average weight of a non-critical circuit (i.e. a circuit not passing through any vertex in Vc(A)) be δ. Then the weight of a path from j to i of length k + 1 in G(A) not passing through any vertex in Vc(A) can be bounded above by [A+]ij + (k × δ) = [A+]ij ⊗ δ⊗k, since such a path consists of an elementary path from j to i (whose weight is bounded above by [A+]ij) and at most k non-critical circuits (whose weights are each bounded above by δ). Since the maximal average circuit weight in G(A) is e we must have δ < e, and so for k large enough [A+ ]ij ⊗ δ⊗k ≤ l∈Vc(A) [A+ ]il ⊗ [A+ ]lj. Indeed, the right-hand side is fixed, while the left-hand side tends to ε as k → ∞. Hence for k large enough we have that [A⊗k ]ij = l∈V(A) [A+ ]il ⊗ [A+ ]lj = l∈Vc(A) [A+ ]il ⊗ [A+ ]lj, for all i, j = 1, . . . , n. We can now state and prove the main theorem of this section. Theorem 1.49. Let A ∈ Rn×n max be an irreducible matrix with unique eigenvalue λ and cyclicity σ := σ(A). Then there exists N ∈ N such that A⊗(k+σ) = λ⊗σ ⊗ A⊗k for all k ≥ N. Proof. Consider the matrix B := (Aλ)⊗σ. Recall that σ is the cyclicity of the critical graph of A, which is a multiple of the cyclicity of the communication graph G(A). By Corollary 1.44, after a suitable relabelling of the vertices of G(A), matrix B is a block diagonal matrix with square diagonal blocks whose communication graphs are strongly connected and have cyclicity one. By Lemma 1.46 we have that the cyclicity of B is one, which implies that the cyclicity of each of its diagonal blocks is one. Hence by applying Lemma 1.48 to each diagonal block, it ultimately follows that there exists M ∈ N such that B⊗(l+1) = B⊗l for all l ≥ M. That is, (Aλ)⊗σ ⊗(l+1) = (Aλ)⊗σ ⊗l , 35
  • 41. which can further be written as (Aλ)⊗(l×σ+σ) = (Aλ)⊗(l×σ), or A⊗(l×σ+σ) = λ⊗σ ⊗ A⊗(l×σ) , for all l ≥ M. Finally, note that A⊗(l×σ+j+σ) = λ⊗σ ⊗A⊗(l×σ+j) for any 0 ≤ j ≤ σ−1, implying that for all k ≥ N := M × σ it follows that A⊗(k+σ) = λ⊗σ ⊗ A⊗k , as required. Theorem 1.49 can be seen as the max-plus analogue of the Perron-Frobenius theorem in con- ventional linear algebra. Strictly speaking it is the normalised matrix Aλ that exhibits periodic behaviour, since the unique eigenvalue of Aλ is e = 0, and then A ⊗(k+σ) λ = A⊗k λ for k sufficiently large. However, we use the term ‘periodic’ to describe the more general behaviour seen here. Note that the cyclicity of A is the smallest possible length of such periodic behaviour (see [2] for the proof of this). For our purposes, we now move on to applying this result to the recurrence relations studied in Section 1.5.1. Recall the form of the basic first-order recurrence relation x(k + 1) = A ⊗ x(k), k ≥ 0, (1.20) which has the solution x(k) = A⊗k ⊗ x(0). We can apply Theorem 1.49 in this context to give us that for k sufficiently large: x(k + σ(A)) = A⊗(k+σ(A)) ⊗ x(0) = λ⊗σ(A) ⊗ A⊗k ⊗ x(0) = λ⊗σ(A) ⊗ x(k). That is, the solution x(k) is periodic with period σ(A). If we interpret k as a time index, then also by Theorem 1.49, the solution enters this periodic regime after N =: t(A) time steps, where we call t(A) is the transient time of A. In particular, if A has cyclicity equal to 1 then x(k+1) = A⊗x(k) = λ⊗x(k) ∀k ≥ t(A), and so for k sufficiently large x(k) effectively becomes an eigenvector of A. In other words, after t(A) time steps, x(k) behaves like an eigenvector, and the effect of the initial condition x(0) has died out. Note that the transient time of a matrix can be large even for systems of small dimension. For example, the matrix A defined by A =   −1 −N e e   36
  • 42. where N ∈ {2, 3, . . . } has transient time t(A) = N, while its cyclicity is clearly 1. Finally, we make some observations regarding the growth rate of the solution x(k). Note that if we take x(0) = v in (1.20), where v is an eigenvector of A, then we immediately obtain that for all j = 1, . . . , n: lim k→∞ xj(k) k = λ, where λ is the unique eigenvalue of A. By applying Theorem 1.49 it should be clear that this holds true for any initial value x(0) and not just for eigenvectors; indeed this result is proved in [13]. We therefore say that the solution has an asymptotic growth rate of λ. Assuming irreducibility, all recurrence relations over max-plus exhibit this behaviour, regardless of the choice of the matrix A! 37
  • 43. Chapter 2 Petri Nets and Timed Event Graphs 2.1 A Motivating Example The following example is adapted from chapter 1 of [2]. Consider a manufacturing system consisting of three machines M1, M2 and M3, which produces three kinds of parts P1, P2 and P3 according to different product mixes. The manufacturing process for each part is depicted below. M1 M2 M3 P2 P3 P1 Figure 2.1: Manufacturing Process for each part. Grey boxes represent the three machines; arrows represent the routes that the different parts must take in their respective manufacture. Processing times are different for each machine and each part, and are given in the following table: P1 P2 P3 M1 - 1 5 M2 3 2 3 M3 4 3 - Table 2.1: Processing times for each part at each machine (arbitrary time units). Blank entries correspond to combinations of machine & part that do not form part of the manufacturing process. Parts are carried through the manufacturing process on a limited number of pallets. We make 38
  • 44. the following assumptions: 1. Only one pallet is available for each part type. 2. Once production of a part is completed, it is removed from its respective pallet and the pallet returns to the beginning of the production line. 3. There are no set-up times or traveling times between machines. 4. The sequencing of part types on the machines is fixed, and for M1 is (P2, P3), for M2 (P1, P2, P3) and for M3 (P1, P2). Assumption (3) gives no loss of generality since if set-up times or traveling times did exist, we could combine them with the processing time at the appropriate machine. Assumption (4) means that machines have to wait for the appropriate part rather than starting work on any part that arrives first (see below for an example). This may or may not be realistic; extensions to the theory presented below in which this assumption is dropped are discussed in chapter 9 of [2]. We can model the time evolution of this system by considering the time that each machine starts working on the k-th part of type i, for i = 1, 2, 3 and k ∈ N. There are seven combinations of machines and parts, so we define x(k) = (x1(k), . . . , x7(k)) as follows: Variable xi(k) Definition x1(k) time that M1 starts working on the k-th unit of P2 x2(k) time that M1 starts working on the k-th unit of P3 x3(k) time that M2 starts working on the k-th unit of P1 x4(k) time that M2 starts working on the k-th unit of P2 x5(k) time that M2 starts working on the k-th unit of P3 x6(k) time that M3 starts working on the k-th unit of P1 x7(k) time that M3 starts working on the k-th unit of P2 Table 2.2: Definitions of each entry of the state vector x(k), for k ∈ N. By examining the production process, work by each machine on the (k+1)-st part is constrained in the following way: x1(k + 1) ≥ max x7(k) + 3, x2(k) + 5 x2(k + 1) ≥ max x5(k) + 3, x1(k + 1) + 1 x3(k + 1) ≥ max x6(k) + 4, x5(k) + 3 x4(k + 1) ≥ max x3(k + 1) + 3, x1(k + 1) + 1 x5(k + 1) ≥ max x2(k + 1) + 5, x4(k + 1) + 2 x6(k + 1) ≥ max x3(k + 1) + 3, x7(k) + 3 x7(k + 1) ≥ max x6(k + 1) + 4, x4(k + 1) + 2 39
  • 45. For example, the inequality for x6(k + 1) comes from the fact that M3 cannot start working on the (k + 1)-st unit of P1 until it has finished working on the k-th unit of P2, and until M2 has finished working on the (k + 1)-st unit of P1. If we are to optimise the system, the inequalities above will actually be equalities. This is where the theory of max-plus algebra comes to the fore. We can write the system in max-plus matrix form as x(k + 1) = A0 ⊗ x(k + 1) ⊕ A1 ⊗ x(k) where A0 =                 ε ε ε ε ε ε ε 1 ε ε ε ε ε ε ε ε ε ε ε ε ε 1 ε 3 ε ε ε ε ε 5 ε 2 ε ε ε ε ε 3 ε ε ε ε ε ε ε 2 ε 4 ε                 ; A1 =                 ε 5 ε ε ε ε 3 ε ε ε ε 3 ε ε ε ε ε ε 3 4 ε ε ε ε ε ε ε ε ε ε ε ε ε ε ε ε ε ε ε ε ε 3 ε ε ε ε ε ε ε                 This is a first-order recurrence relation like we have seen in Section 1.5. A quick examination of G(A0) shows that it does not contain any circuits of positive weight (indeed it does not contain any circuits at all), and therefore we can apply Theorem 1.40 to find the unique solution x(k + 1) = A∗ 0 ⊗ A1 ⊗ x(k) = B ⊗ x(k) (2.1) where B := A∗ 0 ⊗ A1, or explicitly: B =                 ε 5 ε ε ε ε 3 ε 6 ε ε 3 ε 4 ε ε ε ε 3 4 ε ε 6 ε ε 6 7 4 ε 11 ε ε 8 9 9 ε ε ε ε 6 7 3 ε 8 ε ε 10 11 7                 If numerical values of x1(1), . . . , x7(1) are given then these values constitute the initial condition, and the future evolution of the system is uniquely determined. There are no restrictions on x(1) from a mathematical point of view, but given the physical interpretation of the system, limitations do exist. For example, if we assume that all three pallets start at the beginning of their respective production lines (with M1 working on P2 first), we have x1(1) = x3(1) = 0, but x2(1) cannot be less than 1 since M1 has to finish working on P2 before it can start working on P3. 40
  • 46. Note that if we had allowed more than one pallet on any of the three production lines then the system would have been of higher order (for example, if the production line of P1 had three pallets then work on the (k +1)-st unit could start once the (k −2)-th unit had been produced). This system would be solvable using the techniques developed at the end of Section 1.5. Another possible extension would be to incorporate variable processing times rather than the constant values given in table 2.1. The processing times could vary according to how many parts the machines have already processed (i.e. vary with k), or they could exhibit stochastic variability (i.e. following some specified probability distribution). The first type of variability will be introduced with the basic autonomous equation below; stochastic event graph theory will be discussed in Chapter 3. Note that since we can describe the evolution of the system by a recurrence relation of the form (2.1), we might expect that we can apply Theorem 1.49 to see that the system settles down into a periodic regime after a finite length of time. However, upon closer inspection we see that the matrix B has a column of ε, so it is not irreducible and thus Theorem 1.49 does not apply. Later on in this chapter we will discuss some techniques which ensure that the evolution equation does involve an irreducible matrix and therefore enables us to draw the relevant conclusions. To end this introductory example, note that the way we have modeled our system does not immediately give us the production times of the k-th unit of P1, P2 and P3. In order to find these we could introduce an output vector y(k) = (y1(k), y2(k), y3(k)) defined by y(k) = C ⊗ x(k) where C =      ε ε ε ε ε 4 ε ε ε ε ε ε ε 3 ε ε ε ε 3 ε ε      Left multiplication by C adds the appropriate processing time to the starting time at the last machine in the production line of each part. Thus yi(k) gives us the time of production of the k-th unit of part Pi. In the following section we will introduce the concept of timed event graphs, which are the tools we will use to model discrete event systems such as the production line we have considered here. 2.2 Preliminaries of Event Graph Theory 2.2.1 Definitions and Set-up As we have seen above, max-plus algebra allows us to describe the evolution of events on a network subject to synchronisation constraints. In our example, a part moving from one machine to the next is an event. An appropriate tool to model events on a certain class of 41
  • 47. networks is known as a Petri net. We will focus on certain type of Petri net called an event graph, which can be modeled by max-plus linear recurrence relations of the form discussed in Section 1.5. We start by defining the relevant terms and setting out some notation. In order to fully appreciate all the concepts we introduce, it may be helpful to read this section alongside the example that follows (Section 2.2.2). Definition 2.1. Let G = (V, E) be a graph and let i, j ∈ V. We say that i is a predecessor (or an upstream vertex) of j if (i, j) ∈ E, and that i is a successor (or a downstream vertex) of j if (j, i) ∈ E. Definition 2.2. A Petri net is a pair (G, µ) where G = (V, E) is a directed graph and µ is a vector, satisfying the following properties: (i) G is bipartite, i.e. V is partitioned into two disjoint sets P and Q (called places and transitions respectively) such that E only consists of edges of the form (pi, qj) and (qj, pi), with pi ∈ P and qj ∈ Q. (ii) µ is a |P|-vector of non-negative integers, known as the initial marking. Definition 2.3. An event graph is a Petri net in which every place has exactly one upstream and downstream transition. Notation. For general i ∈ V, we let π(i) denote the set of all predecessors of i and σ(i) denote the set of all successors of i. In the case of Petri nets and event graphs, when we want to work with indices we will sometimes use the following additional notation: if pi ∈ π(qj), we write i ∈ πq(j), and if qj ∈ π(pi), we write j ∈ πp(i). Similarly, if pi ∈ σ(qj), we write i ∈ σq(j), and if qj ∈ σ(pi), we write j ∈ σp(i). Note that in the case of an event graph, for any place pi we have that |πp(i)| = |σp(i)| = 1, so we often allow the abuse of notation πp(i) = j (as opposed to πp(i) = {j}). We can think of places as conditions and transitions as events. For example, a machine working on a part is a place, and a transition occurs when the part moves on to the next machine. Each place has an associated marking (given initially by the vector µ) which indicates whether or not the condition has been fulfilled, e.g. whether or not a machine is working on a given part type. Equivalently we say that each place has an associated number of tokens, which can be thought of as the number of data items or resources available at each place. In our example each place can have either 0 or 1 tokens, but in general there can be any amount (e.g. if machines are capable of working on more than one part at once). We say that a transition is enabled if each of its upstream places contains at least one token. When this is the case the transition fires, meaning that one token is removed from each of its upstream places and one token is added to each of its downstream places. If the initial marking 42
  • 48. is µ, a transition firing gives a new marking µ, defined by µi =    µi − 1 if pi ∈ π(qj) µi + 1 if pi ∈ σ(qj) µi otherwise In this case we say that the marking µ is reachable from µ. It is easy to see that for a general Petri net the total number of tokens can change when a transition fires; for example a transition may have one upstream place but two downstream places, in which case the transition firing causes the total number of tokens to increase by one. Furthermore, note that the definition of an event graph allows for input and output transitions (known and sources and sinks respectively), i.e. transitions that do not have any upstream or downstream places. Source transitions are enabled by the outside world and deliver tokens into the system; sink transitions remove tokens from the system completely. The following definition makes an important distinction between two types of event graph: Definition 2.4. An event graph is autonomous if it contains no source transitions, and non- autonomous otherwise. The important property of event graphs is that they do not allow for models conflicts; that is, a token in a given place can be consumed by only one predetermined transition. The ‘opposite’ to an event graph (i.e. a Petri net in which each transition has exactly one upstream place and one downstream place), known as a state machine, does allow for this competition element but does not admit synchronisation. It can be shown that state machines are equivalent to the automata studied in computer science, which shows that Petri nets in general have more modelling power than automata. Up until now, the theory we have introduced is only concerned with the ordering of events. If we wish to investigate network performance, it is necessary to introduce time. There are two ways in which this could be done: we can either associate durations with transition firings, or holding times with places. In fact, in many applications it could be that both times are present; for example the real-life manufacturing system in Section 2.1 would exhibit travel times as well as processing times. However, as we noted before, by incorporating the firing times into the holding times at places, in the case of event graphs it may be assumed without loss of generality that the firing times are equal to 0. We therefore introduce the concept of timed event graph below. Definition 2.5. A timed event graph is an event graph endowed with a |P|-vector α of holding times associated with each place. Note that the definition of a timed event graph does not uniquely determine all future firing times. This is because the initial marking does not specify how long each token has spent in its respective place. We will deal with this more fully when we come to look at the basic autonomous equation in the next section. 43
  • 49. 2.2.2 A Simple Example To consolidate all of this theory, consider this simple example. A train network connects the main stations of two cities. There are two routes from station S1 to station S2; one visiting an intermediate station S3 along the way and the other visiting a different intermediate station S4. Trains link up at S2 and return to S1 via a single fast track with no stops, where they then split up again and repeat their respective journeys. There are also two inner-city loops at S1 and S2 which visit the suburbs of their respective cities. The travel time from Sj to Sl is given as the (l, j)-th entry of the matrix A below: A =        2 5 ε ε ε 3 5 3 2 ε ε ε 4 ε ε ε        . (2.2) We can represent this network as a standard graph as follows: S1 S3 S4 S22 2 4 3 5 5 3 Figure 2.2: Standard graph of the simple train network. Stations (the vertices) are represented by circles and tracks by weighted edges. The travel times are given by the edge weights. Similarly to before, we can assume that there are no waiting times at stations by incorporating them into the travel times. We want the system to be synchronised in the sense that trains arriving at a station should wait for each other to allow for the changeover of passengers. This means that departures from a given station will coincide (once the last train has arrived, all trains can then depart). We can model this system with a timed event graph, where ‘tracks’ are represented by places (the timed elements of the network); trains by tokens and departures at each station by transitions. Note that each transition has an upstream place so the event graph will be autonomous. In order to fully specify the event graph we need to state the positions of the trains in the network at time 0, which corresponds to the initial marking. We assume that at time 0 there is one train travelling from S1 to S3, two trains travelling from S1 to S4, one train travelling back from S2 to S1 and one train on each of the inner-city loops. This gives the following timed event graph, pictured at time 0: 44
  • 50. q1 2 4 q4 3 q2 3 2 q3 5 5 Figure 2.3: Timed event graph of the train network depicted in Figure 2.2. The transitions q1, q2, q3 and q4 represent departures from the four respective stations. The edges can be thought of as the tracks between stations, with the intermediate places (depicted as circles) specifying the travel times. Tokens inside the places represent trains on the tracks. Note that transitions are depicted by bars, places by circles and tokens by counters inside the circles. As we have noted before, we cannot tell which transition will fire first since we do not know how long each token of the initial marking has spent in its respective place (i.e. how close to their respective destinations the trains are at time 0). If transitions q3 and q4 both fire once, the token distribution changes to the following: q1 2 4 q4 3 q2 3 2 q3 5 5 Figure 2.4: Timed event graph of the train network after transitions q3 and q4 have fired. One token has been removed from each of their upstream places and one token has been added to each of their downstream places. This corresponds to the train on the track from S1 to S3 having reached S3 and departed for S2, and also one of the trains on the track from S1 to S4 having reached S4 and departed for S2. Once these trains both reach S2 they link up to form one train, and assuming the inner-city train at S2 is ready and waiting, transition q2 will fire and the token distribution of the event graph will change to: 45