1. HALL’S MATCHING THEOREM
1. Perfect Matching in Bipartite Graphs
A bipartite graph is a graph G = (V, E) whose vertex set V may be partitioned into two
disjoint set VI , VO in such a way that every edge e ∈ E has one endpoint in VI and one
endpoint in VO . The sets VI and VO in this partition will be referred to as the input set
and the output set, respectively. Define a perfect matching in a bipartite graph G to be
an injective mapping f : VI → VO such that for every x ∈ VI there is an edge e ∈ E with
endpoints x and f (x). For any subset A ⊂ VI , define ∂A to be the set of all vertices y ∈ VO
that are endpoints of edges with one endpoint in A.
Theorem 1. (Hall’s Matching Theorem) Let G be a bipartite graph with input set VI ,
output set VO , and edge set E. There exists a perfect matching f : VI → VO if and only if
for every subset A ⊂ VI ,
(1)
|∂A| ≥ |A|.
Proof. By induction on the cardinality of VI . If |VI | = 1 the result is trivially true. Suppose,
then, that the result is true if |VI | ≤ n, and consider a bipartite graph G whose input set VI
has cardinality n + 1. There are two possibilities: either (1) for every proper subset A ⊂ VI ,
the cardinality of ∂A is at least one greater than the cardinality of A; or (2) there exists a
proper subset A ⊂ VI such that |∂A| = |A|.
Case 1: Choose any x ∈ VI and any y ∈ ∂{x} (by hypothesis, ∂{x} has at least one
∗
element). Let G∗ be the bipartite graph with input set VI∗ = VI − {x}, output set VO =
VO − {y}, and whose edges are the same as those of G, but with edges incident to either x
or y deleted. The bipartite graph G∗ satisfies the hypothesis (1), because in Case 1 every
proper subset A ⊂ VI has |∂A| ≥ |A| + 1, so deleting the single vertex y from ∂A still leaves
at least |A| vertices. By the induction hypothesis, there is a perfect matching in G∗ ; this
perfect matching extends to a perfect matching in the original graph G by setting f (x) = y.
Case 2: Let A ⊂ VI be a proper subset of VI such that |∂A| = |A|. Construct bipartite
∗
graphs G∗ and G∗∗ with input sets VI∗ = A and VI∗∗ = VI − A, output sets VO = ∂A and
∗∗ = V − ∂A, and edges inherited from the original graph G. We shall use the induction
VO
O
hypothesis to show that there is a perfect matching in each of the bipartite graphs G∗ and
G∗∗ . If this is so, then a perfect matching in G may be obtained by taking the joins of the
perfect matchings in G∗ and G∗∗ .
Observe that the vertex set VI∗ and VI∗∗ have cardinalities no greater than n, because
A = VI∗ is a proper subset of VI . Thus, the induction hypothesis will guarantee the existence
of perfect matchings in G∗ and G∗∗ provided it is shown that the hypothesis (1) is satisfied
for each of these graphs. Consider first the graph G∗ : For any subset B ⊆ A = VI∗ , the
boundary ∂ ∗ B in the graph G∗ coincides with ∂B in G. Consequently, G∗ satisfies (1). Now
consider G∗∗ : If there were a subset B ⊆ VI∗∗ = VI − A whose boundary ∂ ∗∗ B in the graph
1
2. 2
HALL’S MATCHING THEOREM
G∗∗ had fewer than |B| elements, then in the graph G the boundary ∂(B ∪ A) would have
at most |B ∪ A| − 1 elements, because ∂(B ∪ A) = ∂ ∗∗ B ∪ ∂A. This is impossible, because
the graph G satisfies (1). Hence, G∗∗ also satisfies (1).
2. The Birkhoff-von Neumann Theorem
A doubly stochastic matrix is a square matrix with nonnegative entries whose row sums
and column sums are all 1. A magic square is a square matrix with nonnegative integer
entries whose row sums and column sums are all equal; the common value of the row sums
and column sums is called the weight of the square. Observe that if T is a magic square of
weight d ≥ 1, then one obtains a doubly stochastic matrix by dividing all entries of T by d.
Conversely, if P is a doubly stochastic matrix with rational entries, then one may obtain
a magic square by multiplying all entries by their least common denominator. The magic
squares of weight 1 are called permutation matrices: for any m × m permutation matrix T ,
there exists a permutation σ of the set [m] such that
(2)
Ti,σ(i) = 1
Ti,j = 0
for all i ∈ [m], and
if j = σ(i).
Theorem 2. Every doubly stochastic matrix is a convex combination (weighted average)
of permutation matrices. Every magic square of weight d is the sum of d (not necessarily
distinct) permutation matrices.
Proof. We shall consider only the assertion about magic squares; the assertion about doubly
stochastic matrices may be proved by similar arguments.) By definition, every magic square
of weight 1 is a permutation matrix. Let T be an m × m magic square of weight d > 1.
Consider the bipartite graph with VI = VO = [m] such that, for any pair (i, j) ∈ VI × VO ,
there is an edge from i to j if and only if Ti,j > 0.
Claim: The hypothesis (1) of the Matching Theorem is satisfied.
Proof. Let B be a subset of VI with r ≤ m elements. Since T is a magic square of weight
d, the sum of all entries Ti,j such that i ∈ B must be rd. The positive entries among these
must all lie in the columns indexed by elements of ∂B; consequently, the sum of the entries
Ti,j such that j ∈ ∂B must be at least rd. But this sum cannot exceed d|∂B|, since the
column sums of T are all d.
The Matching Theorem now implies that there is a perfect matching in the bipartite
graph. Since VI = VO = [m], this perfect matching must be a permutation σ of the set [m].
By construction, the permutation matrix T σ defined by equations (2) is dominated (entry
by entry) by the magic square T , so the difference T − T σ is a magic square of weight d − 1.
Thus, the assertion follows by induction on d.
3. HALL’S MATCHING THEOREM
3
3. Strassen’s Monotone Coupling Theorem
A poset is a partially ordered set (X , ≤). Recall that a partial order ≤ must satisfy the
following properties: for all x, y, z ∈ X ,
(3)
x ≤ x;
(4)
x ≤ y & y ≤ x =⇒ x = y;
(5)
x ≤ y & y ≤ z =⇒ x ≤ z.
Posets occur frequently as state spaces in statistical mechanics and elsewhere. An important example is the configuration space Σ = {0, 1}V of a spin system: here V is a set
of sites, often the vertices of a lattice, and the elements of Σ are assignments of zeros and
ones (“spins”) to the sites (“configurations”). The partial order ≤ is defined as follows:
x≤y
iff
xs ≤ ys ∀ s ∈ V.
1
An ideal of a poset (X , ≤) is a subset J ⊂ X with the property that if x ∈ J and x ≤ y
then y ∈ J . If µ and ν are two probability distributions on X , say that ν stochastically
dominates µ (and write µ ≤ ν) if for every ideal J ,
(6)
µ(J ) ≤ ν(J ).
Theorem 3. (Strassen) Let (X , ≤) be a finite poset, and let µ, ν be probability distributions
on X . If µ ≤ ν then on some probability space (in fact, on any probability space supporting
a random variable uniformly distributed on the unit interval) are defined X −valued random
variables M, N with distributions µ, ν, resepectively, such that
(7)
M ≤ N.
Proof. We shall only consider the case where the probability distributions µ, ν assign rational probabilities k/N (with a common denominator N ) to the elements of the poset X . The
general case may be deduced from this by an approximation argument, which the reader
will supply (Exercise!).
Case A: For every x ∈ X , the probabilities µ(x) and ν(x) are either 0 or 1/N .
Consider the bipartite graph with VI = {x ∈ X : µ(x) = 1/N } and VI = {x ∈ X : ν(x) =
1/N }, where x ∈ VI and y ∈ VO are connected by an edge if and only if x ≤ y. The
hypothesis that µ is stochastically dominated by ν implies that the hypothesis (1) of the
Matching Theorem is satisfied. Consequently, there is a perfect matching f : VI → VO .
Let M be an X −valued random variable M with distribution µ (such a random variable
will exist on any probability space supporting a uniform-[0,1] random variable). Define
N = f (M ). Then the pair (M, N ) satisfies M ≤ N , and the marginal distributions of M
and N are µ and ν, as the reader will easily check.
Case B: For every x ∈ X , the probabilities µ(x) and ν(x) are integer multiples of 1/N .
For each x ∈ X , if µ(x) = k/N then construct k “copies” x1 , x2 , . . . , xk of x, and for each
such copy set πI (xi ) = x. Define VI to be the set of all such copies, where x ranges over
X . Similarly, for each y ∈ X , if ν(y) = m/N then construct m copies y1 , y2 , . . . , ym of y,
1sometimes called an upper corner
4. 4
HALL’S MATCHING THEOREM
and for each such copy set πO (yi ) = y. Define VO to be the set of all such copies, where y
ranges over X . For each pair xi ∈ VI and yj ∈ VO , put an edge from xi to yj if and only
if πI (xi ) ≤ πO (yj ). Once again, it is easily verified that hypothesis (1) of the Matching
Theorem is satisfied, since µ ≤ ν. Consequently, there is a perfect matching f : VI → VO .
Let U be a random variable that is uniformly distributed on VI – such a random variable
exists on any probability space supporting a Uniform-[0,1] random variable. Set M = πI (U )
and N = πO (f (U )); then the marginal distributions of M and N are µ and ν, respectively,
and M ≤ N , by construction.
4. Enumeration of 3 × 3 Magic Squares
The enumeration of magic squares is a classical problem in combinatorics, dating to
McMahon in the early 20th century (or earlier). McMahon obtained an explicit formula for
the number h(n) of 3 × 3 magic squares of weight n:
(8)
h(n) = 3
n+3
n+2
+
4
2
Much later Richard Stanley proved that for every m ≥ 2, the number of m × m magic
squares of weight n is a polynomial function of n, and that the degree of the polynomial is
(m − 1)2 . This implies that the function is determined by (m − 1)2 values, by the Lagrange
interpolation formula. In this section we shall outline a proof of McMahon’s formula; in the
homework exercises an appproach to Stanley’s theorem will be outlined.
The Birkhoff-von Neumann Theorem is the key to the enumeration of magic squares.
This theorem asserts that every magic square R of weight d is the sum of d permutation
matrices. The 3 × 3 permutation matrices are
1 0 0
0 1 0
0 0 1
I = 0 1 0 , S = 0 0 1 , S 2 = 1 0 0 ,
0 0 1
1 0 0
0 1 0
0 1 0
0 0 1
1 0 0
Tab = 1 0 0 , Tac = 0 1 0 , Tbc = 0 0 1 .
0 0 1
1 0 0
0 1 0
These 6 matrices satisfy the relation
(9)
I + S + S 2 = Tab + Tac + Tbc .
Proposition 4. Every 3 × 3 magic square R has a unique representation as
(10)
R = m1 I + m2 S + m3 S 2 + m4 Tab + m5 Tac + m6 Tbc
where mi are nonnegative integers whose sum is the weight of the square and are such that
at least one of m1 , m2 , m3 is 0.
Proof. That there is such a representation for every 3 × 3 magic square follows from the
Birkhoff-von Neumann theorem and the relation (9). Suppose that some magic square R
had two such representations m = (m1 , . . . , m6 ) and n = (n1 , . . . , n6 ). By adding multiples
5. HALL’S MATCHING THEOREM
5
of either I + S + S 2 or Tab + Tac + Tbc to the relations m and n, respectively, one may obtain
a relation
p1 I + p2 S + p3 S 2 + p4 Tab + p5 Tac + p6 Tbc
=q1 I + q2 S + q3 S 2 + q4 Tab + q5 Tac + q6 Tbc
where pi − qi = 0 for at least one of i = 1, 2, or 3 but not all of the differences pj − qj are
0. Suppose, for instance, that this is the case with pi = qi with i = 2. Then
r1 I + r3 S 2 = −r4 Tab − r5 Tac − r6 Tbc
where ri = pi − qi . But the matrix on the left
∗ 0
∗ ∗
0 ∗
side has the form
∗
0 .
∗
This implies that r4 = r5 = r6 = 0, and hence that r1 = r3 = 0. This contradicts the
hypothesis that there are two distinct representations.
Using the uniqueness of the rpresentation (10), we may obtain an explicit formula for
the generating function H(z) :=
h(n)z n of the sequence h(n). Observe that the sum
n for each square of weight n. By Proposition 4, each such square
defining H(z) counts z
has a unique representation (10), and the weight n is the sum of the six integers mi in the
representation. Consequently,
∞
(11)
h(n)z n =
H(z) :=
z m1 +m2 +m3 +m4 +m5 +m6
m
n=0
where the sum m is over all six-tuples m of nonnegative integers such that at least one
of the entries m1 , m2 , m3 is 0. Define
A1 , A2 , A3 , A12 , A13 , A23 , A123
to be the sets of all six-tuples m of nonnegative integers such that (a) mi = 0 for Ai ;
(b)mi = mj = 0 for Aij ; and (c)m1 = m2 = m3 = 0 for A123 . Then by Inclusion-Exclusion,
H(z) =
(12)
+
+
A1
−
A2
−
A12
+
A3
−
A13
A23
.
A123
Now each of the seven sums in the last expression is a product of geometric series: for
instance,
3
∞
(13)
z
A123
m1 +m2 +m3 +m4 +m5 +m6
z
=
m4 ,m5 ,m6 ≥0
m4 +m5 +m6
=
z
m=0
m
= (1 − z)−3 .
6. 6
HALL’S MATCHING THEOREM
Thus,
(14)
H(z) = 3(1 − z)−5 − 3(1 − z)−4 + (1 − z)−3 .
McMahon’s result now follows by expanding each of these terms in a Taylor series and
gathering terms.