The document discusses non-negative matrix factorization and algorithms for solving it. It introduces non-negative matrix factorization as factorizing a non-negative matrix A into non-negative matrices W and H such that A = W×H. It then presents a simple algorithm for solving the exact non-negative matrix factorization problem in polynomial time by modeling it as a satisfiability problem over polynomial constraints. It also discusses an approach for simplicial factorization that reduces the number of variables by exploiting the rank of the matrix.
Presentation on Non-negative Matrix Factorization (NMF), its applications, and algorithms.
Introduction to a matrix A, with numerical values represented in the form (4x3) for factorization discussion.
Mathematical representation and factorization of matrix A with non-negative ranks and components.
Definition of non-negative matrices; all entries in these matrices are specified as non-negative.
Introduction to Exact Non-negative Matrix Factorization (ENMF) with inputs and conditions for matrices W and H.
Details on a simple algorithm for ENMF including constraints, variables, and the process of finding matrices.Introduction to simplicial factorization, focusing on rank and matrix dimensionality considerations.
Explanation of the pseudo inverse for matrices, its properties, and relations within factorization.
Designing an algorithm for simplicial factorization involving column and row basis concepts.
Challenges in designing better NMF algorithms, such as initialization, updating rules, and ensuring convergence.Justification of importance for non-negative matrices, tracing back to its evolution from PCA.
Various applications of NMF in image processing and clustering, highlighting its utility in data compression and financial data mining.
n x m
=x
r ≤ k
n x k
k x m
Can obtain a generating
set of the vector space
spanned by columns of A
A H
Factor of a Matrix
14.
Non-Negative Matrix
1 23
1 55 119 11
2 -112 456 154
3 513 33 223
4 324 123 543
4 x 3
All elements are non-negative
15.
Non-Negative Matrix
1 23
1 55 119 11
2 -112 456 154
3 513 33 223
4 324 123 543
4 x 3
All elements are non-negative
16.
Non-Negative Matrix
1 23
1 55 119 11
2 112 456 154
3 513 33 223
4 324 123 543
4 x 3
All elements are non-negative
17.
Non-negative (Exact) Factorof a Non-negative
Matrix
n x m
= xA W H
Minimize k
n x k
k x m
non-negative non-negative non-negative
18.
Non-negative (Exact) Factorof a Non-negative
Matrix
n x m
= xA W H
Minimize k
n x k
k x m
non-negative non-negative non-negative
Non-negative
rank
19.
Decision Version ofthe Problem
Exact Non-negative Matrix Factorization (ENMF)
Input:
Question:
An n x m non-negative matrix A and an
integer k.
Are there non-negative matrices W and H
such that A = W x H, W is of order
n x k, and H is of order k x m?
20.
(A,k)
n x m
A
a11a12 a1m
a21 a22 a2m
an1 an2 anm
(Cohen and Rothblum)
A Simple Algorithm for ENMF
21.
A Simple Algorithmfor ENMF
= W
x
n x k
w11 w12 w1k
w21 w22 w2k
wn1 wn2 wnk
k x m
h11 h12 h1m
h21 h22 h2m
wk1 wk2 wkm
Create variables
n x m
A
a11 a12 a1m
a21 a22 a2m
an1 an2 anm
H
22.
A Simple Algorithmfor ENMF
n x m
A
W
a11 a12 a1m
a21 a22 a2m
an1 an2 anm
x
n x k
w11 w12 w1k
w21 w22 w2k
wn1 wn2 wnk
k x m
h11 h12 h1m
h21 h22 h2m
wk1 wk2 wkm
Create variables
=
H
Create polynomial constraints:
[Const(A,k)]
1. For all i,j wij, hij ≥ 0.
23.
A Simple Algorithmfor ENMF
n x m
A
W
a11 a12 a1m
a21 a22 a2m
an1 an2 anm
x
n x k
w11 w12 w1k
w21 w22 w2k
wn1 wn2 wnk
H
k x m
h11 h12 h1m
h21 h22 h2m
wk1 wk2 wkm
Create variables
Create polynomial constraints:
[Const(A,k)]
1. For all i,j wij, hij ≥ 0.
2. For all i,j, aij = wik hkj.∑
k
=
24.
A Simple Algorithmfor ENMF
(A,k)
n x m
A
a11 a12 a1m
a21 a22 a2m
an1 an2 anm
Number of variables:
Number of polynomial
constraints:
nk + km
nm + nk + km
(A,k) is a yes-instance of ENMF if and only if
Const(A,k) is satisfiable (over reals)
25.
A Simple Algorithmfor ENMF
(A,k)
n x m
A
a11 a12 a1m
a21 a22 a2m
an1 an2 anm
Number of variables:
Number of polynomial
constraints:
nk + km
nm + nk + km
We can find a solution to
a set of polynomial
inequalities in time (Dp)O(x)
26.
A Simple Algorithmfor ENMF
(A,k)
n x m
A
a11 a12 a1m
a21 a22 a2m
an1 an2 anm
Number of variables:
Number of polynomial
constraints:
nk + km
nm + nk + km
We can find a solution to
a set of polynomial
inequalities in time (Dp)O(x)
x: number of variables
p: number of inequalities
D: Maximum degree of a
polynomial inequality
27.
A Simple Algorithmfor ENMF
(A,k)
n x m
A
a11 a12 a1m
a21 a22 a2m
an1 an2 anm
Number of variables:
Number of polynomial
constraints:
nk + km
nm + nk + km
We can decide if Const(A,k) is
satisfiable in time O((nm)O(k(n+m)))
28.
An Illustration ofVariable Reduction
Simplicial Factorization
Input:
Question:
An n x m non-negative matrix A of rank k.
Are there non-negative matrices W and H
such that A = W x H, W is of order n x k,
and H is of order k x m?
29.
n x m
xAW H
n x k
k x m
Rank k
Simplicial Factorization
=
Simplicial Factorization
Goal: Todesign an algorithm for Simplicial
Factorization that runs in time O((nm)O(r )).2
Follow similar approach as the algorithm for ENMF, but
apply with reduced number of variables.
35.
Pseudo Inverse
Consider afull column (or row) rank
matrix Mp,q of rank p (q).
M+ has all real entries;
M+ has order q x p;
M+ x M = Iq,q and M x M+ = Ip,p.
The (unique) pseudo inverse M+, of M satisfies
the following:
Simplicial Factorization
n xm
= xA W H
n x k
k x m
W+
W+ has order k x n and H+ has order m x k;
W+ x W = Ik,k and H x H+ = Ik,k.
H+Pseudo inverse:
38.
Simplicial Factorization
n xm
= xA W H
n x k
k x m
W+ has order k x n and H+ has order m x k;
W+ x W = Ik,k and H x H+ = Ik,k.
A;i
H;i
W+ A;i = W+ W H;i = H;i
39.
Simplicial Factorization
n xm
= xA W H
n x k
k x m
W+ has order k x n and H+ has order m x k;
W+ x W = Ik,k and H x H+ = Ik,k.
A;i
H;i
W+ A;i = W+ W H;i = H;i
40.
Simplicial Factorization
n xm
= xA W H
n x k
k x m
W+ has order k x n and H+ has order m x k;
W+ x W = Ik,k and H x H+ = Ik,k.
A;i
H;i
k x k k x 1
W+ A;i = W+ W H;i = H;i
41.
Simplicial Factorization
n xm
= xA W H
n x k
k x m
W+ has order k x n and H+ has order m x k;
W+ x W = Ik,k and H x H+ = Ik,k.
A;i
H;i
W+ A;i = W+ W H;i = H;i
42.
Simplicial Factorization
n xm
= xA W H
n x k
k x m
W+ has order k x n and H+ has order m x k;
W+ x W = Ik,k and H x H+ = Ik,k.
A;i
W+ A;i = W+ W H;i = H;i
H;i
43.
Simplicial Factorization
n xm
= xA W H
n x k
k x m
W+ has order k x n and H+ has order m x k;
W+ x W = Ik,k and H x H+ = Ik,k.
A;i
H;i
aji Wj;
W+ A;i = W+ W H;i = H;i
44.
Simplicial Factorization
n xm
= xA W H
n x k
k x m
W+ has order k x n and H+ has order m x k;
W+ x W = Ik,k and H x H+ = Ik,k.
A;i
H;i
W+ A;i = W+ W H;i = H;i
45.
Simplicial Factorization
n xm
= xA W H
n x k
k x m
W+ has order k x n and H+ has order m x k;
W+ x W = Ik,k and H x H+ = Ik,k.
A;i
H;i
W+ A;i = W+ W H;i = H;i
Aj; H+ = Wj; H H+ = Wj;
46.
Simplicial Factorization
C ={U1, U2,…, Uk} : A column basis for A.
R = {V1, V2,…, Vk} : A row basis for A.
A
Columns of A
expressed in basic C
a1U1 + a2U2 + … + akUk
j AC
k x mn x m
a1
a2
ak
j
47.
Simplicial Factorization
AC AR
Columnsof A
expressed in basic C
Rows of A
expressed in basic R
n x kk x m
C = {U1, U2,…, Uk} : A column basis for A.
R = {V1, V2,…, Vk} : A row basis for A.
48.
Simplicial Factorization
TC ACand AR TR are non-negative;
AR TR TC AC = A.
Lemma: A has a simplicial factor if and only if the for
every column and row basis C and R of A
there are k x k matrices TC and TR such that:
49.
Simplicial Factorization
Lemma: Ahas a simplicial factor if and only if the for
every column and row basis C and R of A
there are k x k matrices TC and TR such that:
TC AC and AR TR are non-negative;
AR TR TC AC = A.
A has a simplicial factors
by the two conditions and the
construction of AC and AR.
n x k k x m
50.
Simplicial Factorization
TC ACand AR TR are non-negative;
AR TR TC AC = A.
Lemma: A has a simplicial factor if and only if the for
every column and row basis C and R of A
there are k x k matrices TC and TR such that:
A = W x H
n x k k x m
U and V be column and
row basis respectively
Simplicial Factorization
A =W x H
n x k k x m
U and V be column and
row basis respectively
U
n x k
V
k x m
TC = W+ x U
TR = V x H+
53.
Simplicial Factorization
A =W x H
n x k k x m
U and V be column and
row basis respectively k x k
TC = W+ x U
TR = V x H+
54.
Simplicial Factorization
A =W x H
n x k k x m
U and V be column and
row basis respectively
TC = W+ x U
TR = V x H+
k x k
TC x AC = W+ x U x AC = H
W+ A;i = W+ W H;i = H;i
Aj; H+ = Wj; H H+ = Wj;
(non -ve)
55.
Simplicial Factorization
A =W x H
n x k k x m
U and V be column and
row basis respectively
TC = W+ x U
TR = V x H+
k x k
TC x AC = W+ x U x AC = H
W+ A;i = W+ W H;i = H;i
Aj; H+ = Wj; H H+ = Wj;
AR x TR = AR x V x H+ = W
(non -ve)
(non -ve)
56.
Simplicial Factorization
A =W x H
n x k k x m
U and V be column and
row basis respectively
TC = W+ x U
TR = V x H+
k x k
TC x AC = W+ x U x AC = H (non -ve)
AR x TR = AR x V x H+ = W (non -ve)
TC AC and AR TR are non-negative;
AR TR TC AC = A.
57.
Simplicial Factorization
TC ACand AR TR are non-negative;
AR TR TC AC = A.
Lemma: A has a simplicial factor if and only if the for
every column and row basis C and R of A
there are k x k matrices TC and TR such that:
58.
A Simple Algorithmfor ENMF
n x m
A
W
a11 a12 a1m
a21 a22 a2m
an1 an2 anm
x
n x k
w11 w12 w1k
w21 w22 w2k
wn1 wn2 wnk
H
k x m
h11 h12 h1m
h21 h22 h2m
wk1 wk2 wkm
Create variables
Create polynomial constraints:
[Const(A,k)]
1. For all i,j wij, hij ≥ 0.
2. For all i,j, aij = wik hkj.∑
k
=
59.
Simplicial Factorization
TC ACand AR TR are non-negative;
AR TR TC AC = A.
Lemma: A has a simplicial factor if and only if the for
every column and row basis C and R of A
there are k x k matrices TC and TR such that:
60.
U n xk V k x m
n x m
A
a11 a12 a1m
a21 a22 a2m
an1 an2 anm
Simplicial Factorization
61.
n x m
A
a11a12 a1m
a21 a22 a2m
an1 an2 anm
Simplicial Factorization
AR AC
k x mn x k k x k
w11 w12 w1k
w21 w22 w2k
wk1 wk2 wkk
TR
k x k
h11 h12 h1k
h21 h22 h2k
hk1 hk2 hkk
TC
=
62.
n x m
A
a11a12 a1m
a21 a22 a2m
an1 an2 anm
Number of variables:
Number of polynomial
constraints:
2k2
poly(n,m,k)
We can find a solution to
a set of polynomial
inequalities in time (Dp)O(x)
x: number of variables
p: number of inequalities
D: Maximum degree of a
polynomial inequality
Simplicial Factorization
63.
n x m
A
a11a12 a1m
a21 a22 a2m
an1 an2 anm
Number of variables:
Number of polynomial
constraints:
2k2
poly(n,m,k)
Simplicial Factorization
We can solve Simplicial Factorization
in time O((nm)O(k )).
2
Other Results onENMF
[Vavasis] ENMF is known to be NP-Hard.
[Arora et al.] Assuming ETH, there is no algorithm for
ENMF running in time O((nm)o(k)).
66.
Other Results onENMF
2
[Vavasis] ENMF is known to be NP-Hard.
[Arora et al.] Assuming ETH, there is no algorithm for
ENMF running in time O((nm)o(k)).
[Moitra] EMNF admits an algorithm running in time
O((nm)O(k )).
67.
n x m
=xA W H
n x k
k x m
non-negative non-negative non-negative
Exact Non-negative Matrix Factorization
For most applications, close
approximation is good enough.
68.
n x m
xAW H
n x k
k x m
non-negative non-negative non-negative
Non-negative Matrix Factorization
For most applications, close
approximation is good enough.
≈
Example: Distance Function
Squareof Euclidean distance:
For matrices A and B (of same order)
|| A - B ||2 = (Aij - Bij)2∑
i,j
|| A - B ||2 = 0 if and only if A = B
71.
Example: Divergence Function
Formatrices A and B (of same order)
D(A || B ) = (Aij log (Aij/Bij) - Aij + Bij)∑
i,j
D(A || B ) = 0 if and only if A = B
72.
General Scheme ofAlgorithm: Non-negative
Matrix Factorization
Input:
Output:
A, W(0), H(0), and t=1.
W and H.
73.
General Scheme ofAlgorithm: Non-negative
Matrix Factorization
1. Fix H(t-1) and find W(t), such that D(A, W(t)H(t-1)) ≤
D(A, W(t-1)H(t-1)).
2. Fix W(t) and find H(t), such that D(A, W(t)H(t)) ≤ D(A,
W(t)H(t-1)).
3. If convergence satisfied return W and H.
4. t=t+1.
Input:
Output:
A, W(0), H(0), and t=1.
W and H.
While true
74.
Main Challenges inDesigning Better NMF
Algorithms
Getting a good seeding for initialisation of W and H.
75.
Main Challenges inDesigning Better NMF
Algorithms
Getting a good seeding for initialisation of W and H.
Devising updating rules for W and H at subsequent
iterations.
76.
Main Challenges inDesigning Better NMF
Algorithms
Getting a good seeding for initialisation of W and H.
Devising updating rules for W and H at subsequent
iterations.
Selecting distance/ divergence norms based on the
application.
77.
Main Challenges inDesigning Better NMF
Algorithms
Getting a good seeding for initialisation of W and H.
Devising updating rules for W and H at subsequent
iterations.
Selecting distance/ divergence norms based on the
application.
Proving/ giving enough evidences for convergence of
the algorithm.
Origin of Non-negativeMatrix Factorization
Evolved from Principal Component Analysis, which is
used for dimension reduction.
Disadvantage: Both positive and
negative elements appear in
principal components and
coefficients in linear combinations.
Hard to interpret results in
applications like storing pixel
brightness
Applications
Image Processing.
Data representedas a non-negative matrix of
pixels.
NMF can find A W x H≈
W is the basis matrix, its
column can be regarded as
parts like nose, ear, eye,
etc.
Applications
Financial Data Mining
Thestock price fluctuations seem to be
dominated by several underlying factors. NMF
has been used to obtain underlying trends
from the stock market data.