SlideShare a Scribd company logo
Tensor eigenvectors and
stochastic processes
Austin R. Benson · Cornell
David F. Gleich · Purdue
Act 1. 10:45-10:55am Overview
Act 2. 10:55-11:05am Motivating applications
Act 3. 11:05-11:30am Stochastic processes & Markov
chains
Act 4. 11:40-12:10pm Spacey random walk stochastic
process
Act 5. 12:10-12:30pm Theory of spacey random walks SIAM ALA'18Benson & Gleich 1
Papers, slides, & code ⟶ bit.ly/tesp-web, bit.ly/tesp-code
A
3
2
1
Stochastic processes offer a new and exciting set
of opportunities and challenges in tensor
algorithms.
After this is done, you should know a little bit about
And where to look for more info! bit.ly/tesp-web
1
3
2
P
• Tensor eigenvectors
• Z-eigenvectors
• Irreducible tensors
• Higher-order Markov chains
• Spacey random walks
• Vertex reinforced random walks
• Dynamical systems for trajectories
• Fitting spacey random walks to data
• Multilinear PageRank models
• Clustering tensors
• And lots of open problems in this area!
SIAM ALA'18Benson & Gleich 2
A quick overview of where we are going to go in
this tutorial and some rough timing.
Act 1. This overview
• Basic notation and operations
• The fundamental problems
Act 2. Motivating applications
• Compression
• Diffusion imaging
• Hardy-Weinberg genetics
Act 3. Review of Stochastic processes
Markov Chains & Higher-order chains
• Limiting and stationary distributions
• Irreducibility
Act 4. Spacey RWs as stochastic processes
• Pause for interpretations and thought
• FAQ
Act 5. Theory of spacey random walks
• Limiting dists are tensor evecs
• Dynamical systems & vertex reinforced
• (Non-) existence, uniqueness,
• Computation
Act 6. Applications of spacey random
walks
• Pólya urns, sequence data, tensor
• New algorithm for computing tensor
SIAM ALA'18Benson & Gleich 3
A note.
We tried to be a friendly tutorial instead of
trying to be comprehensive!
See the extensive work by HK groups!
SIAM ALA'18Benson & Gleich 4
3
2
1
Fundamental notations and some helpful pictures
SIAM ALA'18Benson & Gleich 5
tensor-vector product tensor-collapse product
Summary of fundamental notations
SIAM ALA'18Benson & Gleich 6
We assume the tensor is symmetric or permuted so the last operations are all that’s needed.
The tensor Z-eigenvector problem has different
properties than matrix eigenvectors
SIAM ALA'18Benson & Gleich 7
There are many generalizations of eigen-problems
to tensors. Their properties are very different.
All eigenvectors have unit 2-norm. ||x||2 = 1.
The H-eigenvalue spectrum is scale invariant.
SIAM ALA'18Benson & Gleich 8
There are many generalizations of eigen-problems
to tensors. Their properties are very different.
All eigenvectors have unit 2-norm. ||x||2 = 1.
Z-eigenvectors are not scale invariant. H-eigenvectors are.
SIAM ALA'18Benson & Gleich 9
There are even more types of eigen-probs!
• D-eigenvalues
• E-eigenvalues (complex Z-eigenvalues)
• Generalized versions too…
• Other normalizations! [Lim 05]
For more information about these tensor
eigenvectors and some of their fundamental
properties, we recommend the following resources
• Tensor Analysis: Spectral Theory and Special
Tensors. Qi & Luo, 2017.
• A survey on the spectral theory of nonnegative
tensors. Chang, Qi, & Zhang, 2013.
Stochastic processes offer a new and exciting set
of opportunities and challenges in tensor
algorithms.
Usually, the properties of these objects are explored algebraically or through
polynomial interpretations.
Our tutorial focuses on interpreting the tensor objects stochastically!
SIAM ALA'18Benson & Gleich 10
A
3
2
1
Act 1. This overview
• Basic notation and operations
• The fundamental problems
Act 2. Motivating applications
• Compression
• Diffusion imaging
• Hardy-Weinberg genetics
Act 3. Review of Stochastic processes
Markov Chains & Higher-order chains
• Limiting and stationary distributions
• Irreducibility
Act 4. Spacey RWs as stochastic
processes
• Pause for interpretations and
• FAQ
Act 5. Theory of spacey random walks
• Limiting dists are tensor evecs
• Dynamical systems & vertex
RWs
• (Non-) existence, uniqueness,
• Computation
Act 6. Applications of spacey random
walks
• Pólya urns, sequence data, tensor
clustering
• New algorithm for computing tensor
The best rank-1 approximation to a symmetric
tensor is given by the principal eigenvector.
[De Lathauwer 97; De Lathauwer-De Moor-Vandewalle 00; Kofidis-Regalia 01, 02]
A is a symmetric if the entries are the same under any
permutation of the indices.
In data mining and signal processing applications, we
are often interested in the “best” rank-1 approximation.
Notes. The first k tensor eigenvectors do not necessarily give the best rank-
k approximation. In general, this problem is not even well-posed [de Silva-
Lim 08].
Furthermore, the first eigenvector is not necessarily in the best rank-k
“orthogonal approximation” from orthogonal vectors [Kolda 01, 03].
3
2
1
SIAM ALA'18Benson & Gleich 12
Quantum
entanglement
A(i,j,k,…,l) are the normalized
amplitudes of an m-partite pure
state |ψ>
A is a nonneg sym tensor
Diffusion imaging
W is a symmetric, fourth-order kurtosis
diffusion tensor
D is a symmetric, 3 x 3 matrix
⟶ both are measured from MRI data.
Michael S. Helfenbein
Yale University
https://www.eurekalert.org/pub_r
eleases/2016-05/yu-
ddo052616.php
[Wei-Goldbart 03; Hu-Qi-Zhang
16]
is the geometric
measure of
entanglement
Paydar et al., Am. J. of
Neuroradiology, 2014
[Qi-Wang-Wu 08]
SIAM ALA'18Benson & Gleich 13
Markovian binary
trees.
Entry-wise minimal solutions
to x = Bx2 + a are extinction
Distribution of alleles (forms of a gene)
in a population at time t is x.
Start with an infinite population.
1. Every individual gets a random
mate.
2. Mates of type j and k produce
offspring of type i with probability
P(i, j, k) and then die.
Hardy-Weinberg equilibria of random mating
models are tensor eigenvectors.
Under Hardy-
Weinberg
equilibria
(steady-state), x
satisfies x = Px2.
[Bean-Kontoleon-Taylor 08;
Bini-Meini-Poloni 11; Meini-Poloni 11, 17]
SIAM ALA'18Benson & Gleich 14
Act 1. This overview
• Basic notation and operations
• The fundamental problems
Act 2. Motivating applications
• Compression
• Diffusion imaging
• Hardy-Weinberg genetics
Act 3. Review of Stochastic
Markov Chains & Higher-order
• Limiting and stationary
• Irreducibility
Act 4. Spacey RWs as stochastic
processes
• Pause for interpretations and
• FAQ
Act 5. Theory of spacey random walks
• Limiting dists are tensor evecs
• Dynamical systems & vertex
RWs
• (Non-) existence, uniqueness,
• Computation
Act 6. Applications of spacey random
walks
• Pólya urns, sequence data, tensor
clustering
• New algorithm for computing tensor
Markov chains, matrices, and eigenvectors have a
long-standing relationship.
[Kemeny-Snell 76] “In the land of Oz they never have two nice
days in a row. If they have a nice day, they are just as likely to
have snow as rain the next day. If they have snow or rain, they
have an even chance of having the same the next day. If there
is a change from snow or rain, only half of the time is this
change to a nice day.”
Column-stochastic in this tutorial
(since we are linear algebra people).
Equations for stationary distribution x.
The vector x is an
eigenvector of P.
Px = x.
SIAM ALA'18Benson & Gleich 16
Markov chains are a special case of a stochastic
process.
Stochastic processes are a (possibly infinite) sequence of RV.
Z1, Z2, …, Zt, Zt+1, …
• Zt is a random variable.
• This is a discrete time stochastic process
Stochastic processes are models throughout applied math and life
• The weather
• The stock market
• Natural language
• Random walks on graphs
• Pólya’s urn
• Brownian motion
SIAM ALA'18Benson & Gleich 17
Stochastic processes are just sets of random
variables (RV). Often they are infinite and coupled.
Brownian Motion.
• My value at the next time goes up or down
by a normal random variable.
• Z0 = 0, Zt+1 = Zt + N(0,1)
• Z = cumsum(randn(100,1))
Z is a realization of a Brownian motion
• Often used to model stock prices
normal random variable
SIAM ALA'18Benson & Gleich 18
Stochastic processes are just sets of random
variables (RV). Often they are infinite and coupled.
Pólya Urn.
• Consider an urn with 1 purple and 1
green ball, draw a ball at random,
replace it with one of the same
color.
• Z0 = 1, Zt+1 = Zt + B(1, Zt / (t+2))
1 with prob Zt / (t+2)
0 otherwise
Draw ball at random
Put ball back with
another of the same
color
SIAM ALA'18Benson & Gleich 19
Stochastic processes are just sets of random
variables (RV). Usually they are infinite and coupled
somehow.
Finite Markov chain & random walk.
Z0 = “state”, Pr(Zt+1 = i | Zt = j) = Pij
• States are indexed by 1, …, n
• The random walk on a graph is a
special Markov chain where
• Random walks on weighted graphs
and finite Markov chains are isomorphic
SIAM ALA'18Benson & Gleich 20
SIAM REVIEW c⃝ 2015 Society for Industrial and Applied Mathematics
Vol. 57, No. 3, pp. 321–363
PageRank Beyond the Web∗
David F. Gleich†
The PageRank Markov chain and random walk is
another well known instance.
Originally, the random surfer model
• States are web-pages and links between
pages make a directed graph.
• The random surfer is a Markov chain
with prob α follow a random outlink and
with prob (1-α) go to a random page
PageRank can be used for everything from
analyzing the world's most important books to
predicting traffic flow to ending sports arguments.
-JESSICA LEBER, Fast Information.David F. Gleich
SIAM ALA'18Benson & Gleich 21
Higher-order Markov chains & random walks are
useful models for many data problems.
Higher order Markov chains & random walks
A second order chain uses the last two states
Z-1 = “state”, Z0 = “another state”
Pr(Zt+1 = i | Zt = j, Zt-1 = k) = Pi,j,k
Simple to understand and turn out to be better models
than standard (first-order) chains in several application
domains [Ching-Ng-Fung 08]
• Traffic flow in airport networks [Rosvall+ 14]
• Web browsing behavior [Pirolli-Pitkow 99; Chierichetti+ 12]
• DNA sequences [Borodovsky-McIninch 93; Ching-Fung-Ng
04]
• Non backtracking walks in networks
[Krzakala+ 13; Arrigo-Gringod-Higham-Noferini 18]
Rosvall et al., Nature Comm., 2014.
A tensor!
SIAM ALA'18Benson & Gleich 22
Higher-order Markov chains are actually first-order
Markov chains in disguise.
Start with a second-order Markov chain
Consider a new stochastic process
on pairs of variables
Higher-order Markov chains are Markov chains on the product space.
SIAM ALA'18Benson & Gleich 23
Tensors are a natural representation of transition
probabilities of higher-order Markov chains.
1
3
2
P
Often called transition probability tensors.
[Li-Ng-Ye 11, Li-Ng 14, Chu-Wu 14, Culp-Pearson-
Zhang 17]
SIAM ALA'18Benson & Gleich 24
A note. Often we use the “second-order” case as
a stand-in for the “general” higher-order case.
Second order Markov chain
Z-1 = “state”, Z0 = “another state”
Pr(Zt+1 = i | Zt = j, Zt-1 = k) = Pijk
General higher-order Markov chain
Pr(Zt+1 = i | Zt = j, Zt-1 = k, …, Zt-m+1 = l) = P(i, j, k, …, l)
Terminology
• Second-order = 2 states of history  3-mode tensor
• mth-order = m states of history  (m+1)-mode tensor
An m+1-mode tensor.
SIAM ALA'18Benson & Gleich 25
We love stochastic processes because
they give you an intuition and
“physics” about what is happening
SIAM ALA'18Benson & Gleich 26
A fundamental quantity for stochastic processes is
the fraction of time spent at each state (limiting
distribution).
Consider a stochastic process that goes on infinitely
where each Zj takes a discrete value from a finite set.
We want to know how often are we in a particular state in the long run?
Other fundamental quantities include
• Return times
• Hitting times
(Cesàro limit)
SIAM ALA'18Benson & Gleich 27
Example limiting distribution with a random walk.
Long time
SIAM ALA'18Benson & Gleich 28
In the Pólya Urn, the limiting distribution of ball
draws always exists. It can converge to any value.
Thisistheuniformdistribution
We have 1000 samples of the trajectories.
SIAM ALA'18Benson & Gleich 29
For each realization, the sequence of
random variables
Z1, Z2, …, Zt, Zt+1, …
converges.
It does not converge to a unique value, but
rather can converge to any value.
SIAM ALA'18Benson & Gleich 30
Limiting distributions and stationary distributions
for Markov chains have different properties.
This point is often mis-understood.We want to make sure you get it right!
Limiting distribution 
A stationary distribution  Pk estart converges to p*
Theorem. A finite Markov chain always has a limiting distribution.
Theorem. The limiting distribution is unique if and only if the chain has only a
single recurrent class.
Theorem.A stationary distribution is limt ⟶ ∞ Prob[Zt = i].This is unique if and
only if a Markov chain has a single aperiodic, recurrent class.
SIAM ALA'18Benson & Gleich 31
States in a finite Markov chain are either recurrent
or transient.
Proof by picture.
Recurrent:
Prob[another visit] = 1
Transient:
Prob[another visit] < 1.
Markov chains ⟺ Dir. graphs
Directed graphs +Tarjan’s
algorithm give the flow among
strongly connected
components. (Block triangular
form.)
Block triangular form
fromTarjan’s algorithm
Strongly connected components
Recurrent states
SIAM ALA'18Benson & Gleich 32
The fundamental theorem of Markov chains is that
any stochastic matrix is Cesàro summable.
Limiting distribution given start node is P*[:, start] because
Pk gives the k-state transition probability.
Result. Only one recurrent class iff P* is rank 1.
Proof sketch. A recurrent class is a fully-stochastic sub-
matrix. If there are >1 recurrent classes, then P* would be
rank >1 because we could look at the sub-chain on each
recurrent class; if P* is rank 1, then the distribution is the
same regardless of where you start and so “no choice” .
Cesàro summable This always exists!
SIAM ALA'18Benson & Gleich 33
Stationary distributions are much stronger than
limiting distribution
A stationary distribution  Pk converging to P*
This requires a single aperiodic recurrent class or irreducible & aperiodic
matrix. (There are some funky cases if your chain is really two disconnected,
independent chains.)
We can always make a limiting distribution a stationary distribution.Turn P
into a lazy-Markov chain.
This is automatically aperiodic and doesn’t change the recurrence.
SIAM ALA'18Benson & Gleich 34
Act 1. This overview
• Basic notation and operations
• The fundamental problems
Act 2. Motivating applications
• Compression
• Diffusion imaging
• Hardy-Weinberg genetics
Act 3. Review of Stochastic processes
Markov Chains & Higher-order chains
• Limiting and stationary distributions
• Irreducibility
Act 4. Spacey RWs as stochastic
processes
• Pause for interpretations and
• FAQ
Act 5. Theory of spacey random walks
• Limiting dists are tensor evecs
• Dynamical systems & vertex
RWs
• (Non-) existence, uniqueness,
• Computation
Act 6. Applications of spacey random
walks
• Pólya urns, sequence data, tensor
clustering
• New algorithm for computing tensor
Remember! Tensors are a natural representation
of transition probabilities of higher-order Markov
chains.
1
3
2
P
But the stationary distribution on pairs of states is
still a matrix eigenvector...
[Li-Ng 14] Making the “rank-1 approximation” Xj,k = xjxk gives a
formulation for tensor eigenvectors.
SIAM ALA'18Benson & Gleich 36
The vector x satisfying Px2 = x is nonnegative and sums to 1.
Thus, x often gets called a limiting distribution.
But all we have done is algebra!
What is a natural stochastic process that has this limiting distribution?
SIAM ALA'18Benson & Gleich 37
Spacey random walks are stochastic processes
whose limiting distribution(s) lead to such tensor
eigenvectors.
1
3
2
P
1. We are at state Zt = j and want to transition
according to P.
2. However, upon arriving at state Zt = j, we
space out and forget about Zt-1 = k.
3. We still want to do our best, so we choose
state Yt = r uniformly from our history Z1, Z2,
…, Zt
(technically, we initialize having visited each state once).
4. We then follow P pretending that Zt-1 = r.
Stochastic process Z1, Z2, …, Zt, Zt+1, … with states in {1, …,
n}.
Spacey or
space out?
走神
or
心不在焉
According to
David’s students
SIAM ALA'18Benson & Gleich 38
Spacey random walks are stochastic processes
whose limiting distributions are such tensor
eigenvectors.
10
12
4
9
7
11
4
Zt-1
Zt
Yt
Key insight [Benson-Gleich-Lim 17]
Limiting distributions of this process are tensor eigenvectors of P.
1
3
2
P
Prob(Zt+1 = i | Zt = j, Yt = r) = P(i, j, r).
SIAM ALA'18Benson & Gleich 39
The main point.
Limiting distributions of the spacey random walk
stochastic process are tensor eigenvectors of P
(we’ll prove this later).
SIAM ALA'18Benson & Gleich 40
We have to be careful with undefined transitions,
which correspond to zero columns in the tensor.
10
12
4
9
7
11
4
Zt-1
Zt
Yt
1
3
2
P
Prob(Zt+1 = i | Zt = j, Yt = r) = ? P(:, j, r) = 0.
A couple options.
1. Pre-specify a distribution for when P(:, j, r) = 0.
2. Choose a random state from history ⟶ super SRW [Wu-Benson-Gleich 16]
SIAM ALA'18Benson & Gleich 41
1
2 3
1/2
1/2
1/2
1/2
1/2
1/2
Limiting distribution of
RW is [1/3, 1/3, 1/3].
What about non-backtracking RW?
NBRW disallows going back to where you came from
and re-normalizes the probabilities.
Lim. dist. is still [1/3, 1/3, 1/3], but for far different
reasons.
NBRW is a second-order Markov chain!
What happens with the spacey random walk using the NBRW transition probabilities?
Zero-column fill-in
affects the limiting
distribution and tensor
evec.
Follow along with Jupyter notebook!
3-node-cycle-walks.ipynb
SIAM ALA'18Benson & Gleich 42
FAQ. Please ask your own questions, too!
1. What’s a spacey random walk, again?
A stochastic process defined by a transition probability tensor.
2. Is the spacey random walk a Markov chain?
No, not in general—the transitions depends on the entire history.
3. Is the limiting distribution of a higher-order MC a tensor e-vec?
No, not in general.
4. Why not just compute the stat. dist. of the higher-order MC?
We are motivating tensor eigenvectors from a stochastic processes view.
5. What is an e-vec with e-val 1 of a transition probability tensor?
It could be the limiting distribution of a spacey random walk.
1
3
2
P
SIAM ALA'18Benson & Gleich 43
1 2 3
1/2
1/2
Follow along with Jupyter notebook!
7-node-line-walks.ipynb
4 5 6 7
1/2
1/2
1/2
1/2
1/2
1/2
1/2
1/2
1/2
1/2
1/3 1/3
1/61/6
What will happen with a…
RW?
NBRW?
SRW (uniform fill-in)?
SRW (RW stat. dist. fill-in)?
SRW (NBRW stat. dist. fill-in)?
SSRW?
SIAM ALA'18Benson & Gleich 44
(Not well defined) conjecture.
When the transition probability tensor entries come from
a non-backtracking random walk, the spacey random
walk “interpolates” between the standard random walk
and the non-backtracking one.
SIAM ALA'18Benson & Gleich 45
Pólya urns are spacey random walks.
Draw random ball.
Put ball back with another of
the same color
This is a second-order spacey
random walk with two states.
Consequently, we know this one must
converge because it’s a Pólya Urn!
SIAM ALA'18Benson & Gleich 46
But didn’t Pólya Urns have any limiting
distribution? Does this mean that tensor is
interesting? Yes!
Any stochastic vector is a tensor eigenvector
and these are also limiting distributions.
SIAM ALA'18Benson & Gleich 47
Act 1. This overview
• Basic notation and operations
• The fundamental problems
Act 2. Motivating applications
• Compression
• Diffusion imaging
• Hardy-Weinberg genetics
Act 3. Review of Stochastic processes
Markov Chains & Higher-order chains
• Limiting and stationary distributions
• Irreducibility
Act 4. Spacey RWs as stochastic
processes
• Pause for interpretations and
• FAQ
Act 5. Theory of spacey random
• Limiting dists are tensor evecs
• Dynamical systems & vertex
RWs
• (Non-) existence, uniqueness,
convergence
• Computation
Act 6. Applications of spacey random
walks
• Pólya urns, sequence data, tensor
clustering
• New algorithm for computing tensor
Spacey random walks have a number of
interesting properties as well as a number of open
challenges!
Properties
1. Limiting distributions of SRWs are tensor evecs with
eval 1 (proof shortly!)
2. Asymptotically, SRWs are first-order Markov chains.
3. If there are just 2 states, then the SRW converges but
possibly to one of several distributions.
4. If P is sufficiently “regularized”, then the SRW
converges to a unique limiting distribution.
Open problems
• Existence?
• Uniqueness?
• Computation?
SIAM ALA'18Benson & Gleich 49
1
3
2
P
Note. Spacey random walks
are defined by a stochastic
transition tensor, so these
are all tensor questions!
An informal and intuitive proof that
spacey random walks converge to tensor
eigenvectors
Idea. Let wT be the fraction of time
spent in each state after T ≫ 1 steps.
Consider an additional L steps, T ≫ L ≫
1.Then wT ≈ wT+L if we converge.
SIAM ALA'18Benson & Gleich 50
1
3
2
P
Long time
wT wT+L
Suppose M(x) = P[wT]m-2 has a unique
stationary distribution, xT.
If the SRW converges, then xT = wT+L,
otherwise wT+L would be different.
Thus, xT = P[wT]m-1 xT ≈ P[wT+L]m-1 xT =
P[xT]m-1 xT = PxT
m-1.
Long time
wT wT+L
To formalize convergence, we need the theory of
generalized vertex reinforced random walks
(GVRRW).
A stochastic process X1, …, Xt, … is a GVRRW if
wT is the fraction of time in each state
FT is the sigma algebra generated by X1, …, XT.
M(wT) is a column stochastic matrix that depends on wT .
[Diaconis 88; Pemantle 92, 07; Benaïm 97]
SIAM ALA'18Benson & Gleich 51
The classicVRRW is the following
• Given a graph, randomly move to a
neighbor with probability propotional to
how often we’ve visited the neighbor!
To formalize convergence, we need the theory of
generalized vertex reinforced random walks
(GVRRW).
A stochastic process X1, …, Xt, … is a GVRRW if
wT is the fraction of time in each state
FT is the sigma algebra generated by X1, …, XT.
M(wT) is a column stochastic matrix that depends on wT .
Spacey random walks are GVRRWs with the map M: M(wT) = P[wT]m-2.
[Diaconis 88; Pemantle 92, 07; Benaïm 97]
SIAM ALA'18Benson & Gleich 52
To formalize convergence, we need the theory of
generalized vertex reinforced random walks
(GVRRW).
Theorem [Benaïm97] heavily paraphrased
In a discrete GVRRW, the long-term behavior of the occupancy distribution wT
follows the long-term behavior of the dynamical system
To study convergence properties of the SRW, we just need to study the
dynamical system for our map M: M(wT) = P[wT]m-2:
where maps a column stochastic matrix to its Perron vector.
SIAM ALA'18Benson & Gleich 53
More on how stationary distributions of GVRRWs
correspond to ODEs
SIAM ALA'18Benson & Gleich 54
THEOREM [Benaïm, 1997] Less Paraphrased
The sequence of empirical observation probabilities ct
is an asymptotic pseudo-trajectory for the dynamical
system
Thus, convergence of the ODE to a fixed point is
equivalent to stationary distributions of the VRRW.
• M must always have a unique stationary distribution!
• The map to M must be very continuous
• Asymptotic pseudo-trajectories satisfy
Spacey random walks converge to tensor
eigenvectors (a more formal proof).
Suppose that the SRW converges.Then we converge to a stationary point.
SIAM ALA'18Benson & Gleich 55
1
3
2
P
Long time
wT wT+L
Remember the informal proof. All we’ve
done is just formalize this by using the
dynamical system to map behavior!
Corollary. Asymptotically, GVRRWs (including
spacey random walks) act as first-order Markov
chains.
Suppose that the SRW converges to x.
Then
SIAM ALA'18Benson & Gleich 56
Relationship between spacey random walk
convergence and existence of tensor
eigenvectors.
SRW converges ⇒ existence of tensor e-vec of P with e-val 1.
SRW converges ⇍ existence of tensor e-vec of P with e-val 1.
Apply map f(x) = Pxm-1 satisfies conditions of Brouwer’s fixed point
theorem, so there always exists an x such that Pxm-1 = x.
Furthermore, 𝜆 = 1 is the largest eigenvalue. [Li-Ng 14]
There exists a P for which the SRW does not converge [Peterson 18]
SIAM ALA'18Benson & Gleich 57
General Open Question.
Under what conditions does the spacey random walk converge?
Peterson’s Conjecture.
If P is a 3-mode tensor, then the spacey random walk converges.
Broader conjecture
There is always a (generalized) SRW that converges to a tensor evec.
What we have been able to show so far.
1. If there are just 2 states, then the SRW converges.
2. If P is sufficiently “regularized”, then the SRW converges.
SIAM ALA'18Benson & Gleich 58
Almost every 2-state spacey random walk
converges.
[Benson-Gleich-Lim 17]
Special case of 2 x 2 x 2 system...
SIAM ALA'18Benson & Gleich 59
Almost every 2-state spacey random walk
converges.
Theorem [Benson-Gleich-Lim 17]
The dynamics of almost every
2 x 2 x … x 2 spacey random
walk (of any order) converges
to a stable equilibrium point.
stable
stable
unstable
Things to note…
1. Multiple stable points in above example; SRW could converge to any.
2. Randomness of SRW is “baked in” to initial condition of system.
SIAM ALA'18Benson & Gleich 60
A sufficiently regularized spacey random walk
converges.
Consider a modified “spacey random surfer” model. At each step,
1. with probability α, follow SRW model P.
2. with probability 1 - α, teleport to a random node.
Equivalent to a SRW on S = αP + (1 – α)J, where J is normalized ones tensor.
Theorem.
If α < 1 / (m – 1),
1. the SRW on S converges [Benson-Gleich-Lim 17]
2. there is a unique tensor x e-vec satisfying Sxm-1 = x [Gleich-Lim-Yu 15]
[Gleich-Lim-Yu 15; Benson-Gleich-Lim 17]
SIAM ALA'18Benson & Gleich 61
A sufficiently regularized spacey random walk
converges.
The higher-order power method is an algorithm to compute the
dominant tensor eigenvector.
yk+1 = Txk
m-1
xk+1 = yk+1 / || yk+1 ||
Theorem [Gleich-Lim-Yu 15]
If α < 1 / (m – 1), the power method on S = αP + (1 – α)J converges to
the unique vector satisfying Sxm-1 = x.
Conjecture.
If the higher-order power method on P always converges, then
the spacey random walk on P always converges.
SIAM ALA'18Benson & Gleich 62
Conjecture.
Determining if a SRW converges is PPAD-complete.
Computing a limiting distribution of SRW is PPAD-complete.
Why?
In general, NP-hard to determine if tensor evec for eval 𝜆 [Hillar-Lim 13].
Know evec exists for transition probability tensor P, eval 𝜆 = 1 [Li-Ng 14].
However, no obvious way to compute it.
Similar to other PPAD-complete problems (e.g., Nash equilibria).
SIAM ALA'18Benson & Gleich 63
General Open Question.
What is the best way to compute tensor eigenvectors?
• Higher-order power method
[Kofidis-Regalia 00, 01; De Lathauwer-De Moor-Vandewalle 00]
• Shifted higher-order power method [Kolda-Mayo 11]
• SDP hierarchies [Cui-Dai-Nie 14; Nie-Wang 14; Nie-Zhang 18]
• Perron iteration [Meini-Poloni 11, 17]
For SRWs, the dynamical system offers another way.
Numerically integrate the dynamical system!
[Benson-Gleich-Lim 17; Benson-Gleich 18]
SIAM ALA'18Benson & Gleich 64
Equivalent to Perron iteration with Forward Euler & unit time-step.
Act 1. This overview
• Basic notation and operations
• The fundamental problems
Act 2. Motivating applications
• Compression
• Diffusion imaging
• Hardy-Weinberg genetics
Act 3. Review of Stochastic processes
Markov Chains & Higher-order chains
• Limiting and stationary distributions
• Irreducibility
Act 4. Spacey RWs as stochastic
processes
• Pause for interpretations and
• FAQ
Act 5. Theory of spacey random walks
• Limiting dists are tensor evecs
• Dynamical systems & vertex
RWs
• (Non-) existence, uniqueness,
• Computation
Act 6. Applications of spacey
walks
• Pólya urns, sequence data, tensor
clustering
• New algorithm for computing
evecs
Applications of spacey random walks.
1. Pólya urns are SRWs.
2. SRWs model taxi sequence data.
3. Asymptotics of SRWs for data clustering.
4. Insight for new algorithms to compute tensor eigenvectors.
Stochastic processes offer a new and exciting set of
opportunities and challenges in tensor algorithms. (Us, Slide 10)
66SIAM ALA'18Benson & Gleich
(Review) Pólya urns are spacey random
walks.
Draw random ball.
Put ball back with another of
the same color
This is a second-order spacey
random walk with two states.
We know it converges by our theory
(every two-state process converges).
67SIAM ALA'18Benson & Gleich
Generalized Pólya urns are spacey random walks.
Draw m random balls
with replacement.
Put in new green ball with
probability q(b1, b2, …, bm).
This is a (m-1)-order spacey
random walk with two states.
We know it converges by our theory
(every two-state process converges).
b1 b2 bm
…
68SIAM ALA'18Benson & Gleich
Applications of spacey random walks.
1. Pólya urns are SRWs.
2. SRWs model taxi sequence data.
3. Asymptotics of SRWs for data clustering.
4. Insight for new algorithms to compute tensor eigenvectors.
Stochastic processes offer a new and exciting set of
opportunities and challenges in tensor algorithms. (Us, Slide 10)
69SIAM ALA'18Benson & Gleich
Spacey random walks model sequence data.
Maximum likelihood estimation problem
(most likely P for the SRW model and the observed data).
convex
objective
linear constraints
nyc.gov
[Benson-Gleich-Lim 17]
70SIAM ALA'18Benson & Gleich
What is the SRW model saying for this data? Model people by locations.
• A passenger with location k is drawn at random.
• The taxi picks up the passenger at location j.
• The taxi drives the passenger to location i with probability Pi,j,k
Approximate location dist. by history ⟶ spacey random walk.
Spacey random walks model sequence data.
nyc.gov
71SIAM ALA'18Benson & Gleich
• One year of 1000 taxi trajectories in NYC.
• States are neighborhoods in Manhattan.
• Compute MLE P for SRW model with 800 taxis.
• Evaluate RMSE on test data of 200 taxis.
RMSE = 1 – Prob[sequence generated by process]
Spacey random walks model sequence data.
72SIAM ALA'18Benson & Gleich
Spacey random walks are identifiable via this
procedure.
73
Two difficult test tensors from [Gleich-Lim-Yu 15]
1. Generate 80 sequences with 200 transitions each from SRW model
Learn P for 2nd-order SRW, R for 2nd-order MC, P for 1st-order MC
2. Generate 20 sequences with 200 transitions each and evaluate RMSE.
Evaluate RMSE = 1 – Prob[sequence generated by process]
SIAM ALA'18Benson & Gleich
Applications of spacey random walks.
1. Pólya urns are SRWs.
2. SRWs model taxi sequence data.
3. Asymptotics of SRWs for data clustering.
4. Insight for new algorithms to compute tensor eigenvectors.
Stochastic processes offer a new and exciting set of
opportunities and challenges in tensor algorithms. (Us, Slide 10)
74SIAM ALA'18Benson & Gleich
Co-clustering nonnegative tensor data.
Joint work with
Tao Wu, Purdue
Spacey random walks that converge are
asymptotically Markov chains.
• occupancy vector wT converges to w
⟶ dynamics converge to P[w]m-2.
1
3
2
P
2
1 M(wt )
This connects to spectral clustering on graphs.
• Eigenvectors of the normalized Laplacian of a graph are
eigenvectors of the random walk matrix.
• Instead, we compute a stationary distribution w and use
eigenvectors of P = P[w]m-2.
[Wu-Benson-Gleich 16]
75SIAM ALA'18Benson & Gleich
We possibly symmetrize and normalize
nonnegative data to get a transition probability
tensor.
[1, 2, …, n] x
[1, 2, …, n] x
[1, 2, …, n]
[i1, i2, …, in1
]x
[j1, j2, …, jn2
]x
[k1, k2, …, kn3
]
If the data is a brick, we symmetrize before
normalization [Ragnarsson-Van Loan 13]
Generalization of
If the data is a symmetric cube,
we can normalize it to get a
transition tensor P.
76SIAM ALA'18Benson & Gleich
77
Input. Nonnegative brick of data.
1. Symmetrize the brick (if necessary).
2. Normalize to a stochastic tensor.
3. Estimate the stationary distribution of the spacey random walk
(or super-spacey random walk for sparse data).
4. Form the asymptotic Markov model.
5. Bisect indices using eigenvector of the asymptotic Markov model.
6. Recurse.
Output. Partition of indices.
The clustering methodology.
1
3
2
T
SIAM ALA'18Benson & Gleich
78
Ti,j,k = #(flights between airport i and airport j on airline k)
Clustering airline-airport-airport networks.
UNCLUSTERED
no apparent structure
CLUSTERED
diagonal structure evident
SIAM ALA'18Benson & Gleich
79
“best” clusters
• pronouns & articles (the, we, he, …)
• prepositions & link verbs (in, of, as, to, …)
fun 3-gram clusters
• {cheese, cream, sour, low-fat, frosting, nonfat, fat-free}
• {bag, plastic, garbage, grocery, trash, freezer}
fun 4-gram cluster
• {german, chancellor, angela, merkel, gerhard, schroeder, helmut, kohl}
Ti,j,k = #(consecutive co-occurrences of words i, j, k in corpus)
Ti,j,k,l = #(consecutive co-occurrences of words i, j, k, l in corpus)
Data from Corpus of ContemporaryAmerican English (COCA) www.ngrams.info
Clustering n-grams in natural language.
SIAM ALA'18Benson & Gleich
Applications of spacey random walks.
1. Pólya urns are SRWs.
2. SRWs model taxi sequence data.
3. Asymptotics of SRWs for data clustering.
4. Insight for new algorithms to compute tensor eigenvectors.
Stochastic processes offer a new and exciting set of
opportunities and challenges in tensor algorithms. (Us, Slide 10)
80SIAM ALA'18Benson & Gleich
New framework for computing tensor evecs.
[Benson-Gleich 18]
Our stochastic viewpoint gives a new approach.
We numerically integrate the dynamical system.
Many tensor eigenvector computation algorithms are
algebraic, look like generalizations of matrix power
method, shifted iteration, Newton iteration.
[Lathauwer-Moore-Vandewalle 00, Regalia-Kofidis 00, Li-Ng
14; Chu-Wu 14; Kolda-Mayo 11, 14]
Higher-order power method
Dynamical system
Many known convergence issues!
1. The dynamical system is empirically more robust for
principal evec of transition probability tensors.
2. Can generalize for symmetric tensors & any evec.
81SIAM ALA'18Benson & Gleich
New framework for computing tensor evecs.
[Benson-Gleich 18]
Let Λ be a prescribed map from a matrix to one of its eigenvectors, e.g.,
Λ(M) = eigenvector of M for kth smallest algebraic eigenvalue,
Λ(M) = eigenvector of M for largest magnitude eigenvalue
Suppose the dynamical system converges.Then
New computational framework.
1. Choose a mapΛ
2. Numerically integrate the dynamical system
82SIAM ALA'18Benson & Gleich
The algorithm is evolving this system!
The algorithm has a simple Julia code
function mult3(A, x)
dims = size(A)
M = zeros(dims[1],dims[2])
for i=1:dims[3]
M += A[:,:,i]*x[i]
end
return M
end
function dynsys_tensor_eigenvector(A;
maxit=100, k=1, h=0.5)
x = randn(size(A,1)); normalize!(x)
# This is the ODE function
F = function(x)
M = mult3(A, x)
d,V = eig(M) # we use Julia's ordering (*)
v = V[:,k] # pick out the kth eigenvector
if real(v[1]) >= 0; v *= -1.0; end # canonicalize
return real(v) – x
end
# evolve the ODE via Forward Euler
for iter=1:maxit; x = x + h*F(x); end
return x, x'*mult3(A,x)*x
end
Benson & Gleich SIAM ALA'18 83
New framework for computing tensor evecs.
Empirically, we can compute all the tensor eigenpairs with this approach (including
unstable ones that higher-order power method cannot compute).
tensor is Example 3.6 from [Kolda-Mayo 11]
84SIAM ALA'18Benson & Gleich
Why does this work? (Hand-wavy version)
Trajectory of dynamical system for Example 3.6
from Kolda and Mayo [2011]. Color is projection
onto first eigenvector of Jacobian which is +1 at
stationary points. Numerical integration with
forward Euler.
Why does this work?
The eigenvector map shifts
the spectrum around
unstable eigenvectors.
Benson & Gleich SIAM ALA'18 85
There are tons of open questions with this
approach that we could use help with!
Can the dynamical system cycle?
Yes, but what problems produce this behavior?
Which eigenvector (k) to use?
It really matters 
How to numerically integrate?
Seems like ODE45 does the trick!
SSHOPM -> Dyn Sys?
If SSHOPM converges, can you show the dyn.
sys will converge for some k?
Can you show there are inaccessible vecs?
No clue right now!
Benson & Gleich SIAM ALA'18 86
Trajectory of dynamical system for Example 3.6
from Kolda and Mayo [2011]. Color is projection
onto first eigenvector of Jacobian which is +1 at
stationary points. Numerical integration with
forward Euler.
New framework for computing tensor evecs.
• SDP methods can compute all eigenpairs but have
scalability issues [Cui-Dai-Nie 14, Nie-Wang 14, Nie-Zhang
17]
• Empirically, we can compute the same eigenvectors
while maintaining scalability.
tensor is Example 4.11 from [Cui-Dai-Nie 14]
87SIAM ALA'18Benson & Gleich
Act 1. This overview
• Basic notation and operations
• The fundamental problems
Act 2. Motivating applications
• Compression
• Diffusion imaging
• Hardy-Weinberg genetics
Act 3. Review of Stochastic processes
Markov Chains & Higher-order chains
• Limiting and stationary distributions
• Irreducibility
Act 4. Spacey RWs as stochastic
processes
• Pause for interpretations and
• FAQ
Act 5. Theory of spacey random walks
• Limiting dists are tensor evecs
• Dynamical systems & vertex
RWs
• (Non-) existence, uniqueness,
• Computation
Act 6. Applications of spacey random
walks
• Pólya urns, sequence data, tensor
clustering
• New algorithm for computing tensor
Stochastic processes offer a new and exciting set
of opportunities and challenges in tensor
algorithms.
Usually, the properties of these objects are explored algebraically or through
polynomial interpretations.
Our tutorial focused on interpreting the tensor objects stochastically!
SIAM ALA'18Benson & Gleich 89
A
3
2
1
Stochastic processes offer a new and exciting set
of opportunities and challenges in tensor
algorithms.
Hopefully, you should know a little bit about…
And where to look for more info! www.cs.cornell.edu/~arb/tesp
1
3
2
P
• Tensor eigenvectors
• Z-eigenvectors
• Irreducible tensors
• Higher-order Markov chains
• Spacey random walks
• Vertex reinforced random walks
• Dynamical systems for trajectories
• Fitting spacey random walks to data
• Multilinear PageRank models
• Clustering tensors
• And lots of open problems in this area!
SIAM ALA'18Benson & Gleich 90
Open problems abound!
SIAM ALA'18Benson & Gleich 91
General Open Questions.
1. What is the relationship between RWs, non-backtracking RWs, and SRWs?
2. Under what conditions does the spacey random walk converge?
3. What is the computational complexity surrounding SRWs?
4. How well does the dynamical system work for computing tensor evecs?
5. How can we use stochastic or dynamical systems views for H-eigenpairs?
6. More data mining applications?
Conjectures.
1. If P is a 3-mode tensor, then the spacey random walk converges.
2. If the HOPM on P always converges, the SRW on P always converges.
3. Determining if a SRW converges is PPAD-complete.
4. Computing a limiting distribution of SRW is PPAD-complete.
Tensor Eigenvectors and Stochastic Processes.
Thanks for your attention!
SIAM ALA'18Benson & Gleich 92
Today’s information & more. www.cs.cornell.edu/~arb/tesp
Austin R. Benson
http://cs.cornell.edu/~arb
@austinbenson
arb@cs.cornell.edu
David F. Gleich
https://www.cs.purdue.edu/homes/dgleich/
@dgleich
dgleich@purdue.edu

More Related Content

Similar to Tensor Eigenvectors and Stochastic Processes

Talk
TalkTalk
Radcliffe
RadcliffeRadcliffe
Bayesian Generalization Error and Real Log Canonical Threshold in Non-negativ...
Bayesian Generalization Error and Real Log Canonical Threshold in Non-negativ...Bayesian Generalization Error and Real Log Canonical Threshold in Non-negativ...
Bayesian Generalization Error and Real Log Canonical Threshold in Non-negativ...
Naoki Hayashi
 
Spacey random walks CMStatistics 2017
Spacey random walks CMStatistics 2017Spacey random walks CMStatistics 2017
Spacey random walks CMStatistics 2017
Austin Benson
 
NMR Random Coil Index & Protein Dynamics
NMR Random Coil Index & Protein Dynamics NMR Random Coil Index & Protein Dynamics
NMR Random Coil Index & Protein Dynamics
Mark Berjanskii
 
Machine Learning, Stock Market and Chaos
Machine Learning, Stock Market and Chaos Machine Learning, Stock Market and Chaos
Machine Learning, Stock Market and Chaos
I Know First: Daily Market Forecast
 
UP-STAT 2015 Abstract Presentation - Statistical and Machine Learning Methods...
UP-STAT 2015 Abstract Presentation - Statistical and Machine Learning Methods...UP-STAT 2015 Abstract Presentation - Statistical and Machine Learning Methods...
UP-STAT 2015 Abstract Presentation - Statistical and Machine Learning Methods...
MATTHEW CORSETTI
 
generalized_nbody_acs_2015_challacombe
generalized_nbody_acs_2015_challacombegeneralized_nbody_acs_2015_challacombe
generalized_nbody_acs_2015_challacombe
Matt Challacombe
 
Sparsenet
SparsenetSparsenet
Sparsenet
ndronen
 
Introduction FEA.pptx
Introduction FEA.pptxIntroduction FEA.pptx
Introduction FEA.pptx
VasirajaN2
 
Presentation at SMI 2023
Presentation at SMI 2023Presentation at SMI 2023
Presentation at SMI 2023
Joaquim Jorge
 
lepibwp74jd2rz.pdf
lepibwp74jd2rz.pdflepibwp74jd2rz.pdf
lepibwp74jd2rz.pdf
SajalTyagi6
 
Ieee lecture
Ieee lectureIeee lecture
Ieee lecture
Julie Samal
 
Change Point Analysis
Change Point AnalysisChange Point Analysis
Change Point Analysis
Mark Conway
 
Financial Networks III. Centrality and Systemic Importance
Financial Networks III. Centrality and Systemic ImportanceFinancial Networks III. Centrality and Systemic Importance
Financial Networks III. Centrality and Systemic Importance
Kimmo Soramaki
 
A Stabilized Finite Element Dynamic Overset Method for the Navier-Stokes Equa...
A Stabilized Finite Element Dynamic Overset Method for the Navier-Stokes Equa...A Stabilized Finite Element Dynamic Overset Method for the Navier-Stokes Equa...
A Stabilized Finite Element Dynamic Overset Method for the Navier-Stokes Equa...
Chao Liu
 
Sequence-analysis-pairwise-alignment.pdf
Sequence-analysis-pairwise-alignment.pdfSequence-analysis-pairwise-alignment.pdf
Sequence-analysis-pairwise-alignment.pdf
sriaisvariyasundar
 
Ds15 minitute-v2
Ds15 minitute-v2Ds15 minitute-v2
Ds15 minitute-v2
Mason Porter
 
Composite Systems - Trace Approach _ PPT Presentation
Composite Systems - Trace Approach _ PPT PresentationComposite Systems - Trace Approach _ PPT Presentation
Composite Systems - Trace Approach _ PPT Presentation
Filipe Giesteira
 
Collision Detection an Overview
Collision Detection an OverviewCollision Detection an Overview
Collision Detection an Overview
slantsixgames
 

Similar to Tensor Eigenvectors and Stochastic Processes (20)

Talk
TalkTalk
Talk
 
Radcliffe
RadcliffeRadcliffe
Radcliffe
 
Bayesian Generalization Error and Real Log Canonical Threshold in Non-negativ...
Bayesian Generalization Error and Real Log Canonical Threshold in Non-negativ...Bayesian Generalization Error and Real Log Canonical Threshold in Non-negativ...
Bayesian Generalization Error and Real Log Canonical Threshold in Non-negativ...
 
Spacey random walks CMStatistics 2017
Spacey random walks CMStatistics 2017Spacey random walks CMStatistics 2017
Spacey random walks CMStatistics 2017
 
NMR Random Coil Index & Protein Dynamics
NMR Random Coil Index & Protein Dynamics NMR Random Coil Index & Protein Dynamics
NMR Random Coil Index & Protein Dynamics
 
Machine Learning, Stock Market and Chaos
Machine Learning, Stock Market and Chaos Machine Learning, Stock Market and Chaos
Machine Learning, Stock Market and Chaos
 
UP-STAT 2015 Abstract Presentation - Statistical and Machine Learning Methods...
UP-STAT 2015 Abstract Presentation - Statistical and Machine Learning Methods...UP-STAT 2015 Abstract Presentation - Statistical and Machine Learning Methods...
UP-STAT 2015 Abstract Presentation - Statistical and Machine Learning Methods...
 
generalized_nbody_acs_2015_challacombe
generalized_nbody_acs_2015_challacombegeneralized_nbody_acs_2015_challacombe
generalized_nbody_acs_2015_challacombe
 
Sparsenet
SparsenetSparsenet
Sparsenet
 
Introduction FEA.pptx
Introduction FEA.pptxIntroduction FEA.pptx
Introduction FEA.pptx
 
Presentation at SMI 2023
Presentation at SMI 2023Presentation at SMI 2023
Presentation at SMI 2023
 
lepibwp74jd2rz.pdf
lepibwp74jd2rz.pdflepibwp74jd2rz.pdf
lepibwp74jd2rz.pdf
 
Ieee lecture
Ieee lectureIeee lecture
Ieee lecture
 
Change Point Analysis
Change Point AnalysisChange Point Analysis
Change Point Analysis
 
Financial Networks III. Centrality and Systemic Importance
Financial Networks III. Centrality and Systemic ImportanceFinancial Networks III. Centrality and Systemic Importance
Financial Networks III. Centrality and Systemic Importance
 
A Stabilized Finite Element Dynamic Overset Method for the Navier-Stokes Equa...
A Stabilized Finite Element Dynamic Overset Method for the Navier-Stokes Equa...A Stabilized Finite Element Dynamic Overset Method for the Navier-Stokes Equa...
A Stabilized Finite Element Dynamic Overset Method for the Navier-Stokes Equa...
 
Sequence-analysis-pairwise-alignment.pdf
Sequence-analysis-pairwise-alignment.pdfSequence-analysis-pairwise-alignment.pdf
Sequence-analysis-pairwise-alignment.pdf
 
Ds15 minitute-v2
Ds15 minitute-v2Ds15 minitute-v2
Ds15 minitute-v2
 
Composite Systems - Trace Approach _ PPT Presentation
Composite Systems - Trace Approach _ PPT PresentationComposite Systems - Trace Approach _ PPT Presentation
Composite Systems - Trace Approach _ PPT Presentation
 
Collision Detection an Overview
Collision Detection an OverviewCollision Detection an Overview
Collision Detection an Overview
 

More from Austin Benson

Hypergraph Cuts with General Splitting Functions (JMM)
Hypergraph Cuts with General Splitting Functions (JMM)Hypergraph Cuts with General Splitting Functions (JMM)
Hypergraph Cuts with General Splitting Functions (JMM)
Austin Benson
 
Spectral embeddings and evolving networks
Spectral embeddings and evolving networksSpectral embeddings and evolving networks
Spectral embeddings and evolving networks
Austin Benson
 
Computational Frameworks for Higher-order Network Data Analysis
Computational Frameworks for Higher-order Network Data AnalysisComputational Frameworks for Higher-order Network Data Analysis
Computational Frameworks for Higher-order Network Data Analysis
Austin Benson
 
Higher-order link prediction and other hypergraph modeling
Higher-order link prediction and other hypergraph modelingHigher-order link prediction and other hypergraph modeling
Higher-order link prediction and other hypergraph modeling
Austin Benson
 
Hypergraph Cuts with General Splitting Functions
Hypergraph Cuts with General Splitting FunctionsHypergraph Cuts with General Splitting Functions
Hypergraph Cuts with General Splitting Functions
Austin Benson
 
Hypergraph Cuts with General Splitting Functions
Hypergraph Cuts with General Splitting FunctionsHypergraph Cuts with General Splitting Functions
Hypergraph Cuts with General Splitting Functions
Austin Benson
 
Higher-order link prediction
Higher-order link predictionHigher-order link prediction
Higher-order link prediction
Austin Benson
 
Simplicial closure & higher-order link prediction
Simplicial closure & higher-order link predictionSimplicial closure & higher-order link prediction
Simplicial closure & higher-order link prediction
Austin Benson
 
Three hypergraph eigenvector centralities
Three hypergraph eigenvector centralitiesThree hypergraph eigenvector centralities
Three hypergraph eigenvector centralities
Austin Benson
 
Semi-supervised learning of edge flows
Semi-supervised learning of edge flowsSemi-supervised learning of edge flows
Semi-supervised learning of edge flows
Austin Benson
 
Choosing to grow a graph
Choosing to grow a graphChoosing to grow a graph
Choosing to grow a graph
Austin Benson
 
Link prediction in networks with core-fringe structure
Link prediction in networks with core-fringe structureLink prediction in networks with core-fringe structure
Link prediction in networks with core-fringe structure
Austin Benson
 
Higher-order Link Prediction GraphEx
Higher-order Link Prediction GraphExHigher-order Link Prediction GraphEx
Higher-order Link Prediction GraphEx
Austin Benson
 
Higher-order Link Prediction Syracuse
Higher-order Link Prediction SyracuseHigher-order Link Prediction Syracuse
Higher-order Link Prediction Syracuse
Austin Benson
 
Random spatial network models for core-periphery structure
Random spatial network models for core-periphery structureRandom spatial network models for core-periphery structure
Random spatial network models for core-periphery structure
Austin Benson
 
Random spatial network models for core-periphery structure.
Random spatial network models for core-periphery structure.Random spatial network models for core-periphery structure.
Random spatial network models for core-periphery structure.
Austin Benson
 
Simplicial closure & higher-order link prediction
Simplicial closure & higher-order link predictionSimplicial closure & higher-order link prediction
Simplicial closure & higher-order link prediction
Austin Benson
 
Simplicial closure and simplicial diffusions
Simplicial closure and simplicial diffusionsSimplicial closure and simplicial diffusions
Simplicial closure and simplicial diffusions
Austin Benson
 
Sampling methods for counting temporal motifs
Sampling methods for counting temporal motifsSampling methods for counting temporal motifs
Sampling methods for counting temporal motifs
Austin Benson
 
Set prediction three ways
Set prediction three waysSet prediction three ways
Set prediction three ways
Austin Benson
 

More from Austin Benson (20)

Hypergraph Cuts with General Splitting Functions (JMM)
Hypergraph Cuts with General Splitting Functions (JMM)Hypergraph Cuts with General Splitting Functions (JMM)
Hypergraph Cuts with General Splitting Functions (JMM)
 
Spectral embeddings and evolving networks
Spectral embeddings and evolving networksSpectral embeddings and evolving networks
Spectral embeddings and evolving networks
 
Computational Frameworks for Higher-order Network Data Analysis
Computational Frameworks for Higher-order Network Data AnalysisComputational Frameworks for Higher-order Network Data Analysis
Computational Frameworks for Higher-order Network Data Analysis
 
Higher-order link prediction and other hypergraph modeling
Higher-order link prediction and other hypergraph modelingHigher-order link prediction and other hypergraph modeling
Higher-order link prediction and other hypergraph modeling
 
Hypergraph Cuts with General Splitting Functions
Hypergraph Cuts with General Splitting FunctionsHypergraph Cuts with General Splitting Functions
Hypergraph Cuts with General Splitting Functions
 
Hypergraph Cuts with General Splitting Functions
Hypergraph Cuts with General Splitting FunctionsHypergraph Cuts with General Splitting Functions
Hypergraph Cuts with General Splitting Functions
 
Higher-order link prediction
Higher-order link predictionHigher-order link prediction
Higher-order link prediction
 
Simplicial closure & higher-order link prediction
Simplicial closure & higher-order link predictionSimplicial closure & higher-order link prediction
Simplicial closure & higher-order link prediction
 
Three hypergraph eigenvector centralities
Three hypergraph eigenvector centralitiesThree hypergraph eigenvector centralities
Three hypergraph eigenvector centralities
 
Semi-supervised learning of edge flows
Semi-supervised learning of edge flowsSemi-supervised learning of edge flows
Semi-supervised learning of edge flows
 
Choosing to grow a graph
Choosing to grow a graphChoosing to grow a graph
Choosing to grow a graph
 
Link prediction in networks with core-fringe structure
Link prediction in networks with core-fringe structureLink prediction in networks with core-fringe structure
Link prediction in networks with core-fringe structure
 
Higher-order Link Prediction GraphEx
Higher-order Link Prediction GraphExHigher-order Link Prediction GraphEx
Higher-order Link Prediction GraphEx
 
Higher-order Link Prediction Syracuse
Higher-order Link Prediction SyracuseHigher-order Link Prediction Syracuse
Higher-order Link Prediction Syracuse
 
Random spatial network models for core-periphery structure
Random spatial network models for core-periphery structureRandom spatial network models for core-periphery structure
Random spatial network models for core-periphery structure
 
Random spatial network models for core-periphery structure.
Random spatial network models for core-periphery structure.Random spatial network models for core-periphery structure.
Random spatial network models for core-periphery structure.
 
Simplicial closure & higher-order link prediction
Simplicial closure & higher-order link predictionSimplicial closure & higher-order link prediction
Simplicial closure & higher-order link prediction
 
Simplicial closure and simplicial diffusions
Simplicial closure and simplicial diffusionsSimplicial closure and simplicial diffusions
Simplicial closure and simplicial diffusions
 
Sampling methods for counting temporal motifs
Sampling methods for counting temporal motifsSampling methods for counting temporal motifs
Sampling methods for counting temporal motifs
 
Set prediction three ways
Set prediction three waysSet prediction three ways
Set prediction three ways
 

Recently uploaded

Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
adhitya5119
 
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem studentsRHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
Himanshu Rai
 
BBR 2024 Summer Sessions Interview Training
BBR  2024 Summer Sessions Interview TrainingBBR  2024 Summer Sessions Interview Training
BBR 2024 Summer Sessions Interview Training
Katrina Pritchard
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
TechSoup
 
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdfANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
Priyankaranawat4
 
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
Nguyen Thanh Tu Collection
 
PIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf IslamabadPIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf Islamabad
AyyanKhan40
 
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
Diana Rendina
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
Jean Carlos Nunes Paixão
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
Colégio Santa Teresinha
 
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
imrankhan141184
 
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
National Information Standards Organization (NISO)
 
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching AptitudeUGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
S. Raj Kumar
 
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
Nguyen Thanh Tu Collection
 
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdfবাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
eBook.com.bd (প্রয়োজনীয় বাংলা বই)
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
RAHUL
 
PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.
Dr. Shivangi Singh Parihar
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
PECB
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
Priyankaranawat4
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
Academy of Science of South Africa
 

Recently uploaded (20)

Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
 
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem studentsRHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
 
BBR 2024 Summer Sessions Interview Training
BBR  2024 Summer Sessions Interview TrainingBBR  2024 Summer Sessions Interview Training
BBR 2024 Summer Sessions Interview Training
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
 
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdfANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
 
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
 
PIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf IslamabadPIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf Islamabad
 
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
 
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
 
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
 
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching AptitudeUGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
 
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
 
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdfবাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
 
PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
 

Tensor Eigenvectors and Stochastic Processes

  • 1. Tensor eigenvectors and stochastic processes Austin R. Benson · Cornell David F. Gleich · Purdue Act 1. 10:45-10:55am Overview Act 2. 10:55-11:05am Motivating applications Act 3. 11:05-11:30am Stochastic processes & Markov chains Act 4. 11:40-12:10pm Spacey random walk stochastic process Act 5. 12:10-12:30pm Theory of spacey random walks SIAM ALA'18Benson & Gleich 1 Papers, slides, & code ⟶ bit.ly/tesp-web, bit.ly/tesp-code A 3 2 1
  • 2. Stochastic processes offer a new and exciting set of opportunities and challenges in tensor algorithms. After this is done, you should know a little bit about And where to look for more info! bit.ly/tesp-web 1 3 2 P • Tensor eigenvectors • Z-eigenvectors • Irreducible tensors • Higher-order Markov chains • Spacey random walks • Vertex reinforced random walks • Dynamical systems for trajectories • Fitting spacey random walks to data • Multilinear PageRank models • Clustering tensors • And lots of open problems in this area! SIAM ALA'18Benson & Gleich 2
  • 3. A quick overview of where we are going to go in this tutorial and some rough timing. Act 1. This overview • Basic notation and operations • The fundamental problems Act 2. Motivating applications • Compression • Diffusion imaging • Hardy-Weinberg genetics Act 3. Review of Stochastic processes Markov Chains & Higher-order chains • Limiting and stationary distributions • Irreducibility Act 4. Spacey RWs as stochastic processes • Pause for interpretations and thought • FAQ Act 5. Theory of spacey random walks • Limiting dists are tensor evecs • Dynamical systems & vertex reinforced • (Non-) existence, uniqueness, • Computation Act 6. Applications of spacey random walks • Pólya urns, sequence data, tensor • New algorithm for computing tensor SIAM ALA'18Benson & Gleich 3
  • 4. A note. We tried to be a friendly tutorial instead of trying to be comprehensive! See the extensive work by HK groups! SIAM ALA'18Benson & Gleich 4
  • 5. 3 2 1 Fundamental notations and some helpful pictures SIAM ALA'18Benson & Gleich 5 tensor-vector product tensor-collapse product
  • 6. Summary of fundamental notations SIAM ALA'18Benson & Gleich 6 We assume the tensor is symmetric or permuted so the last operations are all that’s needed.
  • 7. The tensor Z-eigenvector problem has different properties than matrix eigenvectors SIAM ALA'18Benson & Gleich 7
  • 8. There are many generalizations of eigen-problems to tensors. Their properties are very different. All eigenvectors have unit 2-norm. ||x||2 = 1. The H-eigenvalue spectrum is scale invariant. SIAM ALA'18Benson & Gleich 8
  • 9. There are many generalizations of eigen-problems to tensors. Their properties are very different. All eigenvectors have unit 2-norm. ||x||2 = 1. Z-eigenvectors are not scale invariant. H-eigenvectors are. SIAM ALA'18Benson & Gleich 9 There are even more types of eigen-probs! • D-eigenvalues • E-eigenvalues (complex Z-eigenvalues) • Generalized versions too… • Other normalizations! [Lim 05] For more information about these tensor eigenvectors and some of their fundamental properties, we recommend the following resources • Tensor Analysis: Spectral Theory and Special Tensors. Qi & Luo, 2017. • A survey on the spectral theory of nonnegative tensors. Chang, Qi, & Zhang, 2013.
  • 10. Stochastic processes offer a new and exciting set of opportunities and challenges in tensor algorithms. Usually, the properties of these objects are explored algebraically or through polynomial interpretations. Our tutorial focuses on interpreting the tensor objects stochastically! SIAM ALA'18Benson & Gleich 10 A 3 2 1
  • 11. Act 1. This overview • Basic notation and operations • The fundamental problems Act 2. Motivating applications • Compression • Diffusion imaging • Hardy-Weinberg genetics Act 3. Review of Stochastic processes Markov Chains & Higher-order chains • Limiting and stationary distributions • Irreducibility Act 4. Spacey RWs as stochastic processes • Pause for interpretations and • FAQ Act 5. Theory of spacey random walks • Limiting dists are tensor evecs • Dynamical systems & vertex RWs • (Non-) existence, uniqueness, • Computation Act 6. Applications of spacey random walks • Pólya urns, sequence data, tensor clustering • New algorithm for computing tensor
  • 12. The best rank-1 approximation to a symmetric tensor is given by the principal eigenvector. [De Lathauwer 97; De Lathauwer-De Moor-Vandewalle 00; Kofidis-Regalia 01, 02] A is a symmetric if the entries are the same under any permutation of the indices. In data mining and signal processing applications, we are often interested in the “best” rank-1 approximation. Notes. The first k tensor eigenvectors do not necessarily give the best rank- k approximation. In general, this problem is not even well-posed [de Silva- Lim 08]. Furthermore, the first eigenvector is not necessarily in the best rank-k “orthogonal approximation” from orthogonal vectors [Kolda 01, 03]. 3 2 1 SIAM ALA'18Benson & Gleich 12
  • 13. Quantum entanglement A(i,j,k,…,l) are the normalized amplitudes of an m-partite pure state |ψ> A is a nonneg sym tensor Diffusion imaging W is a symmetric, fourth-order kurtosis diffusion tensor D is a symmetric, 3 x 3 matrix ⟶ both are measured from MRI data. Michael S. Helfenbein Yale University https://www.eurekalert.org/pub_r eleases/2016-05/yu- ddo052616.php [Wei-Goldbart 03; Hu-Qi-Zhang 16] is the geometric measure of entanglement Paydar et al., Am. J. of Neuroradiology, 2014 [Qi-Wang-Wu 08] SIAM ALA'18Benson & Gleich 13
  • 14. Markovian binary trees. Entry-wise minimal solutions to x = Bx2 + a are extinction Distribution of alleles (forms of a gene) in a population at time t is x. Start with an infinite population. 1. Every individual gets a random mate. 2. Mates of type j and k produce offspring of type i with probability P(i, j, k) and then die. Hardy-Weinberg equilibria of random mating models are tensor eigenvectors. Under Hardy- Weinberg equilibria (steady-state), x satisfies x = Px2. [Bean-Kontoleon-Taylor 08; Bini-Meini-Poloni 11; Meini-Poloni 11, 17] SIAM ALA'18Benson & Gleich 14
  • 15. Act 1. This overview • Basic notation and operations • The fundamental problems Act 2. Motivating applications • Compression • Diffusion imaging • Hardy-Weinberg genetics Act 3. Review of Stochastic Markov Chains & Higher-order • Limiting and stationary • Irreducibility Act 4. Spacey RWs as stochastic processes • Pause for interpretations and • FAQ Act 5. Theory of spacey random walks • Limiting dists are tensor evecs • Dynamical systems & vertex RWs • (Non-) existence, uniqueness, • Computation Act 6. Applications of spacey random walks • Pólya urns, sequence data, tensor clustering • New algorithm for computing tensor
  • 16. Markov chains, matrices, and eigenvectors have a long-standing relationship. [Kemeny-Snell 76] “In the land of Oz they never have two nice days in a row. If they have a nice day, they are just as likely to have snow as rain the next day. If they have snow or rain, they have an even chance of having the same the next day. If there is a change from snow or rain, only half of the time is this change to a nice day.” Column-stochastic in this tutorial (since we are linear algebra people). Equations for stationary distribution x. The vector x is an eigenvector of P. Px = x. SIAM ALA'18Benson & Gleich 16
  • 17. Markov chains are a special case of a stochastic process. Stochastic processes are a (possibly infinite) sequence of RV. Z1, Z2, …, Zt, Zt+1, … • Zt is a random variable. • This is a discrete time stochastic process Stochastic processes are models throughout applied math and life • The weather • The stock market • Natural language • Random walks on graphs • Pólya’s urn • Brownian motion SIAM ALA'18Benson & Gleich 17
  • 18. Stochastic processes are just sets of random variables (RV). Often they are infinite and coupled. Brownian Motion. • My value at the next time goes up or down by a normal random variable. • Z0 = 0, Zt+1 = Zt + N(0,1) • Z = cumsum(randn(100,1)) Z is a realization of a Brownian motion • Often used to model stock prices normal random variable SIAM ALA'18Benson & Gleich 18
  • 19. Stochastic processes are just sets of random variables (RV). Often they are infinite and coupled. Pólya Urn. • Consider an urn with 1 purple and 1 green ball, draw a ball at random, replace it with one of the same color. • Z0 = 1, Zt+1 = Zt + B(1, Zt / (t+2)) 1 with prob Zt / (t+2) 0 otherwise Draw ball at random Put ball back with another of the same color SIAM ALA'18Benson & Gleich 19
  • 20. Stochastic processes are just sets of random variables (RV). Usually they are infinite and coupled somehow. Finite Markov chain & random walk. Z0 = “state”, Pr(Zt+1 = i | Zt = j) = Pij • States are indexed by 1, …, n • The random walk on a graph is a special Markov chain where • Random walks on weighted graphs and finite Markov chains are isomorphic SIAM ALA'18Benson & Gleich 20
  • 21. SIAM REVIEW c⃝ 2015 Society for Industrial and Applied Mathematics Vol. 57, No. 3, pp. 321–363 PageRank Beyond the Web∗ David F. Gleich† The PageRank Markov chain and random walk is another well known instance. Originally, the random surfer model • States are web-pages and links between pages make a directed graph. • The random surfer is a Markov chain with prob α follow a random outlink and with prob (1-α) go to a random page PageRank can be used for everything from analyzing the world's most important books to predicting traffic flow to ending sports arguments. -JESSICA LEBER, Fast Information.David F. Gleich SIAM ALA'18Benson & Gleich 21
  • 22. Higher-order Markov chains & random walks are useful models for many data problems. Higher order Markov chains & random walks A second order chain uses the last two states Z-1 = “state”, Z0 = “another state” Pr(Zt+1 = i | Zt = j, Zt-1 = k) = Pi,j,k Simple to understand and turn out to be better models than standard (first-order) chains in several application domains [Ching-Ng-Fung 08] • Traffic flow in airport networks [Rosvall+ 14] • Web browsing behavior [Pirolli-Pitkow 99; Chierichetti+ 12] • DNA sequences [Borodovsky-McIninch 93; Ching-Fung-Ng 04] • Non backtracking walks in networks [Krzakala+ 13; Arrigo-Gringod-Higham-Noferini 18] Rosvall et al., Nature Comm., 2014. A tensor! SIAM ALA'18Benson & Gleich 22
  • 23. Higher-order Markov chains are actually first-order Markov chains in disguise. Start with a second-order Markov chain Consider a new stochastic process on pairs of variables Higher-order Markov chains are Markov chains on the product space. SIAM ALA'18Benson & Gleich 23
  • 24. Tensors are a natural representation of transition probabilities of higher-order Markov chains. 1 3 2 P Often called transition probability tensors. [Li-Ng-Ye 11, Li-Ng 14, Chu-Wu 14, Culp-Pearson- Zhang 17] SIAM ALA'18Benson & Gleich 24
  • 25. A note. Often we use the “second-order” case as a stand-in for the “general” higher-order case. Second order Markov chain Z-1 = “state”, Z0 = “another state” Pr(Zt+1 = i | Zt = j, Zt-1 = k) = Pijk General higher-order Markov chain Pr(Zt+1 = i | Zt = j, Zt-1 = k, …, Zt-m+1 = l) = P(i, j, k, …, l) Terminology • Second-order = 2 states of history  3-mode tensor • mth-order = m states of history  (m+1)-mode tensor An m+1-mode tensor. SIAM ALA'18Benson & Gleich 25
  • 26. We love stochastic processes because they give you an intuition and “physics” about what is happening SIAM ALA'18Benson & Gleich 26
  • 27. A fundamental quantity for stochastic processes is the fraction of time spent at each state (limiting distribution). Consider a stochastic process that goes on infinitely where each Zj takes a discrete value from a finite set. We want to know how often are we in a particular state in the long run? Other fundamental quantities include • Return times • Hitting times (Cesàro limit) SIAM ALA'18Benson & Gleich 27
  • 28. Example limiting distribution with a random walk. Long time SIAM ALA'18Benson & Gleich 28
  • 29. In the Pólya Urn, the limiting distribution of ball draws always exists. It can converge to any value. Thisistheuniformdistribution We have 1000 samples of the trajectories. SIAM ALA'18Benson & Gleich 29
  • 30. For each realization, the sequence of random variables Z1, Z2, …, Zt, Zt+1, … converges. It does not converge to a unique value, but rather can converge to any value. SIAM ALA'18Benson & Gleich 30
  • 31. Limiting distributions and stationary distributions for Markov chains have different properties. This point is often mis-understood.We want to make sure you get it right! Limiting distribution  A stationary distribution  Pk estart converges to p* Theorem. A finite Markov chain always has a limiting distribution. Theorem. The limiting distribution is unique if and only if the chain has only a single recurrent class. Theorem.A stationary distribution is limt ⟶ ∞ Prob[Zt = i].This is unique if and only if a Markov chain has a single aperiodic, recurrent class. SIAM ALA'18Benson & Gleich 31
  • 32. States in a finite Markov chain are either recurrent or transient. Proof by picture. Recurrent: Prob[another visit] = 1 Transient: Prob[another visit] < 1. Markov chains ⟺ Dir. graphs Directed graphs +Tarjan’s algorithm give the flow among strongly connected components. (Block triangular form.) Block triangular form fromTarjan’s algorithm Strongly connected components Recurrent states SIAM ALA'18Benson & Gleich 32
  • 33. The fundamental theorem of Markov chains is that any stochastic matrix is Cesàro summable. Limiting distribution given start node is P*[:, start] because Pk gives the k-state transition probability. Result. Only one recurrent class iff P* is rank 1. Proof sketch. A recurrent class is a fully-stochastic sub- matrix. If there are >1 recurrent classes, then P* would be rank >1 because we could look at the sub-chain on each recurrent class; if P* is rank 1, then the distribution is the same regardless of where you start and so “no choice” . Cesàro summable This always exists! SIAM ALA'18Benson & Gleich 33
  • 34. Stationary distributions are much stronger than limiting distribution A stationary distribution  Pk converging to P* This requires a single aperiodic recurrent class or irreducible & aperiodic matrix. (There are some funky cases if your chain is really two disconnected, independent chains.) We can always make a limiting distribution a stationary distribution.Turn P into a lazy-Markov chain. This is automatically aperiodic and doesn’t change the recurrence. SIAM ALA'18Benson & Gleich 34
  • 35. Act 1. This overview • Basic notation and operations • The fundamental problems Act 2. Motivating applications • Compression • Diffusion imaging • Hardy-Weinberg genetics Act 3. Review of Stochastic processes Markov Chains & Higher-order chains • Limiting and stationary distributions • Irreducibility Act 4. Spacey RWs as stochastic processes • Pause for interpretations and • FAQ Act 5. Theory of spacey random walks • Limiting dists are tensor evecs • Dynamical systems & vertex RWs • (Non-) existence, uniqueness, • Computation Act 6. Applications of spacey random walks • Pólya urns, sequence data, tensor clustering • New algorithm for computing tensor
  • 36. Remember! Tensors are a natural representation of transition probabilities of higher-order Markov chains. 1 3 2 P But the stationary distribution on pairs of states is still a matrix eigenvector... [Li-Ng 14] Making the “rank-1 approximation” Xj,k = xjxk gives a formulation for tensor eigenvectors. SIAM ALA'18Benson & Gleich 36
  • 37. The vector x satisfying Px2 = x is nonnegative and sums to 1. Thus, x often gets called a limiting distribution. But all we have done is algebra! What is a natural stochastic process that has this limiting distribution? SIAM ALA'18Benson & Gleich 37
  • 38. Spacey random walks are stochastic processes whose limiting distribution(s) lead to such tensor eigenvectors. 1 3 2 P 1. We are at state Zt = j and want to transition according to P. 2. However, upon arriving at state Zt = j, we space out and forget about Zt-1 = k. 3. We still want to do our best, so we choose state Yt = r uniformly from our history Z1, Z2, …, Zt (technically, we initialize having visited each state once). 4. We then follow P pretending that Zt-1 = r. Stochastic process Z1, Z2, …, Zt, Zt+1, … with states in {1, …, n}. Spacey or space out? 走神 or 心不在焉 According to David’s students SIAM ALA'18Benson & Gleich 38
  • 39. Spacey random walks are stochastic processes whose limiting distributions are such tensor eigenvectors. 10 12 4 9 7 11 4 Zt-1 Zt Yt Key insight [Benson-Gleich-Lim 17] Limiting distributions of this process are tensor eigenvectors of P. 1 3 2 P Prob(Zt+1 = i | Zt = j, Yt = r) = P(i, j, r). SIAM ALA'18Benson & Gleich 39
  • 40. The main point. Limiting distributions of the spacey random walk stochastic process are tensor eigenvectors of P (we’ll prove this later). SIAM ALA'18Benson & Gleich 40
  • 41. We have to be careful with undefined transitions, which correspond to zero columns in the tensor. 10 12 4 9 7 11 4 Zt-1 Zt Yt 1 3 2 P Prob(Zt+1 = i | Zt = j, Yt = r) = ? P(:, j, r) = 0. A couple options. 1. Pre-specify a distribution for when P(:, j, r) = 0. 2. Choose a random state from history ⟶ super SRW [Wu-Benson-Gleich 16] SIAM ALA'18Benson & Gleich 41
  • 42. 1 2 3 1/2 1/2 1/2 1/2 1/2 1/2 Limiting distribution of RW is [1/3, 1/3, 1/3]. What about non-backtracking RW? NBRW disallows going back to where you came from and re-normalizes the probabilities. Lim. dist. is still [1/3, 1/3, 1/3], but for far different reasons. NBRW is a second-order Markov chain! What happens with the spacey random walk using the NBRW transition probabilities? Zero-column fill-in affects the limiting distribution and tensor evec. Follow along with Jupyter notebook! 3-node-cycle-walks.ipynb SIAM ALA'18Benson & Gleich 42
  • 43. FAQ. Please ask your own questions, too! 1. What’s a spacey random walk, again? A stochastic process defined by a transition probability tensor. 2. Is the spacey random walk a Markov chain? No, not in general—the transitions depends on the entire history. 3. Is the limiting distribution of a higher-order MC a tensor e-vec? No, not in general. 4. Why not just compute the stat. dist. of the higher-order MC? We are motivating tensor eigenvectors from a stochastic processes view. 5. What is an e-vec with e-val 1 of a transition probability tensor? It could be the limiting distribution of a spacey random walk. 1 3 2 P SIAM ALA'18Benson & Gleich 43
  • 44. 1 2 3 1/2 1/2 Follow along with Jupyter notebook! 7-node-line-walks.ipynb 4 5 6 7 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/3 1/3 1/61/6 What will happen with a… RW? NBRW? SRW (uniform fill-in)? SRW (RW stat. dist. fill-in)? SRW (NBRW stat. dist. fill-in)? SSRW? SIAM ALA'18Benson & Gleich 44
  • 45. (Not well defined) conjecture. When the transition probability tensor entries come from a non-backtracking random walk, the spacey random walk “interpolates” between the standard random walk and the non-backtracking one. SIAM ALA'18Benson & Gleich 45
  • 46. Pólya urns are spacey random walks. Draw random ball. Put ball back with another of the same color This is a second-order spacey random walk with two states. Consequently, we know this one must converge because it’s a Pólya Urn! SIAM ALA'18Benson & Gleich 46
  • 47. But didn’t Pólya Urns have any limiting distribution? Does this mean that tensor is interesting? Yes! Any stochastic vector is a tensor eigenvector and these are also limiting distributions. SIAM ALA'18Benson & Gleich 47
  • 48. Act 1. This overview • Basic notation and operations • The fundamental problems Act 2. Motivating applications • Compression • Diffusion imaging • Hardy-Weinberg genetics Act 3. Review of Stochastic processes Markov Chains & Higher-order chains • Limiting and stationary distributions • Irreducibility Act 4. Spacey RWs as stochastic processes • Pause for interpretations and • FAQ Act 5. Theory of spacey random • Limiting dists are tensor evecs • Dynamical systems & vertex RWs • (Non-) existence, uniqueness, convergence • Computation Act 6. Applications of spacey random walks • Pólya urns, sequence data, tensor clustering • New algorithm for computing tensor
  • 49. Spacey random walks have a number of interesting properties as well as a number of open challenges! Properties 1. Limiting distributions of SRWs are tensor evecs with eval 1 (proof shortly!) 2. Asymptotically, SRWs are first-order Markov chains. 3. If there are just 2 states, then the SRW converges but possibly to one of several distributions. 4. If P is sufficiently “regularized”, then the SRW converges to a unique limiting distribution. Open problems • Existence? • Uniqueness? • Computation? SIAM ALA'18Benson & Gleich 49 1 3 2 P Note. Spacey random walks are defined by a stochastic transition tensor, so these are all tensor questions!
  • 50. An informal and intuitive proof that spacey random walks converge to tensor eigenvectors Idea. Let wT be the fraction of time spent in each state after T ≫ 1 steps. Consider an additional L steps, T ≫ L ≫ 1.Then wT ≈ wT+L if we converge. SIAM ALA'18Benson & Gleich 50 1 3 2 P Long time wT wT+L Suppose M(x) = P[wT]m-2 has a unique stationary distribution, xT. If the SRW converges, then xT = wT+L, otherwise wT+L would be different. Thus, xT = P[wT]m-1 xT ≈ P[wT+L]m-1 xT = P[xT]m-1 xT = PxT m-1. Long time wT wT+L
  • 51. To formalize convergence, we need the theory of generalized vertex reinforced random walks (GVRRW). A stochastic process X1, …, Xt, … is a GVRRW if wT is the fraction of time in each state FT is the sigma algebra generated by X1, …, XT. M(wT) is a column stochastic matrix that depends on wT . [Diaconis 88; Pemantle 92, 07; Benaïm 97] SIAM ALA'18Benson & Gleich 51 The classicVRRW is the following • Given a graph, randomly move to a neighbor with probability propotional to how often we’ve visited the neighbor!
  • 52. To formalize convergence, we need the theory of generalized vertex reinforced random walks (GVRRW). A stochastic process X1, …, Xt, … is a GVRRW if wT is the fraction of time in each state FT is the sigma algebra generated by X1, …, XT. M(wT) is a column stochastic matrix that depends on wT . Spacey random walks are GVRRWs with the map M: M(wT) = P[wT]m-2. [Diaconis 88; Pemantle 92, 07; Benaïm 97] SIAM ALA'18Benson & Gleich 52
  • 53. To formalize convergence, we need the theory of generalized vertex reinforced random walks (GVRRW). Theorem [Benaïm97] heavily paraphrased In a discrete GVRRW, the long-term behavior of the occupancy distribution wT follows the long-term behavior of the dynamical system To study convergence properties of the SRW, we just need to study the dynamical system for our map M: M(wT) = P[wT]m-2: where maps a column stochastic matrix to its Perron vector. SIAM ALA'18Benson & Gleich 53
  • 54. More on how stationary distributions of GVRRWs correspond to ODEs SIAM ALA'18Benson & Gleich 54 THEOREM [Benaïm, 1997] Less Paraphrased The sequence of empirical observation probabilities ct is an asymptotic pseudo-trajectory for the dynamical system Thus, convergence of the ODE to a fixed point is equivalent to stationary distributions of the VRRW. • M must always have a unique stationary distribution! • The map to M must be very continuous • Asymptotic pseudo-trajectories satisfy
  • 55. Spacey random walks converge to tensor eigenvectors (a more formal proof). Suppose that the SRW converges.Then we converge to a stationary point. SIAM ALA'18Benson & Gleich 55 1 3 2 P Long time wT wT+L Remember the informal proof. All we’ve done is just formalize this by using the dynamical system to map behavior!
  • 56. Corollary. Asymptotically, GVRRWs (including spacey random walks) act as first-order Markov chains. Suppose that the SRW converges to x. Then SIAM ALA'18Benson & Gleich 56
  • 57. Relationship between spacey random walk convergence and existence of tensor eigenvectors. SRW converges ⇒ existence of tensor e-vec of P with e-val 1. SRW converges ⇍ existence of tensor e-vec of P with e-val 1. Apply map f(x) = Pxm-1 satisfies conditions of Brouwer’s fixed point theorem, so there always exists an x such that Pxm-1 = x. Furthermore, 𝜆 = 1 is the largest eigenvalue. [Li-Ng 14] There exists a P for which the SRW does not converge [Peterson 18] SIAM ALA'18Benson & Gleich 57
  • 58. General Open Question. Under what conditions does the spacey random walk converge? Peterson’s Conjecture. If P is a 3-mode tensor, then the spacey random walk converges. Broader conjecture There is always a (generalized) SRW that converges to a tensor evec. What we have been able to show so far. 1. If there are just 2 states, then the SRW converges. 2. If P is sufficiently “regularized”, then the SRW converges. SIAM ALA'18Benson & Gleich 58
  • 59. Almost every 2-state spacey random walk converges. [Benson-Gleich-Lim 17] Special case of 2 x 2 x 2 system... SIAM ALA'18Benson & Gleich 59
  • 60. Almost every 2-state spacey random walk converges. Theorem [Benson-Gleich-Lim 17] The dynamics of almost every 2 x 2 x … x 2 spacey random walk (of any order) converges to a stable equilibrium point. stable stable unstable Things to note… 1. Multiple stable points in above example; SRW could converge to any. 2. Randomness of SRW is “baked in” to initial condition of system. SIAM ALA'18Benson & Gleich 60
  • 61. A sufficiently regularized spacey random walk converges. Consider a modified “spacey random surfer” model. At each step, 1. with probability α, follow SRW model P. 2. with probability 1 - α, teleport to a random node. Equivalent to a SRW on S = αP + (1 – α)J, where J is normalized ones tensor. Theorem. If α < 1 / (m – 1), 1. the SRW on S converges [Benson-Gleich-Lim 17] 2. there is a unique tensor x e-vec satisfying Sxm-1 = x [Gleich-Lim-Yu 15] [Gleich-Lim-Yu 15; Benson-Gleich-Lim 17] SIAM ALA'18Benson & Gleich 61
  • 62. A sufficiently regularized spacey random walk converges. The higher-order power method is an algorithm to compute the dominant tensor eigenvector. yk+1 = Txk m-1 xk+1 = yk+1 / || yk+1 || Theorem [Gleich-Lim-Yu 15] If α < 1 / (m – 1), the power method on S = αP + (1 – α)J converges to the unique vector satisfying Sxm-1 = x. Conjecture. If the higher-order power method on P always converges, then the spacey random walk on P always converges. SIAM ALA'18Benson & Gleich 62
  • 63. Conjecture. Determining if a SRW converges is PPAD-complete. Computing a limiting distribution of SRW is PPAD-complete. Why? In general, NP-hard to determine if tensor evec for eval 𝜆 [Hillar-Lim 13]. Know evec exists for transition probability tensor P, eval 𝜆 = 1 [Li-Ng 14]. However, no obvious way to compute it. Similar to other PPAD-complete problems (e.g., Nash equilibria). SIAM ALA'18Benson & Gleich 63
  • 64. General Open Question. What is the best way to compute tensor eigenvectors? • Higher-order power method [Kofidis-Regalia 00, 01; De Lathauwer-De Moor-Vandewalle 00] • Shifted higher-order power method [Kolda-Mayo 11] • SDP hierarchies [Cui-Dai-Nie 14; Nie-Wang 14; Nie-Zhang 18] • Perron iteration [Meini-Poloni 11, 17] For SRWs, the dynamical system offers another way. Numerically integrate the dynamical system! [Benson-Gleich-Lim 17; Benson-Gleich 18] SIAM ALA'18Benson & Gleich 64 Equivalent to Perron iteration with Forward Euler & unit time-step.
  • 65. Act 1. This overview • Basic notation and operations • The fundamental problems Act 2. Motivating applications • Compression • Diffusion imaging • Hardy-Weinberg genetics Act 3. Review of Stochastic processes Markov Chains & Higher-order chains • Limiting and stationary distributions • Irreducibility Act 4. Spacey RWs as stochastic processes • Pause for interpretations and • FAQ Act 5. Theory of spacey random walks • Limiting dists are tensor evecs • Dynamical systems & vertex RWs • (Non-) existence, uniqueness, • Computation Act 6. Applications of spacey walks • Pólya urns, sequence data, tensor clustering • New algorithm for computing evecs
  • 66. Applications of spacey random walks. 1. Pólya urns are SRWs. 2. SRWs model taxi sequence data. 3. Asymptotics of SRWs for data clustering. 4. Insight for new algorithms to compute tensor eigenvectors. Stochastic processes offer a new and exciting set of opportunities and challenges in tensor algorithms. (Us, Slide 10) 66SIAM ALA'18Benson & Gleich
  • 67. (Review) Pólya urns are spacey random walks. Draw random ball. Put ball back with another of the same color This is a second-order spacey random walk with two states. We know it converges by our theory (every two-state process converges). 67SIAM ALA'18Benson & Gleich
  • 68. Generalized Pólya urns are spacey random walks. Draw m random balls with replacement. Put in new green ball with probability q(b1, b2, …, bm). This is a (m-1)-order spacey random walk with two states. We know it converges by our theory (every two-state process converges). b1 b2 bm … 68SIAM ALA'18Benson & Gleich
  • 69. Applications of spacey random walks. 1. Pólya urns are SRWs. 2. SRWs model taxi sequence data. 3. Asymptotics of SRWs for data clustering. 4. Insight for new algorithms to compute tensor eigenvectors. Stochastic processes offer a new and exciting set of opportunities and challenges in tensor algorithms. (Us, Slide 10) 69SIAM ALA'18Benson & Gleich
  • 70. Spacey random walks model sequence data. Maximum likelihood estimation problem (most likely P for the SRW model and the observed data). convex objective linear constraints nyc.gov [Benson-Gleich-Lim 17] 70SIAM ALA'18Benson & Gleich
  • 71. What is the SRW model saying for this data? Model people by locations. • A passenger with location k is drawn at random. • The taxi picks up the passenger at location j. • The taxi drives the passenger to location i with probability Pi,j,k Approximate location dist. by history ⟶ spacey random walk. Spacey random walks model sequence data. nyc.gov 71SIAM ALA'18Benson & Gleich
  • 72. • One year of 1000 taxi trajectories in NYC. • States are neighborhoods in Manhattan. • Compute MLE P for SRW model with 800 taxis. • Evaluate RMSE on test data of 200 taxis. RMSE = 1 – Prob[sequence generated by process] Spacey random walks model sequence data. 72SIAM ALA'18Benson & Gleich
  • 73. Spacey random walks are identifiable via this procedure. 73 Two difficult test tensors from [Gleich-Lim-Yu 15] 1. Generate 80 sequences with 200 transitions each from SRW model Learn P for 2nd-order SRW, R for 2nd-order MC, P for 1st-order MC 2. Generate 20 sequences with 200 transitions each and evaluate RMSE. Evaluate RMSE = 1 – Prob[sequence generated by process] SIAM ALA'18Benson & Gleich
  • 74. Applications of spacey random walks. 1. Pólya urns are SRWs. 2. SRWs model taxi sequence data. 3. Asymptotics of SRWs for data clustering. 4. Insight for new algorithms to compute tensor eigenvectors. Stochastic processes offer a new and exciting set of opportunities and challenges in tensor algorithms. (Us, Slide 10) 74SIAM ALA'18Benson & Gleich
  • 75. Co-clustering nonnegative tensor data. Joint work with Tao Wu, Purdue Spacey random walks that converge are asymptotically Markov chains. • occupancy vector wT converges to w ⟶ dynamics converge to P[w]m-2. 1 3 2 P 2 1 M(wt ) This connects to spectral clustering on graphs. • Eigenvectors of the normalized Laplacian of a graph are eigenvectors of the random walk matrix. • Instead, we compute a stationary distribution w and use eigenvectors of P = P[w]m-2. [Wu-Benson-Gleich 16] 75SIAM ALA'18Benson & Gleich
  • 76. We possibly symmetrize and normalize nonnegative data to get a transition probability tensor. [1, 2, …, n] x [1, 2, …, n] x [1, 2, …, n] [i1, i2, …, in1 ]x [j1, j2, …, jn2 ]x [k1, k2, …, kn3 ] If the data is a brick, we symmetrize before normalization [Ragnarsson-Van Loan 13] Generalization of If the data is a symmetric cube, we can normalize it to get a transition tensor P. 76SIAM ALA'18Benson & Gleich
  • 77. 77 Input. Nonnegative brick of data. 1. Symmetrize the brick (if necessary). 2. Normalize to a stochastic tensor. 3. Estimate the stationary distribution of the spacey random walk (or super-spacey random walk for sparse data). 4. Form the asymptotic Markov model. 5. Bisect indices using eigenvector of the asymptotic Markov model. 6. Recurse. Output. Partition of indices. The clustering methodology. 1 3 2 T SIAM ALA'18Benson & Gleich
  • 78. 78 Ti,j,k = #(flights between airport i and airport j on airline k) Clustering airline-airport-airport networks. UNCLUSTERED no apparent structure CLUSTERED diagonal structure evident SIAM ALA'18Benson & Gleich
  • 79. 79 “best” clusters • pronouns & articles (the, we, he, …) • prepositions & link verbs (in, of, as, to, …) fun 3-gram clusters • {cheese, cream, sour, low-fat, frosting, nonfat, fat-free} • {bag, plastic, garbage, grocery, trash, freezer} fun 4-gram cluster • {german, chancellor, angela, merkel, gerhard, schroeder, helmut, kohl} Ti,j,k = #(consecutive co-occurrences of words i, j, k in corpus) Ti,j,k,l = #(consecutive co-occurrences of words i, j, k, l in corpus) Data from Corpus of ContemporaryAmerican English (COCA) www.ngrams.info Clustering n-grams in natural language. SIAM ALA'18Benson & Gleich
  • 80. Applications of spacey random walks. 1. Pólya urns are SRWs. 2. SRWs model taxi sequence data. 3. Asymptotics of SRWs for data clustering. 4. Insight for new algorithms to compute tensor eigenvectors. Stochastic processes offer a new and exciting set of opportunities and challenges in tensor algorithms. (Us, Slide 10) 80SIAM ALA'18Benson & Gleich
  • 81. New framework for computing tensor evecs. [Benson-Gleich 18] Our stochastic viewpoint gives a new approach. We numerically integrate the dynamical system. Many tensor eigenvector computation algorithms are algebraic, look like generalizations of matrix power method, shifted iteration, Newton iteration. [Lathauwer-Moore-Vandewalle 00, Regalia-Kofidis 00, Li-Ng 14; Chu-Wu 14; Kolda-Mayo 11, 14] Higher-order power method Dynamical system Many known convergence issues! 1. The dynamical system is empirically more robust for principal evec of transition probability tensors. 2. Can generalize for symmetric tensors & any evec. 81SIAM ALA'18Benson & Gleich
  • 82. New framework for computing tensor evecs. [Benson-Gleich 18] Let Λ be a prescribed map from a matrix to one of its eigenvectors, e.g., Λ(M) = eigenvector of M for kth smallest algebraic eigenvalue, Λ(M) = eigenvector of M for largest magnitude eigenvalue Suppose the dynamical system converges.Then New computational framework. 1. Choose a mapΛ 2. Numerically integrate the dynamical system 82SIAM ALA'18Benson & Gleich
  • 83. The algorithm is evolving this system! The algorithm has a simple Julia code function mult3(A, x) dims = size(A) M = zeros(dims[1],dims[2]) for i=1:dims[3] M += A[:,:,i]*x[i] end return M end function dynsys_tensor_eigenvector(A; maxit=100, k=1, h=0.5) x = randn(size(A,1)); normalize!(x) # This is the ODE function F = function(x) M = mult3(A, x) d,V = eig(M) # we use Julia's ordering (*) v = V[:,k] # pick out the kth eigenvector if real(v[1]) >= 0; v *= -1.0; end # canonicalize return real(v) – x end # evolve the ODE via Forward Euler for iter=1:maxit; x = x + h*F(x); end return x, x'*mult3(A,x)*x end Benson & Gleich SIAM ALA'18 83
  • 84. New framework for computing tensor evecs. Empirically, we can compute all the tensor eigenpairs with this approach (including unstable ones that higher-order power method cannot compute). tensor is Example 3.6 from [Kolda-Mayo 11] 84SIAM ALA'18Benson & Gleich
  • 85. Why does this work? (Hand-wavy version) Trajectory of dynamical system for Example 3.6 from Kolda and Mayo [2011]. Color is projection onto first eigenvector of Jacobian which is +1 at stationary points. Numerical integration with forward Euler. Why does this work? The eigenvector map shifts the spectrum around unstable eigenvectors. Benson & Gleich SIAM ALA'18 85
  • 86. There are tons of open questions with this approach that we could use help with! Can the dynamical system cycle? Yes, but what problems produce this behavior? Which eigenvector (k) to use? It really matters  How to numerically integrate? Seems like ODE45 does the trick! SSHOPM -> Dyn Sys? If SSHOPM converges, can you show the dyn. sys will converge for some k? Can you show there are inaccessible vecs? No clue right now! Benson & Gleich SIAM ALA'18 86 Trajectory of dynamical system for Example 3.6 from Kolda and Mayo [2011]. Color is projection onto first eigenvector of Jacobian which is +1 at stationary points. Numerical integration with forward Euler.
  • 87. New framework for computing tensor evecs. • SDP methods can compute all eigenpairs but have scalability issues [Cui-Dai-Nie 14, Nie-Wang 14, Nie-Zhang 17] • Empirically, we can compute the same eigenvectors while maintaining scalability. tensor is Example 4.11 from [Cui-Dai-Nie 14] 87SIAM ALA'18Benson & Gleich
  • 88. Act 1. This overview • Basic notation and operations • The fundamental problems Act 2. Motivating applications • Compression • Diffusion imaging • Hardy-Weinberg genetics Act 3. Review of Stochastic processes Markov Chains & Higher-order chains • Limiting and stationary distributions • Irreducibility Act 4. Spacey RWs as stochastic processes • Pause for interpretations and • FAQ Act 5. Theory of spacey random walks • Limiting dists are tensor evecs • Dynamical systems & vertex RWs • (Non-) existence, uniqueness, • Computation Act 6. Applications of spacey random walks • Pólya urns, sequence data, tensor clustering • New algorithm for computing tensor
  • 89. Stochastic processes offer a new and exciting set of opportunities and challenges in tensor algorithms. Usually, the properties of these objects are explored algebraically or through polynomial interpretations. Our tutorial focused on interpreting the tensor objects stochastically! SIAM ALA'18Benson & Gleich 89 A 3 2 1
  • 90. Stochastic processes offer a new and exciting set of opportunities and challenges in tensor algorithms. Hopefully, you should know a little bit about… And where to look for more info! www.cs.cornell.edu/~arb/tesp 1 3 2 P • Tensor eigenvectors • Z-eigenvectors • Irreducible tensors • Higher-order Markov chains • Spacey random walks • Vertex reinforced random walks • Dynamical systems for trajectories • Fitting spacey random walks to data • Multilinear PageRank models • Clustering tensors • And lots of open problems in this area! SIAM ALA'18Benson & Gleich 90
  • 91. Open problems abound! SIAM ALA'18Benson & Gleich 91 General Open Questions. 1. What is the relationship between RWs, non-backtracking RWs, and SRWs? 2. Under what conditions does the spacey random walk converge? 3. What is the computational complexity surrounding SRWs? 4. How well does the dynamical system work for computing tensor evecs? 5. How can we use stochastic or dynamical systems views for H-eigenpairs? 6. More data mining applications? Conjectures. 1. If P is a 3-mode tensor, then the spacey random walk converges. 2. If the HOPM on P always converges, the SRW on P always converges. 3. Determining if a SRW converges is PPAD-complete. 4. Computing a limiting distribution of SRW is PPAD-complete.
  • 92. Tensor Eigenvectors and Stochastic Processes. Thanks for your attention! SIAM ALA'18Benson & Gleich 92 Today’s information & more. www.cs.cornell.edu/~arb/tesp Austin R. Benson http://cs.cornell.edu/~arb @austinbenson arb@cs.cornell.edu David F. Gleich https://www.cs.purdue.edu/homes/dgleich/ @dgleich dgleich@purdue.edu