Successfully reported this slideshow.
Upcoming SlideShare
×

# TENSOR DECOMPOSITION WITH PYTHON

6,822 views

Published on

LEARNING STRUCTURES FROM MULTIDIMENSIONAL DATA

Presentation at Pycon8, Florence, April 9 2017

Published in: Science
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Hello! Get Your Professional Job-Winning Resume Here - Check our website! https://vk.cc/818RFv

Are you sure you want to  Yes  No
• Welcome to best site casual Dating http://wantsdate.com

Are you sure you want to  Yes  No

### TENSOR DECOMPOSITION WITH PYTHON

1. 1. TENSOR DECOMPOSITION WITH PYTHON LEARNING STRUCTURES FROM MULTIDIMENSIONAL DATA ANDRÉ PANISSON @apanisson ISI Foundation, Torino & New York City
2. 2. WHAT IS DATA DECOMPOSITION? DECOMPOSITION == FACTORIZATION Representation a dataset as a sum of (interpretable) parts ▸ Represent data as the combination of many components / factors ▸ Dimensionality reduction: each new dimension  represents a latent variable: ▸ text corpus => topics ▸ shopping behaviour => segments (user segmentation) ▸ social network => groups, communities ▸ psychology surveys => personality traits ▸ electronic medical records => health conditions ▸ chemical solutions => chemical ingredients
3. 3. X W H
4. 4. DATA DECOMPOSITION ▸ Decomposition of data represented in two dimensions:  MATRIX FACTORIZATION ▸ text: documents X terms ▸ surveys: subjects X questions ▸ electronic medical records: patients X diagnosis/drugs ▸ Decomposition of data represented in more dimensions:  TENSOR FACTORIZATION ▸ social networks: user X user (adjacency matrix) X time ▸ text: authors X terms X time ▸ spectroscopy:  solution sample X wavelength (emission) X wavelength (excitation)
5. 5. WHY TENSOR FACTORIZATION + PYTHON? ▸ Matrix Factorization is already used in many ﬁelds ▸ Tensor Factorization is becoming very popular  for multiway data analysis ▸ TF is very useful to explore time-varying network data ▸ But still, the most used tool is Matlab ▸ There’s room for improvement in   the Python libraries for TF
6. 6. MATRIX DECOMPOSITION
7. 7. FACTOR ANALYSIS Spearman ~1900 X≈WH Xtests x subjects ≈ Wtests x intelligences Hintelligences x subjects Spearman, 1927: The abilities of man. ≈ tests subjects subjects tests Int. Int. X W H
8. 8. TOPIC MODELING / LATENT SEMANTIC ANALYSIS Blei, David M. "Probabilistic topic models." Communications of the ACM 55.4 (2012): 77-84. . , , . , , . . . gene dna genetic life evolve organism brai n neuron nerve data number computer . , , Topics Documents Topic proportions and assignments 0.04 0.02 0.01 0.04 0.02 0.01 0.02 0.01 0.01 0.02 0.02 0.01 data number computer . , , 0.02 0.02 0.01
9. 9. TOPIC MODELING / LATENT SEMANTIC ANALYSIS X≈WH Non-negative Matrix Factorization (NMF): (~1970 Lawson, ~1995 Paatero, ~2000 Lee & Seung) 2005 Gaussier et al. "Relation between PLSA and NMF and implications." arg min W,H kX WHk s. t. W, H 0 ≈ documents terms terms documents topic topic Sparse  Matrix! W H
10. 10. NON-NEGATIVE MATRIX FACTORIZATION (NMF) NMF gives Part based representation  (Lee & Seung – Nature 1999) NMF =× Original PCA × = NMF is similar to Spectral Clustering  (Ding et al. - SDM 2005) arg min W,H kX WHk s. t. W, H 0 W W • XHT WHHT H H • WT X WTWH NMF brings interpretation!
11. 11. from sklearn import datasets, decomposition, utils digits = datasets.fetch_mldata('MNIST original') A = utils.shuffle(digits.data) nmf = decomposition.NMF(n_components=20) W = nmf.fit_transform(A) H = nmf.components_ plt.rc("image", cmap="binary") plt.figure(figsize=(8,4)) for i in range(20): plt.subplot(2,5,i+1) plt.imshow(H[i].reshape(28,28)) plt.xticks(()) plt.yticks(()) plt.tight_layout()
12. 12. TENSORS AND TENSOR DECOMPOSITION
13. 13. BEYOND MATRICES: HIGH DIMENSIONAL DATASETS Cichocki et al. Nonnegative Matrix and Tensor Factorizations Environmental analysis ▸ Measurement as a function of (Location, Time, Variable) Sensory analysis ▸ Score as a function of (Wine sample, Judge, Attribute) Process analysis ▸ Measurement as a function of (Batch, Variable, time) Spectroscopy ▸ Intensity as a function of (Wavelength, Retention, Sample, Time, Location, …) … MULTIWAY DATA ANALYSIS
14. 14. DIGITAL TRACES FROM SENSORS AND IOT USER POSITION TIME …
15. 15. TENSORS
16. 16. WHAT IS A TENSOR? A tensor is a multidimensional array  E.g., three-way tensor: Mode-1 Mode-2 Mode-3 651a
17. 17. FIBERS AND SLICES Cichocki et al. Nonnegative Matrix and Tensor Factorizations Column (Mode-1) Fibers Row (Mode-2) Fibers Tube (Mode-3) Fibers Horizontal Slices Lateral Slices Frontal Slices A[:, 4, 1] A[1, :, 4] A[1, 3, :] A[1, :, :] A[:, :, 1]A[:, 1, :]
18. 18. TENSOR UNFOLDINGS: MATRICIZATION AND VECTORIZATION Matricization: convert a tensor to a matrix Vectorization: convert a tensor to a vector
19. 19. >>> T = np.arange(0, 24).reshape((3, 4, 2)) >>> T array([[[ 0, 1], [ 2, 3], [ 4, 5], [ 6, 7]], [[ 8, 9], [10, 11], [12, 13], [14, 15]], [[16, 17], [18, 19], [20, 21], [22, 23]]]) OK for dense tensors: use a combination   of transpose() and reshape() Not simple for sparse datasets (e.g.: <authors, terms, time>) for j in range(T.shape[1]): for i in range(T.shape[2]): print T[:, i, j] [ 0 8 16] [ 2 10 18] [ 4 12 20] [ 6 14 22] [ 1 9 17] [ 3 11 19] [ 5 13 21] [ 7 15 23] # supposing the existence of unfold >>> T.unfold(0) array([[ 0, 2, 4, 6, 1, 3, 5, 7], [ 8, 10, 12, 14, 9, 11, 13, 15], [16, 18, 20, 22, 17, 19, 21, 23]]) >>> T.unfold(1) array([[ 0, 8, 16, 1, 9, 17], [ 2, 10, 18, 3, 11, 19], [ 4, 12, 20, 5, 13, 21], [ 6, 14, 22, 7, 15, 23]]) >>> T.unfold(2) array([[ 0, 8, 16, 2, 10, 18, 4, 12, 20, 6, 14, 22], [ 1, 9, 17, 3, 11, 19, 5, 13, 21, 7, 15, 23]])
20. 20. RANK-1 TENSOR The outer product of N vectors results in a rank-1 tensor array([[[ 1., 2.], [ 2., 4.], [ 3., 6.], [ 4., 8.]], [[ 2., 4.], [ 4., 8.], [ 6., 12.], [ 8., 16.]], [[ 3., 6.], [ 6., 12.], [ 9., 18.], [ 12., 24.]]]) a = np.array([1, 2, 3]) b = np.array([1, 2, 3, 4]) c = np.array([1, 2]) T = np.zeros((a.shape[0], b.shape[0], c.shape[0])) for i in range(a.shape[0]): for j in range(b.shape[0]): for k in range(c.shape[0]): T[i, j, k] = a[i] * b[j] * c[k] T = a(1) · · · a(N) = a c b Ti,j,k = a (1) i a (2) j a (3) k
21. 21. TENSOR RANK ▸ Every tensor can be written as a sum of rank-1 tensors = a1 aJ c1 cJ b1 bJ + + ▸ Tensor rank: smallest number of rank-1 tensors   that can generate it by summing up X ⇡ RX r=1 a(1) r a(2) r · · · a(N) r ⌘ JA(1) , A(2) , · · · , A(N) K T ⇡ RX r=1 ar br cr ⌘ JA, B, CK
22. 22. array([[[ 61., 82.], [ 74., 100.], [ 87., 118.], [ 100., 136.]], [[ 77., 104.], [ 94., 128.], [ 111., 152.], [ 128., 176.]], [[ 93., 126.], [ 114., 156.], [ 135., 186.], [ 156., 216.]]]) A = np.array([[1, 2, 3], [4, 5, 6]]).T B = np.array([[1, 2, 3, 4], [5, 6, 7, 8]]).T C = np.array([[1, 2], [3, 4]]).T T = np.zeros((A.shape[0], B.shape[0], C.shape[0])) for i in range(A.shape[0]): for j in range(B.shape[0]): for k in range(C.shape[0]): for r in range(A.shape[1]): T[i, j, k] += A[i, r] * B[j, r] * C[k, r] T = np.einsum('ir,jr,kr->ijk', A, B, C) : Kruskal Tensorbr cr ⌘ JA, B, CK
23. 23. TENSOR FACTORIZATION ▸ CANDECOMP/PARAFAC factorization (CP) ▸ extensions of SVD / PCA / NMF to tensors NON-NEGATIVE TENSOR FACTORIZATION ▸ Decompose a non-negative tensor to   a sum of R non-negative rank-1 tensors arg min A,B,C kT JA, B, CKk with JA, B, CK ⌘ RX r=1 ar br cr subject to A 0, B 0, C 0
24. 24. TENSOR FACTORIZATION: HOW TO Alternating Least Squares(ALS):  Fix all but one factor matrix to which LS is applied min A 0 kT(1) A(C B)T k min B 0 kT(2) B(C A)T k min C 0 kT(3) C(B A)T k denotes the Khatri-Rao product, which is a column-wise Kronecker product, i.e., C B = [c1 ⌦ b1, c2 ⌦ b2, . . . , cr ⌦ br] T(1) = ˆA(ˆC ˆB)T T(2) = ˆB(ˆC ˆA)T T(3) = ˆC(ˆB ˆA)T Unfolded Tensor  on the kth mode
25. 25. F = [zeros(n, r), zeros(m, r), zeros(o, r)] FF_init = np.rand((len(F), r, r)) def iter_solver(T, F, FF_init): # Update each factor for k in range(len(F)): # Compute the inner-product matrix FF = ones((r, r)) for i in range(k) + range(k+1, len(F)): FF = FF * FF_init[i] # unfolded tensor times Khatri-Rao product XF = T.uttkrp(F, k) F[k] = F[k]*XF/(F[k].dot(FF)) # F[k] = nnls(FF, XF.T).T FF_init[k] = (F[k].T.dot(F[k])) return F, FF_init min A 0 kT(1) A(C B)T k min B 0 kT(2) B(C A)T k min C 0 kT(3) C(B A)T k arg min W,H kX WHk s. J. Kim and H. Park. Fast Nonnegative Tensor Factorization with an Active-set-like Method.  In High-Performance Scientiﬁc Computing: Algorithms and Applications, Springer, 2012, pp. 311-326. W W • XHT WHHT T(1)(C B)
26. 26. HOW TO INTERPRET: USER X TERM X TIME X is a 3-way tensor in which xnmt is 1 if the term m was used by user n at interval t, 0 otherwise ANxK is the the association of each user n to a factor k BMxK is the association of each term m to a factor k CTxK shows the time activity of each factor users users C = X A B (N×M×T) (T×K) (N×K) (M×K) terms tim e tim e terms factors