SlideShare a Scribd company logo
High-degree polynomial expansions
20th June 2022
High-degree polynomial expansions 20th
June 2022 1 / 100
Outline
1 Introduction
2 Higher-degree polynomial expansions
3 Object recognition with polynomial networks
4 Data generation with polynomial networks
5 Future directions
High-degree polynomial expansions 20th
June 2022 2 / 100
High-degree polynomial expansions 20th
June 2022 3 / 100
Imagenet
https: // paperswithcode. com/ sota/ image-classification-on-imagenet
High-degree polynomial expansions 20th
June 2022 4 / 100
Deep-learning architectures
K. He, X. Zhang, X., S. Ren, J. Sun, Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition
(CVPR), 2016
High-degree polynomial expansions 20th
June 2022 5 / 100
Deep-learning architectures
(a) (b)
J Hu, L Shen, G Sun. ’Squeeze-and-excitation networks.’ In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
X. Wang, R. Girshick, A. Gupta, K. He. ’Non-local Neural Networks.’ In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
High-degree polynomial expansions 20th
June 2022 6 / 100
MLP layer
High-degree polynomial expansions 20th
June 2022 7 / 100
MLP layer
High-degree polynomial expansions 20th
June 2022 8 / 100
Squeeze-and-Excitation Nets are 2nd degree polynomials
High-degree polynomial expansions 20th
June 2022 9 / 100
Squeeze-and-Excitation Nets are 2nd degree polynomials
High-degree polynomial expansions 20th
June 2022 10 / 100
Non-local neural network is a 3rd degree polynomial
High-degree polynomial expansions 20th
June 2022 11 / 100
Non-local neural network is a 3rd degree polynomial
High-degree polynomial expansions 20th
June 2022 12 / 100
Self-Attention is a 3rd degree polynomial
High-degree polynomial expansions 20th
June 2022 13 / 100
Learning with polynomials, an old idea
Mapping Units [Hinton, 1985], ”dynamic mapping” [v.d. Malsburg;
1981]
Binocular+Motion Energy models [Adelson, Bergen; 1985], [Ozhawa,
DeAngelis, Freeman; 1990], [Fleet et al., 1994].
Sigma-Pi neural unit [Mel, Koch; 1990].
Higher Order Botlzmann Machines / Higher Order Neural Networks
[Sejnowski; 1986].
Subspace SOM [Kohonen; 1996], topographic ICA [Hyvarinen, Hoyer;
2000] [Karklin, Lewicki;2003].
Bilinear Models [Tenebaum and Freeman; 2008], [Ohlshaussen; 1994],
[Grimes, Rao; 2005].
Higher Order Restricted Boltzmann Machines (RBMs) [Memisevic
and Hinton; 2007], [Ranzato et al; 2010].
Gating mechanisms; LSTM [Hochreiter, Schmidhuber 1997],
Multiplicative RNN [Sutskever, Martens, Hinton; 2011].
High-degree polynomial expansions 20th
June 2022 14 / 100
Group Method of Data Handling (GMDH)
One of the first approaches of systematic design of nonlinear
relationships.
Generation of Partial Descriptions of data (PDs) with two input
variables.
Shortcoming: tends to produce an overly complex network.
A Ivakhnenko. ‘Polynomial theory of complex systems.’ IEEE Transactions on Systems, Man, and Cybernetics, 1971.
High-degree polynomial expansions 20th
June 2022 15 / 100
Mapping Units / Higher Order Boltzmann Machines
Hinton et al. (1985) and Sutskever et al. (2011) argue that
multiplications (mapping units) allow for better modeling of
conjunctions.
Higher order Boltzmann Machines and Higher order RBMs utilize
multiplication in factorized representations, e.g., bilinear models
factorize style and content.
High-degree polynomial expansions 20th
June 2022 16 / 100
Pi-Sigma network (PSN)
Single hidden layer learns multiple affine transformations of the data,
multiplies them to obtain the output.
hji =
X
k
wkji xk + θji
yi = σ(
Y
j
hji ) .
Y Shin, J Ghosh. ‘The pi-sigma network: an efficient higher-order neural network for pattern classification and function approximation.’ International Joint Conference on
Neural Networks, 1991.
High-degree polynomial expansions 20th
June 2022 17 / 100
Sigma-Pi-Sigma Neural Network (SPSNN)
Composed of different orders of pi-sigma networks.
fSPSNN =
k
X
i=1
fPSNk
=
k
X
i=1
k
Y
j=1
hjk .
C Li. ‘A sigma-pi-sigma neural network (SPSNN).’ Neural Processing Letters, 2003.
High-degree polynomial expansions 20th
June 2022 18 / 100
Factorization Machines
Second degree polynomial net to combine the features under sparse
data.
The weight matrix is mapped into a low-rank space using matrix
factorization.
ŷ(x) := w0 +
n
X
i=1
wi xi +
n
X
i=1
n
X
j=i+1
⟨vi , vj⟩ xi xj ,
where the learnable parameters are: w0 ∈ R, w ∈ Rn and
V ∈ Rn×k (k ≫ n).
S Rendle. ‘Factorization Machines.’ International Conference on Data Mining, 2010.
High-degree polynomial expansions 20th
June 2022 19 / 100
Variations of Factorization Machines
Field-aware FM (FFM): Different vectors are used when the features
of different fields combination.
ŷ(x) := w0 +
n
X
i=1
wi xi +
n
X
i=1
n
X
j=i+1
D
vi,fj , vj,fi
E
xi xj .
Field-weighted FM: Add a weight parameter for every two features.
ŷ(x) := w0 +
n
X
i=1
wi xi +
n
X
i=1
n
X
j=i+1
⟨vi , vj⟩ xi xjrfi ,fj .
Higher-order FM: Third-order or higher-order feature combination
problems.
Y Juan, Y Zhuang, W Chin, C Lin. ‘Field-aware factorization machines for CTR prediction.’ In ACM conference on recommender systems, 2016.
J Pan, et al. ‘Field-weighted factorization machines for click-through rate prediction in display advertising.’ In World Wide Web Conference, 2018.
M Blondel, A Fujino, N Ueda, M Ishihata. ‘Higher-order factorization machines.’ In Advances in neural information processing systems (NeurIPS), 2016.
High-degree polynomial expansions 20th
June 2022 20 / 100
Multiplicative Recurrent Neural Networks (MRNN)
Character-level language modeling tasks.
Multiplicative (or “gated”) connections.
factor state sequence ft = diag(Wfx xt) · Wfhht−1
hidden state sequence ht = tanh(Whf ft + Whx xt)
output sequence ot = Wohht + bo .
I Sutskever, J Martens, G Hinton. ‘Generating text with recurrent neural networks.’ In International Conference on Machine Learning (ICML), 2011.
High-degree polynomial expansions 20th
June 2022 21 / 100
Sum-Product Networks (SPN)
H Poon, P Domingos. ‘Sum-product networks: A new deep architecture.’ In International Conference on Computer Vision Workshops, 2011.
High-degree polynomial expansions 20th
June 2022 22 / 100
Multiplicative interactions
High-degree polynomial expansions 20th
June 2022 23 / 100
Outline
1 Introduction
2 Higher-degree polynomial expansions
3 Object recognition with polynomial networks
4 Data generation with polynomial networks
5 Future directions
High-degree polynomial expansions 20th
June 2022 24 / 100
Outline
1 Introduction
2 Higher-degree polynomial expansions
Notation
3 Object recognition with polynomial networks
4 Data generation with polynomial networks
5 Future directions
High-degree polynomial expansions 20th
June 2022 25 / 100
Formalism
In Machine Learning tasks, we have (at least) one input and one
output.
The goal is to learn G(z) : Rd → Ro with z ∈ Rd the input.
Neural networks use a composition of linear and unitary non-linear
units.
We augment this structure and we capture the higher-order
correlations using tensors.
High-degree polynomial expansions 20th
June 2022 26 / 100
Hadamard product
Let matrices Γ ∈ R2×3 and P ∈ R2×3. The Hadamard product
Γ ∗ P is denoted as ‘∗’ and defined as:
"
γ(1,1) γ(1,2) γ(1,3)
γ(2,1) γ(2,2) γ(2,3)
#
| {z }
Γ
∗
"
ρ(1,1) ρ(1,2) ρ(1,3)
ρ(2,1) ρ(2,2) ρ(2,3)
#
| {z }
P
=
"
γ(1,1)ρ(1,1) γ(1,2)ρ(1,2) γ(1,3)ρ(1,3)
γ(1,1)ρ(2,1) γ(1,2)ρ(2,2) γ(1,3)ρ(2,3)
#
| {z }
Γ∗P
(1)
The Hadamard product of Γ ∈ RI×N and P ∈ RI×N results in a
matrix of dimensions I × N.
Hadamard, J. ’Leçons sur la Propagation des Ondes et les Équations de l’Hydrodynamique’, 1903.
Halmos, Paul R. ’Finite-dimensional vector spaces’, Annals of Mathematics Studies, Princeton University Press, 1948.
High-degree polynomial expansions 20th
June 2022 27 / 100
Khatri-Rao product
Let matrices Γ ∈ R2×3 and P ∈ R3×3. The Khatri-Rao product
Γ ⊙ P is denoted as ‘⊙’ and defined as:
"
γ(1,1) γ(1,2) γ(1,3)
γ(2,1) γ(2,2) γ(2,3)
#
| {z }
Γ
⊙



ρ(1,1) ρ(1,2) ρ(1,3)
ρ(2,1) ρ(2,2) ρ(2,3)
ρ(3,1) ρ(3,2) ρ(3,3)



| {z }
P
=









γ(1,1)ρ(1,1) γ(1,2)ρ(1,2) γ(1,3)ρ(1,3)
γ(1,1)ρ(2,1) γ(1,2)ρ(2,2) γ(1,3)ρ(2,3)
γ(1,1)ρ(3,1) γ(1,2)ρ(3,2) γ(1,3)ρ(3,3)
γ(2,1)ρ(1,1) γ(2,2)ρ(1,2) γ(2,3)ρ(1,3)
γ(2,1)ρ(2,1) γ(2,2)ρ(2,2) γ(2,3)ρ(2,3)
γ(2,1)ρ(3,1) γ(2,2)ρ(3,2) γ(2,3)ρ(3,3)









| {z }
Γ⊙P
(2)
The Khatri-Rao product of Γ ∈ RI×N and P ∈ RJ×N results in a
matrix of dimensions (IJ) × N.
Khatri, C. G., and C. Radhakrishna Rao. ’Solutions to some functional equations and their applications to characterization of probability distributions.’ Sankhyā: the Indian
journal of statistics, series A (1968): 167-180.
High-degree polynomial expansions 20th
June 2022 28 / 100
Tensors
Tensors → multi-dimensional arrays.
High-degree polynomial expansions 20th
June 2022 29 / 100
Tensors
Tensors → multi-dimensional arrays.
The order is the number of dimensions, e.g. X ∈ R4×4×4 has order 3.
High-degree polynomial expansions 20th
June 2022 29 / 100
Tensors
Tensors → multi-dimensional arrays.
The order is the number of dimensions, e.g. X ∈ R4×4×4 has order 3.
Third-order tensor illustration:
𝑥𝑖
𝑥𝑗
𝑥𝑘
High-degree polynomial expansions 20th
June 2022 29 / 100
Tensors
Tensors → multi-dimensional arrays.
The order is the number of dimensions, e.g. X ∈ R4×4×4 has order 3.
Third-order tensor illustration:
𝑥𝑖
𝑥𝑗
𝑥𝑘
Let W ∈ RI1×···×IM and u ∈ RIm with m ∈ [1, . . . , M]. The mode-m
vector product W ×m u is:
(W ×m u)i1,...,im−1,im+1,...,iM
=
Im
X
im=1
wi1,...,iM
uim (3)
High-degree polynomial expansions 20th
June 2022 29 / 100
CP decomposition
Goal: Decompose a tensor W to a sequence of low-rank components.
High-degree polynomial expansions 20th
June 2022 30 / 100
CP decomposition
Goal: Decompose a tensor W to a sequence of low-rank components.
In matrix form: W(1)
.
= U[1]
 J2
m=M U[m]
T
where {U[m]}M
m=1 are
the factor matrices.
High-degree polynomial expansions 20th
June 2022 30 / 100
CP decomposition
Goal: Decompose a tensor W to a sequence of low-rank components.
In matrix form: W(1)
.
= U[1]
 J2
m=M U[m]
T
where {U[m]}M
m=1 are
the factor matrices.
A schematic of the CP decomposition of a third-order tensor W is:
Figure: CP decomposition of a third-order tensor.
High-degree polynomial expansions 20th
June 2022 30 / 100
Outline
1 Introduction
2 Higher-degree polynomial expansions
Polynomial expansion with respect to an input vector
3 Object recognition with polynomial networks
4 Data generation with polynomial networks
5 Future directions
High-degree polynomial expansions 20th
June 2022 31 / 100
Polynomial approximation
Approximate the τth element G(z)τ with a Nth-degree polynomial:
(G(z))τ ≈ βτ +
d
X
i=1
w
[1]
τ,i zi +
d
X
i=1
d
X
j=1
w
[2]
τ,i,jzi zj + · · · +
d
X
i=1
d
X
j=1
. . .
d
X
k=1
| {z }
N summations
w
[N]
τ,i,j,...,kzi zj . . . zk
(4)
Both βτ ∈ R and the set of tensors

W[n]
τ ∈ R
Qn
m=1
×md 	N
n=1
are
learnable parameters.
High-degree polynomial expansions 20th
June 2022 32 / 100
Polynomial approximation
The last equation (4) can be written in the tensor format as:
(G(z))τ ≈ βτ + w[1]
τ
T
z + zT
W[2]
τ z + · · · + W[N]
τ
N
Y
n=1
×nz (5)
By stacking the polynomials for all elements τ ∈ [1, . . . , o], we obtain:
G(z) ≈
N
X
n=1

W[n]
n+1
Y
j=2
×jz

+ β (6)
From Stone-Weierstrass theorem, a polynomial can approximate any
smooth function.
High-degree polynomial expansions 20th
June 2022 33 / 100
Polynomial approximation - learnable parameters
The learnable parameters of (6) are Θ(dN).
High-degree polynomial expansions 20th
June 2022 34 / 100
Polynomial approximation - learnable parameters
The learnable parameters of (6) are Θ(dN).
A solution to reduce them: demand each factor W[n]
to be low-rank.
High-degree polynomial expansions 20th
June 2022 35 / 100
Outline
1 Introduction
2 Higher-degree polynomial expansions
Tensor decomposition per degree
3 Object recognition with polynomial networks
4 Data generation with polynomial networks
5 Future directions
High-degree polynomial expansions 20th
June 2022 36 / 100
Tensor decomposition per degree
First solution: Demand each factor W[n]
to be low-rank.
Apply CP decomposition to each factor W[n]
.
Then, the expansion for N = 3 is:
y = β + CT
1,[1]z +

CT
1,[2]z

∗

CT
2,[2]z

+

CT
1,[3]z

∗

CT
2,[3]z

∗

CT
3,[3]z
 (7)
G Chrysos*, M Georgopoulos*, J Deng, J Kossaifi, Y Panagakis, A Anandkumar, ‘Augmenting Deep Classifiers with Polynomial Neural Networks.’ European Conference on
Computer Vision (ECCV), 2022.
High-degree polynomial expansions 20th
June 2022 37 / 100
Khatri-Rao to Hadamard product
Lemma (Chrysos’19)
For a set of N matrices {A[ν] ∈ RIν ×K }N
ν=1 and {B[ν] ∈ RIν ×L}N
ν=1, the
following equality holds:
(
N
K
ν=1
A[ν])T
· (
N
K
ν=1
B[ν]) = (AT
[1] · B[1]) ∗ . . . ∗ (AT
[N] · B[N]), (8)
where the symbol ‘∗’ denotes the Hadamard product.
G Chrysos, S Moschoglou, Y Panagakis, and S Zafeiriou. ‘Polygan: High-order polynomial generators.’ arXiv preprint arXiv:1908.06571.
High-degree polynomial expansions 20th
June 2022 38 / 100
Factorization of Univariate Polynomials Over Finite Fields
Berlekamp’s algorithm (1970): only practical over small finite fields.
Cantor–Zassenhaus Algorithm (1981): Probabilistic algorithms.
Victor Shoup Algorithm (1990): Deterministic algorithm.
E Berlekamp. ‘Factoring Polynomials Over Large Finite Fields.’ In Mathematics of Computation, 1970.
D Cantor, H Zassenhaus. ‘A New Algorithm for Factoring Polynomials Over Finite Fields.’ In Mathematics of Computation, 1981.
V Shoup. ‘On the deterministic complexity of factoring polynomials over finite fields.’ In Information Processing Letters, 1990.
High-degree polynomial expansions 20th
June 2022 39 / 100
Decoupling Multivariate Polynomials
Factorizing multivariate polynomials as a linear combination of
univariate polynomials has been studied using tensor decompositions.
Using first-order information and CP decomposition.
Obtain a decomposition of the form:
fi (u1, . . . , um) =
r
X
j=1
wij · gj
 m
X
k=1
vkjuk

, ∀i = 1, . . . , n ,
Matrix form decoupled representation:
f (u) = Wg(V⊤
u) ,
P. Dreesen, M. Ishteva, J. Schoukens. ‘Decoupling Multivariate Polynomials Using First-Order Information and Tensor Decompositions.’ Journal on Matrix Analysis and
Applications, 2015.
High-degree polynomial expansions 20th
June 2022 40 / 100
Outline
1 Introduction
2 Higher-degree polynomial expansions
Π−nets: Joint decompositions across degrees
3 Object recognition with polynomial networks
4 Data generation with polynomial networks
5 Future directions
High-degree polynomial expansions 20th
June 2022 41 / 100
Π-nets: Third-degree expansion schematic - Model CCP
Figure: Third-degree expansion.
G Chrysos, S Moschoglou, Y Panagakis, and S Zafeiriou. ‘Polygan: High-order polynomial generators.’ arXiv preprint arXiv:1908.06571.
G Chrysos, S Moschoglou, G Bouritsas, Y Panagakis, J Deng, and S Zafeiriou. ‘Π-nets: Deep Polynomial Neural Networks.’ In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), 2020.
High-degree polynomial expansions 20th
June 2022 42 / 100
Π-nets: Third-degree expansion schematic - Model CCP
Figure: Third-degree expansion.
G Chrysos, S Moschoglou, Y Panagakis, and S Zafeiriou. ‘Polygan: High-order polynomial generators.’ arXiv preprint arXiv:1908.06571.
G Chrysos, S Moschoglou, G Bouritsas, Y Panagakis, J Deng, and S Zafeiriou. ‘Π-nets: Deep Polynomial Neural Networks.’ In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), 2020.
High-degree polynomial expansions 20th
June 2022 42 / 100
Π-nets: Third-degree expansion schematic - Model CCP
Figure: Third-degree expansion.
G Chrysos, S Moschoglou, Y Panagakis, and S Zafeiriou. ‘Polygan: High-order polynomial generators.’ arXiv preprint arXiv:1908.06571.
G Chrysos, S Moschoglou, G Bouritsas, Y Panagakis, J Deng, and S Zafeiriou. ‘Π-nets: Deep Polynomial Neural Networks.’ In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), 2020.
High-degree polynomial expansions 20th
June 2022 42 / 100
Π-nets: Third-degree expansion schematic - Model CCP
Figure: Third-degree expansion.
G Chrysos, S Moschoglou, Y Panagakis, and S Zafeiriou. ‘Polygan: High-order polynomial generators.’ arXiv preprint arXiv:1908.06571.
G Chrysos, S Moschoglou, G Bouritsas, Y Panagakis, J Deng, and S Zafeiriou. ‘Π-nets: Deep Polynomial Neural Networks.’ In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), 2020.
High-degree polynomial expansions 20th
June 2022 42 / 100
Π−nets - Model CCP
We use a coupled CP decomposition, i.e., factor sharing in different
levels.
To demonstrate the method, we assume a third degree expansion, i.e.,
N = 3 in (6).
Then, the expansion is:
G(z) = β + W[1]
z + W[2]
×2 z ×3 z + W[3]
×2 z ×3 z ×4 z (9)
High-degree polynomial expansions 20th
June 2022 43 / 100
Π−nets - Third-degree expansion - Model CCP
We use the following factorizations:
Let W[1] = CUT
[1], be the parameters for first level of approximation.
Assume W[2]
= W
[2]
1:2 + W
[2]
1:3. We use a coupled CP decomposition
which results in the following matrix form:
W
[2]
(1) = C(U[3] ⊙ U[1])T + C(U[2] ⊙ U[1])T .
Let the third-degree parameters: W
[3]
(1) = C(U[3] ⊙ U[2] ⊙ U[1])T .
High-degree polynomial expansions 20th
June 2022 44 / 100
Π−nets - Nth
degree expansion
The derivation can be extended to an arbitrary degree with the
following recursive formulation:
xn =

UT
[n]z

∗ xn−1 + xn−1 , (CCP)
for n = 2, . . . , N with x1 = UT
[1]z and x = CxN + β. The parameters
C ∈ Ro×k, U[n] ∈ Rd×k for n = 1, . . . , N are learnable.
High-degree polynomial expansions 20th
June 2022 45 / 100
Π−nets - Alternative models
Model CCP above assumes a certain factorization, e.g.,
W[2]
= W
[2]
1:2 + W
[2]
1:3.
New models can be derived by changing the assumptions.
For instance, what if we assume that the tensors admit nested
decompositions?
High-degree polynomial expansions 20th
June 2022 46 / 100
Π-nets: Model NCP
The model with nested decompositions, called NCP, for N = 3:
b[1] B[1] ∗ S[2] + ∗ S[3] + ∗ C +
A[1] A[2] A[3]
z
B[2] B[3]
b[2] b[3]
β
G(z)
Figure: Third-degree expansion.
High-degree polynomial expansions 20th
June 2022 47 / 100
Π-nets: Model NCP
The derivation can be extended to an arbitrary degree with the following
recursive formulation:
xn =

AT
[n]z

∗

ST
[n]xn−1 + BT
[n]b[n]

, (NCP)
for n = 2, . . . , N with x1 =

AT
[1]z

∗

BT
[1]b[1]

and x = CxN + β.
High-degree polynomial expansions 20th
June 2022 48 / 100
Π-nets: Product of polynomials
The previous formulations, e.g. (CCP), require Θ(N) layers for Nth
degree expansion.
Can we achieve a higher degree expansion with less parameters?
Yes. For instance, by stacking lower-degree polynomials sequentially.
z · · · G(z)
Order 2 Order 2
Order 2N
∗ ∗
Figure: Stacking N polynomials of degree 2, results in a 2N
polynomial expansion.
High-degree polynomial expansions 20th
June 2022 49 / 100
Outline
1 Introduction
2 Higher-degree polynomial expansions
3 Object recognition with polynomial networks
4 Data generation with polynomial networks
5 Future directions
High-degree polynomial expansions 20th
June 2022 50 / 100
Performance of polynomial expansions (with batch normalization) on
CIFAR10, CIFAR100 benchmarks.
Table: Polynomial expansion versus baselines.
CIFAR10 CIFAR100
2−degree products 0.907 ± 0.003 0.667 ± 0.003
ResNet18∗ 0.391 ± 0.001 0.168 ± 0.001
ResNet18 0.945 ± 0.000 0.756 ± 0.001
High-degree polynomial expansions 20th
June 2022 51 / 100
SORT model
The model obtains the following formulation:
x = UT
[1]z + UT
[2]z +

UT
[1]z

∗

UT
[2]z

. (10)
Y Wang, L Xie, C Liu, Y Zhang, W Zhang, A Yuille. ‘SORT: Second-Order Response Transform for Visual Recognition.’ International Conference on Computer Vision
(ICCV), 2017.
High-degree polynomial expansions 20th
June 2022 52 / 100
Squeeze-and-Excitation network
Squeeze-and-Excitation network (SENet): The output of the
SENet block YSE with respect to input X ∈ Rhw×C (h is the height,
w is the width) can be formulated as:
YSE
= (XW1) ∗ r(p(XW1)W2) = (XW1) ∗
−
→
1
 1
hw
−
→
1 T
XW1

W2
T
(11)
where W1, W2 are learnable parameters.
J Hu, L Shen, G Sun. ’Squeeze-and-excitation networks.’ In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
High-degree polynomial expansions 20th
June 2022 53 / 100
Non-local (NL) neural network
Non-local (NL) neural network: The output of the non-local block
YNL ∈ RN×C with respect to input X ∈ RN×C can be formulated as:
YNL
= (XW1W⊤
2 X⊤
)(XW3), (12)
where W1, W2, W3 ∈ RC×C are learnable parameters.
Scales quadratically with the dimension N (i.e. O(N2) complexity).
X Wang, R Girshick, A Gupta, K He. ’Non-local Neural Networks.’ In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
High-degree polynomial expansions 20th
June 2022 54 / 100
Poly-NL
Poly-NL: The output YPoly-NL
∈ RN×C is expressed by Using 3 degree
polynomial nets as non-local self-attention block:
YPoly-NL
= (Φ(XW1 ∗ XW2) ∗ X)W3, (13)
where learnable parameters W1, W2, W3 ∈ RC×C .
Scales linearly with the dimension N (i.e. O(N) complexity).
F Babiloni, et al. ‘Poly-NL: Linear Complexity Non-local Layers with Polynomials.’ In International Conference on Computer Vision (ICCV), 2021.
High-degree polynomial expansions 20th
June 2022 55 / 100
Linear Complexity Self-Attention with Polynomials
Poly-NL reformulates SA using only global descriptors and element-wise
multiplications, achieving Linear Complexity O(N).
High-degree polynomial expansions 20th
June 2022 56 / 100
Poly-NL: Space and Time Complexity
(a) (b)
Figure: Poly-NL achieves up to 10× speed up in run-time and a 5× less
complexity overhead wrt NL.
High-degree polynomial expansions 20th
June 2022 57 / 100
Non-local with lower-degree interactions
PDC-NL: Y = (XW1W⊤
2 X⊤)(XW3) + XW4XW5 + XW6
Includes first to third degrees term based on NL (only third degree).
G Chrysos*, M Georgopoulos*, J Deng, J Kossaifi, Y Panagakis, A Anandkumar, ‘Augmenting Deep Classifiers with Polynomial Neural Networks.’ European Conference on
Computer Vision (ECCV), 2022.
High-degree polynomial expansions 20th
June 2022 58 / 100
Outline
1 Introduction
2 Higher-degree polynomial expansions
3 Object recognition with polynomial networks
4 Data generation with polynomial networks
5 Future directions
High-degree polynomial expansions 20th
June 2022 59 / 100
Outline
1 Introduction
2 Higher-degree polynomial expansions
3 Object recognition with polynomial networks
4 Data generation with polynomial networks
Unconditional generation with polynomial networks
5 Future directions
High-degree polynomial expansions 20th
June 2022 60 / 100
Expressivity - Generation without activation functions
Results from a generator with convolutional layers without activations:
High-degree polynomial expansions 20th
June 2022 61 / 100
Expressivity of Π−nets
We consider image generation without activation functions between the
layers. Synthesized images:
High-degree polynomial expansions 20th
June 2022 62 / 100
Expressivity of Π−nets
Linear interpolation in the latent space:
High-degree polynomial expansions 20th
June 2022 63 / 100
Image generation from a polynomial generator
High-degree polynomial expansions 20th
June 2022 64 / 100
Π−nets on non-euclidean representation learning
Beyond image generation, polynomial nets perform well in non-euclidean
representation learning.
Code: https://github.com/grigorisg9gr/polynomial_nets
G Chrysos, S Moschoglou, G Bouritsas, J Deng, Y Panagakis, and S Zafeiriou. ‘Deep Polynomial Neural Networks.’ IEEE Transactions on Pattern Analysis and Machine
Intelligence (T-PAMI), 2021.
High-degree polynomial expansions 20th
June 2022 65 / 100
Outline
1 Introduction
2 Higher-degree polynomial expansions
3 Object recognition with polynomial networks
4 Data generation with polynomial networks
Synthesizing unseen combinations
5 Future directions
High-degree polynomial expansions 20th
June 2022 66 / 100
Conditional data generation: Visual examples
Figure: Image-to-image translation examples.
Phillip Isola, et al. ’A Image-to-image translation with conditional adversarial networks’, Conference on Computer Vision and Pattern Recognition (CVPR) 2017.
Mehdi Mirza and Simon Osindero. ’Conditional generative adversarial nets’, CoRR 2014.
High-degree polynomial expansions 20th
June 2022 67 / 100
Attribute-conditional generative models
High-degree polynomial expansions 20th
June 2022 68 / 100
Attribute-conditional generative models and generalization
High-degree polynomial expansions 20th
June 2022 69 / 100
Conditional Variational Autoencoder (cVAE)
High-degree polynomial expansions 20th
June 2022 70 / 100
MLC-VAE - Our framework
We instead model each attribute combination with a different mean.
How to obtain the mean:
M(y1, y2) = W[1]
y1 + W[2]
y2 + W[12]
×2 y1 ×3 y2, (14)
for attributes y1, y2.
M Georgopoulos, G Chrysos, M Pantic, and Y Panagakis. ‘Multilinear Latent Conditioning for Generating Unseen Attribute Combinations.’ In International Conference on
Machine Learning (ICML), 2020.
High-degree polynomial expansions 20th
June 2022 71 / 100
MLC-VAE - Results
High-degree polynomial expansions 20th
June 2022 72 / 100
MLC-VAE - Multiplicative interactions
Can we use additive interactions instead?
Not really. For instance, synthesize images with attributes (’smile’
and ’closed mouth’).
High-degree polynomial expansions 20th
June 2022 73 / 100
Outline
1 Introduction
2 Higher-degree polynomial expansions
3 Object recognition with polynomial networks
4 Data generation with polynomial networks
Conditional image generation with polynomial networks
5 Future directions
High-degree polynomial expansions 20th
June 2022 74 / 100
Diverse samples in conditional generation
Figure: In addition to the adversarial loss of GANs, regularization losses are
typically used for enabling diverse synthesis.
Q Mao, H Lee, H Tseng, S Ma, M Yang. ‘Mode Seeking Generative Adversarial Networks for Diverse Image Synthesis.’ In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), 2019.
High-degree polynomial expansions 20th
June 2022 75 / 100
Conditional image generation - Introduction
1 Conditioning the generator still relies on the neural network for the
expressivity.
2 Can we use high-degree polynomial expansions instead?
3 Assume zI, zII ∈ Rd are the input vectors. The goal is to learn a
function G : Rd×d → Ro that captures the higher-order correlations
between the elements of the two inputs.
High-degree polynomial expansions 20th
June 2022 76 / 100
CoPE: Nth
-degree expansion - Model CCP
The recursive formulation of CoPE is given by:
xn = xn−1 +

UT
[n,I]zI + UT
[n,II]zII

∗ xn−1, (15)
for n = 2, . . . , N with x1 = UT
[1,I]zI + UT
[1,II]zII and x = CxN + β.
The schematic illustration is the following:
Figure: Nth
-degree expansion for conditional generation.
G Chrysos, M Georgopoulos, and Y Panagakis. ‘Conditional Generation Using Polynomial Expansions.’ In Advances in neural information processing systems (NeurIPS),
2021.
High-degree polynomial expansions 20th
June 2022 77 / 100
CoPE: Nth
-degree expansion - Model CCP
The recursive formulation of CoPE is given by:
xn = xn−1 +

UT
[n,I]zI + UT
[n,II]zII

∗ xn−1, (15)
for n = 2, . . . , N with x1 = UT
[1,I]zI + UT
[1,II]zII and x = CxN + β.
The schematic illustration is the following:
Figure: Nth
-degree expansion for conditional generation.
G Chrysos, M Georgopoulos, and Y Panagakis. ‘Conditional Generation Using Polynomial Expansions.’ In Advances in neural information processing systems (NeurIPS),
2021.
High-degree polynomial expansions 20th
June 2022 77 / 100
Synthesized images with CoPE
(a) edges-to-handbags (b) edges-to-shoes
Figure: The first row depicts the conditional input (i.e., the edges). The rows 2-6
depict outputs when we vary zI (i.e., noise).
High-degree polynomial expansions 20th
June 2022 78 / 100
Beyond two-variable expansion with CoPE
The recursive formulation can be extended beyond two-variable
expansions. For three-variables the formulation is the following:
xn = xn−1 +

UT
[n,I]zI + UT
[n,II]zII + UT
[n,III]zIII

∗ xn−1, (16)
for n = 2, . . . , N with x1 = UT
[1,I]zI +UT
[1,II]zII +UT
[1,III]zIII and x = CxN +β.
Code:
https://github.com/grigorisg9gr/polynomial_nets_for_conditional_generation
High-degree polynomial expansions 20th
June 2022 79 / 100
Beyond two-variable expansion with CoPE
Synthesized images on conditional generation with 2 attributes:
(a) (b)
Figure: (a) Each row/column depicts a different hair/eye color respectively, (b)
synthesized images per unique combination by varying the noise zI.
High-degree polynomial expansions 20th
June 2022 80 / 100
Outline
1 Introduction
2 Higher-degree polynomial expansions
3 Object recognition with polynomial networks
4 Data generation with polynomial networks
Audio synthesis
5 Future directions
High-degree polynomial expansions 20th
June 2022 81 / 100
Audio representation
Time domain VS Frequency domain
Figure: Source: https://www.nti-audio.com/en/support/know-how/fast-fourier-transform-fft
High-degree polynomial expansions 20th
June 2022 82 / 100
How to model the complex-valued frequency representations?
Real-valued neural networks (RVNNs) with 1 output channel for the
magnitude of complex-valued representations:
Discard the phase information.
Require phase reconstruction in a generative task.
RVNNs with 2 output channels for complex-valued representations:
Higher degree of freedom at the synaptic weighting.
Lower generalization ability.
How about directly modelling the complex-valued representations?
A Hirose, S. Yoshida. ’Generalization Characteristics of Complex-Valued Feedforward Neural Networks in Relation to Signal Coherence.’ IEEE Transactions on Neural
Networks and Learning Systems, 2012.
High-degree polynomial expansions 20th
June 2022 83 / 100
Mergelyan’s Theorem
Suppose K is a compact set in the plane whose complement is connected,
f is a continuous complex-valued function defined on K which is
holomorphic in the interior of K, and if ϵ  0, then there exists a
polynomial P such that |f (x) − P(x)|  ϵ for all x ∈ K.
W Rudin. ’Real and Complex Analysis.’ McGraw-Hill International Series, 1987.
High-degree polynomial expansions 20th
June 2022 84 / 100
Schematic of the generator
Audiorepresentation 

in frequencydomain
Complex-valued

randomnoise
Audiorepresentation 

in frequencydomain
Complex-valued

randomnoise
...
...
...
from degreeto degree
APOLLOgenerator
(Model BN)
Yongtao Wu, G Chrysos, Volkan Cevher. ’Adversarial Audio Synthesis with Complex-valued Polynomial Networks.’ 2022.
High-degree polynomial expansions 20th
June 2022 85 / 100
Model in the complex field
CFBN (Nested CP decomposition with bias):
The recursive form for Nth degree expansion is:
e
yn =

ET
[n]e
x + ρ[n]

∗

FT
[n]e
yn−1 + b[n]

+ e
yn−1, (17)
for n = 2, . . . , N with e
y1 = (e
ET
[1]
e
x) ∗

e
b[1]

, e
y = e
He
yN + e
h, where we
denote by e
b[n] = e
BT
[n]
e
β[n] for n = 1, . . . , N.
High-degree polynomial expansions 20th
June 2022 86 / 100
Unsupervised audio generation on SC09 dataset
Model IS (↑) FID (↓) NDB (↓) JSD (↓) # par (M)
Real data 8.01 ± 0.24 0.50 0.00 ± 0.00 0.011
WaveGAN 4.67 ± 0.01 41.60 16.00 ± 1.09 0.094 36.5
. SpecGAN 6.03 ± 0.04 36.5
TiFGAN 5.97 26.70 6.00 ± 0.89 0.051 42.4
StyleGAN-U2 27.10 48.7
Unsupervised BigGAN 6.17 ± 0.20 24.72
Π-Nets 6.59 ± 0.03 13.01 4.40 ± 0.48 0.048 45.9
APOLLO, Small 6.48 ± 0.05 18.90 4.20 ± 1.47 0.038 4.6
APOLLO 7.25 ± 0.05 8.15 3.20 ± 1.16 0.029 64.1
High-degree polynomial expansions 20th
June 2022 87 / 100
Human evaluation
Human evaluation on unsupervised audio generation on SC09 dataset.
From left to right in the histogram, the Mean Opinion Score (MOS)
for all models and the real data are 1.61, 2.68, 2.73, 3.33, and 4.73,
respectively.
APOLLO
-Nets Real
TiFGAN
WaveGAN
Rating
High-degree polynomial expansions 20th
June 2022 88 / 100
Multimodal generation: Image-to-speech
High-degree polynomial expansions 20th
June 2022 89 / 100
Highway networks
2nd degree
Increasing degree
MLP 

(Identity activation)
1st degree 3rd degree Higher degree
LSTM
Gating
MLC-VAE
Bilinear form
Squeeze and excitation nets
StyleGAN
-Nets
APOLLO
COPE
PDC
Non-local networks
Self-attention
Metric learning
Polynomial nets
Mahalanobis distance
ResNet
RNN Multiplicative RNN Higher order tensor RNN
High-degree polynomial expansions 20th
June 2022 90 / 100
Outline
1 Introduction
2 Higher-degree polynomial expansions
3 Object recognition with polynomial networks
4 Data generation with polynomial networks
5 Future directions
High-degree polynomial expansions 20th
June 2022 91 / 100
Complementary work on polynomial networks I
1 Polynomial networks can enlarge the hypothesis space [Jayakumar’20,
Fan’21].
S Jayakumar, et al. ‘Multiplicative Interactions and Where to Find Them.’ In International Conference on Learning Representations (ICLR), 2020.
FL Fan, et al. ‘Expressivity and Trainability of Quadratic Networks.’ ArXiv preprint arXiv:2110.06081.
S Zhang, Y Gong, D Yu, ‘Encrypted Speech Recognition using Deep Polynomial Networks.’ In International Conference on Acoustics, Speech and Signal Processing
(ICASSP), 2019.
Z Zhu, et al. ‘Controlling the Complexity and Lipschitz Constant improves Polynomial Nets’ In International Conference on Learning Representations (ICLR), 2022.
High-degree polynomial expansions 20th
June 2022 92 / 100
Complementary work on polynomial networks I
1 Polynomial networks can enlarge the hypothesis space [Jayakumar’20,
Fan’21].
2 Privacy-preserving applications require polynomial expansions
[Zhang’19].
S Jayakumar, et al. ‘Multiplicative Interactions and Where to Find Them.’ In International Conference on Learning Representations (ICLR), 2020.
FL Fan, et al. ‘Expressivity and Trainability of Quadratic Networks.’ ArXiv preprint arXiv:2110.06081.
S Zhang, Y Gong, D Yu, ‘Encrypted Speech Recognition using Deep Polynomial Networks.’ In International Conference on Acoustics, Speech and Signal Processing
(ICASSP), 2019.
Z Zhu, et al. ‘Controlling the Complexity and Lipschitz Constant improves Polynomial Nets’ In International Conference on Learning Representations (ICLR), 2022.
High-degree polynomial expansions 20th
June 2022 92 / 100
Complementary work on polynomial networks I
1 Polynomial networks can enlarge the hypothesis space [Jayakumar’20,
Fan’21].
2 Privacy-preserving applications require polynomial expansions
[Zhang’19].
3 Sample complexity (and similar theoretical bounds) might be simpler
to compute [Zhu’22].
S Jayakumar, et al. ‘Multiplicative Interactions and Where to Find Them.’ In International Conference on Learning Representations (ICLR), 2020.
FL Fan, et al. ‘Expressivity and Trainability of Quadratic Networks.’ ArXiv preprint arXiv:2110.06081.
S Zhang, Y Gong, D Yu, ‘Encrypted Speech Recognition using Deep Polynomial Networks.’ In International Conference on Acoustics, Speech and Signal Processing
(ICASSP), 2019.
Z Zhu, et al. ‘Controlling the Complexity and Lipschitz Constant improves Polynomial Nets’ In International Conference on Learning Representations (ICLR), 2022.
High-degree polynomial expansions 20th
June 2022 92 / 100
Complementary work on polynomial networks I
1 Polynomial networks can enlarge the hypothesis space [Jayakumar’20,
Fan’21].
2 Privacy-preserving applications require polynomial expansions
[Zhang’19].
3 Sample complexity (and similar theoretical bounds) might be simpler
to compute [Zhu’22].
4 Known (theoretical) results from neural networks might not be
directly applicable (e.g., implicit bias).
S Jayakumar, et al. ‘Multiplicative Interactions and Where to Find Them.’ In International Conference on Learning Representations (ICLR), 2020.
FL Fan, et al. ‘Expressivity and Trainability of Quadratic Networks.’ ArXiv preprint arXiv:2110.06081.
S Zhang, Y Gong, D Yu, ‘Encrypted Speech Recognition using Deep Polynomial Networks.’ In International Conference on Acoustics, Speech and Signal Processing
(ICASSP), 2019.
Z Zhu, et al. ‘Controlling the Complexity and Lipschitz Constant improves Polynomial Nets’ In International Conference on Learning Representations (ICLR), 2022.
High-degree polynomial expansions 20th
June 2022 92 / 100
Theoretical characterization of polynomial networks
0 200 400 600 800 1000
Polynomial degree
10-3
10-2
10-1
100
101
Test
loss
Test loss
Figure: Double descent curve on polynomial regression.
Source: https: // windowsontheory. org/ 2019/ 12/ 05/ deep-double-descent/
High-degree polynomial expansions 20th
June 2022 93 / 100
Optimization and training
1 Multiplications can make the loss surface less well behaved [Schwarz
et al.]. How should we adapt the optimizers for polynomial
architectures?
J Schwarz, S Jayakumar, R Pascanu, P Latham, T W Teh. ’Powerpropagation: A sparsity inducing weight reparameterisation.’ In Advances in neural information
processing systems (NeurIPS), 2021.
High-degree polynomial expansions 20th
June 2022 94 / 100
Optimization and training
1 Multiplications can make the loss surface less well behaved [Schwarz
et al.]. How should we adapt the optimizers for polynomial
architectures?
2 What is the interaction between model degree and implicit
regularization in polynomial networks?
J Schwarz, S Jayakumar, R Pascanu, P Latham, T W Teh. ’Powerpropagation: A sparsity inducing weight reparameterisation.’ In Advances in neural information
processing systems (NeurIPS), 2021.
High-degree polynomial expansions 20th
June 2022 94 / 100
Optimization and training
1 Multiplications can make the loss surface less well behaved [Schwarz
et al.]. How should we adapt the optimizers for polynomial
architectures?
2 What is the interaction between model degree and implicit
regularization in polynomial networks?
3 How should we initialize polynomial networks?
J Schwarz, S Jayakumar, R Pascanu, P Latham, T W Teh. ’Powerpropagation: A sparsity inducing weight reparameterisation.’ In Advances in neural information
processing systems (NeurIPS), 2021.
High-degree polynomial expansions 20th
June 2022 94 / 100
Architecture
1 Can we use other popular tensor factorizations, e.g. Tucker
decomposition, to obtain useful architectures?
High-degree polynomial expansions 20th
June 2022 95 / 100
Architecture
1 Can we use other popular tensor factorizations, e.g. Tucker
decomposition, to obtain useful architectures?
2 How can we evaluate the differences of those architectures?
High-degree polynomial expansions 20th
June 2022 95 / 100
Architecture
1 Can we use other popular tensor factorizations, e.g. Tucker
decomposition, to obtain useful architectures?
2 How can we evaluate the differences of those architectures?
3 How can we determine the degree required by the task at hand?
High-degree polynomial expansions 20th
June 2022 95 / 100
Architecture
1 Can we use other popular tensor factorizations, e.g. Tucker
decomposition, to obtain useful architectures?
2 How can we evaluate the differences of those architectures?
3 How can we determine the degree required by the task at hand?
1 Is higher degree always better?
High-degree polynomial expansions 20th
June 2022 95 / 100
Architecture
1 Can we use other popular tensor factorizations, e.g. Tucker
decomposition, to obtain useful architectures?
2 How can we evaluate the differences of those architectures?
3 How can we determine the degree required by the task at hand?
1 Is higher degree always better?
2 Where should we have this higher degree?
High-degree polynomial expansions 20th
June 2022 95 / 100
Architecture
1 Can we use other popular tensor factorizations, e.g. Tucker
decomposition, to obtain useful architectures?
2 How can we evaluate the differences of those architectures?
3 How can we determine the degree required by the task at hand?
1 Is higher degree always better?
2 Where should we have this higher degree?
3 Is there a total degree that is sufficient for all standard tasks?
High-degree polynomial expansions 20th
June 2022 95 / 100
Architecture II
4 How can we express a joint tensor decomposition over all sequential
polynomial networks?
High-degree polynomial expansions 20th
June 2022 96 / 100
Architecture II
4 How can we express a joint tensor decomposition over all sequential
polynomial networks?
5 Can we represent all signals of interest with a sequence of polynomial
expansions?
High-degree polynomial expansions 20th
June 2022 96 / 100
Architecture II
4 How can we express a joint tensor decomposition over all sequential
polynomial networks?
5 Can we represent all signals of interest with a sequence of polynomial
expansions?
6 How should we reason about activations often used in conjunction
with a polynomial form?
High-degree polynomial expansions 20th
June 2022 96 / 100
Architecture II
4 How can we express a joint tensor decomposition over all sequential
polynomial networks?
5 Can we represent all signals of interest with a sequence of polynomial
expansions?
6 How should we reason about activations often used in conjunction
with a polynomial form?
1 Are activations required?
High-degree polynomial expansions 20th
June 2022 96 / 100
Architecture II
4 How can we express a joint tensor decomposition over all sequential
polynomial networks?
5 Can we represent all signals of interest with a sequence of polynomial
expansions?
6 How should we reason about activations often used in conjunction
with a polynomial form?
1 Are activations required?
2 Are they mostly there to make learning possible?
High-degree polynomial expansions 20th
June 2022 96 / 100
Architecture II
4 How can we express a joint tensor decomposition over all sequential
polynomial networks?
5 Can we represent all signals of interest with a sequence of polynomial
expansions?
6 How should we reason about activations often used in conjunction
with a polynomial form?
1 Are activations required?
2 Are they mostly there to make learning possible?
3 How do they modify the polynomial expansion?
High-degree polynomial expansions 20th
June 2022 96 / 100
Robustness of polynomial networks
1 A polynomial expansion with unconstrained input can obtain
extremely large values.
High-degree polynomial expansions 20th
June 2022 97 / 100
Robustness of polynomial networks
1 A polynomial expansion with unconstrained input can obtain
extremely large values.
2 How can we constrain their output range values efficiently?
High-degree polynomial expansions 20th
June 2022 97 / 100
Robustness of polynomial networks
1 A polynomial expansion with unconstrained input can obtain
extremely large values.
2 How can we constrain their output range values efficiently?
3 How can we make polynomial nets robust to (adversarial) noise?
High-degree polynomial expansions 20th
June 2022 97 / 100
Demo code
https://github.com/polynomial-nets/tutorial-2022-intro-polynomial-nets
High-degree polynomial expansions 20th
June 2022 98 / 100
Thank you for your attention
1 We would like to thank Francesca Babiloni, Leello Dadi, Zhenyu Zhu
and Yongtao Wu for their help in preparing the tutorial.
2 Further information and materials can be found on
https://polynomial-nets.github.io/.
3 Contact us: grigorios.chrysos [at] epfl.ch.
High-degree polynomial expansions 20th
June 2022 99 / 100
Highway networks
2nd degree
Increasing degree
MLP 

(Identity activation)
1st degree 3rd degree Higher degree
LSTM
Gating
MLC-VAE
Bilinear form
Squeeze and excitation nets
StyleGAN
-Nets
APOLLO
COPE
PDC
Non-local networks
Self-attention
Metric learning
Polynomial nets
Mahalanobis distance
ResNet
RNN Multiplicative RNN Higher order tensor RNN
High-degree polynomial expansions 20th
June 2022 100 / 100

More Related Content

What's hot

Introduction to A3C model
Introduction to A3C modelIntroduction to A3C model
Introduction to A3C model
WEBFARMER. ltd.
 
Lispとは何なのか - 同図像性がもたらす力とその利用法
Lispとは何なのか - 同図像性がもたらす力とその利用法Lispとは何なのか - 同図像性がもたらす力とその利用法
Lispとは何なのか - 同図像性がもたらす力とその利用法
Naoya Yamashita
 
[Ridge-i 論文よみかい] Wasserstein auto encoder
[Ridge-i 論文よみかい] Wasserstein auto encoder[Ridge-i 論文よみかい] Wasserstein auto encoder
[Ridge-i 論文よみかい] Wasserstein auto encoder
Masanari Kimura
 
[DL輪読会]SoftTriple Loss: Deep Metric Learning Without Triplet Sampling (ICCV2019)
[DL輪読会]SoftTriple Loss: Deep Metric Learning Without Triplet Sampling (ICCV2019)[DL輪読会]SoftTriple Loss: Deep Metric Learning Without Triplet Sampling (ICCV2019)
[DL輪読会]SoftTriple Loss: Deep Metric Learning Without Triplet Sampling (ICCV2019)
Deep Learning JP
 
Wasserstein GAN 수학 이해하기 I
Wasserstein GAN 수학 이해하기 IWasserstein GAN 수학 이해하기 I
Wasserstein GAN 수학 이해하기 I
Sungbin Lim
 
introduction to Dueling network
introduction to Dueling networkintroduction to Dueling network
introduction to Dueling network
WEBFARMER. ltd.
 
特徴選択のためのLasso解列挙
特徴選択のためのLasso解列挙特徴選択のためのLasso解列挙
特徴選択のためのLasso解列挙
Satoshi Hara
 
ブラックボックス最適化とその応用
ブラックボックス最適化とその応用ブラックボックス最適化とその応用
ブラックボックス最適化とその応用
gree_tech
 
Decision Transformer: Reinforcement Learning via Sequence Modeling
Decision Transformer: Reinforcement Learning via Sequence ModelingDecision Transformer: Reinforcement Learning via Sequence Modeling
Decision Transformer: Reinforcement Learning via Sequence Modeling
Yasunori Ozaki
 
[DL輪読会]DISTRIBUTIONAL POLICY GRADIENTS
[DL輪読会]DISTRIBUTIONAL POLICY GRADIENTS[DL輪読会]DISTRIBUTIONAL POLICY GRADIENTS
[DL輪読会]DISTRIBUTIONAL POLICY GRADIENTS
Deep Learning JP
 
【DL輪読会】Free Lunch for Few-shot Learning: Distribution Calibration
【DL輪読会】Free Lunch for Few-shot Learning: Distribution Calibration【DL輪読会】Free Lunch for Few-shot Learning: Distribution Calibration
【DL輪読会】Free Lunch for Few-shot Learning: Distribution Calibration
Deep Learning JP
 
[DLHacks]StyleGANとBigGANのStyle mixing, morphing
[DLHacks]StyleGANとBigGANのStyle mixing, morphing[DLHacks]StyleGANとBigGANのStyle mixing, morphing
[DLHacks]StyleGANとBigGANのStyle mixing, morphing
Deep Learning JP
 
[DL輪読会]Reinforcement Learning with Deep Energy-Based Policies
[DL輪読会]Reinforcement Learning with Deep Energy-Based Policies[DL輪読会]Reinforcement Learning with Deep Energy-Based Policies
[DL輪読会]Reinforcement Learning with Deep Energy-Based Policies
Deep Learning JP
 
ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...
ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...
ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...
宏毅 李
 
文献紹介:SegFormer: Simple and Efficient Design for Semantic Segmentation with Tr...
文献紹介:SegFormer: Simple and Efficient Design for Semantic Segmentation with Tr...文献紹介:SegFormer: Simple and Efficient Design for Semantic Segmentation with Tr...
文献紹介:SegFormer: Simple and Efficient Design for Semantic Segmentation with Tr...
Toru Tamaki
 
(文献紹介) 画像復元:Plug-and-Play ADMM
(文献紹介) 画像復元:Plug-and-Play ADMM(文献紹介) 画像復元:Plug-and-Play ADMM
(文献紹介) 画像復元:Plug-and-Play ADMM
Morpho, Inc.
 
introduction to double deep Q-learning
introduction to double deep Q-learningintroduction to double deep Q-learning
introduction to double deep Q-learning
WEBFARMER. ltd.
 
[DL輪読会]Learning Robust Rewards with Adversarial Inverse Reinforcement Learning
[DL輪読会]Learning Robust Rewards with Adversarial Inverse Reinforcement Learning[DL輪読会]Learning Robust Rewards with Adversarial Inverse Reinforcement Learning
[DL輪読会]Learning Robust Rewards with Adversarial Inverse Reinforcement Learning
Deep Learning JP
 
論文紹介-Multi-Objective Deep Reinforcement Learning
論文紹介-Multi-Objective Deep Reinforcement Learning論文紹介-Multi-Objective Deep Reinforcement Learning
論文紹介-Multi-Objective Deep Reinforcement Learning
Shunta Nomura
 
NLP2021 AI王 解法紹介 8
NLP2021 AI王 解法紹介 8NLP2021 AI王 解法紹介 8
NLP2021 AI王 解法紹介 8
Takamichi Toda
 

What's hot (20)

Introduction to A3C model
Introduction to A3C modelIntroduction to A3C model
Introduction to A3C model
 
Lispとは何なのか - 同図像性がもたらす力とその利用法
Lispとは何なのか - 同図像性がもたらす力とその利用法Lispとは何なのか - 同図像性がもたらす力とその利用法
Lispとは何なのか - 同図像性がもたらす力とその利用法
 
[Ridge-i 論文よみかい] Wasserstein auto encoder
[Ridge-i 論文よみかい] Wasserstein auto encoder[Ridge-i 論文よみかい] Wasserstein auto encoder
[Ridge-i 論文よみかい] Wasserstein auto encoder
 
[DL輪読会]SoftTriple Loss: Deep Metric Learning Without Triplet Sampling (ICCV2019)
[DL輪読会]SoftTriple Loss: Deep Metric Learning Without Triplet Sampling (ICCV2019)[DL輪読会]SoftTriple Loss: Deep Metric Learning Without Triplet Sampling (ICCV2019)
[DL輪読会]SoftTriple Loss: Deep Metric Learning Without Triplet Sampling (ICCV2019)
 
Wasserstein GAN 수학 이해하기 I
Wasserstein GAN 수학 이해하기 IWasserstein GAN 수학 이해하기 I
Wasserstein GAN 수학 이해하기 I
 
introduction to Dueling network
introduction to Dueling networkintroduction to Dueling network
introduction to Dueling network
 
特徴選択のためのLasso解列挙
特徴選択のためのLasso解列挙特徴選択のためのLasso解列挙
特徴選択のためのLasso解列挙
 
ブラックボックス最適化とその応用
ブラックボックス最適化とその応用ブラックボックス最適化とその応用
ブラックボックス最適化とその応用
 
Decision Transformer: Reinforcement Learning via Sequence Modeling
Decision Transformer: Reinforcement Learning via Sequence ModelingDecision Transformer: Reinforcement Learning via Sequence Modeling
Decision Transformer: Reinforcement Learning via Sequence Modeling
 
[DL輪読会]DISTRIBUTIONAL POLICY GRADIENTS
[DL輪読会]DISTRIBUTIONAL POLICY GRADIENTS[DL輪読会]DISTRIBUTIONAL POLICY GRADIENTS
[DL輪読会]DISTRIBUTIONAL POLICY GRADIENTS
 
【DL輪読会】Free Lunch for Few-shot Learning: Distribution Calibration
【DL輪読会】Free Lunch for Few-shot Learning: Distribution Calibration【DL輪読会】Free Lunch for Few-shot Learning: Distribution Calibration
【DL輪読会】Free Lunch for Few-shot Learning: Distribution Calibration
 
[DLHacks]StyleGANとBigGANのStyle mixing, morphing
[DLHacks]StyleGANとBigGANのStyle mixing, morphing[DLHacks]StyleGANとBigGANのStyle mixing, morphing
[DLHacks]StyleGANとBigGANのStyle mixing, morphing
 
[DL輪読会]Reinforcement Learning with Deep Energy-Based Policies
[DL輪読会]Reinforcement Learning with Deep Energy-Based Policies[DL輪読会]Reinforcement Learning with Deep Energy-Based Policies
[DL輪読会]Reinforcement Learning with Deep Energy-Based Policies
 
ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...
ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...
ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...
 
文献紹介:SegFormer: Simple and Efficient Design for Semantic Segmentation with Tr...
文献紹介:SegFormer: Simple and Efficient Design for Semantic Segmentation with Tr...文献紹介:SegFormer: Simple and Efficient Design for Semantic Segmentation with Tr...
文献紹介:SegFormer: Simple and Efficient Design for Semantic Segmentation with Tr...
 
(文献紹介) 画像復元:Plug-and-Play ADMM
(文献紹介) 画像復元:Plug-and-Play ADMM(文献紹介) 画像復元:Plug-and-Play ADMM
(文献紹介) 画像復元:Plug-and-Play ADMM
 
introduction to double deep Q-learning
introduction to double deep Q-learningintroduction to double deep Q-learning
introduction to double deep Q-learning
 
[DL輪読会]Learning Robust Rewards with Adversarial Inverse Reinforcement Learning
[DL輪読会]Learning Robust Rewards with Adversarial Inverse Reinforcement Learning[DL輪読会]Learning Robust Rewards with Adversarial Inverse Reinforcement Learning
[DL輪読会]Learning Robust Rewards with Adversarial Inverse Reinforcement Learning
 
論文紹介-Multi-Objective Deep Reinforcement Learning
論文紹介-Multi-Objective Deep Reinforcement Learning論文紹介-Multi-Objective Deep Reinforcement Learning
論文紹介-Multi-Objective Deep Reinforcement Learning
 
NLP2021 AI王 解法紹介 8
NLP2021 AI王 解法紹介 8NLP2021 AI王 解法紹介 8
NLP2021 AI王 解法紹介 8
 

Similar to Tutorial on Polynomial Networks at CVPR'22

Machine learning in science and industry — day 1
Machine learning in science and industry — day 1Machine learning in science and industry — day 1
Machine learning in science and industry — day 1
arogozhnikov
 
Adversarial Variational Autoencoders to extend and improve generative model -...
Adversarial Variational Autoencoders to extend and improve generative model -...Adversarial Variational Autoencoders to extend and improve generative model -...
Adversarial Variational Autoencoders to extend and improve generative model -...
Loc Nguyen
 
Performance analysis of transformation and bogdonov chaotic substitution base...
Performance analysis of transformation and bogdonov chaotic substitution base...Performance analysis of transformation and bogdonov chaotic substitution base...
Performance analysis of transformation and bogdonov chaotic substitution base...
IJECEIAES
 
conv_nets.pptx
conv_nets.pptxconv_nets.pptx
conv_nets.pptx
ssuser80a05c
 
Interactive High-Dimensional Visualization of Social Graphs
Interactive High-Dimensional Visualization of Social GraphsInteractive High-Dimensional Visualization of Social Graphs
Interactive High-Dimensional Visualization of Social Graphs
Tokyo Tech (Tokyo Institute of Technology)
 
The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics
The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics
The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics
Kolja Kleineberg
 
9.venkata naga vamsi. a
9.venkata naga vamsi. a9.venkata naga vamsi. a
9.venkata naga vamsi. a
Alexander Decker
 
(17 22) karthick sir
(17 22) karthick sir(17 22) karthick sir
(17 22) karthick sir
IISRTJournals
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
Fellowship at Vodafone FutureLab
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
Fellowship at Vodafone FutureLab
 
Implementation of a modified counterpropagation neural network model in onlin...
Implementation of a modified counterpropagation neural network model in onlin...Implementation of a modified counterpropagation neural network model in onlin...
Implementation of a modified counterpropagation neural network model in onlin...
Alexander Decker
 
Fractal Image Compression By Range Block Classification
Fractal Image Compression By Range Block ClassificationFractal Image Compression By Range Block Classification
Fractal Image Compression By Range Block Classification
IRJET Journal
 
Paper id 21201483
Paper id 21201483Paper id 21201483
Paper id 21201483
IJRAT
 
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
Hiroshi Fukui
 
VoxelNet
VoxelNetVoxelNet
VoxelNet
taeseon ryu
 
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
IOSR Journals
 
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
IOSR Journals
 
HRNET : Deep High-Resolution Representation Learning for Human Pose Estimation
HRNET : Deep High-Resolution Representation Learning for Human Pose EstimationHRNET : Deep High-Resolution Representation Learning for Human Pose Estimation
HRNET : Deep High-Resolution Representation Learning for Human Pose Estimation
taeseon ryu
 
A new four-dimensional hyper-chaotic system for image encryption
A new four-dimensional hyper-chaotic system for image encryption A new four-dimensional hyper-chaotic system for image encryption
A new four-dimensional hyper-chaotic system for image encryption
IJECEIAES
 
Graph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype PredictionGraph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype Prediction
tuxette
 

Similar to Tutorial on Polynomial Networks at CVPR'22 (20)

Machine learning in science and industry — day 1
Machine learning in science and industry — day 1Machine learning in science and industry — day 1
Machine learning in science and industry — day 1
 
Adversarial Variational Autoencoders to extend and improve generative model -...
Adversarial Variational Autoencoders to extend and improve generative model -...Adversarial Variational Autoencoders to extend and improve generative model -...
Adversarial Variational Autoencoders to extend and improve generative model -...
 
Performance analysis of transformation and bogdonov chaotic substitution base...
Performance analysis of transformation and bogdonov chaotic substitution base...Performance analysis of transformation and bogdonov chaotic substitution base...
Performance analysis of transformation and bogdonov chaotic substitution base...
 
conv_nets.pptx
conv_nets.pptxconv_nets.pptx
conv_nets.pptx
 
Interactive High-Dimensional Visualization of Social Graphs
Interactive High-Dimensional Visualization of Social GraphsInteractive High-Dimensional Visualization of Social Graphs
Interactive High-Dimensional Visualization of Social Graphs
 
The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics
The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics
The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics
 
9.venkata naga vamsi. a
9.venkata naga vamsi. a9.venkata naga vamsi. a
9.venkata naga vamsi. a
 
(17 22) karthick sir
(17 22) karthick sir(17 22) karthick sir
(17 22) karthick sir
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
 
Implementation of a modified counterpropagation neural network model in onlin...
Implementation of a modified counterpropagation neural network model in onlin...Implementation of a modified counterpropagation neural network model in onlin...
Implementation of a modified counterpropagation neural network model in onlin...
 
Fractal Image Compression By Range Block Classification
Fractal Image Compression By Range Block ClassificationFractal Image Compression By Range Block Classification
Fractal Image Compression By Range Block Classification
 
Paper id 21201483
Paper id 21201483Paper id 21201483
Paper id 21201483
 
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
 
VoxelNet
VoxelNetVoxelNet
VoxelNet
 
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
 
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
 
HRNET : Deep High-Resolution Representation Learning for Human Pose Estimation
HRNET : Deep High-Resolution Representation Learning for Human Pose EstimationHRNET : Deep High-Resolution Representation Learning for Human Pose Estimation
HRNET : Deep High-Resolution Representation Learning for Human Pose Estimation
 
A new four-dimensional hyper-chaotic system for image encryption
A new four-dimensional hyper-chaotic system for image encryption A new four-dimensional hyper-chaotic system for image encryption
A new four-dimensional hyper-chaotic system for image encryption
 
Graph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype PredictionGraph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype Prediction
 

Recently uploaded

一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
oaxefes
 
Cell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docxCell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docx
vasanthatpuram
 
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
ywqeos
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Kaxil Naik
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
ihavuls
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
taqyea
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
hyfjgavov
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
mkkikqvo
 
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
eoxhsaa
 
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
aguty
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
bopyb
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
Social Samosa
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
lzdvtmy8
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
xclpvhuk
 
Sample Devops SRE Product Companies .pdf
Sample Devops SRE  Product Companies .pdfSample Devops SRE  Product Companies .pdf
Sample Devops SRE Product Companies .pdf
Vineet
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
ElizabethGarrettChri
 
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
asyed10
 

Recently uploaded (20)

一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
 
Cell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docxCell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docx
 
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
 
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
 
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
 
Sample Devops SRE Product Companies .pdf
Sample Devops SRE  Product Companies .pdfSample Devops SRE  Product Companies .pdf
Sample Devops SRE Product Companies .pdf
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
 
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
 

Tutorial on Polynomial Networks at CVPR'22

  • 1. High-degree polynomial expansions 20th June 2022 High-degree polynomial expansions 20th June 2022 1 / 100
  • 2. Outline 1 Introduction 2 Higher-degree polynomial expansions 3 Object recognition with polynomial networks 4 Data generation with polynomial networks 5 Future directions High-degree polynomial expansions 20th June 2022 2 / 100
  • 3. High-degree polynomial expansions 20th June 2022 3 / 100
  • 4. Imagenet https: // paperswithcode. com/ sota/ image-classification-on-imagenet High-degree polynomial expansions 20th June 2022 4 / 100
  • 5. Deep-learning architectures K. He, X. Zhang, X., S. Ren, J. Sun, Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), 2016 High-degree polynomial expansions 20th June 2022 5 / 100
  • 6. Deep-learning architectures (a) (b) J Hu, L Shen, G Sun. ’Squeeze-and-excitation networks.’ In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018. X. Wang, R. Girshick, A. Gupta, K. He. ’Non-local Neural Networks.’ In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018. High-degree polynomial expansions 20th June 2022 6 / 100
  • 7. MLP layer High-degree polynomial expansions 20th June 2022 7 / 100
  • 8. MLP layer High-degree polynomial expansions 20th June 2022 8 / 100
  • 9. Squeeze-and-Excitation Nets are 2nd degree polynomials High-degree polynomial expansions 20th June 2022 9 / 100
  • 10. Squeeze-and-Excitation Nets are 2nd degree polynomials High-degree polynomial expansions 20th June 2022 10 / 100
  • 11. Non-local neural network is a 3rd degree polynomial High-degree polynomial expansions 20th June 2022 11 / 100
  • 12. Non-local neural network is a 3rd degree polynomial High-degree polynomial expansions 20th June 2022 12 / 100
  • 13. Self-Attention is a 3rd degree polynomial High-degree polynomial expansions 20th June 2022 13 / 100
  • 14. Learning with polynomials, an old idea Mapping Units [Hinton, 1985], ”dynamic mapping” [v.d. Malsburg; 1981] Binocular+Motion Energy models [Adelson, Bergen; 1985], [Ozhawa, DeAngelis, Freeman; 1990], [Fleet et al., 1994]. Sigma-Pi neural unit [Mel, Koch; 1990]. Higher Order Botlzmann Machines / Higher Order Neural Networks [Sejnowski; 1986]. Subspace SOM [Kohonen; 1996], topographic ICA [Hyvarinen, Hoyer; 2000] [Karklin, Lewicki;2003]. Bilinear Models [Tenebaum and Freeman; 2008], [Ohlshaussen; 1994], [Grimes, Rao; 2005]. Higher Order Restricted Boltzmann Machines (RBMs) [Memisevic and Hinton; 2007], [Ranzato et al; 2010]. Gating mechanisms; LSTM [Hochreiter, Schmidhuber 1997], Multiplicative RNN [Sutskever, Martens, Hinton; 2011]. High-degree polynomial expansions 20th June 2022 14 / 100
  • 15. Group Method of Data Handling (GMDH) One of the first approaches of systematic design of nonlinear relationships. Generation of Partial Descriptions of data (PDs) with two input variables. Shortcoming: tends to produce an overly complex network. A Ivakhnenko. ‘Polynomial theory of complex systems.’ IEEE Transactions on Systems, Man, and Cybernetics, 1971. High-degree polynomial expansions 20th June 2022 15 / 100
  • 16. Mapping Units / Higher Order Boltzmann Machines Hinton et al. (1985) and Sutskever et al. (2011) argue that multiplications (mapping units) allow for better modeling of conjunctions. Higher order Boltzmann Machines and Higher order RBMs utilize multiplication in factorized representations, e.g., bilinear models factorize style and content. High-degree polynomial expansions 20th June 2022 16 / 100
  • 17. Pi-Sigma network (PSN) Single hidden layer learns multiple affine transformations of the data, multiplies them to obtain the output. hji = X k wkji xk + θji yi = σ( Y j hji ) . Y Shin, J Ghosh. ‘The pi-sigma network: an efficient higher-order neural network for pattern classification and function approximation.’ International Joint Conference on Neural Networks, 1991. High-degree polynomial expansions 20th June 2022 17 / 100
  • 18. Sigma-Pi-Sigma Neural Network (SPSNN) Composed of different orders of pi-sigma networks. fSPSNN = k X i=1 fPSNk = k X i=1 k Y j=1 hjk . C Li. ‘A sigma-pi-sigma neural network (SPSNN).’ Neural Processing Letters, 2003. High-degree polynomial expansions 20th June 2022 18 / 100
  • 19. Factorization Machines Second degree polynomial net to combine the features under sparse data. The weight matrix is mapped into a low-rank space using matrix factorization. ŷ(x) := w0 + n X i=1 wi xi + n X i=1 n X j=i+1 ⟨vi , vj⟩ xi xj , where the learnable parameters are: w0 ∈ R, w ∈ Rn and V ∈ Rn×k (k ≫ n). S Rendle. ‘Factorization Machines.’ International Conference on Data Mining, 2010. High-degree polynomial expansions 20th June 2022 19 / 100
  • 20. Variations of Factorization Machines Field-aware FM (FFM): Different vectors are used when the features of different fields combination. ŷ(x) := w0 + n X i=1 wi xi + n X i=1 n X j=i+1 D vi,fj , vj,fi E xi xj . Field-weighted FM: Add a weight parameter for every two features. ŷ(x) := w0 + n X i=1 wi xi + n X i=1 n X j=i+1 ⟨vi , vj⟩ xi xjrfi ,fj . Higher-order FM: Third-order or higher-order feature combination problems. Y Juan, Y Zhuang, W Chin, C Lin. ‘Field-aware factorization machines for CTR prediction.’ In ACM conference on recommender systems, 2016. J Pan, et al. ‘Field-weighted factorization machines for click-through rate prediction in display advertising.’ In World Wide Web Conference, 2018. M Blondel, A Fujino, N Ueda, M Ishihata. ‘Higher-order factorization machines.’ In Advances in neural information processing systems (NeurIPS), 2016. High-degree polynomial expansions 20th June 2022 20 / 100
  • 21. Multiplicative Recurrent Neural Networks (MRNN) Character-level language modeling tasks. Multiplicative (or “gated”) connections. factor state sequence ft = diag(Wfx xt) · Wfhht−1 hidden state sequence ht = tanh(Whf ft + Whx xt) output sequence ot = Wohht + bo . I Sutskever, J Martens, G Hinton. ‘Generating text with recurrent neural networks.’ In International Conference on Machine Learning (ICML), 2011. High-degree polynomial expansions 20th June 2022 21 / 100
  • 22. Sum-Product Networks (SPN) H Poon, P Domingos. ‘Sum-product networks: A new deep architecture.’ In International Conference on Computer Vision Workshops, 2011. High-degree polynomial expansions 20th June 2022 22 / 100
  • 23. Multiplicative interactions High-degree polynomial expansions 20th June 2022 23 / 100
  • 24. Outline 1 Introduction 2 Higher-degree polynomial expansions 3 Object recognition with polynomial networks 4 Data generation with polynomial networks 5 Future directions High-degree polynomial expansions 20th June 2022 24 / 100
  • 25. Outline 1 Introduction 2 Higher-degree polynomial expansions Notation 3 Object recognition with polynomial networks 4 Data generation with polynomial networks 5 Future directions High-degree polynomial expansions 20th June 2022 25 / 100
  • 26. Formalism In Machine Learning tasks, we have (at least) one input and one output. The goal is to learn G(z) : Rd → Ro with z ∈ Rd the input. Neural networks use a composition of linear and unitary non-linear units. We augment this structure and we capture the higher-order correlations using tensors. High-degree polynomial expansions 20th June 2022 26 / 100
  • 27. Hadamard product Let matrices Γ ∈ R2×3 and P ∈ R2×3. The Hadamard product Γ ∗ P is denoted as ‘∗’ and defined as: " γ(1,1) γ(1,2) γ(1,3) γ(2,1) γ(2,2) γ(2,3) # | {z } Γ ∗ " ρ(1,1) ρ(1,2) ρ(1,3) ρ(2,1) ρ(2,2) ρ(2,3) # | {z } P = " γ(1,1)ρ(1,1) γ(1,2)ρ(1,2) γ(1,3)ρ(1,3) γ(1,1)ρ(2,1) γ(1,2)ρ(2,2) γ(1,3)ρ(2,3) # | {z } Γ∗P (1) The Hadamard product of Γ ∈ RI×N and P ∈ RI×N results in a matrix of dimensions I × N. Hadamard, J. ’Leçons sur la Propagation des Ondes et les Équations de l’Hydrodynamique’, 1903. Halmos, Paul R. ’Finite-dimensional vector spaces’, Annals of Mathematics Studies, Princeton University Press, 1948. High-degree polynomial expansions 20th June 2022 27 / 100
  • 28. Khatri-Rao product Let matrices Γ ∈ R2×3 and P ∈ R3×3. The Khatri-Rao product Γ ⊙ P is denoted as ‘⊙’ and defined as: " γ(1,1) γ(1,2) γ(1,3) γ(2,1) γ(2,2) γ(2,3) # | {z } Γ ⊙    ρ(1,1) ρ(1,2) ρ(1,3) ρ(2,1) ρ(2,2) ρ(2,3) ρ(3,1) ρ(3,2) ρ(3,3)    | {z } P =          γ(1,1)ρ(1,1) γ(1,2)ρ(1,2) γ(1,3)ρ(1,3) γ(1,1)ρ(2,1) γ(1,2)ρ(2,2) γ(1,3)ρ(2,3) γ(1,1)ρ(3,1) γ(1,2)ρ(3,2) γ(1,3)ρ(3,3) γ(2,1)ρ(1,1) γ(2,2)ρ(1,2) γ(2,3)ρ(1,3) γ(2,1)ρ(2,1) γ(2,2)ρ(2,2) γ(2,3)ρ(2,3) γ(2,1)ρ(3,1) γ(2,2)ρ(3,2) γ(2,3)ρ(3,3)          | {z } Γ⊙P (2) The Khatri-Rao product of Γ ∈ RI×N and P ∈ RJ×N results in a matrix of dimensions (IJ) × N. Khatri, C. G., and C. Radhakrishna Rao. ’Solutions to some functional equations and their applications to characterization of probability distributions.’ Sankhyā: the Indian journal of statistics, series A (1968): 167-180. High-degree polynomial expansions 20th June 2022 28 / 100
  • 29. Tensors Tensors → multi-dimensional arrays. High-degree polynomial expansions 20th June 2022 29 / 100
  • 30. Tensors Tensors → multi-dimensional arrays. The order is the number of dimensions, e.g. X ∈ R4×4×4 has order 3. High-degree polynomial expansions 20th June 2022 29 / 100
  • 31. Tensors Tensors → multi-dimensional arrays. The order is the number of dimensions, e.g. X ∈ R4×4×4 has order 3. Third-order tensor illustration: 𝑥𝑖 𝑥𝑗 𝑥𝑘 High-degree polynomial expansions 20th June 2022 29 / 100
  • 32. Tensors Tensors → multi-dimensional arrays. The order is the number of dimensions, e.g. X ∈ R4×4×4 has order 3. Third-order tensor illustration: 𝑥𝑖 𝑥𝑗 𝑥𝑘 Let W ∈ RI1×···×IM and u ∈ RIm with m ∈ [1, . . . , M]. The mode-m vector product W ×m u is: (W ×m u)i1,...,im−1,im+1,...,iM = Im X im=1 wi1,...,iM uim (3) High-degree polynomial expansions 20th June 2022 29 / 100
  • 33. CP decomposition Goal: Decompose a tensor W to a sequence of low-rank components. High-degree polynomial expansions 20th June 2022 30 / 100
  • 34. CP decomposition Goal: Decompose a tensor W to a sequence of low-rank components. In matrix form: W(1) . = U[1] J2 m=M U[m] T where {U[m]}M m=1 are the factor matrices. High-degree polynomial expansions 20th June 2022 30 / 100
  • 35. CP decomposition Goal: Decompose a tensor W to a sequence of low-rank components. In matrix form: W(1) . = U[1] J2 m=M U[m] T where {U[m]}M m=1 are the factor matrices. A schematic of the CP decomposition of a third-order tensor W is: Figure: CP decomposition of a third-order tensor. High-degree polynomial expansions 20th June 2022 30 / 100
  • 36. Outline 1 Introduction 2 Higher-degree polynomial expansions Polynomial expansion with respect to an input vector 3 Object recognition with polynomial networks 4 Data generation with polynomial networks 5 Future directions High-degree polynomial expansions 20th June 2022 31 / 100
  • 37. Polynomial approximation Approximate the τth element G(z)τ with a Nth-degree polynomial: (G(z))τ ≈ βτ + d X i=1 w [1] τ,i zi + d X i=1 d X j=1 w [2] τ,i,jzi zj + · · · + d X i=1 d X j=1 . . . d X k=1 | {z } N summations w [N] τ,i,j,...,kzi zj . . . zk (4) Both βτ ∈ R and the set of tensors W[n] τ ∈ R Qn m=1 ×md N n=1 are learnable parameters. High-degree polynomial expansions 20th June 2022 32 / 100
  • 38. Polynomial approximation The last equation (4) can be written in the tensor format as: (G(z))τ ≈ βτ + w[1] τ T z + zT W[2] τ z + · · · + W[N] τ N Y n=1 ×nz (5) By stacking the polynomials for all elements τ ∈ [1, . . . , o], we obtain: G(z) ≈ N X n=1 W[n] n+1 Y j=2 ×jz + β (6) From Stone-Weierstrass theorem, a polynomial can approximate any smooth function. High-degree polynomial expansions 20th June 2022 33 / 100
  • 39. Polynomial approximation - learnable parameters The learnable parameters of (6) are Θ(dN). High-degree polynomial expansions 20th June 2022 34 / 100
  • 40. Polynomial approximation - learnable parameters The learnable parameters of (6) are Θ(dN). A solution to reduce them: demand each factor W[n] to be low-rank. High-degree polynomial expansions 20th June 2022 35 / 100
  • 41. Outline 1 Introduction 2 Higher-degree polynomial expansions Tensor decomposition per degree 3 Object recognition with polynomial networks 4 Data generation with polynomial networks 5 Future directions High-degree polynomial expansions 20th June 2022 36 / 100
  • 42. Tensor decomposition per degree First solution: Demand each factor W[n] to be low-rank. Apply CP decomposition to each factor W[n] . Then, the expansion for N = 3 is: y = β + CT 1,[1]z + CT 1,[2]z ∗ CT 2,[2]z + CT 1,[3]z ∗ CT 2,[3]z ∗ CT 3,[3]z (7) G Chrysos*, M Georgopoulos*, J Deng, J Kossaifi, Y Panagakis, A Anandkumar, ‘Augmenting Deep Classifiers with Polynomial Neural Networks.’ European Conference on Computer Vision (ECCV), 2022. High-degree polynomial expansions 20th June 2022 37 / 100
  • 43. Khatri-Rao to Hadamard product Lemma (Chrysos’19) For a set of N matrices {A[ν] ∈ RIν ×K }N ν=1 and {B[ν] ∈ RIν ×L}N ν=1, the following equality holds: ( N K ν=1 A[ν])T · ( N K ν=1 B[ν]) = (AT [1] · B[1]) ∗ . . . ∗ (AT [N] · B[N]), (8) where the symbol ‘∗’ denotes the Hadamard product. G Chrysos, S Moschoglou, Y Panagakis, and S Zafeiriou. ‘Polygan: High-order polynomial generators.’ arXiv preprint arXiv:1908.06571. High-degree polynomial expansions 20th June 2022 38 / 100
  • 44. Factorization of Univariate Polynomials Over Finite Fields Berlekamp’s algorithm (1970): only practical over small finite fields. Cantor–Zassenhaus Algorithm (1981): Probabilistic algorithms. Victor Shoup Algorithm (1990): Deterministic algorithm. E Berlekamp. ‘Factoring Polynomials Over Large Finite Fields.’ In Mathematics of Computation, 1970. D Cantor, H Zassenhaus. ‘A New Algorithm for Factoring Polynomials Over Finite Fields.’ In Mathematics of Computation, 1981. V Shoup. ‘On the deterministic complexity of factoring polynomials over finite fields.’ In Information Processing Letters, 1990. High-degree polynomial expansions 20th June 2022 39 / 100
  • 45. Decoupling Multivariate Polynomials Factorizing multivariate polynomials as a linear combination of univariate polynomials has been studied using tensor decompositions. Using first-order information and CP decomposition. Obtain a decomposition of the form: fi (u1, . . . , um) = r X j=1 wij · gj m X k=1 vkjuk , ∀i = 1, . . . , n , Matrix form decoupled representation: f (u) = Wg(V⊤ u) , P. Dreesen, M. Ishteva, J. Schoukens. ‘Decoupling Multivariate Polynomials Using First-Order Information and Tensor Decompositions.’ Journal on Matrix Analysis and Applications, 2015. High-degree polynomial expansions 20th June 2022 40 / 100
  • 46. Outline 1 Introduction 2 Higher-degree polynomial expansions Π−nets: Joint decompositions across degrees 3 Object recognition with polynomial networks 4 Data generation with polynomial networks 5 Future directions High-degree polynomial expansions 20th June 2022 41 / 100
  • 47. Π-nets: Third-degree expansion schematic - Model CCP Figure: Third-degree expansion. G Chrysos, S Moschoglou, Y Panagakis, and S Zafeiriou. ‘Polygan: High-order polynomial generators.’ arXiv preprint arXiv:1908.06571. G Chrysos, S Moschoglou, G Bouritsas, Y Panagakis, J Deng, and S Zafeiriou. ‘Π-nets: Deep Polynomial Neural Networks.’ In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020. High-degree polynomial expansions 20th June 2022 42 / 100
  • 48. Π-nets: Third-degree expansion schematic - Model CCP Figure: Third-degree expansion. G Chrysos, S Moschoglou, Y Panagakis, and S Zafeiriou. ‘Polygan: High-order polynomial generators.’ arXiv preprint arXiv:1908.06571. G Chrysos, S Moschoglou, G Bouritsas, Y Panagakis, J Deng, and S Zafeiriou. ‘Π-nets: Deep Polynomial Neural Networks.’ In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020. High-degree polynomial expansions 20th June 2022 42 / 100
  • 49. Π-nets: Third-degree expansion schematic - Model CCP Figure: Third-degree expansion. G Chrysos, S Moschoglou, Y Panagakis, and S Zafeiriou. ‘Polygan: High-order polynomial generators.’ arXiv preprint arXiv:1908.06571. G Chrysos, S Moschoglou, G Bouritsas, Y Panagakis, J Deng, and S Zafeiriou. ‘Π-nets: Deep Polynomial Neural Networks.’ In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020. High-degree polynomial expansions 20th June 2022 42 / 100
  • 50. Π-nets: Third-degree expansion schematic - Model CCP Figure: Third-degree expansion. G Chrysos, S Moschoglou, Y Panagakis, and S Zafeiriou. ‘Polygan: High-order polynomial generators.’ arXiv preprint arXiv:1908.06571. G Chrysos, S Moschoglou, G Bouritsas, Y Panagakis, J Deng, and S Zafeiriou. ‘Π-nets: Deep Polynomial Neural Networks.’ In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020. High-degree polynomial expansions 20th June 2022 42 / 100
  • 51. Π−nets - Model CCP We use a coupled CP decomposition, i.e., factor sharing in different levels. To demonstrate the method, we assume a third degree expansion, i.e., N = 3 in (6). Then, the expansion is: G(z) = β + W[1] z + W[2] ×2 z ×3 z + W[3] ×2 z ×3 z ×4 z (9) High-degree polynomial expansions 20th June 2022 43 / 100
  • 52. Π−nets - Third-degree expansion - Model CCP We use the following factorizations: Let W[1] = CUT [1], be the parameters for first level of approximation. Assume W[2] = W [2] 1:2 + W [2] 1:3. We use a coupled CP decomposition which results in the following matrix form: W [2] (1) = C(U[3] ⊙ U[1])T + C(U[2] ⊙ U[1])T . Let the third-degree parameters: W [3] (1) = C(U[3] ⊙ U[2] ⊙ U[1])T . High-degree polynomial expansions 20th June 2022 44 / 100
  • 53. Π−nets - Nth degree expansion The derivation can be extended to an arbitrary degree with the following recursive formulation: xn = UT [n]z ∗ xn−1 + xn−1 , (CCP) for n = 2, . . . , N with x1 = UT [1]z and x = CxN + β. The parameters C ∈ Ro×k, U[n] ∈ Rd×k for n = 1, . . . , N are learnable. High-degree polynomial expansions 20th June 2022 45 / 100
  • 54. Π−nets - Alternative models Model CCP above assumes a certain factorization, e.g., W[2] = W [2] 1:2 + W [2] 1:3. New models can be derived by changing the assumptions. For instance, what if we assume that the tensors admit nested decompositions? High-degree polynomial expansions 20th June 2022 46 / 100
  • 55. Π-nets: Model NCP The model with nested decompositions, called NCP, for N = 3: b[1] B[1] ∗ S[2] + ∗ S[3] + ∗ C + A[1] A[2] A[3] z B[2] B[3] b[2] b[3] β G(z) Figure: Third-degree expansion. High-degree polynomial expansions 20th June 2022 47 / 100
  • 56. Π-nets: Model NCP The derivation can be extended to an arbitrary degree with the following recursive formulation: xn = AT [n]z ∗ ST [n]xn−1 + BT [n]b[n] , (NCP) for n = 2, . . . , N with x1 = AT [1]z ∗ BT [1]b[1] and x = CxN + β. High-degree polynomial expansions 20th June 2022 48 / 100
  • 57. Π-nets: Product of polynomials The previous formulations, e.g. (CCP), require Θ(N) layers for Nth degree expansion. Can we achieve a higher degree expansion with less parameters? Yes. For instance, by stacking lower-degree polynomials sequentially. z · · · G(z) Order 2 Order 2 Order 2N ∗ ∗ Figure: Stacking N polynomials of degree 2, results in a 2N polynomial expansion. High-degree polynomial expansions 20th June 2022 49 / 100
  • 58. Outline 1 Introduction 2 Higher-degree polynomial expansions 3 Object recognition with polynomial networks 4 Data generation with polynomial networks 5 Future directions High-degree polynomial expansions 20th June 2022 50 / 100
  • 59. Performance of polynomial expansions (with batch normalization) on CIFAR10, CIFAR100 benchmarks. Table: Polynomial expansion versus baselines. CIFAR10 CIFAR100 2−degree products 0.907 ± 0.003 0.667 ± 0.003 ResNet18∗ 0.391 ± 0.001 0.168 ± 0.001 ResNet18 0.945 ± 0.000 0.756 ± 0.001 High-degree polynomial expansions 20th June 2022 51 / 100
  • 60. SORT model The model obtains the following formulation: x = UT [1]z + UT [2]z + UT [1]z ∗ UT [2]z . (10) Y Wang, L Xie, C Liu, Y Zhang, W Zhang, A Yuille. ‘SORT: Second-Order Response Transform for Visual Recognition.’ International Conference on Computer Vision (ICCV), 2017. High-degree polynomial expansions 20th June 2022 52 / 100
  • 61. Squeeze-and-Excitation network Squeeze-and-Excitation network (SENet): The output of the SENet block YSE with respect to input X ∈ Rhw×C (h is the height, w is the width) can be formulated as: YSE = (XW1) ∗ r(p(XW1)W2) = (XW1) ∗ − → 1 1 hw − → 1 T XW1 W2 T (11) where W1, W2 are learnable parameters. J Hu, L Shen, G Sun. ’Squeeze-and-excitation networks.’ In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018. High-degree polynomial expansions 20th June 2022 53 / 100
  • 62. Non-local (NL) neural network Non-local (NL) neural network: The output of the non-local block YNL ∈ RN×C with respect to input X ∈ RN×C can be formulated as: YNL = (XW1W⊤ 2 X⊤ )(XW3), (12) where W1, W2, W3 ∈ RC×C are learnable parameters. Scales quadratically with the dimension N (i.e. O(N2) complexity). X Wang, R Girshick, A Gupta, K He. ’Non-local Neural Networks.’ In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018. High-degree polynomial expansions 20th June 2022 54 / 100
  • 63. Poly-NL Poly-NL: The output YPoly-NL ∈ RN×C is expressed by Using 3 degree polynomial nets as non-local self-attention block: YPoly-NL = (Φ(XW1 ∗ XW2) ∗ X)W3, (13) where learnable parameters W1, W2, W3 ∈ RC×C . Scales linearly with the dimension N (i.e. O(N) complexity). F Babiloni, et al. ‘Poly-NL: Linear Complexity Non-local Layers with Polynomials.’ In International Conference on Computer Vision (ICCV), 2021. High-degree polynomial expansions 20th June 2022 55 / 100
  • 64. Linear Complexity Self-Attention with Polynomials Poly-NL reformulates SA using only global descriptors and element-wise multiplications, achieving Linear Complexity O(N). High-degree polynomial expansions 20th June 2022 56 / 100
  • 65. Poly-NL: Space and Time Complexity (a) (b) Figure: Poly-NL achieves up to 10× speed up in run-time and a 5× less complexity overhead wrt NL. High-degree polynomial expansions 20th June 2022 57 / 100
  • 66. Non-local with lower-degree interactions PDC-NL: Y = (XW1W⊤ 2 X⊤)(XW3) + XW4XW5 + XW6 Includes first to third degrees term based on NL (only third degree). G Chrysos*, M Georgopoulos*, J Deng, J Kossaifi, Y Panagakis, A Anandkumar, ‘Augmenting Deep Classifiers with Polynomial Neural Networks.’ European Conference on Computer Vision (ECCV), 2022. High-degree polynomial expansions 20th June 2022 58 / 100
  • 67. Outline 1 Introduction 2 Higher-degree polynomial expansions 3 Object recognition with polynomial networks 4 Data generation with polynomial networks 5 Future directions High-degree polynomial expansions 20th June 2022 59 / 100
  • 68. Outline 1 Introduction 2 Higher-degree polynomial expansions 3 Object recognition with polynomial networks 4 Data generation with polynomial networks Unconditional generation with polynomial networks 5 Future directions High-degree polynomial expansions 20th June 2022 60 / 100
  • 69. Expressivity - Generation without activation functions Results from a generator with convolutional layers without activations: High-degree polynomial expansions 20th June 2022 61 / 100
  • 70. Expressivity of Π−nets We consider image generation without activation functions between the layers. Synthesized images: High-degree polynomial expansions 20th June 2022 62 / 100
  • 71. Expressivity of Π−nets Linear interpolation in the latent space: High-degree polynomial expansions 20th June 2022 63 / 100
  • 72. Image generation from a polynomial generator High-degree polynomial expansions 20th June 2022 64 / 100
  • 73. Π−nets on non-euclidean representation learning Beyond image generation, polynomial nets perform well in non-euclidean representation learning. Code: https://github.com/grigorisg9gr/polynomial_nets G Chrysos, S Moschoglou, G Bouritsas, J Deng, Y Panagakis, and S Zafeiriou. ‘Deep Polynomial Neural Networks.’ IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), 2021. High-degree polynomial expansions 20th June 2022 65 / 100
  • 74. Outline 1 Introduction 2 Higher-degree polynomial expansions 3 Object recognition with polynomial networks 4 Data generation with polynomial networks Synthesizing unseen combinations 5 Future directions High-degree polynomial expansions 20th June 2022 66 / 100
  • 75. Conditional data generation: Visual examples Figure: Image-to-image translation examples. Phillip Isola, et al. ’A Image-to-image translation with conditional adversarial networks’, Conference on Computer Vision and Pattern Recognition (CVPR) 2017. Mehdi Mirza and Simon Osindero. ’Conditional generative adversarial nets’, CoRR 2014. High-degree polynomial expansions 20th June 2022 67 / 100
  • 76. Attribute-conditional generative models High-degree polynomial expansions 20th June 2022 68 / 100
  • 77. Attribute-conditional generative models and generalization High-degree polynomial expansions 20th June 2022 69 / 100
  • 78. Conditional Variational Autoencoder (cVAE) High-degree polynomial expansions 20th June 2022 70 / 100
  • 79. MLC-VAE - Our framework We instead model each attribute combination with a different mean. How to obtain the mean: M(y1, y2) = W[1] y1 + W[2] y2 + W[12] ×2 y1 ×3 y2, (14) for attributes y1, y2. M Georgopoulos, G Chrysos, M Pantic, and Y Panagakis. ‘Multilinear Latent Conditioning for Generating Unseen Attribute Combinations.’ In International Conference on Machine Learning (ICML), 2020. High-degree polynomial expansions 20th June 2022 71 / 100
  • 80. MLC-VAE - Results High-degree polynomial expansions 20th June 2022 72 / 100
  • 81. MLC-VAE - Multiplicative interactions Can we use additive interactions instead? Not really. For instance, synthesize images with attributes (’smile’ and ’closed mouth’). High-degree polynomial expansions 20th June 2022 73 / 100
  • 82. Outline 1 Introduction 2 Higher-degree polynomial expansions 3 Object recognition with polynomial networks 4 Data generation with polynomial networks Conditional image generation with polynomial networks 5 Future directions High-degree polynomial expansions 20th June 2022 74 / 100
  • 83. Diverse samples in conditional generation Figure: In addition to the adversarial loss of GANs, regularization losses are typically used for enabling diverse synthesis. Q Mao, H Lee, H Tseng, S Ma, M Yang. ‘Mode Seeking Generative Adversarial Networks for Diverse Image Synthesis.’ In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019. High-degree polynomial expansions 20th June 2022 75 / 100
  • 84. Conditional image generation - Introduction 1 Conditioning the generator still relies on the neural network for the expressivity. 2 Can we use high-degree polynomial expansions instead? 3 Assume zI, zII ∈ Rd are the input vectors. The goal is to learn a function G : Rd×d → Ro that captures the higher-order correlations between the elements of the two inputs. High-degree polynomial expansions 20th June 2022 76 / 100
  • 85. CoPE: Nth -degree expansion - Model CCP The recursive formulation of CoPE is given by: xn = xn−1 + UT [n,I]zI + UT [n,II]zII ∗ xn−1, (15) for n = 2, . . . , N with x1 = UT [1,I]zI + UT [1,II]zII and x = CxN + β. The schematic illustration is the following: Figure: Nth -degree expansion for conditional generation. G Chrysos, M Georgopoulos, and Y Panagakis. ‘Conditional Generation Using Polynomial Expansions.’ In Advances in neural information processing systems (NeurIPS), 2021. High-degree polynomial expansions 20th June 2022 77 / 100
  • 86. CoPE: Nth -degree expansion - Model CCP The recursive formulation of CoPE is given by: xn = xn−1 + UT [n,I]zI + UT [n,II]zII ∗ xn−1, (15) for n = 2, . . . , N with x1 = UT [1,I]zI + UT [1,II]zII and x = CxN + β. The schematic illustration is the following: Figure: Nth -degree expansion for conditional generation. G Chrysos, M Georgopoulos, and Y Panagakis. ‘Conditional Generation Using Polynomial Expansions.’ In Advances in neural information processing systems (NeurIPS), 2021. High-degree polynomial expansions 20th June 2022 77 / 100
  • 87. Synthesized images with CoPE (a) edges-to-handbags (b) edges-to-shoes Figure: The first row depicts the conditional input (i.e., the edges). The rows 2-6 depict outputs when we vary zI (i.e., noise). High-degree polynomial expansions 20th June 2022 78 / 100
  • 88. Beyond two-variable expansion with CoPE The recursive formulation can be extended beyond two-variable expansions. For three-variables the formulation is the following: xn = xn−1 + UT [n,I]zI + UT [n,II]zII + UT [n,III]zIII ∗ xn−1, (16) for n = 2, . . . , N with x1 = UT [1,I]zI +UT [1,II]zII +UT [1,III]zIII and x = CxN +β. Code: https://github.com/grigorisg9gr/polynomial_nets_for_conditional_generation High-degree polynomial expansions 20th June 2022 79 / 100
  • 89. Beyond two-variable expansion with CoPE Synthesized images on conditional generation with 2 attributes: (a) (b) Figure: (a) Each row/column depicts a different hair/eye color respectively, (b) synthesized images per unique combination by varying the noise zI. High-degree polynomial expansions 20th June 2022 80 / 100
  • 90. Outline 1 Introduction 2 Higher-degree polynomial expansions 3 Object recognition with polynomial networks 4 Data generation with polynomial networks Audio synthesis 5 Future directions High-degree polynomial expansions 20th June 2022 81 / 100
  • 91. Audio representation Time domain VS Frequency domain Figure: Source: https://www.nti-audio.com/en/support/know-how/fast-fourier-transform-fft High-degree polynomial expansions 20th June 2022 82 / 100
  • 92. How to model the complex-valued frequency representations? Real-valued neural networks (RVNNs) with 1 output channel for the magnitude of complex-valued representations: Discard the phase information. Require phase reconstruction in a generative task. RVNNs with 2 output channels for complex-valued representations: Higher degree of freedom at the synaptic weighting. Lower generalization ability. How about directly modelling the complex-valued representations? A Hirose, S. Yoshida. ’Generalization Characteristics of Complex-Valued Feedforward Neural Networks in Relation to Signal Coherence.’ IEEE Transactions on Neural Networks and Learning Systems, 2012. High-degree polynomial expansions 20th June 2022 83 / 100
  • 93. Mergelyan’s Theorem Suppose K is a compact set in the plane whose complement is connected, f is a continuous complex-valued function defined on K which is holomorphic in the interior of K, and if ϵ 0, then there exists a polynomial P such that |f (x) − P(x)| ϵ for all x ∈ K. W Rudin. ’Real and Complex Analysis.’ McGraw-Hill International Series, 1987. High-degree polynomial expansions 20th June 2022 84 / 100
  • 94. Schematic of the generator Audiorepresentation in frequencydomain Complex-valued randomnoise Audiorepresentation in frequencydomain Complex-valued randomnoise ... ... ... from degreeto degree APOLLOgenerator (Model BN) Yongtao Wu, G Chrysos, Volkan Cevher. ’Adversarial Audio Synthesis with Complex-valued Polynomial Networks.’ 2022. High-degree polynomial expansions 20th June 2022 85 / 100
  • 95. Model in the complex field CFBN (Nested CP decomposition with bias): The recursive form for Nth degree expansion is: e yn = ET [n]e x + ρ[n] ∗ FT [n]e yn−1 + b[n] + e yn−1, (17) for n = 2, . . . , N with e y1 = (e ET [1] e x) ∗ e b[1] , e y = e He yN + e h, where we denote by e b[n] = e BT [n] e β[n] for n = 1, . . . , N. High-degree polynomial expansions 20th June 2022 86 / 100
  • 96. Unsupervised audio generation on SC09 dataset Model IS (↑) FID (↓) NDB (↓) JSD (↓) # par (M) Real data 8.01 ± 0.24 0.50 0.00 ± 0.00 0.011 WaveGAN 4.67 ± 0.01 41.60 16.00 ± 1.09 0.094 36.5 . SpecGAN 6.03 ± 0.04 36.5 TiFGAN 5.97 26.70 6.00 ± 0.89 0.051 42.4 StyleGAN-U2 27.10 48.7 Unsupervised BigGAN 6.17 ± 0.20 24.72 Π-Nets 6.59 ± 0.03 13.01 4.40 ± 0.48 0.048 45.9 APOLLO, Small 6.48 ± 0.05 18.90 4.20 ± 1.47 0.038 4.6 APOLLO 7.25 ± 0.05 8.15 3.20 ± 1.16 0.029 64.1 High-degree polynomial expansions 20th June 2022 87 / 100
  • 97. Human evaluation Human evaluation on unsupervised audio generation on SC09 dataset. From left to right in the histogram, the Mean Opinion Score (MOS) for all models and the real data are 1.61, 2.68, 2.73, 3.33, and 4.73, respectively. APOLLO -Nets Real TiFGAN WaveGAN Rating High-degree polynomial expansions 20th June 2022 88 / 100
  • 98. Multimodal generation: Image-to-speech High-degree polynomial expansions 20th June 2022 89 / 100
  • 99. Highway networks 2nd degree Increasing degree MLP (Identity activation) 1st degree 3rd degree Higher degree LSTM Gating MLC-VAE Bilinear form Squeeze and excitation nets StyleGAN -Nets APOLLO COPE PDC Non-local networks Self-attention Metric learning Polynomial nets Mahalanobis distance ResNet RNN Multiplicative RNN Higher order tensor RNN High-degree polynomial expansions 20th June 2022 90 / 100
  • 100. Outline 1 Introduction 2 Higher-degree polynomial expansions 3 Object recognition with polynomial networks 4 Data generation with polynomial networks 5 Future directions High-degree polynomial expansions 20th June 2022 91 / 100
  • 101. Complementary work on polynomial networks I 1 Polynomial networks can enlarge the hypothesis space [Jayakumar’20, Fan’21]. S Jayakumar, et al. ‘Multiplicative Interactions and Where to Find Them.’ In International Conference on Learning Representations (ICLR), 2020. FL Fan, et al. ‘Expressivity and Trainability of Quadratic Networks.’ ArXiv preprint arXiv:2110.06081. S Zhang, Y Gong, D Yu, ‘Encrypted Speech Recognition using Deep Polynomial Networks.’ In International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019. Z Zhu, et al. ‘Controlling the Complexity and Lipschitz Constant improves Polynomial Nets’ In International Conference on Learning Representations (ICLR), 2022. High-degree polynomial expansions 20th June 2022 92 / 100
  • 102. Complementary work on polynomial networks I 1 Polynomial networks can enlarge the hypothesis space [Jayakumar’20, Fan’21]. 2 Privacy-preserving applications require polynomial expansions [Zhang’19]. S Jayakumar, et al. ‘Multiplicative Interactions and Where to Find Them.’ In International Conference on Learning Representations (ICLR), 2020. FL Fan, et al. ‘Expressivity and Trainability of Quadratic Networks.’ ArXiv preprint arXiv:2110.06081. S Zhang, Y Gong, D Yu, ‘Encrypted Speech Recognition using Deep Polynomial Networks.’ In International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019. Z Zhu, et al. ‘Controlling the Complexity and Lipschitz Constant improves Polynomial Nets’ In International Conference on Learning Representations (ICLR), 2022. High-degree polynomial expansions 20th June 2022 92 / 100
  • 103. Complementary work on polynomial networks I 1 Polynomial networks can enlarge the hypothesis space [Jayakumar’20, Fan’21]. 2 Privacy-preserving applications require polynomial expansions [Zhang’19]. 3 Sample complexity (and similar theoretical bounds) might be simpler to compute [Zhu’22]. S Jayakumar, et al. ‘Multiplicative Interactions and Where to Find Them.’ In International Conference on Learning Representations (ICLR), 2020. FL Fan, et al. ‘Expressivity and Trainability of Quadratic Networks.’ ArXiv preprint arXiv:2110.06081. S Zhang, Y Gong, D Yu, ‘Encrypted Speech Recognition using Deep Polynomial Networks.’ In International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019. Z Zhu, et al. ‘Controlling the Complexity and Lipschitz Constant improves Polynomial Nets’ In International Conference on Learning Representations (ICLR), 2022. High-degree polynomial expansions 20th June 2022 92 / 100
  • 104. Complementary work on polynomial networks I 1 Polynomial networks can enlarge the hypothesis space [Jayakumar’20, Fan’21]. 2 Privacy-preserving applications require polynomial expansions [Zhang’19]. 3 Sample complexity (and similar theoretical bounds) might be simpler to compute [Zhu’22]. 4 Known (theoretical) results from neural networks might not be directly applicable (e.g., implicit bias). S Jayakumar, et al. ‘Multiplicative Interactions and Where to Find Them.’ In International Conference on Learning Representations (ICLR), 2020. FL Fan, et al. ‘Expressivity and Trainability of Quadratic Networks.’ ArXiv preprint arXiv:2110.06081. S Zhang, Y Gong, D Yu, ‘Encrypted Speech Recognition using Deep Polynomial Networks.’ In International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019. Z Zhu, et al. ‘Controlling the Complexity and Lipschitz Constant improves Polynomial Nets’ In International Conference on Learning Representations (ICLR), 2022. High-degree polynomial expansions 20th June 2022 92 / 100
  • 105. Theoretical characterization of polynomial networks 0 200 400 600 800 1000 Polynomial degree 10-3 10-2 10-1 100 101 Test loss Test loss Figure: Double descent curve on polynomial regression. Source: https: // windowsontheory. org/ 2019/ 12/ 05/ deep-double-descent/ High-degree polynomial expansions 20th June 2022 93 / 100
  • 106. Optimization and training 1 Multiplications can make the loss surface less well behaved [Schwarz et al.]. How should we adapt the optimizers for polynomial architectures? J Schwarz, S Jayakumar, R Pascanu, P Latham, T W Teh. ’Powerpropagation: A sparsity inducing weight reparameterisation.’ In Advances in neural information processing systems (NeurIPS), 2021. High-degree polynomial expansions 20th June 2022 94 / 100
  • 107. Optimization and training 1 Multiplications can make the loss surface less well behaved [Schwarz et al.]. How should we adapt the optimizers for polynomial architectures? 2 What is the interaction between model degree and implicit regularization in polynomial networks? J Schwarz, S Jayakumar, R Pascanu, P Latham, T W Teh. ’Powerpropagation: A sparsity inducing weight reparameterisation.’ In Advances in neural information processing systems (NeurIPS), 2021. High-degree polynomial expansions 20th June 2022 94 / 100
  • 108. Optimization and training 1 Multiplications can make the loss surface less well behaved [Schwarz et al.]. How should we adapt the optimizers for polynomial architectures? 2 What is the interaction between model degree and implicit regularization in polynomial networks? 3 How should we initialize polynomial networks? J Schwarz, S Jayakumar, R Pascanu, P Latham, T W Teh. ’Powerpropagation: A sparsity inducing weight reparameterisation.’ In Advances in neural information processing systems (NeurIPS), 2021. High-degree polynomial expansions 20th June 2022 94 / 100
  • 109. Architecture 1 Can we use other popular tensor factorizations, e.g. Tucker decomposition, to obtain useful architectures? High-degree polynomial expansions 20th June 2022 95 / 100
  • 110. Architecture 1 Can we use other popular tensor factorizations, e.g. Tucker decomposition, to obtain useful architectures? 2 How can we evaluate the differences of those architectures? High-degree polynomial expansions 20th June 2022 95 / 100
  • 111. Architecture 1 Can we use other popular tensor factorizations, e.g. Tucker decomposition, to obtain useful architectures? 2 How can we evaluate the differences of those architectures? 3 How can we determine the degree required by the task at hand? High-degree polynomial expansions 20th June 2022 95 / 100
  • 112. Architecture 1 Can we use other popular tensor factorizations, e.g. Tucker decomposition, to obtain useful architectures? 2 How can we evaluate the differences of those architectures? 3 How can we determine the degree required by the task at hand? 1 Is higher degree always better? High-degree polynomial expansions 20th June 2022 95 / 100
  • 113. Architecture 1 Can we use other popular tensor factorizations, e.g. Tucker decomposition, to obtain useful architectures? 2 How can we evaluate the differences of those architectures? 3 How can we determine the degree required by the task at hand? 1 Is higher degree always better? 2 Where should we have this higher degree? High-degree polynomial expansions 20th June 2022 95 / 100
  • 114. Architecture 1 Can we use other popular tensor factorizations, e.g. Tucker decomposition, to obtain useful architectures? 2 How can we evaluate the differences of those architectures? 3 How can we determine the degree required by the task at hand? 1 Is higher degree always better? 2 Where should we have this higher degree? 3 Is there a total degree that is sufficient for all standard tasks? High-degree polynomial expansions 20th June 2022 95 / 100
  • 115. Architecture II 4 How can we express a joint tensor decomposition over all sequential polynomial networks? High-degree polynomial expansions 20th June 2022 96 / 100
  • 116. Architecture II 4 How can we express a joint tensor decomposition over all sequential polynomial networks? 5 Can we represent all signals of interest with a sequence of polynomial expansions? High-degree polynomial expansions 20th June 2022 96 / 100
  • 117. Architecture II 4 How can we express a joint tensor decomposition over all sequential polynomial networks? 5 Can we represent all signals of interest with a sequence of polynomial expansions? 6 How should we reason about activations often used in conjunction with a polynomial form? High-degree polynomial expansions 20th June 2022 96 / 100
  • 118. Architecture II 4 How can we express a joint tensor decomposition over all sequential polynomial networks? 5 Can we represent all signals of interest with a sequence of polynomial expansions? 6 How should we reason about activations often used in conjunction with a polynomial form? 1 Are activations required? High-degree polynomial expansions 20th June 2022 96 / 100
  • 119. Architecture II 4 How can we express a joint tensor decomposition over all sequential polynomial networks? 5 Can we represent all signals of interest with a sequence of polynomial expansions? 6 How should we reason about activations often used in conjunction with a polynomial form? 1 Are activations required? 2 Are they mostly there to make learning possible? High-degree polynomial expansions 20th June 2022 96 / 100
  • 120. Architecture II 4 How can we express a joint tensor decomposition over all sequential polynomial networks? 5 Can we represent all signals of interest with a sequence of polynomial expansions? 6 How should we reason about activations often used in conjunction with a polynomial form? 1 Are activations required? 2 Are they mostly there to make learning possible? 3 How do they modify the polynomial expansion? High-degree polynomial expansions 20th June 2022 96 / 100
  • 121. Robustness of polynomial networks 1 A polynomial expansion with unconstrained input can obtain extremely large values. High-degree polynomial expansions 20th June 2022 97 / 100
  • 122. Robustness of polynomial networks 1 A polynomial expansion with unconstrained input can obtain extremely large values. 2 How can we constrain their output range values efficiently? High-degree polynomial expansions 20th June 2022 97 / 100
  • 123. Robustness of polynomial networks 1 A polynomial expansion with unconstrained input can obtain extremely large values. 2 How can we constrain their output range values efficiently? 3 How can we make polynomial nets robust to (adversarial) noise? High-degree polynomial expansions 20th June 2022 97 / 100
  • 125. Thank you for your attention 1 We would like to thank Francesca Babiloni, Leello Dadi, Zhenyu Zhu and Yongtao Wu for their help in preparing the tutorial. 2 Further information and materials can be found on https://polynomial-nets.github.io/. 3 Contact us: grigorios.chrysos [at] epfl.ch. High-degree polynomial expansions 20th June 2022 99 / 100
  • 126. Highway networks 2nd degree Increasing degree MLP (Identity activation) 1st degree 3rd degree Higher degree LSTM Gating MLC-VAE Bilinear form Squeeze and excitation nets StyleGAN -Nets APOLLO COPE PDC Non-local networks Self-attention Metric learning Polynomial nets Mahalanobis distance ResNet RNN Multiplicative RNN Higher order tensor RNN High-degree polynomial expansions 20th June 2022 100 / 100