This is a one page summary of my master thesis which I handed in on June 15, 2019 at TUM. The thesis takes the form of a literature review on the existing rigorous analysis on neural networks. It focuses on 3 key aspects: modern and classical results in approximation theory, robustness of neural networks and unique identification of neural network weights. The thesis was supervised by Prof. Dr. Massimo Fornasier at the Chair of Applied Numerical Analysis of the Mathematics Department at TUM.
development of diagnostic enzyme assay to detect leuser virus
One page summary of master thesis "Mathematical Analysis of Neural Networks"
1. Master Thesis on Mathematical Analysis of Neural Networks at Technical University of Munich
Alina Leidinger under the supervision of Prof. Massimo Fornasier
Summary:
The aim of the master thesis is to gain a deeper understanding of some theoretical aspects of neural
networks. Firstly, it is looked at classic approximation theoretic results pertaining to density and order of
approximation. Secondly, stability and robustness properties of neural networks are examined. Thirdly, the
tractable learning of networks including approaches to unique identification of the weights are discussed.
1. Approximation Theory
This chapter discusses results on density of neural networks within different function spaces under
different assumptions on the non-linearity. Literature on the order or degree of approximation is examined
as well as architecture design that allows for circumventing the curse of dimensionality.
Key references:
Cybenko, George. "Approximation by superpositions of a sigmoidal function." Mathematics of control,
signals and systems 2.4 (1989): 303-314.
Hornik, Kurt, Maxwell Stinchcombe, and Halbert White. "Multilayer feedforward networks are universal
approximators." Neural networks 2.5 (1989): 359-366.
Leshno, Moshe, et al. "Multilayer feedforward networks with a nonpolynomial activation function can
approximate any function." Neural networks 6.6 (1993): 861-867.
Mhaskar, Hrushikesh N. "Neural networks for optimal approximation of smooth and analytic functions."
Neural computation 8.1 (1996): 164-177.
Poggio, Tomaso, et al. "Why and when can deep-but not shallow-networks avoid the curse of
dimensionality: a review." International Journal of Automation and Computing 14.5 (2017): 503-519.
2. Stability
This chapter first examines the phenomenon of adversarial examples. This includes exemplary adversarial
attack strategies, universality properties of adversarial examples to different experimental set ups and
hypotheses on their distribution in input space. The main focus of the chapter is on scattering networks
which iterate on filtering operations with wavelets and non-linearities. A thorough analysis of these
networks from a signal processing point of view and their similarities with conventional CNNs is used to
explain some of the recent success of neural networks. In particular the necessity for a non-linearity and a
deep architecture is explained.
Key references:
Mallat, Stéphane. "Understanding deep convolutional networks." Philosophical Transactions of the Royal
Society A: Mathematical, Physical and Engineering Sciences 374.2065 (2016): 20150203.
Alaifari, Rima, Giovanni S. Alberti, and Tandri Gauksson. "Adef: An iterative algorithm to construct
adversarial deformations." arXiv preprint arXiv:1804.07729 (2018).
Dhillon, Guneet S., et al. "Stochastic activation pruning for robust adversarial defense." arXiv preprint
arXiv:1803.01442 (2018).
3. Learning Neural Networks
This chapter highlights different approaches to facilitating tractability in learning and unique identification
of the network weights. The work of Anandkumar et al. on this is based on tensor decomposition.
Fornasier et al. use Hessian evaluations and matrix subspace matrix optimization.
Key references:
Sedghi, Hanie, and Anima Anandkumar. "Provable methods for training neural networks with sparse
connectivity." arXiv preprint arXiv:1412.2693 (2014).
Janzamin, Majid, Hanie Sedghi, and Anima Anandkumar. "Beating the perils of non-convexity: Guaranteed
training of neural networks using tensor methods." arXiv preprint arXiv:1506.08473 (2015).
Fornasier, Massimo, Jan Vybíral, and Ingrid Daubechies. "Identification of Shallow Neural Networks by
Fewest Samples." arXiv preprint arXiv:1804.01592 (2018).