Selected Observations of
Some Statistical Research
in Neural Networks and
Deep Learning
August 12, 2019
SAMSI Deep Learning workshop
Xiaoming Huo, Georgia Tech
This workshop
schedule applications open source formulation math statistics Discussion
August 12, 2019 SAMSI DL workshop 2
This SAMSI Workshop
Monday, August 12, 2019 Tuesday, August 13, 2019
9:00-9:40am ASurveyofStatistical
ResearchinNeural
XiaomingHuo,
GeorgiaInstituteof
9:40-10:20am AdmissibilityofSolution
EstimatorsforStochastic
AmitabhBasu,Johns
HopkinsUniversity
10:50-11:30am Statisticaland
ComputationalGuarantees
HarrisonZhou,Yale
University
11:30am-12:10pm RobustInformation
Bottleneck
Poh-LingLoh,
Universityof
1:40-2:20pm TowardsDeepLearning:
UnderstandingStatistical
JianqingFan,
PrincetonUniversity
2:20-3:00pm HorseshoeRegularization
forMachineLearning in
AnindyaBhadra,
PurdueUniversity
3:30-4:10pm PosteriorConcentrationfor
SparseDeepLearning
VeronikaRockova,
UniversityofChicago
4:10-4:50pm DeepCompositionalSpatial
Models
AndrewZammit
Mangion,University
4:50-5:30pm TrainingDNNwithDynamic
SMD
Shih-KangChao,
Universityof
Time Description Speaker
9:00-9:40am OnAdversarialLearning LarryCarin,DukeUniversity
9:40-10:20am DeepInstrumentalVariables
Estimator
RuiqiLiu,UniversityofIndiana
10:50-11:30am ImprovingGenerativeModels JunierOliva,Universityof
NorthCarolinaatChapelHill
11:30am-12:10pm LearningtoSolveInverse
ProblemsinImaging
RebeccaWillett,Universityof
Chicago
1:40-2:20pm AnAdaptivelyWeighted
StochasticGradientMCMC
FamingLiang,Purdue
University
2:20-3:00pm InformationGeometricand
TopologicalApproachesto
WyattBridgmanandSorin
Mitran,UniversityofNorth
3:30-4:10pm DomainAdaptation
ChallengesinGenomics:a
BiancaDumitrascu,Princeton
UniversityandSAMSI
4:10-4:50pm NeuralNetworkDensity
Estimation
DeborsheeSen,Duke
UniversityandSAMSI
4:50-5:30pm ComplexityBoundsforDeep
LearningNetworksviathe
JasonKlusowski,Rutgers
University
schedule applications open source formulation math statistics Discussion
August 12, 2019 SAMSI DL workshop 3
Time Description Speaker
9:00-9:40am Statistical Inference for
Online Decision Making
Rui Song, N.C. State University
9:40-10:20am Modern Statistical
Theory Inspired by Deep
Guang Cheng, Purdue
University
10:50-11:30am Deep ReLU Networks
Viewed as a Statistical
Johannes Schmidt-Hieber,
University of Twente
11:30am-12:10pm ReLU regression:
Complexity and
Yao Xie, Georgia Institute of
Technology
1:40-2:20pm ProxSARAH Algorithms
for Stochastic
Quoc Tran-Dinh, University of
North Carolina at Chapel Hill
2:20-3:00pm Deep Models for
Improved Topic
Deanna Needell, UCLA
3:30-4:10pm Group-equivariant
Representation by
Xiuyuan Cheng, Duke
University
4:10-4:50pm Optimization and
Learning with
Guanghui (George) Lan,
Georgia Institute of
schedule applications open source formulation math statistics
Workshop Schedule (2)
Wednesday, August 14, 2019
Discussion
August 12, 2019 SAMSI DL workshop 4
Deep Learning
Applications and
Popularity
schedule applications open source formulation math statistics Discussion
August 12, 2019 SAMSI DL workshop 5
SAMSI DL workshop
16
ArXiv:1803.04818v2 Zacharias et al.
August 12, 2019
• Rating (>10) based on GitHub metrics
Name Website GitHub URL License Language APIs Rating
TensorFlow http://tens tensorflow/t Apache-2C++, PythonPython, C++, Ja 100
Keras http://kera fchollet/keraMIT Python Python, R 46.1
Caffe http://caffeBVLC/caffe BSD C++ Python, MATLA 38.1
MXNet http://mxn apache/incu Apache-2C++ Python, Scala, R 34
Theano http://deepTheano/The BSD Python Python 19.3
CNTK https://doc Microsoft/C MIT C++ Python, C++, C# 18.4
DeepLearning4J https://dee deeplearningApache-2Java, Scala Java, Scala, Cloj 17.8
PaddlePaddle http://wwwbaidu/paddl Apache-2C++ C++ 16.3
PyTorch http://pyto pytorch/pytoBSD C++, PythonPython 14.3
schedule applications open source formulation math statistics
Open Source Software Overview
Discussion
Problem Formulation
August 12, 2019 SAMSI DL workshop 17
schedule applications open source formulation math statistics Discussion
schedule applications open source formulation math statistics
Perceptron, the basic block
Discussion
August 12, 2019 SAMSI DL workshop
8
schedule applications open source formulation math
Multi-layer perceptron
statistics Discussion
August 12, 2019 SAMSI DL workshop
9
schedule applications open source formulation math statistics
Convolutional Neural Network
(CNN)
Discussion
August 12, 2019 SAMSI DL workshop
10
AlexNet (2012)
schedule applications open source formulation math statistics Discussion
August 12, 2019 SAMSI DL workshop
11
References:[90]
Human error (5.1%)
surpassed in 2015
• AlexNet (2012): First CNN (15.4%)
• 8 layers
• 61 millionparameters
• ZFNet (2013): 15.4% to 11.2%
• 8 layers
• More filters. Denserstride.
• VGGNet (2014): 11.2% to 7.3%
• Beautifullyuniform:
3x3 conv, stride 1, pad 1, 2x2 max pool
• 16 layers
• 138 million parameters
• GoogLeNet (2014): 11.2% to 6.7%
• Inceptionmodules
• 22 layers
• 5 millionparameters
(throw away fullyconnected layers)
• ResNet (2015): 6.7% to3.57%
• Morelayers = betterperformance
• 152 layers
• CUImage (2016): 3.57% to 2.99%
• Ensemble of 6 models
• SENet (2017): 2.99% to 2.251%
• Squeeze and excitation block:network
is allowed to adaptively adjust the
weightingof each featuremap in the
August 12, 2019 SAMSI DL workshop
convolutional block.
https://deeplearning.mit.edu 222019
schedule applications open source formulation math statistics Discussion
schedule applications open source formulation math
Depth as function of year
statistics Discussion
August 12, 2019 SAMSI DL workshop
13
Formulation
𝜌𝜌𝜌𝜌(𝑊𝑊𝑊𝑊3 �𝜌𝜌𝜌𝜌(𝑊𝑊𝑊𝑊2⋅ 𝜌𝜌𝜌𝜌(𝑊𝑊𝑊𝑊1 ⋅ 𝑥𝑥𝑥𝑥0)))
schedule applications open source formulation math statistics Discussion
August 12, 2019 SAMSI DL workshop
14
Challenges
• Linear vs Non-linear deep models
schedule applications open source formulation math statistics Discussion
August 12, 2019 SAMSI DL workshop
15
schedule applications open source formulation math statistics
Optimal approximation of …
• https://arxiv.org/abs/1709.05289
Discussion
August 12, 2019 SAMSI DL workshop
16
Mathematical
Foundation
schedule applications open source formulation math statistics Discussion
August 12, 2019 SAMSI DL workshop 17
Lecture Series
by Johannes
Schmidt-Hieber
https://triad.gatech.edu
schedule applications open source formulation math statistics Discussion
August 12, 2019 SAMSI DL workshop 18
A Statistical Perspective
schedule applications open source formulation math statistics Discussion
August 12, 2019 SAMSI DL workshop 19
schedule applications open source formulation math statistics Discussion
August 12, 2019 SAMSI DL workshop
20
schedule applications open source formulation math statistics Discussion
August 12, 2019 SAMSI DL workshop
21
Penalty Functions
schedule applications open source formulation math statistics Discussion
August 12, 2019 SAMSI DL workshop
22
schedule applications open source formulation math statistics Discussion
August 12, 2019 SAMSI DL workshop
23
Difference-of-convex programming…
schedule applications open source formulation math statistics Discussion
August 12, 2019 SAMSI DL workshop
24
schedule applications open source formulation math statistics Discussion
August 12, 2019 SAMSI DL workshop
25
schedule applications open source formulation math statistics Discussion
August 12, 2019 SAMSI DL workshop
26
schedule applications open source formulation math statistics Discussion
August 12, 2019 SAMSI DL workshop
27
schedule applications open source formulation math statistics Discussion
August 12, 2019 SAMSI DL workshop
28
schedule applications open source formulation math statistics Discussion
August 12, 2019 SAMSI DL workshop
29
schedule applications open source formulation math statistics Discussion
August 12, 2019 SAMSI DL workshop
30
schedule applications open source formulation math statistics Discussion
August 12, 2019 SAMSI DL workshop
31
schedule applications open source formulation math statistics Discussion
August 12, 2019 SAMSI DL workshop
32
schedule applications open source formulation math statistics Discussion
August 12, 2019 SAMSI DL workshop
33
schedule applications open source formulation math statistics Discussion
August 12, 2019 SAMSI DL workshop
34
Transdisciplinary
Research Institute
for Advancing
Data Science
Future work
Difference-of-Convex (DC) Offers a Unified
Framework to Study Concave Regularizations for
High-Dimensional Sparse Estimation
Adapting to DL…
schedule applications open source formulation math statistics Discussion
August 12, 2019 SAMSI DL workshop
35
Discussion on Potential
Topics
1. Generative Adversarial Networks
schedule applications open source formulation math statistics Discussion
August 12, 2019 SAMSI DL workshop 46
Discussion on Potential
Topics
3. Other related topics
August 12, 2019 SAMSI DL workshop 57
schedule applications open source formulation math statistics Discussion
The user plays a central role in Interactive Machine Learning.
The user interface is a critical component of this interaction.
schedule applications open source formulation math statistics Discussion
August2018 HUST presentation
38
Concluding thoughts
1. Deep learning is a new frontier for theory; many
new avenues (from SA)
2. A lot of opportunities for statistical related
research
3. This program will expedite new discoveries
Thank you!
huo@gatech.edu
schedule applications open source formulation math statistics Discussion
August2018 HUST presentation
39

Deep Learning Opening Workshop - A Survey of Statistical Research in Neural Networks and Deep Learning - Xiaoming Huo, August 12, 2019

  • 1.
    Selected Observations of SomeStatistical Research in Neural Networks and Deep Learning August 12, 2019 SAMSI Deep Learning workshop Xiaoming Huo, Georgia Tech
  • 2.
    This workshop schedule applicationsopen source formulation math statistics Discussion August 12, 2019 SAMSI DL workshop 2
  • 3.
    This SAMSI Workshop Monday,August 12, 2019 Tuesday, August 13, 2019 9:00-9:40am ASurveyofStatistical ResearchinNeural XiaomingHuo, GeorgiaInstituteof 9:40-10:20am AdmissibilityofSolution EstimatorsforStochastic AmitabhBasu,Johns HopkinsUniversity 10:50-11:30am Statisticaland ComputationalGuarantees HarrisonZhou,Yale University 11:30am-12:10pm RobustInformation Bottleneck Poh-LingLoh, Universityof 1:40-2:20pm TowardsDeepLearning: UnderstandingStatistical JianqingFan, PrincetonUniversity 2:20-3:00pm HorseshoeRegularization forMachineLearning in AnindyaBhadra, PurdueUniversity 3:30-4:10pm PosteriorConcentrationfor SparseDeepLearning VeronikaRockova, UniversityofChicago 4:10-4:50pm DeepCompositionalSpatial Models AndrewZammit Mangion,University 4:50-5:30pm TrainingDNNwithDynamic SMD Shih-KangChao, Universityof Time Description Speaker 9:00-9:40am OnAdversarialLearning LarryCarin,DukeUniversity 9:40-10:20am DeepInstrumentalVariables Estimator RuiqiLiu,UniversityofIndiana 10:50-11:30am ImprovingGenerativeModels JunierOliva,Universityof NorthCarolinaatChapelHill 11:30am-12:10pm LearningtoSolveInverse ProblemsinImaging RebeccaWillett,Universityof Chicago 1:40-2:20pm AnAdaptivelyWeighted StochasticGradientMCMC FamingLiang,Purdue University 2:20-3:00pm InformationGeometricand TopologicalApproachesto WyattBridgmanandSorin Mitran,UniversityofNorth 3:30-4:10pm DomainAdaptation ChallengesinGenomics:a BiancaDumitrascu,Princeton UniversityandSAMSI 4:10-4:50pm NeuralNetworkDensity Estimation DeborsheeSen,Duke UniversityandSAMSI 4:50-5:30pm ComplexityBoundsforDeep LearningNetworksviathe JasonKlusowski,Rutgers University schedule applications open source formulation math statistics Discussion August 12, 2019 SAMSI DL workshop 3
  • 4.
    Time Description Speaker 9:00-9:40amStatistical Inference for Online Decision Making Rui Song, N.C. State University 9:40-10:20am Modern Statistical Theory Inspired by Deep Guang Cheng, Purdue University 10:50-11:30am Deep ReLU Networks Viewed as a Statistical Johannes Schmidt-Hieber, University of Twente 11:30am-12:10pm ReLU regression: Complexity and Yao Xie, Georgia Institute of Technology 1:40-2:20pm ProxSARAH Algorithms for Stochastic Quoc Tran-Dinh, University of North Carolina at Chapel Hill 2:20-3:00pm Deep Models for Improved Topic Deanna Needell, UCLA 3:30-4:10pm Group-equivariant Representation by Xiuyuan Cheng, Duke University 4:10-4:50pm Optimization and Learning with Guanghui (George) Lan, Georgia Institute of schedule applications open source formulation math statistics Workshop Schedule (2) Wednesday, August 14, 2019 Discussion August 12, 2019 SAMSI DL workshop 4
  • 5.
    Deep Learning Applications and Popularity scheduleapplications open source formulation math statistics Discussion August 12, 2019 SAMSI DL workshop 5
  • 6.
    SAMSI DL workshop 16 ArXiv:1803.04818v2Zacharias et al. August 12, 2019 • Rating (>10) based on GitHub metrics Name Website GitHub URL License Language APIs Rating TensorFlow http://tens tensorflow/t Apache-2C++, PythonPython, C++, Ja 100 Keras http://kera fchollet/keraMIT Python Python, R 46.1 Caffe http://caffeBVLC/caffe BSD C++ Python, MATLA 38.1 MXNet http://mxn apache/incu Apache-2C++ Python, Scala, R 34 Theano http://deepTheano/The BSD Python Python 19.3 CNTK https://doc Microsoft/C MIT C++ Python, C++, C# 18.4 DeepLearning4J https://dee deeplearningApache-2Java, Scala Java, Scala, Cloj 17.8 PaddlePaddle http://wwwbaidu/paddl Apache-2C++ C++ 16.3 PyTorch http://pyto pytorch/pytoBSD C++, PythonPython 14.3 schedule applications open source formulation math statistics Open Source Software Overview Discussion
  • 7.
    Problem Formulation August 12,2019 SAMSI DL workshop 17 schedule applications open source formulation math statistics Discussion
  • 8.
    schedule applications opensource formulation math statistics Perceptron, the basic block Discussion August 12, 2019 SAMSI DL workshop 8
  • 9.
    schedule applications opensource formulation math Multi-layer perceptron statistics Discussion August 12, 2019 SAMSI DL workshop 9
  • 10.
    schedule applications opensource formulation math statistics Convolutional Neural Network (CNN) Discussion August 12, 2019 SAMSI DL workshop 10
  • 11.
    AlexNet (2012) schedule applicationsopen source formulation math statistics Discussion August 12, 2019 SAMSI DL workshop 11
  • 12.
    References:[90] Human error (5.1%) surpassedin 2015 • AlexNet (2012): First CNN (15.4%) • 8 layers • 61 millionparameters • ZFNet (2013): 15.4% to 11.2% • 8 layers • More filters. Denserstride. • VGGNet (2014): 11.2% to 7.3% • Beautifullyuniform: 3x3 conv, stride 1, pad 1, 2x2 max pool • 16 layers • 138 million parameters • GoogLeNet (2014): 11.2% to 6.7% • Inceptionmodules • 22 layers • 5 millionparameters (throw away fullyconnected layers) • ResNet (2015): 6.7% to3.57% • Morelayers = betterperformance • 152 layers • CUImage (2016): 3.57% to 2.99% • Ensemble of 6 models • SENet (2017): 2.99% to 2.251% • Squeeze and excitation block:network is allowed to adaptively adjust the weightingof each featuremap in the August 12, 2019 SAMSI DL workshop convolutional block. https://deeplearning.mit.edu 222019 schedule applications open source formulation math statistics Discussion
  • 13.
    schedule applications opensource formulation math Depth as function of year statistics Discussion August 12, 2019 SAMSI DL workshop 13
  • 14.
    Formulation 𝜌𝜌𝜌𝜌(𝑊𝑊𝑊𝑊3 �𝜌𝜌𝜌𝜌(𝑊𝑊𝑊𝑊2⋅ 𝜌𝜌𝜌𝜌(𝑊𝑊𝑊𝑊1⋅ 𝑥𝑥𝑥𝑥0))) schedule applications open source formulation math statistics Discussion August 12, 2019 SAMSI DL workshop 14
  • 15.
    Challenges • Linear vsNon-linear deep models schedule applications open source formulation math statistics Discussion August 12, 2019 SAMSI DL workshop 15
  • 16.
    schedule applications opensource formulation math statistics Optimal approximation of … • https://arxiv.org/abs/1709.05289 Discussion August 12, 2019 SAMSI DL workshop 16
  • 17.
    Mathematical Foundation schedule applications opensource formulation math statistics Discussion August 12, 2019 SAMSI DL workshop 17
  • 18.
    Lecture Series by Johannes Schmidt-Hieber https://triad.gatech.edu scheduleapplications open source formulation math statistics Discussion August 12, 2019 SAMSI DL workshop 18
  • 19.
    A Statistical Perspective scheduleapplications open source formulation math statistics Discussion August 12, 2019 SAMSI DL workshop 19
  • 20.
    schedule applications opensource formulation math statistics Discussion August 12, 2019 SAMSI DL workshop 20
  • 21.
    schedule applications opensource formulation math statistics Discussion August 12, 2019 SAMSI DL workshop 21
  • 22.
    Penalty Functions schedule applicationsopen source formulation math statistics Discussion August 12, 2019 SAMSI DL workshop 22
  • 23.
    schedule applications opensource formulation math statistics Discussion August 12, 2019 SAMSI DL workshop 23
  • 24.
    Difference-of-convex programming… schedule applicationsopen source formulation math statistics Discussion August 12, 2019 SAMSI DL workshop 24
  • 25.
    schedule applications opensource formulation math statistics Discussion August 12, 2019 SAMSI DL workshop 25
  • 26.
    schedule applications opensource formulation math statistics Discussion August 12, 2019 SAMSI DL workshop 26
  • 27.
    schedule applications opensource formulation math statistics Discussion August 12, 2019 SAMSI DL workshop 27
  • 28.
    schedule applications opensource formulation math statistics Discussion August 12, 2019 SAMSI DL workshop 28
  • 29.
    schedule applications opensource formulation math statistics Discussion August 12, 2019 SAMSI DL workshop 29
  • 30.
    schedule applications opensource formulation math statistics Discussion August 12, 2019 SAMSI DL workshop 30
  • 31.
    schedule applications opensource formulation math statistics Discussion August 12, 2019 SAMSI DL workshop 31
  • 32.
    schedule applications opensource formulation math statistics Discussion August 12, 2019 SAMSI DL workshop 32
  • 33.
    schedule applications opensource formulation math statistics Discussion August 12, 2019 SAMSI DL workshop 33
  • 34.
    schedule applications opensource formulation math statistics Discussion August 12, 2019 SAMSI DL workshop 34
  • 35.
    Transdisciplinary Research Institute for Advancing DataScience Future work Difference-of-Convex (DC) Offers a Unified Framework to Study Concave Regularizations for High-Dimensional Sparse Estimation Adapting to DL… schedule applications open source formulation math statistics Discussion August 12, 2019 SAMSI DL workshop 35
  • 36.
    Discussion on Potential Topics 1.Generative Adversarial Networks schedule applications open source formulation math statistics Discussion August 12, 2019 SAMSI DL workshop 46
  • 37.
    Discussion on Potential Topics 3.Other related topics August 12, 2019 SAMSI DL workshop 57 schedule applications open source formulation math statistics Discussion
  • 38.
    The user playsa central role in Interactive Machine Learning. The user interface is a critical component of this interaction. schedule applications open source formulation math statistics Discussion August2018 HUST presentation 38
  • 39.
    Concluding thoughts 1. Deeplearning is a new frontier for theory; many new avenues (from SA) 2. A lot of opportunities for statistical related research 3. This program will expedite new discoveries Thank you! huo@gatech.edu schedule applications open source formulation math statistics Discussion August2018 HUST presentation 39