SlideShare a Scribd company logo
Weight Agnostic Neural Networks
2019.08.13
Yongsu Baek
yongsubaek@mli.kaist.ac.kr
Table of Contents
• Overview
• Motivation
• Related Work
• Architecture search
• Bayesian Neural Networks
• Algorithmic Information Theory(AIT)
• Network Pruning
• Neuroscience
• WANN
• Overview
• Topology Search
• Performance and Complexity
• Results
• Continuous control tasks
• Image Classification
• Discussion
Overview
• Not gradient-based <-> Gradient based
• Only architecture <-> Weight parameter training
• Evolution <-> Rearrangement, Pruning
Motivation - Biology
• “In biology, precocial species are those whose young already possess certain
abilities from the moment of birth.”
Motivation – Deep learning
• Network Structure - "Strong inductive biases"
• Convolutional networks [2], [3]
• LSTM [4]
5
Goal
• Weight Agnostic Neural Network
• Architectures with "strong inductive biases"
• can already perform various tasks with random weights.
• Weight 학습 없이도 충분히 task를 수행할 수 있는 Network structure를 찾아보
자!
• By deemphasizing the importance of weights
1) Single shared weight
2) Evaluation on a wide range of single weight parameter
• Novel neural network building blocks
6
Related Work
• Architecture search
• Bayesian Neural Networks
• Algorithmic Information Theory(AIT)
• Network Pruning
• Neuroscience
7
Architecture Search
• Evolutionary computing
• Topology Search Algorithm - NEAT [5]
• NAS
• Basic building blocks with strong domain priors – CNNs, recurrent cells, self attention
• Weight Training inner loop -> Slow
• Architectures, once trained, outperform human-designed one
• WANN
• Creating network architectures which encode solutions
• No training inner loop
• The solution is innate to the structure
8
Bayesian Neural Networks
• Weight parameters sampled from learned distribution
• Variance Network [6]
• Sampled from Zero-mean, parameterized variance distribution
• conventional BNNs naturally converge to zero-mean posteriors
• Ensemble evaluation
• WANN
• sampling weights from a fixed uniform distribution with zero mean
• evaluating performance on network ensembles
9
Variance Networks: When Expectation Does Not Meet Your Expectations., K. Neklyudov
Algorithmic Information Theory(AIT)
• Kolmogorov complexity
• The minimum length of the program that can compute it
• Occam’s razor
• Simplifying neural networks by soft weight-sharing [7]
• reducing the amount of information in weights by making them noisy, and simplifying
the search space
• WANN
• finding minimal architectures
• Weight-sharing to the entire network (AIT)
• The weight as a rv sampled from a fixed distribution (BNN)
10
Simplifying neural networks by soft weight-sharing, S.J. Nowlan, G.E. Hinton., 1992
Network Pruning
• starts with a full, trained network, and takes away connections
• Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask (2019)
• pruned networks w/ randomly initialized weights
• WANN
• complementary to pruning
• does not require prior training
• no upper bound on the network’s complexity
11
Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask , H. Zhou, J. Lan, R. Liu, J. Yosinski.
Neuroscience
• connectome
• “wiring diagram” of all neural connections
• forming new synaptic connections and rewire
• analyzed using graph theory
• WANN
• aims to learn network graphs that can encode skills and knowledge
• ever-growing networks
• small enough to be analyzed
12
WANN
• Weight of WANN
• Searching Method
• Topology Search
• Performance and Complexity
13
Weight of WANN
• Architecture themselves encode solutions
• Importance of weights must be minimized
• Weight sampling
• The curse of dimensionality
• Weight-sharing
• Efficient and handful
• Single shared weight sampled from a fixed distribution
14
Searching Method
1. An initial population of minimal neural network topologies is created.
2. Each network is evaluated over multiple rollouts, with a different shared weight
value assigned at each rollout.
15
Searching Method
3. Networks are ranked according to their performance and complexity.
4. A new population is created by varying the highest ranked network topologies,
chosen probabilistically through tournament selection
16
Topology Search
• NEAT [5]
• one of three ways:
1) Insert Node
2) Add Connection
3) Change Activation
• Feed-forward network
17
Evolving neural networks through augmenting topologies, K.O. Stanley, R. Miikkulainen.
Performance and Complexity
• evaluated using several shared weight values
• fixed series of weight values [-2, -1, -0.5, +0.5, +1, +2]
• mean performance
• Prefer simpler network (AIT)
• multi-objective optimization problem:
• mean performance over all weight values
• max performance of the single best weight value
• the number of connections in the network
18
Experimental Results
• Continuous Control
• CartPoleSwingUp
• BipedalWalker-v2
• CarRacing-v0
• Image Classification
• MNIST
19
Experiment
1. Random weights: individual weights drawn from 𝑈(−2,2)
2. Random shared weight: a single shared weight drawn from 𝑈(−2,2)
3. Tuned shared weight: the highest performing shared weight value in range (-2,2)
4. Tuned weights: individual weights tuned using population-based REINFORCE
20
Continuous Control
21
• CartPoleSwingUp
• Cannot be solved with a linear controller
Continuous Control
• BipedalWalker-v2
• non-trivial number of possible connections
• 210 connections (SOTA: 2804 connections)
22
Continuous Control
• CarRacing-v0
• pre-trained VAE to compress the pixel representation
• No pretrained hidden states of RNN
23
Continuous Control Results
• WANNs are not completely independent of the weight values
• Single shared weight
• easy tuning
24
Classification
• Ensemble evaluation - vote
25
Discussion and Future Work
• Method to search for simple neural network
• Fine-tune
• Few-shot learning
• Continual lifelong learning
• Multitask
• Supermask [8]
• similar range of performance
• architecture search in a differentiable manner
26
Discussion
• Contribution?
• Network Structure만의 영향력
• Simple Neural Network의 performance
• WANN 자체의 실효성은 없어보임
• Single shared weight에 의한 structure bias가 있음
• Structure를 찾아내는 optimize 방법 연구
27
Thank you !
Any Questions ?
References
1) Weight Agnostic Neural Networks. GAIER, Adam; HA, David. arXiv preprint arXiv:1906.04358, 2019.
2) A powerful generative model using random weights for the deep image representation [link]
He, K., Wang, Y. and Hopcroft, J., 2016. Advances in Neural Information Processing Systems, pp. 631—639.
3) Deep image prior [link]
Ulyanov, D., Vedaldi, A. and Lempitsky, V., 2018. Proceedings of the IEEE Conference on Computer Vision a
nd Pattern Recognition, pp. 9446—9454.
4) Training recurrent networks by evolino [HTML]
J. Schmidhuber, D. Wierstra, M. Gagliolo, F. Gomez.
Neural computation, Vol 19(3), pp. 757—779. MIT Press. 2007.
5) Evolving neural networks through augmenting topologies [HTML]
K.O. Stanley, R. Miikkulainen.
Evolutionary computation, Vol 10(2), pp. 99—127. MIT Press. 2002.
6) Variance Networks: When Expectation Does Not Meet Your Expectations [link]
K. Neklyudov, D. Molchanov, A. Ashukha, D. Vetrov.
International Conference on Learning Representations (ICLR). 2019.
7) Simplifying neural networks by soft weight-sharing [PDF]
S.J. Nowlan, G.E. Hinton.
Neural computation, Vol 4(4), pp. 473—493. MIT Press. 1992.
8) Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask [link]
H. Zhou, J. Lan, R. Liu, J. Yosinski.
arXiv preprint arXiv:1905.01067. 2019.
29

More Related Content

What's hot

Efficient Neural Network Architecture for Image Classfication
Efficient Neural Network Architecture for Image ClassficationEfficient Neural Network Architecture for Image Classfication
Efficient Neural Network Architecture for Image Classfication
Yogendra Tamang
 
Optimum Relay Node Selection in Clustered MANET
Optimum Relay Node Selection in Clustered MANETOptimum Relay Node Selection in Clustered MANET
Optimum Relay Node Selection in Clustered MANET
IRJET Journal
 
3D 딥러닝 동향
3D 딥러닝 동향3D 딥러닝 동향
3D 딥러닝 동향
NAVER Engineering
 
Basics of Artificial Neural Network
Basics of Artificial Neural Network Basics of Artificial Neural Network
Basics of Artificial Neural Network
Subham Preetam
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
Fellowship at Vodafone FutureLab
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
ijceronline
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
IJERD Editor
 
Visualizaing and understanding convolutional networks
Visualizaing and understanding convolutional networksVisualizaing and understanding convolutional networks
Visualizaing and understanding convolutional networks
SungminYou
 
CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...
CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...
CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...
ijcsit
 
Optimized Neural Network for Classification of Multispectral Images
Optimized Neural Network for Classification of Multispectral ImagesOptimized Neural Network for Classification of Multispectral Images
Optimized Neural Network for Classification of Multispectral Images
IDES Editor
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
Fellowship at Vodafone FutureLab
 
Poster_Reseau_Neurones_Journees_2013
Poster_Reseau_Neurones_Journees_2013Poster_Reseau_Neurones_Journees_2013
Poster_Reseau_Neurones_Journees_2013
Pedro Lopes
 
Introduction to Neural networks (under graduate course) Lecture 9 of 9
Introduction to Neural networks (under graduate course) Lecture 9 of 9Introduction to Neural networks (under graduate course) Lecture 9 of 9
Introduction to Neural networks (under graduate course) Lecture 9 of 9
Randa Elanwar
 
Ch 1-1 introduction
Ch 1-1 introductionCh 1-1 introduction
Ch 1-1 introduction
Zahra Amini
 

What's hot (14)

Efficient Neural Network Architecture for Image Classfication
Efficient Neural Network Architecture for Image ClassficationEfficient Neural Network Architecture for Image Classfication
Efficient Neural Network Architecture for Image Classfication
 
Optimum Relay Node Selection in Clustered MANET
Optimum Relay Node Selection in Clustered MANETOptimum Relay Node Selection in Clustered MANET
Optimum Relay Node Selection in Clustered MANET
 
3D 딥러닝 동향
3D 딥러닝 동향3D 딥러닝 동향
3D 딥러닝 동향
 
Basics of Artificial Neural Network
Basics of Artificial Neural Network Basics of Artificial Neural Network
Basics of Artificial Neural Network
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Visualizaing and understanding convolutional networks
Visualizaing and understanding convolutional networksVisualizaing and understanding convolutional networks
Visualizaing and understanding convolutional networks
 
CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...
CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...
CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...
 
Optimized Neural Network for Classification of Multispectral Images
Optimized Neural Network for Classification of Multispectral ImagesOptimized Neural Network for Classification of Multispectral Images
Optimized Neural Network for Classification of Multispectral Images
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
 
Poster_Reseau_Neurones_Journees_2013
Poster_Reseau_Neurones_Journees_2013Poster_Reseau_Neurones_Journees_2013
Poster_Reseau_Neurones_Journees_2013
 
Introduction to Neural networks (under graduate course) Lecture 9 of 9
Introduction to Neural networks (under graduate course) Lecture 9 of 9Introduction to Neural networks (under graduate course) Lecture 9 of 9
Introduction to Neural networks (under graduate course) Lecture 9 of 9
 
Ch 1-1 introduction
Ch 1-1 introductionCh 1-1 introduction
Ch 1-1 introduction
 

Similar to Weight Agnostic Neural Networks

Neural networks1
Neural networks1Neural networks1
Neural networks1
Mohan Raj
 
[RSS2023] Local Object Crop Collision Network for Efficient Simulation
[RSS2023] Local Object Crop Collision Network for Efficient Simulation[RSS2023] Local Object Crop Collision Network for Efficient Simulation
[RSS2023] Local Object Crop Collision Network for Efficient Simulation
DongwonSon1
 
Lecture_04_Supervised_Pretraining.pptx
Lecture_04_Supervised_Pretraining.pptxLecture_04_Supervised_Pretraining.pptx
Lecture_04_Supervised_Pretraining.pptx
GauravGautam216125
 
NEURAL NETWORK IN MACHINE LEARNING FOR STUDENTS
NEURAL NETWORK IN MACHINE LEARNING FOR STUDENTSNEURAL NETWORK IN MACHINE LEARNING FOR STUDENTS
NEURAL NETWORK IN MACHINE LEARNING FOR STUDENTS
hemasubbu08
 
モデル高速化百選
モデル高速化百選モデル高速化百選
モデル高速化百選
Yusuke Uchida
 
Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)
DonghyunKang12
 
ANN load forecasting
ANN load forecastingANN load forecasting
ANN load forecasting
Dr Ashok Tiwari
 
Exploring Randomly Wired Neural Networks for Image Recognition
Exploring Randomly Wired Neural Networks for Image RecognitionExploring Randomly Wired Neural Networks for Image Recognition
Exploring Randomly Wired Neural Networks for Image Recognition
Yongsu Baek
 
network mining and representation learning
network mining and representation learningnetwork mining and representation learning
network mining and representation learning
sun peiyuan
 
artificialneuralnetwork-130409001108-phpapp02 (2).pptx
artificialneuralnetwork-130409001108-phpapp02 (2).pptxartificialneuralnetwork-130409001108-phpapp02 (2).pptx
artificialneuralnetwork-130409001108-phpapp02 (2).pptx
REG83NITHYANANTHANN
 
POSTER
POSTERPOSTER
character_ANN.ppt
character_ANN.pptcharacter_ANN.ppt
character_ANN.ppt
Harsh480253
 
Artificial Neural Networks presentations
Artificial Neural Networks presentationsArtificial Neural Networks presentations
Artificial Neural Networks presentations
migob991
 
Deep Neural Networks.pptx
Deep Neural Networks.pptxDeep Neural Networks.pptx
Deep Neural Networks.pptx
YashPaul20
 
DLD_WeightSharing_Slide
DLD_WeightSharing_SlideDLD_WeightSharing_Slide
DLD_WeightSharing_Slide
Kang-Ho Lee
 
20191107 deeplearningapproachesfornetworks
20191107 deeplearningapproachesfornetworks20191107 deeplearningapproachesfornetworks
20191107 deeplearningapproachesfornetworks
tm1966
 
Introduction to Neural Network
Introduction to Neural NetworkIntroduction to Neural Network
Introduction to Neural Network
Yan Xu
 
Computer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathonComputer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathon
Aditya Bhattacharya
 
Network recasting
Network recastingNetwork recasting
Network recasting
NAVER Engineering
 
FINAL_Team_4.pptx
FINAL_Team_4.pptxFINAL_Team_4.pptx
FINAL_Team_4.pptx
nitin571047
 

Similar to Weight Agnostic Neural Networks (20)

Neural networks1
Neural networks1Neural networks1
Neural networks1
 
[RSS2023] Local Object Crop Collision Network for Efficient Simulation
[RSS2023] Local Object Crop Collision Network for Efficient Simulation[RSS2023] Local Object Crop Collision Network for Efficient Simulation
[RSS2023] Local Object Crop Collision Network for Efficient Simulation
 
Lecture_04_Supervised_Pretraining.pptx
Lecture_04_Supervised_Pretraining.pptxLecture_04_Supervised_Pretraining.pptx
Lecture_04_Supervised_Pretraining.pptx
 
NEURAL NETWORK IN MACHINE LEARNING FOR STUDENTS
NEURAL NETWORK IN MACHINE LEARNING FOR STUDENTSNEURAL NETWORK IN MACHINE LEARNING FOR STUDENTS
NEURAL NETWORK IN MACHINE LEARNING FOR STUDENTS
 
モデル高速化百選
モデル高速化百選モデル高速化百選
モデル高速化百選
 
Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)
 
ANN load forecasting
ANN load forecastingANN load forecasting
ANN load forecasting
 
Exploring Randomly Wired Neural Networks for Image Recognition
Exploring Randomly Wired Neural Networks for Image RecognitionExploring Randomly Wired Neural Networks for Image Recognition
Exploring Randomly Wired Neural Networks for Image Recognition
 
network mining and representation learning
network mining and representation learningnetwork mining and representation learning
network mining and representation learning
 
artificialneuralnetwork-130409001108-phpapp02 (2).pptx
artificialneuralnetwork-130409001108-phpapp02 (2).pptxartificialneuralnetwork-130409001108-phpapp02 (2).pptx
artificialneuralnetwork-130409001108-phpapp02 (2).pptx
 
POSTER
POSTERPOSTER
POSTER
 
character_ANN.ppt
character_ANN.pptcharacter_ANN.ppt
character_ANN.ppt
 
Artificial Neural Networks presentations
Artificial Neural Networks presentationsArtificial Neural Networks presentations
Artificial Neural Networks presentations
 
Deep Neural Networks.pptx
Deep Neural Networks.pptxDeep Neural Networks.pptx
Deep Neural Networks.pptx
 
DLD_WeightSharing_Slide
DLD_WeightSharing_SlideDLD_WeightSharing_Slide
DLD_WeightSharing_Slide
 
20191107 deeplearningapproachesfornetworks
20191107 deeplearningapproachesfornetworks20191107 deeplearningapproachesfornetworks
20191107 deeplearningapproachesfornetworks
 
Introduction to Neural Network
Introduction to Neural NetworkIntroduction to Neural Network
Introduction to Neural Network
 
Computer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathonComputer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathon
 
Network recasting
Network recastingNetwork recasting
Network recasting
 
FINAL_Team_4.pptx
FINAL_Team_4.pptxFINAL_Team_4.pptx
FINAL_Team_4.pptx
 

Recently uploaded

Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
eddie19851
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
kuntobimo2016
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
GetInData
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
Sm321
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
u86oixdj
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
bopyb
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
mzpolocfi
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 

Recently uploaded (20)

Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 

Weight Agnostic Neural Networks

  • 1. Weight Agnostic Neural Networks 2019.08.13 Yongsu Baek yongsubaek@mli.kaist.ac.kr
  • 2. Table of Contents • Overview • Motivation • Related Work • Architecture search • Bayesian Neural Networks • Algorithmic Information Theory(AIT) • Network Pruning • Neuroscience • WANN • Overview • Topology Search • Performance and Complexity • Results • Continuous control tasks • Image Classification • Discussion
  • 3. Overview • Not gradient-based <-> Gradient based • Only architecture <-> Weight parameter training • Evolution <-> Rearrangement, Pruning
  • 4. Motivation - Biology • “In biology, precocial species are those whose young already possess certain abilities from the moment of birth.”
  • 5. Motivation – Deep learning • Network Structure - "Strong inductive biases" • Convolutional networks [2], [3] • LSTM [4] 5
  • 6. Goal • Weight Agnostic Neural Network • Architectures with "strong inductive biases" • can already perform various tasks with random weights. • Weight 학습 없이도 충분히 task를 수행할 수 있는 Network structure를 찾아보 자! • By deemphasizing the importance of weights 1) Single shared weight 2) Evaluation on a wide range of single weight parameter • Novel neural network building blocks 6
  • 7. Related Work • Architecture search • Bayesian Neural Networks • Algorithmic Information Theory(AIT) • Network Pruning • Neuroscience 7
  • 8. Architecture Search • Evolutionary computing • Topology Search Algorithm - NEAT [5] • NAS • Basic building blocks with strong domain priors – CNNs, recurrent cells, self attention • Weight Training inner loop -> Slow • Architectures, once trained, outperform human-designed one • WANN • Creating network architectures which encode solutions • No training inner loop • The solution is innate to the structure 8
  • 9. Bayesian Neural Networks • Weight parameters sampled from learned distribution • Variance Network [6] • Sampled from Zero-mean, parameterized variance distribution • conventional BNNs naturally converge to zero-mean posteriors • Ensemble evaluation • WANN • sampling weights from a fixed uniform distribution with zero mean • evaluating performance on network ensembles 9 Variance Networks: When Expectation Does Not Meet Your Expectations., K. Neklyudov
  • 10. Algorithmic Information Theory(AIT) • Kolmogorov complexity • The minimum length of the program that can compute it • Occam’s razor • Simplifying neural networks by soft weight-sharing [7] • reducing the amount of information in weights by making them noisy, and simplifying the search space • WANN • finding minimal architectures • Weight-sharing to the entire network (AIT) • The weight as a rv sampled from a fixed distribution (BNN) 10 Simplifying neural networks by soft weight-sharing, S.J. Nowlan, G.E. Hinton., 1992
  • 11. Network Pruning • starts with a full, trained network, and takes away connections • Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask (2019) • pruned networks w/ randomly initialized weights • WANN • complementary to pruning • does not require prior training • no upper bound on the network’s complexity 11 Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask , H. Zhou, J. Lan, R. Liu, J. Yosinski.
  • 12. Neuroscience • connectome • “wiring diagram” of all neural connections • forming new synaptic connections and rewire • analyzed using graph theory • WANN • aims to learn network graphs that can encode skills and knowledge • ever-growing networks • small enough to be analyzed 12
  • 13. WANN • Weight of WANN • Searching Method • Topology Search • Performance and Complexity 13
  • 14. Weight of WANN • Architecture themselves encode solutions • Importance of weights must be minimized • Weight sampling • The curse of dimensionality • Weight-sharing • Efficient and handful • Single shared weight sampled from a fixed distribution 14
  • 15. Searching Method 1. An initial population of minimal neural network topologies is created. 2. Each network is evaluated over multiple rollouts, with a different shared weight value assigned at each rollout. 15
  • 16. Searching Method 3. Networks are ranked according to their performance and complexity. 4. A new population is created by varying the highest ranked network topologies, chosen probabilistically through tournament selection 16
  • 17. Topology Search • NEAT [5] • one of three ways: 1) Insert Node 2) Add Connection 3) Change Activation • Feed-forward network 17 Evolving neural networks through augmenting topologies, K.O. Stanley, R. Miikkulainen.
  • 18. Performance and Complexity • evaluated using several shared weight values • fixed series of weight values [-2, -1, -0.5, +0.5, +1, +2] • mean performance • Prefer simpler network (AIT) • multi-objective optimization problem: • mean performance over all weight values • max performance of the single best weight value • the number of connections in the network 18
  • 19. Experimental Results • Continuous Control • CartPoleSwingUp • BipedalWalker-v2 • CarRacing-v0 • Image Classification • MNIST 19
  • 20. Experiment 1. Random weights: individual weights drawn from 𝑈(−2,2) 2. Random shared weight: a single shared weight drawn from 𝑈(−2,2) 3. Tuned shared weight: the highest performing shared weight value in range (-2,2) 4. Tuned weights: individual weights tuned using population-based REINFORCE 20
  • 21. Continuous Control 21 • CartPoleSwingUp • Cannot be solved with a linear controller
  • 22. Continuous Control • BipedalWalker-v2 • non-trivial number of possible connections • 210 connections (SOTA: 2804 connections) 22
  • 23. Continuous Control • CarRacing-v0 • pre-trained VAE to compress the pixel representation • No pretrained hidden states of RNN 23
  • 24. Continuous Control Results • WANNs are not completely independent of the weight values • Single shared weight • easy tuning 24
  • 26. Discussion and Future Work • Method to search for simple neural network • Fine-tune • Few-shot learning • Continual lifelong learning • Multitask • Supermask [8] • similar range of performance • architecture search in a differentiable manner 26
  • 27. Discussion • Contribution? • Network Structure만의 영향력 • Simple Neural Network의 performance • WANN 자체의 실효성은 없어보임 • Single shared weight에 의한 structure bias가 있음 • Structure를 찾아내는 optimize 방법 연구 27
  • 28. Thank you ! Any Questions ?
  • 29. References 1) Weight Agnostic Neural Networks. GAIER, Adam; HA, David. arXiv preprint arXiv:1906.04358, 2019. 2) A powerful generative model using random weights for the deep image representation [link] He, K., Wang, Y. and Hopcroft, J., 2016. Advances in Neural Information Processing Systems, pp. 631—639. 3) Deep image prior [link] Ulyanov, D., Vedaldi, A. and Lempitsky, V., 2018. Proceedings of the IEEE Conference on Computer Vision a nd Pattern Recognition, pp. 9446—9454. 4) Training recurrent networks by evolino [HTML] J. Schmidhuber, D. Wierstra, M. Gagliolo, F. Gomez. Neural computation, Vol 19(3), pp. 757—779. MIT Press. 2007. 5) Evolving neural networks through augmenting topologies [HTML] K.O. Stanley, R. Miikkulainen. Evolutionary computation, Vol 10(2), pp. 99—127. MIT Press. 2002. 6) Variance Networks: When Expectation Does Not Meet Your Expectations [link] K. Neklyudov, D. Molchanov, A. Ashukha, D. Vetrov. International Conference on Learning Representations (ICLR). 2019. 7) Simplifying neural networks by soft weight-sharing [PDF] S.J. Nowlan, G.E. Hinton. Neural computation, Vol 4(4), pp. 473—493. MIT Press. 1992. 8) Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask [link] H. Zhou, J. Lan, R. Liu, J. Yosinski. arXiv preprint arXiv:1905.01067. 2019. 29

Editor's Notes

  1. 도마뱀과 뱀은 태어난 순간부터 포식자에게서 도망칠 능력을 갖고 태어남. 오리는 태어나자마자 수영하고 밥 챙겨먹을 수 있다 칠면조는 태어나자마자 포식자 구분이 가능하다.
  2. CNN - superresolution, inpainting and style transfer LSTM – Time series prediction
  3. 원래 NEAT는 weights와 네트워크 구조를 동시에 optimize함. 여기서는 구조만. Basic building block들로 랜덤 서치해도 잘 한다는게 밝혀짐
  4. [6](삼성) -> 일반적인 stochastic nn은 weigh의 평균을 predictio으로 잡는데 variance net은 zero-mean을 가정하고 분산만 학습함, 대부분의 BNN이 결국 zero-mean posterior로 수렴함 -> zero-mean 가정하면 더 잘 학습됨
  5.  a good model is one that is best at compressing its data, including the cost of describing of the model itself. 큰 네트워크 환경에서의 연구가 현재도 진행중임
  6. Image classification에서 by chance보다 잘함 -> structure에 힘이 있다 Pruned network는 시작 full network에 제한되어있다
  7. 작고 단순한 동물들에 대한 connectome 연구가 진행중임 Weight의 학습을 제쳐둠으로서 network가 계속 크면서 발전할 수 있도록 독려함
  8. 사실 weight 값 변화의 효과는 미미함 [-2, 2] 사이에서 (그나마) 큰 variation이 발생함
  9. tests the ability of WANNs to learn abstract associations Geometric 정보가 아닌 VAE에 의해 추상화된 정보의 association을 학습함
  10. 기존 WANN의 기능을 살리면서 다른 task로 쉽게 기능을 옮길 수 있음