SlideShare a Scribd company logo
1 of 18
1. TabNet inputs raw tabular data without any preprocessing and is trained using
gradient descent-based optimization.
2. TabNet uses sequential attention to choose which features to reason from at each
decision step.
-> interpretability and better learning as the learning capacity is used for the most salient features.
3. TabNet outperforms or is on par with other tabular learning models on various
datasets.
4. We show performance improvements by using unsupervised pre-training to
predict masked features.
Contribution
Feature selection broadly refers to judiciously picking a subset of
features based on their usefulness for prediction.
Feature selection & Tree-based learning
Their prominent strength is efficient picking of global features with the
most statistical information gain.
To improve the performance of standard DTs(Decision Trees), one
common approach is ensembling to reduce variance.
Random Forest : random subsets of data with randomly selected features to grow
many trees.
XGBoost & LightGBM : tree-based ensemble. (다음 기회에 ..)
Integration of DNNs into DTs ->
Self-supervied learning -> BERT
Feature selection & Tree-based learning
Conventional DNN blocks
 TabNet is based on such functionality and it outperforms DTs while reaping their benefits by
careful design which
(i) uses sparse instance-wise feature selection learned from data
(ii) constructs a sequential multi-step architecture, where each step contributes to a
portion of the decision based on the selected features
(iii) improves the learning capacity via nonlinear processing of the selected features
(iv) mimics ensembling via higher dimensions and more steps.
TabNet
𝑴 𝒊 ⋅ 𝒇
𝑴 𝒊
𝑴 𝒊
𝒂 𝒊 − 𝟏
𝑷 𝒊 − 𝟏
𝒅 𝒊
𝒂[𝒊]
BN : Ghost Batch Normalization
GLU : Gated Linear Unit
𝜼
Sparsemax
The idea is to set the probabilities of the smallest values of z to zero and keep only probabilities of the highest
values of z, but still keep the function differentiable to ensure successful application of backpropagation
algorithm.
출처 : https://towardsdatascience.com/what-is-sparsemax-f84c136624e4
𝑎 𝑖 − 1 ∶ 𝑀 𝑖 = 𝑠𝑝𝑎𝑟𝑠𝑒𝑚𝑎𝑥 𝑃 𝑖 − 1 ⋅ ℎ𝑖 𝑎 𝑖 − 1 Note, 𝑗=1
𝐷
𝑀 𝑖 𝑏,𝑗 = 1
 We pass the same 𝐷-dimensional features 𝑓 ∈ ℝ𝐵×𝐷
to each decision step, where 𝐵 is the batch size.
 𝑴 𝒊 ∈ ℝ𝐵×𝐷
for soft selection of the salient features. (𝑴 𝒊 ⋅ 𝒇)
 𝑃 𝑖 is the prior scale term, denoting how much a particular feature has been used previously
 𝑃 𝑖 = 𝑗=1
𝑖
(𝛾 − 𝑴 𝒋 )
 𝛾 : relaxation parameter
 𝛾 = 1, feature is enforced to be used only at one decision step and as 𝛾 increase, more flexibility is provided to use a
featue at multiple decision steps.
 𝑃 0 is initialized as all ones, 1𝐵×𝐷
, without any prior on the masked features.
𝐿𝑠𝑝𝑎𝑟𝑠𝑒 = 𝑖=1
𝑁𝑠𝑡𝑒𝑝𝑠
𝑏=1
𝐵
𝑗=1
𝐷 − 𝑴𝒃,𝒋 𝒊 log 𝑴𝒃,𝒋 𝒊 +𝜖
𝑁𝑠𝑡𝑒𝑝𝑠⋅𝐵
Feature Selection
 If 𝑀𝑏,𝑗 𝑖 = 0, then 𝑗𝑡ℎ
feature of the 𝑏𝑡ℎ
sample should have no contribution to the decision.
 If 𝑓𝑖 were a linear function, the coefficient 𝑀𝑏,𝑗 𝑖 would correspond to the feature importance of 𝑓𝑏,𝑗.
 We simply propose 𝜂𝑏 𝑖 = 𝑐=1
𝑁𝑑
𝑅𝑒𝐿𝑈 𝑑𝑏,𝑐 𝑖 to denote the aggregate decision contribution at 𝑖𝑡ℎ
decision step for the 𝑏𝑡ℎ
sample.
 If 𝑑𝑏,𝑐 𝑖 < 0, then all features at 𝑖𝑡ℎ
decision step should have 0 contribution to the overall decision.
 We propose aggregate feature importance mask,
𝑀𝑎𝑔𝑔−𝑏,𝑗 =
𝑖=1
𝑁𝑠𝑡𝑒𝑝𝑠
𝜂𝑏 𝑖 𝑀𝑏,𝑗 𝑖 /
𝑗=1
𝐷
𝑖=1
𝑁𝑠𝑡𝑒𝑝𝑠
𝜂𝑏 𝑖 𝑀𝑏,𝑗 𝑖
Interpretability
Tabular self-supervised learning
Propose the task of prediction of missing
feature columns from the others.
binary mask : 𝑆 ∈ 0,1 𝐵×𝐷
encoder inputs : 1 − 𝑆 ⋅ 𝑓
->
decoder outputs : 𝑆 ⋅ 𝑓
𝑃 0 = (1 − 𝑆)
reconstructure loss:
𝑏=1
𝐵
𝑗=1
𝐷 𝑓𝑏,𝑗 − 𝑓𝑏,𝑗 ⋅ 𝑆𝑏,𝑗
𝑏=1
𝐵
𝑓𝑏,𝑗 − 1/𝐵 𝑏=1
𝐵
𝑓𝑏,𝑗
2
2
Instant-wise feature selection (AUC)
The datasets are constructed in such a way that only a subset of the features determine the output.
Performance on real-world datasets
-S, -M, -L
Feature importance
T-SNE & Training curves

More Related Content

What's hot

Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer VisionSungjoon Choi
 
Densely Connected Convolutional Networks
Densely Connected Convolutional NetworksDensely Connected Convolutional Networks
Densely Connected Convolutional NetworksHosein Mohebbi
 
Mobilenetv1 v2 slide
Mobilenetv1 v2 slideMobilenetv1 v2 slide
Mobilenetv1 v2 slide威智 黃
 
Dimensionality Reduction | Machine Learning | CloudxLab
Dimensionality Reduction | Machine Learning | CloudxLabDimensionality Reduction | Machine Learning | CloudxLab
Dimensionality Reduction | Machine Learning | CloudxLabCloudxLab
 
Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial NetworksMark Chang
 
Basic Generative Adversarial Networks
Basic Generative Adversarial NetworksBasic Generative Adversarial Networks
Basic Generative Adversarial NetworksDong Heon Cho
 
Activation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural networkActivation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural networkGayatri Khanvilkar
 
Methods of Optimization in Machine Learning
Methods of Optimization in Machine LearningMethods of Optimization in Machine Learning
Methods of Optimization in Machine LearningKnoldus Inc.
 
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision TransformersEmerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision TransformersSungchul Kim
 
Noise Distribution Adaptive Self-Supervised Image Denoising using Tweedie Dis...
Noise Distribution Adaptive Self-Supervised Image Denoising using Tweedie Dis...Noise Distribution Adaptive Self-Supervised Image Denoising using Tweedie Dis...
Noise Distribution Adaptive Self-Supervised Image Denoising using Tweedie Dis...KwanyoungKim7
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter TuningJon Lederman
 
Object Detection using Deep Neural Networks
Object Detection using Deep Neural NetworksObject Detection using Deep Neural Networks
Object Detection using Deep Neural NetworksUsman Qayyum
 
Convolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNetConvolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNetSungminYou
 
Variational Inference
Variational InferenceVariational Inference
Variational InferenceTushar Tank
 
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
 A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs) A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)Thomas da Silva Paula
 

What's hot (20)

Multi tasking learning
Multi tasking learningMulti tasking learning
Multi tasking learning
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer Vision
 
Densely Connected Convolutional Networks
Densely Connected Convolutional NetworksDensely Connected Convolutional Networks
Densely Connected Convolutional Networks
 
XGBoost & LightGBM
XGBoost & LightGBMXGBoost & LightGBM
XGBoost & LightGBM
 
Yolov5
Yolov5 Yolov5
Yolov5
 
Mobilenetv1 v2 slide
Mobilenetv1 v2 slideMobilenetv1 v2 slide
Mobilenetv1 v2 slide
 
Densenet CNN
Densenet CNNDensenet CNN
Densenet CNN
 
Dimensionality Reduction | Machine Learning | CloudxLab
Dimensionality Reduction | Machine Learning | CloudxLabDimensionality Reduction | Machine Learning | CloudxLab
Dimensionality Reduction | Machine Learning | CloudxLab
 
Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial Networks
 
Basic Generative Adversarial Networks
Basic Generative Adversarial NetworksBasic Generative Adversarial Networks
Basic Generative Adversarial Networks
 
Activation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural networkActivation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural network
 
Methods of Optimization in Machine Learning
Methods of Optimization in Machine LearningMethods of Optimization in Machine Learning
Methods of Optimization in Machine Learning
 
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision TransformersEmerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision Transformers
 
Noise Distribution Adaptive Self-Supervised Image Denoising using Tweedie Dis...
Noise Distribution Adaptive Self-Supervised Image Denoising using Tweedie Dis...Noise Distribution Adaptive Self-Supervised Image Denoising using Tweedie Dis...
Noise Distribution Adaptive Self-Supervised Image Denoising using Tweedie Dis...
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter Tuning
 
Object Detection using Deep Neural Networks
Object Detection using Deep Neural NetworksObject Detection using Deep Neural Networks
Object Detection using Deep Neural Networks
 
14_cnn complete.pptx
14_cnn complete.pptx14_cnn complete.pptx
14_cnn complete.pptx
 
Convolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNetConvolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNet
 
Variational Inference
Variational InferenceVariational Inference
Variational Inference
 
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
 A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs) A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
 

Similar to 2021 06-02-tabnet

Restricting the Flow: Information Bottlenecks for Attribution
Restricting the Flow: Information Bottlenecks for AttributionRestricting the Flow: Information Bottlenecks for Attribution
Restricting the Flow: Information Bottlenecks for Attributiontaeseon ryu
 
Machine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by stepMachine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by stepSanjanaSaxena17
 
Deep learning concepts
Deep learning conceptsDeep learning concepts
Deep learning conceptsJoe li
 
Lesson_8_DeepLearning.pdf
Lesson_8_DeepLearning.pdfLesson_8_DeepLearning.pdf
Lesson_8_DeepLearning.pdfssuser7f0b19
 
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...Universitat Politècnica de Catalunya
 
Batch normalization presentation
Batch normalization presentationBatch normalization presentation
Batch normalization presentationOwin Will
 
deepnet-lourentzou.ppt
deepnet-lourentzou.pptdeepnet-lourentzou.ppt
deepnet-lourentzou.pptyang947066
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationYan Xu
 
Training DNN Models - II.pptx
Training DNN Models - II.pptxTraining DNN Models - II.pptx
Training DNN Models - II.pptxPrabhuSelvaraj15
 
Machine Learning using Support Vector Machine
Machine Learning using Support Vector MachineMachine Learning using Support Vector Machine
Machine Learning using Support Vector MachineMohsin Ul Haq
 
Deep learning with TensorFlow
Deep learning with TensorFlowDeep learning with TensorFlow
Deep learning with TensorFlowBarbara Fusinska
 
Deep learning from a novice perspective
Deep learning from a novice perspectiveDeep learning from a novice perspective
Deep learning from a novice perspectiveAnirban Santara
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for SearchBhaskar Mitra
 
DeepLearningLecture.pptx
DeepLearningLecture.pptxDeepLearningLecture.pptx
DeepLearningLecture.pptxssuserf07225
 
Auto encoders in Deep Learning
Auto encoders in Deep LearningAuto encoders in Deep Learning
Auto encoders in Deep LearningShajun Nisha
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for SearchBhaskar Mitra
 
Working with the data for Machine Learning
Working with the data for Machine LearningWorking with the data for Machine Learning
Working with the data for Machine LearningMehwish690898
 

Similar to 2021 06-02-tabnet (20)

Restricting the Flow: Information Bottlenecks for Attribution
Restricting the Flow: Information Bottlenecks for AttributionRestricting the Flow: Information Bottlenecks for Attribution
Restricting the Flow: Information Bottlenecks for Attribution
 
Machine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by stepMachine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by step
 
Deep learning concepts
Deep learning conceptsDeep learning concepts
Deep learning concepts
 
Lesson_8_DeepLearning.pdf
Lesson_8_DeepLearning.pdfLesson_8_DeepLearning.pdf
Lesson_8_DeepLearning.pdf
 
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
 
Batch normalization presentation
Batch normalization presentationBatch normalization presentation
Batch normalization presentation
 
deepnet-lourentzou.ppt
deepnet-lourentzou.pptdeepnet-lourentzou.ppt
deepnet-lourentzou.ppt
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and Regularization
 
Training DNN Models - II.pptx
Training DNN Models - II.pptxTraining DNN Models - II.pptx
Training DNN Models - II.pptx
 
Machine Learning using Support Vector Machine
Machine Learning using Support Vector MachineMachine Learning using Support Vector Machine
Machine Learning using Support Vector Machine
 
04 Multi-layer Feedforward Networks
04 Multi-layer Feedforward Networks04 Multi-layer Feedforward Networks
04 Multi-layer Feedforward Networks
 
Deep learning with TensorFlow
Deep learning with TensorFlowDeep learning with TensorFlow
Deep learning with TensorFlow
 
Reinforcement Learning - DQN
Reinforcement Learning - DQNReinforcement Learning - DQN
Reinforcement Learning - DQN
 
supervised.pptx
supervised.pptxsupervised.pptx
supervised.pptx
 
Deep learning from a novice perspective
Deep learning from a novice perspectiveDeep learning from a novice perspective
Deep learning from a novice perspective
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for Search
 
DeepLearningLecture.pptx
DeepLearningLecture.pptxDeepLearningLecture.pptx
DeepLearningLecture.pptx
 
Auto encoders in Deep Learning
Auto encoders in Deep LearningAuto encoders in Deep Learning
Auto encoders in Deep Learning
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for Search
 
Working with the data for Machine Learning
Working with the data for Machine LearningWorking with the data for Machine Learning
Working with the data for Machine Learning
 

More from JAEMINJEONG5

Jaemin_230701_Simple_Copy_paste.pptx
Jaemin_230701_Simple_Copy_paste.pptxJaemin_230701_Simple_Copy_paste.pptx
Jaemin_230701_Simple_Copy_paste.pptxJAEMINJEONG5
 
2022-01-17-Rethinking_Bisenet.pptx
2022-01-17-Rethinking_Bisenet.pptx2022-01-17-Rethinking_Bisenet.pptx
2022-01-17-Rethinking_Bisenet.pptxJAEMINJEONG5
 
2021 04-04-google nmt
2021 04-04-google nmt2021 04-04-google nmt
2021 04-04-google nmtJAEMINJEONG5
 
2021 03-02-distributed representations-of_words_and_phrases
2021 03-02-distributed representations-of_words_and_phrases2021 03-02-distributed representations-of_words_and_phrases
2021 03-02-distributed representations-of_words_and_phrasesJAEMINJEONG5
 
2021 03-02-transformer interpretability
2021 03-02-transformer interpretability2021 03-02-transformer interpretability
2021 03-02-transformer interpretabilityJAEMINJEONG5
 
2021 03-01-on the relationship between self-attention and convolutional layers
2021 03-01-on the relationship between self-attention and convolutional layers2021 03-01-on the relationship between self-attention and convolutional layers
2021 03-01-on the relationship between self-attention and convolutional layersJAEMINJEONG5
 
2021 01-04-learning filter-basis
2021 01-04-learning filter-basis2021 01-04-learning filter-basis
2021 01-04-learning filter-basisJAEMINJEONG5
 
2021 01-02-linformer
2021 01-02-linformer2021 01-02-linformer
2021 01-02-linformerJAEMINJEONG5
 
2020 12-04-shake shake
2020 12-04-shake shake2020 12-04-shake shake
2020 12-04-shake shakeJAEMINJEONG5
 
2020 11 4_bag_of_tricks
2020 11 4_bag_of_tricks2020 11 4_bag_of_tricks
2020 11 4_bag_of_tricksJAEMINJEONG5
 
2020 11 2_automated sleep stage scoring of the sleep heart
2020 11 2_automated sleep stage scoring of the sleep heart2020 11 2_automated sleep stage scoring of the sleep heart
2020 11 2_automated sleep stage scoring of the sleep heartJAEMINJEONG5
 
2020 11 1_sleep_net
2020 11 1_sleep_net2020 11 1_sleep_net
2020 11 1_sleep_netJAEMINJEONG5
 

More from JAEMINJEONG5 (20)

Jaemin_230701_Simple_Copy_paste.pptx
Jaemin_230701_Simple_Copy_paste.pptxJaemin_230701_Simple_Copy_paste.pptx
Jaemin_230701_Simple_Copy_paste.pptx
 
2022-01-17-Rethinking_Bisenet.pptx
2022-01-17-Rethinking_Bisenet.pptx2022-01-17-Rethinking_Bisenet.pptx
2022-01-17-Rethinking_Bisenet.pptx
 
Swin transformer
Swin transformerSwin transformer
Swin transformer
 
2021 05-04-u2-net
2021 05-04-u2-net2021 05-04-u2-net
2021 05-04-u2-net
 
2021 04-04-google nmt
2021 04-04-google nmt2021 04-04-google nmt
2021 04-04-google nmt
 
2021 04-03-sean
2021 04-03-sean2021 04-03-sean
2021 04-03-sean
 
2021 04-01-dalle
2021 04-01-dalle2021 04-01-dalle
2021 04-01-dalle
 
2021 03-02-spade
2021 03-02-spade2021 03-02-spade
2021 03-02-spade
 
2021 03-02-distributed representations-of_words_and_phrases
2021 03-02-distributed representations-of_words_and_phrases2021 03-02-distributed representations-of_words_and_phrases
2021 03-02-distributed representations-of_words_and_phrases
 
2021 03-02-transformer interpretability
2021 03-02-transformer interpretability2021 03-02-transformer interpretability
2021 03-02-transformer interpretability
 
2021 03-01-on the relationship between self-attention and convolutional layers
2021 03-01-on the relationship between self-attention and convolutional layers2021 03-01-on the relationship between self-attention and convolutional layers
2021 03-01-on the relationship between self-attention and convolutional layers
 
2021 01-04-learning filter-basis
2021 01-04-learning filter-basis2021 01-04-learning filter-basis
2021 01-04-learning filter-basis
 
2021 01-02-linformer
2021 01-02-linformer2021 01-02-linformer
2021 01-02-linformer
 
2020 12-04-shake shake
2020 12-04-shake shake2020 12-04-shake shake
2020 12-04-shake shake
 
2020 12-03-vit
2020 12-03-vit2020 12-03-vit
2020 12-03-vit
 
2020 12-2-detr
2020 12-2-detr2020 12-2-detr
2020 12-2-detr
 
2020 11 4_bag_of_tricks
2020 11 4_bag_of_tricks2020 11 4_bag_of_tricks
2020 11 4_bag_of_tricks
 
2020 11 2_automated sleep stage scoring of the sleep heart
2020 11 2_automated sleep stage scoring of the sleep heart2020 11 2_automated sleep stage scoring of the sleep heart
2020 11 2_automated sleep stage scoring of the sleep heart
 
2020 11 1_sleep_net
2020 11 1_sleep_net2020 11 1_sleep_net
2020 11 1_sleep_net
 
2020 12-1-adam w
2020 12-1-adam w2020 12-1-adam w
2020 12-1-adam w
 

Recently uploaded

Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college projectTonystark477637
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 

Recently uploaded (20)

Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 

2021 06-02-tabnet

  • 1.
  • 2.
  • 3. 1. TabNet inputs raw tabular data without any preprocessing and is trained using gradient descent-based optimization. 2. TabNet uses sequential attention to choose which features to reason from at each decision step. -> interpretability and better learning as the learning capacity is used for the most salient features. 3. TabNet outperforms or is on par with other tabular learning models on various datasets. 4. We show performance improvements by using unsupervised pre-training to predict masked features. Contribution
  • 4.
  • 5. Feature selection broadly refers to judiciously picking a subset of features based on their usefulness for prediction. Feature selection & Tree-based learning Their prominent strength is efficient picking of global features with the most statistical information gain. To improve the performance of standard DTs(Decision Trees), one common approach is ensembling to reduce variance.
  • 6. Random Forest : random subsets of data with randomly selected features to grow many trees. XGBoost & LightGBM : tree-based ensemble. (다음 기회에 ..) Integration of DNNs into DTs -> Self-supervied learning -> BERT Feature selection & Tree-based learning
  • 8.  TabNet is based on such functionality and it outperforms DTs while reaping their benefits by careful design which (i) uses sparse instance-wise feature selection learned from data (ii) constructs a sequential multi-step architecture, where each step contributes to a portion of the decision based on the selected features (iii) improves the learning capacity via nonlinear processing of the selected features (iv) mimics ensembling via higher dimensions and more steps. TabNet
  • 9. 𝑴 𝒊 ⋅ 𝒇 𝑴 𝒊 𝑴 𝒊 𝒂 𝒊 − 𝟏 𝑷 𝒊 − 𝟏 𝒅 𝒊 𝒂[𝒊] BN : Ghost Batch Normalization GLU : Gated Linear Unit 𝜼
  • 10. Sparsemax The idea is to set the probabilities of the smallest values of z to zero and keep only probabilities of the highest values of z, but still keep the function differentiable to ensure successful application of backpropagation algorithm. 출처 : https://towardsdatascience.com/what-is-sparsemax-f84c136624e4 𝑎 𝑖 − 1 ∶ 𝑀 𝑖 = 𝑠𝑝𝑎𝑟𝑠𝑒𝑚𝑎𝑥 𝑃 𝑖 − 1 ⋅ ℎ𝑖 𝑎 𝑖 − 1 Note, 𝑗=1 𝐷 𝑀 𝑖 𝑏,𝑗 = 1
  • 11.  We pass the same 𝐷-dimensional features 𝑓 ∈ ℝ𝐵×𝐷 to each decision step, where 𝐵 is the batch size.  𝑴 𝒊 ∈ ℝ𝐵×𝐷 for soft selection of the salient features. (𝑴 𝒊 ⋅ 𝒇)  𝑃 𝑖 is the prior scale term, denoting how much a particular feature has been used previously  𝑃 𝑖 = 𝑗=1 𝑖 (𝛾 − 𝑴 𝒋 )  𝛾 : relaxation parameter  𝛾 = 1, feature is enforced to be used only at one decision step and as 𝛾 increase, more flexibility is provided to use a featue at multiple decision steps.  𝑃 0 is initialized as all ones, 1𝐵×𝐷 , without any prior on the masked features. 𝐿𝑠𝑝𝑎𝑟𝑠𝑒 = 𝑖=1 𝑁𝑠𝑡𝑒𝑝𝑠 𝑏=1 𝐵 𝑗=1 𝐷 − 𝑴𝒃,𝒋 𝒊 log 𝑴𝒃,𝒋 𝒊 +𝜖 𝑁𝑠𝑡𝑒𝑝𝑠⋅𝐵 Feature Selection
  • 12.  If 𝑀𝑏,𝑗 𝑖 = 0, then 𝑗𝑡ℎ feature of the 𝑏𝑡ℎ sample should have no contribution to the decision.  If 𝑓𝑖 were a linear function, the coefficient 𝑀𝑏,𝑗 𝑖 would correspond to the feature importance of 𝑓𝑏,𝑗.  We simply propose 𝜂𝑏 𝑖 = 𝑐=1 𝑁𝑑 𝑅𝑒𝐿𝑈 𝑑𝑏,𝑐 𝑖 to denote the aggregate decision contribution at 𝑖𝑡ℎ decision step for the 𝑏𝑡ℎ sample.  If 𝑑𝑏,𝑐 𝑖 < 0, then all features at 𝑖𝑡ℎ decision step should have 0 contribution to the overall decision.  We propose aggregate feature importance mask, 𝑀𝑎𝑔𝑔−𝑏,𝑗 = 𝑖=1 𝑁𝑠𝑡𝑒𝑝𝑠 𝜂𝑏 𝑖 𝑀𝑏,𝑗 𝑖 / 𝑗=1 𝐷 𝑖=1 𝑁𝑠𝑡𝑒𝑝𝑠 𝜂𝑏 𝑖 𝑀𝑏,𝑗 𝑖 Interpretability
  • 13. Tabular self-supervised learning Propose the task of prediction of missing feature columns from the others. binary mask : 𝑆 ∈ 0,1 𝐵×𝐷 encoder inputs : 1 − 𝑆 ⋅ 𝑓 -> decoder outputs : 𝑆 ⋅ 𝑓 𝑃 0 = (1 − 𝑆) reconstructure loss: 𝑏=1 𝐵 𝑗=1 𝐷 𝑓𝑏,𝑗 − 𝑓𝑏,𝑗 ⋅ 𝑆𝑏,𝑗 𝑏=1 𝐵 𝑓𝑏,𝑗 − 1/𝐵 𝑏=1 𝐵 𝑓𝑏,𝑗 2 2
  • 14. Instant-wise feature selection (AUC) The datasets are constructed in such a way that only a subset of the features determine the output.