Classification using L1-Penalized Logistic Regression

•

5 likes•6,468 views

L1-Penalized Logistic Regression, is commonly used for classification in high dimensional data such as microarray. This slide presents a brief overview of the algorithm.

Education

‐Penalized Logistic Regression

Setia Pramana
Medical Epidemiology and Biostatistics Department,
Karolinska Institute, Stockholm

1

Introduction
• Logistic regression is a supervised method for binary
or multi‐class classification.
• Logistic regression models the probability of
membership of a class with transforms of linear
combinations of explanatory variables.

2

Introduction
• In high‐dimensional data (e.g., microarray): More
variables than the observations  Classical logistic
regression does not work.
• Other problems: Variables are correlated
(multicolinierity) and overfitting.
• Solution: Introduce a penalty for complexity in the
model.

3

Logistic Regression
• Let is an array of binary (e.g., malaria types) and
is expression level of the j‐th protein. We can fit the
following logistic model:

• probability that an array with measured
expression represents a class of type .
• Maximize the log‐likelihood:

4

Penalized Logistic Regression
• Penalized log‐likelihood:

• : Tuning parameter
• : a penalty function
• ‐penalization (Lasso):

5

‐Penalized Logistic Regression
• Shrinks all regression coefficients () toward zero and
set some of them to zero.
• Performs parameter estimation and variable
selection at the same time.
• L1‐penalized log‐likelihood is maximized using full
gradient algorithm (Goeman, 2010).
• The choice of λ is crucial and chosen via cross‐
validation procedure.
• The procedure is implemented in an R package called
penalized.

6

K‐fold Cross Validation
test train
Test Train on (k - 1) splits

k-fold

• Randomly divide your data into K pieces/folds
• Treat 1st fold as the test dataset. Fit the model to the
other folds (training data).
• Apply the model to the test data and repeat k times.
• Calculate statistics of model accuracy and fit from the
test data only.
7

Obtaining The Optimal 
• Plot of cross‐validated likelihood against :

• In this example, the optimal = 4.11 gives the
maximum log‐likelihood. 8

Obtaining The Optimal 
• Note that in k‐fold CV method, allocation of the
sample into k folds are perform randomly.
• Different randomization may lead to slightly different
results ().
• To obtain more robust λ value, a hundred different
cross validations were performed.
• Take the average of ,…, ( ).
• Use for fitting the final L1‐penalized logistic model.

9

Validation?
• Use another experiment for validation

10

What's hot

Machine learning with ADA BoostAman Patel

Depth Fusion from RGB and Depth Sensors by Deep LearningYu Huang

K - Nearest neighbor ( KNN )Mohammad Junaid Khan

Borderline SmoteTrector Rancor

Introduction to Neural Networks with PythondataHacker. rs

Model selection and cross validation techniquesVenkata Reddy Konasani

Dueling Network Architectures for Deep Reinforcement LearningYoonho Lee

Chap8 basic cluster_analysisguru_prasadg

Supervised and unsupervised learningParas Kohli

Deep Reinforcement LearningUsman Qayyum

Introduction to Generative Adversarial Networks (GANs)Appsilon Data Science

PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural NetworksJinwon Lee

K nearest neighborUjjawal

YoloNEHA Kapoor

Boosting Approach to Solving Machine Learning ProblemsDr Sulaimon Afolabi

Detection focal loss 딥러닝 논문읽기 모임 발표자료taeseon ryu

PR-217: EfficientDet: Scalable and Efficient Object DetectionJinwon Lee

Super resolution from a single imageLakkhana Mallikarachchi

Deep learning with KerasQuantUniversity

Introduction to Recurrent Neural NetworkKnoldus Inc.

What's hot (20)

Machine learning with ADA Boost

Depth Fusion from RGB and Depth Sensors by Deep Learning

K - Nearest neighbor ( KNN )

Borderline Smote

Introduction to Neural Networks with Python

Model selection and cross validation techniques

Dueling Network Architectures for Deep Reinforcement Learning

Chap8 basic cluster_analysis

Supervised and unsupervised learning

Deep Reinforcement Learning

Introduction to Generative Adversarial Networks (GANs)

PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

K nearest neighbor

Yolo

Boosting Approach to Solving Machine Learning Problems

Detection focal loss 딥러닝 논문읽기 모임 발표자료

PR-217: EfficientDet: Scalable and Efficient Object Detection

Super resolution from a single image

Deep learning with Keras

Introduction to Recurrent Neural Network

Similar to Classification using L1-Penalized Logistic Regression

SMOTE and K-Fold Cross Validation-Presentation.pptxHaritikaChhatwal1

Selection K in K-means ClusteringJunghoon Kim

Logistical Regression.pptxRamakrishna Reddy Bijjam

crossvalidation.pptxPriyadharshiniG41

Two strategies for large-scale multi-label classification on the YouTube-8M d...Dalei Li

Modeling Chemical DatasetsAbhik Seal

Week 11 Model Evalaution Model Evaluationkhairulhuda242

ML MODULE 5.pdfShiwani Gupta

Incremental collaborative filtering via evolutionary co clusteringAllen Wu

7. logistics regression using spssDr Nisha Arora

Trajectory Transformer.pptxSeungeon Baek

Generalized Linear Models in Spark MLlib and SparkR by Xiangrui MengSpark Summit

Generalized Linear Models in Spark MLlib and SparkRDatabricks

Lecture 8: Machine Learning in Practice (1) Marina Santini

L05 language model_part2ananth

Your Classifier is Secretly an Energy based model and you should treat it lik...Seunghyun Hwang

MM - KBAC: Using mixed models to adjust for population structure in a rare-va...Golden Helix Inc

CounterFactual Explanations.pdfBong-Ho Lee

6 Evaluating Predictive Performance and ensemble.pptxmohammedalherwi1

MM-KBAC – Using Mixed Models to Adjust for Population Structure in a Rare-var...Golden Helix Inc

Similar to Classification using L1-Penalized Logistic Regression (20)

SMOTE and K-Fold Cross Validation-Presentation.pptx

Selection K in K-means Clustering

Logistical Regression.pptx

crossvalidation.pptx

Two strategies for large-scale multi-label classification on the YouTube-8M d...

Modeling Chemical Datasets

Week 11 Model Evalaution Model Evaluation

ML MODULE 5.pdf

Incremental collaborative filtering via evolutionary co clustering

7. logistics regression using spss

Trajectory Transformer.pptx

Generalized Linear Models in Spark MLlib and SparkR by Xiangrui Meng

Generalized Linear Models in Spark MLlib and SparkR

Lecture 8: Machine Learning in Practice (1)

L05 language model_part2

Your Classifier is Secretly an Energy based model and you should treat it lik...

MM - KBAC: Using mixed models to adjust for population structure in a rare-va...

CounterFactual Explanations.pdf

6 Evaluating Predictive Performance and ensemble.pptx

MM-KBAC – Using Mixed Models to Adjust for Population Structure in a Rare-var...

Recently uploaded

How to Make a Pirate ship Primary Education.pptxmanuelaromero2013

Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1

Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝9953056974 Low Rate Call Girls In Saket, Delhi NCR

Staff of Color (SOC) Retention Efforts DDSDDavid Douglas School District

POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar

Software Engineering Methodologies (overview)eniolaolutunde

BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy

SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood

call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR

ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood

Presiding Officer Training module 2024 lok sabha electionsanshu789521

Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand

Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019

“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr

Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching

Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle

Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre

Mastering the Unannounced Regulatory InspectionSafetyChain Software

Interactive Powerpoint_How to Master effective communicationnomboosow

The basics of sentences session 2pptx copy.pptxheathfieldcps1

Recently uploaded (20)

How to Make a Pirate ship Primary Education.pptx

Employee wellbeing at the workplace.pptx

Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝

Staff of Color (SOC) Retention Efforts DDSD

POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx

Software Engineering Methodologies (overview)

BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf

SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx

call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️

ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT

Presiding Officer Training module 2024 lok sabha elections

Concept of Vouching. B.Com(Hons) /B.Compdf

Sanyam Choudhary Chemistry practical.pdf

“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...

Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...

Hybridoma Technology ( Production , Purification , and Application )

Organic Name Reactions for the students and aspirants of Chemistry12th.pptx

Mastering the Unannounced Regulatory Inspection

Interactive Powerpoint_How to Master effective communication

The basics of sentences session 2pptx copy.pptx

Classification using L1-Penalized Logistic Regression

1. ‐Penalized Logistic Regression Setia Pramana Medical Epidemiology and Biostatistics Department, Karolinska Institute, Stockholm 1

2. Introduction • Logistic regression is a supervised method for binary or multi‐class classification. • Logistic regression models the probability of membership of a class with transforms of linear combinations of explanatory variables. 2

3. Introduction • In high‐dimensional data (e.g., microarray): More variables than the observations  Classical logistic regression does not work. • Other problems: Variables are correlated (multicolinierity) and overfitting. • Solution: Introduce a penalty for complexity in the model. 3

4. Logistic Regression • Let is an array of binary (e.g., malaria types) and is expression level of the j‐th protein. We can fit the following logistic model: • probability that an array with measured expression represents a class of type . • Maximize the log‐likelihood: 4

5. Penalized Logistic Regression • Penalized log‐likelihood: • : Tuning parameter • : a penalty function • ‐penalization (Lasso): 5

6. ‐Penalized Logistic Regression • Shrinks all regression coefficients () toward zero and set some of them to zero. • Performs parameter estimation and variable selection at the same time. • L1‐penalized log‐likelihood is maximized using full gradient algorithm (Goeman, 2010). • The choice of λ is crucial and chosen via cross‐ validation procedure. • The procedure is implemented in an R package called penalized. 6

7. K‐fold Cross Validation test train Test Train on (k - 1) splits k-fold • Randomly divide your data into K pieces/folds • Treat 1st fold as the test dataset. Fit the model to the other folds (training data). • Apply the model to the test data and repeat k times. • Calculate statistics of model accuracy and fit from the test data only. 7

8. Obtaining The Optimal  • Plot of cross‐validated likelihood against : • In this example, the optimal = 4.11 gives the maximum log‐likelihood. 8

9. Obtaining The Optimal  • Note that in k‐fold CV method, allocation of the sample into k folds are perform randomly. • Different randomization may lead to slightly different results (). • To obtain more robust λ value, a hundred different cross validations were performed. • Take the average of ,…, ( ). • Use for fitting the final L1‐penalized logistic model. 9

10. Validation? • Use another experiment for validation 10

11. Thank you for your attention ! 11

Classification using L1-Penalized Logistic Regression

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Classification using L1-Penalized Logistic Regression

Similar to Classification using L1-Penalized Logistic Regression (20)

More from Setia Pramana

More from Setia Pramana (20)

Recently uploaded

Recently uploaded (20)

Classification using L1-Penalized Logistic Regression