SlideShare a Scribd company logo
1 of 4
Prepared by VOLKAN OBAN
Reference: http://datascienceplus.com/perform-logistic-regression-in-r/
Logistic Regression in R
Example:
#data Titanic -> https://www.kaggle.com/datasets
> training.data.raw <- read.csv('train.csv',header=T,na.strings=c(""))
> sapply(training.data.raw,function(x) sum(is.na(x)))
PassengerId Survived Pclass Name Sex Age
SibSp Parch
0 0 0 0 0 177
0 0
Ticket Fare Cabin Embarked
0 0 687 2
> sapply(training.data.raw, function(x) length(unique(x)))
PassengerId Survived Pclass Name Sex Age
SibSp Parch
891 2 3 891 2 89
7 7
Ticket Fare Cabin Embarked
681 248 148 4
> library(Amelia)
Zorunlu paket yükleniyor: Rcpp
##
## Amelia II: Multiple Imputation
## (Version 1.7.4, built: 2015-12-05)
## Copyright (C) 2005-2016 James Honaker, Gary King and Matthew Blackwell
## Refer to http://gking.harvard.edu/amelia/ for more information
##
> missmap(training.data.raw, main = "Missing values vs observed")
> data <- subset(training.data.raw,select=c(2,3,5,6,7,8,10,12))
> data$Age[is.na(data$Age)] <- mean(data$Age,na.rm=T)
> is.factor(data$Sex)
[1] TRUE
>
>
> is.factor(data$Embarked)
[1] TRUE
> data <- data[!is.na(data$Embarked),]
> train <- data[1:800,]
> test <- data[801:889,]
> model <- glm(Survived ~.,family=binomial(link='logit'),data=train)
> summary(model)
Call:
glm(formula = Survived ~ ., family = binomial(link = "logit"),
data = train)
Deviance Residuals:
Min 1Q Median 3Q Max
-2.6064 -0.5954 -0.4254 0.6220 2.4165
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 5.137627 0.594998 8.635 < 2e-16 ***
Pclass -1.087156 0.151168 -7.192 6.40e-13 ***
Sexmale -2.756819 0.212026 -13.002 < 2e-16 ***
Age -0.037267 0.008195 -4.547 5.43e-06 ***
SibSp -0.292920 0.114642 -2.555 0.0106 *
Parch -0.116576 0.128127 -0.910 0.3629
Fare 0.001528 0.002353 0.649 0.5160
EmbarkedQ -0.002656 0.400882 -0.007 0.9947
EmbarkedS -0.318786 0.252960 -1.260 0.2076
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 1065.39 on 799 degrees of freedom
Residual deviance: 709.39 on 791 degrees of freedom
AIC: 727.39
Number of Fisher Scoring iterations: 5
> anova(model, test="Chisq")
Analysis of Deviance Table
Model: binomial, link: logit
Response: Survived
Terms added sequentially (first to last)
Df Deviance Resid. Df Resid. Dev Pr(>Chi)
NULL 799 1065.39
Pclass 1 83.607 798 981.79 < 2.2e-16 ***
Sex 1 240.014 797 741.77 < 2.2e-16 ***
Age 1 17.495 796 724.28 2.881e-05 ***
SibSp 1 10.842 795 713.43 0.000992 ***
Parch 1 0.863 794 712.57 0.352873
Fare 1 0.994 793 711.58 0.318717
Embarked 2 2.187 791 709.39 0.334990
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> install.packages("pscl")
> library(pscl)
> pR2(model)
llh llhNull G2 McFadden r2ML r2
CU
-354.6950111 -532.6961008 356.0021794 0.3341513 0.3591775 0.48802
44
> fitted.results <- predict(model,newdata=subset(test,select=c(2,3,4,5,6,7,
8)),type='response')
> fitted.results <- ifelse(fitted.results > 0.5,1,0)
>
> misClasificError <- mean(fitted.results != test$Survived)
> print(paste('Accuracy',1-misClasificError))
[1] "Accuracy 0.842696629213483"
> install.packages("ROCR")
package ‘ROCR’ successfully unpacked and MD5 sums checked
The downloaded binary packages are in
C:UserslenovoAppDataLocalTempRtmp0mPmE8downloaded_packages
> library(ROCR)
> p <- predict(model, newdata=subset(test,select=c(2,3,4,5,6,7,8)), type="r
esponse")
> pr <- prediction(p, test$Survived)
> prf <- performance(pr, measure = "tpr", x.measure = "fpr")
> plot(prf)
>
> auc <- performance(pr, measure = "auc")
> auc <- auc@y.values[[1]]
> auc
[1] 0.8647186

More Related Content

What's hot

SSN-TC workshop talk at ISWC 2015 on Emrooz
SSN-TC workshop talk at ISWC 2015 on EmroozSSN-TC workshop talk at ISWC 2015 on Emrooz
SSN-TC workshop talk at ISWC 2015 on EmroozMarkus Stocker
 
Implement a modified algorithm PF in a FPGA
Implement a modified algorithm PF in a FPGAImplement a modified algorithm PF in a FPGA
Implement a modified algorithm PF in a FPGABruno Martínez Bargiela
 
What we got from the Predicting Red Hat Business Value competition
What we got from the Predicting Red Hat Business Value competitionWhat we got from the Predicting Red Hat Business Value competition
What we got from the Predicting Red Hat Business Value competitionUmaporn Kerdsaeng
 
Coq for ML users
Coq for ML usersCoq for ML users
Coq for ML userstmiya
 
Nasamatic NewHaven.IO 2014 05-21
Nasamatic NewHaven.IO 2014 05-21Nasamatic NewHaven.IO 2014 05-21
Nasamatic NewHaven.IO 2014 05-21Prasanna Gautam
 
INFLUXQL & TICKSCRIPT
INFLUXQL & TICKSCRIPTINFLUXQL & TICKSCRIPT
INFLUXQL & TICKSCRIPTInfluxData
 
Loader and Tester Swarming Drones for Cellular Phone Network Loading and Fiel...
Loader and Tester Swarming Drones for Cellular Phone Network Loading and Fiel...Loader and Tester Swarming Drones for Cellular Phone Network Loading and Fiel...
Loader and Tester Swarming Drones for Cellular Phone Network Loading and Fiel...Amir MirzaeiNia
 
Mcs 041 assignment solution (2020-21)
Mcs 041 assignment solution (2020-21)Mcs 041 assignment solution (2020-21)
Mcs 041 assignment solution (2020-21)smumbahelp
 
Track Finding in LHCb's 2020 Trigger
Track Finding in LHCb's 2020 TriggerTrack Finding in LHCb's 2020 Trigger
Track Finding in LHCb's 2020 TriggerTimothy Head
 
Confidentiality as a service –usable security for the cloud
Confidentiality as a service –usable security for the cloudConfidentiality as a service –usable security for the cloud
Confidentiality as a service –usable security for the cloudMaha Saad
 
The FE-I4 Pixel Readout System-on-Chip for ATLAS Experiment Upgrades
The FE-I4 Pixel Readout System-on-Chip  for ATLAS Experiment UpgradesThe FE-I4 Pixel Readout System-on-Chip  for ATLAS Experiment Upgrades
The FE-I4 Pixel Readout System-on-Chip for ATLAS Experiment Upgradesthemperek
 
Downsampling your data October 2017
Downsampling your data October 2017Downsampling your data October 2017
Downsampling your data October 2017InfluxData
 
H2O World - PySparkling Water - Nidhi Mehta
H2O World - PySparkling Water - Nidhi MehtaH2O World - PySparkling Water - Nidhi Mehta
H2O World - PySparkling Water - Nidhi MehtaSri Ambati
 
Tim lucas-id2ox
Tim lucas-id2oxTim lucas-id2ox
Tim lucas-id2oxTim Lucas
 
Jonathan Lefman presents his work on Superresolution chemical microscopy
Jonathan Lefman presents his work on Superresolution chemical microscopyJonathan Lefman presents his work on Superresolution chemical microscopy
Jonathan Lefman presents his work on Superresolution chemical microscopyJonathan Lefman
 
Jsr310 - Java 8 Date and Time API
Jsr310 - Java 8 Date and Time APIJsr310 - Java 8 Date and Time API
Jsr310 - Java 8 Date and Time APIAdy Liu
 

What's hot (20)

SSN-TC workshop talk at ISWC 2015 on Emrooz
SSN-TC workshop talk at ISWC 2015 on EmroozSSN-TC workshop talk at ISWC 2015 on Emrooz
SSN-TC workshop talk at ISWC 2015 on Emrooz
 
Implement a modified algorithm PF in a FPGA
Implement a modified algorithm PF in a FPGAImplement a modified algorithm PF in a FPGA
Implement a modified algorithm PF in a FPGA
 
What we got from the Predicting Red Hat Business Value competition
What we got from the Predicting Red Hat Business Value competitionWhat we got from the Predicting Red Hat Business Value competition
What we got from the Predicting Red Hat Business Value competition
 
An Overview of HDF-EOS (Part 1)
An Overview of HDF-EOS (Part 1)An Overview of HDF-EOS (Part 1)
An Overview of HDF-EOS (Part 1)
 
Coq for ML users
Coq for ML usersCoq for ML users
Coq for ML users
 
Nasamatic NewHaven.IO 2014 05-21
Nasamatic NewHaven.IO 2014 05-21Nasamatic NewHaven.IO 2014 05-21
Nasamatic NewHaven.IO 2014 05-21
 
INFLUXQL & TICKSCRIPT
INFLUXQL & TICKSCRIPTINFLUXQL & TICKSCRIPT
INFLUXQL & TICKSCRIPT
 
Sorter
SorterSorter
Sorter
 
Loader and Tester Swarming Drones for Cellular Phone Network Loading and Fiel...
Loader and Tester Swarming Drones for Cellular Phone Network Loading and Fiel...Loader and Tester Swarming Drones for Cellular Phone Network Loading and Fiel...
Loader and Tester Swarming Drones for Cellular Phone Network Loading and Fiel...
 
Mcs 041 assignment solution (2020-21)
Mcs 041 assignment solution (2020-21)Mcs 041 assignment solution (2020-21)
Mcs 041 assignment solution (2020-21)
 
Track Finding in LHCb's 2020 Trigger
Track Finding in LHCb's 2020 TriggerTrack Finding in LHCb's 2020 Trigger
Track Finding in LHCb's 2020 Trigger
 
Confidentiality as a service –usable security for the cloud
Confidentiality as a service –usable security for the cloudConfidentiality as a service –usable security for the cloud
Confidentiality as a service –usable security for the cloud
 
The FE-I4 Pixel Readout System-on-Chip for ATLAS Experiment Upgrades
The FE-I4 Pixel Readout System-on-Chip  for ATLAS Experiment UpgradesThe FE-I4 Pixel Readout System-on-Chip  for ATLAS Experiment Upgrades
The FE-I4 Pixel Readout System-on-Chip for ATLAS Experiment Upgrades
 
Android Refactoring
Android RefactoringAndroid Refactoring
Android Refactoring
 
CellCoverage
CellCoverageCellCoverage
CellCoverage
 
Downsampling your data October 2017
Downsampling your data October 2017Downsampling your data October 2017
Downsampling your data October 2017
 
H2O World - PySparkling Water - Nidhi Mehta
H2O World - PySparkling Water - Nidhi MehtaH2O World - PySparkling Water - Nidhi Mehta
H2O World - PySparkling Water - Nidhi Mehta
 
Tim lucas-id2ox
Tim lucas-id2oxTim lucas-id2ox
Tim lucas-id2ox
 
Jonathan Lefman presents his work on Superresolution chemical microscopy
Jonathan Lefman presents his work on Superresolution chemical microscopyJonathan Lefman presents his work on Superresolution chemical microscopy
Jonathan Lefman presents his work on Superresolution chemical microscopy
 
Jsr310 - Java 8 Date and Time API
Jsr310 - Java 8 Date and Time APIJsr310 - Java 8 Date and Time API
Jsr310 - Java 8 Date and Time API
 

Similar to Logistic Regression in R-An Exmple.

Parallel Computing with R
Parallel Computing with RParallel Computing with R
Parallel Computing with RPeter Solymos
 
Boston Predictive Analytics: Linear and Logistic Regression Using R - Interme...
Boston Predictive Analytics: Linear and Logistic Regression Using R - Interme...Boston Predictive Analytics: Linear and Logistic Regression Using R - Interme...
Boston Predictive Analytics: Linear and Logistic Regression Using R - Interme...Enplus Advisors, Inc.
 
Time Series Analysis and Mining with R
Time Series Analysis and Mining with RTime Series Analysis and Mining with R
Time Series Analysis and Mining with RYanchang Zhao
 
Performance tweaks and tools for Linux (Joe Damato)
Performance tweaks and tools for Linux (Joe Damato)Performance tweaks and tools for Linux (Joe Damato)
Performance tweaks and tools for Linux (Joe Damato)Ontico
 
Native interfaces for R
Native interfaces for RNative interfaces for R
Native interfaces for RSeth Falcon
 
Complex models in ecology: challenges and solutions
Complex models in ecology: challenges and solutionsComplex models in ecology: challenges and solutions
Complex models in ecology: challenges and solutionsPeter Solymos
 
Deep Learning, Microsoft Cognitive Toolkit (CNTK) and Azure Machine Learning ...
Deep Learning, Microsoft Cognitive Toolkit (CNTK) and Azure Machine Learning ...Deep Learning, Microsoft Cognitive Toolkit (CNTK) and Azure Machine Learning ...
Deep Learning, Microsoft Cognitive Toolkit (CNTK) and Azure Machine Learning ...Naoki (Neo) SATO
 
Java Performance Puzzlers
Java Performance PuzzlersJava Performance Puzzlers
Java Performance PuzzlersDoug Hawkins
 
Debugging Ruby Systems
Debugging Ruby SystemsDebugging Ruby Systems
Debugging Ruby SystemsEngine Yard
 
NIPS2017 Few-shot Learning and Graph Convolution
NIPS2017 Few-shot Learning and Graph ConvolutionNIPS2017 Few-shot Learning and Graph Convolution
NIPS2017 Few-shot Learning and Graph ConvolutionKazuki Fujikawa
 
03 - Refresher on buffer overflow in the old days
03 - Refresher on buffer overflow in the old days03 - Refresher on buffer overflow in the old days
03 - Refresher on buffer overflow in the old daysAlexandre Moneger
 
Algorithm Selection for Preferred Extensions Enumeration
Algorithm Selection for Preferred Extensions EnumerationAlgorithm Selection for Preferred Extensions Enumeration
Algorithm Selection for Preferred Extensions EnumerationFederico Cerutti
 
Locks? We Don't Need No Stinkin' Locks - Michael Barker
Locks? We Don't Need No Stinkin' Locks - Michael BarkerLocks? We Don't Need No Stinkin' Locks - Michael Barker
Locks? We Don't Need No Stinkin' Locks - Michael BarkerJAX London
 
Lock? We don't need no stinkin' locks!
Lock? We don't need no stinkin' locks!Lock? We don't need no stinkin' locks!
Lock? We don't need no stinkin' locks!Michael Barker
 
Postgres performance for humans
Postgres performance for humansPostgres performance for humans
Postgres performance for humansCraig Kerstiens
 
Nodejs性能分析优化和分布式设计探讨
Nodejs性能分析优化和分布式设计探讨Nodejs性能分析优化和分布式设计探讨
Nodejs性能分析优化和分布式设计探讨flyinweb
 

Similar to Logistic Regression in R-An Exmple. (20)

Parallel Computing with R
Parallel Computing with RParallel Computing with R
Parallel Computing with R
 
Boston Predictive Analytics: Linear and Logistic Regression Using R - Interme...
Boston Predictive Analytics: Linear and Logistic Regression Using R - Interme...Boston Predictive Analytics: Linear and Logistic Regression Using R - Interme...
Boston Predictive Analytics: Linear and Logistic Regression Using R - Interme...
 
Seminar PSU 10.10.2014 mme
Seminar PSU 10.10.2014 mmeSeminar PSU 10.10.2014 mme
Seminar PSU 10.10.2014 mme
 
Time Series Analysis and Mining with R
Time Series Analysis and Mining with RTime Series Analysis and Mining with R
Time Series Analysis and Mining with R
 
Performance tweaks and tools for Linux (Joe Damato)
Performance tweaks and tools for Linux (Joe Damato)Performance tweaks and tools for Linux (Joe Damato)
Performance tweaks and tools for Linux (Joe Damato)
 
Native interfaces for R
Native interfaces for RNative interfaces for R
Native interfaces for R
 
Complex models in ecology: challenges and solutions
Complex models in ecology: challenges and solutionsComplex models in ecology: challenges and solutions
Complex models in ecology: challenges and solutions
 
Deep Learning, Microsoft Cognitive Toolkit (CNTK) and Azure Machine Learning ...
Deep Learning, Microsoft Cognitive Toolkit (CNTK) and Azure Machine Learning ...Deep Learning, Microsoft Cognitive Toolkit (CNTK) and Azure Machine Learning ...
Deep Learning, Microsoft Cognitive Toolkit (CNTK) and Azure Machine Learning ...
 
R programming language
R programming languageR programming language
R programming language
 
Java Performance Puzzlers
Java Performance PuzzlersJava Performance Puzzlers
Java Performance Puzzlers
 
Spark_Documentation_Template1
Spark_Documentation_Template1Spark_Documentation_Template1
Spark_Documentation_Template1
 
Debugging Ruby Systems
Debugging Ruby SystemsDebugging Ruby Systems
Debugging Ruby Systems
 
NIPS2017 Few-shot Learning and Graph Convolution
NIPS2017 Few-shot Learning and Graph ConvolutionNIPS2017 Few-shot Learning and Graph Convolution
NIPS2017 Few-shot Learning and Graph Convolution
 
Efficient Programs
Efficient ProgramsEfficient Programs
Efficient Programs
 
03 - Refresher on buffer overflow in the old days
03 - Refresher on buffer overflow in the old days03 - Refresher on buffer overflow in the old days
03 - Refresher on buffer overflow in the old days
 
Algorithm Selection for Preferred Extensions Enumeration
Algorithm Selection for Preferred Extensions EnumerationAlgorithm Selection for Preferred Extensions Enumeration
Algorithm Selection for Preferred Extensions Enumeration
 
Locks? We Don't Need No Stinkin' Locks - Michael Barker
Locks? We Don't Need No Stinkin' Locks - Michael BarkerLocks? We Don't Need No Stinkin' Locks - Michael Barker
Locks? We Don't Need No Stinkin' Locks - Michael Barker
 
Lock? We don't need no stinkin' locks!
Lock? We don't need no stinkin' locks!Lock? We don't need no stinkin' locks!
Lock? We don't need no stinkin' locks!
 
Postgres performance for humans
Postgres performance for humansPostgres performance for humans
Postgres performance for humans
 
Nodejs性能分析优化和分布式设计探讨
Nodejs性能分析优化和分布式设计探讨Nodejs性能分析优化和分布式设计探讨
Nodejs性能分析优化和分布式设计探讨
 

More from Dr. Volkan OBAN

Conference Paper:IMAGE PROCESSING AND OBJECT DETECTION APPLICATION: INSURANCE...
Conference Paper:IMAGE PROCESSING AND OBJECT DETECTION APPLICATION: INSURANCE...Conference Paper:IMAGE PROCESSING AND OBJECT DETECTION APPLICATION: INSURANCE...
Conference Paper:IMAGE PROCESSING AND OBJECT DETECTION APPLICATION: INSURANCE...Dr. Volkan OBAN
 
Covid19py Python Package - Example
Covid19py  Python Package - ExampleCovid19py  Python Package - Example
Covid19py Python Package - ExampleDr. Volkan OBAN
 
Object detection with Python
Object detection with Python Object detection with Python
Object detection with Python Dr. Volkan OBAN
 
Python - Rastgele Orman(Random Forest) Parametreleri
Python - Rastgele Orman(Random Forest) ParametreleriPython - Rastgele Orman(Random Forest) Parametreleri
Python - Rastgele Orman(Random Forest) ParametreleriDr. Volkan OBAN
 
Linear Programming wi̇th R - Examples
Linear Programming wi̇th R - ExamplesLinear Programming wi̇th R - Examples
Linear Programming wi̇th R - ExamplesDr. Volkan OBAN
 
"optrees" package in R and examples.(optrees:finds optimal trees in weighted ...
"optrees" package in R and examples.(optrees:finds optimal trees in weighted ..."optrees" package in R and examples.(optrees:finds optimal trees in weighted ...
"optrees" package in R and examples.(optrees:finds optimal trees in weighted ...Dr. Volkan OBAN
 
k-means Clustering in Python
k-means Clustering in Pythonk-means Clustering in Python
k-means Clustering in PythonDr. Volkan OBAN
 
Naive Bayes Example using R
Naive Bayes Example using  R Naive Bayes Example using  R
Naive Bayes Example using R Dr. Volkan OBAN
 
k-means Clustering and Custergram with R
k-means Clustering and Custergram with Rk-means Clustering and Custergram with R
k-means Clustering and Custergram with RDr. Volkan OBAN
 
Data Science and its Relationship to Big Data and Data-Driven Decision Making
Data Science and its Relationship to Big Data and Data-Driven Decision MakingData Science and its Relationship to Big Data and Data-Driven Decision Making
Data Science and its Relationship to Big Data and Data-Driven Decision MakingDr. Volkan OBAN
 
Data Visualization with R.ggplot2 and its extensions examples.
Data Visualization with R.ggplot2 and its extensions examples.Data Visualization with R.ggplot2 and its extensions examples.
Data Visualization with R.ggplot2 and its extensions examples.Dr. Volkan OBAN
 
Scikit-learn Cheatsheet-Python
Scikit-learn Cheatsheet-PythonScikit-learn Cheatsheet-Python
Scikit-learn Cheatsheet-PythonDr. Volkan OBAN
 
Python Pandas for Data Science cheatsheet
Python Pandas for Data Science cheatsheet Python Pandas for Data Science cheatsheet
Python Pandas for Data Science cheatsheet Dr. Volkan OBAN
 
Pandas,scipy,numpy cheatsheet
Pandas,scipy,numpy cheatsheetPandas,scipy,numpy cheatsheet
Pandas,scipy,numpy cheatsheetDr. Volkan OBAN
 
ReporteRs package in R. forming powerpoint documents-an example
ReporteRs package in R. forming powerpoint documents-an exampleReporteRs package in R. forming powerpoint documents-an example
ReporteRs package in R. forming powerpoint documents-an exampleDr. Volkan OBAN
 
ReporteRs package in R. forming powerpoint documents-an example
ReporteRs package in R. forming powerpoint documents-an exampleReporteRs package in R. forming powerpoint documents-an example
ReporteRs package in R. forming powerpoint documents-an exampleDr. Volkan OBAN
 
R-ggplot2 package Examples
R-ggplot2 package ExamplesR-ggplot2 package Examples
R-ggplot2 package ExamplesDr. Volkan OBAN
 
R Machine Learning packages( generally used)
R Machine Learning packages( generally used)R Machine Learning packages( generally used)
R Machine Learning packages( generally used)Dr. Volkan OBAN
 
treemap package in R and examples.
treemap package in R and examples.treemap package in R and examples.
treemap package in R and examples.Dr. Volkan OBAN
 

More from Dr. Volkan OBAN (20)

Conference Paper:IMAGE PROCESSING AND OBJECT DETECTION APPLICATION: INSURANCE...
Conference Paper:IMAGE PROCESSING AND OBJECT DETECTION APPLICATION: INSURANCE...Conference Paper:IMAGE PROCESSING AND OBJECT DETECTION APPLICATION: INSURANCE...
Conference Paper:IMAGE PROCESSING AND OBJECT DETECTION APPLICATION: INSURANCE...
 
Covid19py Python Package - Example
Covid19py  Python Package - ExampleCovid19py  Python Package - Example
Covid19py Python Package - Example
 
Object detection with Python
Object detection with Python Object detection with Python
Object detection with Python
 
Python - Rastgele Orman(Random Forest) Parametreleri
Python - Rastgele Orman(Random Forest) ParametreleriPython - Rastgele Orman(Random Forest) Parametreleri
Python - Rastgele Orman(Random Forest) Parametreleri
 
Linear Programming wi̇th R - Examples
Linear Programming wi̇th R - ExamplesLinear Programming wi̇th R - Examples
Linear Programming wi̇th R - Examples
 
"optrees" package in R and examples.(optrees:finds optimal trees in weighted ...
"optrees" package in R and examples.(optrees:finds optimal trees in weighted ..."optrees" package in R and examples.(optrees:finds optimal trees in weighted ...
"optrees" package in R and examples.(optrees:finds optimal trees in weighted ...
 
k-means Clustering in Python
k-means Clustering in Pythonk-means Clustering in Python
k-means Clustering in Python
 
Naive Bayes Example using R
Naive Bayes Example using  R Naive Bayes Example using  R
Naive Bayes Example using R
 
R forecasting Example
R forecasting ExampleR forecasting Example
R forecasting Example
 
k-means Clustering and Custergram with R
k-means Clustering and Custergram with Rk-means Clustering and Custergram with R
k-means Clustering and Custergram with R
 
Data Science and its Relationship to Big Data and Data-Driven Decision Making
Data Science and its Relationship to Big Data and Data-Driven Decision MakingData Science and its Relationship to Big Data and Data-Driven Decision Making
Data Science and its Relationship to Big Data and Data-Driven Decision Making
 
Data Visualization with R.ggplot2 and its extensions examples.
Data Visualization with R.ggplot2 and its extensions examples.Data Visualization with R.ggplot2 and its extensions examples.
Data Visualization with R.ggplot2 and its extensions examples.
 
Scikit-learn Cheatsheet-Python
Scikit-learn Cheatsheet-PythonScikit-learn Cheatsheet-Python
Scikit-learn Cheatsheet-Python
 
Python Pandas for Data Science cheatsheet
Python Pandas for Data Science cheatsheet Python Pandas for Data Science cheatsheet
Python Pandas for Data Science cheatsheet
 
Pandas,scipy,numpy cheatsheet
Pandas,scipy,numpy cheatsheetPandas,scipy,numpy cheatsheet
Pandas,scipy,numpy cheatsheet
 
ReporteRs package in R. forming powerpoint documents-an example
ReporteRs package in R. forming powerpoint documents-an exampleReporteRs package in R. forming powerpoint documents-an example
ReporteRs package in R. forming powerpoint documents-an example
 
ReporteRs package in R. forming powerpoint documents-an example
ReporteRs package in R. forming powerpoint documents-an exampleReporteRs package in R. forming powerpoint documents-an example
ReporteRs package in R. forming powerpoint documents-an example
 
R-ggplot2 package Examples
R-ggplot2 package ExamplesR-ggplot2 package Examples
R-ggplot2 package Examples
 
R Machine Learning packages( generally used)
R Machine Learning packages( generally used)R Machine Learning packages( generally used)
R Machine Learning packages( generally used)
 
treemap package in R and examples.
treemap package in R and examples.treemap package in R and examples.
treemap package in R and examples.
 

Recently uploaded

04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 

Recently uploaded (20)

04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 

Logistic Regression in R-An Exmple.

  • 1. Prepared by VOLKAN OBAN Reference: http://datascienceplus.com/perform-logistic-regression-in-r/ Logistic Regression in R Example: #data Titanic -> https://www.kaggle.com/datasets > training.data.raw <- read.csv('train.csv',header=T,na.strings=c("")) > sapply(training.data.raw,function(x) sum(is.na(x))) PassengerId Survived Pclass Name Sex Age SibSp Parch 0 0 0 0 0 177 0 0 Ticket Fare Cabin Embarked 0 0 687 2 > sapply(training.data.raw, function(x) length(unique(x))) PassengerId Survived Pclass Name Sex Age SibSp Parch 891 2 3 891 2 89 7 7 Ticket Fare Cabin Embarked 681 248 148 4 > library(Amelia) Zorunlu paket yükleniyor: Rcpp ## ## Amelia II: Multiple Imputation ## (Version 1.7.4, built: 2015-12-05) ## Copyright (C) 2005-2016 James Honaker, Gary King and Matthew Blackwell ## Refer to http://gking.harvard.edu/amelia/ for more information ## > missmap(training.data.raw, main = "Missing values vs observed")
  • 2. > data <- subset(training.data.raw,select=c(2,3,5,6,7,8,10,12)) > data$Age[is.na(data$Age)] <- mean(data$Age,na.rm=T) > is.factor(data$Sex) [1] TRUE > > > is.factor(data$Embarked) [1] TRUE > data <- data[!is.na(data$Embarked),] > train <- data[1:800,] > test <- data[801:889,] > model <- glm(Survived ~.,family=binomial(link='logit'),data=train) > summary(model) Call: glm(formula = Survived ~ ., family = binomial(link = "logit"), data = train) Deviance Residuals: Min 1Q Median 3Q Max -2.6064 -0.5954 -0.4254 0.6220 2.4165 Coefficients:
  • 3. Estimate Std. Error z value Pr(>|z|) (Intercept) 5.137627 0.594998 8.635 < 2e-16 *** Pclass -1.087156 0.151168 -7.192 6.40e-13 *** Sexmale -2.756819 0.212026 -13.002 < 2e-16 *** Age -0.037267 0.008195 -4.547 5.43e-06 *** SibSp -0.292920 0.114642 -2.555 0.0106 * Parch -0.116576 0.128127 -0.910 0.3629 Fare 0.001528 0.002353 0.649 0.5160 EmbarkedQ -0.002656 0.400882 -0.007 0.9947 EmbarkedS -0.318786 0.252960 -1.260 0.2076 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 1065.39 on 799 degrees of freedom Residual deviance: 709.39 on 791 degrees of freedom AIC: 727.39 Number of Fisher Scoring iterations: 5 > anova(model, test="Chisq") Analysis of Deviance Table Model: binomial, link: logit Response: Survived Terms added sequentially (first to last) Df Deviance Resid. Df Resid. Dev Pr(>Chi) NULL 799 1065.39 Pclass 1 83.607 798 981.79 < 2.2e-16 *** Sex 1 240.014 797 741.77 < 2.2e-16 *** Age 1 17.495 796 724.28 2.881e-05 *** SibSp 1 10.842 795 713.43 0.000992 *** Parch 1 0.863 794 712.57 0.352873 Fare 1 0.994 793 711.58 0.318717 Embarked 2 2.187 791 709.39 0.334990 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 > install.packages("pscl") > library(pscl) > pR2(model) llh llhNull G2 McFadden r2ML r2 CU -354.6950111 -532.6961008 356.0021794 0.3341513 0.3591775 0.48802 44 > fitted.results <- predict(model,newdata=subset(test,select=c(2,3,4,5,6,7, 8)),type='response') > fitted.results <- ifelse(fitted.results > 0.5,1,0) > > misClasificError <- mean(fitted.results != test$Survived) > print(paste('Accuracy',1-misClasificError)) [1] "Accuracy 0.842696629213483" > install.packages("ROCR") package ‘ROCR’ successfully unpacked and MD5 sums checked
  • 4. The downloaded binary packages are in C:UserslenovoAppDataLocalTempRtmp0mPmE8downloaded_packages > library(ROCR) > p <- predict(model, newdata=subset(test,select=c(2,3,4,5,6,7,8)), type="r esponse") > pr <- prediction(p, test$Survived) > prf <- performance(pr, measure = "tpr", x.measure = "fpr") > plot(prf) > > auc <- performance(pr, measure = "auc") > auc <- auc@y.values[[1]] > auc [1] 0.8647186