Upcoming SlideShare
×

# Logistic regression

1,117 views

Published on

Logistic regression made easy

0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here
• Be the first to comment

• Be the first to like this

Views
Total views
1,117
On SlideShare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
78
0
Likes
0
Embeds 0
No embeds

No notes for slide

### Logistic regression

1. 1. Logistic regression Dr. Khaled Mahmoud Abd Elaziz Lecturer of public health and preventive medicine, Faculty of Medicine- Ain Shams University
2. 2. Logistic regression is very similar to linear regression. When we use logistic regression? We use it when we have a (binary outcome) of interest and a number of explanatory variables. Outcome: e.g. the presence of absence of a symptom, presence or absence of a disease
3. 3. Logistic regression is very similar to linear regression. When we use logistic regression? We use it when we have a (binary outcome) of interest and a number of explanatory variables. Outcome: e.g. the presence of absence of a symptom, presence or absence of a disease
4. 4. From the equation of the logistic regression model we can do: 1-we can determine which explanatory variables can influence the outcome. Which means which variables had the highest OR or the risk in production of the outcome (1= has the disease 0= doesn’t have the disease)
5. 5. From the equation of the logistic regression model we can do: 2- we can use an individual values of the explanatory variables to evaluate he or she will have a particular outcome
6. 6. we start the logistic regression model by creating a binary variable to represent the outcome (Dependant variable) (1= has the disease 0=doesn’t have the disease) We take the probability P of an individual has the highest coded category (has the disease) as the dependant variable. We use the logit logistic transformation in the regression equation
7. 7. The logit is the natural logarithm of the odds ratio of ‘disease’ Logit (P)= ln P/ 1-p The logistic regression equation Logit (p)= a + b1X1+ b2X2 + b3X3 +……… + biXi X= Explanatory variables P= estimated value of true probability that an individual with a particular set of values for X has the disease. P corresponds to the proportion with the disease, it has underlying binominal distribution b= estimated logistic regression coefficients The exponential of a particular coefficient for example eb1 is an estimated of the odds ratio.
8. 8. For a particular value of X1 the estimated odds of the disease while adjusting for all other X’s in the equation. As the logistic regression is fitted on a log scale the effects of X’s are multiplicative on the odds of the disease . This means that their combined effect is the product of their separate effects. This is unlike linear regression where the effects of X’s on the dependant variables are additive.
9. 9. Plain English: 1-Take the significant variables in the univariate analysis 2-Set the P value that you will take those variables to be put in the models e.g. 0.05 or 0.1 3-if all variables in the univariate analysis are insignificant ? Don’t bother doing logisitic regression. There is no question here about those variables for prediction of the disease
10. 10. Plain English: 4- the idea of doing a logisitic regression we have two many variables that are significant with the outcome we are looking for and we want to know which is more stronger in prediction of the disease outcome 5- we look in the output of the statistical program for Odds ratio and CI, significance of the variable, manipulate to select of the best combination of explanatory variables
11. 11. Plain English: 4- the idea of doing a logisitic regression we have two many variables that are significant with the outcome we are looking for and we want to know which is more stronger in prediction of the disease outcome Mathematical model that describes the relationship between an outcome with one or more explanatory variables 5- we look in the output of the statistical program for Odds ratio and CI, significance of the variable, manipulate to select of the best combination of explanatory variables
12. 12. Example: A study was done to test the relationship between HHV8 infection and sexual behavior of men, were asked about histories of sexually transmitted diseases in the past ( gonorrhea, syphilis, HSV2, and HIV) The explanatory variables were the presence of each of the four infection coded as 0 if the patient has no history or 1 if the patient had a history of that infection and the patient age in years
13. 13. Dependant outcome HHV8 infection Parameter estimate P OR 95% CI Intercept -2.2242 0.006 Gonorrhea 0.5093 0.243 1.664 0.71-3.91 Syphilis 1.1924 0.093 3.295 0.82-13.8 HSV2 0.7910 0.0410 2.206 1.03-4.71 HIV 1.6357 0.0067 5.133 1.57- 16.73 Age 0.0062 0.76 1.006 0.97-1.05
14. 14. Example: Chi square for covariate= 24.5 P=0.002 Indicating at least one of the covariates is significantly associated with HHV-8 serostatus. HSV-2 positively associated with HHV8 infection P=0.04 HIV is positively associated with HHV 8 infection P=0.007
15. 15. Those with a history of HSV-2 having 2.21 times odds of being HHV-8 positive compared to those with negative history after adjusting for other infections Those with a history of HIV having 5.1 times odds of being HHV-8 positive compared to those with negative history after adjusting for other infections
16. 16. Multiplicative effect of the model suggests a man who is both HSV2 and HIV seropositive is estimated to have 2.206 X 5.133 = 11.3 times the odds of HHV 8 infection compared to a man negative for both after adjusting for the other two infections. In this example gonorrhea had a significant chi-square but when entered in the model it was not significant (no indication of independent relationship between a history of gonorrhea and HHV8 seropositivity)
17. 17. There is no significant relationship between HHV8 seropositivity and age, the odds ratio indicates that the estimated odds of HHV8 seropositivity increases by 0.6% for each additional year of age.
18. 18. What is the probability of 51 year old man has HHV8 infection if he has gonorrhea positive and HSV2 positive but doesn’t have the two other diseases (Syphilis and HIV)? Add up the regression coefficients Constant +b1 +b2 +b3X age -2.2242 + 0.5093+0.7910+ (0.0062X51)= -0.6077
19. 19. probability of this person= P= ez / 1+ ez P= e (-0.6077)/ 1+ e (-0.6077) =0.35
20. 20. THANK YOU