- A logistic regression analysis was conducted to evaluate relationships between likelihood of low birth weight and maternal characteristics. History of hypertension increases odds of low birth weight by 1.856 while higher maternal weight decreases odds. Premature labor also increases chances of low birth weight. Tests show data fits the logistic regression model well.
Logistic regression analysis of factors affecting low birth weight
1.
2. A STUDY OF LOW BIRTH WEIGHT OF
CHILDREN WITH SPECIAL EMPHASIS ON
LOGISTIC REGRESSION ANALYSIS
PREPARED BY
VINYA.P
3. INTRODUCTION
• Regression methods have become vital component of any
data analysis concerned with explaining the relationship
between a response variable and one or more explanatory
variables.
Logistic regression measures the relationship between the
dichotomous dependent variable and one or more
independent variables, which are usually (but not necessarily)
continuous, by estimating probabilities.
4. • Logistic regression is a model used for prediction of the
probability of occurrence of an event. It is a generalized linear
model used for binomial regression. It makes use of several
predictor (explanatory) variables that may be either numerical
or categorical. Specifically, logistic regression can be used only
with two types of target (response or dependent) variables.
A categorical target variable that has exactly two
categories (i.e. a binary or a dichotomous variable)
A continuous target variable that has values in the range 0
to 1 representing probability values or proportions.
5. • In binary logistic regression the outcome is usually coded as “0”
or “1”. Success is coded as “1” and failure is coded as “0”.
• Logistic regression is used to predict the probability of odds
being a case based on the values of the independent variables.
• Logistic regression is used widely in many fields, including the
medical and social sciences.
6. The odds of occurrence of some event is defined as the ratio
of the probability that the event will occur to the probability
that the event will not occur. That is the odds of the event E
is given by
Odds(E) =P(E)/P(E’)=P(E)/1 − P(E)
Odds(E) = n/m is interpreted to mean that the probability of
occurrence of the event is n m times the probability of its
not occurring. Equivalently, the odds are “m to n" that the
event will not happen.
ODDS AND ODDS RATIO
7. The odds ratio ORA,B, that compares the odds of events EA
and EB (that is, Event E occurring in group A and B ,
respectively), is defined as the ratio between the two odds;
that is
ORA vs B = =
In particular, if an odds ratio is equal to one, the odds are the
same for the two groups. Note that, if we define a factor with
levels corresponding to groups A and B, respectively, then an
odds ratio equal to one is equivalent to there being no factor-
effect.
8. • Low birth weight is defined as a birth
weight of a live born infant of less than
2,500 g (5 pounds 8 ounces) regardless of
gestational age.
Low Birth Weight
• Low birth may lead to increase in risks
for complications such as mental
retardation, vision loss, or learning
problems.
9. OBJECTIVES OF THE STUDY
The main objective of the survey is to study about
the low birth weight of children with special emphasis on
logistic regression analysis.
My study focuses on the characteristics like age, weight of
the subject at her last menstrual period, race, hyper tension
and the number of physician visits during the first trimester
of pregnancy etc.
10. ANALYSIS
The statistical tools used for the analysis are
• Logistic regression
• Odds and odds ratio
• Hosmer-Lemeshow test
• Wald statistic
• Likelyhood ratio statistic
• Cox and Snell's
• Q-Q Plot
• Box plot
• Anova
• Kruskal-walli’s analysis of variance
• Ancova
11.
12. Q-Q plot show that the distribution is not normal. That is
plot show departure from normality. If the data are normally
distributed, the data points will be close to the diagonal line.
Here the data points are in a non linear fashion, so the data
are not normally distributed. The straight line in the plot
represent expected values when the data are normally
distributed.
13. BOX PLOT
From the box plot we understand that the distribution is positively
skewed. Since the upper whisker is longer and the line corresponding
to median is in the lower part of the box.
14. Kolmogorov-smirnov test of goodness of fit is used to examine
the suitability of normal distribution in describing the low birth
weight. from the table we have the p-value is 0.000. Based on
the p-value we see that normal distribution gives a good fit to
the data on low birth weight.
Since p- value of the Shapiro-Wilk test is less than 0.05,low
birth weight is not normally distributed. That is the data
significantly deviate from the normal distributon.
15. The asymptotic significance estimates the probability of obtaining a
chi-square statistic greater than or equal to the one displayed, if
there truly are no differences between the group ranks. A chi-square
of 13.960 with 2 degrees of freedom should occur only about 1 times
per 1, 000.
Kruskal Wallis One Way Anova
Since p-value 0.001 which is less than 0.05 weight pounds at the last
menstrual period indifferent races are different. Hence we go for Paired
Wilcoxon Rank Sum Test to examine where this difference lies.
CHI-SQUARE df 13.960
Asymp.sig .001
a.Kruskal wallis test
Grouping variable:race
Test statistics
16. From the above we can understand that there is a
significant difference between weight pounds at the last
menstrual period belong to the first and second races.
Willcoxon rank sum test
Mann-whitney U 1017.500
Wilcoxon rank 5673.500
Asymptotic.sig(2 tailed) -1.442
Grouping variable
race
TEST STATISTICS
17. To test the ANCOVA we have to establish the relationship
between weight pounds of mother at the last menstrual period and
the age of mother. Here p-value corresponding to age 0.028 which is
less than 0.05. Also p-value corresponding to Ptl, Ui is not less than
0.05. Therefore Ptl and Ui are insignificant covariate. Hence p-value
corresponding to race, low, smoke, ht, ftv are less than 0.05.Therefore
age is significant covariate. Hence ANCOVA shows the rejection ,that is
weight pounds of mother at last menstrual period for these different
variables are significantly different. Therefore to find where the
difference lies ,we go for post-hoc analysis. Here we find the post-hoc
analysis for race.
18. Under Model Summary we see that the -2 Log Likelihood
statistic is 203.343.The Cox & Snell R2 can be interpreted like R2
in a multiple regression,but cannot reach a maximum value of
1. The Nagelkerke R2 can reach a maximum of 1.Bigger the
value of R2 fits the data better.
step Likelihood
race 203.343 .153 .215
MODEL SUMMARY
19. Since the p-value for the chi-square statistic is greater
than 0. 05,that is 0. 546>0.05. So the Hosmer and
Lemeshow test is insignificant. The data fits the model
better.
Hosmer and Lemeshow Test
Step Chi-square df Sig.
5 6.910 8 .546
20. The classification table compares the predicted values for
the dependent variable, based on the regression model,
with the actual observed values in the data. This table
compares these predicted values with the values observed
in the data. In this case, the model 2 variables can predict
which value of low birth is observed in the data 71. 4% of
the time. Here in this class table ,we see that small
increase in our overall percentage rate, from 68% to
71.4%. Comparing this table with the one above with no
predictors, our classification accuracy improves .
21. • A logistic regression analysis was conducted to evaluate the
relationships between the likelihood of having a low birth
weight (LBW) and certain maternal characteristics.
• History of hypertension increases the log odds of having a low
birth weight baby by 1.856.
• Weight of mother is negatively related to the log odds of
having a low birth weight baby.
• Presence of premature labour leading the chance of low birth
weight baby.
FINDINGS AND CONCLUSION
22. • The Hosmer and Lemeshow statistic indicates a beter fit if the chi -
square statistic is insignificant.
• From the box plot and Q-Q plot, we identify that the weight pounds
of mother at last menstrual period is positively skewed.
• Shapiro-Wilk test reveals that weight pounds of mother at last
menstrual period does not follow normal distribution. Therefore we
use Kruskal-Wallis Anova to compare the means of weight pounds of
mother at last menstrual period for different races.
• By Ancova, we get weight pounds of mother at last menstrual period
is significantly different.