Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Consumer Buying Behaviour - Swapnil Mali
1. An assignment for
EC51001 Applied Business and Marketing
Research
Submitted To
Dr. Andrzej Kwiatkowski
University of Dundee
Submitted On
26 March, 2012
By
Swapnil Mali 120004897
3. 1.0 Introduction
1.1 Survey name- Chicken Survey.
1.2 Objectives- The objective of this survey was to characterize consumers of chicken.
1.3 Aim- Aim is to find out what factors discriminate between those who buy chicken at the
who do not.
1.4 Key Findings- This survey did help to understand the buying behaviour of customers. Those
whose expenditure on chicken in week is more, whose age is more, and who feel that chicken at
hop. But who have more trust on
1.5 Methodology- This survey was done by asking various questions to customers at supermarket
pping, income, family,
etc. Then this data has been analysed by SPSS and statistic model is generated. Prediction is done
on the basis of this statistical model.
EC51001 Applied Business and Marketing Research Page 1
4. 2.0 Analysis part
On the basis of the result in tables by SPSS, logistic regression analysis has been carried out,
which elaborated in detail below.
Case Processing Summary
Unweighted Casesa N Percent
Selected Cases Included in 420 84.0
Analysis
Missing Cases 80 16.0
Total 500 100.0
Unselected Cases 0 .0
Total 500 100.0
a. If weight is in effect, see classification table for the total
number of cases.
The table above shows that there are few missing cases. But most of the data (84%) is been
covered under analysis. It is a good model for the analysis. By default, the tool logistic
regression in SPSS performs a listwise deletion of missing data, which means if there is missing
value for any variable in the model; the entire case will be excluded from the analysis.
Dependent Variable Encoding
Original
Value Internal Value
no 0
yes 1
This shows the internal value representation for the dependent variable. Those who not buy at
Block 0: Beginning Block
o Classification Tablea,b
Predicted
Butcher Percentage
Observed no yes Correct
Step 0 Butcher no 277 0 100.0
yes 143 0 .0
Overall Percentage 66.0
EC51001 Applied Business and Marketing Research Page 2
5. a. Constant is included in the model.
b. The cut value is .500
Step 0: No predictors and just the intercept at this stage.
It is recognising 100 % which is ideal but
in the case of it is not doing the same. 66% of the total dependent variables were correctly
predicted in the given model (277/ 420 = 0.66). So any random calculation for most frequent
category for all cases will yield the same correct present i.e. 66 %.
o Variables in the Equation
B S.E. Wald df Sig. Exp(B)
Step 0 Constant -.661 .103 41.228 1 .000 .516
In the null model B is the coefficient for the constant. In this table significant value indicates that
null hypothesis can be neglected (as value less than 0.05). Exp(B) is nothing but the odds ratio
which can be can calculated as 43/277.
Block 1: Method = Enter
o Omnibus Tests of Model Coefficients
Chi-square df Sig.
Step 1 Step 71.655 4 .000
Block 71.655 4 .000
Model 71.655 4 .000
This help in deciding the significance of the independent variables in the model. As significant
values are less than 0.05 we can say that all predictors are statistically significant.
o Model Summary
-2 Log Cox & Snell R Nagelkerke R
Step likelihood Square Square
a
1 467.079 .157 .217
a. Estimation terminated at iteration number 5 because
parameter estimates changed by less than .001.
EC51001 Applied Business and Marketing Research Page 3
6. Variation in the dependent variable changes only by 15.7 % due to independent variuables.
Nagelkerke R value is 0.217 which shows variance observed is equal to 21.7% between the
predictors and the prediction.
o Hosmer and Lemeshow Test
Step Chi-square df Sig.
1 3.030 8 .932
The difference between observed and expected values should be the minimum. The H-L
goodness-of- -
Significant value (more than 0.05) shows that there is very less or no difference between observed
and expected values.
o Classification Tablea
Predicted
Butcher Percentage
Observed no yes Correct
Step 1 Butcher no 243 34 87.7
yes 89 54 37.8
Overall Percentage 70.7
a. The cut value is .500
This table is about the prediction of model. This is a good model with overall 70.7% correctly
predicted variables.
o Variables in the Equation
B S.E. Wald df Sig. Exp(B)
a
Step 1 q5 .085 .028 8.975 1 .003 1.088
q51 .022 .007 8.988 1 .003 1.022
q21d .441 .077 32.888 1 .000 1.554
q43b -.269 .074 13.327 1 .000 .764
Constant -3.169 .615 26.539 1 .000 .042
a. Variable(s) entered on step 1: q5, q51, q21d, q43b.
With the B values we can form logistic regression equation.
Log(p/1-p)= -3.169 + 0.085 x q5 + 0.022 x q51 + 0.441 x q21d + (-0.269) x q43b
EC51001 Applied Business and Marketing Research Page 4
7. Supermarkets - 0.236
0.246
From the butcher 0.554
Series1
Age 0.022
In a typical week how
much do you spend on 0.088
0 0.1 0.2 0.3 0.4 0.5 0.6
Fig. 1: Exp(B) values against different independent variables
If score expenditure on chicken in a standard week
of increases by 0.088 %.
If score age of the respondent buying chicken at
increases by 0.022 %
whether a respondent agrees (on a seven-point ranking scale) that butchers
sell safe chicken
increases by 0.554 %
I trust (on a seven-point ranking scale) towards supermarkets
unit probability of 236 %
Classification plot
Step number: 1
Observed Groups and Predicted Probabilities
16 + +
I I
I I
F I y y I
R 12 + n y +
E I n y y n y I
Q I y n y y n y y y I
U I y n y yn nyy ny y y y yy I
E 8 + n nyyn yy nn y nyy ny y y n y yy +
N I n nnnny nynnn y ynnn ny y yy y n yy yy I
C I nn nnnnnynnnnnyy n y ynnn nn nyy yyy yyn yy yy I
Y I nnn nnnnnynnnnnnyyn nnnnnnynnynynyyyyyyyn yn yn yy y y y I
4 + nnnynnnnnnnnnnnnnnyn nnnnnnynnynnnnynyynyn yn nn yn n y yy y n +
I nnnnnnnnnnnnnnnnnnnnyynnnnnnnnnynnnnynnynynynnynn yn nyyy yy y y n I
I n nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnynnnnynnnnynnyyn nyyn yyyy y y yn y y I
I nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnynnnnnnnnnnnnyynnnnnn ynny n nynnn yyn y y y y I
Predicted ---------+---------+---------+---------+---------+---------+---------+---------+---------+----------
Prob: 0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1
Group: nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy
Predicted Probability is of Membership for yes
The Cut Value is .50
EC51001 Applied Business and Marketing Research Page 5
8. On X axis probabilities are scaled from 0 to 1. Y axis shows the frequency of occurrence. This
plot is widely spread so predictions are not sharp. But it clearly indicates that more frequencies are
there towards the lower probability values.
3.0 Comparison
Logistic regression allows one to predict a discrete outcome such as group membership from a
set of variables that may be continuous, discrete, dichotomous, or a mix. The goal of the
discriminant function analysis is to predict group membership from a set of predictors. The
logistic regression is much more relaxed and flexible in its assumptions than the discriminant
analysis. Unlike the discriminant analysis, the logistic regression does not have the requirements
of the independent variables to be normally distributed, linearly related, nor equal variance
within each group. ell, 1996, p575).
A logistic regression and discriminant analysis produces nearly similar results. Both methods
calculate statistical significant coefficients similarly. Logistic regression estimated larger
coefficients overall. Either can be helpful in predicting the possibility of who buy chicken at the
.
Total 71.2% of original grouped cases correctly classified in discriminant analysis, while in
logistic analysis for all cases yield 70.07 % correctly. Whether a respondent agrees (on a seven-
point ranking scale) that butchers sell safe chicken is dominant factor in logistic regression as
well as in discriminant analysis.
Thought logistic analysis can predict model with slightly more value of probability than that of
logistic regression, both gives the same result. In both the cases probability of customer buy
reduces if there is higher value of trust towards supermarkets . Both
the analysis produces same results for other factors as well.
4.0 Conclusions
i) Expenditure on chicken in a standard week to higher value.
ii) Age of the respondent
shop d to higher value.
iii) Whether a respondent agrees (on a seven-point ranking scale) that butchers sell safe
chicken d to
higher value.
2) Customer
i) Trust (on a seven-point ranking scale) towards supermarkets to higher value.
3) Logistic regression and discriminant analyses were similar in the model analysis. In order to
decide which method should be used, we must consider the assumptions for the application of
each one.
EC51001 Applied Business and Marketing Research Page 6
9. References
Tabachnick, B.G. and Fidell, L.S. , 1996 , Using Multivariate Statistics. NY: HarperCollins.
Appendix
LOGISTIC REGRESSION VARIABLES q8d
/METHOD=ENTER q5 q51 q21d q43b
/CLASSPLOT
/PRINT=GOODFIT SUMMARY
/CRITERIA=PIN(0.05) POUT(0.10) ITERATE(20) CUT(0.5).
Logistic Regression
[DataSet3] C:UsersSMMaliDownloadsASSIGNMENT_II.sav
Case Processing Summary
Unweighted Casesa N Percent
Selected Cases Included in 420 84.0
Analysis
Missing Cases 80 16.0
Total 500 100.0
Unselected Cases 0 .0
Total 500 100.0
a. If weight is in effect, see classification table for the total
number of cases.
Dependent Variable
Encoding
Original
Value Internal Value
No 0
yes 1
Block 0: Beginning Block
Classification Tablea,b
Predicted
Butcher Percentage
Observed no yes Correct
Step 0 Butcher No 277 0 100.0
Yes 143 0 .0
Overall Percentage 66.0
a. Constant is included in the model.
b. The cut value is .500
Variables in the Equation
EC51001 Applied Business and Marketing Research Page 7
10. B S.E. Wald df Sig. Exp(B)
Step 0 Constant -.661 .103 41.228 1 .000 .516
Variables not in the Equation
Score df Sig.
Step 0 Variables q5 9.903 1 .002
q51 10.968 1 .001
q21d 37.676 1 .000
q43b 9.718 1 .002
Overall Statistics 65.001 4 .000
Block 1: Method = Enter
Omnibus Tests of Model Coefficients
Chi-square df Sig.
Step 1 Step 71.655 4 .000
Block 71.655 4 .000
Model 71.655 4 .000
Model Summary
-2 Log Cox & Snell R Nagelkerke R
Step likelihood Square Square
1 467.079a .157 .217
a. Estimation terminated at iteration number 5 because
parameter estimates changed by less than .001.
Hosmer and Lemeshow Test
Step Chi-square df Sig.
1 3.030 8 .932
EC51001 Applied Business and Marketing Research Page 8
11. Contingency Table for Hosmer and Lemeshow Test
Butcher = no Butcher = yes
Observed Expected Observed Expected Total
Step 1 1 39 39.037 3 2.963 42
2 36 36.728 6 5.272 42
3 35 34.575 7 7.425 42
4 32 32.047 10 9.953 42
5 32 29.184 10 12.816 42
6 29 27.151 13 14.849 42
7 21 24.497 21 17.503 42
8 20 21.929 22 20.071 42
9 20 19.095 22 22.905 42
10 13 12.756 29 29.244 42
Classification Tablea
Predicted
Butcher Percentage
Observed no yes Correct
Step 1 Butcher no 243 34 87.7
yes 89 54 37.8
Overall Percentage 70.7
a. The cut value is .500
Variables in the Equation
B S.E. Wald df Sig. Exp(B)
a
Step 1 q5 .085 .028 8.975 1 .003 1.088
q51 .022 .007 8.988 1 .003 1.022
q21d .441 .077 32.888 1 .000 1.554
q43b -.269 .074 13.327 1 .000 .764
Constant -3.169 .615 26.539 1 .000 .042
a. Variable(s) entered on step 1: q5, q51, q21d, q43b.
EC51001 Applied Business and Marketing Research Page 9
12. Step number: 1
Observed Groups and Predicted Probabilities
16 +
+
I I
I I
F I y y I
R 12 + n y +
E I n y y n y I
Q I y n y y n y y y I
U I y n y yn nyy ny y y y yy I
E 8 + n nyyn yy nn y nyy ny y y n y yy +
N I n nnnny nynnn y ynnn ny y yy y n yy yy I
C I nn nnnnnynnnnnyy n y ynnn nn nyy yyy yyn yy yy I
Y I nnn nnnnnynnnnnnyyn nnnnnnynnynynyyyyyyyn yn yn yy y y y I
4 + nnnynnnnnnnnnnnnnnyn nnnnnnynnynnnnynyynyn yn nn yn n y yy y n +
I nnnnnnnnnnnnnnnnnnnnyynnnnnnnnnynnnnynnynynynnynn yn nyyy yy y y n I
I n nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnynnnnynnnnynnyyn nyyn yyyy y y yn y y I
I nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnynnnnnnnnnnnnyynnnnnn ynny n nynnn yyn y y y y I
Predicted ---------+---------+---------+---------+---------+---------+---------+---------+---------+----------
Prob: 0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1
Group: nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy
Predicted Probability is of Membership for yes
The Cut Value is .50
Symbols: n - no
y - yes
Each Symbol Represents 1 Case.
EC51001 Applied Business and Marketing Research Page 10