SlideShare a Scribd company logo
1 of 38
Course Project: AJ DAVIS DEPARTMENT STORES
Introduction
AJ DAVIS is a department store chain, which has many credit
customers and wants to find out more information about these
customers. A sample of 50 credit customers is selected with
data collected on the following five variables.
1. Location (rural, urban, suburban)
2. Income (in $1,000's—be careful with this)
3. Size (household size, meaning number of people living in the
household)
4. Years (the number of years that the customer has lived in the
current location)
5. Credit balance (the customers current credit card balance on
the store's credit card, in $).
The data is available in Doc Sharing Course Project Data Set as
an Excel file. You are to copy and paste the data set into a
minitab worksheet.
PROJECT PART A: Exploratory Data Analysis
· Open the file MATH533 Project Consumer.xls from the
Course Project Data Set folder in Doc Sharing.
· For each of the five variables, process, organize, present, and
summarize the data. Analyze each variable by itself using
graphical and numerical techniques of summarization. Use
minitab as much as possible, explaining what the printout tells
you. You may wish to use some of the following graphs: stem-
leaf diagram, frequency or relative frequency table, histogram,
boxplot, dotplot, pie chart, bar graph. Caution: Not all of these
are appropriate for each of these variables, nor are they all
necessary. More is not necessarily better. In addition, be sure to
find the appropriate measures of central tendency and measures
of dispersion for the above data. Where appropriate use the five
number summary (the Min, Q1, Median, Q3, Max). Once again,
use minitab as appropriate, and explain what the results mean.
· Analyze the connections or relationships between the
variables. There are 10 pairings here (location and income,
location and size, location and years, location and credit
balance, income and size, income and years, income and
balance, size and years, size and credit balance, years and
Credit Balance). Use graphical as well as numerical summary
measures. Explain what you see. Be sure to consider all 10
pairings. Some variables show clear relationships, while others
do not.
· Prepare your report in Microsoft Word (or some other word
processing package), integrating your graphs and tables with
text explanations and interpretations.Be sure that you have
graphical and numerical back up for your explanations and
interpretations. Be selective in what you include in the report.
I'm not looking for a 20-page report on every variable and every
possible relationship (that's 15 things to do). Rather, what I
want you do is to highlight what you see for three individual
variables(no more than one graph for each, one or two measures
of central tendency and variability (as appropriate), and two or
three sentences of interpretation). For the 10 pairings, identify
and report only on three of the pairings, again using graphical
and numerical summary (as appropriate), with interpretations.
Please note that at least one of your pairings must include
location and at least one of your pairings must not include
location.
· All DeVry University policies are in effect, including the
plagiarism policy.
· Project Part A report is due by the end of Week 2.
· Project Part A is worth 100 total points. See grading rubric
below.
Submission: The report from Part 4, including all relevant
graphs and numerical analysis along with interpretations
Format for report:
A. Brief introduction
B. Discuss your first individual variable, using graphical,
numerical summary, and interpretation
C. Discuss your second individual variable, using graphical,
numerical summary, and interpretation
D. Discuss your third individual variable, using graphical,
numerical summary, and interpretation
E. Discuss your first pairing of variables, using graphical,
numerical summary, and interpretation
F. Discuss your second pairing of variables, using graphical,
numerical summary, and interpretation
G. Discuss your third pairing of variables, using graphical,
numerical summary, and interpretation
H. Conclusion
Project Part A: Grading Rubric
Category
Points
%
Description
Three Individual Variables
12 points each
36
36
graphical analysis, numerical analysis (when appropriate) and
interpretation
Three Relationships
15 points each
45
45
graphical analysis, numerical analysis (when appropriate), and
interpretation
Communication Skills
19
19
writing, grammar, clarity, logic, cohesiveness, adherence to the
above format
Total
100
100
A quality paper will meet or exceed all of the above
requirements.
Project Part B: Hypothesis Testing and Confidence Intervals
Your manager has speculated the following.
a. The average (mean) annual income was greater than $45,000.
b. The true population proportion of customers who live in a
suburban area is less than 45%.
c. The average (mean) number of years lived in the current
home is greater than 8 years.
d. The average (mean) credit balance for rural customers is less
than $3,200.
1. Using the sample data, perform the hypothesis test for each
of the above situations in order to see if there is evidence to
support your manager’s belief in each case A–D. In each case,
use the Seven Elements of a Test of Hypothesis in Section 6.2
of your text book with α = .05, and explain your conclusion in
simple terms. Also, be sure to compute the p-value and
interpret.
2. Follow this up with computing 95% confidence intervals for
each of the variables described in A–D, and again interpreting
these intervals.
3. Write a report to your manager about the results, distilling
down the results in a way that would be understandable to
someone who does not know statistics. Clear explanations and
interpretations are critical.
4. All DeVry University policies are in effect, including the
plagiarism policy.
5. Project Part B report is due by the end of Week 6.
6. Project Part B is worth 100 total points. See the grading
rubric below.
Submission: The report from Part 3 and all of the relevant work
done in the hypothesis testing (including minitab) in 1 and the
confidence intervals (minitab) in Part 2 as an appendix
Format for report:
A. Summary report (about one paragraph on each of the
speculations, A–D)
B. Appendix with all of the steps in hypothesis testing (the
format of the Seven Elements of a Test of Hypothesis, in
Section 6.2 of your text book) for each speculation A–D, as well
as the confidence intervals, including all minitab output
Project Part B: Grading Rubric
Category
Points
%
Description
Addressing each speculation—20 points each
80
80
hypothesis test, interpretation, confidence interval, and
interpretation
Summary report
20
20
one paragraph on each of the speculations
Total
100
100
A quality paper will meet or exceed all of the above
requirements.
Project Part C: Regression and Correlation Analysis
Using MINITAB, perform the regression and correlation
analysis for the data on income(Y), the dependent variable,
and credit balance (X), the independent variable, by answering
the following.
1. Generate a scatterplot for income ($1,000) versus credit
balance($), including the graph of the best fit line. Interpret.
2. Determine the equation of the best fit line, which
describes the relationship between income and credit balance.
3. Determine the coefficient of correlation. Interpret.
4. Determine the coefficient of determination. Interpret.
5. Test the utility of this regression model (use a two tail
test with α =.05). Interpret your results, including the p-value.
6. Based on your findings in 1–5, what is your opinion
about using credit balance to predict income? Explain.
7. Compute the 95% confidence interval for beta-1 (the
population slope). Interpret this interval.
8. Using an interval, estimate the average income for
customers that have credit balance of $4,000. Interpret this
interval.
9. Using an interval, predict the income for a customer that
has a credit balance of $4,000. Interpret this interval.
10. What can we say about the income for a customer that
has a credit balance of $10,000? Explain your answer.
In an attempt to improve the model, we attempt to do a multiple
regression model predicting income based on credit balance,
years, and size.
11. Using MINITAB, run the multiple regression analysis
using the variables credit balance, years, and size to predict
income. State the equation for this multiple regression model.
12. Perform the global test foruUtility (F-Test). Explain
your conclusion.
13. Perform the t-test on each independent variable.
Explain your conclusions and clearly state how you should
proceed. In particular, state which independent variables should
we keep and which should be discarded.
14. Is this multiple regression model better than the linear
model that we generated in parts 1–10? Explain.
All DeVry University policies are in effect, including the
plagiarism policy.
15. Project Part C report is due by the end of Week 7.
16. Project Part C is worth 100 total points. See the grading
rubric below.
Summarize your results from 1–14 in a report that is 3 pages or
less in length and explains and interprets the results in ways
that are understandable to someone who does not know
statistics.
Submission: The summary report + all of the work done in 1–14
(Minitab Output + interpretations) as an appendix
Format:
A. Summary Report
B. Points 1–14 addressed with appropriate output, graphs, and
interpretations. Be sure to number each point 1–14.
Project Part C: Grading Rubric
Category
Points
%
Description
Questions 1–12 and 14
5 points each
65
65
addressed with appropriate output, graphs, and interpretations
Question 13
15
15
addressed with appropriate output, graphs, and interpretations
Summary
20
20
writing, grammar, clarity, logic, and cohesiveness
Total
100
100
A quality paper will meet or exceed all of the above
requirements.
Sheet1LocationIncome ($1,000)SizeYearsCredit
Balance($)Urban27122,631Rural25422,047Suburban25113,155S
uburban26123,913Rural30552,660Urban29133,531Rural336102,
766Urban30143,769Suburban32244,082Urban34163,806Urban3
5184,049Urban40194,073Rural30692,697Rural336112,914Urba
n422104,073Suburban32244,310Urban432104,199Urban432104,
253Rural337133,104Urban472104,293Suburban35354,456Urban
542114,340Suburban42354,925Rural367133,178Urban573114,3
91Suburban44364,947Rural387153,203Urban54384,354Urban54
3104,366Suburban46465,003Rural407153,250Urban604114,402
Urban584104,397Urban615134,595Urban615134,786Urban6261
44,888Suburban49585,148Urban686145,011Suburban57685,220
Rural458163,257Urban717155,528Suburban57795,283Suburban
64895,332Rural458173,304Urban747195,553Suburban658105,4
84Rural478183,342Rural538183,788Suburban668105,756Subur
ban698105,861
Sheet2
Sheet3
PROJECT PART A: Exploratory Data Analysis
Introduction
This is a report which will help the AJ DAVIS which is a
department of store chain. This department has many credit
customers. A sample of 50 credit customers was selected for
analysis to know more about the credit customers. The analysis
was done on five variables which include location of the
household, income of the household, household size, and the
number of years that the customer has lived in the current
location and the customer’s current credit card balance on the
store's credit card.
The pie chart on location as a variable
The graph above shows that 44% of the credit customers
live in urban, 30% in suburban and 26% in rural locations.
The variable income summary table
Income
Mean
46.02
Standard Error
1.963439
Median
44.5
Mode
30
Standard Deviation
13.88361
Sample Variance
192.7547
Kurtosis
-1.09058
Skewness
0.258555
Range
49
Minimum
25
Maximum
74
Sum
2301
Count
50
The average income of the sample is 46.02 with a standard
deviation of 13.88. The minimum of the income is 25 and the
maximum is 74 which gives a range of 49.
The distribution table
size
frequency
Relative frequency
Cumulative frequency
1
8
0.16
0.16
2
7
0.14
0.3
3
6
0.12
0.42
4
4
0.08
0.5
5
4
0.08
0.58
6
6
0.12
0.7
7
7
0.14
0.84
8
8
0.16
1
total
50
The bar chart of the variable size
The above graph shows the frequency graph of the household
size. This shows that the family size of 1 and 8 the frequency is
8 and of size 4 and5 the frequency is 4
The pairing of location and income
Location
Sum of Income ($1,000)
Rural
488
Suburban
709
Urban
1104
The graph above shows that the urban location has the highest
income among the three locations and rural has the lowest.
The pairing of years and credit balance
The scatter plot shows that there is a positive relationship
between number of years and the credit balance. As the number
of years increase there is an increase in credit balance.
The pairing of location by size
The graph shows that the size of the rural household is larger as
compare with the other two locations
Conclusion
We can conclude that the rural has large number of household
size but small credit balance and also the income.
Project Part B: Hypothesis Testing and Confidence Intervals
a. The average (mean) annual income was greater than $45,000.
1. Stating the hypothesis
2. Level of significance is 5%
Critical value of z is 1.645
Therefore the critical region is
3. We use z-test statistics, because the sample is greater than
30.
4. The z computed value does not lie in the critical region, the
null hypothesis is not rejected
5. It is reasonable to conclude that the average (mean) annual
income was not greater than $45,000.
b. The true population proportion of customers who live in a
suburban area is less than 45%.
1. Stating the hypothesis
2. Level of significance is 5%
Critical value of z is 1.645
Therefore the critical region is
3. We use z-test statistics, because the sample is greater than
30.
4. The z computed value lies in the critical region, the null
hypothesis is rejected
It is reasonable to conclude that the true population proportion
of customers who live in a suburban area is less than 45%.
c. The average (mean) number of years lived in the current
home is greater than 8 years.
1. Stating the hypothesis
2. Level of significance is 5%
Critical value of z is 1.645
Therefore the critical region is
3. We use z-test statistics, because the sample is greater than
30.
4. The z computed value lies in the critical region, the null
hypothesis is rejected
5. It is reasonable to conclude that the average (mean) number
of years lived in the current home is greater than 8 years.
d. The average (mean) credit balance for rural customers is less
than $3,200.
1. Stating the hypothesis
2. Level of significance is 5%
Critical value of z is 1.645
Therefore the critical region is
3. We use z-test statistics, because the sample is greater than
30.
4. The z computed value does not lie in the critical region, the
null hypothesis is not rejected
5. It is reasonable to conclude that the average (mean) credit
balance for rural customers is not less than $3,200.
A. Confidence interval is given by. The 95% confidence the Z
value is 1.96
We are 95% confidence that the mean of the population lies
between 753.66 and 49868.34
B. Confidence interval is given by. The 95% confidence the Z
value is 1.96
We are 95% confidence that the mean of the population lies
between 0.173 and 0.427
C. Confidence interval is given by. The 95% confidence the Z
value is 1.96
We are 95% confidence that the mean of the population lies
between 8.34 and 10.856
D. Confidence interval is given by. The 95% confidence the Z
value is 1.96
We are 95% confidence that the mean of the population lies
between 3895.152 and 4411.768
Project Part C: Regression and Correlation Analysis
1. Generate a scatterplot for income ($1,000) versus credit
balance ($), including the graph of the best fit line. Interpret.
There is a positive relationship between income and credit
balance as an increase in credit balance lead to increase in
income
2. Determine the equation of the best fit line, which describes
the relationship between income and credit balance.
3. Determine the coefficient of correlation. Interpret.
The correlation is the square root of 0.64084 which is 0.8005.
This shows that there is a strong positive correlation between
income and credit balance.
4. Determine the coefficient of determination. Interpret.
The R2=0.64084. This shows that 64.08% of variation in income
(y) is explain by credit balance (x)
5. Test the utility of this regression model (use a two tail test
with α =.05). Interpret your results, including the p-value.
The utility of this model can be tested by a t-test for beta-1.
From the obtained output we can see the test statistic for that
test is 9.25 with corresponding p-value 0.000. Since the p-value
is smaller than the significance level α=0.05 so we can say that
the model is significant
6. Based on your findings in 1–5, what is your opinion about
using credit balance to predict income? Explain.
Base on my finding, I see that Size is a good predictor of Credit
Balance because Credit Balance and Size seems to affect each
other. As Size increase Credit Balance seems to increases also;
they correlated. As the Size of the household grow so does the
Credit Balance of those household also grew and increase.
7. Compute the 95% confidence interval for beta-1 (the
population slope). Interpret this interval.
I am 95% confident that for each additional credit balance that
the average income will go up between 0.009335 and 0.014518
8. Using an interval, estimate the average income for customers
that have credit balance of $4,000. Interpret this interval.
The interval is given by (41.77, 46.61). I am 95% confident that
the average mean income is between $41,770 and $46,610 that
have a balance of $4,000.Using an interval, predict the income
for a customer that has a credit balance of $4,000. Interpret this
interval.
9. Using an interval, predict the income for a customer that has
a credit balance of $4,000. Interpret this interval.
The interval is (27.11, 61.27). I am 95% confident that the
average predicted income is between $27,110 and $61,270 that
have a balance of $4,000.
10. What can we say about the income for a customer that has a
credit balance of $10,000? Explain your answer.
Putting $10,000 into the regression model we have
Income =-3.516+0.011926*10,000=115.7483
So a customer with 10,000 credit balance, it is more than likely
to have an income of $115,748.27 as per the fitted regression
model
In an attempt to improve the model, we attempt to do a multiple
regression model predicting income based on credit balance,
years, and size.
11. Using MINITAB, run the multiple regression analysis using
the variables credit balance, years, and size to predict income.
State the equation for this multiple regression model.
Model Summary
Model
R
R Square
Adjusted R Square
Std. Error of the Estimate
Change Statistics
R Square Change
F Change
df1
df2
Sig. F Change
1
.930a
.865
.856
5.261
.865
98.405
3
46
.000
a. Predictors: (Constant), Credit Balance, Years, Size
Coefficientsa
Model
Unstandardized Coefficients
Standardized Coefficients
t
Sig.
95.0% Confidence Interval for B
B
Std. Error
Beta
Lower Bound
Upper Bound
1
(Constant)
-13.186
3.608
-3.655
.001
-20.448
-5.923
Size
.615
.418
.112
1.472
.148
-.226
1.456
Years
1.210
.232
.395
5.209
.000
.742
1.677
Credit Balance
.011
.001
.724
13.188
.000
.009
.012
a. Dependent Variable: Income
Regression Analysis: Income ($1, 000 versus Credit Balance,
Years, Size)
The regression equation is
ANOVAa
Model
Sum of Squares
df
Mean Square
F
Sig.
1
Regression
8171.685
3
2723.895
98.405
.000b
Residual
1273.295
46
27.680
Total
9444.980
49
a. Dependent Variable: Income
b. Predictors: (Constant), Credit Balance, Years, Size
12. Perform the global test foruUtility (F-Test). Explain your
conclusion.
From the output we can see the F-test statistics in this case is
98.41 with corresponding p-value 0. Thus the null hypothesis of
insignificancy is rejected and we can conclude that the
regression model is significant is predicting the dependent
variable income
13. Perform the t-test on each independent variable. Explain
your conclusions and clearly state how you should proceed. In
particular, state which independent variables should we keep
and which should be discarded.
The given output, t-test statistic for credit balance is 13.19 with
corresponding p-value 0, for the size it is 1.47 with p-value
0.148 and for years it is 5.21 with p-value 0. Since the p-value
for credit balance and years is smaller than 0.05 so they are
significant is predicting income so we should keep them in the
model but the p-value for size is greater than 0.05 implying size
is not so significant in predicting the income thus we should
remove this variable from the given model
14. Is this multiple regression model better than the linear
model that we generated in parts 1–10? Explain.
We need to check at the R-sq which is the coefficient of
determination for both the two models. The coefficient of
determination for multiple regression model (86.5%) is greater
than that of simple linear regression 64.08% so the multiple
linear regression is explaining higher variance which implies
that MLR model is better than simple linear regression model
Scatter plot of Income vs credit balance
2631 2047 3155 3913 2660 3531 2766 3769 4082 3806 4049 4073
2697 2914 4073 4310 4199 4253 3104 4293 4456 4340 4925
3178 4391 4947 3203 4354 4366 5003 3250 4402 4397 4595
4786 4888 5148 5011 5220 3257 5528 5283 5332 3304 5553
5484 3342 3788 5756 5861 27 25 25 26 30 29 33
30 32 34 35 40 30 33 42 32 43 43 33
47 35 54 42 36 57 44 38 54 54 46 40
60 58 61 61 62 49 68 57 45 71 57 64
45 74 65 47 53 66 69
Credit Balance
Income
Count of Location
Total
Rural Suburban Urban 13 15 22
A bar chart of the houshold size
1 2 3 4 5 6 7 8 8 7 6 4 4
6 7 8
Size
Frequency
Sum of Income ($1,000) by Location
Total Rural Suburban Urban 488 709 1104
Scatter plot of years and credit balance
2 2 1 2 5 3 10 4 4 6 8 9 9
11 10 4 10 10 13 10 5 11 5 13 11
6 15 8 10 6 15 11 10 13 13 14 8
14 8 16 15 9 9 17 19 10 18 18 10
10 2631 2047 3155 3913 2660 3531 2766 3769 4082 3806
4049 4073 2697 2914 4073 4310 4199 4253 3104 4293 4456
4340 4925 3178 4391 4947 3203 4354 4366 5003 3250 4402
4397 4595 4786 4888 5148 5011 5220 3257 5528 5283 5332
3304 5553 5484 3342 3788 5756 5861
Years
Credit Balance
Sum of Size by Location
Total Rural Suburban Urban 87 69 69
Location
Size Total
45000
:
45000
:
1
1
0
>
£
x
H
x
H
645
.
1
>
z
5195
.
0
4390
.
1963
1020
50
61
.
13883
45000
46020
50
,
61
.
13883
,
45000
,
46020
=
=
-
=
-
=
=
=
=
=
n
s
x
z
n
s
x
m
m
45
.
0
:
45
.
0
:
1
0
<
£
p
p
H
H
645
.
1
-
<
z
(
)
1320
.
2
07036
.
0
15
.
0
50
55
.
0
*
45
.
0
45
.
0
3
.
0
1
50
,
45
.
0
,
3
.
0
50
15
0
0
0
0
-
=
-
=
-
=
-
-
=
=
=
=
=
n
p
z
n
p
p
p
p
p
8
:
8
:
1
1
0
>
£
x
H
x
H
497
.
2
6408
.
0
6
.
1
50
531
.
4
8
6
.
9
50
,
531
.
4
,
8
,
6
.
9
=
=
-
=
-
=
=
=
=
=
n
s
x
z
n
s
x
m
m
3200
:
3200
:
1
1
0
<
³
x
H
x
H
645
.
1
-
<
z
235
.
7
7900
.
131
46
.
953
50
8958
.
931
3200
46
.
4153
50
,
8958
.
931
,
3200
,
46
.
4153
=
=
-
=
-
=
=
=
=
=
n
s
x
z
n
s
x
m
m
n
x
s
m
z
±
=
[
]
34
.
49868
,
66
.
753
34
.
3848
46020
50
61
.
13883
*
96
.
1
46020
50
,
61
.
13883
,
46020
±
=
±
=
=
=
=
m
n
s
x
n
pq
p
z
±
=
p
[
]
4270
.
0
,
1730
.
0
1270
.
0
3
.
0
50
7
.
0
*
3
.
0
*
96
.
1
3
.
0
50
,
3
.
0
±
=
±
=
=
=
p
n
p
[
]
856
.
10
,
34
.
8
256
.
1
6
.
9
50
531
.
4
*
96
.
1
6
.
9
50
,
531
.
4
,
6
.
9
±
=
±
=
=
=
=
m
n
s
x
[
]
768
.
4411
,
152
.
3895
308
.
258
46
.
4153
50
8958
.
931
*
96
.
1
46
.
4153
50
,
8958
.
931
,
46
.
4153
±
=
±
=
=
=
=
m
n
s
x
x
y
0119
.
0
516
.
3
+
-
=
(
)
Size
Years
nce
Creditbala
Income
615
.
0
21
.
1
0108
.
0
2
.
13
000
,
1
$
+
+
+
-
=

More Related Content

More from vanesaburnand

InstructionsYou are to create YOUR OWN example of each of t.docx
InstructionsYou are to create YOUR OWN example of each of t.docxInstructionsYou are to create YOUR OWN example of each of t.docx
InstructionsYou are to create YOUR OWN example of each of t.docxvanesaburnand
 
InstructionsYou are a research group from BSocialMarketing, LLC.docx
InstructionsYou are a research group from BSocialMarketing, LLC.docxInstructionsYou are a research group from BSocialMarketing, LLC.docx
InstructionsYou are a research group from BSocialMarketing, LLC.docxvanesaburnand
 
InstructionsYou are attending an international journalist event.docx
InstructionsYou are attending an international journalist event.docxInstructionsYou are attending an international journalist event.docx
InstructionsYou are attending an international journalist event.docxvanesaburnand
 
InstructionsWrite the Organizational section of your project pap.docx
InstructionsWrite the Organizational section of your project pap.docxInstructionsWrite the Organizational section of your project pap.docx
InstructionsWrite the Organizational section of your project pap.docxvanesaburnand
 
InstructionsWrite a two-page (double spaced, Times New Roman S.docx
InstructionsWrite a two-page (double spaced, Times New Roman S.docxInstructionsWrite a two-page (double spaced, Times New Roman S.docx
InstructionsWrite a two-page (double spaced, Times New Roman S.docxvanesaburnand
 
InstructionsWrite a thesis statement in response to the topi.docx
InstructionsWrite a thesis statement in response to the topi.docxInstructionsWrite a thesis statement in response to the topi.docx
InstructionsWrite a thesis statement in response to the topi.docxvanesaburnand
 
InstructionsWhat You will choose a current issue of social.docx
InstructionsWhat You will choose a current issue of social.docxInstructionsWhat You will choose a current issue of social.docx
InstructionsWhat You will choose a current issue of social.docxvanesaburnand
 
InstructionsWrite a paper about the International Monetary Syste.docx
InstructionsWrite a paper about the International Monetary Syste.docxInstructionsWrite a paper about the International Monetary Syste.docx
InstructionsWrite a paper about the International Monetary Syste.docxvanesaburnand
 
InstructionsWrite a comprehensive medical report on a disease we.docx
InstructionsWrite a comprehensive medical report on a disease we.docxInstructionsWrite a comprehensive medical report on a disease we.docx
InstructionsWrite a comprehensive medical report on a disease we.docxvanesaburnand
 
InstructionsWhether you believe” in evolution or not, why is it.docx
InstructionsWhether you believe” in evolution or not, why is it.docxInstructionsWhether you believe” in evolution or not, why is it.docx
InstructionsWhether you believe” in evolution or not, why is it.docxvanesaburnand
 
InstructionsWe have been looking at different psychological .docx
InstructionsWe have been looking at different psychological .docxInstructionsWe have been looking at different psychological .docx
InstructionsWe have been looking at different psychological .docxvanesaburnand
 
InstructionsTITLEF14-2Beginning an 8-column work sheet for a merch.docx
InstructionsTITLEF14-2Beginning an 8-column work sheet for a merch.docxInstructionsTITLEF14-2Beginning an 8-column work sheet for a merch.docx
InstructionsTITLEF14-2Beginning an 8-column work sheet for a merch.docxvanesaburnand
 
InstructionsThis written assignment requires the student to inve.docx
InstructionsThis written assignment requires the student to inve.docxInstructionsThis written assignment requires the student to inve.docx
InstructionsThis written assignment requires the student to inve.docxvanesaburnand
 
InstructionsThe Art Form Most Meaningful to MePick the form .docx
InstructionsThe Art Form Most Meaningful to MePick the form .docxInstructionsThe Art Form Most Meaningful to MePick the form .docx
InstructionsThe Art Form Most Meaningful to MePick the form .docxvanesaburnand
 
InstructionsThink of a specific topic and two specific kin.docx
InstructionsThink of a specific topic and two specific kin.docxInstructionsThink of a specific topic and two specific kin.docx
InstructionsThink of a specific topic and two specific kin.docxvanesaburnand
 
InstructionsThere are different approaches to gathering risk da.docx
InstructionsThere are different approaches to gathering risk da.docxInstructionsThere are different approaches to gathering risk da.docx
InstructionsThere are different approaches to gathering risk da.docxvanesaburnand
 
InstructionsThe  Public Archaeology Presentation invites you.docx
InstructionsThe  Public Archaeology Presentation invites you.docxInstructionsThe  Public Archaeology Presentation invites you.docx
InstructionsThe  Public Archaeology Presentation invites you.docxvanesaburnand
 
InstructionsThe tools of formal analysis are the starting point .docx
InstructionsThe tools of formal analysis are the starting point .docxInstructionsThe tools of formal analysis are the starting point .docx
InstructionsThe tools of formal analysis are the starting point .docxvanesaburnand
 
InstructionsThe Homeland Security (DHS) agency is intended t.docx
InstructionsThe Homeland Security (DHS) agency is intended t.docxInstructionsThe Homeland Security (DHS) agency is intended t.docx
InstructionsThe Homeland Security (DHS) agency is intended t.docxvanesaburnand
 
InstructionsThe student should describe how learning abou.docx
InstructionsThe student should describe how learning abou.docxInstructionsThe student should describe how learning abou.docx
InstructionsThe student should describe how learning abou.docxvanesaburnand
 

More from vanesaburnand (20)

InstructionsYou are to create YOUR OWN example of each of t.docx
InstructionsYou are to create YOUR OWN example of each of t.docxInstructionsYou are to create YOUR OWN example of each of t.docx
InstructionsYou are to create YOUR OWN example of each of t.docx
 
InstructionsYou are a research group from BSocialMarketing, LLC.docx
InstructionsYou are a research group from BSocialMarketing, LLC.docxInstructionsYou are a research group from BSocialMarketing, LLC.docx
InstructionsYou are a research group from BSocialMarketing, LLC.docx
 
InstructionsYou are attending an international journalist event.docx
InstructionsYou are attending an international journalist event.docxInstructionsYou are attending an international journalist event.docx
InstructionsYou are attending an international journalist event.docx
 
InstructionsWrite the Organizational section of your project pap.docx
InstructionsWrite the Organizational section of your project pap.docxInstructionsWrite the Organizational section of your project pap.docx
InstructionsWrite the Organizational section of your project pap.docx
 
InstructionsWrite a two-page (double spaced, Times New Roman S.docx
InstructionsWrite a two-page (double spaced, Times New Roman S.docxInstructionsWrite a two-page (double spaced, Times New Roman S.docx
InstructionsWrite a two-page (double spaced, Times New Roman S.docx
 
InstructionsWrite a thesis statement in response to the topi.docx
InstructionsWrite a thesis statement in response to the topi.docxInstructionsWrite a thesis statement in response to the topi.docx
InstructionsWrite a thesis statement in response to the topi.docx
 
InstructionsWhat You will choose a current issue of social.docx
InstructionsWhat You will choose a current issue of social.docxInstructionsWhat You will choose a current issue of social.docx
InstructionsWhat You will choose a current issue of social.docx
 
InstructionsWrite a paper about the International Monetary Syste.docx
InstructionsWrite a paper about the International Monetary Syste.docxInstructionsWrite a paper about the International Monetary Syste.docx
InstructionsWrite a paper about the International Monetary Syste.docx
 
InstructionsWrite a comprehensive medical report on a disease we.docx
InstructionsWrite a comprehensive medical report on a disease we.docxInstructionsWrite a comprehensive medical report on a disease we.docx
InstructionsWrite a comprehensive medical report on a disease we.docx
 
InstructionsWhether you believe” in evolution or not, why is it.docx
InstructionsWhether you believe” in evolution or not, why is it.docxInstructionsWhether you believe” in evolution or not, why is it.docx
InstructionsWhether you believe” in evolution or not, why is it.docx
 
InstructionsWe have been looking at different psychological .docx
InstructionsWe have been looking at different psychological .docxInstructionsWe have been looking at different psychological .docx
InstructionsWe have been looking at different psychological .docx
 
InstructionsTITLEF14-2Beginning an 8-column work sheet for a merch.docx
InstructionsTITLEF14-2Beginning an 8-column work sheet for a merch.docxInstructionsTITLEF14-2Beginning an 8-column work sheet for a merch.docx
InstructionsTITLEF14-2Beginning an 8-column work sheet for a merch.docx
 
InstructionsThis written assignment requires the student to inve.docx
InstructionsThis written assignment requires the student to inve.docxInstructionsThis written assignment requires the student to inve.docx
InstructionsThis written assignment requires the student to inve.docx
 
InstructionsThe Art Form Most Meaningful to MePick the form .docx
InstructionsThe Art Form Most Meaningful to MePick the form .docxInstructionsThe Art Form Most Meaningful to MePick the form .docx
InstructionsThe Art Form Most Meaningful to MePick the form .docx
 
InstructionsThink of a specific topic and two specific kin.docx
InstructionsThink of a specific topic and two specific kin.docxInstructionsThink of a specific topic and two specific kin.docx
InstructionsThink of a specific topic and two specific kin.docx
 
InstructionsThere are different approaches to gathering risk da.docx
InstructionsThere are different approaches to gathering risk da.docxInstructionsThere are different approaches to gathering risk da.docx
InstructionsThere are different approaches to gathering risk da.docx
 
InstructionsThe  Public Archaeology Presentation invites you.docx
InstructionsThe  Public Archaeology Presentation invites you.docxInstructionsThe  Public Archaeology Presentation invites you.docx
InstructionsThe  Public Archaeology Presentation invites you.docx
 
InstructionsThe tools of formal analysis are the starting point .docx
InstructionsThe tools of formal analysis are the starting point .docxInstructionsThe tools of formal analysis are the starting point .docx
InstructionsThe tools of formal analysis are the starting point .docx
 
InstructionsThe Homeland Security (DHS) agency is intended t.docx
InstructionsThe Homeland Security (DHS) agency is intended t.docxInstructionsThe Homeland Security (DHS) agency is intended t.docx
InstructionsThe Homeland Security (DHS) agency is intended t.docx
 
InstructionsThe student should describe how learning abou.docx
InstructionsThe student should describe how learning abou.docxInstructionsThe student should describe how learning abou.docx
InstructionsThe student should describe how learning abou.docx
 

Recently uploaded

Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentInMediaRes1
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitolTechU
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxAvyJaneVismanos
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...jaredbarbolino94
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
internship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerinternship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerunnathinaik
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 

Recently uploaded (20)

Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media Component
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptx
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptx
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
internship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerinternship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developer
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 

Course Project AJ DAVIS DEPARTMENT STORESIntroduction.docx

  • 1. Course Project: AJ DAVIS DEPARTMENT STORES Introduction AJ DAVIS is a department store chain, which has many credit customers and wants to find out more information about these customers. A sample of 50 credit customers is selected with data collected on the following five variables. 1. Location (rural, urban, suburban) 2. Income (in $1,000's—be careful with this) 3. Size (household size, meaning number of people living in the household) 4. Years (the number of years that the customer has lived in the current location) 5. Credit balance (the customers current credit card balance on the store's credit card, in $). The data is available in Doc Sharing Course Project Data Set as an Excel file. You are to copy and paste the data set into a minitab worksheet. PROJECT PART A: Exploratory Data Analysis · Open the file MATH533 Project Consumer.xls from the Course Project Data Set folder in Doc Sharing. · For each of the five variables, process, organize, present, and summarize the data. Analyze each variable by itself using graphical and numerical techniques of summarization. Use minitab as much as possible, explaining what the printout tells you. You may wish to use some of the following graphs: stem-
  • 2. leaf diagram, frequency or relative frequency table, histogram, boxplot, dotplot, pie chart, bar graph. Caution: Not all of these are appropriate for each of these variables, nor are they all necessary. More is not necessarily better. In addition, be sure to find the appropriate measures of central tendency and measures of dispersion for the above data. Where appropriate use the five number summary (the Min, Q1, Median, Q3, Max). Once again, use minitab as appropriate, and explain what the results mean. · Analyze the connections or relationships between the variables. There are 10 pairings here (location and income, location and size, location and years, location and credit balance, income and size, income and years, income and balance, size and years, size and credit balance, years and Credit Balance). Use graphical as well as numerical summary measures. Explain what you see. Be sure to consider all 10 pairings. Some variables show clear relationships, while others do not. · Prepare your report in Microsoft Word (or some other word processing package), integrating your graphs and tables with text explanations and interpretations.Be sure that you have graphical and numerical back up for your explanations and interpretations. Be selective in what you include in the report. I'm not looking for a 20-page report on every variable and every possible relationship (that's 15 things to do). Rather, what I want you do is to highlight what you see for three individual variables(no more than one graph for each, one or two measures of central tendency and variability (as appropriate), and two or three sentences of interpretation). For the 10 pairings, identify and report only on three of the pairings, again using graphical and numerical summary (as appropriate), with interpretations. Please note that at least one of your pairings must include location and at least one of your pairings must not include location. · All DeVry University policies are in effect, including the plagiarism policy. · Project Part A report is due by the end of Week 2.
  • 3. · Project Part A is worth 100 total points. See grading rubric below. Submission: The report from Part 4, including all relevant graphs and numerical analysis along with interpretations Format for report: A. Brief introduction B. Discuss your first individual variable, using graphical, numerical summary, and interpretation C. Discuss your second individual variable, using graphical, numerical summary, and interpretation D. Discuss your third individual variable, using graphical, numerical summary, and interpretation E. Discuss your first pairing of variables, using graphical, numerical summary, and interpretation F. Discuss your second pairing of variables, using graphical, numerical summary, and interpretation G. Discuss your third pairing of variables, using graphical, numerical summary, and interpretation H. Conclusion Project Part A: Grading Rubric Category Points % Description Three Individual Variables 12 points each 36 36 graphical analysis, numerical analysis (when appropriate) and interpretation Three Relationships 15 points each
  • 4. 45 45 graphical analysis, numerical analysis (when appropriate), and interpretation Communication Skills 19 19 writing, grammar, clarity, logic, cohesiveness, adherence to the above format Total 100 100 A quality paper will meet or exceed all of the above requirements. Project Part B: Hypothesis Testing and Confidence Intervals Your manager has speculated the following. a. The average (mean) annual income was greater than $45,000. b. The true population proportion of customers who live in a suburban area is less than 45%. c. The average (mean) number of years lived in the current home is greater than 8 years. d. The average (mean) credit balance for rural customers is less than $3,200. 1. Using the sample data, perform the hypothesis test for each of the above situations in order to see if there is evidence to support your manager’s belief in each case A–D. In each case, use the Seven Elements of a Test of Hypothesis in Section 6.2 of your text book with α = .05, and explain your conclusion in simple terms. Also, be sure to compute the p-value and interpret. 2. Follow this up with computing 95% confidence intervals for each of the variables described in A–D, and again interpreting
  • 5. these intervals. 3. Write a report to your manager about the results, distilling down the results in a way that would be understandable to someone who does not know statistics. Clear explanations and interpretations are critical. 4. All DeVry University policies are in effect, including the plagiarism policy. 5. Project Part B report is due by the end of Week 6. 6. Project Part B is worth 100 total points. See the grading rubric below. Submission: The report from Part 3 and all of the relevant work done in the hypothesis testing (including minitab) in 1 and the confidence intervals (minitab) in Part 2 as an appendix Format for report: A. Summary report (about one paragraph on each of the speculations, A–D) B. Appendix with all of the steps in hypothesis testing (the format of the Seven Elements of a Test of Hypothesis, in Section 6.2 of your text book) for each speculation A–D, as well as the confidence intervals, including all minitab output Project Part B: Grading Rubric Category Points % Description Addressing each speculation—20 points each 80 80 hypothesis test, interpretation, confidence interval, and interpretation Summary report 20
  • 6. 20 one paragraph on each of the speculations Total 100 100 A quality paper will meet or exceed all of the above requirements. Project Part C: Regression and Correlation Analysis Using MINITAB, perform the regression and correlation analysis for the data on income(Y), the dependent variable, and credit balance (X), the independent variable, by answering the following. 1. Generate a scatterplot for income ($1,000) versus credit balance($), including the graph of the best fit line. Interpret. 2. Determine the equation of the best fit line, which describes the relationship between income and credit balance. 3. Determine the coefficient of correlation. Interpret. 4. Determine the coefficient of determination. Interpret. 5. Test the utility of this regression model (use a two tail test with α =.05). Interpret your results, including the p-value. 6. Based on your findings in 1–5, what is your opinion about using credit balance to predict income? Explain. 7. Compute the 95% confidence interval for beta-1 (the population slope). Interpret this interval. 8. Using an interval, estimate the average income for customers that have credit balance of $4,000. Interpret this interval. 9. Using an interval, predict the income for a customer that has a credit balance of $4,000. Interpret this interval. 10. What can we say about the income for a customer that has a credit balance of $10,000? Explain your answer. In an attempt to improve the model, we attempt to do a multiple
  • 7. regression model predicting income based on credit balance, years, and size. 11. Using MINITAB, run the multiple regression analysis using the variables credit balance, years, and size to predict income. State the equation for this multiple regression model. 12. Perform the global test foruUtility (F-Test). Explain your conclusion. 13. Perform the t-test on each independent variable. Explain your conclusions and clearly state how you should proceed. In particular, state which independent variables should we keep and which should be discarded. 14. Is this multiple regression model better than the linear model that we generated in parts 1–10? Explain. All DeVry University policies are in effect, including the plagiarism policy. 15. Project Part C report is due by the end of Week 7. 16. Project Part C is worth 100 total points. See the grading rubric below. Summarize your results from 1–14 in a report that is 3 pages or less in length and explains and interprets the results in ways that are understandable to someone who does not know statistics. Submission: The summary report + all of the work done in 1–14 (Minitab Output + interpretations) as an appendix Format: A. Summary Report B. Points 1–14 addressed with appropriate output, graphs, and interpretations. Be sure to number each point 1–14. Project Part C: Grading Rubric Category Points
  • 8. % Description Questions 1–12 and 14 5 points each 65 65 addressed with appropriate output, graphs, and interpretations Question 13 15 15 addressed with appropriate output, graphs, and interpretations Summary 20 20 writing, grammar, clarity, logic, and cohesiveness Total 100 100 A quality paper will meet or exceed all of the above requirements. Sheet1LocationIncome ($1,000)SizeYearsCredit Balance($)Urban27122,631Rural25422,047Suburban25113,155S uburban26123,913Rural30552,660Urban29133,531Rural336102, 766Urban30143,769Suburban32244,082Urban34163,806Urban3 5184,049Urban40194,073Rural30692,697Rural336112,914Urba n422104,073Suburban32244,310Urban432104,199Urban432104, 253Rural337133,104Urban472104,293Suburban35354,456Urban 542114,340Suburban42354,925Rural367133,178Urban573114,3 91Suburban44364,947Rural387153,203Urban54384,354Urban54
  • 9. 3104,366Suburban46465,003Rural407153,250Urban604114,402 Urban584104,397Urban615134,595Urban615134,786Urban6261 44,888Suburban49585,148Urban686145,011Suburban57685,220 Rural458163,257Urban717155,528Suburban57795,283Suburban 64895,332Rural458173,304Urban747195,553Suburban658105,4 84Rural478183,342Rural538183,788Suburban668105,756Subur ban698105,861 Sheet2 Sheet3 PROJECT PART A: Exploratory Data Analysis Introduction This is a report which will help the AJ DAVIS which is a department of store chain. This department has many credit customers. A sample of 50 credit customers was selected for analysis to know more about the credit customers. The analysis was done on five variables which include location of the household, income of the household, household size, and the number of years that the customer has lived in the current location and the customer’s current credit card balance on the store's credit card. The pie chart on location as a variable The graph above shows that 44% of the credit customers live in urban, 30% in suburban and 26% in rural locations. The variable income summary table Income Mean 46.02 Standard Error 1.963439 Median 44.5 Mode
  • 10. 30 Standard Deviation 13.88361 Sample Variance 192.7547 Kurtosis -1.09058 Skewness 0.258555 Range 49 Minimum 25 Maximum 74 Sum 2301 Count 50 The average income of the sample is 46.02 with a standard deviation of 13.88. The minimum of the income is 25 and the maximum is 74 which gives a range of 49. The distribution table size frequency Relative frequency Cumulative frequency 1 8 0.16 0.16 2 7 0.14 0.3
  • 11. 3 6 0.12 0.42 4 4 0.08 0.5 5 4 0.08 0.58 6 6 0.12 0.7 7 7 0.14 0.84 8 8 0.16 1 total 50 The bar chart of the variable size The above graph shows the frequency graph of the household size. This shows that the family size of 1 and 8 the frequency is 8 and of size 4 and5 the frequency is 4
  • 12. The pairing of location and income Location Sum of Income ($1,000) Rural 488 Suburban 709 Urban 1104 The graph above shows that the urban location has the highest income among the three locations and rural has the lowest. The pairing of years and credit balance The scatter plot shows that there is a positive relationship between number of years and the credit balance. As the number of years increase there is an increase in credit balance. The pairing of location by size The graph shows that the size of the rural household is larger as compare with the other two locations Conclusion We can conclude that the rural has large number of household size but small credit balance and also the income. Project Part B: Hypothesis Testing and Confidence Intervals a. The average (mean) annual income was greater than $45,000. 1. Stating the hypothesis 2. Level of significance is 5% Critical value of z is 1.645
  • 13. Therefore the critical region is 3. We use z-test statistics, because the sample is greater than 30. 4. The z computed value does not lie in the critical region, the null hypothesis is not rejected 5. It is reasonable to conclude that the average (mean) annual income was not greater than $45,000. b. The true population proportion of customers who live in a suburban area is less than 45%. 1. Stating the hypothesis 2. Level of significance is 5% Critical value of z is 1.645 Therefore the critical region is 3. We use z-test statistics, because the sample is greater than 30. 4. The z computed value lies in the critical region, the null hypothesis is rejected It is reasonable to conclude that the true population proportion of customers who live in a suburban area is less than 45%. c. The average (mean) number of years lived in the current home is greater than 8 years.
  • 14. 1. Stating the hypothesis 2. Level of significance is 5% Critical value of z is 1.645 Therefore the critical region is 3. We use z-test statistics, because the sample is greater than 30. 4. The z computed value lies in the critical region, the null hypothesis is rejected 5. It is reasonable to conclude that the average (mean) number of years lived in the current home is greater than 8 years. d. The average (mean) credit balance for rural customers is less than $3,200. 1. Stating the hypothesis 2. Level of significance is 5% Critical value of z is 1.645 Therefore the critical region is 3. We use z-test statistics, because the sample is greater than 30.
  • 15. 4. The z computed value does not lie in the critical region, the null hypothesis is not rejected 5. It is reasonable to conclude that the average (mean) credit balance for rural customers is not less than $3,200. A. Confidence interval is given by. The 95% confidence the Z value is 1.96 We are 95% confidence that the mean of the population lies between 753.66 and 49868.34 B. Confidence interval is given by. The 95% confidence the Z value is 1.96 We are 95% confidence that the mean of the population lies between 0.173 and 0.427 C. Confidence interval is given by. The 95% confidence the Z value is 1.96 We are 95% confidence that the mean of the population lies between 8.34 and 10.856
  • 16. D. Confidence interval is given by. The 95% confidence the Z value is 1.96 We are 95% confidence that the mean of the population lies between 3895.152 and 4411.768 Project Part C: Regression and Correlation Analysis 1. Generate a scatterplot for income ($1,000) versus credit balance ($), including the graph of the best fit line. Interpret. There is a positive relationship between income and credit balance as an increase in credit balance lead to increase in income 2. Determine the equation of the best fit line, which describes the relationship between income and credit balance. 3. Determine the coefficient of correlation. Interpret. The correlation is the square root of 0.64084 which is 0.8005. This shows that there is a strong positive correlation between income and credit balance. 4. Determine the coefficient of determination. Interpret. The R2=0.64084. This shows that 64.08% of variation in income (y) is explain by credit balance (x) 5. Test the utility of this regression model (use a two tail test
  • 17. with α =.05). Interpret your results, including the p-value. The utility of this model can be tested by a t-test for beta-1. From the obtained output we can see the test statistic for that test is 9.25 with corresponding p-value 0.000. Since the p-value is smaller than the significance level α=0.05 so we can say that the model is significant 6. Based on your findings in 1–5, what is your opinion about using credit balance to predict income? Explain. Base on my finding, I see that Size is a good predictor of Credit Balance because Credit Balance and Size seems to affect each other. As Size increase Credit Balance seems to increases also; they correlated. As the Size of the household grow so does the Credit Balance of those household also grew and increase. 7. Compute the 95% confidence interval for beta-1 (the population slope). Interpret this interval. I am 95% confident that for each additional credit balance that the average income will go up between 0.009335 and 0.014518 8. Using an interval, estimate the average income for customers that have credit balance of $4,000. Interpret this interval. The interval is given by (41.77, 46.61). I am 95% confident that the average mean income is between $41,770 and $46,610 that have a balance of $4,000.Using an interval, predict the income for a customer that has a credit balance of $4,000. Interpret this interval. 9. Using an interval, predict the income for a customer that has a credit balance of $4,000. Interpret this interval. The interval is (27.11, 61.27). I am 95% confident that the average predicted income is between $27,110 and $61,270 that have a balance of $4,000. 10. What can we say about the income for a customer that has a credit balance of $10,000? Explain your answer. Putting $10,000 into the regression model we have Income =-3.516+0.011926*10,000=115.7483 So a customer with 10,000 credit balance, it is more than likely to have an income of $115,748.27 as per the fitted regression model
  • 18. In an attempt to improve the model, we attempt to do a multiple regression model predicting income based on credit balance, years, and size. 11. Using MINITAB, run the multiple regression analysis using the variables credit balance, years, and size to predict income. State the equation for this multiple regression model. Model Summary Model R R Square Adjusted R Square Std. Error of the Estimate Change Statistics R Square Change F Change df1 df2 Sig. F Change 1 .930a .865 .856 5.261 .865 98.405 3 46 .000 a. Predictors: (Constant), Credit Balance, Years, Size
  • 19. Coefficientsa Model Unstandardized Coefficients Standardized Coefficients t Sig. 95.0% Confidence Interval for B B Std. Error Beta Lower Bound Upper Bound 1 (Constant) -13.186 3.608 -3.655 .001 -20.448 -5.923 Size .615 .418 .112 1.472 .148 -.226 1.456 Years 1.210
  • 20. .232 .395 5.209 .000 .742 1.677 Credit Balance .011 .001 .724 13.188 .000 .009 .012 a. Dependent Variable: Income Regression Analysis: Income ($1, 000 versus Credit Balance, Years, Size) The regression equation is ANOVAa Model Sum of Squares df Mean Square F Sig. 1 Regression 8171.685 3 2723.895 98.405 .000b
  • 21. Residual 1273.295 46 27.680 Total 9444.980 49 a. Dependent Variable: Income b. Predictors: (Constant), Credit Balance, Years, Size 12. Perform the global test foruUtility (F-Test). Explain your conclusion. From the output we can see the F-test statistics in this case is 98.41 with corresponding p-value 0. Thus the null hypothesis of insignificancy is rejected and we can conclude that the regression model is significant is predicting the dependent variable income 13. Perform the t-test on each independent variable. Explain your conclusions and clearly state how you should proceed. In particular, state which independent variables should we keep and which should be discarded. The given output, t-test statistic for credit balance is 13.19 with corresponding p-value 0, for the size it is 1.47 with p-value 0.148 and for years it is 5.21 with p-value 0. Since the p-value for credit balance and years is smaller than 0.05 so they are significant is predicting income so we should keep them in the model but the p-value for size is greater than 0.05 implying size is not so significant in predicting the income thus we should remove this variable from the given model
  • 22. 14. Is this multiple regression model better than the linear model that we generated in parts 1–10? Explain. We need to check at the R-sq which is the coefficient of determination for both the two models. The coefficient of determination for multiple regression model (86.5%) is greater than that of simple linear regression 64.08% so the multiple linear regression is explaining higher variance which implies that MLR model is better than simple linear regression model Scatter plot of Income vs credit balance 2631 2047 3155 3913 2660 3531 2766 3769 4082 3806 4049 4073 2697 2914 4073 4310 4199 4253 3104 4293 4456 4340 4925 3178 4391 4947 3203 4354 4366 5003 3250 4402 4397 4595 4786 4888 5148 5011 5220 3257 5528 5283 5332 3304 5553 5484 3342 3788 5756 5861 27 25 25 26 30 29 33 30 32 34 35 40 30 33 42 32 43 43 33 47 35 54 42 36 57 44 38 54 54 46 40 60 58 61 61 62 49 68 57 45 71 57 64 45 74 65 47 53 66 69 Credit Balance Income Count of Location Total Rural Suburban Urban 13 15 22
  • 23. A bar chart of the houshold size 1 2 3 4 5 6 7 8 8 7 6 4 4 6 7 8 Size Frequency Sum of Income ($1,000) by Location Total Rural Suburban Urban 488 709 1104 Scatter plot of years and credit balance 2 2 1 2 5 3 10 4 4 6 8 9 9 11 10 4 10 10 13 10 5 11 5 13 11 6 15 8 10 6 15 11 10 13 13 14 8 14 8 16 15 9 9 17 19 10 18 18 10 10 2631 2047 3155 3913 2660 3531 2766 3769 4082 3806 4049 4073 2697 2914 4073 4310 4199 4253 3104 4293 4456 4340 4925 3178 4391 4947 3203 4354 4366 5003 3250 4402 4397 4595 4786 4888 5148 5011 5220 3257 5528 5283 5332 3304 5553 5484 3342 3788 5756 5861 Years Credit Balance
  • 24. Sum of Size by Location Total Rural Suburban Urban 87 69 69 Location Size Total 45000 : 45000 : 1 1 0 > £ x H x H 645 . 1 > z 5195 . 0 4390 . 1963 1020