Past year okt 2010
- 1. CONFIDENTIAL CS/OCT2010/QMT554
UNIVERSITI TEKNOLOGI MARA
FINAL EXAMINATION
COURSE DATA ANALYSIS
COURSE CODE QMT554
EXAMINATION OCTOBER 2010
TIME 3 HOURS
INSTRUCTIONS TO CANDIDATES
1. This question paper consists of five (5) questions.
2. Answer ALL questions in the Answer Booklet. Start each answer on a new page.
3. Do not bring any material into the examination room unless permission is given by the
invigilator.
4. Please check to make sure that this examination pack consists of:
i) the Question Paper
ii) a four-page Appendix (Formula List)
Hi) an Answer Booklet - provided by the Faculty
iv) Statistical tables - provided by the faculty
DO NOT TURN THIS PAGE UNTIL YOU ARE TOLD TO DO SO
This examination paper consists of 10 printed pages
© Hak Cipta Universiti Teknologi MARA CONFIDENTIAL
- 2. CONFIDENTIAL 2 CS/OCT2010/QMT554
QUESTION 1
a) The tourism industry in Malaysia is an important foreign exchange earner,
contributing to economic growth, attracting investments and providing employment.
Realizing the importance of tourism industry, the focus of the government is to
enhance the country's position as a leading foreign tourist destination. Amy, a
researcher from a well known consulting firm is given a task to determine the level of
satisfaction on the services provided at tourist attractions destinations located
throughout Malaysia among foreign tourists. Questionnaires are used as the tool for
data collection and a random sample of 50 foreign tourists are selected at various
tourist visit destinations. Each tourist selected was asked to give a score to the
services provided at the tourists visit destinations. In addition, other information such
as gender, age, education level, occupation, income, country of origin, reasons for
traveling, and length of stay were also recorded.
i) State the population for the above study.
(1 mark)
ii) Does the study involve primary or secondary data? Give a reason to support
your answer.
(2 marks)
iii) Name any three variables from the above study. For each variable chosen,
state its type and the most appropriate graphical presentation.
(6 marks)
iv) Amy is required to summarize and analyze the information collected from the
above study. Suggest the appropriate statistical tests that can be used to
analyze each of the following hypothesis.
a) There are differences in the scores obtained between gender.
b) There are differences in the scores obtained among the education
level.
c) The level of satisfaction is independent of gender.
d) There is a relationship between the scores and the income of the
foreign tourist.
(4 marks)
b) The scores (out of 100) given by the foreign tourists to the services provided at the
tourist visit destinations are summarized as below:
Table 1: Descriptive Statistics
Gender N mean median standard minimum maximum skewness
deviation
Male 28 84 88 7.2 78 92 -1.6667
Female 22 83 80 6.8 75 85 0.9184
© Hak Cipta Universiti Teknologi MARA CONFIDENTIAL
- 3. CONFIDENTIAL 3 CS/OCT2010/QMT554
i) How many female foreign tourists are selected in the study?
(1 mark)
ii) State the lowest score given by the male foreign tourist to the services
provided at the tourist visit destinations?
(1 mark)
iii) State the highest score given to the services provided at the tourist visit
destinations?
(1 mark)
iv) State the skewness of the male's scores distribution and explain what it
means.
(2 marks)
v) Which gender is more consistent when giving scores to the services provided
at the tourist visit destinations? Give a reason for your answer.
(2 marks)
QUESTION 2
a) The manager of the Royale Star Resort Hotel stated that the mean guest bills during
weekends are RM700 or less. A member of the hotel's accounting staff noticed that
the total charges for guest bills have been increasing in the recent months. A sample
of weekend guest bills was taken to test the manager's claim. Analysis using SPSS
gives the following result.
Table 2: One-Sample Statistics
N Mean Std. Deviation Std. Error Mean
guest bills 20 705.8000 114.56949 25.61852
Table 3: One-Sample Test
Test Value = 700
95% Confidence Interval of
the Difference
t df Sig. (2-tailed) Mean Difference Lower Upper
guest bills .226 19 .823 5.80000 -47.8202 59.4202
i) Determine the 95% confidence interval for the mean weekend guest bills.
(3 marks)
ii) Specify the null and alternative hypothesis for the above test.
(2 marks)
© Hak Cipta Universiti Teknologi MARA CONFIDENTIAL
- 4. CONFIDENTIAL 4 CS/OCT2010/QMT554
iii) Show that the test statistic t is 0.226.
(2 marks)
iv) Based on the p-value in the SPSS output, is there sufficient evidence to
support the manager's claim at 5% significance level?
(3 marks)
b) The Royale Star Resort Hotel manager also claims that 50% of the guest's will be
staying at the hotel for their next visit. A survey was carried out and the result was
analyzed using SPSS. The output was given in Table 4.
Table 4: Binomial Test
Asymp. Sig. (2-
Category N Observed Prop. Test Prop. tailed)
guest's Group 1 yes 56 .56 .50 .271 a
response
Group 2 no 44 .44
Total 100 1.00
a. Based on Z Approximation.
i) Specify the null and the alternative hypothesis for the above test.
(2 marks)
ii) Based on the SPSS output, is there sufficient evidence to support the
manager's claim at a = 0.05?
(3 marks)
iii) Determine the 95% confidence interval on the proportion of guest who will be
staying at Royale Star Resort Hotel for their next visit.
(3 marks)
iv) Interpret the confidence interval obtained in iii).
(2 marks)
QUESTION 3
a) The senior chef wants to investigate the difference between the mean price (in RM)
between two brands of tomato soup in the market. The chef randomly samples eight
stores. Each store sells its own brand (1) and a national brand (2) of tomato soup.
The SPSS results for the prices of a can of tomato soup of each brand from different
stores was presented in Table 5 and Table 6:
Table 5: Group Statistics
Brand
Type Mean Std. Deviation Std. Error Mean
Mean price of tomato soup 1 2.2000 .13352 .04721
2 2.0200 .10690 .03780
© Hak Cipta Universiti Teknologi MARA CONFIDENTIAL
- 5. CONFIDENTIAL 5 CS/OCT 2010/QMT554
Table 6: Independent Samples Test
Levene's Test
for Equality of
Variances t-test for Equality of Means
95% Confidence
Interval of the
Difference
Sig. Mean Std. Error
F Sig. t df (2-tailed) Difference Difference Lower Upper
Mean price of Equal variances
.439 .51855 2.97647 14 .01001 .18000 .06047 .05030 .30970
tomato soup assumed
Equal variances
2.976 1.33607E1 .01044 .18000 .06047 .04971 .31029
not assumed
i) State the hypotheses for the above test.
(2 marks)
ii) Based on the results, what is the assumption for the variances of the price
between two brands of tomato soup? Use a = 0.05.
(2 marks)
iii) Using the p-value in the SPSS output, do the data provide sufficient evidence
to indicate that there is a difference between the mean price of the two
brands? Use a = 0.05.
(3 marks)
iv) State the 95% confidence interval on the mean price for these brands. Does
the confidence interval consistent with your answer in iii)? Explain your
answer.
(4 marks)
b) The marketing food consultant was hired to visit a random sample of five food stores
across the district of Petaling Jaya to investigate whether the mean net sales had
improved. Each store was a part of large franchise of food stores. The consultant
taught the managers of each store better ways to advise and display their foods. The
net sales for 1 month before and 1 month after the consultant's visit were recorded.
The data was analyzed by using SPSS and the results as follows:
Table 7: Paired Samples Statistics
Std. Error
Mean N Std. Deviation Mean
Pair Sales of food (before visit) 64.3000 5 21.30000 9.52565
1 Sales of food (after visit) 69.2400 5 22.99180 10.28225
© Hak Cipta Universiti Teknologi MARA CONFIDENTIAL
- 6. CONFIDENTIAL 6 CS/OCT2010/QMT554
Table 8: Paired Samples Test
Paired Differences
95% Confidence
Std. Interval of the Sig
Std. Error Difference 2-tailed .
Mean Deviation Mean Lower Upper t df
Pair Sales of food
1 (before visit) -
-4.94 3.90103 1.7446 -9.7838 -.0962 -2.83 4 .047
Sales of food
(after visit)
i) Show how the value of the test statistic for mean is obtained.
(3 marks)
ii) Using the p-value, is there any sufficient evidence to indicate that the mean
net sales have improved? Test at 5% significance level.
(6 marks)
QUESTION 4
a) The program coordinator from Faculty of Hotel and Tourism wanted to investigate the
effectiveness of three different teaching methods for Research Methodology course.
Students registered for the course were assigned at random into three different
classes and will be taught using the three methods. Student's marks at the end of
semester were recorded in the table below.
Table 9: Student's marks in three different classes
Method I Method II Method III
60 70 80
65 72 82
55 85 80
50 84 90
58 82 92
62 78 98
68 88 95
70 74 90
52 80 95
62 76 90
The SPSS software was used to conduct the analysis of variance using the recoded
data. Table 10 gives the output for the analysis done.
© Hak Cipta Universiti Teknologi MARA CONFIDENTIAL
- 7. CONFIDENTIAL 7 CS/OCT2010/QMT554
Table 10: ANOVA
Sum of
Squares df Mean Square F
Between Groups 4322.600 2 X Z
Within Groups V W Y
Total 5404.700 29
Based on the output, answer the following questions:
i) How many observations are involved in this study?
(1 mark)
ii) Compute the values of V, W, X, Y and Z.
(4 marks)
iii) State the null and alternative hypothesis for this study.
(2 marks)
iv) Test the hypothesis that the three different teaching methods have an effect
on the student's performance at a = 0.025.
(4 marks)
b) A lecturer wanted to know whether the courses offered to the students of Faculty
Hotel and Tourism for this semester is suitable to their program based on the
students' opinions. He distributed a questionnaire to gather information regarding the
courses offered and the suitability of the program. The following table shows the
results obtained.
Ta ble 11: Student's opinion towards course offered
Do you think the Courses offered
course offered
suits the Statistics Business Accounting
program?
Yes 85 60 77
No 20 13 16
Total 105 73 93
© Hak Cipta Universiti Teknologi MARA CONFIDENTIAL
- 8. CONFIDENTIAL 8 CS/OCT 2010/QMT554
Below is the two-way contingency table obtained from SPSS output.
Table 12: student's opinion * course_offered Crosstabulation
course_offered
Statistics Business Accounting Total
opinion yes Count E 60 77 222
Expected Count 86.0 59.8 76.2 222.0
no Count 20 13 16 49
Expected Count 19.0 13.2 F 49.0
Total Count 105 73 93
Expected Count 105.0 73.0 93.0
Table 13: Chi-Square Tests
Asymp. Sig. (2-
Value df sided)
Pearson Chi-Square G 2 .943
Likelihood Ratio .118 2 .943
Linear-by-Linear Association .114 1 .736
N of Valid Cases 271
a. 0 cells (.0%) have expected count less than 5. The minimum
expected count is 13.20.
i) Compute the value for E, F and G.
(4 marks)
ii) State the null and alternative hypothesis to test whether there is an
association between the courses offered and the students' opinion on the
suitability of the courses to their program.
(2 marks)
iii) Based on the p-value, state your decision and conclusion for the above test.
Use a=0.05.
(3 marks)
© Hak Cipta Universiti Teknologi MARA CONFIDENTIAL
- 9. CONFIDENTIAL 9 CS/OCT 2010/QMT554
QUESTION 5
a) An observation was carried out to determine the relationship between the age of a
chef and the time (in minutes) needed to prepare a dish. The table below shows the
data recorded by eight randomly selected chefs.
Table 14
Age (years) Time (minutes)
23 63
45 52
34 55
50 54
44 50
29 60
36 57
52 50
Below is the output obtained from SPSS.
Table 15: Model Summary
Adjusted R Std. Error of the
Model R R Square Square Estimate
1 .901 a .811 .780 2.193
a. Predictors: (Constant), age
Table 16: Coefficients3
Standardized
Unstandardized Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 71.133 3.246 21.914 .000
age -.409 .081 -.901 -5.078 .002
i) Identify the independent and the dependent variables.
(2 marks)
ii) Prove that the product moment correlation coefficient is -0.901 and explain its
meaning.
(4 marks)
iii) What percentage of the variation in the time taken to prepare a dish is
explained by difference in age of chefs?
(1 mark)
iv) Determine the slope and y-intercept of the regression equation. Interpret the
coefficients in the context of the problem.
(5 marks)
© Hak Cipta Universiti Teknoiogi MARA CONFIDENTIAL
- 10. CONFIDENTIAL 10 CS/OCT2010/QMT554
v) Write the complete regression equation. Estimate the time needed for a chef
who is 30 years old to prepare a dish.
(4 marks)
b) A manager wishes to estimate the mean time the housekeeping staff to prepare a
guest hotel room. The time is found to be approximately normally distributed with
population standard deviation is estimated to be 15 minutes. How many
housekeeping staff should be sampled if the researcher wants to be 95% confident of
finding that the true mean differs from the sample mean by 5 minutes?
(4 marks)
END OF QUESTION PAPER
© Hak Cipta Universiti Teknologi MARA CONFIDENTIAL
- 11. CONFIDENTIAL APPENDIX 1 CS/OCT2010/QMT554
KEY FORMULAS
CONFIDENCE INTERVAL
Parameter and description A (1 - a) 100% confidence interval
Mean n,
for large samples x±z a/2 a or x±z a/2
4n n
Mean y,
x±t df = n - 1
for small samples a/2
Proportion n
P±z J21
r a/2 l
V <T,
n <J2
(xx-x2)±z a / 2 J — +
Difference in means, nx n2
M - M-2,
-
1 or
for large and
s2 s2
independent samples (*,-*2)±za*J—+ —
nx n2
1 1
Difference in means, (xl-x2)±ta/2s..— +— df=n 1 + n 2 - 2
M -M •
-
1 -
2 nx n2
for small and j(nx-l)s2+(n2-l)s2
independent samples: S
P =
equal a 2 nx+n2-2
d±tal2 *d df = n - 1
Difference in means
Mi " M = M
2 d
2 (2»2
for paired samples
2> 2> - n
d = *d =
n-
DETERMINING THE SAMPLE SIZE
Parameter
Mean, y. 2 2
n Z a / 22 7
n
<
" E
© Hak Cipta Universiti Teknologi MARA CONFIDENTIAL
- 12. CONFIDENTIAL APPENDIX 2 CS/OCT2010/QMT554
HYPOTHESIS TESTING
Null Hypothesis Test statistic
„_x~ Mo n r
x
~ Mo
H 0 : n = x0
aj4n sj4n
for large samples
Ho: p. = no t=X~^ d f - n1
for small samples
Ho! 71 = 7T0 D-7I
7t(1-7l)
-X )-(MI-M2) „ _ Oi-*2)-C"i-/"2)
Ho: M1 - ^2 = 0
- z = ( * ' " 2i or z j
for large and
independent samples y nx n2 nx n2
= U -X7)-(/U,-Uj) — ^ 2 /
(x, 2/_^l d f
.,_ n < | + n 2_ 2
„
Ho: ju.1 - a2 = 0 f=
for small and
independent samples:
equal a 2
t_d~Md
Ho: |Od = 0
sjyfn
df = n - 1, where n = no. of pairs
© Hak Cipta Universiti Teknologi MARA CONFIDENTIAL
- 13. CONFIDENTIAL APPENDIX 3 CS/OCT2010/QMT554
SIMPLE LINEAR REGRESSION
Sum of squares of xy, xx, and yy:
» „ = 2 > 2 - ^ a d ss^sy-S^!
n
Least squares estimates of A and B:
00Jtv
b=—- and a=y-bx
SSxx
Total sum of squares: SST=J]yz ——
n
Regression sum of squares: SSfi= SS7"-- SSE
POD
Coefficient of determination: r2 =•
SS
yy
Linear correlation coefficient: r=-
jssxxss yy
© Hak Cipta Universiti Teknologi MARA CONFIDENTIAL
- 14. CONFIDENTIAL APPENDIX 4 CS/OCT2010/QMT554
ANALYSIS OF VARIANCE FOR A COMPLETELY RANDOMIZED DESIGN
Let:
k = the number of different samples (or treatments)
nt = the size of sample /
T = the sum of the values in sample i
n = the number of values in all samples
=
n ] +« 2 +n 3 +...
V x = the sum of the values in all samples
= T}+T2+T,+...
2
V x = the sum of the squares of values in all samples
Degrees of freedom for the numerator = k-1
Degrees of freedom for the denominator = n-k
Total sum of squares: SST = ^x 2 (2» 2
Between-samples sum of squares:
r
T,2 Tl T}
SSB= -J-+-A-+-?-+...
CL*)2
n n n
2 i J n
Within- samples sum of squares = SST - SSB
Variance between samples: MSB=
Variance within samples: MSW =
ssw
{n-k)
MSB
Test statistic for a one-way ANOVA test: F-
MSW
© Hak Cipta Universiti Teknologi MARA CONFIDENTIAL