This document provides information about various statistical analysis techniques used in biology, including definitions of median, mode, mean, range, and standard deviation. It discusses how to calculate standard deviation using a graphing calculator. It also covers comparing data sets, significant vs. non-significant differences, using t-tests to evaluate differences between populations, and types of correlations between variables.
Introduction to Business Analytics Course Part 10Beamsync
Are you looking for Business Analytics training courses in Bangalore? then consult Beamsync.
Beamsync is providing business analytics training in Bengaluru / Bangalore with experience trainers. For schedules visit: http://beamsync.com/business-analytics-training-bangalore/
Introduction to Business Analytics Course Part 10Beamsync
Are you looking for Business Analytics training courses in Bangalore? then consult Beamsync.
Beamsync is providing business analytics training in Bengaluru / Bangalore with experience trainers. For schedules visit: http://beamsync.com/business-analytics-training-bangalore/
Measure of dispersion part I (Range, Quartile Deviation, Interquartile devi...Shakehand with Life
This tutorial gives the detailed explanation of "Measure of Dispersion" (Range, Quartile Deviation, Interquartile Range, Mean Deviation) with suitable illustrative example with MS Excel Commands of calculation in excel.
Chapter 5 part2- Sampling Distributions for Counts and Proportions (Binomial ...nszakir
Mathematics, Statistics, Sampling Distributions for Counts and Proportions, Binomial Distributions for Sample Counts,
Binomial Distributions in Statistical Sampling, Binomial Mean and Standard Deviation, Sample Proportions, Normal Approximation for Counts and Proportions, Binomial Formula
STAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS).docxwhitneyleman54422
STAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS) 1
Practice Problems: 12.5 (p. 588), 12.9 (p.588)
(4 pts.) 1. For each of the following graphs, identify the form, direction (if possible) and relative
strength. In addition, state if you think that there is an association between X and Y. No
explanation is required.
a) b)
c) d)
STAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS) 2
(14 pts.) 2. Deep-water (>300m) wave forecasts are important for large cargo ships. One
method of prediction suggests that the wind speed (x, in knots) is linearly related to the wave
height (y, in feet). A random sample of buoys was obtained, and the wind speed and wave
height was measured at each. The summary data is shown below.
n = 20, SXX = 91.75, SYY = 15.952, SXY= 36.4, x̄ = 9.25, ȳ = 1.68
The scatter plot of the data is shown below:
(2 pts.) a) Find the estimated regression line for the regression of Wave Height as a function of
Wind Speed.
(1 pt.) b) Does the y-intercept have any physical meaning?
(1 pt.) c) How much change in wave height is expected when the wind speed increases by one
knot? Please explain your answer.
(1 pt.) d) What is the expected value of wave height when the wind speed is 8.6 knots (10
mph)?
(6 pts.) e) Complete the following ANOVA table.
Source of variation Degrees of Freedom Sum of squares Mean square
Regression
Error
Total
(1 pt.) f) What is the estimated variance?
(1 pt.) g) What is the proportion of the wave height that is explained by wind speed?
(1 pt.) h) From the information in the previous parts of this question, do you believe that there is
an association between wave height and wind speed? Please explain your answer. No
additional calculations are required.
STAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS) 3
(2 pts.) 3. Some physicians use the cholesterol ratio (CR = total cholesterol/HDL cholesterol) as
a measure of a patient’s risk of heart disease. In addition, the triglyceride concentration (TG)
is associated with coronary artery disease in many patients. In a study of the relationship
between these two variables, a random sample of adults was obtained, and the triglyceride
level denoted as x1 in mg/dL and cholesterol ratio (y) was obtained for each person. The
scatterplot and regression line of ln(triglyceride level - 129) denoted as x2 vs. cholesterol ratio
is below.
The ANOVA summary table is
Source of Variation Sum of Squares Degrees of freedom Mean Square
Regression 103.16 1 103.16
Error 3.20 23 0.14
Total 106.36 24
(1 pt.) a) What is the coefficient of determination?
(1 pt.) b) Do you think that an increase in the triglyceride level causes an increase in the
cholesterol level? Please explain your answer.
(1 pt.) BONUS: Why do you think that they had to take the logarithm of the triglyceride level?
Additional Problems: Note, the book gives.
The ppt gives an idea about basic concept of Estimation. point and interval. Properties of good estimate is also covered. Confidence interval for single means, difference between two means, proportion and difference of two proportion for different sample sizes are included along with case studies.
2. Median, mode, mean, range &
standard deviation
•The median is the middle value when the values are
placed in order of size.
•The mode is the most commonly occurring value, the
value that appears the most times/shows the greatest
frequency.
•The mean is the sum of the values divided by the
number of values.
•The range is the difference between the minimum and
the maximum value
•The standard deviation is the measure of the spread of
values around the mean.
3. The range or SD of data is
shown as error bars on a
graphical presentation
4. Calculating stats using the GDC
Player A
x
12
16
10
Finding standard
deviation with a GDC:
In the STAT mode
enter the values into a
list, in this case list1
20
22
17
15
(CALC)
(SET)
Ensure XList: List1
1Var Freq: 1
(1Var)
x
:16
∑x
∑x
:112
2
xσ n
xσ n − 1
n
:1898
: 3.89 Either these values
: 4.20 will be accepted.
:7
Player B
x
7
9
12
31
22
22
9
Use a GDC to find the
standard deviation for
player B.
x : .9
σ8
n3
6. Two data sets can have
the same mean value but
different SDs
7. Comparing two sets of data:
when is a difference significant
and when is it not?
•A difference is NOT regarded as significant when
any differences are due to chance variation.
•In statistics the assumption is initially made that
any differences are due to chance. This is called
the null hypothesis.
•Where the null hypothesis is rejected, a
difference is regarded as significant i.e. the
differences are not just due to chance but to an
actual factor causing the difference.
8. A simple rule to evaluate the significance of
difference between two data sets:
Significant difference unlikely if the standard deviations
are greater than the difference between the means (left
diagram) BUT likely if the standard deviations are
smaller than the difference between the means (right
diagram).
9. Example
In a study of heights, two separate human populations
were sampled:
•Population A had a mean height of 1.65 m and
population B a mean height of 1.72 m.
•SD of population A was 0.09 m and SD of population B
0.1 m.
•Evaluate the data to assess if there is likely to be a
significant difference between the heights of the two
populations.
10. Student’s t-test
•A statistical test to find more reliably if there is a significant
difference between two sets of normally distributed data with
ten or more values.
•You are not expected to calculate the value of t, but the
calculation uses the difference between the means and the size
of the standard deviations.
•The test requires the calculation of degrees of freedom, d.f. =
n1+n2-2 where n is the number of values.
•A table of values is used to find the level of significance using
the t value and degrees of freedom (use the one tailed test).
•If the level of significance is 5% or below, reject the null
hypothesis; if above it is accepted that the differences are due to
chance i.e. no significant difference.
11. Example: a study to compare shell diameters in two different
populations of Periwinkle (a marine mollusc). Use the data below to
carry out a t-test to determine the level of significance.
Population A
Population B
n (number in sample) = 15
n (number in sample) = 12
Mean shell diameter = 1.35 mm
Mean shell diameter = 1.55 mm
Standard deviation = 0.15 mm
Standard deviation = 0.24 mm
12. Types of correlation
(note the out-liers)
v a r ia b le 1
N o C o r r e la tio n
v a r ia b le 2
N e g a tiv e C o r r e la tio n
v a r ia b le 2
v a r ia b le 2
P o s itiv e C o r r e la tio n
v a r ia b le 1
v a r ia b le 1
13. Correlations and causal
relationships
•A causal relationship is one in which one
factor/variable affects another e.g. the extension of a
spring depends on the force applied; an increase in air
humidity causes the transpiration rate to fall.
•A positive or negative correlation implies a causal
relationship but does not prove it.
•Proof of a causal relationship in science often requires
an experiment in which one variable is
manipulated/changed (independent variable) and this is
shown to affect another measured (dependent)
variable.
14. Some examples
•It was found that towns with a greater number of
nesting storks had more children per household than
towns with fewer storks (positive correlation).
•CAN WE THEREFORE CONCLUDE THAT THE
STORKS WERE DELIVERING THE BABIES?
•It has long been known that there is a positive
correlation between the number of cigarettes smoked
and deaths from lung cancer.
•ONLY RECENTLY HAS IT BEEN SHOWN THAT
SMOKING CAUSES LUNG CANCER.
•A strong correlation exists between rise in
atmospheric CO2 levels and rise in global temperature.
•SOME PEOPLE STILL DISPUTE THAT INCREASING
CO2 LEVELS IS WHAT CAUSES GLOBAL WARMING