2. Data
• Data is a collection of facts, such as values or
measurements.
OR
• Data is information that has been translated into a
form that is more convenient to move or process.
OR
• Data are any facts, numbers, or text that can be
processed by a computer.
2
3. Statistics
Statistics is the study of the collection, summarizing,
organization, analysis, and interpretation of data.
3
5. Biostatistics
Biostatistics is the application of statistical techniques to
scientific research in health-related fields, including
medicine, biology, and public health.
5
6. Descriptive Statistics
The term descriptive statistics refers to statistics
that are used to describe. When using descriptive
statistics, every member of a group or population is
measured. A good example of descriptive statistics is
the Census, in which all members of a population are
counted.
6
7. Inferential or Analytical Statistics
Inferential statistics are used to draw conclusions and make
predictions based on the analysis of numeric data.
7
8. Primary & Secondary Data
• Raw or Primary data: when data collected having
lot of unnecessary, irrelevant & un wanted
information
• Treated or Secondary data: when we treat &
remove this unnecessary, irrelevant & un wanted
information
• Cooked data: when data collected not genuinely and
is false and fictitious
8
9. Ungrouped & Grouped Data
• Ungrouped data: when data presented or observed individually. For example if we observed
no. of children in 6 families
2, 4, 6, 4, 6, 4
• Grouped data: when we grouped the identical data by frequency. For example above data of
children in 6 families can be grouped as:
No. of children
Families
2
1
4
3
6
2
or alternatively we can make classes:
No. of children
Frequency
2-4
4
5-7
2
9
10. Variable
A variable is something that can be changed, such as a
characteristic or value. For example
age, height, weight, blood pressure etc
10
11. Types of Variable
Independent variable: is typically the variable representing the
value being manipulated or changed. For example smoking
Dependent variable: is the observed result of the independent
variable being manipulated. For example ca of lung
Confounding variable: is associated with both exposure and
disease. For example age is factor for many events
11
13. Quantitative or Numerical data
This data is used to describe a type of information
that can be counted or expressed numerically
(numbers)
2, 4 , 6, 8.5, 10.5
13
14. Quantitative or Numerical
data (cont.)
This data is of two types
1. Discrete Data: it is in whole numbers or values and has no
fraction. For example
Number of children in a family
= 4
Number of patients in hospital
= 320
2. Continuous Data (Infinite Number): measured on a
continuous scale. It can be in fraction. For example
Height of a person
=
5 feet 6 inches 5”.6’
Temperature
=
92.3 °F
14
15. Qualitative or Categorical data
This is non numerical data as
Male/Female,
Short/Tall
This is of two types
1.
Nominal Data: it has series of unordered categories
( one can not √ more than one at a time) For example
Sex
2.
=
Male/Female
Blood group = O/A/B/AB
Ordinal or Ranked Data: that has distinct ordered/ranked categories.
For example
Measurement of height can be = Short / Medium / Tall
Degree of pain can be = None / Mild /Moderate / Severe
15
16. Stem and Leaf Plots
• .Simple way to order and display a data set.
• Abbreviate the observed data into two significant digits.
0.6
Stem
• 0
• 1
• 2
• 3
2.6
0.1
Leaf
6 1
1 3
6 2
2
1.1
0.4
1.3
1.5
2.2
2.0
3.2
4
5
0
16
18. Measures of Central Tendency
are quantitative indices that describe the center of
a distribution of data. These are
• Mean
• Median
• Mode
(Three M M M)
18
19. Mean
Mean or arithmetic mean is also called AVERAGE and only calculated
for numerical data. For example
• What average age of children in years?
Children
1234567
Age
6443246
-X = ∑X
___
n
Formula
Mean = 6 4 4 3 2 4 5 = 28
7
7
= 4 years
19
20. Median
• It is central most value. For example what is central value
in 2, 3, 4, 4, 4, 5, 6 data?
• If we divide data in two equal groups 2, 3, 4, 4, 4, 5, 6
hence 4 is the central most value
• Formula to calculate central value is:
Median = n + 1 (here n is the total no. of value)
2
Median = (n + 1)/2 = 7 + 1 = 8/2 = 4
20
21. Mode
• is the most frequently (repeated) occurring value in set
of observations. Example
• No mode
Raw data:
10.3 4.9 8.9 11.7 6.3 7.7
• One mode
Raw data:
2 3 4 4 4 5 6
• More than 1 mode
Raw data:
21 28 28 41 43 43
21
22. Comparison of the Mode, the
Median, and the Mean
• In a normal distribution, the mode , the median, and the
mean have the same value.
• The mean is the widely reported index of central
tendency for variables measured on an interval and ratio
scale.
• The mean takes each and every score into account.
• It also the most stable index of central tendency and thus
yields the most reliable estimate of the central tendency
of the population.
23. Histogram/Bar Chart
• Histogram & Box plots are used for continuous or
scale variables like temperature, Bone density etc
• Bar chart & Pie Charts are used to categorical or
nominal variables like gender, name etc.
23
24. Measures of Dispersion
quantitative indices that describe the spread of a data set.
These are
•
•
•
•
•
•
Range
Mean deviation
Variance
Standard deviation
Coefficient of variation
Percentile
24
25. Range
It is difference between highest and lowest values
in a data series. For example:
the ages (in Years) of 10 children are
2, 6, 8, 10, 11, 14, 1, 6, 9, 15
here the range of age will be 15 – 1 = 14 years
25
26. Mean Deviation
This is average deviation of all observation from the mean
Mean Deviation = ∑ І X – X І
_______
_
n
here X = Value, X = Mean
n = Total no. of value
26
27. 27
Mean Deviation Example
A student took 5 exams in a class and had scores of
92, 75, 95, 90, and 98. Find the mean deviation for her test scores.
• First step find the mean.
_
x=∑x
___
n
= 92+75+95+90+98
5
= 450
5
= 90
28. • 2nd step find mean deviation
Values = X
ˉ
Mean = X
Deviation from
ˉ
Mean = X - X
Absolute value of
Deviation
Ignoring + signs
92
90
2
2
75
90
-15
15
95
90
5
5
90
90
0
0
98
90
8
8
Total = 450
n= 5
--
Mean Deviation
=
Dr. Riaz A. Bhutto
_
∑І X – X І
_______ = 30/5
n
∑ X - X = 30
=6
Average deviation
from mean is 6
9/3/2012
28
29. Variance
• It is measure of variability which takes into account
the difference between each observation and mean.
• The variance is the sum of the squared deviations
from the mean divided by the number of values in
the series minus 1.
• Sample variance is s² and population variance is σ²
29
30. Variance (cont.)
•
•
•
•
•
The Variance is defined as:
The average of the squared differences from the Mean.
To calculate the variance follow these steps:
Work out the Mean (the simple average of the numbers)
Then for each number: subtract the Mean and square the
result (the squared difference)
• Then work out the average of those squared differences.
30
31. 31
Example: House hold size of 5 families was recorded as following:
2, 5, 4, 6, 3
Step 1
Values = X
Calculate variance for above data.
Step 2
ˉ
Mean = X
Step 3
Step 4
Deviation from
ˉ
Mean = X - X
ˉ
( X – X)²
2
4
-2
4
5
4
1
1
4
4
0
0
6
4
2
4
3
4
-1
1
Step 6 =
Dr. Riaz A. Bhutto
s² =
_
∑ ( X – X)² = 10/5 = 2
_______
n
∑ = 10 Step 5
S²= 2 persons²
9/3/2012
32. Standard Deviation
• The Standard Deviation is a measure of how spread out numbers are.
• Its symbol is σ (the greek letter sigma)
• The formula is easy: it is the square root of the Variance.ie
s = √ s²
• SD is most useful measure of dispersion
s = √ (x - x²)
n
(if n > 30) Population
s = √ (x - x²)
n-1
(if n < 30) Sample
32
33. Standard Deviation and Standard
Error
• SD is an estimate of the variability of the
observations or it is sample estimate of population
parameter .
• SE is a measure of precision of an estimate of a
population parameter.