Data Display and
Summary
Biostatistics

By Dr Zahid Khan
Data
• Data is a collection of facts, such as values or
measurements.

OR

• Data is information that has been translated ...
Statistics
Statistics is the study of the collection, summarizing,
organization, analysis, and interpretation of data.

3
Vital statistics
Vital
statistics
is
collecting, summarizing, organizing, analysis, presentation,
and interpretation of da...
Biostatistics
Biostatistics is the application of statistical techniques to
scientific research in health-related fields, ...
Descriptive Statistics
The term descriptive statistics refers to statistics
that are used to describe. When using descript...
Inferential or Analytical Statistics
Inferential statistics are used to draw conclusions and make
predictions based on the...
Primary & Secondary Data
• Raw or Primary data: when data collected having
lot of unnecessary, irrelevant & un wanted
info...
Ungrouped & Grouped Data
• Ungrouped data: when data presented or observed individually. For example if we observed
no. of...
Variable
A variable is something that can be changed, such as a
characteristic or value. For example
age, height, weight, ...
Types of Variable
Independent variable: is typically the variable representing the
value being manipulated or changed. For...
Categories of DATA

12
Quantitative or Numerical data
This data is used to describe a type of information
that can be counted or expressed numeri...
Quantitative or Numerical
data (cont.)
This data is of two types

1. Discrete Data: it is in whole numbers or values and h...
Qualitative or Categorical data
This is non numerical data as

Male/Female,

Short/Tall

This is of two types

1.

Nominal...
Stem and Leaf Plots
• .Simple way to order and display a data set.
• Abbreviate the observed data into two significant dig...
Measures of Central Tendency &
Variation (Dispersion)

17
Measures of Central Tendency
are quantitative indices that describe the center of
a distribution of data. These are

• Mea...
Mean
Mean or arithmetic mean is also called AVERAGE and only calculated
for numerical data. For example

• What average ag...
Median
• It is central most value. For example what is central value
in 2, 3, 4, 4, 4, 5, 6 data?

• If we divide data in ...
Mode
• is the most frequently (repeated) occurring value in set
of observations. Example

• No mode
Raw data:

10.3 4.9 8....
Comparison of the Mode, the
Median, and the Mean
• In a normal distribution, the mode , the median, and the
mean have the ...
Histogram/Bar Chart
• Histogram & Box plots are used for continuous or
scale variables like temperature, Bone density etc
...
Measures of Dispersion
quantitative indices that describe the spread of a data set.
These are

•
•
•
•
•
•

Range
Mean dev...
Range
It is difference between highest and lowest values
in a data series. For example:
the ages (in Years) of 10 children...
Mean Deviation
This is average deviation of all observation from the mean
Mean Deviation = ∑ І X – X І
_______
_
n
here X ...
27

Mean Deviation Example
A student took 5 exams in a class and had scores of
92, 75, 95, 90, and 98. Find the mean devia...
• 2nd step find mean deviation
Values = X

ˉ
Mean = X

Deviation from
ˉ
Mean = X - X

Absolute value of
Deviation
Ignoring...
Variance
• It is measure of variability which takes into account
the difference between each observation and mean.

• The ...
Variance (cont.)
•
•
•
•
•

The Variance is defined as:
The average of the squared differences from the Mean.
To calculate...
31

Example: House hold size of 5 families was recorded as following:
2, 5, 4, 6, 3

Step 1
Values = X

Calculate variance...
Standard Deviation

• The Standard Deviation is a measure of how spread out numbers are.
• Its symbol is σ (the greek lett...
Standard Deviation and Standard
Error
• SD is an estimate of the variability of the
observations or it is sample estimate ...
Upcoming SlideShare
Loading in …5
×

Data Display and Summary

673 views
412 views

Published on

Data Display and Summary

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
673
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
18
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Data Display and Summary

  1. 1. Data Display and Summary Biostatistics By Dr Zahid Khan
  2. 2. Data • Data is a collection of facts, such as values or measurements. OR • Data is information that has been translated into a form that is more convenient to move or process. OR • Data are any facts, numbers, or text that can be processed by a computer. 2
  3. 3. Statistics Statistics is the study of the collection, summarizing, organization, analysis, and interpretation of data. 3
  4. 4. Vital statistics Vital statistics is collecting, summarizing, organizing, analysis, presentation, and interpretation of data related to vital events of life as births, deaths, marriages, divorces, health & diseases. 4
  5. 5. Biostatistics Biostatistics is the application of statistical techniques to scientific research in health-related fields, including medicine, biology, and public health. 5
  6. 6. Descriptive Statistics The term descriptive statistics refers to statistics that are used to describe. When using descriptive statistics, every member of a group or population is measured. A good example of descriptive statistics is the Census, in which all members of a population are counted. 6
  7. 7. Inferential or Analytical Statistics Inferential statistics are used to draw conclusions and make predictions based on the analysis of numeric data. 7
  8. 8. Primary & Secondary Data • Raw or Primary data: when data collected having lot of unnecessary, irrelevant & un wanted information • Treated or Secondary data: when we treat & remove this unnecessary, irrelevant & un wanted information • Cooked data: when data collected not genuinely and is false and fictitious 8
  9. 9. Ungrouped & Grouped Data • Ungrouped data: when data presented or observed individually. For example if we observed no. of children in 6 families 2, 4, 6, 4, 6, 4 • Grouped data: when we grouped the identical data by frequency. For example above data of children in 6 families can be grouped as: No. of children Families 2 1 4 3 6 2 or alternatively we can make classes: No. of children Frequency 2-4 4 5-7 2 9
  10. 10. Variable A variable is something that can be changed, such as a characteristic or value. For example age, height, weight, blood pressure etc 10
  11. 11. Types of Variable Independent variable: is typically the variable representing the value being manipulated or changed. For example smoking Dependent variable: is the observed result of the independent variable being manipulated. For example ca of lung Confounding variable: is associated with both exposure and disease. For example age is factor for many events 11
  12. 12. Categories of DATA 12
  13. 13. Quantitative or Numerical data This data is used to describe a type of information that can be counted or expressed numerically (numbers) 2, 4 , 6, 8.5, 10.5 13
  14. 14. Quantitative or Numerical data (cont.) This data is of two types 1. Discrete Data: it is in whole numbers or values and has no fraction. For example Number of children in a family = 4 Number of patients in hospital = 320 2. Continuous Data (Infinite Number): measured on a continuous scale. It can be in fraction. For example Height of a person = 5 feet 6 inches 5”.6’ Temperature = 92.3 °F 14
  15. 15. Qualitative or Categorical data This is non numerical data as Male/Female, Short/Tall This is of two types 1. Nominal Data: it has series of unordered categories ( one can not √ more than one at a time) For example Sex 2. = Male/Female Blood group = O/A/B/AB Ordinal or Ranked Data: that has distinct ordered/ranked categories. For example Measurement of height can be = Short / Medium / Tall Degree of pain can be = None / Mild /Moderate / Severe 15
  16. 16. Stem and Leaf Plots • .Simple way to order and display a data set. • Abbreviate the observed data into two significant digits. 0.6 Stem • 0 • 1 • 2 • 3 2.6 0.1 Leaf 6 1 1 3 6 2 2 1.1 0.4 1.3 1.5 2.2 2.0 3.2 4 5 0 16
  17. 17. Measures of Central Tendency & Variation (Dispersion) 17
  18. 18. Measures of Central Tendency are quantitative indices that describe the center of a distribution of data. These are • Mean • Median • Mode (Three M M M) 18
  19. 19. Mean Mean or arithmetic mean is also called AVERAGE and only calculated for numerical data. For example • What average age of children in years? Children 1234567 Age 6443246 -X = ∑X ___ n Formula Mean = 6 4 4 3 2 4 5 = 28 7 7 = 4 years 19
  20. 20. Median • It is central most value. For example what is central value in 2, 3, 4, 4, 4, 5, 6 data? • If we divide data in two equal groups 2, 3, 4, 4, 4, 5, 6 hence 4 is the central most value • Formula to calculate central value is: Median = n + 1 (here n is the total no. of value) 2 Median = (n + 1)/2 = 7 + 1 = 8/2 = 4 20
  21. 21. Mode • is the most frequently (repeated) occurring value in set of observations. Example • No mode Raw data: 10.3 4.9 8.9 11.7 6.3 7.7 • One mode Raw data: 2 3 4 4 4 5 6 • More than 1 mode Raw data: 21 28 28 41 43 43 21
  22. 22. Comparison of the Mode, the Median, and the Mean • In a normal distribution, the mode , the median, and the mean have the same value. • The mean is the widely reported index of central tendency for variables measured on an interval and ratio scale. • The mean takes each and every score into account. • It also the most stable index of central tendency and thus yields the most reliable estimate of the central tendency of the population.
  23. 23. Histogram/Bar Chart • Histogram & Box plots are used for continuous or scale variables like temperature, Bone density etc • Bar chart & Pie Charts are used to categorical or nominal variables like gender, name etc. 23
  24. 24. Measures of Dispersion quantitative indices that describe the spread of a data set. These are • • • • • • Range Mean deviation Variance Standard deviation Coefficient of variation Percentile 24
  25. 25. Range It is difference between highest and lowest values in a data series. For example: the ages (in Years) of 10 children are 2, 6, 8, 10, 11, 14, 1, 6, 9, 15 here the range of age will be 15 – 1 = 14 years 25
  26. 26. Mean Deviation This is average deviation of all observation from the mean Mean Deviation = ∑ І X – X І _______ _ n here X = Value, X = Mean n = Total no. of value 26
  27. 27. 27 Mean Deviation Example A student took 5 exams in a class and had scores of 92, 75, 95, 90, and 98. Find the mean deviation for her test scores. • First step find the mean. _ x=∑x ___ n = 92+75+95+90+98 5 = 450 5 = 90
  28. 28. • 2nd step find mean deviation Values = X ˉ Mean = X Deviation from ˉ Mean = X - X Absolute value of Deviation Ignoring + signs 92 90 2 2 75 90 -15 15 95 90 5 5 90 90 0 0 98 90 8 8 Total = 450 n= 5 -- Mean Deviation = Dr. Riaz A. Bhutto _ ∑І X – X І _______ = 30/5 n ∑ X - X = 30 =6 Average deviation from mean is 6 9/3/2012 28
  29. 29. Variance • It is measure of variability which takes into account the difference between each observation and mean. • The variance is the sum of the squared deviations from the mean divided by the number of values in the series minus 1. • Sample variance is s² and population variance is σ² 29
  30. 30. Variance (cont.) • • • • • The Variance is defined as: The average of the squared differences from the Mean. To calculate the variance follow these steps: Work out the Mean (the simple average of the numbers) Then for each number: subtract the Mean and square the result (the squared difference) • Then work out the average of those squared differences. 30
  31. 31. 31 Example: House hold size of 5 families was recorded as following: 2, 5, 4, 6, 3 Step 1 Values = X Calculate variance for above data. Step 2 ˉ Mean = X Step 3 Step 4 Deviation from ˉ Mean = X - X ˉ ( X – X)² 2 4 -2 4 5 4 1 1 4 4 0 0 6 4 2 4 3 4 -1 1 Step 6 = Dr. Riaz A. Bhutto s² = _ ∑ ( X – X)² = 10/5 = 2 _______ n ∑ = 10 Step 5 S²= 2 persons² 9/3/2012
  32. 32. Standard Deviation • The Standard Deviation is a measure of how spread out numbers are. • Its symbol is σ (the greek letter sigma) • The formula is easy: it is the square root of the Variance.ie s = √ s² • SD is most useful measure of dispersion s = √ (x - x²) n (if n > 30) Population s = √ (x - x²) n-1 (if n < 30) Sample 32
  33. 33. Standard Deviation and Standard Error • SD is an estimate of the variability of the observations or it is sample estimate of population parameter . • SE is a measure of precision of an estimate of a population parameter.

×