SlideShare a Scribd company logo
1 of 67
Chapter two
DATA ORGANIZATION AND
PRESENTATION
Mengistu Y. (BSC, MPH-HI)
2017
1
4/17/2023
Learning objectives
At the end of this section students are expected to:
• understand the nature of data
• organize and present data according to the need of
the activity
• present data in table and graphical ways for
information use.
2
4/17/2023
Data organization and presentation
• Statistics is used to organize and interpret research
observations and findings.
• Before interpretation & communication of the
findings, the raw data must be organized and
presented in a clear and understandable way.
 Techniques used to organize and summarize a set of
data in a concise way.
– Organization of data
– Summarization of data
– Presentation of data
3
4/17/2023
Cont...
• Numbers that have not been summarized and
organized are called raw data
Descriptive statistic includes tables, graphical
/chart displays and calculation of summary
measures such as mean, proportions, averages
etc…
• The methods of describing variables differ
depending on the type of data (Numerical or
Categorical).
4
4/17/2023
Organizing data
Categorical data
• Table of frequency
distributions
– Frequency
– Relative frequency
– Cumulative frequencies
• Graphs
– Bar charts
– Pie charts
Continuous or discrete data
• Frequency distribution
• Summary measures
 Graphs
– Histograms
– Frequency polygons
– Cumulative frequency polygons
 Leaf and steam
 Box and whisker Plots
 Scatter plot
5
4/17/2023
Frequency distributions
• A frequency distribution is a presentation of the
number of times (or the frequency) that each value (or
group of values) occurs in the study population.
• Ordered array: A simple arrangement of individual
observations in order of magnitude.
• A simple and effective way of summarizing categorical
data is to construct a frequency distribution table.
• This is done by counting the number of observations
falling into each of the categories, or levels of the
variables.
• Consider for example, the variable birth weight with
levels ‘Very low ’, ‘Low’, ‘Normal’ and ‘Big’.
6
4/17/2023
Relative Frequency
• Sometimes it is useful to compute the
proportion, or percentages of observations in
each category.
• The distribution of proportions is called the
relative frequency distribution of the variable.
• Given a total number of observations, the
relative frequency distribution is easily derived
from the frequency distribution.
7
4/17/2023
Cumulative frequency
• Two other distributions are useful describing
particularly ordinal data.
• It tells nothing in nominal data.
E.g. You will never say 70% are below blue
color.
• The cumulative frequency is the number of
observations in the category plus observations in
all categories smaller than it.
• Cumulative relative frequency is the
proportion of observations in the category plus
observations in all categories smaller than it, and
is obtained by dividing the cumulative frequency
by the total number of observations.
8
4/17/2023
Table 2. Distribution of birth weight of newborns
between 1976-1996 at TAH.
BWT Freq. Rel. Freq(%) Cum. Freq Cum.rel.freq.(%)
Very low 43 0.4 43 0.4
Low 793 8.0 836 8.4
Normal 8870 88.9 9706 97.3
Big 268 2.7 9974 100_____
Total 9974 100
9
4/17/2023
Frequency distribution for numerical data
• Ordered array, further useful summarization may
be achieved by grouping the data.
• To group a set of observations we select a set of
continuous, non overlapping intervals such
that each value in the set of observations can be
placed in one, and only one, of the intervals.
• These intervals are usually referred to as class
intervals.
10
4/17/2023
• One of the first considerations when
data are to be grouped is how many
intervals to include
• The question is how best can we
organize such data. Imagine when
we have huge data set which may
not be manageable by eye.
4/17/2023 11
Table 3. Frequencies of serum cholesterol levels for
1067 US males of ages 25-34, (1976-1980).
-------------------------------------------------------------------------------------------------------------------------------
Cholesterol level
Mg/100ml freq Relative freq Cum freq Cum.rel. freq
------------------------------------------------------------------------------------------------------------------
80-119 13 1.2 13 1.2
120-159 150 14.1 163 15.3
160-199 442 41.4 605 56.7
200-239 299 28.0 904 84.7
240-279 115 10.8 1019 95.5
280-319 34 3.2 1053 98.7
320-359 9 0.8 1062 99.5
360-399 5 0.5 1067 100
------------------------------------------------------------------------------------------------------------------
Total 1067 100
12
4/17/2023
For both discrete and continuous data the values
are grouped into non-overlapping intervals,
usually of equal width.
13
4/17/2023
Example of raw data of age….
14
4/17/2023
Example of categorized data of age
15
4/17/2023
How to calculate class interval?
 To determine the number of class intervals and the
corresponding width, we use:
 Sturge’s rule:
K=1+3.322(logn)
W=L-S
K
where
K = number of class intervals n = no. of observations
W = width of the class interval L = the largest value
S = the smallest value
16
4/17/2023
Example
• Construct a grouped frequency
distribution of the following data on the
amount of time (in hours) that 80 college
students devoted to leisure activities
during a typical school week:
4/17/2023 17
Example:
4/17/2023 18
The amount of time (in hours) that 80 college students devoted to leisure activities
during a typical school week
• Using the above formula,
K = 1 + 3.322  log (80)
= 7.32  7 classes
• Maximum value = 38 and Minimum value = 10
• w= Range/k = (38 – 10)/7= 28/7 = 4
• Using width of 5(common rule of thumb), we
can construct grouped frequency distribution
for the above data as:
4/17/2023 19
4/17/2023 20
Mid-point and True-limits
Mid-point (class mark): The value of the interval
which lies midway between the lower and the upper
limits of a class.
True limits(class boundaries): Are those limits
that make an interval of a continuous variable
continuous in both directions
Used for smoothening of the class intervals
Subtract 0.5 from the lower and add it to the upper
limit
21
4/17/2023
Contd…
• Note. In the construction of cumulative
frequency distribution, if we start the cumulation
from the lowest size of the variable to the highest
size, the resulting frequency distribution is called
`Less than cumulative frequency distribution'
and if the cumulation is from the highest to the
lowest value the resulting frequency distribution
is called `more than cumulative frequency
distribution.' The most common cumulative
frequency is the less than cumulative frequency
4/17/2023 22
Example
Time
(Hours)
True limit Mid-point Frequency
10-14
15-19
20-24
25-29
30-34
35-39
9.5 – 14.5
14.5 – 19.5
19.5 – 24.5
24.5 – 29.5
29.5 – 34.5
34.5 - 39.5
12
17
22
27
32
37
8
28
27
12
4
1
Total 80
23
4/17/2023
• Class interval: The length of the class, it is
given by the difference between class
boundaries for 1st class, the interval is 5.
• Note: As sample increases, and interval
reduced the sample distribution resembles
the population distribution
4/17/2023 24
– Class intervals should be continuous, non
overlapping, mutually exclusive and exhaustive
– Too few intervals results loss of information
– Too many intervals results that the objective of
summarization will not be met.
– Class intervals generally should be of the same
width (some times impossible)
– Open ended class intervals should be avoided
25
Exercise
• Construct a
grouped frequency
distribution and
complete the
following table for
the Age of patients
(years) in a diabetic
clinic in Addis
Ababa, 2010
4/17/2023 26
Age of patients (years) in a diabetic clinic in
Addis Ababa, 2010
Age
group
(Years)
Class
limit
Class
Boundary
Class
Mid
Point
Tally
Fr.
(fi)
Relative
Frequency
,
Fraction
(%)
Cumulative freq Relative Cum freq
<Method >Method <Method >Method
Total
4/17/2023 27
METHOD OF DATA PRESENTATION
4/17/2023 28
Data table
Guidelines for constructing tables
• Keep them simple
• Limit the number of variables
• All tables should be self-explanatory
• Include clear title telling what, where and
when
• Clearly label the rows and columns
29
4/17/2023
Cntd…
• State clearly the unit of measurement used
• Explain codes and abbreviations in the foot-
note
• Show totals
• If data is not original, indicate the source in
foot-note
4/17/2023 30
Graphical presentation of data
• Variety of graph styles can be used to present
data.
• The most commonly used types of graph are pie
charts, bar diagrams, histograms, frequency
polygon and scatter diagrams.
• The purpose of using a graph is to tell others
about a set of data quickly, allowing them to
grasp the important characteristics of the data.
• In other words, graphs are visual aids to rapid
understanding.
31
4/17/2023
Importance of graphs
• Diagrams have greater attraction than mere
figures.
• They give delight to the eye, add a spark of
interest and as such catch the attention
• They help in deriving the required
information in less time and without any
mental strain.
• They have great memorizing value than
mere figures.
• They facilitate comparison
4/17/2023 32
Bar charts
• Bar chart: Display the frequency distribution for
nominal or ordinal data.
• In a bar chart the various categories into which the
observation fall are represented along horizontal axis
and
• A vertical bar is drawn above each category such that
the height of the bar represents either the frequency
or the relative frequency of observation within the
class.
• The vertical axis should always start from 0 but the
horizontal can start from any where.
• The bars should be of equal width and should be
separated from one another so as not to imply
continuity
33
4/17/2023
Figure 1. Bar charts showing frequency distribution of
the variable ‘BWT’.
0
1000
2000
3000
4000
5000
6000
Very low Low Normal Big
BWT
Freq.
0
20
40
60
80
100
Verylow Low Normal Big
BWT
Rel.
Freq.
34
4/17/2023
Bar charts for comparison
• Multiple bar chart: In order to compare the
distribution of a variable for two or more
groups, bars are often drawn along side each
other for groups being compared in a single bar
chart.
• Sub division bar chart: If there are different
quantities forming the sub-divisions of the
totals, simple bars may be sub-divided in the
ratio of the various sub-divisions to exhibit the
relationship of the parts to the whole.
35
4/17/2023
Fig 2. Bar chart indicating categories of birth weight of 9975
newborns grouped by antenatal follow-up of the mothers
9
88.9
2.1
7.9
89
3.1
0
10
20
30
40
50
60
70
80
90
100
Low Normal Big
BWT
Percent
Yes
No
36
4/17/2023
Example: Plasmodium species distribution for confirmed
malaria cases, Zeway, 2003
37
4/17/2023
Pie chart
Pie Chart: Displays the frequency
distribution for nominal or ordinal data.
• In a pie chart the various categories into
which the observation fall are represented
along sectors of a circle
• Each sector represents either the
frequency or the relative frequency of
observation within the class the angles of
which are proportional to frequency or the
relative frequency.
38
4/17/2023
Figure 3. Pie charts showing frequency distribution of
the variable ‘BWT’
Fig 3(b) Pie chart indicating relative frequencyof
categories of birth weight
0.4 8
88.9
2.7
Very low
Low
Normal
Big
Fig 3(a) Pie chart indicating frequencyof categories
of birth weight
43 793
8870
268
Verylow
Low
Normal
Big
39
4/17/2023
Histogram
• Histogram is frequency distributions with
continuous class interval that has been turned into
graph.
• Given a set of numerical data, we can obtain
impression of the shape of its distribution by
constructing a histogram.
• A histogram is constructed by choosing a set of
non-overlapping intervals (class intervals) and
counting the number of observations that fall in
each class.
. 40
4/17/2023
Histograms cont…
• The number of observations in each class
is called the frequency. Hence histograms
are also called frequency distributions
• It is necessary that the class intervals be
non-overlapping so that each observation
falls in one and only one interval.
4/17/2023 41
Histograms cont…
• Except for the two boundaries, class intervals
are usually chosen to be of equal width. If this
is not the case, the histogram could give a
misleading impression of the shape of the data
• In drawing the histogram , smoothening of
class interval is one of important point. We
subtract 0.5 from the lower and add it up to the
upper boundary of the given interval.
42
4/17/2023
Example
Distribution of the age of women at the time of
marriage
Age group No. of women
15-19 11
20-24 36
25-29 28
30-34 13
35-39 7
40-44 3
45-49 2
43
4/17/2023
Age of women at the time of marriage
0
5
10
15
20
25
30
35
40
14.5-19.5 19.5-24.5 24.5-29.5 29.5-34.5 34.5-39.5 39.5-44.5 44.5-49.5
Age group
No
of
women
44
4/17/2023
Fig 5. A histogram displaying frequency distribution of birth
weight of newborns at Tikur Anbessa Hospital
Birth weight
5200
4800
4400
4000
3600
3200
2800
2400
2000
1600
1200
800
2000
1800
1600
1400
1200
1000
800
600
400
200
0
Std. Dev = 502.34
Mean = 3126
N = 9975.00
45
4/17/2023
Frequency polygons
• Instead of drawing bars for each class interval,
sometimes a single point is drawn at the mid
point of each class interval and consecutive
points joined by straight line.
• Graphs drawn in this way are called frequency
polygons .
• Frequency polygons are superior to histograms
for comparing two or more sets of data.
46
4/17/2023
Fig.6. Frequency polygon of birth weight of 9975 newborns at Tikur
Anbessa Hospital for males and females
Birth Weight
5000
4500
4000
3500
3000
2500
2000
1500
1000
500
%
50
40
30
20
10
0
SEX
Males
Females
47
4/17/2023
Box and Whisker Plot
It is another way to display information when
the objective is to illustrate certain locations
(skewness) in the distribution
Can be used to display a set of discrete or
continuous observations using a single vertical
axis – only certain summaries of the data are
shown
48
4/17/2023
Box plot cont...
 A box is drawn with the top of the box at the third
quartile (75%) and the bottom at the first quartile
(25%).
 The location of the mid-point (50%) of the
distribution is indicated with a horizontal line in the
box.
 Finally, straight lines, or whiskers, are drawn from the
centre of the top of the box to the largest observation
and from the centre of the bottom of the box to the
smallest observation.
49
4/17/2023
Box cont....
The box plot is then completed
 Draw a vertical bar from the upper quartile to
the largest non-outlining value in the sample
 Draw a vertical bar from the lower quartile to the
smallest non-outlying value in the sample
 Any values that are outside the IQR but are not
outliers are marked by the whiskers on the plot
(IQR = P75 – P25)
50
4/17/2023
Box plots are useful for comparing two or
more groups of observations
51
4/17/2023
Drawing Box-and -whiskers plot
Raw data
35, 29, 44, 72, 34, 64, 41, 50, 54, 104, 39, 58
Order the data
29 34 35 39 41 44 50 54 58 64 72 104
Median = (44 + 50)/2 = 47 = Q2
Q1 = 37
Q3 = 61,Min = 29 , Max = 104
52
4/17/2023
Box plot Example
0 10 20 30 40 50 60 70 80 90 100 110
.
.
.
.
Min = 29 Q2 = 47
Q1 = 37 Q3 = 61 Max = 104
53
4/17/2023
Scatter plot
Most studies in medicine involve measuring
more than one characteristic, and graphs
displaying the relationship between two
characteristics are common in literature.
When both the variables are qualitative then
we can use a multiple bar graph.
When one of the characteristics is qualitative
and the other is quantitative, the data can be
displayed in box and whisker plots
54
4/17/2023
Scatter plot ….
For two quantitative variables we use bivariate
plots (also called scatter plots or scatter
diagrams).
It is used to see whether a relationship existed
between the two measures.
A scatter diagram is constructed by drawing
X-and Y-axes
Each point represented by a point or dot()
represents a pair of values measured for a single
study subject =POSTIVE RELATION
55
4/17/2023
0 2 4 6 8 10 12 14 16 18 20
0
10
20
30
40
50
60
Hours of Training
Negative Correlation as x increases, y decreases
x = hours of training
y = number of accidents
Scatter Plots and Types of Correlation
Accidents
56
300 350 400 450 500 550 600 650 700 750 800
1.50
1.75
2.00
2.25
2.50
2.75
3.00
3.25
3.50
3.75
4.00
Math SAT
Positive Correlation as x increases y increases
x = SAT score
y = GPA
GPA
Scatter Plots and Types of Correlation
57
80
76
72
68
64
60
160
150
140
130
120
110
100
90
80
Height
IQ
No linear correlation
x = height y = IQ
Scatter Plots and Types of Correlation
58
1. Direction of Relationship
Positive
Negative
X
X
Y
Y
Scatter Diagram…
4/17/2023 59
2. Form of Relationship
Linear
Curvilinear
X
Y
X
Y
4/17/2023 60
3. Degree of Relationship
Strong
Weak
X
Y
X
Y
4/17/2023 61
Line graph
 Useful for assessing the trend of particular situation
overtime. e.g. monitoring the trend of epidemics.
 The time, in weeks, months or years, is marked along
the horizontal axis
 Values of the quantity being studied is marked on the
vertical axis.
 Values for each category are connected by continuous
line.
 Sometimes two or more graphs are drawn on the same
graph taking the same scale so that the plotted graphs
are comparable.
62
4/17/2023
No. of microscopically confirmed malaria cases by species and month
at Zeway malaria control unit, 2003
0
300
600
900
1200
1500
1800
2100
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Months
No.
of
confirmed
malaria
cases
Positive
P. falciparum
P. vivax
63
4/17/2023
Line graph cont..
The following graph shows level of zidovudine
(AZT) in the blood of HIV/AIDS patients at
several times after administration of the drug,
for with normal fat absorption and with fat
mal absorption.
 Line graph can be also used to depict the
relationship between two continuous
variables like that of scatter diagram.
64
4/17/2023
Line graph cont…..
Response to administration of zidovudine in two groups of AIDS
patients in hospital X, 1999
0
1
2
3
4
5
6
7
8
10
20
70
80
100
120
170
190
250
300
360
Time since administration (Min.)
Blood
zidovudine
concentration
Fat malabsorption Normal fat absorption
65
4/17/2023
Choosing graphs
Type of Data/or
Purpose
Appropriate Graphs
Metric/Numerical -Histogram (one continuous var)
-Frequency Polygon (one/more cont. var)
-Cumulative Freq Polygon (ogive curve)
-Box and whisker (one cont. and one cat.
Var)
-Stem and Leave (one cont. var)
-Scatter (two cont. var)
Categorical -Bar (one/more cat. var) (Simple/Multiple)
-Pie (one cat. var)
Trend -Line (one cont. and one cat. Var/two
cont)
4/17/2023 66
THANK YOU!
67
4/17/2023

More Related Content

What's hot

Friedman test Stat
Friedman test Stat Friedman test Stat
Friedman test Stat
Kate Malda
 

What's hot (20)

Analytical cosmetics:BIS specification and analytical methods for shampoo, sk...
Analytical cosmetics:BIS specification and analytical methods for shampoo, sk...Analytical cosmetics:BIS specification and analytical methods for shampoo, sk...
Analytical cosmetics:BIS specification and analytical methods for shampoo, sk...
 
SUBUMETER, CORNEOMETER, TEWL,m.pharm analysis, pharmaceutical analysis, food ...
SUBUMETER, CORNEOMETER, TEWL,m.pharm analysis, pharmaceutical analysis, food ...SUBUMETER, CORNEOMETER, TEWL,m.pharm analysis, pharmaceutical analysis, food ...
SUBUMETER, CORNEOMETER, TEWL,m.pharm analysis, pharmaceutical analysis, food ...
 
X-Ray Crystallography.pptx
X-Ray Crystallography.pptxX-Ray Crystallography.pptx
X-Ray Crystallography.pptx
 
Role of herbs in hair care Amla and heena.pptx
Role of herbs in hair care  Amla and  heena.pptxRole of herbs in hair care  Amla and  heena.pptx
Role of herbs in hair care Amla and heena.pptx
 
Friedman test Stat
Friedman test Stat Friedman test Stat
Friedman test Stat
 
Hyphenated techniques(GC-MS/MS, LC-MS/MS, HPTLC-MS)
Hyphenated techniques(GC-MS/MS, LC-MS/MS,  HPTLC-MS)Hyphenated techniques(GC-MS/MS, LC-MS/MS,  HPTLC-MS)
Hyphenated techniques(GC-MS/MS, LC-MS/MS, HPTLC-MS)
 
UNIT IV.pptx Principle of cosmetic evaluation.
UNIT IV.pptx  Principle of cosmetic evaluation.UNIT IV.pptx  Principle of cosmetic evaluation.
UNIT IV.pptx Principle of cosmetic evaluation.
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Application of excel and spss programme in statistical
Application of excel and spss programme in statisticalApplication of excel and spss programme in statistical
Application of excel and spss programme in statistical
 
Calibration - UV VIS Spectrophotometer, HPLC, Gas Chromatograph, IR spectroph...
Calibration - UV VIS Spectrophotometer, HPLC, Gas Chromatograph, IR spectroph...Calibration - UV VIS Spectrophotometer, HPLC, Gas Chromatograph, IR spectroph...
Calibration - UV VIS Spectrophotometer, HPLC, Gas Chromatograph, IR spectroph...
 
Regression ppt
Regression pptRegression ppt
Regression ppt
 
Unit 2 Regression- BSRM.pdf
Unit 2 Regression- BSRM.pdfUnit 2 Regression- BSRM.pdf
Unit 2 Regression- BSRM.pdf
 
Probability ,Binomial distribution, Normal distribution, Poisson’s distributi...
Probability ,Binomial distribution, Normal distribution, Poisson’s distributi...Probability ,Binomial distribution, Normal distribution, Poisson’s distributi...
Probability ,Binomial distribution, Normal distribution, Poisson’s distributi...
 
Factorial design ,full factorial design, fractional factorial design
Factorial design ,full factorial design, fractional factorial designFactorial design ,full factorial design, fractional factorial design
Factorial design ,full factorial design, fractional factorial design
 
Hypothesis testing for parametric data
Hypothesis testing for parametric dataHypothesis testing for parametric data
Hypothesis testing for parametric data
 
sebumeter.pptx
sebumeter.pptxsebumeter.pptx
sebumeter.pptx
 
Experimental design techniques
Experimental design techniquesExperimental design techniques
Experimental design techniques
 
Optimization techniques in formulation Development Response surface methodol...
Optimization techniques in formulation Development  Response surface methodol...Optimization techniques in formulation Development  Response surface methodol...
Optimization techniques in formulation Development Response surface methodol...
 
Basic stat analysis using excel
Basic stat analysis using excelBasic stat analysis using excel
Basic stat analysis using excel
 
Skin relating problems in cosmetics
Skin relating problems in cosmeticsSkin relating problems in cosmetics
Skin relating problems in cosmetics
 

Similar to data organization and presentation.pptx

collectionandrepresentationofdata1-200904192336.pptx
collectionandrepresentationofdata1-200904192336.pptxcollectionandrepresentationofdata1-200904192336.pptx
collectionandrepresentationofdata1-200904192336.pptx
aibakimito
 
Chapter-2-Frequency-Distribution-and-Graphical-Presentation.pptx
Chapter-2-Frequency-Distribution-and-Graphical-Presentation.pptxChapter-2-Frequency-Distribution-and-Graphical-Presentation.pptx
Chapter-2-Frequency-Distribution-and-Graphical-Presentation.pptx
LaurenceBernardBalbi1
 

Similar to data organization and presentation.pptx (20)

Group-4-Report-Frequency-Distribution.ppt
Group-4-Report-Frequency-Distribution.pptGroup-4-Report-Frequency-Distribution.ppt
Group-4-Report-Frequency-Distribution.ppt
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
 
collectionandrepresentationofdata1-200904192336.pptx
collectionandrepresentationofdata1-200904192336.pptxcollectionandrepresentationofdata1-200904192336.pptx
collectionandrepresentationofdata1-200904192336.pptx
 
Tabular and Graphical Representation of Data
Tabular and Graphical Representation of Data Tabular and Graphical Representation of Data
Tabular and Graphical Representation of Data
 
Methods of data prsentation.pptx
Methods of data prsentation.pptxMethods of data prsentation.pptx
Methods of data prsentation.pptx
 
1.0 Descriptive statistics.pdf
1.0 Descriptive statistics.pdf1.0 Descriptive statistics.pdf
1.0 Descriptive statistics.pdf
 
Bba 2001
Bba 2001Bba 2001
Bba 2001
 
lesson-data-presentation-tools-1.pptx
lesson-data-presentation-tools-1.pptxlesson-data-presentation-tools-1.pptx
lesson-data-presentation-tools-1.pptx
 
2 biostatistics presenting data
2  biostatistics presenting data2  biostatistics presenting data
2 biostatistics presenting data
 
FREQUENCY DISTRIBUTION.pptx
FREQUENCY DISTRIBUTION.pptxFREQUENCY DISTRIBUTION.pptx
FREQUENCY DISTRIBUTION.pptx
 
2. Descriptive Statistics.pdf
2. Descriptive Statistics.pdf2. Descriptive Statistics.pdf
2. Descriptive Statistics.pdf
 
Unit 4 editing and coding (2)
Unit 4 editing and coding (2)Unit 4 editing and coding (2)
Unit 4 editing and coding (2)
 
Chp 3
Chp 3Chp 3
Chp 3
 
Chp 3
Chp 3Chp 3
Chp 3
 
Basic Concepts of Statistics - Lecture Notes
Basic Concepts of Statistics - Lecture NotesBasic Concepts of Statistics - Lecture Notes
Basic Concepts of Statistics - Lecture Notes
 
Time Series Analysis.pptx
Time Series Analysis.pptxTime Series Analysis.pptx
Time Series Analysis.pptx
 
Chapter-2-Frequency-Distribution-and-Graphical-Presentation.pptx
Chapter-2-Frequency-Distribution-and-Graphical-Presentation.pptxChapter-2-Frequency-Distribution-and-Graphical-Presentation.pptx
Chapter-2-Frequency-Distribution-and-Graphical-Presentation.pptx
 
1Basic biostatistics.pdf
1Basic biostatistics.pdf1Basic biostatistics.pdf
1Basic biostatistics.pdf
 
Data presenattaion we can read this document..pptx
Data presenattaion  we can read this document..pptxData presenattaion  we can read this document..pptx
Data presenattaion we can read this document..pptx
 
Biostatistics
BiostatisticsBiostatistics
Biostatistics
 

Recently uploaded

原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
pwgnohujw
 
Client Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptx
Client Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptxClient Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptx
Client Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptx
Stephen266013
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Klinik kandungan
 
sourabh vyas1222222222222222222244444444
sourabh vyas1222222222222222222244444444sourabh vyas1222222222222222222244444444
sourabh vyas1222222222222222222244444444
saurabvyas476
 
bams-3rd-case-presentation-scabies-12-05-2020.pptx
bams-3rd-case-presentation-scabies-12-05-2020.pptxbams-3rd-case-presentation-scabies-12-05-2020.pptx
bams-3rd-case-presentation-scabies-12-05-2020.pptx
JocylDuran
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
q6pzkpark
 
Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get CytotecAbortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
acoha1
 

Recently uploaded (20)

Seven tools of quality control.slideshare
Seven tools of quality control.slideshareSeven tools of quality control.slideshare
Seven tools of quality control.slideshare
 
Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024
 
Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"
 
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
 
Predictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesPredictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting Techniques
 
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
 
Client Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptx
Client Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptxClient Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptx
Client Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptx
 
DAA Assignment Solution.pdf is the best1
DAA Assignment Solution.pdf is the best1DAA Assignment Solution.pdf is the best1
DAA Assignment Solution.pdf is the best1
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?
 
DBMS UNIT 5 46 CONTAINS NOTES FOR THE STUDENTS
DBMS UNIT 5 46 CONTAINS NOTES FOR THE STUDENTSDBMS UNIT 5 46 CONTAINS NOTES FOR THE STUDENTS
DBMS UNIT 5 46 CONTAINS NOTES FOR THE STUDENTS
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
sourabh vyas1222222222222222222244444444
sourabh vyas1222222222222222222244444444sourabh vyas1222222222222222222244444444
sourabh vyas1222222222222222222244444444
 
bams-3rd-case-presentation-scabies-12-05-2020.pptx
bams-3rd-case-presentation-scabies-12-05-2020.pptxbams-3rd-case-presentation-scabies-12-05-2020.pptx
bams-3rd-case-presentation-scabies-12-05-2020.pptx
 
How to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsHow to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data Analytics
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
 
Bios of leading Astrologers & Researchers
Bios of leading Astrologers & ResearchersBios of leading Astrologers & Researchers
Bios of leading Astrologers & Researchers
 
Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get CytotecAbortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
 
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
 
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
 

data organization and presentation.pptx

  • 1. Chapter two DATA ORGANIZATION AND PRESENTATION Mengistu Y. (BSC, MPH-HI) 2017 1 4/17/2023
  • 2. Learning objectives At the end of this section students are expected to: • understand the nature of data • organize and present data according to the need of the activity • present data in table and graphical ways for information use. 2 4/17/2023
  • 3. Data organization and presentation • Statistics is used to organize and interpret research observations and findings. • Before interpretation & communication of the findings, the raw data must be organized and presented in a clear and understandable way.  Techniques used to organize and summarize a set of data in a concise way. – Organization of data – Summarization of data – Presentation of data 3 4/17/2023
  • 4. Cont... • Numbers that have not been summarized and organized are called raw data Descriptive statistic includes tables, graphical /chart displays and calculation of summary measures such as mean, proportions, averages etc… • The methods of describing variables differ depending on the type of data (Numerical or Categorical). 4 4/17/2023
  • 5. Organizing data Categorical data • Table of frequency distributions – Frequency – Relative frequency – Cumulative frequencies • Graphs – Bar charts – Pie charts Continuous or discrete data • Frequency distribution • Summary measures  Graphs – Histograms – Frequency polygons – Cumulative frequency polygons  Leaf and steam  Box and whisker Plots  Scatter plot 5 4/17/2023
  • 6. Frequency distributions • A frequency distribution is a presentation of the number of times (or the frequency) that each value (or group of values) occurs in the study population. • Ordered array: A simple arrangement of individual observations in order of magnitude. • A simple and effective way of summarizing categorical data is to construct a frequency distribution table. • This is done by counting the number of observations falling into each of the categories, or levels of the variables. • Consider for example, the variable birth weight with levels ‘Very low ’, ‘Low’, ‘Normal’ and ‘Big’. 6 4/17/2023
  • 7. Relative Frequency • Sometimes it is useful to compute the proportion, or percentages of observations in each category. • The distribution of proportions is called the relative frequency distribution of the variable. • Given a total number of observations, the relative frequency distribution is easily derived from the frequency distribution. 7 4/17/2023
  • 8. Cumulative frequency • Two other distributions are useful describing particularly ordinal data. • It tells nothing in nominal data. E.g. You will never say 70% are below blue color. • The cumulative frequency is the number of observations in the category plus observations in all categories smaller than it. • Cumulative relative frequency is the proportion of observations in the category plus observations in all categories smaller than it, and is obtained by dividing the cumulative frequency by the total number of observations. 8 4/17/2023
  • 9. Table 2. Distribution of birth weight of newborns between 1976-1996 at TAH. BWT Freq. Rel. Freq(%) Cum. Freq Cum.rel.freq.(%) Very low 43 0.4 43 0.4 Low 793 8.0 836 8.4 Normal 8870 88.9 9706 97.3 Big 268 2.7 9974 100_____ Total 9974 100 9 4/17/2023
  • 10. Frequency distribution for numerical data • Ordered array, further useful summarization may be achieved by grouping the data. • To group a set of observations we select a set of continuous, non overlapping intervals such that each value in the set of observations can be placed in one, and only one, of the intervals. • These intervals are usually referred to as class intervals. 10 4/17/2023
  • 11. • One of the first considerations when data are to be grouped is how many intervals to include • The question is how best can we organize such data. Imagine when we have huge data set which may not be manageable by eye. 4/17/2023 11
  • 12. Table 3. Frequencies of serum cholesterol levels for 1067 US males of ages 25-34, (1976-1980). ------------------------------------------------------------------------------------------------------------------------------- Cholesterol level Mg/100ml freq Relative freq Cum freq Cum.rel. freq ------------------------------------------------------------------------------------------------------------------ 80-119 13 1.2 13 1.2 120-159 150 14.1 163 15.3 160-199 442 41.4 605 56.7 200-239 299 28.0 904 84.7 240-279 115 10.8 1019 95.5 280-319 34 3.2 1053 98.7 320-359 9 0.8 1062 99.5 360-399 5 0.5 1067 100 ------------------------------------------------------------------------------------------------------------------ Total 1067 100 12 4/17/2023
  • 13. For both discrete and continuous data the values are grouped into non-overlapping intervals, usually of equal width. 13 4/17/2023
  • 14. Example of raw data of age…. 14 4/17/2023
  • 15. Example of categorized data of age 15 4/17/2023
  • 16. How to calculate class interval?  To determine the number of class intervals and the corresponding width, we use:  Sturge’s rule: K=1+3.322(logn) W=L-S K where K = number of class intervals n = no. of observations W = width of the class interval L = the largest value S = the smallest value 16 4/17/2023
  • 17. Example • Construct a grouped frequency distribution of the following data on the amount of time (in hours) that 80 college students devoted to leisure activities during a typical school week: 4/17/2023 17
  • 19. The amount of time (in hours) that 80 college students devoted to leisure activities during a typical school week • Using the above formula, K = 1 + 3.322  log (80) = 7.32  7 classes • Maximum value = 38 and Minimum value = 10 • w= Range/k = (38 – 10)/7= 28/7 = 4 • Using width of 5(common rule of thumb), we can construct grouped frequency distribution for the above data as: 4/17/2023 19
  • 21. Mid-point and True-limits Mid-point (class mark): The value of the interval which lies midway between the lower and the upper limits of a class. True limits(class boundaries): Are those limits that make an interval of a continuous variable continuous in both directions Used for smoothening of the class intervals Subtract 0.5 from the lower and add it to the upper limit 21 4/17/2023
  • 22. Contd… • Note. In the construction of cumulative frequency distribution, if we start the cumulation from the lowest size of the variable to the highest size, the resulting frequency distribution is called `Less than cumulative frequency distribution' and if the cumulation is from the highest to the lowest value the resulting frequency distribution is called `more than cumulative frequency distribution.' The most common cumulative frequency is the less than cumulative frequency 4/17/2023 22
  • 23. Example Time (Hours) True limit Mid-point Frequency 10-14 15-19 20-24 25-29 30-34 35-39 9.5 – 14.5 14.5 – 19.5 19.5 – 24.5 24.5 – 29.5 29.5 – 34.5 34.5 - 39.5 12 17 22 27 32 37 8 28 27 12 4 1 Total 80 23 4/17/2023
  • 24. • Class interval: The length of the class, it is given by the difference between class boundaries for 1st class, the interval is 5. • Note: As sample increases, and interval reduced the sample distribution resembles the population distribution 4/17/2023 24
  • 25. – Class intervals should be continuous, non overlapping, mutually exclusive and exhaustive – Too few intervals results loss of information – Too many intervals results that the objective of summarization will not be met. – Class intervals generally should be of the same width (some times impossible) – Open ended class intervals should be avoided 25
  • 26. Exercise • Construct a grouped frequency distribution and complete the following table for the Age of patients (years) in a diabetic clinic in Addis Ababa, 2010 4/17/2023 26
  • 27. Age of patients (years) in a diabetic clinic in Addis Ababa, 2010 Age group (Years) Class limit Class Boundary Class Mid Point Tally Fr. (fi) Relative Frequency , Fraction (%) Cumulative freq Relative Cum freq <Method >Method <Method >Method Total 4/17/2023 27
  • 28. METHOD OF DATA PRESENTATION 4/17/2023 28
  • 29. Data table Guidelines for constructing tables • Keep them simple • Limit the number of variables • All tables should be self-explanatory • Include clear title telling what, where and when • Clearly label the rows and columns 29 4/17/2023
  • 30. Cntd… • State clearly the unit of measurement used • Explain codes and abbreviations in the foot- note • Show totals • If data is not original, indicate the source in foot-note 4/17/2023 30
  • 31. Graphical presentation of data • Variety of graph styles can be used to present data. • The most commonly used types of graph are pie charts, bar diagrams, histograms, frequency polygon and scatter diagrams. • The purpose of using a graph is to tell others about a set of data quickly, allowing them to grasp the important characteristics of the data. • In other words, graphs are visual aids to rapid understanding. 31 4/17/2023
  • 32. Importance of graphs • Diagrams have greater attraction than mere figures. • They give delight to the eye, add a spark of interest and as such catch the attention • They help in deriving the required information in less time and without any mental strain. • They have great memorizing value than mere figures. • They facilitate comparison 4/17/2023 32
  • 33. Bar charts • Bar chart: Display the frequency distribution for nominal or ordinal data. • In a bar chart the various categories into which the observation fall are represented along horizontal axis and • A vertical bar is drawn above each category such that the height of the bar represents either the frequency or the relative frequency of observation within the class. • The vertical axis should always start from 0 but the horizontal can start from any where. • The bars should be of equal width and should be separated from one another so as not to imply continuity 33 4/17/2023
  • 34. Figure 1. Bar charts showing frequency distribution of the variable ‘BWT’. 0 1000 2000 3000 4000 5000 6000 Very low Low Normal Big BWT Freq. 0 20 40 60 80 100 Verylow Low Normal Big BWT Rel. Freq. 34 4/17/2023
  • 35. Bar charts for comparison • Multiple bar chart: In order to compare the distribution of a variable for two or more groups, bars are often drawn along side each other for groups being compared in a single bar chart. • Sub division bar chart: If there are different quantities forming the sub-divisions of the totals, simple bars may be sub-divided in the ratio of the various sub-divisions to exhibit the relationship of the parts to the whole. 35 4/17/2023
  • 36. Fig 2. Bar chart indicating categories of birth weight of 9975 newborns grouped by antenatal follow-up of the mothers 9 88.9 2.1 7.9 89 3.1 0 10 20 30 40 50 60 70 80 90 100 Low Normal Big BWT Percent Yes No 36 4/17/2023
  • 37. Example: Plasmodium species distribution for confirmed malaria cases, Zeway, 2003 37 4/17/2023
  • 38. Pie chart Pie Chart: Displays the frequency distribution for nominal or ordinal data. • In a pie chart the various categories into which the observation fall are represented along sectors of a circle • Each sector represents either the frequency or the relative frequency of observation within the class the angles of which are proportional to frequency or the relative frequency. 38 4/17/2023
  • 39. Figure 3. Pie charts showing frequency distribution of the variable ‘BWT’ Fig 3(b) Pie chart indicating relative frequencyof categories of birth weight 0.4 8 88.9 2.7 Very low Low Normal Big Fig 3(a) Pie chart indicating frequencyof categories of birth weight 43 793 8870 268 Verylow Low Normal Big 39 4/17/2023
  • 40. Histogram • Histogram is frequency distributions with continuous class interval that has been turned into graph. • Given a set of numerical data, we can obtain impression of the shape of its distribution by constructing a histogram. • A histogram is constructed by choosing a set of non-overlapping intervals (class intervals) and counting the number of observations that fall in each class. . 40 4/17/2023
  • 41. Histograms cont… • The number of observations in each class is called the frequency. Hence histograms are also called frequency distributions • It is necessary that the class intervals be non-overlapping so that each observation falls in one and only one interval. 4/17/2023 41
  • 42. Histograms cont… • Except for the two boundaries, class intervals are usually chosen to be of equal width. If this is not the case, the histogram could give a misleading impression of the shape of the data • In drawing the histogram , smoothening of class interval is one of important point. We subtract 0.5 from the lower and add it up to the upper boundary of the given interval. 42 4/17/2023
  • 43. Example Distribution of the age of women at the time of marriage Age group No. of women 15-19 11 20-24 36 25-29 28 30-34 13 35-39 7 40-44 3 45-49 2 43 4/17/2023
  • 44. Age of women at the time of marriage 0 5 10 15 20 25 30 35 40 14.5-19.5 19.5-24.5 24.5-29.5 29.5-34.5 34.5-39.5 39.5-44.5 44.5-49.5 Age group No of women 44 4/17/2023
  • 45. Fig 5. A histogram displaying frequency distribution of birth weight of newborns at Tikur Anbessa Hospital Birth weight 5200 4800 4400 4000 3600 3200 2800 2400 2000 1600 1200 800 2000 1800 1600 1400 1200 1000 800 600 400 200 0 Std. Dev = 502.34 Mean = 3126 N = 9975.00 45 4/17/2023
  • 46. Frequency polygons • Instead of drawing bars for each class interval, sometimes a single point is drawn at the mid point of each class interval and consecutive points joined by straight line. • Graphs drawn in this way are called frequency polygons . • Frequency polygons are superior to histograms for comparing two or more sets of data. 46 4/17/2023
  • 47. Fig.6. Frequency polygon of birth weight of 9975 newborns at Tikur Anbessa Hospital for males and females Birth Weight 5000 4500 4000 3500 3000 2500 2000 1500 1000 500 % 50 40 30 20 10 0 SEX Males Females 47 4/17/2023
  • 48. Box and Whisker Plot It is another way to display information when the objective is to illustrate certain locations (skewness) in the distribution Can be used to display a set of discrete or continuous observations using a single vertical axis – only certain summaries of the data are shown 48 4/17/2023
  • 49. Box plot cont...  A box is drawn with the top of the box at the third quartile (75%) and the bottom at the first quartile (25%).  The location of the mid-point (50%) of the distribution is indicated with a horizontal line in the box.  Finally, straight lines, or whiskers, are drawn from the centre of the top of the box to the largest observation and from the centre of the bottom of the box to the smallest observation. 49 4/17/2023
  • 50. Box cont.... The box plot is then completed  Draw a vertical bar from the upper quartile to the largest non-outlining value in the sample  Draw a vertical bar from the lower quartile to the smallest non-outlying value in the sample  Any values that are outside the IQR but are not outliers are marked by the whiskers on the plot (IQR = P75 – P25) 50 4/17/2023
  • 51. Box plots are useful for comparing two or more groups of observations 51 4/17/2023
  • 52. Drawing Box-and -whiskers plot Raw data 35, 29, 44, 72, 34, 64, 41, 50, 54, 104, 39, 58 Order the data 29 34 35 39 41 44 50 54 58 64 72 104 Median = (44 + 50)/2 = 47 = Q2 Q1 = 37 Q3 = 61,Min = 29 , Max = 104 52 4/17/2023
  • 53. Box plot Example 0 10 20 30 40 50 60 70 80 90 100 110 . . . . Min = 29 Q2 = 47 Q1 = 37 Q3 = 61 Max = 104 53 4/17/2023
  • 54. Scatter plot Most studies in medicine involve measuring more than one characteristic, and graphs displaying the relationship between two characteristics are common in literature. When both the variables are qualitative then we can use a multiple bar graph. When one of the characteristics is qualitative and the other is quantitative, the data can be displayed in box and whisker plots 54 4/17/2023
  • 55. Scatter plot …. For two quantitative variables we use bivariate plots (also called scatter plots or scatter diagrams). It is used to see whether a relationship existed between the two measures. A scatter diagram is constructed by drawing X-and Y-axes Each point represented by a point or dot() represents a pair of values measured for a single study subject =POSTIVE RELATION 55 4/17/2023
  • 56. 0 2 4 6 8 10 12 14 16 18 20 0 10 20 30 40 50 60 Hours of Training Negative Correlation as x increases, y decreases x = hours of training y = number of accidents Scatter Plots and Types of Correlation Accidents 56
  • 57. 300 350 400 450 500 550 600 650 700 750 800 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00 Math SAT Positive Correlation as x increases y increases x = SAT score y = GPA GPA Scatter Plots and Types of Correlation 57
  • 58. 80 76 72 68 64 60 160 150 140 130 120 110 100 90 80 Height IQ No linear correlation x = height y = IQ Scatter Plots and Types of Correlation 58
  • 59. 1. Direction of Relationship Positive Negative X X Y Y Scatter Diagram… 4/17/2023 59
  • 60. 2. Form of Relationship Linear Curvilinear X Y X Y 4/17/2023 60
  • 61. 3. Degree of Relationship Strong Weak X Y X Y 4/17/2023 61
  • 62. Line graph  Useful for assessing the trend of particular situation overtime. e.g. monitoring the trend of epidemics.  The time, in weeks, months or years, is marked along the horizontal axis  Values of the quantity being studied is marked on the vertical axis.  Values for each category are connected by continuous line.  Sometimes two or more graphs are drawn on the same graph taking the same scale so that the plotted graphs are comparable. 62 4/17/2023
  • 63. No. of microscopically confirmed malaria cases by species and month at Zeway malaria control unit, 2003 0 300 600 900 1200 1500 1800 2100 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Months No. of confirmed malaria cases Positive P. falciparum P. vivax 63 4/17/2023
  • 64. Line graph cont.. The following graph shows level of zidovudine (AZT) in the blood of HIV/AIDS patients at several times after administration of the drug, for with normal fat absorption and with fat mal absorption.  Line graph can be also used to depict the relationship between two continuous variables like that of scatter diagram. 64 4/17/2023
  • 65. Line graph cont….. Response to administration of zidovudine in two groups of AIDS patients in hospital X, 1999 0 1 2 3 4 5 6 7 8 10 20 70 80 100 120 170 190 250 300 360 Time since administration (Min.) Blood zidovudine concentration Fat malabsorption Normal fat absorption 65 4/17/2023
  • 66. Choosing graphs Type of Data/or Purpose Appropriate Graphs Metric/Numerical -Histogram (one continuous var) -Frequency Polygon (one/more cont. var) -Cumulative Freq Polygon (ogive curve) -Box and whisker (one cont. and one cat. Var) -Stem and Leave (one cont. var) -Scatter (two cont. var) Categorical -Bar (one/more cat. var) (Simple/Multiple) -Pie (one cat. var) Trend -Line (one cont. and one cat. Var/two cont) 4/17/2023 66

Editor's Notes

  1. This is so because the impression left by the diagram is of a lasting nature.
  2. 08/28/06
  3. 08/28/06
  4. 08/28/06