SlideShare a Scribd company logo
BIOSTATISTICS
P R E S E N T E D B Y ,
D R . A N J U M A T H E W . K
F I R S T Y E A R M D S
D E P A R T M E N T O F P E R I O D O N T I C S
•Statistics is a very broad subject, with applications in a vast number of different
fields.
• In generally one can say that statistics is the methodology for collecting,
analyzing, interpreting and drawing conclusions from information.
•Statistics is the methodology which scientists and mathematicians have developed
for interpreting and drawing conclusions from collected data
DEFINITION
Statistics consists of a body of methods for collecting and analyzing data. (Agresti &
Finlay, 1997)
Statistics is much more than just the tabulation of numbers and the graphical
presentation of these tabulated numbers.
Statistics is the science of gaining information from numerical and categorical data
Statistical methods can be used to find answers to the questions like:
• What kind and how much data need to be collected?
• How should we organize and summarize the data?
• How can we analyse the data and draw conclusions from it?
• How can we assess the strength of the conclusions and evaluate their
uncertainty?
BIOSTATISTICS
•Deals with the statistical methodologies involved in biological
sciences
•As medicine is a branch of biology, medical statistics is a branch of
biostatistics
SAMPLING
•Sampling is the process of technique or selecting a sample of appropriate
characteristics and adequate size
•Sampling of two types
1.Probability sampling
2.Nonprobability sampling
In PROBABILITY SAMPLING -give all the members of a population equal
chance of being selected
In NONPROBABILITY SAMPLING – samples are collected in a way that
does not give all the units in the population equal chances of being selected
TYPES OF SAMPLING TECHNIQUES
Probability sampling Non probability sampling
1.Simple random 1.Accidental/convenience
2.Stratified random 2.Judgement/purposive
3.Systemic random 3.Network/snowball
4.Area/cluster sampling 4.Quota sampling
5.Dimensional sampling
6.Mixed sampling
Simple random sampling
Every member of population has an equal chance of being
included in the sample. This type of sampling used when the
population in homogenous
Stratified random sampling
Divides the population into groups called strata. It is by some
characteristic, not geographically. The population might be
separated into males and females.
Systemic random sampling
Sample members from a larger population are selected
according to a random starting point but with a fixed,
periodic interval. This interval, called the sampling
interval, is calculated by dividing the population size by
the desired sample size.
Area or cluster sampling
Cluster sampling is accomplished by dividing the
population into groups usually geographically. These
groups are called clusters or blocks. The clusters are
randomly selected, and each element in the selected
clusters are used. For example in a dental survey in
schools each section in a class could be used as a
cluster
Accidental or convenience sampling
Sampling is very easy to do and often used by health
professionals. You will have to examine the people you
are able to contact or get access to. In expensive and
less time consuming
Judgement or purposive sampling sampling
In which researchers rely on their own judgment when
choosing members of the population to participate in
their study
Network or snow ball sampling
Multistage technique. The researcher must first
identify and interview a few subjects with requisite
criteria. These subjects are then asked to identify
other with same criteria these persons are then asked
to identify others until a satisfactory sample is
obtained
Quota sampling
Researchers create a sample involving individuals
that represent a population. Researchers choose these
individuals according to specific traits or qualities
Dimensional sampling
Is an extension to quota sampling. The researcher takes into account several characteristics (e.g.
Gender, income, residence and education). The researcher must ensure that there is at least one
person in the study representing each of the chosen characteristics
Mixed sampling designs
Constitute the combination of both probability and nonprobability sampling procedures
USES OF SAMPLING
•May be the only way to obtain information about a population
•The need to reduce labour and hence cost
•Savings in time, manpower and money
ERRORS IN SAMPLING
•Two types of errors that arise in sampling
1.Sampling error
2.Nonsampling error
•Sampling error
That creep in due to the sampling process and could arise because of
faulty sample design or due to the small size of the sample
•Non sampling errors
a) Coverage error: due to non cooperation of the informant
b) Observational error: due to interviewers bias or imperfect
experimental technique or interaction of both
c) Processing error: due to errors in statistical analysis
DATA
•Data analysis is the cornerstone in reporting research findings
•Data is a set of values of one or more variables recorded on one or
more individuals
TYPES OF DATA
1. Primary data
2. Secondary data
Primary data
Data obtained directly from an individual
ADVANTAGES
1. Precise information
2. Reliable
DISADVANTAGES
1.Time consuming
2.expensive
Secondary data
It is obtained from outside sources eg:hospital records,school register
VARIABLES
A variable is a state ,condition, concept or event whose value is free to vary
within the population
TYPES OF VARIABLES
1.Quantitative
-Discrete
-Continous
2.Qualitative
-Categorical
-Ordered
METHOD OF COLLECTION OF
DATA
1. Questionnaires
2. Surveys
3. Records
4. Interviews
PRESENTATION OF DATA
•Statistical data once collected must be arranged purposively in order
to bring out the important points clearly and strikingly
•The manner in which statistical data is presented is of utmost
importance
METHODS OF PRESENTING DATA
I. Tabulation
Simple tables
Frequency distribution table
II. Charts and diagrams
Bar charts
a. Simple bar chart
b. Multiple bar chart
c. Component bar chart
Histogram
a. Frequency polygon
b. Frequency curve
Pie chart
Pictogram
III. Line diagrams
IV. Statistical maps
TABULATION
•Tables are devices for presenting data
•Tabulation is the first step before the data is used for analysis or interpretation
GENERAL PRINCIPLES BEFORE DESIGNING TABLES
1.The table should be numbered eg: Table 1.Table 2. etc
2.A title must be given to each table. The title must be brief and self explanatory
3.The headings of columns and rows should be clear and concise
4.The data must be presented according to size or importance chronologically, alphabetically or
geographically
5.If percentage or average are to be compared they should be placed as close as possible
6.No table should be too large
7.Foot notes may be given where necessary, providing explanatory notes or additional
information
SIMPLE TABLES
FREQUENCY DISTRIBUTION
TABLE
The data is first split up into convenient groups (class intervals)and the number of
items(frequency) occur in each group
CHARTS AND DIAGRAMS
•Useful method of presenting simple statistical data
•They have powerful impact on the imagination of people, so they are a powerful
media of expressing statistical data
ADVANTAGES
1.Diagrams are better retained in memory than tables
2.If the diagrams are drawn simple the impact on the reader much higher
DISADVANTAGES
1.Loss of details of the original data may be lost in charts and diagrams
BAR CHARTS
A diagram of columns or bars the height of the bars determine the value of the
particular data in question
SIMPLE BAR CHART
MULTIPLE BAR CHART
COMPONET BAR CHART
When there are two sets of similar information they can be contrasted by
displaying both sets on same graph
HISTOGRAMS
A special sort of bar chart. The successive
groups of data is linked in a definite
numerical data
Frequency polygon
A frequency distribution may also be
represented diagrammatically by the
frequency polygon
It is obtained by joining the mid points of the
histogram blocks
Frequency curve
The frequency curve for a distribution can be
obtained by drawing a smooth and free
hand curve through the midpoints
PIE CHARTS
Another way of displaying data.
PICTOGRAMS
Pictorial or diagrammatical data
represented by pictorial symbol
LINE GRAPH
When the quantity is a continuous variable
STATISTICAL MAPS
When statistical data refer to geographic or
administrative areas ,it is presented either as
shaded maps or dot maps
USES OF DATA
•In designing health care programme
•In evaluating the effectiveness of an on going program
•In determination of needs of a specific population
•In evaluating the scientific accuracy of a journal article
MEASURES OF CENTRAL
TENDENCY
•Central tendency:It is the value around which the other values are
distributed
•The main objective of measure of central tendency is to condense the
entire mass of data and to facilitate comparison
•Arithmetic mean
•Median
•Mode
z
MEAN
•This measures implies the arithmetic average or arithmetic mean
•It is obtained by summing up all observations and dividing the total number of observations
•Eg: No. of days patients stayed each day in hospital under Dr. A is: 2,4,3,4,6,6,2,5
•Mean (X) = Sum of all observations/Number of observations = 32/8 = 4
•ADVANTAGES
•Easy to calculate
•Easy to understand
•Utilize entire data
•Amenable to algebraic manipulation
•Affords good comparison
DISADVANTAGES
•Mean is affected by extreme values. In such cases it leads to bad interpretation
MEDIAN
The data arranged in an ascending or descending order of magnitude and the value of middle observation is located
Eg 1: No. of days patients stayed in hospital under Dr. A is: 2,4,3,4,6,6,2,5
Ascending order: 2,2,3,4,4,5,6,6
Median = (4+4)/2 = 8/2 = 4
Eg 2: No. of days patients stayed in hospital under Dr. A is: 2,4,3,4,6,6,2
Descending order: 6,6,4,4,3,2,2
Median: 4
ADVANTAGES
• It is more representative than mean
• It does not depend on every observations
•It is not affected by extreme values
•DISADVANTAGES
•Data has to be arraned before calculation. Hence mean is easier to use as a sample statistic than a population parameter
•More complex statistical procedures than mean
MODE
Value which occurs with the greatest frequency
Eg 1 : No. of days patients stayed in hospital under Dr. A is: 2,4,3,1,6,6,8,5
Mode: 6 i.e. the distribution is unimodal
Eg 1 : No. of days patients stayed in hospital under Dr. A is: 2,4,3,4,6,6,8,5
Mode: 6 & 4 i.e. the distribution is bimodal
ADVANTAGES
•It eliminates extreme variation
•Easily located by mean inspection
•Easy to understand
DISADVANTAGES
•Exact location is uncertain
•It is not exactly defined
•In small number of cases there may be no mode at all because no value may be repeated therefore it is not used in
medical or biological statistics
MEASURES OF DISPERSION
•Measures of dispersion helps to know how widely the observations are spread on
either side of the average
•Dispersion is the degree of spread or variation of the variable about a central
value
•The range
•The mean deviation
•The standard deviation
PURPOSE OF MEASURES OF DISPERSION
•To study the variability of data
•For accounting the variability in data
THE RANGE
•The difference between the highest and lower figures in a given sample.
•Range = Xmax - Xmin
ADVANTAGES
•Easy to calculate
DISADVANTAGES
•Unstable
•It is affected by one extremely high or low score
•It is of no practical importance because it does not indicate anything about the
dispersion of values between the two extreme values
THE MEAN DEVIATION
•It is the average of deviation from the arithmetic mean
•It is the one way of measuring how closely the individual scores in the data set
cluster around the mean. This is done by
• M.D. = Ʃ (x-x)/n
•Where Ʃ (sigma) is the sum of, x is the value of each observation in the data, x
is the arithmetic mean and n is the number of observation in the data.
•Eg : No. of days patients stayed in hospital under Dr. A is: 2,4,3,4,6,6,2,5
•x = 32/8 = 4
• (x-x) = -2,0,-1,0,2,2,-2,-1 ; Ʃ (x-x) = -2+0+-1+0+2+2+-2+-1 = 0
•Zero will obviously not reflect the degree of dispersion. To solve this problem
we can square each deviation score
THE MEAN DEVIATION
• (x-x)2 = 4,0,1,0,4,4,4,1 ; Ʃ (x-x)2 = 18
• Ʃ (x-x)2/n = 18/8 = 2.25
• The resulting value is the variance.
• The Variance is the average of the squared deviations from the mean of
a set of scores.
• i.e. Ʃ (x-x)2/n
STANDARD DEVIATION
•Most frequently used measure of deviation
•Defined as root mean square deviation
•Denoted by the Greek letter Sigma s or by the initials S.D
•S.D is the square root of the Variance
•S.D = √(x-x)2/n
•Therefore for Dr. A, S.D = √ 2.25 = 1.5
TESTS OF SIGNIFICANCE
•Whenever two sets of observation are to be compared, it becomes
essential to find out whether the difference observed between the two
group is because of sampling variation or any other factor
•The method by which this done is called Tests of significance
1. Standard error test for large samples
2. Chi square test
3. Standard error test for small samples
STANDARD ERROR TEST FOR LARGE
SAMPLES
•A sample is considered to be large when it has more than 30
observations
•When the difference between any two large sample in terms of means
or portion need to be tested the formula used is as
•(a). Standard error of mean
•The standard error of mean gives the standard deviation of mean of
several samples from the same population. Standard error can be
estimated from a single sample.
•Standard error (S.E) of mean = S.D/ √n
•(b). Standard error (S.E) of proportion = √pq/n
•Where p and q are the proportion of occurrence of an event in two groups of
the sample and n is the sample size.
•(c). Standard error of difference between two means
•It is used to find out whether the difference between the means of two groups
is significant to indicate that the samples represent two different universes.
•Standard error between means = √S.D1
2/n1 + S.D2
2/n2
•(d). Standard error of difference between proportions
•It is used to find out whether the difference between the proportions of two
groups is significant or has occurred by chance.
•Standard error between proportions = √p1q1/n1+p2q2/n2
CHI SQUARE TEST
It is alternative method of testing the significance of difference between two proportions
Eg: If there are two groups, one of which has received oral hygiene instructions and the other has not received any
instructions and if it is desired to test if the occurrence of new cavities is associated with the instructions.
STEPS
1. Test the null hypothesis
Set up a null hypothesis that “there is no difference between the two” and then proceed to test the hypothesis.
•Here we state the null hypothesis as ‘there is no association between oral hygiene instructions received in dental hygiene
and the occurrence of new cavities’
Group Occurrence of new cavities
Present Absent Total
Number who
received
instructions
10 40 50
Number who did
not receive
instructions
32 8 40
Total 42 48 90
•2. Then the X2 –statistic is calculated as,
X2 = Ʃ(O-E)/E
Where O is the observed frequency and E is the Expected Frequency
Expected Frequency (E) = Row total * Column total/Grand total
Among those who received instructions
Expected number attacked = 42*50/90 = 23.3
Expected number not attacked = 48*50/90 = 26.6
Among those who did not receive instructions
Expected number attacked = 42*40/90 = 18.2
Expected number not attacked = 48*40/90 = 21.3
Group Attacked Not Attacked
Number who received
instructions
O = 10
E = 23.3
O – E = - 13.3
O = 40
E = 26.6
O – E = 13.4
Number who did not receive
instructions
O = 32
E = 18.2
O – E = 13.8
O = 8
E = 21.3
O – E = - 13.3
Group Occurrence of new cavities
Present Absent Total
Number
who
received
instructi
ons
10 40 50
Number
who did
not
receive
instructi
ons
32 8 40
Total 42 48 90
3. Applying the X2 test,
X2 = Ʃ(O-E)2/E
= (-13.3)2/23.3 + (13.4)2/26.6 + (13.8)2/18.2 + (-13.3)2/21.3
= 7.59 + 6.75 + 10.46 + 8.3 = 33.1
4. Finding the degree of freedom (d.f)
It depends on the number of columns and rows in the original table
d.f = (c-1)*(r-1)
Where c = number of columns ; r = number of rows
d.f = (2 – 1)*(2 – 1) = 1
Group Attacked Not Attacked
Number who
received
instructions
O = 10
E = 23.3
O – E = - 13.3
O = 40
E = 26.6
O – E = 13.4
Number who
did not
receive
instructions
O = 32
E = 18.2
O – E = 13.8
O = 8
E = 21.3
O – E = - 13.3
5. Probability tables
Depending upon the value of “P” the conclusion is drawn.
• In the probability table, with a degree of freedom of 1, the X2 value for a probability (P) of 0.05 is 3.84. Since the
observed value 33 is much higher it is concluded that the null hypothesis is false and there is difference in caries
occurrence in the two groups with caries being lower in those who received instructions.
Z test
It is used to test the significance of difference in means for large samples (>30)
The pre-requisites to apply Z test for means are,
1. The sample must be randomly selected
2. The data must be quantitative
3. The variable is assumed to follow a normal distribution in the population
4. Sample should be larger than 30
Observation – mean / Standard deviation
= x – x / SD
STANDARD ERROR TEST FOR SMALL
SAMPLES
•A sample is considered to be small if it has less than 30 observations.
•The test applied is called the ‘t’ test
•Designed by W.S.GOSSETT, whose pen name was student. Hence this test is
called Student’s t-test
•When the investigations is in terms comparing the observations carried out on the
same individual says before and after certain experiment ,such comparison are
called paired comparison
•When the observation are carried out in two independent samples and their values
are compared it is known as unpaired comparison
CRITERIA FOR APPLYING ‘t’ TEST
•The sample must be randomly selected
•The data must be quantitative
•The variable is assumed to follow a normal distribution in population
•Sample should be less than 30
t- TEST FOR PAIRED COMPARISON
1. As per the null hypothesis, assume that there is no real difference between the means of
two samples
2. The difference between the before and after experimentation readings are calculated for
each individuals
3. The mean and standard deviation(s) of these differences are calculated
4. The standard error of this mean difference is calculated by the formula SE = SD/√n
5. t is calculated by the formula, t = Mean difference / Standard error of the difference
6. Find the degree of freedom (df) = (n-1) where n is the number of pairs of observation
7. From t- distribution table, find probability of t is noted down corresponding to (n-1) degree
of freedom
8. If probability is more than 0.05,the difference observed has no significance ,because it can
be due to chance
The unpaired ‘t’ test
1. As per the null hypothesis, assume that there is no real difference between the means of two
samples.
2. Find the observed difference between the means of two samples (X1 – X2)
3. Calculate the standard error of difference between the two means.
SE = √1/n1 + 1/n2
4. Calculate the ‘t’ value
t = X1
2 – X2
2 / SE
5. Determine the pooled degrees of freedom from the formula
d.f = (n1 – 1) + (n2 – 1) = n1 + n2 - 2
6. Compare calculated value with the table value (table of ‘t’) at particular degrees of freedom to find the level of
significance.
CONCLUSION
•Bio-statistical technique can assure that the results found in such a
study are not merely because of chance.
•In every case of our life, Statistics plays a major role for better
gaining and accurate results.
•A well designed and properly conducted study is a basic prerequisite
to arrive at valid conclusions.
REFERENCES
Soben peter ; Essentials of public health dentistry, 5th edition
K Park ; Parks Textbook of Preventive And Social medicine, 19th
edition
Joseph John ; Textbook of Preventive and Community Dentistry, 2nd
edition
Richard Levin & David S. Rubin ; Statistics for Management, 6th
edition

More Related Content

What's hot

Measures of dispersion
Measures of dispersionMeasures of dispersion
Measures of dispersion
Sachin Shekde
 
Lec. biostatistics introduction
Lec. biostatistics  introductionLec. biostatistics  introduction
Lec. biostatistics introduction
Riaz101
 

What's hot (20)

Biostatistics Concept & Definition
Biostatistics Concept & DefinitionBiostatistics Concept & Definition
Biostatistics Concept & Definition
 
Intoduction to biostatistics
Intoduction to biostatisticsIntoduction to biostatistics
Intoduction to biostatistics
 
Biostatistics
BiostatisticsBiostatistics
Biostatistics
 
Biostatistics khushbu
Biostatistics khushbuBiostatistics khushbu
Biostatistics khushbu
 
Measures of dispersion
Measures of dispersionMeasures of dispersion
Measures of dispersion
 
Epidemiolgy and biostatistics notes
Epidemiolgy and biostatistics notesEpidemiolgy and biostatistics notes
Epidemiolgy and biostatistics notes
 
Introduction to biostatistics
Introduction to biostatisticsIntroduction to biostatistics
Introduction to biostatistics
 
Bio stat
Bio statBio stat
Bio stat
 
Tests of significance
Tests of significanceTests of significance
Tests of significance
 
Sampling Techniques
Sampling TechniquesSampling Techniques
Sampling Techniques
 
Biostatistics
BiostatisticsBiostatistics
Biostatistics
 
Biostatistics ug
Biostatistics  ug Biostatistics  ug
Biostatistics ug
 
Introduction of biostatistics
Introduction of biostatisticsIntroduction of biostatistics
Introduction of biostatistics
 
1 Introduction to Biostatistics.pdf
1 Introduction to Biostatistics.pdf1 Introduction to Biostatistics.pdf
1 Introduction to Biostatistics.pdf
 
Test of significance
Test of significanceTest of significance
Test of significance
 
Lec. biostatistics introduction
Lec. biostatistics  introductionLec. biostatistics  introduction
Lec. biostatistics introduction
 
Biostatistics
BiostatisticsBiostatistics
Biostatistics
 
Biostatistics
BiostatisticsBiostatistics
Biostatistics
 
1.introduction
1.introduction1.introduction
1.introduction
 
Biostatistics lec 1
Biostatistics lec 1Biostatistics lec 1
Biostatistics lec 1
 

Similar to Biostatistics

IDS-Unit-II. bachelor of computer applicatio notes
IDS-Unit-II. bachelor of computer applicatio notesIDS-Unit-II. bachelor of computer applicatio notes
IDS-Unit-II. bachelor of computer applicatio notes
AnkurTiwari813070
 

Similar to Biostatistics (20)

BIOSTATISTICS.pptx sidhathab.pptx oral pathology
BIOSTATISTICS.pptx sidhathab.pptx oral pathologyBIOSTATISTICS.pptx sidhathab.pptx oral pathology
BIOSTATISTICS.pptx sidhathab.pptx oral pathology
 
Biostats in ortho
Biostats in orthoBiostats in ortho
Biostats in ortho
 
Introduction to Statistics
Introduction to StatisticsIntroduction to Statistics
Introduction to Statistics
 
data analysis in Statistics-2023 guide 2023
data analysis in Statistics-2023 guide 2023data analysis in Statistics-2023 guide 2023
data analysis in Statistics-2023 guide 2023
 
datacollectionandpresentation-181117043733-converted.pptx
datacollectionandpresentation-181117043733-converted.pptxdatacollectionandpresentation-181117043733-converted.pptx
datacollectionandpresentation-181117043733-converted.pptx
 
POPULATION.pptx
POPULATION.pptxPOPULATION.pptx
POPULATION.pptx
 
Unit 4 editing and coding (2)
Unit 4 editing and coding (2)Unit 4 editing and coding (2)
Unit 4 editing and coding (2)
 
Biostatistics.pptx
Biostatistics.pptxBiostatistics.pptx
Biostatistics.pptx
 
datacollection and presentation.pdf
datacollection and presentation.pdfdatacollection and presentation.pdf
datacollection and presentation.pdf
 
CHAPONE edited Stat.pptx
CHAPONE edited Stat.pptxCHAPONE edited Stat.pptx
CHAPONE edited Stat.pptx
 
POPULATION AND SAMPLING.pptx
POPULATION AND SAMPLING.pptxPOPULATION AND SAMPLING.pptx
POPULATION AND SAMPLING.pptx
 
Statistics
StatisticsStatistics
Statistics
 
Frequency Distribution.pdf
Frequency Distribution.pdfFrequency Distribution.pdf
Frequency Distribution.pdf
 
Data Analysis and Presentation.pptx
Data Analysis and  Presentation.pptxData Analysis and  Presentation.pptx
Data Analysis and Presentation.pptx
 
Introduction to statistics in health care
Introduction to statistics in health care Introduction to statistics in health care
Introduction to statistics in health care
 
Stat and prob a recap
Stat and prob   a recapStat and prob   a recap
Stat and prob a recap
 
introduction to statistical theory
introduction to statistical theoryintroduction to statistical theory
introduction to statistical theory
 
Unit 1 - Statistics (Part 1).pptx
Unit 1 - Statistics (Part 1).pptxUnit 1 - Statistics (Part 1).pptx
Unit 1 - Statistics (Part 1).pptx
 
Chapter 11 Data Analysis Classification and Tabulation
Chapter 11 Data Analysis Classification and TabulationChapter 11 Data Analysis Classification and Tabulation
Chapter 11 Data Analysis Classification and Tabulation
 
IDS-Unit-II. bachelor of computer applicatio notes
IDS-Unit-II. bachelor of computer applicatio notesIDS-Unit-II. bachelor of computer applicatio notes
IDS-Unit-II. bachelor of computer applicatio notes
 

Recently uploaded

Cardiac Impulse: Rhythmical Excitation and Conduction in the Heart
Cardiac Impulse: Rhythmical Excitation and Conduction in the HeartCardiac Impulse: Rhythmical Excitation and Conduction in the Heart
Cardiac Impulse: Rhythmical Excitation and Conduction in the Heart
MedicoseAcademics
 
Alcohol_Dr. Jeenal Mistry MD Pharmacology.pdf
Alcohol_Dr. Jeenal Mistry MD Pharmacology.pdfAlcohol_Dr. Jeenal Mistry MD Pharmacology.pdf
Alcohol_Dr. Jeenal Mistry MD Pharmacology.pdf
Dr Jeenal Mistry
 

Recently uploaded (20)

Relationship between vascular system disfunction, neurofluid flow and Alzheim...
Relationship between vascular system disfunction, neurofluid flow and Alzheim...Relationship between vascular system disfunction, neurofluid flow and Alzheim...
Relationship between vascular system disfunction, neurofluid flow and Alzheim...
 
End Feel -joint end feel - Normal and Abnormal end feel
End Feel -joint end feel - Normal and Abnormal end feelEnd Feel -joint end feel - Normal and Abnormal end feel
End Feel -joint end feel - Normal and Abnormal end feel
 
180-hour Power Capsules For Men In Ghana
180-hour Power Capsules For Men In Ghana180-hour Power Capsules For Men In Ghana
180-hour Power Capsules For Men In Ghana
 
1130525--家醫計畫2.0糖尿病照護研討會-社團法人高雄市醫師公會.pdf
1130525--家醫計畫2.0糖尿病照護研討會-社團法人高雄市醫師公會.pdf1130525--家醫計畫2.0糖尿病照護研討會-社團法人高雄市醫師公會.pdf
1130525--家醫計畫2.0糖尿病照護研討會-社團法人高雄市醫師公會.pdf
 
"Central Hypertension"‚ in China: Towards the nation-wide use of SphygmoCor t...
"Central Hypertension"‚ in China: Towards the nation-wide use of SphygmoCor t..."Central Hypertension"‚ in China: Towards the nation-wide use of SphygmoCor t...
"Central Hypertension"‚ in China: Towards the nation-wide use of SphygmoCor t...
 
Couples presenting to the infertility clinic- Do they really have infertility...
Couples presenting to the infertility clinic- Do they really have infertility...Couples presenting to the infertility clinic- Do they really have infertility...
Couples presenting to the infertility clinic- Do they really have infertility...
 
5cl adbb 5cladba cheap and fine Telegram: +85297504341
5cl adbb 5cladba cheap and fine Telegram: +852975043415cl adbb 5cladba cheap and fine Telegram: +85297504341
5cl adbb 5cladba cheap and fine Telegram: +85297504341
 
Creating Accessible Public Health Communications
Creating Accessible Public Health CommunicationsCreating Accessible Public Health Communications
Creating Accessible Public Health Communications
 
Cardiac Impulse: Rhythmical Excitation and Conduction in the Heart
Cardiac Impulse: Rhythmical Excitation and Conduction in the HeartCardiac Impulse: Rhythmical Excitation and Conduction in the Heart
Cardiac Impulse: Rhythmical Excitation and Conduction in the Heart
 
DECIPHERING COMMON ECG FINDINGS IN ED.pptx
DECIPHERING COMMON ECG FINDINGS IN ED.pptxDECIPHERING COMMON ECG FINDINGS IN ED.pptx
DECIPHERING COMMON ECG FINDINGS IN ED.pptx
 
Arterial health throughout cancer treatment and exercise rehabilitation in wo...
Arterial health throughout cancer treatment and exercise rehabilitation in wo...Arterial health throughout cancer treatment and exercise rehabilitation in wo...
Arterial health throughout cancer treatment and exercise rehabilitation in wo...
 
Anuman- An inference for helpful in diagnosis and treatment
Anuman- An inference for helpful in diagnosis and treatmentAnuman- An inference for helpful in diagnosis and treatment
Anuman- An inference for helpful in diagnosis and treatment
 
Factors Affecting child behavior in Pediatric Dentistry
Factors Affecting child behavior in Pediatric DentistryFactors Affecting child behavior in Pediatric Dentistry
Factors Affecting child behavior in Pediatric Dentistry
 
The POPPY STUDY (Preconception to post-partum cardiovascular function in prim...
The POPPY STUDY (Preconception to post-partum cardiovascular function in prim...The POPPY STUDY (Preconception to post-partum cardiovascular function in prim...
The POPPY STUDY (Preconception to post-partum cardiovascular function in prim...
 
A thorough review of supernormal conduction.pptx
A thorough review of supernormal conduction.pptxA thorough review of supernormal conduction.pptx
A thorough review of supernormal conduction.pptx
 
Effects of vaping e-cigarettes on arterial health
Effects of vaping e-cigarettes on arterial healthEffects of vaping e-cigarettes on arterial health
Effects of vaping e-cigarettes on arterial health
 
Why invest into infodemic management in health emergencies
Why invest into infodemic management in health emergenciesWhy invest into infodemic management in health emergencies
Why invest into infodemic management in health emergencies
 
Multiple sclerosis diet.230524.ppt3.pptx
Multiple sclerosis diet.230524.ppt3.pptxMultiple sclerosis diet.230524.ppt3.pptx
Multiple sclerosis diet.230524.ppt3.pptx
 
TEST BANK For Advanced Practice Nursing in the Care of Older Adults, 2nd Edit...
TEST BANK For Advanced Practice Nursing in the Care of Older Adults, 2nd Edit...TEST BANK For Advanced Practice Nursing in the Care of Older Adults, 2nd Edit...
TEST BANK For Advanced Practice Nursing in the Care of Older Adults, 2nd Edit...
 
Alcohol_Dr. Jeenal Mistry MD Pharmacology.pdf
Alcohol_Dr. Jeenal Mistry MD Pharmacology.pdfAlcohol_Dr. Jeenal Mistry MD Pharmacology.pdf
Alcohol_Dr. Jeenal Mistry MD Pharmacology.pdf
 

Biostatistics

  • 1. BIOSTATISTICS P R E S E N T E D B Y , D R . A N J U M A T H E W . K F I R S T Y E A R M D S D E P A R T M E N T O F P E R I O D O N T I C S
  • 2. •Statistics is a very broad subject, with applications in a vast number of different fields. • In generally one can say that statistics is the methodology for collecting, analyzing, interpreting and drawing conclusions from information. •Statistics is the methodology which scientists and mathematicians have developed for interpreting and drawing conclusions from collected data
  • 3. DEFINITION Statistics consists of a body of methods for collecting and analyzing data. (Agresti & Finlay, 1997) Statistics is much more than just the tabulation of numbers and the graphical presentation of these tabulated numbers. Statistics is the science of gaining information from numerical and categorical data Statistical methods can be used to find answers to the questions like: • What kind and how much data need to be collected? • How should we organize and summarize the data? • How can we analyse the data and draw conclusions from it? • How can we assess the strength of the conclusions and evaluate their uncertainty?
  • 4. BIOSTATISTICS •Deals with the statistical methodologies involved in biological sciences •As medicine is a branch of biology, medical statistics is a branch of biostatistics
  • 5. SAMPLING •Sampling is the process of technique or selecting a sample of appropriate characteristics and adequate size •Sampling of two types 1.Probability sampling 2.Nonprobability sampling In PROBABILITY SAMPLING -give all the members of a population equal chance of being selected In NONPROBABILITY SAMPLING – samples are collected in a way that does not give all the units in the population equal chances of being selected
  • 6. TYPES OF SAMPLING TECHNIQUES Probability sampling Non probability sampling 1.Simple random 1.Accidental/convenience 2.Stratified random 2.Judgement/purposive 3.Systemic random 3.Network/snowball 4.Area/cluster sampling 4.Quota sampling 5.Dimensional sampling 6.Mixed sampling
  • 7. Simple random sampling Every member of population has an equal chance of being included in the sample. This type of sampling used when the population in homogenous Stratified random sampling Divides the population into groups called strata. It is by some characteristic, not geographically. The population might be separated into males and females.
  • 8. Systemic random sampling Sample members from a larger population are selected according to a random starting point but with a fixed, periodic interval. This interval, called the sampling interval, is calculated by dividing the population size by the desired sample size. Area or cluster sampling Cluster sampling is accomplished by dividing the population into groups usually geographically. These groups are called clusters or blocks. The clusters are randomly selected, and each element in the selected clusters are used. For example in a dental survey in schools each section in a class could be used as a cluster
  • 9. Accidental or convenience sampling Sampling is very easy to do and often used by health professionals. You will have to examine the people you are able to contact or get access to. In expensive and less time consuming Judgement or purposive sampling sampling In which researchers rely on their own judgment when choosing members of the population to participate in their study
  • 10. Network or snow ball sampling Multistage technique. The researcher must first identify and interview a few subjects with requisite criteria. These subjects are then asked to identify other with same criteria these persons are then asked to identify others until a satisfactory sample is obtained Quota sampling Researchers create a sample involving individuals that represent a population. Researchers choose these individuals according to specific traits or qualities
  • 11. Dimensional sampling Is an extension to quota sampling. The researcher takes into account several characteristics (e.g. Gender, income, residence and education). The researcher must ensure that there is at least one person in the study representing each of the chosen characteristics Mixed sampling designs Constitute the combination of both probability and nonprobability sampling procedures
  • 12. USES OF SAMPLING •May be the only way to obtain information about a population •The need to reduce labour and hence cost •Savings in time, manpower and money
  • 13. ERRORS IN SAMPLING •Two types of errors that arise in sampling 1.Sampling error 2.Nonsampling error •Sampling error That creep in due to the sampling process and could arise because of faulty sample design or due to the small size of the sample •Non sampling errors a) Coverage error: due to non cooperation of the informant b) Observational error: due to interviewers bias or imperfect experimental technique or interaction of both c) Processing error: due to errors in statistical analysis
  • 14. DATA •Data analysis is the cornerstone in reporting research findings •Data is a set of values of one or more variables recorded on one or more individuals
  • 15. TYPES OF DATA 1. Primary data 2. Secondary data
  • 16. Primary data Data obtained directly from an individual ADVANTAGES 1. Precise information 2. Reliable DISADVANTAGES 1.Time consuming 2.expensive Secondary data It is obtained from outside sources eg:hospital records,school register
  • 17. VARIABLES A variable is a state ,condition, concept or event whose value is free to vary within the population TYPES OF VARIABLES 1.Quantitative -Discrete -Continous 2.Qualitative -Categorical -Ordered
  • 18.
  • 19. METHOD OF COLLECTION OF DATA 1. Questionnaires 2. Surveys 3. Records 4. Interviews
  • 20. PRESENTATION OF DATA •Statistical data once collected must be arranged purposively in order to bring out the important points clearly and strikingly •The manner in which statistical data is presented is of utmost importance
  • 21. METHODS OF PRESENTING DATA I. Tabulation Simple tables Frequency distribution table II. Charts and diagrams Bar charts a. Simple bar chart b. Multiple bar chart c. Component bar chart Histogram a. Frequency polygon b. Frequency curve Pie chart Pictogram III. Line diagrams IV. Statistical maps
  • 22. TABULATION •Tables are devices for presenting data •Tabulation is the first step before the data is used for analysis or interpretation GENERAL PRINCIPLES BEFORE DESIGNING TABLES 1.The table should be numbered eg: Table 1.Table 2. etc 2.A title must be given to each table. The title must be brief and self explanatory 3.The headings of columns and rows should be clear and concise 4.The data must be presented according to size or importance chronologically, alphabetically or geographically 5.If percentage or average are to be compared they should be placed as close as possible 6.No table should be too large 7.Foot notes may be given where necessary, providing explanatory notes or additional information
  • 24. FREQUENCY DISTRIBUTION TABLE The data is first split up into convenient groups (class intervals)and the number of items(frequency) occur in each group
  • 25. CHARTS AND DIAGRAMS •Useful method of presenting simple statistical data •They have powerful impact on the imagination of people, so they are a powerful media of expressing statistical data ADVANTAGES 1.Diagrams are better retained in memory than tables 2.If the diagrams are drawn simple the impact on the reader much higher DISADVANTAGES 1.Loss of details of the original data may be lost in charts and diagrams
  • 26. BAR CHARTS A diagram of columns or bars the height of the bars determine the value of the particular data in question SIMPLE BAR CHART
  • 28. COMPONET BAR CHART When there are two sets of similar information they can be contrasted by displaying both sets on same graph
  • 29. HISTOGRAMS A special sort of bar chart. The successive groups of data is linked in a definite numerical data Frequency polygon A frequency distribution may also be represented diagrammatically by the frequency polygon It is obtained by joining the mid points of the histogram blocks Frequency curve The frequency curve for a distribution can be obtained by drawing a smooth and free hand curve through the midpoints
  • 30. PIE CHARTS Another way of displaying data. PICTOGRAMS Pictorial or diagrammatical data represented by pictorial symbol
  • 31. LINE GRAPH When the quantity is a continuous variable STATISTICAL MAPS When statistical data refer to geographic or administrative areas ,it is presented either as shaded maps or dot maps
  • 32. USES OF DATA •In designing health care programme •In evaluating the effectiveness of an on going program •In determination of needs of a specific population •In evaluating the scientific accuracy of a journal article
  • 33. MEASURES OF CENTRAL TENDENCY •Central tendency:It is the value around which the other values are distributed •The main objective of measure of central tendency is to condense the entire mass of data and to facilitate comparison •Arithmetic mean •Median •Mode
  • 34. z MEAN •This measures implies the arithmetic average or arithmetic mean •It is obtained by summing up all observations and dividing the total number of observations •Eg: No. of days patients stayed each day in hospital under Dr. A is: 2,4,3,4,6,6,2,5 •Mean (X) = Sum of all observations/Number of observations = 32/8 = 4 •ADVANTAGES •Easy to calculate •Easy to understand •Utilize entire data •Amenable to algebraic manipulation •Affords good comparison DISADVANTAGES •Mean is affected by extreme values. In such cases it leads to bad interpretation
  • 35. MEDIAN The data arranged in an ascending or descending order of magnitude and the value of middle observation is located Eg 1: No. of days patients stayed in hospital under Dr. A is: 2,4,3,4,6,6,2,5 Ascending order: 2,2,3,4,4,5,6,6 Median = (4+4)/2 = 8/2 = 4 Eg 2: No. of days patients stayed in hospital under Dr. A is: 2,4,3,4,6,6,2 Descending order: 6,6,4,4,3,2,2 Median: 4 ADVANTAGES • It is more representative than mean • It does not depend on every observations •It is not affected by extreme values •DISADVANTAGES •Data has to be arraned before calculation. Hence mean is easier to use as a sample statistic than a population parameter •More complex statistical procedures than mean
  • 36. MODE Value which occurs with the greatest frequency Eg 1 : No. of days patients stayed in hospital under Dr. A is: 2,4,3,1,6,6,8,5 Mode: 6 i.e. the distribution is unimodal Eg 1 : No. of days patients stayed in hospital under Dr. A is: 2,4,3,4,6,6,8,5 Mode: 6 & 4 i.e. the distribution is bimodal ADVANTAGES •It eliminates extreme variation •Easily located by mean inspection •Easy to understand DISADVANTAGES •Exact location is uncertain •It is not exactly defined •In small number of cases there may be no mode at all because no value may be repeated therefore it is not used in medical or biological statistics
  • 37. MEASURES OF DISPERSION •Measures of dispersion helps to know how widely the observations are spread on either side of the average •Dispersion is the degree of spread or variation of the variable about a central value •The range •The mean deviation •The standard deviation PURPOSE OF MEASURES OF DISPERSION •To study the variability of data •For accounting the variability in data
  • 38. THE RANGE •The difference between the highest and lower figures in a given sample. •Range = Xmax - Xmin ADVANTAGES •Easy to calculate DISADVANTAGES •Unstable •It is affected by one extremely high or low score •It is of no practical importance because it does not indicate anything about the dispersion of values between the two extreme values
  • 39. THE MEAN DEVIATION •It is the average of deviation from the arithmetic mean •It is the one way of measuring how closely the individual scores in the data set cluster around the mean. This is done by • M.D. = Ʃ (x-x)/n •Where Ʃ (sigma) is the sum of, x is the value of each observation in the data, x is the arithmetic mean and n is the number of observation in the data. •Eg : No. of days patients stayed in hospital under Dr. A is: 2,4,3,4,6,6,2,5 •x = 32/8 = 4 • (x-x) = -2,0,-1,0,2,2,-2,-1 ; Ʃ (x-x) = -2+0+-1+0+2+2+-2+-1 = 0 •Zero will obviously not reflect the degree of dispersion. To solve this problem we can square each deviation score
  • 40. THE MEAN DEVIATION • (x-x)2 = 4,0,1,0,4,4,4,1 ; Ʃ (x-x)2 = 18 • Ʃ (x-x)2/n = 18/8 = 2.25 • The resulting value is the variance. • The Variance is the average of the squared deviations from the mean of a set of scores. • i.e. Ʃ (x-x)2/n
  • 41. STANDARD DEVIATION •Most frequently used measure of deviation •Defined as root mean square deviation •Denoted by the Greek letter Sigma s or by the initials S.D •S.D is the square root of the Variance •S.D = √(x-x)2/n •Therefore for Dr. A, S.D = √ 2.25 = 1.5
  • 42. TESTS OF SIGNIFICANCE •Whenever two sets of observation are to be compared, it becomes essential to find out whether the difference observed between the two group is because of sampling variation or any other factor •The method by which this done is called Tests of significance 1. Standard error test for large samples 2. Chi square test 3. Standard error test for small samples
  • 43. STANDARD ERROR TEST FOR LARGE SAMPLES •A sample is considered to be large when it has more than 30 observations •When the difference between any two large sample in terms of means or portion need to be tested the formula used is as •(a). Standard error of mean •The standard error of mean gives the standard deviation of mean of several samples from the same population. Standard error can be estimated from a single sample. •Standard error (S.E) of mean = S.D/ √n
  • 44. •(b). Standard error (S.E) of proportion = √pq/n •Where p and q are the proportion of occurrence of an event in two groups of the sample and n is the sample size. •(c). Standard error of difference between two means •It is used to find out whether the difference between the means of two groups is significant to indicate that the samples represent two different universes. •Standard error between means = √S.D1 2/n1 + S.D2 2/n2 •(d). Standard error of difference between proportions •It is used to find out whether the difference between the proportions of two groups is significant or has occurred by chance. •Standard error between proportions = √p1q1/n1+p2q2/n2
  • 45. CHI SQUARE TEST It is alternative method of testing the significance of difference between two proportions Eg: If there are two groups, one of which has received oral hygiene instructions and the other has not received any instructions and if it is desired to test if the occurrence of new cavities is associated with the instructions. STEPS 1. Test the null hypothesis Set up a null hypothesis that “there is no difference between the two” and then proceed to test the hypothesis. •Here we state the null hypothesis as ‘there is no association between oral hygiene instructions received in dental hygiene and the occurrence of new cavities’ Group Occurrence of new cavities Present Absent Total Number who received instructions 10 40 50 Number who did not receive instructions 32 8 40 Total 42 48 90
  • 46. •2. Then the X2 –statistic is calculated as, X2 = Ʃ(O-E)/E Where O is the observed frequency and E is the Expected Frequency Expected Frequency (E) = Row total * Column total/Grand total Among those who received instructions Expected number attacked = 42*50/90 = 23.3 Expected number not attacked = 48*50/90 = 26.6 Among those who did not receive instructions Expected number attacked = 42*40/90 = 18.2 Expected number not attacked = 48*40/90 = 21.3 Group Attacked Not Attacked Number who received instructions O = 10 E = 23.3 O – E = - 13.3 O = 40 E = 26.6 O – E = 13.4 Number who did not receive instructions O = 32 E = 18.2 O – E = 13.8 O = 8 E = 21.3 O – E = - 13.3 Group Occurrence of new cavities Present Absent Total Number who received instructi ons 10 40 50 Number who did not receive instructi ons 32 8 40 Total 42 48 90
  • 47. 3. Applying the X2 test, X2 = Ʃ(O-E)2/E = (-13.3)2/23.3 + (13.4)2/26.6 + (13.8)2/18.2 + (-13.3)2/21.3 = 7.59 + 6.75 + 10.46 + 8.3 = 33.1 4. Finding the degree of freedom (d.f) It depends on the number of columns and rows in the original table d.f = (c-1)*(r-1) Where c = number of columns ; r = number of rows d.f = (2 – 1)*(2 – 1) = 1 Group Attacked Not Attacked Number who received instructions O = 10 E = 23.3 O – E = - 13.3 O = 40 E = 26.6 O – E = 13.4 Number who did not receive instructions O = 32 E = 18.2 O – E = 13.8 O = 8 E = 21.3 O – E = - 13.3
  • 48. 5. Probability tables Depending upon the value of “P” the conclusion is drawn. • In the probability table, with a degree of freedom of 1, the X2 value for a probability (P) of 0.05 is 3.84. Since the observed value 33 is much higher it is concluded that the null hypothesis is false and there is difference in caries occurrence in the two groups with caries being lower in those who received instructions.
  • 49. Z test It is used to test the significance of difference in means for large samples (>30) The pre-requisites to apply Z test for means are, 1. The sample must be randomly selected 2. The data must be quantitative 3. The variable is assumed to follow a normal distribution in the population 4. Sample should be larger than 30 Observation – mean / Standard deviation = x – x / SD
  • 50. STANDARD ERROR TEST FOR SMALL SAMPLES •A sample is considered to be small if it has less than 30 observations. •The test applied is called the ‘t’ test •Designed by W.S.GOSSETT, whose pen name was student. Hence this test is called Student’s t-test •When the investigations is in terms comparing the observations carried out on the same individual says before and after certain experiment ,such comparison are called paired comparison •When the observation are carried out in two independent samples and their values are compared it is known as unpaired comparison
  • 51. CRITERIA FOR APPLYING ‘t’ TEST •The sample must be randomly selected •The data must be quantitative •The variable is assumed to follow a normal distribution in population •Sample should be less than 30
  • 52. t- TEST FOR PAIRED COMPARISON 1. As per the null hypothesis, assume that there is no real difference between the means of two samples 2. The difference between the before and after experimentation readings are calculated for each individuals 3. The mean and standard deviation(s) of these differences are calculated 4. The standard error of this mean difference is calculated by the formula SE = SD/√n 5. t is calculated by the formula, t = Mean difference / Standard error of the difference 6. Find the degree of freedom (df) = (n-1) where n is the number of pairs of observation 7. From t- distribution table, find probability of t is noted down corresponding to (n-1) degree of freedom 8. If probability is more than 0.05,the difference observed has no significance ,because it can be due to chance
  • 53. The unpaired ‘t’ test 1. As per the null hypothesis, assume that there is no real difference between the means of two samples. 2. Find the observed difference between the means of two samples (X1 – X2) 3. Calculate the standard error of difference between the two means. SE = √1/n1 + 1/n2 4. Calculate the ‘t’ value t = X1 2 – X2 2 / SE 5. Determine the pooled degrees of freedom from the formula d.f = (n1 – 1) + (n2 – 1) = n1 + n2 - 2
  • 54. 6. Compare calculated value with the table value (table of ‘t’) at particular degrees of freedom to find the level of significance.
  • 55. CONCLUSION •Bio-statistical technique can assure that the results found in such a study are not merely because of chance. •In every case of our life, Statistics plays a major role for better gaining and accurate results. •A well designed and properly conducted study is a basic prerequisite to arrive at valid conclusions.
  • 56. REFERENCES Soben peter ; Essentials of public health dentistry, 5th edition K Park ; Parks Textbook of Preventive And Social medicine, 19th edition Joseph John ; Textbook of Preventive and Community Dentistry, 2nd edition Richard Levin & David S. Rubin ; Statistics for Management, 6th edition