Lucknow Call Girls Service { 9984666624 } ā¤ļøVVIP ROCKY Call Girl in Lucknow U...
Ā
Biostatistics
1. BIOSTATISTICS
P R E S E N T E D B Y ,
D R . A N J U M A T H E W . K
F I R S T Y E A R M D S
D E P A R T M E N T O F P E R I O D O N T I C S
2. ā¢Statistics is a very broad subject, with applications in a vast number of different
fields.
ā¢ In generally one can say that statistics is the methodology for collecting,
analyzing, interpreting and drawing conclusions from information.
ā¢Statistics is the methodology which scientists and mathematicians have developed
for interpreting and drawing conclusions from collected data
3. DEFINITION
Statistics consists of a body of methods for collecting and analyzing data. (Agresti &
Finlay, 1997)
ļStatistics is much more than just the tabulation of numbers and the graphical
presentation of these tabulated numbers.
ļStatistics is the science of gaining information from numerical and categorical data
ļStatistical methods can be used to find answers to the questions like:
ā¢ What kind and how much data need to be collected?
ā¢ How should we organize and summarize the data?
ā¢ How can we analyse the data and draw conclusions from it?
ā¢ How can we assess the strength of the conclusions and evaluate their
uncertainty?
4. BIOSTATISTICS
ā¢Deals with the statistical methodologies involved in biological
sciences
ā¢As medicine is a branch of biology, medical statistics is a branch of
biostatistics
5. SAMPLING
ā¢Sampling is the process of technique or selecting a sample of appropriate
characteristics and adequate size
ā¢Sampling of two types
1.Probability sampling
2.Nonprobability sampling
In PROBABILITY SAMPLING -give all the members of a population equal
chance of being selected
In NONPROBABILITY SAMPLING ā samples are collected in a way that
does not give all the units in the population equal chances of being selected
6. TYPES OF SAMPLING TECHNIQUES
Probability sampling Non probability sampling
1.Simple random 1.Accidental/convenience
2.Stratified random 2.Judgement/purposive
3.Systemic random 3.Network/snowball
4.Area/cluster sampling 4.Quota sampling
5.Dimensional sampling
6.Mixed sampling
7. Simple random sampling
Every member of population has an equal chance of being
included in the sample. This type of sampling used when the
population in homogenous
Stratified random sampling
Divides the population into groups called strata. It is by some
characteristic, not geographically. The population might be
separated into males and females.
8. Systemic random sampling
Sample members from a larger population are selected
according to a random starting point but with a fixed,
periodic interval. This interval, called the sampling
interval, is calculated by dividing the population size by
the desired sample size.
Area or cluster sampling
Cluster sampling is accomplished by dividing the
population into groups usually geographically. These
groups are called clusters or blocks. The clusters are
randomly selected, and each element in the selected
clusters are used. For example in a dental survey in
schools each section in a class could be used as a
cluster
9. Accidental or convenience sampling
Sampling is very easy to do and often used by health
professionals. You will have to examine the people you
are able to contact or get access to. In expensive and
less time consuming
Judgement or purposive sampling sampling
In which researchers rely on their own judgment when
choosing members of the population to participate in
their study
10. Network or snow ball sampling
Multistage technique. The researcher must first
identify and interview a few subjects with requisite
criteria. These subjects are then asked to identify
other with same criteria these persons are then asked
to identify others until a satisfactory sample is
obtained
Quota sampling
Researchers create a sample involving individuals
that represent a population. Researchers choose these
individuals according to specific traits or qualities
11. Dimensional sampling
Is an extension to quota sampling. The researcher takes into account several characteristics (e.g.
Gender, income, residence and education). The researcher must ensure that there is at least one
person in the study representing each of the chosen characteristics
Mixed sampling designs
Constitute the combination of both probability and nonprobability sampling procedures
12. USES OF SAMPLING
ā¢May be the only way to obtain information about a population
ā¢The need to reduce labour and hence cost
ā¢Savings in time, manpower and money
13. ERRORS IN SAMPLING
ā¢Two types of errors that arise in sampling
1.Sampling error
2.Nonsampling error
ā¢Sampling error
That creep in due to the sampling process and could arise because of
faulty sample design or due to the small size of the sample
ā¢Non sampling errors
a) Coverage error: due to non cooperation of the informant
b) Observational error: due to interviewers bias or imperfect
experimental technique or interaction of both
c) Processing error: due to errors in statistical analysis
14. DATA
ā¢Data analysis is the cornerstone in reporting research findings
ā¢Data is a set of values of one or more variables recorded on one or
more individuals
16. Primary data
Data obtained directly from an individual
ADVANTAGES
1. Precise information
2. Reliable
DISADVANTAGES
1.Time consuming
2.expensive
Secondary data
It is obtained from outside sources eg:hospital records,school register
17. VARIABLES
A variable is a state ,condition, concept or event whose value is free to vary
within the population
TYPES OF VARIABLES
1.Quantitative
-Discrete
-Continous
2.Qualitative
-Categorical
-Ordered
18.
19. METHOD OF COLLECTION OF
DATA
1. Questionnaires
2. Surveys
3. Records
4. Interviews
20. PRESENTATION OF DATA
ā¢Statistical data once collected must be arranged purposively in order
to bring out the important points clearly and strikingly
ā¢The manner in which statistical data is presented is of utmost
importance
21. METHODS OF PRESENTING DATA
I. Tabulation
Simple tables
Frequency distribution table
II. Charts and diagrams
Bar charts
a. Simple bar chart
b. Multiple bar chart
c. Component bar chart
Histogram
a. Frequency polygon
b. Frequency curve
Pie chart
Pictogram
III. Line diagrams
IV. Statistical maps
22. TABULATION
ā¢Tables are devices for presenting data
ā¢Tabulation is the first step before the data is used for analysis or interpretation
GENERAL PRINCIPLES BEFORE DESIGNING TABLES
1.The table should be numbered eg: Table 1.Table 2. etc
2.A title must be given to each table. The title must be brief and self explanatory
3.The headings of columns and rows should be clear and concise
4.The data must be presented according to size or importance chronologically, alphabetically or
geographically
5.If percentage or average are to be compared they should be placed as close as possible
6.No table should be too large
7.Foot notes may be given where necessary, providing explanatory notes or additional
information
24. FREQUENCY DISTRIBUTION
TABLE
The data is first split up into convenient groups (class intervals)and the number of
items(frequency) occur in each group
25. CHARTS AND DIAGRAMS
ā¢Useful method of presenting simple statistical data
ā¢They have powerful impact on the imagination of people, so they are a powerful
media of expressing statistical data
ADVANTAGES
1.Diagrams are better retained in memory than tables
2.If the diagrams are drawn simple the impact on the reader much higher
DISADVANTAGES
1.Loss of details of the original data may be lost in charts and diagrams
26. BAR CHARTS
A diagram of columns or bars the height of the bars determine the value of the
particular data in question
SIMPLE BAR CHART
28. COMPONET BAR CHART
When there are two sets of similar information they can be contrasted by
displaying both sets on same graph
29. HISTOGRAMS
A special sort of bar chart. The successive
groups of data is linked in a definite
numerical data
Frequency polygon
A frequency distribution may also be
represented diagrammatically by the
frequency polygon
It is obtained by joining the mid points of the
histogram blocks
Frequency curve
The frequency curve for a distribution can be
obtained by drawing a smooth and free
hand curve through the midpoints
30. PIE CHARTS
Another way of displaying data.
PICTOGRAMS
Pictorial or diagrammatical data
represented by pictorial symbol
31. LINE GRAPH
When the quantity is a continuous variable
STATISTICAL MAPS
When statistical data refer to geographic or
administrative areas ,it is presented either as
shaded maps or dot maps
32. USES OF DATA
ā¢In designing health care programme
ā¢In evaluating the effectiveness of an on going program
ā¢In determination of needs of a specific population
ā¢In evaluating the scientific accuracy of a journal article
33. MEASURES OF CENTRAL
TENDENCY
ā¢Central tendency:It is the value around which the other values are
distributed
ā¢The main objective of measure of central tendency is to condense the
entire mass of data and to facilitate comparison
ā¢Arithmetic mean
ā¢Median
ā¢Mode
34. z
MEAN
ā¢This measures implies the arithmetic average or arithmetic mean
ā¢It is obtained by summing up all observations and dividing the total number of observations
ā¢Eg: No. of days patients stayed each day in hospital under Dr. A is: 2,4,3,4,6,6,2,5
ā¢Mean (X) = Sum of all observations/Number of observations = 32/8 = 4
ā¢ADVANTAGES
ā¢Easy to calculate
ā¢Easy to understand
ā¢Utilize entire data
ā¢Amenable to algebraic manipulation
ā¢Affords good comparison
DISADVANTAGES
ā¢Mean is affected by extreme values. In such cases it leads to bad interpretation
35. MEDIAN
The data arranged in an ascending or descending order of magnitude and the value of middle observation is located
Eg 1: No. of days patients stayed in hospital under Dr. A is: 2,4,3,4,6,6,2,5
Ascending order: 2,2,3,4,4,5,6,6
Median = (4+4)/2 = 8/2 = 4
Eg 2: No. of days patients stayed in hospital under Dr. A is: 2,4,3,4,6,6,2
Descending order: 6,6,4,4,3,2,2
Median: 4
ADVANTAGES
ā¢ It is more representative than mean
ā¢ It does not depend on every observations
ā¢It is not affected by extreme values
ā¢DISADVANTAGES
ā¢Data has to be arraned before calculation. Hence mean is easier to use as a sample statistic than a population parameter
ā¢More complex statistical procedures than mean
36. MODE
Value which occurs with the greatest frequency
Eg 1 : No. of days patients stayed in hospital under Dr. A is: 2,4,3,1,6,6,8,5
Mode: 6 i.e. the distribution is unimodal
Eg 1 : No. of days patients stayed in hospital under Dr. A is: 2,4,3,4,6,6,8,5
Mode: 6 & 4 i.e. the distribution is bimodal
ADVANTAGES
ā¢It eliminates extreme variation
ā¢Easily located by mean inspection
ā¢Easy to understand
DISADVANTAGES
ā¢Exact location is uncertain
ā¢It is not exactly defined
ā¢In small number of cases there may be no mode at all because no value may be repeated therefore it is not used in
medical or biological statistics
37. MEASURES OF DISPERSION
ā¢Measures of dispersion helps to know how widely the observations are spread on
either side of the average
ā¢Dispersion is the degree of spread or variation of the variable about a central
value
ā¢The range
ā¢The mean deviation
ā¢The standard deviation
PURPOSE OF MEASURES OF DISPERSION
ā¢To study the variability of data
ā¢For accounting the variability in data
38. THE RANGE
ā¢The difference between the highest and lower figures in a given sample.
ā¢Range = Xmax - Xmin
ADVANTAGES
ā¢Easy to calculate
DISADVANTAGES
ā¢Unstable
ā¢It is affected by one extremely high or low score
ā¢It is of no practical importance because it does not indicate anything about the
dispersion of values between the two extreme values
41. STANDARD DEVIATION
ā¢Most frequently used measure of deviation
ā¢Defined as root mean square deviation
ā¢Denoted by the Greek letter Sigma s or by the initials S.D
ā¢S.D is the square root of the Variance
ā¢S.D = ā(x-x)2/n
ā¢Therefore for Dr. A, S.D = ā 2.25 = 1.5
42. TESTS OF SIGNIFICANCE
ā¢Whenever two sets of observation are to be compared, it becomes
essential to find out whether the difference observed between the two
group is because of sampling variation or any other factor
ā¢The method by which this done is called Tests of significance
1. Standard error test for large samples
2. Chi square test
3. Standard error test for small samples
43. STANDARD ERROR TEST FOR LARGE
SAMPLES
ā¢A sample is considered to be large when it has more than 30
observations
ā¢When the difference between any two large sample in terms of means
or portion need to be tested the formula used is as
ā¢(a). Standard error of mean
ā¢The standard error of mean gives the standard deviation of mean of
several samples from the same population. Standard error can be
estimated from a single sample.
ā¢Standard error (S.E) of mean = S.D/ ān
44. ā¢(b). Standard error (S.E) of proportion = āpq/n
ā¢Where p and q are the proportion of occurrence of an event in two groups of
the sample and n is the sample size.
ā¢(c). Standard error of difference between two means
ā¢It is used to find out whether the difference between the means of two groups
is significant to indicate that the samples represent two different universes.
ā¢Standard error between means = āS.D1
2/n1 + S.D2
2/n2
ā¢(d). Standard error of difference between proportions
ā¢It is used to find out whether the difference between the proportions of two
groups is significant or has occurred by chance.
ā¢Standard error between proportions = āp1q1/n1+p2q2/n2
45. CHI SQUARE TEST
It is alternative method of testing the significance of difference between two proportions
Eg: If there are two groups, one of which has received oral hygiene instructions and the other has not received any
instructions and if it is desired to test if the occurrence of new cavities is associated with the instructions.
STEPS
1. Test the null hypothesis
Set up a null hypothesis that āthere is no difference between the twoā and then proceed to test the hypothesis.
ā¢Here we state the null hypothesis as āthere is no association between oral hygiene instructions received in dental hygiene
and the occurrence of new cavitiesā
Group Occurrence of new cavities
Present Absent Total
Number who
received
instructions
10 40 50
Number who did
not receive
instructions
32 8 40
Total 42 48 90
48. 5. Probability tables
Depending upon the value of āPā the conclusion is drawn.
ā¢ In the probability table, with a degree of freedom of 1, the X2 value for a probability (P) of 0.05 is 3.84. Since the
observed value 33 is much higher it is concluded that the null hypothesis is false and there is difference in caries
occurrence in the two groups with caries being lower in those who received instructions.
49. Z test
It is used to test the significance of difference in means for large samples (>30)
The pre-requisites to apply Z test for means are,
1. The sample must be randomly selected
2. The data must be quantitative
3. The variable is assumed to follow a normal distribution in the population
4. Sample should be larger than 30
Observation ā mean / Standard deviation
= x ā x / SD
50. STANDARD ERROR TEST FOR SMALL
SAMPLES
ā¢A sample is considered to be small if it has less than 30 observations.
ā¢The test applied is called the ātā test
ā¢Designed by W.S.GOSSETT, whose pen name was student. Hence this test is
called Studentās t-test
ā¢When the investigations is in terms comparing the observations carried out on the
same individual says before and after certain experiment ,such comparison are
called paired comparison
ā¢When the observation are carried out in two independent samples and their values
are compared it is known as unpaired comparison
51. CRITERIA FOR APPLYING ātā TEST
ā¢The sample must be randomly selected
ā¢The data must be quantitative
ā¢The variable is assumed to follow a normal distribution in population
ā¢Sample should be less than 30
52. t- TEST FOR PAIRED COMPARISON
1. As per the null hypothesis, assume that there is no real difference between the means of
two samples
2. The difference between the before and after experimentation readings are calculated for
each individuals
3. The mean and standard deviation(s) of these differences are calculated
4. The standard error of this mean difference is calculated by the formula SE = SD/ān
5. t is calculated by the formula, t = Mean difference / Standard error of the difference
6. Find the degree of freedom (df) = (n-1) where n is the number of pairs of observation
7. From t- distribution table, find probability of t is noted down corresponding to (n-1) degree
of freedom
8. If probability is more than 0.05,the difference observed has no significance ,because it can
be due to chance
53. The unpaired ātā test
1. As per the null hypothesis, assume that there is no real difference between the means of two
samples.
2. Find the observed difference between the means of two samples (X1 ā X2)
3. Calculate the standard error of difference between the two means.
SE = ā1/n1 + 1/n2
4. Calculate the ātā value
t = X1
2 ā X2
2 / SE
5. Determine the pooled degrees of freedom from the formula
d.f = (n1 ā 1) + (n2 ā 1) = n1 + n2 - 2
54. 6. Compare calculated value with the table value (table of ātā) at particular degrees of freedom to find the level of
significance.
55. CONCLUSION
ā¢Bio-statistical technique can assure that the results found in such a
study are not merely because of chance.
ā¢In every case of our life, Statistics plays a major role for better
gaining and accurate results.
ā¢A well designed and properly conducted study is a basic prerequisite
to arrive at valid conclusions.
56. REFERENCES
Soben peter ; Essentials of public health dentistry, 5th edition
K Park ; Parks Textbook of Preventive And Social medicine, 19th
edition
Joseph John ; Textbook of Preventive and Community Dentistry, 2nd
edition
Richard Levin & David S. Rubin ; Statistics for Management, 6th
edition