Upcoming SlideShare
×

# Bio statistics1

671 views
451 views

Published on

Published in: Technology, Health & Medicine
0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total views
671
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
13
0
Likes
0
Embeds 0
No embeds

No notes for slide

### Bio statistics1

2. 2. Contents • • • • • • • • • • • • Introduction Common Statistical Terms Source of data Types of data Data presentation Measures of statistical averages or central tendency Types of variability Measures of variation or dispersion Normal distribution or normal curve Sampling Determination of sample size Probability or p value www.indiandentalacademy.com
4. 4. • Any science needs precision for it’s development. • For precision, facts, observations or measurements have to be expressed in figures. • “It has been said when you can measure what you are speaking about and express it in numbers, you know something about it, but when you cannot express it in numbers your knowledge is of meagre and unsatisfactory kind.” - Lord Kelvin www.indiandentalacademy.com
5. 5. • Similarly in medicine, be it diagnosis, treatment or research everything depends on measurement. • E.g. you have to measure or count the number of missing teeth OR measure the vertical dimension and express it in number so that it makes sense. www.indiandentalacademy.com
6. 6. • Statistic or datum means a measured or counted fact or piece of the information stated as a figure such as height of one person, birth weight of a baby etc. • Statistics or data is plural of the same. • Statistics is the science of figures. • Bio statistics is the term used when tools of statistics are applied to data that is derived from biological sciences such as medicine. www.indiandentalacademy.com
7. 7. Applications and uses of bio statistics as a science • In physiology and anatomy – To define the limits of normality for variable such as height or weight or Blood Pressure etc in a population. – Variation more than natural limits may be pathological i.e abnormal due to play of certain external factors. – To find correlation between two variables like height and weight. www.indiandentalacademy.com
8. 8. Applications and uses of bio statistics as a science • In pharmacology – To find the action of drugs – To compare the action of two drugs or two successive dosages of same drug – To find the relative potency of a new drug with respect to a standard drug www.indiandentalacademy.com
9. 9. Applications and uses of bio statistics as a science • In medicine – To compare the efficiency of a particular drug, operation or line of treatment – To find association between two attributes such as cancer and smoking – To identify signs and symptoms of disease www.indiandentalacademy.com
10. 10. Applications and uses of bio statistics as a science • In community medicine and public health – To test usefulness of sera or vaccine in the field – In epidemiologic studies the role of causative factors is statistically tested www.indiandentalacademy.com
11. 11. Applications and uses of bio statistics as a science • In research – It helps in compilation of data , drawing conclusions and making recommendations. www.indiandentalacademy.com
12. 12. Applications and uses of bio statistics as a science • For students – By learning the methods in biostatistics a student learns to evaluate articles published in medical and dental journals or papers read in medical and dental conferences. – He also understands the basic methods of observation in his clinical practice and research. www.indiandentalacademy.com
13. 13. Common Statistical Terms www.indiandentalacademy.com
14. 14. Common Statistical Terms • Constant – Quantities that do not vary e.g. in biostatistics, mean, standard deviation are considered constant for a population • Variable – Characteristics which takes different values for different person, place or thing such as height, weight, blood pressure • Population – Population includes all persons, events and objects under study. it may be finite or infinite. www.indiandentalacademy.com
15. 15. Common Statistical Terms • Sample – Defined as a part of a population generally selected so as to be representative of the population whose variables are under study • Parameter – It is a constant that describes a population e.g. in a college there are 40% girls. This describes the population, hence it is a parameter. www.indiandentalacademy.com
16. 16. Common Statistical Terms • Statistic – Statistic is a constant that describes the sample e.g. out of 200 students of the same college 45% girls. This 45% will be statistic as it describes the sample • Attribute – A characteristic based on which the population can be described into categories or class e.g. gender, caste, religion. www.indiandentalacademy.com
17. 17. Source of data www.indiandentalacademy.com
18. 18. Source of data • The main sources for collection of data – Experiments – Surveys – Records • Experiments – Experiments are performed to collect data for investigations and research by one or more workers. www.indiandentalacademy.com
19. 19. Source of data • Surveys – Carried out for Epidemiological studies in the field by trained teams to find incidence or prevalence of health or disease in a community. • Records – Records are maintained as a routine in registers and books over a long period of time – provides readymade data. www.indiandentalacademy.com
20. 20. Types of data www.indiandentalacademy.com
21. 21. Types of data • Data is of two types • Qualitative or discrete data • Quantitative or continuous data www.indiandentalacademy.com
22. 22. Types of data • Qualitative or discrete data – In such data there is no notion of magnitude or size of an attribute as the same cannot be measured. – The number of person having the same attribute are variable and are measured – e.g. like out of 100 people 75 have class I occlusion, 15 have class II occlusion and 10 have class III occlusion. – Class I II III are attributes , which cannot be measured in figures, only no of people having it can be determined www.indiandentalacademy.com
23. 23. Types of data • Quantitative or continuous data – In this the attribute has a magnitude. both the attribute and the number of persons having the attribute vary – E.g Freeway space. It varies for every patient. It is a quantity with a different value for each individual and is measurable. It is continuous as it can take any value between 2 and 4 like it can be 2.10 or 2.55 or 3.07 etc. www.indiandentalacademy.com
25. 25. Data presentation • Statistical data once collected should be systematically arranged and presented – To arouse interest of readers – For data reduction – To bring out important points clearly and strikingly – For easy grasp and meaningful conclusions – To facilitate further analysis – To facilitate communication www.indiandentalacademy.com
26. 26. Data presentation • Two main types of data presentation are – Tabulation – Graphic representation diagrams with www.indiandentalacademy.com charts and
27. 27. Data presentation Tabulation • It is the most common method • Data presentation is in the form of columns and rows • It can be of the following types – Simple tables – Frequency distribution tables www.indiandentalacademy.com
28. 28. Simple Table Number of patients at KIDS, Bgm Jan 06 2,800 Feb 06 1,900 March 06 1,750 www.indiandentalacademy.com
29. 29. Frequency distribution table • In a frequency distribution table, the data is first split into convenient groups ( class interval ) and the number of items ( frequency ) which occurs in each group is shown in adjacent column. www.indiandentalacademy.com
30. 30. Frequency distribution table Number of Cavities Number of Patients 0 to 3 78 3 to 6 67 6 to 9 32 9 and above 16 www.indiandentalacademy.com
31. 31. Data presentation Charts and diagrams • Useful method of presenting statistical data • Powerful impact on imagination of the people www.indiandentalacademy.com
32. 32. Charts and diagrams • They are – – – – – – – – – – Bar chart Histogram Frequency polygon Frequency curve Line diagram Cumulative frequency diagram or ogive Scatter diagram Pie chart Pictogram Spot map or map diagram www.indiandentalacademy.com
33. 33. Bar chart • Length of bars drawn vertical or horizontal is proportional to frequency of variable. • suitable scale is chosen • bars usually equally spaced www.indiandentalacademy.com
34. 34. Bar chart • They are of three types _simple bar chart _ multiple bar chart • two or more variables are grouped together _component bar chart • bars are divided into two parts • each part representing certain proportional to magnitude of that item www.indiandentalacademy.com item and
35. 35. Simple bar chart 300 250 200 150 Number of CD Patients 100 50 0 1st Qtr 2nd Qtr 3rd Qtr 4th Qtr www.indiandentalacademy.com
36. 36. Multiple bar chart 400 350 320 300 250 390 370 280 290 250 220 200 CD Patients RPD Patients FPD Patients 180 150 100 50 80 95 45 40 0 1st Qtr 2nd Qtr 3rd Qtr 4th Qtr www.indiandentalacademy.com
37. 37. Component bar chart 3000 2500 500 450 2000 1500 Patients to prostho 300 1000 1500 200 2100 1850 1400 500 0 1st Qtr 2nd Qtr 3rd Qtr 4th Qtr www.indiandentalacademy.com Patients to other Departments
38. 38. Histogram • pictorial presentation of frequency distribution • consists of series of rectangles • class interval given on vertical axis • area of rectangle is proportional to the frequency www.indiandentalacademy.com
39. 39. Histogram 80 75 70 60 50 40 30 20 45 43 40 34 32 38 29 22 10 0 Number of carious lesions www.indiandentalacademy.com 0 to 3 3 to 6 6 to 9 9 to 12 12 to 15 15 to 18 18 to 21 21 to 24 24 to 27
40. 40. Frequency polygon • obtained by joining midpoints of histogram blocks at the height of frequency by straight lines usually forming a polygon www.indiandentalacademy.com
42. 42. Frequency curve • when number of observations is very large and class interval is reduced the frequency polygon losses its angulations becoming a smooth curve known as frequency curve www.indiandentalacademy.com
44. 44. Line diagram • line diagram are used to show the trends of events with the passage of time www.indiandentalacademy.com
45. 45. Line Diagram 90 85 80 70 60 60 50 Patients with periodontitis 40 30 25 20 10 10 0 0 1 2 3 4 www.indiandentalacademy.com 5
46. 46. Cumulative Frequency Diagram • graphical representation of cumulative frequency . • it is obtained by adding the frequency of previous class www.indiandentalacademy.com
47. 47. Cumulative Frequency Diagram 100 90 80 70 60 50 40 30 20 10 0 90 70 55 35 40 45 25 0 to 10 to 20 to 30 to 40 to 50 to 60 to 10 20 30 40 50 60 70 yrs yrs yrs yrs yrs yrs yrs www.indiandentalacademy.com Prevalence of Dental Caries ( in percent)
48. 48. Scatter or Dot diagram • shows relationship between two variables • If the dots are clustered showing a straight line, it shows a relationship of linear nature www.indiandentalacademy.com
49. 49. Scatter or Dot diagram 14 12 10 8 Sugar Exposure 6 4 2 0 0 5 10 Carious lesion www.indiandentalacademy.com 15
50. 50. Pie chart • In this frequencies of the group are shown as segment of circle • Degree of angle denotes the frequency • Angle is calculated by – class frequency X 360 total observations www.indiandentalacademy.com
51. 51. Pie chart 30, 5% 70, 11% 200, 31% 180, 29% 150, 24% www.indiandentalacademy.com PROSTHO CONSO PERIO ORTHO PEDO
52. 52. Pictogram • Popular method of presenting data to the common man www.indiandentalacademy.com
53. 53. Pictogram Delhi 9000 Bombay 11000 Chennai 8000 Kolkatta 5000 Hyderabad 6000 Bangalore 12000 Pune 4000 Lucknow 5000 www.indiandentalacademy.com
54. 54. Spot map or map diagram • These maps are prepared to show geographic distribution of frequencies of characteristics www.indiandentalacademy.com
55. 55. Spot map or map diagram www.indiandentalacademy.com
56. 56. Measures of statistical averages or central tendency www.indiandentalacademy.com
57. 57. • Average value in a distribution is the one central value around which all the other observations are concentrated • Average value helps – to find most characteristic value of a set of measurements – to find which group is better off by comparing the average of one group with that of the other www.indiandentalacademy.com
58. 58. • the most commonly used averages are – mean – median – mode www.indiandentalacademy.com
59. 59. Mean • refers to arithmetic mean • it is the summation of all the observations divided by the total number of observations (n) • denoted by X for sample and µ for population • X = x1 + X2 + X3 …. Xn / n • Advantages – it is easy to calculate • Disadvantages – influenced by extreme values www.indiandentalacademy.com
60. 60. Median • When all the observation are arranged either in ascending order or descending order, the middle observation is known as median • In case of even number the average of the two middle values is taken • Median is better indicator of central value as it is not affected by the extreme values www.indiandentalacademy.com
61. 61. Mode • Most frequently occurring observation in a data is called mode • Not often used in medical statistics. www.indiandentalacademy.com
62. 62. Example • Number of decayed teeth in 10 children 2,2,4,1,3,0,10,2,3,8 • Mean = 34 / 10 = 3.4 • Median = (0,1,2,2,2,3,3,4,8,10) = 2+3 /2 = 2.5 • Mode = 2 ( 3 Times) www.indiandentalacademy.com
63. 63. Types of variability www.indiandentalacademy.com
64. 64. • There are three types of variability – Biological variability – Real variability – Experimental variability • Experimental subtypes variability – Observer Error – Instrumental Error – Sampling Error www.indiandentalacademy.com are of three
65. 65. Biological variability • It is the natural difference which occurs in individuals due to age, gender and other attributes which are inherent • This difference is small and occurs by chance and is within certain accepted biological limits • e.g. vertical dimension may vary from patient to patient www.indiandentalacademy.com
66. 66. Real Variability • such variability is more than the normal biological limits • the cause of difference is not inherent or natural and is due to some external factors • e.g. difference in incidence of cancer among smokers and non smokers may be due to excessive smoking and not due to chance only www.indiandentalacademy.com
67. 67. Experimental Variability • it occurs due to the experimental study • they are of three types – Observer error • the investigator may alter some information or not record the measurement correctly – Instrumental error • this is due to defects in the measuring instrument • both the observer and the instrument error are called non sampling error – Sampling error or errors of bias • this is the error which occurs when the samples are not chosen at random from population. • Thus the sample does not truly represent the population www.indiandentalacademy.com
68. 68. Measures of variation or dispersion www.indiandentalacademy.com
69. 69. • Biological data collected by measurement shows variation • e.g. BP of an individual can show variation even if taken by standardized method and measured by the same person. • Thus one should know what is the normal variation and how to measure it. www.indiandentalacademy.com
70. 70. • The various measures of variation or dispersion are – Range – Mean or average deviation – Standard deviation – Co efficient of variation www.indiandentalacademy.com
71. 71. Range • It is the simplest • Defined as the difference between the highest and the lowest figures in a sample • Defines the normal limits of a biological characteristic e.g. freeway space ranges between 2-4 mm • Not satisfactory as based on two extreme values only www.indiandentalacademy.com
72. 72. Mean deviation • It is the summation of difference or deviations from the mean in any distribution ignoring the + or – sign • Denoted by MD MD = € ( x – x ) n X = observation X = mean n = no of observation www.indiandentalacademy.com
73. 73. Standard deviation • Also called root mean square deviation • It is an Improvement over mean deviation used most commonly in statistical analysis • Denoted by SD or s for sample and σ for a population • Denoted by the formula SD = € ( x – x )2 n or n-1 www.indiandentalacademy.com
74. 74. • Greater the standard deviation, greater will be the magnitude of dispersion from mean • Small standard deviation means a high degree of uniformity of the observations • Usually measurement beyond the range of ± 2 SD are considered rare or unusual in any distribution www.indiandentalacademy.com
75. 75. • Uses of Standard Deviation – It summarizes the deviation of a large distribution from it’s mean. – It helps in finding the suitable size of sample e.g. greater deviation indicates the need for larger sample to draw meaningful conclusions – It helps in calculation of standard error which helps us to determine whether the difference between two samples is by chance or real www.indiandentalacademy.com
76. 76. Coefficient of variation • It is used to compare attributes having two different units of measurement e.g. height and weight • Denoted by CV CV = SD X 100 Mean • and is expressed as percentage www.indiandentalacademy.com
77. 77. Normal distribution or normal curve www.indiandentalacademy.com
78. 78. • So much of physiologic variation occurs in any observation • Necessary to – Define normal limits – Determine the chances of an observation being normal – To determine the proportion of observation that lie within a given range www.indiandentalacademy.com
79. 79. • Normal distribution or normal curve used most commonly in statistics helps us to find these • Large number of observations with a narrow class interval gives a frequency curve called the normal curve www.indiandentalacademy.com
80. 80. • • • • It has the following characteristics Bell shaped Bilaterally symmetrical Frequency increases from one side reaches its highest and decreases exactly the way it had increased • The highest point denotes mean, median and mode which coincide www.indiandentalacademy.com
82. 82. • Mean +_ 1 SD includes 68.27% of all observations . such observations are fairly common • Mean +- 2 SD includes 95.45% of all observations i.e. by convention values beyond this range are uncommon or rare. There chances of being normal is 100 – 95.45 % i.e. only 4.55.%. • Mean +- 3 SD includes 99.73%. such values are very rare. There chance of being normal is 0.27% only • These limits on either side of measurement are called confidence limits www.indiandentalacademy.com
85. 85. • the look of frequency distribution curve may vary depending on mean and SD . thus it becomes necessary to standardize it. • Eg- One study has SD as 3 and other has SD as 2,thus it becomes difficult to compare them • Thus normal curve is standardized by using the unit of standard deviation to place any measurement with reference to mean. • The curve that emerges through this procedure is called standard normal curve www.indiandentalacademy.com
87. 87. Properties of standard normal curve • smooth bell shaped • perfectly symmetrical • based on infinite number of observations thus curve does not touch X axis • mean is zero • SD is always 1 • total area under the curve is 1 • mean median mode coincide www.indiandentalacademy.com
88. 88. • the unit of SD here is relative or standard normal deviate and is denoted by Z Z=x–x SD Z = Observation – Mean SD www.indiandentalacademy.com
89. 89. • With the help of Z value we can find the area under the curve from a table • This area helps to give the P value www.indiandentalacademy.com
92. 92. • It is not possible to include each and every member of population as it will be time consuming, costly , laborious . • therefore sampling is done • Sampling is a process by which some unit of a population or universe are selected for the study and by subjecting it to statistical computation, conclusions are drawn about the population from which these units are drawn www.indiandentalacademy.com
93. 93. • The sample will be a representative of entire population only • It is sufficiently large • It is unbiased • Such sample will have its statistics almost equal to parameters of entire population • Two main characteristics of a representative sample are – Precision – Unbiased character www.indiandentalacademy.com
94. 94. Precision • Precision depends on a sample size • Ordinarily sample size should not be less than 30 Precision = n s n = sample size , s = standard deviation • Precision is directly proportional to square root of sample size, greater the sample size greater the precision • Also greater the SD, less will be the precision • Thus in such cases to obtain precision, sample size needs to be increased www.indiandentalacademy.com
95. 95. Unbiased character • The sample should be unbiased i.e. every individual should have an equal chance to be selected in the sample. • Thus a standard random sampling method should be used • Non sampling errors can be taken care of by – Using standardized instruments and criteria – By single , double , triple blind trials – Use of a control group www.indiandentalacademy.com
96. 96. Determination of sample size www.indiandentalacademy.com
97. 97. For Quantitative Data • The investigator needs to decide how large an error due to sampling defect is allowable i.e. allowable error L • Either the investigator should start with assumed SD or do a pilot study to estimate SD sample size = 4 SD2 L2 www.indiandentalacademy.com
98. 98. For Quantitative Data • Mean pulse rate of population is 70 beats per min with standard deviation of 8 beats. What will be the sample size if allowable error is ± 1 n = 4 X 8 X 8 = 256 1 • If L is less n will be more i.e. larger the sample size lesser is the error. www.indiandentalacademy.com
99. 99. For qualitative data • In such data we deal with proportion Sample size = n = 4 p q L2 • p = proportion of positive character • q = proportion of negative character • q = 1-p or (100-p if expressed in percent) • L = allowable error usually 10% of p www.indiandentalacademy.com
100. 100. For qualitative data • e.g. incidence rate in last influenza was found to be 5% of the population exposed • what should be the size of the sample • to find incidence rate in current epidemic if allowable error is 10%? • p = 5% q = 95% • l = 10 % of p = 0.5% n = 4 X 5 X 95 = 7600 O.5 2 www.indiandentalacademy.com
101. 101. Probability or p value www.indiandentalacademy.com
102. 102. • Concept of probability is very important in statistics • Probability is the chance of occurrence of any event or permutation combination. • It is denoted by p for sample and P for population • In various tests of significance we are often interested to know whether the observed difference between 2 samples is by chance or due to sampling variation. • There probability or p value is used www.indiandentalacademy.com
103. 103. • P ranges from 0 to 1 • 0 = there is no chance that the observed difference could not be due to sampling variation • 1 = it is absolutely certain that observed difference between 2 samples is due to sampling variation • However such extreme values are rare. www.indiandentalacademy.com
104. 104. • P = 0.4 i.e. chances that the difference is due to sampling variation is 4 in 10 • Obviously the chances that it is not due to sampling variation will be 6 in 10 • The essence of any test of significance is to find out p value and draw inference www.indiandentalacademy.com
105. 105. • If p value is 0.05 or more – it is customary to accept that difference is due to chance (sampling variation) . – The observed difference is said to be statistically not significant. • If p value is less than 0.05 – observed difference is not due chance but due to role of some external factors. – The observed difference here is said to be statistically significant. www.indiandentalacademy.com
106. 106. Determination of p value • From shape of normal curve • We know that 95% observation lie within mean ± 2SD . Thus probability of value more or less than this range is 5% • From probability tables • p value is also determined by probability tables in case of student t test or chi square test www.indiandentalacademy.com
107. 107. Determination of p value • By area under normal curve • Here z= standard normal deviate is calculated • Corresponding to z values the area under the curve is determined (A) • Probability is given by 2(0.5 - A) www.indiandentalacademy.com