Bio statistics1

671 views
451 views

Published on

Published in: Technology, Health & Medicine
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
671
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
13
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Bio statistics1

  1. 1. Bio Statistics Part 1 INDIAN DENTAL ACADEMY Leader in continuing dental education www.indiandentalacademy.com www.indiandentalacademy.com
  2. 2. Contents • • • • • • • • • • • • Introduction Common Statistical Terms Source of data Types of data Data presentation Measures of statistical averages or central tendency Types of variability Measures of variation or dispersion Normal distribution or normal curve Sampling Determination of sample size Probability or p value www.indiandentalacademy.com
  3. 3. Introduction www.indiandentalacademy.com
  4. 4. • Any science needs precision for it’s development. • For precision, facts, observations or measurements have to be expressed in figures. • “It has been said when you can measure what you are speaking about and express it in numbers, you know something about it, but when you cannot express it in numbers your knowledge is of meagre and unsatisfactory kind.” - Lord Kelvin www.indiandentalacademy.com
  5. 5. • Similarly in medicine, be it diagnosis, treatment or research everything depends on measurement. • E.g. you have to measure or count the number of missing teeth OR measure the vertical dimension and express it in number so that it makes sense. www.indiandentalacademy.com
  6. 6. • Statistic or datum means a measured or counted fact or piece of the information stated as a figure such as height of one person, birth weight of a baby etc. • Statistics or data is plural of the same. • Statistics is the science of figures. • Bio statistics is the term used when tools of statistics are applied to data that is derived from biological sciences such as medicine. www.indiandentalacademy.com
  7. 7. Applications and uses of bio statistics as a science • In physiology and anatomy – To define the limits of normality for variable such as height or weight or Blood Pressure etc in a population. – Variation more than natural limits may be pathological i.e abnormal due to play of certain external factors. – To find correlation between two variables like height and weight. www.indiandentalacademy.com
  8. 8. Applications and uses of bio statistics as a science • In pharmacology – To find the action of drugs – To compare the action of two drugs or two successive dosages of same drug – To find the relative potency of a new drug with respect to a standard drug www.indiandentalacademy.com
  9. 9. Applications and uses of bio statistics as a science • In medicine – To compare the efficiency of a particular drug, operation or line of treatment – To find association between two attributes such as cancer and smoking – To identify signs and symptoms of disease www.indiandentalacademy.com
  10. 10. Applications and uses of bio statistics as a science • In community medicine and public health – To test usefulness of sera or vaccine in the field – In epidemiologic studies the role of causative factors is statistically tested www.indiandentalacademy.com
  11. 11. Applications and uses of bio statistics as a science • In research – It helps in compilation of data , drawing conclusions and making recommendations. www.indiandentalacademy.com
  12. 12. Applications and uses of bio statistics as a science • For students – By learning the methods in biostatistics a student learns to evaluate articles published in medical and dental journals or papers read in medical and dental conferences. – He also understands the basic methods of observation in his clinical practice and research. www.indiandentalacademy.com
  13. 13. Common Statistical Terms www.indiandentalacademy.com
  14. 14. Common Statistical Terms • Constant – Quantities that do not vary e.g. in biostatistics, mean, standard deviation are considered constant for a population • Variable – Characteristics which takes different values for different person, place or thing such as height, weight, blood pressure • Population – Population includes all persons, events and objects under study. it may be finite or infinite. www.indiandentalacademy.com
  15. 15. Common Statistical Terms • Sample – Defined as a part of a population generally selected so as to be representative of the population whose variables are under study • Parameter – It is a constant that describes a population e.g. in a college there are 40% girls. This describes the population, hence it is a parameter. www.indiandentalacademy.com
  16. 16. Common Statistical Terms • Statistic – Statistic is a constant that describes the sample e.g. out of 200 students of the same college 45% girls. This 45% will be statistic as it describes the sample • Attribute – A characteristic based on which the population can be described into categories or class e.g. gender, caste, religion. www.indiandentalacademy.com
  17. 17. Source of data www.indiandentalacademy.com
  18. 18. Source of data • The main sources for collection of data – Experiments – Surveys – Records • Experiments – Experiments are performed to collect data for investigations and research by one or more workers. www.indiandentalacademy.com
  19. 19. Source of data • Surveys – Carried out for Epidemiological studies in the field by trained teams to find incidence or prevalence of health or disease in a community. • Records – Records are maintained as a routine in registers and books over a long period of time – provides readymade data. www.indiandentalacademy.com
  20. 20. Types of data www.indiandentalacademy.com
  21. 21. Types of data • Data is of two types • Qualitative or discrete data • Quantitative or continuous data www.indiandentalacademy.com
  22. 22. Types of data • Qualitative or discrete data – In such data there is no notion of magnitude or size of an attribute as the same cannot be measured. – The number of person having the same attribute are variable and are measured – e.g. like out of 100 people 75 have class I occlusion, 15 have class II occlusion and 10 have class III occlusion. – Class I II III are attributes , which cannot be measured in figures, only no of people having it can be determined www.indiandentalacademy.com
  23. 23. Types of data • Quantitative or continuous data – In this the attribute has a magnitude. both the attribute and the number of persons having the attribute vary – E.g Freeway space. It varies for every patient. It is a quantity with a different value for each individual and is measurable. It is continuous as it can take any value between 2 and 4 like it can be 2.10 or 2.55 or 3.07 etc. www.indiandentalacademy.com
  24. 24. Data presentation www.indiandentalacademy.com
  25. 25. Data presentation • Statistical data once collected should be systematically arranged and presented – To arouse interest of readers – For data reduction – To bring out important points clearly and strikingly – For easy grasp and meaningful conclusions – To facilitate further analysis – To facilitate communication www.indiandentalacademy.com
  26. 26. Data presentation • Two main types of data presentation are – Tabulation – Graphic representation diagrams with www.indiandentalacademy.com charts and
  27. 27. Data presentation Tabulation • It is the most common method • Data presentation is in the form of columns and rows • It can be of the following types – Simple tables – Frequency distribution tables www.indiandentalacademy.com
  28. 28. Simple Table Number of patients at KIDS, Bgm Jan 06 2,800 Feb 06 1,900 March 06 1,750 www.indiandentalacademy.com
  29. 29. Frequency distribution table • In a frequency distribution table, the data is first split into convenient groups ( class interval ) and the number of items ( frequency ) which occurs in each group is shown in adjacent column. www.indiandentalacademy.com
  30. 30. Frequency distribution table Number of Cavities Number of Patients 0 to 3 78 3 to 6 67 6 to 9 32 9 and above 16 www.indiandentalacademy.com
  31. 31. Data presentation Charts and diagrams • Useful method of presenting statistical data • Powerful impact on imagination of the people www.indiandentalacademy.com
  32. 32. Charts and diagrams • They are – – – – – – – – – – Bar chart Histogram Frequency polygon Frequency curve Line diagram Cumulative frequency diagram or ogive Scatter diagram Pie chart Pictogram Spot map or map diagram www.indiandentalacademy.com
  33. 33. Bar chart • Length of bars drawn vertical or horizontal is proportional to frequency of variable. • suitable scale is chosen • bars usually equally spaced www.indiandentalacademy.com
  34. 34. Bar chart • They are of three types _simple bar chart _ multiple bar chart • two or more variables are grouped together _component bar chart • bars are divided into two parts • each part representing certain proportional to magnitude of that item www.indiandentalacademy.com item and
  35. 35. Simple bar chart 300 250 200 150 Number of CD Patients 100 50 0 1st Qtr 2nd Qtr 3rd Qtr 4th Qtr www.indiandentalacademy.com
  36. 36. Multiple bar chart 400 350 320 300 250 390 370 280 290 250 220 200 CD Patients RPD Patients FPD Patients 180 150 100 50 80 95 45 40 0 1st Qtr 2nd Qtr 3rd Qtr 4th Qtr www.indiandentalacademy.com
  37. 37. Component bar chart 3000 2500 500 450 2000 1500 Patients to prostho 300 1000 1500 200 2100 1850 1400 500 0 1st Qtr 2nd Qtr 3rd Qtr 4th Qtr www.indiandentalacademy.com Patients to other Departments
  38. 38. Histogram • pictorial presentation of frequency distribution • consists of series of rectangles • class interval given on vertical axis • area of rectangle is proportional to the frequency www.indiandentalacademy.com
  39. 39. Histogram 80 75 70 60 50 40 30 20 45 43 40 34 32 38 29 22 10 0 Number of carious lesions www.indiandentalacademy.com 0 to 3 3 to 6 6 to 9 9 to 12 12 to 15 15 to 18 18 to 21 21 to 24 24 to 27
  40. 40. Frequency polygon • obtained by joining midpoints of histogram blocks at the height of frequency by straight lines usually forming a polygon www.indiandentalacademy.com
  41. 41. Frequency polygon www.indiandentalacademy.com
  42. 42. Frequency curve • when number of observations is very large and class interval is reduced the frequency polygon losses its angulations becoming a smooth curve known as frequency curve www.indiandentalacademy.com
  43. 43. Frequency curve www.indiandentalacademy.com
  44. 44. Line diagram • line diagram are used to show the trends of events with the passage of time www.indiandentalacademy.com
  45. 45. Line Diagram 90 85 80 70 60 60 50 Patients with periodontitis 40 30 25 20 10 10 0 0 1 2 3 4 www.indiandentalacademy.com 5
  46. 46. Cumulative Frequency Diagram • graphical representation of cumulative frequency . • it is obtained by adding the frequency of previous class www.indiandentalacademy.com
  47. 47. Cumulative Frequency Diagram 100 90 80 70 60 50 40 30 20 10 0 90 70 55 35 40 45 25 0 to 10 to 20 to 30 to 40 to 50 to 60 to 10 20 30 40 50 60 70 yrs yrs yrs yrs yrs yrs yrs www.indiandentalacademy.com Prevalence of Dental Caries ( in percent)
  48. 48. Scatter or Dot diagram • shows relationship between two variables • If the dots are clustered showing a straight line, it shows a relationship of linear nature www.indiandentalacademy.com
  49. 49. Scatter or Dot diagram 14 12 10 8 Sugar Exposure 6 4 2 0 0 5 10 Carious lesion www.indiandentalacademy.com 15
  50. 50. Pie chart • In this frequencies of the group are shown as segment of circle • Degree of angle denotes the frequency • Angle is calculated by – class frequency X 360 total observations www.indiandentalacademy.com
  51. 51. Pie chart 30, 5% 70, 11% 200, 31% 180, 29% 150, 24% www.indiandentalacademy.com PROSTHO CONSO PERIO ORTHO PEDO
  52. 52. Pictogram • Popular method of presenting data to the common man www.indiandentalacademy.com
  53. 53. Pictogram Delhi 9000 Bombay 11000 Chennai 8000 Kolkatta 5000 Hyderabad 6000 Bangalore 12000 Pune 4000 Lucknow 5000 www.indiandentalacademy.com
  54. 54. Spot map or map diagram • These maps are prepared to show geographic distribution of frequencies of characteristics www.indiandentalacademy.com
  55. 55. Spot map or map diagram www.indiandentalacademy.com
  56. 56. Measures of statistical averages or central tendency www.indiandentalacademy.com
  57. 57. • Average value in a distribution is the one central value around which all the other observations are concentrated • Average value helps – to find most characteristic value of a set of measurements – to find which group is better off by comparing the average of one group with that of the other www.indiandentalacademy.com
  58. 58. • the most commonly used averages are – mean – median – mode www.indiandentalacademy.com
  59. 59. Mean • refers to arithmetic mean • it is the summation of all the observations divided by the total number of observations (n) • denoted by X for sample and µ for population • X = x1 + X2 + X3 …. Xn / n • Advantages – it is easy to calculate • Disadvantages – influenced by extreme values www.indiandentalacademy.com
  60. 60. Median • When all the observation are arranged either in ascending order or descending order, the middle observation is known as median • In case of even number the average of the two middle values is taken • Median is better indicator of central value as it is not affected by the extreme values www.indiandentalacademy.com
  61. 61. Mode • Most frequently occurring observation in a data is called mode • Not often used in medical statistics. www.indiandentalacademy.com
  62. 62. Example • Number of decayed teeth in 10 children 2,2,4,1,3,0,10,2,3,8 • Mean = 34 / 10 = 3.4 • Median = (0,1,2,2,2,3,3,4,8,10) = 2+3 /2 = 2.5 • Mode = 2 ( 3 Times) www.indiandentalacademy.com
  63. 63. Types of variability www.indiandentalacademy.com
  64. 64. • There are three types of variability – Biological variability – Real variability – Experimental variability • Experimental subtypes variability – Observer Error – Instrumental Error – Sampling Error www.indiandentalacademy.com are of three
  65. 65. Biological variability • It is the natural difference which occurs in individuals due to age, gender and other attributes which are inherent • This difference is small and occurs by chance and is within certain accepted biological limits • e.g. vertical dimension may vary from patient to patient www.indiandentalacademy.com
  66. 66. Real Variability • such variability is more than the normal biological limits • the cause of difference is not inherent or natural and is due to some external factors • e.g. difference in incidence of cancer among smokers and non smokers may be due to excessive smoking and not due to chance only www.indiandentalacademy.com
  67. 67. Experimental Variability • it occurs due to the experimental study • they are of three types – Observer error • the investigator may alter some information or not record the measurement correctly – Instrumental error • this is due to defects in the measuring instrument • both the observer and the instrument error are called non sampling error – Sampling error or errors of bias • this is the error which occurs when the samples are not chosen at random from population. • Thus the sample does not truly represent the population www.indiandentalacademy.com
  68. 68. Measures of variation or dispersion www.indiandentalacademy.com
  69. 69. • Biological data collected by measurement shows variation • e.g. BP of an individual can show variation even if taken by standardized method and measured by the same person. • Thus one should know what is the normal variation and how to measure it. www.indiandentalacademy.com
  70. 70. • The various measures of variation or dispersion are – Range – Mean or average deviation – Standard deviation – Co efficient of variation www.indiandentalacademy.com
  71. 71. Range • It is the simplest • Defined as the difference between the highest and the lowest figures in a sample • Defines the normal limits of a biological characteristic e.g. freeway space ranges between 2-4 mm • Not satisfactory as based on two extreme values only www.indiandentalacademy.com
  72. 72. Mean deviation • It is the summation of difference or deviations from the mean in any distribution ignoring the + or – sign • Denoted by MD MD = € ( x – x ) n X = observation X = mean n = no of observation www.indiandentalacademy.com
  73. 73. Standard deviation • Also called root mean square deviation • It is an Improvement over mean deviation used most commonly in statistical analysis • Denoted by SD or s for sample and σ for a population • Denoted by the formula SD = € ( x – x )2 n or n-1 www.indiandentalacademy.com
  74. 74. • Greater the standard deviation, greater will be the magnitude of dispersion from mean • Small standard deviation means a high degree of uniformity of the observations • Usually measurement beyond the range of ± 2 SD are considered rare or unusual in any distribution www.indiandentalacademy.com
  75. 75. • Uses of Standard Deviation – It summarizes the deviation of a large distribution from it’s mean. – It helps in finding the suitable size of sample e.g. greater deviation indicates the need for larger sample to draw meaningful conclusions – It helps in calculation of standard error which helps us to determine whether the difference between two samples is by chance or real www.indiandentalacademy.com
  76. 76. Coefficient of variation • It is used to compare attributes having two different units of measurement e.g. height and weight • Denoted by CV CV = SD X 100 Mean • and is expressed as percentage www.indiandentalacademy.com
  77. 77. Normal distribution or normal curve www.indiandentalacademy.com
  78. 78. • So much of physiologic variation occurs in any observation • Necessary to – Define normal limits – Determine the chances of an observation being normal – To determine the proportion of observation that lie within a given range www.indiandentalacademy.com
  79. 79. • Normal distribution or normal curve used most commonly in statistics helps us to find these • Large number of observations with a narrow class interval gives a frequency curve called the normal curve www.indiandentalacademy.com
  80. 80. • • • • It has the following characteristics Bell shaped Bilaterally symmetrical Frequency increases from one side reaches its highest and decreases exactly the way it had increased • The highest point denotes mean, median and mode which coincide www.indiandentalacademy.com
  81. 81. www.indiandentalacademy.com
  82. 82. • Mean +_ 1 SD includes 68.27% of all observations . such observations are fairly common • Mean +- 2 SD includes 95.45% of all observations i.e. by convention values beyond this range are uncommon or rare. There chances of being normal is 100 – 95.45 % i.e. only 4.55.%. • Mean +- 3 SD includes 99.73%. such values are very rare. There chance of being normal is 0.27% only • These limits on either side of measurement are called confidence limits www.indiandentalacademy.com
  83. 83. www.indiandentalacademy.com
  84. 84. Example www.indiandentalacademy.com
  85. 85. • the look of frequency distribution curve may vary depending on mean and SD . thus it becomes necessary to standardize it. • Eg- One study has SD as 3 and other has SD as 2,thus it becomes difficult to compare them • Thus normal curve is standardized by using the unit of standard deviation to place any measurement with reference to mean. • The curve that emerges through this procedure is called standard normal curve www.indiandentalacademy.com
  86. 86. www.indiandentalacademy.com
  87. 87. Properties of standard normal curve • smooth bell shaped • perfectly symmetrical • based on infinite number of observations thus curve does not touch X axis • mean is zero • SD is always 1 • total area under the curve is 1 • mean median mode coincide www.indiandentalacademy.com
  88. 88. • the unit of SD here is relative or standard normal deviate and is denoted by Z Z=x–x SD Z = Observation – Mean SD www.indiandentalacademy.com
  89. 89. • With the help of Z value we can find the area under the curve from a table • This area helps to give the P value www.indiandentalacademy.com
  90. 90. www.indiandentalacademy.com
  91. 91. Sampling www.indiandentalacademy.com
  92. 92. • It is not possible to include each and every member of population as it will be time consuming, costly , laborious . • therefore sampling is done • Sampling is a process by which some unit of a population or universe are selected for the study and by subjecting it to statistical computation, conclusions are drawn about the population from which these units are drawn www.indiandentalacademy.com
  93. 93. • The sample will be a representative of entire population only • It is sufficiently large • It is unbiased • Such sample will have its statistics almost equal to parameters of entire population • Two main characteristics of a representative sample are – Precision – Unbiased character www.indiandentalacademy.com
  94. 94. Precision • Precision depends on a sample size • Ordinarily sample size should not be less than 30 Precision = n s n = sample size , s = standard deviation • Precision is directly proportional to square root of sample size, greater the sample size greater the precision • Also greater the SD, less will be the precision • Thus in such cases to obtain precision, sample size needs to be increased www.indiandentalacademy.com
  95. 95. Unbiased character • The sample should be unbiased i.e. every individual should have an equal chance to be selected in the sample. • Thus a standard random sampling method should be used • Non sampling errors can be taken care of by – Using standardized instruments and criteria – By single , double , triple blind trials – Use of a control group www.indiandentalacademy.com
  96. 96. Determination of sample size www.indiandentalacademy.com
  97. 97. For Quantitative Data • The investigator needs to decide how large an error due to sampling defect is allowable i.e. allowable error L • Either the investigator should start with assumed SD or do a pilot study to estimate SD sample size = 4 SD2 L2 www.indiandentalacademy.com
  98. 98. For Quantitative Data • Mean pulse rate of population is 70 beats per min with standard deviation of 8 beats. What will be the sample size if allowable error is ± 1 n = 4 X 8 X 8 = 256 1 • If L is less n will be more i.e. larger the sample size lesser is the error. www.indiandentalacademy.com
  99. 99. For qualitative data • In such data we deal with proportion Sample size = n = 4 p q L2 • p = proportion of positive character • q = proportion of negative character • q = 1-p or (100-p if expressed in percent) • L = allowable error usually 10% of p www.indiandentalacademy.com
  100. 100. For qualitative data • e.g. incidence rate in last influenza was found to be 5% of the population exposed • what should be the size of the sample • to find incidence rate in current epidemic if allowable error is 10%? • p = 5% q = 95% • l = 10 % of p = 0.5% n = 4 X 5 X 95 = 7600 O.5 2 www.indiandentalacademy.com
  101. 101. Probability or p value www.indiandentalacademy.com
  102. 102. • Concept of probability is very important in statistics • Probability is the chance of occurrence of any event or permutation combination. • It is denoted by p for sample and P for population • In various tests of significance we are often interested to know whether the observed difference between 2 samples is by chance or due to sampling variation. • There probability or p value is used www.indiandentalacademy.com
  103. 103. • P ranges from 0 to 1 • 0 = there is no chance that the observed difference could not be due to sampling variation • 1 = it is absolutely certain that observed difference between 2 samples is due to sampling variation • However such extreme values are rare. www.indiandentalacademy.com
  104. 104. • P = 0.4 i.e. chances that the difference is due to sampling variation is 4 in 10 • Obviously the chances that it is not due to sampling variation will be 6 in 10 • The essence of any test of significance is to find out p value and draw inference www.indiandentalacademy.com
  105. 105. • If p value is 0.05 or more – it is customary to accept that difference is due to chance (sampling variation) . – The observed difference is said to be statistically not significant. • If p value is less than 0.05 – observed difference is not due chance but due to role of some external factors. – The observed difference here is said to be statistically significant. www.indiandentalacademy.com
  106. 106. Determination of p value • From shape of normal curve • We know that 95% observation lie within mean ± 2SD . Thus probability of value more or less than this range is 5% • From probability tables • p value is also determined by probability tables in case of student t test or chi square test www.indiandentalacademy.com
  107. 107. Determination of p value • By area under normal curve • Here z= standard normal deviate is calculated • Corresponding to z values the area under the curve is determined (A) • Probability is given by 2(0.5 - A) www.indiandentalacademy.com
  108. 108. Thank you For more details please visit www.indiandentalacademy.com www.indiandentalacademy.com

×