Chapter1:introduction to medical statistics

11,342 views

Published on

Published in: Technology
1 Comment
29 Likes
Statistics
Notes
  • Excellent for the basics !!
    Very informative.
    Thank you very much.
    jayarajg
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total views
11,342
On SlideShare
0
From Embeds
0
Number of Embeds
36
Actions
Shares
0
Downloads
0
Comments
1
Likes
29
Embeds 0
No embeds

No notes for slide

Chapter1:introduction to medical statistics

  1. 1. <ul><li>Introduction to Medical Statistics </li></ul><ul><li>Chapter-1 </li></ul>
  2. 4. <ul><li>Statistics : </li></ul><ul><li>The discipline concerned with the treatment of numerical data derived from groups of individuals (P. Armitage). </li></ul><ul><li>  </li></ul><ul><li>The science and art of dealing with variation in data through collection, classification and analysis in such a way as to obtain reliable results ( JM Last). </li></ul>
  3. 5. <ul><li>Medical Statistics: </li></ul><ul><li>Application of mathematical statistics in </li></ul><ul><li>the field of medicine </li></ul><ul><li>  </li></ul><ul><li>Why we need to study statistics? </li></ul><ul><li>Three reasons: </li></ul><ul><li>(1)Basic requirement of medical research. </li></ul><ul><li>(2)Update your medical knowledge. </li></ul><ul><li>(3)Data management and treatment. </li></ul>
  4. 6. Basic concepts <ul><li>Homogeneity: All individuals have similar characteristics and belong to same category. </li></ul><ul><li>Variation: the differences in height, weight… </li></ul>1. Homogeneity and Variation
  5. 7. <ul><li>Population : The whole collection of </li></ul><ul><li>individuals that one intends to study </li></ul><ul><li>---- Homogeneity but with Variation </li></ul><ul><li>Sample : A representative part of the population. </li></ul><ul><li>Randomization : An important way to make the sample representative. </li></ul>2. Population and sample
  6. 8. Random <ul><li>By chance! </li></ul><ul><li>Random event : the event may occur or may not occur in one experiment. </li></ul><ul><li>Before one experiment, nobody is sure whether the event occurs or not. </li></ul><ul><li>However, there must be some regulation in a large number of experiments. </li></ul>
  7. 9. 3. Probability <ul><li>Measure the possibility of occurrence of a </li></ul><ul><li>random event. </li></ul><ul><li>A : random event </li></ul><ul><li>P(A) : Probability of the random event A </li></ul><ul><li>P(A)=1 , if an event always occurs. </li></ul><ul><li>P(A)=0, if an event never occurs. </li></ul>
  8. 10. Estimation of Probability----Frequency <ul><li>n : number of observations (large enough) </li></ul><ul><li>m : number of occurrences of random event A </li></ul><ul><li>m / n : relative frequency or frequency of random </li></ul><ul><li>event A </li></ul><ul><li>P(A)  frequency </li></ul>
  9. 11. 4. Parameter and statistic <ul><li>Parameter : A measure of certain property of the </li></ul><ul><li>Population. It is usually presented by Greek letters, such as μ , π ---- usually unknown </li></ul><ul><li>Statistic : A measure of certain property of a sample. </li></ul><ul><li>It is usually presented by Latin letters, such as s , p </li></ul>
  10. 12. The Basic Steps of Statistical Work <ul><li>1. Design of study </li></ul><ul><li>2. Collection of data </li></ul><ul><li>3. Data Sorting </li></ul><ul><li>4. Data Analysis </li></ul>
  11. 13. About This course -- Teaching and Learning <ul><li>Aim : </li></ul><ul><li>Training statistical thinking </li></ul><ul><li>Learning some skill for dealing with medical </li></ul><ul><li>data </li></ul><ul><li>Focus : </li></ul><ul><li>Essential concepts and statistical thinking </li></ul><ul><li>---- lectures and practice session </li></ul><ul><li>Skill on computer and statistical software </li></ul><ul><li>---- practice session </li></ul><ul><li>Practice session </li></ul><ul><li>---- experiments and discussion </li></ul>
  12. 14. <ul><li>Text book </li></ul><ul><li>Fang Ji-Qian. Medical Statistics and Computer </li></ul><ul><li>Experiments. Singapore: World Scientific 2005 </li></ul><ul><li>Reference book </li></ul><ul><li>1. 方积乾 . 医学统计学与电脑实验 ( 第三版 ). 上海科学技术出版社 2006 </li></ul><ul><li>2. 方积乾 . 生物医学研究的统计方法 . 北京 : 高等教育出版社 2007 </li></ul>
  13. 15. <ul><li>SPSS 软件下载地址: </li></ul><ul><li>ftp://202.116.102.7/pub/ 统计工具 /spss11 </li></ul><ul><li>课件下载地址: </li></ul><ul><li>ftp://202.116.102.6/ 统计 /2005 全英班 </li></ul><ul><li>电脑开机密码为学号后四位 </li></ul>
  14. 16. Chapter   1 Descriptive Statistics
  15. 20. Chapter   1 Descriptive Statistics <ul><li>Statistics: Statistical description </li></ul><ul><li>Statistical inference </li></ul><ul><li>Statistical description: </li></ul><ul><li>Describes the feature of the sample. </li></ul><ul><li>Main forms: tables, plots and numerical indexes </li></ul>
  16. 21. 1.1 Variables and Data <ul><li>1.1.1 Structure and feature of data </li></ul>
  17. 22. <ul><li>(1) Basic observed unit </li></ul><ul><li>A patient is defined as an observed unit. </li></ul><ul><li>(2) Recording item </li></ul><ul><li>Group : treatment </li></ul><ul><li>Response variables ( 反应变量 ) : </li></ul><ul><li>systolic pressure, diastolic, pressure, ECG and </li></ul><ul><li>effectiveness </li></ul><ul><li>Covariates ( 伴随变量 ) : age and gender </li></ul><ul><li>Variables: describe the properties of individuals. </li></ul><ul><li>Different types of variables  statistical methods </li></ul>
  18. 23. 1.1.2 Types of variables <ul><li>1. Quantitative Variable ( 定量变量 ) </li></ul><ul><li>Continuous variable ( 连续变量 ) </li></ul><ul><li>Values obtained through measurement : height, </li></ul><ul><li>weight, blood pressure, pulse and … </li></ul><ul><li>Taking values in a continuous interval. </li></ul><ul><li>Discrete variable ( 离散变量 ) </li></ul><ul><li>Taking values in a set of integers. </li></ul><ul><li>Example 1.1 The variable for gender can be </li></ul><ul><li>defined with a binary variable X . </li></ul><ul><li>Binary variable is a simplest special case of it. </li></ul>
  19. 24. <ul><li>2. Qualitative Variable ( 定性变量 ) </li></ul><ul><li>Categorical variable ( 分类变量 ) : </li></ul><ul><li>Taking “values” within several possible categories, such as Gender (male, female), Occupation. </li></ul><ul><li>Ordinal variable ( 有序变量 ) : </li></ul><ul><li>There exists order among all possible categories, such as education (primary school, high school, university, postgraduate) </li></ul>
  20. 25. 1.2 Frequency Table and Histogram <ul><li>Useful for description of sample data </li></ul><ul><li>Intuitive basement of probability distribution. </li></ul><ul><li>1.2.1 Frequency table </li></ul><ul><li>1. Discrete-type frequency table </li></ul>
  21. 26. Table 1.3 The frequency table for occupation of 108 patients
  22. 27. Table 1.4 The frequency table for the results of certain semi-quantitative test among 150 patients
  23. 28. 2. Continuous type frequency table <ul><li>Example 1.3 120 normal male adults were randomly selected from </li></ul><ul><li>the residents of a county. Their red cell counts (1012 /L) were </li></ul><ul><li>observed and listed as the follows: </li></ul><ul><li>5.12 5.13 4.58 4.31 4.09 4.41 4.33 4.58 4.24 5.45 4.32 4.84 </li></ul><ul><li>4.91 5.14 5.25 4.89 4.79 4.90 5.09 4.04 5.14 5.46 4.66 4.20 </li></ul><ul><li>4.21 3.73 5.17 5.79 5.46 4.49 4.85 5.28 4.78 4.32 4.94 5.21 </li></ul><ul><li>4.68 5.09 4.68 4.91 5.13 5.26 3.84 4.17 4.56 3.52 6.00 4.05 </li></ul><ul><li>4.92 4.87 4.28 4.46 5.03 5.69 5.25 4.56 5.53 4.58 4.86 4.97 </li></ul><ul><li>4.70 4.28 4.37 5.33 4.78 4.75 5.39 5.27 4.89 6.18 4.13 5.22 </li></ul><ul><li>4.44 4.13 4.43 4.02 5.86 5.12 5.36 3.86 4.68 5.48 5.31 4.53 </li></ul><ul><li>4.83 4.11 3.29 4.18 4.13 4.06 3.42 4.68 4.52 5.19 3.70 5.51 </li></ul><ul><li>4.64 4.92 4.93 4.90 3.92 5.04 4.70 4.54 3.95 4.40 4.31 3.77 </li></ul><ul><li>4.16 4.58 5.35 3.71 5.27 4.52 5.21 4.37 4.80 4.75 3.86 5.69 </li></ul><ul><li>Please try to establish a frequency table for this set of data. </li></ul>
  24. 29. <ul><li>(1) Range R </li></ul><ul><li>maximum= 6.18, </li></ul><ul><li>minimum=3.29, </li></ul><ul><li>range R =6.18 - 3.29=2.89. </li></ul><ul><li>(2) Length of sub-intervals i </li></ul><ul><li>Divide the whole range into 8-15 sub-intervals. </li></ul><ul><li>R /10=2.89/10= 0.289≈ 0.30 </li></ul><ul><li>then let i =0.30. </li></ul>
  25. 30. <ul><li>1.2.2 Frequency plot and histogram </li></ul><ul><li>1. Frequency plot for discrete variable </li></ul><ul><li>– bar chart </li></ul>
  26. 31. 2. Frequency plot for continuous variable – histogram
  27. 32. 1.3 Measurement for average level <ul><li>Numerical characteristics ( 数字特征 ): Average level ( 平均水平 ) Variation ( 变异 ) </li></ul><ul><li>1.3.1 Arithmetic mean ( 算术均数 ) </li></ul><ul><li>Useful when the histogram looks symmetric. </li></ul><ul><li>Denote the observed values of the individuals with </li></ul><ul><li>, the arithmetic mean </li></ul><ul><li>(1.1) </li></ul>
  28. 33. <ul><li>1.3.2 Geometric mean ( 几何均数 ) </li></ul><ul><li>It is useful when the histogram of the logarithms </li></ul><ul><li>is close to symmetric. </li></ul><ul><li>Example The concentrations of certain antibody are measured for a set of sample and the </li></ul><ul><li>corresponding titers are 4, 8, 16, 16, 64, 128. </li></ul><ul><li>Arithmetic mean = 39.3 </li></ul><ul><li>Geometric mean = 20.16 </li></ul>
  29. 34. <ul><li>1.3.3 Median ( 中位数 ) </li></ul><ul><li>When the histogram shows skew, the median can </li></ul><ul><li>be applied to measure the average level. </li></ul><ul><li>Median = the value in the middle </li></ul><ul><li>Example1 Data set {1,1,2,2, 3 ,4,6,9,10} </li></ul><ul><li>Median = 3 </li></ul><ul><li>Example2 Data set {1,1,2,2, 3,4 ,6,9,10,13} </li></ul><ul><li>Median = ( 3 + 4 )/2=3.5 </li></ul><ul><li>When n is odd, </li></ul><ul><li>Median = the observed value with rank ( n +1)/2 </li></ul><ul><li>When n is even, </li></ul><ul><li>Median =  values with rank n /2+ values with rank ( n +1)/2  2 </li></ul>
  30. 35. Think about <ul><li>How to calculate P percentile ( 百分位数 )? </li></ul><ul><li>P 25 ? </li></ul><ul><li>P 75 ? </li></ul>
  31. 36. 1.4 Measurement for Variation <ul><li>1.4.1 Range ( 极差 ) </li></ul><ul><li>R = maximal value - minimal value </li></ul><ul><li>R is worse in robustness. </li></ul><ul><li>Disadvantage : Based on only two observations, it </li></ul><ul><li>ignores the observations within the two extremes. </li></ul><ul><li>The more the observations, the greater the </li></ul><ul><li>range is. </li></ul>
  32. 37. <ul><li>1.4.2 Inter- quartile range ( 四分位数差距 ) </li></ul><ul><li>Lower Quartile ( 下四分位数 ): </li></ul><ul><li>25 percentile, P 25 or </li></ul><ul><li>Upper Quartile ( 上四分位数 ): </li></ul><ul><li>75 percentile, P 75 or </li></ul><ul><li>Difference between two Quartiles </li></ul><ul><li>= P 75 - P 25 = - </li></ul><ul><li>= 13.120 – 8.083 = 5.037 </li></ul>
  33. 38. 1.4.3 Variance and standard deviation <ul><li>Deviation ( 偏差 ) from the mean: </li></ul><ul><li>Squared deviation: </li></ul><ul><li>Population variance ( 总体方差 ): </li></ul><ul><li>average squared deviation throughout the population, </li></ul><ul><li>Population standard deviation ( 总体标准差 ): </li></ul>
  34. 39. <ul><li>When the population mean ( 总体均数 ) </li></ul><ul><li>is unknown, it is replaced by </li></ul><ul><li>Squared deviation: </li></ul><ul><li>Sample variance ( 样本方差 ) : </li></ul><ul><li>average squared deviation throughout the sample </li></ul><ul><li>Sample standard deviation ( 样本标准差 ) : </li></ul><ul><li>Degrees of freedom ( 自由度 ) : ( n -1) </li></ul>
  35. 40. <ul><li>Example The weight of male infant </li></ul><ul><li>2.85,2.90, 2.96, 3.00, 3.05, 3.18 </li></ul>Conventionally, mean and standard deviation are often expressed together as For instance, for height, mean and standard deviation are 170  6 (cm)
  36. 41. 1.4.4 Coefficient of variation Example 9-10 For normal young males, comparing their height and weight, which one has more variation? Coefficient of variation ( 变异系数 ) is defined as
  37. 42. 1.5 Relative Measures and Standardization Approaches <ul><li>1.5.1 Ratio, frequency and intensity </li></ul><ul><li>Relative measures are widely used in vital </li></ul><ul><li>Statistics( 生命统计 ) and epidemiology( 流行病学 ). </li></ul><ul><li>Caution : There are three types of relative </li></ul><ul><li>measures although they are often named with “… </li></ul><ul><li>rate”. </li></ul>
  38. 43. <ul><li>Ratio ( 比 ): </li></ul><ul><li>It is simply a ratio of any quantity to another </li></ul><ul><li>For example, mass index ( 身体指数 ) </li></ul>
  39. 44. <ul><li>2. Relative frequency ( 频率 ) </li></ul><ul><li>A special type of ratio: </li></ul><ul><li>Both of the numerator( 分子 ) and denominator( 分母 ) are counted numbers; </li></ul><ul><li>The numerator is a part of the denominator; </li></ul><ul><li>Within the interval of [0,1] </li></ul><ul><li>For example, </li></ul>
  40. 45. <ul><li>3. Intensity ( 强度 ) </li></ul><ul><li>Another special type of ratio: </li></ul><ul><li>The denominator: total observed person-years </li></ul><ul><li>( 人 - 年 ) during certain period; </li></ul><ul><li>The numerator: number of certain event happening during the period. </li></ul><ul><li>Not necessary within the interval of [0,1] </li></ul><ul><li>For example, </li></ul>
  41. 46. <ul><li>Unit: “person/person-year” </li></ul><ul><li>The mortality rate can be regarded as adjusted </li></ul><ul><li>relative frequency per year. </li></ul><ul><li>In general, intensity could be understood as </li></ul><ul><li>“ relative frequency per unit of time”, reflecting </li></ul><ul><li>the chance of certain event happening in a unit of </li></ul><ul><li>time. </li></ul>
  42. 47. 1.5.2 Crude death rate and standardization Table 1.9 Age specific mortality rates ( 年龄别死亡率 ) for two cities <ul><li>Which city has a higher mortality? </li></ul>
  43. 48. 1. Direct standardization ( 直接标准化 ) <ul><li>Select a “standard population”( 标准人口 ) </li></ul><ul><li>Taking the sum of populations of the two cities as a “standard population” </li></ul><ul><li>If the mortality rate were applied to the “standard population” correspondingly, Expected numbers of death =? </li></ul>
  44. 49. 2. Indirect standardization ( 间接标准化 ) <ul><li>Standard mortality ratio (SMR) ( 标准化死亡比 ) </li></ul><ul><li>City A: SMR = 63/58.12 = 1.084 </li></ul><ul><li>Indirect standardized mortality rate = 17.2×1.084 = 18.64 (‰) </li></ul><ul><li>City B: SMR = 131/142.30 = 0.920 </li></ul><ul><li>Indirect standardized mortality rate = 17.2×0.920 = 15.83 (‰) </li></ul>
  45. 50. Summary <ul><li>1. Statistics: Sample  Population </li></ul><ul><li>2. Frequency table and histogram </li></ul><ul><li>3. Average level </li></ul><ul><li>Arithmetic mean, median, geometric mean </li></ul><ul><li>4. Variation </li></ul><ul><li>Range, inter-quatile, standard deviation </li></ul><ul><li>Coefficient of variation </li></ul><ul><li>5. Relative measures </li></ul><ul><li>Ratio, frequency and intensity </li></ul><ul><li>6. Crude death rates are not compariable </li></ul><ul><li>Two approaches for standardization </li></ul>

×