Upcoming SlideShare
×

# Chapter1:introduction to medical statistics

11,342 views

Published on

Published in: Technology
1 Comment
29 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Excellent for the basics !!
Very informative.
Thank you very much.
jayarajg

Are you sure you want to  Yes  No
Views
Total views
11,342
On SlideShare
0
From Embeds
0
Number of Embeds
36
Actions
Shares
0
0
1
Likes
29
Embeds 0
No embeds

No notes for slide

### Chapter1:introduction to medical statistics

1. 1. <ul><li>Introduction to Medical Statistics </li></ul><ul><li>Chapter-1 </li></ul>
2. 4. <ul><li>Statistics : </li></ul><ul><li>The discipline concerned with the treatment of numerical data derived from groups of individuals (P. Armitage). </li></ul><ul><li>  </li></ul><ul><li>The science and art of dealing with variation in data through collection, classification and analysis in such a way as to obtain reliable results ( JM Last). </li></ul>
3. 5. <ul><li>Medical Statistics: </li></ul><ul><li>Application of mathematical statistics in </li></ul><ul><li>the field of medicine </li></ul><ul><li>  </li></ul><ul><li>Why we need to study statistics? </li></ul><ul><li>Three reasons: </li></ul><ul><li>(1)Basic requirement of medical research. </li></ul><ul><li>(2)Update your medical knowledge. </li></ul><ul><li>(3)Data management and treatment. </li></ul>
4. 6. Basic concepts <ul><li>Homogeneity: All individuals have similar characteristics and belong to same category. </li></ul><ul><li>Variation: the differences in height, weight… </li></ul>1. Homogeneity and Variation
5. 7. <ul><li>Population : The whole collection of </li></ul><ul><li>individuals that one intends to study </li></ul><ul><li>---- Homogeneity but with Variation </li></ul><ul><li>Sample : A representative part of the population. </li></ul><ul><li>Randomization : An important way to make the sample representative. </li></ul>2. Population and sample
6. 8. Random <ul><li>By chance! </li></ul><ul><li>Random event : the event may occur or may not occur in one experiment. </li></ul><ul><li>Before one experiment, nobody is sure whether the event occurs or not. </li></ul><ul><li>However, there must be some regulation in a large number of experiments. </li></ul>
7. 9. 3. Probability <ul><li>Measure the possibility of occurrence of a </li></ul><ul><li>random event. </li></ul><ul><li>A : random event </li></ul><ul><li>P(A) : Probability of the random event A </li></ul><ul><li>P(A)=1 , if an event always occurs. </li></ul><ul><li>P(A)=0, if an event never occurs. </li></ul>
8. 10. Estimation of Probability----Frequency <ul><li>n : number of observations (large enough) </li></ul><ul><li>m : number of occurrences of random event A </li></ul><ul><li>m / n : relative frequency or frequency of random </li></ul><ul><li>event A </li></ul><ul><li>P(A)  frequency </li></ul>
9. 11. 4. Parameter and statistic <ul><li>Parameter : A measure of certain property of the </li></ul><ul><li>Population. It is usually presented by Greek letters, such as μ , π ---- usually unknown </li></ul><ul><li>Statistic : A measure of certain property of a sample. </li></ul><ul><li>It is usually presented by Latin letters, such as s , p </li></ul>
10. 12. The Basic Steps of Statistical Work <ul><li>1. Design of study </li></ul><ul><li>2. Collection of data </li></ul><ul><li>3. Data Sorting </li></ul><ul><li>4. Data Analysis </li></ul>
11. 13. About This course -- Teaching and Learning <ul><li>Aim : </li></ul><ul><li>Training statistical thinking </li></ul><ul><li>Learning some skill for dealing with medical </li></ul><ul><li>data </li></ul><ul><li>Focus : </li></ul><ul><li>Essential concepts and statistical thinking </li></ul><ul><li>---- lectures and practice session </li></ul><ul><li>Skill on computer and statistical software </li></ul><ul><li>---- practice session </li></ul><ul><li>Practice session </li></ul><ul><li>---- experiments and discussion </li></ul>
12. 14. <ul><li>Text book </li></ul><ul><li>Fang Ji-Qian. Medical Statistics and Computer </li></ul><ul><li>Experiments. Singapore: World Scientific 2005 </li></ul><ul><li>Reference book </li></ul><ul><li>1. 方积乾 . 医学统计学与电脑实验 ( 第三版 ). 上海科学技术出版社 2006 </li></ul><ul><li>2. 方积乾 . 生物医学研究的统计方法 . 北京 : 高等教育出版社 2007 </li></ul>
13. 15. <ul><li>SPSS 软件下载地址： </li></ul><ul><li>ftp://202.116.102.7/pub/ 统计工具 /spss11 </li></ul><ul><li>课件下载地址： </li></ul><ul><li>ftp://202.116.102.6/ 统计 /2005 全英班 </li></ul><ul><li>电脑开机密码为学号后四位 </li></ul>
14. 16. Chapter   1 Descriptive Statistics
15. 20. Chapter   1 Descriptive Statistics <ul><li>Statistics: Statistical description </li></ul><ul><li>Statistical inference </li></ul><ul><li>Statistical description: </li></ul><ul><li>Describes the feature of the sample. </li></ul><ul><li>Main forms: tables, plots and numerical indexes </li></ul>
16. 21. 1.1 Variables and Data <ul><li>1.1.1 Structure and feature of data </li></ul>
17. 22. <ul><li>(1) Basic observed unit </li></ul><ul><li>A patient is defined as an observed unit. </li></ul><ul><li>(2) Recording item </li></ul><ul><li>Group : treatment </li></ul><ul><li>Response variables ( 反应变量 ) : </li></ul><ul><li>systolic pressure, diastolic, pressure, ECG and </li></ul><ul><li>effectiveness </li></ul><ul><li>Covariates ( 伴随变量 ) : age and gender </li></ul><ul><li>Variables: describe the properties of individuals. </li></ul><ul><li>Different types of variables  statistical methods </li></ul>
18. 23. 1.1.2 Types of variables <ul><li>1. Quantitative Variable ( 定量变量 ) </li></ul><ul><li>Continuous variable ( 连续变量 ) </li></ul><ul><li>Values obtained through measurement : height, </li></ul><ul><li>weight, blood pressure, pulse and … </li></ul><ul><li>Taking values in a continuous interval. </li></ul><ul><li>Discrete variable ( 离散变量 ) </li></ul><ul><li>Taking values in a set of integers. </li></ul><ul><li>Example 1.1 The variable for gender can be </li></ul><ul><li>defined with a binary variable X . </li></ul><ul><li>Binary variable is a simplest special case of it. </li></ul>
19. 24. <ul><li>2. Qualitative Variable ( 定性变量 ) </li></ul><ul><li>Categorical variable ( 分类变量 ) : </li></ul><ul><li>Taking “values” within several possible categories, such as Gender (male, female), Occupation. </li></ul><ul><li>Ordinal variable ( 有序变量 ) : </li></ul><ul><li>There exists order among all possible categories, such as education (primary school, high school, university, postgraduate) </li></ul>
20. 25. 1.2 Frequency Table and Histogram <ul><li>Useful for description of sample data </li></ul><ul><li>Intuitive basement of probability distribution. </li></ul><ul><li>1.2.1 Frequency table </li></ul><ul><li>1. Discrete-type frequency table </li></ul>
21. 26. Table 1.3 The frequency table for occupation of 108 patients
22. 27. Table 1.4 The frequency table for the results of certain semi-quantitative test among 150 patients
23. 28. 2. Continuous type frequency table <ul><li>Example 1.3 120 normal male adults were randomly selected from </li></ul><ul><li>the residents of a county. Their red cell counts (1012 /L) were </li></ul><ul><li>observed and listed as the follows: </li></ul><ul><li>5.12 5.13 4.58 4.31 4.09 4.41 4.33 4.58 4.24 5.45 4.32 4.84 </li></ul><ul><li>4.91 5.14 5.25 4.89 4.79 4.90 5.09 4.04 5.14 5.46 4.66 4.20 </li></ul><ul><li>4.21 3.73 5.17 5.79 5.46 4.49 4.85 5.28 4.78 4.32 4.94 5.21 </li></ul><ul><li>4.68 5.09 4.68 4.91 5.13 5.26 3.84 4.17 4.56 3.52 6.00 4.05 </li></ul><ul><li>4.92 4.87 4.28 4.46 5.03 5.69 5.25 4.56 5.53 4.58 4.86 4.97 </li></ul><ul><li>4.70 4.28 4.37 5.33 4.78 4.75 5.39 5.27 4.89 6.18 4.13 5.22 </li></ul><ul><li>4.44 4.13 4.43 4.02 5.86 5.12 5.36 3.86 4.68 5.48 5.31 4.53 </li></ul><ul><li>4.83 4.11 3.29 4.18 4.13 4.06 3.42 4.68 4.52 5.19 3.70 5.51 </li></ul><ul><li>4.64 4.92 4.93 4.90 3.92 5.04 4.70 4.54 3.95 4.40 4.31 3.77 </li></ul><ul><li>4.16 4.58 5.35 3.71 5.27 4.52 5.21 4.37 4.80 4.75 3.86 5.69 </li></ul><ul><li>Please try to establish a frequency table for this set of data. </li></ul>
24. 29. <ul><li>(1) Range R </li></ul><ul><li>maximum= 6.18, </li></ul><ul><li>minimum=3.29, </li></ul><ul><li>range R =6.18 － 3.29=2.89. </li></ul><ul><li>(2) Length of sub-intervals i </li></ul><ul><li>Divide the whole range into 8-15 sub-intervals. </li></ul><ul><li>R /10=2.89/10= 0.289≈ 0.30 </li></ul><ul><li>then let i =0.30. </li></ul>
25. 30. <ul><li>1.2.2 Frequency plot and histogram </li></ul><ul><li>1. Frequency plot for discrete variable </li></ul><ul><li>– bar chart </li></ul>
26. 31. 2. Frequency plot for continuous variable – histogram
27. 32. 1.3 Measurement for average level <ul><li>Numerical characteristics ( 数字特征 ): Average level ( 平均水平 ) Variation ( 变异 ) </li></ul><ul><li>1.3.1 Arithmetic mean ( 算术均数 ) </li></ul><ul><li>Useful when the histogram looks symmetric. </li></ul><ul><li>Denote the observed values of the individuals with </li></ul><ul><li>, the arithmetic mean </li></ul><ul><li>(1.1) </li></ul>
28. 33. <ul><li>1.3.2 Geometric mean ( 几何均数 ) </li></ul><ul><li>It is useful when the histogram of the logarithms </li></ul><ul><li>is close to symmetric. </li></ul><ul><li>Example The concentrations of certain antibody are measured for a set of sample and the </li></ul><ul><li>corresponding titers are 4, 8, 16, 16, 64, 128. </li></ul><ul><li>Arithmetic mean = 39.3 </li></ul><ul><li>Geometric mean = 20.16 </li></ul>
29. 34. <ul><li>1.3.3 Median ( 中位数 ) </li></ul><ul><li>When the histogram shows skew, the median can </li></ul><ul><li>be applied to measure the average level. </li></ul><ul><li>Median = the value in the middle </li></ul><ul><li>Example1 Data set {1,1,2,2, 3 ,4,6,9,10} </li></ul><ul><li>Median = 3 </li></ul><ul><li>Example2 Data set {1,1,2,2, 3,4 ,6,9,10,13} </li></ul><ul><li>Median = ( 3 + 4 )/2=3.5 </li></ul><ul><li>When n is odd, </li></ul><ul><li>Median = the observed value with rank ( n +1)/2 </li></ul><ul><li>When n is even, </li></ul><ul><li>Median =  values with rank n /2+ values with rank ( n +1)/2  2 </li></ul>
30. 35. Think about <ul><li>How to calculate P percentile ( 百分位数 )? </li></ul><ul><li>P 25 ? </li></ul><ul><li>P 75 ? </li></ul>
31. 36. 1.4 Measurement for Variation <ul><li>1.4.1 Range ( 极差 ) </li></ul><ul><li>R = maximal value - minimal value </li></ul><ul><li>R is worse in robustness. </li></ul><ul><li>Disadvantage : Based on only two observations, it </li></ul><ul><li>ignores the observations within the two extremes. </li></ul><ul><li>The more the observations, the greater the </li></ul><ul><li>range is. </li></ul>
32. 37. <ul><li>1.4.2 Inter- quartile range ( 四分位数差距 ) </li></ul><ul><li>Lower Quartile ( 下四分位数 ): </li></ul><ul><li>25 percentile, P 25 or </li></ul><ul><li>Upper Quartile ( 上四分位数 ): </li></ul><ul><li>75 percentile, P 75 or </li></ul><ul><li>Difference between two Quartiles </li></ul><ul><li>= P 75 - P 25 = - </li></ul><ul><li>= 13.120 – 8.083 = 5.037 </li></ul>
33. 38. 1.4.3 Variance and standard deviation <ul><li>Deviation ( 偏差 ) from the mean: </li></ul><ul><li>Squared deviation: </li></ul><ul><li>Population variance ( 总体方差 ): </li></ul><ul><li>average squared deviation throughout the population, </li></ul><ul><li>Population standard deviation ( 总体标准差 ): </li></ul>
34. 39. <ul><li>When the population mean ( 总体均数 ) </li></ul><ul><li>is unknown, it is replaced by </li></ul><ul><li>Squared deviation: </li></ul><ul><li>Sample variance ( 样本方差 ) : </li></ul><ul><li>average squared deviation throughout the sample </li></ul><ul><li>Sample standard deviation ( 样本标准差 ) : </li></ul><ul><li>Degrees of freedom ( 自由度 ) : ( n -1) </li></ul>
35. 40. <ul><li>Example The weight of male infant </li></ul><ul><li>2.85,2.90, 2.96, 3.00, 3.05, 3.18 </li></ul>Conventionally, mean and standard deviation are often expressed together as For instance, for height, mean and standard deviation are 170  6 (cm)
36. 41. 1.4.4 Coefficient of variation Example 9-10 For normal young males, comparing their height and weight, which one has more variation? Coefficient of variation ( 变异系数 ) is defined as
37. 42. 1.5 Relative Measures and Standardization Approaches <ul><li>1.5.1 Ratio, frequency and intensity </li></ul><ul><li>Relative measures are widely used in vital </li></ul><ul><li>Statistics( 生命统计 ) and epidemiology( 流行病学 ). </li></ul><ul><li>Caution : There are three types of relative </li></ul><ul><li>measures although they are often named with “… </li></ul><ul><li>rate”. </li></ul>
38. 43. <ul><li>Ratio ( 比 ): </li></ul><ul><li>It is simply a ratio of any quantity to another </li></ul><ul><li>For example, mass index ( 身体指数 ) </li></ul>
39. 44. <ul><li>2. Relative frequency ( 频率 ) </li></ul><ul><li>A special type of ratio: </li></ul><ul><li>Both of the numerator( 分子 ) and denominator( 分母 ) are counted numbers; </li></ul><ul><li>The numerator is a part of the denominator; </li></ul><ul><li>Within the interval of [0,1] </li></ul><ul><li>For example, </li></ul>
40. 45. <ul><li>3. Intensity ( 强度 ) </li></ul><ul><li>Another special type of ratio: </li></ul><ul><li>The denominator: total observed person-years </li></ul><ul><li>( 人 - 年 ) during certain period; </li></ul><ul><li>The numerator: number of certain event happening during the period. </li></ul><ul><li>Not necessary within the interval of [0,1] </li></ul><ul><li>For example, </li></ul>
41. 46. <ul><li>Unit: “person/person-year” </li></ul><ul><li>The mortality rate can be regarded as adjusted </li></ul><ul><li>relative frequency per year. </li></ul><ul><li>In general, intensity could be understood as </li></ul><ul><li>“ relative frequency per unit of time”, reflecting </li></ul><ul><li>the chance of certain event happening in a unit of </li></ul><ul><li>time. </li></ul>
42. 47. 1.5.2 Crude death rate and standardization Table 1.9 Age specific mortality rates ( 年龄别死亡率 ) for two cities <ul><li>Which city has a higher mortality? </li></ul>
43. 48. 1. Direct standardization ( 直接标准化 ) <ul><li>Select a “standard population”( 标准人口 ) </li></ul><ul><li>Taking the sum of populations of the two cities as a “standard population” </li></ul><ul><li>If the mortality rate were applied to the “standard population” correspondingly, Expected numbers of death =? </li></ul>
44. 49. 2. Indirect standardization ( 间接标准化 ) <ul><li>Standard mortality ratio (SMR) ( 标准化死亡比 ) </li></ul><ul><li>City A: SMR ＝ 63/58.12 ＝ 1.084 </li></ul><ul><li>Indirect standardized mortality rate ＝ 17.2×1.084 ＝ 18.64 (‰) </li></ul><ul><li>City B: SMR ＝ 131/142.30 ＝ 0.920 </li></ul><ul><li>Indirect standardized mortality rate ＝ 17.2×0.920 ＝ 15.83 (‰) </li></ul>
45. 50. Summary <ul><li>1. Statistics: Sample  Population </li></ul><ul><li>2. Frequency table and histogram </li></ul><ul><li>3. Average level </li></ul><ul><li>Arithmetic mean, median, geometric mean </li></ul><ul><li>4. Variation </li></ul><ul><li>Range, inter-quatile, standard deviation </li></ul><ul><li>Coefficient of variation </li></ul><ul><li>5. Relative measures </li></ul><ul><li>Ratio, frequency and intensity </li></ul><ul><li>6. Crude death rates are not compariable </li></ul><ul><li>Two approaches for standardization </li></ul>