Successfully reported this slideshow.
<ul><li>Chapter-2 </li></ul>
<ul><li>Chapter 2 </li></ul><ul><li>Probability and Distribution </li></ul>
Regular statement in statistics <ul><li>Two parts :  </li></ul><ul><li>Conclusion  </li></ul><ul><li>Probability that  the...
2.1  Explanation of Probability and Related Concepts <ul><li>2.1.1 Probability </li></ul><ul><li>Flipping a die ( 骰子 ) </l...
<ul><li>In general,  </li></ul><ul><li>Events:  the possible outcomes  ,… </li></ul><ul><li>Probability  of the event  E  ...
<ul><li>2.1.2  Odds   </li></ul><ul><li>Complementary event : If there are only two </li></ul><ul><li>possible events and ...
Eg.  If the incidence rates of influenza in classes A, B and C  are 60% , 50% and 40% then   Odds: To measure risk Odds ra...
2.1.3  Bayes ’  formula <ul><li>Smoking ( A )   -> Lung cancer ( B ) ?  </li></ul><ul><li>Randomly divide the subjects int...
<ul><li>To find  </li></ul><ul><li>by Bayes’ formula </li></ul><ul><li>Example 2.1   </li></ul>
Conclusion: The risk of lung cancer for smoker is 5 times as much as that for ordinary people.
2.3 Binomial Distribution P ( white ball)=0.8 P (  yellow ball )=0.2 
In general, if : the  probability  of an event appearing in a trial  n   : times of  independently repeated  trials X   : ...
2.3.2  Plot of Binomial Distribution
2.3.3  Population mean and population variance
<ul><li>Example  Five “exactly same” animals were </li></ul><ul><li>injected by a poison with dose of  LD50 (Under  </li><...
2.4  Poisson Distribution <ul><li>Distribution of rare “articles” </li></ul><ul><li>Special case of Binomial distribution ...
It can be proved,  When  n ->∞, the  will tend to In general,  if the probability function of a random variable  X   has t...
Example : Red cell count on glass slide. Since   Divide the  glass slide   into  n  small grids  ----  big  n  , 0 or 1 ; ...
<ul><li>Note  “independent” and “repeat” are important ,  </li></ul><ul><li>without these two, the distribution will not b...
2.4.2 Plot of probability function , positive skew;  , approximately symmetric
Property of Poisson Distribution   <ul><li>population mean = population variance  = λ </li></ul><ul><li>Additive property ...
<ul><li>If  ,  </li></ul><ul><li>then  2 X   does not  follow </li></ul><ul><li>does not  follow  </li></ul>However
Example :   Five samples taken from a river , the number of colibacillus ( 大肠杆菌 )  were counted <ul><li>1-st sample,  X 1 ...
2.5  Normal Distribution <ul><li>In practice, The shape of frequency histograms of many </li></ul><ul><li>continuous rando...
μ 1 μ 2 μ 3 Two parameters:  population mean  population variance Normal distribution denoted by
Standard normal distribution   ,  ,  To any normal variable  , after a transformation of standardization Z is called with ...
2.5.2  Area under the normal probability density curve <ul><li>A table for standard normal distribution is usually  </li><...
<ul><li>The area within   </li></ul>Corresponding to 1.96,  the area of one tail is 0.025,  the area of two tails is 0.025...
<ul><li>The area within   </li></ul>Corresponding to 2.58,  the area of one tail is 0.005,  the area of two tails is 0.010...
 
Critical value  : Two sided critical value : One sided critical value
Distribution of  X 1 + X 2   still follow a normal distribution When  X 1  and  X 2  are independent,
2.5.3  Determination    of a reference range <ul><li>Reference range or normal range :  The range of most  </li></ul><ul><...
<ul><li>2. If the variable does not follow a normal distribution,  then find out the percentile  and  percentile  </li></u...
Example  Based on the hemoglobin data of  120  healthy females ,  ,  ; and the histogram shows it  approximately follows a...
Caution <ul><li>The  95% reference range just tells that the measures of 95% healthy males are within this range;  </li></...
2.5.4  Normal approximation of binomial distribution and Poisson distribution <ul><li>When  n  is large enough,  (n    >5...
 
 
Example  The infectious rate of hookworm( 钩虫 ) is  13% , if randomly select  150  people , what is the probability that  a...
<ul><li>Example   The p ulse count of radio active isotope  </li></ul><ul><li>in 0.5 hour follows a Poisson distribution  ...
Summary <ul><li>Three distributions : </li></ul><ul><li>Discrete variable :  Binomial distribution </li></ul><ul><li>Poiss...
<ul><li>3. Normal distribution  ---- very important </li></ul><ul><li>Many phenomena follow normal distributions;  </li></...
<ul><li>4. Normal approximation </li></ul><ul><li>When  n  is large  ( both   of  n    and  n (1-    ) >5) , </li></ul><...
<ul><li>Thanks </li></ul>
 
Upcoming SlideShare
Loading in …5
×

Chapter 2 Probabilty And Distribution

1,928 views

Published on

Published in: Health & Medicine, Technology
  • Be the first to comment

Chapter 2 Probabilty And Distribution

  1. 1. <ul><li>Chapter-2 </li></ul>
  2. 2. <ul><li>Chapter 2 </li></ul><ul><li>Probability and Distribution </li></ul>
  3. 3. Regular statement in statistics <ul><li>Two parts : </li></ul><ul><li>Conclusion </li></ul><ul><li>Probability that the conclusion is true </li></ul>
  4. 4. 2.1 Explanation of Probability and Related Concepts <ul><li>2.1.1 Probability </li></ul><ul><li>Flipping a die ( 骰子 ) </li></ul><ul><li>Possible outcome: 1, 2, …, 6 </li></ul><ul><li>Probability of “1”= 1/6 </li></ul><ul><li>Color blindness test </li></ul><ul><li>Possible outcome: normal, abnormal </li></ul><ul><li>Probabilities of “abnormal”= ? ---- Unknown! </li></ul><ul><li>Survey: Randomly selected n students, </li></ul><ul><li>if m of them are color blinders, then </li></ul><ul><li>probability of abnormal  </li></ul>
  5. 5. <ul><li>In general, </li></ul><ul><li>Events: the possible outcomes ,… </li></ul><ul><li>Probability of the event E : P ( E ). </li></ul><ul><li>---- between 0 and 1 </li></ul><ul><li>Conditional probability : </li></ul><ul><li>Under the condition that appears, the </li></ul><ul><li>probability of the event </li></ul><ul><li>For example, </li></ul><ul><li>P ( nasopharyngeal carcinoma∣ EB virus +) </li></ul>
  6. 6. <ul><li>2.1.2 Odds </li></ul><ul><li>Complementary event : If there are only two </li></ul><ul><li>possible events and they are exclusive, denoted </li></ul><ul><li>with and , then </li></ul><ul><li>Odds of event E : </li></ul><ul><li>Question : Football game between teams A and B, </li></ul><ul><li>If P(A win)=0.8, then P(B win)=? </li></ul><ul><li>Odds (A win)=? Odds (B win)=? </li></ul>
  7. 7. Eg. If the incidence rates of influenza in classes A, B and C are 60% , 50% and 40% then Odds: To measure risk Odds ratio: To compare risks
  8. 8. 2.1.3 Bayes ’ formula <ul><li>Smoking ( A ) -> Lung cancer ( B ) ? </li></ul><ul><li>Randomly divide the subjects into two groups; </li></ul><ul><li>Invite one group to smoke and forbid another; </li></ul><ul><li>Follow up year by year to obtain the number of </li></ul><ul><li>the subjects “with lung cancer” …… </li></ul><ul><li>Unfortunately, it is morally infeasible. Then how? </li></ul>
  9. 9. <ul><li>To find </li></ul><ul><li>by Bayes’ formula </li></ul><ul><li>Example 2.1 </li></ul>
  10. 10. Conclusion: The risk of lung cancer for smoker is 5 times as much as that for ordinary people.
  11. 11. 2.3 Binomial Distribution P ( white ball)=0.8 P ( yellow ball )=0.2 
  12. 12. In general, if : the probability of an event appearing in a trial n : times of independently repeated trials X : random variable, total times of appearing such an event, then the probability of X = x This variable X is called a binomial variable , or say X following a binomial distribution , denoted as Why is it called Binomial? See following expansion:
  13. 13. 2.3.2 Plot of Binomial Distribution
  14. 14. 2.3.3 Population mean and population variance
  15. 15. <ul><li>Example Five “exactly same” animals were </li></ul><ul><li>injected by a poison with dose of LD50 (Under </li></ul><ul><li>such a dose, the P (death) = 50% ) </li></ul><ul><li>Since </li></ul><ul><li>The possible number of deaths was ; </li></ul><ul><li>The probability of each animal died from this </li></ul><ul><li>injected poison is ; </li></ul><ul><li>Independently repeated n = 5 times ; </li></ul><ul><li>X followed </li></ul>
  16. 16. 2.4 Poisson Distribution <ul><li>Distribution of rare “articles” </li></ul><ul><li>Special case of Binomial distribution : </li></ul><ul><li>Big n , small </li></ul><ul><li>Example Pulse count of radio active isotope( 同位素 ). </li></ul><ul><li>Large n and 0-1 : Divide the period into n sub-intervals, possible numbers of pulses in a sub-interval = 0 or 1 </li></ul><ul><li>Rare event : </li></ul><ul><li>Independent </li></ul>
  17. 17. It can be proved, When n ->∞, the will tend to In general, if the probability function of a random variable X has the above shape , then we say that this variable follows a Poisson distribution with parameter , denoted by .
  18. 18. Example : Red cell count on glass slide. Since Divide the glass slide into n small grids ---- big n , 0 or 1 ; P (a red cell) =  ---- small probability ; With or without a cell ---- independent ; Therefore, Number of cells ~ Poisson distribution
  19. 19. <ul><li>Note “independent” and “repeat” are important , </li></ul><ul><li>without these two, the distribution will not be a </li></ul><ul><li>Poisson distribution. </li></ul><ul><li>Example: </li></ul><ul><li>For an infectious rare disease, the number of patients does not follow a Poisson distribution at all. </li></ul><ul><li>When the bacterium are clustered in milk, the total number of bacterium does not follow a Poisson distribution either. </li></ul>
  20. 20. 2.4.2 Plot of probability function , positive skew; , approximately symmetric
  21. 21. Property of Poisson Distribution <ul><li>population mean = population variance = λ </li></ul><ul><li>Additive property </li></ul><ul><li>If and </li></ul><ul><li>independent each other, then </li></ul><ul><li>If </li></ul><ul><li>then </li></ul>
  22. 22. <ul><li>If , </li></ul><ul><li>then 2 X does not follow </li></ul><ul><li>does not follow </li></ul>However
  23. 23. Example : Five samples taken from a river , the number of colibacillus ( 大肠杆菌 ) were counted <ul><li>1-st sample, X 1 ~  (  1 ) </li></ul><ul><li>2-nd sample X 2 ~  (  2 ) </li></ul><ul><li>…………… . …… . </li></ul><ul><li>5th sample X 5 ~  (  5 ) </li></ul><ul><li>If mix these 5 samples, the total number of </li></ul><ul><li>colibacillus also follows a Poisson distribution </li></ul><ul><li>X 1 + X 2 +…+ X 5 ~  (  1 +  2 +…+  5 ) </li></ul><ul><li>Application of additive property: </li></ul><ul><li>In order to enlarge the parameter, and then make the </li></ul><ul><li>distribution symmetric, we may pool the small units such that enlarge the observed unit. </li></ul>
  24. 24. 2.5 Normal Distribution <ul><li>In practice, The shape of frequency histograms of many </li></ul><ul><li>continuous random variables looks like this: </li></ul><ul><li>taller around center, shorter on two sides and symmetric. </li></ul>
  25. 25. μ 1 μ 2 μ 3 Two parameters: population mean population variance Normal distribution denoted by
  26. 26. Standard normal distribution , , To any normal variable , after a transformation of standardization Z is called with standardized normal deviate or Z-value , or Z-score
  27. 27. 2.5.2 Area under the normal probability density curve <ul><li>A table for standard normal distribution is usually </li></ul><ul><li>attached in most textbooks of statistics. (P. 479) </li></ul><ul><li>---- Given z , to find out </li></ul>z 0
  28. 28. <ul><li>The area within </li></ul>Corresponding to 1.96, the area of one tail is 0.025, the area of two tails is 0.025  2 = 0.05
  29. 29. <ul><li>The area within </li></ul>Corresponding to 2.58, the area of one tail is 0.005, the area of two tails is 0.010 -2.58 Φ(-2.58)=0.005
  30. 31. Critical value : Two sided critical value : One sided critical value
  31. 32. Distribution of X 1 + X 2 still follow a normal distribution When X 1 and X 2 are independent,
  32. 33. 2.5.3 Determination of a reference range <ul><li>Reference range or normal range : The range of most </li></ul><ul><li>“ healthy people”. “Most” : 95% or 99% </li></ul><ul><li>“ Healthy people”: should be well defined </li></ul><ul><li>Determined by a large sample </li></ul><ul><li>1. If the variable follows a normal distribution </li></ul><ul><li>then covers 95% of “healthy people”. </li></ul><ul><li>However, usually are unknown! They may </li></ul><ul><li>replaced by (It is why a large sample needed) </li></ul><ul><li>Therefore, reference range: </li></ul>
  33. 34. <ul><li>2. If the variable does not follow a normal distribution, then find out the percentile and percentile </li></ul><ul><li>Therefore, reference range: </li></ul>
  34. 35. Example Based on the hemoglobin data of 120 healthy females , , ; and the histogram shows it approximately follows a normal distribution. Please estimate the two-sided 95% reference range for females.
  35. 36. Caution <ul><li>The 95% reference range just tells that the measures of 95% healthy males are within this range; </li></ul><ul><li>If someone’s measure is falling in this range, can we claim “ normal” ? </li></ul><ul><li>If someone’s measure is outside this range. can we claim “ abnormal” ? </li></ul><ul><li>---- The reference range could never be a criterion for diagnosis. </li></ul>
  36. 37. 2.5.4 Normal approximation of binomial distribution and Poisson distribution <ul><li>When n is large enough, (n  >5, n (1-  ) >5) , the </li></ul><ul><li>binomial distribution approximates to a </li></ul><ul><li>normal distribution </li></ul><ul><li>When  is large enough (   20 ) , the Poisson </li></ul><ul><li>distribution approximates to a normal </li></ul><ul><li>distribution </li></ul>
  37. 40. Example The infectious rate of hookworm( 钩虫 ) is 13% , if randomly select 150 people , what is the probability that at least 20 of them being infected ? The probability that at least 20 of them being infected is 50% 。 Area of the rectangles on
  38. 41. <ul><li>Example The p ulse count of radio active isotope </li></ul><ul><li>in 0.5 hour follows a Poisson distribution . </li></ul><ul><li>Please estimate the probability that the pulse </li></ul><ul><li>count measured is greater than 400 . </li></ul>
  39. 42. Summary <ul><li>Three distributions : </li></ul><ul><li>Discrete variable : Binomial distribution </li></ul><ul><li>Poisson distribution </li></ul><ul><li>Continuous variable : Normal distribution </li></ul><ul><li>1. Binomial distribution </li></ul><ul><li>Possible values: 0 , 1 </li></ul><ul><li>Probability of positive event in one trial =  , </li></ul><ul><li>Probability of negative event in one trial = 1 -  , </li></ul><ul><li>Independently repeat n times </li></ul><ul><li>Total number of positive event </li></ul><ul><li>2. Poisson distribution </li></ul><ul><li>When  or ( 1 -  ) is very small , n very large, </li></ul><ul><li>binomial distribution approximate to Poisson distribution. </li></ul>
  40. 43. <ul><li>3. Normal distribution ---- very important </li></ul><ul><li>Many phenomena follow normal distributions; </li></ul><ul><li>Important basis of statistical theory </li></ul><ul><li>Two parameters : </li></ul><ul><li>Mean μ Standard deviation σ </li></ul><ul><li>Z- transformation </li></ul><ul><li>Area under the curve of normal distribution </li></ul>
  41. 44. <ul><li>4. Normal approximation </li></ul><ul><li>When n is large ( both of n  and n (1-  ) >5) , </li></ul><ul><li>approximates to </li></ul><ul><li>When  is large (   20 ) , </li></ul><ul><li>approximates to </li></ul>5. Web resources http://statpages.org/
  42. 45. <ul><li>Thanks </li></ul>

×