0
Upcoming SlideShare
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Standard text messaging rates apply

# Chapter 2 Probabilty And Distribution

1,575

Published on

Published in: Health & Medicine, Technology
1 Like
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total Views
1,575
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
0
0
Likes
1
Embeds 0
No embeds

No notes for slide

### Transcript

• 1.
• Chapter-2
• 2.
• Chapter 2
• Probability and Distribution
• 3. Regular statement in statistics
• Two parts :
• Conclusion
• Probability that the conclusion is true
• 4. 2.1 Explanation of Probability and Related Concepts
• 2.1.1 Probability
• Flipping a die ( 骰子 )
• Possible outcome: 1, 2, …, 6
• Probability of “1”= 1/6
• Color blindness test
• Possible outcome: normal, abnormal
• Probabilities of “abnormal”= ? ---- Unknown!
• Survey: Randomly selected n students,
• if m of them are color blinders, then
• probability of abnormal 
• 5.
• In general,
• Events: the possible outcomes ,…
• Probability of the event E : P ( E ).
• ---- between 0 and 1
• Conditional probability :
• Under the condition that appears, the
• probability of the event
• For example,
• P ( nasopharyngeal carcinoma∣ EB virus +)
• 6.
• 2.1.2 Odds
• Complementary event : If there are only two
• possible events and they are exclusive, denoted
• with and , then
• Odds of event E :
• Question : Football game between teams A and B,
• If P(A win)=0.8, then P(B win)=?
• Odds (A win)=? Odds (B win)=?
• 7. Eg. If the incidence rates of influenza in classes A, B and C are 60% , 50% and 40% then Odds: To measure risk Odds ratio: To compare risks
• 8. 2.1.3 Bayes ’ formula
• Smoking ( A ) -> Lung cancer ( B ) ?
• Randomly divide the subjects into two groups;
• Invite one group to smoke and forbid another;
• Follow up year by year to obtain the number of
• the subjects “with lung cancer” ……
• Unfortunately, it is morally infeasible. Then how?
• 9.
• To find
• by Bayes’ formula
• Example 2.1
• 10. Conclusion: The risk of lung cancer for smoker is 5 times as much as that for ordinary people.
• 11. 2.3 Binomial Distribution P ( white ball)=0.8 P ( yellow ball )=0.2 
• 12. In general, if : the probability of an event appearing in a trial n : times of independently repeated trials X : random variable, total times of appearing such an event, then the probability of X = x This variable X is called a binomial variable , or say X following a binomial distribution , denoted as Why is it called Binomial? See following expansion:
• 13. 2.3.2 Plot of Binomial Distribution
• 14. 2.3.3 Population mean and population variance
• 15.
• Example Five “exactly same” animals were
• injected by a poison with dose of LD50 (Under
• such a dose, the P (death) = 50% )
• Since
• The possible number of deaths was ;
• The probability of each animal died from this
• injected poison is ；
• Independently repeated n = 5 times ;
• X followed
• 16. 2.4 Poisson Distribution
• Distribution of rare “articles”
• Special case of Binomial distribution ：
• Big n , small
• Example Pulse count of radio active isotope( 同位素 ).
• Large n and 0-1 : Divide the period into n sub-intervals, possible numbers of pulses in a sub-interval = 0 or 1
• Rare event :
• Independent
• 17. It can be proved, When n ->∞, the will tend to In general, if the probability function of a random variable X has the above shape , then we say that this variable follows a Poisson distribution with parameter , denoted by .
• 18. Example ： Red cell count on glass slide. Since Divide the glass slide into n small grids ---- big n , 0 or 1 ； P (a red cell) =  ---- small probability ; With or without a cell ---- independent ; Therefore, Number of cells ~ Poisson distribution
• 19.
• Note “independent” and “repeat” are important ,
• without these two, the distribution will not be a
• Poisson distribution.
• Example:
• For an infectious rare disease, the number of patients does not follow a Poisson distribution at all.
• When the bacterium are clustered in milk, the total number of bacterium does not follow a Poisson distribution either.
• 20. 2.4.2 Plot of probability function , positive skew; , approximately symmetric
• 21. Property of Poisson Distribution
• population mean = population variance = λ
• If and
• independent each other, then
• If
• then
• 22.
• If ,
• then 2 X does not follow
• does not follow
However
• 23. Example ： Five samples taken from a river ， the number of colibacillus ( 大肠杆菌 ) were counted
• 1-st sample, X 1 ~  (  1 )
• 2-nd sample X 2 ~  (  2 )
• …………… . …… .
• 5th sample X 5 ~  (  5 )
• If mix these 5 samples, the total number of
• colibacillus also follows a Poisson distribution
• X 1 + X 2 +…+ X 5 ~  (  1 +  2 +…+  5 )
• In order to enlarge the parameter, and then make the
• distribution symmetric, we may pool the small units such that enlarge the observed unit.
• 24. 2.5 Normal Distribution
• In practice, The shape of frequency histograms of many
• continuous random variables looks like this:
• taller around center, shorter on two sides and symmetric.
• 25. μ 1 μ 2 μ 3 Two parameters: population mean population variance Normal distribution denoted by
• 26. Standard normal distribution , , To any normal variable , after a transformation of standardization Z is called with standardized normal deviate or Z-value , or Z-score
• 27. 2.5.2 Area under the normal probability density curve
• A table for standard normal distribution is usually
• attached in most textbooks of statistics. (P. 479)
• ---- Given z , to find out
z 0
• 28.
• The area within
Corresponding to 1.96, the area of one tail is 0.025, the area of two tails is 0.025  2 = 0.05
• 29.
• The area within
Corresponding to 2.58, the area of one tail is 0.005, the area of two tails is 0.010 -2.58 Φ(-2.58)=0.005
• 30.
• 31. Critical value : Two sided critical value : One sided critical value
• 32. Distribution of X 1 + X 2 still follow a normal distribution When X 1 and X 2 are independent,
• 33. 2.5.3 Determination of a reference range
• Reference range or normal range : The range of most
• “ healthy people”. “Most” : 95% or 99%
• “ Healthy people”: should be well defined
• Determined by a large sample
• 1. If the variable follows a normal distribution
• then covers 95% of “healthy people”.
• However, usually are unknown! They may
• replaced by (It is why a large sample needed)
• Therefore, reference range:
• 34.
• 2. If the variable does not follow a normal distribution, then find out the percentile and percentile
• Therefore, reference range:
• 35. Example Based on the hemoglobin data of 120 healthy females ， , ; and the histogram shows it approximately follows a normal distribution. Please estimate the two-sided 95% reference range for females.
• 36. Caution
• The 95% reference range just tells that the measures of 95% healthy males are within this range;
• If someone’s measure is falling in this range, can we claim “ normal” ?
• If someone’s measure is outside this range. can we claim “ abnormal” ?
• ---- The reference range could never be a criterion for diagnosis.
• 37. 2.5.4 Normal approximation of binomial distribution and Poisson distribution
• When n is large enough, (n  >5, n (1-  ) >5) , the
• binomial distribution approximates to a
• normal distribution
• When  is large enough (   20 ) , the Poisson
• distribution approximates to a normal
• distribution
• 38.
• 39.
• 40. Example The infectious rate of hookworm( 钩虫 ) is 13% ， if randomly select 150 people ， what is the probability that at least 20 of them being infected ？ The probability that at least 20 of them being infected is 50% 。 Area of the rectangles on
• 41.
• Example The p ulse count of radio active isotope
• in 0.5 hour follows a Poisson distribution .
• Please estimate the probability that the pulse
• count measured is greater than 400 .
• 42. Summary
• Three distributions ：
• Discrete variable : Binomial distribution
• Poisson distribution
• Continuous variable : Normal distribution
• 1. Binomial distribution
• Possible values: 0 , 1
• Probability of positive event in one trial =  ，
• Probability of negative event in one trial = 1 －  ，
• Independently repeat n times
• Total number of positive event
• 2. Poisson distribution
• When  or （ 1 －  ） is very small ， n very large,
• binomial distribution approximate to Poisson distribution.
• 43.
• 3. Normal distribution ---- very important
• Many phenomena follow normal distributions;
• Important basis of statistical theory
• Two parameters ：
• Mean μ Standard deviation σ
• Z- transformation
• Area under the curve of normal distribution
• 44.
• 4. Normal approximation
• When n is large ( both of n  and n (1-  ) >5) ，
• approximates to
• When  is large (   20 ) ,
• approximates to
5. Web resources http://statpages.org/
• 45.
• Thanks
• 46.