Chi-square distribution is commonly used in hypothesis testing involving categorical data. Some examples:
- Goodness-of-fit test - To test if sample data fits a theoretical distribution (e.g. binomial, Poisson)
- Independence test - To test if two categorical variables are independent
- Homogeneity test - To test if population proportions are equal across groups
The test statistics in these tests (like likelihood ratio, G-statistic) follow a chi-square distribution under the null hypothesis.
Knowing properties of chi-square distribution like degrees of freedom, expected values, critical values etc. allows us to calculate p-values and make decisions about null hypotheses.
It's an important distribution in statistical inference.
Discrete Random Variable (Probability Distribution)LeslyAlingay
This presentation the statistics teachers to discuss discrete random variable since it includes examples and solutions.
Content:
-definition of random variable
-creating a frequency distribution table
- creating a histogram
-solving for the mean, variance and standard deviation.
References:
http://www.elcamino.edu/faculty/klaureano/documents/math%20150/chapternotes/chapter6.sullivan.pdf
https://www.mathsisfun.com/data/random-variables-mean-variance.html
https://www.youtube.com/watch?v=OvTEhNL96v0
https://www150.statcan.gc.ca/n1/edu/power-pouvoir/ch12/5214891-eng.htm
Discrete Random Variable (Probability Distribution)LeslyAlingay
This presentation the statistics teachers to discuss discrete random variable since it includes examples and solutions.
Content:
-definition of random variable
-creating a frequency distribution table
- creating a histogram
-solving for the mean, variance and standard deviation.
References:
http://www.elcamino.edu/faculty/klaureano/documents/math%20150/chapternotes/chapter6.sullivan.pdf
https://www.mathsisfun.com/data/random-variables-mean-variance.html
https://www.youtube.com/watch?v=OvTEhNL96v0
https://www150.statcan.gc.ca/n1/edu/power-pouvoir/ch12/5214891-eng.htm
Detail Description about Probability Distribution for Dummies. The contents are about random variables, its types(Discrete and Continuous) , it's distribution (Discrete probability distribution and probability density function), Expected value, Binomial, Poisson and Normal Distribution usage and solved example for each topic.
The PPT covered the distinguish between discrete and continuous distribution. Detailed explanation of the types of discrete distributions such as binomial distribution, Poisson distribution & Hyper-geometric distribution.
2. Chebyshev’s Inequality
Chebyshev’s inequality guarantees that in any
data sample or probability distribution, "nearly all"
values are close to the mean
The precise statement being that no more than
1/k2 of the distribution’s values can be more than
k standard deviations away from the mean.
The inequality has great utility because it can be
applied to completely arbitrary distributions
(unknown except for mean and variance), for
example it can be used to prove the weak law of
Pafnuty Lvovich large numbers.
Chebyshev (1821 –
1894)
4. Law of Large Numbers
The law of large numbers
(LLN) is a theorem that
describes the result of
performing the same
experiment a large number of
times.
According to the law, the
average of the results obtained
from a large number of trials
should be close to the
expected value, and will tend to
become closer as more trials
are performed.
19. • Example
Suppose that we have a box containing 45 numbered balls. In this case, we
randomly select a ball and its number is X:
(1) Probability distribution for X
(2) Expectation and Variance for X
(3) P(X>40)
• Solution
(1)
(2)
(3)
24. (Discrete) Binomial Distribution
Some Properties of the Binomial Distribution
μ = n/2 을 중심으로 좌우대칭 :
대칭이항분포 (symmetric binomial distribution)
Tail in right
Tail in left
27. Example
Two factories, S and L, produce smart phones and their failure ratios are 5%. If you
buy 7 phones from S and 13 phones from L, what is the probability to have at least
one failed phone? And what is the probability that you have one failed phone?
Assume that the failure rates are independent.
Solution X (from S) and Y (from L) :
X ∼ B (7, 0.05) , Y ∼ B (13, 0.05) , X, Y : Independent
X + Y ∼ B (20, 0.05)
Only one phone is failed
At least one phone is failed
28. Criteria for a Binomial Probability Experiment
An experiment is said to be a binomial experiment
provided
1. The experiment is performed a fixed number of
times. Each repetition of the experiment is called a
trial.
2. The trials are independent. This means the
outcome of one trial will not affect the outcome of the
other trials.
3. For each trial, there are two mutually exclusive
outcomes, success or failure.
4. The probability of success is fixed for each trial of
the experiment.
29. Notation Used in the
Binomial Probability Distribution
• There are n independent trials of the experiment
• Let p denote the probability of success so that
1 – p is the probability of failure.
• Let x denote the number of successes in n
independent trials of the experiment. So, 0 < x < n.
30. EXAMPLE Identifying Binomial Experiments
Which of the following are binomial experiments?
(a) A player rolls a pair of fair die 10 times. The number
X of 7’s rolled is recorded.
(b) The 11 largest airlines had an on-time percentage of
84.7% in November, 2001 according to the Air Travel
Consumer Report. In order to assess reasons for
delays, an official with the FAA randomly selects flights
until she finds 10 that were not on time. The number of
flights X that need to be selected is recorded.
(c ) In a class of 30 students, 55% are female. The
instructor randomly selects 4 students. The number X
of females selected is recorded.
31. EXAMPLE Constructing a Binomial Probability
Distribution
According to the Air Travel Consumer Report,
the 11 largest air carriers had an on-time
percentage of 84.7% in November, 2001.
Suppose that 4 flights are randomly selected
from November, 2001 and the number of on-time
flights X is recorded. Construct a probability
distribution for the random variable X using a
tree diagram.
33. (Discrete) Geometric Distribution
Repeat Bernoulli experiments until the first success. => Number of Trial is X
Slot Machine:
How many should I try if I
get the jackpot?
36. (Discrete) Negative Binomial Distribution
Repeat Bernoulli experiments until the rth success.
Crane Game:
How many should I try if I
want to get three dolls?
39. (Discrete) Hypergeometric Distribution
It is similar to the binomial distribution. But the difference is the method of
sampling
Binomial experiment: Sampling with replacement
Hypergeometric experiment: Sampling without replacement
Normal shooting Russian roulette
Each trial has same probability Each trial may have different probability
40. (Discrete) Hypergeometric Distribution
A box contains N balls, where r balls are white (r<N)
Suppose that we randomly select n balls from the box, what is the number
of white balls (X)?
Assumption: Sampling without replacement
n 개의
items
N개
추출
의
items
개
개
개
개
43. Total 50 chips are in a box. Among those, 4 are out of order (failed
chips). If you select 5 chips:
(1) Probability distribution for the failed chip in these selected chips
(2) Probability to have one or two failed chips for this case
(3) Mathematical expectation and variance
(1) Random variable: X
(2)
(3) N = 50, r = 4, n = 5
45. In a box, there are 3 red balls, 2 blue balls, and 5 yellow balls. You
select 4 balls.
(1) Joint probability function for X, Y, and Z
(2) Probability to select 1 red ball, 1 blue ball, and 2 yellow balls.
4 x개
개 y개
5개
z개
2개
3개
(1) Joint probability function:
(2)
46. (Discrete) Poisson Distribution
- Describe an event that rarely happens.
- All events in a specific period are mutually independent.
- The probability to occur is proportional to the length of the period.
- The probability to occur twice is zero if the period is short.
47. (Discrete) Poisson Distribution
It is often used as a model for the number of events (such as the number of
telephone calls at a business, number of customers in waiting lines, number
of defects in a given surface area, airplane arrivals, or the number of
accidents at an intersection) in a specific time period.
If z > 0
Satisfy the PF
condition
Probability function :
48. (Discrete) Poisson Distribution
.
Ex.1. On an average Friday, a waitress gets no tip from 5 customers. Find the
probability that she will get no tip from 7 customers this Friday.
The waitress averages 5 customers that leave no tip on Fridays: λ = 5.
Random Variable : The number of customers that leave her no tip this Friday.
We are interested in P(X = 7).
Ex. 2 During a typical football game, a coach can expect 3.2 injuries. Find the
probability that the team will have at most 1 injury in this game.
A coach can expect 3.2 injuries : λ = 3.2.
Random Variable : The number of injuries the team has in this game.
We are interested in
49. (Discrete) Poisson Distribution
.
Ex. 3. A small life insurance company has determined that on the average it receives 6
death claims per day. Find the probability that the company receives at least seven
death claims on a randomly selected day.
P(x ≥ 7) = 1 - P(x ≤ 6) = 0.393697
Ex. 4. The number of traffic accidents that occurs on a particular stretch of road
during a month follows a Poisson distribution with a mean of 9.4. Find the probability
that less than two accidents will occur on this stretch of road during a randomly
selected month.
P(x < 2) = P(x = 0) + P(x = 1) = 0.000860
52. Characteristics of Poisson Distribution
E(X) increases with parameter µ (or λ).
The graph becomes broadened with increasing the parameter µ (or λ).
56. (Discrete) Poisson Distribution
Comparison of the Poisson
distribution (black dots) and the
binomial distribution with n=10 (red
line), n=20 (blue line), n=1000 (green
line). All distributions have a mean of
5. The horizontal axis shows the
number of events k. Notice that as n
gets larger, the Poisson distribution
becomes an increasingly better
approximation for the binomial
distribution with the same mean
57. Discrete Probability Distributions:
Summary
• Uniform Distribution
• Binomial Distributions
• Multinomial Distributions
• Geometric Distributions
• Negative Binomial Distributions
• Hypergeometric Distributions
• Poisson Distribution
62. If X ∼ U(0, 1) and Y = a + (b - a) X,
(1)Distribution function for Y
(2)Probability function for Y
(3)Expectation and Variance for Y
(4) Centered value for Y
(1)
Since y = a + (b - a) x so 0 ≤ y ≤ b,
65. (Continuous) Exponential Distribution
▶ Analysis of survival rate
▶ Period between first and second earthquakes
▶ Waiting time for events of Poisson distribution
For any positive α
67. From a survey, the frequency of traffic accidents X is given by
-3x
f(x) = 3e (0 ≤ x)
(1)Probability to observe the second accident after one month of the first
accident?
(2)Probability to observe the second accident within 2 months
(3)Suppose that a month is 30 days, what is the average day of the
accident?
(1)
(2)
(3) μ=1/3, accordingly 10 days.
69. A patient was told that he can survive average of 100 days. Suppose that the
probability function is given by
(1) What is the probability that he dies within 150 days.
(2) What is the probability that he survives 200 days
λ=0.01 이므로 분포함수와 생존함수 :
-x/100 -x/100
F(x)=1-e , S(x)=e
(1) 이 환자가 150 일 이내에 사망할 확률 :
-1.5
P(X < 150) = F(150) = 1-e = 1-0.2231 = 0.7769
(2) 이 환자가 200 일 이상 생존할 확률
-2.0
P(X ≥ 200) = S(200) = e = 0.1353
70. (Continuous) Exponential Distribution
⊙ Relation with Poisson Process
(1) If an event occurs according to Poisson process with the ratio λ , the waiting
time between neighboring events (T) follows exponential distribution with the
exponent of λ.
74. (Continuous) Gamma Distribution
⊙ Relation with Exponential Distribution
Exponential distribution is a special gamma distribution with α = 1.
IF X1 , X2 , … , Xn have independent exponential distribution with the same
exponent 1/β, the sum of these random variables S= X1 + X2 + … +Xn results in
a gamma distribution, Γ(n, β).
75. If the time to observe an traffic accident (X) in a region have the following
probability distribution
-3x
f(x) = 3e , 0<x<∞
Estimate the probability to observe the first two accidents between the first
and second months. Assume that the all accidents are independent.
X1 : Time for the first accident
X2 : Time between the first and second accidents
Xi ∼ Exp(1/3) , I = 1, 2
S = X1 + X2 : Time for two accidents
S ∼ Γ(2, 1/3)
Probability function for S :
Answer:
76. (Continuous) Chi Square Distribution
A special gamma distribution α = r/2, β = 2
PD
E(X)
Var(X)
81. (Continuous) Chi Square Distribution
A random variable X follows a Chi Square Distribution with a degree of
freedom of 5, Calculate the critical value to satisfy P(X < x0 )=0.95
Since P(X < x0 )=0.95, P(X > x0 )=0.05.
From the table, find the point with d.f.=5 and α=0.05