Get to know in detail the definitions of the types of probability distributions from binomial, poison, hypergeometric, negative binomial to continuous distribution like t-distribution and much more.
Let me know if anything is required. Ping me at google #bobrupakroy
2. Types of probability distributions.
1. Discrete distribution
binomial
poisson
hyper geometric
negative binomial
geometric
2. Continuous distribution
Normal distribution
T –distribution
3. Continuous probability distribution
If a random variable is a continuous
variable (i.e. if a variable can take any
value between a specific range), its
probability distribution is called as
continuous probability distribution.
The equation used to describe a
continuous probability distribution is called
as probability density function (pdf) and
also density function.
Rupak Roy
4. Discrete probability function
If a random variable is a discrete variable,
its probability distribution is called as
discrete probability distribution.
Example:
What is the chance of exactly getting 10
heads out of 20 tosses?
In Excel:
=BINOMDIST(10,20,0.5, FALSE)
The answer is 17.61%
Rupak Roy
5. Binomial Probability distribution
To understand binomial distribution and binomial
probability. Let’s first understand what is binomial
experiment.
A binomial experiment is a statistical experiment that
has the following properties:
- The experiment consists of n repeated trails.
- Each trail can result in just two possible outcomes
that is success or a failure.
- The trails are independent, that is getting head on
one trail does not affect whether we get heads on
other trails.
Rupak Roy
6. Therefore binomial distribution is the number
of successes x in n repeated trails of a
binomial experiment.
Bernoulli trail is also an another name of
Binomial distribution
In Excel:
=Binom.dist( number_s,trails, probablity_s, cumulative)
Where
number_s = number of P events (success)
trails = total number of events
Probability_s = success rate i.e. 0.5(50% head,50%tail)
Cumulative = True(<=)
False(point probability)
Rupak Roy
7. Example
q What is the probability of getting exactly 2
heads with 5 coins flips?
number_s = number of P events (success) = 2
trails = total number of events = 5
probability_s = success rate = 0.5(50% head,50%tail)
cumulative = False( point probability)
In Excel:
= binom.dist(2,5,0.5,False)
= 0.3125
q What is the probability of getting 2 heads or
less with 5 coin flips?
= binom.dist(2,5,0.5,true) =0.5
Note:
Cumulative: False (point
probability = exactly 2 heads
Cumulative: True (<=) = 2
heads or less
Rupak Roy
8. Example continued
q What is the probability of getting more than 2 heads with
5 coin flips?
= 1- binom.dist (2,5,0.5,true)
= 1- 0.5 = 0.5 i.e. 1- (probability(p) of getting less than 2
heads) this will give you the right side of the values, that is
the probability of getting heads more then 2, and
remember probability is always between 0 to1 )
Alternatively,
we can also achieve this by calculating each point prob.
= probability(3 heads)+ probability(4 heads) + probability(5
heads
= binom.dist(3,5,0.5,false)+binom.dist(4,5,0.5,false)+
binom.dist (5,5,0.5,false)
=0.5 which is not an effective way to calculate probability
In simple mathematical example,
10 - 2 = 8 values remaining
greater than 2
Rupak Roy
9. Example 2
In a factory unit the product has a faulty rate of
30%, as part of quality inspector; you randomly
selected products of 15.
If from a selection, you get 7 faulty products.
How likely is the outcome due to randomness?
In Excel:
= binom.dist ( number_s, trails, probablity_s, cumulative )
Where , number_s = 7, trails = 15
probability_s = 30%(0.3) , cumulative = False
= binom.dist(7,15,0.3,false)
= 0.08
Rupak Roy
10. Poisson Probability distribution
It is an another discrete frequency
distribution which gives the probability of a
number of independent events occurring in
a fixed time.
Characteristics: The experiments consist of
counting of number of events that are
occurred during a specific interval of time
or in a specific distance, area, or volume.
Rupak Roy
11. Examples
Number of calls in day.
Number of car accident in a month.
Diseases spread over a period of month.
Number of emergency services needed in
hospital for the hour.
Rupak Roy
12. A soda vending machine with an average of 80
withdrawals in a day and a average transaction
amount of $70. The owner needs to know much he
have to stock to maintain an equilibrium profit.
What is the most appropriate amount of soda that
needs to be stocked for 5 days? The owner can
tolerate loss up to 10%.
In Excel:
POISSON.DIST( x, mean, cumulative)
Example continued
Rupak Roy
13. Poisson Probability =
( x, mean, cumulative)
P = ( x (withdrawals), 80,true)
P > withdrawals = 1 - p
We can see 10% level at
101 withdrawals which is the
appropriate amount to keep
Hence, we can also
Conclude Amount of sales
For those 5 days
= 101 withdrawals
* $70 average * 5 days
= $35,350
Note: here cumulative is True which gives probability ( <= ) of withdrawals. Therefore 1- p
gives probability (p) > withdrawals
Rupak Roy
14. Hyper geometric Probability Distribution
The hyper geometric distribution is used to calculate
Bernoulli trails without replacement.
Assume there are total 196 voters out of which 95 are
male. A random sample of 10 voters is drawn, what is
the probability of 7 are males .
In Excel:
= HYPGEOM.DIST( sample_s, numer_sample,
population_s, number_population, cumulative)
Rupak Roy
15. Hyper-geometric Probability Distribution
In Excel:
HYPGEOM.DIST( sample_s, number_sample, population_s,
number_population, cumulative)
where, sample_s = 7, number_sample = 10
population_s = 95 , number_population =196
Therefore,
=HYPGEOM.DIST(7,10,95,196,FALSE) = 0.100864
Rupak Roy
16. Negative Binomial distribution
A negative binomial distribution is the
number of repeated trails to get X success.
In excel:
NEGBINOM.DIST( number_f, number_s,
probability_s , cumulative(false))
Rupak Roy
17. Geometric Probability distribution
Is a special case of the negative binomial
distribution that deals with the number of trials
required for a single success.
Example: tossing a coin until it hits head. What is the
probability that the first head occurs on the third flip.
This is known as geometric probability.
In Excel: NEGBINOM.DIST
( number_f, number_s, probability_s,
cumulative(true) )
Rupak Roy
18. Normal Probability distribution
A normal probability distribution function tells the
probability of any real observation that falls
between two specified real limits (numbers) and the
sample size should be more than 30 or else it will fall
under T- distribution.
In Probability Theory, Normal distribution or Gaussian
distribution is one of the common continuous
probability distribution. They are important in
statistics and are often used in the natural and
social sciences to represent the real-values random
variables whose distributions are not known.
The normal distribution is useful because of the
central limit theorem.
Informally normal distribution is called as bell curve.
Rupak Roy
19. Central Limit Theorem CLT (in brief )
The central limit theorem says irrespective
of the underlying population distribution,
when you pick a multiple random sample
from underlying population with a sample
size of at least 30. The distribution of
sample average will be normal even if the
underlying population is not normal
Rupak Roy
20. Most of the data values in a normal distribution tend
to cluster around the mean. The further a data point
is from the mean, the less likely it is to occur.
Normal distributions are symmetric, unimodel and
asymptotic and the mean, median and mode are
all equal.
In normal distribution we never calculate
point value of cumulative because
Here we always take
about less, greater then
but never equal to
probability of an outcome
Rupak Roy
21. In a normal distribution 50% of the observations are less
than median, mode, mean.
In normal distribution 68% of the observations are written
1 standard deviation of the mean
OR
95% of the area of a normal distribution is within 2
standard deviation of the mean.
The mean, median,
mode of a
normal distribution
is equal.
Remember:
Standard deviations refers to
standard way or average of how
near or far the observations are from
the mean.
Rupak Roy
22. A group of students took a test and the final
grades have a mean of 70 and standard
deviation of 10, what of these students
a) Scored higher than 80?
b) Should pass the test (grades >60)?
c) Should fail the test (grades < 60)?
In excel:
= normal.dist ( Outcome, Mean , Standard Deviation,
Cumulative)
A) =1-norm.dist(80,70,10,TRUE) = 1- 0.841 =0.15
B) = 1-norm.dist(60,70,10,TRUE) = 1- 0.841 = 0.841
C)= norm.dist(60,70,10,TRUE) = 0.158
Rupak Roy
23. Extra: Probability distributions in R
programming
R is an open source software where we can perform
advance statistical computing.
In R programming probability distribution is further divided into
2 functions
Density and Cumulative
Density function are like excel functions where we provide FALSE as
the input to cumulative in excel
In Cumulative function we provide TRUE as an input in the excel
counterpart.
For Binomial distribution:
Density function: dbinom(numberOfSuccess, numberOfTrials,
probabilityOfSuccess)
Cumulative function: pbinom (numberOfSuccess, numberOfTrials,
probabilityOfSuccess)
Rupak Roy
24. For Negative Distribution
Density function:
dnbinom (numerOfFalse, numerOf_s, probability_s)
Cumulative function:
pnbinom (numerOfFalse, numerOf_s, probability_s)
For Hyper-geometric distribution
Density function:
dhyper (sample_s, pop_s, pop_f, sample_size)
Cumulative function:
phyper (sample_s, pop_s, pop_f, sample_size)
For Poisson distribution
Density function: dpois ( x , mean)
Cumulative function: ppois ( X, mean)
For Normal distribution: normal distribution are always greater then or
small but never to point probability / value, so the probability
function for this is only
Cumulative: pnorm (ObservedDataValue, mean, standard-deviation)
Rupak Roy
25. The further when we will proceed for advance
analytics we will be familiar with the R open
source programming concepts.
Next.
What probability distribution we will use if the
sample size if less than 30 ?
Rupak Roy