2. 1. Boinomial distribution
2. Possion distribution
3. Normal distribution
4. Confidence interval
5. Least square analysis
OUT LINEOFTOPIC
3. BOINOMIAL DISTRIBUTION
A probability distribution is a function or rule that assigns
probabilities of occurrence to each possible outcome of a random event.
Probability distributions give us a visual of all possible outcomes of some
event and the likelihood of obtaining one outcome relative to the other
possible outcomes.
A binomial distribution is a specific probability distribution. It is
used to model the probability of obtaining one of two outcomes, a certain
number of times (k), out of fixed number of trials (N) of a discrete random
event.
4. A binomial distribution has only two outcomes: the expected outcome is
called a success and any other outcome is a failure. The probability of a
successful outcome is p and the probability of a failure is 1 - p.
Where k = 1,2,3,…..,n
𝑛
𝑘
counts the number of outcomes that include exactly k successes and n-k
failure
5. Example:
A coin is tossed 6 times. The probability of heads on any toss is 0.3. Let X
denote the number of heads that come up. Calculate:
(i) P(X = 2)
(ii) P(X = 3)
(iii)P(1 < X ≤ 5).
ANS:
(i) If we call heads a success then this X has a binomial distribution with
parameters n = 6 and p = 0.3.
P(X = 2) = 6
2
(0.3)2(0.7)4
= 0.324135
6. (ii) for P(X = 3)
ANS:
P(X = 3) = 6
2
(0.3)3(0.7)3
=0.18522
(iii) For P(1 < X ≤ 5)
ANS: We need P(1 < X ≤ 5)
P(X = 2) + P(X = 3) + P(X = 4) + P(X = 5)
= 0.324 + 0.185 + 0.059 + 0.01
= 0.578
7. POISSON DISTRIBUTION
A Poisson distribution is a statistical distribution showing the likely
number of times that an event will occur within a specified period of time. It is
used for independent events which occur at a constant rate within a given
interval of time.
The Poisson distribution is a discrete function, meaning that the event
can only be measured as occurring or not as occurring, the variable can only be
measured in whole numbers. Fractional occurrences of the event are not a part of
the model.
8. Where,
X - denotes the number of successes in the whole interval.
λ - is the mean number of successes in the interval
9. Example
The number of visitors to a webserver per minute follows a Poisson
distribution. If the average number of visitors per minute is 4, what is the
probability that:
(i) There are two or fewer visitors in one minute?
(ii) There are exactly two visitors in 30 seconds?
ANS:
(i) we need the average number of visitors in a minute. In this case the
parameter λ = 4.
We wish to calculate
P(X = 0) + P(X = 1) + P(X = 2).
10. P(X = 0) =
𝑒−4
40
0!
= e−4
P(X = 1) =
𝑒−4
41
1!
= 4e−4
P(X = 2) =
𝑒−4
42
2!
= 8e−4
So the probability of two or fewer visitors in a minute is
e−4 + 4e−4 + 8e−4 = 0.238.
(ii) If the average number of visitors in 1 minute is 4, the average in 30
seconds is 2.
So for this example, our parameter λ = 2.
P(X = 2) =
𝑒−2
22
2!
= 2e−2 = 0.271.
11. NORMAL DISTRIBUTION
A normal distribution is a bell-shaped frequency distribution curve. Most
of the data values in a normal distribution tend to cluster around the mean.
The further a data point is from the mean, the less likely it is to occur. There
are many things, such as intelligence, height, and blood pressure, that
naturally follow a normal distribution.
For example, if you took the height of one hundred 22-year-old women
and created a histogram by plotting height on the x-axis, and the frequency
at which each of the heights occurred on the y-axis, you would get a normal
distribution.
Carl Gauss who created a mathematical formula for the curve.
12. μ is the mean or expectation of the distribution (and also
its median and mode)
σ is the standard deviation
σ 2 is the variance
13. Suppose scores on an IQ test are normally distributed. If the test has a mean of 100 and a
standard deviation of 10, what is the probability that a person who takes the test will score
between 90 and 110?
Solution: Here, we want to know the probability that the test score falls between 90 and
110. The "trick" to solving this problem is to realize the following:
P( 90 < X < 110 ) = P( X < 110 ) - P( X < 90 )
We use the Normal Distribution Calculator to compute both probabilities on the right side
of the above equation.
To compute P( X < 110 ), we enter the following inputs into the calculator: The value of the
normal random variable is 110, the mean is 100, and the standard deviation is 10. We find
that P( X < 110 ) is 0.84.
To compute P( X < 90 ), we enter the following inputs into the calculator: The value of the
normal random variable is 90, the mean is 100, and the standard deviation is 10. We find
that P( X < 90 ) is 0.16.
We use these findings to compute our final answer as follows:
P(90<X<110)=P(X<110)-P(X<90)
P(90<X<110)=0.84-0.16
P( 90 < X < 110 ) = 0.68
Thus, about 68% of the test scores will fall between 90 and 110.
15. This is a property of the normal distribution. Another property is that
'mean = median = mode.' This is because the shape of the data is
symmetrical with one peak.
And, since the curve is symmetrical, the mean or median or mode
(which are all the same number for this distribution) divide the data in
half. From now on, we will just refer to this value in the middle as the
mean
16. Characteristics of Normal Distribution
Here, we see the four characteristics of a normal distribution. Normal
distributions are symmetric, unimodal, and asymptotic, and
the mean, median, and mode are all equal.
17. STANDARD NORMAL DISTRIBUTION
The standard normal distribution is a special case of the normal
distribution . It is the distribution that occurs when a normal random
variable has a mean of zero and a standard deviation of one.
The normalrandom variable of a standard normal distributionis called
a standard score or a z score.
18. CONFIDENCE INTERVAL
In statistics, a confidence interval (CI) is a type of interval estimate,
computed from the statistics of the observed data, that might contain the
true value of an unknown population parameter. The interval has an
associated confidence level that, loosely speaking, quantifies the level of
confidence that the parameter lies in the interval.
The confidence level represents the frequency of possible confidence
intervals that contain the true value of the unknown population parameter.
In other words, if confidence intervals are constructed using a given
confidence level from an infinite number of independent sample statistics,
the proportion of those intervals that contain the true value of the parameter
will be equal to the confidence level.
A range of values so defined that there is a specified probability that the
value of a parameter lies within it. The confidence level is designated prior
to examining the data. Most commonly, the 95% confidence level is
used. However, other confidence levels can be used, for example, 90% and
99%.
19. Suppose a student measuring the boiling temperature of a certain
liquid observes the readings (in degrees Celsius) 102.5, 101.7, 103.1,
100.9, 100.5, and 102.2 on 6 different samples of the liquid. He
calculates the sample mean to be 101.82. If he knows that the standard
deviation for this procedure is 1.2 degrees, what is the confidence
interval for the population mean at a 95% confidence level?
In other words, the student wishes to estimate the true mean boiling
temperature of the liquid using the results of his measurements. If the
measurements follow a normal distribution, then the sample mean will
have the distribution
Example
20. The sample size is 6, the standard deviation of the sample mean is
1.2/sqrt(6) = 0.49.
21. LEAST SQUARE ANALYSIS
Least squares is a statistical method used to determine a line of best
fit by minimizing the sum of squares created by a mathematical
function.
A "square" is determined by squaring the distance between a data
point and the regression line, or mean value of the data set.
The least squares approach limits the distance between a function and
the data points that a function is trying to explain. It is used in
regression analysis, often in nonlinear regression modeling in which a
curve is fit into a set of data.
22. The least squares approach is a popular method for determining regression
equations, and it tells you about the relationship between response
variables and predictor variables.
Instead of trying to solve an equation exactly, mathematicians use the least
squares method to make a close approximation Modeling methods that are
often used when fitting a function to a curve the straight-line method.
Linear or ordinary least squares is the most simple and commonly used
linear regression estimator for analyzing observational and experimental
data. It finds a straight line of best fit through a set of given data points.
WHAT DOES LEAST SQUARES ABOUT
23. KEYPOINTS
Least squares results can be used to summarize data and make
predictions about related but unobserved values from the same group
or system.
Linear least squares regression is the simplest and most commonly
used form of least squares regression.
If the points are occurred in zigzag mannar some points are connected
we should draw the line in connected points
24. EXAMPLE
For example if the product is going to launch in specific area or a
specific brand the survey will be conducted in that ares about the
previous brandsspecific colour, size etc,.
And they will plot a graph about the sales regarding season in that
area if the straight lines is occurred in increasing order so there is no
problem about the launching the product