- Univariate normal distribution describes the distribution of a single random variable and is characterized by its bell-shaped curve. The mean, median, and mode are equal and located at the center. Approximately 68% of the data falls within one standard deviation of the mean.
- Multivariate normal distribution describes the joint distribution of multiple random variables. It generalizes the univariate normal distribution to multiple dimensions. The variables have a consistent relationship that can be modeled as a covariance matrix.
- Examples of data that may follow a normal distribution include heights, test scores, measurement errors, and stock price changes over time. Normal distributions are widely used in statistics
Understanding univariate and multivariate normal distributions
1. Data Science Internship
UNDERSTANDING UNIVARIATE & MULTIVARIATE NORMAL DISTRIBUTIONS
UZMA SULTHANA
LECTURE
DEPT OF CSE,
RAMAIAH INSTITUTE OF TECHNOLOGY, BANGALORE
4. Definition of Distribution
A distribution is a function that shows the possible
values for a variable and how often they occur.
5. Example-Rolling die
It has six sides, numbered from 1 to 6.
Imagine that we roll the die. What is the
probability of getting 1?
6. Example-Rolling die
Try guessing what the probability of rolling a 2 is. Once again - one-sixth. The same holds true for 3, 4, 5 and
6.
7. Example-Rolling die
It is impossible to get a 7 when rolling a single die.
Therefore, the probability is 0.
8. Example-Rolling die
The Values that Make up a Distribution
• Let’s generalize. The distribution of an event consists not only of the input values that can be
observed.
• It is actually made up of all possible values. So, the distribution of the event - rolling a die -
will be given by the following table.
10. Normal Distribution
Visual Representation of Normal distribution
The statistical term for it
is Gaussian distribution.
Though, many people call it
the Bell Curve, as it is shaped
like a bell.
11. Normal Distribution
Visual Representation of Normal distribution
The statistical term for it is Gaussian distribution. Though, many
people call it the Bell Curve, as it is shaped like a bell.
12. Normal Distribution
It is symmetrical and its mean, median and mode are equal.
skewness indicates whether the
observations in a data set are concentrated
on one side.
It is perfectly centred around its mean.
13. Normal Distribution
How it’s Denoted
N stands for normal and the tilde sign(~) shows it is a distribution. In brackets, we have the mean(μ) and
the variance(σ2) of the distribution
you can notice that
the highest point is
located at the mean.
This is because it
coincides with
the mode. The
spread of the graph is
determined by
the standard
deviation, as it is
shown below.
14. Normal Distribution-R
• In a random collection of data from independent source, it is general observed that the distribution of
data is normal
• We observed in the graph, 50% of values lie to the left of the mean and other 50% lie to the right of the
graph
• R has 4 build functions to generate normal distribution:
• X= vector of a numbers
• p=vector of probabilities
• n=Number of observations (sample size)
• mean=mean value of the sample data. By default is zero
• sd=standard deviation. Its defalult value is one.
15. Normal Distribution-R
dnorm()
For example:
Create a sequence of numbers between -20 and 20 incrementing by 0.1.
mean=5.o, sd=1.0
x <- seq(-20, 20, by = .1)
y <- dnorm(x, mean = 5.0, sd = 1.0)
plot(x,y, main = "Normal Distribution", col = "blue")
16. Normal Distribution-R
For example:
Create a sequence of numbers between -20 and 20 incrementing by 0..
mean=2.5, sd=2
x <- seq(-10, 10, by = .2)
y <- pnorm(x, mean = 2.5, sd = 2.0)
plot(x,y, main = “pnorm()", col = "blue")
17. Normal Distribution-R
For example:
Create a sequence of numbers between 0 and 1 incrementing by 0.02.
mean=2, sd=1
x <- seq(0, 1, by = 0.02)
y <- qnorm(x, mean = 2, sd = 1) #DataFlair
plot(x,y, main = "qnorm()", col = "blue")
19. Normal Distributions
The normal Distributions:
1.Univariate Normal Distribution
2.Bivariate Normal Distribution
3.Multivariate Normal Distribution
20. 1.Univariate Normal Distribution
• The univariate normal distribution, also known as the
Gaussian distribution, is a continuous probability
distribution that describes the distribution of a single
random variable.
• It is one of the most important and widely used probability
distributions in statistics and data analysis.
21. 1.Univariate Normal Distribution
• The distribution is characterized by its bell-shaped curve
when plotted, which is symmetric around the mean value.
22. Key characteristics of the univariate normal
distribution include:
1.Symmetry: The distribution is symmetric around its mean value, meaning that the data tend to be evenly
distributed on both sides of the mean.
2.Mean, Median, and Mode: The mean (average), median (middle value), and mode (most frequent value) are
all located at the center of the distribution, and they are equal in a perfectly normal distribution.
3.Standard Deviation: The standard deviation controls the spread of the distribution. A larger standard
deviation results in a wider curve, indicating more variability in the data.
4.68-95-99.7 Rule: Around 68% of the data falls within one standard deviation of the mean, approximately
95% within two standard deviations, and about 99.7% within three standard deviations.
5.Probability Density Function (PDF): The PDF of the normal distribution is a mathematical function that
defines the likelihood of observing a particular value. It's given by the formula Where
μ = Mean
σ = Standard deviation
x = Normal random variable
23. Key characteristics of the univariate normal
distribution include:
Example 1: Height of Adults Suppose you're measuring the heights of a large group of
adult individuals. The distribution of heights is often well-modeled by a normal distribution.
The mean (μ) might represent the average height in the population, and the standard
deviation (σ) would indicate how much the heights tend to vary around the mean.
Example 2: Exam Scores Consider a classroom where students take an exam. If the exam
is well-constructed and not too difficult or too easy, the scores of the students might follow a
normal distribution. The mean score would represent the average performance, and the
standard deviation would describe the spread of scores around the mean.
Example 3: IQ Scores IQ scores are often assumed to follow a normal distribution. If the
mean IQ is 100 and the standard deviation is 15, then most people would have IQ scores
close to 100, with fewer individuals having scores much lower or higher.
24. Key characteristics of the univariate normal
distribution include:
Example 4: Measurement Errors Imagine you're measuring the length of an object using a
ruler with small markings. Due to limitations in precision, there might be small errors in
your measurements. These errors can be modeled using a normal distribution, where the
mean error is close to zero.
Example 5: Random Walks In finance and stock market analysis, random walks are often
used to model the unpredictable changes in stock prices. The daily changes in stock prices
can sometimes be approximated by a normal distribution, which helps in understanding the
likelihood of different price changes.
25. Key characteristics of the univariate normal
distribution include:
1.# Generate random samples
2.random_samples <- rnorm(1000, mean = 0, sd = 1)
3.# Calculate probability
4.probability_less_than_1 <- pnorm(1, mean = 0, sd = 1)
5.# Create visualization
6.library(ggplot2)
7.ggplot(data.frame(x = random_samples), aes(x)) +
8. geom_histogram(aes(y = ..density..), bins = 30, fill = "lightblue", color = "black") +
9. geom_density(color = "red") +
10. labs(title = "Univariate Normal Distribution", x = "Value", y = "Density")