2. SIMPLER
WAY TO UNDERSTAND
Imagine you have a puzzle where some pieces are missing.
The EM algorithm helps you complete the puzzle by
guessing what those missing pieces look like.
3. STEPS
YOU WOULD FOLLOW
GUESS AND IMPROVE (EXPECTATION STEP)
First, you make a guess about what the missing puzzle
pieces might look like. This is like saying, "Hmm, I think the
missing pieces could be this color and shape." Your guess
doesn't have to be perfect; it's just a starting point.
4. STEPS
YOU WOULD FOLLOW
MAKE IT BETTER (MAXIMIZATION STEP)
Then, you look at the pieces you have and the ones you
guessed. You figure out how to adjust your guess to make
it match the pieces you have as closely as possible. This
step is like tweaking your guess to fit the puzzle better.
5. STEPS
YOU WOULD FOLLOW
REPEAT UNTIL DONE
You keep doing these two steps over and over, making
your guess better and better each time. It's like refining
your guess until the puzzle is complete.
6. The EM algorithm is like a smart helper that makes
educated guesses and keeps improving them until the
puzzle is solved. It's great for figuring out things when you
don't have all the information you need.
7. IN
ACTUAL TERMS
The Expectation-Maximization (EM) algorithm is an
iterative statistical technique used for estimating
parameters of probabilistic models when some of the data
is missing or unobserved. EM is particularly useful in
situations where you have incomplete or partially observed
data, and you want to estimate the underlying hidden
variables or parameters of a statistical model.
8. IN
ACTUAL TERMS
The Expectation-Maximization (EM) algorithm is an
iterative optimization method that combines different
unsupervised machine learning algorithms to find
maximum likelihood or maximum posterior estimates of
parameters in statistical models that involve unobserved
latent variables.
9. IN
ACTUAL TERMS
The EM algorithm is commonly used for latent variable
models and can handle missing data. It consists of an
estimation step (E-step) and a maximization step (M-
step), forming an iterative process to improve model fit.
10. IN
ACTUAL TERMS
In the E step, the algorithm computes the latent
variables i.e. expectation of the log-likelihood using the
current parameter estimates.
In the M step, the algorithm determines the parameters
that maximize the expected log-likelihood obtained in
the E step, and corresponding model parameters are
updated based on the estimated latent variables.
11. IN
ACTUAL TERMS
By iteratively repeating these steps, the EM algorithm seeks
to maximize the likelihood of the observed data. It is
commonly used for unsupervised learning tasks, such as
clustering, where latent variables are inferred, and has
applications in various fields, including machine learning,
computer vision, and natural language processing.
13. function ExpectationMaximization(data, initial_parameters, convergence_threshold, max_iterations):
parameters = initial_parameters
iteration = 0
converged = false
while (iteration < max_iterations and not converged):
# E-Step: Calculate expected values of hidden data
expected_values = EStep(data, parameters)
# M-Step: Update parameter estimates based on expected values
parameters = MStep(data, expected_values)
# Check for convergence based on parameter change
converged = CheckConvergence(parameters, previous_parameters, convergence_threshold)
previous_parameters = parameters # Save parameters for the next iteration
iteration = iteration + 1
return parameters # Final estimated parameters
function EStep(data, parameters):
# Calculate expected values (responsibilities) of hidden data
# Based on the current parameter estimates and observed data
# Return the expected values
PSEUDOCODE
14. function MStep(data, expected_values):
# Update parameter estimates to maximize the expected log-likelihood
# of the complete data (observed and hidden)
# Return the updated parameter estimates
function CheckConvergence(parameters, previous_parameters, threshold):
# Calculate a measure of how much the parameters have changed
# from the previous iteration (e.g., Euclidean distance or change in log-likelihood)
# Check if the change is smaller than the convergence threshold
# Return true if converged, false otherwise
# Example Usage
data = ... # Your observed data
initial_parameters = ... # Initial parameter values
convergence_threshold = ... # Convergence threshold for parameter change
max_iterations = ... # Maximum number of iterations
estimated_parameters = ExpectationMaximization(data, initial_parameters, convergence_threshold,
max_iterations)
15. PROBLEM
Imagine you have a bag of colorful candies, but you don't
know how many of each color are in the bag. You want to
figure this out by using the EM algorithm.
16. STEP-1 (E STEP)
Close your eyes and take out one candy from the bag
without looking.
Now, you ask your friend to guess the color of the
candy.
Your friend makes a guess based on their knowledge of
candies, but they're not entirely sure because they can't
see the candy either. So, they give you their best guess
along with how confident they are in their guess.
1.
2.
3.
17. STEP-2 (M STEP)
You collect all the guesses and confidence levels from
your friend for the candies you've taken out so far.
You count how many times each color was guessed
and use the confidence levels to estimate the number
of candies of each color in the bag.
You adjust your guess of how many candies of each
color are in the bag based on this new information.
1.
2.
3.
18. STEP-3 (REPEAT)
Keep repeating these two steps. Each time you do it, your
guess about the candies' colors and amounts gets better
and better. After doing this many times, you'll have a very
good idea of how many candies of each color are in the
bag.
19. LET’S MAKE IT MATHEMATICAL
For the first candy: 80% chance it's Red, 10% Green, 10%
Blue
For the second candy: 30% Red, 60% Green, 10% Blue
For the third candy: 20% Red, 10% Green, 70% Blue
Suppose you have a bag with red (R), green (G), and blue
(B) candies. You take out one candy at a time and record
your friend's guesses. After several candies, you have these
guesses:
20. LET’S MAKE IT MATHEMATICAL
For the first candy: 80% chance it's Red, 10% Green, 10%
Blue
For the second candy: 30% Red, 60% Green, 10% Blue
For the third candy: 20% Red, 10% Green, 70% Blue
Suppose you have a bag with red (R), green (G), and blue
(B) candies. You take out one candy at a time and record
your friend's guesses. After several candies, you have these
guesses:
21. LET’S MAKE IT MATHEMATICAL
Red: (0.80 + 0.30 + 0.20) / 3 = 0.43
Green: (0.10 + 0.60 + 0.10) / 3 = 0.27
Blue: (0.10 + 0.10 + 0.70) / 3 = 0.30
Now, in the M-step, you count the total guesses for each
color and update your estimates:
So, based on these new estimates, you think there are
approximately 43% Red candies, 27% Green candies, and
30% Blue candies in the bag.
You repeat this process many times until your estimates
become very accurate, and you have a good idea of the
candy distribution in the bag. That's how the EM algorithm
works to solve problems like this one!
22. ADVANTAGES
Handles data with missing values effectively.
Useful for unsupervised learning tasks like clustering.
Robust to noisy data.
Adaptable to various probabilistic models.
Can be applied to large datasets.
Estimates model parameters in mixture distributions.
Guarantees convergence to a local maximum.
Well-founded in statistical theory.
Not very sensitive to initial parameter values.
Versatile for various machine learning applications.
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
23. DISADVANTAGES
Sensitive to initial parameter guesses.
Slow convergence for high-dimensional data.
Limited scalability for very large datasets.
Assumes data is generated from a specific model.
Convergence is not guaranteed for all cases.
Can be computationally intensive for some problems.
1.
2.
3.
4.
5.
6.