Estimating Gaussian Mixture Densities via a Matlab implementation of the Expectation Maximization Algorithm. Decomposing an arbitrary distribution in to component Normal Distributions to facilitate clustering, state modeling & profiling.

An Approach to Mixed Dataset Clustering and Validation with ART-2 Artificial ...Happiest Minds Technologies

- 1. ESTIMATING GAUSSIAN MIXTURE DENSITIES VIA A MATLAB IMPLEMETATION OF THE EXPECTATION MAXIMIZATION ALGORITHM DR. ASOKA KORALE, C.ENG. MIET & MIESL
- 2. APPLICATIONS FOR GAUSSIAN MIXTURE DECOMPOSITION MODELING ANALYSIS Slide | 2 Cluster Analysis – Data mapped to a set of Normal Densities – with specified degree of membership – a model based clustering Customer Profiling – Characterizing the Distributions encountered – Age, ARPU, Net Stay… leading to a probabilistic description / modeling of the dataSentiment Analysis via Independent Term Matching where each word is drawn from a specified Normal Distribution – combined by their sum to determine overall sentiment score A model based approach to data analysis Goal: model arbitrary distributions as sums of Gaussian densities (with parameters estimated via expectation maximization algorithm) – so that each data point is characterized with respect to a distribution from which it is expected to have originated
- 3. PARAMETER ESTIMATION VIA EXPECTATION MAXIMIZATION ALGORITHM Slide | 3 Ref: Estimating Gaussian Mixture Densities with EM, Carlo Tomasi, Duke University
- 4. PARAMETER ESTIMATION VIA EXPECTATION MAXIMIZATION ALGORITHM Slide | 4 Ref: Estimating Gaussian Mixture Densities with EM, Carlo Tomasi, Duke University
- 5. RESULTS – ESTIMATING THE COMPONENT GAUSSIAN DENSITIES Slide | 5 II. Standardize the Data and estimate empirical Probability Density Function I. Histogram of original Data – (which composite densities to be estimated) III. Estimate Gaussian Component Densities (fx1/2/3) via EM Algorithm and their scaled Sum (fx) IV. fx: Sum of the individual component densities scaled by their mixing probabilities (for comparison with II the empirical PDF of Data)
- 6. CONVERGENCE OF THE EM ALGORITHM FOR THE PARAMETERS Slide | 6
- 7. RESULTS – INTERPRETATION OF CLUSTER MEMBERSHIP Slide | 7 Test with one dimensional data, through EM algorithm can estimate parameters for sums of “D” dimensional data *Applicable for multi dimensional data and need to explore correlated random variables