SlideShare a Scribd company logo
1 of 12
The Information Sieve
Greg Ver Steeg and Aram Galstyan
Soup = data
“Main
ingredient”
extracted at
each layer
Factorial code
• Carry recipe instead of soup
• Missing ingredients?
• Make more soup
• Compression
• Prediction
• Generative model
Recipe
-Ingredient 1
-Ingredient 2
-…
Invertible transform that makes
components independent
Finding such a transform is a generally intractable problem.
We use a sequence that incrementally removes dependence
Two Steps
1.Find the most informative function of the
input data
2. Transform the data to remove the
information in Yk, and then repeat
InputInputInput
Remainder
Soup
Main
ingredient
The main ingredient:
multivariate information
• Multivariate mutual information, or Total Correlation (Watanabe, 1960)
• TC(X|Y) = 0 if and only if Y “explains” all the dependence in X
• So we search for Y that minimizes TC(X|Y)
• Equivalently, we define the total correlation explained by Y as:
The main ingredient:
Total Correlation Explanation (CorEx)
• Optimize over all probabilistic functions
• Solution has special form that makes it tractable
• Computational complexity is linear in the number of variables
Sift out the main ingredient: remainder
info
The remainder is a transformation of the inputs with 2 properties:
Input
Remainder
Soup
Remainder contains no info about Y
Transformation is invertible
Iterative sifting as:
Multivariate
mutual
information
in data (Total
Correlation)
Contribution
from each layer
of the sieve
(optimized)
Remainder
(at layer r)
Decomposition of information
Iterative sifting as:
Dependence at each layer of the sieve
decreases until we get to zero, i.e. complete
independence
Dependence
(at layer r)
Extracting dependence
Recover spatial clusters from fMRI data
Ground truth ICA Sieve
Example of recovering spatial clusters in
brain data from temporal activation patterns
Lossy compression and in-painting
• Sieve representation with 12 layers/bits/binary latent factors on
MNIST digits
We can use the sieve for standard prediction and
generative model tasks
Lossless compression (on MNIST)
• Same size codebooks for Random and Sieve-based codes
• (gzip is sequence-based, shown for reference)
Proof of principle for lossless compression; though specialized
compression techniques are better on MNIST.
Method Naive gzip Random
codebook
Sieve
codebook
Bits per digit 784 328 267 243
Conclusion
• Incrementally decomposing multivariate
information is useful, practical, and delicious
• Could improve with joint optimization and better
transformations for remainder info
Link to all papers and code
http://bit.ly/corex_info
Contact: gregv@isi.edu, galstyan@isi.edu
• The extension to continuous random variables is nontrivial but more
practical and demonstrates connections to “common information”:
“Sifting Common Information from Many Variables”, arXiv:1606.02307.

More Related Content

Similar to ICML 2016: The Information Sieve

Uwe Friedrichsen – Extreme availability and self-healing data with CRDTs - No...
Uwe Friedrichsen – Extreme availability and self-healing data with CRDTs - No...Uwe Friedrichsen – Extreme availability and self-healing data with CRDTs - No...
Uwe Friedrichsen – Extreme availability and self-healing data with CRDTs - No...NoSQLmatters
 
Dimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptxDimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptxSivam Chinna
 
08 neural networks
08 neural networks08 neural networks
08 neural networksankit_ppt
 
Data streaming algorithms
Data streaming algorithmsData streaming algorithms
Data streaming algorithmsSandeep Joshi
 
Online learning, Vowpal Wabbit and Hadoop
Online learning, Vowpal Wabbit and HadoopOnline learning, Vowpal Wabbit and Hadoop
Online learning, Vowpal Wabbit and HadoopHéloïse Nonne
 
Probabilistic data structures. Part 2. Cardinality
Probabilistic data structures. Part 2. CardinalityProbabilistic data structures. Part 2. Cardinality
Probabilistic data structures. Part 2. CardinalityAndrii Gakhov
 
Firefly exact MCMC for Big Data
Firefly exact MCMC for Big DataFirefly exact MCMC for Big Data
Firefly exact MCMC for Big DataGianvito Siciliano
 
Terascale Learning
Terascale LearningTerascale Learning
Terascale Learningpauldix
 
data clean.ppt
data clean.pptdata clean.ppt
data clean.pptchatbot9
 
Nonlinear dimension reduction
Nonlinear dimension reductionNonlinear dimension reduction
Nonlinear dimension reductionYan Xu
 
2013 open analytics_countingv3
2013 open analytics_countingv32013 open analytics_countingv3
2013 open analytics_countingv3abramsm
 
Graph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear AlgebraGraph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear AlgebraJason Riedy
 
"Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler...
"Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler..."Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler...
"Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler...Dataconomy Media
 
Uwe Friedrichsen - CRDT und mehr - über extreme Verfügbarkeit und selbstheile...
Uwe Friedrichsen - CRDT und mehr - über extreme Verfügbarkeit und selbstheile...Uwe Friedrichsen - CRDT und mehr - über extreme Verfügbarkeit und selbstheile...
Uwe Friedrichsen - CRDT und mehr - über extreme Verfügbarkeit und selbstheile...AboutYouGmbH
 
Manifold Blurring Mean Shift algorithms for manifold denoising, presentation,...
Manifold Blurring Mean Shift algorithms for manifold denoising, presentation,...Manifold Blurring Mean Shift algorithms for manifold denoising, presentation,...
Manifold Blurring Mean Shift algorithms for manifold denoising, presentation,...Florent Renucci
 

Similar to ICML 2016: The Information Sieve (20)

Uwe Friedrichsen – Extreme availability and self-healing data with CRDTs - No...
Uwe Friedrichsen – Extreme availability and self-healing data with CRDTs - No...Uwe Friedrichsen – Extreme availability and self-healing data with CRDTs - No...
Uwe Friedrichsen – Extreme availability and self-healing data with CRDTs - No...
 
Dimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptxDimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptx
 
08 neural networks
08 neural networks08 neural networks
08 neural networks
 
Data streaming algorithms
Data streaming algorithmsData streaming algorithms
Data streaming algorithms
 
1015 track2 abbott
1015 track2 abbott1015 track2 abbott
1015 track2 abbott
 
1030 track2 abbott
1030 track2 abbott1030 track2 abbott
1030 track2 abbott
 
Online learning, Vowpal Wabbit and Hadoop
Online learning, Vowpal Wabbit and HadoopOnline learning, Vowpal Wabbit and Hadoop
Online learning, Vowpal Wabbit and Hadoop
 
Realtime Analytics
Realtime AnalyticsRealtime Analytics
Realtime Analytics
 
Probabilistic data structures. Part 2. Cardinality
Probabilistic data structures. Part 2. CardinalityProbabilistic data structures. Part 2. Cardinality
Probabilistic data structures. Part 2. Cardinality
 
Firefly exact MCMC for Big Data
Firefly exact MCMC for Big DataFirefly exact MCMC for Big Data
Firefly exact MCMC for Big Data
 
Terascale Learning
Terascale LearningTerascale Learning
Terascale Learning
 
data clean.ppt
data clean.pptdata clean.ppt
data clean.ppt
 
Nonlinear dimension reduction
Nonlinear dimension reductionNonlinear dimension reduction
Nonlinear dimension reduction
 
2013 open analytics_countingv3
2013 open analytics_countingv32013 open analytics_countingv3
2013 open analytics_countingv3
 
Class9_PCA_final.ppt
Class9_PCA_final.pptClass9_PCA_final.ppt
Class9_PCA_final.ppt
 
Graph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear AlgebraGraph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear Algebra
 
"Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler...
"Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler..."Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler...
"Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler...
 
Uwe Friedrichsen - CRDT und mehr - über extreme Verfügbarkeit und selbstheile...
Uwe Friedrichsen - CRDT und mehr - über extreme Verfügbarkeit und selbstheile...Uwe Friedrichsen - CRDT und mehr - über extreme Verfügbarkeit und selbstheile...
Uwe Friedrichsen - CRDT und mehr - über extreme Verfügbarkeit und selbstheile...
 
Manifold Blurring Mean Shift algorithms for manifold denoising, presentation,...
Manifold Blurring Mean Shift algorithms for manifold denoising, presentation,...Manifold Blurring Mean Shift algorithms for manifold denoising, presentation,...
Manifold Blurring Mean Shift algorithms for manifold denoising, presentation,...
 
Self healing data
Self healing dataSelf healing data
Self healing data
 

Recently uploaded

Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxpradhanghanshyam7136
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfNAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfWadeK3
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 sciencefloriejanemacaya1
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxAArockiyaNisha
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzohaibmir069
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 

Recently uploaded (20)

Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptx
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfNAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 science
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistan
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 

ICML 2016: The Information Sieve

  • 1. The Information Sieve Greg Ver Steeg and Aram Galstyan Soup = data “Main ingredient” extracted at each layer
  • 2. Factorial code • Carry recipe instead of soup • Missing ingredients? • Make more soup • Compression • Prediction • Generative model Recipe -Ingredient 1 -Ingredient 2 -… Invertible transform that makes components independent Finding such a transform is a generally intractable problem. We use a sequence that incrementally removes dependence
  • 3. Two Steps 1.Find the most informative function of the input data 2. Transform the data to remove the information in Yk, and then repeat InputInputInput Remainder Soup Main ingredient
  • 4. The main ingredient: multivariate information • Multivariate mutual information, or Total Correlation (Watanabe, 1960) • TC(X|Y) = 0 if and only if Y “explains” all the dependence in X • So we search for Y that minimizes TC(X|Y) • Equivalently, we define the total correlation explained by Y as:
  • 5. The main ingredient: Total Correlation Explanation (CorEx) • Optimize over all probabilistic functions • Solution has special form that makes it tractable • Computational complexity is linear in the number of variables
  • 6. Sift out the main ingredient: remainder info The remainder is a transformation of the inputs with 2 properties: Input Remainder Soup Remainder contains no info about Y Transformation is invertible
  • 7. Iterative sifting as: Multivariate mutual information in data (Total Correlation) Contribution from each layer of the sieve (optimized) Remainder (at layer r) Decomposition of information
  • 8. Iterative sifting as: Dependence at each layer of the sieve decreases until we get to zero, i.e. complete independence Dependence (at layer r) Extracting dependence
  • 9. Recover spatial clusters from fMRI data Ground truth ICA Sieve Example of recovering spatial clusters in brain data from temporal activation patterns
  • 10. Lossy compression and in-painting • Sieve representation with 12 layers/bits/binary latent factors on MNIST digits We can use the sieve for standard prediction and generative model tasks
  • 11. Lossless compression (on MNIST) • Same size codebooks for Random and Sieve-based codes • (gzip is sequence-based, shown for reference) Proof of principle for lossless compression; though specialized compression techniques are better on MNIST. Method Naive gzip Random codebook Sieve codebook Bits per digit 784 328 267 243
  • 12. Conclusion • Incrementally decomposing multivariate information is useful, practical, and delicious • Could improve with joint optimization and better transformations for remainder info Link to all papers and code http://bit.ly/corex_info Contact: gregv@isi.edu, galstyan@isi.edu • The extension to continuous random variables is nontrivial but more practical and demonstrates connections to “common information”: “Sifting Common Information from Many Variables”, arXiv:1606.02307.

Editor's Notes

  1. I have a cartoon version of the talk…[describe]...that’s like 90% of it. I’m going to stick with the soup metaphor: All that remains is to say what we mean by “main ingredient”, and what does it mean to “remove” it. Before that, though, why would you want to do this?
  2. Filtering out all the ingredients in soup is really a way to reverse engineer the recipe. The technical equivalent of this is called a factorial code; decompose data into independent components. There are many advantages… Unfortunately, this isn’t very easy. Our sieves provide us a way to easily do this in an incremental way so that our representation is more independent at each step. Let’s abstract a bit...
  3. At every layer of this sieve, we have discrete random variables with iid samples drawn from an unknown distribution. Step 1 finds the “main ingredient” by solving Step 2 filters it out
  4. Why the need for a qualification? It seems to me that information by itself is somewhat useless for learning. A bit of noise and a bit of signal are not really distinguishable. High-d is only difficult if there are nontrivial relationships, so that’s what we need to characterize (CAREFUL not to ramble here…)
  5. In soup terms, we have two criteria: The ingredient is completely extracted. If not, we might end up sifting out some carrots at layer 1 and more at layer 3. We can invert the transformation. We just throw the carrots back in and we are right where we started. WHY do we define remainder in this way exactly? The next two slides will show why that’s a powerful way to go.
  6. Defining the main ingredient as multivariate information and correctly defining the remainder information leads finally to some very nice expressions. Ok, so now we have a way to progressively extract the most important ingredients in our soup. We mentioned the benefits at the beginning, and we still get almost all of those benefits from doing it progressively. In fact, in a way we are better off because our list of ingredients is ranked by importance. PUT IN PLOT?
  7. Defining the main ingredient as multivariate information and correctly defining the remainder information leads finally to some very nice expressions. Ok, so now we have a way to progressively extract the most important ingredients in our soup. We mentioned the benefits at the beginning, and we still get almost all of those benefits from doing it progressively. In fact, in a way we are better off because our list of ingredients is ranked by importance. PUT IN PLOT?
  8. Synthetic data, so we know the ground truth. Plotting the weights, note that this is a linear version that is described in a different paper