Emily Denton - Unsupervised Learning of Disentangled Representations from Vid...Luba Elliott
This talk by Emily Denton from New York University on "Unsupervised Learning of Disentangled Representations from Video" was presented at the Learning Image Representations event on 30th August at Twitter as part of the Creative AI meetup.
文献紹介:X3D: Expanding Architectures for Efficient Video RecognitionToru Tamaki
Christoph Feichtenhofer; X3D: Expanding Architectures for Efficient Video Recognition , Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 203-213
https://openaccess.thecvf.com/content_CVPR_2020/html/Feichtenhofer_X3D_Expanding_Architectures_for_Efficient_Video_Recognition_CVPR_2020_paper.html
Emily Denton - Unsupervised Learning of Disentangled Representations from Vid...Luba Elliott
This talk by Emily Denton from New York University on "Unsupervised Learning of Disentangled Representations from Video" was presented at the Learning Image Representations event on 30th August at Twitter as part of the Creative AI meetup.
文献紹介:X3D: Expanding Architectures for Efficient Video RecognitionToru Tamaki
Christoph Feichtenhofer; X3D: Expanding Architectures for Efficient Video Recognition , Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 203-213
https://openaccess.thecvf.com/content_CVPR_2020/html/Feichtenhofer_X3D_Expanding_Architectures_for_Efficient_Video_Recognition_CVPR_2020_paper.html
NICE: Non-linear Independent Components Estimation Laurent Dinh, David Krueger, Yoshua Bengio. 2014.
Density estimation using Real NVP
Laurent Dinh, Jascha Sohl-Dickstein, Samy Bengio. 2017.
Glow: Generative Flow with Invertible 1x1 Convolutions
Diederik P. Kingma, Prafulla Dhariwal. 2018.
논문 리뷰 자료
In these slides, Generative Adversarial Network (GAN) is briefly introduced, and some GAN applications in medical imaging are presented. In the conclusions, some comments are given for persons who are interested in research of medical imaging using GAN.
요즘 Image관련 Deep learning 관련 논문에서 많이 나오는
용어인 Invariance와 Equivariance의 차이를 알기쉽게 설명하는 자료를 만들어봤습니다. Image의 Transformation에 대해
Equivariant한 feature를 만들기 위하여 제안된 Group equivariant Convolutional. Neural Networks 와 Capsule Nets에 대하여 설명
A simple framework for contrastive learning of visual representationsDevansh16
Link: https://machine-learning-made-simple.medium.com/learnings-from-simclr-a-framework-contrastive-learning-for-visual-representations-6c145a5d8e99
If you'd like to discuss something, text me on LinkedIn, IG, or Twitter. To support me, please use my referral link to Robinhood. It's completely free, and we both get a free stock. Not using it is literally losing out on free money.
Check out my other articles on Medium. : https://rb.gy/zn1aiu
My YouTube: https://rb.gy/88iwdd
Reach out to me on LinkedIn. Let's connect: https://rb.gy/m5ok2y
My Instagram: https://rb.gy/gmvuy9
My Twitter: https://twitter.com/Machine01776819
My Substack: https://devanshacc.substack.com/
Live conversations at twitch here: https://rb.gy/zlhk9y
Get a free stock on Robinhood: https://join.robinhood.com/fnud75
This paper presents SimCLR: a simple framework for contrastive learning of visual representations. We simplify recently proposed contrastive self-supervised learning algorithms without requiring specialized architectures or a memory bank. In order to understand what enables the contrastive prediction tasks to learn useful representations, we systematically study the major components of our framework. We show that (1) composition of data augmentations plays a critical role in defining effective predictive tasks, (2) introducing a learnable nonlinear transformation between the representation and the contrastive loss substantially improves the quality of the learned representations, and (3) contrastive learning benefits from larger batch sizes and more training steps compared to supervised learning. By combining these findings, we are able to considerably outperform previous methods for self-supervised and semi-supervised learning on ImageNet. A linear classifier trained on self-supervised representations learned by SimCLR achieves 76.5% top-1 accuracy, which is a 7% relative improvement over previous state-of-the-art, matching the performance of a supervised ResNet-50. When fine-tuned on only 1% of the labels, we achieve 85.8% top-5 accuracy, outperforming AlexNet with 100X fewer labels.
Comments: ICML'2020. Code and pretrained models at this https URL
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
Cite as: arXiv:2002.05709 [cs.LG]
(or arXiv:2002.05709v3 [cs.LG] for this version)
Submission history
From: Ting Chen [view email]
[v1] Thu, 13 Feb 2020 18:50:45 UTC (5,093 KB)
[v2] Mon, 30 Mar 2020 15:32:51 UTC (5,047 KB)
[v3] Wed, 1 Jul 2020 00:09:08 UTC (5,829 KB)
NICE: Non-linear Independent Components Estimation Laurent Dinh, David Krueger, Yoshua Bengio. 2014.
Density estimation using Real NVP
Laurent Dinh, Jascha Sohl-Dickstein, Samy Bengio. 2017.
Glow: Generative Flow with Invertible 1x1 Convolutions
Diederik P. Kingma, Prafulla Dhariwal. 2018.
논문 리뷰 자료
In these slides, Generative Adversarial Network (GAN) is briefly introduced, and some GAN applications in medical imaging are presented. In the conclusions, some comments are given for persons who are interested in research of medical imaging using GAN.
요즘 Image관련 Deep learning 관련 논문에서 많이 나오는
용어인 Invariance와 Equivariance의 차이를 알기쉽게 설명하는 자료를 만들어봤습니다. Image의 Transformation에 대해
Equivariant한 feature를 만들기 위하여 제안된 Group equivariant Convolutional. Neural Networks 와 Capsule Nets에 대하여 설명
A simple framework for contrastive learning of visual representationsDevansh16
Link: https://machine-learning-made-simple.medium.com/learnings-from-simclr-a-framework-contrastive-learning-for-visual-representations-6c145a5d8e99
If you'd like to discuss something, text me on LinkedIn, IG, or Twitter. To support me, please use my referral link to Robinhood. It's completely free, and we both get a free stock. Not using it is literally losing out on free money.
Check out my other articles on Medium. : https://rb.gy/zn1aiu
My YouTube: https://rb.gy/88iwdd
Reach out to me on LinkedIn. Let's connect: https://rb.gy/m5ok2y
My Instagram: https://rb.gy/gmvuy9
My Twitter: https://twitter.com/Machine01776819
My Substack: https://devanshacc.substack.com/
Live conversations at twitch here: https://rb.gy/zlhk9y
Get a free stock on Robinhood: https://join.robinhood.com/fnud75
This paper presents SimCLR: a simple framework for contrastive learning of visual representations. We simplify recently proposed contrastive self-supervised learning algorithms without requiring specialized architectures or a memory bank. In order to understand what enables the contrastive prediction tasks to learn useful representations, we systematically study the major components of our framework. We show that (1) composition of data augmentations plays a critical role in defining effective predictive tasks, (2) introducing a learnable nonlinear transformation between the representation and the contrastive loss substantially improves the quality of the learned representations, and (3) contrastive learning benefits from larger batch sizes and more training steps compared to supervised learning. By combining these findings, we are able to considerably outperform previous methods for self-supervised and semi-supervised learning on ImageNet. A linear classifier trained on self-supervised representations learned by SimCLR achieves 76.5% top-1 accuracy, which is a 7% relative improvement over previous state-of-the-art, matching the performance of a supervised ResNet-50. When fine-tuned on only 1% of the labels, we achieve 85.8% top-5 accuracy, outperforming AlexNet with 100X fewer labels.
Comments: ICML'2020. Code and pretrained models at this https URL
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
Cite as: arXiv:2002.05709 [cs.LG]
(or arXiv:2002.05709v3 [cs.LG] for this version)
Submission history
From: Ting Chen [view email]
[v1] Thu, 13 Feb 2020 18:50:45 UTC (5,093 KB)
[v2] Mon, 30 Mar 2020 15:32:51 UTC (5,047 KB)
[v3] Wed, 1 Jul 2020 00:09:08 UTC (5,829 KB)
Abstract:
A combination of exponential and Lindley failure rate model is considered and named it as exponential-Lindley
additive failure rate model. In this paper, we studied the distributional properties, central and non-central moments,
estimation of parameters, testing of hypothesis and the power of likelihood ratio criterion about the proposed model.
Key words: Exponential distribution, Lindley distribution, ML estimation, Likelihood ratio type criterion.
A Study on Youth Violence and Aggression using DEMATEL with FCM Methodsijdmtaiir
The DEMATEL method is then a good technique for
making decisions. In this paper we analyzed the risk factors of
youth violence and what makes them more aggressive. Since
there are more risk factors of youth violence, to relate each
other more complex to construct FCM and analyze them.
Moreover the data is an unsupervised one obtained from
survey as well as interviews. Hence fuzzy alone has the
capacity to analyses these concepts.
We will describe and analyze accurate and efficient numerical algorithms to interpolate and approximate the integral of multivariate functions. The algorithms can be applied when we are given the function values at an arbitrary positioned, and usually small, existing sparse set of function values (samples), and additional samples are impossible, or difficult (e.g. expensive) to obtain. The methods are based on local, and global, tensor-product sparse quasi-interpolation methods that are exact for a class of sparse multivariate orthogonal polynomials.
Awarded presentation of my research activity, PhD Day 2011, February 23th 2011, Cagliari, Italy.
This presentation has been awarded as the best one of the track on information engineering.
Want to know more?
see my publications at
http://prag.diee.unica.it/pra/ita/people/satta
Dimensionality reduction by matrix factorization using concept lattice in dat...eSAT Journals
Abstract Concept lattices is the important technique that has become a standard in data analytics and knowledge presentation in many fields such as statistics, artificial intelligence, pattern recognition ,machine learning ,information theory ,social networks, information retrieval system and software engineering. Formal concepts are adopted as the primitive notion. A concept is jointly defined as a pair consisting of the intension and the extension. FCA can handle with huge amount of data it generates concepts and rules and data visualization. Matrix factorization methods have recently received greater exposure, mainly as an unsupervised learning method for latent variable decomposition. In this paper a novel method is proposed to decompose such concepts by using Boolean Matrix Factorization for dimensionality reduction. This paper focuses on finding all the concepts and the object intersections. Keywords: Data mining, formal concepts, lattice, matrix factorization dimensionality reduction.
Similar to Paper Summary of Disentangling by Factorising (Factor-VAE) (20)
A workshop hosted by the South African Journal of Science aimed at postgraduate students and early career researchers with little or no experience in writing and publishing journal articles.
Synthetic Fiber Construction in lab .pptxPavel ( NSTU)
Synthetic fiber production is a fascinating and complex field that blends chemistry, engineering, and environmental science. By understanding these aspects, students can gain a comprehensive view of synthetic fiber production, its impact on society and the environment, and the potential for future innovations. Synthetic fibers play a crucial role in modern society, impacting various aspects of daily life, industry, and the environment. ynthetic fibers are integral to modern life, offering a range of benefits from cost-effectiveness and versatility to innovative applications and performance characteristics. While they pose environmental challenges, ongoing research and development aim to create more sustainable and eco-friendly alternatives. Understanding the importance of synthetic fibers helps in appreciating their role in the economy, industry, and daily life, while also emphasizing the need for sustainable practices and innovation.
Executive Directors Chat Leveraging AI for Diversity, Equity, and InclusionTechSoup
Let’s explore the intersection of technology and equity in the final session of our DEI series. Discover how AI tools, like ChatGPT, can be used to support and enhance your nonprofit's DEI initiatives. Participants will gain insights into practical AI applications and get tips for leveraging technology to advance their DEI goals.
Introduction to AI for Nonprofits with Tapp NetworkTechSoup
Dive into the world of AI! Experts Jon Hill and Tareq Monaur will guide you through AI's role in enhancing nonprofit websites and basic marketing strategies, making it easy to understand and apply.
Read| The latest issue of The Challenger is here! We are thrilled to announce that our school paper has qualified for the NATIONAL SCHOOLS PRESS CONFERENCE (NSPC) 2024. Thank you for your unwavering support and trust. Dive into the stories that made us stand out!
Normal Labour/ Stages of Labour/ Mechanism of LabourWasim Ak
Normal labor is also termed spontaneous labor, defined as the natural physiological process through which the fetus, placenta, and membranes are expelled from the uterus through the birth canal at term (37 to 42 weeks
A review of the growth of the Israel Genealogy Research Association Database Collection for the last 12 months. Our collection is now passed the 3 million mark and still growing. See which archives have contributed the most. See the different types of records we have, and which years have had records added. You can also see what we have for the future.
Francesca Gottschalk - How can education support child empowerment.pptxEduSkills OECD
Francesca Gottschalk from the OECD’s Centre for Educational Research and Innovation presents at the Ask an Expert Webinar: How can education support child empowerment?
Macroeconomics- Movie Location
This will be used as part of your Personal Professional Portfolio once graded.
Objective:
Prepare a presentation or a paper using research, basic comparative analysis, data organization and application of economic information. You will make an informed assessment of an economic climate outside of the United States to accomplish an entertainment industry objective.
2024.06.01 Introducing a competency framework for languag learning materials ...Sandy Millin
http://sandymillin.wordpress.com/iateflwebinar2024
Published classroom materials form the basis of syllabuses, drive teacher professional development, and have a potentially huge influence on learners, teachers and education systems. All teachers also create their own materials, whether a few sentences on a blackboard, a highly-structured fully-realised online course, or anything in between. Despite this, the knowledge and skills needed to create effective language learning materials are rarely part of teacher training, and are mostly learnt by trial and error.
Knowledge and skills frameworks, generally called competency frameworks, for ELT teachers, trainers and managers have existed for a few years now. However, until I created one for my MA dissertation, there wasn’t one drawing together what we need to know and do to be able to effectively produce language learning materials.
This webinar will introduce you to my framework, highlighting the key competencies I identified from my research. It will also show how anybody involved in language teaching (any language, not just English!), teacher training, managing schools or developing language learning materials can benefit from using the framework.
2024.06.01 Introducing a competency framework for languag learning materials ...
Paper Summary of Disentangling by Factorising (Factor-VAE)
1. Paper Summary :
Disentangling by Factorising
Jun-sik Choi
Department of Brain and Cognitive Engineering,
Korea University
November 26, 2019
2. Overview of paper [2]
To enhancing the disentangled representation, Factor-VAE is
proposed.
Factor-VAE enhances disentanglement by encouraging the
distribution of representations to be factorial (independent
accross the dimensions).
Factor-VAE provides a better trade-off between
disentanglement and reconstruction quality than β-VAE [1].
Also, a new disentnaglement metirc is proposed.
3. Unsupervised Disentangled Representation
Disentangled Representation
a representation where a change in one dimension corresponds
to a change in one factor of variation, while being relatively
invariant to changes in other factors. [3]
Why disentangled representation matters?[4]
Data can be represented in more interpretable and semantic
manner.
Learned disentangled representations are more transferrable.
Why disentangled representation in unsupervised manner
1. Humans are able to learn factors of variation unsupervised.
2. Labels are costly as obtaining them requires a human in the
loop.
3. Labels assigned by humans might be inconsistent or leave out
the factors that are difficult for humans to identify.
4. Factor-VAE
Goal
Obtain a better trade-off between disentnaglement and
reconstruction, which is one drawback of β-VAE [1].
How?
Factor-VAE augments the VAE objective with a penalty that
encourages the marginal distribution of representations to be
factorial without substantially affecting the quality of
reconstructions.
The penalty is expressed as a KL divergence between the
marginal distribution and the product of its marginals,
optimized by a discriminator network following the divergence
minimisation view of GANs.
5. Trade-off between Disentanglement and Reconstruction in
beta-VAE I
Notations and assumptions
- Observations: x(i)
∈ X, i = 1, . . . , N
- Underlying generative factors: f = (f1, . . . , fK )
- Latent code that models f : z ∈ Rd
- p(z) = N(0, I), decoder: pθ(x|z), encoder: qθ(z|x)
Disentanglement of Representation
- Variational posterior for an observation:
qθ(z|x) =
d
j=1
N zj |µj (x), σ2
j (x)
can be seen as the distribution of representation corresponding
to the data point x.
6. Trade-off between Disentanglement and Reconstruction in
beta-VAE II
- Marginal posterior and disentanglement
q(z) = Epdata (x)[q(z|x)] =
1
N
N
i=1
q z|x(i)
A disentangled represent would have each zj correspond to
precisely one underlying factor fk , so we want q(z) be
independently factorized:
q(z) =
d
j=1
q (zj )
7. Trade-off between Disentanglement and Reconstruction in
beta-VAE III
Further Decomposition of β-VAE objective
- The β-VAE objective:
1
N
N
i=1
Eq(z|x(i)
) log p x(i)
|z − βKL q z|x(i)
p(z)
is a lower bound of Epdata (x) log p x(i)
Where,
Eq(z|x(i)
) log p x(i)
|z : negative reconstruction error
KL q z|x(i)
p(z) : complexity penalty.
8. Trade-off between Disentanglement and Reconstruction in
beta-VAE IV
- The KL term can be further decomposed as:
Epdata(x)[KL(q(z|x) p(z))] = I(x; z) + KL(q(z) p(z))
proof
Epdata(x)[KL(q(z|x) p(z))]
= Epdata(x)Eq(z|x) log q(z|x)
p(z)
= Epdata(x)Eq(z|x) log q(z|x)
q(z)
q(z)
p(z)
= Epdata(x)Eq(z|x) log q(z|x)
q(z) + log q(z)
p(z)
= Epdata(x)[KL(q(z|x) q(z))] + Eq(x,z) log q(z)
p(z)
= Iq(x; z) + Eq(z) log q(z)
p(z)
= Iq(x; z) + KL(q(z) p(z))
9. Trade-off between Disentanglement and Reconstruction in
beta-VAE V
Epdata(x)[KL(q(z|x) p(z))] = I(x; z) + KL(q(z) p(z))
- When increasing penalty for complexity by setting β > 1,
KL(q(z) p(z)) and I(x; z) are both penalized.
- Penalizing KL(q(z) p(z)) makes q(z) to be factorized as prior
p(z).
- Penalizing I(x; z) reduces amount of information about x
stored in z, which lead to poor reconstruction.
10. Total Correlation Penalty I
Factor-VAE objective
1
N
N
i=1
Eq(z|x(i)
) log p x(i)
|z −KL q z|x(i)
p(z)
− γKL(q(z) ¯q(z))
where, ¯q(z) := d
j=1 q (zj ) is a lower bound on the marginal
log likelihood Epdata(x)[log p(x)] and directly encourages
independence in the code distribution.
Total correlation [5] KL(q(z) ¯q(z))
A popular measure of dependence for multiple random
variables.
As both q(z)and ¯q(z) are intractable, an alternative approach
for optimizing total correlation is required.
Total Correlation
11. Total Correlation Penalty II
Alternative way to optimize total correlation
1. Sample q z|x(i)
with uniformly sampled x(i)
.
2. Generate d samples from q(z) and ignoring all but one
dimension for each sample.
Or,
1. Sample a batch from q(z)
2. Randomly permuting across the batch for eatch latent
dimension.
As long as the batch is large enough, the distribution of these
samples will closely approximate ¯q(z).
12. Total Correlation Penalty III
Minimization of KL divergence
By training a classifier (Discriminator), approximate the density
ratio that arises in the KL term (Density-ratio trick [6]).
TC(z) = KL(q(z) ¯q(z)) = Eq(z) log
q(z)
¯q(z)
≈ Eq(z) log
D(z)
1 − D(z)
The discriminator and VAE trained jointly.
The discriminator is trained to classify between samples from
q(z) and ¯q(z).
15. Metric for Disentanglement I
Disentanglement metric proposed in [1]
Weaknesses
1. The metric is sensitive to hyperparameters of the linear
classifier optimization.
2. Learned representations can be a linear combination of several
dimensions, so using linear classifier is inppropiate.
3. The metric has a failure mode. When only K − 1 factors out
of K factors are disentangled, the classifier still gives 100%
accuracy.
16. Metric for Disentanglement II
Proposed metric for disentanglement
1. Choose a factor k and generate data with this factor fixed, but
all other factors varying randomly.
2. Obtain their representations.
3. Normalize each dimension by its empirical standard deviation s
over the full data (or a large enough random subset).
4. Take the empirical variance Var z
(l)
d /sd in each dimension of
normalized representations.
5. The target index k and index of dimension with the lowest
variance are fed to the majority-vote classifier.
If the representation is perfectly disentangled, the variance of
dimension corresponding to the fixed factor will be 0.
17. Metric for Disentanglement III
As representations are normalized, the argmin Varl z
(l)
d /sd is
invariant to rescaling of the representations in each dimension.
Majority-vote classification1
1. For each L samples, one vote (ai , bi ),
ai ∈ {1, . . . , D} , bi ∈ {1, . . . , K} is achieved.
2. Given M votes (ai , bi )M
i=1, Voting matrix
Vjk =
M
i=1 I (ai = j, bi = k) is achieved.
3. Then, the majority vote classifier is defined to be
C(j) = arg maxk Vjk .
4. In other words, C(j) is the index of generative factor k which
produces largest number of lowest variance for latent
dimension j.
5. The metric is the accuracy of the classifier
ΣD
j=1VjC(j)
Σj Σk Vjk
.
Note that for majority-vote classifier, there are no optimisation
hyperparameters to tune, and the resulting classifier is a
deterministic function of the training data.
18. Metric for Disentanglement IV
Comparison between metrics ([1, 2])
1. New disentanglement metric of [2] is much less sensitive to
hyperparameters than old metric of [1].
2. Old metric is very sensitive to number of iterations, and metric
is constantly improves with more iterations.
1
Please refer the code [Link] for more details.
19. Experiments I
Datasets
Dataset with known generative factors
1. 2D Shapes dataset[7] with n : 737,280, dim : 64 × 64
fk : shape(3), scale(6), orientation(40), x-position(32),
y-position(32)
2. 3D Shales dataset[8] with n : 480,000, dim : 64 × 64 × 3
fk : shape(4), scale(8), orientation(15), floor color(10), wall
color(10), object color(10)
Dataset with unknown generative factors
1. 3D Faces dataset[9] with n : 239,840, dim : 64 × 64 × 3
2. 3D Chairs dataset[10] with n : 86,366, dim : 64 × 64 × 3
3. CelebA dataset (Cropped)[11] with n : 202,599,
dim : 64 × 64 × 3
27. Conclusion
This work introduces FactorVAE, a novel method for
disentangled representation.
A new disentanglement metric is prorposed.
Limitations
Low total correlation is necessary but not sufficient for
disentangling of independent factors of variation. (When all
but one of the latent dimension were to collapse to prior,
TC=0 but not disentangled.)
The proposed metric requires to generate samples holding one
factor fixed, which is not always possible. (When training set
does not cover all possible factors)
The metric is also unsuitable for data with non-independent
factors of variation.
28. References
I. Higgins, L. Matthey, A. Pal, C. Burgess, X. Glorot,
M. Botvinick, S. Mohamed, and A. Lerchner, “beta-vae:
Learning basic visual concepts with a constrained variational
framework.,” ICLR, vol. 2, no. 5, p. 6, 2017.
H. Kim and A. Mnih, “Disentangling by factorising,” arXiv
preprint arXiv:1802.05983, 2018.
Y. Bengio, A. Courville, and P. Vincent, “Representation
learning: A review and new perspectives,” IEEE transactions on
pattern analysis and machine intelligence, vol. 35, no. 8,
pp. 1798–1828, 2013.
B. M. Lake, T. D. Ullman, J. B. Tenenbaum, and S. J.
Gershman, “Building machines that learn and think like
people,” Behavioral and Brain Sciences, vol. 40, no. 2017,
2017.
S. Watanabe, “Information theoretical analysis of multivariate
correlation,” IBM Journal of research and development, vol. 4,
29. Total Correlation
Definition
For a given n random variables {X1, X2, . . . , Xn},
Total correlation is defined as the KL divergence from the joint
distribution p(X1, . . . , Xn) to the independent distribution of
p(X1)p(X2) · · · p(Xn).
TC (X1, X2, . . . , Xn) ≡ DKL [p (X1, . . . , Xn) p (X1) p (X2) · · · p (Xn)]
TC (X1, X2, . . . , Xn) =
n
i=1
H (Xi ) − H (X1, X2, . . . , Xn)
= The amount of information shared
among the variables in the set.
A near-zero TC indicates that the variables in the group are
essentially statistically independent.
Back