[DL輪読会]The Cramer Distance as a Solution to Biased Wasserstein GradientsDeep Learning JP
This document discusses several mathematical concepts related to probability and statistics:
- It defines the Kullback-Leibler divergence KL(P||Q) as a measure of how one probability distribution P differs from a second distribution Q.
- It presents an equation for Qθ(y|x) as a Gaussian distribution with mean fθ(x) and variance 1/2, commonly used in probabilistic modeling.
- It defines the Wasserstein distance Wp(μ,ν) as a way to measure the distance between two probability distributions based on the minimum cost of transporting mass between them.
- It provides equations for the probability density function and cumulative distribution function of a Dirac delta function and a
[DL輪読会]The Cramer Distance as a Solution to Biased Wasserstein GradientsDeep Learning JP
This document discusses several mathematical concepts related to probability and statistics:
- It defines the Kullback-Leibler divergence KL(P||Q) as a measure of how one probability distribution P differs from a second distribution Q.
- It presents an equation for Qθ(y|x) as a Gaussian distribution with mean fθ(x) and variance 1/2, commonly used in probabilistic modeling.
- It defines the Wasserstein distance Wp(μ,ν) as a way to measure the distance between two probability distributions based on the minimum cost of transporting mass between them.
- It provides equations for the probability density function and cumulative distribution function of a Dirac delta function and a
In this work, we introduce a new Markov operator associated with a digraph, which we refer to as a nonlinear Laplacian. Unlike previous Laplacians for digraphs, the nonlinear Laplacian does not rely on the stationary distribution of the random walk process and is well defined on digraphs that are not strongly connected. We show that the nonlinear Laplacian has nontrivial eigenvalues and give a Cheeger-like inequality, which relates the conductance of a digraph and the smallest non-zero eigenvalue of its nonlinear Laplacian. Finally, we apply the nonlinear Laplacian to the analysis of real-world networks and obtain encouraging results.
The document discusses deep kernel learning, which combines deep learning and Gaussian processes (GPs). It briefly reviews the predictive equations and marginal likelihood for GPs, noting their computational requirements. GPs assume datasets with input vectors and target values, modeling the values as joint Gaussian distributions based on a mean function and covariance kernel. Predictive distributions for test points are also Gaussian. The goal of deep kernel learning is to leverage recent work on efficiently representing kernel functions to produce scalable deep kernels, allowing outperformance of standalone deep learning and GPs on various datasets.
In this work, we introduce a new Markov operator associated with a digraph, which we refer to as a nonlinear Laplacian. Unlike previous Laplacians for digraphs, the nonlinear Laplacian does not rely on the stationary distribution of the random walk process and is well defined on digraphs that are not strongly connected. We show that the nonlinear Laplacian has nontrivial eigenvalues and give a Cheeger-like inequality, which relates the conductance of a digraph and the smallest non-zero eigenvalue of its nonlinear Laplacian. Finally, we apply the nonlinear Laplacian to the analysis of real-world networks and obtain encouraging results.
The document discusses deep kernel learning, which combines deep learning and Gaussian processes (GPs). It briefly reviews the predictive equations and marginal likelihood for GPs, noting their computational requirements. GPs assume datasets with input vectors and target values, modeling the values as joint Gaussian distributions based on a mean function and covariance kernel. Predictive distributions for test points are also Gaussian. The goal of deep kernel learning is to leverage recent work on efficiently representing kernel functions to produce scalable deep kernels, allowing outperformance of standalone deep learning and GPs on various datasets.
A simple, widely used control method. This presentation will provide an introduction to PID controllers, including demonstrations, and practise tuning a controller for a simple system.
From the Un-Distinguished Lecture Series (http://ws.cs.ubc.ca/~udls/). The talk was given Mar. 30, 2007.