A One-Pass Triclustering Approach: Is There any Room for Big Data?Dmitrii Ignatov
An efficient one-pass online algorithm for triclustering of binary data (triadic formal contexts) is proposed. This algorithm is a modified version of the basic algorithm for OAC-triclustering approach, but it has linear time and memory complexities with respect to the cardinality
of the underlying ternary relation and can be easily parallelized in order to be applied for the analysis of big datasets. The results of computer experiments show the efficiency of the proposed algorithm.
Full paper: https://arxiv.org/pdf/1804.02339.pdf
We propose and analyze a novel adaptive step size variant of the Davis-Yin three operator splitting, a method that can solve optimization problems composed of a sum of a smooth term for which we have access to its gradient and an arbitrary number of potentially non-smooth terms for which we have access to their proximal operator. The proposed method leverages local information of the objective function, allowing for larger step sizes while preserving the convergence properties of the original method. It only requires two extra function evaluations per iteration and does not depend on any step size hyperparameter besides an initial estimate. We provide a convergence rate analysis of this method, showing sublinear convergence rate for general convex functions and linear convergence under stronger assumptions, matching the best known rates of its non adaptive variant. Finally, an empirical comparison with related methods on 6 different problems illustrates the computational advantage of the adaptive step size strategy.
A One-Pass Triclustering Approach: Is There any Room for Big Data?Dmitrii Ignatov
An efficient one-pass online algorithm for triclustering of binary data (triadic formal contexts) is proposed. This algorithm is a modified version of the basic algorithm for OAC-triclustering approach, but it has linear time and memory complexities with respect to the cardinality
of the underlying ternary relation and can be easily parallelized in order to be applied for the analysis of big datasets. The results of computer experiments show the efficiency of the proposed algorithm.
Full paper: https://arxiv.org/pdf/1804.02339.pdf
We propose and analyze a novel adaptive step size variant of the Davis-Yin three operator splitting, a method that can solve optimization problems composed of a sum of a smooth term for which we have access to its gradient and an arbitrary number of potentially non-smooth terms for which we have access to their proximal operator. The proposed method leverages local information of the objective function, allowing for larger step sizes while preserving the convergence properties of the original method. It only requires two extra function evaluations per iteration and does not depend on any step size hyperparameter besides an initial estimate. We provide a convergence rate analysis of this method, showing sublinear convergence rate for general convex functions and linear convergence under stronger assumptions, matching the best known rates of its non adaptive variant. Finally, an empirical comparison with related methods on 6 different problems illustrates the computational advantage of the adaptive step size strategy.
Context-Aware Recommender System Based on Boolean Matrix FactorisationDmitrii Ignatov
In this work we propose and study an approach for collaborative filtering, which is based on Boolean matrix factorisation and exploits additional (context) information about users and items. To avoid similarity loss in case of Boolean representation we use an adjusted type of projection of a target user to the obtained factor space.
We have compared the proposed method with SVD-based approach on the MovieLens dataset. The experiments demonstrate that the proposed method has better MAE and Precision and comparable Recall and F-measure. We also report an increase of quality in the context information presence.
Beginnig with reviewing Basyain Theorem and chain rule, then explain MAP Estimation; Maximum A Posteriori Estimation.
In the framework of MAP Estimation, we can describe a lot of famous models; naive bayes, regularized redge regression, logistic regression, log-linear model, and gaussian process.
MAP estimation is powerful framework to understand the above models from baysian point of view and cast possibility to extend models to semi-supervised ones.
Approximation Algorithms for the Directed k-Tour and k-Stroll ProblemsSunny Kr
In the Asymmetric Traveling Salesman Problem (ATSP), the input is a directed n-vertex graph G = (V; E) with nonnegative edge lengths, and the goal is to nd a minimum-length tour, visiting
each vertex at least once. ATSP, along with its undirected counterpart, the Traveling Salesman
problem, is a classical combinatorial optimization problem
Learning RBM(Restricted Boltzmann Machine in Practice)Mad Scientists
In Deep Learning, learning RBM is basic hierarchical components of the layer. In this slide, we can learn basic components of RBM (bipartite graph, Gibbs Sampling, Contrastive Divergence (1-CD), Energy function of entropy).
Context-Aware Recommender System Based on Boolean Matrix FactorisationDmitrii Ignatov
In this work we propose and study an approach for collaborative filtering, which is based on Boolean matrix factorisation and exploits additional (context) information about users and items. To avoid similarity loss in case of Boolean representation we use an adjusted type of projection of a target user to the obtained factor space.
We have compared the proposed method with SVD-based approach on the MovieLens dataset. The experiments demonstrate that the proposed method has better MAE and Precision and comparable Recall and F-measure. We also report an increase of quality in the context information presence.
Beginnig with reviewing Basyain Theorem and chain rule, then explain MAP Estimation; Maximum A Posteriori Estimation.
In the framework of MAP Estimation, we can describe a lot of famous models; naive bayes, regularized redge regression, logistic regression, log-linear model, and gaussian process.
MAP estimation is powerful framework to understand the above models from baysian point of view and cast possibility to extend models to semi-supervised ones.
Approximation Algorithms for the Directed k-Tour and k-Stroll ProblemsSunny Kr
In the Asymmetric Traveling Salesman Problem (ATSP), the input is a directed n-vertex graph G = (V; E) with nonnegative edge lengths, and the goal is to nd a minimum-length tour, visiting
each vertex at least once. ATSP, along with its undirected counterpart, the Traveling Salesman
problem, is a classical combinatorial optimization problem
Learning RBM(Restricted Boltzmann Machine in Practice)Mad Scientists
In Deep Learning, learning RBM is basic hierarchical components of the layer. In this slide, we can learn basic components of RBM (bipartite graph, Gibbs Sampling, Contrastive Divergence (1-CD), Energy function of entropy).
First-passage percolation on random planar mapsTimothy Budd
Recently two- and three-point functions have been derived for general planar maps with control
over both the number of edges and number of faces. In the limit of large number of edges, the multi-point
functions reduce to those for random cubic planar maps with random exponential edge lengths, and they
can be interpreted in terms of either a First passage percolation (FPP) or an Eden model. We observe a
surprisingly simple relation between the asymptotic first passage time, the hop count (the number of edges
in a shortest-time path) and the graph distance (the number of edges in a shortest path). Using (heuristic)
transfer matrix arguments, we show that this relation remains valid for random p-valent maps for any p>2.
Slides zu einem 10 Minuten Vortrag zum Thema Werte. Es geht um die dunkle Seite der Macht in uns allen, die uns magisch anzieht und wirklichen Erfolg verhindert. Es geht darum, diese Verhaltensweise bei sich selbst, bei mir selbst zu erkennen.
Machine Learning and Logging for Monitoring Microservices Daniel Berman
In this talk I go over the use cases for using machine learning and centralized logging for monitoring a distributed, multi layered microservices architecture.
Persistence of power-law correlations in nonequilibrium steady states of gapp...Jarrett Lancaster
The existence of quasi-long range order is demonstrated in nonequilibrium steady states in isotropic XY spin chains including of two types of additional terms that generate a gap in the energy spectrum. The system is driven out of equilibrium by initializing a domain-wall magnetization profile through application of external magnetic field and switching off the magnetic field at the same time the energy gap is activated. An energy gap is produced by either applying a staggered magnetic field in the transverse direction or introducing a modulation to the XY coupling. The magnetization, spin current and spin-spin correlation functions are computed in the thermodynamic limit at long times after the quench. For both types of systems, we find the persistence of power-law correlations despite the ground state correlation functions exhibiting exponential decay. It is discussed how these power-law correlations appear related to the periodic nature of the perturbation which generates the energy gap.
We propose a regularized method for multivariate linear regression when the number of predictors may exceed the sample size. This method is designed to strengthen the estimation and the selection of the relevant input features with three ingredients: it takes advantage of the dependency pattern between the responses by estimating the residual covariance; it performs selection on direct links between predictors and responses; and selection is driven by prior structural information. To this end, we build on a recent reformulation of the multivariate linear regression model to a conditional Gaussian graphical model and propose a new regularization scheme accompanied with an efficient optimization procedure. On top of showing very competitive performance on artificial and real data sets, our method demonstrates capabilities for fine interpretation of its parameters, as illustrated in applications to genetics, genomics and spectroscopy.
Riemannian stochastic variance reduced gradient on Grassmann manifold (ICCOPT...Hiroyuki KASAI
Stochastic variance reduction algorithms have recently become popular for minimizing the average of a large, but finite, number of loss functions. In this paper, we propose a novel Riemannian extension of the Euclidean stochastic variance reduced gradient algorithm (R-SVRG) to a compact manifold search space. To this end, we show the developments on the Grassmann manifold. The key challenges of averaging, addition, and subtraction of multiple gradients are addressed with notions like logarithm mapping and parallel translation of vectors on the Grassmann manifold. We present a global convergence analysis of the proposed algorithm with a decay step-size and a local convergence rate analysis under a fixed step-size with under some natural assumptions. The proposed algorithm is applied on a number of problems on the Grassmann manifold like principal components analysis, low-rank matrix completion, and the Karcher mean computation. In all these cases, the proposed algorithm outperforms the standard Riemannian stochastic gradient descent algorithm.
To make Reinforcement Learning Algorithms work in the real-world, one has to get around (what Sutton calls) the "deadly triad": the combination of bootstrapping, function approximation and off-policy evaluation. The first step here is to understand Value Function Vector Space/Geometry and then make one's way into Gradient TD Algorithms (a big breakthrough to overcome the "deadly triad").
Polynomial matrices can help to elegantly formulate many broadband multi-sensor / multi-channel processing problems, and represent a direct extension of well-established narrowband techniques which typically involve eigen- (EVD) and singular value decompositions (SVD) for optimisation. Polynomial matrix decompositions extend the utility of the EVD to polynomial parahermitian matrices, and this talk presents a brief overview of such polynomial matrices, characteristics of the polynomial EVD (PEVD) and iterative algorithms for its solution. The presentation concludes with some surprising results when applying the PEVD to subband coding and broadband beamforming.
From Atomistic to Coarse Grain Systems - Procedures & MethodsFrank Roemer
The physical and mathematical basis as well as the historical background of the most popular coarse graining methods (Reverse/Inverse Monte-Carlo, Iterative Boltzmann Inversion and Force Matching method) in the field of fluids and soft matter are presented here. In terms of lengths and time scale, I refer here to the classical coarse grain systems, which are in between the atomistic and mesoscale systems. The focus is on the path to derive the coarse grain force fields from reference data obtained from atomistic simulations.
Similar to Paper Review: An exact mapping between the Variational Renormalization Group and Deep Learning (20)
Learning visual representation without human labelKai-Wen Zhao
Self supervised learning (SSL) is one of the most fast-growing research topic in recent years. SSL provides algorithm that directly learn visual representation from data itself rather than human manual labels. From theoretical point of view, SSL explores information theory & the nature of large scale dataset.
A new paper published by OpenAI discussing generalization in deep learning and provide an observation that how model & data complexity influence each other.
Learning to discover monte carlo algorithm on spin ice manifoldKai-Wen Zhao
The global update Monte Carlo sampler can be discovered naturally by trained machine using policy gradient method on topologically constrained environment.
Toward Disentanglement through Understand ELBOKai-Wen Zhao
Disentangled representation is the holy grail for representation learning which factorizes human-understandable factors in unsupervised way what help us move forward to interpretable machine learning.
Deep Reinforcement Learning: Q-LearningKai-Wen Zhao
This slide reviews deep reinforcement learning, specially Q-Learning and its variants. We introduce Bellman operator and approximate it with deep neural network. Last but not least, we review the classical paper: DeepMind Atari Game beats human performance. Also, some tips of stabilizing DQN are included.
High Dimensional Data Visualization using t-SNEKai-Wen Zhao
Review of the t-SNE algorithm which helps visualizing the high dimensional data on manifold by projecting them onto 2D or 3D space with metric preserving.
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...pchutichetpong
M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years.
Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success.
MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies.
According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
Paper Review: An exact mapping between the Variational Renormalization Group and Deep Learning
1. An exact mapping between the Variational
Renormalization Group and Deep Learning
Kai-Wen Zhao, kv
Physics, National Taiwan University
kelispinor@gmail.com
December 1, 2016
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 1 / 18
2. Outline
Overview
Renormalization Group
Physical world with various length scales
Symmetry and Scale Invariance
Restricted Boltzman Machine
Generative, Energy-based Model, Unsupervised Learning Algorithm
Richard Feynman: What I Cannot Create, I Do Not Understand.
Mapping
Unsupervised Deep Learning Implements the Kadanoff Real Space
Variational Renormalization Group
HRG
λ [{hj }] = HRBM
λ [{hj }]
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 2 / 18
3. Overview of Variational RG
Statistical Physics
An ensemble of N spins {vi }, take value ±1, i is position index in some
lattice. Boltzman distribution and partition function
P({vi }) =
e−H({vi })
Z
, where Z = Trvi e−H({vi })
=
v1,v2,...=±1
e−H({vi })
Typically, Hamiltonian depends on a set of couplings {Ks}
H[{vi }] = −
i
Ki vi −
ij
Kij vi vj −
ijk
Kijkvi vj vk + ...
Free energy of spin system
F = − log Z = − log(Trvi e−H({vi })
)
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 3 / 18
4. Overview of Variational RG
Overview of Variational Renormalization Group
Idea behind RG: To finde a new coarsed-grained description of spin
system, where one has integrated out short distance fluctuations.
N Physical spins: {vi }, couplings {K}
M Coarse-grained spins: {hj }, couplings { ˜K}, where M < N
Renormalization transformation is often represented as a mapping
{K} → { ˜K}
Coarse-grained Hamiltonian
HRG
[{hj }] = −
i
˜Ki hi −
ij
˜Kij hi hj −
ijk
˜Kijkhi hj hk + ...
Now, we do not distinguish vi and {vi } if no ambiguity
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 4 / 18
5. Overview of Variational RG
Overview of Variational Renormalization Group
Variational RG scheme (Kadanoff)
Coarse graining procedure: Tλ(vi , hj ) couples auxiliary spins hj to physical
spins vi
Naturally, we marginalize over the physical spins
exp (−HRG
λ (hj )) = Trvi exp (Tλ(vi , hj ) − H(vi ))
The free energy of coarse grained system
Fh
λ = −log(Trhj
e−HRG
λ (hj )
)
Choose parameters λ to ensure long-distrance observables are invariant.
Minimize free energy difference
∆F = Fh
λ − Fv
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 5 / 18
6. Overview of Variational RG
Overview of Variational Renormalization Group
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 6 / 18
7. RBMs and Deep Neural Networks
Restricted Boltzman Machine
Binary data probability distribution P(vi ). Energy function
E(vi , hj ) =
ij
wij vi hj +
i
ci vi +
j
bj hj
where we denote parameters λ = {w, b, c}. Joint probability
pλ(vi , hj ) =
e−E(vi ,hj )
Z
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 7 / 18
8. RBMs and Deep Neural Networks
Restricted Boltzman Machine
Variational distribution of visible variables
pλ(vi ) =
hj
p(vi , hj ) = Trhj
pλ(vi , hj ) :=
e−HRBM
λ (vi )
Z
pλ(hj ) =
vi
p(vi , hj ) = Trvi pλ(vi , hj ) :=
e−HRBM
λ (hj )
Z
Kullback-Leibler divergence
DKL(P(vi )||pλ(vi )) =
vi
P(vi ) log
P(vi )
pλ(vi )
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 8 / 18
9. Exact Mapping VRG to DL
Mapping Variational RG to RBM
In RG scheme, the couplings between visible and hidden spins are encodes
by the operators T. Analogous role, in RBM, is played by joint energy
function.
T(vi , hj ) = −E(vi , hj ) + H(vi )
To derive equivalent statement from coarse-grained Hamiltonian
e−HRG
λ (hj )
Z
=
Trvi eTλ(vi ,hj )−H(vi )
Z
= Trvi
e−E(vi ,hj )
Z
= pλ(hj )
=
e−HRBM
λ (hj )
Z
Subsituting the right-hand side yields
HRG
λ [{hj }] = HRBM
λ [{hj }] (1)
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 9 / 18
10. Exact Mapping VRG to DL
Mapping Variational RG to RBM
The operator Tλ can be viewed as a variational approximation for
conditional probability
eT(vi ,hj )
= e−E(vi ,hj )+H(vi )
=
pλ(vi , hj )
pλ(vi )
eH(vi )−HRBM
λ (vi )
= pλ(hj |vi )eH(vi )−HRBM
λ (vi )
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 10 / 18
11. Examples
Examples: 2D Ising Model
Two dimensional nearest neighbor Ising model with ferromagnetic coupling
H({vi }) = −J
<ij>
vi vj
Phase transition occurs when J/(kBT) = 0.4352.
Experiment Setup
20,000 samples, 40x40 periodic lattice
RBM’s architecture 1600-400-100-25
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 11 / 18
12. Examples
Examples: 2D Ising Model
Figure: Top layer
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 12 / 18
13. Examples
Examples: 2D Ising Model
Figure: Middle layer
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 13 / 18
15. Conclusion
Conclusion and Discussion
One-to-one mapping between RBM-based DNN and variational RG
Suggest learning implements RG-like scheme to extract important
features from data
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 15 / 18
16. Relate to us
Relate to us: Auto-Encoder and Convolutional AE
z is the codes extracted by machine
φ : X → Z ψ : Z → X
arg min ||X − (ψ ◦ φ)X||2
Figure: Scheme of Auto-Encoder
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 16 / 18
17. Relate to us
Relate to us: Auto-Encoder and Convolutional AE
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 17 / 18