Abstract : Motivated by the recovery and prediction of electricity consumption time series, we extend Nonnegative Matrix Factorization to take into account external features as side information. We consider general linear measurement settings, and propose a framework which models non-linear relationships between external features and the response variable. We extend previous theoretical results to obtain a sufficient condition on the identifiability of NMF with side information. Based on the classical Hierarchical Alternating Least Squares (HALS) algorithm, we propose a new algorithm (HALSX, or Hierarchical Alternating Least Squares with eXogeneous variables) which estimates NMF in this setting. The algorithm is validated on both simulated and real electricity consumption datasets as well as a recommendation system dataset, to show its performance in matrix recovery and prediction for new rows and columns.
Sesquickselect: One and a half pivot for cache efficient selectionSebastian Wild
These are the slides for my ANALCO about sesquickselect, a novel quickselect variant. The paper and further details here: https://www.wild-inter.net/publications/martinez-nebel-wild-2019
Universal Approximation Property via Quantum Feature Maps
----
The quantum Hilbert space can be used as a quantum-enhanced feature space in machine learning (ML) via the quantum feature map to encode classical data into quantum states. We prove the ability to approximate any continuous function with optimal approximation rate via quantum ML models in typical quantum feature maps.
---
Contributed talk at Quantum Techniques in Machine Learning 2021, Tokyo, November 8-12 2021.
By Quoc Hoan Tran, Takahiro Goto and Kohei Nakajima
The variational Gaussian process (VGP), a Bayesian nonparametric model which adapts its shape to match com- plex posterior distributions. The VGP generates approximate posterior samples by generating latent inputs and warping them through random non-linear mappings; the distribution over random mappings is learned during inference, enabling the transformed outputs to adapt to varying complexity.
Sesquickselect: One and a half pivot for cache efficient selectionSebastian Wild
These are the slides for my ANALCO about sesquickselect, a novel quickselect variant. The paper and further details here: https://www.wild-inter.net/publications/martinez-nebel-wild-2019
Universal Approximation Property via Quantum Feature Maps
----
The quantum Hilbert space can be used as a quantum-enhanced feature space in machine learning (ML) via the quantum feature map to encode classical data into quantum states. We prove the ability to approximate any continuous function with optimal approximation rate via quantum ML models in typical quantum feature maps.
---
Contributed talk at Quantum Techniques in Machine Learning 2021, Tokyo, November 8-12 2021.
By Quoc Hoan Tran, Takahiro Goto and Kohei Nakajima
The variational Gaussian process (VGP), a Bayesian nonparametric model which adapts its shape to match com- plex posterior distributions. The VGP generates approximate posterior samples by generating latent inputs and warping them through random non-linear mappings; the distribution over random mappings is learned during inference, enabling the transformed outputs to adapt to varying complexity.
Model calibration or data inversion involves using experimental or field data to estimate the unknown parameters in a mathematical model. In the first part of the talk, I will present a review of a few approaches for model calibration or data inversion with the focus on model discrepancy and measurement bias. A few state-of-art methods, such as modeling the discrepancy by the Gaussian stochastic process (GaSP) or scaled Gaussian stochastic processes (S-GaSP), L2 calibration, Least squares (LS) calibration and orthogonal Gaussian process calibration, will be introduced. The connection and difference between these methods will be discussed. In the second part of talk, I will discuss our ongoing works on calibrating a geophysical model by integrating the different types of the field data, such as the interferometric synthetic aperture radar satellite (InSAR) interferograms, GPS data, velocities of tilt and lava lake from the Kilauea Volcano during the eruption in 2018. This task is complicated by the discrepancy between the model and reality different sample sizes and possible bias in field data. We introduce the scaled Gaussian stochastic process (S-GaSP), a new stochastic process to model the discrepancy function in calibration for the identifiability issue between the calibrated mathematical model and the discrepancy function. We also compare a few approaches to model the measurement bias in the data. A feasible way to fuse the field data from multiple sources will then be discussed. The calibration models are implemented in the "RobustCalibration" R Package on CRAN. The scientific goal of this work is to use data in May 2018 during the earthquake and the eruption of the Kilauea Volcano to resolve the location, volume, and pressure change in the Halema'uma'u Reservoir, as well as relating the results to the inferences from the past caldera collapses.
Simulators play a major role in analyzing multi-modal transportation networks. As complexity of simulators increases, development of calibration procedures is becoming an increasingly challenging task. Current calibration procedures often rely on heuristics, rules of thumb and sometimes on brute-force search. In this talk we consider a statistical framework for calibration that relies on Bayesian optimization. Bayesian optimization treats the simulator as a sample from a Gaussian process (GP). Tractability and sample efficiency of Gaussian processes enable computationally efficient algorithms for calibration problems. We show how the choice of prior and inference algorithm effect the outcome of our optimization procedure. We develop dimensionality reduction techniques that allow for our optimization techniques to be applicable for real-life problems. We develop a distributed, Gaussian Process Bayesian regression and active learning models. We demonstrate those to calibrate ground transportation simulation models.
Why should you care about Markov Chain Monte Carlo methods?
→ They are in the list of "Top 10 Algorithms of 20th Century"
→ They allow you to make inference with Bayesian Networks
→ They are used everywhere in Machine Learning and Statistics
Markov Chain Monte Carlo methods are a class of algorithms used to sample from complicated distributions. Typically, this is the case of posterior distributions in Bayesian Networks (Belief Networks).
These slides cover the following topics.
→ Motivation and Practical Examples (Bayesian Networks)
→ Basic Principles of MCMC
→ Gibbs Sampling
→ Metropolis–Hastings
→ Hamiltonian Monte Carlo
→ Reversible-Jump Markov Chain Monte Carlo
Extreme‐Scale Parallel Symmetric Eigensolver for Very Small‐Size Matrices Usi...Takahiro Katagiri
We have developed a parallel eigensolver for very small-size matrices. Unlike conventional solvers, our design policy focusses on nature of non-blocking computations and reduced communications. A communication-avoiding approach for Householder pivot vectors is used to implement part of Householder inverse transformation. In addition to that, we implement some techniques for reducing communications by using non-blocking communications in tridiagonalization part. Performance of the solver with full nodes in the Fujitsu FX10 (76,800 cores) is also presented.
This poster was created in LaTeX on a Dell Inspiron laptop with a Linux Fedora Core 4 operating system. The background image and the animation snapshots are dxf meshes of elastic waveform solutions, rendered on a Windows machine using 3D Studio Max.
Machine learning and linear regression programmingSoumya Mukherjee
Overview of AI and ML
Terminology awareness
Applications in real world
Use cases within Nokia
Types of Learning
Regression
Classification
Clustering
Linear Regression Single Variable with python
A walk through the intersection between machine learning and mechanistic mode...JuanPabloCarbajal3
Talk at EURECOM, France.
It overviews regression in several of its forms: regularized, constrained, and mixed. It builds the bridge between machine learning and dynamical models.
Model calibration or data inversion involves using experimental or field data to estimate the unknown parameters in a mathematical model. In the first part of the talk, I will present a review of a few approaches for model calibration or data inversion with the focus on model discrepancy and measurement bias. A few state-of-art methods, such as modeling the discrepancy by the Gaussian stochastic process (GaSP) or scaled Gaussian stochastic processes (S-GaSP), L2 calibration, Least squares (LS) calibration and orthogonal Gaussian process calibration, will be introduced. The connection and difference between these methods will be discussed. In the second part of talk, I will discuss our ongoing works on calibrating a geophysical model by integrating the different types of the field data, such as the interferometric synthetic aperture radar satellite (InSAR) interferograms, GPS data, velocities of tilt and lava lake from the Kilauea Volcano during the eruption in 2018. This task is complicated by the discrepancy between the model and reality different sample sizes and possible bias in field data. We introduce the scaled Gaussian stochastic process (S-GaSP), a new stochastic process to model the discrepancy function in calibration for the identifiability issue between the calibrated mathematical model and the discrepancy function. We also compare a few approaches to model the measurement bias in the data. A feasible way to fuse the field data from multiple sources will then be discussed. The calibration models are implemented in the "RobustCalibration" R Package on CRAN. The scientific goal of this work is to use data in May 2018 during the earthquake and the eruption of the Kilauea Volcano to resolve the location, volume, and pressure change in the Halema'uma'u Reservoir, as well as relating the results to the inferences from the past caldera collapses.
Simulators play a major role in analyzing multi-modal transportation networks. As complexity of simulators increases, development of calibration procedures is becoming an increasingly challenging task. Current calibration procedures often rely on heuristics, rules of thumb and sometimes on brute-force search. In this talk we consider a statistical framework for calibration that relies on Bayesian optimization. Bayesian optimization treats the simulator as a sample from a Gaussian process (GP). Tractability and sample efficiency of Gaussian processes enable computationally efficient algorithms for calibration problems. We show how the choice of prior and inference algorithm effect the outcome of our optimization procedure. We develop dimensionality reduction techniques that allow for our optimization techniques to be applicable for real-life problems. We develop a distributed, Gaussian Process Bayesian regression and active learning models. We demonstrate those to calibrate ground transportation simulation models.
Why should you care about Markov Chain Monte Carlo methods?
→ They are in the list of "Top 10 Algorithms of 20th Century"
→ They allow you to make inference with Bayesian Networks
→ They are used everywhere in Machine Learning and Statistics
Markov Chain Monte Carlo methods are a class of algorithms used to sample from complicated distributions. Typically, this is the case of posterior distributions in Bayesian Networks (Belief Networks).
These slides cover the following topics.
→ Motivation and Practical Examples (Bayesian Networks)
→ Basic Principles of MCMC
→ Gibbs Sampling
→ Metropolis–Hastings
→ Hamiltonian Monte Carlo
→ Reversible-Jump Markov Chain Monte Carlo
Extreme‐Scale Parallel Symmetric Eigensolver for Very Small‐Size Matrices Usi...Takahiro Katagiri
We have developed a parallel eigensolver for very small-size matrices. Unlike conventional solvers, our design policy focusses on nature of non-blocking computations and reduced communications. A communication-avoiding approach for Householder pivot vectors is used to implement part of Householder inverse transformation. In addition to that, we implement some techniques for reducing communications by using non-blocking communications in tridiagonalization part. Performance of the solver with full nodes in the Fujitsu FX10 (76,800 cores) is also presented.
This poster was created in LaTeX on a Dell Inspiron laptop with a Linux Fedora Core 4 operating system. The background image and the animation snapshots are dxf meshes of elastic waveform solutions, rendered on a Windows machine using 3D Studio Max.
Machine learning and linear regression programmingSoumya Mukherjee
Overview of AI and ML
Terminology awareness
Applications in real world
Use cases within Nokia
Types of Learning
Regression
Classification
Clustering
Linear Regression Single Variable with python
A walk through the intersection between machine learning and mechanistic mode...JuanPabloCarbajal3
Talk at EURECOM, France.
It overviews regression in several of its forms: regularized, constrained, and mixed. It builds the bridge between machine learning and dynamical models.
The MAIN CONTRIBUTION is an on-line heuristic law to set the training process and to modify the NN topology based on the Levenberg-Marquardt method.
An Area Predictor Filter using nonlinear autoregressive model based on neural networks for time series forecasting is introduced.
The core of the proposal is to analyze the roughness (long or short term stochastic dependence) of time series evaluated by the Hurst parameter (H).
The proposed law adapts in real time the topology of the filter at each stage of time series, changing the number of pattern, the number of iterations and the input vector length.
The main results show a good performance of the predictor, considering in particular to time series whose H parameter has a high roughness of signal, which is evaluated by HS and HA, respectively.
These results encouraged to continue working on new adjustment algorithms for time series modeling natural phenomena.
The retrieval algorithms in remote sensing generally involve complex physical forward models that are nonlinear and computationally expensive to evaluate. Statistical emulation provides an alternative with cheap computation and can be used to calibrate model parameters and to improve computational efficiency of the retrieval algorithms. We introduce a framework of combining dimension reduction of input and output spaces and Gaussian process emulation
technique. The functional principal component analysis (FPCA) is chosen to reduce to the output space of thousands of dimensions by orders of magnitude. In addition, instead of making restrictive assumptions regarding the correlation structure of the high-dimensional input space,
we identity and exploit the most important directions of this space and thus construct a Gaussian process emulator with feasible computation. We will present preliminary results obtained from applying our method to OCO-2 data, and discuss how our framework can be generalized in
distributed systems. This is joint work with Jon Hobbs, Alex Konomi, Pulong Ma, and Anirban Mondal, and Joon Jin Song.
State estimation and Mean-Field Control with application to demand dispatchSean Meyn
Y. Chen, A. Busic, and S. Meyn.
In 54th IEEE Conference on Decision and Control, Dec. 2015.
See also journal version of the paper,
http://arxiv.org/abs/1504.00088
COMPUTATIONAL PERFORMANCE OF QUANTUM PHASE ESTIMATION ALGORITHMcsitconf
A quantum computation problem is discussed in this paper. Many new features that make
quantum computation superior to classical computation can be attributed to quantum coherence
effect, which depends on the phase of quantum coherent state. Quantum Fourier transform
algorithm, the most commonly used algorithm, is introduced. And one of its most important
applications, phase estimation of quantum state based on quantum Fourier transform, is
presented in details. The flow of phase estimation algorithm and the quantum circuit model are
shown. And the error of the output phase value, as well as the probability of measurement, is
analysed. The probability distribution of the measuring result of phase value is presented and
the computational efficiency is discussed.
COMPUTATIONAL PERFORMANCE OF QUANTUM PHASE ESTIMATION ALGORITHMcscpconf
A quantum computation problem is discussed in this paper. Many new features that make quantum computation superior to classical computation can be attributed to quantum coherence
effect, which depends on the phase of quantum coherent state. Quantum Fourier transform algorithm, the most commonly used algorithm, is introduced. And one of its most important
applications, phase estimation of quantum state based on quantum Fourier transform, is presented in details. The flow of phase estimation algorithm and the quantum circuit model are
shown. And the error of the output phase value, as well as the probability of measurement, is analysed. The probability distribution of the measuring result of phase value is presented and the computational efficiency is discussed.
Linear regression [Theory and Application (In physics point of view) using py...ANIRBANMAJUMDAR18
Machine-learning models are behind many recent technological advances, including high-accuracy translations of the text and self-driving cars. They are also increasingly used by researchers to help in solving physics problems, like Finding new phases of matter, Detecting interesting outliers
in data from high-energy physics experiments, Founding astronomical objects are known as gravitational lenses in maps of the night sky etc. The rudimentary algorithm that every Machine Learning enthusiast starts with is a linear regression algorithm. In statistics, linear regression is a linear approach to modelling the relationship between a scalar response (or dependent variable) and one or more explanatory variables (or independent
variables). Linear regression analysis (least squares) is used in a physics lab to prepare the computer-aided report and to fit data. In this article, the application is made to experiment: 'DETERMINATION OF DIELECTRIC CONSTANT OF NON-CONDUCTING LIQUIDS'. The entire computation is made through Python 3.6 programming language in this article.
A Fault Tolerant Control for Sensor and Actuator Failures of a Non Linear Hyb...IJECEIAES
We focused in this work on a fault tolerant control of a non linear hybrid system class based on diagnosis method (determine and locate the defects and their types) and on the faults reconfiguration method. In literature we can found many important research activities over the fault-tolerant control of non linear systems and linear Hybrid systems. But it dosen´t exist too many for the non linear hybrid system. The main idea in this paper is to consider a new approach to improve the reconfiguration performance of the non linear hybrid system by using hammerstein method which is designed to works only for linear systems. This method compensated the effect of the faults and guarantees the closed-loop system stable. The proposed method is simulated with a hydraulic system of two tanks with 4 modes
In this work, we propose to apply trust region optimization to deep reinforcement
learning using a recently proposed Kronecker-factored approximation to
the curvature. We extend the framework of natural policy gradient and propose
to optimize both the actor and the critic using Kronecker-factored approximate
curvature (K-FAC) with trust region; hence we call our method Actor Critic using
Kronecker-Factored Trust Region (ACKTR). To the best of our knowledge, this
is the first scalable trust region natural gradient method for actor-critic methods.
It is also a method that learns non-trivial tasks in continuous control as well as
discrete control policies directly from raw pixel inputs. We tested our approach
across discrete domains in Atari games as well as continuous domains in the MuJoCo
environment. With the proposed methods, we are able to achieve higher
rewards and a 2- to 3-fold improvement in sample efficiency on average, compared
to previous state-of-the-art on-policy actor-critic methods. Code is available at
https://github.com/openai/baselines.
Feedback linearization and Backstepping controllers for Coupled Tanksieijjournal
This paper investigates the usage of some sophisticated and advanced nonlinear control algorithms inorder
to control a nonlinear Coupled Tanks System. The first control procedure is called the
Feedbacklinearisation control (FLC), this type of control has been found a successful in achieving a global
exponentialasymptotic stability, with very short time response, no significant overshooting is recordedand with a negligible norm of the error. The second control procedure is the approaches of Backsteppingcontrol (BC) which is a recursive procedure that interlaces the choice of a Lyapunov functionwith the design of feedback control, from simulation results it shown that this method preserves tracking, robust control and it can often solve stabilization problems with less restrictive conditions may beencountered in other methods. Finally both of the proposed control schemes guarantee theasymptoticstability of the closed loop system meeting trajectory tracking objectives
Similar to Nonnegative Matrix Factorization with Side Information for Time Series Recovery and Prediction by Jiali Mei, Researcher @Shift Technology (20)
As electricity is difficult to store, it is crucial to strictly maintain the balance between production and consumption. The integration of intermittent renewable energies into the production mix has made the management of the balance more complex. However, access to near real-time data and communication with consumers via smart meters suggest demand response. Specifically, sending signals would encourage users to adjust their consumption according to the production of electricity. The algorithms used to select these signals must learn consumer reactions and optimize them while balancing exploration and exploitation. Various sequential or reinforcement learning approaches are being considered.
Online violence amplifies IRL discriminations, and the lack of diversity grows in a vicious circle. Understanding cyber-violence, its forms and mechanisms, can help us fight back. To process massive volumes of data, AI finally comes into play for good.
In the energy sector, the use of temporal data stands as a pivotal topic. At GRDF, we have developed several methods to effectively handle such data. This presentation will specifically delve into our approaches for anomaly detection and data imputation within time series, leveraging transformers and adversarial training techniques.
Natasha shares her experience to delve into the complexities, challenges, and strategies associated with effectively leading tech teams dispersed across borders.
Nour and Maria present the work they did at Tweag, Modus Create innovation arm, where the GenAI team developed an evaluation framework for Retrieval-Augmented Generation (RAG) systems. RAG systems provide an easy and low-cost way to extend the knowledge of Large Language Models (LLMs) but measuring their performance is not an easy task.
The presentation will review existing evaluation frameworks, ranging from those based on the traditional ML approach of using groundtruth datasets, including Tweag's, to those that use LLMs to compute evaluation metrics.
It will also delve into the practical implementation of Tweag's chatbot over two distinct documents datasets and provide insights on chunking, embedding and how open source and commercial LLMs compare.
Sharone Dayan, Machine Learning Engineer and Daria Stefic, Data Scientist, both from Contentsquare, delve into evaluation strategies for dealing with partially labelled or unlabelled data.
Laure talked about a very hot topic in the community at the moment with the ChatGPT phenomenon: how to supervise a PhD thesis in NLP in the age of Large Language Models (LLMs)?
Abstract: Who hasn't heard of the "Pilot Syndrome"? 85% of Data Science Pilots remain pilots and do not make it to the production stage. Let's build a production-ready and end-user-friendly Data Science application. 100% python and 100% open source.
Phase 1 | Building the GUI: create an interactive and powerful interface in a few lines of code
Phase 2 | Integrated back end: Manage your models and pipelines and create scenarios the smart way
"Nature Language Processing for proteins" by Amélie Héliou, Software Engineer @ Google Research
Abstract: Over the past few months, Large Language Models have become very popular.
We'll see how a simple LLM works, from input sentence to prediction.
I'll then present an application of LLM to protein name prediction.
Twitter: @Amelie_hel
"We are not passing by, and we are not a trend". What if an automated and large scale version of the Bechdel-Wallace test could confirm the speech of Alice Diop at the Cesar 2023?
That's the objective of BechdelAI : to build a tool based on Artificial Intelligence and open-source, allowing to measure the inequalities and the under-representation of women in movies and audiovisual.
CW RADAR, FMCW RADAR, FMCW ALTIMETER, AND THEIR PARAMETERSveerababupersonal22
It consists of cw radar and fmcw radar ,range measurement,if amplifier and fmcw altimeterThe CW radar operates using continuous wave transmission, while the FMCW radar employs frequency-modulated continuous wave technology. Range measurement is a crucial aspect of radar systems, providing information about the distance to a target. The IF amplifier plays a key role in signal processing, amplifying intermediate frequency signals for further analysis. The FMCW altimeter utilizes frequency-modulated continuous wave technology to accurately measure altitude above a reference point.
Immunizing Image Classifiers Against Localized Adversary Attacksgerogepatton
This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks
(CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We
introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations.
When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves
the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach
using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10
and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing
accuracy improvements over previous techniques. The results indicate that the combination of the volumetric
input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating
adversary training.
About
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Technical Specifications
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
Key Features
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface
• Compatible with MAFI CCR system
• Copatiable with IDM8000 CCR
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
Application
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
We have compiled the most important slides from each speaker's presentation. This year’s compilation, available for free, captures the key insights and contributions shared during the DfMAy 2024 conference.
Using recycled concrete aggregates (RCA) for pavements is crucial to achieving sustainability. Implementing RCA for new pavement can minimize carbon footprint, conserve natural resources, reduce harmful emissions, and lower life cycle costs. Compared to natural aggregate (NA), RCA pavement has fewer comprehensive studies and sustainability assessments.
Overview of the fundamental roles in Hydropower generation and the components involved in wider Electrical Engineering.
This paper presents the design and construction of hydroelectric dams from the hydrologist’s survey of the valley before construction, all aspects and involved disciplines, fluid dynamics, structural engineering, generation and mains frequency regulation to the very transmission of power through the network in the United Kingdom.
Author: Robbie Edward Sayers
Collaborators and co editors: Charlie Sims and Connor Healey.
(C) 2024 Robbie E. Sayers
6th International Conference on Machine Learning & Applications (CMLA 2024)ClaraZara1
6th International Conference on Machine Learning & Applications (CMLA 2024) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of on Machine Learning & Applications.
Cosmetic shop management system project report.pdfKamal Acharya
Buying new cosmetic products is difficult. It can even be scary for those who have sensitive skin and are prone to skin trouble. The information needed to alleviate this problem is on the back of each product, but it's thought to interpret those ingredient lists unless you have a background in chemistry.
Instead of buying and hoping for the best, we can use data science to help us predict which products may be good fits for us. It includes various function programs to do the above mentioned tasks.
Data file handling has been effectively used in the program.
The automated cosmetic shop management system should deal with the automation of general workflow and administration process of the shop. The main processes of the system focus on customer's request where the system is able to search the most appropriate products and deliver it to the customers. It should help the employees to quickly identify the list of cosmetic product that have reached the minimum quantity and also keep a track of expired date for each cosmetic product. It should help the employees to find the rack number in which the product is placed.It is also Faster and more efficient way.
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
Nonnegative Matrix Factorization with Side Information for Time Series Recovery and Prediction by Jiali Mei, Researcher @Shift Technology
1. Nonnegative Matrix Factorization with Side Information for
Time Series Recovery and Prediction
Jiali Mei 4 Yohann De Castro 1 Yannig Goude 1,2 Jean-Marc Azaïs 3
Georges Hébrail 2
1LMO, Univ. Paris-Sud, CNRS, Universite Paris-Saclay, Orsay
2EDF Lab Paris-Saclay, Palaiseau
3Institut de Mathématiques, Université Paul Sabatier, Toulouse
4Shift Technology, Paris
September 26, 2018
1/35
3. Context
Utility companies are interested in electricty consumption data of small
regions (village, block, small city) on a fine temporal scale.
This is useful in several ways:
Useful for utility companies to manage the supply-demand balance locally, in a
world with decentralized electricity generation (wind and solar power) and open
electricity market;
A requirement for the transmission system operators (TSO) by regulators;
Generally useful information to better understand socio-economic activities on a
fine temporal level.
Are there enough data for doing this?
2/35
4. Motivating example 1: data from meters
Figure: Electricity meter readings Figure: Daily electricity consumption
Traditional electricity meters need to be read physically, therefore at a lower
frequency than needed for further applications.
The resulting data are asynchronous, since the meter reading dates are not
aligned for all clients. Such data are difficult to further process.
3/35
5. Motivating example 2: data from electricity network
Figure: Map of the 7th Arrondissement of Lyon and low-voltage transformers in this district
Load data can be at a high temporal frequency on the electricity network, but at
a different or coarser spatial scale than what is needed.
4/35
6. Motivating example 3: electricity consumption and external factors
03:00 08:00 13:00 18:00 23:00
1000150020002500300035004000
Influence of calendar variablesconsumption(kw)
Monday
Tuesday
Wednesday
Thursday
Friday
Saturday
Sunday
0 10 20 30
0200040006000800010000
Influence of the temperature
Temperature
consumption(kw)
Figure: Portuguese electricity consumption versus days of the week and the temperature.
It is well established that electricity is influenced by many external factors.
5/35
7. A matrix representation
Figure: A matrix representation of the estimation target of the thesis
Variable of interest: vi,j is the electricity consumption at period i, for an
individual j, with n1 periods and n2 individuals in total.
V ∈ Rn1×n2 for the whole matrix, (vi)T and vj for i-th row and j-th column.
6/35
8. Main questions
Figure: A matrix representation of the estimation
target of the thesis
How can we estimate all entries of the
matrix V from temporal aggregates
and/or spatial aggregates?
Can the use of additional information
such as temporal regularity and
additional exogenous variables
improve such estimations?
Is it possible to produce predictions of
electricity consumption for new
periods and new individuals with such
data?
7/35
9. Outline
Method: Nonnegative matrix factorization with side information
NMF with linear measurements
Time series recovery and prediction with side information
HALSX algorithm
Experiments
Time series recovery
Time series prediction
Conclusions
8/35
10. Outline
Method: Nonnegative matrix factorization with side information
NMF with linear measurements
Time series recovery and prediction with side information
HALSX algorithm
Experiments
Time series recovery
Time series prediction
Conclusions
9/35
11. Nonnegative matrix factorization
We propose to solve the estimation problem by nonnegative matrix factorization
(NMF, Lee and Seung 1999).
Based on the hypothesis that the matrix to be recovered is of low-rank.
All entries in the factor matrices are nonnegative.
A dimension-reduction tool, similar to Singular Value Decomposition (SVD),
Principal Component Analysis (PCA), etc..
10/35
12. Nonnegative matrix factorization
We propose to solve the estimation problem by nonnegative matrix factorization
(NMF, Lee and Seung 1999).
Based on the hypothesis that the matrix to be recovered is of low-rank.
All entries in the factor matrices are nonnegative.
A dimension-reduction tool, similar to Singular Value Decomposition (SVD),
Principal Component Analysis (PCA), etc..
Remarks on non-negativity
Why?: For the electricity application: nonnegative consumption profiles and
weights are much more interpretable.
Price to pay: Less convergence guarantee.
10/35
13. Trace regression model
We wish to recover a matrix V∗, with the knowledge of data a ∈ RD, which are linear
measurements on the unknown matrix V∗, or
a = A(V∗
),
where A is a known linear operator.
11/35
14. Trace regression model
We wish to recover a matrix V∗, with the knowledge of data a ∈ RD, which are linear
measurements on the unknown matrix V∗, or
a = A(V∗
),
where A is a known linear operator.
The linear operator A is identified by A1, ..., AD, D matrices or masks of the same
dimension as V∗.
For all matrix X ∈ RT ×N ,
A(X) ≡ (Tr(AT
1 X), Tr(AT
2 X), ..., Tr(AT
DX))T
.
Hence the name trace regression model (Rohde and Tsybakov 2011).
Usual types of measurement operator A
complete observations
matrix completion (Candès and Recht 2009)
matrix sensing (Recht, Fazel, and Parrilo 2010)
rank-one matrix projections (Zuk and Wagner 2015)
temporal aggregates
11/35
16. Classical NMF algorithms with linear measurements
We minimize the quadratic approximation error, with a linear equality constraint:
min
V∈Rn1×n2 , Fr∈Rn1×k, Fc∈Rn2×k
V − FrFT
c
2
F
s.t. V ≥ 0, Fr ≥ 0, Fc ≥ 0,
A(V) = a.
13/35
17. Classical NMF algorithms with linear measurements
We minimize the quadratic approximation error, with a linear equality constraint:
min
V∈Rn1×n2 , Fr∈Rn1×k, Fc∈Rn2×k
V − FrFT
c
2
F
s.t. V ≥ 0, Fr ≥ 0, Fc ≥ 0,
A(V) = a.
We solve it by combining
classical iterative NMF algorithms, such as HALS or NeNMF (Cichocki 2009;
Guan et al. 2012),
with a projection step: V = PA(FrFT
c ), where PA is the projection operator
into the convex set A defined by the two constraints, V ≥ 0, A(V) = a.
13/35
18. Classical NMF algorithms with linear measurements
We minimize the quadratic approximation error, with a linear equality constraint:
min
V∈Rn1×n2 , Fr∈Rn1×k, Fc∈Rn2×k
V − FrFT
c
2
F
s.t. V ≥ 0, Fr ≥ 0, Fc ≥ 0,
A(V) = a.
We solve it by combining
classical iterative NMF algorithms, such as HALS or NeNMF (Cichocki 2009;
Guan et al. 2012),
with a projection step: V = PA(FrFT
c ), where PA is the projection operator
into the convex set A defined by the two constraints, V ≥ 0, A(V) = a.
Data: PA, 1 ≤ k ≤ min{n1, n2}
Result: V ∈ A, Fr ∈ Rn1×k
+ , Fc ∈ Rn2×k
+
Initialize F0
r, F0
c ≥ 0, V0 = PA(F0
r(F0
c)T ), i = 0;
while Stopping criterion is not satisfied do
Fi+1
r = Update(Fi
r, (Fi
c)T , Vi);
(Fi+1
c )T = Update(Fi+1
r , (Fi
c)T , Vi);
Vi+1 = PA(Fi+1
r (Fi+1
c )T );
i = i + 1;
end
Limiting points are stationary points, as most NMF algorithms.
13/35
19. Outline
Method: Nonnegative matrix factorization with side information
NMF with linear measurements
Time series recovery and prediction with side information
HALSX algorithm
Experiments
Time series recovery
Time series prediction
Conclusions
14/35
20. Regression models on factors
We introduce regression models in the NMF framework to take into account external
factors having an influence in electricity consumption.
Potential benefits
It may improve recovery quality.
It may help to interpret the estimated profiles.
The regression models may be used in prediction for new periods and new
individuals. 15/35
21. Generative low-rank model with exogenous variables
To take into account exogenous variables as side information, we propose a generative
low-rank nonnegative model:
V∗ has an NMF:
V∗
= FrFT
c .
The data are still a = A(V∗).
Features matrices Xr ∈ Rn1×d1 and Xc ∈ Rn2×d2 are connected to V∗ through
link functions fr : Rd1 → Rk and fc : Rd1 → Rk, so that
Fr = (fr(Xr))+,
Fc = (fc(Xc))+,
where the matrices are obtained by stacking the row vectors together.
Given this generative model, the task is to estimate fc, fr, Fc, Fr, and V∗, given Xr,
Xc, A, and a.
16/35
22. Classification of models
The generative model leads to the following optimization problem:
min
V,fr∈F k
r ,fc∈F k
c
V − (fr(Xr))+(fc(Xc))T
+
2
F
s.t. A(V) = a, V ≥ 0.
17/35
23. Classification of models
The generative model leads to the following optimization problem:
min
V,fr∈F k
r ,fc∈F k
c
V − (fr(Xr))+(fc(Xc))T
+
2
F
s.t. A(V) = a, V ≥ 0.
The generative model is very general and includes many known methods as special
cases, by specifying:
measurement operator A:
complete observations
matrix completion
matrix sensing
rank-one matrix projections
temporal aggregates
functional spaces of fr, fc:
reduced-rank linear models (Foygel et al. 2012)
non-parametric reduced-rank regression (Mukherjee and Zhu 2011)
features Xr, Xc:
multiple kernel learning (Gönen and Alpaydın 2011)
collaborative filtering with graph features (Abernethy et al. 2009)
17/35
24. Outline
Method: Nonnegative matrix factorization with side information
NMF with linear measurements
Time series recovery and prediction with side information
HALSX algorithm
Experiments
Time series recovery
Time series prediction
Conclusions
18/35
25. Extending HALS
The generative model leads to the following optimization problem:
min
V,fr∈F k
r ,fc∈F k
c
V − (fr(Xr))+(fc(Xc))T
+
2
F
s.t. A(V) = a, V ≥ 0.
To solve the problem above, we extend the HALS algorithm mentioned before, we
modify the update function for each rank at each iteration to use the exogenous
variables. We call this algorithm HALSX (HALS with eXogenous variables).
With fairly mild conditions, HALSX also verifies that its limiting points are
stationary points.
Sufficient conditions of the uniqueness of such a decomposition can be found in
the case where the link functions are linear.
19/35
26. HALSX: Pseudo-code
Data: A, a, Xr, Xc, Fr, Fc, 1 ≤ k ≤ min{n1, n2}.
Result: Vt, Ft
r ∈ Rn1×k
+ , ft
r,1, ..., ft
r,k ∈ Fr, Ft
c ∈ Rn2×k
+ , ft
c,1, ..., ft
c,k ∈ Fc.
Initialize F0
r, F0
c ≥ 0, t = 0;
while Stopping criterion is not satisfied do
Vt = arg minV|A(V)=a,V≥0 V − Ft
r(Ft
c)T 2
F ;
Rt = Vt − Ft
r(Ft
c)T ;
for i = 1, 2, ..., k do
Rt
i = Rt + ft
r,i(ft
c,i)T ;
Calculate ft+1
r,i = arg minf∈Fr Rt
i − f(Xr)(ft
c,i)T 2
F ;
ft+1
r,i = max(0, ft+1
r,i (Xr));
Rt = Rt
i − ft+1
r,i (ft
c,i)T ;
end
for i = 1, 2, ..., k do
Rt
i = Rt + ft+1
r,i (ft
c,i)T ;
Calculate ft+1
c,i = arg minf∈Fc Rt
i − ft+1
r,i f(Xc)T 2
F ;
ft+1
c,i = max(0, ft+1
c,i (Xc));
Rt = Rt
i − ft+1
r,i (ft+1
c,i )T ;
end
t = t + 1;
end
20/35
27. Local convergence of HALSX
The following property known for HALS (Kim, He, and Park 2014):
Property
For all R ∈ Rn1×n2 , y ∈ Rn2
+ , y not identically zero, any vector x∗ that verifies
x∗
∈ arg min
x∈Rn1
R − x(y)T 2
F ,
is also a solution to
min
x∈Rn1
R − x+(y)T 2
F .
21/35
28. Local convergence of HALSX
The following property known for HALS (Kim, He, and Park 2014):
Property
For all R ∈ Rn1×n2 , y ∈ Rn2
+ , y not identically zero, any vector x∗ that verifies
x∗
∈ arg min
x∈Rn1
R − x(y)T 2
F ,
is also a solution to
min
x∈Rn1
R − x+(y)T 2
F .
In HALSX, we can show the following similar property:
Proposition
Suppose that R ∈ Rn1×n2 , fc ∈ Rn2
+ are not identically equal to zero, and
g : Rd → Rn1 , with d ≥ n1, is a convex differentiable function. Suppose
θ∗
∈ arg min
θ∈Rd
R − g(θ)(fc)T 2
F .
If gθ∗ , the Jacobian matrix of g at θ∗
, is of rank n1, then θ∗
is also a solution to
min
θ∈Rd
R − (g(θ))+(fc)T 2
F .
Then by an argument of strict quasi-convexity, we obtain the convergence result. 21/35
29. Outline
Method: Nonnegative matrix factorization with side information
NMF with linear measurements
Time series recovery and prediction with side information
HALSX algorithm
Experiments
Time series recovery
Time series prediction
Conclusions
22/35
30. Experimental setting
Three datasets are used in experiments:
Synthetic a rank-20 150-by-180 nonnegative matrix simulated following the
generative model (n1 = 150, n2 = 180).
Synthetic row and column variables.
French daily consumption of 473 medium-voltage feeders near Lyon from 2010 to
2012 (n1 = 1096, n2 = 473).
Row variables: daily temperature, calendar variables.
Columns variables: the percentage of each type of clients (residential, professional,
industrial, high-voltage clients).
Portuguese daily consumption of 370 Portuguese clients from 2010 to 2014
(n1 = 1461, n2 = 369).
Row variables: daily temperature, calendar variables.
We generate measures by selecting a number of
observation periods,
either uniformly on the whole matrix (random),
or periodically with a randomly chosen offset
for each column (periodic).
23/35
31. Recovery or prediction
To test the prediction on new individuals and new periods, temporal aggregates
are generated on a number of observation periods over the upper-left matrix.
An error metric (RRMSE, or ˆX − X F / X F ) is calculated on each of the four
submatrices.
24/35
32. Profiles obtained with Portuguese dataset
Using external factors, the obtained profiles present visible annual cycles.
25/35
33. Outline
Method: Nonnegative matrix factorization with side information
NMF with linear measurements
Time series recovery and prediction with side information
HALSX algorithm
Experiments
Time series recovery
Time series prediction
Conclusions
26/35
34. Results on time series recovery
Synthetic Portuguese French
periodicrandom
0.1 0.2 0.3 0.4 0.50.1 0.2 0.3 0.4 0.50.1 0.2 0.3 0.4 0.5
0.00
0.05
0.10
0.15
0.20
0.25
0.00
0.05
0.10
0.15
0.20
0.25
Sampling rate
Recoveryerror
algorithm
empty_model
softImpute
HALS
NeNMF
HALSX_model
HALSX regression
lm
gam
gaussprRadial
svmLinear
Using exogenous variables (HALSX_model), the error rate on matrix recovery is
in most cases equivalent or an improvement compared to NMF methods (NeNMF
and HALS).
With random observation dates on the synthetic dataset, which is arguably the
least realistic case, HALSX_model is a little worse off than NeNMF and HALS.
27/35
35. Outline
Method: Nonnegative matrix factorization with side information
NMF with linear measurements
Time series recovery and prediction with side information
HALSX algorithm
Experiments
Time series recovery
Time series prediction
Conclusions
28/35
36. Results on time series prediction
Row error Column error RC error
periodicrandom
0.25 0.50 0.75 1.00 0.25 0.50 0.75 1.00 0.25 0.50 0.75 1.00
0.0
0.1
0.2
0.3
0.4
0.0
0.1
0.2
0.3
0.4
Sampling rate
algorithm
rrr
trmf
individual_gam
factor_gam
HALSX_model
HALSX regression
lm
gam
gaussprRadial
svmLinear
Figure: Prediction error on synthetic data
Row error Column error RC error
periodicrandom
0.25 0.50 0.75 1.00 0.25 0.50 0.75 1.00 0.25 0.50 0.75 1.00
0.20
0.25
0.30
0.35
0.40
0.20
0.25
0.30
0.35
0.40
Sampling rate
algorithm
rrr
trmf
individual_gam
factor_gam
HALSX_model
HALSX regression
lm
gam
gaussprRadial
svmLinear
Figure: Prediction error on French data
On synthetic data, the error of prediction is rather low for the
three prediction types (around 10%), which is remarkable
since only very partial data was available in the first place.
On the real-world datasets, the prediction error is higher.
However, HALSX still outperforms other benchmark methods.
HALSX is not sensitive to the sampling rate.
29/35
37. Outline
Method: Nonnegative matrix factorization with side information
NMF with linear measurements
Time series recovery and prediction with side information
HALSX algorithm
Experiments
Time series recovery
Time series prediction
Conclusions
30/35
38. Conclusions
In this talk we
formalized the temporal aggregate observations in electricity consumption as a
trace regression model;
proposed a generative low-rank matrix model to introduce side information in
NMF;
deduced HALSX, an algorithm to solve the new NMF problem;
tested it on real and synthetic electricity datasets and obtained results that are
equivalent or better than reference methods.
The proposed method is implemented in an R package used internally at EDF.
31/35
39. Perspectives of the thesis
Industrial applications
Instead of estimating the whole time series, NMF can be used to directly or indirectly
estimate important statistics, such as the peak demand.
Methodological perspectives
Estimation with both spatial and temporal aggregates
Usage of social network data as column variables
Causal relationship between the presence of the measures and the data that are
measured
Neural network/deep learning with partial data
Theoretical perspectives
Is it possible to achieve global convergence of first-order NMF algorithms in special
cases?
32/35
40. References
Jacob Abernethy et al. “A New Approach to Collaborative Filtering: Operator Estimation with Spectral Regularization”. In: The
Journal of Machine Learning Research 10 (2009), pp. 803–826.
Emmanuel J. Candès and Benjamin Recht. “Exact Matrix Completion via Convex Optimization”. In: Foundations of
Computational Mathematics 9.6 (2009), pp. 717–772. doi: 10.1007/s10208-009-9045-5.
Yunmei Chen and Xiaojing Ye. “Projection Onto A Simplex”. In: arXiv preprint arXiv:1101.6081 (2011).
Andrzej Cichocki, ed. Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-Way Data Analysis and
Blind Source Separation. Chichester, U.K: John Wiley, 2009. 477 pp. isbn: 978-0-470-74666-0.
Rina Foygel et al. “Nonparametric Reduced Rank Regression”. In: Advances in Neural Information Processing Systems. 2012,
pp. 1628–1636.
Mehmet Gönen and Ethem Alpaydın. “Multiple Kernel Learning Algorithms”. In: Journal of Machine Learning Research 12 (Jul
2011), pp. 2211–2268.
Naiyang Guan et al. “NeNMF: An Optimal Gradient Method for Nonnegative Matrix Factorization”. In: IEEE Transactions on
Signal Processing 60.6 (2012), pp. 2882–2898. doi: 10.1109/TSP.2012.2190406.
Jingu Kim, Yunlong He, and Haesun Park. “Algorithms for Nonnegative Matrix and Tensor Factorizations: A Unified View Based
on Block Coordinate Descent Framework”. In: Journal of Global Optimization 58.2 (2014), pp. 285–319.
Daniel D. Lee and H. Sebastian Seung. “Learning the Parts of Objects by Non-Negative Matrix Factorization”. In: Nature
401.6755 (1999), pp. 788–791.
Ashin Mukherjee and Ji Zhu. “Reduced Rank Ridge Regression and Its Kernel Extensions”. In: Statistical Analysis and Data
Mining 4.6 (Dec. 2011), pp. 612–622. doi: 10.1002/sam.10138. pmid: 22993641.
Benjamin Recht, Maryam Fazel, and Pablo A. Parrilo. “Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via
Nuclear Norm Minimization”. In: SIAM review 52.3 (2010), pp. 471–501.
Angelika Rohde and Alexandre B. Tsybakov. “Estimation of High-Dimensional Low-Rank Matrices”. In: The Annals of Statistics
39.2 (2011), pp. 887–930. doi: 10.1214/10-AOS860.
Or Zuk and Avishai Wagner. “Low-Rank Matrix Recovery from Row-and-Column Affine Measurements”. In: Proceedings of The
32nd International Conference on Machine Learning. Proceedings of The 32nd International Conference on Machine Learning.
2015, pp. 2012–2020.
33/35
41. How to calculate PA(X)
For some forms of masks, there are efficient methods.
Matrix completion: replacing the observed entries.
Temporal aggregates: simplex projection
min
vId
vId
−
t0(d)+h(d)
t=t0(d)+1
(fr)t(FT
c )nd
2
s.t. vId
≥ 0, vT
Id
1 = ad.
An efficient simplex proejction algorithm (Chen and Ye 2011) is used in this case.
General case: iterate between
V = V + A†
(a − A(V));
vi,j = max(0, vi,j ).
34/35
42. Which functional spaces to choose
HALSX is rather agnostic in the choice of regression models.
There is a biais-variance trade-off between flexible models with many parameters
and simple models with few parameters.
Overfitting can be adressed by cross-validation at each model update.
35/35