In this talk, we present a CNN architecture for predicting autoregressive asynchronous time series. We illustrate its application on predicting traders’ quotes of credit default swaps (proprietary dataset from Hellebore Capital), and on artificial time series. The paper is available there: http://proceedings.mlr.press/v80/binkowski18a/binkowski18a.pdf
A review of two decades of correlations, hierarchies, networks and clustering...Gautier Marti
Opinionated review of two decades of correlations, hierarchies,
networks and clustering in financial markets presented at Ton Duc Thang University in Ho Chi Minh City, Vietnam.
Network and risk spillovers: a multivariate GARCH perspectiveSYRTO Project
M. Billio, M. Caporin, L. Frattarolo, L. Pelizzon: “Network and risk spillovers: a multivariate GARCH perspective”.
Final SYRTO Conference - Université Paris1 Panthéon-Sorbonne
February 19, 2016
Entropy and systemic risk measures
M. Billio, R. Casarin, M. Costola, A. Pasqualini
Ca’ Foscari Venice University
Final SYRTO Conference - Université Paris1 Panthéon-Sorbonne
February 19, 2016
Clustering CDS: algorithms, distances, stability and convergence ratesGautier Marti
Talk given at CMStatistics 2016 (http://cmstatistics.org/CMStatistics2016/).
The standard methodology for clustering financial time series is quite brittle to outliers / heavy-tails for many reasons: Single Linkage / MST suffers from the chaining phenomenon; Pearson correlation coefficient is relevant for Gaussian distributions which is usually not the case for financial returns (especially for credit derivatives). At Hellebore Capital Ltd, we strive to improve the methodology and to ground it. We think that stability is a paramount property to verify, which is closely linked to statistical convergence rates of the methodologies (combination of clustering algorithms and dependence estimators). This gives us a model selection criterion: The best clustering methodology is the methodology that can reach a given 'accuracy' with the minimum sample size.
Clustering in dynamic causal networks as a measure of systemic risk on the eu...SYRTO Project
Clustering in dynamic causal networks as a measure of systemic risk on the euro zone
M. Billio, H. Gatfaoui, L. Frattarolo, P. de Peretti
IESEG/ Universitè Paris1 Panthèon-Sorbonne/ University Ca' Foscari
Final SYRTO Conference - Université Paris1 Panthéon-Sorbonne
February 19, 2016
Exploring Quantum Supremacy in Access Structures of Secret Sharing by Coding ...Ryutaroh Matsumoto
Quantum supremacy or quantum advantage is the potential ability of quantum computing devices to solve problems that classical computers practically cannot (Wikipedia). The speaker recently found that quantum computation can realize secret sharing schemes that cannot be realized by any classical computation. That finding was enabled by combinatorial studies of quantum error-correcting codes and classical secret sharing. This talk introduces those studies to non-specialists with mathematical backgrounds.
Reference: arXiv:1803.10392
Options on Quantum Money: Quantum Path- Integral With Serial ShocksAM Publications,India
The author previously developed a numerical multivariate path-integral algorithm, PATHINT, which has been applied to several classical physics systems, including statistical mechanics of neocortical interactions, options in financial markets, and other nonlinear systems including chaotic systems. A new quantum version, qPATHINT, has the ability to take into account nonlinear and time-dependent modifications of an evolving system. qPATHINT is shown to be useful to study some aspects of serial changes to systems. Applications to options on quantum money and blockchains in financial markets are discussed.
A review of two decades of correlations, hierarchies, networks and clustering...Gautier Marti
Opinionated review of two decades of correlations, hierarchies,
networks and clustering in financial markets presented at Ton Duc Thang University in Ho Chi Minh City, Vietnam.
Network and risk spillovers: a multivariate GARCH perspectiveSYRTO Project
M. Billio, M. Caporin, L. Frattarolo, L. Pelizzon: “Network and risk spillovers: a multivariate GARCH perspective”.
Final SYRTO Conference - Université Paris1 Panthéon-Sorbonne
February 19, 2016
Entropy and systemic risk measures
M. Billio, R. Casarin, M. Costola, A. Pasqualini
Ca’ Foscari Venice University
Final SYRTO Conference - Université Paris1 Panthéon-Sorbonne
February 19, 2016
Clustering CDS: algorithms, distances, stability and convergence ratesGautier Marti
Talk given at CMStatistics 2016 (http://cmstatistics.org/CMStatistics2016/).
The standard methodology for clustering financial time series is quite brittle to outliers / heavy-tails for many reasons: Single Linkage / MST suffers from the chaining phenomenon; Pearson correlation coefficient is relevant for Gaussian distributions which is usually not the case for financial returns (especially for credit derivatives). At Hellebore Capital Ltd, we strive to improve the methodology and to ground it. We think that stability is a paramount property to verify, which is closely linked to statistical convergence rates of the methodologies (combination of clustering algorithms and dependence estimators). This gives us a model selection criterion: The best clustering methodology is the methodology that can reach a given 'accuracy' with the minimum sample size.
Clustering in dynamic causal networks as a measure of systemic risk on the eu...SYRTO Project
Clustering in dynamic causal networks as a measure of systemic risk on the euro zone
M. Billio, H. Gatfaoui, L. Frattarolo, P. de Peretti
IESEG/ Universitè Paris1 Panthèon-Sorbonne/ University Ca' Foscari
Final SYRTO Conference - Université Paris1 Panthéon-Sorbonne
February 19, 2016
Exploring Quantum Supremacy in Access Structures of Secret Sharing by Coding ...Ryutaroh Matsumoto
Quantum supremacy or quantum advantage is the potential ability of quantum computing devices to solve problems that classical computers practically cannot (Wikipedia). The speaker recently found that quantum computation can realize secret sharing schemes that cannot be realized by any classical computation. That finding was enabled by combinatorial studies of quantum error-correcting codes and classical secret sharing. This talk introduces those studies to non-specialists with mathematical backgrounds.
Reference: arXiv:1803.10392
Options on Quantum Money: Quantum Path- Integral With Serial ShocksAM Publications,India
The author previously developed a numerical multivariate path-integral algorithm, PATHINT, which has been applied to several classical physics systems, including statistical mechanics of neocortical interactions, options in financial markets, and other nonlinear systems including chaotic systems. A new quantum version, qPATHINT, has the ability to take into account nonlinear and time-dependent modifications of an evolving system. qPATHINT is shown to be useful to study some aspects of serial changes to systems. Applications to options on quantum money and blockchains in financial markets are discussed.
You may have already read many times that the job of a Data Scientist is to skim through a huge amount of data searching for correlations between some variables of interest. And also, that one of his worst enemies (besides correlation doesn't imply causation) is spurious correlation. But what really is correlation? Are there several types of correlations? Some "good", some "bad"? What about their estimation? This talk will be a very visual presentation around the notion of correlation and dependence. I will first illustrate how the standard linear correlation is estimated (Pearson coefficient), then some more robust alternative: the Spearman coefficient. Building on the geometric understanding of their nature, I will present a generalization that can help Data Scientists to explore, interpret, and measure the dependence (not necessarily linear or comonotonic) between the variables of a given dataset. Financial time series (stocks, credit default swaps, fx rates), and features from the UCI datasets are considered as use cases.
Clustering Financial Time Series: How Long is Enough?Gautier Marti
IJCAI-16, New York, conference presentation of paper http://www.ijcai.org/Proceedings/16/Papers/367.pdf
Researchers have used from 30 days to several
years of daily returns as source data for clustering
financial time series based on their correlations.
This paper sets up a statistical framework to study
the validity of such practices. We first show that
clustering correlated random variables from their
observed values is statistically consistent. Then,
we also give a first empirical answer to the much
debated question: How long should the time series
be? If too short, the clusters found can be spurious;
if too long, dynamics can be smoothed out.
cCorrGAN: Conditional Correlation GAN for Learning Empirical Conditional Dist...Gautier Marti
A Generative Adversarial Networks model to generate realistic correlation matrices. In these slides, we discuss a use case in quantitative finance (comparison of risk-based portfolio allocation methods), and how to improve the seminal model with information geometry (Riemannian neural networks suited for correlation matrices). There are many use cases to explore within, and outside, quantitative finance. The Riemannian geometry of correlation matrices is still under-developed.
We highlight exciting problems at the intersection of Riemannian geometry and deep learning.
ABSTRACT: In a pair of papers from 1995 and 1997, I developed a computational theory of legal argument, but left open a question about the key concept of a "prototype." Contemporary trends in machine learning have now shed new light on the subject. In this talk, I will describe my recent work on "manifold learning," as well as some work in progress on "deep learning." Taken together, this work leads to a logical language grounded in a prototypical perceptual semantics, with implications for legal theory.
The paper talks about the pentagonal Neutrosophic sets and its operational law. The paper presents the cut of single valued pentagonal Neutrosophic numbers and additionally introduced the arithmetic operation of single-valued pentagonal Neutrosophic numbers. Here, we consider a transportation problem with pentagonal Neutrosophic numbers where the supply, demand and transportation cost is uncertain. Taking the benefits of the properties of ranking functions, our model can be changed into a relating deterministic form, which can be illuminated by any method. Our strategy is easy to assess the issue and can rank different sort of pentagonal Neutrosophic numbers. To legitimize the proposed technique, some numerical tests are given to show the adequacy of the new model.
You may have already read many times that the job of a Data Scientist is to skim through a huge amount of data searching for correlations between some variables of interest. And also, that one of his worst enemies (besides correlation doesn't imply causation) is spurious correlation. But what really is correlation? Are there several types of correlations? Some "good", some "bad"? What about their estimation? This talk will be a very visual presentation around the notion of correlation and dependence. I will first illustrate how the standard linear correlation is estimated (Pearson coefficient), then some more robust alternative: the Spearman coefficient. Building on the geometric understanding of their nature, I will present a generalization that can help Data Scientists to explore, interpret, and measure the dependence (not necessarily linear or comonotonic) between the variables of a given dataset. Financial time series (stocks, credit default swaps, fx rates), and features from the UCI datasets are considered as use cases.
Clustering Financial Time Series: How Long is Enough?Gautier Marti
IJCAI-16, New York, conference presentation of paper http://www.ijcai.org/Proceedings/16/Papers/367.pdf
Researchers have used from 30 days to several
years of daily returns as source data for clustering
financial time series based on their correlations.
This paper sets up a statistical framework to study
the validity of such practices. We first show that
clustering correlated random variables from their
observed values is statistically consistent. Then,
we also give a first empirical answer to the much
debated question: How long should the time series
be? If too short, the clusters found can be spurious;
if too long, dynamics can be smoothed out.
cCorrGAN: Conditional Correlation GAN for Learning Empirical Conditional Dist...Gautier Marti
A Generative Adversarial Networks model to generate realistic correlation matrices. In these slides, we discuss a use case in quantitative finance (comparison of risk-based portfolio allocation methods), and how to improve the seminal model with information geometry (Riemannian neural networks suited for correlation matrices). There are many use cases to explore within, and outside, quantitative finance. The Riemannian geometry of correlation matrices is still under-developed.
We highlight exciting problems at the intersection of Riemannian geometry and deep learning.
ABSTRACT: In a pair of papers from 1995 and 1997, I developed a computational theory of legal argument, but left open a question about the key concept of a "prototype." Contemporary trends in machine learning have now shed new light on the subject. In this talk, I will describe my recent work on "manifold learning," as well as some work in progress on "deep learning." Taken together, this work leads to a logical language grounded in a prototypical perceptual semantics, with implications for legal theory.
The paper talks about the pentagonal Neutrosophic sets and its operational law. The paper presents the cut of single valued pentagonal Neutrosophic numbers and additionally introduced the arithmetic operation of single-valued pentagonal Neutrosophic numbers. Here, we consider a transportation problem with pentagonal Neutrosophic numbers where the supply, demand and transportation cost is uncertain. Taking the benefits of the properties of ranking functions, our model can be changed into a relating deterministic form, which can be illuminated by any method. Our strategy is easy to assess the issue and can rank different sort of pentagonal Neutrosophic numbers. To legitimize the proposed technique, some numerical tests are given to show the adequacy of the new model.
Deep neural networks have boosted the convergence of multimedia data analytics in a unified framework shared by practitioners in natural language, vision and speech. Image captioning, lip reading or video sonorization are some of the first applications of a new and exciting field of research exploiting the generalization properties of deep neural representation. This tutorial will firstly review the basic neural architectures to encode and decode vision, text and audio, to later review the those models that have successfully translated information across modalities. The contents of this tutorial are available at: https://telecombcn-dl.github.io/2019-mmm-tutorial/.
A crucial ingredient of a successful weather prediction system is its ability to combine observational data with the
output of numerical weather prediction models to estimate the state of the atmosphere and the oceans. This problem of estimation of the state of a high dimensional chaotic system such as the atmosphere, given noisy and partial observations of it is known as data assimilation in the context of earth sciences. The main object of interest in these problems is
the conditional distribution, called the posterior, of the state conditioned on the observations. Monte Carlo methods are the most commonly used techniques to study this posterior and also to use it efficiently for prediction. I will give a general introduction to the data assimilation problems and also to Monte Carlo techniques, followed by a discussion of some commonly used Monte Carlo algorithms for data assimilation.
Kernel methods and variable selection for exploratory analysis and multi-omic...tuxette
Nathalie Vialaneix
4th course on Computational Systems Biology of Cancer: Multi-omics and Machine Learning Approaches
International course, Curie training
https://training.institut-curie.org/courses/sysbiocancer2021
(remote)
September 29th, 2021
According to the standard Venn diagram depicting the building blocks of modern data science, algorithmics, mathematics and statistical machine learning combine to represent one of the three pillars of Data Science, along with application domain and computer science as the other two components. In this lecture, I will expose the audience to the foundational statistical machine learning methods for modern data science. The most frequently used methods of supervised learning, featuring both classification (pattern recognition) and regression are presented in greater details, with an emphasis on algorithmic clarity and statistical rigor.
This slide was used in the "Mathematics of Logistics" seminar at Nishinari Laboratory, Faculty of Engineering, the University of Tokyo.
references:
1.久保幹雄 (2007) 『ロジスティクスの数理』 共立出版
2.Dimitri P. Bertsekas (2005). Dynamic Programming and Optimal Control. Athena Scientific. Vol 1,2. 4th edition.
Learning Intrusion Prevention Policies Through Optimal StoppingKim Hammar
We study automated intrusion prevention using reinforcement learning. In a novel approach, we formulate the problem of intrusion prevention as an optimal multiple stopping problem. This formulation allows us insight into the structure of the optimal policies, which we show to be threshold based. Since the computation of the optimal defender policy using dynamic programming is not feasible for practical cases, we develop a reinforcement learning approach to approximate the optimal policy in a target infrastructure. The approach uses an emulation of the infrastructure to evaluate policies and to instantiate a simulation model which then is used to train policies through reinforcement learning. Our results show that the learned policies are close to optimal and that they indeed can be expressed using thresholds.
MediaEval 2018: Ensembled Convolutional Neural Network Models for Retrieving ...multimediaeval
Paper: http://ceur-ws.org/Vol-2283/MediaEval_18_paper_27.pdf
Youtube: https://youtu.be/iDwuoVfpDKQ
Yu Feng, Sergiy Shebotnov, Claus Brenner, Monika Sester, Ensembled Convolutional Neural Network Models for Retrieving Flood Relevant Tweets. Proc. of MediaEval 2018, 29-31 October 2018, Sophia Antipolis, France.
Abstract: Social media, which provides instant textual and visual information exchange, plays a more important role in emergency response than ever before. Many researchers nowadays are focusing on disaster monitoring using crowd sourcing. Interpretation and retrieval of such information significantly influences the efficiency of these applications. This paper presents a method proposed by team EVUS-ikg for the MediaEval 2018 challenge on Multimedia Satellite Task. We only focused on the subtask “flood classification for social multimedia”. A supervised learning method with an ensemble of 10 Convolutional Neural Networks (CNN) was applied to classify the tweets in the benchmark.
Presented by Yu Feng
A FRIENDLY APPROACH TO PARTICLE FILTERS IN COMPUTER VISIONMarcos Nieto
This is a friendly approach to particle filters. Some hints, examples, and good practices to be able to successfully apply particle filters to solve your computer vision pro
Similar to Autoregressive Convolutional Neural Networks for Asynchronous Time Series (20)
Using Large Language Models in 10 Lines of CodeGautier Marti
Modern NLP models can be daunting: No more bag-of-words but complex neural network architectures, with billions of parameters. Engineers, financial analysts, entrepreneurs, and mere tinkerers, fear not! You can get started with as little as 10 lines of code.
Presentation prepared for the Abu Dhabi Machine Learning Meetup Season 3 Episode 3 hosted at ADGM in Abu Dhabi.
... two decades of correlation, hierarchies, networks and clustering in financial markets
Summary of some of my past research work at Complex Networks 2022.
The study of correlations, hierarchies, networks and communities (or clustering) has more than 20 years of history in econophysics.
However, for the practitioner, it seems that these tools are not fully ready yet:
Many questions around their proper use for trading or risk monitoring are left unanswered.
Deep Learning might help solve some hard problems such as finding more reliably communities (or clusters) and their number.
Running large simulations (based on GANs, VAEs or realistic market simulators) could also help understand when complex networks methods can give wrong insights (e.g. not enough data, or not stationary enough; too low correlations).
Conference: Complex Networks 2022 in Palermo, Sicily, Italy.
A quick demo of Top2Vec With application on 2020 10-K business descriptionsGautier Marti
A short presentation I did at the Hong Kong Machine Learning Meetup Season 4 Episode 4. Top2Vec is a novel method to find topics in a corpus of documents. It can automatically find a relevant number of topics in the corpus. Besides, you get also relevant word and document vectors for further processing.
How deep generative models can help quants reduce the risk of overfitting?Gautier Marti
How deep generative models can help quants reduce
the risk of overfitting? Applications of GANs for Quants.
Presentation at the "QuantUniversity Autumn School 2020".
Generating Realistic Synthetic Data in FinanceGautier Marti
Talk at IHS Markit Webinar (15 October 2020) on the potential Applications of GANs in Finance. These models could be useful for quants and their managers to avoid over-fitting, portfolio and risk managers for proper capital and risk allocation, cloud computing servicing willing to work with banks and other sensitive data rich organizations, auditors and regulators to detect frauds, and data vendors (such as IHS Markit) to bring new products to market and iterate quickly with clients.
This presentation highlights potential use cases of deep generative models, and Generative Adversarial Networks (GANs) in particular, in Finance. Essentially, these models are useful to generate realistic synthetic datasets. Quantitative Strategists, Traders, Asset and Risk Managers can find these novel techniques useful. Auditors and Regulators should also become aware of their existence as they may be source of new accounting frauds and misleading financial statements (deepfakes).
My recent attempts at using GANs for simulating realistic stocks returnsGautier Marti
A presentation for the Hong Kong Machine Learning meetup summarizing my hobby research over the past year. My goal is to be able to simulate realistic multivariate financial time series. If so, I will be able to compare different statistical methods for portfolio construction, studying complex networks, algorithmic trading, being able to do some reinforcement learning, etc. Still far from being achieved...
Takeaways from ICML 2019, Long Beach, CaliforniaGautier Marti
A few slides that highlight some of my personal takeaways from the ICML 2019 conference. I tried to identify niche trends such as Shapley values, topological data analysis, Hawkes processes...
Clustering Financial Time Series using their Correlations and their Distribut...Gautier Marti
We have designed a distance that takes into account both the correlation between the time series and also the distribution of the individual time series. A tutorial with Python code is available: https://www.datagrapple.com/Tech/GNPR-tutorial-How-to-cluster-random-walks.html
This talk was given at the Paris Machine Learning Meetup.
Optimal Transport vs. Fisher-Rao distance between CopulasGautier Marti
How can we compare two dependence structures (represented by copulas)? It depends on the task. For clustering variables with similar dependence, prefer Optimal Transport. For detecting change points in a dynamical dependence structure, prefer Fisher-Rao and its associated f-divergences (for example, an approach a la Frédéric Barbaresco in radar signal processing). This study illustrates these properties with bivariate Gaussian copulas.
On Clustering Financial Time Series - Beyond CorrelationGautier Marti
Financial correlation matrices are noisy. Most of their coefficients are meaningless. RMT advocates that the intrinsic dimension is much lower than O(N^2). Clustering can help to reduce the dimension. But, it can also work on other information than mere correlation...
Optimal Transport between Copulas for Clustering Time SeriesGautier Marti
Presentation slides of our ICASSP 2016 conference paper in Shanghai. They describe the motivation and design of the Target Dependence Coefficient, a coefficient which can target or forget specific dependence relationships between the variables. This coefficient can be useful for clustering financial time series. Several of such use-cases are described on our Tech Blog https://www.datagrapple.com/Tech/optimal-copula-transport.html
On the stability of clustering financial time seriesGautier Marti
Talk at IEEE ICMLA 2015 Miami
In this presentation, we suggest some data perturbations that can help to validate or reject a clustering methodology besides yielding insights on the time series at hand. We show in this study that Pearson correlation is not that relevant for clustering these time series since it yields unstable clusters; prefer a more robust measure such as Spearman correlation based on rank statistics.
Adjusting OpenMP PageRank : SHORT REPORT / NOTESSubhajit Sahu
For massive graphs that fit in RAM, but not in GPU memory, it is possible to take
advantage of a shared memory system with multiple CPUs, each with multiple cores, to
accelerate pagerank computation. If the NUMA architecture of the system is properly taken
into account with good vertex partitioning, the speedup can be significant. To take steps in
this direction, experiments are conducted to implement pagerank in OpenMP using two
different approaches, uniform and hybrid. The uniform approach runs all primitives required
for pagerank in OpenMP mode (with multiple threads). On the other hand, the hybrid
approach runs certain primitives in sequential mode (i.e., sumAt, multiply).
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Round table discussion of vector databases, unstructured data, ai, big data, real-time, robots and Milvus.
A lively discussion with NJ Gen AI Meetup Lead, Prasad and Procure.FYI's Co-Found
Autoregressive Convolutional Neural Networks for Asynchronous Time Series
1. Autoregressive Convolutional Neural Networks for
Asynchronous Time Series
Hong Kong Machine Learning Meetup - Season 1 Episode 1
Mikolaj Bi´nkowski, Gautier Marti, Philippe Donnat
Imperial College London, Ecole Polytechnique, Hellebore Capital
18 July 2018
HELLEBORECAPITAL
Mikolaj Bi´nkowski, Gautier Marti, Philippe Donnat (Imperial College)CNNs for Asynchronous Time Series 18 July 2018 1 / 10
2. Introduction
Problem: Many real-world time series are asynchronous, i.e.
the durations between consecutive observations are irregular/random
or
the separate dimensions are not observed simultaneously.
Mikolaj Bi´nkowski, Gautier Marti, Philippe Donnat (Imperial College)CNNs for Asynchronous Time Series 18 July 2018 2 / 10
3. Introduction
Problem: Many real-world time series are asynchronous, i.e.
the durations between consecutive observations are irregular/random
or
the separate dimensions are not observed simultaneously.
At the same time:
time series models usually require both regularity of observations and
simultaneous sampling of all dimensions,
continuous-time models often require simultaneous sampling.
Mikolaj Bi´nkowski, Gautier Marti, Philippe Donnat (Imperial College)CNNs for Asynchronous Time Series 18 July 2018 2 / 10
4. Introduction
Problem: Many real-world time series are asynchronous, i.e.
the durations between consecutive observations are irregular/random
or
the separate dimensions are not observed simultaneously.
At the same time:
time series models usually require both regularity of observations and
simultaneous sampling of all dimensions,
continuous-time models often require simultaneous sampling.
Numerous interpolation methods have been developed for preprocessing of
asynchronous series. However,...
Mikolaj Bi´nkowski, Gautier Marti, Philippe Donnat (Imperial College)CNNs for Asynchronous Time Series 18 July 2018 2 / 10
5. Drawbacks of synchronous sampling
... every interpolation method leads to either increase in the number of
data points or loss of data.
0 20 40 60 80 100
original series
Mikolaj Bi´nkowski, Gautier Marti, Philippe Donnat (Imperial College)CNNs for Asynchronous Time Series 18 July 2018 3 / 10
6. Drawbacks of synchronous sampling
... every interpolation method leads to either increase in the number of
data points or loss of data.
0 20 40 60 80 100
original series
frequency = 10s; information loss
But the situation can be much worse...
Mikolaj Bi´nkowski, Gautier Marti, Philippe Donnat (Imperial College)CNNs for Asynchronous Time Series 18 July 2018 3 / 10
7. Drawbacks of synchronous sampling
... every interpolation method leads to either increase in the number of
data points or loss of data.
0 20 40 60 80 100
original series
frequency = 10s; information loss
frequency = 1s; 12x more points
But the situation can be much worse...
Mikolaj Bi´nkowski, Gautier Marti, Philippe Donnat (Imperial College)CNNs for Asynchronous Time Series 18 July 2018 3 / 10
8. Drawbacks of synchronous sampling
WLPH
SULFH
HYROXWLRQRITXRWHGSULFHVWKURXJKRXWRQHGD
VRXUFH$ELG
VRXUFH$DVN
VRXUFH%ELG
VRXUFH%DVN
VRXUFHELG
VRXUFHDVN
VRXUFH'ELG
VRXUFH'DVN
Mikolaj Bi´nkowski, Gautier Marti, Philippe Donnat (Imperial College)CNNs for Asynchronous Time Series 18 July 2018 4 / 10
9. Drawbacks of synchronous sampling
WLPH
SULFH
HYROXWLRQRITXRWHGSULFHVWKURXJKRXWRQHGD
VRXUFH$ELG
VRXUFH$DVN
VRXUFH%ELG
VRXUFH%DVN
VRXUFHELG
VRXUFHDVN
VRXUFH'ELG
VRXUFH'DVN
Objectives:
Propose alternative representation of asynchronous data,
Mikolaj Bi´nkowski, Gautier Marti, Philippe Donnat (Imperial College)CNNs for Asynchronous Time Series 18 July 2018 4 / 10
10. Drawbacks of synchronous sampling
WLPH
SULFH
HYROXWLRQRITXRWHGSULFHVWKURXJKRXWRQHGD
VRXUFH$ELG
VRXUFH$DVN
VRXUFH%ELG
VRXUFH%DVN
VRXUFHELG
VRXUFHDVN
VRXUFH'ELG
VRXUFH'DVN
Objectives:
Propose alternative representation of asynchronous data,
Find neural network architecture appropriate for such representation.
Mikolaj Bi´nkowski, Gautier Marti, Philippe Donnat (Imperial College)CNNs for Asynchronous Time Series 18 July 2018 4 / 10
11. How to deal with asynchronous data?
0 0.3 1 1.5 1.8 2.7 3.5 4.20.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6
value
time
X
Y
duration
Mikolaj Bi´nkowski, Gautier Marti, Philippe Donnat (Imperial College)CNNs for Asynchronous Time Series 18 July 2018 5 / 10
12. How to deal with asynchronous data?
0 0.3 1 1.5 1.8 2.7 3.5 4.20.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6
value
time
X
Y
duration
X indicator
value
Y indicator
duration
Mikolaj Bi´nkowski, Gautier Marti, Philippe Donnat (Imperial College)CNNs for Asynchronous Time Series 18 July 2018 5 / 10
13. How to deal with asynchronous data?
0 0.3 1 1.5 1.8 2.7 3.5 4.20.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6
value
time
X
Y
duration
1
4.0
0
.3
X indicator
value
Y indicator
duration
Mikolaj Bi´nkowski, Gautier Marti, Philippe Donnat (Imperial College)CNNs for Asynchronous Time Series 18 July 2018 5 / 10
14. How to deal with asynchronous data?
0 0.3 1 1.5 1.8 2.7 3.5 4.20.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6
value
time
X
Y
duration
1
4.0 7.5
0
0 1
.3 .7
X indicator
value
Y indicator
duration
Mikolaj Bi´nkowski, Gautier Marti, Philippe Donnat (Imperial College)CNNs for Asynchronous Time Series 18 July 2018 5 / 10
15. How to deal with asynchronous data?
0 0.3 1 1.5 1.8 2.7 3.5 4.20.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6
value
time
X
Y
duration
1
1
4.0 7.5
0
0 1
.3 .7
9.0 2.3
0 1
1 0
.5 .3
7.7 5.0
1 0
0 1
.9 .6
4.5 5.1
1 0
0
.7 1.3
X indicator
value
Y indicator
duration
Mikolaj Bi´nkowski, Gautier Marti, Philippe Donnat (Imperial College)CNNs for Asynchronous Time Series 18 July 2018 5 / 10
16. Not satisfactory performance of Neural Nets
Architectures such as Long-Short Term Memory (LSTM) and
Convolutional Neural Networks (CNN) do not perform as well as expected,
compared to simple autoregressive (AR) model
Xn =
M
m=1
Xn−m × am + εn (1)
Mikolaj Bi´nkowski, Gautier Marti, Philippe Donnat (Imperial College)CNNs for Asynchronous Time Series 18 July 2018 6 / 10
17. Not satisfactory performance of Neural Nets
Architectures such as Long-Short Term Memory (LSTM) and
Convolutional Neural Networks (CNN) do not perform as well as expected,
compared to simple autoregressive (AR) model.
Idea: equip AR model with data-dependent weights
Xn =
M
m=1
Xn−m × am(Xn−m) + εn (1)
Mikolaj Bi´nkowski, Gautier Marti, Philippe Donnat (Imperial College)CNNs for Asynchronous Time Series 18 July 2018 6 / 10
18. Proposed Architecture
The model predicts
yn = E[xI
n|x−M
n ],
where
x−M
n = (xn−1, . . . , xn−M)
- regressors
I = (i1, i2, . . . , idI
)
- target dimensions
Mikolaj Bi´nkowski, Gautier Marti, Philippe Donnat (Imperial College)CNNs for Asynchronous Time Series 18 July 2018 7 / 10
19. Proposed Architecture
The model predicts
yn = E[xI
n|x−M
n ],
where
x−M
n = (xn−1, . . . , xn−M)
- regressors
I = (i1, i2, . . . , idI
)
- target dimensions
with
ˆyn =
M
m=1
W·,m ⊗ σ(S(x−M
n ))·,m
data dependent weights
⊗ off(xn−m) + xI
n−m
adjusted regressors
Mikolaj Bi´nkowski, Gautier Marti, Philippe Donnat (Imperial College)CNNs for Asynchronous Time Series 18 July 2018 7 / 10
20. Proposed Architecture
The model predicts
yn = E[xI
n|x−M
n ],
where
x−M
n = (xn−1, . . . , xn−M)
- regressors
I = (i1, i2, . . . , idI
)
- target dimensions
with
ˆyn =
M
m=1
W·,m ⊗ σ(S(x−M
n ))·,m
data dependent weights
⊗ off(xn−m) + xI
n−m
adjusted regressors
Input series 𝒙 𝒕−𝟔 𝒙 𝒕−𝟓 𝒙 𝒕−𝟒 𝒙 𝒕−𝟑 𝒙 𝒕−𝟐 𝒙 𝒕−𝟏
d - dimensional
timesteps
Mikolaj Bi´nkowski, Gautier Marti, Philippe Donnat (Imperial College)CNNs for Asynchronous Time Series 18 July 2018 7 / 10
21. Proposed Architecture
The model predicts
yn = E[xI
n|x−M
n ],
where
x−M
n = (xn−1, . . . , xn−M)
- regressors
I = (i1, i2, . . . , idI
)
- target dimensions
with
ˆyn =
M
m=1
W·,m ⊗ σ(S(x−M
n ))·,m
data dependent weights
⊗ off(xn−m) + xI
n−m
adjusted regressors
Input series 𝒙 𝒕−𝟔 𝒙 𝒕−𝟓 𝒙 𝒕−𝟒 𝒙 𝒕−𝟑 𝒙 𝒕−𝟐 𝒙 𝒕−𝟏
d - dimensional
timesteps
Mikolaj Bi´nkowski, Gautier Marti, Philippe Donnat (Imperial College)CNNs for Asynchronous Time Series 18 July 2018 7 / 10
22. Proposed Architecture
The model predicts
yn = E[xI
n|x−M
n ],
where
x−M
n = (xn−1, . . . , xn−M)
- regressors
I = (i1, i2, . . . , idI
)
- target dimensions
with
ˆyn =
M
m=1
W·,m ⊗ σ(S(x−M
n ))·,m
data dependent weights
⊗ off(xn−m) + xI
n−m
adjusted regressors
Offset networkSignificance network
Input series 𝒙 𝒕−𝟔 𝒙 𝒕−𝟓 𝒙 𝒕−𝟒 𝒙 𝒕−𝟑 𝒙 𝒕−𝟐 𝒙 𝒕−𝟏
d - dimensional
timesteps
Mikolaj Bi´nkowski, Gautier Marti, Philippe Donnat (Imperial College)CNNs for Asynchronous Time Series 18 July 2018 7 / 10
23. Proposed Architecture
The model predicts
yn = E[xI
n|x−M
n ],
where
x−M
n = (xn−1, . . . , xn−M)
- regressors
I = (i1, i2, . . . , idI
)
- target dimensions
with
ˆyn =
M
m=1
W·,m ⊗ σ(S(x−M
n ))·,m
data dependent weights
⊗ off(xn−m) + xI
n−m
adjusted regressors
Convolution
kx1 kernel
c channels
Convolution
1x1 kernel
c channels
Offset networkSignificance network
Input series 𝒙 𝒕−𝟔 𝒙 𝒕−𝟓 𝒙 𝒕−𝟒 𝒙 𝒕−𝟑 𝒙 𝒕−𝟐 𝒙 𝒕−𝟏
d - dimensional
timesteps
Mikolaj Bi´nkowski, Gautier Marti, Philippe Donnat (Imperial College)CNNs for Asynchronous Time Series 18 July 2018 7 / 10
24. Proposed Architecture
The model predicts
yn = E[xI
n|x−M
n ],
where
x−M
n = (xn−1, . . . , xn−M)
- regressors
I = (i1, i2, . . . , idI
)
- target dimensions
with
ˆyn =
M
m=1
W·,m ⊗ σ(S(x−M
n ))·,m
data dependent weights
⊗ off(xn−m) + xI
n−m
adjusted regressors
× (𝑵 𝑺 − 𝟏) layers
Convolution
kx1 kernel
c channels
× (𝑵 𝒐𝒇𝒇 − 𝟏) layers
Convolution
1x1 kernel
c channels
Offset networkSignificance network
Input series 𝒙 𝒕−𝟔 𝒙 𝒕−𝟓 𝒙 𝒕−𝟒 𝒙 𝒕−𝟑 𝒙 𝒕−𝟐 𝒙 𝒕−𝟏
d - dimensional
timesteps
Mikolaj Bi´nkowski, Gautier Marti, Philippe Donnat (Imperial College)CNNs for Asynchronous Time Series 18 July 2018 7 / 10
25. Proposed Architecture
The model predicts
yn = E[xI
n|x−M
n ],
where
x−M
n = (xn−1, . . . , xn−M)
- regressors
I = (i1, i2, . . . , idI
)
- target dimensions
with
ˆyn =
M
m=1
W·,m ⊗ σ(S(x−M
n ))·,m
data dependent weights
⊗ off(xn−m) + xI
n−m
adjusted regressors
× (𝑵 𝑺 − 𝟏) layers
Convolution
kx1 kernel
c channels
Convolution
1x1 kernel
dI channels
Convolution
kx1 kernel
dI channels
× (𝑵 𝒐𝒇𝒇 − 𝟏) layers
Convolution
1x1 kernel
c channels
Offset networkSignificance network
Input series 𝒙 𝒕−𝟔 𝒙 𝒕−𝟓 𝒙 𝒕−𝟒 𝒙 𝒕−𝟑 𝒙 𝒕−𝟐 𝒙 𝒕−𝟏
d - dimensional
timesteps
Mikolaj Bi´nkowski, Gautier Marti, Philippe Donnat (Imperial College)CNNs for Asynchronous Time Series 18 July 2018 7 / 10
26. Proposed Architecture
The model predicts
yn = E[xI
n|x−M
n ],
where
x−M
n = (xn−1, . . . , xn−M)
- regressors
I = (i1, i2, . . . , idI
)
- target dimensions
with
ˆyn =
M
m=1
W·,m ⊗ σ(S(x−M
n ))·,m
data dependent weights
⊗ off(xn−m) + xI
n−m
adjusted regressors
× (𝑵 𝑺 − 𝟏) layers
Convolution
kx1 kernel
c channels
Convolution
1x1 kernel
dI channels
Convolution
kx1 kernel
dI channels
× (𝑵 𝒐𝒇𝒇 − 𝟏) layers
Convolution
1x1 kernel
c channels
Offset network
𝒙𝑰
Significance network
Input series 𝒙 𝒕−𝟔 𝒙 𝒕−𝟓 𝒙 𝒕−𝟒 𝒙 𝒕−𝟑 𝒙 𝒕−𝟐 𝒙 𝒕−𝟏
d - dimensional
timesteps
𝐨𝐟𝐟
Mikolaj Bi´nkowski, Gautier Marti, Philippe Donnat (Imperial College)CNNs for Asynchronous Time Series 18 July 2018 7 / 10
27. Proposed Architecture
The model predicts
yn = E[xI
n|x−M
n ],
where
x−M
n = (xn−1, . . . , xn−M)
- regressors
I = (i1, i2, . . . , idI
)
- target dimensions
with
ˆyn =
M
m=1
W·,m ⊗ σ(S(x−M
n ))·,m
data dependent weights
⊗ off(xn−m) + xI
n−m
adjusted regressors
Weighting
𝑯 𝒏−𝟏 = 𝝈 𝑺 ⨂ (𝐨𝐟𝐟 + 𝒙 𝑰
)
× (𝑵 𝑺 − 𝟏) layers
Convolution
kx1 kernel
c channels
𝑺
𝛔
Convolution
1x1 kernel
dI channels
Convolution
kx1 kernel
dI channels
× (𝑵 𝒐𝒇𝒇 − 𝟏) layers
Convolution
1x1 kernel
c channels
Offset network
𝒙𝑰
Significance network
Input series 𝒙 𝒕−𝟔 𝒙 𝒕−𝟓 𝒙 𝒕−𝟒 𝒙 𝒕−𝟑 𝒙 𝒕−𝟐 𝒙 𝒕−𝟏
d - dimensional
timesteps
𝐨𝐟𝐟
Mikolaj Bi´nkowski, Gautier Marti, Philippe Donnat (Imperial College)CNNs for Asynchronous Time Series 18 July 2018 7 / 10
28. Proposed Architecture
The model predicts
yn = E[xI
n|x−M
n ],
where
x−M
n = (xn−1, . . . , xn−M)
- regressors
I = (i1, i2, . . . , idI
)
- target dimensions
with
ˆyn =
M
m=1
W·,m ⊗ σ(S(x−M
n ))·,m
data dependent weights
⊗ off(xn−m) + xI
n−m
adjusted regressors
Weighting
𝑯 𝒏−𝟏 = 𝝈 𝑺 ⨂ (𝐨𝐟𝐟 + 𝒙 𝑰
)
× (𝑵 𝑺 − 𝟏) layers
Convolution
kx1 kernel
c channels
𝑺
𝛔
Convolution
1x1 kernel
dI channels
Convolution
kx1 kernel
dI channels
× (𝑵 𝒐𝒇𝒇 − 𝟏) layers
Convolution
1x1 kernel
c channels
Offset network
𝒙𝑰
Significance network
Input series 𝒙 𝒕−𝟔 𝒙 𝒕−𝟓 𝒙 𝒕−𝟒 𝒙 𝒕−𝟑 𝒙 𝒕−𝟐 𝒙 𝒕−𝟏
d - dimensional
timesteps
Locally connected layer
fully connected for each of 𝒅𝑰 dimensions
𝑯 𝒏 = 𝑾𝑯 𝒏−𝟏 + 𝒃
𝐨𝐟𝐟
ෝ𝒙 𝒕
𝑰
Mikolaj Bi´nkowski, Gautier Marti, Philippe Donnat (Imperial College)CNNs for Asynchronous Time Series 18 July 2018 7 / 10
29. Experiments
Datasets:
artificially generated,
synchronous asynchronous
Electricity consumption [UCI
repository]
Quotes [16 tasks]
Mikolaj Bi´nkowski, Gautier Marti, Philippe Donnat (Imperial College)CNNs for Asynchronous Time Series 18 July 2018 8 / 10
30. Experiments
Datasets:
artificially generated,
synchronous asynchronous
Electricity consumption [UCI
repository]
Quotes [16 tasks]
Benchmarks:
(linear) VAR model
vanilla LSTM, 1d-CNN
25-layer conv. ResNet
Phased LSTM [Neil et al. 2016]
Sync 16 Sync 64 Async 16 Async 64 Electricity Quotes0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
MSE
VAR
CNN
ResNet
LSTM
Phased LSTM
SOCNN (ours)
Mikolaj Bi´nkowski, Gautier Marti, Philippe Donnat (Imperial College)CNNs for Asynchronous Time Series 18 July 2018 8 / 10
31. Experiments #2
Ablation study: Significance Network needs more depth than the Offset
Past observations are pretty good predictors, we just need to weight them
Mikolaj Bi´nkowski, Gautier Marti, Philippe Donnat (Imperial College)CNNs for Asynchronous Time Series 18 July 2018 9 / 10
32. Experiments #2
Ablation study: Significance Network needs more depth than the Offset
Past observations are pretty good predictors, we just need to weight them
Robustness: What happens to the error if we add noise to the input?
DGGHGQRLVHLQVWDQGDUGGHYLDWLRQV