A slides for the seminar about Probabilistic Models in Recommender Systems. Covered topics are: Tensor factorization, State-space models, and Dynamic Bayesian PMF (via HDP).
Mining Correlations on Massive Bursty Time Series Collection (DASFAA2015)Tomasz Kusmierczyk
The presentation of the paper "Tomasz Kusmierczyk, Kjetil Nørvåg: Mining Correlations on Massive Bursty Time Series Collections. DASFAA (1) 2015: 55-71"
Abstract: Existing methods for finding correlations between bursty time series are limited to collections consisting of a small number of time series. In this paper, we present a novel approach for mining correlation in collections consisting of a large number of time series. In our approach, we use bursts co-occurring in different streams as the measure of their relatedness. By exploiting the pruning properties of our measure we develop new indexing structures and algorithms that allow for efficient mining of related pairs from millions of streams. An experimental study performed on a large time series collection demonstrates the efficiency and scalability of the proposed approach.
Presentation of the paper "State-Boundedness in Data-Aware Dynamic Systems" at the 14th International Conference on Principles of Knowledge Representation and Reasoning (KR 2014)
A major challenge in the study of dynamical systems is that of model discovery: turning data into models that are not just predictive but provide insight into the nature of the underlying dynamical system that generated the data. This problem is made more difficult by the fact that many systems of interest exhibit diverse behaviors across multiple time scales. We introduce several data-driven strategies for discovering nonlinear multiscale dynamical systems and their embeddings from data. We consider two canonical cases: (i) systems for which we have full measurements of the governing variables, and (ii) systems for which we have incomplete measurements. For systems with full state measurements, we show that the recent sparse identification of nonlinear dynamical systems (SINDy) method can discover governing equations with relatively little data and introduce a sampling method that allows SINDy to scale efficiently to problems with multiple time scales. Specifically, we can discover distinct governing equations at slow and fast scales. For systems with incomplete observations, we show that the Hankel alternative view of Koopman (HAVOK) method, based on time-delay embedding coordinates, can be used to obtain a linear model and Koopman invariant measurement system that nearly perfectly captures the dynamics of nonlinear quasiperiodic systems. We introduce two strategies for using HAVOK on systems with multiple time scales. Together, our approaches provide a suite of mathematical strategies for reducing the data required to discover and model nonlinear multiscale systems.
Refining Underwater Target Localization and Tracking EstimatesCSCJournals
Improving the accuracy and reliability of the localization estimates and tracking of underwater targets is a constant quest in ocean surveillance operations. The localization estimates may vary owing to various noises and interferences such as sensor errors and environmental noises. Even though adaptive filters like the Kalman filter subdue these problems and yield dependable results, targets that undergo maneuvering can cause incomprehensible errors, unless suitable corrective measures are implemented. Simulation studies on improving the localization and tracking estimates for a stationary target as well as a moving target including the maneuvering situations are presented in this paper
A Two Stage Estimator of Instrumental Variable Quantile Regression for Panel ...ijtsrd
This paper proposes a two stage instrumental variable quantile regression 2S IVQR estimation to estimate the time invariant effects in panel data model. In the first stage, we introduce the dummy variables to represent the time invariant effects, and use quantile regression to estimate effects of individual covariates. The advantage of the first stage is that it can reduce calculations and the number of estimation parameters. Then in the second stage, we adapt instrument variables approach and 2SLS method. In addition, we present a proof of 2S IVQR estimators large sample properties. Monte Carlo simulation study shows that with increasing sample size, the Bias and RMSE of our estimator are decreased. Besides, our estimator has lower Bias and RMSE than those of the other two estimators. Tao Li "A Two-Stage Estimator of Instrumental Variable Quantile Regression for Panel Data with Time-Invariant Effects" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-6 , October 2021, URL: https://www.ijtsrd.com/papers/ijtsrd47716.pdf Paper URL : https://www.ijtsrd.com/other-scientific-research-area/other/47716/a-twostage-estimator-of-instrumental-variable-quantile-regression-for-panel-data-with-timeinvariant-effects/tao-li
Mining Correlations on Massive Bursty Time Series Collection (DASFAA2015)Tomasz Kusmierczyk
The presentation of the paper "Tomasz Kusmierczyk, Kjetil Nørvåg: Mining Correlations on Massive Bursty Time Series Collections. DASFAA (1) 2015: 55-71"
Abstract: Existing methods for finding correlations between bursty time series are limited to collections consisting of a small number of time series. In this paper, we present a novel approach for mining correlation in collections consisting of a large number of time series. In our approach, we use bursts co-occurring in different streams as the measure of their relatedness. By exploiting the pruning properties of our measure we develop new indexing structures and algorithms that allow for efficient mining of related pairs from millions of streams. An experimental study performed on a large time series collection demonstrates the efficiency and scalability of the proposed approach.
Presentation of the paper "State-Boundedness in Data-Aware Dynamic Systems" at the 14th International Conference on Principles of Knowledge Representation and Reasoning (KR 2014)
A major challenge in the study of dynamical systems is that of model discovery: turning data into models that are not just predictive but provide insight into the nature of the underlying dynamical system that generated the data. This problem is made more difficult by the fact that many systems of interest exhibit diverse behaviors across multiple time scales. We introduce several data-driven strategies for discovering nonlinear multiscale dynamical systems and their embeddings from data. We consider two canonical cases: (i) systems for which we have full measurements of the governing variables, and (ii) systems for which we have incomplete measurements. For systems with full state measurements, we show that the recent sparse identification of nonlinear dynamical systems (SINDy) method can discover governing equations with relatively little data and introduce a sampling method that allows SINDy to scale efficiently to problems with multiple time scales. Specifically, we can discover distinct governing equations at slow and fast scales. For systems with incomplete observations, we show that the Hankel alternative view of Koopman (HAVOK) method, based on time-delay embedding coordinates, can be used to obtain a linear model and Koopman invariant measurement system that nearly perfectly captures the dynamics of nonlinear quasiperiodic systems. We introduce two strategies for using HAVOK on systems with multiple time scales. Together, our approaches provide a suite of mathematical strategies for reducing the data required to discover and model nonlinear multiscale systems.
Refining Underwater Target Localization and Tracking EstimatesCSCJournals
Improving the accuracy and reliability of the localization estimates and tracking of underwater targets is a constant quest in ocean surveillance operations. The localization estimates may vary owing to various noises and interferences such as sensor errors and environmental noises. Even though adaptive filters like the Kalman filter subdue these problems and yield dependable results, targets that undergo maneuvering can cause incomprehensible errors, unless suitable corrective measures are implemented. Simulation studies on improving the localization and tracking estimates for a stationary target as well as a moving target including the maneuvering situations are presented in this paper
A Two Stage Estimator of Instrumental Variable Quantile Regression for Panel ...ijtsrd
This paper proposes a two stage instrumental variable quantile regression 2S IVQR estimation to estimate the time invariant effects in panel data model. In the first stage, we introduce the dummy variables to represent the time invariant effects, and use quantile regression to estimate effects of individual covariates. The advantage of the first stage is that it can reduce calculations and the number of estimation parameters. Then in the second stage, we adapt instrument variables approach and 2SLS method. In addition, we present a proof of 2S IVQR estimators large sample properties. Monte Carlo simulation study shows that with increasing sample size, the Bias and RMSE of our estimator are decreased. Besides, our estimator has lower Bias and RMSE than those of the other two estimators. Tao Li "A Two-Stage Estimator of Instrumental Variable Quantile Regression for Panel Data with Time-Invariant Effects" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-6 , October 2021, URL: https://www.ijtsrd.com/papers/ijtsrd47716.pdf Paper URL : https://www.ijtsrd.com/other-scientific-research-area/other/47716/a-twostage-estimator-of-instrumental-variable-quantile-regression-for-panel-data-with-timeinvariant-effects/tao-li
Invited presentation on "Verification of Parameterized Data-Aware Dynamic Systems" at the First Workshop on Parameterized Verification (PV 2014), satellite event of the 25th International Conference on Concurrency Theory (CONCUR 2014).
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
Talk from /dev/summer
Brief overview of Simulatneous Localistion and Mapping incl. brief intro to localisation methods. Relates these methods to autonomous vehicles and touches on ethical concerns.
This Presentation is on recommended system on question paper predication using machine learning techniques. We did literature survey and implement using same technique.
Priors for BNNs, based on:
"Wenzel et al.: What Are Bayesian Neural Network Posteriors Really Like?, 2020", "Noci et al.: Disentangling the Roles of Curation, Data-Augmentation and the Prior in the Cold Posterior Effect, 2021",
"Fortuin et al.: Bayesian Neural Network Priors Revisited, 2021" and
"Immer et al.: Scalable Marginal Likelihood Estimation for Model Selection in Deep Learning, 2021"
Overconfidence and subnetwork Inference for BNNs, based on:
"Being Bayesian, Even Just a Bit, Fixes Overconfidence in ReLU Networks" and "Bayesian Deep Learning via Subnetwork Inference"
More Related Content
Similar to Probabilistic Models in Recommender Systems: Time Variant Models
Invited presentation on "Verification of Parameterized Data-Aware Dynamic Systems" at the First Workshop on Parameterized Verification (PV 2014), satellite event of the 25th International Conference on Concurrency Theory (CONCUR 2014).
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
Talk from /dev/summer
Brief overview of Simulatneous Localistion and Mapping incl. brief intro to localisation methods. Relates these methods to autonomous vehicles and touches on ethical concerns.
This Presentation is on recommended system on question paper predication using machine learning techniques. We did literature survey and implement using same technique.
Similar to Probabilistic Models in Recommender Systems: Time Variant Models (20)
Priors for BNNs, based on:
"Wenzel et al.: What Are Bayesian Neural Network Posteriors Really Like?, 2020", "Noci et al.: Disentangling the Roles of Curation, Data-Augmentation and the Prior in the Cold Posterior Effect, 2021",
"Fortuin et al.: Bayesian Neural Network Priors Revisited, 2021" and
"Immer et al.: Scalable Marginal Likelihood Estimation for Model Selection in Deep Learning, 2021"
Overconfidence and subnetwork Inference for BNNs, based on:
"Being Bayesian, Even Just a Bit, Fixes Overconfidence in ReLU Networks" and "Bayesian Deep Learning via Subnetwork Inference"
Automatic variational inference with latent categorical variablesTomasz Kusmierczyk
Advances in gradient-based inference have made distributional approximations for posterior distribution of latent-variable models easy, but only for continuous latent spaces. Models with discrete latent variables still require analytic marginalization, continuous relaxations, or specialized algorithms that are difficult to generalize already for minor variations of the model. Discrete normalizing flows could, in principle, be used as approximations while allowing efficient gradient-based learning, but are not sufficiently expressive for representing realistic posterior distributions even for simple cases. We overcome this limitation by considering mixtures of discrete normalizing flows instead.
The presentation is an introduction to decision making with approximate Bayesian Methods. It consists of a review of Bayesian Decision Theory and Variational Inference along with a description of Loss Calibrated Variational Inference.
The presentation covers the problem of the digital badges' impact on social media users and presents the causal framework allowing to validate it. It was presented during The WebConf (WWW2018). The arXiv paper: https://arxiv.org/abs/1707.08160
What are the negative effects of social media?: fighting fake informationTomasz Kusmierczyk
The presentation addresses the problem of fake news and reviews in the Internet. In the first part, I present the characteristics of fake information. In the second part, I present the most recent approaches of how to deal with this problem.
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...pchutichetpong
M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years.
Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success.
MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies.
According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
Probabilistic Models in Recommender Systems: Time Variant Models
1. 2015-12-10
Eliezer de Souza da Silva (State-space models, Dynamic PMF vis HDP)
Tomasz Kuśmierczyk (Tensor factorization)
Session 3: Time variant models
Tensor factorization
State-space models
Dynamic Bayesian PMF (via HDP)
Approximate and Scalable Inference for Complex
Probabilistic Models in Recommender Systems
Part 1: Models and Representations
2. Literature / Sources
● Temporal Collaborative Filtering with Bayesian Probabilistic Tensor
Factorization.-- Xiong, L., Chen, X., Huang, T. K., Schneider, J. G., &
Carbonell, J. G. 2010. SDM Proceedings.
● Dynamic Matrix Factorization: A State-Space Approach -- John Z. Sun, Kush
R. Varshney and Karthik Subbian. 2012. ICASSP.
● Dynamic Bayesian Probabilistic Matrix Factorization -- Sotirios P. Chatzis.
2014. AAAI.
6. Tensors generalization (multi-way data)
- P-mode tensor of dimensions M1 x … x Mp (example: observations x
measurements x time x equipments).
- Multiple relationships between multidimensional variables
- Focus on 3-way (canonical decomposition or parallel factor analysis - CP)
7. CP Tensor Factorization (current case: 3 way
analysis)
M Items
NUsers
K
Contexts
latent 1 latent D
Ratings (normalized)
10. Temporal ...
● 1 additional type of contexts = time
(3D tensor instead of 2D matrix R)
● In practice:
○ ECCO sales: two context values per season (early/late
season)
○ Netflix, Movielens: one context value per month
25. Linear state-space approach
- User latent factors are time dependent
- gaussian assumptions for the dynamics allows exact inference
26. Linear state-space approach
- User latent factors are time dependent
- User latent factors are hidden states in a state-space system
time dependent
user features
27. Linear state-space approach
- items latent factors are stationary
- ratings are time dependent and observed
Stationary items
factors
time dependent
ratings
time dependent
user features
32. PMF meets Kalman
- Parameters are time-independent
- Initial state iid zero mean gaussian for all users with similar scaling of preferences σU
- process (time evolution of user preferences) and measurement (estimation of rating from user and item latent
factors) noise are iid zero mean gaussians, σQ
,σR
- Transitions (A) and measurements (items latent factors H) can be calculated to maximize the log-likelihood.
33. PMF meets Kalman: learning the parameters
- EM with expected joint likelihood maximization
- Other approaches: minimizing the residual prediction error, maximizing the prediction likelihood, maximizing the
measurement likelihood, optimizing the performance after smoothing.
35. Dynamic Bayesian Probabilistic Matrix Factorization
- User patterns changing over time
- Groups of users share latent structure (clustering of user features)
- Capture the dynamics of the generative process of the group structure
- dHDP - dynamic hierarchical dirichlet process
38. Dirichlet process
- Distribution of distributions (infinite distribution of discrete distributions)
- Clustering effect: rich gets richer
- Chinese Restaurant process.