The document summarizes a research paper about using recurrent neural networks for session-based recommendations. It introduces factor model and neighborhood approaches commonly used for recommendations. It then discusses limits of these approaches for session data and how RNNs are well-suited to handle sequential session information. The proposed model uses GRUs to process item sequences, outputs predicted item scores, and is trained on mini-batches with a ranking loss to optimize for next-item predictions. Experiments evaluated the model on two datasets.
Artificial Neural Networks have been very successfully used in several machine learning applications. They are often the building blocks when building deep learning systems. We discuss the hypothesis, training with backpropagation, update methods, regularization techniques.
Artificial Neural Networks have been very successfully used in several machine learning applications. They are often the building blocks when building deep learning systems. We discuss the hypothesis, training with backpropagation, update methods, regularization techniques.
Artificial Intelligence, Machine Learning and Deep LearningSujit Pal
Slides for talk Abhishek Sharma and I gave at the Gennovation tech talks (https://gennovationtalks.com/) at Genesis. The talk was part of outreach for the Deep Learning Enthusiasts meetup group at San Francisco. My part of the talk is covered from slides 19-34.
With the explosive growth of online information, recommender system has been an effective tool to overcome information overload and promote sales. In recent years, deep learning's revolutionary advances in speech recognition, image analysis and natural language processing have gained significant attention. Meanwhile, recent studies also demonstrate its efficacy in coping with information retrieval and recommendation tasks. Applying deep learning techniques into recommender system has been gaining momentum due to its state-of-the-art performance. In this talk, I will present recent development of deep learning based recommender models and highlight some future challenges and open issues of this research field.
This presentation is a part of ML Course and this deals with some of the basic concepts such as different types of learning, definitions of classification and regression, decision surfaces etc. This slide set also outlines the Perceptron Learning algorithm as a starter to other complex models to follow in the rest of the course.
Overview of TensorFlow For Natural Language Processingananth
TensorFlow open sourced recently by Google is one of the key frameworks that support development of deep learning architectures. In this slideset, part 1, we get started with a few basic primitives of TensorFlow. We will also discuss when and when not to use TensorFlow.
Learning to rank (LTR) for information retrieval (IR) involves the application of machine learning models to rank artifacts, such as items to be recommended, in response to user's need. LTR models typically employ training data, such as human relevance labels and click data, to discriminatively train towards an IR objective. The focus of this tutorial will be on the fundamentals of neural networks and their applications to learning to rank.
In this presentation we describe the formulation of the HMM model as consisting of states that are hidden that generate the observables. We introduce the 3 basic problems: Finding the probability of a sequence of observation given the model, the decoding problem of finding the hidden states given the observations and the model and the training problem of determining the model parameters that generate the given observations. We discuss the Forward, Backward, Viterbi and Forward-Backward algorithms.
Generative Adversarial Networks : Basic architecture and variantsananth
In this presentation we review the fundamentals behind GANs and look at different variants. We quickly review the theory such as the cost functions, training procedure, challenges and go on to look at variants such as CycleGAN, SAGAN etc.
This is the first lecture on Applied Machine Learning. The course focuses on the emerging and modern aspects of this subject such as Deep Learning, Recurrent and Recursive Neural Networks (RNN), Long Short Term Memory (LSTM), Convolution Neural Networks (CNN), Hidden Markov Models (HMM). It deals with several application areas such as Natural Language Processing, Image Understanding etc. This presentation provides the landscape.
Artificial Intelligence Course: Linear models ananth
In this presentation we present the linear models: Regression and Classification. We illustrate with several examples. Concepts such as underfitting (Bias) and overfitting (Variance) are presented. Linear models can be used as stand alone classifiers for simple cases and they are essential building blocks as a part of larger deep learning networks
Adversarial Reinforced Learning for Unsupervised Domain Adaptationtaeseon ryu
안녕하세요 딥러닝 논문읽기 모임입니다 오늘 업로드된 논문 리뷰 영상은 2021 WACB 에서 발표된 Adversarial Reinforced Learning for Unsupervised Domain Adaptation 라는 제목의 논문입니다.
데이터 분류의 자동화를 위해서는 많은양의 학습데이터가 필요합니다. 그렇기에 레이블이 존재하는 데이터로 학습이 끝난 모델을 재활용해서 새로운 도메인에 적용하는 연구인 도메인 어뎁션 분야는 많은 각광을 받고 있습니다.
논문의 특징으로는 크게 세가지를 둘 수 있습니다.
첫 번째로 본 논문에서는 GAN을 이용하여 비지도 방식으로 도메인 어뎁션이 가능한 프레임워크를 제안하였습니다 여기서 이제 강화학습 모델은 소스와 타겟
도메인간 가장 최적의 피처쌍을 선택하는데 사용됩니다
두 번째로 레이블링 되지않은 타겟 도메인에서 가장 적합한 피처를 찾아내기 위해
소스와 타겟간 상관관계를 보상으로 적용하는 정책을 개발하였습니다
마지막으로 제안된 적대적 강화학습 모델을 소스와 타겟 도메인간
최소화하는 피처쌍의 탐색과 각 도메인의 거리 분포상태의
Alignment 학습을 통해 소타대비 이제 성능을 향상 하였습니다
논문에 대한 디테일한 리뷰를 펀디멘탈팀 이근배님이 많은 도움 주셨습니다!
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...Alessandro Suglia
Presentation for "Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neural Networks" at the 7th Italian Information Retrieval Workshop.
See paper: http://ceur-ws.org/Vol-1653/paper_11.pdf
These are the slides from workshop: Introduction to Machine Learning with R which I gave at the University of Heidelberg, Germany on June 28th 2018.
The accompanying code to generate all plots in these slides (plus additional code) can be found on my blog: https://shirinsplayground.netlify.com/2018/06/intro_to_ml_workshop_heidelberg/
The workshop covered the basics of machine learning. With an example dataset I went through a standard machine learning workflow in R with the packages caret and h2o:
- reading in data
- exploratory data analysis
- missingness
- feature engineering
- training and test split
- model training with Random Forests, Gradient Boosting, Neural Nets, etc.
- hyperparameter tuning
A fast-paced introduction to Deep Learning concepts, such as activation functions, cost functions, backpropagation, and then a quick dive into CNNs. Basic knowledge of vectors, matrices, and elementary calculus (derivatives), are helpful in order to derive the maximum benefit from this session.
Next we'll see a simple neural network using Keras, followed by an introduction to TensorFlow and TensorBoard. (Bonus points if you know Zorn's Lemma, the Well-Ordering Theorem, and the Axiom of Choice.)
Artificial Intelligence, Machine Learning and Deep LearningSujit Pal
Slides for talk Abhishek Sharma and I gave at the Gennovation tech talks (https://gennovationtalks.com/) at Genesis. The talk was part of outreach for the Deep Learning Enthusiasts meetup group at San Francisco. My part of the talk is covered from slides 19-34.
With the explosive growth of online information, recommender system has been an effective tool to overcome information overload and promote sales. In recent years, deep learning's revolutionary advances in speech recognition, image analysis and natural language processing have gained significant attention. Meanwhile, recent studies also demonstrate its efficacy in coping with information retrieval and recommendation tasks. Applying deep learning techniques into recommender system has been gaining momentum due to its state-of-the-art performance. In this talk, I will present recent development of deep learning based recommender models and highlight some future challenges and open issues of this research field.
This presentation is a part of ML Course and this deals with some of the basic concepts such as different types of learning, definitions of classification and regression, decision surfaces etc. This slide set also outlines the Perceptron Learning algorithm as a starter to other complex models to follow in the rest of the course.
Overview of TensorFlow For Natural Language Processingananth
TensorFlow open sourced recently by Google is one of the key frameworks that support development of deep learning architectures. In this slideset, part 1, we get started with a few basic primitives of TensorFlow. We will also discuss when and when not to use TensorFlow.
Learning to rank (LTR) for information retrieval (IR) involves the application of machine learning models to rank artifacts, such as items to be recommended, in response to user's need. LTR models typically employ training data, such as human relevance labels and click data, to discriminatively train towards an IR objective. The focus of this tutorial will be on the fundamentals of neural networks and their applications to learning to rank.
In this presentation we describe the formulation of the HMM model as consisting of states that are hidden that generate the observables. We introduce the 3 basic problems: Finding the probability of a sequence of observation given the model, the decoding problem of finding the hidden states given the observations and the model and the training problem of determining the model parameters that generate the given observations. We discuss the Forward, Backward, Viterbi and Forward-Backward algorithms.
Generative Adversarial Networks : Basic architecture and variantsananth
In this presentation we review the fundamentals behind GANs and look at different variants. We quickly review the theory such as the cost functions, training procedure, challenges and go on to look at variants such as CycleGAN, SAGAN etc.
This is the first lecture on Applied Machine Learning. The course focuses on the emerging and modern aspects of this subject such as Deep Learning, Recurrent and Recursive Neural Networks (RNN), Long Short Term Memory (LSTM), Convolution Neural Networks (CNN), Hidden Markov Models (HMM). It deals with several application areas such as Natural Language Processing, Image Understanding etc. This presentation provides the landscape.
Artificial Intelligence Course: Linear models ananth
In this presentation we present the linear models: Regression and Classification. We illustrate with several examples. Concepts such as underfitting (Bias) and overfitting (Variance) are presented. Linear models can be used as stand alone classifiers for simple cases and they are essential building blocks as a part of larger deep learning networks
Adversarial Reinforced Learning for Unsupervised Domain Adaptationtaeseon ryu
안녕하세요 딥러닝 논문읽기 모임입니다 오늘 업로드된 논문 리뷰 영상은 2021 WACB 에서 발표된 Adversarial Reinforced Learning for Unsupervised Domain Adaptation 라는 제목의 논문입니다.
데이터 분류의 자동화를 위해서는 많은양의 학습데이터가 필요합니다. 그렇기에 레이블이 존재하는 데이터로 학습이 끝난 모델을 재활용해서 새로운 도메인에 적용하는 연구인 도메인 어뎁션 분야는 많은 각광을 받고 있습니다.
논문의 특징으로는 크게 세가지를 둘 수 있습니다.
첫 번째로 본 논문에서는 GAN을 이용하여 비지도 방식으로 도메인 어뎁션이 가능한 프레임워크를 제안하였습니다 여기서 이제 강화학습 모델은 소스와 타겟
도메인간 가장 최적의 피처쌍을 선택하는데 사용됩니다
두 번째로 레이블링 되지않은 타겟 도메인에서 가장 적합한 피처를 찾아내기 위해
소스와 타겟간 상관관계를 보상으로 적용하는 정책을 개발하였습니다
마지막으로 제안된 적대적 강화학습 모델을 소스와 타겟 도메인간
최소화하는 피처쌍의 탐색과 각 도메인의 거리 분포상태의
Alignment 학습을 통해 소타대비 이제 성능을 향상 하였습니다
논문에 대한 디테일한 리뷰를 펀디멘탈팀 이근배님이 많은 도움 주셨습니다!
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...Alessandro Suglia
Presentation for "Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neural Networks" at the 7th Italian Information Retrieval Workshop.
See paper: http://ceur-ws.org/Vol-1653/paper_11.pdf
These are the slides from workshop: Introduction to Machine Learning with R which I gave at the University of Heidelberg, Germany on June 28th 2018.
The accompanying code to generate all plots in these slides (plus additional code) can be found on my blog: https://shirinsplayground.netlify.com/2018/06/intro_to_ml_workshop_heidelberg/
The workshop covered the basics of machine learning. With an example dataset I went through a standard machine learning workflow in R with the packages caret and h2o:
- reading in data
- exploratory data analysis
- missingness
- feature engineering
- training and test split
- model training with Random Forests, Gradient Boosting, Neural Nets, etc.
- hyperparameter tuning
A fast-paced introduction to Deep Learning concepts, such as activation functions, cost functions, backpropagation, and then a quick dive into CNNs. Basic knowledge of vectors, matrices, and elementary calculus (derivatives), are helpful in order to derive the maximum benefit from this session.
Next we'll see a simple neural network using Keras, followed by an introduction to TensorFlow and TensorBoard. (Bonus points if you know Zorn's Lemma, the Well-Ordering Theorem, and the Axiom of Choice.)
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESVikash Kumar
IMAGE CLASSIFICATION USING KNN, RANDOM FOREST AND SVM ALGORITHM ON GLAUCOMA DATASETS AND EXPLAIN THE ACCURACY, SENSITIVITY, AND SPECIFICITY OF EACH AND EVERY ALGORITHMS
An introduction to Deep Learning concepts, with a simple yet complete neural network, CNNs, followed by rudimentary concepts of Keras and TensorFlow, and some simple code fragments.
Methodological study of opinion mining and sentiment analysis techniquesijsc
Decision making both on individual and organizational level is always accompanied by the search of
other’s opinion on the same. With tremendous establishment of opinion rich resources like, reviews, forum
discussions, blogs, micro-blogs, Twitter etc provide a rich anthology of sentiments. This user generated
content can serve as a benefaction to market if the semantic orientations are deliberated. Opinion mining
and sentiment analysis are the formalization for studying and construing opinions and sentiments. The
digital ecosystem has itself paved way for use of huge volume of opinionated data recorded. This paper is
an attempt to review and evaluate the various techniques used for opinion and sentiment analysis.
Learning to rank (LTR) for information retrieval (IR) involves the application of machine learning models to rank artifacts, such as webpages, in response to user's need, which may be expressed as a query. LTR models typically employ training data, such as human relevance labels and click data, to discriminatively train towards an IR objective. The focus of this lecture will be on the fundamentals of neural networks and their applications to learning to rank.
Methodological Study Of Opinion Mining And Sentiment Analysis Techniques ijsc
Decision making both on individual and organizational level is always accompanied by the search of other’s opinion on the same. With tremendous establishment of opinion rich resources like, reviews, forum discussions, blogs, micro-blogs, Twitter etc provide a rich anthology of sentiments. This user generated content can serve as a benefaction to market if the semantic orientations are deliberated. Opinion mining and sentiment analysis are the formalization for studying and construing opinions and sentiments. The digital ecosystem has itself paved way for use of huge volume of opinionated data recorded. This paper is an attempt to review and evaluate the various techniques used for opinion and sentiment analysis.
This slide deck introduces Deep Learning concepts, such gradient descent, back propagation, activation functions, and CNNs. Basic knowledge of vectors, matrices, and Android, as well as elementary calculus (derivatives), are strongly recommended in order to derive the maximum benefit from this session.
A fast-paced introduction to Deep Learning that starts with a simple yet complete neural network (no frameworks), followed by an overview of activation functions, cost functions, backpropagation, and then a quick dive into CNNs. Next we'll create a neural network using Keras, followed by an introduction to TensorFlow and TensorBoard. For best results, familiarity with basic vectors and matrices, inner (aka "dot") products of vectors, and rudimentary Python is definitely helpful.
A fast-paced introduction to Deep Learning concepts, such as activation functions, cost functions, back propagation, and then a quick dive into CNNs. Basic knowledge of vectors, matrices, and derivatives is helpful in order to derive the maximum benefit from this session.
New Approach of Preprocessing For Numeral RecognitionIJERA Editor
The present paper proposes a new approach of preprocessing for handwritten, printed and isolated numeral
characters. The new approach reduces the size of the input image of each numeral by discarding the redundant
information. This method reduces also the number of features of the attribute vector provided by the extraction
features method. Numeral recognition is carried out in this work through k nearest neighbors and multilayer
perceptron techniques. The simulations have obtained a good rate of recognition in fewer running time.
This presentation focuses on Deep Learning (DL) concepts, such as neural networks, backprop, activation functions, and Convolutional Neural Networks. You'll also learn how to incorporate Deep Learning in Android applications. Basic knowledge of matrices is helpful for this session, which is targeted primarily to beginners.
An introductory/illustrative but precise slide on mathematics on neural networks (densely connected layers).
Please download it and see its animations with PowerPoint.
*This slide is not finished yet. If you like it, please give me some feedback to motivate me.
I made this slide as an intern in DATANOMIQ Gmbh
URL: https://www.datanomiq.de/
An introduction to Deep Learning (DL) concepts, starting with a simple yet complete neural network (no frameworks), followed by aspects of deep neural networks, such as back propagation, activation functions, CNNs, and the AUT theorem. Next, a quick introduction to TensorFlow and Tensorboard, and then some code samples with Scala and TensorFlow.
Similar to Session-Based Recommendations with Recurrent Neural Networks(Balazs Hidasi, Alexandros Karatzoglou et al) (20)
Final project report on grocery store management system..pdfKamal Acharya
In today’s fast-changing business environment, it’s extremely important to be able to respond to client needs in the most effective and timely manner. If your customers wish to see your business online and have instant access to your products or services.
Online Grocery Store is an e-commerce website, which retails various grocery products. This project allows viewing various products available enables registered users to purchase desired products instantly using Paytm, UPI payment processor (Instant Pay) and also can place order by using Cash on Delivery (Pay Later) option. This project provides an easy access to Administrators and Managers to view orders placed using Pay Later and Instant Pay options.
In order to develop an e-commerce website, a number of Technologies must be studied and understood. These include multi-tiered architecture, server and client-side scripting techniques, implementation technologies, programming language (such as PHP, HTML, CSS, JavaScript) and MySQL relational databases. This is a project with the objective to develop a basic website where a consumer is provided with a shopping cart website and also to know about the technologies used to develop such a website.
This document will discuss each of the underlying technologies to create and implement an e- commerce website.
Student information management system project report ii.pdfKamal Acharya
Our project explains about the student management. This project mainly explains the various actions related to student details. This project shows some ease in adding, editing and deleting the student details. It also provides a less time consuming process for viewing, adding, editing and deleting the marks of the students.
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
Explore the innovative world of trenchless pipe repair with our comprehensive guide, "The Benefits and Techniques of Trenchless Pipe Repair." This document delves into the modern methods of repairing underground pipes without the need for extensive excavation, highlighting the numerous advantages and the latest techniques used in the industry.
Learn about the cost savings, reduced environmental impact, and minimal disruption associated with trenchless technology. Discover detailed explanations of popular techniques such as pipe bursting, cured-in-place pipe (CIPP) lining, and directional drilling. Understand how these methods can be applied to various types of infrastructure, from residential plumbing to large-scale municipal systems.
Ideal for homeowners, contractors, engineers, and anyone interested in modern plumbing solutions, this guide provides valuable insights into why trenchless pipe repair is becoming the preferred choice for pipe rehabilitation. Stay informed about the latest advancements and best practices in the field.
About
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Technical Specifications
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
Key Features
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface
• Compatible with MAFI CCR system
• Copatiable with IDM8000 CCR
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
Application
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
2. DM목차
Backgrounds
Factor model based approach in recommender system
Neighborhood approach in recommender system
Recurrent Neural Networks and GRU(Gated Recurrent Unit)
RNN in session based recommender system
Structure of proposed model
Session based mini-batch
Ranking Loss
Experiments and Discussion
Conclusion
What makes this paper great?
2
4. DMFactor model based recommender system
Represent users and items in latent space numerically
EX)
Represent a user U as vector 𝑢 = 0.7, 1.3, −0.5, 0.6 𝑇
Represent a item I as vector 𝑖 = 2.05, 1.2, 2.6, 3.9 𝑇
Targets(what we want to predict) are calculated using numerically
represented using user, item, and other contents information.
EX)
Predicted rating of user U on item I
𝑟𝑢,𝑖 = 𝑑𝑜𝑡 𝑢, 𝑖 = 𝑢 𝑇 𝑖 = 0.7 ∗ 2.05 + 1.3 ∗ 1.2 − 0.5 ∗ 2.6 + 0.6 ∗ 3.9 = 4.035
4
5. DMNeighborhood based recommender system
Rating of user u on item I is calculated by how user u’s neighbors rated item I
determining neighborhood of users is important
finding similar users given an user is big issue
Targets are calculated by weighted normalized sum of rating of neighbors where similarity is used as weight.
EX)
Predicted rating of user U on item I
𝑟𝑢,𝑖 ==
𝑢𝑠𝑒𝑟𝑠(𝑐𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 ∗ 𝑟𝑎𝑡𝑖𝑛𝑔 𝑢𝑠𝑒𝑟)
𝑢𝑠𝑒𝑟𝑠 𝑐𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒
=
0.7 ∗ 3 + 0.3 ∗ 4 + 0.05 ∗ 2
0.7 + 0.3 + 0.05
= 3.24
5
User similarity with user U Rating on item I
U 70% 3
B 30% 4
C 5% 2
6. DMLimits of factor model in session-based recommender
system
Same user in different session are classified as different user.
Hard to construct user-profile
Lack of user-profile
Neighborhood based recommender system still works
Computing similarities between items are based on co-occurrences of items in
sessions(user profiles).
In session-based recommender system, Neighborhood methods are
used extensively.
6
7. DMRecurrent Neural Network, GRU
Recurrent Neural network is a kind of network that
sequential input
text sentences, series of actions of a user on web gives
arbitrary goals (mainly next element/action in sequential data)
sentiment of given sentence (given text sentence)
Which page a user will visit next (given actions on web)
Gated Recurrent Unit(GRU) (Cho et al., 2014)
Designed to solve gradient vanishing/exploding problem in RNN like LSTM
Faster training than LSTM because it have lower number of parameters than LSTM
7
8. DMAbstract view of Recurrent Neural Network (1)
RNN layer takes two input
Given input
Hidden state from previous state (initially zero)
Input hidden state determines the state of RNN layer
RNN layer gives two output
Output
Hidden state to next state.
RNN can span arbitrary length.
We can train using only last output, whole
output sequence, or some of them.
8
Input 1
Output 1
RNN layerℎ1
Initially zero
Input 2
Output 2
RNN layer
ℎ2
12. DMgradient vanishing/exploding in Deep Learning
presume we are familiar with basic linear algebra
Repeated matrix-vector multiplication can be dangerous.
𝑾𝒙 𝒕 = 𝒙 𝒕+𝟏
Suppose that 𝒙0 = 𝒗 𝟏 + 𝒗 𝟐 + ⋯ 𝒗 𝒌, 𝒘𝒉𝒆𝒓𝒆 𝒗 𝟏, 𝒗 𝟐, . . 𝒗 𝒌 are eigenvectors of 𝑾
This is True for most cases.
𝑾𝒙 𝟎 = 𝝀 𝟏 𝒗 𝟏 + 𝝀 𝟐 𝒗 𝟐 + … + 𝝀 𝒌 𝒗 𝒌 , where 𝝀 𝒌 is eigenvalue of W
𝒙 𝒏+𝟏 = 𝑾𝒙 𝒏 = 𝝀 𝟏
𝒏
𝒗 𝟏 + 𝝀 𝟐
𝒏
𝒗 𝟐 + … + 𝝀 𝒌
𝒏
𝒗 𝒌
If largest eigenvalue > 1, 𝒙 𝒏 goes to infinite
If largest eigenvalue < 1, 𝒙 𝒏 goes to zero
In both cases, training becomes infeasible
LSTM, GRU and other variants of RNN are designed to solve this problem while preserving long term
dependencies.
12
14. DMStructure of proposed model
Input : Sequence of an user
유저가 본 아이템 리스트
𝑖1,𝑡1
, 𝑖1,𝑡2
, … 𝑖1,𝑡 𝑘
Output :
유저가 볼 아이템 리스트(의 확률 분포)
𝑝1,𝑡2
, 𝑝1,𝑡3
, … 𝑝1,𝑡 𝑘+1
𝑖1,𝑡1
는 item id
𝑝1,𝑡2
는 유저 1이 𝑡2 시간에 볼 아이템의 확률
분포
Ex
𝟏 𝟐 𝟑 𝟒 𝟓
𝑝1,𝑡2
= 𝟎. 𝟐, 𝟎. 𝟑, 𝟎. 𝟏, 𝟎. 𝟏, 𝟎. 𝟑
이면 item 1을 볼 확률이 0.2, item 2를 볼 확
률이 0.3, item 3을 볼 확률이 0.1 …
14
Input :
One-
hot
Encod
ed
Vector
Embed
ding
Layer
GRU
Layer
GRU
Layer
GRU
Layer
…
Feedfo
rward
Layer
Output
:
scores
on
items
15. DMStructure of proposed model
Ont-hot vector
The input vector of length equal
to the number of items and only
the elements corresponding to
the active item is one.
Embedding
Assign a trainable vector for
every item.
A model with embedding
performs worse
15
Input :
One-
hot
Encod
ed
Vector
Embed
ding
Layer
GRU
Layer
GRU
Layer
GRU
Layer
…
Feedfo
rward
Layer
Output
:
scores
on
items
22. DMDatsets
Dataset Recsys 2015(RCS15) OTT video(VIDEO)
# sessions 15,324 ~37k
# items 37,483 ~330k
# clicks 71,222 ~180k
22
Preprocessing
Erase items in test-set that do not appeared in train-set
Erase Sessions with length 1
Do not split session sequence into train-set and test-set
23. DM
Evaluation measure
Recall@k
I want this answer : Cat
But the computer says Your answer
might be one of [Chicken, Dog, Horse]
Then Recall @ 3 = 0
I want this answer ! : pizza
But the computer says Your answer
might be one of [chicken, dog, pizza]
Then Recall @ 3 = 1
Computer는 candidates를 갖고 있고, 이
를 정렬한 뒤 top-k개를 뽑아 이 중에 정
답을 있으면 1, 아니면 0이 된다.
MRR@k
Computer tries to guess my hair
color. He have 3 chances, and
says Red(1st), Black(2nd),
Yellow(3rd).
My hair color is black.
MRR @ 3 = 1 / 2 = 0.5
Computer는 candidates를 갖고 있고,
이를 정렬했을 때 (top-k개를 뽑아) 정
답의 rank의 역수를 말한다.
정답이 top-k개 밖이면 0 MRR@k = 0
23
24. DMRecall@20 and MMR@20 using baseline methods
Baseline RSC15 VIDEO
Recall@20 MRR@20 Recall@20 MRR@20
POP 0.0050 0.0012 0.0499 0.0117
S-POP 0.2672 0.1775 0.1301 0.0863
Item-KNN 0.5065 0.2048 0.5598 0.3381
BPR-MF 0.2574 0.0618 0.0692 0.0374
24
25. DMRecall@20 and MRR@20 for different types of single-layer
GRU
Loss function # Units
Length of
hidden state
vector(𝒉𝒊)
0.5777RSC15 VIDEO
Recall@20 MRR@20 Recall@20 MRR@20
TOP1 100 0.5853 0.2305 0.6141 0.3511
BPR 100 0.6069 0.2407 0.5999 0.3260
Cross-Entropy 100 0.6074 0.2430 0.6372 0.3720
TOP1 1000 0.6206 0.2693 0.6624 0.3891
BPR 1000 0.6322 0.2467 0.6311 0.3136
Cross-Entropy 1000 0.5777 0.2153 N/A N/A
25
26. DMDiscussion
Larger hidden-state(unit) gives better performance.
at 100 < at 1000 < at 10^4
Pointwise-loss is unstable
I do not understand this means ‘numerically unstable’, overflow or underflow, or
means that results is inconsistent.
Deeper GRU layer improves performance.
Embedding is not good for this model.
26
27. DMWhat makes this paper great?
New parallel training methods for training RNN (in recommender
system)
Devise new Ranking Loss (I think other models can exploit this loss
function)
Performance improvement
20~25% performance improved compared to best baseline Item-KNN
Designing novel model framework to solve session-based
Recommender system problem using RNN
27