This is the official slide for the WWW 2021 paper: Session-aware Linear Item-Item Models for Session-based Recommendation
If you have any questions, please contact zxcvxd@skku.edu.
Overview of tree algorithms from decision tree to xgboostTakami Sato
For my understanding, I surveyed popular tree algorithms on Machine Learning and their evolution. This is the first time I wrote a presentation in English. So, I am happy if you give me a feedback.
The document discusses distances between data and similarity measures in data analysis. It introduces the concept of distance between data as a quantitative measure of how different two data points are, with smaller distances indicating greater similarity. Distances are useful for tasks like clustering data, detecting anomalies, data recognition, and measuring approximation errors. The most common distance measure, Euclidean distance, is explained for vectors of any dimension using the concept of norm from geometry. Caution is advised when calculating distances between data with differing scales.
V Jornadas eMadrid sobre "Educación Digital". Cristina Conati, University of ...eMadrid network
V Jornadas eMadrid sobre "Educación Digital". Cristina Conati, University of British Columbia: Beyond problem solving: student adaptive support for open-learning environments
With the explosive growth of online information, recommender system has been an effective tool to overcome information overload and promote sales. In recent years, deep learning's revolutionary advances in speech recognition, image analysis and natural language processing have gained significant attention. Meanwhile, recent studies also demonstrate its efficacy in coping with information retrieval and recommendation tasks. Applying deep learning techniques into recommender system has been gaining momentum due to its state-of-the-art performance. In this talk, I will present recent development of deep learning based recommender models and highlight some future challenges and open issues of this research field.
Overview of tree algorithms from decision tree to xgboostTakami Sato
For my understanding, I surveyed popular tree algorithms on Machine Learning and their evolution. This is the first time I wrote a presentation in English. So, I am happy if you give me a feedback.
The document discusses distances between data and similarity measures in data analysis. It introduces the concept of distance between data as a quantitative measure of how different two data points are, with smaller distances indicating greater similarity. Distances are useful for tasks like clustering data, detecting anomalies, data recognition, and measuring approximation errors. The most common distance measure, Euclidean distance, is explained for vectors of any dimension using the concept of norm from geometry. Caution is advised when calculating distances between data with differing scales.
V Jornadas eMadrid sobre "Educación Digital". Cristina Conati, University of ...eMadrid network
V Jornadas eMadrid sobre "Educación Digital". Cristina Conati, University of British Columbia: Beyond problem solving: student adaptive support for open-learning environments
With the explosive growth of online information, recommender system has been an effective tool to overcome information overload and promote sales. In recent years, deep learning's revolutionary advances in speech recognition, image analysis and natural language processing have gained significant attention. Meanwhile, recent studies also demonstrate its efficacy in coping with information retrieval and recommendation tasks. Applying deep learning techniques into recommender system has been gaining momentum due to its state-of-the-art performance. In this talk, I will present recent development of deep learning based recommender models and highlight some future challenges and open issues of this research field.
The document presents a framework for recommending sequences of activities based on a user's past activity patterns and context. The framework uses machine learning to determine an optimal subsequence length for matching current and past activity timelines. It then ranks candidate timelines using an edit distance similarity measure and recommends the next sequence of activities. The framework has been applied to lifelogging, transportation, and tourism recommendation domains.
Context-aware preference modeling with factorizationBalázs Hidasi
- The document outlines Balázs Hidasi's research on context-aware recommendation models using factorization techniques.
- It introduces context-aware algorithms like iTALS and iTALSx that estimate preferences using ALS learning and scale linearly with data.
- Methods for speeding up ALS through approximate solutions like ALS-CG and ALS-CD are described, providing significant speed gains.
- A General Factorization Framework (GFF) is presented that allows experimenting with novel context-aware preference models beyond traditional approaches.
This document presents a sequence-based approach for recommending modes of transport to users based on their past activity patterns. It extends the authors' previous framework to extract and match subsequences from user timelines. A machine learning approach learns an optimal subsequence length for matching current and past user activity patterns. The framework is evaluated on a real-world GPS trajectory dataset containing transport mode labels for 18 users. Results show the proposed sequence-based recommender outperforms baseline methods that recommend frequent or long-duration transport modes.
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...Lionel Briand
This document proposes SAMOTA, a surrogate-assisted many-objective optimization approach for online testing of DNN-enabled systems. SAMOTA uses global and local surrogate models to replace expensive function evaluations. It clusters local data points and builds individual surrogate models for each cluster, rather than one model for all data. An evaluation on a DNN-enabled autonomous driving system shows SAMOTA achieves better test effectiveness and efficiency than alternative approaches, and clustering local data points leads to more effective local searches than using a single local model. SAMOTA is an effective method for online testing of complex DNN systems.
This document provides an overview of object-oriented design fundamentals. It discusses the steps of object-oriented design which include developing design class diagrams, using CRC cards to define class responsibilities and collaborations, and explaining fundamental design principles. CRC cards are used as a technique for designing object-oriented software where classes are assigned responsibilities for how they collaborate to accomplish use cases. The document also outlines notation for design classes and provides examples of applying object-oriented design.
Crafting Recommenders: the Shallow and the Deep of it! Sudeep Das, Ph.D.
Sudeep Das presented on recommender systems and advances in deep learning approaches. Matrix factorization is still the foundational method for collaborative filtering, but deep learning models are now augmenting these approaches. Deep neural networks can learn hierarchical representations of users and items from raw data like images, text, and sequences of user actions. Models like wide and deep networks combine the strengths of memorization and generalization. Sequence models like recurrent neural networks have also been applied to sessions for next item recommendation.
The document summarizes the results of benchmarking tests performed on the Blackboard Academic Suite to determine system sizing requirements. Key findings include:
- Tests showed a Unicode conversion taking minutes for small datasets, hours for moderate, and under 3 days for large datasets, meeting objectives.
- Regression performance from version 6.3 to 7.X met the objective of no more than a 5% degradation and potential for a 5% improvement.
- Benchmarking of different hardware platforms like Sun, Dell, and Windows showed performance varied based on configuration.
The document discusses how Blackboard sizes its Academic Suite software based on benchmarking. It provides details on the benchmarking methodology, including modeling user behavior, data growth, and performance objectives. The results showed how the software performed under different workload levels on various hardware configurations. The last part discusses using the benchmark results and sizing guide to determine an institution's adoption profile and appropriate hardware configuration based on factors like sessions per hour and page loads.
Continuous Evaluation of Collaborative Recommender Systems in Data Stream Man...Dr. Cornelius Ludmann
Talk at the Data Streams and Event Processing Workshop at the 16. Fachtagung »Datenbanksysteme für Business, Technologie und Web« (BTW) of the Gesellschaft für Informatik (GI) in Hamburg, Germany. March 3, 2015
Graph and language embeddings were used to analyze user data from Reddit to predict whether authors would post in the SuicideWatch subreddit. Metapath2vec was used to generate graph embeddings from subreddit and author relationships. Doc2vec was used to generate document embeddings based on language similarity between submissions and subreddits. Combining the graph and document embeddings in a logistic regression achieved 90% accuracy in predicting SuicideWatch posters, reducing both false positives and false negatives compared to using the embeddings separately. Next steps proposed using the embeddings to better understand similarities between related subreddits and predict risk factors in posts.
Beyond data and model parallelism for deep neural networksJunKudo2
FlexFlow is a deep learning engine that automatically finds optimal parallelization strategies for arbitrary DNN models and hardware configurations. It defines a comprehensive search space called SOAP that includes sample, operator, attribute, and parameter parallelization. FlexFlow uses a Markov Chain Monte Carlo algorithm to search this space. It simulates strategy executions, either through full simulation or more efficient delta simulation, to identify high performing strategies. Evaluation on real-world DNN models and hardware showed FlexFlow outperforms expert-designed strategies and other automated frameworks by achieving up to 3.3x higher training throughput.
Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.pptAnirbanBhar3
This document summarizes a research paper that proposes a hybrid software development lifecycle model called Proto-Spiral for measuring scalability early in development. The model combines prototyping and the spiral model. It defines a process for developing scalable software by analyzing scalability factors after each prototype using probabilistic measurements. The paper includes a case study analysis of how the Proto-Spiral model could help assure scalability for a large-scale system like eBay.
This document provides a summary of practical machine learning on big data platforms. It begins with an introduction and agenda, then provides a quick brief on the machine learning process. It discusses the current landscape of open source tools, including evolutionary drivers and examples. It covers case studies from Twitter and their experience. Finally, it discusses architectural forces like Moore's Law and Kryder's Law that are shaping the field. The document aims to present a unified approach for machine learning on big data platforms and discuss how industry leaders are implementing these techniques.
Apereo Webinar: Learning What Works When Scaling Analytics Infrastructure (Ja...Unicon, Inc.
With student success as a primary institutional goal, NC State’s DELTA organization has taken initial steps in building a scalable analytics infrastructure. This webinar provides insights into the open source frameworks, analytics technology, and strategy required to deploy analytics infrastructure with efficient IT delivery. We will discuss planning an architecture to accommodate large amounts of data, while still providing predictions in short order. We will also touch on some work we are doing building cohorts of sub-populations to improve scalability and accuracy. In addition, we will discuss ongoing and future work to improve the infrastructure even further.
Neural networks for semantic gaze analysis in xr settingsJaey Jeong
This document presents a method for semantic gaze analysis in XR settings using neural networks. The approach uses a CAD model to generate training data for a Cycle-GAN, which creates synthetic data to train a CNN for identifying volumes of interest (VOIs) from eye-tracking data without manual annotation. An experiment applied this method to data from participants interacting with a virtual and real coffee machine, showing it performed slightly better in VR than reality while eliminating the need for lengthy human annotation of gaze data. Future work could aim to improve prediction accuracy further.
Similar to Session-aware Linear Item-Item Models for Session-based Recommendation (WWW 2021) (20)
_Extraction of Ethylene oxide and 2-Chloroethanol from alternate matrices Li...LucyHearn1
How do you know your food is safe?
Last Friday was world World Food Safety Day, facilitated by the Food and Agriculture Organization of the United Nations (FAO) and the World Health Organization (WHO) in which the slogan rightly says, 'food safety is everyone's business'. Due to this, I thought it would be worth sharing some data that I have worked on in this field!
Working at Markes International has really opened my eyes (and unfortunately my friends and family 🤣) to food safety and quality, especially with my recent application work on ethylene oxide and 2-chloroethanol residues in foodstuffs, as of the biggest global food recalls in history was and is still being implemented by the Rapid alert system for food and feed (RASFF) in 2021, for high levels of these carcinogenic compounds.
The binding of cosmological structures by massless topological defectsSérgio Sacani
Assuming spherical symmetry and weak field, it is shown that if one solves the Poisson equation or the Einstein field
equations sourced by a topological defect, i.e. a singularity of a very specific form, the result is a localized gravitational
field capable of driving flat rotation (i.e. Keplerian circular orbits at a constant speed for all radii) of test masses on a thin
spherical shell without any underlying mass. Moreover, a large-scale structure which exploits this solution by assembling
concentrically a number of such topological defects can establish a flat stellar or galactic rotation curve, and can also deflect
light in the same manner as an equipotential (isothermal) sphere. Thus, the need for dark matter or modified gravity theory is
mitigated, at least in part.
SDSS1335+0728: The awakening of a ∼ 106M⊙ black hole⋆Sérgio Sacani
Context. The early-type galaxy SDSS J133519.91+072807.4 (hereafter SDSS1335+0728), which had exhibited no prior optical variations during the preceding two decades, began showing significant nuclear variability in the Zwicky Transient Facility (ZTF) alert stream from December 2019 (as ZTF19acnskyy). This variability behaviour, coupled with the host-galaxy properties, suggests that SDSS1335+0728 hosts a ∼ 106M⊙ black hole (BH) that is currently in the process of ‘turning on’. Aims. We present a multi-wavelength photometric analysis and spectroscopic follow-up performed with the aim of better understanding the origin of the nuclear variations detected in SDSS1335+0728. Methods. We used archival photometry (from WISE, 2MASS, SDSS, GALEX, eROSITA) and spectroscopic data (from SDSS and LAMOST) to study the state of SDSS1335+0728 prior to December 2019, and new observations from Swift, SOAR/Goodman, VLT/X-shooter, and Keck/LRIS taken after its turn-on to characterise its current state. We analysed the variability of SDSS1335+0728 in the X-ray/UV/optical/mid-infrared range, modelled its spectral energy distribution prior to and after December 2019, and studied the evolution of its UV/optical spectra. Results. From our multi-wavelength photometric analysis, we find that: (a) since 2021, the UV flux (from Swift/UVOT observations) is four times brighter than the flux reported by GALEX in 2004; (b) since June 2022, the mid-infrared flux has risen more than two times, and the W1−W2 WISE colour has become redder; and (c) since February 2024, the source has begun showing X-ray emission. From our spectroscopic follow-up, we see that (i) the narrow emission line ratios are now consistent with a more energetic ionising continuum; (ii) broad emission lines are not detected; and (iii) the [OIII] line increased its flux ∼ 3.6 years after the first ZTF alert, which implies a relatively compact narrow-line-emitting region. Conclusions. We conclude that the variations observed in SDSS1335+0728 could be either explained by a ∼ 106M⊙ AGN that is just turning on or by an exotic tidal disruption event (TDE). If the former is true, SDSS1335+0728 is one of the strongest cases of an AGNobserved in the process of activating. If the latter were found to be the case, it would correspond to the longest and faintest TDE ever observed (or another class of still unknown nuclear transient). Future observations of SDSS1335+0728 are crucial to further understand its behaviour. Key words. galaxies: active– accretion, accretion discs– galaxies: individual: SDSS J133519.91+072807.4
Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...Travis Hills MN
By harnessing the power of High Flux Vacuum Membrane Distillation, Travis Hills from MN envisions a future where clean and safe drinking water is accessible to all, regardless of geographical location or economic status.
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...Sérgio Sacani
Wereport the study of a huge optical intraday flare on 2021 November 12 at 2 a.m. UT in the blazar OJ287. In the binary black hole model, it is associated with an impact of the secondary black hole on the accretion disk of the primary. Our multifrequency observing campaign was set up to search for such a signature of the impact based on a prediction made 8 yr earlier. The first I-band results of the flare have already been reported by Kishore et al. (2024). Here we combine these data with our monitoring in the R-band. There is a big change in the R–I spectral index by 1.0 ±0.1 between the normal background and the flare, suggesting a new component of radiation. The polarization variation during the rise of the flare suggests the same. The limits on the source size place it most reasonably in the jet of the secondary BH. We then ask why we have not seen this phenomenon before. We show that OJ287 was never before observed with sufficient sensitivity on the night when the flare should have happened according to the binary model. We also study the probability that this flare is just an oversized example of intraday variability using the Krakow data set of intense monitoring between 2015 and 2023. We find that the occurrence of a flare of this size and rapidity is unlikely. In machine-readable Tables 1 and 2, we give the full orbit-linked historical light curve of OJ287 as well as the dense monitoring sample of Krakow.
Authoring a personal GPT for your research and practice: How we created the Q...Leonel Morgado
Thematic analysis in qualitative research is a time-consuming and systematic task, typically done using teams. Team members must ground their activities on common understandings of the major concepts underlying the thematic analysis, and define criteria for its development. However, conceptual misunderstandings, equivocations, and lack of adherence to criteria are challenges to the quality and speed of this process. Given the distributed and uncertain nature of this process, we wondered if the tasks in thematic analysis could be supported by readily available artificial intelligence chatbots. Our early efforts point to potential benefits: not just saving time in the coding process but better adherence to criteria and grounding, by increasing triangulation between humans and artificial intelligence. This tutorial will provide a description and demonstration of the process we followed, as two academic researchers, to develop a custom ChatGPT to assist with qualitative coding in the thematic data analysis process of immersive learning accounts in a survey of the academic literature: QUAL-E Immersive Learning Thematic Analysis Helper. In the hands-on time, participants will try out QUAL-E and develop their ideas for their own qualitative coding ChatGPT. Participants that have the paid ChatGPT Plus subscription can create a draft of their assistants. The organizers will provide course materials and slide deck that participants will be able to utilize to continue development of their custom GPT. The paid subscription to ChatGPT Plus is not required to participate in this workshop, just for trying out personal GPTs during it.
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Leonel Morgado
Current descriptions of immersive learning cases are often difficult or impossible to compare. This is due to a myriad of different options on what details to include, which aspects are relevant, and on the descriptive approaches employed. Also, these aspects often combine very specific details with more general guidelines or indicate intents and rationales without clarifying their implementation. In this paper we provide a method to describe immersive learning cases that is structured to enable comparisons, yet flexible enough to allow researchers and practitioners to decide which aspects to include. This method leverages a taxonomy that classifies educational aspects at three levels (uses, practices, and strategies) and then utilizes two frameworks, the Immersive Learning Brain and the Immersion Cube, to enable a structured description and interpretation of immersive learning cases. The method is then demonstrated on a published immersive learning case on training for wind turbine maintenance using virtual reality. Applying the method results in a structured artifact, the Immersive Learning Case Sheet, that tags the case with its proximal uses, practices, and strategies, and refines the free text case description to ensure that matching details are included. This contribution is thus a case description method in support of future comparative research of immersive learning cases. We then discuss how the resulting description and interpretation can be leveraged to change immersion learning cases, by enriching them (considering low-effort changes or additions) or innovating (exploring more challenging avenues of transformation). The method holds significant promise to support better-grounded research in immersive learning.
JAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDSSérgio Sacani
The pathway(s) to seeding the massive black holes (MBHs) that exist at the heart of galaxies in the present and distant Universe remains an unsolved problem. Here we categorise, describe and quantitatively discuss the formation pathways of both light and heavy seeds. We emphasise that the most recent computational models suggest that rather than a bimodal-like mass spectrum between light and heavy seeds with light at one end and heavy at the other that instead a continuum exists. Light seeds being more ubiquitous and the heavier seeds becoming less and less abundant due the rarer environmental conditions required for their formation. We therefore examine the different mechanisms that give rise to different seed mass spectrums. We show how and why the mechanisms that produce the heaviest seeds are also among the rarest events in the Universe and are hence extremely unlikely to be the seeds for the vast majority of the MBH population. We quantify, within the limits of the current large uncertainties in the seeding processes, the expected number densities of the seed mass spectrum. We argue that light seeds must be at least 103 to 105 times more numerous than heavy seeds to explain the MBH population as a whole. Based on our current understanding of the seed population this makes heavy seeds (Mseed > 103 M⊙) a significantly more likely pathway given that heavy seeds have an abundance pattern than is close to and likely in excess of 10−4 compared to light seeds. Finally, we examine the current state-of-the-art in numerical calculations and recent observations and plot a path forward for near-future advances in both domains.
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...Scintica Instrumentation
Targeting Hsp90 and its pathogen Orthologs with Tethered Inhibitors as a Diagnostic and Therapeutic Strategy for cancer and infectious diseases with Dr. Timothy Haystead.
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...PsychoTech Services
A proprietary approach developed by bringing together the best of learning theories from Psychology, design principles from the world of visualization, and pedagogical methods from over a decade of training experience, that enables you to: Learn better, faster!
Anti-Universe And Emergent Gravity and the Dark UniverseSérgio Sacani
Recent theoretical progress indicates that spacetime and gravity emerge together from the entanglement structure of an underlying microscopic theory. These ideas are best understood in Anti-de Sitter space, where they rely on the area law for entanglement entropy. The extension to de Sitter space requires taking into account the entropy and temperature associated with the cosmological horizon. Using insights from string theory, black hole physics and quantum information theory we argue that the positive dark energy leads to a thermal volume law contribution to the entropy that overtakes the area law precisely at the cosmological horizon. Due to the competition between area and volume law entanglement the microscopic de Sitter states do not thermalise at sub-Hubble scales: they exhibit memory effects in the form of an entropy displacement caused by matter. The emergent laws of gravity contain an additional ‘dark’ gravitational force describing the ‘elastic’ response due to the entropy displacement. We derive an estimate of the strength of this extra force in terms of the baryonic mass, Newton’s constant and the Hubble acceleration scale a0 = cH0, and provide evidence for the fact that this additional ‘dark gravity force’ explains the observed phenomena in galaxies and clusters currently attributed to dark matter.
Microbial interaction
Microorganisms interacts with each other and can be physically associated with another organisms in a variety of ways.
One organism can be located on the surface of another organism as an ectobiont or located within another organism as endobiont.
Microbial interaction may be positive such as mutualism, proto-cooperation, commensalism or may be negative such as parasitism, predation or competition
Types of microbial interaction
Positive interaction: mutualism, proto-cooperation, commensalism
Negative interaction: Ammensalism (antagonism), parasitism, predation, competition
I. Mutualism:
It is defined as the relationship in which each organism in interaction gets benefits from association. It is an obligatory relationship in which mutualist and host are metabolically dependent on each other.
Mutualistic relationship is very specific where one member of association cannot be replaced by another species.
Mutualism require close physical contact between interacting organisms.
Relationship of mutualism allows organisms to exist in habitat that could not occupied by either species alone.
Mutualistic relationship between organisms allows them to act as a single organism.
Examples of mutualism:
i. Lichens:
Lichens are excellent example of mutualism.
They are the association of specific fungi and certain genus of algae. In lichen, fungal partner is called mycobiont and algal partner is called
II. Syntrophism:
It is an association in which the growth of one organism either depends on or improved by the substrate provided by another organism.
In syntrophism both organism in association gets benefits.
Compound A
Utilized by population 1
Compound B
Utilized by population 2
Compound C
utilized by both Population 1+2
Products
In this theoretical example of syntrophism, population 1 is able to utilize and metabolize compound A, forming compound B but cannot metabolize beyond compound B without co-operation of population 2. Population 2is unable to utilize compound A but it can metabolize compound B forming compound C. Then both population 1 and 2 are able to carry out metabolic reaction which leads to formation of end product that neither population could produce alone.
Examples of syntrophism:
i. Methanogenic ecosystem in sludge digester
Methane produced by methanogenic bacteria depends upon interspecies hydrogen transfer by other fermentative bacteria.
Anaerobic fermentative bacteria generate CO2 and H2 utilizing carbohydrates which is then utilized by methanogenic bacteria (Methanobacter) to produce methane.
ii. Lactobacillus arobinosus and Enterococcus faecalis:
In the minimal media, Lactobacillus arobinosus and Enterococcus faecalis are able to grow together but not alone.
The synergistic relationship between E. faecalis and L. arobinosus occurs in which E. faecalis require folic acid
MICROBIAL INTERACTION PPT/ MICROBIAL INTERACTION AND THEIR TYPES // PLANT MIC...
Session-aware Linear Item-Item Models for Session-based Recommendation (WWW 2021)
1. Session-aware Linear Item-Item Models for
Session-based Recommendation
Minjin Choi1, Jinhong Kim1, Joonseok Lee2,3 , Hyunjung Shim4, Jongwuk Lee1
Sungkyunkwan University (SKKU)1, Google Research2
Seoul National University3, Yonsei University4
3. Session-based Recommendation (SR)
Predicting the next item(s) based on only the current session
• Usually, session histories are much shorter than user histories.
3
User
6, January
13, January
17, February
Anonymous 1
Anonymous 2
Anonymous 3
4. Limitation of Existing SR Models
DNN models show good performance but suffer scalability issues.
• RNNs and GNNs are mostly used for session-based recommendations.
• When the dataset is too large, the scalability issue arises.
4
Balázs Hidasi et. al., “Session-based Recommendations with Recurrent Neural Networks”, ICLR 2016
GRU Cell GRU Cell GRU Cell GRU Cell
5. Limitation of Existing SR Models
Neighborhood-based models are difficult to capture complex
dependencies between items in a session.
5
Diksha Garg et. al., “Sequence and Time Aware Neighborhood for Session-based Recommendations: STAN”, SIGIR 2019
𝒔𝒊𝒎 = 𝟎. 𝟔
𝑠𝑖𝑚 = 0.4 Top-1 Neighbor session
Target session
7. Our Key Contributions
We utilize linear models for session-based recommendations.
• Linear models show great performance in traditional recommendations.
• However, simply applying them to session-based recommendations does not
show an effective performance.
7
Item-item model 𝑩
𝑩 ∈ ℝ𝒏×𝒏
User-item matrix 𝑿
𝑿 ∈ ℝ𝒎×𝒏
Dataset: YC-1/64
Models HR@20 MRR@20
GRU4Rec+ 0.6528 0.2752
SKNN 0.6423 0.2522
SEASER 0.5443 0.1963
Harald Steck, “Embarrassingly Shallow Autoencoders for Sparse Data”, WWW 2019
Worse than existing models!
𝐚𝐫𝐠𝐦𝐢𝐧
𝑩
𝑿 − 𝑿𝑩 𝑭
𝟐
+ 𝝀 𝑩 𝑭
𝟐
s. t. 𝒅𝒊𝒂𝒈 𝑩 = 𝟎
𝒎: # of users
𝒏: # of items
8. Our Key Contributions
We design linear models to utilize some unique characteristics of
sessions.
• Some items tend to be consumed in a specific order. (dependency)
• Items in a session are often highly coherent. (consistency)
8
6, January
13, January
17, February
Session consistency
(tops)
Sequential dependency
(pants -> shoes)
9. Our Key Contributions
We design linear models to utilize some unique characteristics of
sessions.
• Sessions often reflect a recent trend. (timeliness)
• The user might consume the same items in a session. (repeated item)
9
6, January
13, January
17, February
Repeated item consumption
Timeliness of sessions
11. Overview of the Proposed Model
We devise two linear models to learn item similarity/transition and
unify them to fully capture the complex relationships of items.
11
Full session
representation
Partial session
representation
Item
similarity
model
Item
transition
model
Item
similarity/
transition
model
12. Two Session Representations
We deal with two session representations for our linear models.
• (1) A single vector to capture the correlations between items while ignoring
the order.
• (2) Past and future vectors to represent sequential dependency of items.
12
Full session
representation
Partial session
representation
past future
13. Session-aware Item Similarity Model (SLIS)
We learn the linear model using full session representation, to
represent the similarity between items.
• The entire session is represented as a single vector.
• 𝐵𝑖𝑗 indicates the learned item similarity between 𝑖 and 𝑗.
• Note: the diagonal constraint is relaxed.
13
Full session representation 𝑿
𝑿 ∈ ℝ|𝑺|×𝑵
Item similarity model 𝑩
𝑩 ∈ ℝ𝑵×𝑵
Dark cells mean high similarities.
𝐚𝐫𝐠𝐦𝐢𝐧
𝑩
𝑿 − 𝑿𝑩 𝑭
𝟐
+ 𝝀 𝑩 𝑭
𝟐
s. t. 𝒅𝒊𝒂𝒈 𝑩 ≤ 𝝃
|𝑺|: # of sessions
𝒏: # of items
14. Session-aware Item Transition Model (SLIT)
We learn the linear model using partial session representation that
captures the sequential dependency across items.
• A session is divided into past vectors and future vectors.
• 𝐵𝑖𝑗 indicates the item transitions from 𝑖 to 𝑗.
14
Item Transition Model 𝑩
𝑩 ∈ ℝ𝑵×𝑵
Partial session representation 𝑺𝑻
𝑺, 𝑻 ∈ ℝ|𝑺′|×𝑵
past future
𝐚𝐫𝐠𝐦𝐢𝐧
𝑩
𝑻 − 𝑺𝑩 𝑭
𝟐
+ 𝝀 𝑩 𝑭
𝟐
Dark cells mean the high transition
probability.
|𝑺′|: # of partial sessions
𝒏: # of items
15. Item Similarity/Transition Model: SLIST
We unify two linear models by jointly optimizing both models.
15
Item similarity/transition model 𝑩
𝑩 ∈ ℝ𝑵×𝑵
Full session representation
Partial session representation
Item similarity model
Item Transition Model
16. Weight Decay of Items in a Session
We assign higher weights for more recent sessions (𝒘𝒕𝒊𝒎𝒆) and
current items (𝒘𝒑𝒐𝒔, 𝒘𝒊𝒏𝒇) in the sessions.
16
Full and partial session representation
𝒘𝒕𝒊𝒎𝒆
𝒐𝒓𝒅𝒆𝒓
𝒘𝒑𝒐𝒔
past future
𝒕𝒊𝒎𝒆
17. Detail: Training and Inference
The objective function of SLIST
Model inference
17
⋅
𝒕 = 𝒔 ⋅ 𝑩
Partial
representation vector
Learned
item-item model
𝒂𝒓𝒈𝒎𝒊𝒏
𝑩
𝜶 𝑾𝒇𝒖𝒍𝒍 ⊙ 𝑿 − 𝑿𝑩
𝑭
𝟐
+ 𝟏 − 𝜶 𝑾𝒑𝒂𝒓 ⊙ 𝑻 − 𝑺𝑩
𝑭
𝟐
+ 𝝀 𝑩 𝑭
𝟐
𝑩 = 𝑰 − 𝝀𝑷 − 𝟏 − 𝜶 𝑷𝑺⊤𝒅𝒊𝒂𝒈𝑴𝒂𝒕(𝒘𝒑𝒂𝒓
𝟐 )(𝑺 − 𝑻)
Closed form
solution
Item similarity model (SLIS) Item transition model (SLIT)
𝒘𝒊𝒏𝒇
𝒐𝒓𝒅𝒆𝒓
19. Experimental Setup: Dataset
We evaluate the accuracy and scalability of SLIST over public
datasets.
• We use both 5-split and 1-split datasets.
19
Split Dataset # of actions # of sessions # of items
1-split
YooChoose 1/64
(YC-1/64)
494,330 119,287 17,319
YooChoose 1/4
(YC-1/4)
7,909,307 1,939,891 30,638
DIGINETICA
(DIGI1)
916,370 188,807 43,105
5-split
YooChoose
(YC5)
5,426,961 1,375,128 28,582
DIGINETICA
(DIGI5)
203,488 41,755 32,137
20. Evaluation Protocol and Metrics
Evaluation protocol: iterative revealing scheme
• We iteratively expose the item of a session to the model.
Evaluation metrics
• HR@20 and MRR@20
• To predict only the next item in a session
• R@20 and MAP@20
• To consider all subsequent items for a session
20
Time
Session history:
(1) Input & target:
(2) Input & target:
(4) Input & target:
…
21. Competitive Models
Two neighborhood-based models
• SKNN: a session-based KNN algorithm, a variant of user-based KNN.
• STAN: an improved version of SKNN by considering sequential dependency
and timeliness of sessions.
Four DNN-based models
• GRU4Rec+: an improved version of GRU4REC using the top-k gain.
• NARM: an improved version of GRU4REC+ using an attention mechanism.
• STAMP: an attention-based model for capturing user’s interests.
• SR-GNN: a GNN-based model to capture complex dependency.
21
Dietmar Jannach et. al., “When Recurrent Neural Networks meet the Neighborhood for Session-Based Recommendation”, RecSys 2017
Diksha Garg et. al., “Sequence and Time Aware Neighborhood for Session-based Recommendations: STAN”, SIGIR 2019
Balázs Hidasi et. al., “Session-based Recommendations with Recurrent Neural Networks”, ICLR 2016.
Jing Li et. al., “Neural Attentive Session-based Recommendation” CIKM 2017.
Qiao Liu et. al., “STAMP: ShortTerm Attention/Memory Priority Model for Session-based Recommendation” KDD 2018.
Shu Wu et. al., “Session-Based Recommendation with Graph Neural Networks”, AAAI 2019.
23. Scalability: Ours vs. Competing Models
Training SLIST is much faster than training DNN-based models.
• Computational complexity is mainly proportional to the number of items.
• It is highly desirable in practice, especially on popular online services, where
millions of users create billions of session data every day.
23
Models
The ratio of training sessions on YC-1/4
5% 10% 20% 50% 100%
GRU4Rec+ 177.4 317.8 614.0 1600.2 3206.9
NARM 2137.6 3431.2 7454.2 27804.5 72076.8
STAMP 434.1 647.5 985.3 2081.2 4083.3
SR-GNN 6780.5 18014.7 31444.3 68581.9 185862.5
SLIST 202.7 199.0 199.9 227.0 241.9
Gains
(SLIST vs. SR-GNN)
33.5x 90.6x 157.3x 302.1x 768.2x
24. Ablation Study: Weight components
SLIST with all components is always better than others.
• For three datasets, considering recent items at the inference phase (𝑤𝑖𝑛𝑓) is
the most influential factor.
24
𝒘𝒊𝒏𝒇 𝒘𝒑𝒐𝒔 𝒘𝒕𝒊𝒎𝒆
YC-1/64 YC-1/4 DIGI1
HR MRR HR MRR HR MRR
O 0.6764 0.2697 0.6784 0.2723 0.5111 0.1817
O 0.6898 0.2980 0.6947 0.3005 0.5173 0.1852
O 0.6570 0.2651 0.6675 0.2693 0.5055 0.1789
O O 0.7078 0.3077 0.7096 0.3115 0.5253 0.1880
O O 0.6761 0.2697 0.6841 0.2753 0.5145 0.1820
O O 0.6880 0.2973 0.7004 0.3031 0.5202 0.1855
0.6592 0.2652 0.6626 0.2671 0.5025 0.1786
O O O 0.7088 0.3083 0.7175 0.3161 0.5291 0.1886
All Weights
Two Weights
One Weights
No Weights
26. Conclusion
We propose session-aware linear item models to fully capture
various characteristics of sessions.
• SLIST: session-aware linear similarity/transition model
Despite its simplicity, SLIST achieves competitive or state-of-the-art
performance on various datasets.
SLIST is highly scalable thanks to closed-form solutions of the linear
models.
• The computational complexity of SLIST is proportional to the number of
items, which is desirable in commercial systems.
26