I gave this talk at the 1st Budapest RecSys and Personalization Meetup about using deep learning to solve long standing problems of recommender systems. I also presented our approach on using RNNs for session-based recommendations in details.
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...Balázs Hidasi
Slides for my RecSys 2016 talk on integrating image and textual information into session based recommendations using novel parallel RNN architectures.
Link to the paper: http://www.hidasi.eu/en/publications.html#p_rnn_recsys16
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...Balázs Hidasi
Slides of my presentation at CIKM2018 about version 2 of the GRU4Rec algorithm, a recurrent neural network based algorithm for the session-based recommendation task.
We discuss sampling strategies and introduce additional sampling to the algorithm. We also redesign the loss function to cope with additional sampling. The resulting BPR-max loss function is able to efficiently handle many negative samples without encountering the vanishing gradient problem. We also introduce constrained embeddings which speeds up the conversion of item representations and reduces memory usage by a factor of 4. These improvements increase offline measures up to 52%.
In the talk we also discuss online A/B test and the implications of long time observations. Most of these observations are exclusive to this talk and are not in the paper.
You can access the preprint version of the paper on arXiv: https://arxiv.org/abs/1706.03847
The code is available on GitHub: https://github.com/hidasib/GRU4Rec
Utilizing additional information in factorization methods (research overview,...Balázs Hidasi
This presentation contains the main points of my recommender systems related research. It describes the arc of my research starting from improving matrix factorization, through the developement of my context-aware algorithms & addressing scalability issues to developing a general factorization framework & dealing with context dimension modeling. The slides were presented at the Delft University of Technology where I was invited to give this introductory talk as part of the collaboration between participiants of the CrowdRec project. The presentation was given on 11th April 2014.
Context-aware preference modeling with factorizationBalázs Hidasi
This talk was presented at the Doctoral Symposium of RecSys'15. It is a summary of the core part of my PhD research in the last few years. The research revolves around solving the implicit feedback based context-aware recommendation problem with factorization.
Associated paper: http://dl.acm.org/citation.cfm?id=2796543
Details of presented algorithms/methods (public versions available on http://hidasi.eu):
iTALS: http://link.springer.com/chapter/10.1007/978-3-642-33486-3_5
iTALSx: http://www.infocommunications.hu/documents/169298/1025723/InfocomJ_2014_4_5_Hidasi.pdf
ALS-CG/CD: http://link.springer.com/article/10.1007/s10115-015-0863-2
GFF: http://link.springer.com/article/10.1007/s10618-015-0417-y
Deep Learning in Recommender Systems - RecSys Summer School 2017Balázs Hidasi
This is the presentation accompanying my tutorial about deep learning methods in the recommender systems domain. The tutorial consists of a brief general overview of deep learning and the introduction of the four most prominent research direction of DL in recsys as of 2017. Presented during RecSys Summer School 2017 in Bolzano, Italy.
Deep learning: the future of recommendationsBalázs Hidasi
An informative talk about deep learning and its potential uses in recommender systems. Presented at the Budapest Startup Safary, 21 April, 2016.
The breakthroughs of the last decade in neural network research and the quick increasing of computational power resulted in the revival of deep neural networks and the field focusing on their training: deep learning. Deep learning methods have succeeded in complex tasks where other machine learning methods have failed, such as computer vision and natural language processing. Recently deep learning has began to gain ground in recommender systems as well. This talk introduces deep learning and its applications, with emphasis on how deep learning methods can solve long standing recommendation problems.
Lessons learnt at building recommendation services at industry scaleDomonkos Tikk
Industry day keynote presentation held at ECIR 2016, Padova. The talk presents algorithmic, technical and business challenges Gravity R&D encountered from building a recommender system vendor company from being a top Netflix Prize contender.
With the explosive growth of online information, recommender system has been an effective tool to overcome information overload and promote sales. In recent years, deep learning's revolutionary advances in speech recognition, image analysis and natural language processing have gained significant attention. Meanwhile, recent studies also demonstrate its efficacy in coping with information retrieval and recommendation tasks. Applying deep learning techniques into recommender system has been gaining momentum due to its state-of-the-art performance. In this talk, I will present recent development of deep learning based recommender models and highlight some future challenges and open issues of this research field.
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...Balázs Hidasi
Slides for my RecSys 2016 talk on integrating image and textual information into session based recommendations using novel parallel RNN architectures.
Link to the paper: http://www.hidasi.eu/en/publications.html#p_rnn_recsys16
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...Balázs Hidasi
Slides of my presentation at CIKM2018 about version 2 of the GRU4Rec algorithm, a recurrent neural network based algorithm for the session-based recommendation task.
We discuss sampling strategies and introduce additional sampling to the algorithm. We also redesign the loss function to cope with additional sampling. The resulting BPR-max loss function is able to efficiently handle many negative samples without encountering the vanishing gradient problem. We also introduce constrained embeddings which speeds up the conversion of item representations and reduces memory usage by a factor of 4. These improvements increase offline measures up to 52%.
In the talk we also discuss online A/B test and the implications of long time observations. Most of these observations are exclusive to this talk and are not in the paper.
You can access the preprint version of the paper on arXiv: https://arxiv.org/abs/1706.03847
The code is available on GitHub: https://github.com/hidasib/GRU4Rec
Utilizing additional information in factorization methods (research overview,...Balázs Hidasi
This presentation contains the main points of my recommender systems related research. It describes the arc of my research starting from improving matrix factorization, through the developement of my context-aware algorithms & addressing scalability issues to developing a general factorization framework & dealing with context dimension modeling. The slides were presented at the Delft University of Technology where I was invited to give this introductory talk as part of the collaboration between participiants of the CrowdRec project. The presentation was given on 11th April 2014.
Context-aware preference modeling with factorizationBalázs Hidasi
This talk was presented at the Doctoral Symposium of RecSys'15. It is a summary of the core part of my PhD research in the last few years. The research revolves around solving the implicit feedback based context-aware recommendation problem with factorization.
Associated paper: http://dl.acm.org/citation.cfm?id=2796543
Details of presented algorithms/methods (public versions available on http://hidasi.eu):
iTALS: http://link.springer.com/chapter/10.1007/978-3-642-33486-3_5
iTALSx: http://www.infocommunications.hu/documents/169298/1025723/InfocomJ_2014_4_5_Hidasi.pdf
ALS-CG/CD: http://link.springer.com/article/10.1007/s10115-015-0863-2
GFF: http://link.springer.com/article/10.1007/s10618-015-0417-y
Deep Learning in Recommender Systems - RecSys Summer School 2017Balázs Hidasi
This is the presentation accompanying my tutorial about deep learning methods in the recommender systems domain. The tutorial consists of a brief general overview of deep learning and the introduction of the four most prominent research direction of DL in recsys as of 2017. Presented during RecSys Summer School 2017 in Bolzano, Italy.
Deep learning: the future of recommendationsBalázs Hidasi
An informative talk about deep learning and its potential uses in recommender systems. Presented at the Budapest Startup Safary, 21 April, 2016.
The breakthroughs of the last decade in neural network research and the quick increasing of computational power resulted in the revival of deep neural networks and the field focusing on their training: deep learning. Deep learning methods have succeeded in complex tasks where other machine learning methods have failed, such as computer vision and natural language processing. Recently deep learning has began to gain ground in recommender systems as well. This talk introduces deep learning and its applications, with emphasis on how deep learning methods can solve long standing recommendation problems.
Lessons learnt at building recommendation services at industry scaleDomonkos Tikk
Industry day keynote presentation held at ECIR 2016, Padova. The talk presents algorithmic, technical and business challenges Gravity R&D encountered from building a recommender system vendor company from being a top Netflix Prize contender.
With the explosive growth of online information, recommender system has been an effective tool to overcome information overload and promote sales. In recent years, deep learning's revolutionary advances in speech recognition, image analysis and natural language processing have gained significant attention. Meanwhile, recent studies also demonstrate its efficacy in coping with information retrieval and recommendation tasks. Applying deep learning techniques into recommender system has been gaining momentum due to its state-of-the-art performance. In this talk, I will present recent development of deep learning based recommender models and highlight some future challenges and open issues of this research field.
Artificial Intelligence Course: Linear models ananth
In this presentation we present the linear models: Regression and Classification. We illustrate with several examples. Concepts such as underfitting (Bias) and overfitting (Variance) are presented. Linear models can be used as stand alone classifiers for simple cases and they are essential building blocks as a part of larger deep learning networks
Artificial Neural Networks have been very successfully used in several machine learning applications. They are often the building blocks when building deep learning systems. We discuss the hypothesis, training with backpropagation, update methods, regularization techniques.
Artificial Intelligence, Machine Learning and Deep LearningSujit Pal
Slides for talk Abhishek Sharma and I gave at the Gennovation tech talks (https://gennovationtalks.com/) at Genesis. The talk was part of outreach for the Deep Learning Enthusiasts meetup group at San Francisco. My part of the talk is covered from slides 19-34.
This is the first lecture on Applied Machine Learning. The course focuses on the emerging and modern aspects of this subject such as Deep Learning, Recurrent and Recursive Neural Networks (RNN), Long Short Term Memory (LSTM), Convolution Neural Networks (CNN), Hidden Markov Models (HMM). It deals with several application areas such as Natural Language Processing, Image Understanding etc. This presentation provides the landscape.
Deep Learning Models for Question AnsweringSujit Pal
Talk about a hobby project to apply Deep Learning models to predict answers to 8th grade science multiple choice questions for the Allen AI challenge on Kaggle.
An introduction to Machine Learning (and a little bit of Deep Learning)Thomas da Silva Paula
25-min talk about Machine Learning and a little bit of Deep Learning. Starts with some basic definitions (Supervised and Unsupervised Learning). Then, neural networks basic functionality is explained, ending up in Deep Learning and Convolutional Neural Networks.
Machine Learning Meetup that happened in Porto Alegre, Brazil.
Deep Learning For Practitioners, lecture 2: Selecting the right applications...ananth
In this presentation we articulate when deep learning techniques yield best results from a practitioner's view point. Do we apply deep learning techniques for every machine learning problem? What characteristics of an application lends itself suitable for deep learning? Does more data automatically imply better results regardless of the algorithm or model? Does "automated feature learning" obviate the need for data preprocessing and feature design?
Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...Alexandros Karatzoglou
Slides from my talk at the RecSys Stammtisch at SoundCloud in Berlin. The presentation is split in two part one focusing on ranking and relevance and one on diversity and how to achieve it using genres. We introduce a novel diversity metric called Binomial Diversity.
Embed, Encode, Attend, Predict – applying the 4 step NLP recipe for text clas...Sujit Pal
Slides for talk at PyData Seattle 2017 about Matthew Honnibal's 4-step recipe for Deep Learning NLP pipelines. Description of the stages in pipeline as well as 3 examples of document classification, document similarity and sentence similarity. Examples include Keras custom layers for different types of attention.
In this presentation we discuss the hypothesis of MaxEnt models, describe the role of feature functions and their applications to Natural Language Processing (NLP). The training of the classifier is discussed in a later presentation.
Deep Learning: Chapter 11 Practical MethodologyJason Tsai
Lecture for Deep Learning 101 study group to be held on June 9th, 2017.
Reference book: https://www.deeplearningbook.org/
Past video archives: https://goo.gl/hxermB
Initiated by Taiwan AI Group (https://www.facebook.com/groups/Taiwan.AI.Group/)
Deep Learning with Python: Getting started and getting from ideas to insights in minutes.
PyData Seattle 2015
Alex Korbonits (@korbonits)
This presentation was given July 25, 2015 at the PyData Seattle conference hosted by PyData and NumFocus.
Deep Learning Enabled Question Answering System to Automate Corporate HelpdeskSaurabh Saxena
Studied feasibility of applying state-of-the-art deep learning models like end-to-end memory networks and neural attention- based models to the problem of machine comprehension and subsequent question answering in corporate settings with huge
amount of unstructured textual data. Used pre-trained embeddings like word2vec and GLove to avoid huge training costs.
In this talk, presented at the UX Forum in Munich, I had a look at the Artificial Intelligence landscape, and the differences between Machine Learning and Deep Learning. The session walks through a big range of use cases in fields such as object and face recognition and showcases how the design industry is charging up with new powers.
Deep learning has renewed interest in computational creativity. Can machines be creative? In which sense? And why this would be useful? We argue current creative AI systems are stuck: they explore combination, analogy or random, but the value of the objects are provided by the system designer.
The only way to creative AI is to develop agents building their own value.
We also argue: the generative potential of deep learning is understudied.
Current focus is on likelihood - whereas creativity is unlikely.
We present an implementation of these ideas on the MNIST handwritten digits dataset - to create symbols that could have been digits (e.g. in an imaginary culture) but that are not.
Artificial Intelligence Course: Linear models ananth
In this presentation we present the linear models: Regression and Classification. We illustrate with several examples. Concepts such as underfitting (Bias) and overfitting (Variance) are presented. Linear models can be used as stand alone classifiers for simple cases and they are essential building blocks as a part of larger deep learning networks
Artificial Neural Networks have been very successfully used in several machine learning applications. They are often the building blocks when building deep learning systems. We discuss the hypothesis, training with backpropagation, update methods, regularization techniques.
Artificial Intelligence, Machine Learning and Deep LearningSujit Pal
Slides for talk Abhishek Sharma and I gave at the Gennovation tech talks (https://gennovationtalks.com/) at Genesis. The talk was part of outreach for the Deep Learning Enthusiasts meetup group at San Francisco. My part of the talk is covered from slides 19-34.
This is the first lecture on Applied Machine Learning. The course focuses on the emerging and modern aspects of this subject such as Deep Learning, Recurrent and Recursive Neural Networks (RNN), Long Short Term Memory (LSTM), Convolution Neural Networks (CNN), Hidden Markov Models (HMM). It deals with several application areas such as Natural Language Processing, Image Understanding etc. This presentation provides the landscape.
Deep Learning Models for Question AnsweringSujit Pal
Talk about a hobby project to apply Deep Learning models to predict answers to 8th grade science multiple choice questions for the Allen AI challenge on Kaggle.
An introduction to Machine Learning (and a little bit of Deep Learning)Thomas da Silva Paula
25-min talk about Machine Learning and a little bit of Deep Learning. Starts with some basic definitions (Supervised and Unsupervised Learning). Then, neural networks basic functionality is explained, ending up in Deep Learning and Convolutional Neural Networks.
Machine Learning Meetup that happened in Porto Alegre, Brazil.
Deep Learning For Practitioners, lecture 2: Selecting the right applications...ananth
In this presentation we articulate when deep learning techniques yield best results from a practitioner's view point. Do we apply deep learning techniques for every machine learning problem? What characteristics of an application lends itself suitable for deep learning? Does more data automatically imply better results regardless of the algorithm or model? Does "automated feature learning" obviate the need for data preprocessing and feature design?
Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...Alexandros Karatzoglou
Slides from my talk at the RecSys Stammtisch at SoundCloud in Berlin. The presentation is split in two part one focusing on ranking and relevance and one on diversity and how to achieve it using genres. We introduce a novel diversity metric called Binomial Diversity.
Embed, Encode, Attend, Predict – applying the 4 step NLP recipe for text clas...Sujit Pal
Slides for talk at PyData Seattle 2017 about Matthew Honnibal's 4-step recipe for Deep Learning NLP pipelines. Description of the stages in pipeline as well as 3 examples of document classification, document similarity and sentence similarity. Examples include Keras custom layers for different types of attention.
In this presentation we discuss the hypothesis of MaxEnt models, describe the role of feature functions and their applications to Natural Language Processing (NLP). The training of the classifier is discussed in a later presentation.
Deep Learning: Chapter 11 Practical MethodologyJason Tsai
Lecture for Deep Learning 101 study group to be held on June 9th, 2017.
Reference book: https://www.deeplearningbook.org/
Past video archives: https://goo.gl/hxermB
Initiated by Taiwan AI Group (https://www.facebook.com/groups/Taiwan.AI.Group/)
Deep Learning with Python: Getting started and getting from ideas to insights in minutes.
PyData Seattle 2015
Alex Korbonits (@korbonits)
This presentation was given July 25, 2015 at the PyData Seattle conference hosted by PyData and NumFocus.
Deep Learning Enabled Question Answering System to Automate Corporate HelpdeskSaurabh Saxena
Studied feasibility of applying state-of-the-art deep learning models like end-to-end memory networks and neural attention- based models to the problem of machine comprehension and subsequent question answering in corporate settings with huge
amount of unstructured textual data. Used pre-trained embeddings like word2vec and GLove to avoid huge training costs.
In this talk, presented at the UX Forum in Munich, I had a look at the Artificial Intelligence landscape, and the differences between Machine Learning and Deep Learning. The session walks through a big range of use cases in fields such as object and face recognition and showcases how the design industry is charging up with new powers.
Deep learning has renewed interest in computational creativity. Can machines be creative? In which sense? And why this would be useful? We argue current creative AI systems are stuck: they explore combination, analogy or random, but the value of the objects are provided by the system designer.
The only way to creative AI is to develop agents building their own value.
We also argue: the generative potential of deep learning is understudied.
Current focus is on likelihood - whereas creativity is unlikely.
We present an implementation of these ideas on the MNIST handwritten digits dataset - to create symbols that could have been digits (e.g. in an imaginary culture) but that are not.
In this session, I explain why user experience design for artificial intelligence matters. How you make machine learning transparent to users is one of the great design challenges of our time—but a necessary one.
+ Updated version for NEXT Conference Hamburg +
https://nextconf.eu/event/how-deep-learning-is-changing-the-design-process/
Deep learning is a new and exciting subfield of machine learning which attempts to sidestep the whole feature design process. This session explains how it derives from AI, why it quietly became a part of user experience and how it also changes the actual design workflow. The talk highlights a range of use cases and doesn’t forget to illustrate why user experience design for artificial intelligence matters the other way around.
Until 2014, gutefrage.net was striving towards chaos and chaos had many faces like
* Bad answers were shown to the users and hiding the good and helpful ones
* Spam was overlooked in the huge amount of generated content
* Tags do not always represent the intended topic of the question
* The page load time led to aborting users before the site is fully rendered
These problems imply bad consequences for the user experience, the manageability of our content, the image of our platform and also on the revenue.
To tackle these big problems, we decided to leverage the full power of our data. We put great effort into an automatic rating of the answers and hide the really bad ones from every user, we alert problems with Spammers in realtime to our Community Management, improve the page load time tremendously and currently testing prototypes for (semi-) automatically inferring the topic of a question.
I will show you on some examples, how we discovered the problems by making the data visible to everyone in the company, fixing them either by advanced Machine Learning techniques or by relying on the „collective brain“ of our community and improving the user experience step by step until the chaos is finally defeated.
This is a slide deck from a presentation, that my colleague Shirin Glander (https://www.slideshare.net/ShirinGlander/) and I did together. As we created our respective parts of the presentation on our own, it is quite easy to figure out who did which part of the presentation as the two slide decks look quite different ... :)
For the sake of simplicity and completeness, I just copied the two slide decks together. As I did the "surrounding" part, I added Shirin's part at the place when she took over and then added my concluding slides at the end. Well, I'm sure, you will figure it out easily ... ;)
The presentation was intended to be an introduction to deep learning (DL) for people who are new to the topic. It starts with some DL success stories as motivation. Then a quick classification and a bit of history follows before the "how" part starts.
The first part of the "how" is some theory of DL, to demystify the topic and explain and connect some of the most important terms on the one hand, but also to give an idea of the broadness of the topic on the other hand.
After that the second part dives deeper into the question how to actually implement DL networks. This part starts with coding it all on your own and then moves on to less coding step by step, depending on where you want to start.
The presentation ends with some pitfalls and challenges that you should have in mind if you want to dive deeper into DL - plus the invitation to become part of it.
As always the voice track of the presentation is missing. I hope that the slides are of some use for you, though.
This is a slide deck from a presentation, that my colleague Uwe Friedrichsen (https://www.slideshare.net/ufried/) and I did together. As we created our respective parts of the presentation on our own, it is quite easy to figure out who did which part of the presentation as the two slide decks look quite different ... :)
For the sake of simplicity and completeness, Uwe copied the two slide decks together. As he did the "surrounding" part, he added my part at the place where I took over and then added concluding slides at the end. Well, I'm sure, you will figure it out easily ... ;)
The presentation was intended to be an introduction to deep learning (DL) for people who are new to the topic. It starts with some DL success stories as motivation. Then a quick classification and a bit of history follows before the "how" part starts.
The first part of the "how" is some theory of DL, to demystify the topic and explain and connect some of the most important terms on the one hand, but also to give an idea of the broadness of the topic on the other hand.
After that the second part dives deeper into the question how to actually implement DL networks. This part starts with coding it all on your own and then moves on to less coding step by step, depending on where you want to start.
The presentation ends with some pitfalls and challenges that you should have in mind if you want to dive deeper into DL - plus the invitation to become part of it.
As always the voice track of the presentation is missing. I hope that the slides are of some use for you, though.
Yuandong Tian at AI Frontiers : Planning in Reinforcement LearningAI Frontiers
Deep Reinforcement Learning (DRL) has made strong progress in many tasks, such as board games, robotics, navigation, neural architecture search, etc. I will present our recent open-sourced DRL frameworks to facilitate game research and development. Our framework is scalable so we can can reproduce AlphaGoZero and AlphaZero using 2000 GPUs, achieving super-human performance of Go AI that beats 4 top-30 professional players. We also show usability of our platform by training agents in real-time strategy games, and show interesting behaviors with a small amount of resource.
ODSC 2019: Sessionisation via stochastic periods for root event identificationKuldeep Jiwani
In todays world majority of information is generated by self sustaining systems like various kinds of bots, crawlers, servers, various online services, etc. This information is flowing on the axis of time and is generated by these actors under some complex logic. For example, a stream of buy/sell order requests by an Order Gateway in financial world, or a stream of web requests by a monitoring / crawling service in the web world, or may be a hacker's bot sitting on internet and attacking various computers. Although we may not be able to know the motive or intention behind these data sources. But via some unsupervised techniques we can try to infer the pattern or correlate the events based on their multiple occurrences on the axis of time. Associating a chain of events in order of time helps in doing a root event analysis. In certain cases a time ordered correlation and root event identification is good enough to automatically identify signatures of various malicious actors and take appropriate corrective actions to stop cyber attacks, stop malicious social campaigns, etc.
Sessionisation is one such unsupervised technique that tries to find the signal in a stream of events associated with a timestamp. In the ideal world it would resolve to finding periods with a mixture of sinusoidal waves. But for the real world this is a much complex activity, as even the systematic events generated by machines over the internet behave in a much erratic manner. So the notion of a period for a signal also changes in the real world. We can no longer associate it with a number, it has to be treated as a random variable, with expected values and associated variance. Hence we need to model "Stochastic periods" and learn their probability distributions in an unsupervised manner.
The main focus of this talk will be to showcase applied data science techniques to discover stochastic periods. There are many ways to obtain periods in data, so the journey would begin by a walk through of existing techniques like FFT (Fast Fourier Transform) then discuss about Gaussian Mixture Models. After highlighting the short comings of these techniques we will succinctly explain one of the most general non-parametric Bayesian approaches to solve this problem. Without going too deep in the complex math, we will get back to applied data science and discuss a much simpler technique that can solve the same problem if certain assumptions are satisfied.
In this talk we will demonstrate some time based pattern we discovered while working on a security analytics use case that uses Sessionisation. In the talk we will demonstrate such patterns based on an open source malware attack datasets that is available publicly.
Key concepts explained in talk: Sessionisation, Bayesian techniques of Machine Learning, Gaussian Mixture Models, Kernel density estimation, FFT, stochastic periods, probabilistic modelling, Bayesian non-parametric methods
Using Deep Learning to do Real-Time Scoring in Practical ApplicationsGreg Makowski
http://www.meetup.com/SF-Bay-ACM/events/227480571/
(see also YouTube for a recording of the presentation)
The talk will cover a brief review of neural network basics and the following types of neural network deep learning:
* autocorrelational - unsupervised learning for extracting features. He will describe how additional layers build complexity in the feature extraction.
* convolutional - how to detect shift invariant patterns in various data sources. Horizontal shift invariant detection applies to signals like speech recognition or IoT data. Horizontal and vertical shift invariance applies to images or videos, for faces or self driving cars
* discuss details of applying deep net systems for continuous or real time scoring
* reinforcement learning or Q Learning - such as learning how to play Atari video games
* continuous space word models - such as word2vec, skipgram training, NLP understanding and translation
These slides were designed for a talk at the IT-Meetup League of Geeks in Passau. It contains an introduction to the concept of TF and it's major improvements in version 2.0. Furthermore, basics about Machine and Deep Learning are explained. Finally, I explain how to do Computer Vision in TensorFlow 2.
The full talk can be found on YouTube: https://www.youtube.com/channel/UCycbEYf8CJSaAVCYgfMOAPQ
Code is on Github: https://github.com/sastemmler/leagueofgeeks
Impatience is a Virtue: Revisiting Disorder in High-Performance Log AnalyticsBadrish Chandramouli
There is a growing interest in processing real-time queries over out-of-order streams in this big data era. This paper presents a comprehensive solution to meet this requirement. Our solution is based on Impatience sort, an online sorting technique that is based on an old technique called Patience sort. Impatience sort is tailored for incrementally sorting streaming datasets that present themselves as almost sorted, usually due to network delays and machine failures. With several optimizations, our solution can adapt to both input streams and query logic. Further, we develop a new Impatience framework that leverages Impatience sort to reduce the latency and memory usage of query execution, and supports a range of user latency requirements, without compromising on query completeness and throughput, while leveraging existing efficient in-order streaming engines and operators. We evaluate our proposed solution in Trill, a high-performance streaming engine, and demonstrate that our techniques significantly improve sorting performance and reduce memory usage – in some cases, by over an order of magnitude.
A presentation I did for China GDC 2011.
I cover the basic of visibility optimization as well as present some practical examples of visibility systems used in modern video games.
Approximate "Now" is Better Than Accurate "Later"NUS-ISS
How does Twitter track the top trending topics?
How does Amazon keep track of the top-selling items for the day?
How many cabs have been booked this month using your App?
Is the password that a new user is choosing a common/compromised password?
Modern web-scale systems process billions of transactions and generate terabytes of data every single day. In order to find answers to questions against this data, one would initiate a multi-minute query against a NoSQL datastore or kick off a batch job written in a distributed processing framework such as Spark or Flink. However, these jobs are throughput-heavy and not suited for realtime low-latency queries. However, you and your customers would like to have all this information "right now".
At the end of this talk, you'll realize that you can power these low-latency queries and with incredibly low memory footprint "IF" you are willing to accept answers that are, say, 96-99% accurate. This talk introduces some of the go-to probabilistic data structures that are used by organisations with large amounts of data - specifically Bloom filter, Count Min Sketch and HyperLogLog.
Egyedi termék kreatívok tömeges gyártása generatív AI segítségévelBalázs Hidasi
UPDATE: Typo on the 8th slide, last line should be (slides can't be modified on slideshare):
grad(log(p_gamma(x|y))) = (1-gamma)*grad(log(p(x))) + gamma*grad(log(p(x|y)))
My presentation on using generative AI for creative generation for e-commerce. Presented on 14 November 2023 at the TECH meetup series organized by Gravity R&D, a Taboola company. Slides are in Hungarian.
*****
Title/abstract in English:
Mass production of unique product creatives with generative AI
-----
The probability of a user clicking on an online advertisement is greatly influenced by creative's look. Traditional brand level campaigns require only a few creatives that can be produced by humans. However product level recommendations require creatives for every single product. Producing these using human work is infeasible at scale, thus they are often shown in front of simple (e.g. white) backgrounds. This presentation showcases a solution based on generative AI that allows placing products in different environments, which makes the creatives more appealing. I'll talk about the challenges of this approach along with potential solutions, as well as the initial results of our live test.
*****
Eredeti absztrakt:
Az online hirdetések megjelenése nagyban befolyásolja a rákattintás valószínűségét. A tradicionális márka szinten targetált kampányokhoz szükséges egy-két kreatív/banner legyártása még emberi erőforrás igénybevételével is megoldható. Termék szintű ajánlás esetén viszont minden egyes termékhez külön kreatívra van szükség, akár több felbontásban. Nagyszámú kreatív legyártása emberi erővel lassú és drága, ezért gyakori megközelítés a terméket valamilyen egyszerű, például egyszínű, háttér előtt megjeleníteni. Az előadás során bemutatunk egy generatív AI technológián alapuló megoldást, ami lehetővé teszi, hogy a termékeket különféle környezetekben jelenítsük meg, és így érdekesebbé/vonzóbbá tegyük a kreatívokat. Szót ejtünk a megközelítés nehézségeiről, lehetséges megoldásokról, és a módszer hatékonyságát vizsgáló mérésünk előzetes eredményeiről.
The Effect of Third Party Implementations on ReproducibilityBalázs Hidasi
Presentation of "The Effect of Third Party Implementations on Reproducibility" paper from RecSys 2023.
Abstract:
Reproducibility of recommender systems research has come under scrutiny during recent years. Along with works focusing on repeating experiments with certain algorithms, the research community has also started discussing various aspects of evaluation and how these affect reproducibility. We add a novel angle to this discussion by examining how unofficial third-party implementations could benefit or hinder reproducibility. Besides giving a general overview, we thoroughly examine six third-party implementations of a popular recommender algorithm and compare them to the official version on five public datasets. In the light of our alarming findings we aim to draw the attention of the research community to this neglected aspect of reproducibility.
Context aware factorization methods for implicit feedback based recommendatio...Balázs Hidasi
Slides I prepared for defending my PhD dissertation on context-aware factorization methods for implicit-feedback based recommendations. Dissertation (in English) can be accessed here: http://hidasi.eu/content/phd.pdf Slides are in Hungarian.
Az implicit ajánlási probléma és néhány megoldása (BME TMIT szeminárium előad...Balázs Hidasi
Ez a diasor egy ismeretterjesztő előadáshoz készült.
Az előadás témája az implicit feedback alapú ajánlás (amikor a felhasználók preferenciái nem olvashatóak ki közvetlenül az adatokból), és a probléma néhány lehetséges megoldása. A prezentáció a probléma ismertetését követően kitér néhány kutatási eredményemre, mint például a mátrix faktorizáció inicializálására, vagy az implicit tenzorfaktorizációra.
Az előadásra 2012. nyarán, a BME Távközlési és Médiainformatikai Tanszéke (TMIT) által szervezett szemináriumon került sor.
Context-aware similarities within the factorization framework (CaRR 2013 pres...Balázs Hidasi
This presentation is about an interesting side project of my main research in recommender systems. It is about the preliminary examination of context-aware similarities in the factorization framework.
This work is in the intersection of the following areas: (1) implicit feedback based recommendations; (2) context / context awareness; (3) item-to-item recommendations; (4) matrix / tensor factorization. The aim of this work is to examine whether context can be used to compute more accurate item similarities based on their feature vectors. Two levels of context aware similarities are introduced: (1) context is only used during training, but not for computing the similarity; (2) context is used during the training and for the similarity computations as well.
This presentation was given at the 3rd workshop on Context-awareness in Retrieval and Recommendations (CaRR 2013) in Rome.
iTALS: implicit tensor factorization for context-aware recommendations (ECML/...Balázs Hidasi
This presentation is about the context-aware recommender algorithm iTALS.
iTALS is a context-aware recommender algorithm for implicit feedback data. The user-item-context(s) setup is modelled in a binary tensor. Weights are also assigned to the cells based on the certainity of their information. An ALS-based algorithm is proposed that is capable of efficiently factorizing this tensor. Additionally a novel context information is introduced: sequentiality. This context allows us to incorporate association rule like information into the factorization framework and to differentiate between items with different repetetiveness patters and thus to make recommendations more accurate.
This presentation was originally given at ECML/PKDD 2012 in Bristol.
Initialization of matrix factorization (CaRR 2012 presentation)Balázs Hidasi
This presentation is about why initialization of matrix factorization methods is important and proposes an interesting initialization method (coined SimFactor). The method revolves around a similarity preserving dimensionality reduction technique. Context-based initialization is introduced as well.
As most of my recommender systems related research, this presentation focuses on implicit feedback (the case where user preferences are not coded explicitely in the data).
Originally presented at the 2nd workshop on Context-awareness in Retrieval and Recommendations (CaRR 2012) in Lisbon.
ShiftTree: model alapú idősor-osztályozó (VK 2009 előadás)Balázs Hidasi
A prezentáció témája a ShiftTree névre hallgató, egyedi, model alapú idősor-osztályozó.
A ShiftTree az idősor-osztályozás problémájának egy egyedülálló, modell alapú megközelítése. Az elképzelés alapja, hogy minden idősorhoz egy szemet (kurzort) rendelünk, ami az időtengely egy adott pontjára mutat. Dinamikus attribútumokat hozunk létre úgy, hogy a következő két kérdésre válaszolunk: (1) Hová nézzünk az időtengelyen? (2) Mit nézzünk az adott pontban? Az első kérdésre adott válasz azt mondja meg, hogy hogyan mozgassuk a szemet az időtengely mentén. A második válasz pedig azt definiálja, hogy hogyan számoljuk ki a dinamikus attribútum értékét az adott pontban. Ezeket a dinamikus attribútumokat ezután egy bináris döntési fában használjuk fel.
Ez a diasor a ShiftTree egy korai (2009-es) verzióját mutatja be.
A prezentáció a 2009-es Végzős Konferencián került bemutatásra.
Megjegyzés: valamilyen oknál fogva a SlideShare nem támogatja az animációkat, ezért az animált diák több diára lettek szétszedve.
ShiftTree: model alapú idősor-osztályozó (ML@BP előadás, 2012)Balázs Hidasi
A prezentáció témája a ShiftTree névre hallgató, egyedi, model alapú idősor-osztályozó.
A ShiftTree az idősor-osztályozás problémájának egy egyedülálló, modell alapú megközelítése. Az elképzelés alapja, hogy minden idősorhoz egy szemet (kurzort) rendelünk, ami az időtengely egy adott pontjára mutat. Dinamikus attribútumokat hozunk létre úgy, hogy a következő két kérdésre válaszolunk: (1) Hová nézzünk az időtengelyen? (2) Mit nézzünk az adott pontban? Az első kérdésre adott válasz azt mondja meg, hogy hogyan mozgassuk a szemet az időtengely mentén. A második válasz pedig azt definiálja, hogy hogyan számoljuk ki a dinamikus attribútum értékét az adott pontban. Ezeket a dinamikus attribútumokat ezután egy bináris döntési fában használjuk fel.
Ez a diasor a legteljesebb, a ShiftTree-ről szóló prezentációk közül. Tartalmaz több kiegészítést, valamint leír néhány olyan megoldást, amik a kutatás során előkerültek, de végül zsákutcának bizonyultak.
A prezentáció egy 2012. februári előadáshoz tartozik, amire az ML@BP rendezvénysorozat keretein belül került sor.
Megjegyzés: valamilyen oknál fogva a SlideShare nem támogatja az animációkat, ezért az animált diák több diára lettek szétszedve.
ShiftTree: model based time series classifier (ECML/PKDD 2011 presentation)Balázs Hidasi
This slideshow is about the time series classifier algorithm, ShiftTree.
ShiftTree is a unique, model-based approach for time series classification. The basic idea is that we assign a cursor (or eye) to each series and move this to certain positions on the time axis. We generate dynamic attributes by answering two questions: (1) Where to look? (2) What to look at?. The answer to the first question tells us where to move the cursor (e.g.: forward 100 steps, to the previous local maxima, etc), while the second answer defines the calculation of the dynamic attributes (e.g.: value at that point, the weighted avarage of the values around the position, the difference in the current and previous cursor position, etc). These dynamic attributes then used in a binary decision tree.
This slideshow was originally presented at ECML/PKDD 2011 in Athens.
Note that for whatever reasons SlideShare doesn't support animations. Therefore the animated slides were split into multiple slides.
What is greenhouse gasses and how many gasses are there to affect the Earth.moosaasad1975
What are greenhouse gasses how they affect the earth and its environment what is the future of the environment and earth how the weather and the climate effects.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.Sérgio Sacani
The return of a sample of near-surface atmosphere from Mars would facilitate answers to several first-order science questions surrounding the formation and evolution of the planet. One of the important aspects of terrestrial planet formation in general is the role that primary atmospheres played in influencing the chemistry and structure of the planets and their antecedents. Studies of the martian atmosphere can be used to investigate the role of a primary atmosphere in its history. Atmosphere samples would also inform our understanding of the near-surface chemistry of the planet, and ultimately the prospects for life. High-precision isotopic analyses of constituent gases are needed to address these questions, requiring that the analyses are made on returned samples rather than in situ.
Multi-source connectivity as the driver of solar wind variability in the heli...Sérgio Sacani
The ambient solar wind that flls the heliosphere originates from multiple
sources in the solar corona and is highly structured. It is often described
as high-speed, relatively homogeneous, plasma streams from coronal
holes and slow-speed, highly variable, streams whose source regions are
under debate. A key goal of ESA/NASA’s Solar Orbiter mission is to identify
solar wind sources and understand what drives the complexity seen in the
heliosphere. By combining magnetic feld modelling and spectroscopic
techniques with high-resolution observations and measurements, we show
that the solar wind variability detected in situ by Solar Orbiter in March
2022 is driven by spatio-temporal changes in the magnetic connectivity to
multiple sources in the solar atmosphere. The magnetic feld footpoints
connected to the spacecraft moved from the boundaries of a coronal hole
to one active region (12961) and then across to another region (12957). This
is refected in the in situ measurements, which show the transition from fast
to highly Alfvénic then to slow solar wind that is disrupted by the arrival of
a coronal mass ejection. Our results describe solar wind variability at 0.5 au
but are applicable to near-Earth observatories.
Richard's entangled aventures in wonderlandRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...Scintica Instrumentation
Intravital microscopy (IVM) is a powerful tool utilized to study cellular behavior over time and space in vivo. Much of our understanding of cell biology has been accomplished using various in vitro and ex vivo methods; however, these studies do not necessarily reflect the natural dynamics of biological processes. Unlike traditional cell culture or fixed tissue imaging, IVM allows for the ultra-fast high-resolution imaging of cellular processes over time and space and were studied in its natural environment. Real-time visualization of biological processes in the context of an intact organism helps maintain physiological relevance and provide insights into the progression of disease, response to treatments or developmental processes.
In this webinar we give an overview of advanced applications of the IVM system in preclinical research. IVIM technology is a provider of all-in-one intravital microscopy systems and solutions optimized for in vivo imaging of live animal models at sub-micron resolution. The system’s unique features and user-friendly software enables researchers to probe fast dynamic biological processes such as immune cell tracking, cell-cell interaction as well as vascularization and tumor metastasis with exceptional detail. This webinar will also give an overview of IVM being utilized in drug development, offering a view into the intricate interaction between drugs/nanoparticles and tissues in vivo and allows for the evaluation of therapeutic intervention in a variety of tissues and organs. This interdisciplinary collaboration continues to drive the advancements of novel therapeutic strategies.
Cancer cell metabolism: special Reference to Lactate PathwayAADYARAJPANDEY1
Normal Cell Metabolism:
Cellular respiration describes the series of steps that cells use to break down sugar and other chemicals to get the energy we need to function.
Energy is stored in the bonds of glucose and when glucose is broken down, much of that energy is released.
Cell utilize energy in the form of ATP.
The first step of respiration is called glycolysis. In a series of steps, glycolysis breaks glucose into two smaller molecules - a chemical called pyruvate. A small amount of ATP is formed during this process.
Most healthy cells continue the breakdown in a second process, called the Kreb's cycle. The Kreb's cycle allows cells to “burn” the pyruvates made in glycolysis to get more ATP.
The last step in the breakdown of glucose is called oxidative phosphorylation (Ox-Phos).
It takes place in specialized cell structures called mitochondria. This process produces a large amount of ATP. Importantly, cells need oxygen to complete oxidative phosphorylation.
If a cell completes only glycolysis, only 2 molecules of ATP are made per glucose. However, if the cell completes the entire respiration process (glycolysis - Kreb's - oxidative phosphorylation), about 36 molecules of ATP are created, giving it much more energy to use.
IN CANCER CELL:
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
introduction to WARBERG PHENOMENA:
WARBURG EFFECT Usually, cancer cells are highly glycolytic (glucose addiction) and take up more glucose than do normal cells from outside.
Otto Heinrich Warburg (; 8 October 1883 – 1 August 1970) In 1931 was awarded the Nobel Prize in Physiology for his "discovery of the nature and mode of action of the respiratory enzyme.
WARNBURG EFFECT : cancer cells under aerobic (well-oxygenated) conditions to metabolize glucose to lactate (aerobic glycolysis) is known as the Warburg effect. Warburg made the observation that tumor slices consume glucose and secrete lactate at a higher rate than normal tissues.
This pdf is about the Schizophrenia.
For more details visit on YouTube; @SELF-EXPLANATORY;
https://www.youtube.com/channel/UCAiarMZDNhe1A3Rnpr_WkzA/videos
Thanks...!
Nutraceutical market, scope and growth: Herbal drug technologyLokesh Patil
As consumer awareness of health and wellness rises, the nutraceutical market—which includes goods like functional meals, drinks, and dietary supplements that provide health advantages beyond basic nutrition—is growing significantly. As healthcare expenses rise, the population ages, and people want natural and preventative health solutions more and more, this industry is increasing quickly. Further driving market expansion are product formulation innovations and the use of cutting-edge technology for customized nutrition. With its worldwide reach, the nutraceutical industry is expected to keep growing and provide significant chances for research and investment in a number of categories, including vitamins, minerals, probiotics, and herbal supplements.
Nutraceutical market, scope and growth: Herbal drug technology
Deep learning to the rescue - solving long standing problems of recommender systems
1. Deep learning to the rescue
solving long standing problems of recommender systems
Balázs Hidasi
@balazshidasi
Budapest RecSys & Personalization meetup
12 May, 2016
2. What is deep learning?
• A class of machine learning algorithms
that use a cascade of multiple non-linear
processing layers
and complex model structures
to learn different representations of the
data in each layer
where higher level features are derived
from lower level features
to form a hierarchical representation.
• Key component of recent technologies
Speech recognition
Personal assistants (e.g. Siri, Cortana)
Computer vision, object recognition
Machine translation
Chatbot technology
Face recognition
Self driving cars
• An efficient tool for certain complex
problems
Pattern recognition
Computer vision
Natural language processing
Speech recognition
• Deep learning is NOT
the true AI
o it may be a component of it when and
if AI is created
how the human brain works
the best solution to every machine
learning tasks
4. Why is deep learning happening now?
• Actually it is not first papers published in 1970s
• Third resurgence of neural networks
Research breakthroughs
Increase in computational power
GP GPUs
Problem Solution
Vanishing
gradients
Sigmoid type activation functions easily saturate.
Gradients are small, in deeper layers updates become
almost zero.
Earlier: layer-by-layer pretraining
Recently: non-saturating
activation functions
Gradient
descent
First order methods (e.g. SGD) are easily stuck.
Second order methods are infeasible on larger data.
Adaptive training: adagrad, adam,
adadelta, RMSProp
Nesterov momentum
Regularization Networks easily overfit (even with L2 regularization). Dropout
ETC...
5. Challenges in RecSys
• Recommender systems ≠ Netflix challenge
Rating prediction Top-N recommendation (ranking)
Explicit feedback Implicit feedback
Long user histories Sessions
Slowly changing taste Goal oriented browsing
Item-to-user only Other scenarios
• Success of CF
Human brain is a powerful feature extractor
Cold-start
o CF can’t be used
o Decisions are rarely made on metadata
o But rather on what the user sees: e.g. product image, content itself
Domain
dependent
6. Session-based recommendations
• Permanent cold start
User identification
o Possible but often not reliable
Intent/theme
o What the user needs?
o Theme of the session
Never/rarely returning users
• Workaround in practice
Item-to-item recommendations
o Similar items
o Co-occurring items
Non-personalized
Not adaptive
7. Recurrent Neural Networks
• Hidden state
Next hidden state depends on the input and the actual hidden state (recurrence)
ℎ 𝑡 = tanh 𝑊𝑥 𝑡 + 𝑈ℎ 𝑡−1
• „Infinite depth”
• Backpropagation Through Time
• Exploding gradients
Due to recurrence
If the spectral radius of U > 1 (necessary)
• Lack of long term memory (vanishing gradients)
Gradients of earlier states vanish
If the spectral radius of U < 1 (sufficient)
ℎ
𝑥𝑡 ℎ 𝑡
ℎ
ℎ 𝑡 𝑥𝑡
ℎℎ 𝑡−1
𝑥𝑡−1
ℎℎ 𝑡−2
𝑥𝑡−2
ℎℎ 𝑡−3
𝑥𝑡−3
ℎ 𝑡−4
8. Advanced RNN units
• Long Short-Term Memory (LSTM)
Memory cell (𝑐𝑡) is the mix of
o its previous value (governed by the
forget gate (𝑓𝑡))
o the cell value candidate (governed
by the input gate (𝑖 𝑡))
Cell value candidate ( 𝑐𝑡) depends on
the input and the previous hidden state
Hidden state is the memory cell
regulated by the output gate (𝑜𝑡)
No vanishing/exploding gradients
• 𝑓𝑡 𝑖 𝑡 = 𝜎 𝑊 𝑓
𝑊 𝑖
𝑥𝑡 + 𝑈 𝑓
𝑈 𝑖
ℎ 𝑡−1 + 𝑉 𝑓
𝑉 𝑖
𝑐𝑡−1
• 𝑜𝑡 = 𝜎 𝑊 𝑜
𝑥𝑡 + 𝑈 𝑜
ℎ 𝑡−1 + 𝑉 𝑜
𝑐𝑡−1
• 𝑐𝑡 = tanh 𝑊𝑥𝑡 + 𝑈ℎ 𝑡−1
• 𝑐𝑡 = tanh 𝑓𝑡 𝑐𝑡−1 + 𝑖 𝑡 𝑐𝑡
• ℎ 𝑡 = 𝑜𝑡 𝑐𝑡
• Gated Recurrent Unit (GRU)
Hidden state is the mix of
o the previous hidden state
o the hidden state candidate (ℎ 𝑡)
o governed by the update gate (𝑧𝑡)
– merged input+forget gate
Hidden state candidate depends on the
input and the previous hidden state
through a reset gate (𝑟𝑡)
Similar performance
Less calculations
• 𝑧𝑡 = 𝜎 𝑊 𝑧
𝑥𝑡 + 𝑈 𝑧
ℎ 𝑡−1
• 𝑟𝑡 = 𝜎 𝑊 𝑟
𝑥𝑡 + 𝑈 𝑟
ℎ 𝑡−1
• ℎ 𝑡 = 𝜎 𝑊𝑥𝑡 + 𝑈(𝑟𝑡∘ ℎ 𝑡−1)
• ℎ 𝑡 = 1 − 𝑧𝑡 ℎ 𝑡−1 + 𝑧𝑡ℎ 𝑡
ℎℎ
𝑥𝑡
ℎ 𝑡
𝑧
ℎ𝑐
𝑐
𝑥𝑡
ℎ 𝑡
10. Session modeling with RNNs
• Input: actual item of session
• Output: score on items for being the
next in the event stream
• GRU based RNN
RNN is worse
LSTM is slower (same accuracy)
• Optional embedding and feedforward
layers
Better results without
• Number of layers
1 gave the best performance
Sessions span over short timeframes
No need for modeling on multiple scales
• Requires some adaptation
Feedforward layers
Embedding layers
…
Output: scores on all items
GRU layer
GRU layer
GRU layer
Input: actual item, 1-of-N coding
(optional)
(optional)
11. Adaptation: session parallel mini-batches
• Motivation
High variance in the length of the sessions (from 2 to 100s of
events)
The goal is to capture how sessions evolve
• Minibatch
Input: current evets
Output: next events
𝑖1,1 𝑖1,2 𝑖1,3 𝑖1,4
𝑖2,1 𝑖2,2 𝑖2,3
𝑖3,1 𝑖3,2 𝑖3,3 𝑖3,4 𝑖3,5 𝑖3,6
𝑖4,1 𝑖4,2
𝑖5,1 𝑖5,2 𝑖5,3
Session1
Session2
Session3
Session4
Session5
𝑖1,1 𝑖1,2 𝑖1,3
𝑖2,1 𝑖2,2
𝑖3,1 𝑖3,2 𝑖3,3 𝑖3,4 𝑖3,5
𝑖4,1
𝑖5,1 𝑖5,2
Input
(item of the
actual event)
Desired output
(next item in the
event stream)
…
…
…
Mini-batch1
Mini-batch3
Mini-batch2
𝑖1,2 𝑖1,3 𝑖1,4
𝑖2,2 𝑖2,3
𝑖3,2 𝑖3,3 𝑖3,4 𝑖3,5 𝑖3,6
𝑖4,2
𝑖5,2 𝑖5,3 …
…
…
• Active sessions
First X
Finished sessions
replaced by the next
available
12. Adaptation: pairwise loss function
• Motivation
Goal of recommender: ranking
Pairwise and pointwise ranking (listwise costly)
Pairwise often better
• Pairwise loss functions
Positive items compared to negatives
BPR
o Bayesian personalized ranking
o 𝐿 = −
1
𝑁 𝑆
𝑗=1
𝑁 𝑆
log 𝜎 𝑟𝑠,𝑖 − 𝑟𝑠,𝑗
TOP1
o Regularized approximation of the relative rank of the positive item
o 𝐿 =
1
𝑁 𝑆
𝑗=1
𝑁 𝑆
𝜎 𝑟𝑠,𝑖 − 𝑟𝑠,𝑗 + 𝜎 𝑟𝑠,𝑗
2
13. Adaptation: sampling the output
• Motivation
Number of items is high bottleneck
Model needs to be trained frequently (should be quick)
• Sampling negative items
Popularity based sampling
o Missing event on popular item more likely sign of negative feedback
o Pop items often get large scores faster learning
Negative items for an example: examples of other sessions in the minibatch
o Technical benefits
o Follows data distribution (pop sampling)
𝑖1 𝑖5 𝑖8
Mini-batch
(desired items)
𝑦1
1
𝑦2
1
𝑦3
1 𝑦4
1
𝑦5
1
𝑦6
1 𝑦7
1
𝑦8
1
𝑦1
3
𝑦2
3
𝑦3
3
𝑦4
3
𝑦5
3
𝑦6
3
𝑦7
3
𝑦8
3
𝑦1
2
𝑦2
2
𝑦3
2
𝑦4
2 𝑦5
2
𝑦6
2
𝑦7
2 𝑦8
2
1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1
0 0 0 0 1 0 0 0
Network output
(scores)
Desired output scores
Positive
item
Sampled
negative items
Inactive outputs
(not computed)
16. The next step in recsys technology
• is deep learning
• Besides session modelling
Incorporating content into the model directly
Modeling complex context-states based on sensory data
(IoT)
Optimizing recommendations through deep reinforcement
learning
• Would you like to try something in this area?
Submit to DLRS 2016
dlrs-workshop.org
17. Thank you!
Detailed description of the RNN approach:
• B. Hidasi, A. Karatzoglou, L. Baltrunas, D. Tikk: Session-based recommendations with recurrent neural networks. ICLR 2016.
• http://arxiv.org/abs/1511.06939
• Public code: https://github.com/hidasib/GRU4Rec