In the past years, Location-based Social Network (LBSN) data have
strongly fostered a data-driven approach to the recommendation
of Points of Interest (POIs) in the tourism domain. However, an
important aspect that is often not taken into account by current
approaches is the temporal correlations among POI categories in
tourist paths. In this work, we collect data from Foursquare, we
extract timed paths of POI categories from sequences of temporally
neighboring check-ins and we use a Recurrent Neural Network
(RNN) to learn to generate new paths by training it to predict
observed paths. As a further step, we cluster the data considering
users’ demographics and learn separate models for each category
of users. The evaluation shows the effectiveness of the proposed
approach in predicting paths in terms of model perplexity on the
test set.
Distributed Coordinate Descent for Logistic Regression with RegularizationИлья Трофимов
Logistic regression with L1 and L2 regularization is a widely used technique for solving
classication and class probability estimation problems. With the numbers of both featurescand examples growing rapidly in the fields like text mining and clickstream data analysis parallelization and the use of cluster architectures becomes important. We present a novel algorithm for tting regularized logistic regression in the distributed environment. The algorithm splits data between nodes by features, uses coordinate descent on each node and line search to merge results globally. Convergence proof is provided. A modications of the algorithm addresses slow node problem. We empirically compare our program with several state-of-the art approaches that rely on different algorithmic and data spitting methods. Experiments demonstrate that our approach is scalable and superior when training on large and sparse datasets.
----------------------------------------------------------
Machine Learning: Prospects and Applications
58 October 2015, Berlin, Germany
https://telecombcn-dl.github.io/2017-dlai/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.
Digital electronics(EC8392) unit- 1-Sesha Vidhya S/ ASP/ECE/RMKCETSeshaVidhyaS
Number systems, Number conversion,Logic Gates,Boolean Theorem and Laws,Boolean Simplification,NAND,NOR Implementation,K-MAP simplification and Tabulation Method
Distributed Coordinate Descent for Logistic Regression with RegularizationИлья Трофимов
Logistic regression with L1 and L2 regularization is a widely used technique for solving
classication and class probability estimation problems. With the numbers of both featurescand examples growing rapidly in the fields like text mining and clickstream data analysis parallelization and the use of cluster architectures becomes important. We present a novel algorithm for tting regularized logistic regression in the distributed environment. The algorithm splits data between nodes by features, uses coordinate descent on each node and line search to merge results globally. Convergence proof is provided. A modications of the algorithm addresses slow node problem. We empirically compare our program with several state-of-the art approaches that rely on different algorithmic and data spitting methods. Experiments demonstrate that our approach is scalable and superior when training on large and sparse datasets.
----------------------------------------------------------
Machine Learning: Prospects and Applications
58 October 2015, Berlin, Germany
https://telecombcn-dl.github.io/2017-dlai/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.
Digital electronics(EC8392) unit- 1-Sesha Vidhya S/ ASP/ECE/RMKCETSeshaVidhyaS
Number systems, Number conversion,Logic Gates,Boolean Theorem and Laws,Boolean Simplification,NAND,NOR Implementation,K-MAP simplification and Tabulation Method
Driving Moore's Law with Python-Powered Machine Learning: An Insider's Perspe...PyData
People talk about a Moore's Law for gene sequencing, a Moore's Law for software, etc. This is talk is about *the* Moore's Law, the bull that the other "Laws" ride; and how Python-powered ML helps drive it. How do we keep making ever-smaller devices? How do we harness atomic-scale physics? Large-scale machine learning is key. The computation drives new chip designs, and those new chip designs are used for new computations, ad infinitum. High-dimensional regression, classification, active learning, optimization, ranking, clustering, density estimation, scientific visualization, massively parallel processing -- it all comes into play, and Python is powering it all.
Protein Secondary Structure Prediction using Deep Learning methodsChrysoula Kosma
Presentation for my thesis defence. Topic: Protein Secondary Structure Prediction using Deep Learning Methods.
Full text: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/17443
https://telecombcn-dl.github.io/2018-dlai/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.
Seminar giving an overview of quantum programming languages, given in the Programming Languages Group at the University of Waterloo in 2006. [Updated with name change]
Artificial Intelligence, Machine Learning and Deep LearningSujit Pal
Slides for talk Abhishek Sharma and I gave at the Gennovation tech talks (https://gennovationtalks.com/) at Genesis. The talk was part of outreach for the Deep Learning Enthusiasts meetup group at San Francisco. My part of the talk is covered from slides 19-34.
Recurrent Neural Networks have shown to be very powerful models as they can propagate context over several time steps. Due to this they can be applied effectively for addressing several problems in Natural Language Processing, such as Language Modelling, Tagging problems, Speech Recognition etc. In this presentation we introduce the basic RNN model and discuss the vanishing gradient problem. We describe LSTM (Long Short Term Memory) and Gated Recurrent Units (GRU). We also discuss Bidirectional RNN with an example. RNN architectures can be considered as deep learning systems where the number of time steps can be considered as the depth of the network. It is also possible to build the RNN with multiple hidden layers, each having recurrent connections from the previous time steps that represent the abstraction both in time and space.
⭐⭐⭐⭐⭐ Device Free Indoor Localization in the 28 GHz band based on machine lea...Victor Asanza
By exploiting the received power change in a communication link produced by the presence of a human body in an otherwise empty room, this work evaluates indoor free device localization methods in the 28 GHz band using machine learning techniques. For this objective, a database is built using results from ray tracing simulations of a system comprised of 4 receivers and up to 2 transmitters, while a person is standing within the room. Transmitters are equipped with uniform linear arrays that switch their main beams sequentially at 21 angles, whereas the receivers operate with omnidirectional antennas. Statistical localization error reduction of at least 16% over a global-based classification technique can be obtained through the combination of two independent classifiers using one transmitter and a reduction of at least 19% for 2 transmitters. An additional improvement is achieved by combining each independent classifier with a regression algorithm. Results also suggest that the number of examples per class and size of the blocks (strips) in which the study area is partitioned play a role in the localization error.
https://telecombcn-dl.github.io/2018-dlai/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.
What am I going to get from this course?
Provides a basic conceptual understanding of how clustering works
Provides intuitive understanding of the mathematics behind various clustering algorithms
Walk through Python code examples on how to use various cluster algorithms
Show how clustering is applied in various industry applications
Check it on Experfy: https://www.experfy.com/training/courses/unsupervised-learning-clustering
Cycle’s topological optimizations and the iterative decoding problem on gener...Usatyuk Vasiliy
We consider several problem related to graph model related to error-correcting codes. From base problem of cycle broken, trapping set elliminating and bypass to fundamental problem of graph model. Thanks to the hard work of Michail Chertkov, Michail Stepanov and Andrea Montanari which inspirit me...
Slides presented at Applied Mathematics Day, Steklov Mathematical Institute of the Russian Academy of Sciences September 22, 2017 http://www.mathnet.ru/conf1249
Driving Moore's Law with Python-Powered Machine Learning: An Insider's Perspe...PyData
People talk about a Moore's Law for gene sequencing, a Moore's Law for software, etc. This is talk is about *the* Moore's Law, the bull that the other "Laws" ride; and how Python-powered ML helps drive it. How do we keep making ever-smaller devices? How do we harness atomic-scale physics? Large-scale machine learning is key. The computation drives new chip designs, and those new chip designs are used for new computations, ad infinitum. High-dimensional regression, classification, active learning, optimization, ranking, clustering, density estimation, scientific visualization, massively parallel processing -- it all comes into play, and Python is powering it all.
Protein Secondary Structure Prediction using Deep Learning methodsChrysoula Kosma
Presentation for my thesis defence. Topic: Protein Secondary Structure Prediction using Deep Learning Methods.
Full text: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/17443
https://telecombcn-dl.github.io/2018-dlai/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.
Seminar giving an overview of quantum programming languages, given in the Programming Languages Group at the University of Waterloo in 2006. [Updated with name change]
Artificial Intelligence, Machine Learning and Deep LearningSujit Pal
Slides for talk Abhishek Sharma and I gave at the Gennovation tech talks (https://gennovationtalks.com/) at Genesis. The talk was part of outreach for the Deep Learning Enthusiasts meetup group at San Francisco. My part of the talk is covered from slides 19-34.
Recurrent Neural Networks have shown to be very powerful models as they can propagate context over several time steps. Due to this they can be applied effectively for addressing several problems in Natural Language Processing, such as Language Modelling, Tagging problems, Speech Recognition etc. In this presentation we introduce the basic RNN model and discuss the vanishing gradient problem. We describe LSTM (Long Short Term Memory) and Gated Recurrent Units (GRU). We also discuss Bidirectional RNN with an example. RNN architectures can be considered as deep learning systems where the number of time steps can be considered as the depth of the network. It is also possible to build the RNN with multiple hidden layers, each having recurrent connections from the previous time steps that represent the abstraction both in time and space.
⭐⭐⭐⭐⭐ Device Free Indoor Localization in the 28 GHz band based on machine lea...Victor Asanza
By exploiting the received power change in a communication link produced by the presence of a human body in an otherwise empty room, this work evaluates indoor free device localization methods in the 28 GHz band using machine learning techniques. For this objective, a database is built using results from ray tracing simulations of a system comprised of 4 receivers and up to 2 transmitters, while a person is standing within the room. Transmitters are equipped with uniform linear arrays that switch their main beams sequentially at 21 angles, whereas the receivers operate with omnidirectional antennas. Statistical localization error reduction of at least 16% over a global-based classification technique can be obtained through the combination of two independent classifiers using one transmitter and a reduction of at least 19% for 2 transmitters. An additional improvement is achieved by combining each independent classifier with a regression algorithm. Results also suggest that the number of examples per class and size of the blocks (strips) in which the study area is partitioned play a role in the localization error.
https://telecombcn-dl.github.io/2018-dlai/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.
What am I going to get from this course?
Provides a basic conceptual understanding of how clustering works
Provides intuitive understanding of the mathematics behind various clustering algorithms
Walk through Python code examples on how to use various cluster algorithms
Show how clustering is applied in various industry applications
Check it on Experfy: https://www.experfy.com/training/courses/unsupervised-learning-clustering
Cycle’s topological optimizations and the iterative decoding problem on gener...Usatyuk Vasiliy
We consider several problem related to graph model related to error-correcting codes. From base problem of cycle broken, trapping set elliminating and bypass to fundamental problem of graph model. Thanks to the hard work of Michail Chertkov, Michail Stepanov and Andrea Montanari which inspirit me...
Slides presented at Applied Mathematics Day, Steklov Mathematical Institute of the Russian Academy of Sciences September 22, 2017 http://www.mathnet.ru/conf1249
Knowledge Graphs have proven to be extremely valuable to rec-
ommender systems, as they enable hybrid graph-based recommen-
dation models encompassing both collaborative and content infor-
mation. Leveraging this wealth of heterogeneous information for
top-N item recommendation is a challenging task, as it requires the
ability of effectively encoding a diversity of semantic relations and
connectivity patterns. In this work, we propose entity2rec, a novel
approach to learning user-item relatedness from knowledge graphs
for top-N item recommendation. We start from a knowledge graph
modeling user-item and item-item relations and we learn property-
specific vector representations of users and items applying neural
language models on the network. These representations are used
to create property-specific user-item relatedness features, which
are in turn fed into learning to rank algorithms to learn a global
relatedness model that optimizes top-N item recommendations. We
evaluate the proposed approach in terms of ranking quality on
the MovieLens 1M dataset, outperforming a number of state-of-
the-art recommender systems, and we assess the importance of
property-specific relatedness scores on the overall ranking quality.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
The affect of service quality and online reviews on customer loyalty in the E...
Path recommender rec tour
1. Predicting your Next Stop-Over
from Location-Based Social
Network Data with Recurrent Neural
Networks
RecTour workshop 2017
RecSys 2017, Como, Italy
Enrico Palumbo, ISMB, Italy, Turin
Giuseppe Rizzo, ISMB, Italy, Turin
Raphaёl Troncy, EURECOM, France, Sophia Antipolis
Elena Baralis, Politecnico di Torino, Italy, Turin
2. Location-Based Social Network Data
● Allow users to check-in in a venue
and share information about
their whereabouts
● Venues are categorized
according to consistent
taxonomies
● User profiles allow to group
users according to demographic
information
3. Next-POI prediction problem
● Predict where a user will go next
depending on past check-ins and
profile
● Modeling sequential
information: where is a user
likely to go after an Art Museum
and a Sushi Restaurant?
● Modeling user profile
information (e.g. Vegetarian
Restaurant and Steakhouse do not
go together)
?
4. Next-POI category prediction
Frédéric, French
man in his 40’s, is
going to visit Tate
Modern in London
this afternoon
Then, he can be
interested in
Taking a
beer in an
Irish Pub,
first
Then, having
a dinner in an
Italian
Restaurant
And, lately,
attending an
event in a
Jazz Club
path, ie sequence, of categories
of Points of Interest that Frédéric
can be interested to go after
having seen the Tate Modern
gallery
5. Data Collection and Enrichment
1.5M check-ins from 235.6K Swarmapp users worldwide in a week via Twitter API
Collect venue category, user gender and language with Foursquare API
6. Data cleansing and path extraction
● At least 10 check-ins: prefiltering, select active users who are likely to
generate paths
● Bot removal: if more than twice two check-ins in a minute, it is a bot
(p~1 on a sample of 50 examples)
● Segmentation: split a stream of checkins for a user (c1
, τ1
), …, (cn
, τn
)
when the temporal interval is more than 8 hours: τi+1
- τi
> 8. ‘Isolated’
check-ins, τi+1
- τi
> 8, τi
- τi-1
> 8, are removed from the data.
● Example: (Art_Museum, 8), (Café, 10), (French_Restaurant, 12),
(Rock_Club, 22), (Cocktail_Bar, 23)
τi+1
- τi
= 10h
7. Stats
Users Check-ins
Collection 235.6 K 1.5 M
At least 10 check-ins 19.5 K 400 K
Without Bots 12.4 K 184 K
Segmentation 12 K 123 K
29.5K paths, with a max length of 50 and an avg length of 4.2
Path: Cafeteria, Vegetarian Restaurant, Café, Shopping Mall, Smoothie
Shop, Spa, Wine Bar, Opera House, EOP
8. Recurrent Neural Networks
● Model sequences
● Output layer: ot
= fo
(xt
, ht
)
● Hidden layer: ht
= fh
(xt
, ht-1
)
● No Markov assumption of fixed
memory window
● Very popular in NLP, language
modeling, speech processing,
semantic image analysis
ot0
ot1
ot2
xt0
xt1
xt2
ht2
ht1
ht0
9. RNN application: next character prediction
● Train the network to
predict the next character
in a sentence
● Feed one character at the
time into the network
● ‘hello’: ‘h’-> ‘e’, ‘e’ -> ‘l’, ‘l’
-> ‘o’
● Use the trained model to
generate new text
ot0
ot1
ot2
xt0
xt1
xt2
ht2
ht1
ht0
h
e
e
l
l
o
10. Next POI category prediction
Café, Art Museum, Monument, Italian Restaurant, Irish Pub, Night
Club, EOP
Chinese Restaurant, Pub, Wine Bar, Karaoke Bar, Rock Club, EOP
Beach, Gym, Cocktail Bar, EOP
Cafeteria, Vegetarian Restaurant, Café, Shopping Mall, Smoothie
Shop, Spa, Wine Bar, Opera House, EOP
Input Target
12. One-hot encoding
● Encoding: a strategy to turn
categories c into vectors x
● One-hot encoding:
○ assign to each category c
a unique index α = α (c).
○ xi
(c) = 1 if i == α else 0
○ |xi
(c)| = |C| where c ∈ C
● Simple and intuitive
0 0 0 1 0 0
c = Art Museum
α (Art Museum) = 4
α = 4
13. Shortcomings
● Independence: all
categories are considered
as orthogonal and
equidistant
● Size: the size of the input
vector depends on the
size of the ‘vocabulary’,
i.e. the set of categories
0 0 0 1 0 0
0 0 1 0 0 0
0 1 0 0 0 0
Art Museum
Café
History Museum
14. node2vec
● Feature learning from
networks
● Adaptation of word2vec
on graph structures
using random walks
● Maps nodes in a graph
into an euclidean space
preserving the graph
structure
15. node2vec workflow
node1, node3, node2,
node1, node5 ...
node2, node3, node1,
node2, node5, node4 …
node3, node1, node2,
node1, node3 …
node4, node3, node4,
node1, node5 ...
Random
Walk Word2vec
Graph Text file with ‘sentences’ Node embeddings
16. node2vec on Foursquare taxonomy
● Foursquare taxonomy is a
graph that models
dependencies among venue
categories
● Using node2vec we obtain
category vectors that models
dependencies
http://projector.tensorflow.org/?config=https://gist.githubusercontent.
com/enricopal/9a2683de61f5fb16c4f59ae295e3fef7/raw/159df72f47e88
1d0f314096fcc8ea561fb7132b9/projector_config.json
Arts & Entertainment
Museum
Art
Museum
History
Museum
Music Venue
Jazz
Club
Piano
Bar
Rock
Club
17. GRU units
● Hidden layer with Gated
Recurrent Units (GRU)
● ht+1
= GRU( xt
, ht
)
● Able to effectively model
long-term dependencies
● Recently they have proven to
be easier to train w.r.t. to
Long Short-Term Memory
Networks (LSTM)
img source:
http://www.wildml.com/2015/10/recurrent-neural-network-tutorial
-part-4-implementing-a-grulstm-rnn-with-python-and-theano/
18. Dropout
● Regularization by randomly
switching off neurons at
training time
● Prevent neurons from
co-adapting and overfitting
● Can be seen as an ensemble
of slightly different networks
● k = 0.2
0 0 1 0 0
19. Softmax
● Output layer has
length equal to |C|
● Turn activation values
into a probability
distribution over the
categories
0.3 0.2 0.3 0.8 0.50.1
0.15
α
0.14 0.12 0.15 0.25 0.19
Output
Softmax
20. Loss function
● Cross entropy between
the model probability
distribution and the
observed distribution
● One-hot encoding for
target vectors
● Target -> α
● L = - log (softmax(oα
))
Target
Art Museum
0 0 0 1 0 0
L = -log (0.15)
0.3 0.2 0.3 0.8 0.50.1
0.15 0.14 0.12 0.15 0.25 0.19
Output
Softmax
21. Evaluation
● Train, validation and test set split
● Accuracy weights error in the same way (e.g. predicting ‘Art Museum’ when
the true category is ‘History Museum’ counts the same as predicting
‘Karaoke Bar’)
● Perplexity (ppl): borrowed from neural language modeling, measures how
much the model is ‘surprised’ in observing the test set, the lower the
better: ppl = 2L
= 2H(p,q)
where p is the model distribution and q is the
category distribution in the test set
● A random uniform model has ppl = |C|
● ppl can be interpreted as the number of categories among which a random
uniform model would have to choose
23. Results
RNN with node2vec encoding, one-hot encoding vs bigram and random model.
System ppl
RNN-node2vec 75.271
RNN-one-hot 76.472
bigram 125.361
random 741
As if we were to choose
randomly among ~75
categories
Bigram model,
1-st order
transition
probabilities
estimated,
Laplace
add-one
smoothing This is exactly the
number of categories |C|
in the training data
24. User clustering results
● Paths depend on user
preferences
● Not enough data to learn a
personalized model
● Groups of users according to
user profile (no cold start)
● Gender, language extracted
from Twitter and Foursquare
Gender Language Paths ppl
M All 18718 75.478
F All 8741 73.024
All En 8955 71.343
All Ar 439 100.772
F En 2636 74.999
M En 5771 69.481
F Th 234 83.601
M Th 524 94.132
All All 29465 75.271
25. Correlation with training set size
● Perplexity vs
number of paths
in the user cluster
training set
● Spearmann
correlation = -0.48
● p value = 0.02
26. Conclusions
● Novel approach to next POI prediction passing through POI
categories
● Recurrent Neural Network to model sequences of POI
categories for next POI prediction
● Category encoding with node2vec vs one-hot encoding
● Perplexity as an evaluation metric
● User clustering based on demographics
● Will be integrated with entity2rec and user context to provide
next POI instance from the POI category