Context-aware Recommender Systems for Opportunistic Environments

Context-aware Recommender Systems
for Opportunistic Environments
Tutors: Dr. Franca Delmastro, Dr. Enrico Gregori
Mattia Giovanni Campana
Doctoral Thesis Defense
May 15th, 2019

OPPORTUNISTICEnvironment
CHARACTERISTICSChapter 1. Introduction
0
20
40
60
80
100
2015 2016 2017 2018
MarketShare(%)
Desktop
Mobile
(a)
Core Internet
(b)
Figure 1.1: The Desktop and Mobile worldwide market share trends in the last years (a), and the expan
sion of the Internet at its edge (b).
๏ Personal mobile devices can exploit their wireless capabilities to establish direct
connections among them and physical objects (IoT) through self-organizing networks
• Device-to-device wireless communications (D2D)
• Human mobility
• Store-carry-forward paradigm
๏ They can opportunistically share both computational resources and contents
๏ Users have several connectivity opportunities through both the core Internet
and direct communications with other users and devices in proximity.
Devices must be able to autonomously:
• Collect the available content
• Process and ﬁlter them
• Keep only the most interesting contents for the users
1

i
i
“main” — 2019/5/2 — 14:28 — page 5 — #29
1.1. Thesis Contr
RS}
User-Item Interactions
Additional Information
Items filtering
Figure 1.4: General representation of the recommendation process.
DATAFiltering
TRADITIONAL APPROACHES VS OUR PROPOSAL
Traditional approaches for data dissemination in self-organizing networks:
• Manual configuration of the mobile device (i.e., list of topics of interest)
• Mainly based on a publish/subscriber mechanism
๏ User’s interests are not static, but they change over time and often depend
on the current situation.
๏ Most of the contents available in the edge of the Internet is very
contextualized. They may be relevant only:
• in specific situations
• for a particular group of users
Automatic content discovery in opportunistic
environments, based on Context-aware
Recommender Systems (CARS).
2
Provide proactive services to
the local user
A s s i s t c o n t e x t - a w a r e
forwarding algorithms

i
i
“main” — 2019/5/2 — 14:28 — page 4 — #28
i
i
Chapter 1. Introduction
Operating System
Physical & Virtual Sensors Monitors
Context
Manager
Context-Aware
Recommender
Systems
Network Manager
Self-forming D2D
Routing / Data
dissemination
Application Manager
App 1 App 1 App 1 App 1…
Security&Privacy
DATAFiltering
A MIDDLEWARE SOLUTION
Establish D2D communications
and discover new contents in
the network
Recognizes the user’s context
Models the user’s preferences
and provides personalized
recommendations to the local
user and applications
3
๏ In our reference scenario, we need to perform the entire computation on the local device.
๏ CARS for opportunistic environments need to be supported by additional components.
Opportunistic contacts could last just few seconds due to the users’ mobility

THESISContributions
4
We present novel contributions in multiple ﬁelds
CARS
Network
Context
Sensors
A novel CARS solution especially designed for
opportunistic environments.
A context-aware networking protocol to implement
self-organizing networks with commercial mobile
devices.
A lightweight approach to model and recognize the
user context by using the sensing capabilities of the
mobile device.
Data
Apps
A sensing framework to monitor context data from
real mobile devices.
2 mobile applications to perform sensing
experiments.
2 context datasets collected from real devices
Can be used to deﬁne and evaluate both context-
modelling approaches and new CARS algorithms.
Theoretical Experimental

Data filtering in Opportunistic Environments
(p-)PLIERS

“main” — 2019/4/17 — 21:33 — page 24 — #48
i
i
Chapter 2. Context-Aware Recommender Systems
CARS
Social-aware
Tag-based
Location-based
Friendships relations
Followers / Followee relations
Trust relations
(User-defined) Tags
Location (POIs and trajectories)
Time
Locations’ meta-information (e.g., tags)
Social & Trust relations
People
Items
Tags
Locations
Figure 2.9: Classification of CARS according to the type of context information considered and recom-
mendation target.
CONTEXT-AWARERecommenderSystems
๏ Several approaches and methods
๏ Focus on specific context information for different target domains
๏ Mobile devices are “simple” clients
5
Centralized Distributed
๏ Few solutions proposed for this scenario
๏ Goal: reduce the complexity of methods proposed for
centralised scenarios
User-based Collaborative Filtering
k-users most similar to the target user
Tag-Expansion
The K-tags with the highest value of
co-occurrence with those of the target.
Use of tag matching.
Users
Items
Tags
Tags

TAG-BASEDCARS
6
๏ Perfectly ﬁt our reference scenario
• Tags can be used to characterize both the users context and their items
• We can build one single multi-domain Recommender System
RS1 RS2 RS3 RS4 RS
๏ Folksonomy = set of user-deﬁned tags
• PROS: easy to use, adapts to changes in the users’s vocabulary
• CONS: no relationships between different tags (≠ ontology)
U1
T2 T3 T4 T5
U4U2
T1
U3
๏ Diffusion-based approach to rank items/tags previously “unseen” by the target user

TAG-BASEDCARS
6
RS1 RS2 RS3 RS4 RS
U1
T2 T3 T4 T5
U4U2
T1
U3
๏ Diffusion-based approach to rank items/tags previously “unseen” by the target user

TAG-BASEDCARS
6
RS1 RS2 RS3 RS4 RS
U1
T2 T3 T4 T5
U4U2
T1
U3
1st 2nd3rd4th
๏ Diffusion-based approach: rank items/tags previously “unseen” by the target user
• ProbS: biased by extremely popular items
• HeatS: biased by non-popular items
• Hybrid: ProbS + HeatS (increase the complexity)
• PD and BHC: use parameters that can vary greatly among different datasets
๏ Current solutions:
5th
Popularity = connected users

Tags
Items
Users
PLIERSPopuLarity-based Item Recommender System
๏ Solves the dilemma between the choice of popular or unpopular items in a more
natural way with respect of other solutions
๏ Does not require any parameter to tune
๏ Without increasing the computational complexity
๏ Assumption: a very popular tag is related to a more generic topic than a less
popular that describes a more speciﬁc topic
7
3.3: Structure of the synthetic user-tag bipartite graph. The zoomed area highlights the interests
he users 1 and 3.
fpl
j =
nX
l=1
mX
s=1
al,j · al,s · at,s
k(ul) · k(is)
|Us Uj|
k(ij)
j = 1, . . . , m, (3.1)
Football
Milan Millwall
Normalize the resources assigned to the tags according to their popularity and
their overlap (users) with tags directly connected to the target

i
“main” — 2019/4/17 — 21:33 — page 52 — #76
Chapter 3. Exploiting tags as context information
0
100
200
300
400
500
600
PLIER
S
ProbS
H
eatS
H
ybrid
PD
BH
C
PLIER
S
ProbS
H
eatS
H
ybrid
PD
BH
C
0
0.02
0.04
0.06
0.08
0.1
0.12
Variance Overlap
MovieLens Delicious Twitter
Figure 3.6: Structure of the synthetic user-tag bipartite graph. The zoomed area highlights the inter
of the users 1 and 3.
O =
1
n
nX
l=1
1
rl
rlX
q=1
1
z
Y
J(Uiq , Uik
), (3
where Uiq is the set of users connected to the item iq and J(S1, S2) is the Jaccar
index, that measures the percentage of overlap between two generic sets S1 and
Therefore, a good Recommender System should provide both a low V and a high O
For the link prediction task, we use three standard metrics: (i) the Recall (R) ind
i
i
“main” — 2019/4/17 — 21:33 — page 53 — #77
3.5. Conclusio
0
0.02
0.04
0.06
0.08
0.1
0.12
PLIER
S
ProbS
H
eatS
H
ybrid
PD
BH
C
PLIER
S
ProbS
H
eatS
H
ybrid
PD
BH
C
Precision Recall
(a) Results in terms of Precision and Recall.
CENTRALISED ENVIRONMENT
PLIERSEvaluation
8
PLIERS vs other diffusion-based RS
๏ Validate the PLIERS assumption
๏ Link Prediction task
i
i
periments.
uation metrics
rpose of PLIERS is to suggest the contents closest to the interests of the tar-
en the PLIERS’s assumptions about the popularity-based semantic of the
ned in Section 3.2, to compare our proposal with the baseline algorithms,
est to analyse how much the the recommended tags are similar (in terms
) and overlapped to the interests of the target user. To this aim, we define
(Variance), that calculates the average difference in terms of popularity
recommended tags and those already owned by the users:
V =
1
n
nX
l=1
1
rl
rlX
q=1
q
(k(tq) p(Tul
))2, (3.3)
s the number of users in the network, rl is the number of recommended
ul and p(Tul
) = 1
z
Pz
j=1 k(tj) is the mean popularity of the tags originally
user ul with z the number of those tags. Moreover, we define the metric O
at measures the percentage of users connected to both the recommended
of the tags of the target user, averaged for all the tags of the user and
he users. It gives us an idea of the potential interest for the users in the
d tags. It is defined as follows:
51
obS
H
eatS
H
ybrid
PD
BH
C
PLIER
S
ProbS
H
eatS
H
ybrid
PD
BH
C
0
0.02
0.04
0.06
0.08
0.1
0.12
Variance Overlap
ure of the synthetic user-tag bipartite graph. The zoomed area highlights the interests
and 3.
O =
1
n
nX
l=1
1
rl
rlX
q=1
1
z
Y
J(Uiq , Uik
), (3.4)
the set of users connected to the item iq and J(S1, S2) is the Jaccard’s
asures the percentage of overlap between two generic sets S1 and S2.
od Recommender System should provide both a low V and a high O.
prediction task, we use three standard metrics: (i) the Recall (R) index,
he number of recovered links within the first L recommendations for
ed by L; (ii) the Precision (P) index, that measures the number of recov-
n the first L recommendations divided by the total number of recovered
user; and (iii) the Novelty (N) index, that measures the capacity of a
System to generate novel and unexpected results, generally related to
popularity, quantified by measuring the average popularity of the first L
tems. In this case, the best algorithm should have high P and R, while
value for N.
Minimize
Maximize
• Remove random links of the graph
• Evaluate the ability of the RS to reconstruct the
original graph (Precision and Recall)

CENTRALISED ENVIRONMENT
PLIERSEvaluation
9
4.4. PLIERS Experimental Evaluation in a Static Scenario
0%
20%
40%
60%
10 20 30 40 50 60 70 80 90 100
600%
800%
1000%
PLIERSvsTag-Exp PLIERSvsCF
PrecisionGain
k
(a)
0%
20%
40%
60%
80%
10 20 30 40 50 60 70 80 90 100
PLIERSvsCF PLIERSvsTag-Exp
RecallGain
k
(b)
4.4. PLIERS Experimental Evaluation in a Static Scenario
0%
20%
40%
60%
10 20 30 40 50 60 70 80 90 100
600%
800%
1000%
PLIERSvsTag-Exp PLIERSvsCF
PrecisionGain
k
(a)
0%
20%
40%
60%
80%
10 20 30 40 50 60 70 80 90 100
PLIERSvsCF PLIERSvsTag-Exp
RecallGain
k
(b)
PLIERS vs solutions for distributed scenarios

Local
Knowledge 
Graph
Knowledge 
exchange
Content 
Sharing
Each node builds its own local
representation of the knowledge
about users and items in the
network
Nodes share their knowledge
graphs during opportunistic
contacts
Nodes evaluate the discovered
items by locally running the CARS
and exchange them
CARS SOLUTION FOR OPPORTUNISTIC ENVIRONMENTS
p-PLIERSPervasivePLIERS
10

“main” — 2019/4/17 — 21:33 — page 64 — #88
i
i
Chapter 4. Pervasive PLIERS: A framework for Distributed Recommender Systems
. . .
Figure 4.4: Map of Expo 2015 area with the position of five of the simulated communities. Note that the
grid in the figure is only an example to show how we divided the area for the simulations, but it does
not represent the actual grid.
Moreover, for each simulation step, we calculated the following:
3. Number of contents generated by the nodes over time.
4. Average number of contacts between nodes over time.
These metrics are used to characterise the contact traces and the contents used in
the different scenarios. We anticipate that both the synthetic and real traces we used
during the simulations show similar properties (e.g., the contact traces used for the
WFD@Expo2015 scenario show values compatible with those used in the conference
scenario), thus supporting the significance of the synthetic trace.
We also calculated all the aforementioned metrics by considering that the interests
of nodes may be limited in time. To do so, we calculated the metrics using only the
most recent contents generated in the network and considering only the information
about these contents in the folksonomy graphs. Moreover, during the simulations, we
also considered that nodes could have a limited memory capacity; thus they discarded
contents older than a fixed time threshold (i.e., contents older than 1, 2, and 3 hours).
4.5.3 Scenario 1 - Big Event: World Food Day @Expo2015
As a first dynamic scenario for the evaluation of p-PLIERS, we considered a big event
attended by a large number of people in a relatively large area. In this scenario, ac-
cessing the Internet from mobile devices may be problematic and thus obtaining useful
# nodes: 200, 500, 900
Content: Tweets generated during the event and collected by
using the Twitter Streaming APIs
Time: 13h (10am - 11pm)
Mobility: HCMM (with communities)
Expo 2015
# nodes: 800
Content: Tweets generated in the city
center of Helsinki by using the Twitter
Streaming APIs
Time: 24h
Mobility: Working Day Mobility Model
Helsinki
# nodes: 789
Content: Tweets generated during the
conference by using the Twitter REST
APIs
Time: 9h (7:30am -
4:30pm)
Mobility: Real contact traces from an
American school
SIMULATED SCENARIOS
p-PLIERSEvaluation
11
Users
Items
Tags # hashtags
Twitter
User
=
=
=

“main” — 2019/4/17 — 21:33 — page 68 — #92
i
Chapter 4. Pervasive PLIERS: A framework for Distributed Recommender Systems
0
0.2
0.4
0.6
0.8
1
10am 1pm 4pm 8pm 11pm
250 agents
500 agents
900 agents
J(LKGs,GKG)
Time
(a)
0
0.2
0.4
0.6
0.8
1
250 agents
500 agents
900 agents
S(LKGs,GKG)
Time
(b)
0
0.2
0.4
0.6
0.8
1
250 agents
500 agents
900 agents
J(fLKGs,fGKG)
Time
0
0.2
0.4
0.6
0.8
1
1h
2h
3h
J(LKGs,GKG)
Time
0
0.2
0.4
0.6
0.8
250 agents
500 agents
900 agents
J(LKGs,GK
Time
(a)
0
0.2
0.4
0.6
0.8
250 agents
500 agents
900 agents
S(LKGs,GK
Time
(b)
0
0.2
0.4
0.6
0.8
1
250 agents
500 agents
900 agents
J(fLKGs,fGKG)
Time
(c)
0
0.2
0.4
0.6
0.8
1
1h
2h
3h
J(LKGs,GKG)
Time
(d)
Figure 4.7: Results for the WFD@Expo2015 scenario. (a) shows the average Jaccard similarity between
the LKGs of the agents and the GKG, for different number of agents. (b) shows the average Spearman
index and (c) shows the average Jaccard similarity between the recommendation list provided by
PLIERS by using the LKGs of the agents and the list obtained exploiting the GKG, for different
number of agents. (d) shows the average Jaccard similarity between the LKGs and the GKG by
limiting the knowledge to different time windows in the past.
the global graph, where only information generated not more than 1, 2 and 3 hours (of
simulated time) before the calculations is respectively considered. Note that the ﬁgure
is related to the simulation with 900 agents. The differences in terms of average simi-
i
“main” — 2019/4/17 — 21:33 — page 71 — #95
4.5. p-PLIERS Experimental Evaluation in Dynamic Sc
0
0.2
0.4
0.6
0.8
1
7:30 am 9am 11am 1pm 3pm 4:30 pm
J(LKGs,GKG)
Time
(a)
0
0.2
0.4
0.6
0.8
1
7:30 am 9am 11am 1pm 3pm 4:3
S(LKGs,GKG)
Time
(b)
0
0.2
0.4
0.6
0.8
1
7:30 am 9am 11am 1pm 3pm 4:30 pm
J(fLKGs,fGKG)
Time
0
0.2
0.4
0.6
0.8
1
7:30 am 9am 11am 1pm 3pm 4:
1h 2h 3h
J(LKGs,GKG)
Time
0
0.2
0.4
0.6
0.8
7:30 am 9am 11am 1pm 3pm 4:30 pm
J(LKGs,GK
Time
(a)
0
0.2
0.4
0.6
0.8
1
7:30 am 9am 11am 1pm 3pm 4:30 pm
J(fLKGs,fGKG)
Time
(c)
Figure 4.10: Results for the scenario of the KDD
between the LKGs of the agents and the GKG
age Spearman index and (c) shows the avera
provided by PLIERS by using the LKGs of th
different number of agents. (d) shows the aver
by limiting the knowledge to different time win
considered that the tweets were generate
creation time of each tweet, and not its cr
Figure 4.8a and Figure 4.8b show res
i
i
“main” — 2019/4/17 — 21:33 — page 75 — #99
4.5. p-PLIERS Experimental Evaluation in Dynamic Sc
0
0.2
0.4
0.6
0.8
1
6am 10am 2pm 6pm 10pm 2am 6am
1 d
2 d
3 d
J(LKGs,GKG)
Time
0
0.2
0.4
0.6
0.8
1
6am 10am 2pm 6pm 10pm 2am
1 d
2 d
3 d
S(LKGs,GKG)
Time
i
“main” — 2019/4/17 —
4.5. p-PLIERS E
0
0.2
0.4
0.6
0.8
1
1 d
2 d
3 d
J(LKGs,GKG)
Time
(a)
0
0.2
0.4
0.6
0.8
1
1 d
2 d
3 d
J(fLKGs,fGKG)
Time
J(fLKG,fGKG)
Expo 2015
Helsinki
KDD 2015
Local Knowledge Graphs (LKGs) vs Global Knowledge Graph (GKG)
RESULTS
p-PLIERSEvaluation
12
J(LKGs,GKG)
J(LKGs,GKG)
J(LKGs,GKG)
J(fLKG,fGKG)
J(fLKG,fGKG)
J(fLKG, fGKG) = Jaccard Index between the recommendations
provided by PLIERS by using the LKGs and the GKG
J(LKGs, GKG) = Jaccard Index between LKGs and GKG

0
0.2
0.4
0.6
0.8
1
250 agents
500 agents
900 agents
S(LKGs,GKG)
Time
(b)
0
0.2
0.4
0.6
0.8
1
1h
2h
3h
J(LKGs,GKG)
Time
(d)
nario. (a) shows the average Jaccard similarity between
rent number of agents. (b) shows the average Spearman
milarity between the recommendation list provided by
Expo 2015
0
0.2
0.4
0.6
0.8
1
1 d
2 d
3 d
J(LKGs,GKG)
Time
(a)
0
0.2
0.4
0.6
0.8
1
1 d
2 d
3 d
S(LKGs,GKG)
Time
(b)
0
0.2
0.4
0.6
0.8
1
1 d
2 d
3 d
J(fLKGs,fGKG)
Time
(c)
0
0.2
0.4
0.6
0.8
1h
2h
3h
5h
10h
J(LKGs,GKG)
Time
(d)
Figure 4.14: Results for the scenario of the city centre of Helsinki. (a) shows the average Jaccar
similarity between the LKGs of the agents and the GKG, for different number of agents. (b) shows th
average Spearman index and (c) shows the average Jaccard similarity between the recommendatio
Helsinki
0
0.2
0.4
0.6
0.8
1
7:30 am 9am 11am 1pm 3pm 4:30 pm
J(LKGs,GKG)
Time
(a)
0
0.2
0.4
0.6
0.8
1
7:30 am 9am 11am 1pm 3pm 4:30 pm
S(LKGs,GKG)
Time
(b)
0
0.2
0.4
0.6
0.8
1
7:30 am 9am 11am 1pm 3pm 4:30 pm
J(fLKGs,fGKG)
Time
(c)
0
0.2
0.4
0.6
0.8
1
7:30 am 9am 11am 1pm 3pm 4:30 pm
1h 2h 3h
J(LKGs,GKG)
Time
(d)
Figure 4.10: Results for the scenario of the KDD conference. (a) shows the average Jaccard similarity
between the LKGs of the agents and the GKG, for different number of agents. (b) shows the aver-
age Spearman index and (c) shows the average Jaccard similarity between the recommendation list
KDD 2015
๏Nodes have a limited memory and they delete old information from their LKGs
๏Similarity between LKGs and GKS by limiting the information lifetime at different hours.
RESULTS
p-PLIERSEvaluation
13
J(LKGs,GKG)
J(LKGs,GKG)
J(LKGs,GKG)

self-forming D2D connections
WFD-GM

GOClient
GENERAL OVERVIEW
WI-FIDirect (WFD)
๏ In WFD nodes can communicate to each other only if they belong to the same WFD
Group (star topology)
๏ Group Owner (GO) is the “leader” of the group. It implements the functionalities of a IEEE
802.11 Access Point (AP)
๏ Clients: both WFD-enabled and “legacy” devices
see a GO as a traditional AP
14

Accept connection
from
Device_XYZ ?
WFDLimitations
๏ GO Intent is not related to the suitability of a node to act as GO (It is a random value or set
by applications).
๏ Peer discovery + GO Negotiation may require several seconds
๏ WPS requires manual user’s authorization (PIN or Accept button)
๏ Two WFD Groups in proximity can not communicate to each other
PEER
DISCOVERY
WPS DHCPGO
negotiation
response
GO
negotiation
request
GO
negotiation
conﬁrm
D1
D2
Nodes send a GO Intent (GI) value, which 
represents their willingness to become GO
Wi-Fi Simple Configuration
15

PROPOSED SOLUTION
WFDGroupManager (WFD-GM)
๏ We propose Wi-Fi Direct - Group Manager (WFD-GM), a novel middleware-layer protocol to enable
opportunistic networks with real commercial devices.
๏ Uses a context-aware function to find the best configuration of WFD groups
๏ Enables the content/information diffusion among different WFD groups
๏ Does not require any modification of O.S. or WFD standard
๏ Avoids the manual user’s authorization
16
We can implement security policies in higher level layer

Each node creates a WFD
group electing itself as GO 
(Autonomous Group Formation)
Shares the group credentials
among nodes in proximity
(Service Discovery)
WFD-GM
๏ GOAL: Speed up the group formation and the credential exchange
๏ Combines two mechanism of WFD standard to identify the best group conﬁguration:
• Autonomous Group Formation
• Service Discovery
INITIALIZATION
17

Bad GO
Bad GO: LN changes quickly
(Its group will be rapidly destroyed)
Good GO: LN changes slowly
(It is able to create a long-lasting group)
In addition to the group credentials, each node shares its Suitability index S(ln) - Suitability to become GO of a larger group
CONTEXT INFORMATION
WFD-GM
18
5: VR = wait VISIBILITY_RESP from the clients
6: t = |{ri 2 VR : ri == true}|
7: if t |G| + 1 then
8: Send MERGE_WARNING(gbest) to the clients
9: DisbandGroup() and Connect(gbest)
10: end procedure
which provide a measure of the ability of the node to create a long lasting WFD group
(i.e., a group that will not be rapidly destroyed due to the local node’s mobility). More
formally:
S(ln) = !1 · rln + !2 · ppln + !3 · cln + !4 · stln, (5.1)
where the weights !1,··· ,4 govern the relative importance of each feature in the overall
computation of S(ln).
The stability index stln evaluates both the mobility of the local node and how much
its surrounding environment changes over time. Currently, we consider it as a function
of the nodes in proximity (LN ), but more complex approaches can be taken into account
(e.g., a function of the geographical locations visited by the node in the past). The
UpdateStabilityIndex procedure is in charge to update stln every Tst seconds
as follows. Every time LN changes, it calculates the difference between the current list
of neighbours and the one of the previous time window, then computing the Jaccard
index of the two lists. Then, it updates a running average ¯J of the Jaccard indices
calculated since the last update of stln. Finally, the stability index is updated with the
following formula:
stln = st0
ln · !1
st + ¯J · !2
st, (5.2)
where st0
ln is the stability index calculated in the previous time window of Tst sec-
1 2
available resources (e.g., battery level, free CPU/memory)
# current peers in proximity (LN)
# incoming connections that the
device can still accept
Stability Index: how much faster LN changes over time

My si = max si ?
Yes
No
n5
n1
n3
n4
n2
n5
n1
n3
n4
n2
It destroys the group and comes back to the initial status GO1
n3
n5
n1
n4
n2
Every TD seconds (decision time), each node check its status which can be one of the following:
NODE STATUS
WFD-GM
๏ GO1: the node has no clients but LN is not empty (nearby nodes)
GOElection Procedure: remains GO and wait for incoming connections
connect as legacy client to the GO with the max si
๏ GO2: the node has some clients but
the amount of resources consumed
to manage the current group is
beyond a predeﬁned threshold resth
19

GO has discovered
another GO in proximity
Based on their suitability
indices, it is not the best GO
GO asks to its clients if
they “see” the other GO
If the majority agree, GO
disbands its group and
connects to the new one
Best GO
With probability pT it
becomes a traveler
Node blacklists the old GO
for a fixed amount of time
Node choose which group to
connect among those in proximity
NODE STATUS
WFD-GM
๏ GO3 (merge procedure): node evaluates to merge its group with another one in proximity.
๏ C1 (traveler procedure): a client has discovered another GO in proximity.
20

๏ We compared WFD-GM with a Baseline protocol
 
๏ We implemented both WFD-GM and Baseline in the ONE opportunistic simulator
๏ Parameters estimation with real devices
0 10 20 30
Hour
0
0.2
0.4
0.6
0.8
1
Batterylevel
Group size: 2
Group size: 20
Intermediate
0 10 20 30
Hour
0
0.2
0.4
0.6
0.8
1
Batterylevel
Group size: 2
Group size: 20
Intermediate
Predicted battery depletion
- GO w/o clients + Service Discovery: 20% every 5h
- Groups of [1,4] clients that continuously send msgs to
each other. Then, we used a linear regression model to
estimate the power consumption in larger groups.
GOs Clients
In simulations: rand(4,15) max clients for each node
SETUP
WFD-GMEvaluation
• GO election: node with the highest MAC address
• The GO maintains its role until the end of its resources or in case of out-of-range
• Limited number of clients - e.g., LG Nexus 5 (4 clients), HTC Nexus 5X (10+ clients)
• Battery depletion
21

ComiCon
# nodes: 2000
Mobility: [0,1.5] m/s - ShortestPath
575 POIs (e.g., stands, eateries)
Each node waits from 10min to 1h at
each POI (e.g., queues)
Time: 4 h
Helsinki
# nodes: 4000
Mobility: Working Day Mobility Model
Time: 24 h
Concert
# nodes: 1000
Mobility: fixed positions
Time: 3 h
Main Stage
We simulated 3 application scenarios with different numbers of nodes (users) and different mobility patterns
SIMULATED SCENARIOS
WFD-GMEvaluation
22

0
0.2
0.4
0.6
0.8
1
0 0.5 1 1.5 2 2.5 3
innodes’caches(%)
Hour
Baseline 5
Baseline 30
Baseline 60
WFD-GM 5
WFD-GM 30
WFD-GM 60
0
0.2
0.4
0.6
0.8
1
0 0.5 1 1.5 2 2.5 3 3.5 4
Meannumberofmessages
innodes’caches(%)
Hour
Baseline 5
Baseline 30
Baseline 60
WFD-GM 5
WFD-GM 30
WFD-GM 60
0
0.2
0.4
0.6
0.8
1
0 5 10 15 20 25
Meannumberofmessages
innodes’caches(%)
Hour
Baseline 5
Baseline 30
Baseline 60
WFD-GM 5
WFD-GM 30
WFD-GM 60
Concert Comicon Helsinki
MESSAGE DIFFUSION
WFD-GMEvaluation
๏ When a simulation starts, each node generates a message
๏ We assume that nodes implement an epidemic forwarding algorithm
๏ When a node joins a WFD group, it sends all the messages contained in its own cache to all the members of the group
๏ Every 30 minutes (sim. time), we measured the % of message contained in the nodes’ caches
23

n5
n1
n3
n4
n2
n1
n2n3
n4n5
WFD Group Corresponding CG
Total connection time
CONNECTIVITY GRAPH
WFD-GMEvaluation
๏ Both Baseline and WDF-GM create a network of multi-hop paths among the nodes, called Connectivity Graph (CG)
๏ In CG, two nodes are connected if they have participated in the same WFD group
24

Concert Comicon
Helsinki
Baseline WFD-GM Baseline WFD-GM
Baseline WFD-GM
NETWORK CONNECTIVITY
WFD-GMEvaluation
25

0
20
40
60
80
100
Finalbatterylevel
6%
9%
Times at which nodes expire their
batteries (i.e., 71% of the sim. time)
WFD-GMBaseline
0
3
6
8
11
5 30 60 5 30 60 5 30 60
80
87
93
100
99
9999
99
99
99
100 100 100100 100
#ofCG’sconnectedcomponents
2
% nodes in the largest
connected component
100
2 2
NETWORK CONNECTIVITY & RESOURCES
WFD-GMEvaluation
26

model and recognize the user’s situation
CONTEXT

i
6.2. The User Physical Co
Interests
Social
Context
Physical
Context
Online
Social
Networks
Audio
Battery
Display
Weather
Cellular
Info
BT
Connections
Activity
Recognition
Environmental
Sensors
Motion
Sensors
Running
Applications
Calendar
BT
Scans
WFD
Scans
Installed
Applications
Phone
Calls
Messaging
Figure 6.2: Characterisation of the user’s context and interests using Context Kit.
CONTEXTDeﬁnition
We need a context deﬁnition that characterizes both the user and the mobile environment.
27
Context

Open source project available on https://contextkit.github.io
Sensors MonitoringReady to use Proximity Easy to extend
A sensing framework especially designed to perform large-scale sensing experiments and to simplify the data
collection from real mobile devices.
Released as simple library to
include in mobile applications.
Supports the monitoring of both
physical (e.g., accelerometer)
and virtual (i.e., user’s
interactions) sensors
Discovers other devices and
people in proximity using
both Bluetooth 4.0 and Wi-Fi
Direct
Modular development
to support other sensors
and functionalities
CONTEXTKit
28

Activate sensors through
the configuration file.
Runs in background
A log file for each sensor
Compress and send logs
to a remote server.
ARCHITECTURE
CONTEXTKit
29

PHYSICALContext
• Few selected sensors might be
enough for “simple” activities (e.g.,
user gait) but not for more
abstract info (e.g., user’s situation)
Context Modeling Context ReasoningSensors Data Features Extraction
• Identify sensor info that is the most
descriptive of the user context 
• Use of software eng. formalism
(e.g., ontologies, or mark-up
schemes) to model them.
• M a i n l y s u p e r v i s e d l e a r n i n g
approaches (i.e., classiﬁcation) 
• Often performed on remote servers
Manual features extraction/
creation from raw sensor data
TRADITIONAL CONTEXT INFERENCE PROCESS VS OUR PROPOSAL (perform the computation on the local device)
30
• Large set of heterogeneous
sensors available on commercial
mobile devices. 
• We consider both physical and
virtual sensors.
We propose to model the context information using
Dimensionality Reduction (DR) algorithms to infer new and
meaningful features in a data-driven way (latent features)
• DR algorithms allows to reduce the
complexity of learning algorithms
and to speed-up the reasoning
phase. 
• We can perform the entire context
inference process on the mobile
device.

We have developed Context Labeler, an Android app that
includes CK as library and allows to collect real and labeled context
data from mobile devices.
๏ Heterogeneous devices (users personal smartphones)
๏ Daily life activities, e.g., “Working”, “Break”, “Lunch”
๏ Volunteers associate labels to their daily life activities
๏ We did not deﬁne any constraints for the user behaviour
and her interaction with the mobile device (e.g., device’s
position on the body)
PHYSICALContext
DATA COLLECTION
31

Labels Distribution
0
3500
7000
10500
14000
Home
Sleep
W
orking
Free time
Lunch Break
Break
Restaurant
Shopping
DATASETCHARACTERISTICS
9%
8%
4%
10%
69%
36K data samples
1331 features
Location
Others (e.g., Audio, Battery,…)
Bluetooth
Running Apps
Physical Sensors
Available on https://github.com/contextkit/ContextLabeler-Dataset
32

0.4
0.5
0.6
0.7
0.8
0.9
1
0 10 20 30 40 50 60 70 80 90 100
SVM
Accuracy
# of latent features
0.86
0.88
0.9
0.92
0.94
0.96
0.98
1
0 10 20 30 40 50 60 70 80 90 100
k-NN
Accuracy
PCA
NMF
GRP
SRP
FA
AE
RAW
0.86
0.88
0.9
0.92
0.94
0.96
0.98
1
0 10 20 30 40 50 60 70 80 90 100
CART
Accuracy
EVALUATIONACCURACY
We compare the accuracy of 3 commonly used classiﬁers (i.e., k-NN, SVM, and CART) using both raw and latent features
inferred by 6 diﬀerent DR algorithms (different approaches).
•Autoencoder (AE)
Content-driven
•Principal Component Analysis (PCA)
•Non-Negative Matrix Factorization (NMF)
•Random Projection (Sparse - SRP - and Gaussian - GRP -)
•Feature Agglomeration (FA)
Hierarchical approach
Topology-driven
0.86
0.88
0.9
0.92
0.94
0.96
0.98
1
0 10 20 30 40 50 60 70 80 90 100
k-NN
Accuracy
PCA
NMF
GRP
SRP
FA
AE
RAW
0.86
0.88
0.9
0.92
0.94
0.96
0.98
1
0 10 20 30 40 50 60 70 80 90 100
k-NN
Accuracy
PCA
NMF
GRP
SRP
FA
AE
RAW
0.86
0.88
0.9
0.92
0.94
0.96
0.98
1
0 10 20 30 40 50 60 70
k-NN
Accuracy
PCA
NMF
GRP
SRP
FA
AE
RAW
33
Is it possible to recognize the user situation? What is the level of accuracy that we can obtain by using the
raw features and latent features ?

0.001
0.01
0.1
1
10
100
1000
RAW AE NMF FA PCA SRP GRPTime(seconds)
k-NN (tr)
k-NN (t)
SVM (tr)
SVM (t)
CART (tr)
CART (t)
0.01
0.1
1
10
50
500
0 10 20 30 40 50 60 70 80 90 100
Time(seconds)
PCA
NMF
GRP
SRP
FA
AE
EVALUATIONTIME
Modeling Time
DR exec. time to infer different number of Latent
features from the entire dataset.
Reasoning Time
Execution times for different combination of DR techniques and classiﬁers.
All the classiﬁers obtain the worst performance with the raw data since
they are considering more features.
Milliseconds
Seconds
Minutes
Best overall improvement
34
Can we perform the entire context inferring task on the local device?

0
0.1
0.2
0.3
0.4
0.5
RAW AE NMF FA PCA SRP GRP
Random Guesser
Accuracy
k-NN SVM CART
EVALUATIONSUBJECT-INDEPENDENT
Is it possible to use a model learned using data coming from other people?
๏ Cross-validation: the training set is made of data samples generated by users not included in the test set.
๏ All the classiﬁers still perform ~3x better than a Random Guesser (random predictions - acc: 12.5%).
๏ All the classiﬁers perform better using the latent features instead of the raw data. Latent features are more
representative than raw features.
35

SOCIALContext
DATA COLLECTION
We used ContextKit to build a second Android application, called
MyDigitalFootprint.
๏ Does not require any interactions with the user
๏ Collects physical context data from the local phone
๏ Collects also several information from Online Social Networks, such
as contents shared, list of friends/followers, comments, likes, etc.
๏ Used by 31 high-school students for 1 month
36

SOCIALContext
DATA COLLECTION
Combined Social Graph
(CSG)
Physical Social Graph Virtual Social Graph
0
2
4
6
8
10
12
14
16
0 5 10 15 20 25 30
Frequency
Degree
VSG
CSG
Nodes degree (MyDigitalFootprint dataset)
37

CONCLUSIONS&FUTURE WORK
38
A novel and complete approach to automatically discover interesting contents in opportunistic environments
PLIERS: a new graph-based and tag-based CARS that is able
to evaluate the similarity among different tags, based on their
popularity and the graph topology
• Evaluated against real-world datasets in a centralised scenario
• Provide personalised recommendations
p-PLIERS: a novel framework for distributed CARS
• Evaluated by simulating realistic scenarios (synthetic + real data)
• Effective recs. comparable to the centralised scenario
WFD-GM: a novel middleware-layer protocol to implement
self-organizing networks
• Simulation of realistic scenarios (synthetic + real data)
• Improves network connectivity and content dissemination
contextKit: sensing library for mobile devices
• 2 real-world context datasets
• Modelling physical & social context on the local device by
using its sensing capabilities

CONCLUSIONS&FUTURE WORK
39
Our research raised several open challenges that needs to be investigated in the future
Social weight Physical context vector
Prediction model
Features vector
๏ Extension of PLIERS ๏ Cast the recommendation problem into a link prediction task

PUBLICATIONS
40
International journals
• Arnaboldi Valerio, Campana Mattia Giovanni, Delmastro Franca, Pagani Elena. (2017). A personalized recommender system for pervasive social networks.
Pervasive and Mobile Computing. (Vol.36, pp. 3-24). Elsevier.
• Campana Mattia Giovanni, Delmastro Franca. (2017). Recommender Systems for Online and Mobile Social Networks: A survey. Online Social Networks and
Media. (Vol. 3, pp. 75-97). Elsevier.
International Conferences/Workshops with Peer Review
• Campana Mattia Giovanni, Chatzopoulos Dimitris, Delmastro Franca, Hui Pan. (2018, October). Lightweight Modeling of User Context Combining Physical and
Virtual Sensor Data. In Proceedings of UbiComp/ISWC’18 Adjunct. ACM
• Arnaboldi Valerio, Campana Mattia Giovanni, Delmastro Franca. (2017, October). Context-Aware Conﬁguration and Management of WiFi Direct Groups for Real
Opportunistic Networks. In Proceedings of the 14th IEEE International Conference on Mobile Ad Hoc and Sensor Systems (MASS) (pp. 266-274). IEEE.
• Campana Mattia Giovanni, Delmastro Franca, Bruno Raffaele. (2016, November). A machine-learned ranking algorithm for dynamic and personalised car
pooling services. In Proceedings of the 19th International IEEE Conference on Intelligent Transportation Systems (ITSC) (pp. 1856-1862). IEEE.
• Arnaboldi Valerio, Campana Mattia Giovanni, Delmastro Franca, Pagani Elena. (2016, April). PLIERS: a popularity-based recommender system for content
dissemination in online social networks. In Proceedings of the 31st Annual ACM Symposium on Applied Computing (pp. 671-673). ACM.

Context-aware Recommender Systems for Opportunistic Environments

More Related Content

Similar to Context-aware Recommender Systems for Opportunistic Environments

Recently uploaded

Context-aware Recommender Systems for Opportunistic Environments