Information Fusion Methods for Location Data Analysis

Information fusion methods for
location data analysis
Candidate: Alket Cecaj Supervisor: Prof. Marco Mamei
Doctorate School in Industrial Innovation Engineering

Thesis outline
• Introduction
• Data Fusion for Event Detection and Event Description Using Agg. CDR
• Re-identification of Anonymized CDR Records Using Information Fusion
• Privacy issues
• Conclusions

Data Fusion and Location data
• Data Fusion
• Location Data types:
- CDR (Call Description Records) aggregated or individual.
- Geo-tagged social network data or LBS as Foursquare
- Location data as Open data. Example: census data.

Data fusion for event detection by using aggregated
CDR and geo-tagged social network data
Detecting and describing events happening in urban
areas by analysing spatio – temporal data
• Detecting and describing events happening in urban areas by
analysing spatio – temporal data
• Prevoious works: Laura Ferrari, Marco Mamei, Massimo Colonna (2012) : “ People get together on special
events: Discovering happenings in the city via cell network analysis ” Pervasive Computing and Communications
Workshops (PERCOM Workshops), 2012 IEEE International Conference on.
• Publication: Cecaj Alket, Marco Mamei (2016) : “Data Fusion for City Life Event Detection” In: Journal of
Ambient Intelligence and Humanized Computing, pp 1– 15.

The dataset: spatio-temporal aggregation
Spatial Aggregation
Temporal aggregation

Outlier detection
method
IQR method :
[LB,UB] = [Q25 – k*IQR, Q75 + k*IQR]
M method :
[LB,UB] = [Q50 – k*Q50, Q50 + k*Q50]
Q75 method :
[LB,UB] = [Q25 – k*Q25, Q25 + k*Q75]

Groundtruth
dataset
 Football matches
 Fairs
 Protests
 Other events, large crowds
Events happening in the period
of time the data covers

Measuring precision and
recall of the system
True positives (tp)
False positives (fp)
False negatives (fn)
Precision = tp / (tp + fp)
Recall = tp / (tp + fn)

Precision – Recall of event detection system : CDR

By combining the results from
the two datasets
• Improvement of precision – recall
performance of the method
• The improvement is limited in the
long run by the main dataset.
• The same improvement can be
observed also by joining the results
of the other datasets.
Improving event detection results by data fusion

By using the CDR data the
events can be detected but
not described:
• By joining the results the data
can complement and enrich
each other.
• In this case the social dataset
can be used to describe
semantically the events
Data fusion for Event description

Re-identification of CDR data by using social
network geo-tagged data
Information fusion for anonymized CDR data de-
anonymization.
Montjoye, Y. et al. (2013). “Unique in the crowd. The privacy bounds of
human mobility”. In: Scientific Reports 3, pp. 161 –180
Cecaj, Alket, Marco Mamei, and Franco Zambonelli (2015). “Re-identification and Information
Fusion Between Anonymized CDR and Social Network Data”. Journal of Ambient Intelligence
and Humanized Computing, pp. 1–14.

CDR and Social: event distribution and R.G

Mobility measures and uniqueness of users mobility (unique in the crowd)
Knowledge extraction : uniqueness of traces

Knowledge extraction : uniqueness of mobility traces

• Given that CDR user Ci has Ni events (points) in common with FTi, how likely is that the two
users are the same?
• Question is both novel (no other works addressing it in this domain) and fundamental
• Conditional probability
• Even the percentage is low in a data set of millions of users there is a consistent
number of them that can be identified.
Re-identification : probabilistic approach

Conclusions
• Information fusion as a an enabling process for novel applications
- Future work oriented towards the “structured data fusion” idea
• Privacy
- anonimty VS re-identification and remaining utility of data
- variations of existing privacy preserving techniques (Differential privacy.)

Publications
• Nicola Bicocchi, Alket Cecaj, Damiano Fontana, Marco Mamei, Andrea Sassi, Franco Zambonelli: “ Collective Awareness
for Human ICT Collaboration in Smart Cities”. IEEE WETICE International conference on state-of-the art research in
enabling technologies for collaboration 17-20 2013.
• Alket Cecaj, Marco Mamei, Nicola Bicocchi : “ Re-identification of Anonymized CDR datasets Using Social Network Data
”. IEEE Percom International conference on Pervasive Computing and Communications. Budapest, Hungary 24-28, 2014.
• Cecaj Alket, Marco Mamei (2016) : “Data Fusion for City Life Event Detection” In: Journal of Ambient Intelligence and
Humanized Computing, pp 1– 15.
• Nicola Bicocchi, Alket Cecaj, Damiano Fontana, Marco Mamei, Andrea Sassi, Franco Zambonelli.(2014) “ Social
Collective Awareness in Socio-Technical Urban Superorganisms ”. Social Collective Intelligence Combining the Powers
Of Humans and Machines to Build a Smarter Society,Part III, Applications and Case studies, page 227.
• Cecaj, Alket, Marco Mamei, and Franco Zambonelli (2015). “Re-identification and Information Fusion Between
Anonymized CDR and Social Network Data”. In: Journal of Ambient Intelligence and Humanized Computing, pp. 1–14.

Information Fusion Methods for Location Data Analysis

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Information Fusion Methods for Location Data Analysis

Similar to Information Fusion Methods for Location Data Analysis (20)

More from Alket Cecaj

More from Alket Cecaj (6)

Recently uploaded

Recently uploaded (20)

Information Fusion Methods for Location Data Analysis

Editor's Notes