SlideShare a Scribd company logo
1 of 12
Download to read offline
Semantic Labeling of Places 
based on Phone Usage Features 
using Supervised Learning 
A. Rivero-Rodriguez, H. Leppäkoski ,R. Piché 
1 
19.11.2014 
Tampere University of Technology 
Tampere, Finland 
www.tut.fi/posgroup 
November 21, 2014 
Corpus Christi, Texas, USA 
UPIN-LBS 
Context inference and awareness
This talk describes the design of the algorithms 
for a smartphone to learn your significant 
places 
Training data Features Classifiers 
2 
19.11.2014
MDC dataset 
Idiap and NRC-Lausanne 
Lausanne Data Collection Campaign (2009-2011) 
Records of 200 users over 18 months 
Captures all types of information 
Users provide extra information (labels!) 
Anonymisation 
46 GB of data! 
3 
19.11.2014 
Training Data
Active Phone Usage 
calls, messages 
calendar, contacts 
application usage 
Pasive Phone Usage 
network information 
system Information 
location & movement 
4 
19.11.2014 
Features Available 
Training Data
5 
19.11.2014 
The places were identified by 
clustering, then labeled by the user Training Data 
200 m 
Friend’s Home 
Restaurant 
Work 
Home
6 
19.11.2014 
We selected 14 features that could be 
used by a place-labelling application 
Call logs 
callsTimeRatio 
callsPerHour 
Accelerometer 
idleStillRatio 
walkRatio 
vehicleRatio 
sportRatio 
Features 
System 
duration 
startHour 
endHour 
nightStay 
batteryAvg 
chargingTimeRatio 
sysActiveRatio 
sysActStartsPerHour
7 
19.11.2014 
Features 
We considered two different data 
representations
8 
19.11.2014 
visits_20min.csv 
places.csv 
Definitions 
for DB queries Make queries 
system 
call logs 
accel activity 
start times, 
end times, 
used ids, 
place labels 
Accumulate times & counts, 
weight averages 
feature vectors 
for places 
for each 
user & place 
Compute times, 
counts, averages 
for each 
visit 
Compute ratios Compute ratios 
feature vectors 
for visits 
Features 
We preprocessed the data to obtain 
the features for both approaches
9 
19.11.2014 
We applied five popular classification 
methods to the data Classifiers 
ܲ X | ܣ, ܤ = 
ܲ ܣ| ܺ ܲ B|ܺ ܲ ܺ ) 
ܲ ܣ ܲ(ܤ) 
Naïve Bayes (NB) 
Decision Tree (DT) 
K-nearest neighbors (K-NN) 
Bagged Tree (DT) 
Neural Networks (NN)
10 
19.11.2014 
10000 
9000 
8000 
7000 
6000 
5000 
4000 
3000 
2000 
1000 
0 
70% 
O 
W 
H 
28% 
O 
H 
W 
65% 
W 
H 
O 
82% 
O 
W 
H 
80% 
O 
H 
W 
12% 
W 
H 
O 
80% 
O 
W 
H 
89% 
O 
H 
W 
7% 
W 
H 
O 
96% 
W O 
H 
29% 
O 
H 
W 
2% 
W 
H 
O 
93% 
W O 
H 
25% 
O 
H 
W 
7% 
W 
H 
O 
Number of cases(visits) 
Well Classified Misclassified 
NB 
53% 
DT 
75% 
BT 
77% 
NN 
61% 
KNN 
58% 
H: Home 
W: Work 
O: Others 
Results - Visits approach 
Classifiers
11 
19.11.2014 
40 
35 
30 
25 
20 
15 
10 
5 
0 
97% 
O 
H 
88% 
O 
H 
W 
69% 
W 
H 
O 
86% 
O 
H 
91% 
O 
H 
W 
67% 
W 
H 
O 
97% 
O 
H 
91% 
O 
H 
W 
69% 
W 
H 
O 
93% 
O 
H 
85% 
O 
H 
W 
69% 
W 
H 
O 
86% 
O 
H 
79% 
O 
H 
W 
53% 
W 
H 
O 
Number of cases(visits) 
Well Classified Misclassified 
NN 
71% 
DT 
81% 
NB 
84% 
BT 
85% 
KNN 
71% 
H: Home 
W: Work 
O: Others 
Results - Places approach 
Classifiers
Naive Bayes and Bagged Decision Tree with Places data-representation 
are best 
NN and K-NN underperform and are computationally demanding 
Most relevant features are: night stay, stay duration, start time, 
battery status, idle time 
Other classifiers (logistic regresion, support vector machine) 
Combine Places and Visits data-representations 
12 
19.11.2014 
Classifiers 
Results & Future Work 
Alejandro Rivero 
alejandro.rivero@tut.fi

More Related Content

Similar to Semantic Labeling of Places

Costing your Bug Data Operations
Costing your Bug Data OperationsCosting your Bug Data Operations
Costing your Bug Data Operations
DataWorks Summit
 
Google and SRI talk September 2016
Google and SRI talk September 2016Google and SRI talk September 2016
Google and SRI talk September 2016
Hagai Aronowitz
 
Survey on confidentiality of the user and query processing on spatial network
Survey on confidentiality of the user and query processing on spatial networkSurvey on confidentiality of the user and query processing on spatial network
Survey on confidentiality of the user and query processing on spatial network
eSAT Journals
 

Similar to Semantic Labeling of Places (20)

Spatial query integrity with voronoi neighbors
Spatial query integrity with voronoi neighborsSpatial query integrity with voronoi neighbors
Spatial query integrity with voronoi neighbors
 
060730 Igarss06 Denver Husar
060730 Igarss06 Denver Husar060730 Igarss06 Denver Husar
060730 Igarss06 Denver Husar
 
Hosted PBX- Should You Be a Provider or a Reseller?
Hosted PBX- Should You Be a Provider or a Reseller?Hosted PBX- Should You Be a Provider or a Reseller?
Hosted PBX- Should You Be a Provider or a Reseller?
 
Costing your Bug Data Operations
Costing your Bug Data OperationsCosting your Bug Data Operations
Costing your Bug Data Operations
 
Hadoop Summit San Jose 2014: Costing Your Big Data Operations
Hadoop Summit San Jose 2014: Costing Your Big Data Operations Hadoop Summit San Jose 2014: Costing Your Big Data Operations
Hadoop Summit San Jose 2014: Costing Your Big Data Operations
 
Introducing Multi Valued Vectors Fields in Apache Lucene
Introducing Multi Valued Vectors Fields in Apache LuceneIntroducing Multi Valued Vectors Fields in Apache Lucene
Introducing Multi Valued Vectors Fields in Apache Lucene
 
Google and SRI talk September 2016
Google and SRI talk September 2016Google and SRI talk September 2016
Google and SRI talk September 2016
 
Optique presentation
Optique presentationOptique presentation
Optique presentation
 
5200 Analysis-Airbnb data
5200 Analysis-Airbnb data5200 Analysis-Airbnb data
5200 Analysis-Airbnb data
 
Cloud-Scale BGP and NetFlow Analysis
Cloud-Scale BGP and NetFlow AnalysisCloud-Scale BGP and NetFlow Analysis
Cloud-Scale BGP and NetFlow Analysis
 
XeMPUPiL @ NGCLE@e-Novia 15.11.2017
XeMPUPiL @ NGCLE@e-Novia 15.11.2017XeMPUPiL @ NGCLE@e-Novia 15.11.2017
XeMPUPiL @ NGCLE@e-Novia 15.11.2017
 
Big Data and User Segmentation in Mobile Context
Big Data and User Segmentation in Mobile ContextBig Data and User Segmentation in Mobile Context
Big Data and User Segmentation in Mobile Context
 
Survey on confidentiality of the user and query processing on spatial network
Survey on confidentiality of the user and query processing on spatial networkSurvey on confidentiality of the user and query processing on spatial network
Survey on confidentiality of the user and query processing on spatial network
 
Data Usability Assessment for Remote Sensing Data: Accuracy of Interactive Da...
Data Usability Assessment for Remote Sensing Data: Accuracy of Interactive Da...Data Usability Assessment for Remote Sensing Data: Accuracy of Interactive Da...
Data Usability Assessment for Remote Sensing Data: Accuracy of Interactive Da...
 
Big data summary_v2.1
Big data summary_v2.1Big data summary_v2.1
Big data summary_v2.1
 
rerngvit_phd_seminar
rerngvit_phd_seminarrerngvit_phd_seminar
rerngvit_phd_seminar
 
Multi Valued Vectors Lucene
Multi Valued Vectors LuceneMulti Valued Vectors Lucene
Multi Valued Vectors Lucene
 
MachineLearning_Seminar_final.pptx
MachineLearning_Seminar_final.pptxMachineLearning_Seminar_final.pptx
MachineLearning_Seminar_final.pptx
 
Smart App@Pivotal by Dat Tran
Smart App@Pivotal by Dat TranSmart App@Pivotal by Dat Tran
Smart App@Pivotal by Dat Tran
 
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習 Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
 

Recently uploaded

The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
seri bangash
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
MohamedFarag457087
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
Scintica Instrumentation
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
Silpa
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 

Recently uploaded (20)

Chemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdfChemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdf
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
 
Introduction of DNA analysis in Forensic's .pptx
Introduction of DNA analysis in Forensic's .pptxIntroduction of DNA analysis in Forensic's .pptx
Introduction of DNA analysis in Forensic's .pptx
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspects
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.
 
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
 
An introduction on sequence tagged site mapping
An introduction on sequence tagged site mappingAn introduction on sequence tagged site mapping
An introduction on sequence tagged site mapping
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Stages in the normal growth curve
Stages in the normal growth curveStages in the normal growth curve
Stages in the normal growth curve
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 

Semantic Labeling of Places

  • 1. Semantic Labeling of Places based on Phone Usage Features using Supervised Learning A. Rivero-Rodriguez, H. Leppäkoski ,R. Piché 1 19.11.2014 Tampere University of Technology Tampere, Finland www.tut.fi/posgroup November 21, 2014 Corpus Christi, Texas, USA UPIN-LBS Context inference and awareness
  • 2. This talk describes the design of the algorithms for a smartphone to learn your significant places Training data Features Classifiers 2 19.11.2014
  • 3. MDC dataset Idiap and NRC-Lausanne Lausanne Data Collection Campaign (2009-2011) Records of 200 users over 18 months Captures all types of information Users provide extra information (labels!) Anonymisation 46 GB of data! 3 19.11.2014 Training Data
  • 4. Active Phone Usage calls, messages calendar, contacts application usage Pasive Phone Usage network information system Information location & movement 4 19.11.2014 Features Available Training Data
  • 5. 5 19.11.2014 The places were identified by clustering, then labeled by the user Training Data 200 m Friend’s Home Restaurant Work Home
  • 6. 6 19.11.2014 We selected 14 features that could be used by a place-labelling application Call logs callsTimeRatio callsPerHour Accelerometer idleStillRatio walkRatio vehicleRatio sportRatio Features System duration startHour endHour nightStay batteryAvg chargingTimeRatio sysActiveRatio sysActStartsPerHour
  • 7. 7 19.11.2014 Features We considered two different data representations
  • 8. 8 19.11.2014 visits_20min.csv places.csv Definitions for DB queries Make queries system call logs accel activity start times, end times, used ids, place labels Accumulate times & counts, weight averages feature vectors for places for each user & place Compute times, counts, averages for each visit Compute ratios Compute ratios feature vectors for visits Features We preprocessed the data to obtain the features for both approaches
  • 9. 9 19.11.2014 We applied five popular classification methods to the data Classifiers ܲ X | ܣ, ܤ = ܲ ܣ| ܺ ܲ B|ܺ ܲ ܺ ) ܲ ܣ ܲ(ܤ) Naïve Bayes (NB) Decision Tree (DT) K-nearest neighbors (K-NN) Bagged Tree (DT) Neural Networks (NN)
  • 10. 10 19.11.2014 10000 9000 8000 7000 6000 5000 4000 3000 2000 1000 0 70% O W H 28% O H W 65% W H O 82% O W H 80% O H W 12% W H O 80% O W H 89% O H W 7% W H O 96% W O H 29% O H W 2% W H O 93% W O H 25% O H W 7% W H O Number of cases(visits) Well Classified Misclassified NB 53% DT 75% BT 77% NN 61% KNN 58% H: Home W: Work O: Others Results - Visits approach Classifiers
  • 11. 11 19.11.2014 40 35 30 25 20 15 10 5 0 97% O H 88% O H W 69% W H O 86% O H 91% O H W 67% W H O 97% O H 91% O H W 69% W H O 93% O H 85% O H W 69% W H O 86% O H 79% O H W 53% W H O Number of cases(visits) Well Classified Misclassified NN 71% DT 81% NB 84% BT 85% KNN 71% H: Home W: Work O: Others Results - Places approach Classifiers
  • 12. Naive Bayes and Bagged Decision Tree with Places data-representation are best NN and K-NN underperform and are computationally demanding Most relevant features are: night stay, stay duration, start time, battery status, idle time Other classifiers (logistic regresion, support vector machine) Combine Places and Visits data-representations 12 19.11.2014 Classifiers Results & Future Work Alejandro Rivero alejandro.rivero@tut.fi