Presentation about a project I did to use Machine Learning for providing targetted advertisments, while preserving users' privacy. The features used are available from mobile devices, such as Android phones. This was done as a course project during my Master's in Waterloo, for CS 889 with Prof Michael Terry. I believe that this approach can make ads useful rather than annoying, and it puts users privacy first.. a double win for users, and certainly a big win for companies that would take advantage of it.
1. 1
PALISTIN
Privacy Aware Location Independent SiTuation INference
by
Younos Aboulnaga
* Image captured from Sony XPERIA S Ad.: http://www.youtube.com/watch?v=FRinpj7th3Q
2. PRIVACY INTHE
AGE OF UBI. COMP.
• “We collect information to provide better services to
all of our users ...We may also use various
technologies to determine location, such as sensor
data from your device...”, Google’s privacy policy
effective March 1st, 2012. [1]
2
[1] Google’s Privacy Policy http://www.google.ca/intl/en/policies/privacy/
3. PRIVACY INTHE
AGE OF UBI. COMP.
• “We collect information to provide better services to
all of our users ...We may also use various
technologies to determine location, such as sensor
data from your device...”, Google’s privacy policy
effective March 1st, 2012. [1]
• The first to announce data collection, and also the
first to have users’ consent.
2
[1] Google’s Privacy Policy http://www.google.ca/intl/en/policies/privacy/
4. PRIVACY INTHE
AGE OF UBI. COMP.
2
[1] Google’s Privacy Policy http://www.google.ca/intl/en/policies/privacy/
Ubiquitous computing has become reality,
but is it possible to embrace it while
preserving users’ privacy?
5. AGENDA
• Motivation: Privacy and location sharing
• Mobile ad. targeting with location and beyond
• Situation Inference: Method and related work
• Evaluation
• Conclusion and key take aways
3
7. LOCATION BASED SERVICES
• A Location Based Service (LBS) is an information or
entertainment service, accessible with mobile devices
through the mobile network and utilizing the ability to
make use of geographical position of the mobile device
X
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
8. LOCATION BASED SERVICES
• A Location Based Service (LBS) is an information or
entertainment service, accessible with mobile devices
through the mobile network and utilizing the ability to
make use of geographical position of the mobile device
• Location Based Advertising (LBA) is a service provided
to the user under this definition.
X
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
9. LOCATION BASED SERVICES
• A Location Based Service (LBS) is an information or
entertainment service, accessible with mobile devices
through the mobile network and utilizing the ability to
make use of geographical position of the mobile device
• Location Based Advertising (LBA) is a service provided
to the user under this definition.
• Without loss of generality, we will focus on the Ads
served along with SERPs.
X
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
11. THREATS IN
LOCATION SHARING
• The minimum LBS query tuple is (User Id, Location, Keywords).
X
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
12. THREATS IN
LOCATION SHARING
• The minimum LBS query tuple is (User Id, Location, Keywords).
• We will now assume fine-grained location; lat./long.
X
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
13. THREATS IN
LOCATION SHARING
• The minimum LBS query tuple is (User Id, Location, Keywords).
• We will now assume fine-grained location; lat./long.
• Other location-indicating attributes such as IP address will be
ignored.
X
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
14. THREATS IN
LOCATION SHARING
• The minimum LBS query tuple is (User Id, Location, Keywords).
• We will now assume fine-grained location; lat./long.
• Other location-indicating attributes such as IP address will be
ignored.
• We also assume that LBS providers keep logs of all the
queries they receive.
X
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
16. THREATS IN
LOCATION SHARING
• Location Based Service (LBS) query must contain
(User Id, Location, Keywords).
5
Motivation
Beyond locationn
Situation Inference
Evaluation
Conclusion
17. THREATS IN
LOCATION SHARING
• Location Based Service (LBS) query must contain
(User Id, Location, Keywords).
• We assume that LBS providers keep logs of all the queries
they receive.
5
Motivation
Beyond locationn
Situation Inference
Evaluation
Conclusion
18. THREATS IN
LOCATION SHARING
• Location Based Service (LBS) query must contain
(User Id, Location, Keywords).
• We assume that LBS providers keep logs of all the queries
they receive.
• A threat is any use of such log data to derive information
other than what the data was originally collected for.
5
Motivation
Beyond locationn
Situation Inference
Evaluation
Conclusion
19. THREATS IN
LOCATION SHARING
• Location Based Service (LBS) query must contain
(User Id, Location, Keywords).
• We assume that LBS providers keep logs of all the queries
they receive.
• A threat is any use of such log data to derive information
other than what the data was originally collected for.
• Proof of concept: pleaserobme.com
5
Motivation
Beyond locationn
Situation Inference
Evaluation
Conclusion
20. THREATS IN
LOCATION SHARING
• Location Based Service (LBS) query must contain
(User Id, Location, Keywords).
• We assume that LBS providers keep logs of all the queries
they receive.
• A threat is any use of such log data to derive information
other than what the data was originally collected for.
• Proof of concept: pleaserobme.com
5
Motivation
Beyond locationn
Situation Inference
Evaluation
Conclusion
21. THREATS IN
LOCATION SHARING
• Logging (User Id, Location) over time:
X
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
22. THREATS IN
LOCATION SHARING
• Logging (User Id, Location) over time:
• Easy to extract patterns and predict a user’s location.
X
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
23. THREATS IN
LOCATION SHARING
• Logging (User Id, Location) over time:
• Easy to extract patterns and predict a user’s location.
• Threatens user’s property, and possibly safety
X
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
24. THREATS IN
LOCATION SHARING
• Logging (User Id, Location) over time:
• Easy to extract patterns and predict a user’s location.
• Threatens user’s property, and possibly safety
• Proof of concept: pleaserobme.com
X
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
25. THREATS IN
LOCATION SHARING
• Logging (User Id, Location) over time:
• Easy to extract patterns and predict a user’s location.
• Threatens user’s property, and possibly safety
• Proof of concept: pleaserobme.com
X
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
26. THREATS IN
LOCATION SHARING
• Logging (User Id, Keywords) over time:
X
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
27. THREATS IN
LOCATION SHARING
• Logging (User Id, Keywords) over time:
• Keywords in LBS could be about Points of Interest
X
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
28. THREATS IN
LOCATION SHARING
• Logging (User Id, Keywords) over time:
• Keywords in LBS could be about Points of Interest
• Reveals the behaviour of the user
X
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
29. THREATS IN
LOCATION SHARING
• Logging (User Id, Keywords) over time:
• Keywords in LBS could be about Points of Interest
• Reveals the behaviour of the user
• Could be a privacy concern for some users
X
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
31. THREATS IN
LOCATION SHARING
• Logging (Location, Keywords) over time:
X
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
32. THREATS IN
LOCATION SHARING
• Logging (Location, Keywords) over time:
• Location could be Home
X
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
33. THREATS IN
LOCATION SHARING
• Logging (Location, Keywords) over time:
• Location could be Home
• Directly identifies a person or at least reduces the
anonymity set to the size of the household
X
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
34. THREATS IN
LOCATION SHARING
• Logging (Location, Keywords) over time:
• Location could be Home
• Directly identifies a person or at least reduces the
anonymity set to the size of the household
• Example follows!
X
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
35. X
Image showing query density in Seattle, from a paper.
THREATS IN
LOCATION SHARING
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
36. X
Google StreetView image of the house
indicated by the arrow
THREATS IN
LOCATION SHARING
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
38. LOCATION INFORMATION
• Focusing on Physical Location not Location of Interest
X
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
39. LOCATION INFORMATION
• Focusing on Physical Location not Location of Interest
• Variety of sources at different levels of granularity
X
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
40. LOCATION INFORMATION
• Focusing on Physical Location not Location of Interest
• Variety of sources at different levels of granularity
• Coarse grained
X
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
41. LOCATION INFORMATION
• Focusing on Physical Location not Location of Interest
• Variety of sources at different levels of granularity
• Coarse grained
• Examples: IP Address and CommunicationTower
X
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
42. LOCATION INFORMATION
• Focusing on Physical Location not Location of Interest
• Variety of sources at different levels of granularity
• Coarse grained
• Examples: IP Address and CommunicationTower
• Fine grained
X
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
43. LOCATION INFORMATION
• Focusing on Physical Location not Location of Interest
• Variety of sources at different levels of granularity
• Coarse grained
• Examples: IP Address and CommunicationTower
• Fine grained
• Examples: GPS fix and WiFi Access Points
X
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
44. PRIVACY PRESERVING LBS
• Preventing the exploitation of user location data is a
difficult problem, and an active area of research.
6
[1] Olumofin, Femi,Tysowski, Piotr K., Goldberg, Ian, and Hengartner, Urs.Achieving efficient query privacy for location
based services. In Proceedings of the 10th international conference on Privacy enhancing technologies, PETS’10.
Motivation
Beyond locationn
Situation Inference
Evaluation
Conclusion
45. PRIVACY PRESERVING LBS
• Preventing the exploitation of user location data is a
difficult problem, and an active area of research.
• A good solution is proposed in [1].
6
[1] Olumofin, Femi,Tysowski, Piotr K., Goldberg, Ian, and Hengartner, Urs.Achieving efficient query privacy for location
based services. In Proceedings of the 10th international conference on Privacy enhancing technologies, PETS’10.
Motivation
Beyond locationn
Situation Inference
Evaluation
Conclusion
46. PRIVACY PRESERVING LBS
• Preventing the exploitation of user location data is a
difficult problem, and an active area of research.
• A good solution is proposed in [1].
• All solutions require a location granularity covering at
least k other users or Points Of Interest.
6
[1] Olumofin, Femi,Tysowski, Piotr K., Goldberg, Ian, and Hengartner, Urs.Achieving efficient query privacy for location
based services. In Proceedings of the 10th international conference on Privacy enhancing technologies, PETS’10.
Motivation
Beyond locationn
Situation Inference
Evaluation
Conclusion
47. PRIVACY PRESERVING LBS
• Preventing the exploitation of user location data is a
difficult problem, and an active area of research.
• A good solution is proposed in [1].
• All solutions require a location granularity covering at
least k other users or Points Of Interest.
• Range targeting of Google AdWords can target a
circle of radius as small as1km.
6
[1] Olumofin, Femi,Tysowski, Piotr K., Goldberg, Ian, and Hengartner, Urs.Achieving efficient query privacy for location
based services. In Proceedings of the 10th international conference on Privacy enhancing technologies, PETS’10.
Motivation
Beyond locationn
Situation Inference
Evaluation
Conclusion
55. MOBILE AD.TARGETING
• Specific region
X
Geographic: Fine Grained
[1] Olumofin, Femi,Tysowski, Piotr K., Goldberg, Ian, and Hengartner, Urs.Achieving efficient query privacy for location
based services. In Proceedings of the 10th international conference on Privacy enhancing technologies, PETS’10.
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
56. MOBILE AD.TARGETING
• Specific region
• Can be as small as a circle of radius 1 km in AdWords
X
Geographic: Fine Grained
[1] Olumofin, Femi,Tysowski, Piotr K., Goldberg, Ian, and Hengartner, Urs.Achieving efficient query privacy for location
based services. In Proceedings of the 10th international conference on Privacy enhancing technologies, PETS’10.
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
57. MOBILE AD.TARGETING
• Specific region
• Can be as small as a circle of radius 1 km in AdWords
• Requires tracking user’s fine-grained location
X
Geographic: Fine Grained
[1] Olumofin, Femi,Tysowski, Piotr K., Goldberg, Ian, and Hengartner, Urs.Achieving efficient query privacy for location
based services. In Proceedings of the 10th international conference on Privacy enhancing technologies, PETS’10.
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
58. MOBILE AD.TARGETING
• Specific region
• Can be as small as a circle of radius 1 km in AdWords
• Requires tracking user’s fine-grained location
• Raises privacy concerns
X
Geographic: Fine Grained
[1] Olumofin, Femi,Tysowski, Piotr K., Goldberg, Ian, and Hengartner, Urs.Achieving efficient query privacy for location
based services. In Proceedings of the 10th international conference on Privacy enhancing technologies, PETS’10.
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
59. MOBILE AD.TARGETING
• Specific region
• Can be as small as a circle of radius 1 km in AdWords
• Requires tracking user’s fine-grained location
• Raises privacy concerns
• Possible to circumvent privacy issues using
complicated techniques, such as that described in [1]
X
Geographic: Fine Grained
[1] Olumofin, Femi,Tysowski, Piotr K., Goldberg, Ian, and Hengartner, Urs.Achieving efficient query privacy for location
based services. In Proceedings of the 10th international conference on Privacy enhancing technologies, PETS’10.
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
60. MOBILE AD.TARGETING
• Exact location can be used to infer activity [2]
X
Geographic: Exact
[2] Miluzzo, Emiliano et al. Sensing meets mobile social networks: the design, implementation and evaluation of the
CenceMe application. In Proceedings of the 6th ACM conference on Embedded network sensor systems, SenSys ’08,
pp. 337–350, NewYork, NY, USA, 2008.ACM.
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
61. MOBILE AD.TARGETING
• Exact location can be used to infer activity [2]
• Relies on presence of geographic information
X
Geographic: Exact
[2] Miluzzo, Emiliano et al. Sensing meets mobile social networks: the design, implementation and evaluation of the
CenceMe application. In Proceedings of the 6th ACM conference on Embedded network sensor systems, SenSys ’08,
pp. 337–350, NewYork, NY, USA, 2008.ACM.
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
62. MOBILE AD.TARGETING
• Exact location can be used to infer activity [2]
• Relies on presence of geographic information
• It is common that many POIs share the same
locations while having different associated activities
X
Geographic: Exact
[2] Miluzzo, Emiliano et al. Sensing meets mobile social networks: the design, implementation and evaluation of the
CenceMe application. In Proceedings of the 6th ACM conference on Embedded network sensor systems, SenSys ’08,
pp. 337–350, NewYork, NY, USA, 2008.ACM.
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
63. MOBILE AD.TARGETING
• Exact location can be used to infer activity [2]
• Relies on presence of geographic information
• It is common that many POIs share the same
locations while having different associated activities
• Protecting user’s privacy while sharing exact location is
still an open problem; should not decrease granularity
X
Geographic: Exact
[2] Miluzzo, Emiliano et al. Sensing meets mobile social networks: the design, implementation and evaluation of the
CenceMe application. In Proceedings of the 6th ACM conference on Embedded network sensor systems, SenSys ’08,
pp. 337–350, NewYork, NY, USA, 2008.ACM.
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
65. BEYOND GEOGRAPHY:
SITUATION
• Situation is basically what the user is doing.
• Could be called activity, but this usually means
physical activities such as running, walking, ..etc.
8
Motivation
Beyond locationn
Situation Inference
Evaluation
Conclusion
66. BEYOND GEOGRAPHY:
SITUATION
• Situation is basically what the user is doing.
• Could be called activity, but this usually means
physical activities such as running, walking, ..etc.
• Could be called context awareness, but this can mean
anything since there is no clear definition of context.
8
Motivation
Beyond locationn
Situation Inference
Evaluation
Conclusion
67. BEYOND GEOGRAPHY:
SITUATION
• Situation is basically what the user is doing.
• Could be called activity, but this usually means
physical activities such as running, walking, ..etc.
• Could be called context awareness, but this can mean
anything since there is no clear definition of context.
• Gives more information than fine grained location.
8
Motivation
Beyond locationn
Situation Inference
Evaluation
Conclusion
68. BEYOND GEOGRAPHY:
SITUATION
• Situation is basically what the user is doing.
• Could be called activity, but this usually means
physical activities such as running, walking, ..etc.
• Could be called context awareness, but this can mean
anything since there is no clear definition of context.
• Gives more information than fine grained location.
• Augments privacy preserving coarse grained location.
8
Motivation
Beyond locationn
Situation Inference
Evaluation
Conclusion
70. PLACE SEMANTIC LABELS
• Good approximation for situation
9
Motivation
Beyond locationn
Situation Inference
Evaluation
Conclusion
71. PLACE SEMANTIC LABELS
• Good approximation for situation
• Natural answers to “Where are you?”
9
Motivation
Beyond locationn
Situation Inference
Evaluation
Conclusion
72. PLACE SEMANTIC LABELS
• Good approximation for situation
• Natural answers to “Where are you?”
9
• 1 : Home
• 2 :At a friend’s place
• 3 :At work/school
• 4 : On they way
• 5 :At my daughter’s school /
Picking up my girl friend from work
• 6 :Walking, hiking, skiing, ..etc
• 7 :At the gym (indoor sports)
• 8 :At a restaurant or bar
• 9 : Shopping
• 10 : On vacation
Motivation
Beyond locationn
Situation Inference
Evaluation
Conclusion
73. PALISTIN
• True positive rate of 0.965 on average (95% CI: 0.004)
10
Privacy Aware Location Independent SiTuation INference
Motivation
Beyond locationn
Situation Inference
Evaluation
Conclusion
74. SITUATION USE IN
PERVASIVE ADVERTISING
11
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
75. SITUATION USE IN
PERVASIVE ADVERTISING
• Gives insight about lifestyle
11
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
76. SITUATION USE IN
PERVASIVE ADVERTISING
• Gives insight about lifestyle
• Vertical targeting based on lifestyle.
11
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
77. SITUATION USE IN
PERVASIVE ADVERTISING
• Gives insight about lifestyle
• Vertical targeting based on lifestyle.
• Timing ads based on previous patterns.
11
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
78. SITUATION USE IN
PERVASIVE ADVERTISING
• Gives insight about lifestyle
• Vertical targeting based on lifestyle.
• Timing ads based on previous patterns.
• Trigger for a process calmly running in the background of
a mobile phone to become engaging.
11
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
79. SITUATION USE IN
PERVASIVE ADVERTISING
• Gives insight about lifestyle
• Vertical targeting based on lifestyle.
• Timing ads based on previous patterns.
• Trigger for a process calmly running in the background of
a mobile phone to become engaging.
• Many other possibilities remain to be explored!
11
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
81. RELATED BODIES OF WORK
• Answering “What is the user doing?”:
• Activity Inference
• Behavioural Modelling
• Contextual Usage
• User Modelling
13
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
82. RELATED BODIES OF WORK
• Answering “What is the user doing?”:
• Activity Inference
• Behavioural Modelling
• Contextual Usage
• User Modelling
13
Context Inference
or Acquisition
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
83. COMMON METHODS
• Latent Dirichlet Allocation [3,4]
• Hidden Markov Models and
Bayesian Networks [5]
• Eigen Decomposition [6]
• Ontology based [7]
• Rule based [8]
14
[3] Farrahi, Katayoun and Gatica-Perez, Daniel. Discovering routines from large-scale human locations using
probabilistic topic models.ACMTrans. Intell. Syst.Technol., 2:3:1–3:27, January 2011.
[4]Trinh-Minh-Tri Do and Daniel Gatica-Perez. By their apps you shall understand them: mining large-scale
patterns of mobile phone usage. In Proceedings of the 9th International Conference on Mobile and Ubiquitous
Multimedia (MUM '10).ACM, NewYork, NY, USA. 2010.
[5] Salamin, Hugues andVinciarelli,Alessandro. Introduction to sequence analysis for human behavior
understanding. In Computer analysis of human behavior, pp. 21–40. Springer London, 2011.
[6]Eagle, Nathan and Pentland,Alex. Reality min- ing: sensing complex social systems. Per- sonal Ubiquitous
Comput., 10:255–268, March 2006. ISSN 1617-4909.
[7] Gerber, Simon et al. PersonisJ: mobile, client-side user modelling. In Proceedings of the 18th international
conference on User Modeling,Adaptation, and Personalization (UMAP'10). Springer-Verlag, Berlin, Heidelberg,
[8] Siewiorek, Daniel et al. SenSay:A Context-Aware Mobile Phone. InProceedings of the 7th IEEE International
Symposium on Wearable Computers (ISWC '03). IEEE Computer Society,Washington, DC, USA. 2003.
With references to most related papers
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
84. MAIN CONTRIBUTIONS
• Focus on Work and Home
• Many depend on daily or
weekly patterns
• Data sets might be susceptible
to biased sampling
• Some use geographic and/or
other specific information, such
as text of calendar entries
• Ten different labels
• Infers the situation of any user
given only 10 minutes worth of
data
• Data set collected from a wide
variety of participants
• No specific information; only
privacy preserving features
15
Other works Our work
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
85. PROPOSED METHOD
16
[9] Kiukkonen, Niko, Blom, Jan, Dousse, Olivier, Gatica-Perez, Daniel, and Laurila, Juha.Towards rich mobile phone
datasets: Lausanne datacollection campaign.Technical report, IDIAP Research Institute, Switzerland, 2010.
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
86. PROPOSED METHOD
• Supervised learning using C4.5 DecisionTrees
16
[9] Kiukkonen, Niko, Blom, Jan, Dousse, Olivier, Gatica-Perez, Daniel, and Laurila, Juha.Towards rich mobile phone
datasets: Lausanne datacollection campaign.Technical report, IDIAP Research Institute, Switzerland, 2010.
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
87. PROPOSED METHOD
• Supervised learning using C4.5 DecisionTrees
• Data from Nokia’s Lausanne Data Collection Campaign [9]
16
[9] Kiukkonen, Niko, Blom, Jan, Dousse, Olivier, Gatica-Perez, Daniel, and Laurila, Juha.Towards rich mobile phone
datasets: Lausanne datacollection campaign.Technical report, IDIAP Research Institute, Switzerland, 2010.
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
88. PROPOSED METHOD
• Supervised learning using C4.5 DecisionTrees
• Data from Nokia’s Lausanne Data Collection Campaign [9]
• Mobile usage data of 80 participants for 17 months
16
[9] Kiukkonen, Niko, Blom, Jan, Dousse, Olivier, Gatica-Perez, Daniel, and Laurila, Juha.Towards rich mobile phone
datasets: Lausanne datacollection campaign.Technical report, IDIAP Research Institute, Switzerland, 2010.
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
89. PROPOSED METHOD
• Supervised learning using C4.5 DecisionTrees
• Data from Nokia’s Lausanne Data Collection Campaign [9]
• Mobile usage data of 80 participants for 17 months
• Users self-report the meaning of each location in which
they stayed for more than 10 minutes, if meaningful
16
[9] Kiukkonen, Niko, Blom, Jan, Dousse, Olivier, Gatica-Perez, Daniel, and Laurila, Juha.Towards rich mobile phone
datasets: Lausanne datacollection campaign.Technical report, IDIAP Research Institute, Switzerland, 2010.
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
90. PROPOSED METHOD
• Supervised learning using C4.5 DecisionTrees
• Data from Nokia’s Lausanne Data Collection Campaign [9]
• Mobile usage data of 80 participants for 17 months
• Users self-report the meaning of each location in which
they stayed for more than 10 minutes, if meaningful
• Most labels are for Home,Work, and Home of a Friend.
Prevalence of other labels is relatively low
16
[9] Kiukkonen, Niko, Blom, Jan, Dousse, Olivier, Gatica-Perez, Daniel, and Laurila, Juha.Towards rich mobile phone
datasets: Lausanne datacollection campaign.Technical report, IDIAP Research Institute, Switzerland, 2010.
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
91. PROPOSED METHOD
• Hierarchical ensemble of SupportVector Machines (SVM)
X
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
92. PROPOSED METHOD
• Hierarchical ensemble of SupportVector Machines (SVM)
• Overcomes the bias of the dataset by first determining if the
test instance is among the prevalent classes or not.
X
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
93. PROPOSED METHOD
• Hierarchical ensemble of SupportVector Machines (SVM)
• Overcomes the bias of the dataset by first determining if the
test instance is among the prevalent classes or not.
• If the class is predicted to be one of the prevalent classes, it
is determined using Pairwise SVMs.
X
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
94. PROPOSED METHOD
• Hierarchical ensemble of SupportVector Machines (SVM)
• Overcomes the bias of the dataset by first determining if the
test instance is among the prevalent classes or not.
• If the class is predicted to be one of the prevalent classes, it
is determined using Pairwise SVMs.
• For other classes with very few labelled examples, the class
is determined using One-Agains-All SVMs.
X
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
95. PROPOSED METHOD
• Hierarchical ensemble of SupportVector Machines (SVM)
• Overcomes the bias of the dataset by first determining if the
test instance is among the prevalent classes or not.
• If the class is predicted to be one of the prevalent classes, it
is determined using Pairwise SVMs.
• For other classes with very few labelled examples, the class
is determined using One-Agains-All SVMs.
• Dividing the dataset also results in high performance.
X
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
97. • Foreground application
Universal Identifier
• Number of running
applications
• Media play events
• Communication events
types (SMS orVoice Call)
• Voice call duration
• Communication direction
• Communication party
known or unknown
• Audible ring or not
• Periods of inactivity
18
Extracted Features
FEATURE CONSTRUCTION
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
98. • Movement (Accelerometer)
• Movement (WiFi)
• Number of WiFi APs/SSIDs
• Number of Bluetooth devices
• Charger connected
• Battery level
• Time (day of week and time
of day)
• Visit length
• Weather (temperature and
sky condition)
• Type and recurrence of
coinciding calendar event
• Label of previous visit
19
Extracted Features
FEATURE CONSTRUCTION
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
100. • Four different feature selection algorithms were attempted
20
Feature Selection/Dimensionality Reduction
FEATURE CONSTRUCTION
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
101. • Four different feature selection algorithms were attempted
• Results varied widely.
20
Feature Selection/Dimensionality Reduction
FEATURE CONSTRUCTION
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
102. • Four different feature selection algorithms were attempted
• Results varied widely.
• Indicates high correlation between features.
20
Feature Selection/Dimensionality Reduction
FEATURE CONSTRUCTION
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
103. • Four different feature selection algorithms were attempted
• Results varied widely.
• Indicates high correlation between features.
• Principal Component Analysis reduces correlation matrix to its principal
Eigen vectors
20
Feature Selection/Dimensionality Reduction
FEATURE CONSTRUCTION
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
104. • Four different feature selection algorithms were attempted
• Results varied widely.
• Indicates high correlation between features.
• Principal Component Analysis reduces correlation matrix to its principal
Eigen vectors
• Performed better than SingularValue Decomposition on the dataset.
20
Feature Selection/Dimensionality Reduction
FEATURE CONSTRUCTION
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
105. • Four different feature selection algorithms were attempted
• Results varied widely.
• Indicates high correlation between features.
• Principal Component Analysis reduces correlation matrix to its principal
Eigen vectors
• Performed better than SingularValue Decomposition on the dataset.
• Using only the the Eigen vectors produced by PCA, average accuracy of
DecisionTrees increased from 0.59 to the current 0.96
20
Feature Selection/Dimensionality Reduction
FEATURE CONSTRUCTION
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
107. Data Segmentation
• A visit is a stay of some user in some location for 10+ minutes
21
FEATURE CONSTRUCTION
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
108. Data Segmentation
• A visit is a stay of some user in some location for 10+ minutes
• Might be a visit to a significant location or not.
21
FEATURE CONSTRUCTION
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
109. Data Segmentation
• A visit is a stay of some user in some location for 10+ minutes
• Might be a visit to a significant location or not.
• Visits where a lot of movements is detected (form WiFi) is
further divided into micro locations 10 square meters in area.
21
FEATURE CONSTRUCTION
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
110. Data Segmentation
• A visit is a stay of some user in some location for 10+ minutes
• Might be a visit to a significant location or not.
• Visits where a lot of movements is detected (form WiFi) is
further divided into micro locations 10 square meters in area.
• Readings from all inputs within the time period of a visit are
bagged as (feature, value) pairs into micro location “documents”.
21
FEATURE CONSTRUCTION
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
111. Data Segmentation
• A visit is a stay of some user in some location for 10+ minutes
• Might be a visit to a significant location or not.
• Visits where a lot of movements is detected (form WiFi) is
further divided into micro locations 10 square meters in area.
• Readings from all inputs within the time period of a visit are
bagged as (feature, value) pairs into micro location “documents”.
• The term frequency of (feature, value) pairs are used as the
features fed to the machine learning algorithms.
21
FEATURE CONSTRUCTION
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
113. • The true positive rate of the following algorithms is reported:
23
EXPERIMENT SETUP
* Weka 3 (http://www.cs.waikato.ac.nz/ml/weka/)
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
114. • The true positive rate of the following algorithms is reported:
• DecisionTrees, Naive Bayes,AdaBoost, and Bayes Networks
23
EXPERIMENT SETUP
* Weka 3 (http://www.cs.waikato.ac.nz/ml/weka/)
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
115. • The true positive rate of the following algorithms is reported:
• DecisionTrees, Naive Bayes,AdaBoost, and Bayes Networks
• Each algorithm is run 80 times to perform Leave One Out
CrossValidation on the 80 users in the dataset
23
EXPERIMENT SETUP
* Weka 3 (http://www.cs.waikato.ac.nz/ml/weka/)
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
116. • The true positive rate of the following algorithms is reported:
• DecisionTrees, Naive Bayes,AdaBoost, and Bayes Networks
• Each algorithm is run 80 times to perform Leave One Out
CrossValidation on the 80 users in the dataset
• This is repeated twice; one with all features and another
with selected features.
23
EXPERIMENT SETUP
* Weka 3 (http://www.cs.waikato.ac.nz/ml/weka/)
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
117. • The true positive rate of the following algorithms is reported:
• DecisionTrees, Naive Bayes,AdaBoost, and Bayes Networks
• Each algorithm is run 80 times to perform Leave One Out
CrossValidation on the 80 users in the dataset
• This is repeated twice; one with all features and another
with selected features.
• Weka 3.6*, an open source Java Machine Learning Library, is
used to perform the experiment
23
EXPERIMENT SETUP
* Weka 3 (http://www.cs.waikato.ac.nz/ml/weka/)
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
118. 24
EXPERIMENT RESULTS
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
Algorithm Accuracy Average AccuracyVariance
C4.5 DecisionTrees 59.69% 2.44%
~ with PCA 96.52% 0.04%
Ada Boost 56.85% 2.68%
~ with PCA 68.66% 1.86%
Naive Bayes 34.93% 4.34%
~ with Gain Ratio 44.74% 2.63%
Bayes Network 34.93% 4.34%
~ with CFS 46.34% 2.72%
119. • Features selected using 4 different feature selection algorithms:
Information Gain Attribute Ranking, ReliefF, Correlation-based
Feature Selection and Consistency-based Subset Evaluation
X
Feature Ranking
FEATURE CONSTRUCTION
[10] Davenport,A. and Kalagnanam, J. Davenport,Andrew J. and Kalagnanam, Jyant.A Computational Study of the Kemeny Rule
for Preference Aggregation.AAAI 2004, Proceedings ofThe Nineteenth National Conference on Artificial Intelligence, July, 2004
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
120. • Features selected using 4 different feature selection algorithms:
Information Gain Attribute Ranking, ReliefF, Correlation-based
Feature Selection and Consistency-based Subset Evaluation
• Each algorithm run 80 times to perform LOO-CV
X
Feature Ranking
FEATURE CONSTRUCTION
[10] Davenport,A. and Kalagnanam, J. Davenport,Andrew J. and Kalagnanam, Jyant.A Computational Study of the Kemeny Rule
for Preference Aggregation.AAAI 2004, Proceedings ofThe Nineteenth National Conference on Artificial Intelligence, July, 2004
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
121. • Features selected using 4 different feature selection algorithms:
Information Gain Attribute Ranking, ReliefF, Correlation-based
Feature Selection and Consistency-based Subset Evaluation
• Each algorithm run 80 times to perform LOO-CV
• Results merged using Consensus Ranking [10]
X
Feature Ranking
FEATURE CONSTRUCTION
[10] Davenport,A. and Kalagnanam, J. Davenport,Andrew J. and Kalagnanam, Jyant.A Computational Study of the Kemeny Rule
for Preference Aggregation.AAAI 2004, Proceedings ofThe Nineteenth National Conference on Artificial Intelligence, July, 2004
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
122. • Features selected using 4 different feature selection algorithms:
Information Gain Attribute Ranking, ReliefF, Correlation-based
Feature Selection and Consistency-based Subset Evaluation
• Each algorithm run 80 times to perform LOO-CV
• Results merged using Consensus Ranking [10]
• The whole process was repeated twice for 2 different base
classifiers: Naive Bayes and DecisionTree. Results were identical.
X
Feature Ranking
FEATURE CONSTRUCTION
[10] Davenport,A. and Kalagnanam, J. Davenport,Andrew J. and Kalagnanam, Jyant.A Computational Study of the Kemeny Rule
for Preference Aggregation.AAAI 2004, Proceedings ofThe Nineteenth National Conference on Artificial Intelligence, July, 2004
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
123. • Battery Level
• Charger Connected
• Communication Party Known
or Unknown
• Foreground Application
Universal Identifier
• Movement (Accelerometer)
• Audible ring or not
• Visit Length
• Time (Day of the Week and
Time of the Day)
• Type and Recurrence of
Coinciding Calendar Event
• Change in Number of
Bluetooth Devices
X
Feature Ranking:Top 10 features
FEATURE CONSTRUCTION
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
124. • Charger Connected: Both
• Battery Level:All Levels
• Communication Party
Known: Both values
• Movement (Accel.):All levels
• App.: Calculator
• Number of WiFi APs: 2-3
• App.: EasyVoIP
• App.: Podcasting
• App.: JiokuSpot Light
(Turns phone into AP)
• App.: Image Print
X
Feature Ranking:Top 10+ values
FEATURE CONSTRUCTION
Motivation
Beyond location
Situation Inference
Evaluation
Conclusion
126. Possible System Architecture
CONCLUSION
26
[11] LaMarca,Anthony, Hightower, Jeff, Smith, Ian, and Consolvo, Sunny. Self-mapping in 802.11 location systems, 2005.
Motivation
Beyond locationn
Situation Inference
Evaluation
Conclusion
127. Possible System Architecture
CONCLUSION
• Ad. server would accept the place semantic label as an
extra input, and use it for better targeting
26
[11] LaMarca,Anthony, Hightower, Jeff, Smith, Ian, and Consolvo, Sunny. Self-mapping in 802.11 location systems, 2005.
Motivation
Beyond locationn
Situation Inference
Evaluation
Conclusion
128. Possible System Architecture
CONCLUSION
• Ad. server would accept the place semantic label as an
extra input, and use it for better targeting
• An inference server would take the bag of (input, value)
pairs for a 10 minute stay in one location, and use the
model it has to infer the place label
26
[11] LaMarca,Anthony, Hightower, Jeff, Smith, Ian, and Consolvo, Sunny. Self-mapping in 802.11 location systems, 2005.
Motivation
Beyond locationn
Situation Inference
Evaluation
Conclusion
129. Possible System Architecture
CONCLUSION
• Ad. server would accept the place semantic label as an
extra input, and use it for better targeting
• An inference server would take the bag of (input, value)
pairs for a 10 minute stay in one location, and use the
model it has to infer the place label
• Ad. client on the mobile phone would be responsible for
collecting the inputs and bagging them within visits and
micro locations, using WiFi to detect movement [11]
26
[11] LaMarca,Anthony, Hightower, Jeff, Smith, Ian, and Consolvo, Sunny. Self-mapping in 802.11 location systems, 2005.
Motivation
Beyond locationn
Situation Inference
Evaluation
Conclusion
131. KEYTAKE AWAYS
• Geographic targeting has reached a point where any
further refinement of location would raise privacy issues.
27
Motivation
Beyond locationn
Situation Inference
Evaluation
Conclusion
132. KEYTAKE AWAYS
• Geographic targeting has reached a point where any
further refinement of location would raise privacy issues.
• Very fine grained location is not always useful.
27
Motivation
Beyond locationn
Situation Inference
Evaluation
Conclusion
133. KEYTAKE AWAYS
• Geographic targeting has reached a point where any
further refinement of location would raise privacy issues.
• Very fine grained location is not always useful.
• PALISITIN enables situation based mobile ad targeting,
while requiring only counts of events happening on the
mobile and names of used applications.
27
Motivation
Beyond locationn
Situation Inference
Evaluation
Conclusion
134. KEYTAKE AWAYS
• Geographic targeting has reached a point where any
further refinement of location would raise privacy issues.
• Very fine grained location is not always useful.
• PALISITIN enables situation based mobile ad targeting,
while requiring only counts of events happening on the
mobile and names of used applications.
• The predicted place label is correct 96.5% of the time.
27
Motivation
Beyond locationn
Situation Inference
Evaluation
Conclusion
136. KEYTAKE AWAYS
• Geographic targeting has reached a point where any
further refinement of location would raise privacy issues.
• Very fine grained location is not always useful.
• PALISITIN enables situation based mobile ad targeting,
while requiring only counts of events happening on the
mobile and names of used applications.
• The predicted place label is correct 96.5% of the time.
29