SlideShare a Scribd company logo
National University of Singapore
Yuhui Wang
Advisor: Prof. Mohan Kankanhalli
NUS Graduate School for Integrative Sciences & Engineering
National University of Singapore
16 November 2016
Fusing Physical and Social
Sensors for Situation Awareness
Ph.D. Thesis Defense
1
National University of Singapore
24th International World Wide Web Conference
Big Sensor Data
 Physical Sensors
2
National University of Singapore
24th International World Wide Web Conference
Big Sensor Data
 Social Sensors
3
National University of Singapore
24th International World Wide Web Conference
Big Sensor Data
 Physical Sensors
• Camera
 Social Sensors
• Twitter
82 % access Twitter via mobile devices
-- https://www.statista.com/statistics/282087/number-of-monthly-active-twitter-users/
-- T. Huang, “Surveillance Video: The
Biggest Big Data,” Computing Now, vol. 7,
no. 2, Feb. 2014, IEEE Computer Society
65% global big data come
from surveillance video by
2015
4
National University of Singapore
24th International World Wide Web Conference
Live WebCams
5
National University of Singapore
24th International World Wide Web Conference
Live WebCams
6
National University of Singapore
24th International World Wide Web Conference
Motivation
Physical
Sensors
Social
Sensors
 Physical and social sensors are observing same
situation from different perspectives
7
National University of Singapore
24th International World Wide Web Conference
Literature Review
TwitterStand [2009]
Sakaki et al. [2010]
Weng & Lee [2011]
Walther & Kaisser [2013]
Twevent [2012]
Yang et la. [2016]
...
Kulkarni et al. [2005]
Atrey et al. [2007]
Jacobs et al. [2009]
Babari et al. [2012]
Jing et al. [2016]
Lin et al. [2016]
…
Mediamill101 [2006]
VIREO-374 [2010]
Snoek et al. [2006]
Karpathy et al. [2014]
Vinyals et al. [2014]
Markatopoulou et al. [2016]
Eventshop [2012]
Vivek et al. [2010]
Pan et al. [2013]
Wu et al. [2015]
Semantic Understanding &
Image/Video Concept
Detection
Situation Understanding
using Social Sensors
Event Detection
using Physical Sensors
This work
8
National University of Singapore
24th International World Wide Web Conference
Problem & Challenges
 Work independently
Incomplete information
 Different Modalities
Numeric (physical) vs Symbolic (social)
 Different Spatio-Temporal Density
Spatial: physical < social
Temporal: physical > social
 Uncertain Data (Noise)
No restrictions on content
Failure of sensors
Maintenance of devices
9
National University of Singapore
24th International World Wide Web Conference
Work I:
Tweeting Cameras for Event Detection
10
National University of Singapore
24th International World Wide Web Conference
Motivation
010101
010110
110101
010101
010110
110101
010101
010110
110101
010101
010110
110101
010101
010110
110101
010101
010110
110101
010101
010110
110101
010101
010110
110101
010101
010110
110101
010101
010110
110101
010101
010110
110101
010101
010110
110101
010101
010110
110101
010101
010110
110101
010101
010110
110101010101
010110
110101010101
010110
110101Big
Visual Data
?
!? …
Traditional Camera System Tweeting Camera System
11
National University of Singapore
24th International World Wide Web Conference
Multi-Layer Tweeting Cameras Framework
PST = < Probability, Space, Time, Label>
Isensor
Low-level Concept Detection
Crowd
Action
Parade
Face
Car
Mid-level Concept Filtering
Filtering & Analytic
Operators
High-level Social Sensor Fusion
Social Media
Social Information
Extraction
Cross Media Analysis
Database
(Building
Blocks)
Event Signal Detection
Feature Extraction
…
Traffic
User
Cmage
Physical
Sensors
Social
Sensors
Filtering &
Analytic
Operators
Event
Detection
Social
Information
Crawler
Sensor Data
Collector
Parade
Is going
12
National University of Singapore
24th International World Wide Web Conference
Multi-Layer Tweeting Cameras Framework
PST = < Probability, Space, Time, Label>
Low-level Concept Detection
Crowd
Action
Parade
Face
Car
Database
(Building
Blocks)
Feature Extraction
…
Traffic
Sensor Data
Collector
13
National University of Singapore
24th International World Wide Web Conference
Multi-Layer Tweeting Cameras Framework
PST = < Probability, Space, Time, Label>
Isensor
Low-level Concept Detection
Crowd
Action
Parade
Face
Car
Mid-level Concept Filtering
Filtering & Analytic
Operators Database
(Building
Blocks)
Event Signal Detection
Feature Extraction
…
Traffic
Sensor Data
Collector
14
National University of Singapore
24th International World Wide Web Conference
Multi-Layer Tweeting Cameras Framework
PST = < Probability, Space, Time, Label>
Isensor
Low-level Concept Detection
Crowd
Action
Parade
Face
Car
Mid-level Concept Filtering
Filtering & Analytic
Operators
High-level Social Sensor Fusion
Social Media
Social Information
Extraction
Cross Media Analysis
Database
(Building
Blocks)
Event Signal Detection
Feature Extraction
…
Traffic
Social
Information
Crawler
Sensor Data
Collector
15
National University of Singapore
24th International World Wide Web Conference
Multi-Layer Tweeting Cameras FrameworkLow-level Concept Detection
• Concept Detectors
Columbia 374
VIERO-374
Mediamill (101)
VIREO-WEB81
CU-VIREO 374
…
16
National University of Singapore
24th International World Wide Web Conference
Columbia 374
VIERO-374
Mediamill (101)
VIREO-WEB81
CU-VIREO 374
…
Low-level Concept Detection
• Concept Detectors
Concept Label Confidence
Crowd 0.9
Parade 0.8
Car 0.1
Outdoor 0.5
… …
6 Avenue @ 23 Street 15:10 13th Dec, 2014
Location Time
6 Ave@ 23 St 15:10, Dec13th
6 Ave@ 23 St 15:10, Dec13th
6 Ave@ 23 St 15:10, Dec13th
6 Ave@ 23 St 15:10, Dec13th
… … 17
National University of Singapore
24th International World Wide Web Conference
Low-level Concept Detection
I’ve seen crowd here
now, 90% sure
Low Level Camera Tweet
18
National University of Singapore
24th International World Wide Web Conference
Low-level Concept Detection
server storage
I’ve seen crowd
here now, 90%
sure
I’ve seen crowd
here now, 90%
sure
I’ve seen crowd
here now, 90%
sure
I’ve seen crowd
here now, 90%
sure
I’ve seen crowd
here now, 90%
sure
19
National University of Singapore
24th International World Wide Web Conference
Mid-level Concept Filtering
• Filtering & Analytic Operators
o Query Operators:
E.g. Show the March 17th data for the concept of “parade” at 5th
Avenue with a confidence higher than 0.8:
𝑄𝑢𝑒𝑟𝑦: 𝜃 𝑃_𝑃𝑅𝑂𝐵⋀𝑃_𝐿𝐴𝐵𝐸𝐿⋀𝑃_𝐿𝑂𝐶⋀𝑃_𝑇𝐼𝑀𝐸 𝑆
Where 𝑃_𝑃𝑅𝑂𝑃 = 𝑃_𝑝𝑟𝑜𝑏(0.8 ≤ 𝑝), 𝑃𝐿𝐴𝐵𝐸𝐿 = 𝑃𝑙𝑎𝑏𝑒𝑙 𝑙𝑎𝑏𝑒𝑙=𝑝𝑎𝑟𝑎𝑑𝑒 ,
𝑃_𝐿𝑂𝐶 = 𝑃_𝑙𝑜𝑐(𝐶𝐴𝑀𝑖 = 5 𝑡ℎ
𝐴𝑣𝑒𝑛𝑢𝑒), 𝑃_𝑇𝐼𝑀𝐸 = 𝑃_𝑡𝑖𝑚𝑒(𝑡 = 𝑀𝑎𝑟𝑡ℎ 17 𝑡ℎ
)
o Statistical Functions
o mean, max, min, sum
o Processing Operators
o Extremes
o Smooth
o Trend
o Outlier (Anomaly)
20
National University of Singapore
24th International World Wide Web Conference
Operators
St Patrick’s Day Parade Event (“parade” concept)
Probability
Hour
PROJECTION(SELECTION(EXTRE
ME(SMOOTH(MAP
𝑡1, 𝑙𝑜𝑐1, raw_image1
, 0.2
𝑡2, 𝑙𝑜𝑐2, raw_image2
, 0.3
…
𝑡 𝑛, 𝑙𝑜𝑐 𝑛, raw_image 𝑛
, 0.7)))) = (t 𝑥, 𝑙𝑜𝑐1,
parade, 0.7)
21
National University of Singapore
24th International World Wide Web Conference
What about Social Tweets ?
Tokenizer
Normalizer
Preprocessing
English Words
Slang Words
Dictionary
2015-01-26 12:05:32
I'm ready for you snowpocalypse! #madisonsqpark !!!! @madisonsqpark zzzzzzz
#snowpocalypse @ Madison Square ParkQ&amp http://t.co/KoRJ4FYOkZ
40.7421 -73.988283
2015-01-26 12:05:32
I'm ready for you snowpocalypse #madisonsqpark #snowpocalypse Madison Square Park
40.7421 -73.988283
22
National University of Singapore
24th International World Wide Web Conference
What about Social Tweets ?
Representative Term Mining
𝐷𝑎𝑦 𝑒𝑣𝑒𝑛𝑡
(𝑡 𝑠−𝑡 𝑒)
𝐷𝑎𝑦 𝑝𝑟𝑣1
(𝑡 𝑠−𝑡 𝑒)
… … ……
𝐷𝑎𝑦 𝑝𝑟𝑣2
(𝑡 𝑠−𝑡 𝑒)
23
National University of Singapore
24th International World Wide Web Conference
Representative Term Mining
What about Social Tweets ?
Loc 1 Loc 2 Loc 3 Loc N
𝑇𝐶: tweets posted during events
𝑇 𝐻: tweets posted before events
𝑡𝑓: term frequency
i𝑑𝑓: inverse document frequency
𝑤𝑡𝑒𝑟𝑚 = 𝑡𝑓 𝑡𝑒𝑟𝑚, 𝑇𝐶 × 𝑖𝑑𝑓 𝑡𝑒𝑟𝑚, 𝑇 𝐻
𝑠. 𝑡.
𝑡𝑓 𝑡𝑒𝑟𝑚, 𝑇𝐶 =
𝑓(𝑡𝑒𝑟𝑚, 𝑇𝐶)
max{𝑓 𝜔, 𝑇𝐶 , ∀𝜔 ∈ 𝑇𝐶}
𝑖𝑑𝑓 𝑡𝑒𝑟𝑚, 𝑇 𝐻 = 𝑙𝑜𝑔
𝑇 𝐻
{𝑡ℎ ∈ 𝑇 𝐻: 𝑡𝑒𝑟𝑚 ∈ 𝑡ℎ}
…
…
24
National University of Singapore
24th International World Wide Web Conference
Representative Term Mining
What about Social Tweets ?
Loc 1 Loc 2 Loc 3 Loc N
𝑇𝐶: tweets posted during events
𝑇 𝐻: tweets posted before events
𝑡𝑓: term frequency
i𝑑𝑓: inverse document frequency
𝑤𝑡𝑒𝑟𝑚 = 𝑡𝑓 𝑡𝑒𝑟𝑚, 𝑇𝐶 × 𝑖𝑑𝑓 𝑡𝑒𝑟𝑚, 𝑇 𝐻
𝑠. 𝑡.
𝑡𝑓 𝑡𝑒𝑟𝑚, 𝑇𝐶 =
𝑓(𝑡𝑒𝑟𝑚, 𝑇𝐶)
max{𝑓 𝜔, 𝑇𝐶 , ∀𝜔 ∈ 𝑇𝐶}
𝑖𝑑𝑓 𝑡𝑒𝑟𝑚, 𝑇 𝐻 = 𝑙𝑜𝑔
𝑇 𝐻
{𝑡ℎ ∈ 𝑇 𝐻: 𝑡𝑒𝑟𝑚 ∈ 𝑡ℎ}
…
…
25
National University of Singapore
24th International World Wide Web Conference
Data Analysis & Real World Events
26
National University of Singapore
24th International World Wide Web Conference
DataSet
• NYC Traffic CCTV Camera
149 cameras all over Manhattan
Sampling rate: ~10s/f
Period: 2014 ~ May 2016
• Twitter Data (Geo-tagged)
o Region: Manhattan
o Oct 4th 2014 ~ Sep 2016
o Attributes: text, time, geo-coordinates, etc.
o Size: ~40,000/day
Event Date Time Location
CBGB Music Festival 12 Oct 10am-7pm Broadway 51st
Columbus Day Parade 13 Oct 11am-5pm 5th Avenue
Hispanic Parade 12 Oct 12pm-5pm 5th Avenue
Million March NYC Protest 13 Dec 2pm-5pm Washington Square
Park, 5th Avenue,
Foley Square 27
National University of Singapore
24th International World Wide Web Conference
Real-world Events
28
National University of Singapore
24th International World Wide Web Conference
Real-world Events
Hispanic ParadeCBGB Musical Festival
Columbus Day Parade
twtw
tw tw
tw tw
twtw
tw tw
tw tw
twtw
tw tw
tw tw
twtw
tw tw
tw tw
Historic TweetsRecent Tweets
Event time and location
Retrieve recent tweets Retrieve historic tweets
29
National University of Singapore
24th International World Wide Web Conference
“Million March NYC Protest” Event
30
National University of Singapore
24th International World Wide Web Conference
Twitter Images
“people marching” : 0.5
“parade” : 0.4, “crowd” : 0.9
31
National University of Singapore
Demo : Tweeting Camera
A New Paradigm of Event-based Smart
Sensing Device
32
National University of Singapore
24th International World Wide Web Conference
From 1925:
35mm Leica A
From 1942: CCTV Camera Network
“Looks like a fire ball here ?
”
Fire Event Parade Event Meeting Event Jogging Event
NOW:New Tweeting Cameras Paradigm
Motivation
33
National University of Singapore
24th International World Wide Web Conference
Tweeting Camera (Group Meeting Event)
34
https://www.youtube.com/watch?v=eXn89Z_MZwI
National University of Singapore
24th International World Wide Web Conference
Summary
• Aggregation of physical sensors and social sensors
• Multi-layer tweeting camera framework
• Probabilistic Spatio-temporal Data (Camera Tweet)
• Analytic functions & operators
• Concept Based Image (Cmage)
• Feasibility via Real-world Events Data
35
National University of Singapore
24th International World Wide Web Conference
Work II:
Cmage Based Hybrid Fusion of Multimodal
Event Signals
36
National University of Singapore
24th International World Wide Web Conference
Motivation
 Geo-tagged Multisensor Data
 Noise and Sparsity
PST = <Loc, Time, Label, Prob>
Isensor
Low-level Concept Detection
Crowd
Actio
nPara
de Face
Car
Mid-level Concept Filtering
Filtering & Analytic
Operators
High-level Social Sensor Fusion
Social
Media
Social Information
Extraction
Cross Media Analysis
Database
(Building
Blocks)
Event Signal Detection
Feature Extraction
…
Traffic
User
Cmage
Physical
Sensors
Social
Sensors
Filtering &
Analytic
Operators
Event
Detection
Social
Informati
onCrawle
r
Sensor
Data
Collector
Parade
Is going
Event Locating
Better Visualization (Where happens what)
Goal:
37
National University of Singapore
24th International World Wide Web Conference
Event Signals to Event Cmage
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
“crowdedness”
38
National University of Singapore
24th International World Wide Web Conference
Cmage-based Fusion Pipeline
Gaussian
Process
Prediction
Bayesian
Decision
Fusion
Spatial
Fusion
Sensor Cmage
Social Cmage Event CmageFused CmageExtract Social
Signals
Extract Sensor
Signals
Sensor Concepts
(Crowd, People marching, Car, Traffic, Building)
Social Terms
(“MillionMarchNYC”, “HappyNewYear”, “SGHaze”)
… …
39
National University of Singapore
24th International World Wide Web Conference
Sensor Cmage Pixel Value Estimation
Gaussian Process
- using noisy and sparse observations
Observed pixel values
Predicted pixel values
Gaussian Process
based Prediction
40
National University of Singapore
24th International World Wide Web Conference
Hybrid Event Image Fusion
• Event decision fusion
• Spatial fusion
0.3
0.6
0.63
0.83
0.7
0.030.03
00.10.010.03
0.010.010.03
0.840.84
0.6
0.30.4
0.6 0.6
0.1
Sensor Image Social Image
Reference Window
Fused Pixel
Reference Pixel
Decision Fusion
Spatial Fusion
0
.
4
0.7
Fused Image
GP Predicted Pixel
0.4
0.8
0.8 0.70.4
0.5 0.6 0.6
0.60.6
0.70.4
0
.
4
0.80.4 0.7
0.8
0.3 0.2
0.8
0.6
0.3
0.5
0.6
0.6
0.5 0.6 0.6
0.4
0.6
0.5
0.60.4
0.60.6
0.8
41
National University of Singapore
24th International World Wide Web Conference
Evaluation Metrics
 Saliency Metric S
Low S => more salient & concentrated region
 Mean Square Error compared with Ground Truth
Experiments
42
National University of Singapore
24th International World Wide Web Conference
Experiments
Evaluation Metrics
 Saliency Metric S
• Noise Removal & Saliency Enhancement
S=122.86 S=53.49 S=21.18
“MillionMarchNYC”“Marching” Fused 43
National University of Singapore
24th International World Wide Web Conference
Experiments
• Saliency Enhancement
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
Saliency Metric S
sensor image social image fused image
44
National University of Singapore
24th International World Wide Web Conference
Experiments
• More Events
Events Sensor
Cmage
Social
Cmage
Fused Sensor
Concept
Social
Term
Columbus Day
Parade
124.6 0.43 0.34 Crowd ColumbusDay
MillionMarchNYC
protest
124.5 0.47 0.40 People_marching MillionMarchNYC
StPatricks Day
Parade
1.49 0.61 0.53 Crowd StPatriks
45
National University of Singapore
24th International World Wide Web Conference
Experiments
• MSE compared with Ground Truth
(MillionMmarchNYC Events)
0
20
40
60
80
100
120
MSE
MSE (Fused Cmage – Ground Truth)
MSE(Sensor Cmage)
MSE(Social Cmage)
MSE(Fused Cmage)
46
National University of Singapore
24th International World Wide Web Conference
Experiments
• Effectiveness of Gaussian process
0
0.1
0.2
0.3
0.4
0.5
0.6
Fused with GP Fused without GP
S
47
National University of Singapore
24th International World Wide Web Conference
Summary
• Leveraged multimodal information for better
situation understanding
• Proposed an image-based hybrid fusion method
featuring sensor decision and spatial information
• Reduced noise in sensor data for better event
detection
• Limitation
Concepts to be fused are predefined
48
National University of Singapore
24th International World Wide Web Conference
Work III:
A Matrix Factorization Based Framework for
Fusion of Physical and Social Sensors
49
National University of Singapore
24th International World Wide Web Conference
Motivation
Physical sensors generate sparse
or inaccurate readings
Social Information implicitly
explains readings
• Help us discovery different dimensions or
aspects of events
• Useful for inferring & predicting ongoing
situations
• More than a single reading; tells why
Same events have similar social
topics and physical sensor readings
Goal: utilize physical & social
correlation to predict events
Crowdedness Prob
Hour
50
National University of Singapore
24th International World Wide Web Conference
Spatio-Temporal-Semantic Representation
Time
Stamps
Locations
1 40 80 120 150
3-4pm
1-2pm
2-3pm
12-1pm
4-5pm
5-6pm
“People_marching” Situation
51
National University of Singapore
24th International World Wide Web Conference
Approach: Matrix Factorization Based
Fusion Framework
≈ ×
52
National University of Singapore
24th International World Wide Web Conference
Formalization & Notations
26 ? 20 56 ?
? 102 80 ? 90
89 35 ? 21 35
14 ? 16 ? 109
Word 1, Word 2,
Word 3, word 4
…
Times
Physical
Readings 𝑆𝑖𝑗
𝛿
Locations
Locations
Time
Stamps
3-4pm
1-2pm
2-3pm
12-
1pm
4-5pm
5-6pm
1 40 80 120 150
N locations: 𝑗 = 1, … , 𝑁
M time stamps: i = 1, … , 𝑀
Temporal window: 𝛿
Situation matrix: 𝑆 𝛿
⊆ ℝ 𝑀×𝑁
Physical readings: 𝑆𝑖𝑗
𝛿
∈ 𝑆 𝛿
Word 1, Word 2,
Word 3, word 4
…
Word 1, Word 2,
Word 3, word 4
…
Word 1, Word 2,
Word 3, word 4
…
Word 1, Word 2,
Word 3, word 4
…
Location Document: 𝐿𝐷𝑗
𝛿,𝑟
= {𝑝1, … , 𝑝 𝑘}
Social post : 𝑝 𝑘 = 𝜔1, … , 𝜔 𝑅
Word: 𝜔 𝑞 ∈ 𝐷
Social Information: 𝕊 𝛿
= {𝐿𝐷1
𝛿,𝑟
, … , 𝐿𝐷 𝑁
𝛿,𝑟
}
Sensor Signal S :
Social Signal 𝕊 :
𝑝𝑒𝑟𝑓𝑜𝑟𝑚𝑎𝑛𝑐𝑒 𝔽 𝑆, 𝕊 𝛿
> 𝑝𝑒𝑟𝑓𝑜𝑟𝑚𝑎𝑛𝑐𝑒 𝑆
Fusion Function 𝔽 :
Matrix Factorization (MF) Model:
 MF on Physical Signals: Basic Model
 MF incorporating Social Signals: Latent Topics
𝐿𝐷 𝛿,𝑟
53
National University of Singapore
24th International World Wide Web Conference
MF on Physical Signals: Basic Model
26 ? 20 56 ?
? 102 80 ? 90
89 35 ? 21 35
14 ? 16 ? 109
≈Times
1 12 2
3.6 2 2
1 -3 1
2 2 -3
-0.9 -0.4 5 3 1
10 1 7 -0.3 9
3 2 -9 6 12
×
Temporal Latent Factors Spatial Latent Factors
Locations
𝑠𝑖𝑗 = 𝑡𝑖
′ 𝑙𝑗 + 𝜇 + 𝛽𝑖 + 𝛽𝑗
Ω𝑖,𝑗 Γ = 𝜆 𝑟𝑒𝑔 𝑡𝑖
2 + 𝑙𝑗
2
+ 𝛽𝑖
2
+ 𝛽𝑗
2
𝑡𝑖
𝑙𝑗
𝑗 = 1, … , 𝑁 : N locations
i = 1, … , 𝑀 : M time stamps:
𝑆 ⊆ ℝ 𝑀×𝑁
: Situation matrix:
𝑆𝑖𝑗 ∈ 𝑆 : Physical readings:
𝑆 𝑝ℎ𝑦 =
𝑖,𝑗 ∈𝜅
𝑆𝑖𝑗 − 𝑠𝑖𝑗
2
+ Ω𝑖,𝑗 Γ
Γ = 𝜇, 𝛽𝑖, 𝛽𝑗, 𝑡𝑗, 𝑙𝑗
54
National University of Singapore
24th International World Wide Web Conference
MF Incorporating Social: Latent Topics
Object Function
min 𝑓 𝑆 Γ = 𝑆 𝑝ℎ𝑦=
𝑖,𝑗 ∈𝜅
𝑆𝑖𝑗 − 𝑠𝑖𝑗
2
+ Ω𝑖,𝑗 Γ
55
National University of Singapore
24th International World Wide Web Conference
MF Incorporating Social: Latent Topics
Object Function
min 𝑓 𝑆 Γ = 𝑆 𝑝ℎ𝑦=
𝑖,𝑗 ∈𝜅
𝑆𝑖𝑗 − 𝑠𝑖𝑗
2
+ Ω𝑖,𝑗 Γ
56
National University of Singapore
24th International World Wide Web Conference
MF Incorporating Social: Latent Topics
Object Function
min 𝑓 𝑆 Γ = 𝑆 𝑝ℎ𝑦=
𝑖,𝑗 ∈𝜅
𝑆𝑖𝑗 − 𝑠𝑖𝑗
2
+ Ω𝑖,𝑗 Γ
min 𝑓 𝑆, 𝕊 Γ , 𝑃𝑎𝑟𝑎𝑚𝑒𝑡𝑒𝑟(𝕊) = 𝑆 𝑝ℎ𝑦 + 𝑆𝑠𝑜𝑐
57
National University of Singapore
24th International World Wide Web Conference
MF Incorporating Social: Latent Topics
Object Function
min 𝑓 𝑆 Γ = 𝑆 𝑝ℎ𝑦=
𝑖,𝑗 ∈𝜅
𝑆𝑖𝑗 − 𝑠𝑖𝑗
2
+ Ω𝑖,𝑗 Γ
Location document: 𝐿𝐷𝑗 = {𝑝1, … , 𝑝 𝑘}
Social post : 𝑝 𝑘 = 𝜔1, … , 𝜔 𝑅
Word: 𝜔 𝑞 ∈ 𝐷
Social Information: 𝕊 = {𝐿𝐷1, … , 𝐿𝐷1}
Social Signal 𝕊 :
min 𝑓 𝑆, 𝕊 Γ , 𝑃𝑎𝑟𝑎𝑚𝑒𝑡𝑒𝑟(𝕊) = 𝑆 𝑝ℎ𝑦 + 𝑆𝑠𝑜𝑐
Locations
Time
Stamps3-
4pm
1-
2pm
2-
3pm
12-
1pm
4-
5pm
5-
6pm
1 40 80 120
150
58
National University of Singapore
24th International World Wide Web Conference
MF Incorporating Social: Latent Topics
LDA Model
𝑆𝑠𝑜𝑐 = − 𝑝 𝕊|𝜃, 𝜙, 𝑧 =
−
𝑗
𝑁
𝑞=1
𝑁 𝐿𝐷 𝑗
𝜃𝑧 𝐿𝐷 𝑗,𝑞
𝜙 𝑧 𝐿𝐷 𝑗,𝑞,𝜔 𝐿𝐷 𝑗,𝑞
min( 𝑓 𝑆, 𝕊 Γ , 𝑃𝑎𝑟𝑎𝑚𝑒𝑡𝑒𝑟 𝕊 = 𝑆 𝑝ℎ𝑦 + 𝑆𝑠𝑜𝑐 )
t
l
s
θ
w
ф
N
M
|LD|
min 𝑓 𝑆, 𝕊 Γ, Θ, 𝑘, 𝑧 = 𝑆 𝑝ℎ𝑦 − 𝜆 𝑠𝑜𝑐𝑖𝑎𝑙 𝑝 𝕊|𝜃, 𝜙, 𝑘, 𝑧
=
𝑖,𝑗 ∈𝜅
𝑆𝑖𝑗 − 𝑠𝑖𝑗
2
+ 𝜆 𝑟𝑒𝑔 𝑡𝑖
2
+ 𝑙𝑗
2
+ 𝛽𝑖
2
+ 𝛽𝑗
2
− 𝜆 𝑠𝑜𝑐𝑖𝑎𝑙
𝑗
𝑁
𝑞=1
𝑁 𝐿𝐷 𝑗
𝜃𝑧 𝐿𝐷 𝑗,𝑞
𝜙 𝑧 𝐿𝐷 𝑗,𝑞,𝜔 𝐿𝐷 𝑗,𝑞
spatial latent -> social topic
𝜃𝑗,𝑓 =
𝑒
𝑘𝑙 𝑗,𝑓
𝑓 𝑒
𝑘𝑙 𝑗,𝑓
Parameters Learning: Gradient Descent
59
National University of Singapore
24th International World Wide Web Conference
Situation Prediction for Missing Readings
“Cold-start” problem
𝑆 𝑝ℎ𝑦 =
𝑖,𝑗 ∈𝜅
𝑆𝑖𝑗 − 𝑠𝑖𝑗 − 𝛽𝑖
2
+ 𝜆 𝑟𝑒𝑔 𝑡𝑖
2 + 𝑙𝑗
2
+ 𝛽𝑖
min(𝑓) =
𝑖,𝑗 ∈𝜅
𝑆𝑖𝑗 − 𝑠𝑖𝑗 − 𝛽𝑖
2
+ 𝜆 𝑟𝑒𝑔 𝑡𝑖
2 + 𝑙𝑗
2
+ 𝛽𝑖 − 𝑝 𝕊|𝜃, 𝜙, 𝑘, 𝑧
26 ? ? 56 ?
? 102 ? ? 90
89 35 ? 21 35
14 ? ? ? 109
Times
Locations
60
National University of Singapore
24th International World Wide Web Conference
Tweets,
FB,
Flickr
…
Matrix Factorization Based Fusion
26 ? 20 56 ?
? 102 80 ? 90
89 35 ? 21 35
14 ? 16 ? 109
1 12 2
3.6 2 2
1 -3 1
2 2 -3
-0.9 -0.4 5 3 1
10 1 7 -0.3 9
3 2 -9 6 12
= ×Times
Goal: minimize the error of predicted values & maximize likelihood of social observations
Tweets,
Youtube,
Flickr …
Tweets,
words,
Instagram
…
WeChat
News
Media
Flickr …
Tweets,
FB,
Flickr
…
Temporal Latent Factors Social Embedded
Latent Factors
Physical
Readings
𝑡𝑝1 𝑡𝑝2 𝑡𝑝3
𝑡,𝑙
(𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛 𝑙𝑎𝑡𝑒𝑛𝑡
𝑝𝑎𝑟𝑎𝑚𝑠
𝑡, 𝑙 − 𝑃ℎ𝑦𝑠𝑖𝑐𝑎𝑙 𝑅𝑒𝑎𝑑𝑖𝑛𝑔𝑠𝑡,𝑙)2
− 𝜆 ∗ 𝑝𝑟𝑜𝑏(𝑆𝑜𝑐𝑖𝑎𝑙|𝑠𝑜𝑐𝑖𝑎𝑙 𝑝𝑎𝑟𝑎𝑚𝑠)
Physical Readings Error Social Observations Likelihood
𝑙𝑎𝑡𝑒𝑛𝑡, 𝑠𝑜𝑐𝑖𝑎𝑙
𝑝𝑎𝑟𝑎𝑚𝑠
min
Locations
61
National University of Singapore
24th International World Wide Web Conference
Situation Awareness – Singapore Haze
Noise Filtering – NYC Large Scale Events
Experiments :
62
National University of Singapore
24th International World Wide Web Conference
DataSet
• Historical PSI Readings
5 stations
3 weeks: 1st-7th, 12-19th August,
22nd-29th September
• Geo-tagged Tweets
Attributes: text, time, geo-
coordinates, etc
• 149 cameras all over Manhattan
Sampling rate: ~10s/f
Period: 2014 ~ Now
Size: > 2 TB
• Geo-tagged Tweets
Attributes: text, time, geo-
coordinates, etc.
Period: 2014 ~ Now
Size: ~60,000/day
SG Haze Data NYC Traffic Data
1
2
3
4
5
6
7
#tweets #words #words per location
SGHaze 19073 178825 8515
NYCTraffic 10005 90381 669 63
National University of Singapore
24th International World Wide Web Conference
Situation Awareness – Singapore Haze
Physical & Social Sensors Correlation
1
2
3
4
5
6
7
=
×
×
without
tweets
with
tweets
PSI Situation Matrix
1
14
28
42
56
1 2 3 4 5 6 7 8 9
1 2 3 4 5 6 7 8 9
Spearman’s rank correlation 𝜌𝑖𝑗
Location pairs: 1-2, 1-3, 2-1, …, 8-9
Ground truth: original PSI readings 𝜌𝑖𝑗
𝑝𝑠𝑖
With tweets 𝜌𝑖𝑗
𝑙+𝑡𝑤
vs without tweets 𝜌𝑖𝑗
𝑙
Evaluate: 𝑑𝑖𝑠𝑡 𝜌𝑖𝑗
𝑙+𝑡𝑤
, 𝜌𝑖𝑗
𝑝𝑠𝑖
, 𝑑𝑖𝑠𝑡 𝜌𝑖𝑗
𝑙
, 𝜌𝑖𝑗
𝑝𝑠𝑖
64
National University of Singapore
24th International World Wide Web Conference
Situation Awareness – Singapore Haze
Physical & Social Sensors Correlation
=
×
×
PSI Situation Matrix
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
1-2
1-3
1-4
1-5
1-6
1-7
1-8
1-9
2-3
2-4
2-5
2-6
2-7
2-8
2-9
3-4
3-5
3-6
3-7
3-8
3-9
4-5
4-6
4-7
4-8
4-9
5-6
5-7
5-8
5-9
6-7
6-8
6-9
7-8
7-9
8-9
Dist(ρ(l), ρ(PSI) Dist(ρ(l+tw), ρ(PSI))
Both
Observe Event
Neither
Observe Event
Only One
Observes Event
#Location Pairs 3 15 18
#Better Correlation 3 7 12
Percentage 100% 47% 67%
Spearman’s rank correlation 𝜌𝑖𝑗
Location pairs: 1-2, 1-3, 2-1, …, 8-9
Ground truth: original PSI readings 𝜌𝑖𝑗
𝑝𝑠𝑖
With tweets 𝜌𝑖𝑗
𝑙+𝑡𝑤
vs without tweets 𝜌𝑖𝑗
𝑙
Evaluate: 𝑑𝑖𝑠𝑡 𝜌𝑖𝑗
𝑙+𝑡𝑤
, 𝜌𝑖𝑗
𝑝𝑠𝑖
, 𝑑𝑖𝑠𝑡 𝜌𝑖𝑗
𝑙
, 𝜌𝑖𝑗
𝑝𝑠𝑖
65
National University of Singapore
24th International World Wide Web Conference
Situation Awareness – Singapore Haze
Spatio-Temporal Situation Prediction
1
2
3
4
5
6
7
PSI Situation Matrix
1
1
14
28
42
56
52 3 4 76 1 52 3 4 76 1 52 3 4 76
Week 1 Week 2 Week 3
66
National University of Singapore
24th International World Wide Web Conference
Situation Awareness – Singapore Haze
Spatio-Temporal Situation Prediction
1
2
3
4
5
6
7
PSI Situation Matrix
1
1
14
28
42
56
52 3 4 76 1 52 3 4 76 1 52 3 4 76
Physical Only
LDA Topics:
1. hazy, hari, haze, gardens, uffc
2. Internationalcosplayday(icds), icds, sghaze, psi
3. iphone, airport, changi, terminal
Physical + Social
Week 1 Week 2 Week 3
67
National University of Singapore
24th International World Wide Web Conference
Situation Awareness – Singapore Haze
Spatio-Temporal Situation Prediction
1
2
3
4
5
6
7
PSI Situation Matrix
1
1
14
28
42
56
52 3 4 76 1 52 3 4 76 1 52 3 4 76
Physical Only
Physical + Social
Week 1 Week 2 Week 3
LOC1 LOC2 LOC3
51.38 49.45 57.73
42.48 26.80 22.86
Cross Validation Measured by MSE
No Tweets
With Tweets
68
National University of Singapore
24th International World Wide Web Conference
Event Classification
Events:
1.MillionMarchNYC Protest (M)
2.St Patrick’s Day Parade (S)
3.Columbus Day Parade (C)
Evaluation:
Root Mean Squared Error (RMSE)
69
National University of Singapore
24th International World Wide Web Conference
Event Classification Performance
Performance: Precision, Recall, 𝐹1-score
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0 0.5 1
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.5 1
0 0.5 1
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Precision
𝐹1 Score
Recall 70
National University of Singapore
24th International World Wide Web Conference
Summary
• Social signals tells why for physical sensor readings
• Matrix factorization is used to fuse physical and
social information with spatial and temporal
aspects
• MF solves “cold-start” problem of predicting
missing readings
• Correlation exist between physical and social
signals that reflect events
• Fusing two sources has better performance in real-
world situations understanding than using only one
71
National University of Singapore
24th International World Wide Web Conference
Thesis Contributions
 Proposed multilayer tweeting camera framework
can bridge the gap between physical and social
sensors, makes data analysis efficient, enable the
sematic details of occurring events
 Cmage based fusion method removes noise
effectively, locates event accurately, and results a
better visualization of situations
 MF based fusion results higher performance in event
classification and situation prediction with the help
of correlation between physical and social sensors
72
National University of Singapore
24th International World Wide Web Conference
Future work
 Create an interactive framework extending
multilayer tweeting camera framework
 Investigate semantic relatedness among concept
(utilizing ontology from various lexical databases,
e.g. WordNet), build up event ontology
 Predict how event will evolve in temporal aspect
 Apply fusion methods in different scenarios
(semantic, sentiment, stock analysis)
73
National University of Singapore
24th International World Wide Web Conference
Publication List
Journal paper
Yuhui Wang, Francesco Gelli, Christian von der Weth and Mohan
Kankanhalli, “A Matrix Factorization Based Framework for Fusion of
Physical and Social Sensors", in revision, IEEE Transactions on Multimedia,
2016. (In Peer Review)
Conference papers
1) Yuhui Wang and Mohan Kankanhalli, “Tweeting Cameras for Event
Detection”, 24th International Conference on World Wide Web (WWW’15),
pp. 1231-1241, Florence, Italy, May 2015.
2) Yuhui Wang, “Socializing Multimodal Sensors for Information Fusion”, 23rd
ACM international conference on Multimedia (MM’15),Doctoral Symposium.
653-656, Brisbane, Australia, October 2015. (Best Paper Award).
3) Yuhui Wang, Christian von der Weth, Thomas Winkler and Mohan
Kankanhalli, “Demo: Tweeting Camera - A New Paradigm of Eventbased
Smart Sensing Device", pp. 210-211, 10th International Conference on
Distributed Smart Cameras (ICDSC’16), Paris, France, September 2016.
4) Yuhui Wang, Yehong Zhang, Christian von der Weth, Kian Hsiang
Low, Vivek Singh and Mohan Kankanhalli, “Concept Based Fusion of
Multimodal Event Signals", IEEE International Symposium on Multimedia
(ISM’16), San Jose, USA, December 2016.
74
National University of Singapore
24th International World Wide Web Conference
Acknowledgement
Prof Mohan Kankanhalli
Prof Roger Zimmermann
Prof Qi Zhao
Prof Terence Sim
Prof Ramesh Jain
Prof Vivek Singh
Dr. Christian von der Weth
Dr. Prabhu Natarajan
Dr. Tian Gan
Dr. Yongkang Wong
Dr. Thomas Winkler
Lab mates in SeSaMe
75
National University of Singapore
YouThank
76

More Related Content

Similar to Thesis-Defense-YuhuiWang-small

Conference talk: Understanding Vulnerabilities of Location Privacy Mechanisms...
Conference talk: Understanding Vulnerabilities of Location Privacy Mechanisms...Conference talk: Understanding Vulnerabilities of Location Privacy Mechanisms...
Conference talk: Understanding Vulnerabilities of Location Privacy Mechanisms...
Zohaib Riaz
 
Smart Citizen Workshop - Cybera Summit 2016, Banff, Canada
Smart Citizen Workshop - Cybera Summit 2016, Banff, CanadaSmart Citizen Workshop - Cybera Summit 2016, Banff, Canada
Smart Citizen Workshop - Cybera Summit 2016, Banff, Canada
SensorUp
 
WSI Stimulus Project: Centre for longitudinal studies of online citizen parti...
WSI Stimulus Project: Centre for longitudinal studies of online citizen parti...WSI Stimulus Project: Centre for longitudinal studies of online citizen parti...
WSI Stimulus Project: Centre for longitudinal studies of online citizen parti...
Ramine Tinati
 
Ingredients for Semantic Sensor Networks
Ingredients for Semantic Sensor NetworksIngredients for Semantic Sensor Networks
Ingredients for Semantic Sensor NetworksOscar Corcho
 
김현정 서울의료원 피부과&시민공감서비스디자인센터 공유자료
김현정 서울의료원 피부과&시민공감서비스디자인센터 공유자료김현정 서울의료원 피부과&시민공감서비스디자인센터 공유자료
김현정 서울의료원 피부과&시민공감서비스디자인센터 공유자료
경만 고
 
Snap4City November 2019 Course: Smart City IOT Data Analytics
Snap4City November 2019 Course: Smart City IOT Data AnalyticsSnap4City November 2019 Course: Smart City IOT Data Analytics
Snap4City November 2019 Course: Smart City IOT Data Analytics
Paolo Nesi
 
Semantic Sensor Networks and Linked Stream Data
Semantic Sensor Networks and Linked Stream DataSemantic Sensor Networks and Linked Stream Data
Semantic Sensor Networks and Linked Stream Data
Oscar Corcho
 
Geo4All: a successful OSGeo Initiative
Geo4All: a successful OSGeo InitiativeGeo4All: a successful OSGeo Initiative
Geo4All: a successful OSGeo Initiative
Maria Antonia Brovelli
 
Data and science
Data and scienceData and science
Data and science
Anand Deshpande
 
Twitter Vigilance: a Multi-User platform for Cross-Domain Twitter Data Analyt...
Twitter Vigilance: a Multi-User platform for Cross-Domain Twitter Data Analyt...Twitter Vigilance: a Multi-User platform for Cross-Domain Twitter Data Analyt...
Twitter Vigilance: a Multi-User platform for Cross-Domain Twitter Data Analyt...
Paolo Nesi
 
Cultures in Community Question Answering
Cultures in Community Question AnsweringCultures in Community Question Answering
Cultures in Community Question Answering
Nicolas Kourtellis
 
ESWC 2015 - EU Networking Session
ESWC 2015 - EU Networking SessionESWC 2015 - EU Networking Session
ESWC 2015 - EU Networking Session
Erik Mannens
 
Data Streaming in Big Data Analysis
Data Streaming in Big Data AnalysisData Streaming in Big Data Analysis
Data Streaming in Big Data Analysis
Vincenzo Gulisano
 
How to make cities "smarter"?
How to make cities "smarter"?How to make cities "smarter"?
How to make cities "smarter"?
PayamBarnaghi
 
Multimedia Information Retrieval: Bytes and pixels meet the challenges of hum...
Multimedia Information Retrieval: Bytes and pixels meet the challenges of hum...Multimedia Information Retrieval: Bytes and pixels meet the challenges of hum...
Multimedia Information Retrieval: Bytes and pixels meet the challenges of hum...
maranlar
 
Bangkok | Mar-17 | The OpenAQ Story: Combining Open Data + Community for Impact
Bangkok | Mar-17 | The OpenAQ Story: Combining Open Data + Community for ImpactBangkok | Mar-17 | The OpenAQ Story: Combining Open Data + Community for Impact
Bangkok | Mar-17 | The OpenAQ Story: Combining Open Data + Community for Impact
Smart Villages
 
Tracking research and research systems
Tracking research and research systemsTracking research and research systems
Tracking research and research systems
Jisc
 
Data Science: History repeated? – The heritage of the Free and Open Source GI...
Data Science: History repeated? – The heritage of the Free and Open Source GI...Data Science: History repeated? – The heritage of the Free and Open Source GI...
Data Science: History repeated? – The heritage of the Free and Open Source GI...
Peter Löwe
 
GeoForAll: a successful OSGeo Initiative
GeoForAll: a successful OSGeo InitiativeGeoForAll: a successful OSGeo Initiative
GeoForAll: a successful OSGeo Initiative
Maria Antonia Brovelli
 
OU Rise library analytics viz
OU Rise library analytics vizOU Rise library analytics viz
OU Rise library analytics vizTony Hirst
 

Similar to Thesis-Defense-YuhuiWang-small (20)

Conference talk: Understanding Vulnerabilities of Location Privacy Mechanisms...
Conference talk: Understanding Vulnerabilities of Location Privacy Mechanisms...Conference talk: Understanding Vulnerabilities of Location Privacy Mechanisms...
Conference talk: Understanding Vulnerabilities of Location Privacy Mechanisms...
 
Smart Citizen Workshop - Cybera Summit 2016, Banff, Canada
Smart Citizen Workshop - Cybera Summit 2016, Banff, CanadaSmart Citizen Workshop - Cybera Summit 2016, Banff, Canada
Smart Citizen Workshop - Cybera Summit 2016, Banff, Canada
 
WSI Stimulus Project: Centre for longitudinal studies of online citizen parti...
WSI Stimulus Project: Centre for longitudinal studies of online citizen parti...WSI Stimulus Project: Centre for longitudinal studies of online citizen parti...
WSI Stimulus Project: Centre for longitudinal studies of online citizen parti...
 
Ingredients for Semantic Sensor Networks
Ingredients for Semantic Sensor NetworksIngredients for Semantic Sensor Networks
Ingredients for Semantic Sensor Networks
 
김현정 서울의료원 피부과&시민공감서비스디자인센터 공유자료
김현정 서울의료원 피부과&시민공감서비스디자인센터 공유자료김현정 서울의료원 피부과&시민공감서비스디자인센터 공유자료
김현정 서울의료원 피부과&시민공감서비스디자인센터 공유자료
 
Snap4City November 2019 Course: Smart City IOT Data Analytics
Snap4City November 2019 Course: Smart City IOT Data AnalyticsSnap4City November 2019 Course: Smart City IOT Data Analytics
Snap4City November 2019 Course: Smart City IOT Data Analytics
 
Semantic Sensor Networks and Linked Stream Data
Semantic Sensor Networks and Linked Stream DataSemantic Sensor Networks and Linked Stream Data
Semantic Sensor Networks and Linked Stream Data
 
Geo4All: a successful OSGeo Initiative
Geo4All: a successful OSGeo InitiativeGeo4All: a successful OSGeo Initiative
Geo4All: a successful OSGeo Initiative
 
Data and science
Data and scienceData and science
Data and science
 
Twitter Vigilance: a Multi-User platform for Cross-Domain Twitter Data Analyt...
Twitter Vigilance: a Multi-User platform for Cross-Domain Twitter Data Analyt...Twitter Vigilance: a Multi-User platform for Cross-Domain Twitter Data Analyt...
Twitter Vigilance: a Multi-User platform for Cross-Domain Twitter Data Analyt...
 
Cultures in Community Question Answering
Cultures in Community Question AnsweringCultures in Community Question Answering
Cultures in Community Question Answering
 
ESWC 2015 - EU Networking Session
ESWC 2015 - EU Networking SessionESWC 2015 - EU Networking Session
ESWC 2015 - EU Networking Session
 
Data Streaming in Big Data Analysis
Data Streaming in Big Data AnalysisData Streaming in Big Data Analysis
Data Streaming in Big Data Analysis
 
How to make cities "smarter"?
How to make cities "smarter"?How to make cities "smarter"?
How to make cities "smarter"?
 
Multimedia Information Retrieval: Bytes and pixels meet the challenges of hum...
Multimedia Information Retrieval: Bytes and pixels meet the challenges of hum...Multimedia Information Retrieval: Bytes and pixels meet the challenges of hum...
Multimedia Information Retrieval: Bytes and pixels meet the challenges of hum...
 
Bangkok | Mar-17 | The OpenAQ Story: Combining Open Data + Community for Impact
Bangkok | Mar-17 | The OpenAQ Story: Combining Open Data + Community for ImpactBangkok | Mar-17 | The OpenAQ Story: Combining Open Data + Community for Impact
Bangkok | Mar-17 | The OpenAQ Story: Combining Open Data + Community for Impact
 
Tracking research and research systems
Tracking research and research systemsTracking research and research systems
Tracking research and research systems
 
Data Science: History repeated? – The heritage of the Free and Open Source GI...
Data Science: History repeated? – The heritage of the Free and Open Source GI...Data Science: History repeated? – The heritage of the Free and Open Source GI...
Data Science: History repeated? – The heritage of the Free and Open Source GI...
 
GeoForAll: a successful OSGeo Initiative
GeoForAll: a successful OSGeo InitiativeGeoForAll: a successful OSGeo Initiative
GeoForAll: a successful OSGeo Initiative
 
OU Rise library analytics viz
OU Rise library analytics vizOU Rise library analytics viz
OU Rise library analytics viz
 

Thesis-Defense-YuhuiWang-small

  • 1. National University of Singapore Yuhui Wang Advisor: Prof. Mohan Kankanhalli NUS Graduate School for Integrative Sciences & Engineering National University of Singapore 16 November 2016 Fusing Physical and Social Sensors for Situation Awareness Ph.D. Thesis Defense 1
  • 2. National University of Singapore 24th International World Wide Web Conference Big Sensor Data  Physical Sensors 2
  • 3. National University of Singapore 24th International World Wide Web Conference Big Sensor Data  Social Sensors 3
  • 4. National University of Singapore 24th International World Wide Web Conference Big Sensor Data  Physical Sensors • Camera  Social Sensors • Twitter 82 % access Twitter via mobile devices -- https://www.statista.com/statistics/282087/number-of-monthly-active-twitter-users/ -- T. Huang, “Surveillance Video: The Biggest Big Data,” Computing Now, vol. 7, no. 2, Feb. 2014, IEEE Computer Society 65% global big data come from surveillance video by 2015 4
  • 5. National University of Singapore 24th International World Wide Web Conference Live WebCams 5
  • 6. National University of Singapore 24th International World Wide Web Conference Live WebCams 6
  • 7. National University of Singapore 24th International World Wide Web Conference Motivation Physical Sensors Social Sensors  Physical and social sensors are observing same situation from different perspectives 7
  • 8. National University of Singapore 24th International World Wide Web Conference Literature Review TwitterStand [2009] Sakaki et al. [2010] Weng & Lee [2011] Walther & Kaisser [2013] Twevent [2012] Yang et la. [2016] ... Kulkarni et al. [2005] Atrey et al. [2007] Jacobs et al. [2009] Babari et al. [2012] Jing et al. [2016] Lin et al. [2016] … Mediamill101 [2006] VIREO-374 [2010] Snoek et al. [2006] Karpathy et al. [2014] Vinyals et al. [2014] Markatopoulou et al. [2016] Eventshop [2012] Vivek et al. [2010] Pan et al. [2013] Wu et al. [2015] Semantic Understanding & Image/Video Concept Detection Situation Understanding using Social Sensors Event Detection using Physical Sensors This work 8
  • 9. National University of Singapore 24th International World Wide Web Conference Problem & Challenges  Work independently Incomplete information  Different Modalities Numeric (physical) vs Symbolic (social)  Different Spatio-Temporal Density Spatial: physical < social Temporal: physical > social  Uncertain Data (Noise) No restrictions on content Failure of sensors Maintenance of devices 9
  • 10. National University of Singapore 24th International World Wide Web Conference Work I: Tweeting Cameras for Event Detection 10
  • 11. National University of Singapore 24th International World Wide Web Conference Motivation 010101 010110 110101 010101 010110 110101 010101 010110 110101 010101 010110 110101 010101 010110 110101 010101 010110 110101 010101 010110 110101 010101 010110 110101 010101 010110 110101 010101 010110 110101 010101 010110 110101 010101 010110 110101 010101 010110 110101 010101 010110 110101 010101 010110 110101010101 010110 110101010101 010110 110101Big Visual Data ? !? … Traditional Camera System Tweeting Camera System 11
  • 12. National University of Singapore 24th International World Wide Web Conference Multi-Layer Tweeting Cameras Framework PST = < Probability, Space, Time, Label> Isensor Low-level Concept Detection Crowd Action Parade Face Car Mid-level Concept Filtering Filtering & Analytic Operators High-level Social Sensor Fusion Social Media Social Information Extraction Cross Media Analysis Database (Building Blocks) Event Signal Detection Feature Extraction … Traffic User Cmage Physical Sensors Social Sensors Filtering & Analytic Operators Event Detection Social Information Crawler Sensor Data Collector Parade Is going 12
  • 13. National University of Singapore 24th International World Wide Web Conference Multi-Layer Tweeting Cameras Framework PST = < Probability, Space, Time, Label> Low-level Concept Detection Crowd Action Parade Face Car Database (Building Blocks) Feature Extraction … Traffic Sensor Data Collector 13
  • 14. National University of Singapore 24th International World Wide Web Conference Multi-Layer Tweeting Cameras Framework PST = < Probability, Space, Time, Label> Isensor Low-level Concept Detection Crowd Action Parade Face Car Mid-level Concept Filtering Filtering & Analytic Operators Database (Building Blocks) Event Signal Detection Feature Extraction … Traffic Sensor Data Collector 14
  • 15. National University of Singapore 24th International World Wide Web Conference Multi-Layer Tweeting Cameras Framework PST = < Probability, Space, Time, Label> Isensor Low-level Concept Detection Crowd Action Parade Face Car Mid-level Concept Filtering Filtering & Analytic Operators High-level Social Sensor Fusion Social Media Social Information Extraction Cross Media Analysis Database (Building Blocks) Event Signal Detection Feature Extraction … Traffic Social Information Crawler Sensor Data Collector 15
  • 16. National University of Singapore 24th International World Wide Web Conference Multi-Layer Tweeting Cameras FrameworkLow-level Concept Detection • Concept Detectors Columbia 374 VIERO-374 Mediamill (101) VIREO-WEB81 CU-VIREO 374 … 16
  • 17. National University of Singapore 24th International World Wide Web Conference Columbia 374 VIERO-374 Mediamill (101) VIREO-WEB81 CU-VIREO 374 … Low-level Concept Detection • Concept Detectors Concept Label Confidence Crowd 0.9 Parade 0.8 Car 0.1 Outdoor 0.5 … … 6 Avenue @ 23 Street 15:10 13th Dec, 2014 Location Time 6 Ave@ 23 St 15:10, Dec13th 6 Ave@ 23 St 15:10, Dec13th 6 Ave@ 23 St 15:10, Dec13th 6 Ave@ 23 St 15:10, Dec13th … … 17
  • 18. National University of Singapore 24th International World Wide Web Conference Low-level Concept Detection I’ve seen crowd here now, 90% sure Low Level Camera Tweet 18
  • 19. National University of Singapore 24th International World Wide Web Conference Low-level Concept Detection server storage I’ve seen crowd here now, 90% sure I’ve seen crowd here now, 90% sure I’ve seen crowd here now, 90% sure I’ve seen crowd here now, 90% sure I’ve seen crowd here now, 90% sure 19
  • 20. National University of Singapore 24th International World Wide Web Conference Mid-level Concept Filtering • Filtering & Analytic Operators o Query Operators: E.g. Show the March 17th data for the concept of “parade” at 5th Avenue with a confidence higher than 0.8: 𝑄𝑢𝑒𝑟𝑦: 𝜃 𝑃_𝑃𝑅𝑂𝐵⋀𝑃_𝐿𝐴𝐵𝐸𝐿⋀𝑃_𝐿𝑂𝐶⋀𝑃_𝑇𝐼𝑀𝐸 𝑆 Where 𝑃_𝑃𝑅𝑂𝑃 = 𝑃_𝑝𝑟𝑜𝑏(0.8 ≤ 𝑝), 𝑃𝐿𝐴𝐵𝐸𝐿 = 𝑃𝑙𝑎𝑏𝑒𝑙 𝑙𝑎𝑏𝑒𝑙=𝑝𝑎𝑟𝑎𝑑𝑒 , 𝑃_𝐿𝑂𝐶 = 𝑃_𝑙𝑜𝑐(𝐶𝐴𝑀𝑖 = 5 𝑡ℎ 𝐴𝑣𝑒𝑛𝑢𝑒), 𝑃_𝑇𝐼𝑀𝐸 = 𝑃_𝑡𝑖𝑚𝑒(𝑡 = 𝑀𝑎𝑟𝑡ℎ 17 𝑡ℎ ) o Statistical Functions o mean, max, min, sum o Processing Operators o Extremes o Smooth o Trend o Outlier (Anomaly) 20
  • 21. National University of Singapore 24th International World Wide Web Conference Operators St Patrick’s Day Parade Event (“parade” concept) Probability Hour PROJECTION(SELECTION(EXTRE ME(SMOOTH(MAP 𝑡1, 𝑙𝑜𝑐1, raw_image1 , 0.2 𝑡2, 𝑙𝑜𝑐2, raw_image2 , 0.3 … 𝑡 𝑛, 𝑙𝑜𝑐 𝑛, raw_image 𝑛 , 0.7)))) = (t 𝑥, 𝑙𝑜𝑐1, parade, 0.7) 21
  • 22. National University of Singapore 24th International World Wide Web Conference What about Social Tweets ? Tokenizer Normalizer Preprocessing English Words Slang Words Dictionary 2015-01-26 12:05:32 I'm ready for you snowpocalypse! #madisonsqpark !!!! @madisonsqpark zzzzzzz #snowpocalypse @ Madison Square ParkQ&amp http://t.co/KoRJ4FYOkZ 40.7421 -73.988283 2015-01-26 12:05:32 I'm ready for you snowpocalypse #madisonsqpark #snowpocalypse Madison Square Park 40.7421 -73.988283 22
  • 23. National University of Singapore 24th International World Wide Web Conference What about Social Tweets ? Representative Term Mining 𝐷𝑎𝑦 𝑒𝑣𝑒𝑛𝑡 (𝑡 𝑠−𝑡 𝑒) 𝐷𝑎𝑦 𝑝𝑟𝑣1 (𝑡 𝑠−𝑡 𝑒) … … …… 𝐷𝑎𝑦 𝑝𝑟𝑣2 (𝑡 𝑠−𝑡 𝑒) 23
  • 24. National University of Singapore 24th International World Wide Web Conference Representative Term Mining What about Social Tweets ? Loc 1 Loc 2 Loc 3 Loc N 𝑇𝐶: tweets posted during events 𝑇 𝐻: tweets posted before events 𝑡𝑓: term frequency i𝑑𝑓: inverse document frequency 𝑤𝑡𝑒𝑟𝑚 = 𝑡𝑓 𝑡𝑒𝑟𝑚, 𝑇𝐶 × 𝑖𝑑𝑓 𝑡𝑒𝑟𝑚, 𝑇 𝐻 𝑠. 𝑡. 𝑡𝑓 𝑡𝑒𝑟𝑚, 𝑇𝐶 = 𝑓(𝑡𝑒𝑟𝑚, 𝑇𝐶) max{𝑓 𝜔, 𝑇𝐶 , ∀𝜔 ∈ 𝑇𝐶} 𝑖𝑑𝑓 𝑡𝑒𝑟𝑚, 𝑇 𝐻 = 𝑙𝑜𝑔 𝑇 𝐻 {𝑡ℎ ∈ 𝑇 𝐻: 𝑡𝑒𝑟𝑚 ∈ 𝑡ℎ} … … 24
  • 25. National University of Singapore 24th International World Wide Web Conference Representative Term Mining What about Social Tweets ? Loc 1 Loc 2 Loc 3 Loc N 𝑇𝐶: tweets posted during events 𝑇 𝐻: tweets posted before events 𝑡𝑓: term frequency i𝑑𝑓: inverse document frequency 𝑤𝑡𝑒𝑟𝑚 = 𝑡𝑓 𝑡𝑒𝑟𝑚, 𝑇𝐶 × 𝑖𝑑𝑓 𝑡𝑒𝑟𝑚, 𝑇 𝐻 𝑠. 𝑡. 𝑡𝑓 𝑡𝑒𝑟𝑚, 𝑇𝐶 = 𝑓(𝑡𝑒𝑟𝑚, 𝑇𝐶) max{𝑓 𝜔, 𝑇𝐶 , ∀𝜔 ∈ 𝑇𝐶} 𝑖𝑑𝑓 𝑡𝑒𝑟𝑚, 𝑇 𝐻 = 𝑙𝑜𝑔 𝑇 𝐻 {𝑡ℎ ∈ 𝑇 𝐻: 𝑡𝑒𝑟𝑚 ∈ 𝑡ℎ} … … 25
  • 26. National University of Singapore 24th International World Wide Web Conference Data Analysis & Real World Events 26
  • 27. National University of Singapore 24th International World Wide Web Conference DataSet • NYC Traffic CCTV Camera 149 cameras all over Manhattan Sampling rate: ~10s/f Period: 2014 ~ May 2016 • Twitter Data (Geo-tagged) o Region: Manhattan o Oct 4th 2014 ~ Sep 2016 o Attributes: text, time, geo-coordinates, etc. o Size: ~40,000/day Event Date Time Location CBGB Music Festival 12 Oct 10am-7pm Broadway 51st Columbus Day Parade 13 Oct 11am-5pm 5th Avenue Hispanic Parade 12 Oct 12pm-5pm 5th Avenue Million March NYC Protest 13 Dec 2pm-5pm Washington Square Park, 5th Avenue, Foley Square 27
  • 28. National University of Singapore 24th International World Wide Web Conference Real-world Events 28
  • 29. National University of Singapore 24th International World Wide Web Conference Real-world Events Hispanic ParadeCBGB Musical Festival Columbus Day Parade twtw tw tw tw tw twtw tw tw tw tw twtw tw tw tw tw twtw tw tw tw tw Historic TweetsRecent Tweets Event time and location Retrieve recent tweets Retrieve historic tweets 29
  • 30. National University of Singapore 24th International World Wide Web Conference “Million March NYC Protest” Event 30
  • 31. National University of Singapore 24th International World Wide Web Conference Twitter Images “people marching” : 0.5 “parade” : 0.4, “crowd” : 0.9 31
  • 32. National University of Singapore Demo : Tweeting Camera A New Paradigm of Event-based Smart Sensing Device 32
  • 33. National University of Singapore 24th International World Wide Web Conference From 1925: 35mm Leica A From 1942: CCTV Camera Network “Looks like a fire ball here ? ” Fire Event Parade Event Meeting Event Jogging Event NOW:New Tweeting Cameras Paradigm Motivation 33
  • 34. National University of Singapore 24th International World Wide Web Conference Tweeting Camera (Group Meeting Event) 34 https://www.youtube.com/watch?v=eXn89Z_MZwI
  • 35. National University of Singapore 24th International World Wide Web Conference Summary • Aggregation of physical sensors and social sensors • Multi-layer tweeting camera framework • Probabilistic Spatio-temporal Data (Camera Tweet) • Analytic functions & operators • Concept Based Image (Cmage) • Feasibility via Real-world Events Data 35
  • 36. National University of Singapore 24th International World Wide Web Conference Work II: Cmage Based Hybrid Fusion of Multimodal Event Signals 36
  • 37. National University of Singapore 24th International World Wide Web Conference Motivation  Geo-tagged Multisensor Data  Noise and Sparsity PST = <Loc, Time, Label, Prob> Isensor Low-level Concept Detection Crowd Actio nPara de Face Car Mid-level Concept Filtering Filtering & Analytic Operators High-level Social Sensor Fusion Social Media Social Information Extraction Cross Media Analysis Database (Building Blocks) Event Signal Detection Feature Extraction … Traffic User Cmage Physical Sensors Social Sensors Filtering & Analytic Operators Event Detection Social Informati onCrawle r Sensor Data Collector Parade Is going Event Locating Better Visualization (Where happens what) Goal: 37
  • 38. National University of Singapore 24th International World Wide Web Conference Event Signals to Event Cmage 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 “crowdedness” 38
  • 39. National University of Singapore 24th International World Wide Web Conference Cmage-based Fusion Pipeline Gaussian Process Prediction Bayesian Decision Fusion Spatial Fusion Sensor Cmage Social Cmage Event CmageFused CmageExtract Social Signals Extract Sensor Signals Sensor Concepts (Crowd, People marching, Car, Traffic, Building) Social Terms (“MillionMarchNYC”, “HappyNewYear”, “SGHaze”) … … 39
  • 40. National University of Singapore 24th International World Wide Web Conference Sensor Cmage Pixel Value Estimation Gaussian Process - using noisy and sparse observations Observed pixel values Predicted pixel values Gaussian Process based Prediction 40
  • 41. National University of Singapore 24th International World Wide Web Conference Hybrid Event Image Fusion • Event decision fusion • Spatial fusion 0.3 0.6 0.63 0.83 0.7 0.030.03 00.10.010.03 0.010.010.03 0.840.84 0.6 0.30.4 0.6 0.6 0.1 Sensor Image Social Image Reference Window Fused Pixel Reference Pixel Decision Fusion Spatial Fusion 0 . 4 0.7 Fused Image GP Predicted Pixel 0.4 0.8 0.8 0.70.4 0.5 0.6 0.6 0.60.6 0.70.4 0 . 4 0.80.4 0.7 0.8 0.3 0.2 0.8 0.6 0.3 0.5 0.6 0.6 0.5 0.6 0.6 0.4 0.6 0.5 0.60.4 0.60.6 0.8 41
  • 42. National University of Singapore 24th International World Wide Web Conference Evaluation Metrics  Saliency Metric S Low S => more salient & concentrated region  Mean Square Error compared with Ground Truth Experiments 42
  • 43. National University of Singapore 24th International World Wide Web Conference Experiments Evaluation Metrics  Saliency Metric S • Noise Removal & Saliency Enhancement S=122.86 S=53.49 S=21.18 “MillionMarchNYC”“Marching” Fused 43
  • 44. National University of Singapore 24th International World Wide Web Conference Experiments • Saliency Enhancement 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 Saliency Metric S sensor image social image fused image 44
  • 45. National University of Singapore 24th International World Wide Web Conference Experiments • More Events Events Sensor Cmage Social Cmage Fused Sensor Concept Social Term Columbus Day Parade 124.6 0.43 0.34 Crowd ColumbusDay MillionMarchNYC protest 124.5 0.47 0.40 People_marching MillionMarchNYC StPatricks Day Parade 1.49 0.61 0.53 Crowd StPatriks 45
  • 46. National University of Singapore 24th International World Wide Web Conference Experiments • MSE compared with Ground Truth (MillionMmarchNYC Events) 0 20 40 60 80 100 120 MSE MSE (Fused Cmage – Ground Truth) MSE(Sensor Cmage) MSE(Social Cmage) MSE(Fused Cmage) 46
  • 47. National University of Singapore 24th International World Wide Web Conference Experiments • Effectiveness of Gaussian process 0 0.1 0.2 0.3 0.4 0.5 0.6 Fused with GP Fused without GP S 47
  • 48. National University of Singapore 24th International World Wide Web Conference Summary • Leveraged multimodal information for better situation understanding • Proposed an image-based hybrid fusion method featuring sensor decision and spatial information • Reduced noise in sensor data for better event detection • Limitation Concepts to be fused are predefined 48
  • 49. National University of Singapore 24th International World Wide Web Conference Work III: A Matrix Factorization Based Framework for Fusion of Physical and Social Sensors 49
  • 50. National University of Singapore 24th International World Wide Web Conference Motivation Physical sensors generate sparse or inaccurate readings Social Information implicitly explains readings • Help us discovery different dimensions or aspects of events • Useful for inferring & predicting ongoing situations • More than a single reading; tells why Same events have similar social topics and physical sensor readings Goal: utilize physical & social correlation to predict events Crowdedness Prob Hour 50
  • 51. National University of Singapore 24th International World Wide Web Conference Spatio-Temporal-Semantic Representation Time Stamps Locations 1 40 80 120 150 3-4pm 1-2pm 2-3pm 12-1pm 4-5pm 5-6pm “People_marching” Situation 51
  • 52. National University of Singapore 24th International World Wide Web Conference Approach: Matrix Factorization Based Fusion Framework ≈ × 52
  • 53. National University of Singapore 24th International World Wide Web Conference Formalization & Notations 26 ? 20 56 ? ? 102 80 ? 90 89 35 ? 21 35 14 ? 16 ? 109 Word 1, Word 2, Word 3, word 4 … Times Physical Readings 𝑆𝑖𝑗 𝛿 Locations Locations Time Stamps 3-4pm 1-2pm 2-3pm 12- 1pm 4-5pm 5-6pm 1 40 80 120 150 N locations: 𝑗 = 1, … , 𝑁 M time stamps: i = 1, … , 𝑀 Temporal window: 𝛿 Situation matrix: 𝑆 𝛿 ⊆ ℝ 𝑀×𝑁 Physical readings: 𝑆𝑖𝑗 𝛿 ∈ 𝑆 𝛿 Word 1, Word 2, Word 3, word 4 … Word 1, Word 2, Word 3, word 4 … Word 1, Word 2, Word 3, word 4 … Word 1, Word 2, Word 3, word 4 … Location Document: 𝐿𝐷𝑗 𝛿,𝑟 = {𝑝1, … , 𝑝 𝑘} Social post : 𝑝 𝑘 = 𝜔1, … , 𝜔 𝑅 Word: 𝜔 𝑞 ∈ 𝐷 Social Information: 𝕊 𝛿 = {𝐿𝐷1 𝛿,𝑟 , … , 𝐿𝐷 𝑁 𝛿,𝑟 } Sensor Signal S : Social Signal 𝕊 : 𝑝𝑒𝑟𝑓𝑜𝑟𝑚𝑎𝑛𝑐𝑒 𝔽 𝑆, 𝕊 𝛿 > 𝑝𝑒𝑟𝑓𝑜𝑟𝑚𝑎𝑛𝑐𝑒 𝑆 Fusion Function 𝔽 : Matrix Factorization (MF) Model:  MF on Physical Signals: Basic Model  MF incorporating Social Signals: Latent Topics 𝐿𝐷 𝛿,𝑟 53
  • 54. National University of Singapore 24th International World Wide Web Conference MF on Physical Signals: Basic Model 26 ? 20 56 ? ? 102 80 ? 90 89 35 ? 21 35 14 ? 16 ? 109 ≈Times 1 12 2 3.6 2 2 1 -3 1 2 2 -3 -0.9 -0.4 5 3 1 10 1 7 -0.3 9 3 2 -9 6 12 × Temporal Latent Factors Spatial Latent Factors Locations 𝑠𝑖𝑗 = 𝑡𝑖 ′ 𝑙𝑗 + 𝜇 + 𝛽𝑖 + 𝛽𝑗 Ω𝑖,𝑗 Γ = 𝜆 𝑟𝑒𝑔 𝑡𝑖 2 + 𝑙𝑗 2 + 𝛽𝑖 2 + 𝛽𝑗 2 𝑡𝑖 𝑙𝑗 𝑗 = 1, … , 𝑁 : N locations i = 1, … , 𝑀 : M time stamps: 𝑆 ⊆ ℝ 𝑀×𝑁 : Situation matrix: 𝑆𝑖𝑗 ∈ 𝑆 : Physical readings: 𝑆 𝑝ℎ𝑦 = 𝑖,𝑗 ∈𝜅 𝑆𝑖𝑗 − 𝑠𝑖𝑗 2 + Ω𝑖,𝑗 Γ Γ = 𝜇, 𝛽𝑖, 𝛽𝑗, 𝑡𝑗, 𝑙𝑗 54
  • 55. National University of Singapore 24th International World Wide Web Conference MF Incorporating Social: Latent Topics Object Function min 𝑓 𝑆 Γ = 𝑆 𝑝ℎ𝑦= 𝑖,𝑗 ∈𝜅 𝑆𝑖𝑗 − 𝑠𝑖𝑗 2 + Ω𝑖,𝑗 Γ 55
  • 56. National University of Singapore 24th International World Wide Web Conference MF Incorporating Social: Latent Topics Object Function min 𝑓 𝑆 Γ = 𝑆 𝑝ℎ𝑦= 𝑖,𝑗 ∈𝜅 𝑆𝑖𝑗 − 𝑠𝑖𝑗 2 + Ω𝑖,𝑗 Γ 56
  • 57. National University of Singapore 24th International World Wide Web Conference MF Incorporating Social: Latent Topics Object Function min 𝑓 𝑆 Γ = 𝑆 𝑝ℎ𝑦= 𝑖,𝑗 ∈𝜅 𝑆𝑖𝑗 − 𝑠𝑖𝑗 2 + Ω𝑖,𝑗 Γ min 𝑓 𝑆, 𝕊 Γ , 𝑃𝑎𝑟𝑎𝑚𝑒𝑡𝑒𝑟(𝕊) = 𝑆 𝑝ℎ𝑦 + 𝑆𝑠𝑜𝑐 57
  • 58. National University of Singapore 24th International World Wide Web Conference MF Incorporating Social: Latent Topics Object Function min 𝑓 𝑆 Γ = 𝑆 𝑝ℎ𝑦= 𝑖,𝑗 ∈𝜅 𝑆𝑖𝑗 − 𝑠𝑖𝑗 2 + Ω𝑖,𝑗 Γ Location document: 𝐿𝐷𝑗 = {𝑝1, … , 𝑝 𝑘} Social post : 𝑝 𝑘 = 𝜔1, … , 𝜔 𝑅 Word: 𝜔 𝑞 ∈ 𝐷 Social Information: 𝕊 = {𝐿𝐷1, … , 𝐿𝐷1} Social Signal 𝕊 : min 𝑓 𝑆, 𝕊 Γ , 𝑃𝑎𝑟𝑎𝑚𝑒𝑡𝑒𝑟(𝕊) = 𝑆 𝑝ℎ𝑦 + 𝑆𝑠𝑜𝑐 Locations Time Stamps3- 4pm 1- 2pm 2- 3pm 12- 1pm 4- 5pm 5- 6pm 1 40 80 120 150 58
  • 59. National University of Singapore 24th International World Wide Web Conference MF Incorporating Social: Latent Topics LDA Model 𝑆𝑠𝑜𝑐 = − 𝑝 𝕊|𝜃, 𝜙, 𝑧 = − 𝑗 𝑁 𝑞=1 𝑁 𝐿𝐷 𝑗 𝜃𝑧 𝐿𝐷 𝑗,𝑞 𝜙 𝑧 𝐿𝐷 𝑗,𝑞,𝜔 𝐿𝐷 𝑗,𝑞 min( 𝑓 𝑆, 𝕊 Γ , 𝑃𝑎𝑟𝑎𝑚𝑒𝑡𝑒𝑟 𝕊 = 𝑆 𝑝ℎ𝑦 + 𝑆𝑠𝑜𝑐 ) t l s θ w ф N M |LD| min 𝑓 𝑆, 𝕊 Γ, Θ, 𝑘, 𝑧 = 𝑆 𝑝ℎ𝑦 − 𝜆 𝑠𝑜𝑐𝑖𝑎𝑙 𝑝 𝕊|𝜃, 𝜙, 𝑘, 𝑧 = 𝑖,𝑗 ∈𝜅 𝑆𝑖𝑗 − 𝑠𝑖𝑗 2 + 𝜆 𝑟𝑒𝑔 𝑡𝑖 2 + 𝑙𝑗 2 + 𝛽𝑖 2 + 𝛽𝑗 2 − 𝜆 𝑠𝑜𝑐𝑖𝑎𝑙 𝑗 𝑁 𝑞=1 𝑁 𝐿𝐷 𝑗 𝜃𝑧 𝐿𝐷 𝑗,𝑞 𝜙 𝑧 𝐿𝐷 𝑗,𝑞,𝜔 𝐿𝐷 𝑗,𝑞 spatial latent -> social topic 𝜃𝑗,𝑓 = 𝑒 𝑘𝑙 𝑗,𝑓 𝑓 𝑒 𝑘𝑙 𝑗,𝑓 Parameters Learning: Gradient Descent 59
  • 60. National University of Singapore 24th International World Wide Web Conference Situation Prediction for Missing Readings “Cold-start” problem 𝑆 𝑝ℎ𝑦 = 𝑖,𝑗 ∈𝜅 𝑆𝑖𝑗 − 𝑠𝑖𝑗 − 𝛽𝑖 2 + 𝜆 𝑟𝑒𝑔 𝑡𝑖 2 + 𝑙𝑗 2 + 𝛽𝑖 min(𝑓) = 𝑖,𝑗 ∈𝜅 𝑆𝑖𝑗 − 𝑠𝑖𝑗 − 𝛽𝑖 2 + 𝜆 𝑟𝑒𝑔 𝑡𝑖 2 + 𝑙𝑗 2 + 𝛽𝑖 − 𝑝 𝕊|𝜃, 𝜙, 𝑘, 𝑧 26 ? ? 56 ? ? 102 ? ? 90 89 35 ? 21 35 14 ? ? ? 109 Times Locations 60
  • 61. National University of Singapore 24th International World Wide Web Conference Tweets, FB, Flickr … Matrix Factorization Based Fusion 26 ? 20 56 ? ? 102 80 ? 90 89 35 ? 21 35 14 ? 16 ? 109 1 12 2 3.6 2 2 1 -3 1 2 2 -3 -0.9 -0.4 5 3 1 10 1 7 -0.3 9 3 2 -9 6 12 = ×Times Goal: minimize the error of predicted values & maximize likelihood of social observations Tweets, Youtube, Flickr … Tweets, words, Instagram … WeChat News Media Flickr … Tweets, FB, Flickr … Temporal Latent Factors Social Embedded Latent Factors Physical Readings 𝑡𝑝1 𝑡𝑝2 𝑡𝑝3 𝑡,𝑙 (𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛 𝑙𝑎𝑡𝑒𝑛𝑡 𝑝𝑎𝑟𝑎𝑚𝑠 𝑡, 𝑙 − 𝑃ℎ𝑦𝑠𝑖𝑐𝑎𝑙 𝑅𝑒𝑎𝑑𝑖𝑛𝑔𝑠𝑡,𝑙)2 − 𝜆 ∗ 𝑝𝑟𝑜𝑏(𝑆𝑜𝑐𝑖𝑎𝑙|𝑠𝑜𝑐𝑖𝑎𝑙 𝑝𝑎𝑟𝑎𝑚𝑠) Physical Readings Error Social Observations Likelihood 𝑙𝑎𝑡𝑒𝑛𝑡, 𝑠𝑜𝑐𝑖𝑎𝑙 𝑝𝑎𝑟𝑎𝑚𝑠 min Locations 61
  • 62. National University of Singapore 24th International World Wide Web Conference Situation Awareness – Singapore Haze Noise Filtering – NYC Large Scale Events Experiments : 62
  • 63. National University of Singapore 24th International World Wide Web Conference DataSet • Historical PSI Readings 5 stations 3 weeks: 1st-7th, 12-19th August, 22nd-29th September • Geo-tagged Tweets Attributes: text, time, geo- coordinates, etc • 149 cameras all over Manhattan Sampling rate: ~10s/f Period: 2014 ~ Now Size: > 2 TB • Geo-tagged Tweets Attributes: text, time, geo- coordinates, etc. Period: 2014 ~ Now Size: ~60,000/day SG Haze Data NYC Traffic Data 1 2 3 4 5 6 7 #tweets #words #words per location SGHaze 19073 178825 8515 NYCTraffic 10005 90381 669 63
  • 64. National University of Singapore 24th International World Wide Web Conference Situation Awareness – Singapore Haze Physical & Social Sensors Correlation 1 2 3 4 5 6 7 = × × without tweets with tweets PSI Situation Matrix 1 14 28 42 56 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 Spearman’s rank correlation 𝜌𝑖𝑗 Location pairs: 1-2, 1-3, 2-1, …, 8-9 Ground truth: original PSI readings 𝜌𝑖𝑗 𝑝𝑠𝑖 With tweets 𝜌𝑖𝑗 𝑙+𝑡𝑤 vs without tweets 𝜌𝑖𝑗 𝑙 Evaluate: 𝑑𝑖𝑠𝑡 𝜌𝑖𝑗 𝑙+𝑡𝑤 , 𝜌𝑖𝑗 𝑝𝑠𝑖 , 𝑑𝑖𝑠𝑡 𝜌𝑖𝑗 𝑙 , 𝜌𝑖𝑗 𝑝𝑠𝑖 64
  • 65. National University of Singapore 24th International World Wide Web Conference Situation Awareness – Singapore Haze Physical & Social Sensors Correlation = × × PSI Situation Matrix 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 1-2 1-3 1-4 1-5 1-6 1-7 1-8 1-9 2-3 2-4 2-5 2-6 2-7 2-8 2-9 3-4 3-5 3-6 3-7 3-8 3-9 4-5 4-6 4-7 4-8 4-9 5-6 5-7 5-8 5-9 6-7 6-8 6-9 7-8 7-9 8-9 Dist(ρ(l), ρ(PSI) Dist(ρ(l+tw), ρ(PSI)) Both Observe Event Neither Observe Event Only One Observes Event #Location Pairs 3 15 18 #Better Correlation 3 7 12 Percentage 100% 47% 67% Spearman’s rank correlation 𝜌𝑖𝑗 Location pairs: 1-2, 1-3, 2-1, …, 8-9 Ground truth: original PSI readings 𝜌𝑖𝑗 𝑝𝑠𝑖 With tweets 𝜌𝑖𝑗 𝑙+𝑡𝑤 vs without tweets 𝜌𝑖𝑗 𝑙 Evaluate: 𝑑𝑖𝑠𝑡 𝜌𝑖𝑗 𝑙+𝑡𝑤 , 𝜌𝑖𝑗 𝑝𝑠𝑖 , 𝑑𝑖𝑠𝑡 𝜌𝑖𝑗 𝑙 , 𝜌𝑖𝑗 𝑝𝑠𝑖 65
  • 66. National University of Singapore 24th International World Wide Web Conference Situation Awareness – Singapore Haze Spatio-Temporal Situation Prediction 1 2 3 4 5 6 7 PSI Situation Matrix 1 1 14 28 42 56 52 3 4 76 1 52 3 4 76 1 52 3 4 76 Week 1 Week 2 Week 3 66
  • 67. National University of Singapore 24th International World Wide Web Conference Situation Awareness – Singapore Haze Spatio-Temporal Situation Prediction 1 2 3 4 5 6 7 PSI Situation Matrix 1 1 14 28 42 56 52 3 4 76 1 52 3 4 76 1 52 3 4 76 Physical Only LDA Topics: 1. hazy, hari, haze, gardens, uffc 2. Internationalcosplayday(icds), icds, sghaze, psi 3. iphone, airport, changi, terminal Physical + Social Week 1 Week 2 Week 3 67
  • 68. National University of Singapore 24th International World Wide Web Conference Situation Awareness – Singapore Haze Spatio-Temporal Situation Prediction 1 2 3 4 5 6 7 PSI Situation Matrix 1 1 14 28 42 56 52 3 4 76 1 52 3 4 76 1 52 3 4 76 Physical Only Physical + Social Week 1 Week 2 Week 3 LOC1 LOC2 LOC3 51.38 49.45 57.73 42.48 26.80 22.86 Cross Validation Measured by MSE No Tweets With Tweets 68
  • 69. National University of Singapore 24th International World Wide Web Conference Event Classification Events: 1.MillionMarchNYC Protest (M) 2.St Patrick’s Day Parade (S) 3.Columbus Day Parade (C) Evaluation: Root Mean Squared Error (RMSE) 69
  • 70. National University of Singapore 24th International World Wide Web Conference Event Classification Performance Performance: Precision, Recall, 𝐹1-score 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0 0.5 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.5 1 0 0.5 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Precision 𝐹1 Score Recall 70
  • 71. National University of Singapore 24th International World Wide Web Conference Summary • Social signals tells why for physical sensor readings • Matrix factorization is used to fuse physical and social information with spatial and temporal aspects • MF solves “cold-start” problem of predicting missing readings • Correlation exist between physical and social signals that reflect events • Fusing two sources has better performance in real- world situations understanding than using only one 71
  • 72. National University of Singapore 24th International World Wide Web Conference Thesis Contributions  Proposed multilayer tweeting camera framework can bridge the gap between physical and social sensors, makes data analysis efficient, enable the sematic details of occurring events  Cmage based fusion method removes noise effectively, locates event accurately, and results a better visualization of situations  MF based fusion results higher performance in event classification and situation prediction with the help of correlation between physical and social sensors 72
  • 73. National University of Singapore 24th International World Wide Web Conference Future work  Create an interactive framework extending multilayer tweeting camera framework  Investigate semantic relatedness among concept (utilizing ontology from various lexical databases, e.g. WordNet), build up event ontology  Predict how event will evolve in temporal aspect  Apply fusion methods in different scenarios (semantic, sentiment, stock analysis) 73
  • 74. National University of Singapore 24th International World Wide Web Conference Publication List Journal paper Yuhui Wang, Francesco Gelli, Christian von der Weth and Mohan Kankanhalli, “A Matrix Factorization Based Framework for Fusion of Physical and Social Sensors", in revision, IEEE Transactions on Multimedia, 2016. (In Peer Review) Conference papers 1) Yuhui Wang and Mohan Kankanhalli, “Tweeting Cameras for Event Detection”, 24th International Conference on World Wide Web (WWW’15), pp. 1231-1241, Florence, Italy, May 2015. 2) Yuhui Wang, “Socializing Multimodal Sensors for Information Fusion”, 23rd ACM international conference on Multimedia (MM’15),Doctoral Symposium. 653-656, Brisbane, Australia, October 2015. (Best Paper Award). 3) Yuhui Wang, Christian von der Weth, Thomas Winkler and Mohan Kankanhalli, “Demo: Tweeting Camera - A New Paradigm of Eventbased Smart Sensing Device", pp. 210-211, 10th International Conference on Distributed Smart Cameras (ICDSC’16), Paris, France, September 2016. 4) Yuhui Wang, Yehong Zhang, Christian von der Weth, Kian Hsiang Low, Vivek Singh and Mohan Kankanhalli, “Concept Based Fusion of Multimodal Event Signals", IEEE International Symposium on Multimedia (ISM’16), San Jose, USA, December 2016. 74
  • 75. National University of Singapore 24th International World Wide Web Conference Acknowledgement Prof Mohan Kankanhalli Prof Roger Zimmermann Prof Qi Zhao Prof Terence Sim Prof Ramesh Jain Prof Vivek Singh Dr. Christian von der Weth Dr. Prabhu Natarajan Dr. Tian Gan Dr. Yongkang Wong Dr. Thomas Winkler Lab mates in SeSaMe 75
  • 76. National University of Singapore YouThank 76

Editor's Notes

  1. Vintage camera early 19s , digital CCTV camera mi19 Gradual evolution digital eyes distributed, passively Traditional cctv camera system , infront of the wall , look into footage , what’s happening, Event camera capture, human interpret Why not let camera actively send information, You know human , good at tweet, social media, smart , what worth to tweet Iot era, smart nation, understand situation, not only be human tweets, but also camera tweets. New tweeting camera paradigm, socially connected, configurable applications, for example infer meeting going by cheking lighting Know people by dectecting face
  2. we assume the sensor readings over an area to be realized from, Gaussian process incorporates noise model and allows the spatial correlation of sensor readings (sensor pixels) to be formally characterized in terms of their locations
  3. <- why matrix factorization is good ->
  4. We can imagine In one hand , Continuously coming In the other hand , Social Streaming to ------------------ Fuse them togeter
  5. This can be cast as matrix completion on a partially observed matrix of users’ ratings
  6. This can be cast as matrix completion on a partially observed matrix of users’ ratings
  7. Maximaize the Likelyhood of observing the words give this parameters Optimize theta by so that opitmized l_j,f
  8. The colums where the haze happens are more similar if social information is corporated
  9. nonparametric measure of statistical dependence between two variables.
  10. False alarm LDA just check occurrence
  11. False alarm LDA just check occurrence
  12. with tweets: Precision goes up faster to 1 Recall goes down slower to 0