Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation with Organizations

Mining Citizen Sensor Communities to Improve
Cooperation with Organizational Actors
June 23 2015
PhD Defense
Hemant Purohit (Advisor: Prof. Amit Sheth)

Kno.e.sis, Dept. of CSE, Wright State University, USA

@hemant_pt
Outline
—  Citizen Sensor Communities & Organizations
—  Cooperative System Design Challenges
—  Contributions
—  Problem 1. Conversation Classification using Offline Theories
—  Problem 2. Intent Classification
—  Problem 3. Engagement Modeling
—  Applications
—  Limitations & Future Work
2

@hemant_pt
Citizen Sensors: Access to Human
Observations & Interactions
Uni-directional communication
(TO people)
Unstructured, Unconstrained Language Data
•  Ambiguity
•  Sparsity
•  Diversity
•  Scalability
Bi-directional
(BY people, TO people)
Web 2.0
media
3

@hemant_pt
Goal: Data to Decision Making
Organizational Decision Making
Noisy Citizen Sensor data
4
SOCIAL SCIENCE
•  Experts on Organizations
•  Small-scale Data
COMPUTER SCIENCE
•  Experts on Mining
•  Large-scale data
Scope of My
Research

@hemant_pt
1.  No Structured Roles
2.  No Defined Tasks
ü  But “GENERATE”
Massive Data
1.  Structured Roles
2.  Defined Tasks
ü  COLLECT Data
ü  Process, & Make Decisions
ORGANIZATIONS

Sure!
How to help?
CITIZEN
SENSOR
COMMUNITIES

5
COOPERATIVE
SYSTEM
Can you
help us?

@hemant_pt
Computer-Supported Cooperative
Work (CSCW) Matrix
6
[Johansen
1988,
Baecker
1995]
TIME
PLACE

@hemant_pt
Articulation
Challenges
(Malone & Crowston 1990;
Schmidt & Bannon 1992)
ENGAGEMENT MODELING INTENT MINING
COOPERATIVE
SYSTEM
DATA
PROBLEM
DESIGN
PROBLEM
7
ORGANIZATIONS
CITIZEN
SENSOR
COMMUNITIES

Awareness
Q1. Who to
engage
first?
Org. Actor
Q2. What are
resource needs &
availabilities?
Org. Actor

@hemant_pt
Research Questions
—  Can general theories of offline conversation be
applied in the online context?
—  Can we model intentions to inform organizational
tasks using knowledge-guided features?
—  Can we find reliable groups to engage by modeling
collective group divergence using content-based
measure?
8

@hemant_pt
Thesis: Statement
Prior knowledge, and
interplay of features of users, their content, and network
efficiently model
Intent & Engagement
for cooperation of citizen sensor communities.
Scope of Concepts
•  Intent: aim of action, e.g., offering help
•  Engagement: involvement in activity, e.g., participating in discussion
9

@hemant_pt
Contributions
1.  Operationalized computing in cooperative system design
—  by accommodating articulation in Intent Mining, and
—  enriching awareness by Engagement Modeling
2.  Improved computation of online social data
—  by incorporating features from offline social theoretical knowledge
3.  Improved performance of intent classification
—  by fusing top-down & bottom-up data representations
4.  Improved explanation of group engagement
—  by modeling content divergence to complement existing structural measures
10

@hemant_pt
Data: Scope
—  Social Platform: Twitter
—  Important bridge between citizens & organizations
—  Characteristics
—  Users: follow/subscribe
—  Content: status updates (140 chars max)
—  Network: directed
—  Platform conversation functions
—  Reply
—  Retweet
—  Mention
11

@hemant_pt
Outline
—  Awareness: tackle via Engagement Modeling
—  Articulation: tackle via Intent Mining
12

@hemant_pt
User1. Analyzing #Conversations on Twitter. Using platform provided
functions #REPLY, #RT, and #Mention.
..
…
……..
User2. I kinda feel one might need more than just the platform fn -- @User1 u
can think #Psycholinguistics, dude!
Problem 1. Conversation Classification
—  Function of Reply, Retweet, Mention reflect conversation
13
R1. Can general theories of conversation be applied in the online context?

@hemant_pt
Problem 1. Conversation Classification
—  Function of Reply, Retweet, Mention reflect conversation
—  Task: Given a set S of messages mi, Classify a sample {mi}
for {RP, None}, {RT, None}, {MN, None} , where
—  Ground-truth corpuses
—  RP = { mi | has_Reply_function (mi) = True }
—  RT = { mi | has_Retweet_function (mi) = True }
—  MN = { mi | has_Mention_function (mi) = True }
—  None = S – {RP, RT, MN}
—  Sample {mi} size = 3, based on average Reply conversation size
14

@hemant_pt
Conversation Classification: Offline
Theories
—  Psycholinguistics Indicators [Clark & Gibbs, 1986, Chafe 1987, etc.]
—  Determiners (‘the’ vs. ‘a/an’)
—  Dialogue Management (e.g., ‘thanks’, ’anyway’), etc.
—  Drawback
—  Offline analysis focused on positive conversation instances
—  Hypotheses
—  Offline theoretic features are discriminative
—  Such features correlate with information density
15

@hemant_pt
Conversation Classification: Feature
Examples
16
CATEGORY Hj Hj SET
H1 - Determiners (the)
H3 - Subject pronouns (she, he, we, they)
H9 - Dialogue management indicators (thanks, yes, ok, sorry, hi, hello, bye,
anyway, how about, so, what do you
mean, please, {could, would, should,
can, will} followed by pronoun)
H11 - Hedge words (kinda, sorta)
•  Feature_Hj (mi) = term-frequency ( Hj-set, mi )
•  Normalized
•  Total 14 feature categories

@hemant_pt
Conversation Classification: Results
—  Dataset
—  Tweets from 3 Disasters, and 3 Non-Disaster events
—  Varying set size (3.8K – 609K), time periods
—  Classifier:
—  Decision Tree
—  Evaluation: 10-fold Cross Validation
—  Accuracy: 62% - 78% [Lowest for {Mention,None} ]
—  AUC range: 0.63 - 0.84
17
Purohit,
Hampton,
Shalin,
Sheth
&
Flach.
In
Journal
of
Computers
in
Human
Behavior,
2013

@hemant_pt
Conversation Classification:
Discriminative Features
—  Consistent top features across classifiers
—  Pronouns (e.g., you, he)
—  Dialogue management (e.g., thanks)
—  Determiners (e.g., the)
—  Word counts
—  Positively correlated with RP, RT, MN
—  Correlation Coefficient up to 0.69
18

@hemant_pt
Psycholinguistic Analysis
—  LIWC: Tool for deeper content analysis [Pennebaker, 2001]
—  Gives a measure per psychological category
—  Categories of interest
—  Social Interaction
—  Sensed Experience
—  Communication
—  Analyzed output sets in confusion matrices
Ø  Higher values for positive classified conversation
Ø suggests higher information for cooperative intent
19
Purohit,
Hampton,
Shalin,
Sheth
&
Flach.
In
Journal
of
Computers
in
Human
Behavior,
2013
True
Positive
False
Negative
False
Positive
True
Negative

@hemant_pt
Lessons
1.  Offline theoretic features of conversations exist in the
online environment
Ø  Can be applied for computing social data
2.  Such features correlate with information density in content
- Reflection of conversation for an intent
20

@hemant_pt
Outline
21

@hemant_pt
Thesis: Statement
efficiently model
Intent & Engagement
22

@hemant_pt
Short-text Document Intent
—  Intent: Aim of action
DOCUMENT
INTENT
Text
REDCROSS
to
90999
to
donate
10$
to
help
the
victims
of

hurricane
sandy
SEEKING HELP
Anyone know where the nearest #RedCross is? I wanna
give blood today to help the victims of hurricane Sandy
OFFERING HELP

Would like to urge all citizens to make the proper
preparations for Hurricane #Sandy - prep is key - http://
t.co/LyCSprbk has valuable info!
ADVISING

23

@hemant_pt
Short-text Document Intent
—  Intent: Aim of action
DOCUMENT
INTENT
Text
REDCROSS
to
90999
to
donate
10$
to
help
the
victims
of

hurricane
sandy
SEEKING HELP
Anyone know where the nearest #RedCross is? I wanna
give blood today to help the victims of hurricane Sandy
OFFERING HELP

Would like to urge all citizens to make the proper
preparations for Hurricane #Sandy - prep is key - http://
t.co/LyCSprbk has valuable info!
ADVISING

24
How to identify relevant intent from ambiguous, unconstrained
natural language text?
Relevant intent è Articulation of organizational tasks
(e.g., Seeking vs. Offering resources)

@hemant_pt
Intent Classification: Problem
Formulation
—  Given a set of user-generated text documents, identify
existing intents
—  Variety of interpretations
—  Problem statement: a multi-class classification task
approximate f: S ! C , where
C = {c1, c2 … cK}
is a set of predefined K intent classes, and
S = {m1, m2 … mN}
is a set of N short text documents
Focus - Cooperation-assistive intent classes, C= {Seeking, Offering, None}
25

@hemant_pt
Intent Classification: Related Work
TEXT CLASSIFICATION
TYPE
FOCUS EXAMPLE
Topic predominant
subject matter
sports or entertainment
Sentiment/Emotion/
Opinion
focus on present state
of emotional affairs
negative or positive;
happy emotion
Intent Focus on action, hence,
future state of affairs
offer to help after floods
e.g., I am going to watch the awesome Fast and Furious movie!! #Excited
26

@hemant_pt
Intent Classification: Related Work
DATA TYPE APPROACH FOCUS LIMITED APPLICABILITY
27
Formal text on
Webpages/blogs
(Kröll and Strohmaier 2009, -15;
Raslan et al. 2013, -14)
Knowledge
Acquisition:
via Rules, Clustering
•  Lack of large corpora with
proper grammatical structure
•  Poor quality text hard to parse
for dependencies
Commercial Reviews,
marketplace
(Hollerit et al. 2013, Wu et al. 2011,
Ramanand et al. 2010, Carlos &
Yalamanchi 2012, Nagarajan et al.
2009)
Classification:
via Rules, Lexical
template based,
Pattern
•  More generalized intents
(e.g., ‘help’ broader than ‘sell’)
•  Patterns implicit to capture than
for buying/selling
Search Queries
(Broder 2002, Downey et al. 2008,,
Case 2012, Wu et al. 2010,
Strohmaier & Kröll 2012)
User Profiling:
Query Classification
•  Lack of large query logs, click
graphs
•  Existence of social conversation

@hemant_pt
Intent Classification: Challenges
—  Unconstrained Natural Language in small space
—  Ambiguity in interpretation
—  Sparsity of low ‘signal-to-noise’: Imbalanced classes
—  1% signals (Seeking/Offering) in 4.9 million tweets #Sandy
—  Hard-to-predict problem:
—  commercial intent, F-1 score 65% on Twitter [Hollerit et al. 2013]
@Zuora wants to help @Network4Good with Hurricane Relief. Text SANDY to
80888 & donate $10 to @redcross @AmeriCares & @SalvationArmyUS #help
*Blue: offering intent, *Red: seeking intent
28

@hemant_pt
Intent Classification: Types & Features
29
Intent
Binary
Crisis Domain:
- [Varga et al. 2013] Problem vs. Aid (Japanese)
- Features: Syntactic, Noun-Verb templates, etc.
Commercial Domain:
- [Hollerit et al. 2013] Buy vs. Sell intent
- Features: N-grams, Part-of-Speech
Multiclass
Commercial Domain:
-  Not on Twitter

@hemant_pt
TOP-DOWN
Pattern Rules:
Declarative Knowledge
(patterns defined for intent association)
BOTTOM-UP
Bag of N-grams Tokens:
Independent Tokens
(patterns derived from the data)
Our
Hybrid
Approach
Learning
Improves
Expressivity
Increases
30

@hemant_pt
Intent Classification Top-Down:
Binary Classifier - Prior Knowledge
—  Conceptual Dependency Theory [Schank, 1972]
—  Make meaning independent from the actual words in input
—  e.g., Class in an Ontology abstracts similar instances
—  Verb Lexicon [Hollerit et al. 2013]
—  Relevant Levin’s Verb categories [Levin, 1993]
—  e.g., give, send, etc.
—  Syntactic Pattern
—  Auxiliary & modals: e.g., ‘be’, ‘do’, ‘could’, etc. [Ramanand et al. 2010]
—  Word order: Verb-Subject positions, etc.
Purohit,
Hampton,
Bhatt,
Shalin,
Sheth
&
Flach.
In
Journal
of
CSCW,
2014

31

@hemant_pt
Binary Classifier – Psycholinguistic Rules
—  Transform knowledge into rules
—  Examples:
(Pronouns except 'you' = yes) ^ (need/want = yes) ^ (Adjective = yes/no) ^ (Things=yes) → Seeking
(Pronoun except 'you' | Proper Noun = yes) ^ (can/could/would/should = yes) ^ (Levin Verb = yes)
^ (Determiner = yes/no) ^ (Adjective = yes/no) ^ (Things = yes) -> Offering
Domain
ontology
32
Purohit,
Hampton,
Bhatt,
Shalin,
Sheth
&
Flach.
In
Journal
of
CSCW,
2014

@hemant_pt
Binary Classifier - Lessons
—  Preliminary Study
—  2000 conversation and then rule-based classified tweets:
labeled by two native speakers
—  Labels: Seeking, Offering, None
—  Results
—  Avg. F-1 score: 78% (Baseline F-1 score: 57% [Varga et al. 2013] )
—  Lessons
—  Role of prior knowledge: Domain Independent & Dependent
—  Limitation: Exhaustive rule-set, low Recall, Ambiguity
addressed, but sparsity

Purohit,
Hampton,
Bhatt,
Shalin,
Sheth
&
Flach.
In
Journal
of
CSCW,
2014

33

@hemant_pt
TOP-DOWN
Pattern Rules:
Declarative Knowledge
BOTTOM-UP
Independent Tokens
Hybrid
Approach
34

@hemant_pt
Intent Classification Hybrid:
Binary Classifier - Design
—  AMBIGUITY: addressed via rich feature space
1. Top-Down: Declarative Knowledge Patterns [Ramanand et al. 2010]
DK(mi, P) ! {0,1}
e.g., P= b(like|want) b.*b(to)b.*b(bring|give|help|raise|donate)b

(acquired via Red Cross expert searches)
2. Abstraction: due to importance in info sharing [Nagarajan et al. 2010]
-  Numeric (e.g., $10) à _NUM_
-  Interactions (e.g., RT & @user) à _RT_ , _MENTION_
-  Links (e.g., http://bit.ly) ! _URL_
3. Bottom-Up: N-grams after stemming and abstraction [Hollerit et al. 2013]
TOKENIZER ( mi ) à { bi-, tri-gram }
35

@hemant_pt
Binary Classifier - Design
—  SPARSITY: addressed via algorithmic choices
1.  Feature Selection
2.  Ensemble Learning
3.  Classifier Chain
36
DATASET
Knowledge-driven
features
XT
, y
m_1
m_2
P(c2)
P(c1)
X1
T, y1
X2
T, y2
1 - P(c1)

@hemant_pt
Binary Classifier - Experiments
—  Binary classifiers:
—  Seeking vs. not Seeking
—  Offering vs. not Offering
—  Dataset:
—  Candidate set: 4000 donation classified tweets
—  Labels: min. 3 judges
—  Annotations: Seeking , Offering , None
37Purohit,
Castillo,
Diaz,
Sheth,
&
Meier.
First
Monday
journal,
2014

@hemant_pt
Binary Classifier - Results
Experiments Supervised
Learning
Training
Samples
Precision
(*Baseline)
F-1
score
Class-
labels
Seeking vs. (None’ +
Offering)
RF
(CR=50:1)
3836 98%
(*79%)
46%
(56%)
56%
requests
Offering vs. (None’) RF
(CR=9:2)
1763 90%
(*65%)
44%
(*58%)
13%
offers
RF = Random Forest ensemble
CR = Asymmetric false–alarm Cost Ratios for True:False
Evaluation : 10-fold CV
Notes:
-  Domain requires high precision than recall
-  Scope for improving low recall
38Purohit,
Castillo,
Diaz,
Sheth,
&
Meier.
First
Monday
journal,
2014

@hemant_pt
Multiclass Classifier - Generalization
—  Lessons from binary classification
—  Improvement by fusing top-down & bottom-up
—  Sparsity
—  Ambiguity (Seeking & Offering complementary)
—  addressed via improved data representation
Hypothesis: Knowledge-guided approach improves
multiclass classification accuracy
39

@hemant_pt
TOP-DOWN
Knowledge Patterns
(DK) Declarative
(SK) Social Behavior
(CTK, CSK) Contrast Patterns
BOTTOM-UP
(T) Independent Tokens
Hybrid
Approach
40

@hemant_pt
Multiclass Classifier – Feature Creation
1. (T) Bag of Tokens -
2. (DK) Declarative Knowledge Patterns
—  Domain expert guidance
—  Psycholinguistics syntactic & semantic rules
—  Expand by WordNet and Levin Verbs
e.g.,
3. (SK) Social Knowledge Indicators
—  Offline conversation indicators studied in Problem 1
e.g., Hj = Dialogue Management, Hj-set = {Thanks, anyway,..}
41
(how = yes) ^ (Modal-Set 'can' = yes) ^ (Pronouns except 'you' = yes) ^ (Levin Verb-Set 'give' = yes)
Feature_Hj (mi) = term-frequency ( Hj-set, mi )
Pj = Feature_Pj (mi) = 1 if Pj exists in mi , else 0
TOKENIZER(mi , min, max)

@hemant_pt
Multiclass Classifier - Feature Creation
4. (CTK) Contrast Knowledge Patterns
INPUT: corpus {mi} cleaned and abstracted, min. support, X
For each class Cj
—  Find contrasting pattern using sequential pattern mining
OUTPUT: contrast patterns set {P} for each class Cj
5. (CPK) Contrast Patterns: on Part-of-Speech tags of {mi}
42
e.g., unique sequential patterns:
SEEKING: help .* victim .* _url_ .*
OFFERING: anyon .* know .* cloth .*

@hemant_pt
Multiclass Classifier - Feature Creation
Finding CTK: Contrast Knowledge Patterns
For each class Cj
1.  Tokenize the cleaned, abstracted text of {mi }
2.  Mine Sequential Patterns: SPADE Algorithm
—  - Output: sequences of token sets, {P’}
3.  Reduce to minimal sequences {P}
4.  Compute growth rate & contrast strength for P with all other Ck
5.  Top-K ranked {P} by contrast strength
OUTPUT: contrast patterns set {P} for each class Cj
43
gr(P,Cj,Ck) = support (P,Cj) / support (P,Ck) .. (1)
Contrast-Growth (P,Cj,Ck) = 1/(|Cj| -1) ΣCk, k=/=j gr(P,Cj,Ck)/ (1 + gr(P,Cj,Ck)) ..(2)
Contrast-Strength(P,Cj) = support(P,Cj)*Contrast-Growth(P,Cj,Ck) .. (3)

@hemant_pt
CORPUS
Set of
short text
documents,
S
FEATURES
Knowledge-driven
features
XT
, y
M_1
M_2
M_K
.
.
.
Subset Xj
T ⊂ S such that, Xj
T includes
all the labeled instances of class Cj for
model M_j
Binarization Frameworks for
Multiclass Classifier: 1 vs. All
P(c2)
P(c1)
X1
T, y1
X2
T, y2
XK
T, yK
P(cK)
44(In 1 vs. 1 framework: K*(K-1)/2 classifiers, for each Cj,Ck pair)

@hemant_pt
Multiclass Classifier - Experiments
—  Datasets
—  Dataset-1: Hurricane Sandy, Oct 27 – Nov 7, 2012
—  Dataset-2: Philippines Typhoon, Nov 7 – Nov 17, 2013
—  Parameters
—  Base Learner M_j: Random Forest, 10 trees with 100 features
—  bi-, tri-gram for (T)
—  K=100% & min. support 10% for CTK, 50% for CPK
45

@hemant_pt
Intent Classification:
Multiclass Classifier – Results
46
56% 58% 60% 62% 64% 66% 68% 70%
T (Baseline)
T,DK
T,SK
T,CTK,CSK
T,DK,SK,CTK,CSK
1-vs-1
1-vs-All
Avg. F-1 Score
(10-fold CV)
Frameworks:
Gain 7%, p < 0.05
Dataset-1 (Hurricane Sandy, 2012)
(Declarative)
(Social)
(Contrast)

@hemant_pt
74% 76% 78% 80% 82% 84% 86%
T (Baseline)
T,DK
T,SK
T,CTK,CSK
T,DK,SK,CTK,CSK
1-vs-1
1-vs-All
Intent Classification:
Multiclass Classifier - Results
47
Frameworks:
Gain 6%, p < 0.05
Dataset-2 (Philippines Typhoon, 2013)
(Declarative)
(Social)
(Contrast)
Avg. F-1 Score
(10-fold CV)

@hemant_pt
Lessons
1.  Top-down & Bottom-up hybrid approach improves data
representation for learning (complementary) intent classes
—  Top 1% discriminative features contained 50% knowledge driven
2.  Offline theoretic social conversation (SK) features (the, thanks,
etc.), often removed for text classification are valuable for
intent.
3.  There is a varying effect of knowledge types (SK vs. DK vs.
CTK/CPK) in different types of real world event datasets
Ø Culturally-sensitive psycholinguistics knowledge in future
48

@hemant_pt
Outline
49

@hemant_pt
Thesis: Statement
efficiently model
Intent & Engagement
50

@hemant_pt
—  Engagement: degree of involvement in discussion
—  Reliable groups: stay focused and collectively behave to diverge on
topics
Problem 3. Group Engagement Model
51Purohit, Ruan, Fuhry, Parthasarathy, & Sheth. ICWSM 2014
How can organizations find reliable groups to engage for action?

@hemant_pt
—  Engagement: degree of involvement in discussion
—  Reliable groups: stay focused and collectively behave to diverge on topics
—  Why & How do groups collectively evolve over time?
1.  Define a group from interaction network, g
2.  Define Divergence of g: content based in contrast to structure
3.  Predict change in the divergence between time slices
—  Features of g based on theories of social identity, & cohesion
Problem 3. Group Engagement Model
52Purohit, Ruan, Fuhry, Parthasarathy, & Sheth. ICWSM 2014

@hemant_pt
Group Engagement Model:
Integrated Approach Unlike Prior Work
People (User): Participant
of the discussion
Content (Text): Topic of
Interest
Network (Community):
Group around topic
AND
AND
Sources: tupper-lake.com/.../uploads/Community.jpg
http://www.iconarchive.com/show/people-icons-by-aha-soft/user-icon.html
KEY POINT: capture
User Node Diversity
53

@hemant_pt
—  Candidate Group: Detect in interaction network
—  Group Discussion Divergence: Jenson-Shannon Divergence of topic
distribution on group members’ tweets
Group Engagement Model: Discussion
Divergence
where, H(*) = Shannon Entropy
Bt = Latent topic distribution of each tweet t in all members’ tweets |Tg| ,
Bg = mean topic distribution of group g, such that:
54

@hemant_pt
Lessons
1.  Content Divergence based measure helps explanation of
why groups collectively diverge
—  Less diverging group write more social & future action related
content
2.  Emerging events such as disasters have higher correlation
with social identity-driven features
Ø Role of social context
55

@hemant_pt
Outline
56

@hemant_pt
DISASTER Event
Application-1: Filter Content for
Disaster Response
CITIZEN
Sensors
RESPONSE
Organizations
Me
and
@CeceVancePR
are
coordinating
a
clothing/
food
drive
for
families
affected
by
Hurricane
Sandy.

If
you
would
like
to
donate,
DM
us

Does
anyone
know
how
to
donate
clothes
to

hurricane
#Sandy
victims?

[SEEKING

[OFFERING

Intent-Classifiers
as a Service
57

@hemant_pt
Broader Impact: Classifier Model
integrated by Crisis Mapping Pioneer
58

@hemant_pt
DISASTER Event
Application-2: “We TRUST people!”
User engagement tool
CITIZEN
Sensors
RESPONSE
Organizations
Tool to mine
Important
users
59

@hemant_pt
Broader Impact: Winner of Int’l Challenge: UN
ITU Young Innovators 2014
60

@hemant_pt
Articulation
ENGAGEMENT MODELING INTENT MINING
COOPERATIVE
SYSTEM
61
ORGANIZATIONS
CITIZEN
SENSOR
COMMUNITIES

Awareness
Q1. Who to
engage
first?
Org. Actor
Q2. What are
Resource needs &
availabilities?
Org. Actor

@hemant_pt
Limitations & Future Work
—  Cooperative System
—  CSCW Application specific to domain of crisis
Ø  How to create a full What-Where-When-Who knowledge base
—  Intent Mining
—  Non-cooperation assistive intent classes not considered, as well as
the temporal drift of intent not considered
Ø  How to mine actor-level intent beyond document level
—  Group Engagement
—  Reliable prioritized groups based on Correlation, not Causality
—  Interplay of Offline and Online interactions beyond the scope
Ø  How to incorporate intent in the group divergence
—  Bipartite Intent Graph Matching
—  Reducing time complexity of Seeking vs. Offering matching
62

@hemant_pt
Conclusion
efficiently model
Intent & Engagement
for cooperation between citizen sensors and organizations in
the online social communities.
63

@hemant_pt
Thanks to the Committee Members
64
[Left to Right] Prof. Amit Sheth, (advisor, WSU), Prof. Guozhu Dong (WSU), Prof. Srinivasan
Parthasarathy (OSU), Prof. TK Prasad (WSU), Dr. Patrick Meier (QCRI), Prof. Valerie Shalin (WSU)
Computer Science Social Science

@hemant_pt
Acknowledgement,
Thanks and Questions J
—  NSF SoCS grant IIS-1111182 to support this work
—  Interdisciplinary Mentors especially Prof. John Flach (WSU), Drs. Carlos
Castillo (QCRI), Fernando Diaz (Microsoft), Meena Nagarajan (IBM)
—  Kno.e.sis team especially Andrew Hampton from Psychology dept. and
Shreyansh and Tanvi from CSE at Wright State, as well as Yiye Ruan (now
Google) & David Fuhry at the Data Mining Lab, Ohio State University
—  Colleagues: Digital Volunteers from the CrisisMappers network, StandBy Task
Force, InCrisisRelief.org, info4Disasters, Humanity Road, Ushahidi, etc. and
the subject matter experts at UN FPA
65

@hemant_pt
Ambiguity
Sparsity
Diversity
Scalability
•  Mutual Influence in Sparse
Friendship Network
[AAAI ICWSM’12]
•  User Summarization with
Sparse Profile Metadata
[ASE SocialInfo’12]
•  Matching intent as task of
Information Retrieval [FM’14]
•  Knowledge-aware Bi-partite
Matching [In preparation]
•  Short-Text Document Intent
Mining [FM’14, JCSCW’14]
•  Actor-Intent Mining
Complexity [In preparation]
•  Modeling Group Using
Diverse Social Identity &
Cohesion [AAAI ICWSM’14]
•  Modeling Diverse User-
Engagement [SOME WWW’11,
ACM WebSci’12]
(Interpretation)
(users)
(behaviors)
66
Other
works

Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation with Organizations

More Related Content

What's hot

Viewers also liked

Similar to Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation with Organizations

Recently uploaded

Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation with Organizations