This document discusses automatically extracting and categorizing occasions associated with consumer products from text data. It defines occasions as times or events when a product is used or associated with. Six high-level occasion types are identified: celebratory, special, seasonal, weather-related, temporal, and other. A dataset of product reviews and descriptions is annotated to identify occasions. An iterative machine learning and manual correction process is used to annotate occasions in the data. The goal is to provide insights into consumers and their behaviors based on the occasions extracted from text.
1. oculus360.usOculus360: The Future of Insights @oculus360 |oculus360.us
Long Nights, Rainy Days, and Misspent
Youth: Automatically Extracting and
Categorizing Occasions Associated
with Consumer Products
David B. Bracewell, Ph.D.
Oculus360
3. oculus360.us
I bought
this because
all the
reviews…
I bought this
for a camping
trip and …
Influences other
consumer decisions
Provides marketers
insights into the who,
what, when, where, why,
and how
4. oculus360.us
Traditionally marketers rely upon surveys and
ethnographic studies in order to gain insights about
consumers and their behaviors.
• Outcomes of these surveys and
studies include:
• Consumer segments
• When & where & why a
product is purchased or used
• Solitary or group consumption
• Beliefs about the product
• Preferences
• Likelihood to purchase again
5. oculus360.us
Traditionally marketers rely upon surveys and
ethnographic studies in order to gain insights about
consumers and their behaviors.
• Outcomes of these surveys
include:
• Consumer segments
• When & where consumers
purchase and use products
• Solitary or group consumption
• Beliefs about the product
• Will they purchase it again?
Costly $$$
N = 100s or 1,000s
Represent a single snapshot in time
6. oculus360.us
For the most part commercial computational
approaches have focused mainly on the trend of
positive and negative comments, tweets, etc.
7. oculus360.us
Decision makers are left not knowing why the trends
are going up/down, because traditional approaches
do not capture the implicatures of the online
discussion and commentary.
8. oculus360.us
One way in which we can start to gain deeper insights is
through the automated extraction of occasions in which
consumers use products or with which consumers associate
products.
I bought this
for a camping
trip and …
• Facilitates insight into:
• Personality
• Culture
• Social status
• Social circle
• Behavior
9. oculus360.us
One way in which we can learn more about the consumer
and start to answer the six W’s is by examining the occasions
in which consumers use products or with which consumers
associate products.
I bought this
for a camping
trip and …
• Provides insight into:
• Personality
• Culture
• Social status
• Social circle
• Behavior
“I bought these for a party and everyone I talked to loved them.”
“With ###### shopping for prom!”
“I bought the non ballistic for rock climbing and mountain biking.”
10. oculus360.us
One way in which we can learn more about the consumer
and start to answer the six W’s is by examining the occasions
in which consumers use products or with which consumers
associate products.
I bought this
for a camping
trip and …
• Provides insight into:
• Personality
• Culture
• Social status
• Social circle
• Behavior
「 阿波おどりのために、 高円寺まで出たので、ついでにヴィレバ
ンに寄って買ってきましたよ~ 」
“For Awa Odori I went out to Koenji and along the way stopped at
the Village Vanguard and bought this!”
“我刚接到被哈佛录取的通知,就顺手买下了机票 。”
“I just got accepted to Harvard, and bought a plane ticket.”
11. oculus360.us
“We ( my son and I ) purchased this gift set for
my wife on Valentines day.”
12. oculus360.us
“We ( my son and I ) purchased this gift set for
my wife on Valentines day.”
gift set
Valentines day
Purpose Gift
When Valentines day
Giving Situation Family
Why?
What?
When?
Purpose
13. oculus360.us
“We ( my son and I ) purchased this gift set for
my wife on Valentines day.”
my son and I my wife
Gender Male
Married Yes
Children 1 son
Age Infer > 25
Income Infer from price
Who?
14. oculus360.us
Outline
o Related Work
o Modelling occasions for consumer insights
• Define
• Differentiate
• Categorize
o Data collection and annotation
o Computational methodology and results
• Extraction
• Categorization
o Conclusion and Future Work
15. oculus360.us
Related Work
o Consumer Psychology: study of how thoughts,
feelings, and perceptions influencethe way
individuals buy, use, and relate to products,
services, and brands.
• Categorical representation of the cognitive
system of consumers (Loken, Barsalou, and Joiner
2008).
• The categories go beyond just product and
brand to encompass others:
q Goal-directed
q Cultural
q Service employee
16. oculus360.us
Related Work
o Affective Computing: study on the understanding
and interpretation of human affect.
• Aspect-based sentiment analysis where the goal
is to determine the sentiment toward aspects of
a target entity.
• Sentic computing synthesizes common-sense
computing, linguistics, and psychology to infer
both affective and semantic information about
concepts (Cambria and Hussain 2012).
17. oculus360.us
Related Work
o Dialogue Processing: study of the underlying
structure, patterns, representations, and processes
of written and verbal communication.
• Online bullying (Dinakar et al., 2011)
• Suicide prevention ( Jashinsky et al., 2014)
• Social implicatures from social acts (Bracewell et
al. 2011, Tomlinson et al. 2012)
o Real-Time Event Detection: uses social sensors to
detect events as they happen.
• Disasters (Sakaki et al., 2010; Vieweg et. Al., 2010)
• Local Events (Boettcher and Lee, 2012)
18. oculus360.us
Related Work
o Event Extraction: deals with the extraction and
categorization of events, entities participating in the
events, and other attributes of the event.
• Time (Moschitti et al., 2013)
• Location (Speriosu et al., 2010)
• Modality (Bracewell et al., 2014)
o The definition of event varies and is ill defined.
• ACE defines events using a limited set of types
(ACE, 2005).
• TimeML defines events as “situations that happen
or occur” focusing on duration properties of the
event (Pustejovsky et al., 2003).
19. oculus360.us
Modelling Occasions for Consumer Insights
o Occasions are particular times or events.
o They can range from the everyday (e.g. waking up
and going to bed) to the special (e.g. birthdays
and weddings).
o We restrict the definition of an occasion to:
Times or happenings in which a product is
used or with which a product is associated.
20. oculus360.us
o Examples of occasions meeting this definition
include:
Modelling Occasions for Consumer Insights
“They are GREAT to take along to a party if you’re
serving crackers and cheese.”
“I bought these for my vacation and they did not
disappoint.”
“Boy, do these take me back to those misspent
days of my foolish youth.”
21. oculus360.us
Modelling Occasions for Consumer Insights
o While occasions have similarities to events, not all fit
nicely within the ACE and TimeML definitions.
“These boots really kept me warm during the
winter.”
“Every time I smell a freshly baked apple pie it
brings me back to my childhood. ”
22. oculus360.us
o We define six high-level types of occasions, which
are based on common categories marketers use to
segment consumers.
Modelling Occasions for Consumer Insights
Occasion Type Definition
Celebratory
Occasions meant to celebrate an event, person, or group of people
(e.g. parties and award ceremonies)
Special
Occasions which have significant importance to an individual or
group of individuals (e.g. holidays and life events)
Seasonal Occasions related to the seasons of the year. (e.g. winter)
Weather-‐
Related
Occasions strongly associated with the weather and/or
temperature. (e.g. hot days and rainy nights)
Temporal
Occasions tied to a specific time (e.g. 9 to 5, late night, and last
year)
Other
Occasions which do not fit in the other categories (e.g. a shopping
spree, at the beach)
23. oculus360.us
Modelling Occasions for Consumer Insights
o Celebratory: Includes parties and festivals. They are
social occasions and often inform to the group with
which the consumer belongs.
o Special: Occasions that have significant meaning
to the consumer, such as holidays and religious
observances.
“I wore it a couple weeks ago to a party and felt festive
yet as comfy as if I was wearing loungewear.”
“I recommend these for your engagement party or
rehearsal dinner.”
24. oculus360.us
Modelling Occasions for Consumer Insights
o Seasonal: Relate to the seasons of the year in which
a product is used or associated.
o Temporal: Relate to the time in which a product is
used or associated.
“A quintessential style to take you between seasons.”
“Just the right size for your day-‐to-‐day life, but elegant
enough for evening.”
25. oculus360.us
Modelling Occasions for Consumer Insights
o Weather: Relate to the weather, e.g. rain and snow,
or temperature, e.g. hot and 98 degrees.
o Other: Occasions that do not neatly fit in one of the
previous five categories.
“The tea is great hot for chilly nights and iced for hot
days.”
“Taking a look at the latest summer fashion makes me
want to lie on the beach.”
26. oculus360.us
Data Collection and Annotation
o Collect 26,208 sentences from1,000 product
reviews, 500 product descriptions, 800 forum posts
covering fashion and food.
o Iterative annotation process:
Machine
Annotation
• Alternates
between two
machine learning
algorithms
Manual
Correction
• Remove incorrect
• Add missing
• Fix boundaries
27. oculus360.us
Data Collection and Annotation
o The initial iteration of the annotation process is
performed on 7,000 randomly selected sentences.
o A semi-automatically constructed gazetteer is used
during the initial iteration.
• The full hyponym tree and all derivationally
related forms for social event, time period, and
the first noun sense of activity.
“Darling cocktail party or date night dress .”
“We only stayed at the party an hour because
my shoes were killing my feet.”
Blue = missed Red = error
28. oculus360.us
Data Collection and Annotation
o After manual correction of the initial 7,000
sentences:
• 4,500 held out as test data.
• 500 held out as a development.
• 2,000 used as training data for 2nd iteration
model.
o Remaining Iterations use batches of 500 sentences.
• Machine learning model trained on previous
iteration and model applied to the new batch of
sentences.
• Switch between two models, discussed later, to
not get preference to either model.
29. oculus360.us
Data Collection and Annotation
o The final extraction corpus consists of:
• 2,393 occasions
• ~1 occasion every 11 sentences
• ~1 occasion per product review
• ~1 occasion per forum post
• ~1 occasion every 3 product descriptions
o Assign an occasion type to each of the 2,393
occasions by:
1. Automatic assignment using a WordNet
mapping
2. Manual correction
30. oculus360.us
Data Collection and Annotation
o Automatic assignment is done from a set of 12
seeds that are expanded using the full hyponym
tree and derivationally related forms.
• WordNet lemmas found in a given occasion
annotation are examined in right-to-left order.
• All senses for a lemma are considered in order of
sense number.
• Assignment is performed greedily with the type of
the first sense found in the mapping being assigned
to the occasion.
• The Other type is assigned if no mapping is found.
31. oculus360.us
Data Collection and Annotation
WordNet Sense Occasion Type
party#N#4 Celebratory
celebration#N#1 Celebratory
season#N#2 Seasonal
temperature#N#1 Weather-‐Related
day#N#1 Temporal
day#N#2 Special
valentine#N#1 Special
gift#N#1 Special
anniversary#N#1 Special
birthday#N#1 Special
special#A#3 Special
New Year#N#1 Special
32. oculus360.us
Data Collection and Annotation
o Manual correction to correct mistakes.
o Most types are easily determined by an annotator.
• The celebratory and special types do have an
overlap, e.g. birthday party.
• Special is assigned instead of celebratory when
the celebration is associated with a life event
(e.g. birthday and engagement parties) or
holiday (e.g. Halloween).
33. oculus360.us
Data Collection and Annotation
Type Count
Celebratory 107
Seasonal 525
Special 336
Temporal 263
Weather-‐Related 48
Other 1,114
Final count of annotations per type.
34. oculus360.us
Computational Methodology and Results
Automatically Extracting Occasions
o Model the extraction using the standard BIO
encoding where words in a sentence are labeled
as :
• B-Occasion (Word begins an occasion mention)
• I-Occasion (Word is inside an occasion mention)
• Other (Word is not part of an occasion mention)
Darling cocktail party or date night dress .
Other Other Other OtherB-‐Occasion B-‐OccasionI-‐Occasion I-‐Occasion
35. oculus360.us
Automatically Extracting Occasions
o Experiment using:
• A Maximum Entropy Markov Model (MEMM)
q In-house implementation which uses the
LibLinear library (Fan et al., 2008)
• A Linear Chain Conditional Random Field (CRF)
q CRFSuite (Okazaki, 2007)
• Parameters for both models are optimized using
grid search over the 500 sentence development
set:
q MEMM: C = 3.0
q CRF: C1 = 0.0, C2 = 2.0
36. oculus360.us
of occa-
prelim-
s better
s. The
ults for
he stan-
labeled
ding on
thin an
phrase
um en-
et al.,
m field
raction.
EMMs,
2008),
imple-
search
entence
for the
ters for
ction of
Current word wi & ti
Current word & POS wi, pi & ti
Previous word & POS wi 1, pi 1 & ti
Word two back & POS wi 2, pi 2 & ti
Next word & POS wi+1, pi+1 & ti
Word two ahead & POS wi+2, pi+2 & ti
Bigram word wi 2, wi 1 & ti
wi 1, wi & ti
wi, wi+1 & ti
wi+1, wi+2 & ti
Bigram word & POS wi 2, pi 2, wi 1, pi 1 & ti
wi 1, pi 1, wi, pi & ti
wi, pi, wi+1, pi+1 & ti
wi+1, pi+1, wi+2, pi+2 & ti
Trigram word wi 2, wi 1, wi & ti
wi, wi+1, wi+2 & ti
Current POS pi & ti
Previous POS pi 1 & ti
POS two back pi 2 & ti
Next POS pi+1 & ti
POS two ahead pi+2 & ti
Bigram POS pi 2, pi 1 & ti
pi 1, pi & ti
pi, pi+1 & ti
pi+1, pi+2 & ti
Current word is punct. isPunctuation(wi) & ti
Current word is digit isDigit(wi) & ti
Current word is letter isLetter(wi) & ti
Current word is upper isUppercase(wi) & ti
Current word is lower isLowercase(wi) & ti
WordNet super sense ssij8sense(wi) & ti
Figure 3: Feature templates used for extracting occasions.
w1, · · · , wn are the words in the sentence and wi the cur-
rent word. p1, · · · , pn is the part-of-speech sequence for
the sentence and p is the part-of-speech for the current
• Features consist of:
– Surface level
information in the form
of unigrams, bigrams,
and trigrams.
– Part-of-speech
information.
– Semantic information in
the form of WordNet
super senses.
• We eliminate all features
that occur only once in
our training set.
Automatically Extracting Occasions
37. oculus360.us
• Performance is
measured using the
CoNLL precision, recall,
and F1-measure.
• An occasion is correct
if and only if it exactly
matches a gold
standard annotation
MEMM CRF
Precision 79.2% 83.7%
Recall 43.2% 63.8%
F1 55.9% 72.4%
Automatically Extracting Occasions
Darling cocktail party ...
Other B-‐Occasion I-‐Occasion
✔ ️Correct
Darling cocktail party ...
Other Other B-‐Occasion
✖ ️Incorrect
38. oculus360.us
Computational Methodology and Results
in which contextual information is lost around low-
entropy transitions due to the use of a per-state (vs
single) exponential model (Lafferty et al., 2001).
Model Length P R F1
MEMM
1 79.8% 55.6% 65.6%
2 84.6% 41.8% 55.9%
3 75.0% 25.5% 38.1%
4 76.9% 35.7% 48.8%
5+ 40.0% 66.7% 11.4%
CRF
1 83.9% 52.8% 64.8%
2 80.8% 74.6% 77.6%
3 89.2% 70.2% 78.6%
4 85.7% 85.7% 85.7%
5+ 80.8% 70.0% 75.0%
Table 3: CoNLL Precision, Recall and F1-measure by
length of occasion in words.
sen
adje
defi
of t
nou
lati
ject
nou
SU
soc
5.2.
T
for
see
wea
39. oculus360.us
Computational Methodology and Results
o Examples where the CRF and MEMM are correct:
“Just what you need for a hot summer day!”
“We ( my son and I ) purchased this gift set for
my wife on Valentines day.”
“It’s the perfect size to take me from a day at
work to a night out for drinks with friends.”
40. oculus360.us
Computational Methodology and Results
Automatically Categorizing Occasions
o Once an occasion is extracted it is categorized as
one of the previously defined six types.
o We examine the effectiveness of categorizing
occasions given only the occasion and no context.
o We use a multi-class support vector machine, which
has been shown to work well for this type of
problem.
• Use the LibLinear library with default values for C
and 𝜖.
41. oculus360.us
Automatically Categorizing Occasions
o Three features are used for determining the type:
• Bag of words (case normalized)
• WordNet super senses of all possible senses found
in the occasion.
q Adjectives and Adverbs are linked to nouns
via the derivationally related form and
pertainym relations.
• SUMO concepts (Benzmuller and Pease, 2012)
associated with all WordNet senses in the
occasion.
42. oculus360.us
Automatically Categorizing Occasions
Type Precision Recall F1-‐measure
Celebratory 92.6% 84.7% 88.5%
Seasonal 95.9% 97.5% 96.7%
Special 96.5% 93.8% 95.1%
Temporal 80.6% 88.6% 84.4%
Weather-‐
Related
78.0% 66.7% 71.9%
Other 94.5% 93.8% 94.2%
Macro-‐avg. 89.7% 87.5% 88.5%
Micro-‐avg. 93.1% 93.1% 93.1%
Results for 10-‐fold cross-‐validation.
43. oculus360.us
Automatically Categorizing Occasions
Occasion Gold System
“new spring semester” Temporal Seasonal
“spend time with the one you
love”
Other Temporal
“shooting your engagement
photos”
Special Other
“upcoming year” Temporal Special
“Halloween party” Special Celebratory
Examples of error in type assignment.
44. oculus360.us
Conclusion and Future Work
o Methodology for extracting and categorizing
occasions in which a product is used or with which
a product is associated.
o Focus on product descriptions, product reviews,
and forum posts which are comments or reviews
about a product.
o Occasions are categorized as one of six types:
Celebratory, Special, Seasonal, Temporal, Weather-
Related, and Other.
o Extraction at 72.4% F-Measure and categorization at
88.5% macro-averaged F1-measure.
45. oculus360.us
Conclusion and Future Work
Celebratory
• Party
• Family Time
• General
Temporal
• On the go
• At the office
• Early morning
Seasonal
• Spring
• Summer
• Between seasons
Weather-‐Related
• Rainy
• Hot
• Windy
Special
• Birthday
• Wedding
• Christmas
Other
• Sports
• Emotional times
• Date night
46. oculus360.us
Conclusion and Future Work
my son and I
my wife
Gender Male
Married Yes
Children 1 son
Age Infer > 25
Income Infer from price
Purpose Gift
When Valentines day
Giving Situation Family
48. oculus360.us
“I know my children needs (sic) to know
computers to be successful, but I just can't afford
one.”
my children needs to know
computers
Desire to meet need
Ability
can’t afford
49. oculus360.us
“I know my children needs (sic) to know
computers to be successful, but I just can't afford
one.”
my children needs to know
computers
Desire to meet need
Ability
Consumer is unable to realize need
can’t afford
Missed
Opportunity