Loading…

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

Like this presentation? Why not share!

The 2010 JDPA Sentiment Corpus for the Automotive Domain

on

  • 1,193 views

 

Statistics

Views

Total Views
1,193
Views on SlideShare
1,184
Embed Views
9

Actions

Likes
0
Downloads
17
Comments
0

2 Embeds 9

http://www.linkedin.com 5
http://www.slideshare.net 4

Accessibility

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

The 2010 JDPA Sentiment Corpus for the Automotive Domain The 2010 JDPA Sentiment Corpus for the Automotive Domain Presentation Transcript

  • The JDPA Sentiment Corpus for the Automotive Domain Miriam Eckert, Lyndsie Clark, Nicolas Nicolov J.D. Power and Associates Jason S. Kessler Indiana University
  • Overview
    • 335 blog posts containing opinions about cars
      • 223K tokens of blog data
    • Goal of annotation project:
      • Examples of how words interact to evaluate entities
      • Annotations encode these interactions
    • Entities are invoked physical objects and their properties
      • Not just cars, car parts
      • People, locations, organizations, times
  • Excerpt from the corpus
    • “ last night was nice. sean bought me caribou and we went to my house to watch the baseball game …
    • “… yesturday i helped me mom with brians house and then we went and looked at a kia spectra . it looked nice, but when we got up to it, i wasn't impressed ...”
  • Outline
    • Motivating example
    • Overview of annotation types
      • Some statistics
    • Potential uses of corpus
    • Comparison to other resources
  • John recently purchased a had a great a disappointing stereo , and was mildly very grippy . He also considered a which, while highly had a better PERSON CAR-PART stereo . CAR-PART CAR PERSON BMW It CAR Honda Civic . CAR engine , CAR-PART REFERS-TO priced CAR-FEATURE REFERS-TO
  • John recently purchased a had a great a disappointing stereo , and was mildly very grippy . He also considered a which, while highly had a better PERSON CAR-PART stereo . CAR-PART CAR PERSON BMW It CAR Honda Civic . CAR engine , CAR-PART priced CAR-FEATURE TARGET TARGET TARGET TARGET TARGET
  • John recently purchased a had a great a disappointing stereo , and was mildly very grippy . He also considered a which, while highly had a better PERSON CAR-PART stereo . CAR-PART CAR PERSON BMW It CAR PART-OF Honda Civic . CAR engine , CAR-PART REFERS-TO priced CAR-FEATURE REFERS-TO PART-OF PART-OF FEATURE-OF
  • John recently purchased a had a great a disappointing stereo , and was mildly very grippy . He also considered a which, while highly had a better PERSON CAR-PART stereo . CAR-PART CAR PERSON BMW It CAR Honda Civic . CAR engine , CAR-PART priced CAR-FEATURE DIMENSION MORE LESS
  • John recently purchased a had a great a disappointing stereo , and was mildly very grippy . He also considered a which, while highly had a better PERSON CAR-PART stereo . CAR-PART CAR PERSON BMW It CAR Entity-level sentiment: positive Entity-level sentiment: mixed TARGET Honda Civic . CAR engine , CAR-PART REFERS-TO PART-OF PART-OF TARGET TARGET TARGET TARGET TARGET priced CAR-FEATURE FEATURE-OF DIMENSION MORE LESS REFERS-TO
  • Outline
    • Motivating example
    • Overview of annotation types
      • Some statistics
    • Potential uses of corpus
    • Comparison to other resources
  • John recently purchased a Civic . It had a great engine and was priced well. John PERSON Civic It Entity annotations REFERS-TO REFERS-TO CAR engine CAR-PART
      • >20 semantic types from
        • ACE Entity Mention Detection Task
        • Generic automotive types
    priced CAR- FEATURE
  • Entity-relation annotations Entity-level sentiment: Positive
    • Relations between entities
    • Entity-level sentiment annotations
    • Sentiment flow between entities through relations
      • My car has a great engine.
      • Honda, known for its high standards, made my car .
    Civic CAR engine CAR- PART priced CAR- FEATURE PART-OF FEATURE-OF
  • Entity annotation type: statistics
    • Inter-annotator agreement
      • Among mentions 83%
      • Refers-to: 68%
    • 61K mentions in corpus and 43K entities
    • 103 documents annotated by around 3 annotators
    A1: …Kia Rio … A2: … Kia Rio … MATCH A1: …Kia Rio … A2: … Kia Rio… NOT A MATCH
  • Sentiment expressions great engine highly priced Prior polarity: positive Prior polarity: negative
    • Evaluations
    • Target mentions
    • Prior polarity:
      • Semantic orientation given target
      • positive, negative, neutral, mixed
    … a highly spec’ed Prior polarity: positive
  • Sentiment expressions
    • Occurrences in corpus: 10K
    • 13% are multi-word
      • like no other , get up and go
    • 49% are headed by adjectives
    • 22% nouns ( damage, good amount )
    • 20% verbs ( likes , upset )
    • 5% adverbs ( highly )
  • Sentiment expressions
    • 75% of sentiment expression occurrences have non evaluative uses in corpus
    • “light”
      • … the car seemed too light to be safe…
      • … vehicles in the light truck category…
    • 77% sentiment expression occurrences are positive
    • Inter-annotator agreement:
      • 75% spans, 66% targets, 95% prior polarity
  • Modifiers -> contextual polarity NEGATORS not a good car not a very good car INTENSIFIERS very good car a kind of good car a UPWARD DOWNARD NEUTRALIZERS if good the car is I hope good the car is COMMITTERS sure good the car is I am UPWARD suspect good the car is I DOWNWARD
  • Other annotations
    • Speech events (not sourced from author)
      • John thinks the car is good.
    • Comparisons:
      • Car X has a better engine than car Y.
      • Handles a variety of cases
  • Outline
    • Motivating example
    • Overview of annotation types
      • Some statistics
    • Potential uses of corpus
    • Comparison to other resources
  • Possible tasks
    • Detecting mentions, sentiment expressions, and modifiers
    • Identifying targets of sentiment expressions, modifiers
    • Coreference resolution
    • Finding part-of, feature-of, etc. relations
    • Identifying errors/inconsistencies in data
  • Possible tasks
    • Exploring how elements interact:
      • Some idiot thinks this is a good car .
    • Evaluating unsupervised sentiment systems or those trained on other domains
    • How do relations between entities transfer sentiment?
      • The car’s paint job is flawless but the safety record is poor.
    • Solution to one task may be useful in solving another.
  • But wait, there’s more!
    • 180 digital camera blog posts were annotated
    • Total of 223,001 + 108,593 = 331,594 tokens
  • Outline
    • Motivating example
      • Elements combine to render entity-level sentiment
    • Overview of annotation types
      • Some statistics
    • Potential uses of corpus
    • Comparison to other resources
  • Other resources
    • MPQA Version 2.0
      • Wiebe, Wilson and Cardie (2005)
      • Largely professionally written news articles
      • Subjective expression
        • “ beliefs, emotions, sentiments, speculations, etc.”
      • Attitude, contextual sentiment on subjective expressions
      • Target, source annotations
      • 226K tokens (JDPA: 332K)
  • Other resources
    • Data sets provided by Bing Liu (2004, 2008)
      • Customer-written consumer electronics product reviews
      • Contextual sentiment toward mention of product
      • Comparison annotations
      • 130K tokens (JDPA: 332K)
  • Thank you!
    • Obtaining the corpus:
      • Research and educational purposes
      • [email_address]
      • June 2010
      • Annotation guidelines:
        • http://www.cs.indiana.edu/~jaskessl
    • Thanks to: Prof. Michael Gasser, Prof. James Martin, Prof. Martha Palmer, Prof. Michael Mozer, William Headden
  • Top 20 annotations by type
  • Inter-annotator agreement