Arcomem training simple-text-mining_beginner


Published on

This presentation on Text Mining is part of the ARCOMEM training curriculum. Feel free to roam around or contact us on Twitter via @arcomem to learn more about ARCOMEM training on archiving Social Media.

Published in: Technology, Education
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Arcomem training simple-text-mining_beginner

  1. 1. University of Sheffield, NLP Entity, Event and Opinion Recognition in ARCOMEM University of Sheffield, UK ● © The University of Sheffield, 1995-2010 This work is licenced under the Creative Commons Attribution-NonCommercial-ShareAlike Licence
  2. 2. University of Sheffield, NLP What is Entity Recognition? Entity Recognition is about recogising and classifying key Named Entities and terms in the text A Named Entity is a Person, Location, Organisation, Date etc. A term is a key concept or phrase that is representative of the text Entities and terms may be described in different ways but refer to the same thing. We call this co-reference. Mitt Romney, the favorite to win the Republican nomination for president in 2012 DatePerson Term The GOP tweeted that they had knocked on 75,000 doors in Ohio the day prior. Organisation co-reference Location
  3. 3. University of Sheffield, NLP What is Event Recognition? An event is an action or situation relevant to the domain expressed by some relation between entities or terms. It is always grounded in time, e.g. the performance of a band, an election, the death of a person Mitt Romney, the favorite to win the Republican nomination for president in 2012 Event DatePerson Relation Relation
  4. 4. University of Sheffield, NLP Why are Entities and Events Useful? They can help answer the “Big 5” journalism questions (who, what, when, where, why) They can be used to categorise the texts in different ways ● look at all texts about Obama. They can be used as targets for opinion mining ● find out what people think about President Obama When linked to an ontology and/or combined with other information they can be used for reasoning about things not explicit in the text ● seeing how opinions about different American presidents has changed over the years
  5. 5. University of Sheffield, NLP Opinions Opinion mining is about finding out what people think A positive opinion about Romney A negative opinion about the Republican volunteers We analyse the texts and classify opinionated statements with: a polarity (positive or negative) a score (strength of opinion) a target (which entity or event the opinion is about) Romney was the perfect candidate, and he was the President this country needs. Such apathy among the Republican volunteers is disgusting.
  6. 6. University of Sheffield, NLP Finding Opinions is not trivial We can use sentiment dictionaries to look up words like “disgusting” and “perfect” and match them to a sentiment But this isn't enough on its own. We have to make sure to match the sentiment to the correct target (entity) We have to deal with negative words and their scope “Happy” and “not happy” have opposite sentiment But “not great” does not imply negative sentiment We have to deal with things like sarcasm, especially in tweets. “Aahh how sweet it is to wake up to ignorance and stupidity :-)”
  7. 7. Why do we want to find opinions? • Opinion mining allows us to answer questions such as: • What are the opinions on crucial social events and the key people involved? • How are these opinions distributed in relation to demographic user data? • How have these opinions evolved? • Who are the opinion leaders? • What is their impact and influence?
  8. 8. Try out some opinion mining