Technology Frontiers: Text, Sentiment, and Sense


Published on

Presentation by Seth Grimes at the Insight Innovation Exchange (IIEX) conference, June 17, 2013 in Philadelphis.

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Technology Frontiers: Text, Sentiment, and Sense

  1. 1. Technology Frontiers: Text,Sentiment, and SenseSeth Grimes@sethgrimes
  2. 2. A Sensemaking StoryNew York Times,September 30, 2012
  3. 3. New York Times,September 8, 1957Valium: A Chain of Connections
  4. 4. Natural Language ProcessingBy H.P. Luhn, inIBM Journal,April, 1958
  5. 5. Modelling Text“Statistical information derived from word frequency and distribution isused by the machine to compute a relative measure of significance, firstfor individual words and then for sentences. Sentences scoring highest insignificance are extracted and printed out to become the auto-abstract.”-- H.P. Luhn, The Automatic Creation of Literature Abstracts, IBM Journal, 1958.Luhn’s analysis ofMessengers of the NervousSystem, a Scientific Americanarticle, appliedto the NY Times article
  6. 6. New York Times,September 8, 1957Luhn’s Example
  7. 7. Close Reading
  8. 8. Can Software Make the Connection?Mark Lombardi, George W. Bush, Harken Energyand Jackson Stephens, c. 1979-90, Detail
  9. 9. Insight from Connections… via graphs, clusters, categories, and counts.… by mining the full set of available data.
  10. 10. & Social Change Everything
  11. 11. (Accessible) Data Everywhere
  12. 12. Lexical, syntactic, and semantic analysis discernfeatures including relationships in source materials.Features = entities, measure-value pairs, concepts,topics, events, sentiment, and more.Text analytics may draw on:• Lexicons & taxonomies.• Statistics.• Patterns.• Linguistics.• Machine learning.Text Analytics
  13. 13. How?
  14. 14. From POS to RelationshipsUnderstand parts ofspeech (POS), e.g. –<subject> <verb><object> –todiscern facts andrelationships.Semantic networkssuch as WordNetare adisambiguationasset.
  15. 15. Clustered ClarityCarrot2.(open source)
  16. 16. Platforms and ecosystems.APIs and services.Text and content analytics --Discerns and extracts features including relationships fromsource materials.Features = entities, key-value pairs, concepts, topics,events, sentiment, etc.Provide (for) BI on content-sourced data.Data integration, record linkage, data fusion.The Back End
  17. 17. Content, Composites, Connections
  18. 18. Content, Composites, Connections, 2
  19. 19. Social Sources
  20. 20. Sentiment Analysis“Sentiment analysis is the task of identifying positiveand negative opinions, emotions, and evaluations.”-- Wilson, Wiebe & Hoffman, 2005, “Recognizing Contextual Polarity inPhrase-Level Sentiment Analysis”“Sentiment analysis or opinion mining is thecomputational study of opinions, sentiments andemotions expressed in text… An opinion on a feature f isa positive or negative view, attitude, emotion orappraisal on f from an opinion holder.”-- Bing Liu, 2010, “Sentiment Analysis and Subjectivity,” in Handbook ofNatural Language Processing
  21. 21. Detection, Classification
  22. 22. Beyond Polarity
  23. 23. Intent Analysis
  24. 24. ComplicationsSentiment may be of interest at multiple levels.Corpus / data space, i.e., across multiple sources.Document.Statement / sentence.Entity / topic / concept.Human language is noisy and chaotic!Jargon, slang, irony, ambiguity, anaphora, polysemy,synonymy, etc.Context is key. Discourse analysis comes into play.Must distinguish the sentiment holder from the object:“Geithner said the recession may worsen.”
  25. 25. Audio including speech.Images.Video. Text
  26. 26. Sensemaking“It is convenient to divide the entireinformation access process into twomain components: information retrievalthrough searching and browsing, andanalysis and synthesis of results. Thisbroader process is often referred to inthe literature as sensemaking.Sensemaking refers to an iterativeprocess of formulating a conceptualrepresentation from of a large volumeof information. Search plays only onepart in this process.”-- Marti Hearst, 2009
  27. 27. Apply new tech to old needs, e.g., automated coding.Select from and use all available data.Marry social to profiles and surveys.Factor in behaviors.Interpret according to context and needs.Understand intent to create situational predictivemodels.Explore; experiment.Suggestions
  28. 28. Racing On
  29. 29. Technology Frontiers: Text,Sentiment, and SenseSeth Grimes@sethgrimes