Text Analytics Applied
Seth Grimes
Alta Plana Corporation
@sethgrimes
2nd LIDER roadmapping
workshop – Madrid
May 8, 2014
Text Analytics Applied
2nd LIDER workshop
2
“Organizations embracing text analytics all
report having an epiphany moment w...
Text Analytics Applied
2nd LIDER workshop
3
Document
input and
processing
Knowledge
handling is
key
Desk Set (1957): Computer engineer
Richard Sumner (Spencer Tracy)
...
Text Analytics Applied
2nd LIDER workshop
5
Statistics and semantics
Text analytics involves statistical characterization ...
Text Analytics Applied
2nd LIDER workshop
6
Sources
It’s a truism that 80% of enterprise-relevant information
originates i...
Text Analytics Applied
2nd LIDER workshop
7
Value
What do we do with information online, on-social, and in the
enterprise?...
Text Analytics Applied
2nd LIDER workshop
8
Semantics, analytics, and IR
Text analytics generates semantics to bridge sear...
New York Times,
September 8, 1957
Text Analytics Applied
2nd LIDER workshop
10
http://open.blogs.nytimes.com/2012/02/16/rnews-is-
here-and-this-is-what-it-m...
Text Analytics Applied
2nd LIDER workshop
11
Exploratory analysis, synthesis
Decisive Analytics
http://www.dac.us/
Text Analytics Applied
2nd LIDER workshop
12
http://www.geeklawblog.com/2011/12/lexis-advance-platform-launch-two.html
A b...
Text Analytics Applied
2nd LIDER workshop
13
Applications
Synthesis is cool, but let’s take a step back…
Text analytics ha...
Text Analytics Applied
2nd LIDER workshop
14
Sentiment analysis
A specialization, of relevance to:
Brand/reputation manage...
Text Analytics Applied
2nd LIDER workshop
15
http://altaplana.com/TA2014
Text Analytics Applied
2nd LIDER workshop
16
5%
6%
8%
9%
10%
11%
13%
14%
15%
16%
25%
27%
29%
33%
38%
38%
39%
0% 5% 10% 15%...
Text Analytics Applied
2nd LIDER workshop
17
Voice of the Customer
Text analytics is applied to improve customer service a...
Text Analytics Applied
2nd LIDER workshop
18
Online commerce
Text analytics is applied for marketing, search optimization,...
Text Analytics Applied
2nd LIDER workshop
19
E-Discovery and compliance
Text analytics is applied for compliance, fraud an...
Text Analytics Applied
2nd LIDER workshop
20
5%
5%
5%
5%
7%
9%
11%
11%
12%
12%
12%
13%
16%
19%
20%
20%
22%
26%
31%
31%
32%...
Text Analytics Applied
2nd LIDER workshop
21
16%
19%
20%
20%
22%
26%
31%
31%
32%
36%
37%
38%
42%
43%
46%
0% 10% 20% 30% 40...
Text Analytics Applied
2nd LIDER workshop
22
Current, 33%
Current, 31%
Current, 34%
Current, 47%
Current, 51%
Current, 56%...
Text Analytics Applied
2nd LIDER workshop
23
16%
18%
22%
25%
28%
30%
32%
33%
33%
36%
37%
40%
41%
43%
44%
45%
53%
53%
54%
6...
Text Analytics Applied
2nd LIDER workshop
24
10%
1%
16%
9%
36%
34%
2%
2%
18%
7%
4%
3%
13%
8%
7%
38%
3%
2%
3%
2%
5%
9%
17%
...
Text Analytics Applied
2nd LIDER workshop
25
Software & platform options
Text-analytics options may be grouped in general ...
Text Analytics Applied
2nd LIDER workshop
26
User decision criteria
Primary considerations include –
Adaptation or special...
Text Analytics Applied
2nd LIDER workshop
27
Linked Data Links?
Text Analytics Applied
Seth Grimes
Alta Plana Corporation
@sethgrimes
2nd LIDER roadmapping
workshop – Madrid
May 8, 2014
Upcoming SlideShare
Loading in...5
×

Text Analytics Applied (LIDER roadmapping presentation)

8,272

Published on

Presentation to the May 8 2014 LIDER roadmapping workshop in Madrid

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
8,272
On Slideshare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
29
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Transcript of "Text Analytics Applied (LIDER roadmapping presentation)"

  1. 1. Text Analytics Applied Seth Grimes Alta Plana Corporation @sethgrimes 2nd LIDER roadmapping workshop – Madrid May 8, 2014
  2. 2. Text Analytics Applied 2nd LIDER workshop 2 “Organizations embracing text analytics all report having an epiphany moment when they suddenly knew more than before.” -- Philip Russom, the Data Warehousing Institute, 2007 http://tdwi.org/articles/2007/05/09-what-works/bi-search-and-text-analytics.aspx
  3. 3. Text Analytics Applied 2nd LIDER workshop 3
  4. 4. Document input and processing Knowledge handling is key Desk Set (1957): Computer engineer Richard Sumner (Spencer Tracy) and television network librarian Bunny Watson (Katherine Hepburn) and the "electronic brain" EMERAC. Hans Peter Luhn “A Business Intelligence System” IBM Journal, October 1958
  5. 5. Text Analytics Applied 2nd LIDER workshop 5 Statistics and semantics Text analytics involves statistical characterization and semantic understanding of text-derived features – Named entities: people, companies, places, etc. Pattern-based entities: e-mail addresses, phone numbers, etc. Concepts: abstractions of entities. Facts and relationships. Events. Concrete and abstract attributes (e.g., “expensive” & “comfortable”) including measure-value pairs. Subjectivity in the forms of opinions, sentiments, and emotions: attitudinal data. – applied to business ends.
  6. 6. Text Analytics Applied 2nd LIDER workshop 6 Sources It’s a truism that 80% of enterprise-relevant information originates in “unstructured” form: E-mail and messages. Web pages, online news & blogs, forum postings, and other social media. Contact-center notes and transcripts. Surveys, feedback forms, warranty claims. Scientific literature, books, legal documents. ... Non-text “unstructured” content? Images Audio including speech Video Value derives from patterns.
  7. 7. Text Analytics Applied 2nd LIDER workshop 7 Value What do we do with information online, on-social, and in the enterprise? 1. Post/Publish, Manage, and Archive. 2. Index and Search. 3. Categorize and Classify according to metadata & contents. 4. Extract and Analyze.
  8. 8. Text Analytics Applied 2nd LIDER workshop 8 Semantics, analytics, and IR Text analytics generates semantics to bridge search, BI, and applications, enabling next-generation information systems. Search BI/Big Data Applica- tions Search based applications (search + text + apps) Information access (search + analytics) Synthesis (text + BI)/(big data) Text analytics (inner circle) Semantic search (search + text) NextGen CRM, EFM, MR, marketing, apps…
  9. 9. New York Times, September 8, 1957
  10. 10. Text Analytics Applied 2nd LIDER workshop 10 http://open.blogs.nytimes.com/2012/02/16/rnews-is- here-and-this-is-what-it-means/ <div itemscope itemtype="http://schema.org/Organization"> <span itemprop="name">Google.org (GOOG)</span> Contact Details: <div itemprop="address" itemscope itemtype="http://schema.org/PostalAddress"> Main address: <span itemprop="streetAddress">38 avenue de l'Opera</span> <span itemprop="postalCode">F-75002</span> <span itemprop="addressLocality">Paris, France</span> , </div> Tel:<span itemprop="telephone">( 33 1) 42 68 53 00 </span>, Fax:<span itemprop="faxNumber">( 33 1) 42 68 53 01 </span>, E-mail: <span itemprop="email">secretariat(at)google.org</span> </div> http://schema.org/Organization Structure matters http://img.freebase.com/api/trans/raw/m/02dtnzv http://www.cambridgesemantics.com/se mantic-university/semantic-search-and- the-semantic-web
  11. 11. Text Analytics Applied 2nd LIDER workshop 11 Exploratory analysis, synthesis Decisive Analytics http://www.dac.us/
  12. 12. Text Analytics Applied 2nd LIDER workshop 12 http://www.geeklawblog.com/2011/12/lexis-advance-platform-launch-two.html A big data analytics architecture (example)
  13. 13. Text Analytics Applied 2nd LIDER workshop 13 Applications Synthesis is cool, but let’s take a step back… Text analytics has applications in: Intelligence & law enforcement. Life sciences & clinical medicine. Media & publishing including social-media analysis and contextual advertizing. Competitive intelligence. Voice of the Customer: CRM, product management & marketing. Public administration & policy. Legal, tax & regulatory (LTR) including compliance. Recruiting.
  14. 14. Text Analytics Applied 2nd LIDER workshop 14 Sentiment analysis A specialization, of relevance to: Brand/reputation management. Customer experience management (CEM). Competitive intelligence. Survey analysis (EFM). Market research. Product design/quality. Trend spotting.
  15. 15. Text Analytics Applied 2nd LIDER workshop 15 http://altaplana.com/TA2014
  16. 16. Text Analytics Applied 2nd LIDER workshop 16 5% 6% 8% 9% 10% 11% 13% 14% 15% 16% 25% 27% 29% 33% 38% 38% 39% 0% 5% 10% 15% 20% 25% 30% 35% 40% 45% Military/national security/intelligence Law enforcement Intellectual property/patent analysis Financial services/capital markets Product/service design, quality assurance, or warranty claims Other Insurance, risk management, or fraud E-discovery Life sciences or clinical medicine Online commerce including shopping, price intelligence, reviews Content management or publishing Customer /CRM Search, information access, or Question Answering Competitive intelligence Brand/product/reputation management Research (not listed) Voice of the Customer / Customer Experience Management What are your primary applications where text comes into play?
  17. 17. Text Analytics Applied 2nd LIDER workshop 17 Voice of the Customer Text analytics is applied to improve customer service and boost satisfaction and loyalty. Analyze customer interactions and opinions – • E-mail, contact-center notes, survey responses. • Forum & blog posting and other social media. – to – • Address customer product & service issues. • Improve quality. • Manage brand & reputation. Assessment of qualitative information from text helps users – • Gain feedback on interactions. • Assess customer value. • Understand root causes. • Mine data for measures such as churn likelihood.
  18. 18. Text Analytics Applied 2nd LIDER workshop 18 Online commerce Text analytics is applied for marketing, search optimization, competitive intelligence. Analyze social media and enterprise feedback to understand the Voice of the Market: • Opportunities • Threats • Trends Categorize product and service offerings for on-site search and faceted navigation and to enrich content delivery. Annotate pages to enhance Web-search findability, ranking. Scrape competitor sites for offers and pricing. Analyze social and news media for competitive information.
  19. 19. Text Analytics Applied 2nd LIDER workshop 19 E-Discovery and compliance Text analytics is applied for compliance, fraud and risk, and e-discovery. Regulatory mandates and corporate practices dictate – • Monitoring corporate communications • Managing electronic stored information for production in event of litigation Sources include e-mail (!!), news, social media Risk avoidance and fraud detection are key to effective decision making • Text analytics mines critical data from unstructured sources • Integrated text-transactional analytics provides rich insights
  20. 20. Text Analytics Applied 2nd LIDER workshop 20 5% 5% 5% 5% 7% 9% 11% 11% 12% 12% 12% 13% 16% 19% 20% 20% 22% 26% 31% 31% 32% 36% 37% 38% 42% 43% 46% 0% 5% 10% 15% 20% 25% 30% 35% 40% 45% 50% insurance claims or underwriting notes point-of-service notes or transcripts video or animated images warranty claims/documentation photographs or other graphical images crime, legal, or judicial reports or evidentiary materials field/intelligence reports speech or other audio patent/IP filings other text messages/instant messages/SMS medical records Web-site feedback social media not listed above chat employee surveys contact-center notes or transcripts e-mail and correspondence online reviews scientific or technical literature Facebook postings on-line forums customer/market surveys comments on blogs and articles news articles blogs (long form) including Tumblr Twitter, Sina Weibo, or other microblogs What textual information are you analyzing or do you plan to analyze?
  21. 21. Text Analytics Applied 2nd LIDER workshop 21 16% 19% 20% 20% 22% 26% 31% 31% 32% 36% 37% 38% 42% 43% 46% 0% 10% 20% 30% 40% 50% 60% 70% Web-site feedback social media not listed above chat employee surveys contact-center notes or transcripts e-mail and correspondence online reviews scientific or technical literature Facebook postings on-line forums customer/market surveys comments on blogs and articles news articles blogs (long form) including Tumblr Twitter, Sina Weibo, or other microblogs What textual information are you analyzing or do you plan to analyze? 2014 2011 2009
  22. 22. Text Analytics Applied 2nd LIDER workshop 22 Current, 33% Current, 31% Current, 34% Current, 47% Current, 51% Current, 56% Current, 47% Current, 54% Current, 66% Expect, 21% Expect, 24% Expect, 23% Expect, 23% Expect, 28% Expect, 25% Expect, 33% Expect, 28% Expect, 22% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Events Semantic annotations Other entities – phone numbers, part/product numbers, e-mail & street addresses, etc. Metadata such as document author, publication date, title, headers, etc. Concepts, that is, abstract groups of entities Named entities – people, companies, geographic locations, brands, ticker symbols, etc. Relationships and/or facts Sentiment, opinions, attitudes, emotions, perceptions, intent Topics and themes Do you currently need (or expect to need) to extract or analyze...
  23. 23. Text Analytics Applied 2nd LIDER workshop 23 16% 18% 22% 25% 28% 30% 32% 33% 33% 36% 37% 40% 41% 43% 44% 45% 53% 53% 54% 64% 0% 10% 20% 30% 40% 50% 60% 70% export to Semantic Web formats… frontline voice of the customer (VOC) system integration media monitoring/analysis interface hosted or Web service (on-demand "API") option supports data fusion / unified analytics sector adaptation (e.g., hospitality, insurance, retail, health… BI (business intelligence) integration ability to create custom workflows or to create or change… big data capabilities, e.g., via Hadoop/MapReduce predictive-analytics integration open source support for multiple languages sentiment scoring "real time" capabilities low cost deep sentiment/emotion/opinion/intent extraction document classification broad information extraction capability ability to use specialized… ability to generate categories or taxonomies What is important in a solution?
  24. 24. Text Analytics Applied 2nd LIDER workshop 24 10% 1% 16% 9% 36% 34% 2% 2% 18% 7% 4% 3% 13% 8% 7% 38% 3% 2% 3% 2% 5% 9% 17% 3% 28% 7% 17% 24% 2% 10% 11% 15% 8% 4% 17% 21% 3% 20% 4% 0% 1% 1% 2% 0% 0% 10% 20% 30% 40% 50% 60% Arabic Bahasa Indonesia or Malay Chinese Dutch French German Greek Hindi, Urdu, Bengali, Punjabi, or other… Italian Japanese Korean Polish Portuguese Russian Scandinavian or Baltic Spanish Turkish or Turkic Other African Other Arabic script (including… Other East Asian Other European or Slavic/Cyrillic Other Current Within 2 years Non-English language support?
  25. 25. Text Analytics Applied 2nd LIDER workshop 25 Software & platform options Text-analytics options may be grouped in general classes. • Installed text-analysis application, whether desktop or server or deployed in-database. • Data mining workbench. • Hosted. • Programming tool. • As-a-service, via an application programming interface (API). • Code library or component of a business/vertical application, for instance for CRM, e-discovery, search. Text analytics is frequently embedded in search or other end-user applications. The slides that follow next will present leading options in each category except Hosted…
  26. 26. Text Analytics Applied 2nd LIDER workshop 26 User decision criteria Primary considerations include – Adaptation or specialization: To a business or cultural domain, language, information type (e.g., text, speech, images) & source (e.g., Twitter, e-mail, online news). By-user customization possibilities: For instance, via custom taxonomies, rules, lexicons. Sentiment resolution: Aggregate, message, or feature level. (What features? Topics, coreferenced entities?) What sentiment? Valence & what else? Emotion? Intent? Outputs: E.g., annotated text, models, indicators, dashboards, exploratory data interfaces. Usage mode: As-a-service (API), installed, or hosted/cloud. Capacity: Volume, performance, throughput, latency. Cost.
  27. 27. Text Analytics Applied 2nd LIDER workshop 27 Linked Data Links?
  28. 28. Text Analytics Applied Seth Grimes Alta Plana Corporation @sethgrimes 2nd LIDER roadmapping workshop – Madrid May 8, 2014
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×