SlideShare a Scribd company logo
1 of 22
Download to read offline
Stacy Faught
IT Business Solutions
11/18/2016
Introduction
GOAL OF THIS PRESENTATION:
Define text analytics capabilities to
build the business case for
creating the service.
2
creating the service.
Ontology Of Text Analytics BUZZ Words
Text Analytics
Text Mining Big Data
Linguistics
Machine
NLP
associated withassociated with
relies upon
sub-discipline of
works
together
with
technology
tools for
works
together
with
Faceted
Search
enables
3
Taxonomy
SyntaxSemantics
Machine
Learning
OntologyThesaurus
Semantic Network
Synonymous
with
with
more
complex
more
complex
used to
build
Morphology
Disambiguation
Entity Extraction
Sentiment
Beginnings
Text analytics leverages
and learns from massive
quantities of textual
data to reveal customer
intentions and
sentiment…
Text Analytics World
4
Text Analytics World
Gaining Interest
Text Analytics is the
process of deriving
information from text
sources.
Gartner
5
Gartner
Text Analytics is a
method for making
unstructured content
useful and accessible.
Expert System
It’s All About the CONCEPT
6
CurrentState - Keyword Searching
All these documents contain the
keywords “big cat oil discoveries”.
Read ALL the documents to find the
ones relevant to you.
7
Keyword Search Not Always Enough
Advantages
 Speculative searching (i.e. Where are the best tacos in Houston?)
 Finding general information (i.e. Address of Access Sciences’ website)
Disadvantages
8
 Large result sets mean not enough time to read all documents
 Noisy and irrelevant hits have to be filtered out
 Narrowing the question may mean missing a key result
 Have to type in all variants of a term
 i.e. significant oil discovery, large oil find, >200M barrels???
What Text Analytics Does
TYPICAL SEARCH
Text Analytics occurs here
9
INDEX
Text Analytics occurs here
Analyzes content & extracts meaningful metadata
Entities
Themes
Sentiment
SMARTER Searching
How it works
Concept:
big cat
Definition:
Carnivorous
mammal
Child:
Jaguar
Relationship:
Wild
Synonym:
Constraint:
Not domestic
Parent:
Mammal
Concept:
big cat
Definition:
Caterpillar
machinery
Child:
Drilling
equipment
Relationship:
Mining
Synonym:
Heavy
Constraint:
Not Hitachi
Parent:
Equipment
Definition:
Large oil
discovery
10
Synonym:
Feline
Heavy
equipment
Concept:
big cat
discovery
Child:
Shale big cat
Relationship:
Elephant
Synonym:
Significant
Constraint:
>200Mboe
Parent:
Oil discovery
Interpretingthe meaning of text
• Groups words into meaningful units
• Searches for different forms of words (morphology)
• Searches for words with semantic relationships
sentences Noun groups
Match entities
verb groups
Match actions
morphology
Match different forms
semantics
Match related meanings
11
Match related meanings
Total has confirmed just one “big cat” -- with more than 200 million barrels -- in
Bolivia in May 2011 that extends a 2004 discovery.
Shell has discovered oil on three big cat prospects offshore Nigeria, plus a large gas-
condensate field in the Norwegian Sea.
Firm makes major Gatwick oil find.
Automatically Extract Known Entities
People OrganizationsPlaces
Total S.A.
Europe
France
12
• Entity extraction
Total S.A.
– French oil & gas co.
vs.
total - adj. meaning entire
Saudi Aramco
Royal Dutch Shell
Exxon Mobil
Erle P. Halliburton
Charles Holiday
– Shell Chairman
vs.
4th of July Holiday
France
Paris
Monaco
Tyrrhenian Sea
Oceania
Basins
Plays
Fields
Put the Puzzle Pieces Together…
Concepts:
 Big Cat
 Discoveries
 Prospects
 Play
Entities:
 Organizations
 Shell
13
• Faceted search
 Play
 Shale
 Conventional
 Shell
 Total
 Places
 Africa
 S. America
Find the Missing Piece
14
Automatically Extract Relevant Facts
15
WHO WHEN WHERE
Total May 2011 Bolivia
Shell July 2005 Offshore Nigeria
UK Oil and Gas April 2015 Sussex
Use Cases
 Auto Classification
 Competitive Intel
16
Auto Classification
Internal Data Sources
 In Place
 Share Drives
 Legacy Datasets
 Migrations
17
• In its simplest form
 Migrations
 Consolidation/Expansion
 Mergers and Acquisitions
Competitive Intel
External Data Sources
 Public Domain
 Industry Publications
 Regulatory Reporting
18
Industries and Drivers
 Oil & Gas
 Pharmaceuticals
 Government Agencies
 Legal
Competitive Intel
Research
19
Knowledge Areas/Roles
 Text Mining
 Library Sciences
 Taxonomy
 Linguistics
 Foreign Language
 Technology
 Domain Expert
20
 Domain Expert
Technology Tools
 Expert System Cogito
 conceptSearching
 Smartlogic Semaphore
 HP Autonomy  Linguamatics I2E
 SAS Text Analytics Suite
 IBM Languageware / Content Analytics
 Lexalytics Text Analytics
 Provalis Research QDAMinder / WordStat
21
 Provalis Research QDAMinder / WordStat
 PingarAPI
 AlchemyAPI
 Content Analyst
 Angoss KnowledgeREADER
 NetOwl
 Language Computer Corp.
 Basis Technology
 MeaningCloud
 Forest RIM’s Textual ETL
Q & A
22

More Related Content

Viewers also liked

One Is Too Many
One Is Too ManyOne Is Too Many
One Is Too Manysufizu
 
月刊事業構想2_Feb2014
月刊事業構想2_Feb2014月刊事業構想2_Feb2014
月刊事業構想2_Feb2014Reini Mizushima
 
การศึกษาของบุคลากรสุขภาพ 590727
การศึกษาของบุคลากรสุขภาพ 590727การศึกษาของบุคลากรสุขภาพ 590727
การศึกษาของบุคลากรสุขภาพ 590727Pattie Pattie
 
การศึกษาไทยในศตวรรษที่ ๒๑ ม. เซนต์จอห์น 590731_n-last
การศึกษาไทยในศตวรรษที่ ๒๑ ม. เซนต์จอห์น 590731_n-lastการศึกษาไทยในศตวรรษที่ ๒๑ ม. เซนต์จอห์น 590731_n-last
การศึกษาไทยในศตวรรษที่ ๒๑ ม. เซนต์จอห์น 590731_n-lastPattie Pattie
 
Roushan Kumar Presentation
Roushan Kumar PresentationRoushan Kumar Presentation
Roushan Kumar PresentationRoushan Kumar
 
Summary The PMAC 2017
Summary The PMAC 2017Summary The PMAC 2017
Summary The PMAC 2017Pattie Pattie
 
Washroom Optimised Brochure
Washroom Optimised BrochureWashroom Optimised Brochure
Washroom Optimised BrochureMegan Murphy
 
Psicopatología: Trastornos de evitación y anorexia nerviosa
Psicopatología: Trastornos de  evitación y  anorexia nerviosaPsicopatología: Trastornos de  evitación y  anorexia nerviosa
Psicopatología: Trastornos de evitación y anorexia nerviosaM Sc. Marta LiCY - Marta Cuyuch
 
Introducció a la metodologia BIM, by DDV Arquitectura ©
Introducció a la metodologia BIM, by DDV Arquitectura ©Introducció a la metodologia BIM, by DDV Arquitectura ©
Introducció a la metodologia BIM, by DDV Arquitectura ©David Delgado Vendrell
 
Text Analytics for Dummies 2010
Text Analytics for Dummies 2010Text Analytics for Dummies 2010
Text Analytics for Dummies 2010Seth Grimes
 
An Introduction to Text Analytics: 2013 Workshop presentation
An Introduction to Text Analytics: 2013 Workshop presentationAn Introduction to Text Analytics: 2013 Workshop presentation
An Introduction to Text Analytics: 2013 Workshop presentationSeth Grimes
 
Social Media Analytics Demystified
Social Media Analytics DemystifiedSocial Media Analytics Demystified
Social Media Analytics DemystifiedDebra Askanase
 

Viewers also liked (16)

One Is Too Many
One Is Too ManyOne Is Too Many
One Is Too Many
 
月刊事業構想2_Feb2014
月刊事業構想2_Feb2014月刊事業構想2_Feb2014
月刊事業構想2_Feb2014
 
การศึกษาของบุคลากรสุขภาพ 590727
การศึกษาของบุคลากรสุขภาพ 590727การศึกษาของบุคลากรสุขภาพ 590727
การศึกษาของบุคลากรสุขภาพ 590727
 
Taj mahal
Taj mahalTaj mahal
Taj mahal
 
การศึกษาไทยในศตวรรษที่ ๒๑ ม. เซนต์จอห์น 590731_n-last
การศึกษาไทยในศตวรรษที่ ๒๑ ม. เซนต์จอห์น 590731_n-lastการศึกษาไทยในศตวรรษที่ ๒๑ ม. เซนต์จอห์น 590731_n-last
การศึกษาไทยในศตวรรษที่ ๒๑ ม. เซนต์จอห์น 590731_n-last
 
Agrupaciones instrumentales1 eso
Agrupaciones instrumentales1 esoAgrupaciones instrumentales1 eso
Agrupaciones instrumentales1 eso
 
Roushan Kumar Presentation
Roushan Kumar PresentationRoushan Kumar Presentation
Roushan Kumar Presentation
 
Summary The PMAC 2017
Summary The PMAC 2017Summary The PMAC 2017
Summary The PMAC 2017
 
Maturitate 8
Maturitate 8Maturitate 8
Maturitate 8
 
Washroom Optimised Brochure
Washroom Optimised BrochureWashroom Optimised Brochure
Washroom Optimised Brochure
 
Psicopatología: Trastornos de evitación y anorexia nerviosa
Psicopatología: Trastornos de  evitación y  anorexia nerviosaPsicopatología: Trastornos de  evitación y  anorexia nerviosa
Psicopatología: Trastornos de evitación y anorexia nerviosa
 
Text Analytics
Text Analytics Text Analytics
Text Analytics
 
Introducció a la metodologia BIM, by DDV Arquitectura ©
Introducció a la metodologia BIM, by DDV Arquitectura ©Introducció a la metodologia BIM, by DDV Arquitectura ©
Introducció a la metodologia BIM, by DDV Arquitectura ©
 
Text Analytics for Dummies 2010
Text Analytics for Dummies 2010Text Analytics for Dummies 2010
Text Analytics for Dummies 2010
 
An Introduction to Text Analytics: 2013 Workshop presentation
An Introduction to Text Analytics: 2013 Workshop presentationAn Introduction to Text Analytics: 2013 Workshop presentation
An Introduction to Text Analytics: 2013 Workshop presentation
 
Social Media Analytics Demystified
Social Media Analytics DemystifiedSocial Media Analytics Demystified
Social Media Analytics Demystified
 

Similar to Text Analytics Presentation LinkedIn

12 Things the Semantic Web Should Know about Content Analytics
12 Things the Semantic Web Should Know about Content Analytics12 Things the Semantic Web Should Know about Content Analytics
12 Things the Semantic Web Should Know about Content AnalyticsSeth Grimes
 
Open Calais Release 4.0
Open Calais Release 4.0Open Calais Release 4.0
Open Calais Release 4.0Krista Thomas
 
Session 4.2 unleash the triple: leveraging a corporate discovery interface....
Session 4.2   unleash the triple: leveraging a corporate discovery interface....Session 4.2   unleash the triple: leveraging a corporate discovery interface....
Session 4.2 unleash the triple: leveraging a corporate discovery interface....semanticsconference
 
Five Ways To Calais V01
Five Ways To Calais V01Five Ways To Calais V01
Five Ways To Calais V01Thomas Tague
 
Push To Test - Open Source Adoption in the Enterprise
Push To Test - Open Source Adoption in the EnterprisePush To Test - Open Source Adoption in the Enterprise
Push To Test - Open Source Adoption in the EnterpriseAndrew Aitken
 
Utilizing the natural langauage toolkit for keyword research
Utilizing the natural langauage toolkit for keyword researchUtilizing the natural langauage toolkit for keyword research
Utilizing the natural langauage toolkit for keyword researchErudite
 
Keynote at the MTSR conference
Keynote at the MTSR conferenceKeynote at the MTSR conference
Keynote at the MTSR conferenceJohannes Keizer
 
Advanced Keyword Research to Uncover Content Opportunities
Advanced Keyword Research to Uncover Content OpportunitiesAdvanced Keyword Research to Uncover Content Opportunities
Advanced Keyword Research to Uncover Content OpportunitiesAffiliate Summit
 
Cognitive Recommendations Using Real Estate Standard Ontology
Cognitive Recommendations Using Real Estate Standard OntologyCognitive Recommendations Using Real Estate Standard Ontology
Cognitive Recommendations Using Real Estate Standard OntologyPropMixIO
 
Operationalized Analytics in the Enterprise
Operationalized Analytics in the EnterpriseOperationalized Analytics in the Enterprise
Operationalized Analytics in the EnterpriseRon Bodkin
 
Open Source Basics
Open Source BasicsOpen Source Basics
Open Source BasicsRoss Gardler
 
Ontologies & Machine Learning v2 - SciBIte Lab Of The Future 2019
Ontologies & Machine Learning v2 - SciBIte Lab Of The Future 2019Ontologies & Machine Learning v2 - SciBIte Lab Of The Future 2019
Ontologies & Machine Learning v2 - SciBIte Lab Of The Future 2019SciBite Limited
 
Professional Information Research
Professional Information ResearchProfessional Information Research
Professional Information ResearchEric Kokke
 
Five creative search solutions using text analytics
Five creative search solutions using text analyticsFive creative search solutions using text analytics
Five creative search solutions using text analyticsEnterprise Knowledge
 
Julien Gonçalves: Named entity recognition and disambiguation using an iterat...
Julien Gonçalves: Named entity recognition and disambiguation using an iterat...Julien Gonçalves: Named entity recognition and disambiguation using an iterat...
Julien Gonçalves: Named entity recognition and disambiguation using an iterat...Semantic Web Company
 
Role of Text Mining in Search Engine
Role of Text Mining in Search EngineRole of Text Mining in Search Engine
Role of Text Mining in Search EngineJay R Modi
 
SWT Lecture Session 7 - Advanced uses of RDFS
SWT Lecture Session 7 - Advanced uses of RDFSSWT Lecture Session 7 - Advanced uses of RDFS
SWT Lecture Session 7 - Advanced uses of RDFSMariano Rodriguez-Muro
 
Cirrus: Africa's AI initiative, Proposal 2018
Cirrus: Africa's AI initiative, Proposal 2018Cirrus: Africa's AI initiative, Proposal 2018
Cirrus: Africa's AI initiative, Proposal 2018Gregg Barrett
 

Similar to Text Analytics Presentation LinkedIn (20)

12 Things the Semantic Web Should Know about Content Analytics
12 Things the Semantic Web Should Know about Content Analytics12 Things the Semantic Web Should Know about Content Analytics
12 Things the Semantic Web Should Know about Content Analytics
 
Open Calais Release 4.0
Open Calais Release 4.0Open Calais Release 4.0
Open Calais Release 4.0
 
Session 4.2 unleash the triple: leveraging a corporate discovery interface....
Session 4.2   unleash the triple: leveraging a corporate discovery interface....Session 4.2   unleash the triple: leveraging a corporate discovery interface....
Session 4.2 unleash the triple: leveraging a corporate discovery interface....
 
Five Ways To Calais V01
Five Ways To Calais V01Five Ways To Calais V01
Five Ways To Calais V01
 
Push To Test - Open Source Adoption in the Enterprise
Push To Test - Open Source Adoption in the EnterprisePush To Test - Open Source Adoption in the Enterprise
Push To Test - Open Source Adoption in the Enterprise
 
Document repositories-and-metadata
Document repositories-and-metadataDocument repositories-and-metadata
Document repositories-and-metadata
 
Utilizing the natural langauage toolkit for keyword research
Utilizing the natural langauage toolkit for keyword researchUtilizing the natural langauage toolkit for keyword research
Utilizing the natural langauage toolkit for keyword research
 
Keynote at the MTSR conference
Keynote at the MTSR conferenceKeynote at the MTSR conference
Keynote at the MTSR conference
 
Advanced Keyword Research to Uncover Content Opportunities
Advanced Keyword Research to Uncover Content OpportunitiesAdvanced Keyword Research to Uncover Content Opportunities
Advanced Keyword Research to Uncover Content Opportunities
 
Cognitive Recommendations Using Real Estate Standard Ontology
Cognitive Recommendations Using Real Estate Standard OntologyCognitive Recommendations Using Real Estate Standard Ontology
Cognitive Recommendations Using Real Estate Standard Ontology
 
Operationalized Analytics in the Enterprise
Operationalized Analytics in the EnterpriseOperationalized Analytics in the Enterprise
Operationalized Analytics in the Enterprise
 
Open Source Basics
Open Source BasicsOpen Source Basics
Open Source Basics
 
Ontologies & Machine Learning v2 - SciBIte Lab Of The Future 2019
Ontologies & Machine Learning v2 - SciBIte Lab Of The Future 2019Ontologies & Machine Learning v2 - SciBIte Lab Of The Future 2019
Ontologies & Machine Learning v2 - SciBIte Lab Of The Future 2019
 
Professional Information Research
Professional Information ResearchProfessional Information Research
Professional Information Research
 
Five creative search solutions using text analytics
Five creative search solutions using text analyticsFive creative search solutions using text analytics
Five creative search solutions using text analytics
 
Julien Gonçalves: Named entity recognition and disambiguation using an iterat...
Julien Gonçalves: Named entity recognition and disambiguation using an iterat...Julien Gonçalves: Named entity recognition and disambiguation using an iterat...
Julien Gonçalves: Named entity recognition and disambiguation using an iterat...
 
Role of Text Mining in Search Engine
Role of Text Mining in Search EngineRole of Text Mining in Search Engine
Role of Text Mining in Search Engine
 
Broad Data
Broad DataBroad Data
Broad Data
 
SWT Lecture Session 7 - Advanced uses of RDFS
SWT Lecture Session 7 - Advanced uses of RDFSSWT Lecture Session 7 - Advanced uses of RDFS
SWT Lecture Session 7 - Advanced uses of RDFS
 
Cirrus: Africa's AI initiative, Proposal 2018
Cirrus: Africa's AI initiative, Proposal 2018Cirrus: Africa's AI initiative, Proposal 2018
Cirrus: Africa's AI initiative, Proposal 2018
 

Text Analytics Presentation LinkedIn

  • 1. Stacy Faught IT Business Solutions 11/18/2016
  • 2. Introduction GOAL OF THIS PRESENTATION: Define text analytics capabilities to build the business case for creating the service. 2 creating the service.
  • 3. Ontology Of Text Analytics BUZZ Words Text Analytics Text Mining Big Data Linguistics Machine NLP associated withassociated with relies upon sub-discipline of works together with technology tools for works together with Faceted Search enables 3 Taxonomy SyntaxSemantics Machine Learning OntologyThesaurus Semantic Network Synonymous with with more complex more complex used to build Morphology Disambiguation Entity Extraction Sentiment
  • 4. Beginnings Text analytics leverages and learns from massive quantities of textual data to reveal customer intentions and sentiment… Text Analytics World 4 Text Analytics World
  • 5. Gaining Interest Text Analytics is the process of deriving information from text sources. Gartner 5 Gartner Text Analytics is a method for making unstructured content useful and accessible. Expert System
  • 6. It’s All About the CONCEPT 6
  • 7. CurrentState - Keyword Searching All these documents contain the keywords “big cat oil discoveries”. Read ALL the documents to find the ones relevant to you. 7
  • 8. Keyword Search Not Always Enough Advantages  Speculative searching (i.e. Where are the best tacos in Houston?)  Finding general information (i.e. Address of Access Sciences’ website) Disadvantages 8  Large result sets mean not enough time to read all documents  Noisy and irrelevant hits have to be filtered out  Narrowing the question may mean missing a key result  Have to type in all variants of a term  i.e. significant oil discovery, large oil find, >200M barrels???
  • 9. What Text Analytics Does TYPICAL SEARCH Text Analytics occurs here 9 INDEX Text Analytics occurs here Analyzes content & extracts meaningful metadata Entities Themes Sentiment SMARTER Searching
  • 10. How it works Concept: big cat Definition: Carnivorous mammal Child: Jaguar Relationship: Wild Synonym: Constraint: Not domestic Parent: Mammal Concept: big cat Definition: Caterpillar machinery Child: Drilling equipment Relationship: Mining Synonym: Heavy Constraint: Not Hitachi Parent: Equipment Definition: Large oil discovery 10 Synonym: Feline Heavy equipment Concept: big cat discovery Child: Shale big cat Relationship: Elephant Synonym: Significant Constraint: >200Mboe Parent: Oil discovery
  • 11. Interpretingthe meaning of text • Groups words into meaningful units • Searches for different forms of words (morphology) • Searches for words with semantic relationships sentences Noun groups Match entities verb groups Match actions morphology Match different forms semantics Match related meanings 11 Match related meanings Total has confirmed just one “big cat” -- with more than 200 million barrels -- in Bolivia in May 2011 that extends a 2004 discovery. Shell has discovered oil on three big cat prospects offshore Nigeria, plus a large gas- condensate field in the Norwegian Sea. Firm makes major Gatwick oil find.
  • 12. Automatically Extract Known Entities People OrganizationsPlaces Total S.A. Europe France 12 • Entity extraction Total S.A. – French oil & gas co. vs. total - adj. meaning entire Saudi Aramco Royal Dutch Shell Exxon Mobil Erle P. Halliburton Charles Holiday – Shell Chairman vs. 4th of July Holiday France Paris Monaco Tyrrhenian Sea Oceania Basins Plays Fields
  • 13. Put the Puzzle Pieces Together… Concepts:  Big Cat  Discoveries  Prospects  Play Entities:  Organizations  Shell 13 • Faceted search  Play  Shale  Conventional  Shell  Total  Places  Africa  S. America
  • 14. Find the Missing Piece 14
  • 15. Automatically Extract Relevant Facts 15 WHO WHEN WHERE Total May 2011 Bolivia Shell July 2005 Offshore Nigeria UK Oil and Gas April 2015 Sussex
  • 16. Use Cases  Auto Classification  Competitive Intel 16
  • 17. Auto Classification Internal Data Sources  In Place  Share Drives  Legacy Datasets  Migrations 17 • In its simplest form  Migrations  Consolidation/Expansion  Mergers and Acquisitions
  • 18. Competitive Intel External Data Sources  Public Domain  Industry Publications  Regulatory Reporting 18
  • 19. Industries and Drivers  Oil & Gas  Pharmaceuticals  Government Agencies  Legal Competitive Intel Research 19
  • 20. Knowledge Areas/Roles  Text Mining  Library Sciences  Taxonomy  Linguistics  Foreign Language  Technology  Domain Expert 20  Domain Expert
  • 21. Technology Tools  Expert System Cogito  conceptSearching  Smartlogic Semaphore  HP Autonomy  Linguamatics I2E  SAS Text Analytics Suite  IBM Languageware / Content Analytics  Lexalytics Text Analytics  Provalis Research QDAMinder / WordStat 21  Provalis Research QDAMinder / WordStat  PingarAPI  AlchemyAPI  Content Analyst  Angoss KnowledgeREADER  NetOwl  Language Computer Corp.  Basis Technology  MeaningCloud  Forest RIM’s Textual ETL