SlideShare a Scribd company logo
PART 1




  Deep Parsing

          Craig Trim / craigtrim@gmail.com / CCA 3.0
Craig Trim / craigtrim@gmail.com / CCA 3.0
Craig Trim / craigtrim@gmail.com / CCA 3.0
Craig Trim / craigtrim@gmail.com / CCA 3.0
SBARQ - ?
            Craig Trim / craigtrim@gmail.com / CCA 3.0
SBARQ - ?, WHADVP = Adverb Phrase, SQ = Inverted Yes/No Question
                                                                   Craig Trim / craigtrim@gmail.com / CCA 3.0
SBARQ - ?, WHADVP = Adverb Phrase, SQ = Inverted Yes/No Question, WRB = Adverb, VBP = Present Tense Verb
NP = Noun Phrase, VP = Verb Phrase                                               Craig Trim / craigtrim@gmail.com / CCA 3.0
SBARQ - ?, WHADVP = Adverb Phrase, SQ = Inverted Yes/No Question, WRB = Adverb, VBP = Present Tense Verb
                                                                                     Craig Trim / craigtrim@gmail.com / CCA 3.0
NP = Noun Phrase, VP = Verb Phrase, PP = Prepositional Phrase, VB = Verb, PRP = Personal Pronoun
SBARQ - ?, WHADVP = Adverb Phrase, SQ = Inverted Yes/No Question, WRB = Adverb, VBP = Present Tense Verb
                                                                                     Craig Trim / craigtrim@gmail.com / CCA 3.0
NP = Noun Phrase, VP = Verb Phrase, PP = Prepositional Phrase, VB = Verb, PRP = Personal Pronoun
IN = Preposition
SBARQ - ?, WHADVP = Adverb Phrase, SQ = Inverted Yes/No Question, WRB = Adverb, VBP = Present Tense Verb
                                                                                     Craig Trim / craigtrim@gmail.com / CCA 3.0
NP = Noun Phrase, VP = Verb Phrase, PP = Prepositional Phrase, VB = Verb, PRP = Personal Pronoun
IN = Preposition, NNS = Singular Noun
Structural components highlighted.
                                     Craig Trim / craigtrim@gmail.com / CCA 3.0
Part-of-Speech tags highlighted.
                                   Craig Trim / craigtrim@gmail.com / CCA 3.0
Tokens highlighted.
                      Craig Trim / craigtrim@gmail.com / CCA 3.0
User input (sentence) highlighted.
                                     Craig Trim / craigtrim@gmail.com / CCA 3.0
Focus on noun phrases (NP).
                              Craig Trim / craigtrim@gmail.com / CCA 3.0
Find the connecting prepositional phrases (PP).
                                                  Craig Trim / craigtrim@gmail.com / CCA 3.0
Highlight segment of sentence to extract.
                                            Craig Trim / craigtrim@gmail.com / CCA 3.0
Perform extraction.
                      Craig Trim / craigtrim@gmail.com / CCA 3.0
Peform extraction.
                     Craig Trim / craigtrim@gmail.com / CCA 3.0
Create a semantic chain (collection of ≥ 2 triples).
                                                       Craig Trim / craigtrim@gmail.com / CCA 3.0
Compare semantic chain to parse tree structure.
                                                  Craig Trim / craigtrim@gmail.com / CCA 3.0
Compare semantic chain to parse tree structure.
                                                  Craig Trim / craigtrim@gmail.com / CCA 3.0
Compare semantic chain to parse tree structure.
                                                  Craig Trim / craigtrim@gmail.com / CCA 3.0
Compare semantic chain to parse tree structure.
                                                  Craig Trim / craigtrim@gmail.com / CCA 3.0
Normalize semantic chain.
                            Craig Trim / craigtrim@gmail.com / CCA 3.0
Add additional semantic context.
                                   Craig Trim / craigtrim@gmail.com / CCA 3.0
Add additional semantic context.
                                   Craig Trim / craigtrim@gmail.com / CCA 3.0
Add additional semantic context.
                                   Craig Trim / craigtrim@gmail.com / CCA 3.0
Add additional semantic context.
                                   Craig Trim / craigtrim@gmail.com / CCA 3.0
Craig Trim / craigtrim@gmail.com / CCA 3.0
PART 2




   The Parsing Process



                Craig Trim / craigtrim@gmail.com / CCA 3.0
Craig Trim / craigtrim@gmail.com / CCA 3.0
Craig Trim / craigtrim@gmail.com / CCA 3.0
Craig Trim / craigtrim@gmail.com / CCA 3.0
Craig Trim / craigtrim@gmail.com / CCA 3.0

More Related Content

More from Craig Trim

Publishing Python to PyPI using Github Actions.pptx
Publishing Python to PyPI using Github Actions.pptxPublishing Python to PyPI using Github Actions.pptx
Publishing Python to PyPI using Github Actions.pptx
Craig Trim
 
Ontologies and the Semantic Web
Ontologies and the Semantic WebOntologies and the Semantic Web
Ontologies and the Semantic Web
Craig Trim
 
SAS Visual Process Flows
SAS Visual Process FlowsSAS Visual Process Flows
SAS Visual Process Flows
Craig Trim
 
SAS University Edition - Getting Started
SAS University Edition - Getting StartedSAS University Edition - Getting Started
SAS University Edition - Getting Started
Craig Trim
 
Bluemix NL Classifier Tutorial
Bluemix NL Classifier TutorialBluemix NL Classifier Tutorial
Bluemix NL Classifier Tutorial
Craig Trim
 
Bluemix - Deploying a Java Web Application
Bluemix - Deploying a Java Web ApplicationBluemix - Deploying a Java Web Application
Bluemix - Deploying a Java Web Application
Craig Trim
 
IBM Bluemix - Building a Project with Maven
IBM Bluemix - Building a Project with MavenIBM Bluemix - Building a Project with Maven
IBM Bluemix - Building a Project with Maven
Craig Trim
 
Question Types in Natural Language Processing
Question Types in Natural Language ProcessingQuestion Types in Natural Language Processing
Question Types in Natural Language Processing
Craig Trim
 
Jenkins on Docker
Jenkins on DockerJenkins on Docker
Jenkins on Docker
Craig Trim
 
IBM Bluemix: Creating a Git Project
IBM Bluemix: Creating a Git ProjectIBM Bluemix: Creating a Git Project
IBM Bluemix: Creating a Git Project
Craig Trim
 
Things and strings public
Things and strings   publicThings and strings   public
Things and strings public
Craig Trim
 
Octave - Prototyping Machine Learning Algorithms
Octave - Prototyping Machine Learning AlgorithmsOctave - Prototyping Machine Learning Algorithms
Octave - Prototyping Machine Learning Algorithms
Craig Trim
 
PROV Overview
PROV OverviewPROV Overview
PROV Overview
Craig Trim
 
The Onomyicon
The OnomyiconThe Onomyicon
The Onomyicon
Craig Trim
 
Inference using owl 2.0 semantics
Inference using owl 2.0 semanticsInference using owl 2.0 semantics
Inference using owl 2.0 semantics
Craig Trim
 
An Introduction to the Jena API
An Introduction to the Jena APIAn Introduction to the Jena API
An Introduction to the Jena API
Craig Trim
 
The art of tokenization
The art of tokenizationThe art of tokenization
The art of tokenizationCraig Trim
 
Ontology and semantic web (2016)
Ontology and semantic web (2016)Ontology and semantic web (2016)
Ontology and semantic web (2016)
Craig Trim
 

More from Craig Trim (18)

Publishing Python to PyPI using Github Actions.pptx
Publishing Python to PyPI using Github Actions.pptxPublishing Python to PyPI using Github Actions.pptx
Publishing Python to PyPI using Github Actions.pptx
 
Ontologies and the Semantic Web
Ontologies and the Semantic WebOntologies and the Semantic Web
Ontologies and the Semantic Web
 
SAS Visual Process Flows
SAS Visual Process FlowsSAS Visual Process Flows
SAS Visual Process Flows
 
SAS University Edition - Getting Started
SAS University Edition - Getting StartedSAS University Edition - Getting Started
SAS University Edition - Getting Started
 
Bluemix NL Classifier Tutorial
Bluemix NL Classifier TutorialBluemix NL Classifier Tutorial
Bluemix NL Classifier Tutorial
 
Bluemix - Deploying a Java Web Application
Bluemix - Deploying a Java Web ApplicationBluemix - Deploying a Java Web Application
Bluemix - Deploying a Java Web Application
 
IBM Bluemix - Building a Project with Maven
IBM Bluemix - Building a Project with MavenIBM Bluemix - Building a Project with Maven
IBM Bluemix - Building a Project with Maven
 
Question Types in Natural Language Processing
Question Types in Natural Language ProcessingQuestion Types in Natural Language Processing
Question Types in Natural Language Processing
 
Jenkins on Docker
Jenkins on DockerJenkins on Docker
Jenkins on Docker
 
IBM Bluemix: Creating a Git Project
IBM Bluemix: Creating a Git ProjectIBM Bluemix: Creating a Git Project
IBM Bluemix: Creating a Git Project
 
Things and strings public
Things and strings   publicThings and strings   public
Things and strings public
 
Octave - Prototyping Machine Learning Algorithms
Octave - Prototyping Machine Learning AlgorithmsOctave - Prototyping Machine Learning Algorithms
Octave - Prototyping Machine Learning Algorithms
 
PROV Overview
PROV OverviewPROV Overview
PROV Overview
 
The Onomyicon
The OnomyiconThe Onomyicon
The Onomyicon
 
Inference using owl 2.0 semantics
Inference using owl 2.0 semanticsInference using owl 2.0 semantics
Inference using owl 2.0 semantics
 
An Introduction to the Jena API
An Introduction to the Jena APIAn Introduction to the Jena API
An Introduction to the Jena API
 
The art of tokenization
The art of tokenizationThe art of tokenization
The art of tokenization
 
Ontology and semantic web (2016)
Ontology and semantic web (2016)Ontology and semantic web (2016)
Ontology and semantic web (2016)
 

Recently uploaded

UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.
ViralQR
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 

Recently uploaded (20)

UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 

Deep Parsing (2012)

  • 1. PART 1 Deep Parsing Craig Trim / craigtrim@gmail.com / CCA 3.0
  • 2. Craig Trim / craigtrim@gmail.com / CCA 3.0
  • 3. Craig Trim / craigtrim@gmail.com / CCA 3.0
  • 4. Craig Trim / craigtrim@gmail.com / CCA 3.0
  • 5. SBARQ - ? Craig Trim / craigtrim@gmail.com / CCA 3.0
  • 6. SBARQ - ?, WHADVP = Adverb Phrase, SQ = Inverted Yes/No Question Craig Trim / craigtrim@gmail.com / CCA 3.0
  • 7. SBARQ - ?, WHADVP = Adverb Phrase, SQ = Inverted Yes/No Question, WRB = Adverb, VBP = Present Tense Verb NP = Noun Phrase, VP = Verb Phrase Craig Trim / craigtrim@gmail.com / CCA 3.0
  • 8. SBARQ - ?, WHADVP = Adverb Phrase, SQ = Inverted Yes/No Question, WRB = Adverb, VBP = Present Tense Verb Craig Trim / craigtrim@gmail.com / CCA 3.0 NP = Noun Phrase, VP = Verb Phrase, PP = Prepositional Phrase, VB = Verb, PRP = Personal Pronoun
  • 9. SBARQ - ?, WHADVP = Adverb Phrase, SQ = Inverted Yes/No Question, WRB = Adverb, VBP = Present Tense Verb Craig Trim / craigtrim@gmail.com / CCA 3.0 NP = Noun Phrase, VP = Verb Phrase, PP = Prepositional Phrase, VB = Verb, PRP = Personal Pronoun IN = Preposition
  • 10. SBARQ - ?, WHADVP = Adverb Phrase, SQ = Inverted Yes/No Question, WRB = Adverb, VBP = Present Tense Verb Craig Trim / craigtrim@gmail.com / CCA 3.0 NP = Noun Phrase, VP = Verb Phrase, PP = Prepositional Phrase, VB = Verb, PRP = Personal Pronoun IN = Preposition, NNS = Singular Noun
  • 11. Structural components highlighted. Craig Trim / craigtrim@gmail.com / CCA 3.0
  • 12. Part-of-Speech tags highlighted. Craig Trim / craigtrim@gmail.com / CCA 3.0
  • 13. Tokens highlighted. Craig Trim / craigtrim@gmail.com / CCA 3.0
  • 14. User input (sentence) highlighted. Craig Trim / craigtrim@gmail.com / CCA 3.0
  • 15. Focus on noun phrases (NP). Craig Trim / craigtrim@gmail.com / CCA 3.0
  • 16. Find the connecting prepositional phrases (PP). Craig Trim / craigtrim@gmail.com / CCA 3.0
  • 17. Highlight segment of sentence to extract. Craig Trim / craigtrim@gmail.com / CCA 3.0
  • 18. Perform extraction. Craig Trim / craigtrim@gmail.com / CCA 3.0
  • 19. Peform extraction. Craig Trim / craigtrim@gmail.com / CCA 3.0
  • 20. Create a semantic chain (collection of ≥ 2 triples). Craig Trim / craigtrim@gmail.com / CCA 3.0
  • 21. Compare semantic chain to parse tree structure. Craig Trim / craigtrim@gmail.com / CCA 3.0
  • 22. Compare semantic chain to parse tree structure. Craig Trim / craigtrim@gmail.com / CCA 3.0
  • 23. Compare semantic chain to parse tree structure. Craig Trim / craigtrim@gmail.com / CCA 3.0
  • 24. Compare semantic chain to parse tree structure. Craig Trim / craigtrim@gmail.com / CCA 3.0
  • 25. Normalize semantic chain. Craig Trim / craigtrim@gmail.com / CCA 3.0
  • 26. Add additional semantic context. Craig Trim / craigtrim@gmail.com / CCA 3.0
  • 27. Add additional semantic context. Craig Trim / craigtrim@gmail.com / CCA 3.0
  • 28. Add additional semantic context. Craig Trim / craigtrim@gmail.com / CCA 3.0
  • 29. Add additional semantic context. Craig Trim / craigtrim@gmail.com / CCA 3.0
  • 30. Craig Trim / craigtrim@gmail.com / CCA 3.0
  • 31. PART 2 The Parsing Process Craig Trim / craigtrim@gmail.com / CCA 3.0
  • 32. Craig Trim / craigtrim@gmail.com / CCA 3.0
  • 33. Craig Trim / craigtrim@gmail.com / CCA 3.0
  • 34. Craig Trim / craigtrim@gmail.com / CCA 3.0
  • 35. Craig Trim / craigtrim@gmail.com / CCA 3.0

Editor's Notes

  1. The first step is tokenizing the
  2. What you have here are 2 triples connected together; a semantic chain.
  3. Don’t look at this diagram with the mis-conception that an ontology is a taxonomy or directed tree. It’s not. It’s a cyclic network. We do seem to have Software as a root node with most relationships flowing up to the parent. However, in real life, the extracted semantic chain would be one small connection in the midst of an innumerable number of nodes, some in clusters, some in sequences, some apparently random, but all connected and sometimes having multiple connections between 2 nodes and so on.
  4. … . Now, you’ve been a good audience. Thank you. Let’s look at some real code and a real process. < CLICK > (END PRESENTATION AND GO TO PART 2)
  5. < CLICK > The first step is to pre-process the input. Pre-processing means we might add or remove tokens, most often punctuation, but we could make other additions. Some degree of normalization might occur here – for example an acronym that is spelled “I.B.M.” might be normalized to “IBM” or “U.S.A” to “USA”. Pattern reduction is a type of normalization – it provides a higher degree of uniformity on user input and makes the job of parsing and downstream processing easier. There are simply less variations to account for. However, we generally want to keep pre-processing short and sweet, depending on the needs of our applicatoin. By pre-processing we do have a tendency to lose the “user-speak”; that is, how a user might choose to refer to an entity or employ nuanced constructions. Also, too much normalization can lead to inaccurate results in the parser. We don’t lose anything by changing “I.B.M.” to “IBM”, but if we changed the inflected verb “installed” to the infinitive construction (also called cannonical form, normal form, or lemma) of “install” we lose the fact that the installation occurred in the past tense. < CLICK > Performing lemmatization at this stage may be appropriate for some applications, but in the main, nuanced speech leads to more accurate parsing results, which in turns leads to higher precision in extracting information of interest. Lemmatization is typically performed in the stage that follows parsing, the post processing stage. < CLICK >. Post processing is really an abstraction of many many many services – services that perform not only lemmatization (which is conceptually trivial), but semantic interpolation – the adding of additional meaning to the parse tree, as we saw on previous slides. < CLICK >
  6. However, at a high level, this is what happens. The input is pre-processed, parsed, and post-processed. < CLICK >
  7. Let’s add a little more context. The user provides input, the input is received, goes through the process we just talked about, and the insight (hopefully there is some) is provided back to the user. The important thing on this diagram is the “Intermediate Form”. How is the user input represented as it flows through this process? At its simplest, a data transfer object msut exist tha represents the initial input as a String, converts the String into an array of tokens, parses the tokens and stores the structured parse results, and has a mechanism for allowing the structurd output to be enhanced (or simplified) through a number of services, and finally for additional context to be applied and brought to bear upon these results. The design for intermediate representation lies at the heart of every parsing strategy. There are multiple strategies available today. These may vary by architecture, design principle or needs of the application. A parsing strategy that only leverages part of speech tagging is not likely to require a mechanism for storing deep parse results and the additional complexity this incurs. On the other hand, an architecture that can allow a parsing process the simplicity of a few steps, or the complexity of several hundred steps, and be customized without compromise to original design principles is of the most value. Of the many architectures that exist, there are yet many that are this well designed. Ultimiately the strategy you choose will be based on a variety of factors. I do identify this choice as being one of the the most important considerations in the parsing process.