Relationship between the      Semantic Web and NLP                 Rajendra Akerkar                 Technomathematics Rese...
Structure of this talk         Relationship between NLP and SW         Inspiration: QA system and H          I    i ti  ...
The sense of the relationship    Could the Semantic Web enhance the     technical level of NLP technologies?    Could NL...
Purpose of the Semantic Web    to help users         locate,         organize,          organize and         process i...
Why natural language?    It is intuitive           intuitive,    easy to use and rapidly deployable, and    no speciali...
Vision    The Semantic Web equally accessible by     computers using specialized languages and     interchange formats, a...
What synergistic opportunities exist between naturallanguage technology and the Sl           h l       d h Semantic W b?  ...
Semantic Web research    Constructing, integrating, packaging,     Constructing integrating packaging and     exporting s...
SW: What is missing?         Where in the loop is the human?         How will we communicate with our software agents?  ...
Mechanism for integrating NL into theRDF    Augmenting RDF property definitions    Creating Information Access Schemata ...
Inspiration    Question Answering     Question-Answering (QA) System    Haystack System         End user semantic web p...
QA system    The use of metadata is a common technique     for rendering information fragments more     tenable to proces...
QA system    Natural language annotations         machine-parsable sentences and phrases that          describe the cont...
QA system    “For pioneering contributions to the theory and      For     practice of optimizing compiler techniques that...
QA system    The annotations allow system to answer:         What award did Allan receive in 2006?         Who was sele...
QA system    Feature of natural language annotations         any information segment can be annotated:                n...
Haystack    Aggregates a user’s information into a unified                  user s     repository.         e mail,      ...
Haystack    “Present Tim the letter from the secretary I      Present     met with last Tuesday from TMRF.”         Curr...
Haystack    Reduce the protocol barriers to information—                                     information     standardizin...
QA System & Haystack    By incorporating natural language search     capabilities into Haystack    Demonstrate         ...
To endow Haystack with the ability toanswer         What is the state bird of India?         Tell me what the vision sta...
Adenine    To facilitate frequent manipulation of RDF     data, Haystack’s programming language.         Features of Lis...
Adenine :State class and the :bird property     @prefix dc: <http://purl.org/dc/elements/1.1/>     @prefix : <www.tourindi...
Adenine unique feature    Every Adenine instruction is encoded as a     node in the RDF graph, and a sequence of     inst...
The connection between the RDF schema and theNL annotations in natural language schema          i    i    t ll            ...
Question Answering    What is the state bird of India?         System parses the question and determines that          :...
User query is parsed by QA System    So,     So a single natural language annotation is     capable of answering a questi...
Natural language schemaadd { :stateAttribute      rdf:type nl:NaturalLanguageSchema                 The method invoked by ...
QA System    Built a prototype implementing the natural     language schemata.    Limited in the types of questions that...
Further integrating natural languagetechnology with the Semantic Web    RDF triples ≈ System’s ternary expression        ...
A natural language-aware software agent could answer questions…add { :Person     rdf:type                      rdfs:Class ...
‘Make sense’ with minimal cost!    The nl:generation property specifies a     natural language version of the knowledge. ...
Information Access Schemata    Despite the simplicity of adding NL     annotations to RDF properties         Significant...
An information access schema is aquadruple    Annotations: NL sentences (   (either declarative or     interrogative) or ...
Example: “family” of questions         What is the country in Asia with the largest area?         Tell me what Asian cou...
Capture the “pattern” of informationrequests i an i f         in information access schema                      i         ...
    The pattern binds to the value of the particular attribute for          countries within the queried geographic regio...
Further extension: Query Plan    Question: What is the distance from India to     Norway?    Solution Plan: To compute t...
Information Planning Schemata    An extension of Information Access Schemata.    Simplifies the task of knowledge engine...
An information planning schema   <nl:InformationPlanningSchema>              <nl:ann>distance between $country1           ...
Integrating the methods    The three proposed methods for integrating natural language and     RDF can be used together t...
Thank You !March 17, 2009      Akerkar: Sogndal Lecture   42
Upcoming SlideShare
Loading in …5
×

Relationship between the Semantic Web and NLP

1,590 views

Published on

Published in: Education, Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,590
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
4
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Relationship between the Semantic Web and NLP

  1. 1. Relationship between the Semantic Web and NLP Rajendra Akerkar Technomathematics Research Foundation, Kolhapur, IndiaMarch 17, 2009 Akerkar: Sogndal Lecture 1
  2. 2. Structure of this talk  Relationship between NLP and SW  Inspiration: QA system and H I i ti t d Haystack t k  RDF Schema & NL Annotations  Information Access Schemata  Information Planning Schemata  Integration  ConclusionMarch 17, 2009 Akerkar: Sogndal Lecture 2
  3. 3. The sense of the relationship Could the Semantic Web enhance the technical level of NLP technologies? Could NLP technologies help in delivering and using a better Semantic Web? gMarch 17, 2009 Akerkar: Sogndal Lecture 3
  4. 4. Purpose of the Semantic Web to help users  locate,  organize, organize and  process information. belief:  It should be grounded in the information access method humans are comfortable with — natural language.March 17, 2009 Akerkar: Sogndal Lecture 4
  5. 5. Why natural language? It is intuitive intuitive, easy to use and rapidly deployable, and no specialized training training.March 17, 2009 Akerkar: Sogndal Lecture 5
  6. 6. Vision The Semantic Web equally accessible by computers using specialized languages and interchange formats, and humans using natural l t l language.  Ask a computer: “when was the king of Norway born? born?”  “What’s the cheapest flight to the Mumbai this month? month?” Retrieve “exact information”.March 17, 2009 Akerkar: Sogndal Lecture 6
  7. 7. What synergistic opportunities exist between naturallanguage technology and the Sl h l d h Semantic W b? i Web? State of the art State-of-the-art NL systems are capable of providing users intuitive access to a wealth of textual data using ordinary language. However, such systems are often hampered by  the knowledge engineering bottleneck (wrappers, integrate new data source),  knowledge integration (from multi. Sources), and g g )  time consuming. Here Semantic Web comes in …March 17, 2009 Akerkar: Sogndal Lecture 7
  8. 8. Semantic Web research Constructing, integrating, packaging, Constructing integrating packaging and exporting segments of knowledge to be usable by the entire world. y NL technology can tap into this knowledge framework  In return provides natural language information access for the Semantic Web.March 17, 2009 Akerkar: Sogndal Lecture 8
  9. 9. SW: What is missing?  Where in the loop is the human?  How will we communicate with our software agents?  How will we access information on the Semantic Web? Obviously, we cannot expect ordinary Semantic Web users to manually manipulate ontologies, query with formal logic expressions, etc. i t We would like to communicate with software agents in natural language…  What is the role of natural language in the Semantic Web?March 17, 2009 Akerkar: Sogndal Lecture 9
  10. 10. Mechanism for integrating NL into theRDF Augmenting RDF property definitions Creating Information Access Schemata  To bridge gap between NL & RDF Extension to mirror human question answering behaviour in the form of NL query plans.March 17, 2009 Akerkar: Sogndal Lecture 10
  11. 11. Inspiration Question Answering Question-Answering (QA) System Haystack System  End user semantic web platform  aggregates all user s information into a unified user’s repository.March 17, 2009 Akerkar: Sogndal Lecture 11
  12. 12. QA system The use of metadata is a common technique for rendering information fragments more tenable to processing by computer systems. Our approach  natural language itself as metadata  numerous advantages and opportunities.  preserves h human readability and d bilit d  encourages non-expert users to engage in metadata creation.March 17, 2009 Akerkar: Sogndal Lecture 12
  13. 13. QA system Natural language annotations  machine-parsable sentences and phrases that describe the content of various i f d ib th t t f i information ti segments.  annotations serve as metadata  describe the kinds of questions a particular piece of knowledge is capable of answering. Contains natural language annotation technologyMarch 17, 2009 Akerkar: Sogndal Lecture 13
  14. 14. QA system “For pioneering contributions to the theory and For practice of optimizing compiler techniques that laid the foundation for modern optimizing compilers and automatic parallel execution.” F t ti ll l ti ” Frances E All was Allen selected for Turing award 2006. Annotation:  Frances E Allen is selected for Turing award in 2006.  2006 Turing awardMarch 17, 2009 Akerkar: Sogndal Lecture 14
  15. 15. QA system The annotations allow system to answer:  What award did Allan receive in 2006?  Who was selected for the Turing award in 2006?  To whom was the Turing award given in 2006?March 17, 2009 Akerkar: Sogndal Lecture 15
  16. 16. QA system Feature of natural language annotations  any information segment can be annotated:  not only text, but also images, multimedia … y , g , To provide uniform access to semi-structured resources on the Web  a virtual database system  integrates Web sources under a single query interface.March 17, 2009 Akerkar: Sogndal Lecture 16
  17. 17. Haystack Aggregates a user’s information into a unified user s repository.  e mail, e-mail, documents, calendar, and web pages. It is presented using RDF  makes it easy for agents to access filter and access, filter, process this information in an automated fashion.March 17, 2009 Akerkar: Sogndal Lecture 17
  18. 18. Haystack “Present Tim the letter from the secretary I Present met with last Tuesday from TMRF.”  Current IT allows to store all info to answer the query  Scattered amongst multiple systems  Agent need to communicate with  Email client  Calendar  File system  Directory serverMarch 17, 2009 Akerkar: Sogndal Lecture 18
  19. 19. Haystack Reduce the protocol barriers to information— information standardizing on RDF as a common model for information—  agents are free to mine the semantics of a user’s various data sources End-user End user application for managing information  serves as a powerful platform for experimenting with various information retrieval and user interface research problemsMarch 17, 2009 Akerkar: Sogndal Lecture 19
  20. 20. QA System & Haystack By incorporating natural language search capabilities into Haystack Demonstrate  the usefulness of natural language search  show its applicability to the Semantic WebMarch 17, 2009 Akerkar: Sogndal Lecture 20
  21. 21. To endow Haystack with the ability toanswer  What is the state bird of India?  Tell me what the vision statement of TMRF is.  Do you know Sogndal’s population? Sogndal s Easy on Web But, for this data to be usable by any Semantic Web system it must be system, restructured in terms of the RDF model.March 17, 2009 Akerkar: Sogndal Lecture 21
  22. 22. Adenine To facilitate frequent manipulation of RDF data, Haystack’s programming language.  Features of Lisp, Python, and Notation3.  Basic data unit is the RDF triple.March 17, 2009 Akerkar: Sogndal Lecture 22
  23. 23. Adenine :State class and the :bird property @prefix dc: <http://purl.org/dc/elements/1.1/> @prefix : <www.tourindia.com/data#> add { :State Triples T i l are enclosed i curly l d in l rdf:type rdfs:Class ; rdfs:label "State" braces { } and expressed in } subject-predicate-object order. add { :bird rdf:type rdf:Property ; semicolon denotes the predicate- rdfs:label “State bird" ; rdfs:domain :State object pair is to assume the last } used subject subject. # ... more property declarations add { :india rdf:type :State ; dc:title “India" ; RDF literals are written as strings in double quotes :bird “Peacock" ; :flower “Lotus" ; :population "1,147,995,904" # ... more information about India and its states }March 17, 2009 Akerkar: Sogndal Lecture 23
  24. 24. Adenine unique feature Every Adenine instruction is encoded as a node in the RDF graph, and a sequence of instructions is expressed by adenine:next p y arcs between these instruction nodes. As a result, data and procedures can be embedded within the same RDF graph and can be distributed together.March 17, 2009 Akerkar: Sogndal Lecture 24
  25. 25. The connection between the RDF schema and theNL annotations in natural language schema i i t ll h@prefix nl: <http://www.tmrfindia.org/sw/projects/enlight#> add { :stateAttribute rdf:type nl:NaturalLanguageSchema ; # This annotation handles cases like "[state bird] of [India]" # and "[population] of [India]". nl:annotation @( :attribute "of" :state ) ; The definition of :attribute # Code to run to resolve state attribute restricts the resource representing nl:code :stateAttributeCode the attribute to be queried to have} type rdf:Property. df P t add { :attribute rdf:type nl:Parameter ; nl:domain rdf:Property ; The rdfs:label property to resolve the actual literal, nl:descriptionProperty rdfs:label e.g., “State bird” or “population”.} add { :state :state restricts the resource to have type rdf:type nl:Parameter ; nl:domain :State ; :State and to have the resolver dc:title nl:descriptionProperty dc:title}# The identifier [state] will be bound to the value of the named# parameter :state. The identifier [attribute] will be bound to the# value of the named parameter :attribute.method :stateAttributeCode :state = state :attribute = attribute # Ask the system what the [attribute] property of [state] is return (ask %{ attribute state ?x })March 17, 2009 Akerkar: Sogndal Lecture 25
  26. 26. Question Answering What is the state bird of India?  System parses the question and determines that :stateAttribute is the relevant natural language schema to invoke. i k  System extracts the natural language bindings of :attribute and :state, which are “state bird” and “India”, respectively. This is further resolved into the RDF resources :bird and :india :india.  As a response to the question, the method :stateAttributeCode is invoked with named parameter :attribute bound to :bird and named parameter :state p bound to :india.  The invoked method performs a query into Haystack’s RDF store, which returns “Peacock”, the state bird of India.March 17, 2009 Akerkar: Sogndal Lecture 26
  27. 27. User query is parsed by QA System So, So a single natural language annotation is capable of answering a question. QA system is capable of normalizing different methods for requesting the same information information.  imperative (“Tell me...”),  interrogative (“What is ”) ( What is... ).March 17, 2009 Akerkar: Sogndal Lecture 27
  28. 28. Natural language schemaadd { :stateAttribute rdf:type nl:NaturalLanguageSchema The method invoked by the ; NLS queries the RDF store for q nl:annotation @( :state " has the the resource of type :State largest " :comparisonAttribute that contains the maximal ) ; integer value for the property nl:code :maxComparisonAttributeCode given by} :comparisonAttribute. pmethod :maxComparisonAttributeCode :comparisonAttribute = attribute return (ask %{ Allow our system to answer the rdf:type ?x :State , following questions: adenine:argMax ?x ?y 1 xsd:int • Which state has the lowest population? %{ • Do you know what state has the :attribute ?x ?y largest area? }} @(?x))March 17, 2009 Akerkar: Sogndal Lecture 28
  29. 29. QA System Built a prototype implementing the natural language schemata. Limited in the types of questions that it can answer and the domain. However, proof of concept that demonstrates a method of marrying natural language with the Semantic Web.March 17, 2009 Akerkar: Sogndal Lecture 29
  30. 30. Further integrating natural languagetechnology with the Semantic Web RDF triples ≈ System’s ternary expression representation of NL. Clipping natural language annotations directly into i t rdf:Property d fi iti definitions. Consider a piece of an ontology modeling an address book entry in Haystack:March 17, 2009 Akerkar: Sogndal Lecture 30
  31. 31. A natural language-aware software agent could answer questions…add { :Person rdf:type rdfs:Class The :homeAddress is a property} specifying a user’s home address.add { :homeAddress rdf:type rdf:Property ; rdfs:domain :Person ; rdfs:range xsd:string ; Annotation nl:annotation @( nl:subject " lives at " expresses this nl:object ) ; connection nl:annotation @( nl:subject "’s home address is " concretely in nl:object ) ; natural language, nl:annotation @( nl:subject "’s bungalow" ) ; via the nl:generation @( nl:subject "’s home address is nl:annotation nl:object ) property.} The phrase “nl:subject lives at nl:object” is linked to every RDF statement involving the :homeAddress property, where nl:subject is shorthand for indicating the subject (domain) of the relation, and nl:object is h th d for the bj t (range) of the relation. i shorthand f th object ( ) f th l tiMarch 17, 2009 Akerkar: Sogndal Lecture 31
  32. 32. ‘Make sense’ with minimal cost! The nl:generation property specifies a natural language version of the knowledge.  allows software agents to present meaningful, natural responses to users.  Question: Where does Ram live?  Reply: Ram’s home is Tellefsens gate 5.March 17, 2009 Akerkar: Sogndal Lecture 32
  33. 33. Information Access Schemata Despite the simplicity of adding NL annotations to RDF properties  Significant restriction : only one RDF statement can be queried at once.  Solution: Create a schemata that captures similar patterns of information access.March 17, 2009 Akerkar: Sogndal Lecture 33
  34. 34. An information access schema is aquadruple Annotations: NL sentences ( (either declarative or interrogative) or phrases that describe the types of user questions the schema can answer answer. Pattern: a declarative pattern of RDF triples that references a pre-existing ontology. p g gy Action: a set of operators to further process variables bound during the pattern matching process. Mapping: mechanism for handling disjunction between lexical and ontological terms.March 17, 2009 Akerkar: Sogndal Lecture 34
  35. 35. Example: “family” of questions  What is the country in Asia with the largest area?  Tell me what Asian country has the highest population density density.  What country in Europe has the lowest infant mortality rate? y  What is the most populated American country?March 17, 2009 Akerkar: Sogndal Lecture 35
  36. 36. Capture the “pattern” of informationrequests i an i f in information access schema i h <nl:InformationAccessSchema> Natural language <nl:ann>what country in $region has the largest $attribute</nl:ann> annotations are <nl:pattern>?x a :Country</nl:pattern> employed to <nl:pattern>?x map($attribute) ?val</nl:pattern> describe a pattern of RDF statements <nl:pattern>?x :location $region</nl:pattern> <nl:action>display(boundto(?x, max(?val))) </nl:action> <nl:mapping> Because annotations would be <nl:hash variable="$attribute"> variable $attribute > p processed by linguistically y g y <nl:map value="population"> sophisticated systems, different :population adjectives such as “highest” and </nl:map> “largest” could be uniformly mapped <nl:map value="area"> onto the maximum operation. :area </nl:map> ... Schema answers questions that </nl:hash> involve region specific superlative </nl:mapping> pp g comparison of countries. </nl:InformationAccessSchema>March 17, 2009 Akerkar: Sogndal Lecture 36
  37. 37.  The pattern binds to the value of the particular attribute for countries within the queried geographic region, and the action ti ithi th i d hi i d th ti specifies an aggregate operation (maximum) over the values bound within the pattern.  The country corresponding to that maximum value is returned as the answer.  The mapping provides a translation from language attributes to pp g p g g RDF properties.  Information access schemata are written with respect to a particular pre-existing ontology;  In thi I this example, we assume th t an appropriate ontology h b l that i t t l has been established (i.e., :Country is defined as a class, and :location is defined as a property). In this vision of the Semantic Web, information access schemata grounded in natural language would co-exist alongside RDF metadata.March 17, 2009 Akerkar: Sogndal Lecture 37
  38. 38. Further extension: Query Plan Question: What is the distance from India to Norway? Solution Plan: To compute the distance between their respective capitals.Could humans “teach” such plans to a computer directly teach ?March 17, 2009 Akerkar: Sogndal Lecture 38
  39. 39. Information Planning Schemata An extension of Information Access Schemata. Simplifies the task of knowledge engineering. Example:  Instead of writing RDF patterns,  which would require knowledge of domain-specific ontologies,  Use natural language itself to describe the process of answering a question.  The answer plan (nl:plan) reflects the user’s thought process expressed in natural language: first find the capitals of the countries, and then find the distance between those citiesMarch 17, 2009 Akerkar: Sogndal Lecture 39
  40. 40. An information planning schema <nl:InformationPlanningSchema> <nl:ann>distance between $country1 and $country2</ann> <nl:plan> <rdf:Seq> <rdf:li>what is the capital of $country1 := ?capital1</rdf:li> <rdf:li>what is the capital of $country2 := ?capital2</rdf:li> <rdf:li>what is the distance between ?capital1 and ?capital2 := ?distance</rdf:li> </rdf:Seq> </nl:plan> <nl:action>display(?distance)</nl:action> </nl:InformationPlanningSchema>March 17, 2009 Akerkar: Sogndal Lecture 40
  41. 41. Integrating the methods The three proposed methods for integrating natural language and RDF can be used together to afford greater flexibility.  Annotating RDF properties is a low-cost (from a knowledge engineering perspective) way of providing natural language access to RDF statements.  Information access schemata while being more complex and schemata, requiring knowledge of domain-specific ontologies, give experienced knowledge engineers fine-grained tools for manipulating RDF and controlling the output.  Information planning schemata allow users to describe in natural describe, language itself, how they would go about answering a particular class of questions. These three methods can combine to provide the foundation for question answering on the Semantic Web.March 17, 2009 Akerkar: Sogndal Lecture 41
  42. 42. Thank You !March 17, 2009 Akerkar: Sogndal Lecture 42

×