Your SlideShare is downloading. ×
Relationship between the Semantic Web and NLP
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Relationship between the Semantic Web and NLP


Published on

Published in: Education, Technology

1 Like
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Relationship between the Semantic Web and NLP Rajendra Akerkar Technomathematics Research Foundation, Kolhapur, IndiaMarch 17, 2009 Akerkar: Sogndal Lecture 1
  • 2. Structure of this talk  Relationship between NLP and SW  Inspiration: QA system and H I i ti t d Haystack t k  RDF Schema & NL Annotations  Information Access Schemata  Information Planning Schemata  Integration  ConclusionMarch 17, 2009 Akerkar: Sogndal Lecture 2
  • 3. The sense of the relationship Could the Semantic Web enhance the technical level of NLP technologies? Could NLP technologies help in delivering and using a better Semantic Web? gMarch 17, 2009 Akerkar: Sogndal Lecture 3
  • 4. Purpose of the Semantic Web to help users  locate,  organize, organize and  process information. belief:  It should be grounded in the information access method humans are comfortable with — natural language.March 17, 2009 Akerkar: Sogndal Lecture 4
  • 5. Why natural language? It is intuitive intuitive, easy to use and rapidly deployable, and no specialized training training.March 17, 2009 Akerkar: Sogndal Lecture 5
  • 6. Vision The Semantic Web equally accessible by computers using specialized languages and interchange formats, and humans using natural l t l language.  Ask a computer: “when was the king of Norway born? born?”  “What’s the cheapest flight to the Mumbai this month? month?” Retrieve “exact information”.March 17, 2009 Akerkar: Sogndal Lecture 6
  • 7. What synergistic opportunities exist between naturallanguage technology and the Sl h l d h Semantic W b? i Web? State of the art State-of-the-art NL systems are capable of providing users intuitive access to a wealth of textual data using ordinary language. However, such systems are often hampered by  the knowledge engineering bottleneck (wrappers, integrate new data source),  knowledge integration (from multi. Sources), and g g )  time consuming. Here Semantic Web comes in …March 17, 2009 Akerkar: Sogndal Lecture 7
  • 8. Semantic Web research Constructing, integrating, packaging, Constructing integrating packaging and exporting segments of knowledge to be usable by the entire world. y NL technology can tap into this knowledge framework  In return provides natural language information access for the Semantic Web.March 17, 2009 Akerkar: Sogndal Lecture 8
  • 9. SW: What is missing?  Where in the loop is the human?  How will we communicate with our software agents?  How will we access information on the Semantic Web? Obviously, we cannot expect ordinary Semantic Web users to manually manipulate ontologies, query with formal logic expressions, etc. i t We would like to communicate with software agents in natural language…  What is the role of natural language in the Semantic Web?March 17, 2009 Akerkar: Sogndal Lecture 9
  • 10. Mechanism for integrating NL into theRDF Augmenting RDF property definitions Creating Information Access Schemata  To bridge gap between NL & RDF Extension to mirror human question answering behaviour in the form of NL query plans.March 17, 2009 Akerkar: Sogndal Lecture 10
  • 11. Inspiration Question Answering Question-Answering (QA) System Haystack System  End user semantic web platform  aggregates all user s information into a unified user’s repository.March 17, 2009 Akerkar: Sogndal Lecture 11
  • 12. QA system The use of metadata is a common technique for rendering information fragments more tenable to processing by computer systems. Our approach  natural language itself as metadata  numerous advantages and opportunities.  preserves h human readability and d bilit d  encourages non-expert users to engage in metadata creation.March 17, 2009 Akerkar: Sogndal Lecture 12
  • 13. QA system Natural language annotations  machine-parsable sentences and phrases that describe the content of various i f d ib th t t f i information ti segments.  annotations serve as metadata  describe the kinds of questions a particular piece of knowledge is capable of answering. Contains natural language annotation technologyMarch 17, 2009 Akerkar: Sogndal Lecture 13
  • 14. QA system “For pioneering contributions to the theory and For practice of optimizing compiler techniques that laid the foundation for modern optimizing compilers and automatic parallel execution.” F t ti ll l ti ” Frances E All was Allen selected for Turing award 2006. Annotation:  Frances E Allen is selected for Turing award in 2006.  2006 Turing awardMarch 17, 2009 Akerkar: Sogndal Lecture 14
  • 15. QA system The annotations allow system to answer:  What award did Allan receive in 2006?  Who was selected for the Turing award in 2006?  To whom was the Turing award given in 2006?March 17, 2009 Akerkar: Sogndal Lecture 15
  • 16. QA system Feature of natural language annotations  any information segment can be annotated:  not only text, but also images, multimedia … y , g , To provide uniform access to semi-structured resources on the Web  a virtual database system  integrates Web sources under a single query interface.March 17, 2009 Akerkar: Sogndal Lecture 16
  • 17. Haystack Aggregates a user’s information into a unified user s repository.  e mail, e-mail, documents, calendar, and web pages. It is presented using RDF  makes it easy for agents to access filter and access, filter, process this information in an automated fashion.March 17, 2009 Akerkar: Sogndal Lecture 17
  • 18. Haystack “Present Tim the letter from the secretary I Present met with last Tuesday from TMRF.”  Current IT allows to store all info to answer the query  Scattered amongst multiple systems  Agent need to communicate with  Email client  Calendar  File system  Directory serverMarch 17, 2009 Akerkar: Sogndal Lecture 18
  • 19. Haystack Reduce the protocol barriers to information— information standardizing on RDF as a common model for information—  agents are free to mine the semantics of a user’s various data sources End-user End user application for managing information  serves as a powerful platform for experimenting with various information retrieval and user interface research problemsMarch 17, 2009 Akerkar: Sogndal Lecture 19
  • 20. QA System & Haystack By incorporating natural language search capabilities into Haystack Demonstrate  the usefulness of natural language search  show its applicability to the Semantic WebMarch 17, 2009 Akerkar: Sogndal Lecture 20
  • 21. To endow Haystack with the ability toanswer  What is the state bird of India?  Tell me what the vision statement of TMRF is.  Do you know Sogndal’s population? Sogndal s Easy on Web But, for this data to be usable by any Semantic Web system it must be system, restructured in terms of the RDF model.March 17, 2009 Akerkar: Sogndal Lecture 21
  • 22. Adenine To facilitate frequent manipulation of RDF data, Haystack’s programming language.  Features of Lisp, Python, and Notation3.  Basic data unit is the RDF triple.March 17, 2009 Akerkar: Sogndal Lecture 22
  • 23. Adenine :State class and the :bird property @prefix dc: <> @prefix : <> add { :State Triples T i l are enclosed i curly l d in l rdf:type rdfs:Class ; rdfs:label "State" braces { } and expressed in } subject-predicate-object order. add { :bird rdf:type rdf:Property ; semicolon denotes the predicate- rdfs:label “State bird" ; rdfs:domain :State object pair is to assume the last } used subject subject. # ... more property declarations add { :india rdf:type :State ; dc:title “India" ; RDF literals are written as strings in double quotes :bird “Peacock" ; :flower “Lotus" ; :population "1,147,995,904" # ... more information about India and its states }March 17, 2009 Akerkar: Sogndal Lecture 23
  • 24. Adenine unique feature Every Adenine instruction is encoded as a node in the RDF graph, and a sequence of instructions is expressed by adenine:next p y arcs between these instruction nodes. As a result, data and procedures can be embedded within the same RDF graph and can be distributed together.March 17, 2009 Akerkar: Sogndal Lecture 24
  • 25. The connection between the RDF schema and theNL annotations in natural language schema i i t ll h@prefix nl: <> add { :stateAttribute rdf:type nl:NaturalLanguageSchema ; # This annotation handles cases like "[state bird] of [India]" # and "[population] of [India]". nl:annotation @( :attribute "of" :state ) ; The definition of :attribute # Code to run to resolve state attribute restricts the resource representing nl:code :stateAttributeCode the attribute to be queried to have} type rdf:Property. df P t add { :attribute rdf:type nl:Parameter ; nl:domain rdf:Property ; The rdfs:label property to resolve the actual literal, nl:descriptionProperty rdfs:label e.g., “State bird” or “population”.} add { :state :state restricts the resource to have type rdf:type nl:Parameter ; nl:domain :State ; :State and to have the resolver dc:title nl:descriptionProperty dc:title}# The identifier [state] will be bound to the value of the named# parameter :state. The identifier [attribute] will be bound to the# value of the named parameter :attribute.method :stateAttributeCode :state = state :attribute = attribute # Ask the system what the [attribute] property of [state] is return (ask %{ attribute state ?x })March 17, 2009 Akerkar: Sogndal Lecture 25
  • 26. Question Answering What is the state bird of India?  System parses the question and determines that :stateAttribute is the relevant natural language schema to invoke. i k  System extracts the natural language bindings of :attribute and :state, which are “state bird” and “India”, respectively. This is further resolved into the RDF resources :bird and :india :india.  As a response to the question, the method :stateAttributeCode is invoked with named parameter :attribute bound to :bird and named parameter :state p bound to :india.  The invoked method performs a query into Haystack’s RDF store, which returns “Peacock”, the state bird of India.March 17, 2009 Akerkar: Sogndal Lecture 26
  • 27. User query is parsed by QA System So, So a single natural language annotation is capable of answering a question. QA system is capable of normalizing different methods for requesting the same information information.  imperative (“Tell me...”),  interrogative (“What is ”) ( What is... ).March 17, 2009 Akerkar: Sogndal Lecture 27
  • 28. Natural language schemaadd { :stateAttribute rdf:type nl:NaturalLanguageSchema The method invoked by the ; NLS queries the RDF store for q nl:annotation @( :state " has the the resource of type :State largest " :comparisonAttribute that contains the maximal ) ; integer value for the property nl:code :maxComparisonAttributeCode given by} :comparisonAttribute. pmethod :maxComparisonAttributeCode :comparisonAttribute = attribute return (ask %{ Allow our system to answer the rdf:type ?x :State , following questions: adenine:argMax ?x ?y 1 xsd:int • Which state has the lowest population? %{ • Do you know what state has the :attribute ?x ?y largest area? }} @(?x))March 17, 2009 Akerkar: Sogndal Lecture 28
  • 29. QA System Built a prototype implementing the natural language schemata. Limited in the types of questions that it can answer and the domain. However, proof of concept that demonstrates a method of marrying natural language with the Semantic Web.March 17, 2009 Akerkar: Sogndal Lecture 29
  • 30. Further integrating natural languagetechnology with the Semantic Web RDF triples ≈ System’s ternary expression representation of NL. Clipping natural language annotations directly into i t rdf:Property d fi iti definitions. Consider a piece of an ontology modeling an address book entry in Haystack:March 17, 2009 Akerkar: Sogndal Lecture 30
  • 31. A natural language-aware software agent could answer questions…add { :Person rdf:type rdfs:Class The :homeAddress is a property} specifying a user’s home address.add { :homeAddress rdf:type rdf:Property ; rdfs:domain :Person ; rdfs:range xsd:string ; Annotation nl:annotation @( nl:subject " lives at " expresses this nl:object ) ; connection nl:annotation @( nl:subject "’s home address is " concretely in nl:object ) ; natural language, nl:annotation @( nl:subject "’s bungalow" ) ; via the nl:generation @( nl:subject "’s home address is nl:annotation nl:object ) property.} The phrase “nl:subject lives at nl:object” is linked to every RDF statement involving the :homeAddress property, where nl:subject is shorthand for indicating the subject (domain) of the relation, and nl:object is h th d for the bj t (range) of the relation. i shorthand f th object ( ) f th l tiMarch 17, 2009 Akerkar: Sogndal Lecture 31
  • 32. ‘Make sense’ with minimal cost! The nl:generation property specifies a natural language version of the knowledge.  allows software agents to present meaningful, natural responses to users.  Question: Where does Ram live?  Reply: Ram’s home is Tellefsens gate 5.March 17, 2009 Akerkar: Sogndal Lecture 32
  • 33. Information Access Schemata Despite the simplicity of adding NL annotations to RDF properties  Significant restriction : only one RDF statement can be queried at once.  Solution: Create a schemata that captures similar patterns of information access.March 17, 2009 Akerkar: Sogndal Lecture 33
  • 34. An information access schema is aquadruple Annotations: NL sentences ( (either declarative or interrogative) or phrases that describe the types of user questions the schema can answer answer. Pattern: a declarative pattern of RDF triples that references a pre-existing ontology. p g gy Action: a set of operators to further process variables bound during the pattern matching process. Mapping: mechanism for handling disjunction between lexical and ontological terms.March 17, 2009 Akerkar: Sogndal Lecture 34
  • 35. Example: “family” of questions  What is the country in Asia with the largest area?  Tell me what Asian country has the highest population density density.  What country in Europe has the lowest infant mortality rate? y  What is the most populated American country?March 17, 2009 Akerkar: Sogndal Lecture 35
  • 36. Capture the “pattern” of informationrequests i an i f in information access schema i h <nl:InformationAccessSchema> Natural language <nl:ann>what country in $region has the largest $attribute</nl:ann> annotations are <nl:pattern>?x a :Country</nl:pattern> employed to <nl:pattern>?x map($attribute) ?val</nl:pattern> describe a pattern of RDF statements <nl:pattern>?x :location $region</nl:pattern> <nl:action>display(boundto(?x, max(?val))) </nl:action> <nl:mapping> Because annotations would be <nl:hash variable="$attribute"> variable $attribute > p processed by linguistically y g y <nl:map value="population"> sophisticated systems, different :population adjectives such as “highest” and </nl:map> “largest” could be uniformly mapped <nl:map value="area"> onto the maximum operation. :area </nl:map> ... Schema answers questions that </nl:hash> involve region specific superlative </nl:mapping> pp g comparison of countries. </nl:InformationAccessSchema>March 17, 2009 Akerkar: Sogndal Lecture 36
  • 37.  The pattern binds to the value of the particular attribute for countries within the queried geographic region, and the action ti ithi th i d hi i d th ti specifies an aggregate operation (maximum) over the values bound within the pattern.  The country corresponding to that maximum value is returned as the answer.  The mapping provides a translation from language attributes to pp g p g g RDF properties.  Information access schemata are written with respect to a particular pre-existing ontology;  In thi I this example, we assume th t an appropriate ontology h b l that i t t l has been established (i.e., :Country is defined as a class, and :location is defined as a property). In this vision of the Semantic Web, information access schemata grounded in natural language would co-exist alongside RDF metadata.March 17, 2009 Akerkar: Sogndal Lecture 37
  • 38. Further extension: Query Plan Question: What is the distance from India to Norway? Solution Plan: To compute the distance between their respective capitals.Could humans “teach” such plans to a computer directly teach ?March 17, 2009 Akerkar: Sogndal Lecture 38
  • 39. Information Planning Schemata An extension of Information Access Schemata. Simplifies the task of knowledge engineering. Example:  Instead of writing RDF patterns,  which would require knowledge of domain-specific ontologies,  Use natural language itself to describe the process of answering a question.  The answer plan (nl:plan) reflects the user’s thought process expressed in natural language: first find the capitals of the countries, and then find the distance between those citiesMarch 17, 2009 Akerkar: Sogndal Lecture 39
  • 40. An information planning schema <nl:InformationPlanningSchema> <nl:ann>distance between $country1 and $country2</ann> <nl:plan> <rdf:Seq> <rdf:li>what is the capital of $country1 := ?capital1</rdf:li> <rdf:li>what is the capital of $country2 := ?capital2</rdf:li> <rdf:li>what is the distance between ?capital1 and ?capital2 := ?distance</rdf:li> </rdf:Seq> </nl:plan> <nl:action>display(?distance)</nl:action> </nl:InformationPlanningSchema>March 17, 2009 Akerkar: Sogndal Lecture 40
  • 41. Integrating the methods The three proposed methods for integrating natural language and RDF can be used together to afford greater flexibility.  Annotating RDF properties is a low-cost (from a knowledge engineering perspective) way of providing natural language access to RDF statements.  Information access schemata while being more complex and schemata, requiring knowledge of domain-specific ontologies, give experienced knowledge engineers fine-grained tools for manipulating RDF and controlling the output.  Information planning schemata allow users to describe in natural describe, language itself, how they would go about answering a particular class of questions. These three methods can combine to provide the foundation for question answering on the Semantic Web.March 17, 2009 Akerkar: Sogndal Lecture 41
  • 42. Thank You !March 17, 2009 Akerkar: Sogndal Lecture 42