Knowledge representation (KR) and reasoning' is an area of artificial intelligence whose fundamental goal is to represent knowledge in a manner that facilitates inferencing (i.e. drawing conclusions) from knowledge. It analyzes how to formally think - how to use a symbol system to represent a domain of discourse (that which can be talked about), along with functions that allow inference (formalized reasoning) about the objectsKnowledge Representation is crucial for the systemactic capture and fast access and retrieval of knowledge in Knowledge Management tasks. When we design a knowledge representation (and a knowledge representation system to interpret sentences in the logic in order to derive inferences from them) we have to make choices across a number of design spaces. The single most important decision to be made, is the expressivity of the KR. The more expressive, the easier and more compact it is to "say something”However, more expressive languages are harder to automatically derive inferences from. An example of a less expressive KR would be propositional logic.An example of a more expressive KR would be autoepistemic temporal modal logic. Less expressive KRs may be both complete and consistent (formally less expressive than set theory). More expressive KRs may be neither complete nor consistent.Recent developments in KR have been driven by the Semantic Web, and have included development of XML-based knowledge representation languages and standards, including Resource Description Framework (RDF), RDF Schema, Topic Maps, DARPA Agent Markup Language (DAML), Ontology Inference Layer (OIL), and Web Ontology Language (OWL).
So how do you do general KR, KR that by design is regular enough that KRs for various specific purposes can be combined. How do you make a KR system with such broad applicability that all humanKnowledge can be expressed in it. Such questions have led to the Semantic Web and other efforts.
In computer science, particularly artificial intelligence, a number of representations have been devised to structure information.KR is most commonly used to refer to representations intended for processing by modern computers, and in particular, for representations consisting of explicit objects (the class of all elephants, or Clyde a certain individual), and of assertions or claims about them ('Clyde is an elephant', or 'all elephants are grey'). Representing knowledge in such explicit form enables computers to draw conclusions from knowledge already stored ('Clyde is grey').Computationallinquistics added much knowledge about language itself. One of the better known KR programming languages is Prolog. It was actually developed in 1972 but not popular until roughly 1985. Remember the Fifth Generation Computing hype of the time or heard of it? We thought Japan was going to solve such powerful and even general AI that the US had to put major energy into catching up. Prolog represents propositions and basic logic, and can derive conclusions from known premises. KL-ONE (1980s) is more specifically aimed at knowledge representation itself. In 1995, the Dublin Core standard of metadata was conceived.SGML -> HTML -> XML These facilitated information retrieval and data mining efforts, which have in recent years begun to relate to knowledge representation.
Development of the Semantic Web, has included development of XML-based knowledge representation languages and standards, including RDF, RDF Schema, Topic Maps, DARPA Agent Markup Language (DAML), Ontology Inference Layer (OIL), and Web Ontology Language (OWL).TheSemantic Web is a "web of data" that enables machines to understand the semantics, or meaning, of information on the World Wide WebHumans can do a variety of tasks using the web that machines cannot because humans understand the semantics of those materials. They were designed to sufficiently convey semantics to enable such human use.Machines can’t use the same cues and contexts and are missing our “common sense”. Machine readability allows deep automated processing of the web. For instance cross-linking all content discussing specific aspects of some subject, topic or situation that are of particular types. Find all that support or undermine a particular hypothesis. I have a dream for the Web [in which computers] become capable of analyzing all the data on the Web – the content, links, and transactions between people and computers. A ‘Semantic Web’, which should make this possible, has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines. The ‘intelligent agents’ people have touted for ages will finally materialize.– Tim Berners-Lee, 1999Researchers could directly self-publish their experiment data in "semantic" format on the web. Semantic search engines could then make these data widely available. For instance the Open Cures project mentioned two weeks ago in the Longevity talk. an ontology is a formal representation of knowledge as a set of concepts within a domain, and the relationships between those concepts. It can be applied to reason about the entities within that domain, and may be used to describe the domain.an ontology is a "formal, explicit specification of a shared conceptualisation
The advantages of RDF are that it allows an unlimited amount of information about any subject in a schema independent way. There are common shortcuts in practice and many tools for more efficient editing and viewing. But it is nowhere near as concise for structured data as specifying a schema once and referring to it by data collection type. Note that RDF is pretty much limited to facts about instances. RDFS schema allows ability to define types and a limited set of properties of types.On the other hand OWL is a language for describing ontologies – conceptual mappings of a particular domain. OWL is compatible with RDFS but much more expressive, expressively for reasoning about interrelated types.
A class is a collection of objects. It corresponds to a description logic (DL) concept. A class may contain individuals, instances of the class. A class may have any number of instances. An instance may belong to none, one or more classes.A class may be a subclass of another, inheriting characteristics from its parent superclass. This corresponds to logical subsumption and DL concept inclusion notated .All classes are subclasses of owl:Thing (DL top notated ), the root class.All classes are subclassed by owl:Nothing (DL bottom notated ), the empty class. No instances are members of owl:Nothing. Modelers use owl:Thing and owl:Nothing to assert facts about all or no instances.An instance is an object. It corresponds to a description logic individual.A property is a directed binary relation that specifies class characteristics. It corresponds to a description logic role. They are attributes of instances and sometimes act as data values or link to other instances. Properties may possess logical capabilities such as being transitive, symmetric, inverse and functional. Properties may also have domains and ranges.Datatype properties are relations between instances of classes and RDF literals or XML schema datatypes. For example, modelName (String datatype) is the property of Manufacturer class. They are formulated using owl:DatatypeProperty type.Object properties are relationsbetween instances of two classes. For example, ownedBy may be an object type property of the Vehicle class and may have a range which is the class Person. They are formulated using owl:ObjectProperty.Languages in the OWL family support various operations on classes such as union, intersection and complement. They also allow class enumeration, cardinality, and disjointness.
topics, representing any concept, from people, countries, and organizations to software modules, individual files, and events,associations, representing hypergraph relationships between topics, andoccurrences representing information resources relevant to a particular topic.Topics, associations, occurences can all be typed. The collection of definitions of allowed types forms the ontology of the topic map. topics, representing any concept, from people, countries, and organizations to software modules, individual files, and events,associations, representing hypergraph relationships between topics, andoccurrences representing information resources relevant to a particular topic.http://www.topicmaps.org/http://www.xml.com/pub/a/2002/09/11/topicmaps.htmlhttp://www.isotopicmaps.org/
Knowledge Representation and Mangement Technologies for extended minds
Knowledge Representation Aspects• How do we represent what we know? – Expressiveness can conflict with computability• What aspects of what we know and their relationships are important? – Every KR is an explicit answer to this question – Every KR is a fragmented of full reasoning • The subset useful to the problem at hand in tractable limits – The choice of KR limits • What can be captured/expressed • What sorts of questions may be tractably answered • Usefulness for human exploration and learning • Usefulness for computational exploration and learning
KR Desired Properties• Coverage – Sufficient breath and depth• Understandable by humans – If for human use anyway. Useful for debugging in any case• Consistency• Efficient• Easy of modification• Supports the applications / functions the KR was desired for
Historical Attempts• 70s and early 80s • Heuristic question-answering, neural networks, theorem proving, expert systems. (Mycin) • Cyc starting is late 80s. – Naïve physics, time notions, causality, motivation, common objects and classes of objects• 90s to now • Computational linquistics • KR Programming languages • SGML -> HTML -> XML • Semantic Web
Semantic Web• KR of web content – Machine readable web content or description of content – Integration across different content, applications, systems • Enterprise Information Systems – Semantic publishing • Documents with semantic markup – RDF is most used currently – Two Approaches • Information as data objects using semantic language (RDF, OWL) • Embed formal metadata within documents with new markup – RDFa, Microformats
Some ontologies and vocabularies• Dublin Core – Resources, materials, media, text, web pages• SKOS – Thesauri, taxonomies, classification schemes• FOAF – Friend of a friend. Social network ontology• SIOC – Interconnection of discussions, blogs, forums, mailing lists• RSS – Syndication. Updates of blogs, news headlines, audio, video• DOAP – Description of a project. 43000 OS projects in Freshmeat• SPE – Scientific publishing experiment
Open Source Tools and Services• Ambra Project – Publish open access journal with RDF.• Semantic MediaWiki – Mediawiki extension for semantic annotation and RDF publishing• Swoogle – Search engine for ontologies and instance data a• Ufeed – Publishes RDF resources and feeds• D2R Server – Publishes relational database on the web als Linked Data and SPARQL endpoints• BigBlogZoo – Crawls and reaggregates 60000 XML sources under semantic URLs• Utopia – Interactive documents
Resource Description Framework• RDF basics – Subject predicate object • Typically all three are URIs to keep identity clear • Graphed as subject node, object node, predicate as labeled directed edge – Basically a lightweight binary relationship – Note similarity to Prolog entries – Structured information broken in two set of RDF triplets – Nodes, at least objects, can be containers of URIs • Containers are unbound bags • Collections are closed / complete• RDF Schema (RDFS) – Defines types and classes of URIs and expected associations or information about types. • IS-A and HAS-A relationships • Meaning details for types • Properties of classes
Web Ontology Language (OWL)• Components • Classes • Instances • Properties • Datatype properties • Object properties • operators
Topic Maps• Components – Topics – Associations – Occurrences• Similar to concept maps and mind maps• Higher level of semantic abstraction than OWL and RDFS• Fully supports merging of topic maps• APIs – TMAPI• Query – TMQL• Constraint specification (unfinished) – TMCL