Collaborative Ontology Building Project  - a multiagent-based ontology editing and discovery environment Jie Bao Artificial Intelligence Research Laboratory  Dept of Computer Science Iowa State University Ames IA 50010 [email_address] http://www.cs.iastate.edu/~baojie Project homepage: http://boole.cs.iastate.edu:9090/COB/   A Research proposal Dec 02, 2003
COB Without  SHOE  how can you be a  RACER ? Without  Sesame  how can you make  OIL ? Semantic Web  is a plan of good But with no  ontology  it’s only a nil. Everyone makes a small piece of brick  Not in one day can we make Rome real. Let’s build ontology together and hard Just like ants build their hill.
Outline Objectives Key difficulties Background review A tentative framework
What is the problem Semantic web needs  general  and  open  ontology library, but ontology building is a time-consuming, knowledge sensitive process.  Domain experts are needed, and nobody has full knowledge Also, intellectual asset/copyright issue hinders the wide usage of commercial ontology (e.g. Cyc) Automatic ontology discovery and mapping are still impossible in general Existent ontology editing and discovery tools are standalone and too complex  Not suitable for team ontology generation. Jargons are horrible for common people who knows little about ontology. Data sources are distributed, heterogonous, dynamic  New concept appears everyday: Election2004
Related problems Distributed Learning Learning from distributed, heterogonous, dynamic, multiple dataset Software engineering Concurrent version control and management Open Source Issue  (copyright vs. copyleft) Knowledge Management Knowledge sharing in group/project Automatic knowledge aggregation
Design Philosophy (1)   ----- about people Teamwork is needed Nobody can know everything But everyone is an expert somehow Everybody knows something: your dog, your department, your favorite TV show You can build big things from small pieces One expert can write several articles for an encyclopedia And hundreds of experts can work together. However, People always have different viewpoints Conflict: 21 st  century begins at 2000/2001 Redundancy: IraqWar, WarInIraq, GulfWarII
Design Philosophy (2)   ----- about agent and software Small pieces of ontologies are generated by agents Those agents are domain experts or trained agents Light-weight ontology editor which requires minimal user effort: browser-based Automatic and controllable information collection by software robots. Ontology repository is maintained by machine learning algorithms Ontology mapping on controlled topics. Detect and reduce redundancy and conflicts by inference
A Desirable Case   -- Pop Music Ontology (1) Suppose we want to build an ontology and knowledge base about pop music called PopOnt Even kids know John is a teenager student and knows nothing about ontology. But he knows much about pop music.  He’d like to share his knowledge to PopOnt. I’m willing to spend 5 minutes for you There are millions of pop music fans like John, their knowledge is complementary each other. Some of them may go to the website of PopOnt and write one or two pieces of simple sentences, like [ M. Jackson] [isn’t] a [country music artist].  They may also correct others’ mistakes
A Desirable Case   -- Pop Music Ontology (2) You even don’t need to go to the website There are also mailing lists, newsgroups, weblogs, p2p applications and websites about pop music, which can be used for validation or mining. For example, if [M. Jackson]  hardly coincides with [country music], it’s more possible [ M. Jackson] [isn’t] a [country music artist]  is true Agent can be expert, too. It will be more desirable if those articles have subject, abstract, or even keywords, which can be used as labeled instances for machine learning.  New concepts can be mined and cross-validated by people, too. Finally, PopOnt is built in a couple of months and free to use for everyone.
Outline Objectives Key difficulties Background review A tentative framework
Key Difficulties 1 :  Logic breakdown How to make ontology editing as easy as writing diary? Ontology [subject][predicate][object] [subject][predicate][object] [subject][predicate][object] [subject][predicate][object] Class SubClass SubSubClass SubSubClass SubClass SubSubClass SubSubClass Classes and Slots Instances Can complex ontology be broken down into group of single sentences? Or say, how to decompose complex description logic statement into very simple FOPL sentences? And inverse composition is also needed. Each single sentences is as simple as A is B , A has B
Key Difficulties 2 :  Ontology Evolution How to refine an ontology by cooperation of experts and software agents? People and agents are all error-prone.  Interactive and iterative cross-validation are central. People are “lazy” and “natural”. An ontology piece may be firstly written in short natural language and be refined latterly by other people or agents into a former and more complex piece. Inference are needed to rule out conflict information, to detect malicious/wrong information
Key Difficulties 3 :  Ontology Mining Where to collect source information? Google search? No Pull: agents search and know where are “good” sources. That can be verified by whether the source is well cited(referenced) or not. Push: information are automatic pushed to agent via credible channels. Automatic extraction is still impossible Depends on NLP Article summary/keywords are helpful, especially when the summary overlaps with existent ontology. Such summarized text can  be used as labeled instance. Simplified tasks are feasible It the keyword a consistent concept? Do some keywords are related? Comparison: In content-based retrieval of video database, automatic discovery of semantics based on image processing / pattern recognition are proven not quite successful. Semantics from expert knowledge are needed in MPEG 7 stream.
Key Difficulties 4 :  Ontology Mapping People always name same thing with different names, or divide concepts into groups in multiple ways. Automatic general ontology mapping is still hard.  Simplified mapping is more feasible while still useful Check concept pair (with instances) are same or not Detect redundancy and suggest merge.
Outline Objectives Key difficulties Background review A tentative framework
Beyond INDUS INDUS is a distributed learning system, while COB is a MAS learning system Agents in different channels  have different focus for learning  They work together for the same goal. INDUS have a heavy-weight database mechanism while COB aims at light-weight implementation Ontology/KB are stored in atom sentences Interface for dummies, not for gurus. Data sources are usually small but change quickly, and their number is huge. In query, uses the inference power of ontology language.
Semantic Web meets MAS COB is an application of MAS learning from data on web Learn new concept from instances Validate concept of other agents/human Learner can be any form: BayesNet, Neural Net, Decision Tree, KNN Everything is about semantics Agents share an ontology but also have dialect issue Small pieces of semantics are carried by agents and aggregated in the “home” Guess semantics from labeled instance. An application shows how to implement proof and trust on semantic web
Ready Techniques Dynamic knowledge sharing  RSS(RDF site summary): answering questions like "Who wrote this?", "When was this published?", and "What is/are the topic(s) of discussion?"  RSS is widely used for news aggregation and automatic news discovery.  Grid/Social Computation Grid: distribute the compuation task across the internet and compose result together. Blog and Wiki: easy to use site building tools, instead of HTML editor.  Topics are refined by the effort of a community. Peer-to-peer communication  Local repository can be shared to other peer The other peer can be a agent in COB ! However, they are all somehow missing of semantics. The unfiltered information may flood  the user.
  Collaborative Ontology Building Example FOAF http://xml.mfd-consult.dk/foaf/explorer/ FoaF  is an acronym for  Friend of a Friend , an experimental project and vocabulary for the  Semantic Web . It is based on the idea of a machine-readable version of the current World Wide Web, with homepages, mailling lists, travel itineraries, calendars, address books and the likes. Everyone can join and add their own information It’s RDF based
Collaborative Ontology Building Example  wikipedia 170,000 concepts in English only, more in other language. An open encyclopedia Everyone can edit any page. Based on the assumption that most of people are nice And it’s proven true! Limitation: the relation between items is not formal, and it’s to human read only(at least for now)
Collaborative Ontology Building Example Open Directory Project http:// www.dmoz.org / 60,000 editors 460,000 concepts Collaborative taxonomy building Open to everyone Limitation: Taxonomy only
Outline Objectives Key difficulties Background review A tentative framework
System design Ontology Repository OntoWiki OWL-like  syntax Human Expert Email list Newsgroup Forum Blog Wiki P2P node Semantic RSS-aware Channel Semantic RSS-aware Channel Semantic RSS-aware Channel Agents:  Ontology  Mining Browser Ontology Alignment Version Control Redundancy Check Conflict Check Cross Validation A B C D
Part A (1): OntoWiki Everyone can edit any concept Version control is enabled Ontology-guide editing Should have a ontology visualizer
Part A (2): OWL-like syntax // COB terms  cob:equals   cob:documentation   // OWL terms  owl:AllDifferent   owl:allValuesFrom   owl:backwardCompatibleWith   owl:cardinality   owl:Class   owl:complementOf   owl:DatatypeProperty   owl:DeprecatedClass   owl:DeprecatedProperty   owl:differentFrom   owl:disjointWith   owl:distinctMembers   owl:equivalentClass   owl:equivalentProperty   owl:FunctionalProperty   owl:hasValue     owl:imports   owl:incompatibleWith   owl:intersectionOf   owl:InverseFunctionalProperty     owl:inverseOf   owl:maxCardinality   owl:minCardinality   owl:Nothing   owl:ObjectProperty   owl:oneOf   owl:onProperty   owl:Ontology   owl:priorVersion   owl:Restriction   owl:sameAs   owl:someValuesFrom   owl:SymmetricProperty   owl:Thing   owl:TransitiveProperty   owl:unionOf   owl:versionInfo   rdf:List   rdf:nil   rdf:type   rdfs:comment   rdfs:Datatype   rdfs:domain   rdfs:label   rdfs:Literal   rdfs:Literal   rdfs:range   rdfs:subClassOf   rdfs:subPropertyOf A subset of OWL is used Single statement are RDF-like triple [subject] [predicate] [object] Name Space are used cob:instanceOf owl:Class rdfs:subClassOf Core COB language is defined in it’s own namespace (see right)
Part A (3): Instance Example # [cob:Instance]  # [cob:instanceOf] [Student]  # [cob:instanceOf] [Chinese] # [cob:equals][ 鲍捷 ] # [hasSurname] Bao # [hasFirstname] Jie # [worksOn] [semanticWeb] # [worksOn] [MAS] # [worksOn] [complexSystem] # [advisedBy] [Honavar] # [memberOf] [aiLab] # [hasEmail] baojie@cs.iastate.edu # [hasHomepage] http://www.cs.iastate.edu/~baojie # [cob:documentation] Hi, I love cats BaoJie cob:Instance   cob:instanceOf   Student ?   cob:instanceOf   Chinese ?   cob:equals   鲍捷   hasSurname  Bao  hasFirstname  Jie  worksOn   semanticWeb ?   worksOn   MAS ?   worksOn   complexSystem ?   advisedBy   Honavar ?   memberOf   aiLab ?   hasEmail  baojie@cs.iastate.edu  hasHomepage  http://www.cs.iastate.edu/~baojie  cob:documentation   Hi, I love cats  Edit this page     More info...    Attach file...  Source Screen shows
Part A (4): Name Space Java-like package naming, which shows the relatedness of concepts even when they don’t inherit from the same concept. Packages are in DAG  Internationalization is enabled  //cob:Thing.Country.US.Iowa.Ames.ISU //cob:Thing.Education.University.Iowa.ISU [cob:instanceOf] [PublicUniversity] [cob:instanceOf] [dmoz:University] [cob:equals] [Iowa State University] // cobZH: 事物 . 美国大学 . 艾奥瓦州立大学 [cob:language] zh // Chinese [cob:equals]    [cob:Thing.Country.US.Iowa.Ames.ISU] //cob:Thing.Education.University.Idaho.ISU [cob:instanceOf] [PublicUniversity] [cob:instanceOf] [dmoz:University] [cob:equals] [Idaho State University]
Part B: Semantic RSS RSS  has no semantics We can use  Dublin Core   to enhance RSS Keywords are concepts or concept candidates in the ontology Agents listen to S-RSS channels and discover new concepts <channel rdf:about=&quot;http://boole.cs.iastate.edu:9090/COB/&quot;> <title>COB Project</title> <link>http://boole.cs.iastate.edu:9090/COB/</link> <description>AI Ontology</description> <language>en-us</language> <items> <rdf:Seq> <rdf:li rdf:resource=&quot;http://boole.cs.iastate.edu:9090/COB/Wiki.jsp?page=Main&quot; /> </rdf:Seq> </items> </channel> <item rdf:about=&quot;http://boole.cs.iastate.edu:9090/COB/Wiki.jsp?page=Main&quot;> <title>Main</title> <link>http://boole.cs.iastate.edu:9090/COB/Wiki.jsp?page=Main</link> <description>129.186.93.7 changed this page on Wed Dec 03 19:18:23 CST 2003:&lt;br />&lt;hr />&lt;br /></description> <wiki:version>27</wiki:version> <wiki:diff>http://boole.cs.iastate.edu:9090/COB/Diff.jsp?page=Main&amp;r1=-1</wiki:diff> <dc:date>2003-12-04T01:18:23Z</dc:date> <dc:contributor> <rdf:Description> <rdf:value>129.186.93.7</rdf:value> </rdf:Description> </dc:contributor> <wiki:history>http://boole.cs.iastate.edu:9090/COB/PageInfo.jsp?page=Main</wiki:history> </item>
Part C (1): Agent Each agent does  Trace back information source and check its credibility. Do filtering and text normalization Extract new concept from instances Extract possible general relationship (like [cob:alsoSee]) between concepts And they may differs Not necessarily should use the same learning algorithm Learning from email header are different from learning from free text content Dialect Agent 1: I listens to Idaho S.U. maillist and know ISU = Idaho State University Agent 2: I watch a blog in Iowa and know ISU = Iowa State University Communication helps Agent 1: P([M. Jackson]^[CountryMusic])=0.1 Agent 2: P([M. Jackson]^[CountryMusic])=0.03
Part C (2): Ontology Alignment Do mapping on restricted cases When an agent or expert doubts if some concepts are same, it will ask OntologyAlignmenter with instance set  Merge detected duplicated concepts like IraqWar and WarInIraq be careful: UniversityOfWashington, WashtingtonUniversity are different. It can be learnt from instances. Manual alignment enabled, too
Part D : Ontology Repository Version control Keep version for each concept, lock mature concepts, detect malicious changes Redundancy check [I.S.U] [cob:instanceOf] [University] [I.S.U] [cob:alsoSee] [Cyclone] [Iowa Stete University] [cob:instanceOf] [PublicUniversity] [Iowa Stete University] [cob:alsoSee] [Cyclone] [PublicUniversity] [cob:subClassOf][University] Conflict check [ISU] [locatedIn] [Ames] [ISU] [locatedIn] [Des Moines] Cross validation Score agent and expert for it’s credibility Check soundness of inputs from it’s peer inputs. Refactoring (rename, remove, merge)
Summary What’s new Light-weight ontology editor for community Collaborative, distributed ontology learning based on logic decomposition  Semantic extension to RSS Mulitagent ontology mining from trusted channel. Do ontology management based on proof and trust COB doesn't want to  Solve ontology mapping in general Solve ontology extract from free text in general

Collaborative Ontology Building Project

  • 1.
    Collaborative Ontology BuildingProject - a multiagent-based ontology editing and discovery environment Jie Bao Artificial Intelligence Research Laboratory Dept of Computer Science Iowa State University Ames IA 50010 [email_address] http://www.cs.iastate.edu/~baojie Project homepage: http://boole.cs.iastate.edu:9090/COB/ A Research proposal Dec 02, 2003
  • 2.
    COB Without SHOE how can you be a RACER ? Without Sesame how can you make OIL ? Semantic Web is a plan of good But with no ontology it’s only a nil. Everyone makes a small piece of brick Not in one day can we make Rome real. Let’s build ontology together and hard Just like ants build their hill.
  • 3.
    Outline Objectives Keydifficulties Background review A tentative framework
  • 4.
    What is theproblem Semantic web needs general and open ontology library, but ontology building is a time-consuming, knowledge sensitive process. Domain experts are needed, and nobody has full knowledge Also, intellectual asset/copyright issue hinders the wide usage of commercial ontology (e.g. Cyc) Automatic ontology discovery and mapping are still impossible in general Existent ontology editing and discovery tools are standalone and too complex Not suitable for team ontology generation. Jargons are horrible for common people who knows little about ontology. Data sources are distributed, heterogonous, dynamic New concept appears everyday: Election2004
  • 5.
    Related problems DistributedLearning Learning from distributed, heterogonous, dynamic, multiple dataset Software engineering Concurrent version control and management Open Source Issue (copyright vs. copyleft) Knowledge Management Knowledge sharing in group/project Automatic knowledge aggregation
  • 6.
    Design Philosophy (1) ----- about people Teamwork is needed Nobody can know everything But everyone is an expert somehow Everybody knows something: your dog, your department, your favorite TV show You can build big things from small pieces One expert can write several articles for an encyclopedia And hundreds of experts can work together. However, People always have different viewpoints Conflict: 21 st century begins at 2000/2001 Redundancy: IraqWar, WarInIraq, GulfWarII
  • 7.
    Design Philosophy (2) ----- about agent and software Small pieces of ontologies are generated by agents Those agents are domain experts or trained agents Light-weight ontology editor which requires minimal user effort: browser-based Automatic and controllable information collection by software robots. Ontology repository is maintained by machine learning algorithms Ontology mapping on controlled topics. Detect and reduce redundancy and conflicts by inference
  • 8.
    A Desirable Case -- Pop Music Ontology (1) Suppose we want to build an ontology and knowledge base about pop music called PopOnt Even kids know John is a teenager student and knows nothing about ontology. But he knows much about pop music. He’d like to share his knowledge to PopOnt. I’m willing to spend 5 minutes for you There are millions of pop music fans like John, their knowledge is complementary each other. Some of them may go to the website of PopOnt and write one or two pieces of simple sentences, like [ M. Jackson] [isn’t] a [country music artist]. They may also correct others’ mistakes
  • 9.
    A Desirable Case -- Pop Music Ontology (2) You even don’t need to go to the website There are also mailing lists, newsgroups, weblogs, p2p applications and websites about pop music, which can be used for validation or mining. For example, if [M. Jackson] hardly coincides with [country music], it’s more possible [ M. Jackson] [isn’t] a [country music artist] is true Agent can be expert, too. It will be more desirable if those articles have subject, abstract, or even keywords, which can be used as labeled instances for machine learning. New concepts can be mined and cross-validated by people, too. Finally, PopOnt is built in a couple of months and free to use for everyone.
  • 10.
    Outline Objectives Keydifficulties Background review A tentative framework
  • 11.
    Key Difficulties 1: Logic breakdown How to make ontology editing as easy as writing diary? Ontology [subject][predicate][object] [subject][predicate][object] [subject][predicate][object] [subject][predicate][object] Class SubClass SubSubClass SubSubClass SubClass SubSubClass SubSubClass Classes and Slots Instances Can complex ontology be broken down into group of single sentences? Or say, how to decompose complex description logic statement into very simple FOPL sentences? And inverse composition is also needed. Each single sentences is as simple as A is B , A has B
  • 12.
    Key Difficulties 2: Ontology Evolution How to refine an ontology by cooperation of experts and software agents? People and agents are all error-prone. Interactive and iterative cross-validation are central. People are “lazy” and “natural”. An ontology piece may be firstly written in short natural language and be refined latterly by other people or agents into a former and more complex piece. Inference are needed to rule out conflict information, to detect malicious/wrong information
  • 13.
    Key Difficulties 3: Ontology Mining Where to collect source information? Google search? No Pull: agents search and know where are “good” sources. That can be verified by whether the source is well cited(referenced) or not. Push: information are automatic pushed to agent via credible channels. Automatic extraction is still impossible Depends on NLP Article summary/keywords are helpful, especially when the summary overlaps with existent ontology. Such summarized text can be used as labeled instance. Simplified tasks are feasible It the keyword a consistent concept? Do some keywords are related? Comparison: In content-based retrieval of video database, automatic discovery of semantics based on image processing / pattern recognition are proven not quite successful. Semantics from expert knowledge are needed in MPEG 7 stream.
  • 14.
    Key Difficulties 4: Ontology Mapping People always name same thing with different names, or divide concepts into groups in multiple ways. Automatic general ontology mapping is still hard. Simplified mapping is more feasible while still useful Check concept pair (with instances) are same or not Detect redundancy and suggest merge.
  • 15.
    Outline Objectives Keydifficulties Background review A tentative framework
  • 16.
    Beyond INDUS INDUSis a distributed learning system, while COB is a MAS learning system Agents in different channels have different focus for learning They work together for the same goal. INDUS have a heavy-weight database mechanism while COB aims at light-weight implementation Ontology/KB are stored in atom sentences Interface for dummies, not for gurus. Data sources are usually small but change quickly, and their number is huge. In query, uses the inference power of ontology language.
  • 17.
    Semantic Web meetsMAS COB is an application of MAS learning from data on web Learn new concept from instances Validate concept of other agents/human Learner can be any form: BayesNet, Neural Net, Decision Tree, KNN Everything is about semantics Agents share an ontology but also have dialect issue Small pieces of semantics are carried by agents and aggregated in the “home” Guess semantics from labeled instance. An application shows how to implement proof and trust on semantic web
  • 18.
    Ready Techniques Dynamicknowledge sharing RSS(RDF site summary): answering questions like &quot;Who wrote this?&quot;, &quot;When was this published?&quot;, and &quot;What is/are the topic(s) of discussion?&quot; RSS is widely used for news aggregation and automatic news discovery. Grid/Social Computation Grid: distribute the compuation task across the internet and compose result together. Blog and Wiki: easy to use site building tools, instead of HTML editor. Topics are refined by the effort of a community. Peer-to-peer communication Local repository can be shared to other peer The other peer can be a agent in COB ! However, they are all somehow missing of semantics. The unfiltered information may flood the user.
  • 19.
    CollaborativeOntology Building Example FOAF http://xml.mfd-consult.dk/foaf/explorer/ FoaF is an acronym for Friend of a Friend , an experimental project and vocabulary for the Semantic Web . It is based on the idea of a machine-readable version of the current World Wide Web, with homepages, mailling lists, travel itineraries, calendars, address books and the likes. Everyone can join and add their own information It’s RDF based
  • 20.
    Collaborative Ontology BuildingExample wikipedia 170,000 concepts in English only, more in other language. An open encyclopedia Everyone can edit any page. Based on the assumption that most of people are nice And it’s proven true! Limitation: the relation between items is not formal, and it’s to human read only(at least for now)
  • 21.
    Collaborative Ontology BuildingExample Open Directory Project http:// www.dmoz.org / 60,000 editors 460,000 concepts Collaborative taxonomy building Open to everyone Limitation: Taxonomy only
  • 22.
    Outline Objectives Keydifficulties Background review A tentative framework
  • 23.
    System design OntologyRepository OntoWiki OWL-like syntax Human Expert Email list Newsgroup Forum Blog Wiki P2P node Semantic RSS-aware Channel Semantic RSS-aware Channel Semantic RSS-aware Channel Agents: Ontology Mining Browser Ontology Alignment Version Control Redundancy Check Conflict Check Cross Validation A B C D
  • 24.
    Part A (1):OntoWiki Everyone can edit any concept Version control is enabled Ontology-guide editing Should have a ontology visualizer
  • 25.
    Part A (2):OWL-like syntax // COB terms cob:equals cob:documentation // OWL terms owl:AllDifferent owl:allValuesFrom owl:backwardCompatibleWith owl:cardinality owl:Class owl:complementOf owl:DatatypeProperty owl:DeprecatedClass owl:DeprecatedProperty owl:differentFrom owl:disjointWith owl:distinctMembers owl:equivalentClass owl:equivalentProperty owl:FunctionalProperty owl:hasValue owl:imports owl:incompatibleWith owl:intersectionOf owl:InverseFunctionalProperty owl:inverseOf owl:maxCardinality owl:minCardinality owl:Nothing owl:ObjectProperty owl:oneOf owl:onProperty owl:Ontology owl:priorVersion owl:Restriction owl:sameAs owl:someValuesFrom owl:SymmetricProperty owl:Thing owl:TransitiveProperty owl:unionOf owl:versionInfo rdf:List rdf:nil rdf:type rdfs:comment rdfs:Datatype rdfs:domain rdfs:label rdfs:Literal rdfs:Literal rdfs:range rdfs:subClassOf rdfs:subPropertyOf A subset of OWL is used Single statement are RDF-like triple [subject] [predicate] [object] Name Space are used cob:instanceOf owl:Class rdfs:subClassOf Core COB language is defined in it’s own namespace (see right)
  • 26.
    Part A (3):Instance Example # [cob:Instance] # [cob:instanceOf] [Student] # [cob:instanceOf] [Chinese] # [cob:equals][ 鲍捷 ] # [hasSurname] Bao # [hasFirstname] Jie # [worksOn] [semanticWeb] # [worksOn] [MAS] # [worksOn] [complexSystem] # [advisedBy] [Honavar] # [memberOf] [aiLab] # [hasEmail] baojie@cs.iastate.edu # [hasHomepage] http://www.cs.iastate.edu/~baojie # [cob:documentation] Hi, I love cats BaoJie cob:Instance cob:instanceOf Student ? cob:instanceOf Chinese ? cob:equals 鲍捷 hasSurname Bao hasFirstname Jie worksOn semanticWeb ? worksOn MAS ? worksOn complexSystem ? advisedBy Honavar ? memberOf aiLab ? hasEmail baojie@cs.iastate.edu hasHomepage http://www.cs.iastate.edu/~baojie cob:documentation Hi, I love cats Edit this page    More info...    Attach file... Source Screen shows
  • 27.
    Part A (4):Name Space Java-like package naming, which shows the relatedness of concepts even when they don’t inherit from the same concept. Packages are in DAG Internationalization is enabled //cob:Thing.Country.US.Iowa.Ames.ISU //cob:Thing.Education.University.Iowa.ISU [cob:instanceOf] [PublicUniversity] [cob:instanceOf] [dmoz:University] [cob:equals] [Iowa State University] // cobZH: 事物 . 美国大学 . 艾奥瓦州立大学 [cob:language] zh // Chinese [cob:equals] [cob:Thing.Country.US.Iowa.Ames.ISU] //cob:Thing.Education.University.Idaho.ISU [cob:instanceOf] [PublicUniversity] [cob:instanceOf] [dmoz:University] [cob:equals] [Idaho State University]
  • 28.
    Part B: SemanticRSS RSS has no semantics We can use Dublin Core to enhance RSS Keywords are concepts or concept candidates in the ontology Agents listen to S-RSS channels and discover new concepts <channel rdf:about=&quot;http://boole.cs.iastate.edu:9090/COB/&quot;> <title>COB Project</title> <link>http://boole.cs.iastate.edu:9090/COB/</link> <description>AI Ontology</description> <language>en-us</language> <items> <rdf:Seq> <rdf:li rdf:resource=&quot;http://boole.cs.iastate.edu:9090/COB/Wiki.jsp?page=Main&quot; /> </rdf:Seq> </items> </channel> <item rdf:about=&quot;http://boole.cs.iastate.edu:9090/COB/Wiki.jsp?page=Main&quot;> <title>Main</title> <link>http://boole.cs.iastate.edu:9090/COB/Wiki.jsp?page=Main</link> <description>129.186.93.7 changed this page on Wed Dec 03 19:18:23 CST 2003:&lt;br />&lt;hr />&lt;br /></description> <wiki:version>27</wiki:version> <wiki:diff>http://boole.cs.iastate.edu:9090/COB/Diff.jsp?page=Main&amp;r1=-1</wiki:diff> <dc:date>2003-12-04T01:18:23Z</dc:date> <dc:contributor> <rdf:Description> <rdf:value>129.186.93.7</rdf:value> </rdf:Description> </dc:contributor> <wiki:history>http://boole.cs.iastate.edu:9090/COB/PageInfo.jsp?page=Main</wiki:history> </item>
  • 29.
    Part C (1):Agent Each agent does Trace back information source and check its credibility. Do filtering and text normalization Extract new concept from instances Extract possible general relationship (like [cob:alsoSee]) between concepts And they may differs Not necessarily should use the same learning algorithm Learning from email header are different from learning from free text content Dialect Agent 1: I listens to Idaho S.U. maillist and know ISU = Idaho State University Agent 2: I watch a blog in Iowa and know ISU = Iowa State University Communication helps Agent 1: P([M. Jackson]^[CountryMusic])=0.1 Agent 2: P([M. Jackson]^[CountryMusic])=0.03
  • 30.
    Part C (2):Ontology Alignment Do mapping on restricted cases When an agent or expert doubts if some concepts are same, it will ask OntologyAlignmenter with instance set Merge detected duplicated concepts like IraqWar and WarInIraq be careful: UniversityOfWashington, WashtingtonUniversity are different. It can be learnt from instances. Manual alignment enabled, too
  • 31.
    Part D :Ontology Repository Version control Keep version for each concept, lock mature concepts, detect malicious changes Redundancy check [I.S.U] [cob:instanceOf] [University] [I.S.U] [cob:alsoSee] [Cyclone] [Iowa Stete University] [cob:instanceOf] [PublicUniversity] [Iowa Stete University] [cob:alsoSee] [Cyclone] [PublicUniversity] [cob:subClassOf][University] Conflict check [ISU] [locatedIn] [Ames] [ISU] [locatedIn] [Des Moines] Cross validation Score agent and expert for it’s credibility Check soundness of inputs from it’s peer inputs. Refactoring (rename, remove, merge)
  • 32.
    Summary What’s newLight-weight ontology editor for community Collaborative, distributed ontology learning based on logic decomposition Semantic extension to RSS Mulitagent ontology mining from trusted channel. Do ontology management based on proof and trust COB doesn't want to Solve ontology mapping in general Solve ontology extract from free text in general