Collaborative Ontology Building Project - a multiagent-based ontology editing and discovery environment Jie Bao Artificial Intelligence Research Laboratory Dept of Computer Science Iowa State University Ames IA 50010 [email_address] http://www.cs.iastate.edu/~baojie Project homepage: http://boole.cs.iastate.edu:9090/COB/ A Research proposal Dec 02, 2003
Suppose we want to build an ontology and knowledge base about pop music called PopOnt
Even kids know John is a teenager student and knows nothing about ontology. But he knows much about pop music. He’d like to share his knowledge to PopOnt.
I’m willing to spend 5 minutes for you There are millions of pop music fans like John, their knowledge is complementary each other. Some of them may go to the website of PopOnt and write one or two pieces of simple sentences, like [ M. Jackson] [isn’t] a [country music artist]. They may also correct others’ mistakes
You even don’t need to go to the website There are also mailing lists, newsgroups, weblogs, p2p applications and websites about pop music, which can be used for validation or mining. For example, if [M. Jackson] hardly coincides with [country music], it’s more possible [ M. Jackson] [isn’t] a [country music artist] is true
Agent can be expert, too. It will be more desirable if those articles have subject, abstract, or even keywords, which can be used as labeled instances for machine learning. New concepts can be mined and cross-validated by people, too.
Finally, PopOnt is built in a couple of months and free to use for everyone.
Classes and Slots Instances Can complex ontology be broken down into group of single sentences? Or say, how to decompose complex description logic statement into very simple FOPL sentences? And inverse composition is also needed. Each single sentences is as simple as A is B , A has B
Pull: agents search and know where are “good” sources. That can be verified by whether the source is well cited(referenced) or not.
Push: information are automatic pushed to agent via credible channels.
Automatic extraction is still impossible
Depends on NLP
Article summary/keywords are helpful, especially when the summary overlaps with existent ontology.
Such summarized text can be used as labeled instance.
Simplified tasks are feasible
It the keyword a consistent concept?
Do some keywords are related?
Comparison: In content-based retrieval of video database, automatic discovery of semantics based on image processing / pattern recognition are proven not quite successful. Semantics from expert knowledge are needed in MPEG 7 stream.