Learning Knowledge Rich User Models from the Semantic Web Gunnar Aastrand Grimnes User Modeling 2003 – Doctoral Consortium 24th June, 2003
Presentation Overview Motivation Part I - Preliminary Experiments Part II - Agentcities & GraniteNights The Future
Motivation The Semantic Web should: Facilitate learning from the Web. Facilitate reuse of learning outcomes. Hypothesis :  Learning from data annotated with semantic  mark-up should outperform learning from  traditional (HTML) Web. Goals: The learned user model should be expressed in a  Semantic Web Language. Such a learned model should be re-usable across  domains and applications.
Part I - Preliminary Experiments Compare performance of learning from plain text and from semantic meta-data. Using traditional ML algorithms as baseline approach: Naïve Bayes K-Nearest Neighbour  Explore application of more knowledge intensive approaches, such as ILP (Progol). An Empirical Investigation of Learning From the Semantic Web , Pete Edwards,  Gunnar  AA.  Grimnes  and  Alun Preece  – Presented at Semantic Web Mining Workshop at ECML/PKDD, Helsinki, 2002
Issues Datasets in a Semantic Web language were very hard to come by. We used two datasets: ITTalks (Seminars described using HTML vs. DAML+OIL). Citeseer (Full text of Academic Papers vs. BibTex converted to RDF).  How does RDF map to an instance representation suitable for learning?
Results Largely negative.  K Nearest Neighbour on plain-text had best accuracy.  …  but: 10 lines of RDF vs. 6000 words of full-text paper.  Reasons for failure: Shallow and artificial RDF. Statistical methods used. Progol results were the most interesting, although it generated lots of non-compressing rules. % Classifying Machine Learning papers: inClass(A,’ML’) :-  publisher(A,'Morgan Kaufmann'),  booktitleword(A,learning).
Part II Agentcities @ Aberdeen  &  GraniteNights
GraniteNights  Raison d’être:  Agentcities Agent Technology Competition. Need a Semantic Web framework for learning user models. Bring together different people/research areas in the department: agents, learning, scheduling, constraints, etc.  Proof that RDF is usable! GraniteNights  - A Multi-Agent Visit Scheduler Utilising Semantic Web Technology ,  Gunnar  AA.  Grimnes , Stuart  Chalmers , Pete Edwards and  Alun Preece Accepted for CIA2003
GraniteNights - Example
GraniteNights - Architecture
Query By Example RDQL too complicated to write by hand. Query by example is very intuitive. Internal conversion to RDQL. Could be “smarter” than RDQL. <q: Query > <q: template > <akt: Academic > <akt: family-name > Brown </akt: family-name > </akt: Academic > </q: template > </q: Query > SELECT  ?x   WHERE ( ?x ,  ?y ,  ?z ),  (  ?x , < rdf # type >, < akt # Academic > ),  (  ?x , < akt # family-name >, &quot; Brown &quot; )
QbEx with constraints <q: Query > <q: template > <r: R estaurant > <r: type  rdf: resource =“ r#Tandoori &quot; /> <r: open-time > <cif: V ariable  rdf: ID =&quot; x &quot;> <cif: varname > x </cif: varname > </cif: V ariable > </r: open-time > </r: R estaurant > </q: template > <q: constraints > <cif: C omparison > <cif: comparison O perator > &gt; </cif: comparison O perator > <cif: comparison O p1 > <cif: V ariable  rdf: about =&quot; #x &quot;/> </cif: comparison O p1 > <cif: comparison O p2 > <cif: I ntegerconst > <cif: constant V alue > 1900 </cif: constant V alue >  .. . . Give me a Restaurant that is open after X,  where X > 1900. i.e. Give me a restaurant open after 7 pm.
GraniteNights Profiling <ep: User  rdf: about =“ profileagent#gunnar ”    ep: name =“ gunnar ” ep: pword =“ **** ”>  <ep: preference > <q: Query > <q: template > <pub: EnglishPub >  <pub: servesBeer  rdf: resource =“ #flowers ”/> </pub: EnglishPub > ... <ep: interactions >  <rdf: Seq ><rdf: li >  <ep: Interaction  ep: timestamp =“ 20030508T135013 ”> <ep: pref >  <q: Query > <q: template > <pub: EnglishPub >  <pub: servesBeer  rdf: resource =“ #flowers ”/> </pub: EnglishPub > ... <pub: EnglishPub > <pub: servesBeer  rdf: resource =“ #hobgoblin ”/> ... <pub: EnglishPub >  <pub: servesBeer  rdf: resource =“ #flowers ”/> ...
GraniteNights Profiling II  Current implementation:  Most frequently specified constraint.  Possible improvements:  Super/Sub-class inference in the ontology, i.e.  Flowers  and  Hobgoblin  are both sub-classes of  Real Ale .  Combination of constraints important,  i.e.Pete likes  Lager  when eating  Curry , but  Ale  for his occasional pub-visit. Requires more sophisticated techniques than counting.
The Future – RDF Issues Must strike a balance: RDF purely for representation: Not very interoperable.  Not very ground-breaking. Using RDF data-model: RDF expressive power is limited. Extend using Rule-ML? OWL? Lots of unsolved RDF problems for free: Anonymous nodes / graph isomorphism Software support is immature. Inference
The Future – Problems Future work on GraniteNights:  Real users for real user modelling:  Substantial software engineering efforts required to make a system that attracts users. Simulating user interactions is an option.  Adding features: Reviews / Recommendations (This pub has good guinness)
The Future – Plans Near future plans: Further empirical investigations:  Inductive logic programming contd. Case-based reasoning. User modelling in a broader scope: User roles, commitments etc.
Questions ?
Rule-ML Example <rule: Imp ><rule: head > <rule: Atom > <rule: rel  rdf: resource  =“ ~ggrimnes/dev/exp#inClass ”/> <rule: atomArg > <rule: Var  rdf: about =“ #A ”> <rule: varName > A </rule: varName > </rule: Var > </rule: atomArg > <rule: atomArg > ML </rule: atomArg > </rule: Atom > </rule: head > <rule: body > <rule: And > <rule: arg > <rule: Atom > <rule: rel  rdf: resource =“ ~ggrimnes/dev/exp#booktitleword ”/> <rule: atomArg > <rule: Var  rdf: about =“ #A ”> <rule: varName > A </rule: varName > </rule: Var > </rule: atomArg > <rule: atomArg > learning </rule: atomArg > </rule: Atom > </rule: arg > <rule: arg > <rule: Atom > <rule: rel  rdf: resource =“ ~ggrimnes/dev/exp#publisher# ”/> <rule: atomArg > <rule: Var  rdf:about=“ #A ”> <rule: varName > A </rule: varName > </rule: Var > </rule: atomArg > <rule: atomArg > Morgan Kaufman </rule: atomArg >   </rule: Atom > </rule: arg > </rule: And > </rule: body > </rule: Imp >
Agentcities & the Evening Scenario EU funded – 5 th  F.W. In Aberdeen since January’02. WeatherAgent online since February’02. Evening Scenario City Nodes Tourist Information Recommendations

UM03 - Learning Know..

  • 1.
    Learning Knowledge RichUser Models from the Semantic Web Gunnar Aastrand Grimnes User Modeling 2003 – Doctoral Consortium 24th June, 2003
  • 2.
    Presentation Overview MotivationPart I - Preliminary Experiments Part II - Agentcities & GraniteNights The Future
  • 3.
    Motivation The SemanticWeb should: Facilitate learning from the Web. Facilitate reuse of learning outcomes. Hypothesis : Learning from data annotated with semantic mark-up should outperform learning from traditional (HTML) Web. Goals: The learned user model should be expressed in a Semantic Web Language. Such a learned model should be re-usable across domains and applications.
  • 4.
    Part I -Preliminary Experiments Compare performance of learning from plain text and from semantic meta-data. Using traditional ML algorithms as baseline approach: Naïve Bayes K-Nearest Neighbour Explore application of more knowledge intensive approaches, such as ILP (Progol). An Empirical Investigation of Learning From the Semantic Web , Pete Edwards, Gunnar AA. Grimnes and Alun Preece – Presented at Semantic Web Mining Workshop at ECML/PKDD, Helsinki, 2002
  • 5.
    Issues Datasets ina Semantic Web language were very hard to come by. We used two datasets: ITTalks (Seminars described using HTML vs. DAML+OIL). Citeseer (Full text of Academic Papers vs. BibTex converted to RDF). How does RDF map to an instance representation suitable for learning?
  • 6.
    Results Largely negative. K Nearest Neighbour on plain-text had best accuracy. … but: 10 lines of RDF vs. 6000 words of full-text paper. Reasons for failure: Shallow and artificial RDF. Statistical methods used. Progol results were the most interesting, although it generated lots of non-compressing rules. % Classifying Machine Learning papers: inClass(A,’ML’) :- publisher(A,'Morgan Kaufmann'), booktitleword(A,learning).
  • 7.
    Part II Agentcities@ Aberdeen & GraniteNights
  • 8.
    GraniteNights Raisond’être: Agentcities Agent Technology Competition. Need a Semantic Web framework for learning user models. Bring together different people/research areas in the department: agents, learning, scheduling, constraints, etc. Proof that RDF is usable! GraniteNights - A Multi-Agent Visit Scheduler Utilising Semantic Web Technology , Gunnar AA. Grimnes , Stuart Chalmers , Pete Edwards and Alun Preece Accepted for CIA2003
  • 9.
  • 10.
  • 11.
    Query By ExampleRDQL too complicated to write by hand. Query by example is very intuitive. Internal conversion to RDQL. Could be “smarter” than RDQL. <q: Query > <q: template > <akt: Academic > <akt: family-name > Brown </akt: family-name > </akt: Academic > </q: template > </q: Query > SELECT ?x WHERE ( ?x , ?y , ?z ), ( ?x , < rdf # type >, < akt # Academic > ), ( ?x , < akt # family-name >, &quot; Brown &quot; )
  • 12.
    QbEx with constraints<q: Query > <q: template > <r: R estaurant > <r: type rdf: resource =“ r#Tandoori &quot; /> <r: open-time > <cif: V ariable rdf: ID =&quot; x &quot;> <cif: varname > x </cif: varname > </cif: V ariable > </r: open-time > </r: R estaurant > </q: template > <q: constraints > <cif: C omparison > <cif: comparison O perator > &gt; </cif: comparison O perator > <cif: comparison O p1 > <cif: V ariable rdf: about =&quot; #x &quot;/> </cif: comparison O p1 > <cif: comparison O p2 > <cif: I ntegerconst > <cif: constant V alue > 1900 </cif: constant V alue > .. . . Give me a Restaurant that is open after X, where X > 1900. i.e. Give me a restaurant open after 7 pm.
  • 13.
    GraniteNights Profiling <ep:User rdf: about =“ profileagent#gunnar ” ep: name =“ gunnar ” ep: pword =“ **** ”> <ep: preference > <q: Query > <q: template > <pub: EnglishPub > <pub: servesBeer rdf: resource =“ #flowers ”/> </pub: EnglishPub > ... <ep: interactions > <rdf: Seq ><rdf: li > <ep: Interaction ep: timestamp =“ 20030508T135013 ”> <ep: pref > <q: Query > <q: template > <pub: EnglishPub > <pub: servesBeer rdf: resource =“ #flowers ”/> </pub: EnglishPub > ... <pub: EnglishPub > <pub: servesBeer rdf: resource =“ #hobgoblin ”/> ... <pub: EnglishPub > <pub: servesBeer rdf: resource =“ #flowers ”/> ...
  • 14.
    GraniteNights Profiling II Current implementation: Most frequently specified constraint. Possible improvements: Super/Sub-class inference in the ontology, i.e. Flowers and Hobgoblin are both sub-classes of Real Ale . Combination of constraints important, i.e.Pete likes Lager when eating Curry , but Ale for his occasional pub-visit. Requires more sophisticated techniques than counting.
  • 15.
    The Future –RDF Issues Must strike a balance: RDF purely for representation: Not very interoperable. Not very ground-breaking. Using RDF data-model: RDF expressive power is limited. Extend using Rule-ML? OWL? Lots of unsolved RDF problems for free: Anonymous nodes / graph isomorphism Software support is immature. Inference
  • 16.
    The Future –Problems Future work on GraniteNights: Real users for real user modelling: Substantial software engineering efforts required to make a system that attracts users. Simulating user interactions is an option. Adding features: Reviews / Recommendations (This pub has good guinness)
  • 17.
    The Future –Plans Near future plans: Further empirical investigations: Inductive logic programming contd. Case-based reasoning. User modelling in a broader scope: User roles, commitments etc.
  • 18.
  • 19.
    Rule-ML Example <rule:Imp ><rule: head > <rule: Atom > <rule: rel rdf: resource =“ ~ggrimnes/dev/exp#inClass ”/> <rule: atomArg > <rule: Var rdf: about =“ #A ”> <rule: varName > A </rule: varName > </rule: Var > </rule: atomArg > <rule: atomArg > ML </rule: atomArg > </rule: Atom > </rule: head > <rule: body > <rule: And > <rule: arg > <rule: Atom > <rule: rel rdf: resource =“ ~ggrimnes/dev/exp#booktitleword ”/> <rule: atomArg > <rule: Var rdf: about =“ #A ”> <rule: varName > A </rule: varName > </rule: Var > </rule: atomArg > <rule: atomArg > learning </rule: atomArg > </rule: Atom > </rule: arg > <rule: arg > <rule: Atom > <rule: rel rdf: resource =“ ~ggrimnes/dev/exp#publisher# ”/> <rule: atomArg > <rule: Var rdf:about=“ #A ”> <rule: varName > A </rule: varName > </rule: Var > </rule: atomArg > <rule: atomArg > Morgan Kaufman </rule: atomArg > </rule: Atom > </rule: arg > </rule: And > </rule: body > </rule: Imp >
  • 20.
    Agentcities & theEvening Scenario EU funded – 5 th F.W. In Aberdeen since January’02. WeatherAgent online since February’02. Evening Scenario City Nodes Tourist Information Recommendations