WP2 2nd Review

504 views

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
504
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
3
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • Say how it’s different from tagora dataset => we have gold standard preprocessing disambiguation, with agreement between at least two annotators
  • The first platform for building gold standards for the evaluation of concept-based search algorithms, vocabulary convergence algorithms, etc in folksonomiesThe first gold standard dataset produced and publishedThe first evaluation of a keywords-based search algorithm w.r.t. the gold standard semantic search in a folksonomyTag preprocessing algorithm, WSD algorithm, concept-based search algorithm
  • WP2 2nd Review

    1. 1. Combining Human and Computational Intelligence<br />Ilya Zaihrayeu, Pierre Andrews, <br />Juan Pane<br />
    2. 2. Semantic annotation lifecycle<br />Problem 4: semi-automatic semantification of existing annotations<br />free text annotations<br />Overall problem:How to combine human and computational intelligence to support the generation and consumption of semantic contents?<br />Problem 2: extract (semantic) annotations from contexts of user resource at publishing<br />Problem 1: help the user find and understand the meaning of semantic annotations<br />What if the users could use semantic annotations instead to leverage semantic technology services?<br />Semantic annotation=structure and/or meaning<br />User<br />Reasoning <br />Semantic search<br />…<br />Context<br />Problem 3: QoS of semantics-enabled services<br />4/14/2011<br />2<br />
    3. 3. Index: meaning summarization<br />Problem 1: help the user find and understand the meaning of semantic annotations<br />User<br />Reasoning <br />Semantic search<br />…<br />4/14/2011<br />3<br />
    4. 4. Meaning summarization: why?<br />The right meaning of the words being used for the annotation are in the mind of the people using them<br />E.g.: Java:<br />an island in Indonesia south of Borneo; one of the world's most densely populated regions<br />a beverage consisting of an infusion of ground coffee beans; "he ordered a cup of coffee“<br />a simple platform-independent object-oriented programming languageused for writing applets that are downloaded from the World Wide Web by a client and run on the client's machine<br />Descriptions are too long for the user to grasp the meaning immediately – too high barrier to start generating semantic annotations<br />island<br />beverage<br />programming language<br />4/14/2011<br />4<br />
    5. 5. Meaning summarization: an example<br />One word summaries are generated from the relations in the knowledge base, sense definitions, synonyms and hypernym terms <br />4/14/2011<br />5<br />
    6. 6. Meaning summarization: evaluation results<br />Best precision: 63%<br />If we talk about java, does the word coffee mean the same as island?<br />Discriminating power: 76,4%<br />4/14/2011<br />6<br />
    7. 7. Index: gold standard dataset<br />Problem 4: semi-automatic semantification of existing annotations<br />In order to evaluate the performance of the algorithms, a gold standard dataset is needed <br />User<br />Reasoning <br />Semantic search<br />…<br />Problem 3: QoS of semantics-enabled services?<br />4/14/2011<br />7<br />
    8. 8. Proposed Approach<br />Create a gold standard of folksonomy with sense<br />Tag<br />Tokens<br />Senses<br />Disambiguation<br />Preprocessing<br />59% Accuracy<br />80% Accuracy<br />Java – an island in Indonesia to the south of Borneo<br />Island – a land mass that is surrounded by water<br />javaisland<br />Java island<br />Java is land<br />…<br />4/14/2011<br />8<br />
    9. 9. A Platform for Gold Standards of Semantic Annotation Systems<br />Manual validation<br />RDF export<br />Evaluation of<br />Preprocessing<br />WSD <br />BoW Search<br />Convergence<br />Open source:<br />7 modules<br />25K lines of code26% of comments<br />http://sourceforge.net/projects/tags2con/<br />4/14/2011<br />9<br />
    10. 10. Delicious RDF Dataset @ LOD cloud<br />http://disi.unitn.it/~knowdive/dataset/delicious/<br />4/14/2011<br />10<br />Dereferenceable  at:<br />
    11. 11. Index: QoS for semantic search<br />User<br />Reasoning <br />Semantic search<br />…<br />Problem 3: QoS of semantics-enabled services?<br />4/14/2011<br />11<br />
    12. 12. Semantic search: why?<br />With the free text search, the following problems may reduce precision and recall:<br />synonymy problem: searching for “images” should return resources annotated with “picture”<br />polysemy problem: searching for “java” (island) should not return resources annotated with “java” (coffee beverage) <br />specificity gap problem: searching for “animals” should also return resources annotated with “dogs”<br />Semantic, meaning-based search can address the above listed problems<br />4/14/2011<br />12<br />
    13. 13. Semantics vsFolksonomy<br />link<br />Used to build “raw” queries<br />javaisland<br />Semantic search: complete and correct results (the baseline)<br />vehicle<br />query<br />submit<br />Used to build BoW queries<br />java island<br />Used to build semantic queriescorrect and complete<br />Java(island) island(land)<br />result<br />Specificity Gap (SG)<br />resource<br />annotation<br />Recall goes down as the specificity gap increases<br />SG=1<br />User<br />car<br />SG=2<br />taxi<br />Specificity Gap<br />4/14/2011<br />13<br />
    14. 14. Index: semantic convergence<br />Problem 4: semi-automatic semantification of existing annotations<br />User<br />Reasoning <br />Semantic search<br />…<br />4/14/2011<br />14<br />
    15. 15. Semantic convergence: Why?<br />Ajax<br />Mac<br />Apple<br />CSS<br />…<br />Random: programming and web domain<br />“General” domains: cooking, travel,<br /> education<br />4/14/2011<br />15<br />
    16. 16. Semantic convergence: proposed solution<br />Find new senses of terms<br />Find different senses of the same term (word sense)<br />Find synonymous of a term (synonymous sets - synset)<br />Place the new synset in the vocabulary is-a hierarchy<br />What we improve<br />Better use of Machine Learning techniques<br />The polysemy issue is not considered in the state of the art<br />Missing or “subjective” evaluations in the state of the art<br />Evaluation using the Delicious dataset<br />4/14/2011<br />16<br />
    17. 17. Convergence Evaluation:Finding Senses<br />Tag Collocation<br />User Collocation<br />4/14/2011<br />17<br />t2<br />t2<br />U1<br />B2<br />B1<br />B1<br />t3<br />t1<br />t1<br />t4<br />t3<br />t5<br />t5<br />U2<br />B4<br />t4<br />B4<br />B3<br />B3<br />Random Baseline<br />Precision: 56%<br />Recall: 73%<br />Precision: 57%<br />Recall: 68%<br />Precision: 42%<br />Recall: 29%<br />
    18. 18. Semantic annotation lifecycle<br />Problem 4: semi-automatic semantification of existing annotations<br />free text annotations<br />combining human and computational intelligence<br />Conclusions<br />Problem 2: extract (semantic) annotations from contexts of user resource at publishing?<br />Problem 1: help the user understand the meaning of semantic annotations?<br />What if the users could use semantic annotations instead to leverage semantic technology services?<br />Semantic annotation=structure and/or meaning<br />User<br />Reasoning <br />Semantic search<br />…<br />Context<br />Problem 3: QoS of semantics-enabled services?<br />4/14/2011<br />18<br />
    19. 19. Conclusions<br />We developed and evaluated a meaning summarization algorithm<br />We developed a “semantic folksonomy” evaluation platform<br />We studied the effect of semantics on social tagging systems: <br />how much semantics can help? <br />how much the user needs to be involved? <br />How human and computer intelligence can be combined in the generation and consumption of semantic annotations<br />We developed and evaluated a knowledge base enrichment algorithm<br />We built and used a gold standard dataset for evaluating:<br />Word Sense Disambiguation<br />Tag Preprocessing<br />Semantic Search<br />Semantic Convergence<br />4/14/2011<br />19<br />
    20. 20. Integration with the use cases<br />4/14/2011<br />20<br />
    21. 21. Publications<br /><ul><li>Semantic Disambiguation in Folksonomy: a Case StudyPierre Andrews, Juan Pane, and Ilya Zaihrayeu;Advanced Language Technologies for Digital Libraries, Springer’s LNCS.
    22. 22. Semantic Annotation of Images on FlickrPierre Andrews, Sergey Kanshin, Juan Pane, and Ilya Zaihrayeu;ESWC 2011
    23. 23. A Classification of Semantic Annotation SystemsPierre Andrews, Sergey Kanshin, Juan Pane, and IlyaZaihrayeu;Semantic Web Journal – second review phase
    24. 24. Sense Induction in FolksonomiesPierre Andrews, Juan Pane, and IlyaZaihrayeu;IJCAI-LHD 2011 – under review
    25. 25. Evaluating the Quality of Service in Semantic Annotation SystemsIlyaZaihrayeu, Pierre Andrews, and Juan Pane;in preparation</li></ul>4/14/2011<br />21<br />

    ×