Your SlideShare is downloading. ×
0
WP2 2nd Review
WP2 2nd Review
WP2 2nd Review
WP2 2nd Review
WP2 2nd Review
WP2 2nd Review
WP2 2nd Review
WP2 2nd Review
WP2 2nd Review
WP2 2nd Review
WP2 2nd Review
WP2 2nd Review
WP2 2nd Review
WP2 2nd Review
WP2 2nd Review
WP2 2nd Review
WP2 2nd Review
WP2 2nd Review
WP2 2nd Review
WP2 2nd Review
WP2 2nd Review
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

WP2 2nd Review

325

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
325
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Say how it’s different from tagora dataset => we have gold standard preprocessing disambiguation, with agreement between at least two annotators
  • The first platform for building gold standards for the evaluation of concept-based search algorithms, vocabulary convergence algorithms, etc in folksonomiesThe first gold standard dataset produced and publishedThe first evaluation of a keywords-based search algorithm w.r.t. the gold standard semantic search in a folksonomyTag preprocessing algorithm, WSD algorithm, concept-based search algorithm
  • Transcript

    • 1. Combining Human and Computational Intelligence<br />Ilya Zaihrayeu, Pierre Andrews, <br />Juan Pane<br />
    • 2. Semantic annotation lifecycle<br />Problem 4: semi-automatic semantification of existing annotations<br />free text annotations<br />Overall problem:How to combine human and computational intelligence to support the generation and consumption of semantic contents?<br />Problem 2: extract (semantic) annotations from contexts of user resource at publishing<br />Problem 1: help the user find and understand the meaning of semantic annotations<br />What if the users could use semantic annotations instead to leverage semantic technology services?<br />Semantic annotation=structure and/or meaning<br />User<br />Reasoning <br />Semantic search<br />…<br />Context<br />Problem 3: QoS of semantics-enabled services<br />4/14/2011<br />2<br />
    • 3. Index: meaning summarization<br />Problem 1: help the user find and understand the meaning of semantic annotations<br />User<br />Reasoning <br />Semantic search<br />…<br />4/14/2011<br />3<br />
    • 4. Meaning summarization: why?<br />The right meaning of the words being used for the annotation are in the mind of the people using them<br />E.g.: Java:<br />an island in Indonesia south of Borneo; one of the world's most densely populated regions<br />a beverage consisting of an infusion of ground coffee beans; "he ordered a cup of coffee“<br />a simple platform-independent object-oriented programming languageused for writing applets that are downloaded from the World Wide Web by a client and run on the client's machine<br />Descriptions are too long for the user to grasp the meaning immediately – too high barrier to start generating semantic annotations<br />island<br />beverage<br />programming language<br />4/14/2011<br />4<br />
    • 5. Meaning summarization: an example<br />One word summaries are generated from the relations in the knowledge base, sense definitions, synonyms and hypernym terms <br />4/14/2011<br />5<br />
    • 6. Meaning summarization: evaluation results<br />Best precision: 63%<br />If we talk about java, does the word coffee mean the same as island?<br />Discriminating power: 76,4%<br />4/14/2011<br />6<br />
    • 7. Index: gold standard dataset<br />Problem 4: semi-automatic semantification of existing annotations<br />In order to evaluate the performance of the algorithms, a gold standard dataset is needed <br />User<br />Reasoning <br />Semantic search<br />…<br />Problem 3: QoS of semantics-enabled services?<br />4/14/2011<br />7<br />
    • 8. Proposed Approach<br />Create a gold standard of folksonomy with sense<br />Tag<br />Tokens<br />Senses<br />Disambiguation<br />Preprocessing<br />59% Accuracy<br />80% Accuracy<br />Java – an island in Indonesia to the south of Borneo<br />Island – a land mass that is surrounded by water<br />javaisland<br />Java island<br />Java is land<br />…<br />4/14/2011<br />8<br />
    • 9. A Platform for Gold Standards of Semantic Annotation Systems<br />Manual validation<br />RDF export<br />Evaluation of<br />Preprocessing<br />WSD <br />BoW Search<br />Convergence<br />Open source:<br />7 modules<br />25K lines of code26% of comments<br />http://sourceforge.net/projects/tags2con/<br />4/14/2011<br />9<br />
    • 10. Delicious RDF Dataset @ LOD cloud<br />http://disi.unitn.it/~knowdive/dataset/delicious/<br />4/14/2011<br />10<br />Dereferenceable  at:<br />
    • 11. Index: QoS for semantic search<br />User<br />Reasoning <br />Semantic search<br />…<br />Problem 3: QoS of semantics-enabled services?<br />4/14/2011<br />11<br />
    • 12. Semantic search: why?<br />With the free text search, the following problems may reduce precision and recall:<br />synonymy problem: searching for “images” should return resources annotated with “picture”<br />polysemy problem: searching for “java” (island) should not return resources annotated with “java” (coffee beverage) <br />specificity gap problem: searching for “animals” should also return resources annotated with “dogs”<br />Semantic, meaning-based search can address the above listed problems<br />4/14/2011<br />12<br />
    • 13. Semantics vsFolksonomy<br />link<br />Used to build “raw” queries<br />javaisland<br />Semantic search: complete and correct results (the baseline)<br />vehicle<br />query<br />submit<br />Used to build BoW queries<br />java island<br />Used to build semantic queriescorrect and complete<br />Java(island) island(land)<br />result<br />Specificity Gap (SG)<br />resource<br />annotation<br />Recall goes down as the specificity gap increases<br />SG=1<br />User<br />car<br />SG=2<br />taxi<br />Specificity Gap<br />4/14/2011<br />13<br />
    • 14. Index: semantic convergence<br />Problem 4: semi-automatic semantification of existing annotations<br />User<br />Reasoning <br />Semantic search<br />…<br />4/14/2011<br />14<br />
    • 15. Semantic convergence: Why?<br />Ajax<br />Mac<br />Apple<br />CSS<br />…<br />Random: programming and web domain<br />“General” domains: cooking, travel,<br /> education<br />4/14/2011<br />15<br />
    • 16. Semantic convergence: proposed solution<br />Find new senses of terms<br />Find different senses of the same term (word sense)<br />Find synonymous of a term (synonymous sets - synset)<br />Place the new synset in the vocabulary is-a hierarchy<br />What we improve<br />Better use of Machine Learning techniques<br />The polysemy issue is not considered in the state of the art<br />Missing or “subjective” evaluations in the state of the art<br />Evaluation using the Delicious dataset<br />4/14/2011<br />16<br />
    • 17. Convergence Evaluation:Finding Senses<br />Tag Collocation<br />User Collocation<br />4/14/2011<br />17<br />t2<br />t2<br />U1<br />B2<br />B1<br />B1<br />t3<br />t1<br />t1<br />t4<br />t3<br />t5<br />t5<br />U2<br />B4<br />t4<br />B4<br />B3<br />B3<br />Random Baseline<br />Precision: 56%<br />Recall: 73%<br />Precision: 57%<br />Recall: 68%<br />Precision: 42%<br />Recall: 29%<br />
    • 18. Semantic annotation lifecycle<br />Problem 4: semi-automatic semantification of existing annotations<br />free text annotations<br />combining human and computational intelligence<br />Conclusions<br />Problem 2: extract (semantic) annotations from contexts of user resource at publishing?<br />Problem 1: help the user understand the meaning of semantic annotations?<br />What if the users could use semantic annotations instead to leverage semantic technology services?<br />Semantic annotation=structure and/or meaning<br />User<br />Reasoning <br />Semantic search<br />…<br />Context<br />Problem 3: QoS of semantics-enabled services?<br />4/14/2011<br />18<br />
    • 19. Conclusions<br />We developed and evaluated a meaning summarization algorithm<br />We developed a “semantic folksonomy” evaluation platform<br />We studied the effect of semantics on social tagging systems: <br />how much semantics can help? <br />how much the user needs to be involved? <br />How human and computer intelligence can be combined in the generation and consumption of semantic annotations<br />We developed and evaluated a knowledge base enrichment algorithm<br />We built and used a gold standard dataset for evaluating:<br />Word Sense Disambiguation<br />Tag Preprocessing<br />Semantic Search<br />Semantic Convergence<br />4/14/2011<br />19<br />
    • 20. Integration with the use cases<br />4/14/2011<br />20<br />
    • 21. Publications<br /><ul><li>Semantic Disambiguation in Folksonomy: a Case StudyPierre Andrews, Juan Pane, and Ilya Zaihrayeu;Advanced Language Technologies for Digital Libraries, Springer’s LNCS.
    • 22. Semantic Annotation of Images on FlickrPierre Andrews, Sergey Kanshin, Juan Pane, and Ilya Zaihrayeu;ESWC 2011
    • 23. A Classification of Semantic Annotation SystemsPierre Andrews, Sergey Kanshin, Juan Pane, and IlyaZaihrayeu;Semantic Web Journal – second review phase
    • 24. Sense Induction in FolksonomiesPierre Andrews, Juan Pane, and IlyaZaihrayeu;IJCAI-LHD 2011 – under review
    • 25. Evaluating the Quality of Service in Semantic Annotation SystemsIlyaZaihrayeu, Pierre Andrews, and Juan Pane;in preparation</li></ul>4/14/2011<br />21<br />

    ×