• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
UAB 2011- Combining human and computational intelligence
 

UAB 2011- Combining human and computational intelligence

on

  • 328 views

 

Statistics

Views

Total Views
328
Views on SlideShare
316
Embed Views
12

Actions

Likes
0
Downloads
0
Comments
0

1 Embed 12

http://www.insemtives.eu 12

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Say how it’s different from tagora dataset => we have gold standard preprocessing disambiguation, with agreement between at least two annotators
  • The first platform for building gold standards for the evaluation of concept-based search algorithms, vocabulary convergence algorithms, etc in folksonomiesThe first gold standard dataset produced and publishedThe first evaluation of a keywords-based search algorithm w.r.t. the gold standard semantic search in a folksonomyTag preprocessing algorithm, WSD algorithm, concept-based search algorithm

UAB 2011- Combining human and computational intelligence UAB 2011- Combining human and computational intelligence Presentation Transcript

  • Combining Human andComputational Intelligence Ilya Zaihrayeu, Pierre Andrews, Juan Pane
  • Semantic annotation lifecycle Problem 4: semi- automatic semantification free text annotations of existing annotations Problem 2: extract Problem 1: help the (semantic) user find and annotations understand thefrom contexts meaning of semantic of user annotations resource at What if the users could use publishing semantic annotations instead to leverage semantic technology services? User Semantic Semantic annotation=structure search … Reasoning and/or meaning Context Problem 3: QoS of semantics-enabled services 4/14/2011 2
  • Index: meaning summarization Problem 1: help the user find and understand the meaning of semantic annotations User Semantic search … Reasoning4/14/2011 3
  • Meaning summarization: why?• The right meaning of the words being used for the annotation are in the mind of the people using them• E.g.: Java: – an island in Indonesia south of Borneo; one of the island worlds most densely populated regions – a beverage consisting of an infusion of ground coffee beverage beans; "he ordered a cup of coffee“ – a simple platform-independent object-oriented programming language used for writing applets that programming language are downloaded from the World Wide Web by a client and run on the clients machine• Descriptions are too long for the user to grasp the meaning immediately – too high barrier to start generating semantic annotations4/14/2011 4
  • Meaning summarization: an example One word summaries are generated from the relations in the knowledge base, sense definitions, synonyms and hypernym terms4/14/2011 5
  • Meaning summarization: evaluation results Best precision: 63% If we talk about java, does the word coffee mean the same as island? Discriminating power: 76,4%4/14/2011 6
  • Index: gold standard dataset Problem 4: semi- automatic semantification of existing annotationsIn order to evaluate the performance of the algorithms, agold standard dataset is needed User Semantic search … Reasoning Problem 3: QoS of semantics-enabled services?4/14/2011 7
  • Proposed Approach Create a gold standard of folksonomy with sense Tag Tokens Senses # of annotations 4 296 Unique tags 857 Unique URLs 644 Preprocessing Disambiguation Unique users 1 194 Annotator Agreement 80% Accuracy 81 % 59% Accuracy Java – an island in Indonesia to the south ofjavaisland Java island Borneo Java is land Island – a land mass that is … surrounded by water 4/14/2011 8
  • A Platform for Gold Standards of Semantic Annotation Systems • Manual validation • RDF export • Evaluation of – Preprocessing – WSD – BoW Search – Convergence • Open source: 7 modules 25K lines of codehttp://sourceforge.net/projects/tags2con/ 26% of comments 4/14/2011 9
  • Delicious RDF Dataset @ LOD cloud# triples 85 908Outlinks to LOD cloud 651 Dereferenceable at:(WN synsets) http://disi.unitn.it/~knowdive/dataset/delicious/ 4/14/2011 10
  • Index: QoS for semantic search User Semantic search … Reasoning Problem 3: QoS of semantics-enabled services?4/14/2011 11
  • Semantic search: why?• With the free text search, the following problems may reduce precision and recall: – synonymy problem: searching for “images” should return resources annotated with “picture” – polysemy problem: searching for “java” (island) should not return resources annotated with “java” (coffee beverage) – specificity gap problem: searching for “animals” should also return resources annotated with “dogs”• Semantic, meaning-based search can address the above listed problems4/14/2011 12
  • Semantics vs Folksonomy Used to buildjavaisland “raw” queries Semantic search: complete and correct results Used to build (the baseline)java island BoW queries Used to buildJava(island) island(land) semantic queries correct and complete Specificity Gap (SG) link query vehicle submit SG=1 Recall goes down as the specificity gap car increases User SG=2 result resource taxi annotation Specificity Gap 4/14/2011 13
  • Index: semantic convergence Problem 4: semi- automatic semantification of existing annotations User Semantic search … Reasoning4/14/2011 14
  • Semantic convergence: Why? Cannot Other decide Other Cannot 1% 6% 3% decide 5% Abbreviation Abbreviation 2% 5% Missing sense 15% With a WN sense Missing I dont know 49% sense With a WN 4% Ajax sense 36% Mac 71% Apple CSS …Random:programming and “General” domains: cooking, travel,web domain I dont know education4/14/2011 3% 15
  • Semantic convergence: proposed solution• Find new senses of terms – Find different senses of the same term (word sense) – Find synonymous of a term (synonymous sets - synset)• Place the new synset in the vocabulary is-a hierarchy• What we improve – Better use of Machine Learning techniques – The polysemy issue is not considered in the state of the art – Missing or “subjective” evaluations in the state of the art• Evaluation using the Delicious dataset4/14/2011 16
  • Convergence Evaluation: Finding Senses Tag Collocation User Collocation t2 t2 B2 U1 B1 B1 t1 t1 t3 t3 t4 t5 B4 U2 t5B4 t4 B3 B3 Random Baseline Precision: 56% Precision: 42% Precision: 57% Recall: 73% Recall: 29% Recall: 68% 4/14/2011 17
  • Semantic annotation lifecycle Problem 4: semi- automatic semantification free text annotations of existing annotations Problem 2: extract combining human and computational Problem 1: help the (semantic) user understand the intelligence annotations meaning of semanticfrom contexts annotations? of user resource at Conclusions What if the users could use publishing? semantic annotations instead to leverage semantic technology services? User Semantic Semantic annotation=structure search … Reasoning and/or meaning Context Problem 3: QoS of semantics-enabled services? 4/14/2011 18
  • Conclusions• We developed and evaluated a meaning summarization algorithm• We developed a “semantic folksonomy” evaluation platform• We studied the effect of semantics on social tagging systems: – how much semantics can help? – how much the user needs to be involved? – How human and computer intelligence can be combined in the generation and consumption of semantic annotations• We developed and evaluated a knowledge base enrichment algorithm• We built and used a gold standard dataset for evaluating: – Word Sense Disambiguation – Tag Preprocessing – Semantic Search – Semantic Convergence4/14/2011 19
  • Integration with the use cases4/14/2011 20
  • Publications • Semantic Disambiguation in Folksonomy: a Case Study Pierre Andrews, Juan Pane, and Ilya Zaihrayeu; Advanced Language Technologies for Digital Libraries, Springer’s LNCS. • Semantic Annotation of Images on Flickr Pierre Andrews, Sergey Kanshin, Juan Pane, and Ilya Zaihrayeu; ESWC 2011 • A Classification of Semantic Annotation Systems Pierre Andrews, Sergey Kanshin, Juan Pane, and Ilya Zaihrayeu; Semantic Web Journal – second review phase • Sense Induction in Folksonomies Pierre Andrews, Juan Pane, and Ilya Zaihrayeu; IJCAI-LHD 2011 – under review • Evaluating the Quality of Service in Semantic Annotation Systems Ilya Zaihrayeu, Pierre Andrews, and Juan Pane; in preparation4/14/2011 21
  • WP 2 TIMELINE AND DELIVERABLESMonths 0 6 12 18 24 30 36 D2.1.1: State of the Art Tasks D2.1.2: Specification of the and requirements from model the use case partnersTask 2.1Designing UIBKmodels D2.2.2+D2.2.3: Report on linking D2.4 Report on the D2.2.1: Report on bootstrapping semantic annotations to external sources refinement of the proposed semantic annotations and on reaching and on keeping them up-to-date when models, methods and consensus in the use of semantics the underlying semantic model changes semantic searchTask 2.2Designingmethods UNITNTask 2.3 D2.3.1: Requirements for D2.3.2: Specification forResearch on semantics-aware IR methods semantics-aware IR methodsInformationRetrieval (IR)methods for ONTO D2.5 Report on the state ofsemantic the art, proposed suitable models and methods forcontent automatic visual annotationTask 2.4Models andmethods for UTCautomaticvisualannotation