Your SlideShare is downloading. ×
Multipedia: Enriching DBpedia with Multimedia information
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Multipedia: Enriching DBpedia with Multimedia information

1,369
views

Published on

Presentation given by Andrés García at KCAP2011 on the selection of images for dbpedia terms

Presentation given by Andrés García at KCAP2011 on the selection of images for dbpedia terms

Published in: Technology, Education

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,369
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
13
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Multipedia:Enriching DBpedia with Images
    Andrés García-Silva†, Asunción Gómez-Pérez†
    Max Jakob *, Pablo Mendez * and Chris Bizer ⃰
    † {hgarcia, ocorcho,asun}@fi.upm.es
    Facultad de Informática
    Universidad Politécnica de Madrid
    Campus de Montegancedo s/n
    28660 Boadilla del Monte, Madrid, Spain
    *first.last@fu-berlin.de
    Web-based Systems Group
    Freie Universitat Berlin, Germany
  • 2. Introduction
    Enriching ontologies with multimedia
    The use of images and videos complement information about concepts/entities in existing knowledge bases.
    Multimodal ontologies can help in QA systems, User Interfaces, search and recommendation processes.
    depicts
    Pathology
    IsA
    «Show me X-ray Images with fractures of the Femur»
    occurs
    isA
    Bone
    depicts
    Radhouani, S., HweeLim, J.: pierre Chevallet, J., Falquet, G.: Combining textual and visual ontologies to solve medical multimodal queries. In: IEEE International Conference on Multimedia and Expo., pp. 1853-1856 (2006).
    2
    Garcia-Silva et al.
  • 3. Introduction
    Goal:
    Populate a general purpose ontology with images from the Web.
    - Find relevant images for ontology instances with ambiguous names
    DBpedia knowledge base
    Collects facts from Wikipedia containing 3.5 million entities,
    Classified into a consistent cross-domain ontology: 272 classes and 1.6 million instances.
    Has evolved into a hub in the linked data cloud.
    Images in DBpedia
    Wikipedia images are represented in DBpedia (foaf:depiction)
    about 70% of the wikipedia articles don’t have images
    3
    Garcia-Silva et al.
  • 4. Introduction
    Challenges
    Ambiguity of instance labels
    Querying the web for images related to the resource dbpedia:hornet
    4
    Garcia-Silva et al.
  • 5. Related Work
    5
    Garcia-Silva et al.
  • 6. Enriching DBpedia with Multimedia
    6
    Garcia-Silva et al.
    Get Context
    Retrieve Images
    Aggregate
    Generate tag-based ranking
    Aggregate
    dbpr:Hornet
    Wikipedia-based Context Index
    Related terms
    Query per context term & dbpr name
    Image Search Engines
    Rankings of Images
    (One per each query)
    List of Images
    Annotated with tags
    Ranking of Images
    Ranking of Images
    Ranking of Images
  • 7. Enriching DBpedia with Multimedia
    7
    Garcia-Silva et al.
    Get Context
    Wikipedia article
    dbpr:Hornet
    Wikipedia-based Context Index
    family, wasps, insect
  • 8. Enriching DBpedia with Multimedia
    8
    Garcia-Silva et al.
    Retrieve Images
    dbpr:Hornet
    family, wasps, insect
    Q0=Hornet
    Q1=Hornet and Family
    Q2=Hornet and Wasps
    Q3=Hornet and insect
    Image Search Engines
    Image Rankings
    R0 = img0,1; img0,2 ... Img0,k
    R1 = img1,1; img1,2 ... Img1,l
    R2 = img2,1; img2,2 ... Img2,m
    R3 = img3,1; img3,2 ... Img3,n
  • 9. Enriching DBpedia with Multimedia
    9
    Garcia-Silva et al.
    Aggregate
    R0 = img0,1; img0,2 ... Img0,k
    R1 = img1,1; img1,2 ... Img1,l
    R2 = img2,1; img2,2 ... Img2,m
    R3 = img3,1; img3,2 ... Img3,n
    Borda´s count
    • Positional Method, very easy to compute
    • 10. Each query result Ri is a voter and Images imgj are candidates:
    Foreachcandidate imgj in Ri
    Si(imgj) = number of candidates ranked below imgjin Ri.
    Output: imgj ordered by S(imgj) value
    Rcontext-based= img1; img2 ... Imgp
  • 11. Enriching DBpedia with Multimedia
    10
    Garcia-Silva et al.
    Generate tag-based ranking
    Aggregate
    List of images
    L= R0ᴜ R1ᴜ R2ᴜ R3
    Rtag-based= img1; img2 ... Imgq
    1) Measuring relatedness between a DBpedia resource and an image:
    - Overlapping of terms between the context of the former and the tags of the latter.
    2) Vector Space Model to represent the DBpedia resource and images:
    - TF as weighting scheme,
    - cosine function to measure similarity
    3) Generate ranking of images according to the similarity value
    Rcontext-based= img1; img2 ... Imgp
    Rfinal= img1; img2 ... Imgl
    Rtag-based= img1; img2 ... Imgq
  • 12. Experiments
    How many context words do produce the best results?
    11
    Apple context: «juice, fruit, apples, capital, michigan, orange»
    Garcia-Silva et al.
  • 13. Experiments
    Ambiguity
    Search engines work well:
    unambiguous names
    ambiguous names referring a dominant sense e.g., dbpedia:Stonehenge
    However they fail for ambiguous names:
    Lacking of a dominant sensee.g.: dbpedia:Apple
    When they do not refer to the dominant sense
    e.g.: dbpedia:Blackberry
    12
    Garcia-Silva et al.
  • 14. Experiments
    Dominance:
    Dataset:
    10 Classes and 15 dbpr randomly selected per each class
    Each dbpr must be: 1) popular, 2) have a dominance under 0.7
    We found dbpr for Mammals, Birds and Insects
    Increasing the dominance limit to 0.9 we found dbpr for the rest of classes.
    13
    Garcia-Silva et al.
  • 15. Experiments
    15 people evaluate the results of three approaches
    Each image was rated by 3 evaluators
    14
    Garcia-Silva et al.
  • 16. Experiments
    15
    Garcia-Silva et al.
  • 17. Conclusions
    Multipedia an approach to automatically populate an ontology with images related to existing instances
    We focused on the particularly challenging problem of ambiguity in instance names
    Human-driven evaluation of the approach involving 15 users and a total of 2250 image ratings containing DBpedia resources from several classes.
    A variation of Multipedia improves average precision by 9.4% over a baseline of keyword queries to commercial image search engines
    We have validated that in contrast to the baseline our approach achieves the highest precision with ambiguous names lacking a dominant sense.
    16
    Garcia-Silva et al.

×