SlideShare a Scribd company logo
1 of 33
Erwin Verbruggen
                       Jacco van Ossenbruggen
      Lora Aroyo                                  Johan Oomen



                   COLLECTING AND MANAGING
                           USER-GENERATED METADATA
                                                                          Maarten Brinkerink

Guus Schreiber




                   Riste Gligorov                               Lotte Baltussen

                                        Michiel Hildebrand
VIDEO CONTENT ANNOTATION
VIDEO CONTENT ANNOTATION
time consuming
5 times the duration of the video


coarse grained
professional vocabulary
VIDEO SEARCH BEHAVIOR




Today's and Tomorrow's Retrieval Practice in the Audiovisual Archive
Bouke Huurnink et al. ACM International Conference on Image and Video Retrieval 2010
VIDEO SEARCH BEHAVIOR
  people buy fragments
  broadcast (33%), stories (17%), fragments(49%)


  user vocabulary
  35% of clicked results not found by title or term




Today's and Tomorrow's Retrieval Practice in the Audiovisual Archive
Bouke Huurnink et al. ACM International Conference on Image and Video Retrieval 2010
improve support to ļ¬nd fragments
time-based annotations in a user vocabulary
crowdsourcing?
Winner EuroITV Competition
                                                Best Archives on the Web Award




Emerging Practices in the Cultural Heritage Domain - Social Tagging of Audiovisual Heritage
Johan Oomen, Lotte Belice Baltussen, et al. Web science conference 2010
Labeling images with a computer game
Luis von Ahn and Laura Dabbish. SIGCHI conference on Human factors in computing systems 2004
Pilot I     Pilot II

months             8            2

videos          612        1.544

players      2.000           438

tags      420.000       172.000
westminster abbey
    abbey
    priester
    geestelijken
    Hyde
    hye park
    beefeater
    bernhard
    hek
    paarden
    tocht
    aankomst
    kerk
    intocht
    engeland
    koets
    kroning
    mensenmassa
    parade
    juliana
    koning
    kroon
    stoet
    regen




On the Role of User-generated Metadata in Audio Visual Collections
Riste Gligorov, Michiel Hildebrand et al. KCAP International Conference on Knowledge Capture 2011
westminster abbey
    abbey
    priester
    geestelijken
    Hyde
    hye park
    beefeater
    bernhard
    bernhard                                                         time-based
    hek
    paarden
    tocht
    aankomst
    kerk
    intocht
    engeland
    koets
    kroning
    mensenmassa
    parade
    juliana
    koning
    kroon
    stoet
    regen




On the Role of User-generated Metadata in Audio Visual Collections
Riste Gligorov, Michiel Hildebrand et al. KCAP International Conference on Knowledge Capture 2011
westminster abbey
    abbey
    priester
    geestelijken
    Hyde
    hye park
                                                          user vocabulary
    beefeater                                              8% in professional vocabulary
    bernhard                                              23% in Dutch lexicon
    hek
    paarden
                                                          89% found on google
    tocht
    aankomst
    kerk
    intocht
    engeland
    koets
    kroning
    mensenmassa
    parade
    juliana
    koning
    kroon
    stoet
    regen




On the Role of User-generated Metadata in Audio Visual Collections
Riste Gligorov, Michiel Hildebrand et al. KCAP International Conference on Knowledge Capture 2011
westminster abbey
    westminster abbey
    abbey
    abbey
    priester
    priester
    geestelijken
    geestelijken
    Hyde
    hye park
                                                          user vocabulary
    beefeater
    beefeater                                              8% in professional vocabulary
    bernhard                                              23% in Dutch lexicon
    hek
    hek
    paarden
    paarden
                                                          89% found on google
    tocht
    tocht
    aankomst
    aankomst
    kerk
    kerk
    intocht
    intocht
                                                         objects (57%)
    engeland
    koets
    koets
    kroning
    kroning
    mensenmassa
    mensenmassa
    parade
    parade
    juliana
    koning
    koning
    kroon
    kroon
    stoet
    stoet
    regen
    regen




On the Role of User-generated Metadata in Audio Visual Collections
Riste Gligorov, Michiel Hildebrand et al. KCAP International Conference on Knowledge Capture 2011
westminster abbey
    abbey
    priester
    geestelijken
    Hyde
    hye park
                                                          user vocabulary
    beefeater                                              8% in professional vocabulary
    bernhard
    bernhard                                              23% in Dutch lexicon
    hek
    paarden
                                                          89% found on google
    tocht
    aankomst
    kerk
    intocht
                                                         objects (57%)
    engeland
    koets                                                persons (31%)
    kroning
    mensenmassa
    parade
    juliana
    juliana
    koning
    kroon
    stoet
    regen




On the Role of User-generated Metadata in Audio Visual Collections
Riste Gligorov, Michiel Hildebrand et al. KCAP International Conference on Knowledge Capture 2011
westminster abbey
    abbey
    priester
    geestelijken
    Hyde
    hye park
                                                          user vocabulary
    beefeater                                              8% in professional vocabulary
    bernhard                                              23% in Dutch lexicon
    hek
    paarden
                                                          89% found on google
    tocht
    aankomst
    kerk
    intocht
                                                         objects (57%)
    engeland
    engeland
    koets                                                persons (31%)
    kroning
    mensenmassa
    parade                                               locations (7%)
    juliana
    koning
    kroon
    stoet
    regen




On the Role of User-generated Metadata in Audio Visual Collections
Riste Gligorov, Michiel Hildebrand et al. KCAP International Conference on Knowledge Capture 2011
westminster abbey
    westminster abbey
     abbey
    abbey
     priester
    priester
     geestelijken
    geestelijken
     Hyde
    Hyde
     hye park
    hye park
                                                          user vocabulary
     beefeater
    beefeater                                              8% in professional vocabulary
     bernhard
    bernhard                                              23% in Dutch lexicon
     hek
    hek
     paarden
                                                          89% found on google
    paarden
     tocht
    tocht
     aankomst
    aankomst
     kerk
    kerk
     intocht
                                                         objects (57%)                        no events
    intocht
     engeland
    engeland
     koets
    koets                                                persons (31%)                        no scenes
     kroning
    kroning
     mensenmassa
    mensenmassa
     parade
    parade                                               locations (7%)
     juliana
    juliana
     koning
    koning
     kroon
    kroon
     stoet
    stoet
     regen
    regen




On the Role of User-generated Metadata in Audio Visual Collections
Riste Gligorov, Michiel Hildebrand et al. KCAP International Conference on Knowledge Capture 2011
westminster abbey
    westminster abbey
     abbey
    abbey
     priester
    priester
     geestelijken
    geestelijken
     Hyde
    Hyde
     hye park
    hye park
     beefeater
    beefeater
     bernhard
    bernhard
                                                          ā€œjustā€ tags
     hek
    hek
     paarden
    paarden
     tocht
    tocht
     aankomst
    aankomst
     kerk
    kerk
     intocht
    intocht
     engeland
    engeland
     koets
    koets
     kroning
    kroning
     mensenmassa
    mensenmassa
     parade
    parade
     juliana
    juliana
     koning
    koning
     kroon
    kroon
     stoet
    stoet
     regen
    regen




On the Role of User-generated Metadata in Audio Visual Collections
Riste Gligorov, Michiel Hildebrand et al. KCAP International Conference on Knowledge Capture 2011
westminster abbey
    abbey
    priester
    geestelijken
    Hyde
    hye park
    hye park
    beefeater
    beefeater
    bernhard
                                                          ā€œjustā€ tags
    bernhard
    hek
    hek
    paarden
    paarden
    tocht
    tocht
    aankomst
    aankomst
                                                         typos
    kerk
    kerk
    intocht
    intocht
    engeland
    engeland
    koets
    koets
    kroning
    kroning
    mensenmassa
    mensenmassa
    parade
    parade
    juliana
    juliana
    koning
    koning
    kroon
    kroon
    stoet
    stoet
    regen
    regen




On the Role of User-generated Metadata in Audio Visual Collections
Riste Gligorov, Michiel Hildebrand et al. KCAP International Conference on Knowledge Capture 2011
westminster abbey
    abbey
    priester
    geestelijken
    Hyde
    hye park
    beefeater
    bernhard
                                                          ā€œjustā€ tags
    bernhard
    hek
    hek
    paarden
    paarden
    tocht
    tocht
    aankomst
    aankomst
                                                         typos
    kerk
    kerk
    intocht
    intocht
    engeland
    engeland
                                                         no unique reference
    koets
    koets
    kroning
    kroning
    mensenmassa
    mensenmassa
    parade
    parade
    juliana
    juliana
    koning
    koning
    kroon
    kroon
    stoet
    stoet
    regen
    regen




On the Role of User-generated Metadata in Audio Visual Collections
Riste Gligorov, Michiel Hildebrand et al. KCAP International Conference on Knowledge Capture 2011
westminster abbey
    abbey
    priester
    geestelijken
    Hyde
    hye park
    beefeater
    beefeater
    bernhard
                                                          ā€œjustā€ tags
    bernhard
    hek
    hek
    paarden
    paarden
    tocht
    tocht
    aankomst
    aankomst
                                                         typos
    kerk
    kerk
    intocht
    intocht
    engeland
    engeland
                                                         no unique reference
    koets
    koets
    kroning
    kroning
    mensenmassa
                                                         no synonym
    mensenmassa
    parade
    parade
    juliana
    juliana
    koning
    koning
    kroon
    kroon
    stoet
    stoet
    regen
    regen




On the Role of User-generated Metadata in Audio Visual Collections
Riste Gligorov, Michiel Hildebrand et al. KCAP International Conference on Knowledge Capture 2011
Linking user-generated video annotations to the web of data
Michiel Hildebrand and Jacco van Ossenbruggen, International conference on multimedia modelling 2012
Linking user-generated video annotations to the web of data
Michiel Hildebrand and Jacco van Ossenbruggen, International conference on multimedia modelling 2012
Linking user-generated video annotations to the web of data
Michiel Hildebrand and Jacco van Ossenbruggen, International conference on multimedia modelling 2012
grouped by type (faceted)
concepts uniquely identiļ¬ed
background
information
Would you include user-generated metadata
              in your collection?


            no      maybe     yes


why not?                            waisda.nl


           what (quality) criteria are
           important to you?

More Related Content

More from Michiel Hildebrand

NDE betekenisvolle verbinding
NDE betekenisvolle verbindingNDE betekenisvolle verbinding
NDE betekenisvolle verbindingMichiel Hildebrand
Ā 
Crowdsourcing video annotation
Crowdsourcing video annotationCrowdsourcing video annotation
Crowdsourcing video annotationMichiel Hildebrand
Ā 
Workshop tag gardening AVA Net
Workshop tag gardening AVA NetWorkshop tag gardening AVA Net
Workshop tag gardening AVA NetMichiel Hildebrand
Ā 
SIKS 2011 - Semantic Web course
SIKS 2011 - Semantic Web courseSIKS 2011 - Semantic Web course
SIKS 2011 - Semantic Web courseMichiel Hildebrand
Ā 
PhD Defense Michiel Hildebrand: End-user support for access to heterogeneous ...
PhD Defense Michiel Hildebrand: End-user support for access to heterogeneous ...PhD Defense Michiel Hildebrand: End-user support for access to heterogeneous ...
PhD Defense Michiel Hildebrand: End-user support for access to heterogeneous ...Michiel Hildebrand
Ā 

More from Michiel Hildebrand (6)

NDE betekenisvolle verbinding
NDE betekenisvolle verbindingNDE betekenisvolle verbinding
NDE betekenisvolle verbinding
Ā 
Enrich, Link, Search
Enrich, Link, SearchEnrich, Link, Search
Enrich, Link, Search
Ā 
Crowdsourcing video annotation
Crowdsourcing video annotationCrowdsourcing video annotation
Crowdsourcing video annotation
Ā 
Workshop tag gardening AVA Net
Workshop tag gardening AVA NetWorkshop tag gardening AVA Net
Workshop tag gardening AVA Net
Ā 
SIKS 2011 - Semantic Web course
SIKS 2011 - Semantic Web courseSIKS 2011 - Semantic Web course
SIKS 2011 - Semantic Web course
Ā 
PhD Defense Michiel Hildebrand: End-user support for access to heterogeneous ...
PhD Defense Michiel Hildebrand: End-user support for access to heterogeneous ...PhD Defense Michiel Hildebrand: End-user support for access to heterogeneous ...
PhD Defense Michiel Hildebrand: End-user support for access to heterogeneous ...
Ā 

Recently uploaded

Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu MenzaYouth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menzaictsugar
Ā 
Investment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy CheruiyotInvestment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy Cheruiyotictsugar
Ā 
IoT Insurance Observatory: summary 2024
IoT Insurance Observatory:  summary 2024IoT Insurance Observatory:  summary 2024
IoT Insurance Observatory: summary 2024Matteo Carbone
Ā 
Market Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 EditionMarket Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 EditionMintel Group
Ā 
PSCC - Capability Statement Presentation
PSCC - Capability Statement PresentationPSCC - Capability Statement Presentation
PSCC - Capability Statement PresentationAnamaria Contreras
Ā 
Innovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdfInnovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdfrichard876048
Ā 
8447779800, Low rate Call girls in Tughlakabad Delhi NCR
8447779800, Low rate Call girls in Tughlakabad Delhi NCR8447779800, Low rate Call girls in Tughlakabad Delhi NCR
8447779800, Low rate Call girls in Tughlakabad Delhi NCRashishs7044
Ā 
Ten Organizational Design Models to align structure and operations to busines...
Ten Organizational Design Models to align structure and operations to busines...Ten Organizational Design Models to align structure and operations to busines...
Ten Organizational Design Models to align structure and operations to busines...Seta Wicaksana
Ā 
Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03DallasHaselhorst
Ā 
Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Kirill Klimov
Ā 
Call Girls in DELHI Cantt, ( Call Me )-8377877756-Female Escort- In Delhi / Ncr
Call Girls in DELHI Cantt, ( Call Me )-8377877756-Female Escort- In Delhi / NcrCall Girls in DELHI Cantt, ( Call Me )-8377877756-Female Escort- In Delhi / Ncr
Call Girls in DELHI Cantt, ( Call Me )-8377877756-Female Escort- In Delhi / Ncrdollysharma2066
Ā 
MAHA Global and IPR: Do Actions Speak Louder Than Words?
MAHA Global and IPR: Do Actions Speak Louder Than Words?MAHA Global and IPR: Do Actions Speak Louder Than Words?
MAHA Global and IPR: Do Actions Speak Louder Than Words?Olivia Kresic
Ā 
Digital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdfDigital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdfJos Voskuil
Ā 
Ms Motilal Padampat Sugar Mills vs. State of Uttar Pradesh & Ors. - A Milesto...
Ms Motilal Padampat Sugar Mills vs. State of Uttar Pradesh & Ors. - A Milesto...Ms Motilal Padampat Sugar Mills vs. State of Uttar Pradesh & Ors. - A Milesto...
Ms Motilal Padampat Sugar Mills vs. State of Uttar Pradesh & Ors. - A Milesto...ShrutiBose4
Ā 
Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737Riya Pathan
Ā 
Intro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdfIntro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdfpollardmorgan
Ā 
Case study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detailCase study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detailAriel592675
Ā 
Call Us šŸ“²8800102216šŸ“ž Call Girls In DLF City Gurgaon
Call Us šŸ“²8800102216šŸ“ž Call Girls In DLF City GurgaonCall Us šŸ“²8800102216šŸ“ž Call Girls In DLF City Gurgaon
Call Us šŸ“²8800102216šŸ“ž Call Girls In DLF City Gurgaoncallgirls2057
Ā 

Recently uploaded (20)

Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu MenzaYouth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Ā 
Investment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy CheruiyotInvestment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy Cheruiyot
Ā 
IoT Insurance Observatory: summary 2024
IoT Insurance Observatory:  summary 2024IoT Insurance Observatory:  summary 2024
IoT Insurance Observatory: summary 2024
Ā 
Market Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 EditionMarket Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 Edition
Ā 
PSCC - Capability Statement Presentation
PSCC - Capability Statement PresentationPSCC - Capability Statement Presentation
PSCC - Capability Statement Presentation
Ā 
Innovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdfInnovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdf
Ā 
8447779800, Low rate Call girls in Tughlakabad Delhi NCR
8447779800, Low rate Call girls in Tughlakabad Delhi NCR8447779800, Low rate Call girls in Tughlakabad Delhi NCR
8447779800, Low rate Call girls in Tughlakabad Delhi NCR
Ā 
Ten Organizational Design Models to align structure and operations to busines...
Ten Organizational Design Models to align structure and operations to busines...Ten Organizational Design Models to align structure and operations to busines...
Ten Organizational Design Models to align structure and operations to busines...
Ā 
Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03
Ā 
Japan IT Week 2024 Brochure by 47Billion (English)
Japan IT Week 2024 Brochure by 47Billion (English)Japan IT Week 2024 Brochure by 47Billion (English)
Japan IT Week 2024 Brochure by 47Billion (English)
Ā 
Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024
Ā 
Call Girls in DELHI Cantt, ( Call Me )-8377877756-Female Escort- In Delhi / Ncr
Call Girls in DELHI Cantt, ( Call Me )-8377877756-Female Escort- In Delhi / NcrCall Girls in DELHI Cantt, ( Call Me )-8377877756-Female Escort- In Delhi / Ncr
Call Girls in DELHI Cantt, ( Call Me )-8377877756-Female Escort- In Delhi / Ncr
Ā 
MAHA Global and IPR: Do Actions Speak Louder Than Words?
MAHA Global and IPR: Do Actions Speak Louder Than Words?MAHA Global and IPR: Do Actions Speak Louder Than Words?
MAHA Global and IPR: Do Actions Speak Louder Than Words?
Ā 
Corporate Profile 47Billion Information Technology
Corporate Profile 47Billion Information TechnologyCorporate Profile 47Billion Information Technology
Corporate Profile 47Billion Information Technology
Ā 
Digital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdfDigital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdf
Ā 
Ms Motilal Padampat Sugar Mills vs. State of Uttar Pradesh & Ors. - A Milesto...
Ms Motilal Padampat Sugar Mills vs. State of Uttar Pradesh & Ors. - A Milesto...Ms Motilal Padampat Sugar Mills vs. State of Uttar Pradesh & Ors. - A Milesto...
Ms Motilal Padampat Sugar Mills vs. State of Uttar Pradesh & Ors. - A Milesto...
Ā 
Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737
Ā 
Intro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdfIntro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdf
Ā 
Case study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detailCase study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detail
Ā 
Call Us šŸ“²8800102216šŸ“ž Call Girls In DLF City Gurgaon
Call Us šŸ“²8800102216šŸ“ž Call Girls In DLF City GurgaonCall Us šŸ“²8800102216šŸ“ž Call Girls In DLF City Gurgaon
Call Us šŸ“²8800102216šŸ“ž Call Girls In DLF City Gurgaon
Ā 

Collecting and Analyzing User-Generated Metadata for Video Collections

  • 1. Erwin Verbruggen Jacco van Ossenbruggen Lora Aroyo Johan Oomen COLLECTING AND MANAGING USER-GENERATED METADATA Maarten Brinkerink Guus Schreiber Riste Gligorov Lotte Baltussen Michiel Hildebrand
  • 3. VIDEO CONTENT ANNOTATION time consuming 5 times the duration of the video coarse grained professional vocabulary
  • 4. VIDEO SEARCH BEHAVIOR Today's and Tomorrow's Retrieval Practice in the Audiovisual Archive Bouke Huurnink et al. ACM International Conference on Image and Video Retrieval 2010
  • 5. VIDEO SEARCH BEHAVIOR people buy fragments broadcast (33%), stories (17%), fragments(49%) user vocabulary 35% of clicked results not found by title or term Today's and Tomorrow's Retrieval Practice in the Audiovisual Archive Bouke Huurnink et al. ACM International Conference on Image and Video Retrieval 2010
  • 6. improve support to ļ¬nd fragments
  • 7. time-based annotations in a user vocabulary
  • 9. Winner EuroITV Competition Best Archives on the Web Award Emerging Practices in the Cultural Heritage Domain - Social Tagging of Audiovisual Heritage Johan Oomen, Lotte Belice Baltussen, et al. Web science conference 2010
  • 10.
  • 11.
  • 12. Labeling images with a computer game Luis von Ahn and Laura Dabbish. SIGCHI conference on Human factors in computing systems 2004
  • 13. Pilot I Pilot II months 8 2 videos 612 1.544 players 2.000 438 tags 420.000 172.000
  • 14. westminster abbey abbey priester geestelijken Hyde hye park beefeater bernhard hek paarden tocht aankomst kerk intocht engeland koets kroning mensenmassa parade juliana koning kroon stoet regen On the Role of User-generated Metadata in Audio Visual Collections Riste Gligorov, Michiel Hildebrand et al. KCAP International Conference on Knowledge Capture 2011
  • 15. westminster abbey abbey priester geestelijken Hyde hye park beefeater bernhard bernhard time-based hek paarden tocht aankomst kerk intocht engeland koets kroning mensenmassa parade juliana koning kroon stoet regen On the Role of User-generated Metadata in Audio Visual Collections Riste Gligorov, Michiel Hildebrand et al. KCAP International Conference on Knowledge Capture 2011
  • 16. westminster abbey abbey priester geestelijken Hyde hye park user vocabulary beefeater 8% in professional vocabulary bernhard 23% in Dutch lexicon hek paarden 89% found on google tocht aankomst kerk intocht engeland koets kroning mensenmassa parade juliana koning kroon stoet regen On the Role of User-generated Metadata in Audio Visual Collections Riste Gligorov, Michiel Hildebrand et al. KCAP International Conference on Knowledge Capture 2011
  • 17. westminster abbey westminster abbey abbey abbey priester priester geestelijken geestelijken Hyde hye park user vocabulary beefeater beefeater 8% in professional vocabulary bernhard 23% in Dutch lexicon hek hek paarden paarden 89% found on google tocht tocht aankomst aankomst kerk kerk intocht intocht objects (57%) engeland koets koets kroning kroning mensenmassa mensenmassa parade parade juliana koning koning kroon kroon stoet stoet regen regen On the Role of User-generated Metadata in Audio Visual Collections Riste Gligorov, Michiel Hildebrand et al. KCAP International Conference on Knowledge Capture 2011
  • 18. westminster abbey abbey priester geestelijken Hyde hye park user vocabulary beefeater 8% in professional vocabulary bernhard bernhard 23% in Dutch lexicon hek paarden 89% found on google tocht aankomst kerk intocht objects (57%) engeland koets persons (31%) kroning mensenmassa parade juliana juliana koning kroon stoet regen On the Role of User-generated Metadata in Audio Visual Collections Riste Gligorov, Michiel Hildebrand et al. KCAP International Conference on Knowledge Capture 2011
  • 19. westminster abbey abbey priester geestelijken Hyde hye park user vocabulary beefeater 8% in professional vocabulary bernhard 23% in Dutch lexicon hek paarden 89% found on google tocht aankomst kerk intocht objects (57%) engeland engeland koets persons (31%) kroning mensenmassa parade locations (7%) juliana koning kroon stoet regen On the Role of User-generated Metadata in Audio Visual Collections Riste Gligorov, Michiel Hildebrand et al. KCAP International Conference on Knowledge Capture 2011
  • 20. westminster abbey westminster abbey abbey abbey priester priester geestelijken geestelijken Hyde Hyde hye park hye park user vocabulary beefeater beefeater 8% in professional vocabulary bernhard bernhard 23% in Dutch lexicon hek hek paarden 89% found on google paarden tocht tocht aankomst aankomst kerk kerk intocht objects (57%) no events intocht engeland engeland koets koets persons (31%) no scenes kroning kroning mensenmassa mensenmassa parade parade locations (7%) juliana juliana koning koning kroon kroon stoet stoet regen regen On the Role of User-generated Metadata in Audio Visual Collections Riste Gligorov, Michiel Hildebrand et al. KCAP International Conference on Knowledge Capture 2011
  • 21. westminster abbey westminster abbey abbey abbey priester priester geestelijken geestelijken Hyde Hyde hye park hye park beefeater beefeater bernhard bernhard ā€œjustā€ tags hek hek paarden paarden tocht tocht aankomst aankomst kerk kerk intocht intocht engeland engeland koets koets kroning kroning mensenmassa mensenmassa parade parade juliana juliana koning koning kroon kroon stoet stoet regen regen On the Role of User-generated Metadata in Audio Visual Collections Riste Gligorov, Michiel Hildebrand et al. KCAP International Conference on Knowledge Capture 2011
  • 22. westminster abbey abbey priester geestelijken Hyde hye park hye park beefeater beefeater bernhard ā€œjustā€ tags bernhard hek hek paarden paarden tocht tocht aankomst aankomst typos kerk kerk intocht intocht engeland engeland koets koets kroning kroning mensenmassa mensenmassa parade parade juliana juliana koning koning kroon kroon stoet stoet regen regen On the Role of User-generated Metadata in Audio Visual Collections Riste Gligorov, Michiel Hildebrand et al. KCAP International Conference on Knowledge Capture 2011
  • 23. westminster abbey abbey priester geestelijken Hyde hye park beefeater bernhard ā€œjustā€ tags bernhard hek hek paarden paarden tocht tocht aankomst aankomst typos kerk kerk intocht intocht engeland engeland no unique reference koets koets kroning kroning mensenmassa mensenmassa parade parade juliana juliana koning koning kroon kroon stoet stoet regen regen On the Role of User-generated Metadata in Audio Visual Collections Riste Gligorov, Michiel Hildebrand et al. KCAP International Conference on Knowledge Capture 2011
  • 24. westminster abbey abbey priester geestelijken Hyde hye park beefeater beefeater bernhard ā€œjustā€ tags bernhard hek hek paarden paarden tocht tocht aankomst aankomst typos kerk kerk intocht intocht engeland engeland no unique reference koets koets kroning kroning mensenmassa no synonym mensenmassa parade parade juliana juliana koning koning kroon kroon stoet stoet regen regen On the Role of User-generated Metadata in Audio Visual Collections Riste Gligorov, Michiel Hildebrand et al. KCAP International Conference on Knowledge Capture 2011
  • 25.
  • 26. Linking user-generated video annotations to the web of data Michiel Hildebrand and Jacco van Ossenbruggen, International conference on multimedia modelling 2012
  • 27. Linking user-generated video annotations to the web of data Michiel Hildebrand and Jacco van Ossenbruggen, International conference on multimedia modelling 2012
  • 28. Linking user-generated video annotations to the web of data Michiel Hildebrand and Jacco van Ossenbruggen, International conference on multimedia modelling 2012
  • 29.
  • 30. grouped by type (faceted)
  • 33. Would you include user-generated metadata in your collection? no maybe yes why not? waisda.nl what (quality) criteria are important to you?

Editor's Notes

  1. From the different research topics I am involved in today I address some of the results from projects in which we use explicit semantics and user interaction to make crowd knowledge effectively processable for machines\n\n\n*****\nFrom Crowd Knowledge to Machine Knowledge: Use cases with semantics and user interaction in Dutch cultural heritage collections\n\nIn this talk I will discuss several projects, for example with the Dutch national archive for sound and vision or with the Rijksmuseum Amsterdam, where we have experimented with semantics-based technologies and user interaction paradigms to provide systems with additional support for users to turn their lay knowledge into machine-processable knowledge. Turning this crowd knowledge into machine knowledge makes the system more intelligent and thus makes them capitalize on the knowledge assets in the crowds of users.\n\nThe problem our system addresses is concerned with making the massive set of interesting multimedia content available in these cultural heritage institutions accessible for a large community of users.  On the one hand, this content is difficult to find for the average user as they have been indexed by experts (curators, art historians, etc) who use a very specific vocabulary that is unknown to the general audience.  On the other hand, these professionals can no longer  cope with the demand for annotation on the ever growing multimedia content in these collections.\n\nOur solution to both these problem is to exploit crowdsourcing on the web, where there is a lot of specific domain knowledge and processing power available that such archives and museums would gladly incorporate. However, turning this knowledge from the mass of lay users into additional intelligence in content management systems is not a simple challenge for several reasons: (1) the lay users use different vocabularies and are also interested in different aspects of the collection items; (2) the collected user metadata needs to be validated for quality and correctness; (3) crowd-sourcing users need to be continuously supported and motivated in order to supply more and better knowledge in such systems.  I will present our data analysis of the crowdsourced content, our techniques for quality control and incorporating domain semantics, and the results of our attempts to employ different interaction strategies and user interfaces (for example games and simplified interfaces for interactive annotation) to engage and stimulate the users.\n\nDemos:\n-----------\nVideo-tagging game: http://woordentikkertje.manbijthond.nl/\nArt recommender and personalized museum tours: http://www.chip-project.org/demo/ (3rd prize winner of ISWC Semantic Web challenge)\nHistorical events in museum collections: http://agora.cs.vu.nl/demo/\n\nRelevant Papers:\n------------------------\n- On the role of user-generated metadata in audio visual collections, at K-CAP2011\nhttp://portal.acm.org/citation.cfm?id=1999702\n\n- Digital Hermeneutics:  Agora and the Online Understanding of Cultural Heritage, at WebSci2011, (Best paper nominee)\nhttp://www.websci11.org/fileadmin/websci/Papers/116_paper.pdf\n\n- The effects of transparency on trust in and acceptance of a content-based art recommender, UMUAI journal, (Best journal paper for 2008)\nhttp://www.springerlink.com/content/81q34u73mpp58u75/\n\n- Recommendations based on semantically enriched museum collections, Journal of Web Semantics\nhttp://www.sciencedirect.com/science/article/pii/S1570826808000681\n\n- Enhancing Content-Based Recommendation with the Task Model of Classification, at EKAW2010\nhttp://www.springerlink.com/content/p78hl5r283x79r13/\n\nShort bio:\n-----------------\nLora Aroyo is an associate professor at the Web and Media group, at the Department of Computer Science, Free University Amsterdam, The Netherlands. Her research interests are in using semantic web technologies for modeling user interests and context, recommendation systems and personalized access in Web-based applications. Typical example domains are cultural heritage collections, multimedia archives and interactive TV. She has coordinated the research work in the CHIP project on Cultural Heritage Information Personalization (http://chip-project.org). Currently she is a scientific coordinator of the EU Integrated Project NoTube dealing with the integration of Web and TV data with the help of semantics (http://notube.tv), a project leader of VU INTERTAIN Experimental Research Lab initiative (http://www.cs.vu.nl/intertain), and involved in the research on motivational user interaction for video-tagging games in the PrestoPrime project (http://www.prestoprime.org/) and modeling\nhistoric events in the Agora project (http://agora.cs.vu.nl/). She has organized numerous workshops in the areas of personalized access to cultural heritage, e-learning, interactive television, as well as on visual interfaces to the social and semantic web (PATCH, FutureTV, PersWeb, VISSW and DeRIVE). Lora has been actively involved in both the Semantic Web (PC co-chair and conference chair for ESWC2009 and ESWC2010 and PC co-chair for ISWC2011) and the Personalization and User modeling communities (on the editorial board for the UMUAI journal and on the steering committee of UMAP conference).\n\nMore information can be found at:\n-----------------------------------------------\nWebpage: http://www.cs.vu.nl/~laroyo\nSlideshare: http://www.slideshare.net/laroyo\nTwitter: @laroyo\n\n
  2. Nowadays av collections are undergoing process of transformation from archives of analog material to large digital (online) data stores, as videos are very much wanted by different types of end users. \n\nFor example, the Netherlands Institute of Sound and Vision archives all radio and TV material broadcasted in the Netherlands (has appr. 700,000 hours radio and television programs available online. \n\nFacilitating a successfully access to av collection items demands quality metadata associated with them.\n\nTraditionally, in AV achives it is the task of professional catalogers to manually describe the videos. Usually, in the process they follow well-defined , well-established guidelines and rules. They also may make use of auxiliary materials like controlled vocabularies, thesauri, and such.\nHowever, as we all know video is medium that is extremely rich in meaning. Directors and screenwriters create entire universes with complex interplay between characters, objects and events. Sometimes they may employ rich and complex abstract symbolic language. This makes that task of describing the meaning of a video as complicated as describing the real worlds. Which is no trivial matter.\nAs a result the process of annotation is tedious, time-consuming and inevitably incomplete. According to some research, it takes approximately 5 times of the duration of the material to annotate it completely. So for example, if we are talking about a documentary that lasts one hour, it will take approximately 5 hours for a cataloger to fully describe it. Furthermore,\n\nConsequently, professional annotations are coarse-grained in a sense that they are referring to the entire video describing prevalent topics. It may happen that catalogers provide more fine-grained, shot-level descriptions for a video. But this is exception of the rule and it is reserved to the most important pieces of the AV collection.\n
  3. Nowadays av collections are undergoing process of transformation from archives of analog material to large digital (online) data stores, as videos are very much wanted by different types of end users. \n\nFor example, the Netherlands Institute of Sound and Vision archives all radio and TV material broadcasted in the Netherlands (has appr. 700,000 hours radio and television programs available online. \n\nFacilitating a successfully access to av collection items demands quality metadata associated with them.\n\nTraditionally, in AV achives it is the task of professional catalogers to manually describe the videos. Usually, in the process they follow well-defined , well-established guidelines and rules. They also may make use of auxiliary materials like controlled vocabularies, thesauri, and such.\nHowever, as we all know video is medium that is extremely rich in meaning. Directors and screenwriters create entire universes with complex interplay between characters, objects and events. Sometimes they may employ rich and complex abstract symbolic language. This makes that task of describing the meaning of a video as complicated as describing the real worlds. Which is no trivial matter.\nAs a result the process of annotation is tedious, time-consuming and inevitably incomplete. According to some research, it takes approximately 5 times of the duration of the material to annotate it completely. So for example, if we are talking about a documentary that lasts one hour, it will take approximately 5 hours for a cataloger to fully describe it. Furthermore,\n\nConsequently, professional annotations are coarse-grained in a sense that they are referring to the entire video describing prevalent topics. It may happen that catalogers provide more fine-grained, shot-level descriptions for a video. But this is exception of the rule and it is reserved to the most important pieces of the AV collection.\n
  4. Our solution to both these problem is to exploit crowdsourcing on the web, where there is a lot of specific domain knowledge and processing power available that such archives and museums would gladly incorporate. However, turning this knowledge from the mass of lay users into additional intelligence in content management systems is not a simple challenge for several reasons: (1) the lay users use different vocabularies and are also interested in different aspects of the collection items; (2) the collected user metadata needs to be validated for quality and correctness; (3) crowd-sourcing users need to be continuously supported and motivated in order to supply more and better knowledge in such systems.  I will present our data analysis of the crowdsourced content, our techniques for quality control and incorporating domain semantics, and the results of our attempts to employ different interaction strategies and user interfaces (for example games and simplified interfaces for interactive annotation) to engage and stimulate the users.\n\nunderstanding the user-generated data\ncontextualize the user-generated metadata \n\n\n
  5. Our solution to both these problem is to exploit crowdsourcing on the web, where there is a lot of specific domain knowledge and processing power available that such archives and museums would gladly incorporate. However, turning this knowledge from the mass of lay users into additional intelligence in content management systems is not a simple challenge for several reasons: (1) the lay users use different vocabularies and are also interested in different aspects of the collection items; (2) the collected user metadata needs to be validated for quality and correctness; (3) crowd-sourcing users need to be continuously supported and motivated in order to supply more and better knowledge in such systems.  I will present our data analysis of the crowdsourced content, our techniques for quality control and incorporating domain semantics, and the results of our attempts to employ different interaction strategies and user interfaces (for example games and simplified interfaces for interactive annotation) to engage and stimulate the users.\n\nunderstanding the user-generated data\ncontextualize the user-generated metadata \n\n\n
  6. Our solution to both these problem is to exploit crowdsourcing on the web, where there is a lot of specific domain knowledge and processing power available that such archives and museums would gladly incorporate. However, turning this knowledge from the mass of lay users into additional intelligence in content management systems is not a simple challenge for several reasons: (1) the lay users use different vocabularies and are also interested in different aspects of the collection items; (2) the collected user metadata needs to be validated for quality and correctness; (3) crowd-sourcing users need to be continuously supported and motivated in order to supply more and better knowledge in such systems.  I will present our data analysis of the crowdsourced content, our techniques for quality control and incorporating domain semantics, and the results of our attempts to employ different interaction strategies and user interfaces (for example games and simplified interfaces for interactive annotation) to engage and stimulate the users.\n\nunderstanding the user-generated data\ncontextualize the user-generated metadata \n\n\n
  7. Our solution to both these problem is to exploit crowdsourcing on the web, where there is a lot of specific domain knowledge and processing power available that such archives and museums would gladly incorporate. However, turning this knowledge from the mass of lay users into additional intelligence in content management systems is not a simple challenge for several reasons: (1) the lay users use different vocabularies and are also interested in different aspects of the collection items; (2) the collected user metadata needs to be validated for quality and correctness; (3) crowd-sourcing users need to be continuously supported and motivated in order to supply more and better knowledge in such systems.  I will present our data analysis of the crowdsourced content, our techniques for quality control and incorporating domain semantics, and the results of our attempts to employ different interaction strategies and user interfaces (for example games and simplified interfaces for interactive annotation) to engage and stimulate the users.\n\nunderstanding the user-generated data\ncontextualize the user-generated metadata \n\n\n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n
  33. \n
  34. \n
  35. \n
  36. \n
  37. \n
  38. \n
  39. \n
  40. \n
  41. \n
  42. \n
  43. \n
  44. \n
  45. \n
  46. \n
  47. \n
  48. \n
  49. \n
  50. \n
  51. \n
  52. \n
  53. \n
  54. \n
  55. \n
  56. \n
  57. - group by type (faceted)\n- concept is uniquely identified (see full name for Juliana and Bernhard)\n- background information\n
  58. - group by type (faceted)\n- concept is uniquely identified (see full name for Juliana and Bernhard)\n- background information\n
  59. - group by type (faceted)\n- concept is uniquely identified (see full name for Juliana and Bernhard)\n- background information\n
  60. - group by type (faceted)\n- concept is uniquely identified (see full name for Juliana and Bernhard)\n- background information\n
  61. - group by type (faceted)\n- concept is uniquely identified (see full name for Juliana and Bernhard)\n- background information\n
  62. - group by type (faceted)\n- concept is uniquely identified (see full name for Juliana and Bernhard)\n- background information\n
  63. - group by type (faceted)\n- concept is uniquely identified (see full name for Juliana and Bernhard)\n- background information\n
  64. - group by type (faceted)\n- concept is uniquely identified (see full name for Juliana and Bernhard)\n- background information\n
  65. - group by type (faceted)\n- concept is uniquely identified (see full name for Juliana and Bernhard)\n- background information\n
  66. - group by type (faceted)\n- concept is uniquely identified (see full name for Juliana and Bernhard)\n- background information\n
  67. \n