Audiovisual Content Exploitation at FIA 15042010 NISV


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • The concept of technology-enhanced audiovisual indexing may have been a holy grail in earlier research, but recently it has become an urgent prerequisite in the context of our present-day information society. The effective exploitation of our Digital Libraries however, is currently impeded despite a wealth of technological progress. It becomes increasingly apparent that there may be an underlying problem, rooted in the disparity between technology and user needs. In order to respond to the demands of the information society advanced techniques and new workflow practices of use need to be explored by focusing on the needs of contemporary users, both professionals and non-professionals. At the same time, we need to foster opportunities for drawing user communities into the Digital Libraries, to involve them in enhancing content exploitability, e.g., by community tagging or capturing user generated content from the internet and aligning this with archived items. In this talk, the area of tension between the current state of technology and user needs is discussed in more detail in the context of practical use cases within the Netherlands Institute for Sound and Vision.
  • Let me first introduce Sound and Vision. Sound and Vision maintains and provides access to 70 per cent of the Dutch audio-visual heritage comprising approximately 700,000 hours of television, radio, music and film, making SV one of the largest audiovisual archives in Europe. The archive is growing. SV is the business archive of the national broadcasting corporationsand digitally born television and radio made by the Dutch public broadcastering companies is flowing in right after it is broadcast. Also, via the SV archive service PROARCHIVE content from Dutch cultural heritage institutes and regional broadcast institutes flows in. Finally we are currently selecting manually Dutch user generated content from the internet. More on this later.
  • € 173 mln in 7 jaar (looptijd 2007-2014)wel terugverdienverplichting. Tijdens looptijd € 19 mln
  • Om dit soort use cases mogelijk te maken is handmatig beschrijven van materiaal niet de oplossing want te kostbaar en per definitie beperkt. Bij contentbeheerders ontbreken vaak de resources om zelfs minimale beschrijvingen zoals titel en datum te produceren
  • Om materiaal toch, en in meer detail te kunnen beschrijven zijn een aantal strategieen denkbaar.Zo wordt de hulp van het publiek ingeroepen om materiaal te beschrijven, het zogenaamde crowdsourcing.Een belangrijk alternatief waar we in Twente onderzoek naar gedaan hebben is het gebruik van beschikbare tekstuele bronnen zoals ondertitels, notulen of aantekeningen, als beschrijvingen van content. Met behulp van spraaktechnologie kunnen deze beschrijvingen soms ook worden gesynchroniseerd met het materiaal (denk hierbij aan notulen) of is er al een vorm van synchronisatie aanwezig (in het geval van ondertitels. Maar vaak zijn dit soort bronnen niet beschikbaar. Dan kan technologie die op basis van audiovisuele kenmerken automatisch beschrijvingen genereert uitkomst bieden.
  • allemaal potentieel interessante content voor verschillende typen gebruikers:producers die uit zijn op hergebruik, journalisten, onderzoekers, en natuurlijk ook het algemeen publiek.Zomaar een greep uit mogelijke use cases waar je aan kunt denken:
  • Audiovisual Content Exploitation at FIA 15042010 NISV

    1. 1. Audiovisual content exploitation in the networked information society<br />Roeland Ordelman<br />Research&Development<br />Netherlands Institute for Sound and Vision<br /><br />
    2. 2. contents<br /><ul><li>NISV context: Images of the Future
    3. 3. “Access”, an important keyword in the business models ...
    4. 4. ... but what about access in practice
    5. 5. Technology and user interaction: from a ‘laboratory view’ on users to drawing them into the development chain</li></li></ul><li>NISV context<br /><ul><li>+700.000 hours of radio, television, documentaries, films and music, over 2 million photographs, 20.000 objects like cameras, televisions, radios, costumes and pieces of scenery.
    6. 6. still growing:
    7. 7. digitally born television and radio programs made by the Dutch public broadcasting companies (video: 15K/hours/year)
    8. 8. PROARCHIVE: archiving service
    9. 9. selection of (Dutch) user generated content</li></li></ul><li>
    10. 10. Images of the Future<br /><ul><li>Selection, restoration, digitization, encoding and storage of 137,000 hours of video, 20,000 hours of film, 124,000 hours of audio and more than three million photographs.
    11. 11. One of the largest digitisation effort in Europe
    12. 12. Three goals:
    13. 13. Safeguarding heritage for future generations
    14. 14. Creating social- economical value (“unlock the social and economic potential of the collections”)
    15. 15. Innovation: new infrastructure for strengthening knowledge economy
    16. 16. To achieve these objectives, the cultural heritage sector is challenged to re-evaluate its business models</li></li></ul><li>Business model<br /><ul><li>The total investment of this initiative sums up to 173 million Euro
    17. 17. A strong business model is necessary to support this kind of investment and prove that such an investment will result in long-term socio-economic returns
    18. 18. The outcome of a Cost-Benefit analysis was positive: “The total balance of costs and returns of restoring, preserving and digitising audio-visual material (excluding costs of tax payments) will be between: 20+ and 60+ million.’’
    19. 19. Economic benefits:
    20. 20. Direct effects of the investment are revenues from sales, access for specific user groups, the repartition of copyright for the use of the material and so on.
    21. 21. The indirect effects concern the product markets and labour market.
    22. 22. Social benefits:
    23. 23. conservation of culture, reinforcement of cultural awareness, reinforcement of democracy through the accessibility of information, increase in multimedia literacy and contribution to the Lisbon goals set by the EU</li></ul><br />
    24. 24. Content exploitation: from content is king ...<br />
    25. 25. ... to metadata rules<br />
    26. 26. Manual annotation<br />costly & limited<br />
    27. 27. Research on automatic annotation<br /><ul><li>automatic information extraction based on:
    28. 28. visual features
    29. 29. information from audio
    30. 30. crowdsourcing
    31. 31. deploying collateral data sources:
    32. 32. subtitles, production scripts, meeting minutes, slides</li></li></ul><li>Various (laboratory) showcases<br />Commercial systems (e.g., blinkx, google)<br />Progress? Yes!<br />
    33. 33. work in progress<br /><ul><li>institutional: reorganisation of traditional archival workflows
    34. 34. national: development of common services
    35. 35. OAI, Persistent Identifiers, ASR service, Vocabulary Repositories
    36. 36. commercial: uptake by MNCs (Google and Microsoft) and SMEs
    37. 37. individual: bring about a shift regarding defensive attitude of content owners towards opening up their funded and protected archives (trust/reliability) </li></li></ul><li>Automatic annotation<br /><ul><li>Participation in international research projects
    38. 38. VidioActive, MultiMATCH, VIDI-video, LiWA, P2P-Fusion, Sterna, EUScreen, PrestoPrime
    39. 39. Collaboration agreement with Dutch research institutes
    40. 40. Researchers stationed at Sound and Vision
    41. 41. Provide data (TRECVID, VideoCLEF)
    42. 42. Research environment: exact copy of iMMix production environment for testing new technology
    43. 43. speech recognition
    44. 44. video analysis
    45. 45. fingerprinting
    46. 46. linking of context data (web, program guide, production data)</li></li></ul><li>Annotation strategies<br /><ul><li>crowdsourcing: video labeling game
    47. 47. deploying collateral data sources: incorporation of subtitles
    48. 48. automatic information extraction: speech recognition for radio, pilots with visual
    49. 49. technology aided manual annotation: documentalist support
    50. 50. linking to other information sources </li></li></ul><li>media professionals<br />journalists<br />researchers<br />educators<br />general public <br />disparity between technology and user needs<br />
    51. 51. Users perspective<br /><ul><li>Rapidly evolving networked information society
    52. 52. Opening up
    53. 53. Focus on community specific requirements
    54. 54. search needs
    55. 55. presentation/interaction needs
    56. 56. Draw communities into libraries</li></li></ul><li>Digital Archive<br />DigitisingLegacyMaterial<br />Images for the Future<br />>250.000 hrs of audio and video<br />Digital Born<br />15.000 hours of video<br />40.000 hours of radio<br />content<br />content<br />(encoding)<br />(import)<br />Asset management<br />metadata<br />metadata<br />(conversions)<br />(import)<br />Education<br />Exhibitions<br />User generated content and metadata<br />Broadcast<br />Professional<br />Public Web Acces<br />
    57. 57. "ifitdoesn't spread, it is dead" (Jenkins, 2009)<br />
    58. 58. Open Images<br /><ul><li>Open media platform for online access to audiovisualarchivematerial, availablefor free (creative) reuse
    59. 59. Built by Sound and Vision & Knowledgeland
    60. 60. Contributersinclude:</li></li></ul><li>Open, open, open<br /><ul><li>Open source media platform (MMBase)
    61. 61. Use of and open video codec (OggTheora)
    62. 62. Use of the HTML5 <video> tag
    63. 63. Use of an open API (OAI-PMH, Atomfeeds) </li></li></ul><li>Licence<br /><ul><li>CC-BY-SA as preferredlicense
    64. 64. 3,000 items fromour ‘own’ collection
    65. 65. ‘Internet quality’</li></li></ul><li>Open Images<br />Rightsownedby Sound and Vision<br />Digitised items<br />Sound and Visioncollection<br />
    66. 66.
    67. 67.
    68. 68.
    69. 69.
    70. 70. community specific requirements<br />From document level search to fragment level search <br />
    71. 71. 28<br />Broadcast professionals<br />In: Huurnink, Hollink, van Den Heuvel 2009 (submitted)<br />
    72. 72. User survey (broadcast professionals)<br />
    73. 73. Sound and Vision: Education<br /><ul><li>Government and ‘Images for the Future’
    74. 74. Earlier Initiatives
    75. 75. ED*IT latest development completed with tools for </li></ul> teacher and student<br /><ul><li>ED*IT has been tested and developed in cooperation with many schools</li></li></ul><li>ED*IT: Proposition <br /><ul><li>One environment provides access to different </li></ul> controlled content databases (video, audio, photograps, articles, etc)<br /><ul><li>Editorial Staff contextualizes and enriches </li></ul> content for educational use<br /><ul><li>Enriched with tools for student and teacher to edit </li></ul> content in an easy way<br /><ul><li>For primary-, secondary- and vocational education</li></li></ul><li>ED*IT: Functionalities<br />Cut Videoclips<br />Digital Paper Maker<br />Presentation Maker<br />Edit Photographs<br />Video & Content Editor<br />Upload Files<br />Dossier Maker<br />Teacher Forum<br /> E- Lesson Maker<br />Timeline Maker<br />
    76. 76. ED*IT: Facts & Figures<br /><ul><li>Test Accounts: 2500
    77. 77. Licence: 50 schools
    78. 78. Licence: 50 educational departments
    79. 79. Objective: Same market share as Teleblik is 78%</li></li></ul><li>Researchers<br /><ul><li>Verteld Verleden aims at establishing a shared information space on distributed Dutch Oral History collections:
    80. 80. distributed collections (harvested via OAI)
    81. 81. search & interlink collections via centralized search
    82. 82. project goals: </li></ul>provide demonstrator portal to show how technology could help researchers<br />acquire information on specific user requirements <br />search<br />collaboration<br />linking<br />privacy<br />dedicated work space<br /><br />
    83. 83. example VPRO radio interviews<br />QUOTE<br />“VISUAL RADIO”<br />
    84. 84. interaction requirements<br />people expect easy interaction as <br />in 'every-day tools' they use on the web ... <br /><ul><li>The Sound and Vision Experience: a crossover between a museum and amusement park with various archive material, to make audiovisual heritage more acessible to the general public</li></li></ul><li>interaction<br /><ul><li>people expect easy interaction as the in 'every-day tools' they use on the web
    85. 85. next generation interaction:
    86. 86. hyperlinked video
    87. 87. interactive and collaborative interaction modalities
    88. 88. a truly Internet connected society</li></li></ul><li>draw communities into libraries<br />
    89. 89. goals<br /><ul><li>exploiting community tagging (tagging games, etc)
    90. 90. exploring the wisdom of crowds by hooking up with user communities (e.g., everyone-as-commentator, unexpected experts)
    91. 91. capturing relevant information from the internet and aligning this with archived items.
    92. 92. finding new ways for communities to interact with the data.</li></li></ul><li>Technology perspective<br />Technology:<br /><ul><li>provide anchor points for linking up with the `cloud’ (entity detection, segmentation, cross-collection SID, etc): people, places, events, topics, quotes, etc.
    93. 93. keywords: reliability, speed
    94. 94. synchronization of UGC with AV documents
    95. 95. users in the loop: UGC for adapting/training analysis tools
    96. 96. early fushion of multiple modalities (vision, speech)
    97. 97. technology aided annotation: Documentalist Support System</li></li></ul><li>
    98. 98. Crowdsourcing<br />14 minutes left for annotation<br />you score when somebody else uses the same term<br />fill in words that describe what you see or hear<br />Play!<br />
    99. 99. Tagging game example<br /><br />
    100. 100. Play against ASR<br />
    101. 101.<br /><br />
    102. 102. Hollands Glorie op Pinkpop<br />
    103. 103. Wrap up<br /><ul><li>value of archive is strongly related to access opportunities
    104. 104. access is to a large extend technology driven
    105. 105. but next to technology development we need to make a shift:
    106. 106. from a ‘laboratory view’ on users to drawing users and communities into the loop
    107. 107. NISV is aiming towards this two-way strategy:
    108. 108. incorporate advanced access technology
    109. 109. discuss access requirements with the stakeholders</li>