Enrichment of News Show Videos with Multimodal Semi-Automatic Analysis

  • 239 views
Uploaded on

The talk was delivered by Vasileios Mezaris during the NEM 2012 summit on October 2012, in Istanbul, Turkey. More info: http://bit.ly/XUP0jg

The talk was delivered by Vasileios Mezaris during the NEM 2012 summit on October 2012, in Istanbul, Turkey. More info: http://bit.ly/XUP0jg

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
239
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Television Linked To The Web Daniel Stein1, Evlampios Apostolidis2, Vasileios Mezaris2, Nicolas de Abreu Pereira3, Jennifer Müller3, Mathilde Sahuguet4, Benoit Huet4, Ivo Lašek5 Enrichment of News Show Videos with Multimodal Semi-Automatic Analysis1 Fraunhofer Institute IAIS, Schloss Birlinghoven, Germany2 Information Technologies Institute, CERTH, Greece3 rbb - Rundfunk Berlin-Brandenburg, 14482 Potsdam, Germany4 Eurecom, Sophia Antpolis, France5 Czech Technical University and University of Economics, Prague, Czech Republic NEM Summit, Istanbul, www.linkedtv.eu October 2012
  • 2. Synopsis www.linkedtv.eu Introduction: LinkedTV Project Use Cases Intelligent Video Analysis Results Conclusions & future plans2 Information Technologies Institute Centre for Research and Technology Hellas
  • 3. LinkedTV ― Television Linked To the Web www.linkedtv.eu Vision: 12 Excellent Partners  hypervideo Fraunhofer Eurecom  ubiquitously online cloud of STI GMBH Condat Networked Audio-Visual Content CERTH BEELD EN GELUID  decoupled from place, device or UEP Noterik source UMONS U. ST GALLEN CWI RBB Aim:  provide interactive multimedia service for non-professional end- users  Focus on television broadcast content as seed videos Web: http://www.linkedtv.eu 3 Information Technologies Institute Centre for Research and Technology Hellas
  • 4. LinkedTV Workflow www.linkedtv.eu Overall Architecture Use Case Scenarios Intelligent Video Analysis Linking Hypervideo to Web Content Contextualization and Personalization Interface and Presentation Engine4 Information Technologies Institute Centre for Research and Technology Hellas
  • 5. LinkedTV Workflow www.linkedtv.eu Overall Architecture Use Case Scenarios Intelligent Video Analysis Linking Hypervideo to Web Content Contextualization and Personalization Interface and Presentation Engine5 Information Technologies Institute Centre for Research and Technology Hellas
  • 6. Two Use Case Scenarios in LinkedTV www.linkedtv.euScenario 1 (this talk):Interactive News Show Professional news  Due to legal constraints: whitelist  Detailed scenario archetype description content produced by RBB News topic, Seed content: local people, news show "rbb locations, Aktuell" objects etc Scenario 2 (not covered here): Hyperlinked Documentary  Cultural content from S&V (1700 hours of cultural heritage AV- content under CCL)  Seed content: "Antique Roadshow" 66 Information Technologies Institute Centre for Research and Technology Hellas
  • 7. Intelligent Video Analysis www.linkedtv.eu7 Information Technologies Institute Centre for Research and Technology Hellas
  • 8. Segmentation www.linkedtv.eu  Shot segmentation technique  Spatio-temporal Segmentation  [Tsamoura et. al., 2008]  [Mezaris et. al., 2004]  News show video performance:  News show performance: Good “remarkably well”  False positives due to:  Out of 269 shots detected:  Camera movement or zoom in/out (~ 55 %)  2 had wrong starting points  Gradual transition between frames (~ 10 %)  Erroneous motion vectors (~ 35 %)  4 contained multiple shots  11 were too short to evaluate  Unwanted effect: false recognition of properly moving banners which do not yield additional informationV. Mezaris, I. Kompatsiaris, N. V. Boulgouris, and M. G. Strintzis, "Real-time compressed-domain spatiotemporal segmentation and ontologiesfor video indexing and retrieval", IEEE Transactions on Circuits and Systems for Video Technology, vol. 14, no. 5, pp. 606-621, May 2004.E. Tsamoura, V. Mezaris, I. Kompatsiaris, "Gradual transition detection using color coherence and other criteria in a video shot meta-segmentation framework", IEEE International Conference on Image Processing, Workshop on Multimedia Information Retrieval (ICIP-MIR2008), San Diego, CA, USA, October 2008, pp. 45-48. 8 Information Technologies Institute Centre for Research and Technology Hellas
  • 9. Concept Detection www.linkedtv.eu Method was described in [Moumtzidou et. al., 2011] 346 concepts from TRECVID 2011 SIN task Overall performance:  Correctly detected concepts > 64 %  About 25 % of them are characterized as particularly useful mostly related to detecting persons (e.g., person, face, adult)  Erroneous concepts vary between 22% - 42% and in many cases achieve high scores (e.g., outdoor, amateur video) Visit: http://mklab.iti.gr/eventdetection-linkedtv/A. Moumtzidou, P. Sidiropoulos, S. Vrochidis, N. Gkalelis, S. Nikolopoulos, V. Mezaris, I. Kompatsiaris, I. Patras, "ITI-CERTH participationto TRECVID 2011", Proc. TRECVID 2011 Workshop, December 2011, Gaithersburg, MD, USA. 9 Information Technologies Institute Centre for Research and Technology Hellas
  • 10. Automatic Speech Recognition www.linkedtv.eu Automatic speech recognition for German (using [Schneider08]): segment of one news show WER notes new airport 36.2 outdoor, spontaneous soccer riot 44.2 tavern, dialect, background noise various news I 9.5 murder case 24.0 boxing 50.6 dialect, very spontaneous various news II 20.9 rbb game 39.1 weather report 46.7 spontaneous, casual  main obstacles: local dialect, spontaneous speech, background noise Schneider, D., Schon, J., and Eickeler, S. (2008). Towards Large Scale Vocabulary Independent Spoken Term Detection: Advances in the Fraunhofer IAIS Audiomining System. In Proc. SIGIR, Singapore. 10 Information Technologies Institute Centre for Research and Technology Hellas
  • 11. Person Detection www.linkedtv.eu Face clustering using the face.com api  Speaker Identification using a GMM- Result: generally very good, some HMM model, with 253 German parliamenterroneous clusters due to side-view speakers  Result: 8.0% Equal Error Rate 11 Information Technologies Institute Centre for Research and Technology Hellas
  • 12. Conclusions www.linkedtv.eu We have established:  all the different video analysis techniques  their exact functionality  the connections among them Preliminary results work as a solid ground for further improvements Many challenges have been addressed but several aspects of the analysis techniques show much room for improvement, e.g.,  over-sensitivity of spatiotemporal segmentation algorithm to gradual transitions and camera’s movement  adaptation of several TRECVID concepts to the needs of each specific multimedia content (news show, documentary, art show)  over-sensitivity of speech recognizer to localized dialects and background noise12 Information Technologies Institute Centre for Research and Technology Hellas
  • 13. Future Plans www.linkedtv.eu  Incorporate new methods:  Near-duplicate Content Detection  Goal: find parts that are already watched  Optical Character Recognition  Goal: exploit banner information to obtain a database for face and speaker recognition  Topic Segmentation  Goal: improve scene segmentation  Find synergies between methods:  ASR + Speaker Recognition + Face Detection  Person Detection  ASR + Topic Classification + Shot Segmentation  Story Segmentation  Concept Detection + Keywords Extraction + Topic Segmentation  Video Similarity/Clustering13 Information Technologies Institute Centre for Research and Technology Hellas
  • 14. www.linkedtv.eu Questions ?More information:http://www.iti.gr/~bmezarisbmezaris@iti.grhttp://www.linkedtv.eu14 Information Technologies Institute Centre for Research and Technology Hellas