These are the slides from my keynote talk about Video Browsing on June 18, 2014, at the International Workshop on Content-Based Multimedia Indexing (CBMI) 2014.
Interactive Video Search - Tutorial at ACM Multimedia 2015klschoef
This is the presentation given by Klaus Schoeffmann and Frank Hopfgartner at the ACM Multimedia 2015 Tutorial in Brisbane, Australia (October 26, 2015). #acmmm15
Find paper here:
http://dl.acm.org/citation.cfm?id=2807417
This document discusses technologies for video fragment creation and annotation for the purpose of video hyperlinking. It describes video temporal segmentation to shots and scenes to break videos into fragments. It also discusses visual concept detection and event detection for annotating fragments so meaningful hyperlinks between fragments can be identified. An example approach is described that uses visual features to detect both abrupt and gradual shot transitions with high accuracy at 7-8 times faster than real-time.
This document discusses structured interactive scores, which provide a formalism for interactive multimedia. It presents examples of multimedia interaction in domains like contemporary dance and interactive installations. It describes problems with most existing multimedia tools, such as a lack of formal semantics and unrelated time models. The document proposes interactive scores as a solution and discusses their history and development at LaBRI. It outlines existing tools for interactive scores, related formalisms, and the Virage software implementation.
TVSum: Summarizing Web Videos Using TitlesNEERAJ BAGHEL
Title-based video summarization is a relatively unexplored domain; there is no publicly available dataset suitable for our purpose.
Author therefore collected a new dataset,TVSum50, that contains 50 videos and their shot-level importance scores obtained via crowdsourcing
SpokenMedia: Automatic Lecture Transcription and Rich Media NotebooksBrandon Muramatsu
Need to find a specific segment in an hour-long web video, webcast or podcast of a lecture? Want to read a transcript of that lecture? Want to bookmark, annotate, or discuss video or audio clips from an entire lecture? The SpokenMedia project at MIT is developing a web-based service to enable automatic lecture transcription. The project is also developing a suite of tools and services to improve interaction with webcasts and podcasts enabling students and faculty to create rich media notebooks to support their learning and teaching. Presented by Brandon Muramatsu, Andrew McKinney and Peter Wilkins at the NERCOMP 2010, Providence, Rhode Island, March 9, 2010.
Hyper Video Browser Search and Hyperlinking in Broadcast MediaBenoit HUET
Massive amounts of digital media is being produced and consumed daily on the Internet. Efficient access to relevant information is of key importance in contemporary society. The Hyper Video Browser provides multiple navigation means within the content of a media repository. Our system utilizes the state of the art multimodal content analysis and indexing techniques, at multiple temporal granularity, in order to satisfy the user need by suggesting relevant material.
We integrate two intuitive interfaces: for search and browsing through the video archive, and for further hyperlinking to the related content while enjoying some video content. The novelty of this work includes a multi-faceted search and browsing interface for navigating in video collections and the dynamic suggestion of hyperlinks related to a media fragment content, rather than the entire video, being viewed.
The approach was evaluated on the MediaEval Search and Hyperlinking task, demonstrating its effectiveness at locating accurately relevant content in a big media archive.
SpokenMedia: Content, Content Everywhere...What video? Where? at OpenEd 2009Brandon Muramatsu
This document discusses challenges in discovering and accessing open educational resource (OER) videos and audio lectures online. It proposes a lecture transcription service to improve searchability and discoverability of educational videos by generating timed transcripts. The service would automate transcription of lecture-style content and integrate with video hosting and production workflows. Transcripts could then enable richer search, playback, social features and reuse to enhance the user experience of OER videos.
Improving the OER Experience: Enabling Rich Media Notebooks of OER Video and ...Brandon Muramatsu
The SpokenMedia project at MIT is developing a web-based service to enable automatic lecture transcription. And it is developing a suite of tools and services to improve interaction with OER webcasts and podcasts enabling students and faculty to create rich media notebooks to support their learning and teaching. Presented by Brandon Muramatsu at OER 10, Cambridge, UK, March 23, 2010.
Interactive Video Search - Tutorial at ACM Multimedia 2015klschoef
This is the presentation given by Klaus Schoeffmann and Frank Hopfgartner at the ACM Multimedia 2015 Tutorial in Brisbane, Australia (October 26, 2015). #acmmm15
Find paper here:
http://dl.acm.org/citation.cfm?id=2807417
This document discusses technologies for video fragment creation and annotation for the purpose of video hyperlinking. It describes video temporal segmentation to shots and scenes to break videos into fragments. It also discusses visual concept detection and event detection for annotating fragments so meaningful hyperlinks between fragments can be identified. An example approach is described that uses visual features to detect both abrupt and gradual shot transitions with high accuracy at 7-8 times faster than real-time.
This document discusses structured interactive scores, which provide a formalism for interactive multimedia. It presents examples of multimedia interaction in domains like contemporary dance and interactive installations. It describes problems with most existing multimedia tools, such as a lack of formal semantics and unrelated time models. The document proposes interactive scores as a solution and discusses their history and development at LaBRI. It outlines existing tools for interactive scores, related formalisms, and the Virage software implementation.
TVSum: Summarizing Web Videos Using TitlesNEERAJ BAGHEL
Title-based video summarization is a relatively unexplored domain; there is no publicly available dataset suitable for our purpose.
Author therefore collected a new dataset,TVSum50, that contains 50 videos and their shot-level importance scores obtained via crowdsourcing
SpokenMedia: Automatic Lecture Transcription and Rich Media NotebooksBrandon Muramatsu
Need to find a specific segment in an hour-long web video, webcast or podcast of a lecture? Want to read a transcript of that lecture? Want to bookmark, annotate, or discuss video or audio clips from an entire lecture? The SpokenMedia project at MIT is developing a web-based service to enable automatic lecture transcription. The project is also developing a suite of tools and services to improve interaction with webcasts and podcasts enabling students and faculty to create rich media notebooks to support their learning and teaching. Presented by Brandon Muramatsu, Andrew McKinney and Peter Wilkins at the NERCOMP 2010, Providence, Rhode Island, March 9, 2010.
Hyper Video Browser Search and Hyperlinking in Broadcast MediaBenoit HUET
Massive amounts of digital media is being produced and consumed daily on the Internet. Efficient access to relevant information is of key importance in contemporary society. The Hyper Video Browser provides multiple navigation means within the content of a media repository. Our system utilizes the state of the art multimodal content analysis and indexing techniques, at multiple temporal granularity, in order to satisfy the user need by suggesting relevant material.
We integrate two intuitive interfaces: for search and browsing through the video archive, and for further hyperlinking to the related content while enjoying some video content. The novelty of this work includes a multi-faceted search and browsing interface for navigating in video collections and the dynamic suggestion of hyperlinks related to a media fragment content, rather than the entire video, being viewed.
The approach was evaluated on the MediaEval Search and Hyperlinking task, demonstrating its effectiveness at locating accurately relevant content in a big media archive.
SpokenMedia: Content, Content Everywhere...What video? Where? at OpenEd 2009Brandon Muramatsu
This document discusses challenges in discovering and accessing open educational resource (OER) videos and audio lectures online. It proposes a lecture transcription service to improve searchability and discoverability of educational videos by generating timed transcripts. The service would automate transcription of lecture-style content and integrate with video hosting and production workflows. Transcripts could then enable richer search, playback, social features and reuse to enhance the user experience of OER videos.
Improving the OER Experience: Enabling Rich Media Notebooks of OER Video and ...Brandon Muramatsu
The SpokenMedia project at MIT is developing a web-based service to enable automatic lecture transcription. And it is developing a suite of tools and services to improve interaction with OER webcasts and podcasts enabling students and faculty to create rich media notebooks to support their learning and teaching. Presented by Brandon Muramatsu at OER 10, Cambridge, UK, March 23, 2010.
Interactive Video Search: Where is the User in the Age of Deep Learning?klschoef
Interactive video retrieval tools are commonly evaluated using user studies, log file analysis, and indirect task-based evaluations like competitions. User studies directly observe users performing tasks with a tool and provide qualitative feedback. Log file analysis examines quantitative interaction patterns. Competitions like TRECVID and Video Browser Showdown pose search tasks to quantitatively compare tools. A combination of methods is often used to fully understand a tool's effectiveness from different perspectives.
Libraries as Motion Video: Setting up an in-house studio, getting visual & ex...Bernadette Daly Swanson
Libraries as Motion Video: Setting up an in-house studio, getting visual & extending skill-sets into new environments.
Created for the 3.5 hour Engage Workshop during pre-conference for CARL (California Academic & Research Libraries Conference), April 8-10, 2010, Sacramento, CA.
PDF of the paper from CARL proceedings:
http://carl-acrl.org/Archives/ConferencesArchive/Conference10/2010proceedings/BernadetteDalySwanson.pdf
Accompanying video used during workshop:
http://www.youtube.com/watch?v=hktUGfpLhTw&hd=1
Library Video Channel:
http://www.youtube.com/user/libraryvideochannel
Presenters: Bernadette Daly Swanson & Meredith Saba, UC Davis
Photo credits: many images purchased from http://www.istockphoto.com - istockphoto, Bernadette Daly Swanson, Wikipedia, with screen captures from Second Life® and YouTube, assorted Library websites.
This document describes research using a video repository called the Video Mosaic Collaborative to build multimedia artifacts called VMC Analytics. Students and researchers create narratives with video clips to illustrate concepts from mathematics education and learning sciences. The artifacts are analyzed to identify high-quality examples and emerging themes. Word clouds are generated from coded artifacts to visualize dominant themes. Analysis of contrasting cases shows how artifacts can illustrate connections between teacher questioning and student engagement or conceptual understanding.
Automated Lecture Transcription at OCW Consortium Global Meeting 2009Brandon Muramatsu
Introduction and background to the automated lecture transcription/lecture transcription service project by MIT's Office of Educational Innovation and Technology (OEIT). Presented by Brandon Muramatsu at the OCW Consortium Global Meeting in Monterrey Mexico, April 22, 2009.
In-Time On-Place Learning — Creation, Annotation and Sharing of Location-Base...Teemu Leinonen
Presentation in the 10th International Conference on Mobile Learning 2014, 28 February – 2 March, Madrid, Spain. The aim of the research is to look at how mobile video recording devices could support learning related to physical practices or places and situations at work. The paper discusses particular kind of workplace learning, namely learning using short video clips that are related to physical environment and tasks preformed in situ. The paper presents challenges of supporting learning as part of work practices taking place in the workplace, because learning has different attributes during work than in formal educational contexts: e.g. it is informal, just in time and social. The theoretical framework of the design is the tradition of pragmatism. We start with the concepts of experience, change of practices / habits and reflection, claiming that living through experiences suggest changes for practices and these trigger reflective processing of the situations. We present an Android application ‘Ach So!’ for creating and annotating short videos as potential solution for informal learning for physical work practices. The paper ends in proposing future steps in the development of the application. The co-design process for the application is lean and iterative, where the design receives feedback from the project partners, skilled workers, apprentices and managers of SMEs targeted to be the main users of the application.
In this talk I will address issues of "rigour" and "quality" in qualitative research, and the way that the two are closely aligned with how the researcher may explore various points of focus within the research process itself. Rigour and quality are inseparable from the generative nature of much qualitative inquiry, and the need to "show your workings" in the field within which the research is carried out. I will discuss this using examples of particular aspects of qualitative research that I have been involved with recently, both in design and execution. I will also discuss the opportunities and challenges of making a case for qualitative insights to augment and add value to other forms of research.
Content Modelling for Human Action Detection via Multidimensional ApproachCSCJournals
Video content analysis is an active research domain due to the availability and the increment of audiovisual data in the digital format. There is a need to automatically extracting video content for efficient access, understanding, browsing and retrieval of videos. To obtain the information that is of interest and to provide better entertainment, tools are needed to help users extract relevant content and to effectively navigate through the large amount of available video information. Existing methods do not seem to attempt to model and estimate the semantic content of the video. Detecting and interpreting human presence, actions and activities is one of the most valuable functions in this proposed framework. The general objectives of this research are to analyze and process the audio-video streams to a robust audiovisual action recognition system by integrating, structuring and accessing multimodal information via multidimensional retrieval and extraction model. The proposed technique characterizes the action scenes by integrating cues obtained from both the audio and video tracks. Information is combined based on visual features (motion, edge, and visual characteristics of objects), audio features and video for recognizing action. This model uses HMM and GMM to provide a framework for fusing these features and to represent the multidimensional structure of the framework. The action-related visual cues are obtained by computing the spatiotemporal dynamic activity from the video shots and by abstracting specific visual events. Simultaneously, the audio features are analyzed by locating and compute several sound effects of action events that embedded in the video. Finally, these audio and visual cues are combined to identify the action scenes. Compared with using single source of either visual or audio track alone, such combined audiovisual information provides more reliable performance and allows us to understand the story content of movies in more detail. To compare the usefulness of the proposed framework, several experiments were conducted and the results were obtained by using visual features only (77.89% for precision; 72.10% for recall), audio features only (62.52% for precision; 48.93% for recall) and combined audiovisual (90.35% for precision; 90.65% for recall).
This document discusses motion media and its applications in education. Motion media refers to visual content that appears to be in motion, such as videos, films, and animations. It can communicate information through sight and sound to large audiences simultaneously. When used for education, motion media has several advantages, such as demonstrating processes and skills. It can also help teach problem solving and cultural understanding. However, it also has limitations, like a fixed pace and potential for misinterpretation. When incorporated into instruction, video-based materials can promote student-centered learning if they allow students to interpret content and apply it to new problems. Teachers can still play an important role by facilitating content and ensuring deeper understanding.
This document discusses a proposed system for semantically annotating and retrieving documentary media objects. It presents the system's architecture, which includes a manual annotation tool, authoring tool, and search engine for documentary experts. The system is based on an evolving semantic network that provides flexible organization of documentary content descriptions and related media data. The proposed approach provides semantic structures that can change and grow over time to allow ongoing interpretation of source material.
The document discusses a proposed system for semantically annotating and retrieving documentary media objects. It presents the system's architecture, which includes a manual annotation tool, authoring tool, and search engine. The key aspect of the system is using an evolving semantic network as the basis for audiovisual content description, allowing flexible organization and ongoing interpretation of source material. The proposed approach provides a way to semantically connect information nodes representing technical details of media objects to enable intelligent search and retrieval of documentary content.
This document presents an approach for automated indexing and content-based retrieval of lecture videos. It extracts textual metadata from slides using optical character recognition and spoken text from audio using automatic speech recognition. Keywords are extracted from the OCR and ASR results and used to create search indices. Video segments are identified by detecting transitions between unique lecture slides. The approach aims to enable efficient search and retrieval of specific video clips from large lecture video archives.
SpokenMedia Project: Media-Linked Transcripts and Rich Media Notebooks for Le...Brandon Muramatsu
The SpokenMedia project’s goal is to increase the effectiveness of web-based lecture media by improving the search and discoverability of specific, relevant media segments. SpokenMedia creates media-linked transcripts that will enable users to find contextually relevant video segments to improve their teaching and learning. The SpokenMedia project envisions a number of tools and services layered on top of, and supporting, these media-linked transcripts to enable users to interact with the media in more educationally relevant ways. Presented by Brandon Muramatsu at the Technology For Education 2009 Workshop, Bangalore, India, August 4, 2009.
This are the slides of the keynote talk I gave at CBMI 2019 (on September 4, 2019 in Dublin, Ireland) about the Video Browser Showdown (VBS) competition.
This document discusses video-based data collection methods for social network analysis research in sports, specifically football. It provides three examples of studies - two amateur football matches analyzed experimentally and one professional World Cup match analyzed using archived video data. The document also discusses the advantages and disadvantages of experimental and observational video-based data collection approaches for social network analysis.
Action event retrieval from cricket video using audio energy feature for even...IAEME Publication
This document summarizes a research paper that proposes an audio-based approach for retrieving action events from cricket videos. The approach detects action events by measuring abnormal increases in audio energy levels during batsman strokes and crowd cheers. The researchers extract audio features like MFCC coefficients and calculate audio energy values from cricket video soundtracks. Peaks in energy levels are detected to find action clips corresponding to strokes and cheers. The experiments analyze cricket videos and show the method can efficiently retrieve action events using only audio analysis of crowd noise and bat impacts.
Action event retrieval from cricket video using audio energy feature for eventIAEME Publication
This document summarizes a research paper that proposes an audio-based approach for retrieving action events from cricket videos. The approach detects action events by measuring abnormal increases in audio energy levels during batsman strokes and crowd cheers. The researchers extract audio features like MFCC coefficients and calculate audio energy values from cricket video soundtracks. Peaks in energy levels are identified using adaptive thresholding to detect strokes and cheers. Video frames around detected audio peaks are retrieved as highlights of the action event. The method was tested on a dataset of cricket videos and results showed it can efficiently retrieve events like strokes and crowd reactions.
Relevant Content Detection in Cataract Surgery Videos (Invited Talk 1 at IPTA...klschoef
This document summarizes research on detecting relevant content in cataract surgery videos using computer vision techniques. It discusses segmenting videos into phases like incision and phacoemulsification. Instrument segmentation using Mask R-CNN is described, achieving over 90% accuracy. Relevance detection can enable compressed storage by encoding relevant segments at high quality and irrelevant segments at low quality. The goal is enabling efficient search, retrieval and analysis of cataract surgery videos for teaching, training and research.
These are the slides to my tutorial that I have given at the International Conference on Image Processing Theorie, Tools & Applications (IPTA 2022) on April 19, 2022.
More Related Content
Similar to Video Browsing - The Need for Interactive Video Search (Talk at CBMI 2014)
Interactive Video Search: Where is the User in the Age of Deep Learning?klschoef
Interactive video retrieval tools are commonly evaluated using user studies, log file analysis, and indirect task-based evaluations like competitions. User studies directly observe users performing tasks with a tool and provide qualitative feedback. Log file analysis examines quantitative interaction patterns. Competitions like TRECVID and Video Browser Showdown pose search tasks to quantitatively compare tools. A combination of methods is often used to fully understand a tool's effectiveness from different perspectives.
Libraries as Motion Video: Setting up an in-house studio, getting visual & ex...Bernadette Daly Swanson
Libraries as Motion Video: Setting up an in-house studio, getting visual & extending skill-sets into new environments.
Created for the 3.5 hour Engage Workshop during pre-conference for CARL (California Academic & Research Libraries Conference), April 8-10, 2010, Sacramento, CA.
PDF of the paper from CARL proceedings:
http://carl-acrl.org/Archives/ConferencesArchive/Conference10/2010proceedings/BernadetteDalySwanson.pdf
Accompanying video used during workshop:
http://www.youtube.com/watch?v=hktUGfpLhTw&hd=1
Library Video Channel:
http://www.youtube.com/user/libraryvideochannel
Presenters: Bernadette Daly Swanson & Meredith Saba, UC Davis
Photo credits: many images purchased from http://www.istockphoto.com - istockphoto, Bernadette Daly Swanson, Wikipedia, with screen captures from Second Life® and YouTube, assorted Library websites.
This document describes research using a video repository called the Video Mosaic Collaborative to build multimedia artifacts called VMC Analytics. Students and researchers create narratives with video clips to illustrate concepts from mathematics education and learning sciences. The artifacts are analyzed to identify high-quality examples and emerging themes. Word clouds are generated from coded artifacts to visualize dominant themes. Analysis of contrasting cases shows how artifacts can illustrate connections between teacher questioning and student engagement or conceptual understanding.
Automated Lecture Transcription at OCW Consortium Global Meeting 2009Brandon Muramatsu
Introduction and background to the automated lecture transcription/lecture transcription service project by MIT's Office of Educational Innovation and Technology (OEIT). Presented by Brandon Muramatsu at the OCW Consortium Global Meeting in Monterrey Mexico, April 22, 2009.
In-Time On-Place Learning — Creation, Annotation and Sharing of Location-Base...Teemu Leinonen
Presentation in the 10th International Conference on Mobile Learning 2014, 28 February – 2 March, Madrid, Spain. The aim of the research is to look at how mobile video recording devices could support learning related to physical practices or places and situations at work. The paper discusses particular kind of workplace learning, namely learning using short video clips that are related to physical environment and tasks preformed in situ. The paper presents challenges of supporting learning as part of work practices taking place in the workplace, because learning has different attributes during work than in formal educational contexts: e.g. it is informal, just in time and social. The theoretical framework of the design is the tradition of pragmatism. We start with the concepts of experience, change of practices / habits and reflection, claiming that living through experiences suggest changes for practices and these trigger reflective processing of the situations. We present an Android application ‘Ach So!’ for creating and annotating short videos as potential solution for informal learning for physical work practices. The paper ends in proposing future steps in the development of the application. The co-design process for the application is lean and iterative, where the design receives feedback from the project partners, skilled workers, apprentices and managers of SMEs targeted to be the main users of the application.
In this talk I will address issues of "rigour" and "quality" in qualitative research, and the way that the two are closely aligned with how the researcher may explore various points of focus within the research process itself. Rigour and quality are inseparable from the generative nature of much qualitative inquiry, and the need to "show your workings" in the field within which the research is carried out. I will discuss this using examples of particular aspects of qualitative research that I have been involved with recently, both in design and execution. I will also discuss the opportunities and challenges of making a case for qualitative insights to augment and add value to other forms of research.
Content Modelling for Human Action Detection via Multidimensional ApproachCSCJournals
Video content analysis is an active research domain due to the availability and the increment of audiovisual data in the digital format. There is a need to automatically extracting video content for efficient access, understanding, browsing and retrieval of videos. To obtain the information that is of interest and to provide better entertainment, tools are needed to help users extract relevant content and to effectively navigate through the large amount of available video information. Existing methods do not seem to attempt to model and estimate the semantic content of the video. Detecting and interpreting human presence, actions and activities is one of the most valuable functions in this proposed framework. The general objectives of this research are to analyze and process the audio-video streams to a robust audiovisual action recognition system by integrating, structuring and accessing multimodal information via multidimensional retrieval and extraction model. The proposed technique characterizes the action scenes by integrating cues obtained from both the audio and video tracks. Information is combined based on visual features (motion, edge, and visual characteristics of objects), audio features and video for recognizing action. This model uses HMM and GMM to provide a framework for fusing these features and to represent the multidimensional structure of the framework. The action-related visual cues are obtained by computing the spatiotemporal dynamic activity from the video shots and by abstracting specific visual events. Simultaneously, the audio features are analyzed by locating and compute several sound effects of action events that embedded in the video. Finally, these audio and visual cues are combined to identify the action scenes. Compared with using single source of either visual or audio track alone, such combined audiovisual information provides more reliable performance and allows us to understand the story content of movies in more detail. To compare the usefulness of the proposed framework, several experiments were conducted and the results were obtained by using visual features only (77.89% for precision; 72.10% for recall), audio features only (62.52% for precision; 48.93% for recall) and combined audiovisual (90.35% for precision; 90.65% for recall).
This document discusses motion media and its applications in education. Motion media refers to visual content that appears to be in motion, such as videos, films, and animations. It can communicate information through sight and sound to large audiences simultaneously. When used for education, motion media has several advantages, such as demonstrating processes and skills. It can also help teach problem solving and cultural understanding. However, it also has limitations, like a fixed pace and potential for misinterpretation. When incorporated into instruction, video-based materials can promote student-centered learning if they allow students to interpret content and apply it to new problems. Teachers can still play an important role by facilitating content and ensuring deeper understanding.
This document discusses a proposed system for semantically annotating and retrieving documentary media objects. It presents the system's architecture, which includes a manual annotation tool, authoring tool, and search engine for documentary experts. The system is based on an evolving semantic network that provides flexible organization of documentary content descriptions and related media data. The proposed approach provides semantic structures that can change and grow over time to allow ongoing interpretation of source material.
The document discusses a proposed system for semantically annotating and retrieving documentary media objects. It presents the system's architecture, which includes a manual annotation tool, authoring tool, and search engine. The key aspect of the system is using an evolving semantic network as the basis for audiovisual content description, allowing flexible organization and ongoing interpretation of source material. The proposed approach provides a way to semantically connect information nodes representing technical details of media objects to enable intelligent search and retrieval of documentary content.
This document presents an approach for automated indexing and content-based retrieval of lecture videos. It extracts textual metadata from slides using optical character recognition and spoken text from audio using automatic speech recognition. Keywords are extracted from the OCR and ASR results and used to create search indices. Video segments are identified by detecting transitions between unique lecture slides. The approach aims to enable efficient search and retrieval of specific video clips from large lecture video archives.
SpokenMedia Project: Media-Linked Transcripts and Rich Media Notebooks for Le...Brandon Muramatsu
The SpokenMedia project’s goal is to increase the effectiveness of web-based lecture media by improving the search and discoverability of specific, relevant media segments. SpokenMedia creates media-linked transcripts that will enable users to find contextually relevant video segments to improve their teaching and learning. The SpokenMedia project envisions a number of tools and services layered on top of, and supporting, these media-linked transcripts to enable users to interact with the media in more educationally relevant ways. Presented by Brandon Muramatsu at the Technology For Education 2009 Workshop, Bangalore, India, August 4, 2009.
This are the slides of the keynote talk I gave at CBMI 2019 (on September 4, 2019 in Dublin, Ireland) about the Video Browser Showdown (VBS) competition.
This document discusses video-based data collection methods for social network analysis research in sports, specifically football. It provides three examples of studies - two amateur football matches analyzed experimentally and one professional World Cup match analyzed using archived video data. The document also discusses the advantages and disadvantages of experimental and observational video-based data collection approaches for social network analysis.
Action event retrieval from cricket video using audio energy feature for even...IAEME Publication
This document summarizes a research paper that proposes an audio-based approach for retrieving action events from cricket videos. The approach detects action events by measuring abnormal increases in audio energy levels during batsman strokes and crowd cheers. The researchers extract audio features like MFCC coefficients and calculate audio energy values from cricket video soundtracks. Peaks in energy levels are detected to find action clips corresponding to strokes and cheers. The experiments analyze cricket videos and show the method can efficiently retrieve action events using only audio analysis of crowd noise and bat impacts.
Action event retrieval from cricket video using audio energy feature for eventIAEME Publication
This document summarizes a research paper that proposes an audio-based approach for retrieving action events from cricket videos. The approach detects action events by measuring abnormal increases in audio energy levels during batsman strokes and crowd cheers. The researchers extract audio features like MFCC coefficients and calculate audio energy values from cricket video soundtracks. Peaks in energy levels are identified using adaptive thresholding to detect strokes and cheers. Video frames around detected audio peaks are retrieved as highlights of the action event. The method was tested on a dataset of cricket videos and results showed it can efficiently retrieve events like strokes and crowd reactions.
Similar to Video Browsing - The Need for Interactive Video Search (Talk at CBMI 2014) (20)
Relevant Content Detection in Cataract Surgery Videos (Invited Talk 1 at IPTA...klschoef
This document summarizes research on detecting relevant content in cataract surgery videos using computer vision techniques. It discusses segmenting videos into phases like incision and phacoemulsification. Instrument segmentation using Mask R-CNN is described, achieving over 90% accuracy. Relevance detection can enable compressed storage by encoding relevant segments at high quality and irrelevant segments at low quality. The goal is enabling efficient search, retrieval and analysis of cataract surgery videos for teaching, training and research.
These are the slides to my tutorial that I have given at the International Conference on Image Processing Theorie, Tools & Applications (IPTA 2022) on April 19, 2022.
Medical Multimedia Systems and Applicationsklschoef
This document provides an overview of medical multimedia systems and applications. It discusses the use of multimedia data in medicine, including medical images, videos, and sensor data. A focus is placed on endoscopic video, including its characteristics, domains, and challenges. Applications of medical video are explored, such as post-procedural usage for documentation, training, and quality assessment. Techniques for pre-processing, analyzing, and summarizing medical video are also presented.
These are the results of the 7th Video Browser Showdown (VBS 2018), which was performed as a 3 hours competition on February 5th, 2018, at the MMM 2018 conference in Bangkok, Thailand.
These slides were presented as an introduction to the 7th Video Browser Showdown (VBS 2018) on February 5th, 2018, at the MMM 2018 in Bangkok, Thailand.
Medical Multimedia Information Systems (ACMMM17 Tutorial) klschoef
This document provides an overview of medical multimedia information systems and discusses various topics related to endoscopic video data. It begins with an introduction to different types of multimedia data in medicine, including medical text, sensor signals, images, and video. It then focuses on the characteristics of endoscopic video data and different research fields and communities. Several applications are discussed, including using surgery videos for post-procedural purposes and diagnostic decision support. The document also covers topics like domain-specific storage, video content analysis, and visualization and annotation of videos. It concludes with a discussion of knowledge transfer and future outlook for medical multimedia information systems.
Current Ms word generated power point presentation covers major details about the micronuclei test. It's significance and assays to conduct it. It is used to detect the micronuclei formation inside the cells of nearly every multicellular organism. It's formation takes place during chromosomal sepration at metaphase.
Authoring a personal GPT for your research and practice: How we created the Q...Leonel Morgado
Thematic analysis in qualitative research is a time-consuming and systematic task, typically done using teams. Team members must ground their activities on common understandings of the major concepts underlying the thematic analysis, and define criteria for its development. However, conceptual misunderstandings, equivocations, and lack of adherence to criteria are challenges to the quality and speed of this process. Given the distributed and uncertain nature of this process, we wondered if the tasks in thematic analysis could be supported by readily available artificial intelligence chatbots. Our early efforts point to potential benefits: not just saving time in the coding process but better adherence to criteria and grounding, by increasing triangulation between humans and artificial intelligence. This tutorial will provide a description and demonstration of the process we followed, as two academic researchers, to develop a custom ChatGPT to assist with qualitative coding in the thematic data analysis process of immersive learning accounts in a survey of the academic literature: QUAL-E Immersive Learning Thematic Analysis Helper. In the hands-on time, participants will try out QUAL-E and develop their ideas for their own qualitative coding ChatGPT. Participants that have the paid ChatGPT Plus subscription can create a draft of their assistants. The organizers will provide course materials and slide deck that participants will be able to utilize to continue development of their custom GPT. The paid subscription to ChatGPT Plus is not required to participate in this workshop, just for trying out personal GPTs during it.
hematic appreciation test is a psychological assessment tool used to measure an individual's appreciation and understanding of specific themes or topics. This test helps to evaluate an individual's ability to connect different ideas and concepts within a given theme, as well as their overall comprehension and interpretation skills. The results of the test can provide valuable insights into an individual's cognitive abilities, creativity, and critical thinking skills
Or: Beyond linear.
Abstract: Equivariant neural networks are neural networks that incorporate symmetries. The nonlinear activation functions in these networks result in interesting nonlinear equivariant maps between simple representations, and motivate the key player of this talk: piecewise linear representation theory.
Disclaimer: No one is perfect, so please mind that there might be mistakes and typos.
dtubbenhauer@gmail.com
Corrected slides: dtubbenhauer.com/talks.html
ESR spectroscopy in liquid food and beverages.pptxPRIYANKA PATEL
With increasing population, people need to rely on packaged food stuffs. Packaging of food materials requires the preservation of food. There are various methods for the treatment of food to preserve them and irradiation treatment of food is one of them. It is the most common and the most harmless method for the food preservation as it does not alter the necessary micronutrients of food materials. Although irradiated food doesn’t cause any harm to the human health but still the quality assessment of food is required to provide consumers with necessary information about the food. ESR spectroscopy is the most sophisticated way to investigate the quality of the food and the free radicals induced during the processing of the food. ESR spin trapping technique is useful for the detection of highly unstable radicals in the food. The antioxidant capability of liquid food and beverages in mainly performed by spin trapping technique.
The debris of the ‘last major merger’ is dynamically youngSérgio Sacani
The Milky Way’s (MW) inner stellar halo contains an [Fe/H]-rich component with highly eccentric orbits, often referred to as the
‘last major merger.’ Hypotheses for the origin of this component include Gaia-Sausage/Enceladus (GSE), where the progenitor
collided with the MW proto-disc 8–11 Gyr ago, and the Virgo Radial Merger (VRM), where the progenitor collided with the
MW disc within the last 3 Gyr. These two scenarios make different predictions about observable structure in local phase space,
because the morphology of debris depends on how long it has had to phase mix. The recently identified phase-space folds in Gaia
DR3 have positive caustic velocities, making them fundamentally different than the phase-mixed chevrons found in simulations
at late times. Roughly 20 per cent of the stars in the prograde local stellar halo are associated with the observed caustics. Based
on a simple phase-mixing model, the observed number of caustics are consistent with a merger that occurred 1–2 Gyr ago.
We also compare the observed phase-space distribution to FIRE-2 Latte simulations of GSE-like mergers, using a quantitative
measurement of phase mixing (2D causticality). The observed local phase-space distribution best matches the simulated data
1–2 Gyr after collision, and certainly not later than 3 Gyr. This is further evidence that the progenitor of the ‘last major merger’
did not collide with the MW proto-disc at early times, as is thought for the GSE, but instead collided with the MW disc within
the last few Gyr, consistent with the body of work surrounding the VRM.
Phenomics assisted breeding in crop improvementIshaGoswami9
As the population is increasing and will reach about 9 billion upto 2050. Also due to climate change, it is difficult to meet the food requirement of such a large population. Facing the challenges presented by resource shortages, climate
change, and increasing global population, crop yield and quality need to be improved in a sustainable way over the coming decades. Genetic improvement by breeding is the best way to increase crop productivity. With the rapid progression of functional
genomics, an increasing number of crop genomes have been sequenced and dozens of genes influencing key agronomic traits have been identified. However, current genome sequence information has not been adequately exploited for understanding
the complex characteristics of multiple gene, owing to a lack of crop phenotypic data. Efficient, automatic, and accurate technologies and platforms that can capture phenotypic data that can
be linked to genomics information for crop improvement at all growth stages have become as important as genotyping. Thus,
high-throughput phenotyping has become the major bottleneck restricting crop breeding. Plant phenomics has been defined as the high-throughput, accurate acquisition and analysis of multi-dimensional phenotypes
during crop growing stages at the organism level, including the cell, tissue, organ, individual plant, plot, and field levels. With the rapid development of novel sensors, imaging technology,
and analysis methods, numerous infrastructure platforms have been developed for phenotyping.
The technology uses reclaimed CO₂ as the dyeing medium in a closed loop process. When pressurized, CO₂ becomes supercritical (SC-CO₂). In this state CO₂ has a very high solvent power, allowing the dye to dissolve easily.
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...AbdullaAlAsif1
The pygmy halfbeak Dermogenys colletei, is known for its viviparous nature, this presents an intriguing case of relatively low fecundity, raising questions about potential compensatory reproductive strategies employed by this species. Our study delves into the examination of fecundity and the Gonadosomatic Index (GSI) in the Pygmy Halfbeak, D. colletei (Meisner, 2001), an intriguing viviparous fish indigenous to Sarawak, Borneo. We hypothesize that the Pygmy halfbeak, D. colletei, may exhibit unique reproductive adaptations to offset its low fecundity, thus enhancing its survival and fitness. To address this, we conducted a comprehensive study utilizing 28 mature female specimens of D. colletei, carefully measuring fecundity and GSI to shed light on the reproductive adaptations of this species. Our findings reveal that D. colletei indeed exhibits low fecundity, with a mean of 16.76 ± 2.01, and a mean GSI of 12.83 ± 1.27, providing crucial insights into the reproductive mechanisms at play in this species. These results underscore the existence of unique reproductive strategies in D. colletei, enabling its adaptation and persistence in Borneo's diverse aquatic ecosystems, and call for further ecological research to elucidate these mechanisms. This study lends to a better understanding of viviparous fish in Borneo and contributes to the broader field of aquatic ecology, enhancing our knowledge of species adaptations to unique ecological challenges.
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxMAGOTI ERNEST
Although Artemia has been known to man for centuries, its use as a food for the culture of larval organisms apparently began only in the 1930s, when several investigators found that it made an excellent food for newly hatched fish larvae (Litvinenko et al., 2023). As aquaculture developed in the 1960s and ‘70s, the use of Artemia also became more widespread, due both to its convenience and to its nutritional value for larval organisms (Arenas-Pardo et al., 2024). The fact that Artemia dormant cysts can be stored for long periods in cans, and then used as an off-the-shelf food requiring only 24 h of incubation makes them the most convenient, least labor-intensive, live food available for aquaculture (Sorgeloos & Roubach, 2021). The nutritional value of Artemia, especially for marine organisms, is not constant, but varies both geographically and temporally. During the last decade, however, both the causes of Artemia nutritional variability and methods to improve poorquality Artemia have been identified (Loufi et al., 2024).
Brine shrimp (Artemia spp.) are used in marine aquaculture worldwide. Annually, more than 2,000 metric tons of dry cysts are used for cultivation of fish, crustacean, and shellfish larva. Brine shrimp are important to aquaculture because newly hatched brine shrimp nauplii (larvae) provide a food source for many fish fry (Mozanzadeh et al., 2021). Culture and harvesting of brine shrimp eggs represents another aspect of the aquaculture industry. Nauplii and metanauplii of Artemia, commonly known as brine shrimp, play a crucial role in aquaculture due to their nutritional value and suitability as live feed for many aquatic species, particularly in larval stages (Sorgeloos & Roubach, 2021).
2. Video Content Search Scenarios
• Private collection of recorded videos
Many long sequences…
You know there are a few interesting (e.g., funny) clips,
but don’t know where
Want to find them for editing/sharing
• Downloaded a suggested lecture video
In hurry for exam…
2 hours duration
Want to quickly check for important information
• Recordings from several surveillance cameras
Quickly look for suspicious activities (e.g., forensics expert)
Disasters (e.g., Boston Marathon bombings 2013)
2
3. Use Video Retrieval Tool?
3
Content‐
based
Feature
Example
Image
Text
Ranked list
of shots
Temporal
Context
[ Heesch, D., Howarth, P., Magalhaes, J., May, A., Pickering, M., Yavlinsky, A., & Rüger, S. (2004, November).
Video retrieval using search and browsing. In TREC Video Retrieval Evaluation Online Proceedings. ]
4. Video Search Scenarios
• Private collection of recorded videos
Many long sequences…
You know there are a few interesting (e.g., funny) clips,
but don’t know where
Want to find them for editing/sharing
• Downloaded a suggested lecture video
In hurry for exam…
2 hours duration
Want to quickly check for important information
• Recordings from several surveillance cameras
Quickly look for suspicious activities (e.g., forensics expert)
Disasters (e.g., Boston Marathon bombings 2013)
4
interesting
important information
suspicious activities
6. Video Retrieval
Well-known issues
Query by example
Typically no perfect example available.
Query by text
How to describe a desired image by text?
Usability Gap
6
A picture tells a 1000 words.
by marfis75
How to describe a video clip by text???
8. TRECVID Known-item Search
TRECVID KIS (2010‐2012)
models the situation in which
“someone knows of a video, has seen it before, believes it is
contained in a collection, but doesn‘t know where to look”
Automatic Search
Text‐description about the video
Return ranked list of 100 videos (out of 9000)
Interactive Search
Pre‐processing based on text query
Searcher browses through result list (e.g., keyframes of shots)
• Interactively find target video as fast as possible
• Within 5 minutes
8
9. TRECVID Known-item Search
The Performance of State-of-The-Art Video Retrieval Tools
Known items not found by any team:
Interactive Automatic out of
2010 5 / 24 21% 69 / 300 22% 15 teams
2011 6 / 25 24% 142 / 391 36% 9 teams
2012 2 / 24 17% 108 / 361 29% 9 teams
From: [Alan Smeaton, Paul Over, “Known‐Item Search @ TRECVID 2012”, NIST, 2012]
9
12. How do Users Browse Today?
In practice most users employ a…
VCR in the 1970s provided a similar functionality!
12
13. Novice vs. Expert
13
• Mostly interactive search
• Simple‐to‐use
• Inflexible and tedious for archives
• Low performance
• Mostly automatic search
• Complicated to use
• Flexible and easier (?) for archives
• Still limited performance
14. Modern Video Browsing
• Combines automatic and interactive search
• Integrates the user in search process
Instead of „query‐and‐browse‐results“
User controls search process
Inspects and interacts
Most meaningful feature for current need
• content navigation, abstract visualization,
ad‐hoc querying or content summarization, …
Klaus Schoeffmann, Frank Hopfgartner, Oge Marques, Laszlo Boeszoermenyi, and Joemon M. Jose, “Video browsing interfaces
and applications: a review“, in SPIE Reviews Journal , Vol. 1, No. 1, pp. 1‐35 (018004), SPIE, Online, March 2010
14
Exploratory Search
„Will know it when I see it!“
(instead of “telling the system what you want”)
15. Modern Video Browsing
• Interactive inspection/exploration of visual content in
order to satisfy an information need
• Focuses on search and exploration in
(i) single videos as well as (ii) video collections
Directed Search
Find a specific shot or segment in a video
Find a specific video in an archive
Undirected Search
Searching to discover information
E.g., browse through a video in order to
• Learn how the content looks like
• See if it is interesting
15
Supported by
Video Retrieval
Not supported by
Video Retrieval
18. Improving Content Visualization
aka “Video Surrogates”
18
However, outperformed by simple
“grid of keyframes”
in terms of search time.
VideoTree
[Jansen et al., CBMI 2008]
Similar concept proposed later
[Girgensohn et al., ICMR 2011]
23. Visual Seeker Bar with 2 Levels
Allows a user to quickly identify
similar/repeating scenes
23
[ Schoeffmann, K., & Boeszoermenyi, L. (2009, June). Video browsing using interactive navigation summaries. In
Content‐Based Multimedia Indexing, 2009. CBMI'09. Seventh International Workshop on (pp. 243‐248). IEEE. ]
24. Example: Motion Direction + Intensity
Motion Vector (µ) classification into
K=12 equidistant motion directions
Mapping to Hue channel
24
[ Schoeffmann, K., Lux, M., Taschwer, M., & Boeszoermenyi, L. (2009, June). Visualization of video motion in context
of video browsing. In Multimedia and Expo, 2009. ICME 2009. IEEE International Conference on (pp. 658‐661). IEEE. ]
25. Ad-Hoc Query by Motion Pattern
25
[ Schoeffmann, K., Lux, M., Taschwer, M., & Boeszoermenyi, L. (2009, June). Visualization of video motion in context
of video browsing. In Multimedia and Expo, 2009. ICME 2009. IEEE International Conference on (pp. 658‐661). IEEE. ]
26. Ad-Hoc Query by Color Layout
Region‐of‐Interest (ROI) Search
User selects spatial region‐of‐interest
On search
Compute Euclidian distance of frame F
to every other frame f (acc. to selected region)
Based on color layout descriptor
…
frame F
frame 1 frame k frame n
User‐selected
region (I)
…
d(F,1)=350 d(F,k)=8 d(F,n)=400
26
[ Schoeffmann, K., Taschwer, M., & Boeszoermenyi, L. (2010, February). The video explorer: a tool for navigation and searching within a single
video based on fast content analysis. In Proceedings of the first annual ACM SIGMM conference on Multimedia systems (pp. 247‐258). ACM. ]
27. Ad-Hoc Query by Color Layout
27
[ Schoeffmann, K., Taschwer, M., & Boeszoermenyi, L. (2010, February). The video explorer: a tool for navigation and searching within a single
video based on fast content analysis. In Proceedings of the first annual ACM SIGMM conference on Multimedia systems (pp. 247‐258). ACM. ]
29. Video Browser for the Digital Native
[ Adams, B., Greenhill, S., & Venkatesh, S. (2012, July). Towards a video browser for the digital native. In
Multimedia and Expo Workshops (ICMEW), 2012 IEEE International Conference on (pp. 127‐132). IEEE. ]
29
Temporal Semantic Compression
• Compress the content of e.g., a 1h video to 5 mins.
• Based on tempo and popularity (see next slide)
Compression on interestingness
User defines a compression factor (f)
that defines duration of compressed video
Based on interest function k shots are ranked
in order of interestingness, satisfying
Shots are presented in their temporal order
30. Video Browser for the Digital Native
Interestingness
30
[ Adams, B., Greenhill, S., & Venkatesh, S. (2012, July). Towards a video browser for the digital native. In
Multimedia and Expo Workshops (ICMEW), 2012 IEEE International Conference on (pp. 127‐132). IEEE. ]
Tempo function derived from
motion and audio features
(originally; Greenhill et al.)
Per‐frame and per‐shot popularity based
on information like
YouTube Insights and manual annotations
31. Video Browser for the Digital Native
User study with 8 participants
Test configuration elements by two tasks
1. Browse a familiar movie to find scenes you remember
2. Browse an unfamiliar movie to get a feel for its story or structure
Questionnaire with Likert‐scale ratings
31
[ Adams, B., Greenhill, S., & Venkatesh, S. (2012, July). Towards a video browser for the digital native. In
Multimedia and Expo Workshops (ICMEW), 2012 IEEE International Conference on (pp. 127‐132). IEEE. ]
33. Signature-based Video Browser
• Color sketches mapped to
feature signatures
• Matched to those of
keyframes
33
[ Kruliš, M., Lokoč, J. and Skopal, T. (2013). Efficient Extraction of Feature Signatures
Using Multi‐GPU Architecture. Springer Berlin Heidelberg, LNCS 7733, pp.446‐456. ]
1. Sampling keypoints
2. Description through location (x,y),
CIE Lab, contrast and entropy of
surrounding pixels
3. K‐means clustering
37. Evaluation of Browsing Tools
• User Studies
Reflect real benefit (+)
Unexpected behaviors (+)
Very tedious to do (‐)
Individual data sets (‐)
• User Simulations
Quick procedure (+)
Approximation only (‐)
• Campaigns/Competitions
TRECVID Known‐Item‐Search
Video Browser Showdown
Combine advantages from above
37
38. Video Browser Showdown (VBS)
• Annual performance evaluation competition
Live evaluation of search performance
Special session at Int. Conference on MultiMedia Modeling (MMM)
• Focus
Known‐item Search tasks
Target clips are presented on site
Teams search in shared data set
Highly interactive search
e.g., text‐queries are not allowed
Should push research on interfaces
and interaction/navigation
Experts and Novices
Easy‐to‐use tools and methods
38
40. Video Browser Showdown (VBS)
• Scoring through VBS Server
• Score (s) [0‐100] for task i and team k is based on
Solve time (t)
Penalty (p) based on
number of submissions (m)
40
Maximum solve time (Tmax)
typically 3 minutes
[ Schoeffmann, K., Ahlström, D., Bailer, W., Cobârzan, C., Hopfgartner, F., McGuinness, K., ... & Weiss, W. (2013). The Video Browser
Showdown: a live evaluation of interactive video search tools. International Journal of Multimedia Information Retrieval, 1‐15. ]
41. VBS 2013 Evaluation
Baseline Study with Novices and a Video Player
• Add. User study (16 participants) for comparison with VBS tools
• Known Item Search Tasks as used for VBS 2013
41
[ Schoeffmann and Cobarzan, “An Evaluation of Interactive Search with Modern Video Players”, in
Proc. of the 2013 IEEE International Symposium on Multimedia (ISM), Anaheim, CA, USA, 2013 ]
42. VBS 2013: Baseline vs. Experts
Score
42
[ Schoeffmann, K., Ahlström, D., Bailer, W., Cobârzan, C., Hopfgartner, F., McGuinness, K., ... & Weiss, W. (2013). The Video Browser
Showdown: a live evaluation of interactive video search tools. International Journal of Multimedia Information Retrieval, 1‐15. ]
Avg (Baseline) = 74.8 Avg (VBS) = 71.7
43. VBS 2013: Baseline vs. Experts
Submission Time
43
Avg (Baseline) = 57.9 s Avg (VBS) = 40.5 s
45. HCI
Conclusions
• Need for interactive/exploratory search
• Video browsing tools
Effective alternative to automatic search tools, support undirected search
Provide reasonable performance, can help to bridge usability gap
Many proposals for single browsing techniques
• But still improvable…
How to even better integrate user into search process?
User knowledge could help to circumvent shortcomings of content analysis
How to better support search behavior of users?
Stronger combination of automatic and interactive search techniques needed!
More research on interface concepts, interaction models, demos, and user studies!
45
MM
46. Where is the User
in Multimedia Retrieval?
IEEE Multimedia Magazine, Oct.‐Dec. 2012, vol. 19, no. 4, pp. 6‐10
Marcel Worring, Paul Sajda, Simone Santini, David Shamma, Alan Smeaton, Qiang Yang
46
• “In the multimedia retrieval community, the
emphasis has moved toward quantitative
results to such an extent that the user has
moved into the background. ”
• “It might be time to rethink what we are doing
in the field.”
• “…users often don’t even know what they want
from an automatic system….”
• “…user needs and characteristics are dynamic.”
• “It is so much easier to publish papers about
improving a standard task than it is to describe
a new insight about user intention or a new
interface for browsing results.”
47. What About Novice Users?
[ Heesch, D., Howarth, P., Magalhaes, J., May, A., Pickering, M., Yavlinsky, A., & Rüger, S. (2004, November).
Video retrieval using search and browsing. In TREC Video Retrieval Evaluation Online Proceedings. ]
47
48. Video Browser Showdown 2012
Two examples (of the 11 tools)
48
Xiangyu Chen, Jin Yuan, Liqiang Nie, Zheng‐Jun Zha, Shuicheng Yan, and Tat‐Seng Chua, "TRECVID 2010
Known‐item Search by NUS", in Proceedings of TRECVID 2010 workshop, NIST, Gaithersburgh, USA, 2011
Jin Yuan, Huanbo Luan, Dejun Hou, Han Zhang, Yan‐Tao Zheng, Zheng‐Jun Zha, and Tat‐Seng Chua, "Video
Browser Showdown by NUS", in Proceedings of th 18th International Conference on Multimedia Modeling
(MMM) 2012, Klagenfurt, Austria, pp. 642‐645
• Keyframe extraction (shots)
• ASR and OCR
• HLF (Concepts)
• RF with Related Samples
• Uniform sampled keyframes
(with flexible distance)
• Parallel playback + navigation
Manfred Del Fabro and Laszlo Böszörmenyi, "AAU Video Browser: Non‐
Sequential Hierarchical Video Browsing without Content Analysis", in
Proceedings of th 18th International Conference on Multimedia Modeling
(MMM) 2012, Klagenfurt, Austria, pp. 639‐641
Winner of VBS 2012
50. Mobile Video Browsing
FilmStrip – Improve Visability [ Hudelist, M. A., Schoeffmann, K., & Boeszoermenyi, L. (2013, April). Mobile
video browsing with a 3D filmstrip. In Proceedings of the 3rd ACM conference on
International Conference on Multimedia Retrieval (pp. 299‐300). ACM. ]
50