Video Indexing and Retrieval SLIS 5206
Text indexing Began in earnest after the printing press was invented in the 1400s.  Scholarly journals began to be published with rapidity and indexing methods Were desperately needed.  Pre-coordinate indexing Post-coordinate indexing Computerized indexing: KWIC—keyword in context String searches
Still image indexing James Turner, 1997: “Indexing Images, Some Considerations” Images are subject to more than one interpretation; text is not Text can stand alone; images rarely do
Video indexing Video indexing applications : news video, film archives, surveillance, user-generated content, distance learning, video conferencing, medical applications, sports. Video indexing involves  “segmentation, analysis and abstraction”  of video content (Zhong).
Problems/Goals Growing amounts   of video data   Video data is  difficult to index ; dynamic not static.  Ex: TV video has 25-30 frames per second. Bibliographic schemes: different manifestations of videos—languages, content added or edited, etc. Copyright issues:  not many videos are in public domain. ----------------------------------------------------------------------- Goal  = automated semantic indexing; not quite there yet.
Indexing breakdown Sequence->scene->shot->frame->object Frame =still image Key frame  =representative still image
Types of indexing: low-level Based on  color histograms, motion & object detection and tracking. Focuses on appearance Sequence->shot-> scene Sequence =group of scenes Scene  =group of shots, similar to a paragraph in a text document. Shot  = “single series of actions with one camera.” The basic unit of indexing, similar to a word in a text document.  Scenes are similar to paragraphs in a text document while sequences are similar to pages or chapters.
Types of indexing: high-level High level indexing  focuses on the content of the video, rather than its appearance Semantic gap:  the difference between description of content and how the user perceives the content.  Ex: content described by indexer as “boat”; user perceives it as “cruise” Entities mentioned but not seen must still be indexed.  Ex: newsreel of Bob Hope making joke about Marilyn Monroe (she may not actually appear in footage)
Scene cut detection algorithms  (Zhong) 1. Divide video streams into units (such as shots) 2. Select representative or KEY frame  3. Describe colors and shapes for indexing Note:  Temporal information not included (must be separately annotated along with  metadata)
Metadata Information that describes a resource; aids in classification Metadata standards used for video indexing: FIAF EAD   SMIL   MPEG-7 Dublin Core   XML  RDF  MPEG-21
Metadata standards, cont’d FIAF —International Film Archive Federation cataloging rules RDF —Resource Description Framework; uses subject-predicate-object descriptions XML —eXtensible Markup Language—used for sharing data across different info systems SMIL —Synchronized Multimedia Integration Language—an XML markup language for describing multimedia presentations
Metadata standards, cont’d MPEG-7 multimedia content description interface (Wiley 2002)
Metadata Standards cont’d Dublin Core  (fifteen many elements, many qualifiers) can be applied to video indexing E ncoded  A rchival  D escription  —used for film archives; can be mapped to Dublin Core MPEG-7  —mutimedia content description standard MPEG-21 —Rights Expression Language, designed to discourage illegal file-sharing
Ex: Dublin Core and Video Indexing
Retrieval  Granularity : how do users want to retrieve materials? i.e. segments, scenes, entire video? Purpose : news, entertainment, business, security? User expertise : technical users vs. general consumers Database type
Ex: Content-based video retrieval and indexing model (Zhong, 2001)
Real-life example: News archives Special issues:   they use large amounts of video daily and must archive them immediately. “ On the fly” bibliographic schemes  and indexing methods.  Ex: CNN uses its own cataloging scheme.
Online Film archives Internet Archive   http:// www.archive.org /details/movies British Film Institute   http:// www.bfi.org.uk/see.html Google Video   (now includes film from National Archives)  http://www.bfi.org.uk/see.html  and  http://video.google.com/nara.html Youtube  http://www.youtube.com International News Archives   http://www.ibiblio.org/slanews/internet/intarchives.htm Blinkx —over 7 million hours of video.  http://www.blinkx.com
Current and Future Trends: User-generated content General public can upload/access videos through sites such as Youtube. Retrieval is imprecise: often based on keywords, date and relevance. Other resources: vlogs or video blogs.  User-generated content and demand for  quick and precise  access to content will continue to grow.

Video Indexing And Retrieval

  • 1.
    Video Indexing andRetrieval SLIS 5206
  • 2.
    Text indexing Beganin earnest after the printing press was invented in the 1400s. Scholarly journals began to be published with rapidity and indexing methods Were desperately needed. Pre-coordinate indexing Post-coordinate indexing Computerized indexing: KWIC—keyword in context String searches
  • 3.
    Still image indexingJames Turner, 1997: “Indexing Images, Some Considerations” Images are subject to more than one interpretation; text is not Text can stand alone; images rarely do
  • 4.
    Video indexing Videoindexing applications : news video, film archives, surveillance, user-generated content, distance learning, video conferencing, medical applications, sports. Video indexing involves “segmentation, analysis and abstraction” of video content (Zhong).
  • 5.
    Problems/Goals Growing amounts of video data Video data is difficult to index ; dynamic not static. Ex: TV video has 25-30 frames per second. Bibliographic schemes: different manifestations of videos—languages, content added or edited, etc. Copyright issues: not many videos are in public domain. ----------------------------------------------------------------------- Goal = automated semantic indexing; not quite there yet.
  • 6.
    Indexing breakdown Sequence->scene->shot->frame->objectFrame =still image Key frame =representative still image
  • 7.
    Types of indexing:low-level Based on color histograms, motion & object detection and tracking. Focuses on appearance Sequence->shot-> scene Sequence =group of scenes Scene =group of shots, similar to a paragraph in a text document. Shot = “single series of actions with one camera.” The basic unit of indexing, similar to a word in a text document. Scenes are similar to paragraphs in a text document while sequences are similar to pages or chapters.
  • 8.
    Types of indexing:high-level High level indexing focuses on the content of the video, rather than its appearance Semantic gap: the difference between description of content and how the user perceives the content. Ex: content described by indexer as “boat”; user perceives it as “cruise” Entities mentioned but not seen must still be indexed. Ex: newsreel of Bob Hope making joke about Marilyn Monroe (she may not actually appear in footage)
  • 9.
    Scene cut detectionalgorithms (Zhong) 1. Divide video streams into units (such as shots) 2. Select representative or KEY frame 3. Describe colors and shapes for indexing Note: Temporal information not included (must be separately annotated along with metadata)
  • 10.
    Metadata Information thatdescribes a resource; aids in classification Metadata standards used for video indexing: FIAF EAD SMIL MPEG-7 Dublin Core XML RDF MPEG-21
  • 11.
    Metadata standards, cont’dFIAF —International Film Archive Federation cataloging rules RDF —Resource Description Framework; uses subject-predicate-object descriptions XML —eXtensible Markup Language—used for sharing data across different info systems SMIL —Synchronized Multimedia Integration Language—an XML markup language for describing multimedia presentations
  • 12.
    Metadata standards, cont’dMPEG-7 multimedia content description interface (Wiley 2002)
  • 13.
    Metadata Standards cont’dDublin Core (fifteen many elements, many qualifiers) can be applied to video indexing E ncoded A rchival D escription —used for film archives; can be mapped to Dublin Core MPEG-7 —mutimedia content description standard MPEG-21 —Rights Expression Language, designed to discourage illegal file-sharing
  • 14.
    Ex: Dublin Coreand Video Indexing
  • 15.
    Retrieval Granularity: how do users want to retrieve materials? i.e. segments, scenes, entire video? Purpose : news, entertainment, business, security? User expertise : technical users vs. general consumers Database type
  • 16.
    Ex: Content-based videoretrieval and indexing model (Zhong, 2001)
  • 17.
    Real-life example: Newsarchives Special issues: they use large amounts of video daily and must archive them immediately. “ On the fly” bibliographic schemes and indexing methods. Ex: CNN uses its own cataloging scheme.
  • 18.
    Online Film archivesInternet Archive http:// www.archive.org /details/movies British Film Institute http:// www.bfi.org.uk/see.html Google Video (now includes film from National Archives) http://www.bfi.org.uk/see.html and http://video.google.com/nara.html Youtube http://www.youtube.com International News Archives http://www.ibiblio.org/slanews/internet/intarchives.htm Blinkx —over 7 million hours of video. http://www.blinkx.com
  • 19.
    Current and FutureTrends: User-generated content General public can upload/access videos through sites such as Youtube. Retrieval is imprecise: often based on keywords, date and relevance. Other resources: vlogs or video blogs. User-generated content and demand for quick and precise access to content will continue to grow.