Your SlideShare is downloading. ×
Metadata for Motion Pictures: Media Streams
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Metadata for Motion Pictures: Media Streams

285

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
285
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • State of denial of researchers and developers in the areas of multimedia and interactive television wthout content representation, no scaling no control at multiple levels of granularity
  • Clip-based representation Fixes a segmentation of the video stream Separates the clip from its context of origin Encodes only one particular segmentation of the original data
  • Stream-based representation The stream of frames is left intact The stream has many possible segmentations by multi-layered annotations with precise time indexes (and the intersections, unions, etc. of these annotations)
  • Editing Resequencing of Shots Tell Kuleshov example face-bowl of soup face-coffin face-field of flowers Tell greeting/agreeing meeting example
  • (SHOW VIDEO of MTL) Content VCR Visualize Video Structure Videogram Browse at Different Time Scales Combine Automatic and Human Annotation
  • Generalization Hierarchy of Iconic Descriptors
  • Show categories Show VIDEO of IP Explain Glommed Icons (Icon Sentences)
  • (SHOW VIDEO of DW -- stop before Icon Title Editor-- Discuss Structure Horizontal Vertical and Actual vs. Inferable Time/Space STOP After first icon is made)
  • Transcript

    • 1. Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 am Fall 2002 http://www.sims.berkeley.edu/academics/courses/is202/f02/ Lecture 08: Media Streams SIMS 202: Information Organization and Retrieval
    • 2. Lecture 08: Media Streams
      • Problem Setting
      • Current Approaches
      • Representing Media
      • New Solutions
      • Methodological Considerations
      • Future Work
    • 3. Lecture 08: Media Streams
      • Problem Setting
      • Current Approaches
      • Representing Media
      • New Solutions
      • Methodological Considerations
      • Future Work
    • 4. What is the Problem?
      • Today people cannot easily create, find, edit, share, and reuse media
      • Computers don’t understand media content
        • Media is opaque and data rich
        • We lack structured representations
      • Without content representation (metadata), manipulating digital media will remain like word-processing with bitmaps
    • 5. Lecture 08: Media Streams
      • Problem Setting
      • Current Approaches
      • Representing Media
      • New Solutions
      • Methodological Considerations
      • Future Work
    • 6. The Search for Solutions
      • Current approaches to creating metadata don’t work
        • Signal-based analysis
        • Keywords
        • Natural language
      • Need standardized metadata framework
        • Designed for video and rich media data
        • Human and machine readable and writable
        • Standardized and scaleable
        • Integrated into media capture, archiving, editing, distribution, and reuse
    • 7. Signal-Based Parsing
      • Practical problem
        • Parsing unstructured, unknown video is very, very hard
      • Theoretical problem
        • Mismatch between percepts and concepts
    • 8. Why Keywords Don’t Work
      • Are not a semantic representation
      • Do not describe relations between descriptors
      • Do not describe temporal structure
      • Do not converge
      • Do not scale
    • 9. Natural Language vs. Visual Language Jack, an adult male police officer, while walking to the left, starts waving with his left arm, and then has a puzzled look on his face as he turns his head to the right; he then drops his facial expression and stops turning his head, immediately looks up, and then stops looking up after he stops waving but before he stops walking.
    • 10. Natural Language vs. Visual Language Jack, an adult male police officer, while walking to the left, starts waving with his left arm, and then has a puzzled look on his face as he turns his head to the right; he then drops his facial expression and stops turning his head, immediately looks up, and then stops looking up after he stops waving but before he stops walking.
    • 11. Notation for Time-Based Media: Music
    • 12. Visual Language Advantages
      • A language designed as an accurate and readable representation of time-based media
        • For video, especially important for actions, expressions, and spatial relations
      • Enables Gestalt view and quick recognition of descriptors due to designed visual similarities
      • Supports global use of annotations
    • 13. Lecture 08: Media Streams
      • Problem Setting
      • Current Approaches
      • Representing Media
      • New Solutions
      • Methodological Considerations
      • Future Work
    • 14. Representing Video
      • Streams vs. Clips
      • Video syntax and semantics
      • Ontological issues in video representation
    • 15. Video is Temporal
    • 16. Streams vs. Clips
    • 17. Stream-Based Representation
      • Makes annotation pay off
        • The richer the annotation, the more numerous the possible segmentations of the video stream
      • Clips
        • Change from being fixed segmentations of the video stream, to being the results of retrieval queries based on annotations of the video stream
      • Annotations
        • Create representations which make clips, not representations of clips
    • 18. Video Syntax and Semantics
      • The Kuleshov Effect
      • Video has a dual semantics
        • Sequence-independent invariant semantics of shots
        • Sequence-dependent variable semantics of shots
    • 19. Ontological Issues for Video
      • Video plays with rules for identity and continuity
        • Space
        • Time
        • Character
        • Action
    • 20. Space and Time: Actual vs. Inferable
      • Actual Recorded Space and Time
        • GPS
        • Studio space and time
      • Inferable Space and Time
        • Establishing shots
        • Cues and clues
    • 21. Time: Temporal Durations
      • Story (Fabula) Duration
        • Example: Brushing teeth in story world (5 minutes)
      • Plot (Syuzhet) Duration
        • Example: Brushing teeth in plot world (1 minute: 6 steps of 10 seconds each)
      • Screen Duration
        • Example: Brushing teeth (10 seconds: 2 shots of 5 seconds each)
    • 22. Character and Continuity
      • Identity of character is constructed through
        • Continuity of actor
        • Continuity of role
      • Alternative continuities
        • Continuity of actor only
        • Continuity of role only
    • 23. Representing Action
      • Physically-based description for sequence-independent action semantics
        • Abstract vs. conventionalized descriptions
        • Temporally and spatially decomposable actions and subactions
      • Issues in describing sequence-dependent action semantics
        • Mental states (emotions vs. expressions)
        • Cultural differences (e.g., bowing vs. greeting)
    • 24. “Cinematic” Actions
      • Cinematic actions support the basic narrative structure of cinema
        • Reactions/Proactions
          • Nodding, screaming, laughing, etc.
        • Focus of Attention
          • Gazing, headturning, pointing, etc.
        • Locomotion
          • Walking, running, etc.
      • Cinematic actions can occur
          • Within the frame/shot boundary
          • Across the frame boundary
          • Across shot boundaries
    • 25. Lecture 08: Media Streams
      • Problem Setting
      • Current Approaches
      • Representing Media
      • New Solutions
      • Methodological Considerations
      • Future Work
    • 26. New Solutions for Creating Metadata After Capture During Capture
    • 27. After Capture: Media Streams
    • 28. Media Streams Features
      • Key features
        • Stream-based representation (better segmentation)
        • Semantic indexing (what things are similar to)
        • Relational indexing (who is doing what to whom)
        • Temporal indexing (when things happen)
        • Iconic interface (designed visual language)
        • Universal annotation (standardized markup schema)
      • Key benefits
        • More accurate annotation and retrieval
        • Global usability and standardization
        • Reuse of rich media according to content and structure
    • 29. Media Streams GUI Components
      • Media Time Line
      • Icon Space
        • Icon Workshop
        • Icon Palette
    • 30. Media Time Line
      • Visualize video at multiple time scales
      • Write and read multi-layered iconic annotations
      • One interface for annotation, query, and composition
    • 31. Media Time Line
    • 32. Icon Space
      • Icon Workshop
        • Utilize categories of video representation
        • Create iconic descriptors by compounding iconic primitives
        • Extend set of iconic descriptors
      • Icon Palette
        • Dynamically group related sets of iconic descriptors
        • Reuse descriptive effort of others
        • View and use query results
    • 33. Icon Space
    • 34. Icon Space: Icon Workshop
      • General to specific (horizontal)
        • Cascading hierarchy of icons with increasing specificity on subordinate levels
      • Combinatorial (vertical)
        • Compounding of hierarchically organized icons across multiple axes of description
    • 35. Icon Space: Icon Workshop Detail
    • 36. Icon Space: Icon Palette
      • Dynamically group related sets of iconic descriptors
      • Collect icon sentences
      • Reuse descriptive effort of others
    • 37. Icon Space: Icon Palette Detail
    • 38. Video Retrieval In Media Streams
      • Same interface for annotation and retrieval
      • Assembles responses to queries as well as finds them
      • Query responses use semantics to degrade gracefully
    • 39. Media Streams Technologies
      • Minimal video representation distinguishing syntax and semantics
      • Iconic visual language for annotating and retrieving video content
      • Retrieval-by-composition methods for repurposing video
    • 40. New Solutions for Creating Metadata After Capture During Capture
    • 41. Creating Metadata During Capture New Capture Paradigm 1 Good Capture Drives Multiple Uses Current Capture Paradigm Multiple Captures To Get 1 Good Capture
    • 42. Active Capture
      • Active engagement and communication among the capture device, agent(s), and the environment
      • Re-envision capture as a control system with feedback
      • Use multiple data sources and communication to simplify the capture scenario
      • Use HCI to support “human-in-the-loop” algorithms for computer vision and audition
    • 43. Active Capture Processing Capture Interaction Active Capture Computer Vision HCI Direction/ Cinematography
    • 44. Automated Capture: Good Capture
    • 45. Automated Capture: Error Handling
    • 46. Evolution of Media Production
      • Customized production
        • Skilled creation of one media product
      • Mass production
        • Automatic replication of one media product
      • Mass customization
        • Skilled creation of adaptive media templates
        • Automatic production of customized media
    • 47.
      • Movies change from being static data to programs
      • Shots are inputs to a program that computes new media based on content representation and functional dependency (US Patents 6,243,087 & 5,969,716)
      Central Idea: Movies as Programs Parser Parser Producer Media Media Media Content Representation Content Representation
    • 48. Jim Lanahan in an MCI Ad
    • 49. Jim Lanahan in an @Home Banner
    • 50. Automated Media Production Process Reusable Online Asset Database 2 Annotation and Retrieval Asset Retrieval and Reuse Web Integration and Streaming Media Services Flash Generator WAP HTML Email Print/Physical Media Automated Capture 1 Automatic Editing 3 Personalized Delivery 4 Annotation of Media Assets Adaptive Media Engine
    • 51. Proposed Technology Architecture Media Processing DB Analysis Engine Interaction Engine Adaptive Media Engine Annotation and Retrieval Engine (MPEG 7) Delivery Engine OS Media Capture File AV Out Network Device Control
    • 52. Lecture 08: Media Streams
      • Problem Setting
      • Representing Media
      • Current Approaches
      • New Solutions
      • Methodological Considerations
      • Future Work
    • 53. Non-Technical Challenges
      • Standardization of media metadata (MPEG-7)
      • Broadband infrastructure and deployment
      • Intellectual property and economic models for sharing and reuse of media assets
    • 54. Technical Research Challenges
      • Develop end-to-end metadata system for automated media capture, processing, management, and reuse
      • Creating metadata
        • Represent action sequences and higher level narrative structures
        • Integrate legacy metadata (keywords, natural language)
        • Gather more and better metadata at the point of capture (develop metadata cameras)
        • Develop “human-in-the-loop” indexing algorithms and interfaces
      • Using metadata
        • Develop media components (MediaLego)
        • Integrate linguistic and other query interfaces
    • 55. For More Info
      • Marc Davis Web Site
        • www.sims.berkeley.edu/~marc
      • Spring 2003 course on “Multimedia Information” at SIMS
      • URAP and GSR positions
      • TidalWave II “New Media” program
    • 56. Next Time
      • Metadata for Motion Pictures: MPEG-7 (MED)
      • Readings for next time (in Protected)
        • “ MPEG-7: The Generic Multimedia Content Description Interface, Part 1” (J. M. Martinez, R. Koenen, F. Pereira)
        • “ MPEG-7: Overview of MPEG-7 Description Tools, Part 2” (J. Martinez)
    • 57. Homework (!)
      • Assignment 4: Revision of Photo Metadata Design and Project Presentation
        • Due by Monday, September 23
          • Completed (Revised) Photo Classifications and Annotated Photos
            • [groupname]_classification.xls file
            • [groupname]_photos.xls file
        • Due by Thursday, September 26
          • Group Presentation
            • 2 minutes: Presentation of application idea
            • 6 minutes: Presentation of classification and photo browser
            • 2 minutes: residual time for completing explanations and Q + A
          • Photo Browser Page (will be sent to you)

    ×