Advances in Semantic Analysis
       of Multimedia




Dr. Gerald Friedland
International Computer Science Institute
Berke...
The Internet Today




                     2
Internet Use Today




Raphaël Troncy: Linked Media: Weaving non-textual content into the Semantic Web, MozCamp, 03/2009.
...
Types of Videos




                  4
Addressable Market for
                          Enterprise Video Applications




          Security               Asset ...
Multimedia Capabilities:
       1985


• Record
• Store
• Play
• Random Seek
• Annotate Manually


                       ...
Multimedia Capabilities:
       2009

• Record
• Store
• Stream
• Play
• Random Seek
• Annotate Manually

                ...
Multimedia Capabilities:
      Wanted
       • Semantic Navigation
       • Search
       • Content Compare
       • Objec...
Problems


•Multimedia data very dense manual
 annotation not feasable
•Multimedia content analysis is
 difficult and rare...
My Research...
         Network                     Knowledge

     Semantic Web



         Context                    Un...
My Research...


Hypotheses:
• Multimedia content analysis works
  better when every cue is taken into
  account (eg. vide...
Context
Sources of Context:
• Inclusion of prior knowledge
• Combination of algorithms
• Multimodality:
  – audio+video+.....
Context as Key:
 Example 1



      →   Cut          Horse    →
          Paste   ^V   Meadow




Visual Object Extraction...
Simple Interactive
        Object Extraction (SIOX)


           →                   →




Image          User Input      ...
SIOX: Algorithm Idea
                   Color Signatures from image retrieval:




Y. Rubner, C. Tomasi, and L. J. Guibas:...
SIOX in GIMP
             SIOX
            Button




G. Friedland, K. Jantz, T. Lenz, F. Wiesel, R. Rojas: “Object Cut an...
SIOX in Inkscape




                   17
SIOX in Blender




                  18
Extensions
Extracting multiple similar
objects at once:




          →




                              19
Sub-Pixel Refinement
      Problem: Spill colors and foreground
      disappearance



           →



Original          SI...
Sub-Pixel Refinement
Detail Refinement Brush:
Coarse Interaction



                    →




                    →




    ...
VideoSIOX

1st Frame:




Subsequent
Frames:


             22
More Information



 http://www.siox.org




                       23
Shoesurfer




             24
Shoesurfer




             25
Shoesurfer




             26
Shoesurfer




             27
Shoesurfer




             28
Context as Key:
Example 2




                  29
Speaker Diarization: Who
            Spoke When?
            Audiotrack:


             Segmentation:




             Clu...
Analyzing Meetings




                     31
Dominance Estimation
I Know You...



http://www.icsi.berkeley.edu/
~fractor/ioda_demo.avi




                                33
Narrative Theme Navigation




G. Friedland, L. Gottlieb, A. Janin: “Joke-o-mat: Browsing Sitcoms Punchline by
Punchline”,...
Joke-O-Mat: Demo




http://www.youtube.com/watch?v=1qfa84Ulm5s




                                         35
Connecting Multimedia
and Semantic Technologies
   GStreamer

     Appscio
                   User
       Device   Compone...
Semantic Media
Framework
   Pipeline Framework
                                    Integrated
      C/C++/Java            ...
Semantic Analysis of
Multimedia Data
• enables automatic logical
  inference on perceptually
  encoded data
• enables more...
A note...




            James A. Hendler


                          39
MySTT



 Open-Source, open-model,
 state-of-the-art speech
 recognizer for multiparty
 conversations.

 Release Date: Feb...
4th IEEE International
  Conference on Semantic
  Computing 2010




Paper Deadline: May 3rd, 2010
                       ...
Upcoming...




              42
Thank You!
Questions?
Contact:
Dr. Gerald Friedland
International Computer Science Institute
Berkeley, CA
http://www.geral...
Upcoming SlideShare
Loading in …5
×

Semantics And Multimedia

2,333 views
1,784 views

Published on

This is Gerald Friedland's presentation for SVST's Multi-Media and the Semantic Web.

Published in: Technology, Education
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,333
On SlideShare
0
From Embeds
0
Number of Embeds
52
Actions
Shares
0
Downloads
59
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

Semantics And Multimedia

  1. Advances in Semantic Analysis of Multimedia Dr. Gerald Friedland International Computer Science Institute Berkeley, CA friedland@icsi.berkeley.edu
  2. The Internet Today 2
  3. Internet Use Today Raphaël Troncy: Linked Media: Weaving non-textual content into the Semantic Web, MozCamp, 03/2009. 3
  4. Types of Videos 4
  5. Addressable Market for Enterprise Video Applications Security Asset Tracking QA/Operational Efficiency Intelligent $1.2 Billion $480m by 2010 $700m Marketing (Total Market $7.8B, 2005) $4.0 Billion Commercially (RFID in 2006 2.4B) (source: Envysion, (Source: JP Freeman) (Total Asset protection $14.7B) Arrowsight, corporate $200m (source: T3CI corporate ($7B in 06. Source Lehman)(Source: Lehman report 2006) analysis) analysis) BI Training Government Compliance $400m $450m $600m (Reporting and Analysis 4B) (source: JP Freeman) (source: Forrester (Intelligence, Defense, (Total BI market $13.3B) Enterprise Software Homeland Security) 5 (source: IDC BI tools 03-08) report 2005)
  6. Multimedia Capabilities: 1985 • Record • Store • Play • Random Seek • Annotate Manually 6
  7. Multimedia Capabilities: 2009 • Record • Store • Stream • Play • Random Seek • Annotate Manually 7
  8. Multimedia Capabilities: Wanted • Semantic Navigation • Search • Content Compare • Object Cut & Paste • Annotate Automatically • Infer over Content => Make multimedia “understandable” for computers. 8
  9. Problems •Multimedia data very dense manual annotation not feasable •Multimedia content analysis is difficult and rarely good enough to create reliable products. 9
  10. My Research... Network Knowledge Semantic Web Context Understanding Semantic Computing Machine Learning Recognition Artificial Intelligence Filtering Features Signal/Text Processing Images Audio Video Text
  11. My Research... Hypotheses: • Multimedia content analysis works better when every cue is taken into account (eg. video AND audio). • Semantic is enabled through context. Converts AI research into products.
  12. Context Sources of Context: • Inclusion of prior knowledge • Combination of algorithms • Multimodality: – audio+video+... – extra hardware • Human interaction • ... 12
  13. Context as Key: Example 1 → Cut Horse → Paste ^V Meadow Visual Object Extraction 13
  14. Simple Interactive Object Extraction (SIOX) → → Image User Input Output Context delivered by human interaction 14
  15. SIOX: Algorithm Idea Color Signatures from image retrieval: Y. Rubner, C. Tomasi, and L. J. Guibas: The Earth Mover’s Distance as a Metric for Image Retrieval. Int. Journal of Computer Vision, 40(2):99–121, 2000. Idea: Instead of searching and image database, use Color Signatures to search inside an image. 15
  16. SIOX in GIMP SIOX Button G. Friedland, K. Jantz, T. Lenz, F. Wiesel, R. Rojas: “Object Cut and Paste in Images and Videos”, International Journal of Semantic Computing Vol 1, No 2, pp. 221-247, World Scientific, USA, June 2007. 16
  17. SIOX in Inkscape 17
  18. SIOX in Blender 18
  19. Extensions Extracting multiple similar objects at once: → 19
  20. Sub-Pixel Refinement Problem: Spill colors and foreground disappearance → Original SIOX GraphCut → 20
  21. Sub-Pixel Refinement Detail Refinement Brush: Coarse Interaction → → 21
  22. VideoSIOX 1st Frame: Subsequent Frames: 22
  23. More Information http://www.siox.org 23
  24. Shoesurfer 24
  25. Shoesurfer 25
  26. Shoesurfer 26
  27. Shoesurfer 27
  28. Shoesurfer 28
  29. Context as Key: Example 2 29
  30. Speaker Diarization: Who Spoke When? Audiotrack: Segmentation: Clustering: G. Friedland, O. Vinyals, Y. Huang, C. Müller: “Prosodic and other Long-Term Features for Speaker Diarization”, IEEE Transactions on Audio, Speech, and Language Processing, Vol 17, No 5, pp 985--993, July 2009. 30
  31. Analyzing Meetings 31
  32. Dominance Estimation
  33. I Know You... http://www.icsi.berkeley.edu/ ~fractor/ioda_demo.avi 33
  34. Narrative Theme Navigation G. Friedland, L. Gottlieb, A. Janin: “Joke-o-mat: Browsing Sitcoms Punchline by Punchline”, Proceedings of ACM Multimedia, Beijing, China, October 2009. 34
  35. Joke-O-Mat: Demo http://www.youtube.com/watch?v=1qfa84Ulm5s 35
  36. Connecting Multimedia and Semantic Technologies GStreamer Appscio User Device Component 1 Driver User Component 2 Source Recorder . . . File User Component n 36
  37. Semantic Media Framework Pipeline Framework Integrated C/C++/Java Development Interface Environment Events Code Custom Event Source 1 Video Application Server Web Technology Custom Event Interface Source 2 . Scripting & Logic Engine . . Custom Event Services Connector Source n http://www.appscio.com 37
  38. Semantic Analysis of Multimedia Data • enables automatic logical inference on perceptually encoded data • enables more “natural” interaction with the computer: “do what the user means” • Interfaces nicely with Semantic Web technologies 38
  39. A note... James A. Hendler 39
  40. MySTT Open-Source, open-model, state-of-the-art speech recognizer for multiparty conversations. Release Date: February 2010 40
  41. 4th IEEE International Conference on Semantic Computing 2010 Paper Deadline: May 3rd, 2010 41
  42. Upcoming... 42
  43. Thank You! Questions? Contact: Dr. Gerald Friedland International Computer Science Institute Berkeley, CA http://www.gerald-friedland.org friedland@icsi.berkeley.edu 43

×