Semantics And Multimedia

2,011 views
1,653 views

Published on

This is Gerald Friedland's presentation for SVST's Multi-Media and the Semantic Web.

Published in: Technology, Education
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,011
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
59
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

Semantics And Multimedia

  1. Advances in Semantic Analysis of Multimedia Dr. Gerald Friedland International Computer Science Institute Berkeley, CA friedland@icsi.berkeley.edu
  2. The Internet Today 2
  3. Internet Use Today Raphaël Troncy: Linked Media: Weaving non-textual content into the Semantic Web, MozCamp, 03/2009. 3
  4. Types of Videos 4
  5. Addressable Market for Enterprise Video Applications Security Asset Tracking QA/Operational Efficiency Intelligent $1.2 Billion $480m by 2010 $700m Marketing (Total Market $7.8B, 2005) $4.0 Billion Commercially (RFID in 2006 2.4B) (source: Envysion, (Source: JP Freeman) (Total Asset protection $14.7B) Arrowsight, corporate $200m (source: T3CI corporate ($7B in 06. Source Lehman)(Source: Lehman report 2006) analysis) analysis) BI Training Government Compliance $400m $450m $600m (Reporting and Analysis 4B) (source: JP Freeman) (source: Forrester (Intelligence, Defense, (Total BI market $13.3B) Enterprise Software Homeland Security) 5 (source: IDC BI tools 03-08) report 2005)
  6. Multimedia Capabilities: 1985 • Record • Store • Play • Random Seek • Annotate Manually 6
  7. Multimedia Capabilities: 2009 • Record • Store • Stream • Play • Random Seek • Annotate Manually 7
  8. Multimedia Capabilities: Wanted • Semantic Navigation • Search • Content Compare • Object Cut & Paste • Annotate Automatically • Infer over Content => Make multimedia “understandable” for computers. 8
  9. Problems •Multimedia data very dense manual annotation not feasable •Multimedia content analysis is difficult and rarely good enough to create reliable products. 9
  10. My Research... Network Knowledge Semantic Web Context Understanding Semantic Computing Machine Learning Recognition Artificial Intelligence Filtering Features Signal/Text Processing Images Audio Video Text
  11. My Research... Hypotheses: • Multimedia content analysis works better when every cue is taken into account (eg. video AND audio). • Semantic is enabled through context. Converts AI research into products.
  12. Context Sources of Context: • Inclusion of prior knowledge • Combination of algorithms • Multimodality: – audio+video+... – extra hardware • Human interaction • ... 12
  13. Context as Key: Example 1 → Cut Horse → Paste ^V Meadow Visual Object Extraction 13
  14. Simple Interactive Object Extraction (SIOX) → → Image User Input Output Context delivered by human interaction 14
  15. SIOX: Algorithm Idea Color Signatures from image retrieval: Y. Rubner, C. Tomasi, and L. J. Guibas: The Earth Mover’s Distance as a Metric for Image Retrieval. Int. Journal of Computer Vision, 40(2):99–121, 2000. Idea: Instead of searching and image database, use Color Signatures to search inside an image. 15
  16. SIOX in GIMP SIOX Button G. Friedland, K. Jantz, T. Lenz, F. Wiesel, R. Rojas: “Object Cut and Paste in Images and Videos”, International Journal of Semantic Computing Vol 1, No 2, pp. 221-247, World Scientific, USA, June 2007. 16
  17. SIOX in Inkscape 17
  18. SIOX in Blender 18
  19. Extensions Extracting multiple similar objects at once: → 19
  20. Sub-Pixel Refinement Problem: Spill colors and foreground disappearance → Original SIOX GraphCut → 20
  21. Sub-Pixel Refinement Detail Refinement Brush: Coarse Interaction → → 21
  22. VideoSIOX 1st Frame: Subsequent Frames: 22
  23. More Information http://www.siox.org 23
  24. Shoesurfer 24
  25. Shoesurfer 25
  26. Shoesurfer 26
  27. Shoesurfer 27
  28. Shoesurfer 28
  29. Context as Key: Example 2 29
  30. Speaker Diarization: Who Spoke When? Audiotrack: Segmentation: Clustering: G. Friedland, O. Vinyals, Y. Huang, C. Müller: “Prosodic and other Long-Term Features for Speaker Diarization”, IEEE Transactions on Audio, Speech, and Language Processing, Vol 17, No 5, pp 985--993, July 2009. 30
  31. Analyzing Meetings 31
  32. Dominance Estimation
  33. I Know You... http://www.icsi.berkeley.edu/ ~fractor/ioda_demo.avi 33
  34. Narrative Theme Navigation G. Friedland, L. Gottlieb, A. Janin: “Joke-o-mat: Browsing Sitcoms Punchline by Punchline”, Proceedings of ACM Multimedia, Beijing, China, October 2009. 34
  35. Joke-O-Mat: Demo http://www.youtube.com/watch?v=1qfa84Ulm5s 35
  36. Connecting Multimedia and Semantic Technologies GStreamer Appscio User Device Component 1 Driver User Component 2 Source Recorder . . . File User Component n 36
  37. Semantic Media Framework Pipeline Framework Integrated C/C++/Java Development Interface Environment Events Code Custom Event Source 1 Video Application Server Web Technology Custom Event Interface Source 2 . Scripting & Logic Engine . . Custom Event Services Connector Source n http://www.appscio.com 37
  38. Semantic Analysis of Multimedia Data • enables automatic logical inference on perceptually encoded data • enables more “natural” interaction with the computer: “do what the user means” • Interfaces nicely with Semantic Web technologies 38
  39. A note... James A. Hendler 39
  40. MySTT Open-Source, open-model, state-of-the-art speech recognizer for multiparty conversations. Release Date: February 2010 40
  41. 4th IEEE International Conference on Semantic Computing 2010 Paper Deadline: May 3rd, 2010 41
  42. Upcoming... 42
  43. Thank You! Questions? Contact: Dr. Gerald Friedland International Computer Science Institute Berkeley, CA http://www.gerald-friedland.org friedland@icsi.berkeley.edu 43

×