Semantics And Multimedia
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Semantics And Multimedia

on

  • 1,961 views

This is Gerald Friedland's presentation for SVST's Multi-Media and the Semantic Web.

This is Gerald Friedland's presentation for SVST's Multi-Media and the Semantic Web.

Statistics

Views

Total Views
1,961
Views on SlideShare
1,913
Embed Views
48

Actions

Likes
4
Downloads
56
Comments
0

7 Embeds 48

http://blocs.xtec.cat 15
http://www.linkedin.com 10
http://jpaularmstrong.blogspot.com 8
http://jpaularmstrong.blogspot.com 8
https://www.linkedin.com 4
http://www.slideshare.net 2
http://www.mefeedia.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Semantics And Multimedia Presentation Transcript

  • 1. Advances in Semantic Analysis of Multimedia Dr. Gerald Friedland International Computer Science Institute Berkeley, CA friedland@icsi.berkeley.edu
  • 2. The Internet Today 2
  • 3. Internet Use Today Raphaël Troncy: Linked Media: Weaving non-textual content into the Semantic Web, MozCamp, 03/2009. 3
  • 4. Types of Videos 4
  • 5. Addressable Market for Enterprise Video Applications Security Asset Tracking QA/Operational Efficiency Intelligent $1.2 Billion $480m by 2010 $700m Marketing (Total Market $7.8B, 2005) $4.0 Billion Commercially (RFID in 2006 2.4B) (source: Envysion, (Source: JP Freeman) (Total Asset protection $14.7B) Arrowsight, corporate $200m (source: T3CI corporate ($7B in 06. Source Lehman)(Source: Lehman report 2006) analysis) analysis) BI Training Government Compliance $400m $450m $600m (Reporting and Analysis 4B) (source: JP Freeman) (source: Forrester (Intelligence, Defense, (Total BI market $13.3B) Enterprise Software Homeland Security) 5 (source: IDC BI tools 03-08) report 2005)
  • 6. Multimedia Capabilities: 1985 • Record • Store • Play • Random Seek • Annotate Manually 6
  • 7. Multimedia Capabilities: 2009 • Record • Store • Stream • Play • Random Seek • Annotate Manually 7
  • 8. Multimedia Capabilities: Wanted • Semantic Navigation • Search • Content Compare • Object Cut & Paste • Annotate Automatically • Infer over Content => Make multimedia “understandable” for computers. 8
  • 9. Problems •Multimedia data very dense manual annotation not feasable •Multimedia content analysis is difficult and rarely good enough to create reliable products. 9
  • 10. My Research... Network Knowledge Semantic Web Context Understanding Semantic Computing Machine Learning Recognition Artificial Intelligence Filtering Features Signal/Text Processing Images Audio Video Text
  • 11. My Research... Hypotheses: • Multimedia content analysis works better when every cue is taken into account (eg. video AND audio). • Semantic is enabled through context. Converts AI research into products.
  • 12. Context Sources of Context: • Inclusion of prior knowledge • Combination of algorithms • Multimodality: – audio+video+... – extra hardware • Human interaction • ... 12
  • 13. Context as Key: Example 1 → Cut Horse → Paste ^V Meadow Visual Object Extraction 13
  • 14. Simple Interactive Object Extraction (SIOX) → → Image User Input Output Context delivered by human interaction 14
  • 15. SIOX: Algorithm Idea Color Signatures from image retrieval: Y. Rubner, C. Tomasi, and L. J. Guibas: The Earth Mover’s Distance as a Metric for Image Retrieval. Int. Journal of Computer Vision, 40(2):99–121, 2000. Idea: Instead of searching and image database, use Color Signatures to search inside an image. 15
  • 16. SIOX in GIMP SIOX Button G. Friedland, K. Jantz, T. Lenz, F. Wiesel, R. Rojas: “Object Cut and Paste in Images and Videos”, International Journal of Semantic Computing Vol 1, No 2, pp. 221-247, World Scientific, USA, June 2007. 16
  • 17. SIOX in Inkscape 17
  • 18. SIOX in Blender 18
  • 19. Extensions Extracting multiple similar objects at once: → 19
  • 20. Sub-Pixel Refinement Problem: Spill colors and foreground disappearance → Original SIOX GraphCut → 20
  • 21. Sub-Pixel Refinement Detail Refinement Brush: Coarse Interaction → → 21
  • 22. VideoSIOX 1st Frame: Subsequent Frames: 22
  • 23. More Information http://www.siox.org 23
  • 24. Shoesurfer 24
  • 25. Shoesurfer 25
  • 26. Shoesurfer 26
  • 27. Shoesurfer 27
  • 28. Shoesurfer 28
  • 29. Context as Key: Example 2 29
  • 30. Speaker Diarization: Who Spoke When? Audiotrack: Segmentation: Clustering: G. Friedland, O. Vinyals, Y. Huang, C. Müller: “Prosodic and other Long-Term Features for Speaker Diarization”, IEEE Transactions on Audio, Speech, and Language Processing, Vol 17, No 5, pp 985--993, July 2009. 30
  • 31. Analyzing Meetings 31
  • 32. Dominance Estimation
  • 33. I Know You... http://www.icsi.berkeley.edu/ ~fractor/ioda_demo.avi 33
  • 34. Narrative Theme Navigation G. Friedland, L. Gottlieb, A. Janin: “Joke-o-mat: Browsing Sitcoms Punchline by Punchline”, Proceedings of ACM Multimedia, Beijing, China, October 2009. 34
  • 35. Joke-O-Mat: Demo http://www.youtube.com/watch?v=1qfa84Ulm5s 35
  • 36. Connecting Multimedia and Semantic Technologies GStreamer Appscio User Device Component 1 Driver User Component 2 Source Recorder . . . File User Component n 36
  • 37. Semantic Media Framework Pipeline Framework Integrated C/C++/Java Development Interface Environment Events Code Custom Event Source 1 Video Application Server Web Technology Custom Event Interface Source 2 . Scripting & Logic Engine . . Custom Event Services Connector Source n http://www.appscio.com 37
  • 38. Semantic Analysis of Multimedia Data • enables automatic logical inference on perceptually encoded data • enables more “natural” interaction with the computer: “do what the user means” • Interfaces nicely with Semantic Web technologies 38
  • 39. A note... James A. Hendler 39
  • 40. MySTT Open-Source, open-model, state-of-the-art speech recognizer for multiparty conversations. Release Date: February 2010 40
  • 41. 4th IEEE International Conference on Semantic Computing 2010 Paper Deadline: May 3rd, 2010 41
  • 42. Upcoming... 42
  • 43. Thank You! Questions? Contact: Dr. Gerald Friedland International Computer Science Institute Berkeley, CA http://www.gerald-friedland.org friedland@icsi.berkeley.edu 43