Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Unlocking Data Trapped in Audio & Video Files

359 views

Published on

As more and more apps record audio and video files we need to start thinking about what to do with those files. Playing them back isn't enough. Media files are full of data that developers can start exploiting thanks to an emergent category of signal and natural language processing technologies. Some of these are accessible via developer-friendly APIs.

Media files area extremely rich from a data perspective. Signal processing technologies can be used to extract and interpret words, emotions, identity, objects, events, etc.

Unfortunately, the work coming out of research labs is complicated, and usually difficult to understand. Thankfully, people are working to make it accessible.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Unlocking Data Trapped in Audio & Video Files

  1. 1. Unlocking Data Trapped in Audio & Video Files Paul Murphy paul@clarify.io @prmurphy
  2. 2. Córdoba, Argentina
  3. 3. Milan, Italy Paris, France
  4. 4. London, UK
  5. 5. So what’s wrong?
  6. 6. Context • My accent • The way I look • Bio (maybe)
  7. 7. My Background (why am I here?) Banking » Telephony » Analytics When things important, they’re oral Oral means no trace Except that that’s no longer true
  8. 8. Why am I excited? • It’s a hard problem • It has to be solved (trends!) • I founded a company to do that
  9. 9. Trends • End of the Gutenberg Pause • Humans communicate through sound & images • Text is an optimization • Text is dead • Massive amounts of data are now being stored in A/V files
  10. 10. Where are we today? • Lots of tools for manipulating text • Almost no tools for manipulating audio & video
  11. 11. Where are we today? • Lots of tools for manipulating text • Almost no tools for manipulating audio & video
  12. 12. How can we compute on A/V? • Transcription • Annotation
  13. 13. Turks are great! • Human transcription (APIs) • Manual annotation
  14. 14. Can those be automated? • Transcription • Annotation » ASR » Artificial vision Of course! (or I wouldn’t be here)
  15. 15. ASRs • AT&T • IBM Watson • Vocapia • Speechmatics • …
  16. 16. Vision • Clarifai • Orbeus • Image Vision Labs • Face++ • …
  17. 17. State of the Art • Last 4 slides
  18. 18. Context (input) • Telephony vs. wideband • Conversation vs. voicemail • Music vs. speech • English vs. Spanish Better & better data
  19. 19. Context (output) • Gender? • Identity? • Relationship? • Emotion? • Location? Need audio. Transcripts & annotations aren’t enough.
  20. 20. Recap…why does context matter? • Context helps us extract more & better data • Context is data We compute on data
  21. 21. Recap…where are we today • Trends • Technology • Context
  22. 22. Recap…where are we today
  23. 23. Recap…where are we today
  24. 24. Recap…where are we today
  25. 25. Recap…where are we today
  26. 26. Paul Murphy paul@clarify.io @prmurphy Thank you!
  27. 27. Any Questions? PS. We’re hiring! Paul Murphy paul@clarify.io @prmurphy

×