Unlocking Data
Trapped in
Audio & Video
Files
Paul Murphy
paul@clarify.io
@prmurphy
Córdoba, Argentina
Milan, Italy
Paris, France
London, UK
So what’s
wrong?
Context
• My accent
• The way I look
• Bio (maybe)
My Background (why am I here?)
Banking » Telephony » Analytics
When things
important, they’re oral
Oral means no trace
Except that that’s no
longer true
Why am I excited?
• It’s a hard problem
• It has to be solved (trends!)
• I founded a company to do that
Trends
• End of the Gutenberg Pause
• Humans communicate through
sound & images
• Text is an optimization
• Text is dead
• Massive amounts of data are
now being stored in A/V files
Where are we today?
• Lots of tools for manipulating text
• Almost no tools for manipulating
audio & video
Where are we today?
• Lots of tools for manipulating text
• Almost no tools for manipulating
audio & video
How can we compute on A/V?
• Transcription
• Annotation
Turks are great!
• Human transcription (APIs)
• Manual annotation
Can those be automated?
• Transcription
• Annotation
» ASR
» Artificial vision
Of course!
(or I wouldn’t be here)
ASRs
• AT&T
• IBM Watson
• Vocapia
• Speechmatics
• …
Vision
• Clarifai
• Orbeus
• Image Vision Labs
• Face++
• …
State of the Art
• Last 4 slides
Context (input)
• Telephony vs. wideband
• Conversation vs. voicemail
• Music vs. speech
• English vs. Spanish
Better & better data
Context (output)
• Gender?
• Identity?
• Relationship?
• Emotion?
• Location?
Need audio.
Transcripts & annotations
aren’t enough.
Recap…why does context matter?
• Context helps us extract
more & better data
• Context is data
We compute on data
Recap…where are we today
• Trends
• Technology
• Context
Recap…where are we today
Recap…where are we today
Recap…where are we today
Recap…where are we today
Paul Murphy
paul@clarify.io
@prmurphy
Thank you!
Any Questions?
PS. We’re hiring!
Paul Murphy
paul@clarify.io
@prmurphy

Unlocking Data Trapped in Audio & Video Files