SlideShare a Scribd company logo
FIAT/IFTA Media Management Seminar
“Game Changers? From Automation to Curation: Futureproofing AV Content”
IBM AI Overview with several Examples of
Projects in the Media and Lessons Learned
Jakob Rosinski | Lead Architect Video Solutions & Broadcast Industry Europe
Stockholm | 13.06.2018
This speech will give you an overview about client projects in the space of media archives worldwide IBM has
contributed to with it's own AI - named Watson - but also with it's knowledge and integration capabilities. Major topics are
scope definition and use case identification, further the usage of cognitive services of different kinds and vendors - with
success and open problems. In such a multi-modal approach training of services is also key, and the speech should
show how this can be managed both from a human and machine perspective.
Abstract
Jakob is the Lead Architect for Video Solutions & Broadcast Industry for IBM Services
in Europe. He is also the product owner of IBM AREMA, a workflow and essence
management solution which is widely used at different broadcasters for essence
archives and workflow automation.
Over the last decade Jakob was responsible for various projects in the media industry
at HBO, France24, ORF, SRF, RTL Mediengruppe or Deutsche Bundesliga/Sportcast.
He is an expert for multi-site & multi-tier essence management and workflow
automation for ingest, archive, production & distribution.
Further he is known and valued as a subject matter expert for the topics above in the
WW IBM M&E community. He is skilled at translating business needs into systems
solutions
Video Enrichment uses industry leading AI capabilities to analyze textual, audio, and visual data
within multi-media content, and to build easily searchable metadata packages for every asset.
By understanding content in new ways, media companies can improve content discovery,
increase operational efficiency, deliver higher ad revenues, drive viewer engagement and offer
entirely new ways to meet the demands of their businesses.
Enriched content is inherently more searchable. Improved content discovery in your consumer
service leads to increased usage.
Cognitive base services used for content enrichment
Enhanced and automated
understanding of personalities
present in the frame, and objects
Activate decade-old material by
running it through the STT API and
then performing deeper analytics
Deeper understanding of concepts,
recognized entities, keywords, and
relationships
Target
Deeply
enriched
content
second-to-
second
Search for image and videodata for
not trained objects or contexts.
Visual Recognition
Audiomining & Speech
to Text
NLU & Translation
Videodetection / Speed /
Movement
Pattern Detection &
Similarity Search
A lot of vendors are providing base cognitive
services...
Visual Recognition
Audioming & Speech to
Text
NLU & Translation
Videodetection / Speed /
Movement
Pattern Detection &
Similarity Search
7
8
9
https://www.foxsports.com/soccer/fifa-world-cup/highlights
©2018 IBM Corporation 27 June 2019 IBM Services10
Customer
MAM or DAM
Enriched metadata is delivered as an open JSON bundle to be
stored and used for search, compliance, recommendation and
other vital use cases.
Assets are acquired, ingested, processed and enriched
using the Watson Media platform.
SEMANTIC SCENE CHAPTERING
Divides the Media into meaningful chunks or chapters that can be more
easily managed by people responsible for editing or producing.
SPEECH TO TEXT
Converts audio into text, by leveraging machine intelligence to combine
information about grammar and language structure with knowledge of
the composition of the audio signal. Trainable.
NATURAL LANGUAGE UNDERSTANDING
Using the Textual output of S2T or a Close Caption File, NLU derives:
Concepts, Document-Level Emotions Sentiment, Entities, Keywords,
Language, & Taxonomy. Trainable.
VISUAL RECOGNITION
Detects the contents of an image or video frame, answering the
question: “What is in this image?” Returns class, class description, face
detection, and text recognition. Trainable.
Watson Video
Enrichment Workflow
> > >
>>>
>>>
11
Customer
MAM or DAM
Enriched metadata is delivered as an open JSON bundle to be
stored and used for search, compliance, recommendation and
other vital use cases.
Assets are acquired, ingested, processed and enriched
using the Watson Media platform.
SEMANTIC SCENE CHAPTERING
Divides the Media into meaningful chunks or chapters that can be more
easily managed by people responsible for editing or producing.
SPEECH TO TEXT
Converts audio into text, by leveraging machine intelligence to combine
information about grammar and language structure with knowledge of
the composition of the audio signal. Trainable.
NATURAL LANGUAGE UNDERSTANDING
Using the Textual output of S2T or a Close Caption File, NLU derives:
Concepts, Document-Level Emotions Sentiment, Entities, Keywords,
Language, & Taxonomy. Trainable.
VISUAL RECOGNITION
Detects the contents of an image or video frame, answering the
question: “What is in this image?” Returns class, class description, face
detection, and text recognition. Trainable.
TONE ANALYZER & PERSONALITY INSIGHTS
Provide additional features that document the Emotional Tone, Writing
Tone, Social Tone of dialogue, as well as the overall personalities of
characters based on their words.
Watson Video Enrichment Workflow
> > >
>>>
>>>
12
13
14
Scene Detection
Deep Video-Analysis
 People-, Object and Context-Detection
 Classification of actors based on 24
emotions
 Classification of scenes based on 22.000
categories
Deep Audio-Analysis
 Background
 Actor sentiment and tone
Analysis of scene composition
 Classification of light and color
Analysis of succesful trailers
to automatically create a
new one
https://www.youtube.com/watch?v=gJEzuYynaiw
15
Concept and proving of an automatic content
enrichment system for 40+ years of soccer history
 Annotation by usage of a portfolio of cognitive solutions
 Audio: Speech-to-text / Transcript
 Audio: Speaker-Detection
 Audio: Atmosphere (cheers, whistles, ..)
 Video: Angle/Camera & Context Detection
 Video: Face- & Object Detection
 Domain trained services including Traningsportal
 Sharpening of results by knowledge of domain and
creation of timelines, identifiying of concepts
Link with Game- and Playerdata
 Optimize content analysis and search based on game
and player statistics
 Guided search.
Persona-based User Experience
 Personalized Discovery, Suggestions, Design &
Projects
Content enrichment for
Bundesliga archive
16
17
Target: Automatic content enrichment
of 30+ years of show content
Annotation by usage of a portfolio of
cognitive solutions (IBM, OpenCV)
 Audio: Speech-to-text / Transcript /
Phrase detection
 Video: Angle/Camera & Context
Detection
 Video: Face- & Object Detection
Domain trained services including
Traningsportal
Sharpening of results by knowledge of
domain and creation of timelines,
identifiying of concepts
Content enrichment for
Brazils most famous TV show
Architecture of “Captain Caption” Demo
AREMA
Speech
to Text
Deep Learning –
Sound
Recognition
Natural
Language
Understanding
Conform results into one Close Caption file
Translation into target language
L
19
Context / Solution
Frame accurate detection of trained frames of lead in and out scenes to mark those
scenes in the content and exchange those automatically in master format without
transcoding (unwrap, cut, wrap) and with appropriate audio track handling to
enable fast channel switch of content.
• Usage of own developed detection component using OpenCV and Watson VR for
frameaccurate detection of scenes.
• Usage of AREMA‘s Dalet Galaxy integration to directly pull and push content to
MAM system, no need to extend Galaxy for this purpose
• Automatically scalable by using AREMA autoscaler in combination with
Kubernetes & Docker
• Usage of AREMA MXF Package for
• metadata extraction of source file
• rewrapping / preparartion audiotrack schema of new scene
• partial cut of source file
• conforming of all parts to target file
=> very fast, no transcoding or change of audio and video streams
Use Case: “Implement a full integrated, trained
cognitive service to exchange ident in and out
scenes”
Result:
• Fully automatized exchange of scenes, deeply integrated with existing environment
• Nearly endlessly scalable as all components can run in Kubernetes/Docker environment leads to significant reduce of time and people effort and faster
change of content between programs => from 3 months (2 full-time persons) to days
Each Use Case of Multimodal Analysis has different requirements so the workflows and the
combination of AI Services have to be adopted to these requirements
 This is where the following model provides flexibility to adapt to each unique use case of
multimodal analytics
 Vendor independant usage of cognitive services
 The whole is greater than the sum of its parts (Aristoteles), but sometimes also particular
„tiny“ use cases are worth to be evaluated
 Flexible MULTIMODALITY is a must
There is no One Size Fits All
21
Elemental parts of a content
enrichment platform
Multi-Modality &
Training &
Vendorindependence
Data-Consolidation &
Monitoring
Integration
& Workflow
212121
...
Why is training necessary?
22
Why is training necessary?
- How do we tell Will Ferrell (famous actor) apart from
Chad Smith (famous rock musician)?
- Challenges include:
• Out-of-Plane Rotation: frontal, 45 degree, profile,
upside down
• Presence of beard, mustache, glasses.
• Facial Expressions
• Occlusions by long hair, hand
• In-Plane Rotation
• Image conditions: size, lighting condition, distortion,
noise, compression
Trust me, these are two non-related different people!
https://medium.com/@ageitgey/machine-learning-is-fun-part-4-modern-face-recognition-with-deep-learning-c3cffc121d78
https://medium.com/@ageitgey/machine-learning-is-fun-part-4-modern-face-
recognition-with-deep-learning-c3cffc121d78
A lot of vendors are providing base cognitive services...but without
individual training they do not provide sufficient benefit
Customized user
AI model
Industry/Domain AI
Model
Base AI Model
Training data size
Accuracy
70%
60%
40%
Base model
learning curve
Domain-specific model
learning curve
50%
Customer adapted model
learning curve
0
80%
90%
As the domain specializes, learning accelerates
• Public models
• Pre-trained
• Limited accuracy for
typical real life use
cases
• Trained with proprietary
data
• Data ownership critical for
differentiation
Automated TRAINING is a must
Source: Andrej Karp
Cognitive
Process with
Trainer,
Analysis
Workflow and
Aggregator
26
Cogntive
Analysis
Workflow
Cognitive
Trainer
Cogntive
Aggregator
Image
Classifier
Inbox
Taxonomy
Database
Image
Classifier
Repository
Media
Ingestion
Metadata
Repository
(MAM)
1
2
3
4
5
6
1. Configure Taxonomy (add
Classifiers, Categories, etc.)
2. Show and organize classifier
images
3. Move good classifiers to
repository to optimize training
4. Use classifier repository to
train services and perform
custom analysis
5. Move actual frame to inbox
when confidence ok
6. Use taxonomy for rule
creation
Parts for an successful content enrichment
1. By combination of
trained cognitive
serviced new valuable
metadata can be
retrieved from content
2. Automatic creation and
use of those metadata
must be included in
existing processes
3. Quality of cognitive
services and processes
must be supervisioned
Information Corpora
- Rule-based configuration
- Batch learning
- Manual labeling
- Cognitive workflow builder
- E2E Broadcast Integration
(MAM, etc.)
- Full integration into AREMA
Operations Dashboards
…
Training
Cognitive Workflow
Orchestration
Cognitive Workflow
Operations
Elementary AI Services
Cognitive Content Media Services
IBM Watson APIs 3rd Party APIs
Speech-
to-Text
NLC/
NLU*
Visual
Recogn. …
General Domain
Content Tagging
Domain-specific
Content Tagging
(3rd party)
Domain-specific
Content Tagging
(propriety)
Domain-specific
Content Tagging
(shared)
Speech
Languag
e
Visual …Watson
Media
Knowledge
Studio
Essence Files Meta Data Public Data
Other Data
sources
…
• A comparison between single cognitive services is not adequate, but the reasonable combination of
services is
• The solution approach must start with the use case given, for which the solution will be defined and
customized
• AI will not overtake all human work, but will support in the areas where automization is meaningful
• The process will be a mix of human an AI based tasks and steps
• Sufficient solutions will be created by try-out and optimization, not by waiting for the perfect
technology.
Summary
While AI can’t fully
equate the human
touch creatively, it can
optimize workflows and
media processes to
gain more value from
content.
31
Notes and Sources
McCaskill, Steve. “Wimbledon 2018: AI Marries Tennis Tradition With Digital Innovation.” Forbes. July 2018.
https://www.forbes.com/sites/stevemccaskill/2018/07/06/ wimbledon-marries-innovation-with-tradition-in-use-of- ai/#7686e2d92198
Moore, Mike. “Wimbledon 2018: How IBM Watson is serving up the best viewer experience.” Tech Radar. July 2018.
https://www.techradar.com/news/wimbledon-2018-how-ibm- watson-is-serving-up-the-best-viewer-experience
McCarthy, John. “IBM and Fox Sports lean on AI so fans can generate World Cup highlights packages.” The Drum. June 2018.
https://www.thedrum.com/news/2018/06/06/ibm-
and-fox-sports-lean-ai-so-fans-can-generate-world- cup-highlights-packages
Alvarez, Edgar. “Fox Sports’ World Cup Highlight Machine is powered by IBM’s Watson.” Engadget. June 2018.
https://www.engadget.com/2018/06/04/fox-sports-world- cup-highlight-machine-ibm-watson
Chang, Lulu. “IBM’s Watson will make headlines at the Masters tournament.” Digital Trends. April 2018.
https://www.digitaltrends.com/outdoors/ibm-watson-masters
Alexander, Julia, “Watch the first ever movie trailer made by artificial intelligence.” Polygon. September 2016.
https://www.polygon.com/2016/9/1/12753298/morgan- trailer-artificial-intelligence
Smith, John R. “IBM Research takes Watson to Hollywood with the first “cognitive movie trailer.” IBM. August 2016.
https://www.ibm.com/blogs/think/2016/08/cognitive- movie-trailer
“Uncovering Dark Video Data with AI: How Watson Video Enrichment can provide better decision-making data and unlock new business possibilities in
the media industry.” IBM. August 2017. https://public.dhe.ibm.com/common/ ssi/ecm/me/en/mew03018usen/uncovering-dark-data_
MEW03018USEN.pdf

More Related Content

Similar to Rosinski ibm ai overview with several examples of projects in the media and lessons learned

Evolve your app’s video experience with Azure: Processing and Video AI at scale
Evolve your app’s video experience with Azure: Processing and Video AI at scaleEvolve your app’s video experience with Azure: Processing and Video AI at scale
Evolve your app’s video experience with Azure: Processing and Video AI at scale
Microsoft Tech Community
 
InterBEE 2016: クラウドをコアにした「デジタル・トランスフォーメーション」が メディア業界に与えるインパクトとは何か?
InterBEE 2016: クラウドをコアにした「デジタル・トランスフォーメーション」が  メディア業界に与えるインパクトとは何か?InterBEE 2016: クラウドをコアにした「デジタル・トランスフォーメーション」が  メディア業界に与えるインパクトとは何か?
InterBEE 2016: クラウドをコアにした「デジタル・トランスフォーメーション」が メディア業界に与えるインパクトとは何か?
Daiyu Hatakeyama
 
Analisi avanzata di video e immagini con i servizi AI di AWS
Analisi avanzata di video e immagini con i servizi AI di AWSAnalisi avanzata di video e immagini con i servizi AI di AWS
Analisi avanzata di video e immagini con i servizi AI di AWS
Amazon Web Services
 
Intro to watson bluemix services
Intro to watson bluemix servicesIntro to watson bluemix services
Intro to watson bluemix services
Vikas Manoria
 
An Stepped Forward Security System for Multimedia Content Material for Cloud ...
An Stepped Forward Security System for Multimedia Content Material for Cloud ...An Stepped Forward Security System for Multimedia Content Material for Cloud ...
An Stepped Forward Security System for Multimedia Content Material for Cloud ...
IRJET Journal
 
Intelligent ChatBot
Intelligent ChatBotIntelligent ChatBot
Intelligent ChatBot
antimo musone
 
Netex learningMaker | Authoring tool for HTML5 e-learning content [EN]
Netex learningMaker | Authoring tool for HTML5 e-learning content [EN]Netex learningMaker | Authoring tool for HTML5 e-learning content [EN]
Netex learningMaker | Authoring tool for HTML5 e-learning content [EN]
Netex Learning
 
Deliver high-quality messaging, screen sharing, audio, and video capabilities...
Deliver high-quality messaging, screen sharing, audio, and video capabilities...Deliver high-quality messaging, screen sharing, audio, and video capabilities...
Deliver high-quality messaging, screen sharing, audio, and video capabilities...
Jorge Fonseca
 
[DSC Europe 22] On building a video recommendation system and other use-cases...
[DSC Europe 22] On building a video recommendation system and other use-cases...[DSC Europe 22] On building a video recommendation system and other use-cases...
[DSC Europe 22] On building a video recommendation system and other use-cases...
DataScienceConferenc1
 
Mariana Alupului Inventions
Mariana Alupului InventionsMariana Alupului Inventions
Mariana Alupului Inventions
malupului
 
AI at Scale in Enterprises
AI at Scale in Enterprises AI at Scale in Enterprises
AI at Scale in Enterprises
Ganesan Narayanasamy
 
Using the power of Generative AI at scale
Using the power of Generative AI at scaleUsing the power of Generative AI at scale
Using the power of Generative AI at scale
Maxim Salnikov
 
Artificial Intelligence on the AWS Platform
Artificial Intelligence on the AWS PlatformArtificial Intelligence on the AWS Platform
Artificial Intelligence on the AWS Platform
Adrian Hornsby
 
How Amazon AI Can Help You Transform Your Education Business | AWS Webinar
How Amazon AI Can Help You Transform Your Education Business | AWS WebinarHow Amazon AI Can Help You Transform Your Education Business | AWS Webinar
How Amazon AI Can Help You Transform Your Education Business | AWS Webinar
Amazon Web Services
 
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und ExpertenMaschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten
AWS Germany
 
Cloud-Native Roadshow - Google - DC
Cloud-Native Roadshow - Google - DCCloud-Native Roadshow - Google - DC
Cloud-Native Roadshow - Google - DC
VMware Tanzu
 
Netex learningMaker | Dossier [EN]
Netex learningMaker | Dossier [EN]Netex learningMaker | Dossier [EN]
Netex learningMaker | Dossier [EN]
Netex Learning
 
IWE 2480 - An Ecosystem of Innovation: Creating Cognitive Apps Powered by IB...
IWE 2480 - An Ecosystem of Innovation:  Creating Cognitive Apps Powered by IB...IWE 2480 - An Ecosystem of Innovation:  Creating Cognitive Apps Powered by IB...
IWE 2480 - An Ecosystem of Innovation: Creating Cognitive Apps Powered by IB...
Carmine DiMascio
 
Guru_poster
Guru_posterGuru_poster
Guru_poster
Christopher Clarke
 
Reveal The Secrets of Your Videos
Reveal The Secrets of Your VideosReveal The Secrets of Your Videos
Reveal The Secrets of Your Videos
Zoltán Németh
 

Similar to Rosinski ibm ai overview with several examples of projects in the media and lessons learned (20)

Evolve your app’s video experience with Azure: Processing and Video AI at scale
Evolve your app’s video experience with Azure: Processing and Video AI at scaleEvolve your app’s video experience with Azure: Processing and Video AI at scale
Evolve your app’s video experience with Azure: Processing and Video AI at scale
 
InterBEE 2016: クラウドをコアにした「デジタル・トランスフォーメーション」が メディア業界に与えるインパクトとは何か?
InterBEE 2016: クラウドをコアにした「デジタル・トランスフォーメーション」が  メディア業界に与えるインパクトとは何か?InterBEE 2016: クラウドをコアにした「デジタル・トランスフォーメーション」が  メディア業界に与えるインパクトとは何か?
InterBEE 2016: クラウドをコアにした「デジタル・トランスフォーメーション」が メディア業界に与えるインパクトとは何か?
 
Analisi avanzata di video e immagini con i servizi AI di AWS
Analisi avanzata di video e immagini con i servizi AI di AWSAnalisi avanzata di video e immagini con i servizi AI di AWS
Analisi avanzata di video e immagini con i servizi AI di AWS
 
Intro to watson bluemix services
Intro to watson bluemix servicesIntro to watson bluemix services
Intro to watson bluemix services
 
An Stepped Forward Security System for Multimedia Content Material for Cloud ...
An Stepped Forward Security System for Multimedia Content Material for Cloud ...An Stepped Forward Security System for Multimedia Content Material for Cloud ...
An Stepped Forward Security System for Multimedia Content Material for Cloud ...
 
Intelligent ChatBot
Intelligent ChatBotIntelligent ChatBot
Intelligent ChatBot
 
Netex learningMaker | Authoring tool for HTML5 e-learning content [EN]
Netex learningMaker | Authoring tool for HTML5 e-learning content [EN]Netex learningMaker | Authoring tool for HTML5 e-learning content [EN]
Netex learningMaker | Authoring tool for HTML5 e-learning content [EN]
 
Deliver high-quality messaging, screen sharing, audio, and video capabilities...
Deliver high-quality messaging, screen sharing, audio, and video capabilities...Deliver high-quality messaging, screen sharing, audio, and video capabilities...
Deliver high-quality messaging, screen sharing, audio, and video capabilities...
 
[DSC Europe 22] On building a video recommendation system and other use-cases...
[DSC Europe 22] On building a video recommendation system and other use-cases...[DSC Europe 22] On building a video recommendation system and other use-cases...
[DSC Europe 22] On building a video recommendation system and other use-cases...
 
Mariana Alupului Inventions
Mariana Alupului InventionsMariana Alupului Inventions
Mariana Alupului Inventions
 
AI at Scale in Enterprises
AI at Scale in Enterprises AI at Scale in Enterprises
AI at Scale in Enterprises
 
Using the power of Generative AI at scale
Using the power of Generative AI at scaleUsing the power of Generative AI at scale
Using the power of Generative AI at scale
 
Artificial Intelligence on the AWS Platform
Artificial Intelligence on the AWS PlatformArtificial Intelligence on the AWS Platform
Artificial Intelligence on the AWS Platform
 
How Amazon AI Can Help You Transform Your Education Business | AWS Webinar
How Amazon AI Can Help You Transform Your Education Business | AWS WebinarHow Amazon AI Can Help You Transform Your Education Business | AWS Webinar
How Amazon AI Can Help You Transform Your Education Business | AWS Webinar
 
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und ExpertenMaschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten
 
Cloud-Native Roadshow - Google - DC
Cloud-Native Roadshow - Google - DCCloud-Native Roadshow - Google - DC
Cloud-Native Roadshow - Google - DC
 
Netex learningMaker | Dossier [EN]
Netex learningMaker | Dossier [EN]Netex learningMaker | Dossier [EN]
Netex learningMaker | Dossier [EN]
 
IWE 2480 - An Ecosystem of Innovation: Creating Cognitive Apps Powered by IB...
IWE 2480 - An Ecosystem of Innovation:  Creating Cognitive Apps Powered by IB...IWE 2480 - An Ecosystem of Innovation:  Creating Cognitive Apps Powered by IB...
IWE 2480 - An Ecosystem of Innovation: Creating Cognitive Apps Powered by IB...
 
Guru_poster
Guru_posterGuru_poster
Guru_poster
 
Reveal The Secrets of Your Videos
Reveal The Secrets of Your VideosReveal The Secrets of Your Videos
Reveal The Secrets of Your Videos
 

More from FIAT/IFTA

2021 FIAT/IFTA Timeline Survey
2021 FIAT/IFTA Timeline Survey2021 FIAT/IFTA Timeline Survey
2021 FIAT/IFTA Timeline Survey
FIAT/IFTA
 
20211021 FIAT/IFTA Most Wanted List
20211021 FIAT/IFTA Most Wanted List20211021 FIAT/IFTA Most Wanted List
20211021 FIAT/IFTA Most Wanted List
FIAT/IFTA
 
WARBURTON FIAT/IFTA Timeline Survey results 2020
WARBURTON FIAT/IFTA Timeline Survey results 2020WARBURTON FIAT/IFTA Timeline Survey results 2020
WARBURTON FIAT/IFTA Timeline Survey results 2020
FIAT/IFTA
 
OOMEN MEZARIS ReTV
OOMEN MEZARIS ReTVOOMEN MEZARIS ReTV
OOMEN MEZARIS ReTV
FIAT/IFTA
 
BUCHMAN Digitisation of quarter inch audio tapes at DR (FRAME Expert)
BUCHMAN Digitisation of quarter inch audio tapes at DR (FRAME Expert)BUCHMAN Digitisation of quarter inch audio tapes at DR (FRAME Expert)
BUCHMAN Digitisation of quarter inch audio tapes at DR (FRAME Expert)
FIAT/IFTA
 
CULJAT (FRAME Expert) Public procurement in audiovisual digitisation at RTÉ
CULJAT (FRAME Expert) Public procurement in audiovisual digitisation at RTÉCULJAT (FRAME Expert) Public procurement in audiovisual digitisation at RTÉ
CULJAT (FRAME Expert) Public procurement in audiovisual digitisation at RTÉ
FIAT/IFTA
 
HULSENBECK Value Use and Copyright Comission initiatives
HULSENBECK Value Use and Copyright Comission initiativesHULSENBECK Value Use and Copyright Comission initiatives
HULSENBECK Value Use and Copyright Comission initiatives
FIAT/IFTA
 
WILSON Film digitisation at BBC Scotland
WILSON Film digitisation at BBC ScotlandWILSON Film digitisation at BBC Scotland
WILSON Film digitisation at BBC Scotland
FIAT/IFTA
 
GOLODNOFF We need to make our past accessible!
GOLODNOFF We need to make our past accessible!GOLODNOFF We need to make our past accessible!
GOLODNOFF We need to make our past accessible!
FIAT/IFTA
 
LORENZ Building an integrated digital media archive and legal deposit
LORENZ Building an integrated digital media archive and legal depositLORENZ Building an integrated digital media archive and legal deposit
LORENZ Building an integrated digital media archive and legal deposit
FIAT/IFTA
 
BIRATUNGANYE Shock of formats
BIRATUNGANYE Shock of formatsBIRATUNGANYE Shock of formats
BIRATUNGANYE Shock of formats
FIAT/IFTA
 
CANTU VT is TV The History of Argentinian Video Art and Television Archives P...
CANTU VT is TV The History of Argentinian Video Art and Television Archives P...CANTU VT is TV The History of Argentinian Video Art and Television Archives P...
CANTU VT is TV The History of Argentinian Video Art and Television Archives P...
FIAT/IFTA
 
BERGER RIPPON BBC Music memories
BERGER RIPPON BBC Music memoriesBERGER RIPPON BBC Music memories
BERGER RIPPON BBC Music memories
FIAT/IFTA
 
AOIBHINN and CHOISTIN Rehash your archive
AOIBHINN and CHOISTIN Rehash your archiveAOIBHINN and CHOISTIN Rehash your archive
AOIBHINN and CHOISTIN Rehash your archive
FIAT/IFTA
 
HULSENBECK BLOM A blast from the past open up
HULSENBECK BLOM A blast from the past open upHULSENBECK BLOM A blast from the past open up
HULSENBECK BLOM A blast from the past open up
FIAT/IFTA
 
PERVIZ Automated evolvable media console systems in digital archives
PERVIZ Automated evolvable media console systems in digital archivesPERVIZ Automated evolvable media console systems in digital archives
PERVIZ Automated evolvable media console systems in digital archives
FIAT/IFTA
 
AICHROTH Systemaic evaluation and decentralisation for a (bit more) trusted AI
AICHROTH Systemaic evaluation and decentralisation for a (bit more) trusted AIAICHROTH Systemaic evaluation and decentralisation for a (bit more) trusted AI
AICHROTH Systemaic evaluation and decentralisation for a (bit more) trusted AI
FIAT/IFTA
 
VINSON Accuracy and cost assessment for archival video transcription methods
VINSON Accuracy and cost assessment for archival video transcription methodsVINSON Accuracy and cost assessment for archival video transcription methods
VINSON Accuracy and cost assessment for archival video transcription methods
FIAT/IFTA
 
LYCKE Artificial intelligence, hype or hope?
LYCKE Artificial intelligence, hype or hope?LYCKE Artificial intelligence, hype or hope?
LYCKE Artificial intelligence, hype or hope?
FIAT/IFTA
 
AZIZ BABBUCCI Let's play with the archive
AZIZ BABBUCCI Let's play with the archiveAZIZ BABBUCCI Let's play with the archive
AZIZ BABBUCCI Let's play with the archive
FIAT/IFTA
 

More from FIAT/IFTA (20)

2021 FIAT/IFTA Timeline Survey
2021 FIAT/IFTA Timeline Survey2021 FIAT/IFTA Timeline Survey
2021 FIAT/IFTA Timeline Survey
 
20211021 FIAT/IFTA Most Wanted List
20211021 FIAT/IFTA Most Wanted List20211021 FIAT/IFTA Most Wanted List
20211021 FIAT/IFTA Most Wanted List
 
WARBURTON FIAT/IFTA Timeline Survey results 2020
WARBURTON FIAT/IFTA Timeline Survey results 2020WARBURTON FIAT/IFTA Timeline Survey results 2020
WARBURTON FIAT/IFTA Timeline Survey results 2020
 
OOMEN MEZARIS ReTV
OOMEN MEZARIS ReTVOOMEN MEZARIS ReTV
OOMEN MEZARIS ReTV
 
BUCHMAN Digitisation of quarter inch audio tapes at DR (FRAME Expert)
BUCHMAN Digitisation of quarter inch audio tapes at DR (FRAME Expert)BUCHMAN Digitisation of quarter inch audio tapes at DR (FRAME Expert)
BUCHMAN Digitisation of quarter inch audio tapes at DR (FRAME Expert)
 
CULJAT (FRAME Expert) Public procurement in audiovisual digitisation at RTÉ
CULJAT (FRAME Expert) Public procurement in audiovisual digitisation at RTÉCULJAT (FRAME Expert) Public procurement in audiovisual digitisation at RTÉ
CULJAT (FRAME Expert) Public procurement in audiovisual digitisation at RTÉ
 
HULSENBECK Value Use and Copyright Comission initiatives
HULSENBECK Value Use and Copyright Comission initiativesHULSENBECK Value Use and Copyright Comission initiatives
HULSENBECK Value Use and Copyright Comission initiatives
 
WILSON Film digitisation at BBC Scotland
WILSON Film digitisation at BBC ScotlandWILSON Film digitisation at BBC Scotland
WILSON Film digitisation at BBC Scotland
 
GOLODNOFF We need to make our past accessible!
GOLODNOFF We need to make our past accessible!GOLODNOFF We need to make our past accessible!
GOLODNOFF We need to make our past accessible!
 
LORENZ Building an integrated digital media archive and legal deposit
LORENZ Building an integrated digital media archive and legal depositLORENZ Building an integrated digital media archive and legal deposit
LORENZ Building an integrated digital media archive and legal deposit
 
BIRATUNGANYE Shock of formats
BIRATUNGANYE Shock of formatsBIRATUNGANYE Shock of formats
BIRATUNGANYE Shock of formats
 
CANTU VT is TV The History of Argentinian Video Art and Television Archives P...
CANTU VT is TV The History of Argentinian Video Art and Television Archives P...CANTU VT is TV The History of Argentinian Video Art and Television Archives P...
CANTU VT is TV The History of Argentinian Video Art and Television Archives P...
 
BERGER RIPPON BBC Music memories
BERGER RIPPON BBC Music memoriesBERGER RIPPON BBC Music memories
BERGER RIPPON BBC Music memories
 
AOIBHINN and CHOISTIN Rehash your archive
AOIBHINN and CHOISTIN Rehash your archiveAOIBHINN and CHOISTIN Rehash your archive
AOIBHINN and CHOISTIN Rehash your archive
 
HULSENBECK BLOM A blast from the past open up
HULSENBECK BLOM A blast from the past open upHULSENBECK BLOM A blast from the past open up
HULSENBECK BLOM A blast from the past open up
 
PERVIZ Automated evolvable media console systems in digital archives
PERVIZ Automated evolvable media console systems in digital archivesPERVIZ Automated evolvable media console systems in digital archives
PERVIZ Automated evolvable media console systems in digital archives
 
AICHROTH Systemaic evaluation and decentralisation for a (bit more) trusted AI
AICHROTH Systemaic evaluation and decentralisation for a (bit more) trusted AIAICHROTH Systemaic evaluation and decentralisation for a (bit more) trusted AI
AICHROTH Systemaic evaluation and decentralisation for a (bit more) trusted AI
 
VINSON Accuracy and cost assessment for archival video transcription methods
VINSON Accuracy and cost assessment for archival video transcription methodsVINSON Accuracy and cost assessment for archival video transcription methods
VINSON Accuracy and cost assessment for archival video transcription methods
 
LYCKE Artificial intelligence, hype or hope?
LYCKE Artificial intelligence, hype or hope?LYCKE Artificial intelligence, hype or hope?
LYCKE Artificial intelligence, hype or hope?
 
AZIZ BABBUCCI Let's play with the archive
AZIZ BABBUCCI Let's play with the archiveAZIZ BABBUCCI Let's play with the archive
AZIZ BABBUCCI Let's play with the archive
 

Recently uploaded

Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
Social Samosa
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
AlessioFois2
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
AndrzejJarynowski
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
nuttdpt
 
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
74nqk8xf
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
nuttdpt
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
v7oacc3l
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
kuntobimo2016
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 

Recently uploaded (20)

Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
 
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 

Rosinski ibm ai overview with several examples of projects in the media and lessons learned

  • 1. FIAT/IFTA Media Management Seminar “Game Changers? From Automation to Curation: Futureproofing AV Content” IBM AI Overview with several Examples of Projects in the Media and Lessons Learned Jakob Rosinski | Lead Architect Video Solutions & Broadcast Industry Europe Stockholm | 13.06.2018
  • 2. This speech will give you an overview about client projects in the space of media archives worldwide IBM has contributed to with it's own AI - named Watson - but also with it's knowledge and integration capabilities. Major topics are scope definition and use case identification, further the usage of cognitive services of different kinds and vendors - with success and open problems. In such a multi-modal approach training of services is also key, and the speech should show how this can be managed both from a human and machine perspective. Abstract Jakob is the Lead Architect for Video Solutions & Broadcast Industry for IBM Services in Europe. He is also the product owner of IBM AREMA, a workflow and essence management solution which is widely used at different broadcasters for essence archives and workflow automation. Over the last decade Jakob was responsible for various projects in the media industry at HBO, France24, ORF, SRF, RTL Mediengruppe or Deutsche Bundesliga/Sportcast. He is an expert for multi-site & multi-tier essence management and workflow automation for ingest, archive, production & distribution. Further he is known and valued as a subject matter expert for the topics above in the WW IBM M&E community. He is skilled at translating business needs into systems solutions
  • 3. Video Enrichment uses industry leading AI capabilities to analyze textual, audio, and visual data within multi-media content, and to build easily searchable metadata packages for every asset. By understanding content in new ways, media companies can improve content discovery, increase operational efficiency, deliver higher ad revenues, drive viewer engagement and offer entirely new ways to meet the demands of their businesses. Enriched content is inherently more searchable. Improved content discovery in your consumer service leads to increased usage.
  • 4. Cognitive base services used for content enrichment Enhanced and automated understanding of personalities present in the frame, and objects Activate decade-old material by running it through the STT API and then performing deeper analytics Deeper understanding of concepts, recognized entities, keywords, and relationships Target Deeply enriched content second-to- second Search for image and videodata for not trained objects or contexts. Visual Recognition Audiomining & Speech to Text NLU & Translation Videodetection / Speed / Movement Pattern Detection & Similarity Search
  • 5. A lot of vendors are providing base cognitive services... Visual Recognition Audioming & Speech to Text NLU & Translation Videodetection / Speed / Movement Pattern Detection & Similarity Search
  • 6.
  • 7. 7
  • 8. 8
  • 10. ©2018 IBM Corporation 27 June 2019 IBM Services10 Customer MAM or DAM Enriched metadata is delivered as an open JSON bundle to be stored and used for search, compliance, recommendation and other vital use cases. Assets are acquired, ingested, processed and enriched using the Watson Media platform. SEMANTIC SCENE CHAPTERING Divides the Media into meaningful chunks or chapters that can be more easily managed by people responsible for editing or producing. SPEECH TO TEXT Converts audio into text, by leveraging machine intelligence to combine information about grammar and language structure with knowledge of the composition of the audio signal. Trainable. NATURAL LANGUAGE UNDERSTANDING Using the Textual output of S2T or a Close Caption File, NLU derives: Concepts, Document-Level Emotions Sentiment, Entities, Keywords, Language, & Taxonomy. Trainable. VISUAL RECOGNITION Detects the contents of an image or video frame, answering the question: “What is in this image?” Returns class, class description, face detection, and text recognition. Trainable. Watson Video Enrichment Workflow > > > >>> >>>
  • 11. 11 Customer MAM or DAM Enriched metadata is delivered as an open JSON bundle to be stored and used for search, compliance, recommendation and other vital use cases. Assets are acquired, ingested, processed and enriched using the Watson Media platform. SEMANTIC SCENE CHAPTERING Divides the Media into meaningful chunks or chapters that can be more easily managed by people responsible for editing or producing. SPEECH TO TEXT Converts audio into text, by leveraging machine intelligence to combine information about grammar and language structure with knowledge of the composition of the audio signal. Trainable. NATURAL LANGUAGE UNDERSTANDING Using the Textual output of S2T or a Close Caption File, NLU derives: Concepts, Document-Level Emotions Sentiment, Entities, Keywords, Language, & Taxonomy. Trainable. VISUAL RECOGNITION Detects the contents of an image or video frame, answering the question: “What is in this image?” Returns class, class description, face detection, and text recognition. Trainable. TONE ANALYZER & PERSONALITY INSIGHTS Provide additional features that document the Emotional Tone, Writing Tone, Social Tone of dialogue, as well as the overall personalities of characters based on their words. Watson Video Enrichment Workflow > > > >>> >>>
  • 12. 12
  • 13. 13
  • 14. 14
  • 15. Scene Detection Deep Video-Analysis  People-, Object and Context-Detection  Classification of actors based on 24 emotions  Classification of scenes based on 22.000 categories Deep Audio-Analysis  Background  Actor sentiment and tone Analysis of scene composition  Classification of light and color Analysis of succesful trailers to automatically create a new one https://www.youtube.com/watch?v=gJEzuYynaiw 15
  • 16. Concept and proving of an automatic content enrichment system for 40+ years of soccer history  Annotation by usage of a portfolio of cognitive solutions  Audio: Speech-to-text / Transcript  Audio: Speaker-Detection  Audio: Atmosphere (cheers, whistles, ..)  Video: Angle/Camera & Context Detection  Video: Face- & Object Detection  Domain trained services including Traningsportal  Sharpening of results by knowledge of domain and creation of timelines, identifiying of concepts Link with Game- and Playerdata  Optimize content analysis and search based on game and player statistics  Guided search. Persona-based User Experience  Personalized Discovery, Suggestions, Design & Projects Content enrichment for Bundesliga archive 16
  • 17. 17 Target: Automatic content enrichment of 30+ years of show content Annotation by usage of a portfolio of cognitive solutions (IBM, OpenCV)  Audio: Speech-to-text / Transcript / Phrase detection  Video: Angle/Camera & Context Detection  Video: Face- & Object Detection Domain trained services including Traningsportal Sharpening of results by knowledge of domain and creation of timelines, identifiying of concepts Content enrichment for Brazils most famous TV show
  • 18. Architecture of “Captain Caption” Demo AREMA Speech to Text Deep Learning – Sound Recognition Natural Language Understanding Conform results into one Close Caption file Translation into target language L
  • 19. 19 Context / Solution Frame accurate detection of trained frames of lead in and out scenes to mark those scenes in the content and exchange those automatically in master format without transcoding (unwrap, cut, wrap) and with appropriate audio track handling to enable fast channel switch of content. • Usage of own developed detection component using OpenCV and Watson VR for frameaccurate detection of scenes. • Usage of AREMA‘s Dalet Galaxy integration to directly pull and push content to MAM system, no need to extend Galaxy for this purpose • Automatically scalable by using AREMA autoscaler in combination with Kubernetes & Docker • Usage of AREMA MXF Package for • metadata extraction of source file • rewrapping / preparartion audiotrack schema of new scene • partial cut of source file • conforming of all parts to target file => very fast, no transcoding or change of audio and video streams Use Case: “Implement a full integrated, trained cognitive service to exchange ident in and out scenes” Result: • Fully automatized exchange of scenes, deeply integrated with existing environment • Nearly endlessly scalable as all components can run in Kubernetes/Docker environment leads to significant reduce of time and people effort and faster change of content between programs => from 3 months (2 full-time persons) to days
  • 20. Each Use Case of Multimodal Analysis has different requirements so the workflows and the combination of AI Services have to be adopted to these requirements  This is where the following model provides flexibility to adapt to each unique use case of multimodal analytics  Vendor independant usage of cognitive services  The whole is greater than the sum of its parts (Aristoteles), but sometimes also particular „tiny“ use cases are worth to be evaluated  Flexible MULTIMODALITY is a must There is no One Size Fits All
  • 21. 21 Elemental parts of a content enrichment platform Multi-Modality & Training & Vendorindependence Data-Consolidation & Monitoring Integration & Workflow 212121
  • 22. ... Why is training necessary? 22
  • 23. Why is training necessary? - How do we tell Will Ferrell (famous actor) apart from Chad Smith (famous rock musician)? - Challenges include: • Out-of-Plane Rotation: frontal, 45 degree, profile, upside down • Presence of beard, mustache, glasses. • Facial Expressions • Occlusions by long hair, hand • In-Plane Rotation • Image conditions: size, lighting condition, distortion, noise, compression Trust me, these are two non-related different people! https://medium.com/@ageitgey/machine-learning-is-fun-part-4-modern-face-recognition-with-deep-learning-c3cffc121d78 https://medium.com/@ageitgey/machine-learning-is-fun-part-4-modern-face- recognition-with-deep-learning-c3cffc121d78
  • 24. A lot of vendors are providing base cognitive services...but without individual training they do not provide sufficient benefit Customized user AI model Industry/Domain AI Model Base AI Model Training data size Accuracy 70% 60% 40% Base model learning curve Domain-specific model learning curve 50% Customer adapted model learning curve 0 80% 90% As the domain specializes, learning accelerates • Public models • Pre-trained • Limited accuracy for typical real life use cases • Trained with proprietary data • Data ownership critical for differentiation Automated TRAINING is a must
  • 26. Cognitive Process with Trainer, Analysis Workflow and Aggregator 26 Cogntive Analysis Workflow Cognitive Trainer Cogntive Aggregator Image Classifier Inbox Taxonomy Database Image Classifier Repository Media Ingestion Metadata Repository (MAM) 1 2 3 4 5 6 1. Configure Taxonomy (add Classifiers, Categories, etc.) 2. Show and organize classifier images 3. Move good classifiers to repository to optimize training 4. Use classifier repository to train services and perform custom analysis 5. Move actual frame to inbox when confidence ok 6. Use taxonomy for rule creation
  • 27. Parts for an successful content enrichment 1. By combination of trained cognitive serviced new valuable metadata can be retrieved from content 2. Automatic creation and use of those metadata must be included in existing processes 3. Quality of cognitive services and processes must be supervisioned Information Corpora - Rule-based configuration - Batch learning - Manual labeling - Cognitive workflow builder - E2E Broadcast Integration (MAM, etc.) - Full integration into AREMA Operations Dashboards … Training Cognitive Workflow Orchestration Cognitive Workflow Operations Elementary AI Services Cognitive Content Media Services IBM Watson APIs 3rd Party APIs Speech- to-Text NLC/ NLU* Visual Recogn. … General Domain Content Tagging Domain-specific Content Tagging (3rd party) Domain-specific Content Tagging (propriety) Domain-specific Content Tagging (shared) Speech Languag e Visual …Watson Media Knowledge Studio Essence Files Meta Data Public Data Other Data sources …
  • 28. • A comparison between single cognitive services is not adequate, but the reasonable combination of services is • The solution approach must start with the use case given, for which the solution will be defined and customized • AI will not overtake all human work, but will support in the areas where automization is meaningful • The process will be a mix of human an AI based tasks and steps • Sufficient solutions will be created by try-out and optimization, not by waiting for the perfect technology. Summary While AI can’t fully equate the human touch creatively, it can optimize workflows and media processes to gain more value from content.
  • 29.
  • 30.
  • 31. 31 Notes and Sources McCaskill, Steve. “Wimbledon 2018: AI Marries Tennis Tradition With Digital Innovation.” Forbes. July 2018. https://www.forbes.com/sites/stevemccaskill/2018/07/06/ wimbledon-marries-innovation-with-tradition-in-use-of- ai/#7686e2d92198 Moore, Mike. “Wimbledon 2018: How IBM Watson is serving up the best viewer experience.” Tech Radar. July 2018. https://www.techradar.com/news/wimbledon-2018-how-ibm- watson-is-serving-up-the-best-viewer-experience McCarthy, John. “IBM and Fox Sports lean on AI so fans can generate World Cup highlights packages.” The Drum. June 2018. https://www.thedrum.com/news/2018/06/06/ibm- and-fox-sports-lean-ai-so-fans-can-generate-world- cup-highlights-packages Alvarez, Edgar. “Fox Sports’ World Cup Highlight Machine is powered by IBM’s Watson.” Engadget. June 2018. https://www.engadget.com/2018/06/04/fox-sports-world- cup-highlight-machine-ibm-watson Chang, Lulu. “IBM’s Watson will make headlines at the Masters tournament.” Digital Trends. April 2018. https://www.digitaltrends.com/outdoors/ibm-watson-masters Alexander, Julia, “Watch the first ever movie trailer made by artificial intelligence.” Polygon. September 2016. https://www.polygon.com/2016/9/1/12753298/morgan- trailer-artificial-intelligence Smith, John R. “IBM Research takes Watson to Hollywood with the first “cognitive movie trailer.” IBM. August 2016. https://www.ibm.com/blogs/think/2016/08/cognitive- movie-trailer “Uncovering Dark Video Data with AI: How Watson Video Enrichment can provide better decision-making data and unlock new business possibilities in the media industry.” IBM. August 2017. https://public.dhe.ibm.com/common/ ssi/ecm/me/en/mew03018usen/uncovering-dark-data_ MEW03018USEN.pdf