The Personalization Toolkit
Metadata Essentials
Web Session, September 2015
Pancrazio Auteri, CTO Kauser Kanji, Editor
Well-managed metadata is critical
Metadata: a lot of manual work
Metadata are often
cumbersome
and full of
gaps
Rich and well managed metadata
make the difference
and become a
strategic part
of your success!
Descriptive
Technical
Commercial
Title, synopsis, duration, type, topics, cast, crew,
genres, saga, collections, celebrities, age rating,
year, country, languages, images…
Schedule times, channels, duration,
resolutions, encoding profiles, channel lineups,
backend systems…
Business model, price, margin, constraints on
availability, bundles, packaging, allowed
discounts…
Metadata we are talking about
Timeline Scene-level tags, start of closing credits, actor/
character in scene, product placement…
Metadata is not a silo
Areas Impacted by Metadata
User Interface
Personalization
UI Autopilot
Analytics Audience Profiling
Catalog
Consolidation
Viewership
Forecasting
Advertising
User Engagement
Content Acquisition
Examples: Personalization
Similar content
Related content
Explanations
Developing stories
Next-to-play
Celebrity videos
(teens love this)
Character-based
(for kids)
Surfacing sports
content
Collections
Recommendations
News topics
UX ENHANCEMENT
Next-to-play More playbacks, more ads
Welcome back Shorter time-to-content
Similar or related content Better catalog perception
Because you liked More trust
Targeted promotion Better conversion
Personalized e-mail Higher email open ratio
Personalized notification Higher return ratio
IMPACT ON SERVICE KPIs
and many more!
Examples: User Interface
Data gaps ➜ Graceful degradation to fallback values
Delayed enrichment ➜ Fast lane and incremental re-publishing
Lack of links ➜ Use a knowledge graph
Modification rights ➜ Lock data fields by source
I S S U E S C O U N T E R M E A S U R E S
Publishing rights ➜ Select data source by UI context
➜ Link UI words to core metadata values
But Search must work with user’s language!
MD.genre = “action” CERCA azione
LANGUAGE en it fr
You can deliver a rich and solid core of metadata based
on standardized language and reference dictionaries
(natural language processing, dictionaries or other techniques)
Simplest example
Examples: Search
Word localization, synonyms, misspelled words, nicknames…
Semantic approach is mandatory for natural language & voice search
Include a self-tuning set of relevancy boosters
‣ a show that is hot today may be irrelevant in a few months or weeks
‣ immediate boost may be needed for news clips related to a developing story
‣ external excitement factors or trends may be needed by sports content
Learn from audience behavior
‣ whether to use search or not depends on user’s habits and content type
‣ search is used in 3-10% of TV sessions; 18-25% on PC/tablet/phone (entertainment apps)
‣ voice search helps but most users name specific entities (titles, person names, character
names, channels, genres, topics)
‣ use a search-focused analytics dashboard
Examples: Search
LIVE EVENT VIDEO CLIP APP LIVE EVENT
EPG
Apps
Sports
Highlights
LIVE EVENT
LIVE EVENT
RELATIONSHIPS
Channel (e.g. ESPN)
Organization (e.g. NBA)
Sport (e.g. Tennis)
Tournament (e.g. World Cup)
Sponsor (e.g. Nike)
RELATIONSHIPS
Match
Athlete
Sponsor
Examples: Cross-domain Recommendations
Examples: Dynamic Streams Personalization
Content in a flat list. No visual help to
process so many things on screen.
Collections and micro-genres:
easily scannable
+
Uplifting movies about teamwork
American movies starring Tom Hanks
French comedies set in Paris
Collection
metadata
Collection
metadata
Collection
metadata
User
Profile
+
+
MATCH POSITION
3
5
-
Examples: Dynamic Streams Personalization
Handling Metadata
Handling Metadata
1. Blending data sources
2. Making data richer
5. Publishing changes faster
3. Automating workflows
4. Validating automatic operations
Content
item
MD.Netflix
OFR.Netflix
MD.Commonsense
MD.Rovi
OFR.Catchup
MD.RottenTomatoes
MD.Gracenote
OFR.HBO-Go
Multiple Data Sources: Reconciliation
Data + Offer (playable)
“Data Only” Sources
OFR.EPG
Enrichment (Source Blending)
Title
Synopsis (en)
Parental rating
Topics
Mood
Genre
Duration
Title (en)
Genre
Year
Parental rating
Critics score
Parental rating
Parental advisory
Title
Review
Duration
Title (en)
Critics score
Topics
Mood
Title (fr)
Synopsis (fr)
Title (fr)
Synopsis (en)
Synopsis (fr)
Parental rating
Parental advisory
Year
Images
Images
1
2
3
4
5
Content feed
Licensed source
Critics Review
Parental Ratings
Other Language
Handling Metadata
1. Blending data sources
2. Making data richer
5. Publishing changes faster
3. Automating workflows
4. Validating automatic operations
From Fragments to a Graph of Knowledge
Movie Episode
Gossip Video
Talk Show
Clip
spouse
2015..
spouse
2000..2005
Gossip Video
appearsIn appearsIn
actorOf appearsIn
Season
Series
Special
spinOff
appearsIn Channel
Brand
Talk Show
Brand
Movie
sequelOf
franchise
James
Bond
franchise
Schedule
interviewedIn
Semantic Reasoning: Properties
appearsIn
actorOf
interviewedIn
hosts
directorOf
writerOf
isContributor
producerOf
More specific More generic
Works with “inconsistent” tagging!
Movie Episode
Gossip Video
Talk Show
Clip
spouse
2015..
spouse
2000..2005
Gossip Video
appearsIn appearsIn
actorOf appearsIn
Season
Series
Special
spinOff
appearsIn Channel
Brand
Talk Show
Brand
Movie
sequel
franchise
James
Bond
franchise
Schedule
interviewedIn
producerOf
Person
Actor
Athlete
Quarterback
Musician
Agent Entity
Organization
Music Label
Semantic Reasoning: Types
Rapper
Charity
Band
Apparently inconsistent tagging can still generate global value.

No re-tagging needed
Team
Venue
Association
(NBA…)
Tournament
(World Cup)
Match
Sport
Athlete
EPG
Event
Sports data structure Excitement Factor
(external signal)
Clip
Sponsor
VOD
Asset
University
Handling Metadata
1. Blending data sources
2. Making data richer
5. Publishing changes faster
3. Automating workflows
4. Validating automatic operations
Knowledge Factory
Ingest
Reconcile
Deduplicate
Enrich Validate
Consolidate
Export
Knowledge Factory
Data Source
Data Source
Data Source
Metadata
Core
Editorial and Operational Tools
Typical Workflow
Handling Metadata
1. Blending data sources
2. Making data richer
5. Publishing changes faster
3. Automating workflows
4. Validating automatic operations
Validation
Operations can be validated automatically using “confidence”
Human editors validate the operation or correct it
The metadata management system should learn the
correction and will apply it to similar situations in the future.
Automatic
operation
CONFIDENCE
HI
LO
Publish
Ask human Publish
Handling Metadata
1. Blending data sources
2. Making data richer
5. Publishing changes faster
3. Automating workflows
4. Validating automatic operations
TIME
Basic data arrives
Quickly published
to clients
Data gets
enriched
Enriched data
re-published to
clients (via API)
The Colbert Report
8:00 PM - Comedy Central
Hi-def show image
Guest celebrity
Frame from the show
Data ingestion
Video clips generated
Catch-up version created
Fast lane and continuous data re-publishing
Additional metadata
Timeline tags
PRODUCT PLACEMENT
CELEBRITY
CLOSING CREDITS
ADS
1:25 4:20 6:00 9:30
Thank you!
www.contentwise.tv
For more information on Knowledge Factory, 

UI Autopilot and Personalization, 

please visit our website or contact us
Digital TV. Personalized.
info@contentwise.tv

Metadata for Online TV & Video - by ContentWise

  • 1.
    The Personalization Toolkit MetadataEssentials Web Session, September 2015 Pancrazio Auteri, CTO Kauser Kanji, Editor
  • 2.
  • 3.
    Metadata: a lotof manual work
  • 4.
  • 5.
    Rich and wellmanaged metadata make the difference and become a strategic part of your success!
  • 6.
    Descriptive Technical Commercial Title, synopsis, duration,type, topics, cast, crew, genres, saga, collections, celebrities, age rating, year, country, languages, images… Schedule times, channels, duration, resolutions, encoding profiles, channel lineups, backend systems… Business model, price, margin, constraints on availability, bundles, packaging, allowed discounts… Metadata we are talking about Timeline Scene-level tags, start of closing credits, actor/ character in scene, product placement…
  • 7.
  • 8.
    Areas Impacted byMetadata User Interface Personalization UI Autopilot Analytics Audience Profiling Catalog Consolidation Viewership Forecasting Advertising User Engagement Content Acquisition
  • 9.
    Examples: Personalization Similar content Relatedcontent Explanations Developing stories Next-to-play Celebrity videos (teens love this) Character-based (for kids) Surfacing sports content Collections Recommendations News topics
  • 10.
    UX ENHANCEMENT Next-to-play Moreplaybacks, more ads Welcome back Shorter time-to-content Similar or related content Better catalog perception Because you liked More trust Targeted promotion Better conversion Personalized e-mail Higher email open ratio Personalized notification Higher return ratio IMPACT ON SERVICE KPIs and many more!
  • 11.
    Examples: User Interface Datagaps ➜ Graceful degradation to fallback values Delayed enrichment ➜ Fast lane and incremental re-publishing Lack of links ➜ Use a knowledge graph Modification rights ➜ Lock data fields by source I S S U E S C O U N T E R M E A S U R E S Publishing rights ➜ Select data source by UI context
  • 12.
    ➜ Link UIwords to core metadata values But Search must work with user’s language! MD.genre = “action” CERCA azione LANGUAGE en it fr You can deliver a rich and solid core of metadata based on standardized language and reference dictionaries (natural language processing, dictionaries or other techniques) Simplest example Examples: Search
  • 13.
    Word localization, synonyms,misspelled words, nicknames… Semantic approach is mandatory for natural language & voice search Include a self-tuning set of relevancy boosters ‣ a show that is hot today may be irrelevant in a few months or weeks ‣ immediate boost may be needed for news clips related to a developing story ‣ external excitement factors or trends may be needed by sports content Learn from audience behavior ‣ whether to use search or not depends on user’s habits and content type ‣ search is used in 3-10% of TV sessions; 18-25% on PC/tablet/phone (entertainment apps) ‣ voice search helps but most users name specific entities (titles, person names, character names, channels, genres, topics) ‣ use a search-focused analytics dashboard Examples: Search
  • 14.
    LIVE EVENT VIDEOCLIP APP LIVE EVENT EPG Apps Sports Highlights LIVE EVENT LIVE EVENT RELATIONSHIPS Channel (e.g. ESPN) Organization (e.g. NBA) Sport (e.g. Tennis) Tournament (e.g. World Cup) Sponsor (e.g. Nike) RELATIONSHIPS Match Athlete Sponsor Examples: Cross-domain Recommendations
  • 15.
    Examples: Dynamic StreamsPersonalization Content in a flat list. No visual help to process so many things on screen. Collections and micro-genres: easily scannable
  • 16.
    + Uplifting movies aboutteamwork American movies starring Tom Hanks French comedies set in Paris Collection metadata Collection metadata Collection metadata User Profile + + MATCH POSITION 3 5 - Examples: Dynamic Streams Personalization
  • 17.
  • 18.
    Handling Metadata 1. Blendingdata sources 2. Making data richer 5. Publishing changes faster 3. Automating workflows 4. Validating automatic operations
  • 19.
  • 20.
    Enrichment (Source Blending) Title Synopsis(en) Parental rating Topics Mood Genre Duration Title (en) Genre Year Parental rating Critics score Parental rating Parental advisory Title Review Duration Title (en) Critics score Topics Mood Title (fr) Synopsis (fr) Title (fr) Synopsis (en) Synopsis (fr) Parental rating Parental advisory Year Images Images 1 2 3 4 5 Content feed Licensed source Critics Review Parental Ratings Other Language
  • 21.
    Handling Metadata 1. Blendingdata sources 2. Making data richer 5. Publishing changes faster 3. Automating workflows 4. Validating automatic operations
  • 22.
    From Fragments toa Graph of Knowledge Movie Episode Gossip Video Talk Show Clip spouse 2015.. spouse 2000..2005 Gossip Video appearsIn appearsIn actorOf appearsIn Season Series Special spinOff appearsIn Channel Brand Talk Show Brand Movie sequelOf franchise James Bond franchise Schedule interviewedIn
  • 23.
  • 24.
    Works with “inconsistent”tagging! Movie Episode Gossip Video Talk Show Clip spouse 2015.. spouse 2000..2005 Gossip Video appearsIn appearsIn actorOf appearsIn Season Series Special spinOff appearsIn Channel Brand Talk Show Brand Movie sequel franchise James Bond franchise Schedule interviewedIn producerOf
  • 25.
    Person Actor Athlete Quarterback Musician Agent Entity Organization Music Label SemanticReasoning: Types Rapper Charity Band Apparently inconsistent tagging can still generate global value.
 No re-tagging needed
  • 26.
    Team Venue Association (NBA…) Tournament (World Cup) Match Sport Athlete EPG Event Sports datastructure Excitement Factor (external signal) Clip Sponsor VOD Asset University
  • 27.
    Handling Metadata 1. Blendingdata sources 2. Making data richer 5. Publishing changes faster 3. Automating workflows 4. Validating automatic operations
  • 28.
    Knowledge Factory Ingest Reconcile Deduplicate Enrich Validate Consolidate Export KnowledgeFactory Data Source Data Source Data Source Metadata Core Editorial and Operational Tools Typical Workflow
  • 29.
    Handling Metadata 1. Blendingdata sources 2. Making data richer 5. Publishing changes faster 3. Automating workflows 4. Validating automatic operations
  • 30.
    Validation Operations can bevalidated automatically using “confidence” Human editors validate the operation or correct it The metadata management system should learn the correction and will apply it to similar situations in the future. Automatic operation CONFIDENCE HI LO Publish Ask human Publish
  • 31.
    Handling Metadata 1. Blendingdata sources 2. Making data richer 5. Publishing changes faster 3. Automating workflows 4. Validating automatic operations
  • 32.
    TIME Basic data arrives Quicklypublished to clients Data gets enriched Enriched data re-published to clients (via API) The Colbert Report 8:00 PM - Comedy Central Hi-def show image Guest celebrity Frame from the show Data ingestion Video clips generated Catch-up version created Fast lane and continuous data re-publishing Additional metadata Timeline tags PRODUCT PLACEMENT CELEBRITY CLOSING CREDITS ADS 1:25 4:20 6:00 9:30
  • 33.
    Thank you! www.contentwise.tv For moreinformation on Knowledge Factory, 
 UI Autopilot and Personalization, 
 please visit our website or contact us Digital TV. Personalized. info@contentwise.tv