Taxonomy: Deep Dive
Presented by :Chinmay Panda
Date :25-03-2019
Product Innovation Session 3
Problem statement
Improve IAB 2.0 categorisation accuracy
Taxonomy Structures
Sprig; A complete list of content categories returned for the ‘classifyText’ method are shown below.
The Natural Language API filters the categories returned by the ‘classifyText’ method to include only the most relevant
categories for a request.
IAB- “The working group was conscious that the taxonomy should be leveraged by publishers and advertisers to describe
the topical “aboutness” of content with the primary purpose of facilitating relevant, brand safe, and effective advertising. “
The goal for the group was to create an enhanced and more powerful taxonomy, enabling content creators to
more accurately and consistently describe content, facilitating more relevant advertising and providing a higher quality
and more granular foundation for data analysis.
Taxonomy Structures
Sprig:
“A complete list of content categories returned for the ‘classifyText’ method are shown below.
The Natural Language API filters the categories returned by the ‘classifyText’ method to include only the most relevant
categories for a request.”
IAB:
“The working group was conscious that the taxonomy should be leveraged by publishers and advertisers to describe
the topical “aboutness” of content with the primary purpose of facilitating relevant, brand safe, and effective advertising. “
The goal for the group was to create an enhanced and more powerful taxonomy, enabling content creators to
more accurately and consistently describe content, facilitating more relevant advertising and providing a higher quality
and more granular foundation for data analysis.
So Lets Understand Content
How to classify News?
IPTC
MediaTopics
“The IPTC is the global standards body of the news media. We provide the technical
foundation for the news ecosystem.”
IPTC Standards
Comparison
The answer lies in what they miss
Summary
The crux of the matter is
❏ Taxonomy Structures are designed to classify content, not
news/information
❏ Publisher in advertising means content publisher, not news
publisher
❏ IAB & Sprig both are designed to cater to advertising, not news
webpage
❏ It also explains why contextual keyword types have always
struggled in News domains
Solution
Solution
IPTC
MediaTopics
IPTC - IAB
Mapping
(Custom)
URL IABPublisher Advertiser
Contextual product
Some Expert Advice
https://medium.com/kontikilabs/comparing-machine-
learning-ml-services-from-various-cloud-ml-service-
providers-63c8a2626cb6
https://www.hasaltaiar.com.au/microsoft-cognitive-service-
for-text-analysis-vs-google-natural-language-service/
However...
Everyone knows the limitations of IAB categories,
So let’s keep that aside and try to achieve
the maximum accuracy level with the current structure.
The more important question is
Solution
Deep Learning
Solution
(To be presented in P.I.S-4)
Let’s Discuss Data
https://docs.google.com/spreadsheets/d/1yh8TJpqfdJMnbuYwJmsKqpCMHH9VXGEWttAyEM
MbZHU/edit?usp=sharing

Taxonomy deep dive