SlideShare a Scribd company logo
Taxonomy at AOL Classifying the parts of a whole Noel Agnew (@noelagnewny) Ashley Marty (@ashleykmarty) June 09, 2011
The problem:Aol did not have a common vocabulary
56+ Media brands, including: DAM New York 2011 Page 3
Multiple ad systems and content platforms Content platforms: Blogsmith Huffington Post (Movable type) 5min Truveo StudioNow DAM New York 2011 Page 4 Some ad systems: AdTech Advertising.com Feedpoint/Dynamic Banners
All speaking different languages… DAM New York 2011 Page 5 Tag.aol.com “beyonce” Tag… “beyonceknowles” AOL Music “beyonce” AOL music “beyonceknowles” Moviefone “beyonceknowles” Huffington Post “beyonce” H… Post “beyonceknowles”
What we were asked to do Effectively and granularly classify content:    For improved ad sales    To relate content within and between the brands    In some cases, to assist editors with external-facing tags    All sorts of other bits of magic (which will be touched on later) DAM New York 2011 Page6
The solution:Classify all AOL content in the same way
Faceted Ontology DAM New York 2011 Page 8 “…structural frameworks for organizing information on the semantic Web and within semantic enterprises. They provide unique benefits in discovery, flexible access, and information integration due to their inherent connectedness; that is, their ability to represent conceptual relationships. ” -M.K. Bergman, “An Executive Intro to Ontologies” http://www.mkbergman.com/900/an-executive-intro-to-ontologies/
Subjects We have approx. 6800 subjects Generally hierarchical, but some associative relationships Iterative process with editors (subject specialists) 12 Top levels (or classes) DAM New York 2011 Page 9 Arts and Humanities Education Entertainment Health and Medicine Lifestyle Money and Finance News and Politics Science and Tech Social Sciences Sports Transportation Travel and Tourism
Entities Named Things (includes persons) Locations Works Events Groups Brands Products DAM New York 2011 Page 10 Proper nouns (specific persons, places, things) Not hierarchical, but rather associative relationships 7 Entities Vocabularies
Taxonomy/ontology mashup DAM New York 2011 Page 11 Sprint HTC Evo 4G OSX iPhone Verizon Apple AT&T
Making it work
HELLO TEL AVIV! When we were tasked with this, we had very little direct communication with the team in Tel Aviv that runs the classification engine… We also were under the impression that auto-classification was their issue and they’d just have to classify with whatever we gave them. This was WRONG! DAM New York 2011 Page 13
Train in vain? DAM New York 2011 Page 14 ‘Women's Shoes’ We had to find training data for each subject in the taxonomy… and are continually doing so to improve classification.
DAM New York 2011 Page 15 More Contact with the Classification Team 	Providing Feedback on tagging results 	Collaborating on priorities 	What data is most valuable to the tagger? Getting to Know You
Turning large amounts of data into an ontology DAM New York 2011 Page 16 More data sources means multiple records for the same Entity More sources = More effort required in Merging records Name: Beyoncé MusicPerson MoviePerson Alias (synonym): Beyonce Knowles Alias (synonym): Beyonce Source:Wikipedia Source: AolMusicDB Source: AolMovieDB After Merge, one record remains with metadata and relationships from all sources More sources = More valuable records
Where we are now
DAM New York 2011 Page 18 Integrating with Advertising systems Our subjects can be mapped to Advertising categories to serve ads for related products Current Department Store campaign:  Page 18
Recommending Tags for Editorial DAM New York 2011 Page 19
Where we’re going
On the Roadmap… More projects with Advertising teams More data in our ontology to make classification better Refining the ontology- because it’s a living thing DAM New York 2011 Page 21
Lessons learned
Life lessons… Keep your eye on the prize Expect people to think this is a much smaller task than it is Don’t reinvent the wheel Never underestimate the power of the ability to manipulate data DAM New York 2011 Page 23

More Related Content

Viewers also liked

Decisión
DecisiónDecisión
Decisión
Fernando Guadix
 
Pegasus essentials 2011 2012
Pegasus essentials 2011 2012Pegasus essentials 2011 2012
Pegasus essentials 2011 2012
Jennifer Marten
 
Pharma Field Sales Learning and Development
Pharma Field Sales Learning and DevelopmentPharma Field Sales Learning and Development
Pharma Field Sales Learning and Development
Anup Soans
 
Text mining and analytics v6 - p2
Text mining and analytics   v6 - p2Text mining and analytics   v6 - p2
Text mining and analytics v6 - p2
Dave King
 
Ivy 09
Ivy  09Ivy  09
Ivy 09
aahana23
 
lista de canais com tps
lista de canais com tpslista de canais com tps
lista de canais com tps
paladinoprateado
 
Creating a digital story by The Grove Library
Creating a digital story by The Grove LibraryCreating a digital story by The Grove Library
Creating a digital story by The Grove Library
The Grove. Leading, learning, living.
 
Has Pharma Marketing Forgotten the Patient?
Has Pharma Marketing Forgotten the Patient? Has Pharma Marketing Forgotten the Patient?
Has Pharma Marketing Forgotten the Patient?
Anup Soans
 
Mayamuscleadvancedtechniques
MayamuscleadvancedtechniquesMayamuscleadvancedtechniques
Mayamuscleadvancedtechniques
codewarrior congrejo
 
Elvis In The Movies Original Recordings Remastered
Elvis In The Movies Original Recordings RemasteredElvis In The Movies Original Recordings Remastered
Elvis In The Movies Original Recordings Remastered
Elvis Presley Blues
 
เข้าใช้โปรแกรม Turnitin ไม่ได้ ทำอย่างไร? ... Forgot Your Turnitin Password?
เข้าใช้โปรแกรม Turnitin ไม่ได้ ทำอย่างไร? ... Forgot Your Turnitin Password?เข้าใช้โปรแกรม Turnitin ไม่ได้ ทำอย่างไร? ... Forgot Your Turnitin Password?
เข้าใช้โปรแกรม Turnitin ไม่ได้ ทำอย่างไร? ... Forgot Your Turnitin Password?
Gritiga Soothorn
 
Week 20
Week 20Week 20
Week 20
kibentz
 
Encuentro 2 Evaluar con tic
Encuentro 2 Evaluar con ticEncuentro 2 Evaluar con tic
Elvis Presley Vol 02
Elvis Presley Vol 02Elvis Presley Vol 02
Elvis Presley Vol 02
Elvis Presley Blues
 
LXE February Partner Webinar
LXE February Partner WebinarLXE February Partner Webinar
LXE February Partner Webinar
LXE
 
Thesis writing assignment; thesis presentation(fixed)
Thesis writing assignment; thesis presentation(fixed)Thesis writing assignment; thesis presentation(fixed)
Thesis writing assignment; thesis presentation(fixed)
tykl94
 
Summary Report For 23 Nov 09 (Final)
Summary Report For 23 Nov 09 (Final)Summary Report For 23 Nov 09 (Final)
Summary Report For 23 Nov 09 (Final)
shweetheart
 
Social Monitor keynote - Barcelona Affiliate Conference #BAC2014
Social Monitor keynote - Barcelona Affiliate Conference #BAC2014Social Monitor keynote - Barcelona Affiliate Conference #BAC2014
Social Monitor keynote - Barcelona Affiliate Conference #BAC2014
Joakim Nilsson
 
PréSentation Axir
PréSentation AxirPréSentation Axir
PréSentation Axir
rooneyhallatt
 

Viewers also liked (20)

Decisión
DecisiónDecisión
Decisión
 
Pegasus essentials 2011 2012
Pegasus essentials 2011 2012Pegasus essentials 2011 2012
Pegasus essentials 2011 2012
 
Pharma Field Sales Learning and Development
Pharma Field Sales Learning and DevelopmentPharma Field Sales Learning and Development
Pharma Field Sales Learning and Development
 
Voorzieningen
VoorzieningenVoorzieningen
Voorzieningen
 
Text mining and analytics v6 - p2
Text mining and analytics   v6 - p2Text mining and analytics   v6 - p2
Text mining and analytics v6 - p2
 
Ivy 09
Ivy  09Ivy  09
Ivy 09
 
lista de canais com tps
lista de canais com tpslista de canais com tps
lista de canais com tps
 
Creating a digital story by The Grove Library
Creating a digital story by The Grove LibraryCreating a digital story by The Grove Library
Creating a digital story by The Grove Library
 
Has Pharma Marketing Forgotten the Patient?
Has Pharma Marketing Forgotten the Patient? Has Pharma Marketing Forgotten the Patient?
Has Pharma Marketing Forgotten the Patient?
 
Mayamuscleadvancedtechniques
MayamuscleadvancedtechniquesMayamuscleadvancedtechniques
Mayamuscleadvancedtechniques
 
Elvis In The Movies Original Recordings Remastered
Elvis In The Movies Original Recordings RemasteredElvis In The Movies Original Recordings Remastered
Elvis In The Movies Original Recordings Remastered
 
เข้าใช้โปรแกรม Turnitin ไม่ได้ ทำอย่างไร? ... Forgot Your Turnitin Password?
เข้าใช้โปรแกรม Turnitin ไม่ได้ ทำอย่างไร? ... Forgot Your Turnitin Password?เข้าใช้โปรแกรม Turnitin ไม่ได้ ทำอย่างไร? ... Forgot Your Turnitin Password?
เข้าใช้โปรแกรม Turnitin ไม่ได้ ทำอย่างไร? ... Forgot Your Turnitin Password?
 
Week 20
Week 20Week 20
Week 20
 
Encuentro 2 Evaluar con tic
Encuentro 2 Evaluar con ticEncuentro 2 Evaluar con tic
Encuentro 2 Evaluar con tic
 
Elvis Presley Vol 02
Elvis Presley Vol 02Elvis Presley Vol 02
Elvis Presley Vol 02
 
LXE February Partner Webinar
LXE February Partner WebinarLXE February Partner Webinar
LXE February Partner Webinar
 
Thesis writing assignment; thesis presentation(fixed)
Thesis writing assignment; thesis presentation(fixed)Thesis writing assignment; thesis presentation(fixed)
Thesis writing assignment; thesis presentation(fixed)
 
Summary Report For 23 Nov 09 (Final)
Summary Report For 23 Nov 09 (Final)Summary Report For 23 Nov 09 (Final)
Summary Report For 23 Nov 09 (Final)
 
Social Monitor keynote - Barcelona Affiliate Conference #BAC2014
Social Monitor keynote - Barcelona Affiliate Conference #BAC2014Social Monitor keynote - Barcelona Affiliate Conference #BAC2014
Social Monitor keynote - Barcelona Affiliate Conference #BAC2014
 
PréSentation Axir
PréSentation AxirPréSentation Axir
PréSentation Axir
 

Similar to Aol dam taxonomy

Invited Talk MESOCA 2014: Evolving software systems: emerging trends and chal...
Invited Talk MESOCA 2014: Evolving software systems: emerging trends and chal...Invited Talk MESOCA 2014: Evolving software systems: emerging trends and chal...
Invited Talk MESOCA 2014: Evolving software systems: emerging trends and chal...
Alexander Serebrenik
 
Open Calais @ Transparent Text
Open Calais @ Transparent TextOpen Calais @ Transparent Text
Open Calais @ Transparent Text
Krista Thomas
 
Collaborative Project to Improve EMu for Managing Archives – Update
Collaborative Project to Improve EMu for Managing Archives – UpdateCollaborative Project to Improve EMu for Managing Archives – Update
Collaborative Project to Improve EMu for Managing Archives – Update
Axiell ALM
 
Salesforce: How To Win The War On the Web
Salesforce: How To Win The War On the WebSalesforce: How To Win The War On the Web
Salesforce: How To Win The War On the Web
WriterAccess
 
Semantic Technology 2009: Hybrid Approaches to Taxonomy and Folksonomy
Semantic Technology 2009:  Hybrid  Approaches to Taxonomy and FolksonomySemantic Technology 2009:  Hybrid  Approaches to Taxonomy and Folksonomy
Semantic Technology 2009: Hybrid Approaches to Taxonomy and Folksonomy
Earley Information Science
 
Semantic search
Semantic searchSemantic search
Semantic search
Bill Slawski
 
Joe Bavonese Psychotherapy Networker presentation March 2011
Joe Bavonese Psychotherapy Networker presentation March 2011Joe Bavonese Psychotherapy Networker presentation March 2011
Joe Bavonese Psychotherapy Networker presentation March 2011
Joe Bavonese, PhD
 
Conversion for companies that put people in touch with each other (like class...
Conversion for companies that put people in touch with each other (like class...Conversion for companies that put people in touch with each other (like class...
Conversion for companies that put people in touch with each other (like class...
Conversion Rate Experts
 
Key Term TARIFFS- (800 words minimum) 1-5After you have s.docx
Key Term TARIFFS- (800 words minimum) 1-5After you have s.docxKey Term TARIFFS- (800 words minimum) 1-5After you have s.docx
Key Term TARIFFS- (800 words minimum) 1-5After you have s.docx
croysierkathey
 
Running Head STRATEGIC MANAGEMENT PLAN1STRATEGIC MANAGEMENT.docx
Running Head STRATEGIC MANAGEMENT PLAN1STRATEGIC MANAGEMENT.docxRunning Head STRATEGIC MANAGEMENT PLAN1STRATEGIC MANAGEMENT.docx
Running Head STRATEGIC MANAGEMENT PLAN1STRATEGIC MANAGEMENT.docx
todd521
 
Mobile Search Generating Revenues At The Intersection Of Content And Context
Mobile Search Generating Revenues At The Intersection Of Content And ContextMobile Search Generating Revenues At The Intersection Of Content And Context
Mobile Search Generating Revenues At The Intersection Of Content And Context
Mobile Groove
 
Impact Of Piracy And Free ( T O C F F)
Impact Of Piracy And Free ( T O C  F F)Impact Of Piracy And Free ( T O C  F F)
Impact Of Piracy And Free ( T O C F F)
Brian O'Leary
 
PhD Day: Entity Linking using Generic Linked Data Datasets
PhD Day: Entity Linking using Generic Linked Data DatasetsPhD Day: Entity Linking using Generic Linked Data Datasets
PhD Day: Entity Linking using Generic Linked Data Datasets
Bianca Pereira
 
"Why the Semantic Web will Never Work" (note the quotes)
"Why the Semantic Web will Never Work"  (note the quotes)"Why the Semantic Web will Never Work"  (note the quotes)
"Why the Semantic Web will Never Work" (note the quotes)
James Hendler
 
Collaborative Project to Improve EMu for Managing Archives – Update
Collaborative Project to Improve EMu for Managing Archives – UpdateCollaborative Project to Improve EMu for Managing Archives – Update
Collaborative Project to Improve EMu for Managing Archives – Update
Axiell ALM
 
Amazon
AmazonAmazon
Amazon
AmazonAmazon
Amazon
coquetxio
 
MN AMA Search101
MN AMA Search101MN AMA Search101
MN AMA Search101
Azul 7
 
draft bpl
draft bpldraft bpl
draft bpl
mparhar
 
Chanimal Alliance Presentation
Chanimal Alliance PresentationChanimal Alliance Presentation
Chanimal Alliance Presentation
tedfinch
 

Similar to Aol dam taxonomy (20)

Invited Talk MESOCA 2014: Evolving software systems: emerging trends and chal...
Invited Talk MESOCA 2014: Evolving software systems: emerging trends and chal...Invited Talk MESOCA 2014: Evolving software systems: emerging trends and chal...
Invited Talk MESOCA 2014: Evolving software systems: emerging trends and chal...
 
Open Calais @ Transparent Text
Open Calais @ Transparent TextOpen Calais @ Transparent Text
Open Calais @ Transparent Text
 
Collaborative Project to Improve EMu for Managing Archives – Update
Collaborative Project to Improve EMu for Managing Archives – UpdateCollaborative Project to Improve EMu for Managing Archives – Update
Collaborative Project to Improve EMu for Managing Archives – Update
 
Salesforce: How To Win The War On the Web
Salesforce: How To Win The War On the WebSalesforce: How To Win The War On the Web
Salesforce: How To Win The War On the Web
 
Semantic Technology 2009: Hybrid Approaches to Taxonomy and Folksonomy
Semantic Technology 2009:  Hybrid  Approaches to Taxonomy and FolksonomySemantic Technology 2009:  Hybrid  Approaches to Taxonomy and Folksonomy
Semantic Technology 2009: Hybrid Approaches to Taxonomy and Folksonomy
 
Semantic search
Semantic searchSemantic search
Semantic search
 
Joe Bavonese Psychotherapy Networker presentation March 2011
Joe Bavonese Psychotherapy Networker presentation March 2011Joe Bavonese Psychotherapy Networker presentation March 2011
Joe Bavonese Psychotherapy Networker presentation March 2011
 
Conversion for companies that put people in touch with each other (like class...
Conversion for companies that put people in touch with each other (like class...Conversion for companies that put people in touch with each other (like class...
Conversion for companies that put people in touch with each other (like class...
 
Key Term TARIFFS- (800 words minimum) 1-5After you have s.docx
Key Term TARIFFS- (800 words minimum) 1-5After you have s.docxKey Term TARIFFS- (800 words minimum) 1-5After you have s.docx
Key Term TARIFFS- (800 words minimum) 1-5After you have s.docx
 
Running Head STRATEGIC MANAGEMENT PLAN1STRATEGIC MANAGEMENT.docx
Running Head STRATEGIC MANAGEMENT PLAN1STRATEGIC MANAGEMENT.docxRunning Head STRATEGIC MANAGEMENT PLAN1STRATEGIC MANAGEMENT.docx
Running Head STRATEGIC MANAGEMENT PLAN1STRATEGIC MANAGEMENT.docx
 
Mobile Search Generating Revenues At The Intersection Of Content And Context
Mobile Search Generating Revenues At The Intersection Of Content And ContextMobile Search Generating Revenues At The Intersection Of Content And Context
Mobile Search Generating Revenues At The Intersection Of Content And Context
 
Impact Of Piracy And Free ( T O C F F)
Impact Of Piracy And Free ( T O C  F F)Impact Of Piracy And Free ( T O C  F F)
Impact Of Piracy And Free ( T O C F F)
 
PhD Day: Entity Linking using Generic Linked Data Datasets
PhD Day: Entity Linking using Generic Linked Data DatasetsPhD Day: Entity Linking using Generic Linked Data Datasets
PhD Day: Entity Linking using Generic Linked Data Datasets
 
"Why the Semantic Web will Never Work" (note the quotes)
"Why the Semantic Web will Never Work"  (note the quotes)"Why the Semantic Web will Never Work"  (note the quotes)
"Why the Semantic Web will Never Work" (note the quotes)
 
Collaborative Project to Improve EMu for Managing Archives – Update
Collaborative Project to Improve EMu for Managing Archives – UpdateCollaborative Project to Improve EMu for Managing Archives – Update
Collaborative Project to Improve EMu for Managing Archives – Update
 
Amazon
AmazonAmazon
Amazon
 
Amazon
AmazonAmazon
Amazon
 
MN AMA Search101
MN AMA Search101MN AMA Search101
MN AMA Search101
 
draft bpl
draft bpldraft bpl
draft bpl
 
Chanimal Alliance Presentation
Chanimal Alliance PresentationChanimal Alliance Presentation
Chanimal Alliance Presentation
 

Recently uploaded

Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
fredae14
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Tatiana Kojar
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
Dinusha Kumarasiri
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
Operating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptxOperating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptx
Pravash Chandra Das
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
Shinana2
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStrDeep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
saastr
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 

Recently uploaded (20)

Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
Operating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptxOperating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptx
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStrDeep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 

Aol dam taxonomy

  • 1. Taxonomy at AOL Classifying the parts of a whole Noel Agnew (@noelagnewny) Ashley Marty (@ashleykmarty) June 09, 2011
  • 2. The problem:Aol did not have a common vocabulary
  • 3. 56+ Media brands, including: DAM New York 2011 Page 3
  • 4. Multiple ad systems and content platforms Content platforms: Blogsmith Huffington Post (Movable type) 5min Truveo StudioNow DAM New York 2011 Page 4 Some ad systems: AdTech Advertising.com Feedpoint/Dynamic Banners
  • 5. All speaking different languages… DAM New York 2011 Page 5 Tag.aol.com “beyonce” Tag… “beyonceknowles” AOL Music “beyonce” AOL music “beyonceknowles” Moviefone “beyonceknowles” Huffington Post “beyonce” H… Post “beyonceknowles”
  • 6. What we were asked to do Effectively and granularly classify content: For improved ad sales To relate content within and between the brands In some cases, to assist editors with external-facing tags All sorts of other bits of magic (which will be touched on later) DAM New York 2011 Page6
  • 7. The solution:Classify all AOL content in the same way
  • 8. Faceted Ontology DAM New York 2011 Page 8 “…structural frameworks for organizing information on the semantic Web and within semantic enterprises. They provide unique benefits in discovery, flexible access, and information integration due to their inherent connectedness; that is, their ability to represent conceptual relationships. ” -M.K. Bergman, “An Executive Intro to Ontologies” http://www.mkbergman.com/900/an-executive-intro-to-ontologies/
  • 9. Subjects We have approx. 6800 subjects Generally hierarchical, but some associative relationships Iterative process with editors (subject specialists) 12 Top levels (or classes) DAM New York 2011 Page 9 Arts and Humanities Education Entertainment Health and Medicine Lifestyle Money and Finance News and Politics Science and Tech Social Sciences Sports Transportation Travel and Tourism
  • 10. Entities Named Things (includes persons) Locations Works Events Groups Brands Products DAM New York 2011 Page 10 Proper nouns (specific persons, places, things) Not hierarchical, but rather associative relationships 7 Entities Vocabularies
  • 11. Taxonomy/ontology mashup DAM New York 2011 Page 11 Sprint HTC Evo 4G OSX iPhone Verizon Apple AT&T
  • 13. HELLO TEL AVIV! When we were tasked with this, we had very little direct communication with the team in Tel Aviv that runs the classification engine… We also were under the impression that auto-classification was their issue and they’d just have to classify with whatever we gave them. This was WRONG! DAM New York 2011 Page 13
  • 14. Train in vain? DAM New York 2011 Page 14 ‘Women's Shoes’ We had to find training data for each subject in the taxonomy… and are continually doing so to improve classification.
  • 15. DAM New York 2011 Page 15 More Contact with the Classification Team Providing Feedback on tagging results Collaborating on priorities What data is most valuable to the tagger? Getting to Know You
  • 16. Turning large amounts of data into an ontology DAM New York 2011 Page 16 More data sources means multiple records for the same Entity More sources = More effort required in Merging records Name: Beyoncé MusicPerson MoviePerson Alias (synonym): Beyonce Knowles Alias (synonym): Beyonce Source:Wikipedia Source: AolMusicDB Source: AolMovieDB After Merge, one record remains with metadata and relationships from all sources More sources = More valuable records
  • 18. DAM New York 2011 Page 18 Integrating with Advertising systems Our subjects can be mapped to Advertising categories to serve ads for related products Current Department Store campaign: Page 18
  • 19. Recommending Tags for Editorial DAM New York 2011 Page 19
  • 21. On the Roadmap… More projects with Advertising teams More data in our ontology to make classification better Refining the ontology- because it’s a living thing DAM New York 2011 Page 21
  • 23. Life lessons… Keep your eye on the prize Expect people to think this is a much smaller task than it is Don’t reinvent the wheel Never underestimate the power of the ability to manipulate data DAM New York 2011 Page 23

Editor's Notes

  1. How many of you knew that all of these are owned by aolHow many of these were purchased since we started the taxonomy process
  2. Photo platform (mention it)At a minimum, 3 ad systems that we’ve had to deal with
  3. url to link out here
  4. Ad Sales: so products with some relation to the article can be served2.Relating content: Within: e.g. Someone on Aol Music can see all Beyonce articles Between: see Beyonce articles on Moviefone, Stylelist, Popeater: keep people on Aol sites instead of linking out3. Assist editors: standardize tags so content not being lost without relationships – can’t find it if not tagged properly
  5. Difference between taxo and onto
  6. Be flexible and remember your purpose (for us its aol content)Subjects may be called topics/categories in other placesSubjects describe ‘aboutness’ of an articlee.g. Report on world series is about ‘Baseball’e.g. Article about best airlines is about ‘Air Travel’
  7. We have around 3.8 million and countingTogether subjects and entities make up the taxonomy
  8. More Contact with the Classification Team Providing Feedback on tagging results Collaborating on priorities Focus on what is most valuable to the tagger
  9. Mix of NLP and machine learningPicks up important related terms that imply content is about a subject (heels, flats, etc).. Brands..etcMention that now entities extracted can actually improve subject taggingDMOZ: Voluntary human-edited directory of the web: lists of websites by subject
  10. One record will have multiple node types, aliases, metadata will be brought together: albums, date of birth, marriedto, spokesperson for brandVery rich records result: opportunity to create multiple relationships
  11. Subjects and entitiesWe met with teams, one thing they liked was the fact they could tag a ‘master version’ with a taxonomy ID-Bring all articles mentioning ‘Charlie Sheen’ together, just like the Beyonce example not different versions like charliesheen,charlie sheen, charlie+sheen
  12. Need title