SlideShare a Scribd company logo
1 of 14
Download to read offline
NLP applications
in search
Dr. Eoin Hurrell
@eoinhurrell
eoinhurrell@gmail.com
http://cohort.is
Overview
- What is Cohort?
- Problem definition: What kind of search and understanding?
- Our solution: a mix of old and new
- Evaluation
- Current usage
What is Cohort?
“Cohort helps you find the people you need through the people you know”
- (Qualified) Second degree social network search
- We search for relevant people in your network you could get an intro to from a
close friend

A Data Product in Cohort
Understanding "asks" - social feed posts that are also searches
Asks to concepts
- We have a number of ways of classifying people as having interests
- By classifying queries as asking for interests we have 'dimensionality reduction as
query expansion'
More Generally
How can we expand the scope of a search to return
more (and more relevant) results
The Old
Only some elements of the text signify the information need. Part-of-speech
tagging and named entity recognition are used to filter out some noise
"I'm looking for a python hacker for some remote work based anywhere"
The New (sorta)
- (Compound) word vectors as concepts
Python (language),
Code
Remote
Working
-
"I'm looking for a python hacker for some remote work based anywhere"
Query = Free-text: python hacker, remote work
Interests: Python (language), Code, Remote Working
Word vector training data
Technical roles, web technologies and similar text to understand
+
Word vector training data
Start with model with Google News vectors, then train on HackerNews data
Evaluation
Tested against human-judged concepts for 542 real asks
F1-score:
87.06%
Current status
The app today makes use of a more explicit exploratory search interface and uses
word vectors to enrich information searched over.
Code
Move from understanding concepts in asks to understanding concepts in tweets, and
using that to improve our interest tagging.
Summary
- An example of a data product
- Understanding text to improve search
- Mixing traditional NLP techniques with deep learning
- Evaluation
- Current usage within Cohort
Questions?
Dr. Eoin Hurrell
@eoinhurrell
eoinhurrell@gmail.com
http://cohort.is

More Related Content

Similar to NLP Applications in Search

Bearish SEO: Defining the User Experience for Google’s Panda Search Landscape
Bearish SEO: Defining the User Experience for Google’s Panda Search LandscapeBearish SEO: Defining the User Experience for Google’s Panda Search Landscape
Bearish SEO: Defining the User Experience for Google’s Panda Search LandscapeMarianne Sweeny
 
Smashing silos ia-ux-meetup-mar112014
Smashing silos ia-ux-meetup-mar112014Smashing silos ia-ux-meetup-mar112014
Smashing silos ia-ux-meetup-mar112014Marianne Sweeny
 
Search Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By DesignSearch Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By DesignMarianne Sweeny
 
Project Panorama: vistas on validated information
Project Panorama: vistas on validated informationProject Panorama: vistas on validated information
Project Panorama: vistas on validated informationEric Sieverts
 
Customer Insights Workshop - Consumer Text Analytics Conference
Customer Insights Workshop - Consumer Text Analytics ConferenceCustomer Insights Workshop - Consumer Text Analytics Conference
Customer Insights Workshop - Consumer Text Analytics ConferenceMekkin Bjarnadottir
 
The evolution of Search spscinci
The evolution of Search spscinciThe evolution of Search spscinci
The evolution of Search spscinciJohnny Lopez
 
HPE IDOL Technical Overview - july 2016
HPE IDOL Technical Overview - july 2016HPE IDOL Technical Overview - july 2016
HPE IDOL Technical Overview - july 2016Andrey Karpov
 
Introduction to enterprise search
Introduction to enterprise searchIntroduction to enterprise search
Introduction to enterprise searchUsama Nada
 
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...Paul Shapiro
 
SG_UserGroup_Oct20_2022_NLP_AzureLangStudio.pptx
SG_UserGroup_Oct20_2022_NLP_AzureLangStudio.pptxSG_UserGroup_Oct20_2022_NLP_AzureLangStudio.pptx
SG_UserGroup_Oct20_2022_NLP_AzureLangStudio.pptxPriyankaShah668821
 
Elqano - Where Knowledge Finds People
Elqano - Where Knowledge Finds PeopleElqano - Where Knowledge Finds People
Elqano - Where Knowledge Finds PeopleGuillermo Garcia
 
Brave new search world
Brave new search worldBrave new search world
Brave new search worldvoginip
 
Content Analyst - Conceptualizing LSI Based Text Analytics White Paper
Content Analyst - Conceptualizing LSI Based Text Analytics White PaperContent Analyst - Conceptualizing LSI Based Text Analytics White Paper
Content Analyst - Conceptualizing LSI Based Text Analytics White PaperJohn Felahi
 
How to be Successful with Search in YOUR Organization
How to be Successful with Search in YOUR OrganizationHow to be Successful with Search in YOUR Organization
How to be Successful with Search in YOUR OrganizationAgnes Molnar
 
How to be successful with search in your organisation
How to be successful with search in your organisationHow to be successful with search in your organisation
How to be successful with search in your organisationvoginip
 
AI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge ManagementAI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge ManagementTrey Grainger
 

Similar to NLP Applications in Search (20)

#trulondon irina
#trulondon irina#trulondon irina
#trulondon irina
 
Bearish SEO: Defining the User Experience for Google’s Panda Search Landscape
Bearish SEO: Defining the User Experience for Google’s Panda Search LandscapeBearish SEO: Defining the User Experience for Google’s Panda Search Landscape
Bearish SEO: Defining the User Experience for Google’s Panda Search Landscape
 
Smashing silos ia-ux-meetup-mar112014
Smashing silos ia-ux-meetup-mar112014Smashing silos ia-ux-meetup-mar112014
Smashing silos ia-ux-meetup-mar112014
 
Search Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By DesignSearch Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By Design
 
Project Panorama: vistas on validated information
Project Panorama: vistas on validated informationProject Panorama: vistas on validated information
Project Panorama: vistas on validated information
 
exploring semantic means
exploring semantic meansexploring semantic means
exploring semantic means
 
People Search
People SearchPeople Search
People Search
 
Customer Insights Workshop - Consumer Text Analytics Conference
Customer Insights Workshop - Consumer Text Analytics ConferenceCustomer Insights Workshop - Consumer Text Analytics Conference
Customer Insights Workshop - Consumer Text Analytics Conference
 
The evolution of Search spscinci
The evolution of Search spscinciThe evolution of Search spscinci
The evolution of Search spscinci
 
HPE IDOL Technical Overview - july 2016
HPE IDOL Technical Overview - july 2016HPE IDOL Technical Overview - july 2016
HPE IDOL Technical Overview - july 2016
 
Introduction to enterprise search
Introduction to enterprise searchIntroduction to enterprise search
Introduction to enterprise search
 
presentation
presentationpresentation
presentation
 
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...
 
SG_UserGroup_Oct20_2022_NLP_AzureLangStudio.pptx
SG_UserGroup_Oct20_2022_NLP_AzureLangStudio.pptxSG_UserGroup_Oct20_2022_NLP_AzureLangStudio.pptx
SG_UserGroup_Oct20_2022_NLP_AzureLangStudio.pptx
 
Elqano - Where Knowledge Finds People
Elqano - Where Knowledge Finds PeopleElqano - Where Knowledge Finds People
Elqano - Where Knowledge Finds People
 
Brave new search world
Brave new search worldBrave new search world
Brave new search world
 
Content Analyst - Conceptualizing LSI Based Text Analytics White Paper
Content Analyst - Conceptualizing LSI Based Text Analytics White PaperContent Analyst - Conceptualizing LSI Based Text Analytics White Paper
Content Analyst - Conceptualizing LSI Based Text Analytics White Paper
 
How to be Successful with Search in YOUR Organization
How to be Successful with Search in YOUR OrganizationHow to be Successful with Search in YOUR Organization
How to be Successful with Search in YOUR Organization
 
How to be successful with search in your organisation
How to be successful with search in your organisationHow to be successful with search in your organisation
How to be successful with search in your organisation
 
AI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge ManagementAI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge Management
 

Recently uploaded

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024SynarionITSolutions
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 

Recently uploaded (20)

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 

NLP Applications in Search

  • 1. NLP applications in search Dr. Eoin Hurrell @eoinhurrell eoinhurrell@gmail.com http://cohort.is
  • 2. Overview - What is Cohort? - Problem definition: What kind of search and understanding? - Our solution: a mix of old and new - Evaluation - Current usage
  • 3. What is Cohort? “Cohort helps you find the people you need through the people you know” - (Qualified) Second degree social network search - We search for relevant people in your network you could get an intro to from a close friend

  • 4. A Data Product in Cohort Understanding "asks" - social feed posts that are also searches
  • 5. Asks to concepts - We have a number of ways of classifying people as having interests - By classifying queries as asking for interests we have 'dimensionality reduction as query expansion'
  • 6. More Generally How can we expand the scope of a search to return more (and more relevant) results
  • 7. The Old Only some elements of the text signify the information need. Part-of-speech tagging and named entity recognition are used to filter out some noise "I'm looking for a python hacker for some remote work based anywhere"
  • 8. The New (sorta) - (Compound) word vectors as concepts Python (language), Code Remote Working - "I'm looking for a python hacker for some remote work based anywhere" Query = Free-text: python hacker, remote work Interests: Python (language), Code, Remote Working
  • 9. Word vector training data Technical roles, web technologies and similar text to understand
  • 10. + Word vector training data Start with model with Google News vectors, then train on HackerNews data
  • 11. Evaluation Tested against human-judged concepts for 542 real asks F1-score: 87.06%
  • 12. Current status The app today makes use of a more explicit exploratory search interface and uses word vectors to enrich information searched over. Code Move from understanding concepts in asks to understanding concepts in tweets, and using that to improve our interest tagging.
  • 13. Summary - An example of a data product - Understanding text to improve search - Mixing traditional NLP techniques with deep learning - Evaluation - Current usage within Cohort