SlideShare a Scribd company logo
NLP applications
in search
Dr. Eoin Hurrell
@eoinhurrell
eoinhurrell@gmail.com
http://cohort.is
Overview
- What is Cohort?
- Problem definition: What kind of search and understanding?
- Our solution: a mix of old and new
- Evaluation
- Current usage
What is Cohort?
“Cohort helps you find the people you need through the people you know”
- (Qualified) Second degree social network search
- We search for relevant people in your network you could get an intro to from a
close friend

A Data Product in Cohort
Understanding "asks" - social feed posts that are also searches
Asks to concepts
- We have a number of ways of classifying people as having interests
- By classifying queries as asking for interests we have 'dimensionality reduction as
query expansion'
More Generally
How can we expand the scope of a search to return
more (and more relevant) results
The Old
Only some elements of the text signify the information need. Part-of-speech
tagging and named entity recognition are used to filter out some noise
"I'm looking for a python hacker for some remote work based anywhere"
The New (sorta)
- (Compound) word vectors as concepts
Python (language),
Code
Remote
Working
-
"I'm looking for a python hacker for some remote work based anywhere"
Query = Free-text: python hacker, remote work
Interests: Python (language), Code, Remote Working
Word vector training data
Technical roles, web technologies and similar text to understand
+
Word vector training data
Start with model with Google News vectors, then train on HackerNews data
Evaluation
Tested against human-judged concepts for 542 real asks
F1-score:
87.06%
Current status
The app today makes use of a more explicit exploratory search interface and uses
word vectors to enrich information searched over.
Code
Move from understanding concepts in asks to understanding concepts in tweets, and
using that to improve our interest tagging.
Summary
- An example of a data product
- Understanding text to improve search
- Mixing traditional NLP techniques with deep learning
- Evaluation
- Current usage within Cohort
Questions?
Dr. Eoin Hurrell
@eoinhurrell
eoinhurrell@gmail.com
http://cohort.is

More Related Content

Similar to NLP Applications in Search

#trulondon irina
#trulondon irina#trulondon irina
#trulondon irina
Irina Shamaeva
 
Bearish SEO: Defining the User Experience for Google’s Panda Search Landscape
Bearish SEO: Defining the User Experience for Google’s Panda Search LandscapeBearish SEO: Defining the User Experience for Google’s Panda Search Landscape
Bearish SEO: Defining the User Experience for Google’s Panda Search Landscape
Marianne Sweeny
 
Smashing silos ia-ux-meetup-mar112014
Smashing silos ia-ux-meetup-mar112014Smashing silos ia-ux-meetup-mar112014
Smashing silos ia-ux-meetup-mar112014
Marianne Sweeny
 
Search Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By DesignSearch Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By Design
Marianne Sweeny
 
Project Panorama: vistas on validated information
Project Panorama: vistas on validated informationProject Panorama: vistas on validated information
Project Panorama: vistas on validated information
Eric Sieverts
 
exploring semantic means
exploring semantic meansexploring semantic means
exploring semantic means
Daniel Tunkelang
 
People Search
People SearchPeople Search
People Search
Halogen AS
 
Customer Insights Workshop - Consumer Text Analytics Conference
Customer Insights Workshop - Consumer Text Analytics ConferenceCustomer Insights Workshop - Consumer Text Analytics Conference
Customer Insights Workshop - Consumer Text Analytics Conference
Mekkin Bjarnadottir
 
The evolution of Search spscinci
The evolution of Search spscinciThe evolution of Search spscinci
The evolution of Search spscinci
Johnny Lopez
 
HPE IDOL Technical Overview - july 2016
HPE IDOL Technical Overview - july 2016HPE IDOL Technical Overview - july 2016
HPE IDOL Technical Overview - july 2016
Andrey Karpov
 
Introduction to enterprise search
Introduction to enterprise searchIntroduction to enterprise search
Introduction to enterprise search
Usama Nada
 
presentation
presentationpresentation
presentation
Kelly (Kaili) Li
 
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...
Paul Shapiro
 
SG_UserGroup_Oct20_2022_NLP_AzureLangStudio.pptx
SG_UserGroup_Oct20_2022_NLP_AzureLangStudio.pptxSG_UserGroup_Oct20_2022_NLP_AzureLangStudio.pptx
SG_UserGroup_Oct20_2022_NLP_AzureLangStudio.pptx
PriyankaShah668821
 
Elqano - Where Knowledge Finds People
Elqano - Where Knowledge Finds PeopleElqano - Where Knowledge Finds People
Elqano - Where Knowledge Finds People
Guillermo Garcia
 
Brave new search world
Brave new search worldBrave new search world
Brave new search world
voginip
 
Content Analyst - Conceptualizing LSI Based Text Analytics White Paper
Content Analyst - Conceptualizing LSI Based Text Analytics White PaperContent Analyst - Conceptualizing LSI Based Text Analytics White Paper
Content Analyst - Conceptualizing LSI Based Text Analytics White Paper
John Felahi
 
How to be Successful with Search in YOUR Organization
How to be Successful with Search in YOUR OrganizationHow to be Successful with Search in YOUR Organization
How to be Successful with Search in YOUR Organization
Agnes Molnar
 
How to be successful with search in your organisation
How to be successful with search in your organisationHow to be successful with search in your organisation
How to be successful with search in your organisation
voginip
 
AI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge ManagementAI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge Management
Trey Grainger
 

Similar to NLP Applications in Search (20)

#trulondon irina
#trulondon irina#trulondon irina
#trulondon irina
 
Bearish SEO: Defining the User Experience for Google’s Panda Search Landscape
Bearish SEO: Defining the User Experience for Google’s Panda Search LandscapeBearish SEO: Defining the User Experience for Google’s Panda Search Landscape
Bearish SEO: Defining the User Experience for Google’s Panda Search Landscape
 
Smashing silos ia-ux-meetup-mar112014
Smashing silos ia-ux-meetup-mar112014Smashing silos ia-ux-meetup-mar112014
Smashing silos ia-ux-meetup-mar112014
 
Search Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By DesignSearch Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By Design
 
Project Panorama: vistas on validated information
Project Panorama: vistas on validated informationProject Panorama: vistas on validated information
Project Panorama: vistas on validated information
 
exploring semantic means
exploring semantic meansexploring semantic means
exploring semantic means
 
People Search
People SearchPeople Search
People Search
 
Customer Insights Workshop - Consumer Text Analytics Conference
Customer Insights Workshop - Consumer Text Analytics ConferenceCustomer Insights Workshop - Consumer Text Analytics Conference
Customer Insights Workshop - Consumer Text Analytics Conference
 
The evolution of Search spscinci
The evolution of Search spscinciThe evolution of Search spscinci
The evolution of Search spscinci
 
HPE IDOL Technical Overview - july 2016
HPE IDOL Technical Overview - july 2016HPE IDOL Technical Overview - july 2016
HPE IDOL Technical Overview - july 2016
 
Introduction to enterprise search
Introduction to enterprise searchIntroduction to enterprise search
Introduction to enterprise search
 
presentation
presentationpresentation
presentation
 
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...
 
SG_UserGroup_Oct20_2022_NLP_AzureLangStudio.pptx
SG_UserGroup_Oct20_2022_NLP_AzureLangStudio.pptxSG_UserGroup_Oct20_2022_NLP_AzureLangStudio.pptx
SG_UserGroup_Oct20_2022_NLP_AzureLangStudio.pptx
 
Elqano - Where Knowledge Finds People
Elqano - Where Knowledge Finds PeopleElqano - Where Knowledge Finds People
Elqano - Where Knowledge Finds People
 
Brave new search world
Brave new search worldBrave new search world
Brave new search world
 
Content Analyst - Conceptualizing LSI Based Text Analytics White Paper
Content Analyst - Conceptualizing LSI Based Text Analytics White PaperContent Analyst - Conceptualizing LSI Based Text Analytics White Paper
Content Analyst - Conceptualizing LSI Based Text Analytics White Paper
 
How to be Successful with Search in YOUR Organization
How to be Successful with Search in YOUR OrganizationHow to be Successful with Search in YOUR Organization
How to be Successful with Search in YOUR Organization
 
How to be successful with search in your organisation
How to be successful with search in your organisationHow to be successful with search in your organisation
How to be successful with search in your organisation
 
AI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge ManagementAI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge Management
 

Recently uploaded

Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 

Recently uploaded (20)

Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 

NLP Applications in Search

  • 1. NLP applications in search Dr. Eoin Hurrell @eoinhurrell eoinhurrell@gmail.com http://cohort.is
  • 2. Overview - What is Cohort? - Problem definition: What kind of search and understanding? - Our solution: a mix of old and new - Evaluation - Current usage
  • 3. What is Cohort? “Cohort helps you find the people you need through the people you know” - (Qualified) Second degree social network search - We search for relevant people in your network you could get an intro to from a close friend

  • 4. A Data Product in Cohort Understanding "asks" - social feed posts that are also searches
  • 5. Asks to concepts - We have a number of ways of classifying people as having interests - By classifying queries as asking for interests we have 'dimensionality reduction as query expansion'
  • 6. More Generally How can we expand the scope of a search to return more (and more relevant) results
  • 7. The Old Only some elements of the text signify the information need. Part-of-speech tagging and named entity recognition are used to filter out some noise "I'm looking for a python hacker for some remote work based anywhere"
  • 8. The New (sorta) - (Compound) word vectors as concepts Python (language), Code Remote Working - "I'm looking for a python hacker for some remote work based anywhere" Query = Free-text: python hacker, remote work Interests: Python (language), Code, Remote Working
  • 9. Word vector training data Technical roles, web technologies and similar text to understand
  • 10. + Word vector training data Start with model with Google News vectors, then train on HackerNews data
  • 11. Evaluation Tested against human-judged concepts for 542 real asks F1-score: 87.06%
  • 12. Current status The app today makes use of a more explicit exploratory search interface and uses word vectors to enrich information searched over. Code Move from understanding concepts in asks to understanding concepts in tweets, and using that to improve our interest tagging.
  • 13. Summary - An example of a data product - Understanding text to improve search - Mixing traditional NLP techniques with deep learning - Evaluation - Current usage within Cohort