2. Overview
- What is Cohort?
- Problem definition: What kind of search and understanding?
- Our solution: a mix of old and new
- Evaluation
- Current usage
3. What is Cohort?
“Cohort helps you find the people you need through the people you know”
- (Qualified) Second degree social network search
- We search for relevant people in your network you could get an intro to from a
close friend
4. A Data Product in Cohort
Understanding "asks" - social feed posts that are also searches
5. Asks to concepts
- We have a number of ways of classifying people as having interests
- By classifying queries as asking for interests we have 'dimensionality reduction as
query expansion'
6. More Generally
How can we expand the scope of a search to return
more (and more relevant) results
7. The Old
Only some elements of the text signify the information need. Part-of-speech
tagging and named entity recognition are used to filter out some noise
"I'm looking for a python hacker for some remote work based anywhere"
8. The New (sorta)
- (Compound) word vectors as concepts
Python (language),
Code
Remote
Working
-
"I'm looking for a python hacker for some remote work based anywhere"
Query = Free-text: python hacker, remote work
Interests: Python (language), Code, Remote Working
9. Word vector training data
Technical roles, web technologies and similar text to understand
10. +
Word vector training data
Start with model with Google News vectors, then train on HackerNews data
12. Current status
The app today makes use of a more explicit exploratory search interface and uses
word vectors to enrich information searched over.
Code
Move from understanding concepts in asks to understanding concepts in tweets, and
using that to improve our interest tagging.
13. Summary
- An example of a data product
- Understanding text to improve search
- Mixing traditional NLP techniques with deep learning
- Evaluation
- Current usage within Cohort