SlideShare a Scribd company logo
1
Towards a Learning To Rank
Ecosystem @ snag
---- We've got LTR to “work”, now what?
Xun Wang (xun.wang@snag.co)
2
Iterating LTR beyond v1
Agenda today
● Snag Overview
● Snag & Learning to Rank
● Troubleshooting Learning to Rank
● Elements of the LTR Ecosystem (LTR
2.0 initiatives)
3
Snag Overview
About Snag
4
Snag is an Hourly Job Marketplace
snag
● Marketplace health
● Member growth
● Revenue growth
Job-seekers
● Preference
● Qualifications
● Schedule
● Responsiveness
Employers &
Hiring Agencies
● Candidate volume
● Candidate quality
● Cost per hire/lead
Fabio
Rosati
90MM+ registered workers 150K+ hires per month 400K+ active locations
5
Hourly Jobs are Transactional
● Fragmented
Organized around “Shifts”. A worker
can be assigned 1 to 30+ hours per
week. Many hold multiple jobs
● High turnover
Workers stay at each job for 6 months
on average
● Lightly Skilled
Many hourly jobs require just a high
school diploma https://www.snag.co/employers/wp-content/uploads/2016/07/2016
_SOTHW_Report-3.pdf
6
Hourly Job Search is Open-ended
Schedule and location more
important than actual job duty
Queries not explicit (40% without keywords)
7
Matching Hourly Jobs
Recommendation
Search/IR
Explicit
Requests
Implicit
Feedbacks Activities
Job Inventory
Worker Profile Match
Query
Keywords
Query
Locations
Employer
Locations
Employer Profiles
Preference
Yield
Positions
Trey
Grainger
requires a hybrid approach
8
Snag & Learning to Rank
what we’ve built as v1
9
The Old System
● System too complex to accurately tune the boosts: Relevancy
whack-a-mole
● Inventory content frequently changes
● Lacks data driven input -- assumption driven without proper statistical
analysis
“If only there was a way to do this
differently…”
Jason
Kowalewsky
(This slide is a shout-out to Jason Kowalewsky, who jump-started Learning to Rank at Snag.
He was a terrific boss but routinely wrote sloppy slides like this. )
10
Learning to Rank Model Doug
Turnbull
Abandonment: 0
Relevancy Labels Features
Click: 1
Apply Intent: 2
bm25 scores on job title,
employer name, job type, ...
distance <position, seeker>
match scores on query location
(e.g. zip-code, city)
Bm25 scores on job description
query string attributes (e.g
length, query type)
posting attributes (e.g. position,
requirements, industry,
semantics representation)
.
.
.
lambdamart
Machine learning is everywhere
11
Training Pipeline Rishi
Kumar
Elizabeth
Haubert
Peter
Dixon-Moses
User events posting
collection
event
sampler
posting
sampler
training
data
parser
posting
ingestion
model
generator
feature
backfilling
relevancy label
parser
relevancy
scores
query
info
features training
data
ranking
model
posting
docs
user
events
training
index
search
engine
(dev)
search
engine
(prod)
“click model” +
HyperOpt
Scott
Stults
12
Last time we checked, LTR “worked” Aash
Srikar
...with varying degrees of success across query types
11%
27%
0%
-3%
0%
Old New
5%
% of searches 24% 13% 16% 30% 13%
15%
“Near me”
(50% native app traffic)
13
However, with great power... Everyone who
complained
“Why is my customer losing so many
applications? ”
“Why is this keyword search still
perform poorly?”
“I heard Google released a job search
service, why don’t we just use that?
Nobody beats Google in search!”
(Somebody actually set up a meeting with
Google Cloud Talent Solution while I was
on vacation…)
(OK this one’s on us. We actually made
the conversion rate better than before
but it’s still far from satisfactory)
(Because your customer has been gaming
our site for years and the new system
closed the loophole?)
14
Troubleshooting Learning to Rank
Issues we realized, fixed or stumbled upon while maintaining v1.0
15
Sample Complexity Simon
Hughes
Factorial state space, low capacity model, biased training data
● Many LTR algorithms approximate
ranking as a scoring problem due to
intractable state space (Perm(n, r)).
● Under-expressive model formulation
leads to high bias and overfitting
● Search log typically contains bias
introduced by previous ranking models
https://en.wikipedia.org/wiki/Sample_complexity
16
BM 2.5 scores can make spurious LTR features
Low precision on long texts, low recall on short texts
17
Presentation Bias Jason
Kowalewsky
Stephen
Ahearn
● Users’ propensity to click on an
search entry can be influenced by
factors besides relevancy (e.g.
position, yield, UX)
● Search logs often cannot tell active
skipping from passive neglects,
introducing lots of false negatives -
had to throw away lots of data
Not all clicks are created equal
Unbiased learning to rank: https://arxiv.org/abs/1608.04468
18
Search Metrics
Used in training, offline and online testing but often don’t align with business objectives
1
0
0
0
0
0
This SERP has NDCG
of 1 but 0 apply
0
2
1
0
0
0
This SERP has
lower NDCG but
one apply (ERR?)
KFC
KFC
Macy’s
KFC
KFC
Uber
...until you realize KFC
showed up 4 times for no
good reason
0
0
2
2
0
0
This SERP has the
lowest NDCG but the
best yield (MAP?)
http://olivier.chapelle.cc/pub/err.pdf
19
Bot detection
● Bot traffic consists of > 60% of
Snag’s web and mobile web traffic
● Bots behave very differently from
human users. (e.g. views 50+
pages, clicks every posting, etc.)
● Thus, even a 5~10% false
negative rate can significantly
contaminate LTR training data
Ali
Bartos
Carl
Gieringer
Garbage in, garbage out
20
SEO - External Query Pattern Shift
Problem:
Solution:
Outcome:
When Google doesn’t care about small businesses (not that it ever did)...
21
Elements of the LTR Ecosystem
Work in progress and future initiatives towards LTR 2.0
22
Search Engine needs Metadata
Availability Req. Example Integration Strategy
User/Query
Metadata real-time query string search engine plugin / external API
near-static user profile external API
near real-time search history streaming -> external API
Posting
Metadata static
industry, vector
embeddings external API -> search index
near real-time
yield, remaining
budget streaming -> external API -> search index
Relevancy real-time relevancy score search engine plugin / external API
for both offline training and real time querying
(current focus)
(long term goal)
23
Signals Platform
Signals is an Kafka-based data streaming
platform to stream & transform real-time
events data to various internal
consumers.
● Kafka backend to process real-time
comprehensive user behavior &
product activity data
● “Hermes” REST API layer to enable
signal publishing via http calls
● Avro schema registry to enforce
typed event definition
Corey
Fritz
Clean, granular data to train and serve machine learning models
24
Position Profile via Clustering
“CDL Training School ! We
train, We Hire, Guaranteed!”
“Truck
Driver”
use position ontology to align with query intent and boost recall
25
Posting Summarization via Topic Modeling
“ If you are an actor, actress, admin, agency, artist,
assistant, barista, bartender, broker, bus driver, cab
driver, cashier, chauffeur, cleaner, college student,
customer service agent, chef, contract worker, cook,
courier, designer, dishwasher, dog walker, driver,
entrepreneurs, fitness trainer, food prep, food services,
freelancer, handyman, hostess, insurance broker,
instructor, intern, janitor, maid, maintenance,
messenger, manager, management, musician, maid,
office assistant, office administrator, photographer,
private hire, professional driver, realtor, retail associate,
sales associate, sales person, security, server, students,
teacher, tutor, valet, veteran, waiter, waitress who is
looking for a flexible part-time, full-time or summer gig,
apply to <> to supplement your income this summer! ”
extracted from a real job description:
● Many postings contain
‘stuffed’ keywords to boost
their own recall at the
expense of others’
● Topic models “summarize”
each posting by the strength
of its key concepts to both
reduce spurious recall and
promote relevant recall
https://en.wikipedia.org/wiki/Latent_Dirichlet_allocation
(proof of concept) Goodbye keyword spamming
26
Posting Deduplication via LSH Robert
Mealey
● Large employers often supply generic job
descriptions that receive similar relevancy
scores for neighboring store locations,
affecting result diversity
● Locality-Sensitive Hashing (LSH) is used
to tag duplicates/near duplicates so that
all but one is shown in search results
https://en.wikipedia.org/wiki/Locality-sensitive_hashing
no, not 4 KFC jobs on the same SERP
27
Yield Management
● An interesting problem to the
LTR framework because users
behave agnostically of yield
information
● Requires careful user modeling
to “de-bias” relevancy signal
and streaming infrastructure to
update yield and budget
information
Quadrant III
Low Engagement,
Low yield
Quadrant II
High Engagement,
High Yield
(proof of concept) make some money, change the world
Anuradha
Uduwage
28
Additional Initiatives
● Language model for job postings
● Posting quality score
● User profile features/embeddings
● Enhanced AB testing and metrics
monitoring capabilities
● Real-time user-activity-based
features and related infrastructure
● Search result diversity
● Query expansion
● Named Entity detection
● Knowledge graph and
graph-based search
● Vector-based relevancy
● Neural ranking models
hopefully some of those will make themselves to Haystack 2020
29
Lessons Learned
30
Lessons Learned
● LTR isn’t just about the ML model or the search engine
Ranking models are only as expressive and/or accurate as the features and labels
we feed them. Investment in data infrastructure and data assets is absolutely
necessary and arguably more critical.
● Expectations from stakeholders need to be carefully managed
Workers, employers, internal teams, Google bots, etc. all have their own areas of
emphasis and sometimes may demand slightly different search experiences.
Navigating through those multiple party-tradeoffs is crucial for the success of the
search system.
31
We are Hiring!
Join us and solve some interesting data engineering and search relevance engineering problems !
Richmond, VA, too
32
Thank you.
Any questions?

More Related Content

What's hot

Personalized Re-Ranking of Documents
Personalized Re-Ranking of DocumentsPersonalized Re-Ranking of Documents
Personalized Re-Ranking of Documentskswapna9
 
Reduce Query Time Up to 60% with Selective Search
Reduce Query Time Up to 60% with Selective SearchReduce Query Time Up to 60% with Selective Search
Reduce Query Time Up to 60% with Selective SearchLucidworks
 
Scaling Box-Search: Gearing up for Petabyte Scale - Shubhro Roy & Anthony Urb...
Scaling Box-Search: Gearing up for Petabyte Scale - Shubhro Roy & Anthony Urb...Scaling Box-Search: Gearing up for Petabyte Scale - Shubhro Roy & Anthony Urb...
Scaling Box-Search: Gearing up for Petabyte Scale - Shubhro Roy & Anthony Urb...Lucidworks
 
Real-time Recommendations for Retail: Architecture, Algorithms, and Design
Real-time Recommendations for Retail: Architecture, Algorithms, and DesignReal-time Recommendations for Retail: Architecture, Algorithms, and Design
Real-time Recommendations for Retail: Architecture, Algorithms, and DesignJuliet Hougland
 
Agnes Molnar - Scoping and Enterprise Search Implementation
Agnes Molnar - Scoping and Enterprise Search ImplementationAgnes Molnar - Scoping and Enterprise Search Implementation
Agnes Molnar - Scoping and Enterprise Search ImplementationAgnes Molnar
 
Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINE
Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINEFelix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINE
Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINEsemanticsconference
 
Webinar: How to (Finally!) Get Relevant Results From SharePoint’s Search
Webinar: How to (Finally!) Get Relevant Results From SharePoint’s SearchWebinar: How to (Finally!) Get Relevant Results From SharePoint’s Search
Webinar: How to (Finally!) Get Relevant Results From SharePoint’s SearchLucidworks
 
Commercializing Alternative Data
Commercializing Alternative DataCommercializing Alternative Data
Commercializing Alternative DataDatabricks
 
Productionising Machine Learning Models
Productionising Machine Learning ModelsProductionising Machine Learning Models
Productionising Machine Learning ModelsTash Bickley
 
Enterprise search Information
Enterprise search Information Enterprise search Information
Enterprise search Information Netwoven Inc.
 
Casablanca SharePoint Days Power User Search Tips
Casablanca SharePoint Days Power User Search TipsCasablanca SharePoint Days Power User Search Tips
Casablanca SharePoint Days Power User Search TipsJoel Oleson
 
Question Answering and Virtual Assistants with Deep Learning
Question Answering and Virtual Assistants with Deep LearningQuestion Answering and Virtual Assistants with Deep Learning
Question Answering and Virtual Assistants with Deep LearningLucidworks
 
Enterprise Search in the Big Data Era: Recent Developments and Open Challenges
Enterprise Search in the Big Data Era: Recent Developments and Open ChallengesEnterprise Search in the Big Data Era: Recent Developments and Open Challenges
Enterprise Search in the Big Data Era: Recent Developments and Open ChallengesYunyao Li
 
Optimize HR From Hire To Retire With Oracle BI Cloud Service for E-Business S...
Optimize HR From Hire To Retire With Oracle BI Cloud Service for E-Business S...Optimize HR From Hire To Retire With Oracle BI Cloud Service for E-Business S...
Optimize HR From Hire To Retire With Oracle BI Cloud Service for E-Business S...KPI Partners
 
Modelling Customer Lifetime Revenue for Subscription Business
Modelling Customer Lifetime Revenue for Subscription BusinessModelling Customer Lifetime Revenue for Subscription Business
Modelling Customer Lifetime Revenue for Subscription BusinessDatabricks
 
Cloud Analytics for E-Business Suite
Cloud Analytics for E-Business SuiteCloud Analytics for E-Business Suite
Cloud Analytics for E-Business SuiteKPI Partners
 
FrugalML: Using ML APIs More Accurately and Cheaply
FrugalML: Using ML APIs More Accurately and CheaplyFrugalML: Using ML APIs More Accurately and Cheaply
FrugalML: Using ML APIs More Accurately and CheaplyDatabricks
 

What's hot (20)

Personalized Re-Ranking of Documents
Personalized Re-Ranking of DocumentsPersonalized Re-Ranking of Documents
Personalized Re-Ranking of Documents
 
Reduce Query Time Up to 60% with Selective Search
Reduce Query Time Up to 60% with Selective SearchReduce Query Time Up to 60% with Selective Search
Reduce Query Time Up to 60% with Selective Search
 
Scaling Box-Search: Gearing up for Petabyte Scale - Shubhro Roy & Anthony Urb...
Scaling Box-Search: Gearing up for Petabyte Scale - Shubhro Roy & Anthony Urb...Scaling Box-Search: Gearing up for Petabyte Scale - Shubhro Roy & Anthony Urb...
Scaling Box-Search: Gearing up for Petabyte Scale - Shubhro Roy & Anthony Urb...
 
Real-time Recommendations for Retail: Architecture, Algorithms, and Design
Real-time Recommendations for Retail: Architecture, Algorithms, and DesignReal-time Recommendations for Retail: Architecture, Algorithms, and Design
Real-time Recommendations for Retail: Architecture, Algorithms, and Design
 
Agnes Molnar - Scoping and Enterprise Search Implementation
Agnes Molnar - Scoping and Enterprise Search ImplementationAgnes Molnar - Scoping and Enterprise Search Implementation
Agnes Molnar - Scoping and Enterprise Search Implementation
 
Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINE
Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINEFelix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINE
Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINE
 
Webinar: How to (Finally!) Get Relevant Results From SharePoint’s Search
Webinar: How to (Finally!) Get Relevant Results From SharePoint’s SearchWebinar: How to (Finally!) Get Relevant Results From SharePoint’s Search
Webinar: How to (Finally!) Get Relevant Results From SharePoint’s Search
 
Commercializing Alternative Data
Commercializing Alternative DataCommercializing Alternative Data
Commercializing Alternative Data
 
Productionising Machine Learning Models
Productionising Machine Learning ModelsProductionising Machine Learning Models
Productionising Machine Learning Models
 
PrachiSharma
PrachiSharmaPrachiSharma
PrachiSharma
 
Enterprise search Information
Enterprise search Information Enterprise search Information
Enterprise search Information
 
Casablanca SharePoint Days Power User Search Tips
Casablanca SharePoint Days Power User Search TipsCasablanca SharePoint Days Power User Search Tips
Casablanca SharePoint Days Power User Search Tips
 
Nikhil CV
Nikhil CVNikhil CV
Nikhil CV
 
Question Answering and Virtual Assistants with Deep Learning
Question Answering and Virtual Assistants with Deep LearningQuestion Answering and Virtual Assistants with Deep Learning
Question Answering and Virtual Assistants with Deep Learning
 
Enterprise Search in the Big Data Era: Recent Developments and Open Challenges
Enterprise Search in the Big Data Era: Recent Developments and Open ChallengesEnterprise Search in the Big Data Era: Recent Developments and Open Challenges
Enterprise Search in the Big Data Era: Recent Developments and Open Challenges
 
Optimize HR From Hire To Retire With Oracle BI Cloud Service for E-Business S...
Optimize HR From Hire To Retire With Oracle BI Cloud Service for E-Business S...Optimize HR From Hire To Retire With Oracle BI Cloud Service for E-Business S...
Optimize HR From Hire To Retire With Oracle BI Cloud Service for E-Business S...
 
Modelling Customer Lifetime Revenue for Subscription Business
Modelling Customer Lifetime Revenue for Subscription BusinessModelling Customer Lifetime Revenue for Subscription Business
Modelling Customer Lifetime Revenue for Subscription Business
 
Cloud Analytics for E-Business Suite
Cloud Analytics for E-Business SuiteCloud Analytics for E-Business Suite
Cloud Analytics for E-Business Suite
 
Architecting for Data Science
Architecting for Data ScienceArchitecting for Data Science
Architecting for Data Science
 
FrugalML: Using ML APIs More Accurately and Cheaply
FrugalML: Using ML APIs More Accurately and CheaplyFrugalML: Using ML APIs More Accurately and Cheaply
FrugalML: Using ML APIs More Accurately and Cheaply
 

Similar to Haystack 2019 - Towards a Learning To Rank Ecosystem @ Snag - We've got LTR to work! Now what? - Xun Wang

Haystack- Learning to rank in an hourly job market
Haystack- Learning to rank in an hourly job market Haystack- Learning to rank in an hourly job market
Haystack- Learning to rank in an hourly job market Xun Wang
 
Adoption of Robotic Automation Process
Adoption of Robotic Automation ProcessAdoption of Robotic Automation Process
Adoption of Robotic Automation ProcessMukund Wangikar
 
Uncovering hidden stories in logs!
Uncovering hidden stories in logs!Uncovering hidden stories in logs!
Uncovering hidden stories in logs!Chandan Jog
 
Webinar - Know Your Customer - Arya (20160526)
Webinar - Know Your Customer - Arya (20160526)Webinar - Know Your Customer - Arya (20160526)
Webinar - Know Your Customer - Arya (20160526)Turi, Inc.
 
The future Proof Financial: Fintech
The future Proof Financial: FintechThe future Proof Financial: Fintech
The future Proof Financial: FintechMartijn Zoet
 
SmartRecruiters Corporate Edition
SmartRecruiters Corporate Edition SmartRecruiters Corporate Edition
SmartRecruiters Corporate Edition Michelle Cowden
 
Epam BI - Near Realtime Marketing Support System
Epam BI - Near Realtime Marketing Support SystemEpam BI - Near Realtime Marketing Support System
Epam BI - Near Realtime Marketing Support SystemDmitry Tolpeko
 
Leveraging Machine Learning for Competitive Advantage by Dylan Hogg - Search ...
Leveraging Machine Learning for Competitive Advantage by Dylan Hogg - Search ...Leveraging Machine Learning for Competitive Advantage by Dylan Hogg - Search ...
Leveraging Machine Learning for Competitive Advantage by Dylan Hogg - Search ...Search Party
 
Leveraging Machine Learning for Competitive Advantage at Search Party
Leveraging Machine Learning for Competitive Advantage at Search PartyLeveraging Machine Learning for Competitive Advantage at Search Party
Leveraging Machine Learning for Competitive Advantage at Search PartyDylan Hogg
 
Job Street Impact Recruitment Automation
Job Street Impact Recruitment AutomationJob Street Impact Recruitment Automation
Job Street Impact Recruitment Automationvikashmodi
 
Aen010 Stroka 091907
Aen010 Stroka 091907Aen010 Stroka 091907
Aen010 Stroka 091907Dreamforce07
 
Analytics Recruitment Consultants India | PeopleLogic
Analytics Recruitment Consultants India | PeopleLogicAnalytics Recruitment Consultants India | PeopleLogic
Analytics Recruitment Consultants India | PeopleLogicpeoplelogic669
 
HRU_EPAM_April2016_v1
HRU_EPAM_April2016_v1HRU_EPAM_April2016_v1
HRU_EPAM_April2016_v1ddiddo
 
Recruitment Software- TalentRecruit: Industry's Best Recruitment Software
Recruitment Software- TalentRecruit: Industry's Best Recruitment SoftwareRecruitment Software- TalentRecruit: Industry's Best Recruitment Software
Recruitment Software- TalentRecruit: Industry's Best Recruitment SoftwareTalent-Recruit
 
WORKSHOP: STRATEGY AND SUCCESS WITH OFFICE 365: PRACTICAL TOOLS AND TECHNIQUE...
WORKSHOP: STRATEGY AND SUCCESS WITH OFFICE 365: PRACTICAL TOOLS AND TECHNIQUE...WORKSHOP: STRATEGY AND SUCCESS WITH OFFICE 365: PRACTICAL TOOLS AND TECHNIQUE...
WORKSHOP: STRATEGY AND SUCCESS WITH OFFICE 365: PRACTICAL TOOLS AND TECHNIQUE...Richard Harbridge
 
Rpa ai automation webinar by new, cfgi, ui path 11 82018
Rpa ai automation webinar by new, cfgi, ui path 11 82018Rpa ai automation webinar by new, cfgi, ui path 11 82018
Rpa ai automation webinar by new, cfgi, ui path 11 82018Bob Fitzpatrick
 
Luxoft Personnel_Presentation in English
Luxoft Personnel_Presentation in EnglishLuxoft Personnel_Presentation in English
Luxoft Personnel_Presentation in EnglishIMorgulis
 
Luxoft Personnel _ presentation (In English)
Luxoft Personnel _ presentation (In English)Luxoft Personnel _ presentation (In English)
Luxoft Personnel _ presentation (In English)IMorgulis
 

Similar to Haystack 2019 - Towards a Learning To Rank Ecosystem @ Snag - We've got LTR to work! Now what? - Xun Wang (20)

Haystack- Learning to rank in an hourly job market
Haystack- Learning to rank in an hourly job market Haystack- Learning to rank in an hourly job market
Haystack- Learning to rank in an hourly job market
 
Introduction-To-RPA_1.pptx
Introduction-To-RPA_1.pptxIntroduction-To-RPA_1.pptx
Introduction-To-RPA_1.pptx
 
Adoption of Robotic Automation Process
Adoption of Robotic Automation ProcessAdoption of Robotic Automation Process
Adoption of Robotic Automation Process
 
Uncovering hidden stories in logs!
Uncovering hidden stories in logs!Uncovering hidden stories in logs!
Uncovering hidden stories in logs!
 
Webinar - Know Your Customer - Arya (20160526)
Webinar - Know Your Customer - Arya (20160526)Webinar - Know Your Customer - Arya (20160526)
Webinar - Know Your Customer - Arya (20160526)
 
The future Proof Financial: Fintech
The future Proof Financial: FintechThe future Proof Financial: Fintech
The future Proof Financial: Fintech
 
SmartRecruiters Corporate Edition
SmartRecruiters Corporate Edition SmartRecruiters Corporate Edition
SmartRecruiters Corporate Edition
 
Epam BI - Near Realtime Marketing Support System
Epam BI - Near Realtime Marketing Support SystemEpam BI - Near Realtime Marketing Support System
Epam BI - Near Realtime Marketing Support System
 
Robotic Process Automation
Robotic Process AutomationRobotic Process Automation
Robotic Process Automation
 
Leveraging Machine Learning for Competitive Advantage by Dylan Hogg - Search ...
Leveraging Machine Learning for Competitive Advantage by Dylan Hogg - Search ...Leveraging Machine Learning for Competitive Advantage by Dylan Hogg - Search ...
Leveraging Machine Learning for Competitive Advantage by Dylan Hogg - Search ...
 
Leveraging Machine Learning for Competitive Advantage at Search Party
Leveraging Machine Learning for Competitive Advantage at Search PartyLeveraging Machine Learning for Competitive Advantage at Search Party
Leveraging Machine Learning for Competitive Advantage at Search Party
 
Job Street Impact Recruitment Automation
Job Street Impact Recruitment AutomationJob Street Impact Recruitment Automation
Job Street Impact Recruitment Automation
 
Aen010 Stroka 091907
Aen010 Stroka 091907Aen010 Stroka 091907
Aen010 Stroka 091907
 
Analytics Recruitment Consultants India | PeopleLogic
Analytics Recruitment Consultants India | PeopleLogicAnalytics Recruitment Consultants India | PeopleLogic
Analytics Recruitment Consultants India | PeopleLogic
 
HRU_EPAM_April2016_v1
HRU_EPAM_April2016_v1HRU_EPAM_April2016_v1
HRU_EPAM_April2016_v1
 
Recruitment Software- TalentRecruit: Industry's Best Recruitment Software
Recruitment Software- TalentRecruit: Industry's Best Recruitment SoftwareRecruitment Software- TalentRecruit: Industry's Best Recruitment Software
Recruitment Software- TalentRecruit: Industry's Best Recruitment Software
 
WORKSHOP: STRATEGY AND SUCCESS WITH OFFICE 365: PRACTICAL TOOLS AND TECHNIQUE...
WORKSHOP: STRATEGY AND SUCCESS WITH OFFICE 365: PRACTICAL TOOLS AND TECHNIQUE...WORKSHOP: STRATEGY AND SUCCESS WITH OFFICE 365: PRACTICAL TOOLS AND TECHNIQUE...
WORKSHOP: STRATEGY AND SUCCESS WITH OFFICE 365: PRACTICAL TOOLS AND TECHNIQUE...
 
Rpa ai automation webinar by new, cfgi, ui path 11 82018
Rpa ai automation webinar by new, cfgi, ui path 11 82018Rpa ai automation webinar by new, cfgi, ui path 11 82018
Rpa ai automation webinar by new, cfgi, ui path 11 82018
 
Luxoft Personnel_Presentation in English
Luxoft Personnel_Presentation in EnglishLuxoft Personnel_Presentation in English
Luxoft Personnel_Presentation in English
 
Luxoft Personnel _ presentation (In English)
Luxoft Personnel _ presentation (In English)Luxoft Personnel _ presentation (In English)
Luxoft Personnel _ presentation (In English)
 

More from OpenSource Connections

How To Structure Your Search Team for Success
How To Structure Your Search Team for SuccessHow To Structure Your Search Team for Success
How To Structure Your Search Team for SuccessOpenSource Connections
 
The right path to making search relevant - Taxonomy Bootcamp London 2019
The right path to making search relevant  - Taxonomy Bootcamp London 2019The right path to making search relevant  - Taxonomy Bootcamp London 2019
The right path to making search relevant - Taxonomy Bootcamp London 2019OpenSource Connections
 
Haystack 2019 Lightning Talk - The Future of Quepid - Charlie Hull
Haystack 2019 Lightning Talk - The Future of Quepid - Charlie HullHaystack 2019 Lightning Talk - The Future of Quepid - Charlie Hull
Haystack 2019 Lightning Talk - The Future of Quepid - Charlie HullOpenSource Connections
 
Haystack 2019 Lightning Talk - State of Apache Tika - Tim Allison
Haystack 2019 Lightning Talk - State of Apache Tika - Tim AllisonHaystack 2019 Lightning Talk - State of Apache Tika - Tim Allison
Haystack 2019 Lightning Talk - State of Apache Tika - Tim AllisonOpenSource Connections
 
Haystack 2019 Lightning Talk - Relevance on 17 million full text documents - ...
Haystack 2019 Lightning Talk - Relevance on 17 million full text documents - ...Haystack 2019 Lightning Talk - Relevance on 17 million full text documents - ...
Haystack 2019 Lightning Talk - Relevance on 17 million full text documents - ...OpenSource Connections
 
Haystack 2019 Lightning Talk - Solr Cloud on Kubernetes - Manoj Bharadwaj
Haystack 2019 Lightning Talk - Solr Cloud on Kubernetes - Manoj BharadwajHaystack 2019 Lightning Talk - Solr Cloud on Kubernetes - Manoj Bharadwaj
Haystack 2019 Lightning Talk - Solr Cloud on Kubernetes - Manoj BharadwajOpenSource Connections
 
Haystack 2019 - Search-based recommendations at Politico - Ryan Kohl
Haystack 2019 - Search-based recommendations at Politico - Ryan KohlHaystack 2019 - Search-based recommendations at Politico - Ryan Kohl
Haystack 2019 - Search-based recommendations at Politico - Ryan KohlOpenSource Connections
 
Haystack 2019 - Search with Vectors - Simon Hughes
Haystack 2019 - Search with Vectors - Simon HughesHaystack 2019 - Search with Vectors - Simon Hughes
Haystack 2019 - Search with Vectors - Simon HughesOpenSource Connections
 
Haystack 2019 - Natural Language Search with Knowledge Graphs - Trey Grainger
Haystack 2019 - Natural Language Search with Knowledge Graphs - Trey GraingerHaystack 2019 - Natural Language Search with Knowledge Graphs - Trey Grainger
Haystack 2019 - Natural Language Search with Knowledge Graphs - Trey GraingerOpenSource Connections
 
Haystack 2019 - Search Logs + Machine Learning = Auto-Tagging Inventory - Joh...
Haystack 2019 - Search Logs + Machine Learning = Auto-Tagging Inventory - Joh...Haystack 2019 - Search Logs + Machine Learning = Auto-Tagging Inventory - Joh...
Haystack 2019 - Search Logs + Machine Learning = Auto-Tagging Inventory - Joh...OpenSource Connections
 
Haystack 2019 - Improving Search Relevance with Numeric Features in Elasticse...
Haystack 2019 - Improving Search Relevance with Numeric Features in Elasticse...Haystack 2019 - Improving Search Relevance with Numeric Features in Elasticse...
Haystack 2019 - Improving Search Relevance with Numeric Features in Elasticse...OpenSource Connections
 
Haystack 2019 - Custom Solr Query Parser Design Option, and Pros & Cons - Ber...
Haystack 2019 - Custom Solr Query Parser Design Option, and Pros & Cons - Ber...Haystack 2019 - Custom Solr Query Parser Design Option, and Pros & Cons - Ber...
Haystack 2019 - Custom Solr Query Parser Design Option, and Pros & Cons - Ber...OpenSource Connections
 
Haystack 2019 - Establishing a relevance focused culture in a large organizat...
Haystack 2019 - Establishing a relevance focused culture in a large organizat...Haystack 2019 - Establishing a relevance focused culture in a large organizat...
Haystack 2019 - Establishing a relevance focused culture in a large organizat...OpenSource Connections
 
Haystack 2019 - Solving for Satisfaction: Introduction to Click Models - Eliz...
Haystack 2019 - Solving for Satisfaction: Introduction to Click Models - Eliz...Haystack 2019 - Solving for Satisfaction: Introduction to Click Models - Eliz...
Haystack 2019 - Solving for Satisfaction: Introduction to Click Models - Eliz...OpenSource Connections
 
2019 Haystack - How The New York Times Tackles Relevance - Jeremiah Via
2019 Haystack - How The New York Times Tackles Relevance - Jeremiah Via2019 Haystack - How The New York Times Tackles Relevance - Jeremiah Via
2019 Haystack - How The New York Times Tackles Relevance - Jeremiah ViaOpenSource Connections
 
Haystack 2019 - Addressing variance in AB tests: Interleaved evaluation of ra...
Haystack 2019 - Addressing variance in AB tests: Interleaved evaluation of ra...Haystack 2019 - Addressing variance in AB tests: Interleaved evaluation of ra...
Haystack 2019 - Addressing variance in AB tests: Interleaved evaluation of ra...OpenSource Connections
 
Haystack 2019 - Beyond The Search Engine: Improving Relevancy through Query E...
Haystack 2019 - Beyond The Search Engine: Improving Relevancy through Query E...Haystack 2019 - Beyond The Search Engine: Improving Relevancy through Query E...
Haystack 2019 - Beyond The Search Engine: Improving Relevancy through Query E...OpenSource Connections
 

More from OpenSource Connections (20)

Encores
EncoresEncores
Encores
 
Test driven relevancy
Test driven relevancyTest driven relevancy
Test driven relevancy
 
How To Structure Your Search Team for Success
How To Structure Your Search Team for SuccessHow To Structure Your Search Team for Success
How To Structure Your Search Team for Success
 
The right path to making search relevant - Taxonomy Bootcamp London 2019
The right path to making search relevant  - Taxonomy Bootcamp London 2019The right path to making search relevant  - Taxonomy Bootcamp London 2019
The right path to making search relevant - Taxonomy Bootcamp London 2019
 
Payloads and OCR with Solr
Payloads and OCR with SolrPayloads and OCR with Solr
Payloads and OCR with Solr
 
Haystack 2019 Lightning Talk - The Future of Quepid - Charlie Hull
Haystack 2019 Lightning Talk - The Future of Quepid - Charlie HullHaystack 2019 Lightning Talk - The Future of Quepid - Charlie Hull
Haystack 2019 Lightning Talk - The Future of Quepid - Charlie Hull
 
Haystack 2019 Lightning Talk - State of Apache Tika - Tim Allison
Haystack 2019 Lightning Talk - State of Apache Tika - Tim AllisonHaystack 2019 Lightning Talk - State of Apache Tika - Tim Allison
Haystack 2019 Lightning Talk - State of Apache Tika - Tim Allison
 
Haystack 2019 Lightning Talk - Relevance on 17 million full text documents - ...
Haystack 2019 Lightning Talk - Relevance on 17 million full text documents - ...Haystack 2019 Lightning Talk - Relevance on 17 million full text documents - ...
Haystack 2019 Lightning Talk - Relevance on 17 million full text documents - ...
 
Haystack 2019 Lightning Talk - Solr Cloud on Kubernetes - Manoj Bharadwaj
Haystack 2019 Lightning Talk - Solr Cloud on Kubernetes - Manoj BharadwajHaystack 2019 Lightning Talk - Solr Cloud on Kubernetes - Manoj Bharadwaj
Haystack 2019 Lightning Talk - Solr Cloud on Kubernetes - Manoj Bharadwaj
 
Haystack 2019 - Search-based recommendations at Politico - Ryan Kohl
Haystack 2019 - Search-based recommendations at Politico - Ryan KohlHaystack 2019 - Search-based recommendations at Politico - Ryan Kohl
Haystack 2019 - Search-based recommendations at Politico - Ryan Kohl
 
Haystack 2019 - Search with Vectors - Simon Hughes
Haystack 2019 - Search with Vectors - Simon HughesHaystack 2019 - Search with Vectors - Simon Hughes
Haystack 2019 - Search with Vectors - Simon Hughes
 
Haystack 2019 - Natural Language Search with Knowledge Graphs - Trey Grainger
Haystack 2019 - Natural Language Search with Knowledge Graphs - Trey GraingerHaystack 2019 - Natural Language Search with Knowledge Graphs - Trey Grainger
Haystack 2019 - Natural Language Search with Knowledge Graphs - Trey Grainger
 
Haystack 2019 - Search Logs + Machine Learning = Auto-Tagging Inventory - Joh...
Haystack 2019 - Search Logs + Machine Learning = Auto-Tagging Inventory - Joh...Haystack 2019 - Search Logs + Machine Learning = Auto-Tagging Inventory - Joh...
Haystack 2019 - Search Logs + Machine Learning = Auto-Tagging Inventory - Joh...
 
Haystack 2019 - Improving Search Relevance with Numeric Features in Elasticse...
Haystack 2019 - Improving Search Relevance with Numeric Features in Elasticse...Haystack 2019 - Improving Search Relevance with Numeric Features in Elasticse...
Haystack 2019 - Improving Search Relevance with Numeric Features in Elasticse...
 
Haystack 2019 - Custom Solr Query Parser Design Option, and Pros & Cons - Ber...
Haystack 2019 - Custom Solr Query Parser Design Option, and Pros & Cons - Ber...Haystack 2019 - Custom Solr Query Parser Design Option, and Pros & Cons - Ber...
Haystack 2019 - Custom Solr Query Parser Design Option, and Pros & Cons - Ber...
 
Haystack 2019 - Establishing a relevance focused culture in a large organizat...
Haystack 2019 - Establishing a relevance focused culture in a large organizat...Haystack 2019 - Establishing a relevance focused culture in a large organizat...
Haystack 2019 - Establishing a relevance focused culture in a large organizat...
 
Haystack 2019 - Solving for Satisfaction: Introduction to Click Models - Eliz...
Haystack 2019 - Solving for Satisfaction: Introduction to Click Models - Eliz...Haystack 2019 - Solving for Satisfaction: Introduction to Click Models - Eliz...
Haystack 2019 - Solving for Satisfaction: Introduction to Click Models - Eliz...
 
2019 Haystack - How The New York Times Tackles Relevance - Jeremiah Via
2019 Haystack - How The New York Times Tackles Relevance - Jeremiah Via2019 Haystack - How The New York Times Tackles Relevance - Jeremiah Via
2019 Haystack - How The New York Times Tackles Relevance - Jeremiah Via
 
Haystack 2019 - Addressing variance in AB tests: Interleaved evaluation of ra...
Haystack 2019 - Addressing variance in AB tests: Interleaved evaluation of ra...Haystack 2019 - Addressing variance in AB tests: Interleaved evaluation of ra...
Haystack 2019 - Addressing variance in AB tests: Interleaved evaluation of ra...
 
Haystack 2019 - Beyond The Search Engine: Improving Relevancy through Query E...
Haystack 2019 - Beyond The Search Engine: Improving Relevancy through Query E...Haystack 2019 - Beyond The Search Engine: Improving Relevancy through Query E...
Haystack 2019 - Beyond The Search Engine: Improving Relevancy through Query E...
 

Recently uploaded

Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxbenishzehra469
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhArpitMalhotra16
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .NABLAS株式会社
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单ewymefz
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单ewymefz
 
Uber Ride Supply Demand Gap Analysis Report
Uber Ride Supply Demand Gap Analysis ReportUber Ride Supply Demand Gap Analysis Report
Uber Ride Supply Demand Gap Analysis ReportSatyamNeelmani2
 
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...Domenico Conte
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单ewymefz
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...correoyaya
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJames Polillo
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单ewymefz
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单enxupq
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单enxupq
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单ukgaet
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIAlejandraGmez176757
 
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...elinavihriala
 
Computer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sComputer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sMAQIB18
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatheahmadsaood
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP
 

Recently uploaded (20)

Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
Uber Ride Supply Demand Gap Analysis Report
Uber Ride Supply Demand Gap Analysis ReportUber Ride Supply Demand Gap Analysis Report
Uber Ride Supply Demand Gap Analysis Report
 
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
 
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
 
Computer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sComputer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage s
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 

Haystack 2019 - Towards a Learning To Rank Ecosystem @ Snag - We've got LTR to work! Now what? - Xun Wang

  • 1. 1 Towards a Learning To Rank Ecosystem @ snag ---- We've got LTR to “work”, now what? Xun Wang (xun.wang@snag.co)
  • 2. 2 Iterating LTR beyond v1 Agenda today ● Snag Overview ● Snag & Learning to Rank ● Troubleshooting Learning to Rank ● Elements of the LTR Ecosystem (LTR 2.0 initiatives)
  • 4. 4 Snag is an Hourly Job Marketplace snag ● Marketplace health ● Member growth ● Revenue growth Job-seekers ● Preference ● Qualifications ● Schedule ● Responsiveness Employers & Hiring Agencies ● Candidate volume ● Candidate quality ● Cost per hire/lead Fabio Rosati 90MM+ registered workers 150K+ hires per month 400K+ active locations
  • 5. 5 Hourly Jobs are Transactional ● Fragmented Organized around “Shifts”. A worker can be assigned 1 to 30+ hours per week. Many hold multiple jobs ● High turnover Workers stay at each job for 6 months on average ● Lightly Skilled Many hourly jobs require just a high school diploma https://www.snag.co/employers/wp-content/uploads/2016/07/2016 _SOTHW_Report-3.pdf
  • 6. 6 Hourly Job Search is Open-ended Schedule and location more important than actual job duty Queries not explicit (40% without keywords)
  • 7. 7 Matching Hourly Jobs Recommendation Search/IR Explicit Requests Implicit Feedbacks Activities Job Inventory Worker Profile Match Query Keywords Query Locations Employer Locations Employer Profiles Preference Yield Positions Trey Grainger requires a hybrid approach
  • 8. 8 Snag & Learning to Rank what we’ve built as v1
  • 9. 9 The Old System ● System too complex to accurately tune the boosts: Relevancy whack-a-mole ● Inventory content frequently changes ● Lacks data driven input -- assumption driven without proper statistical analysis “If only there was a way to do this differently…” Jason Kowalewsky (This slide is a shout-out to Jason Kowalewsky, who jump-started Learning to Rank at Snag. He was a terrific boss but routinely wrote sloppy slides like this. )
  • 10. 10 Learning to Rank Model Doug Turnbull Abandonment: 0 Relevancy Labels Features Click: 1 Apply Intent: 2 bm25 scores on job title, employer name, job type, ... distance <position, seeker> match scores on query location (e.g. zip-code, city) Bm25 scores on job description query string attributes (e.g length, query type) posting attributes (e.g. position, requirements, industry, semantics representation) . . . lambdamart Machine learning is everywhere
  • 11. 11 Training Pipeline Rishi Kumar Elizabeth Haubert Peter Dixon-Moses User events posting collection event sampler posting sampler training data parser posting ingestion model generator feature backfilling relevancy label parser relevancy scores query info features training data ranking model posting docs user events training index search engine (dev) search engine (prod) “click model” + HyperOpt Scott Stults
  • 12. 12 Last time we checked, LTR “worked” Aash Srikar ...with varying degrees of success across query types 11% 27% 0% -3% 0% Old New 5% % of searches 24% 13% 16% 30% 13% 15% “Near me” (50% native app traffic)
  • 13. 13 However, with great power... Everyone who complained “Why is my customer losing so many applications? ” “Why is this keyword search still perform poorly?” “I heard Google released a job search service, why don’t we just use that? Nobody beats Google in search!” (Somebody actually set up a meeting with Google Cloud Talent Solution while I was on vacation…) (OK this one’s on us. We actually made the conversion rate better than before but it’s still far from satisfactory) (Because your customer has been gaming our site for years and the new system closed the loophole?)
  • 14. 14 Troubleshooting Learning to Rank Issues we realized, fixed or stumbled upon while maintaining v1.0
  • 15. 15 Sample Complexity Simon Hughes Factorial state space, low capacity model, biased training data ● Many LTR algorithms approximate ranking as a scoring problem due to intractable state space (Perm(n, r)). ● Under-expressive model formulation leads to high bias and overfitting ● Search log typically contains bias introduced by previous ranking models https://en.wikipedia.org/wiki/Sample_complexity
  • 16. 16 BM 2.5 scores can make spurious LTR features Low precision on long texts, low recall on short texts
  • 17. 17 Presentation Bias Jason Kowalewsky Stephen Ahearn ● Users’ propensity to click on an search entry can be influenced by factors besides relevancy (e.g. position, yield, UX) ● Search logs often cannot tell active skipping from passive neglects, introducing lots of false negatives - had to throw away lots of data Not all clicks are created equal Unbiased learning to rank: https://arxiv.org/abs/1608.04468
  • 18. 18 Search Metrics Used in training, offline and online testing but often don’t align with business objectives 1 0 0 0 0 0 This SERP has NDCG of 1 but 0 apply 0 2 1 0 0 0 This SERP has lower NDCG but one apply (ERR?) KFC KFC Macy’s KFC KFC Uber ...until you realize KFC showed up 4 times for no good reason 0 0 2 2 0 0 This SERP has the lowest NDCG but the best yield (MAP?) http://olivier.chapelle.cc/pub/err.pdf
  • 19. 19 Bot detection ● Bot traffic consists of > 60% of Snag’s web and mobile web traffic ● Bots behave very differently from human users. (e.g. views 50+ pages, clicks every posting, etc.) ● Thus, even a 5~10% false negative rate can significantly contaminate LTR training data Ali Bartos Carl Gieringer Garbage in, garbage out
  • 20. 20 SEO - External Query Pattern Shift Problem: Solution: Outcome: When Google doesn’t care about small businesses (not that it ever did)...
  • 21. 21 Elements of the LTR Ecosystem Work in progress and future initiatives towards LTR 2.0
  • 22. 22 Search Engine needs Metadata Availability Req. Example Integration Strategy User/Query Metadata real-time query string search engine plugin / external API near-static user profile external API near real-time search history streaming -> external API Posting Metadata static industry, vector embeddings external API -> search index near real-time yield, remaining budget streaming -> external API -> search index Relevancy real-time relevancy score search engine plugin / external API for both offline training and real time querying (current focus) (long term goal)
  • 23. 23 Signals Platform Signals is an Kafka-based data streaming platform to stream & transform real-time events data to various internal consumers. ● Kafka backend to process real-time comprehensive user behavior & product activity data ● “Hermes” REST API layer to enable signal publishing via http calls ● Avro schema registry to enforce typed event definition Corey Fritz Clean, granular data to train and serve machine learning models
  • 24. 24 Position Profile via Clustering “CDL Training School ! We train, We Hire, Guaranteed!” “Truck Driver” use position ontology to align with query intent and boost recall
  • 25. 25 Posting Summarization via Topic Modeling “ If you are an actor, actress, admin, agency, artist, assistant, barista, bartender, broker, bus driver, cab driver, cashier, chauffeur, cleaner, college student, customer service agent, chef, contract worker, cook, courier, designer, dishwasher, dog walker, driver, entrepreneurs, fitness trainer, food prep, food services, freelancer, handyman, hostess, insurance broker, instructor, intern, janitor, maid, maintenance, messenger, manager, management, musician, maid, office assistant, office administrator, photographer, private hire, professional driver, realtor, retail associate, sales associate, sales person, security, server, students, teacher, tutor, valet, veteran, waiter, waitress who is looking for a flexible part-time, full-time or summer gig, apply to <> to supplement your income this summer! ” extracted from a real job description: ● Many postings contain ‘stuffed’ keywords to boost their own recall at the expense of others’ ● Topic models “summarize” each posting by the strength of its key concepts to both reduce spurious recall and promote relevant recall https://en.wikipedia.org/wiki/Latent_Dirichlet_allocation (proof of concept) Goodbye keyword spamming
  • 26. 26 Posting Deduplication via LSH Robert Mealey ● Large employers often supply generic job descriptions that receive similar relevancy scores for neighboring store locations, affecting result diversity ● Locality-Sensitive Hashing (LSH) is used to tag duplicates/near duplicates so that all but one is shown in search results https://en.wikipedia.org/wiki/Locality-sensitive_hashing no, not 4 KFC jobs on the same SERP
  • 27. 27 Yield Management ● An interesting problem to the LTR framework because users behave agnostically of yield information ● Requires careful user modeling to “de-bias” relevancy signal and streaming infrastructure to update yield and budget information Quadrant III Low Engagement, Low yield Quadrant II High Engagement, High Yield (proof of concept) make some money, change the world Anuradha Uduwage
  • 28. 28 Additional Initiatives ● Language model for job postings ● Posting quality score ● User profile features/embeddings ● Enhanced AB testing and metrics monitoring capabilities ● Real-time user-activity-based features and related infrastructure ● Search result diversity ● Query expansion ● Named Entity detection ● Knowledge graph and graph-based search ● Vector-based relevancy ● Neural ranking models hopefully some of those will make themselves to Haystack 2020
  • 30. 30 Lessons Learned ● LTR isn’t just about the ML model or the search engine Ranking models are only as expressive and/or accurate as the features and labels we feed them. Investment in data infrastructure and data assets is absolutely necessary and arguably more critical. ● Expectations from stakeholders need to be carefully managed Workers, employers, internal teams, Google bots, etc. all have their own areas of emphasis and sometimes may demand slightly different search experiences. Navigating through those multiple party-tradeoffs is crucial for the success of the search system.
  • 31. 31 We are Hiring! Join us and solve some interesting data engineering and search relevance engineering problems ! Richmond, VA, too