Haystack- Learning to rank in an hourly job market

Xun Wang
Xun WangData Scientist
Learning to Rank
in an Hourly Job Marketplace
Xun Wang (xun.wang@snag.co)
Jason Kowalewski (jason.kowalewski@snag.co)
Snag. The marketing slide 2
● 85 MM registered workers
● 325,000 employer locations
● 4.5 MM applications submitted monthly
● 1MM active job postings
Our Matching Problem 3
Location? Pay?Industry?
Part Time
Our Matching Problem 4
Sandwich
Artist?
Host/Hostess?Barista?
cashier
Our Matching Problem 5
Determine
Intent / Context.
?????????
Legacy Search System 6
Legacy Search System 7
Query: “mcdonalds in 92801”
title^n
Query: “mcdonalds in 90024”
Query: “mcdonalds in 11231”
Boosts on channel
Haystack- Learning to rank in an hourly job market
The (0ld) system 12
● System that is too complex to accurately tune the boosts: Relevancy
whack-a-mole
● Inventory content frequently changes
● Lacks data driven input -- assumption driven without proper statistical
analysis.
“If only there was a way to do this
differently…”
The Hourly Job Marketplace
Job search is both an IR and a match problem
Search/ IR (e.g. Youtube)
14
{ User } { Resource }
Match (e.g. online chess)
● Many to Many
● Asymmetric
● Unlimited supply
{ Player} { Player }
● One to One
● Symmetric
● No extra supply
Job Search
{ Job Seekers } {Job Positions}
● One to Many
● Asymmetric and Bi-directional
● Limited supply, unlimited “attempts”
● Fragmented
Organized around “Shifts”. A worker
can be assigned 1 to 30+ hours per
week. Many hold multiple jobs
● Transactional
Workers stay at each job for 6 months
on average
● Lightly Skilled
Many hourly jobs require just a high
school diploma
Hourly Jobs are not ‘Sticky’ 15
https://www.snag.co/employers/wp-content/uploads/2016/07/2016
_SOTHW_Report-3.pdf
Hourly job search is often a recommendation
Schedule and location can be more important
than actual duty of the job
Queries are not explicit (40% don’t have keywords)
Our relevancy signals are collected from
multiple levels of interactions
17
Clicks
Apply Intents
Completed
Applications
Interviews &
Assessments
Hires
1 3
2 4
5
Jobseeker Platform Employer Platform
0
Search
We balance a variety of market participants
snag.
18
● Revenue growth
● User base growth
● Marketplace health &
efficiency
Job-seekers
● Job preference
● Job requirements
● Employer
responsiveness
Employers
● Seeker volume
● Conversion rate (e.g. CTR)
● Cost per lead
● Seeker volume & velocity
● Candidate quality
● Cost per hire
Advertisers/Partners
LTR System Design
Learning to Rank Model 20
Development Environment
Abandonment: 0
Relevancy Labels Features
Click: 1
Apply Intent: 2
match scores on job title,
employer name, job type, ...
distance <position, seeker>
match scores on query location
(e.g. zip-code, city)
match scores on job description
query string attributes (e.g
length, entity type)
posting attributes (e.g. position,
requirements, industry,
semantics representation)
.
.
.
lambdamart
Composability!
Training Pipeline - esltr plugin 0.x 21
Development Environment
data
warehouse
posting
collection
event
sampler
posting
sampler
training
data
generator
posting
ingestion
model
generator
feature
backfilling
relevancy label
parser
Ranklib
relevancy
scores
query
info
features training
data
ranking
model
posting
docs
user
events
training
index
search
engine
(dev)
search
engine
(prod)
Training Pipeline - esltr plugin 1.0 22
Development Environment
data
warehouse
event
sampler
training
data
generator
model
generator
feature
parser
relevancy label
generator
relevancy
scores
features training
data
ranking
model
user
events
live feature logs
+
HyperOpt
search
engine
(prod)
Offline Validation Pre-Deployment 23
Development Environment
● Re-ranking historical queries
Gives good directional guidance, but not very accurate in absolute numbers due to 1)
inability to account for new items and 2) contamination from sponsored postings with
artificially high rankings.
● Manual examination of common query patterns
Great for sanity checks. Reveals details beyond relevancy labels. More indicative of
future performance.
● Best of Both Worlds?
Aljadda, Khalifeh & Korayem, Mohammed & Grainger, Trey. (2018). Fully Automated QA
System for Large Scale Search and Recommendation Engines Leveraging Implicit User
Feedback.
Deployment via A-B testing 24
Production Environment
Don’t modify the existing system.
Deployment via A-B testing 25
Production Environment
a) Build a parallel system
b) Iterate
c) Test
d) Evaluate
Posting Ingestion 26
Production Environment
Step 1. Make it work.
Step 2. Streaming magic
We are still here!
Not here.
Search API 27
Production Environment
Tuning the LTR system
Iteration 1 (Q2 2017) 29
● LTR Features
1. job_title match score
2. job_description match score
3. employer_name match score
4. city-state_match score
5. zipcode_match score
6. distance <query location, posting>
● Relevancy Labels
Click : 1
Apply Intent: 2
Completed Applications: 3
● Success Criteria
- NDCG@10
● Use Cases
Site: desktop, mobile web
User: registered
Search Type:
- zip-code location only
- zip-code location + keyword
Relevancy Performance 30
Iteration 1
● Pros
Immediate boost of NDCG for zipcode-only
searches (~5%)
● Cons
Keyword and location-only searches shared
same feature space, leading to polarized
user experience.
● Todo
Add query-string-related attributes to the
list of features
Query:
- keyword: Starbucks
- location: Arlington, VA, 22201
Results:
Rank Employer Location
1 Starbucks Arlington, VA, 22201
2 Starbucks Arlington, VA, 22203
3 WholeFoods Arlington, VA, 22201
4 Starbucks Washington, DC, 20007
● When things don’t work:
Iteration 2 (Q3 2017) 31
● Success Criteria
- NDCG@10
- Application Rate
(# of applications/ # of search sessions)
● Use Cases
Site: desktop, mobile web
User: registered, unregistered
Search Type:
- zip-code location only
- zip-code location + keyword
- text location only
- text location + keyword
● LTR Features
1. job_title match scores
2. job_description match scores
3. employer_name match scores
4. location “match” level
5. distance <seeker, posting>
6. query location level
7. query length
8. platform (e.g. desktop, mobile)
9. job seeker registration status
● Relevancy Labels
Click : 1
Apply Intent: 2
Completed Applications: 3
Relevancy Performance 32
Iteration 2
● Pros
More stable performance across the board.
● Cons
Low geo-location resolution rate (~95%) hurt
queries with text locations
Default text analyzers supplied noisy signals
to ltr.
● Todo
Enhance geo-coding logics
Define customized analyzers (e.g. stopwords,
synonym filters, keyword markers) for every
field used by the ranking model
Query:
- keyword: Part time restaurant
Results:
Rank Title Employer
1 Part time server Chipotle
2 Full time cook KFC
3 Part time Cashier Restaurant Depot
4 Cook District Taco
● When things don’t work:
Iteration 3 (Q4 2017) 33
● Success Criteria
- NDCG@10
- Application rate
(# of applications / # of search sessions)
- Applicant conversion rate
(# of applicants / # of users)
- Applications per user
(# of applications / # of users)
● Use Cases
Site: desktop, mobile web
User: registered, unregistered
Search Type:
- zip-code location only
- zip-code location + keyword
- text location only
- text location + keyword
- keyword only
● LTR Features
1. job_title match scores
2. job_description match scores
3. employer_name match scores
4. location “match” level
5. distance <seeker, postings>
6. query location level
7. query length
8. platform (e.g. desktop, mobile)
9. job seeker registration status
10. is_faceted flag
● Relevancy Labels
Click : 1
Apply Intent: 2
Completed Applications: 3
Relevancy Performance 34
Iteration 3
● Pros
Location only searches are 10%+ better than
baseline. Keyword searches broke even.
● Cons
Large numbers of tied LTR scores artificially
limited user options via presentation bias
Lack of features about job description contexts
meant “click-baits” received too much
exposure
● Todo
Randomize the ranking of postings with tied
LTR scores on a per-user/session basis
Add query independent posting-level features
Query:
- keyword: PT (part time)
- location: Arlington, VA
Results:
Rank Title Location
1 Part time Cashier Arlington, VA, 22201
2 Drive Uber PT! Arlington, VA, 22209
3 Drive Uber PT! Arlington, VA, 22202
4 Drive Uber PT! Arlington, VA, 22203
● When things don’t work:
Current Iteration (Q1 2018) 35
● Success Criteria
- Application rate
(# of applications/ # of search sessions)
- Applicant conversion rate
(# of applicants/ # of users)
- Applications per user
(# of applications / # of users)
- Application diversity
(# of distinct applied postings/ # of applications)
● Use cases
Site: mobile apps, desktop, mobile web
User: registered, unregistered
Search Type:
- zip-code location only
- zip-code location + keyword
- text location only
- text location + keyword
- keyword only
- user coordinates only (a.k.a Jobs near me)
- user coordinates + keyword
● LTR features
1. job_title match scores
2. job_description match scores
3. employer_name match scores
4. location “match” level
5. distance <seeker, postings>
6. query location level
7. query length
8. platform (e.g. desktop, mobile)
9. job seeker registration status
10. is_faceted flag
11. Location conf level of postings
(proxy for posting quality)
● Relevancy Labels
Click : 1
Apply Intent: 2
Completed Applications: 3
Android App Live Performance (April, 2018) 36
Metrics Qualitative assessments
● Signal Regularisation: No particular
field has outsized impact on relevancy
anymore
● Signal Coordination: e.g. The
interaction between text and location
relevancy are more balanced
● Randomized ties => Better Match:
Randomization enables
well-distributed matchings and better
marketplace health, and partially
corrects positional bias
Metric
Control
(80% user)
Test (20%
user)
Average
% Lift
Application
Rate
0.1273
(0.0005)
0.1409
(0.0011) 10.72%
Applicant
Conversion
Rate
33.86%
(0.20%)
36.64%
(0.43%) 8.22%
Apply Intent
Diversity
0.676
(0.002)
0.759
(0.004) 12.40%
Click
Diversity
0.663
(0.002)
0.807
(0.004) 21.62%
Engineering Challenges 37
● Latency
● API: window size from 3000 to 1000 to 500
● Igniter (posting ingestion) execution time
● Signal Quality
● Randomization for result consistency
Lessons Learned
Lessons Learned 39
Model Development
● Relevancy tuning can create feedback loops. Look ahead
Changes in the ranking function sometimes triggers changes in user behavior, which in turn invalidate
said ranking function. Treat relevancy tuning as interactive experiments, not a curve-fitting exercise
● Apply strong model assumptions to correct deficiencies in old ranking functions
Use sound behavioral hypothesis via data analysis and qualitative user research to regulate model
behavior. Historical data can be noisy. Let AB tests be the final judge.
● Engineer the relevancy labels as well as the features
Implicit feedbacks are not absolute measures of relevancy and should be modeled to account for biases
and behavioral assumptions
● Ranking functions are only as expressive as the features you feed them
Any relevancy insights that can’t be encoded as meaningful differences in the feature space will not be
reflected in the search results
Lessons Learned 40
Engineering & Infrastructure
● Prioritize on velocity of iteration (analysis paralysis)
● Worked backwards from conclusions about system latency
Future Work
Posting and Query Semantics Features 42
● Contextual information in posting
descriptions contribute many relevancy
signals
● Back-testings on both manually crafted
bag-of-words features and
machine-learned representations (e.g.
via SVD, word2vec) already saw
significant lift of reranked NDCG
● Some concerns for query-time
performance and over-fitting of long
NLP feature vectors
“... hiring individuals to work as part-time
Package Handlers... involves continual
lifting, lowering and sliding packages that
typically weigh 25 - 35 lbs… typically do
not work on holidays.... working
approximately 17.5 - 20 hours per week…
outstanding education assistance of up to
$2,625 per semester...”
“We have a part time opening for a delivery
driver position. Must be authorized to work
in the US”
High context
Low context
Click / Relevancy Label Modeling 43
Model Improvements
● Build multi-stage click models to
account for factors that cannot be
formulated as query-time LTR features
(e.g. rank position, between-session
correlations).
● Creates a positive feedback loop that
boosts potentially relevant postings
with low exposures (and penalize the
reverse)
Personalized Matching 44
Model Improvements
● Incorporate LTR features about matching
signals between job seeker preferences/
qualifications and job requirements
● (Potentially) an online learning module
that dynamically adjusts the rankings
shown to each user based on onsite
behavior
(...That pays >$15 per
hour. No night shifts!
...is In the retail
industry, where I have
5 years of experience
Bonus points if it’s
Harris Teeter…)
I want a part
time job near
my home!
Engineering Improvements 45
Engineering & Infrastructure
● Push-button training pipeline
● Automated push button deployment for re-indexing
● Latency and scale improvements
References
47
● Elasticsearch: https://www.elastic.co/guide/en/elasticsearch/reference/current/index.htm
● ES Learning to Rank Plugin: http://elasticsearch-learning-to-rank.readthedocs.io/en/latest/
● Relevancy tuning: Turnbull, Doug, and John Berryman. Relevant Search with Applications for Solr and Elasticsearch. Manning, 2016.
● lambdaMart: C. Burges. From RankNet to LambdaRank to LambdaMART: An overview. Technical Report MSR-TR-2010-82,
Microsoft Research, 2010.
● ranklib: https://sourceforge.net/p/lemur/wiki/RankLib
● xgboost: Tianqi Chen and Carlos Guestrin. 2016. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM
SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '16). ACM, New York, NY, USA, 785-794
K.V. Rashmi and Ran Gilad-Bachrach, Dart: Dropouts meet multiple additive regression trees, April 2015
● hyperopt: J. Bergstra, R. Bardenet, Y. Bengio and B. Kégl. Algorithms for Hyper-parameter Optimization. Proc. Neural Information
Processing Systems 24 (NIPS2011), 2546–2554, 2011
● Interleaving: O. Chapelle, T. Joachims, F. Radlinski, Yisong Yue, Large-Scale Validation and Analysis of Interleaved Search Evaluation,
ACM Transactions on Information Systems (TOIS), 30(1):6.1-6.41, 2012.
T. Joachims, Evaluating Retrieval Performance Using Clickthrough Data, Proceedings of the SIGIR Workshop on Mathematical/Formal
Methods in Information Retrieval, 2002.
● document & query embeddings: Mitra, Bhaskar & Craswell, Nick. (2017). Neural Models for Information Retrieval.
Hamed Zamani and W. Bruce Croft. 2017. Relevance-based Word Embedding. In Proceedings of the 40th International ACM SIGIR
Conference on Research and Development in Information Retrieval (SIGIR '17). ACM, New York, NY, USA, 505-514
● Click model: Chuklin, A., Markov, I., & de Rijke, M. (2015). Click models for web search. Synthesis Lectures on Information Concepts
Retrieval and Services, 7(3), 1–115. With Pyclick: https://github.com/markovi/PyClick
Y. Hu, Y. Koren and C. Volinsky, "Collaborative Filtering for Implicit Feedback Datasets," 2008 Eighth IEEE International Conference
on Data Mining, Pisa, 2008, pp. 263-272.
Our posting index is constantly changing 48
Thank you.
Any questions?
1 of 49

Recommended

Interleaving, Evaluation to Self-learning Search @904Labs by
Interleaving, Evaluation to Self-learning Search @904LabsInterleaving, Evaluation to Self-learning Search @904Labs
Interleaving, Evaluation to Self-learning Search @904LabsJohn T. Kane
812 views27 slides
Search Product Manager: Software PM vs. Enterprise PM or What does that * PM do? by
Search Product Manager: Software PM vs. Enterprise PM or What does that * PM do?Search Product Manager: Software PM vs. Enterprise PM or What does that * PM do?
Search Product Manager: Software PM vs. Enterprise PM or What does that * PM do?John T. Kane
1.3K views31 slides
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies by
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and VocabulariesHaystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and VocabulariesMax Irwin
3.1K views29 slides
Crowdsourced query augmentation through the semantic discovery of domain spec... by
Crowdsourced query augmentation through the semantic discovery of domain spec...Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Trey Grainger
8K views18 slides
Vespa, A Tour by
Vespa, A TourVespa, A Tour
Vespa, A TourMatthewOverstreet2
492 views25 slides
Haystacks slides by
Haystacks slidesHaystacks slides
Haystacks slidesTed Sullivan
544 views24 slides

More Related Content

What's hot

AI, Search, and the Disruption of Knowledge Management by
AI, Search, and the Disruption of Knowledge ManagementAI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge ManagementTrey Grainger
522 views134 slides
Reflected intelligence evolving self-learning data systems by
Reflected intelligence  evolving self-learning data systemsReflected intelligence  evolving self-learning data systems
Reflected intelligence evolving self-learning data systemsTrey Grainger
6.5K views90 slides
Enhancing relevancy through personalization & semantic search by
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchTrey Grainger
3.2K views62 slides
The Intent Algorithms of Search & Recommendation Engines by
The Intent Algorithms of Search & Recommendation EnginesThe Intent Algorithms of Search & Recommendation Engines
The Intent Algorithms of Search & Recommendation EnginesTrey Grainger
2.4K views108 slides
Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine by
Leveraging Lucene/Solr as a Knowledge Graph and Intent EngineLeveraging Lucene/Solr as a Knowledge Graph and Intent Engine
Leveraging Lucene/Solr as a Knowledge Graph and Intent EngineTrey Grainger
8K views37 slides
Self-learned Relevancy with Apache Solr by
Self-learned Relevancy with Apache SolrSelf-learned Relevancy with Apache Solr
Self-learned Relevancy with Apache SolrTrey Grainger
2.5K views103 slides

What's hot(20)

AI, Search, and the Disruption of Knowledge Management by Trey Grainger
AI, Search, and the Disruption of Knowledge ManagementAI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge Management
Trey Grainger522 views
Reflected intelligence evolving self-learning data systems by Trey Grainger
Reflected intelligence  evolving self-learning data systemsReflected intelligence  evolving self-learning data systems
Reflected intelligence evolving self-learning data systems
Trey Grainger6.5K views
Enhancing relevancy through personalization & semantic search by Trey Grainger
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic search
Trey Grainger3.2K views
The Intent Algorithms of Search & Recommendation Engines by Trey Grainger
The Intent Algorithms of Search & Recommendation EnginesThe Intent Algorithms of Search & Recommendation Engines
The Intent Algorithms of Search & Recommendation Engines
Trey Grainger2.4K views
Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine by Trey Grainger
Leveraging Lucene/Solr as a Knowledge Graph and Intent EngineLeveraging Lucene/Solr as a Knowledge Graph and Intent Engine
Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine
Trey Grainger8K views
Self-learned Relevancy with Apache Solr by Trey Grainger
Self-learned Relevancy with Apache SolrSelf-learned Relevancy with Apache Solr
Self-learned Relevancy with Apache Solr
Trey Grainger2.5K views
Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine: Presented by T... by Lucidworks
Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine: Presented by T...Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine: Presented by T...
Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine: Presented by T...
Lucidworks1.6K views
The Apache Solr Semantic Knowledge Graph by Trey Grainger
The Apache Solr Semantic Knowledge GraphThe Apache Solr Semantic Knowledge Graph
The Apache Solr Semantic Knowledge Graph
Trey Grainger7.2K views
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S... by Lucidworks
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Lucidworks18.3K views
Using a keyword extraction pipeline to understand concepts in future work sec... by Kai Li
Using a keyword extraction pipeline to understand concepts in future work sec...Using a keyword extraction pipeline to understand concepts in future work sec...
Using a keyword extraction pipeline to understand concepts in future work sec...
Kai Li415 views
Dice.com Bay Area Search - Beyond Learning to Rank Talk by Simon Hughes
Dice.com Bay Area Search - Beyond Learning to Rank TalkDice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank Talk
Simon Hughes945 views
Searching for Meaning by Trey Grainger
Searching for MeaningSearching for Meaning
Searching for Meaning
Trey Grainger1.8K views
The Relevance of the Apache Solr Semantic Knowledge Graph by Trey Grainger
The Relevance of the Apache Solr Semantic Knowledge GraphThe Relevance of the Apache Solr Semantic Knowledge Graph
The Relevance of the Apache Solr Semantic Knowledge Graph
Trey Grainger2.3K views
How to Build a Semantic Search System by Trey Grainger
How to Build a Semantic Search SystemHow to Build a Semantic Search System
How to Build a Semantic Search System
Trey Grainger5.3K views
The Apache Solr Smart Data Ecosystem by Trey Grainger
The Apache Solr Smart Data EcosystemThe Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data Ecosystem
Trey Grainger6.1K views
Building Search & Recommendation Engines by Trey Grainger
Building Search & Recommendation EnginesBuilding Search & Recommendation Engines
Building Search & Recommendation Engines
Trey Grainger6K views
Searching and Querying Knowledge Graphs with Solr/SIREn - A Reference Archite... by Lucidworks
Searching and Querying Knowledge Graphs with Solr/SIREn - A Reference Archite...Searching and Querying Knowledge Graphs with Solr/SIREn - A Reference Archite...
Searching and Querying Knowledge Graphs with Solr/SIREn - A Reference Archite...
Lucidworks9K views
Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente... by Lucidworks
Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...
Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...
Lucidworks1.1K views
Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese... by Lucidworks
Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese...Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese...
Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese...
Lucidworks1.1K views

Similar to Haystack- Learning to rank in an hourly job market

Haystack 2019 - Towards a Learning To Rank Ecosystem @ Snag - We've got LTR t... by
Haystack 2019 - Towards a Learning To Rank Ecosystem @ Snag - We've got LTR t...Haystack 2019 - Towards a Learning To Rank Ecosystem @ Snag - We've got LTR t...
Haystack 2019 - Towards a Learning To Rank Ecosystem @ Snag - We've got LTR t...OpenSource Connections
379 views32 slides
Creating Consistency in ​ Compensation with Global Job Leveling by
Creating Consistency in ​ Compensation with Global Job LevelingCreating Consistency in ​ Compensation with Global Job Leveling
Creating Consistency in ​ Compensation with Global Job LevelingPayScale, Inc.
205 views22 slides
Leveraging Machine Learning for Competitive Advantage by Dylan Hogg - Search ... by
Leveraging Machine Learning for Competitive Advantage by Dylan Hogg - Search ...Leveraging Machine Learning for Competitive Advantage by Dylan Hogg - Search ...
Leveraging Machine Learning for Competitive Advantage by Dylan Hogg - Search ...Search Party
281 views30 slides
Leveraging Machine Learning for Competitive Advantage at Search Party by
Leveraging Machine Learning for Competitive Advantage at Search PartyLeveraging Machine Learning for Competitive Advantage at Search Party
Leveraging Machine Learning for Competitive Advantage at Search PartyDylan Hogg
1.5K views30 slides
QA Test Engineer by
QA Test EngineerQA Test Engineer
QA Test EngineerManoj Pal
465 views3 slides
Yelp Ad Targeting at Scale with Apache Spark with Inaz Alaei-Novin and Joe Ma... by
Yelp Ad Targeting at Scale with Apache Spark with Inaz Alaei-Novin and Joe Ma...Yelp Ad Targeting at Scale with Apache Spark with Inaz Alaei-Novin and Joe Ma...
Yelp Ad Targeting at Scale with Apache Spark with Inaz Alaei-Novin and Joe Ma...Databricks
816 views46 slides

Similar to Haystack- Learning to rank in an hourly job market (20)

Haystack 2019 - Towards a Learning To Rank Ecosystem @ Snag - We've got LTR t... by OpenSource Connections
Haystack 2019 - Towards a Learning To Rank Ecosystem @ Snag - We've got LTR t...Haystack 2019 - Towards a Learning To Rank Ecosystem @ Snag - We've got LTR t...
Haystack 2019 - Towards a Learning To Rank Ecosystem @ Snag - We've got LTR t...
Creating Consistency in ​ Compensation with Global Job Leveling by PayScale, Inc.
Creating Consistency in ​ Compensation with Global Job LevelingCreating Consistency in ​ Compensation with Global Job Leveling
Creating Consistency in ​ Compensation with Global Job Leveling
PayScale, Inc.205 views
Leveraging Machine Learning for Competitive Advantage by Dylan Hogg - Search ... by Search Party
Leveraging Machine Learning for Competitive Advantage by Dylan Hogg - Search ...Leveraging Machine Learning for Competitive Advantage by Dylan Hogg - Search ...
Leveraging Machine Learning for Competitive Advantage by Dylan Hogg - Search ...
Search Party281 views
Leveraging Machine Learning for Competitive Advantage at Search Party by Dylan Hogg
Leveraging Machine Learning for Competitive Advantage at Search PartyLeveraging Machine Learning for Competitive Advantage at Search Party
Leveraging Machine Learning for Competitive Advantage at Search Party
Dylan Hogg1.5K views
QA Test Engineer by Manoj Pal
QA Test EngineerQA Test Engineer
QA Test Engineer
Manoj Pal465 views
Yelp Ad Targeting at Scale with Apache Spark with Inaz Alaei-Novin and Joe Ma... by Databricks
Yelp Ad Targeting at Scale with Apache Spark with Inaz Alaei-Novin and Joe Ma...Yelp Ad Targeting at Scale with Apache Spark with Inaz Alaei-Novin and Joe Ma...
Yelp Ad Targeting at Scale with Apache Spark with Inaz Alaei-Novin and Joe Ma...
Databricks816 views
Personalized Job Recommendation System at LinkedIn: Practical Challenges and ... by Benjamin Le
Personalized Job Recommendation System at LinkedIn: Practical Challenges and ...Personalized Job Recommendation System at LinkedIn: Practical Challenges and ...
Personalized Job Recommendation System at LinkedIn: Practical Challenges and ...
Benjamin Le4.5K views
Dataiku at SF DataMining Meetup - Kaggle Yandex Challenge by Dataiku
Dataiku at SF DataMining Meetup - Kaggle Yandex ChallengeDataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku3.8K views
Florian Douetteau @ Dataiku by PAPIs.io
Florian Douetteau @ DataikuFlorian Douetteau @ Dataiku
Florian Douetteau @ Dataiku
PAPIs.io1.3K views
Webinar - Q2 2023: What’s New in MarketPay by PayScale, Inc.
Webinar - Q2 2023: What’s New in MarketPayWebinar - Q2 2023: What’s New in MarketPay
Webinar - Q2 2023: What’s New in MarketPay
PayScale, Inc.344 views
Preparing for Peak in Ecommerce | eTail Asia 2020 by Lucidworks
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020
Lucidworks236 views
Application Test Engineer by Manoj Pal
Application Test EngineerApplication Test Engineer
Application Test Engineer
Manoj Pal197 views
How to Prepare for Product Based Companies? by Joel Kingsley
How to Prepare for Product Based Companies?How to Prepare for Product Based Companies?
How to Prepare for Product Based Companies?
Joel Kingsley238 views

Recently uploaded

GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N... by
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...James Anderson
33 views32 slides
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLive by
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLiveAutomating a World-Class Technology Conference; Behind the Scenes of CiscoLive
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLiveNetwork Automation Forum
21 views35 slides
ChatGPT and AI for Web Developers by
ChatGPT and AI for Web DevelopersChatGPT and AI for Web Developers
ChatGPT and AI for Web DevelopersMaximiliano Firtman
181 views82 slides
DALI Basics Course 2023 by
DALI Basics Course  2023DALI Basics Course  2023
DALI Basics Course 2023Ivory Egg
14 views12 slides
SAP Automation Using Bar Code and FIORI.pdf by
SAP Automation Using Bar Code and FIORI.pdfSAP Automation Using Bar Code and FIORI.pdf
SAP Automation Using Bar Code and FIORI.pdfVirendra Rai, PMP
19 views38 slides
Special_edition_innovator_2023.pdf by
Special_edition_innovator_2023.pdfSpecial_edition_innovator_2023.pdf
Special_edition_innovator_2023.pdfWillDavies22
16 views6 slides

Recently uploaded(20)

GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N... by James Anderson
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...
James Anderson33 views
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLive by Network Automation Forum
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLiveAutomating a World-Class Technology Conference; Behind the Scenes of CiscoLive
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLive
DALI Basics Course 2023 by Ivory Egg
DALI Basics Course  2023DALI Basics Course  2023
DALI Basics Course 2023
Ivory Egg14 views
SAP Automation Using Bar Code and FIORI.pdf by Virendra Rai, PMP
SAP Automation Using Bar Code and FIORI.pdfSAP Automation Using Bar Code and FIORI.pdf
SAP Automation Using Bar Code and FIORI.pdf
Special_edition_innovator_2023.pdf by WillDavies22
Special_edition_innovator_2023.pdfSpecial_edition_innovator_2023.pdf
Special_edition_innovator_2023.pdf
WillDavies2216 views
The details of description: Techniques, tips, and tangents on alternative tex... by BookNet Canada
The details of description: Techniques, tips, and tangents on alternative tex...The details of description: Techniques, tips, and tangents on alternative tex...
The details of description: Techniques, tips, and tangents on alternative tex...
BookNet Canada121 views
From chaos to control: Managing migrations and Microsoft 365 with ShareGate! by sammart93
From chaos to control: Managing migrations and Microsoft 365 with ShareGate!From chaos to control: Managing migrations and Microsoft 365 with ShareGate!
From chaos to control: Managing migrations and Microsoft 365 with ShareGate!
sammart939 views
HTTP headers that make your website go faster - devs.gent November 2023 by Thijs Feryn
HTTP headers that make your website go faster - devs.gent November 2023HTTP headers that make your website go faster - devs.gent November 2023
HTTP headers that make your website go faster - devs.gent November 2023
Thijs Feryn19 views
1st parposal presentation.pptx by i238212
1st parposal presentation.pptx1st parposal presentation.pptx
1st parposal presentation.pptx
i2382129 views
Web Dev - 1 PPT.pdf by gdsczhcet
Web Dev - 1 PPT.pdfWeb Dev - 1 PPT.pdf
Web Dev - 1 PPT.pdf
gdsczhcet55 views
Voice Logger - Telephony Integration Solution at Aegis by Nirmal Sharma
Voice Logger - Telephony Integration Solution at AegisVoice Logger - Telephony Integration Solution at Aegis
Voice Logger - Telephony Integration Solution at Aegis
Nirmal Sharma17 views
Attacking IoT Devices from a Web Perspective - Linux Day by Simone Onofri
Attacking IoT Devices from a Web Perspective - Linux Day Attacking IoT Devices from a Web Perspective - Linux Day
Attacking IoT Devices from a Web Perspective - Linux Day
Simone Onofri15 views
Lilypad @ Labweek, Istanbul, 2023.pdf by Ally339821
Lilypad @ Labweek, Istanbul, 2023.pdfLilypad @ Labweek, Istanbul, 2023.pdf
Lilypad @ Labweek, Istanbul, 2023.pdf
Ally3398219 views
Perth MeetUp November 2023 by Michael Price
Perth MeetUp November 2023 Perth MeetUp November 2023
Perth MeetUp November 2023
Michael Price15 views

Haystack- Learning to rank in an hourly job market

  • 1. Learning to Rank in an Hourly Job Marketplace Xun Wang (xun.wang@snag.co) Jason Kowalewski (jason.kowalewski@snag.co)
  • 2. Snag. The marketing slide 2 ● 85 MM registered workers ● 325,000 employer locations ● 4.5 MM applications submitted monthly ● 1MM active job postings
  • 3. Our Matching Problem 3 Location? Pay?Industry? Part Time
  • 4. Our Matching Problem 4 Sandwich Artist? Host/Hostess?Barista? cashier
  • 5. Our Matching Problem 5 Determine Intent / Context. ?????????
  • 8. Query: “mcdonalds in 92801” title^n
  • 10. Query: “mcdonalds in 11231” Boosts on channel
  • 12. The (0ld) system 12 ● System that is too complex to accurately tune the boosts: Relevancy whack-a-mole ● Inventory content frequently changes ● Lacks data driven input -- assumption driven without proper statistical analysis. “If only there was a way to do this differently…”
  • 13. The Hourly Job Marketplace
  • 14. Job search is both an IR and a match problem Search/ IR (e.g. Youtube) 14 { User } { Resource } Match (e.g. online chess) ● Many to Many ● Asymmetric ● Unlimited supply { Player} { Player } ● One to One ● Symmetric ● No extra supply Job Search { Job Seekers } {Job Positions} ● One to Many ● Asymmetric and Bi-directional ● Limited supply, unlimited “attempts”
  • 15. ● Fragmented Organized around “Shifts”. A worker can be assigned 1 to 30+ hours per week. Many hold multiple jobs ● Transactional Workers stay at each job for 6 months on average ● Lightly Skilled Many hourly jobs require just a high school diploma Hourly Jobs are not ‘Sticky’ 15 https://www.snag.co/employers/wp-content/uploads/2016/07/2016 _SOTHW_Report-3.pdf
  • 16. Hourly job search is often a recommendation Schedule and location can be more important than actual duty of the job Queries are not explicit (40% don’t have keywords)
  • 17. Our relevancy signals are collected from multiple levels of interactions 17 Clicks Apply Intents Completed Applications Interviews & Assessments Hires 1 3 2 4 5 Jobseeker Platform Employer Platform 0 Search
  • 18. We balance a variety of market participants snag. 18 ● Revenue growth ● User base growth ● Marketplace health & efficiency Job-seekers ● Job preference ● Job requirements ● Employer responsiveness Employers ● Seeker volume ● Conversion rate (e.g. CTR) ● Cost per lead ● Seeker volume & velocity ● Candidate quality ● Cost per hire Advertisers/Partners
  • 20. Learning to Rank Model 20 Development Environment Abandonment: 0 Relevancy Labels Features Click: 1 Apply Intent: 2 match scores on job title, employer name, job type, ... distance <position, seeker> match scores on query location (e.g. zip-code, city) match scores on job description query string attributes (e.g length, entity type) posting attributes (e.g. position, requirements, industry, semantics representation) . . . lambdamart Composability!
  • 21. Training Pipeline - esltr plugin 0.x 21 Development Environment data warehouse posting collection event sampler posting sampler training data generator posting ingestion model generator feature backfilling relevancy label parser Ranklib relevancy scores query info features training data ranking model posting docs user events training index search engine (dev) search engine (prod)
  • 22. Training Pipeline - esltr plugin 1.0 22 Development Environment data warehouse event sampler training data generator model generator feature parser relevancy label generator relevancy scores features training data ranking model user events live feature logs + HyperOpt search engine (prod)
  • 23. Offline Validation Pre-Deployment 23 Development Environment ● Re-ranking historical queries Gives good directional guidance, but not very accurate in absolute numbers due to 1) inability to account for new items and 2) contamination from sponsored postings with artificially high rankings. ● Manual examination of common query patterns Great for sanity checks. Reveals details beyond relevancy labels. More indicative of future performance. ● Best of Both Worlds? Aljadda, Khalifeh & Korayem, Mohammed & Grainger, Trey. (2018). Fully Automated QA System for Large Scale Search and Recommendation Engines Leveraging Implicit User Feedback.
  • 24. Deployment via A-B testing 24 Production Environment Don’t modify the existing system.
  • 25. Deployment via A-B testing 25 Production Environment a) Build a parallel system b) Iterate c) Test d) Evaluate
  • 26. Posting Ingestion 26 Production Environment Step 1. Make it work. Step 2. Streaming magic We are still here! Not here.
  • 28. Tuning the LTR system
  • 29. Iteration 1 (Q2 2017) 29 ● LTR Features 1. job_title match score 2. job_description match score 3. employer_name match score 4. city-state_match score 5. zipcode_match score 6. distance <query location, posting> ● Relevancy Labels Click : 1 Apply Intent: 2 Completed Applications: 3 ● Success Criteria - NDCG@10 ● Use Cases Site: desktop, mobile web User: registered Search Type: - zip-code location only - zip-code location + keyword
  • 30. Relevancy Performance 30 Iteration 1 ● Pros Immediate boost of NDCG for zipcode-only searches (~5%) ● Cons Keyword and location-only searches shared same feature space, leading to polarized user experience. ● Todo Add query-string-related attributes to the list of features Query: - keyword: Starbucks - location: Arlington, VA, 22201 Results: Rank Employer Location 1 Starbucks Arlington, VA, 22201 2 Starbucks Arlington, VA, 22203 3 WholeFoods Arlington, VA, 22201 4 Starbucks Washington, DC, 20007 ● When things don’t work:
  • 31. Iteration 2 (Q3 2017) 31 ● Success Criteria - NDCG@10 - Application Rate (# of applications/ # of search sessions) ● Use Cases Site: desktop, mobile web User: registered, unregistered Search Type: - zip-code location only - zip-code location + keyword - text location only - text location + keyword ● LTR Features 1. job_title match scores 2. job_description match scores 3. employer_name match scores 4. location “match” level 5. distance <seeker, posting> 6. query location level 7. query length 8. platform (e.g. desktop, mobile) 9. job seeker registration status ● Relevancy Labels Click : 1 Apply Intent: 2 Completed Applications: 3
  • 32. Relevancy Performance 32 Iteration 2 ● Pros More stable performance across the board. ● Cons Low geo-location resolution rate (~95%) hurt queries with text locations Default text analyzers supplied noisy signals to ltr. ● Todo Enhance geo-coding logics Define customized analyzers (e.g. stopwords, synonym filters, keyword markers) for every field used by the ranking model Query: - keyword: Part time restaurant Results: Rank Title Employer 1 Part time server Chipotle 2 Full time cook KFC 3 Part time Cashier Restaurant Depot 4 Cook District Taco ● When things don’t work:
  • 33. Iteration 3 (Q4 2017) 33 ● Success Criteria - NDCG@10 - Application rate (# of applications / # of search sessions) - Applicant conversion rate (# of applicants / # of users) - Applications per user (# of applications / # of users) ● Use Cases Site: desktop, mobile web User: registered, unregistered Search Type: - zip-code location only - zip-code location + keyword - text location only - text location + keyword - keyword only ● LTR Features 1. job_title match scores 2. job_description match scores 3. employer_name match scores 4. location “match” level 5. distance <seeker, postings> 6. query location level 7. query length 8. platform (e.g. desktop, mobile) 9. job seeker registration status 10. is_faceted flag ● Relevancy Labels Click : 1 Apply Intent: 2 Completed Applications: 3
  • 34. Relevancy Performance 34 Iteration 3 ● Pros Location only searches are 10%+ better than baseline. Keyword searches broke even. ● Cons Large numbers of tied LTR scores artificially limited user options via presentation bias Lack of features about job description contexts meant “click-baits” received too much exposure ● Todo Randomize the ranking of postings with tied LTR scores on a per-user/session basis Add query independent posting-level features Query: - keyword: PT (part time) - location: Arlington, VA Results: Rank Title Location 1 Part time Cashier Arlington, VA, 22201 2 Drive Uber PT! Arlington, VA, 22209 3 Drive Uber PT! Arlington, VA, 22202 4 Drive Uber PT! Arlington, VA, 22203 ● When things don’t work:
  • 35. Current Iteration (Q1 2018) 35 ● Success Criteria - Application rate (# of applications/ # of search sessions) - Applicant conversion rate (# of applicants/ # of users) - Applications per user (# of applications / # of users) - Application diversity (# of distinct applied postings/ # of applications) ● Use cases Site: mobile apps, desktop, mobile web User: registered, unregistered Search Type: - zip-code location only - zip-code location + keyword - text location only - text location + keyword - keyword only - user coordinates only (a.k.a Jobs near me) - user coordinates + keyword ● LTR features 1. job_title match scores 2. job_description match scores 3. employer_name match scores 4. location “match” level 5. distance <seeker, postings> 6. query location level 7. query length 8. platform (e.g. desktop, mobile) 9. job seeker registration status 10. is_faceted flag 11. Location conf level of postings (proxy for posting quality) ● Relevancy Labels Click : 1 Apply Intent: 2 Completed Applications: 3
  • 36. Android App Live Performance (April, 2018) 36 Metrics Qualitative assessments ● Signal Regularisation: No particular field has outsized impact on relevancy anymore ● Signal Coordination: e.g. The interaction between text and location relevancy are more balanced ● Randomized ties => Better Match: Randomization enables well-distributed matchings and better marketplace health, and partially corrects positional bias Metric Control (80% user) Test (20% user) Average % Lift Application Rate 0.1273 (0.0005) 0.1409 (0.0011) 10.72% Applicant Conversion Rate 33.86% (0.20%) 36.64% (0.43%) 8.22% Apply Intent Diversity 0.676 (0.002) 0.759 (0.004) 12.40% Click Diversity 0.663 (0.002) 0.807 (0.004) 21.62%
  • 37. Engineering Challenges 37 ● Latency ● API: window size from 3000 to 1000 to 500 ● Igniter (posting ingestion) execution time ● Signal Quality ● Randomization for result consistency
  • 39. Lessons Learned 39 Model Development ● Relevancy tuning can create feedback loops. Look ahead Changes in the ranking function sometimes triggers changes in user behavior, which in turn invalidate said ranking function. Treat relevancy tuning as interactive experiments, not a curve-fitting exercise ● Apply strong model assumptions to correct deficiencies in old ranking functions Use sound behavioral hypothesis via data analysis and qualitative user research to regulate model behavior. Historical data can be noisy. Let AB tests be the final judge. ● Engineer the relevancy labels as well as the features Implicit feedbacks are not absolute measures of relevancy and should be modeled to account for biases and behavioral assumptions ● Ranking functions are only as expressive as the features you feed them Any relevancy insights that can’t be encoded as meaningful differences in the feature space will not be reflected in the search results
  • 40. Lessons Learned 40 Engineering & Infrastructure ● Prioritize on velocity of iteration (analysis paralysis) ● Worked backwards from conclusions about system latency
  • 42. Posting and Query Semantics Features 42 ● Contextual information in posting descriptions contribute many relevancy signals ● Back-testings on both manually crafted bag-of-words features and machine-learned representations (e.g. via SVD, word2vec) already saw significant lift of reranked NDCG ● Some concerns for query-time performance and over-fitting of long NLP feature vectors “... hiring individuals to work as part-time Package Handlers... involves continual lifting, lowering and sliding packages that typically weigh 25 - 35 lbs… typically do not work on holidays.... working approximately 17.5 - 20 hours per week… outstanding education assistance of up to $2,625 per semester...” “We have a part time opening for a delivery driver position. Must be authorized to work in the US” High context Low context
  • 43. Click / Relevancy Label Modeling 43 Model Improvements ● Build multi-stage click models to account for factors that cannot be formulated as query-time LTR features (e.g. rank position, between-session correlations). ● Creates a positive feedback loop that boosts potentially relevant postings with low exposures (and penalize the reverse)
  • 44. Personalized Matching 44 Model Improvements ● Incorporate LTR features about matching signals between job seeker preferences/ qualifications and job requirements ● (Potentially) an online learning module that dynamically adjusts the rankings shown to each user based on onsite behavior (...That pays >$15 per hour. No night shifts! ...is In the retail industry, where I have 5 years of experience Bonus points if it’s Harris Teeter…) I want a part time job near my home!
  • 45. Engineering Improvements 45 Engineering & Infrastructure ● Push-button training pipeline ● Automated push button deployment for re-indexing ● Latency and scale improvements
  • 47. 47 ● Elasticsearch: https://www.elastic.co/guide/en/elasticsearch/reference/current/index.htm ● ES Learning to Rank Plugin: http://elasticsearch-learning-to-rank.readthedocs.io/en/latest/ ● Relevancy tuning: Turnbull, Doug, and John Berryman. Relevant Search with Applications for Solr and Elasticsearch. Manning, 2016. ● lambdaMart: C. Burges. From RankNet to LambdaRank to LambdaMART: An overview. Technical Report MSR-TR-2010-82, Microsoft Research, 2010. ● ranklib: https://sourceforge.net/p/lemur/wiki/RankLib ● xgboost: Tianqi Chen and Carlos Guestrin. 2016. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '16). ACM, New York, NY, USA, 785-794 K.V. Rashmi and Ran Gilad-Bachrach, Dart: Dropouts meet multiple additive regression trees, April 2015 ● hyperopt: J. Bergstra, R. Bardenet, Y. Bengio and B. Kégl. Algorithms for Hyper-parameter Optimization. Proc. Neural Information Processing Systems 24 (NIPS2011), 2546–2554, 2011 ● Interleaving: O. Chapelle, T. Joachims, F. Radlinski, Yisong Yue, Large-Scale Validation and Analysis of Interleaved Search Evaluation, ACM Transactions on Information Systems (TOIS), 30(1):6.1-6.41, 2012. T. Joachims, Evaluating Retrieval Performance Using Clickthrough Data, Proceedings of the SIGIR Workshop on Mathematical/Formal Methods in Information Retrieval, 2002. ● document & query embeddings: Mitra, Bhaskar & Craswell, Nick. (2017). Neural Models for Information Retrieval. Hamed Zamani and W. Bruce Croft. 2017. Relevance-based Word Embedding. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '17). ACM, New York, NY, USA, 505-514 ● Click model: Chuklin, A., Markov, I., & de Rijke, M. (2015). Click models for web search. Synthesis Lectures on Information Concepts Retrieval and Services, 7(3), 1–115. With Pyclick: https://github.com/markovi/PyClick Y. Hu, Y. Koren and C. Volinsky, "Collaborative Filtering for Implicit Feedback Datasets," 2008 Eighth IEEE International Conference on Data Mining, Pisa, 2008, pp. 263-272.
  • 48. Our posting index is constantly changing 48