SlideShare a Scribd company logo
1 of 110
to our
data science
meetup
welcome
hosted by
randstad group the netherlands IT
and BigData Republic
a story about
data driven
randstad group.
randstad group meets
data scientists.
falco vermeer
randstad group the netherlands IT
© randstad group the netherlands IT | 3
our purpose is to
support people and
organizations in
realizing their true
potential.
randstad group
human forward.
randstad in numbers
#1 HR services provider worldwide.
€ 23.8
billion in revenue
key figures 2018
262,500
permanent placements
38,820
avg. corporate employees
4,826
offices in 39 countries
48%
women in leadership positions
670,900
people we help to work every day
© randstad group the netherlands IT | 4
|| 5
back in the good old days...
information has always been an important asset for Randstad.
| 6
labour market
predicting trends and scarcity
on the labour market
candidates
tailored insights on job
& career at every moment
customers
predict the most suitable
candidate at the right time
||
data driven randstad
purpose - what & how
7© randstad group the netherlands IT
”With data & insights we help our customers & talents moving forward in the world of work,
and by building solutions for our own consultants we help them in organising their work &
making decisions in a more effective way.”
For decisions at strategic & tactical level we develop dashboards,
a self-service BI environment and ad-hoc analytics to help making
the right decision at the right time.
For decisions on operational level we automate the decision
making process and help the user making smarter choices.
Therefore we build models and integrate them in our processes.
|| 8
our data driven journey.
redesigning the core of our company
2014
2018
datatechnology
skills & peopleorganisation
||
Data Scientist
meet our data & analytics team!
9
(Advanced) Analyst
Machine Learning
Engineer
Analytics Translator
Data Engineer BI Engineer
© randstad group the netherlands IT
||
our core process since 60 years.
a lot of potential for predictive data use cases
10© randstad group the netherlands IT
vacancy recommender
the use case
hugo valk
randstad group the Netherlands IT
the business process of randstad.
© randstad group the netherlands IT | 12
team SmartMatch.
SmartMatch has the ambition to improve the
end user experience by providing actionable
insights in the needs and wishes of talents as
well as clients.
SmartMatch complements Randstad’s
expertise using Artificial Intelligence and
Machine Learning.
SmartMatch develops algorithms and exposes
these via web services.
© randstad group the netherlands IT | 13
main challenges.
data quality database utilizationpersonalized advice
| 14
data quality.
With our taxonomy we can for example:
Provide clients with suggestions on necessary
skills, certificates and educations you might
need for this job
Present candidates skills they might have
based on their work experience.
© randstad group the netherlands IT | 15
database utilization.
We provide:
a talent search engine based on ElasticSearch that
replaced the old engine.
the talent recommender which is an algorithm to find the
most promising candidates for a vacancy in our database.
© randstad group the netherlands IT | 16
personalized advice for candidates.
The vacancy recommender instantly offers the
best matching vacancies for any talent.
The vacancy recommender will increase the
quality of applications and improves the
utilization of talents in a more efficient way.
© randstad group the netherlands IT | 17
primary impact of smartmatch solutions.
© randstad group the netherlands IT | 18
“working software is the
primary measure of progress.”
— Agile Manifesto
vacancy recommender - recruiter perspective.
© randstad group the netherlands IT | 20
vacancy recommender - recruiter perspective.
© randstad group the netherlands IT | 21
|
vacancy recommender
talent perspective.
© randstad group the netherlands IT
When you are known to us, we
provide you with personalized
vacancy recommendations.
The same component is shown on
the job board, where yuu search
through our vacancies.
results vacancy recommender.
Results on the Tempo-Team website are
looking great!
27% improvement in application rate
compared to self-found vacancies
Still waiting for results from the
Randstad website and the effect for our
recruiters.
© randstad group the netherlands IT | 23
27%
“All we have to decide is what
to do with the time that is
given us.”
— J.R.R. Tolkien
project lifecycle vacancy recommender.
typical
adoption
curve
© randstad group the netherlands IT | 25
time
adoption
project lifecycle vacancy recommender.
inception PoC MVP Scale-up Mature Phase-out
ideas &
research
business concept
plan
develop model
test model with
historical data
test model with
selected users
deploy to
production
API
minimal
engineering effort
but not losing
quality
larger user groups
more channels
model
improvements
more use cases
iterate on product
maintenance
© randstad group the netherlands IT | 26
implementation frontend recruiter
project lifecycle vacancy recommender.
inception PoC
implementation Randstad jobboard
MVP Scale-up Mature Phase-out
idea
implementation tempo-team jobboard
evaluation
1st version algorithm
test on google analytics data
implementation tempo-team jobboard improved version
evaluation
© randstad group the netherlands IT | 27
project learnings - what went well.
In the past we had problems to scale:
+ engineering involvement too late
leading to code quality issues
+ many features in model from the beginning
high complexity made maintenance more problematic
First time we went through this lifecycle, which did help to address these issues.
Takeaways:
→ minimum viable product must be as simple as possible but no simpler!
→ minimum viable product must be of good quality from the start!
© randstad group the netherlands IT | 28
project learnings - what could be better.
frontends are developed and maintained in another team.
- difficult to get good measurements
- difficulties to develop in sync because of different priorities
Takeaway:
→ Agree on the priorities before you start your MVP!
© randstad group the netherlands IT | 29
thank you!
any questions?
© randstad group the netherlands IT | 30
|
vacancy recommender
the science.
robbert van der gugten
randstad group the netherlands IT
|
matching from multiple perspectives.
32
match
recruiter provide all open vacancies
talent
wants job of the highest
possible aspiration
client
wants best possible
talent available
© randstad group the netherlands IT
|
online journey
matching by showing relevant vacancies to talent.
33
talent match
kpi
search find apply
model
touchpoint
rank
© randstad group the netherlands IT
data
feedback
| 34
vacancy recommender.
more effective and efficient process.
no time to
read
no need to
read
vacancy recommender
© randstad group the netherlands IT
|
the reciprocal recommendation problem.
35
“capability match”
“preferred vacancy predictor”
talent vacancy
© randstad group the netherlands IT
|
talent vacancy
preferred vacancy predictor.
© randstad group the netherlands IT
|
we can use interaction data.
37
1 ? ? 5 1
? 1 1 ? ?
? ? 5 1 ?
? 5 ? ? 1
1 ? 1 ? ?
© randstad group the netherlands IT
talent vacancy
talent-vacancy interaction matrix
|
vacancies
talent
×
≈ predicted
preferences
=
predicted preference of
talent t for vacancy v
preferences
38
matrix factorization.
talent factors
vacancy factors
© randstad group the netherlands IT
|
from scipy.sparse import csr_matrix
from sklearn.decomposition import NMF
39© randstad group the netherlands IT
talent_ids = df['talent_id'].unique()
vacancies = df['vacancy_id'].unique()
data = df['action'].tolist()
row = df['talent_id'].astype('category', categories=talent_ids).cat.codes
col = df['vacancy_id'].astype('category', categories=vacancies).cat.codes
factorizer = NMF(n_components=20, random_state=42)
model = factorizer.fit(sparse_matrix)
sparse_matrix = csr_matrix((data, (row, col)),
shape=(len(talent_ids), len(vacancies))
talent_factors = model.transform(sparse_matrix)
vacancy_factors = model.components_
|
problem: vacancies are timely.
40© randstad group the netherlands IT
time
{v1
,v2
,v3
} {v1
,v2
,v3
,v4
} {v2
,v4
}
|
available data.
41
profile online interactions
preferred job title
location
preferred salary
hours per week
night shift
...
job detail page
job applications
soon: applied filters
soon: search queries
© randstad group the netherlands IT
|
modelling.
42
matrix factorization “item” additional features
© randstad group the netherlands IT
|
scoring.
43© randstad group the netherlands IT
[x1
, x2
, x3
]x
=
x4
pred(t, job title)
-
[v1
, v2
, v3
]v
[t1
, t2
, t3
]t
|
offline evaluation set up.
calculate the average rank of historical matches
44
historical matches
use the model to score vacancies
for different talents
rank all vacancy scores for
each talent
1
2
3
calculate average rank for
the historical matches
1
0.993
0.945
© randstad group the netherlands IT
|
offline evaluation results.
45
top 10
percentage
average rank
0.83 0.50
30.2 1.6
average
rank
top 10%
© randstad group the netherlands IT
|
metric
matching by showing relevant vacancies to talent.
46
talent match
kpi
search find apply
touchpoint
rank datamodel
© randstad group the netherlands IT
feedback
|
live results: ideal experiment setup.
47
recommendations no recommendations random recommendations
compare application rate between three groups
© randstad group the netherlands IT
|
live results: used setup.
48
compare application rate for logged in users
© randstad group the netherlands IT
recommendations no recommendations
|
live results.
application rate improvement: 9%
49© randstad group the netherlands IT
|
talent vacancy
capability match.
© randstad group the netherlands IT 50
|
same goal, different perspective, different data.
51
profile work experience
education
skills
driving licences
languages
...
job title
sector
company
...
© randstad group the netherlands IT
|
beyond matrix factorization.
52© randstad group the netherlands IT
?1 ? ? 1 1
? 1 1 ? ?
? ? 1 1 ?
? 1 ? ? 1
1 ? 1 ? ?
|
StarSpace model.
53© randstad group the netherlands IT
| 54
+ ?
... ... ...
© randstad group the netherlands IT
  
 
StarSpace model.
| 55
talent_model = Model(inputs=talent_inputs,
outputs=talent_embedding)
vacancy_model = Model(inputs=vacancy_inputs,
outputs=vacancy_embedding)
© randstad group the netherlands IT
talent_embedding = talent_model(talent_input_features)
pos_vacancy_embedding = vacancy_model(vacancy_pos_input_features)
random_vacancy_embedding = vacancy_model(vacancy_random_input_features)
pos_similarity = Dot(axes=1, normalize=True)(
[talent_embedding, pos_vacancy_embedding])
random_similarity = Dot(axes=1, normalize=True)(
[talent_embedding, random_vacancy_embedding])
similarities = Concatenate()([pos_similarity, random_similarity])
| 56
model = Model(
inputs=[talent_input_features, vacancy_pos_input_features,
vacancy_random_input_features],
outputs=similarities)
© randstad group the netherlands IT
similarities = Concatenate()([pos_similarity, random_similarity])
| 57
scoring.
57
talent embedding
cache
realtime
similarity
© randstad group the netherlands IT
all vacancy embeddings
|
offline evaluation results.
58
top 10
percentage
average rank
0.78 0.50
33.7 1.6
average
rank
top 10%
© randstad group the netherlands IT
|
bringing it together.
59© randstad group the netherlands IT
|
live results.
application rate improvement
preferred vacancy predictor: 9%
60© randstad group the netherlands IT
preferred vacancy predictor + capability match: 27%
|
key takeaways.
matrix factorization provides a good baseline method to
recommend vacancies to talents
61© randstad group the netherlands IT
1
2
3
use the StarSpace algorithm to generate comparable
embeddings of different entities
the reciprocal recommendation problem can be solved by
combining approaches
|
one StarSpace to rule them all.
62© randstad group the netherlands IT
+ ?
... ... ...
|
thank you!
questions?
63© randstad group the netherlands IT
break
until 20:00 hours
vacancy recommender
engineering the solution
hugo valk
randstad group the netherlands IT
“This is a story of how a
[team] had an adventure, and
found [themselves] doing and
saying things altogether
unexpected..”
— J.R.R. Tolkien
so, how do we bring this to the user?
© randstad group the netherlands IT | 67
• providing computing power for ML jobs
• scheduling jobs
• ingesting data and training
• serving model scores via an API
• integration & final API
• moving to ‘near real time’
the steps to put the model in production.
© randstad group the netherlands IT | 68
setting the stage
69
the context in 2016.
© randstad group the netherlands IT | 69
november 2016.
Data center AWS
Data Lake
(RedShift)
backoffice
service
websites
frontoffice ...
...
...
... ...
Sales application
... ...
AWS was new territory for
Randstad. Most of the IT
was hosted on-premise.
Just a data lake.
No standards and
out-of-the-box ML services.
SageMaker came one year
later.
Some team members
had experience with
earlier AWS efforts.
© randstad group the netherlands IT | 70
the solution
what we put in production.
© randstad group the netherlands IT | 71
“Little by little,
one travels far.”
— J.R.R. Tolkien
• providing computing power for ML jobs
• scheduling jobs
• ingesting data and training
• serving model scores via an API
• integration & final API
• moving to ‘near real time’
the steps to put the model in production.
© randstad group the netherlands IT | 73
|
We use Docker on AWS
Elastic Container Services
• ECS Cluster with 1 big EC2
instance assigned to it
• We run ECS Services with a task
definition
• An ECS task runs a number of
Docker containers
ECS Cluster
computing power with AWS ECS.
EC2 Instance
ECS Service ECS Service
ECS
Task
ECS
Task
© randstad group the netherlands IT | 74
Since we have several Docker containers
running, we will omit the Service/Task
structure and the actual EC2 instances
for clarity.
we will summarize this.
ECS Cluster
ECS Cluster
EC2 Instance
ECS Service ECS Service
ECS
Task
ECS
Task
© randstad group the netherlands IT | 75
• providing computing power for ML jobs
• scheduling jobs
• ingesting data and training
• serving model scores via an API
• integration & final API
• moving to ‘near real time’
the steps to put the model in production.
© randstad group the netherlands IT | 76
|
SmartMatch Compute
Cluster
Apache Airflow is commonly used for
scheduling tasks.
• Distributed workers
• Execute a directed acyclic graph
(DAG) of tasks
• Web interface to view and control
the DAGs
Deployed as Docker containers.
scheduling with Apache Airflow.
Airflow
core
Airflow
webserver
Airflow
workers
Dag
syncer
ELB
PostGreSQL
RDS
Redis
ElastiCache
DAGs in
S3
© randstad group the netherlands IT | 77
scheduling with Apache Airflow.
© randstad group the netherlands IT | 78
|
SmartMatch Compute
Cluster
The training DAGs are run on the Airflow
worker. We omit the compute cluster, the
scheduler, the UI, the Airflow storage
components, etc.
In the next slides we will show a DAG running
in one of the Airflow workers like this:
we summarize again.
Airflow
core
Airflow
webserver
Airflow
workers
Dag
syncer
ELB
PostGreSQL
RDS
Redis
ElastiCache
DAGs in
S3
Training job
© randstad group the netherlands IT | 79
• providing computing power for ML jobs
• scheduling jobs
• ingesting data and training
• serving model scores via an API
• integration & final API
• moving to ‘near real time’
the steps to put the model in production.
© randstad group the netherlands IT | 80
ingesting data and training.
Shared
feature
store
Preferred
Vacancy Predictor
trainer
Shared
features
ingestion
Capability Match
trainer
Data Lake
(RedShift)
Google Analytics
Capability Match
model in S3
Preferred
Vacancy Predictor
model in S3
© randstad group the netherlands IT | 81
• providing computing power for ML jobs
• scheduling jobs
• ingesting data and training
• serving model scores via an API
• integration & final API
• moving to ‘near real time’
the steps to put the model in production.
© randstad group the netherlands IT | 82
SmartMatch
Production
Cluster
Preferred Vacancy
Predictor scoring engine
Capability Match scoring
engine
serving model predictions with Flask.
• A separate ECS cluster for serving production
APIs
• Internal model APIs with Flask
• These serve the scores for the individual model
components that make up the total prediction
© randstad group the netherlands IT | 83
ELB
ELB
SmartMatch
Production
Cluster
Preferred Vacancy
Predictor scoring engine
Capability Match scoring
engine
yet another summary.
We will omit the ELB and the fact that the scoring
engines run as a Docker container.
Preferred
Vacancy Predictor
scorer
Capability Match
scorer
© randstad group the netherlands IT | 84
ELB
ELB
serving the model scores.
Shared
feature
store
Preferred Vacancy
Predictor trainer
Shared
features
ingestion
Capability Match
trainer
Data Lake
(RedShift)
Google Analytics
Capability Match
model & data in
S3
Preferred Vacancy
Predictor model &
data in S3
Capability Match
current data ingest
Preferred Vacancy Predictor
current data ingest
Preferred Vacancy
Predictor scorer
Capability Match
scorer
Daily restart lambda
© randstad group the netherlands IT | 85
• providing computing power for ML jobs
• scheduling jobs
• ingesting data and training
• serving model scores via an API
• integration & final API
• moving to ‘near real time’
the steps to put the model in production.
© randstad group the netherlands IT | 86
integration and final API.
Preferred Vacancy
Predictor scorer
Capability Match
scorer
Vacancy
Recommender
API
Frontend
system
© randstad group the netherlands IT | 87
Firehose
register
API definition
API
manager
• providing computing power for ML jobs
• scheduling jobs
• ingesting data and training
• serving model scores via an API
• integration & final API
• moving to ‘near real time’
the steps to put the model in production.
© randstad group the netherlands IT | 88
we only refresh data in batch.
Shared
feature
store
Preferred Vacancy
Predictor trainer
Shared
features
ingestion
Capability Match
trainer
Data Lake
(RedShift)
Google Analytics
Capability Match
model & data in
S3
Preferred Vacancy
Predictor model &
data in S3
Capability Match
current data ingest
Preferred Vacancy Predictor
current data ingest
Preferred Vacancy
Predictor scorer
Capability Match
scorer
© randstad group the netherlands IT | 89
‘near real time’ updates on vacancies.
Capability Match
model & data in
S3
Preferred Vacancy
Predictor model &
data in S3
Preferred Vacancy
Predictor scorer
Capability Match
scorer
Vacancy
Management
System
Business
Event
Service
PostGreSQL
persistent
Vacancy
cache
© randstad group the netherlands IT | 90
SQS
API manager
Lambda reads vacancy
and performs
preprocessing
time to recap
WHAT have
we just done?
© randstad group the netherlands IT | 91
Shared feature
store
Preferred Vacancy
Predictor trainer
Shared features
ingestion
Capability Match trainer
Data Lake
(RedShift)
Google Analytics
Capability Match model
& data in S3
Preferred Vacancy
Predictor model & data
in S3
Capability Match current
data ingest
Preferred Vacancy Predictor current
data ingest
Preferred Vacancy
Predictor scorer
Capability Match
scorer Frontend
system
Vacancy
Management
System
Business
Event
Service
© randstad group the netherlands IT | 92
having done all this
we learned a lot.
© randstad group the netherlands IT | 93
we now know many AWS components.
© randstad group the netherlands IT | 94
we have great experience with fully scripted environments.
© randstad group the netherlands IT | 95
Shared
feature store
Preferred Vacancy
Predictor trainer
Shared
features
ingestion Capability Match
trainer
Data Lake
(RedShift)
Google
Analytics
Capability Match
model & data in S3
Preferred Vacancy
Predictor model &
data in S3
Capability Match
current data ingest
Preferred Vacancy Predictor
current data ingest
Preferred Vacancy
Predictor trainer
Capability Match
trainer Frontend
system
Vacancy
Management
System
Business
Event
Service
we have seen the different components of a ML solution.
data
ingestion
handling live events
© randstad group the netherlands IT | 96
training
models
expose scoring
results with API
we have been through the journey of a data scientist.
access to
the data
exploration
of the data
modeling
versioning &
source
control
scheduling
model
training
scoring in
production
© randstad group the netherlands IT | 97
“real artists ship!”
98
rinse & repeat all this for,
let’s say,
another 50 algorithms...
“Don’t adventures ever have
an end? I suppose not.
Someone else always has to
carry on the story.”
— J.R.R. Tolkien
moving forward
the DataHub.
© randstad group the netherlands IT | 100
problems the data scientist encounters.
access to
the data
exploration
of the data
modeling
versioning
build
deploy
scheduling
model
training
scoring in
production
laptop
or
shared RStudio
Server
no standards
lack of
knowledge
no shared
self-service
scheduling
infrastructure
lack of skills
building
production APIs
© randstad group the netherlands IT | 101
self-service DataHub.
data lake
project
space
ECS task
SageMaker
training job
model
endpoint
SageMaker
notebooks
DAG
data
subscriptions
Airflow is used to
schedule
everything
DataHub provides
roles, policies,
naming, tagging, etc.
SageMaker notebook
instances contain
JupyterLab for
research
Standardizing data
unloads/subscriptions
will later allow for
notifications and
governance
© randstad group the netherlands IT | 102
project template with cookiecutter.
• provides default project layout
• folder for notebooks - store your research
• DAG - write the DAG to run your scheduled jobs
• operator for data ingest
• operators for AWS SageMaker
• operator for ECS tasks
• Docker files - for SageMaker models or custom tasks
• standard Airflow operators to perform tasks on the DataHub
• Jenkins webhook provides standard deploy scripts
© randstad group the netherlands IT | 103
10
4
data subscriptions.
Get a periodic data extract from the
DataHub in your project space.
• no involvement data engineering
• central management of all data
subscriptions
10
5
self-service JupyterLab
SageMaker instances.
Get a JupyterLab instance with access
to the data lake and your own project
space.
• no involvement data engineering
• comes with Anaconda installed
• comes with working GPU support
© randstad group the netherlands IT | 106
revisiting the data scientist journey.
access to
the data
exploration
of the data
modeling
versioning
build
deploy
scheduling
model
training
scoring in
production
personal
computing power
with SageMaker
notebook
instances
standard build &
deploy pipeline
maintained by
data engineering
Airflow cluster
with fixed user
management
project template
with all the
necessary
components
standard way of working supported by data engineering
© randstad group the netherlands IT | 107
“A single dream
is more powerful
than a thousand realities.”
— J.R.R. Tolkien
thank you!
any questions?
© randstad group the netherlands IT | 109
randstad group the netherlands IT

More Related Content

What's hot

Monetizing the Internet of Things: Creating a Connected Customer Experience
Monetizing the Internet of Things: Creating a Connected Customer ExperienceMonetizing the Internet of Things: Creating a Connected Customer Experience
Monetizing the Internet of Things: Creating a Connected Customer ExperienceZuora, Inc.
 
The Role of Contact Center in Omnichannel Strategy
The Role of Contact Center in Omnichannel StrategyThe Role of Contact Center in Omnichannel Strategy
The Role of Contact Center in Omnichannel StrategyHfS Research
 
Introduction to-data-science
Introduction to-data-scienceIntroduction to-data-science
Introduction to-data-scienceAhmad karawash
 
The Analytics COE positioning your business analytics program for success
The Analytics COE   positioning your business analytics program for successThe Analytics COE   positioning your business analytics program for success
The Analytics COE positioning your business analytics program for successKiran Garimella
 
COGNIZANT TECHNOLOGY SOLUTIONS
COGNIZANT TECHNOLOGY SOLUTIONSCOGNIZANT TECHNOLOGY SOLUTIONS
COGNIZANT TECHNOLOGY SOLUTIONSdafforaj
 
Raising Venture Capital
Raising Venture CapitalRaising Venture Capital
Raising Venture CapitalDamien Steel
 
AI Overview and Capabilities
AI Overview and CapabilitiesAI Overview and Capabilities
AI Overview and CapabilitiesAnandSRao1962
 
Big data analytics in healthcare industry
Big data analytics in healthcare industryBig data analytics in healthcare industry
Big data analytics in healthcare industryBhagath Gopinath
 
Building Effective Denial Management Dashboards
Building Effective Denial Management DashboardsBuilding Effective Denial Management Dashboards
Building Effective Denial Management DashboardsCitiusTech
 
Webinar: Building a Blockchain Database with MongoDB
Webinar: Building a Blockchain Database with MongoDBWebinar: Building a Blockchain Database with MongoDB
Webinar: Building a Blockchain Database with MongoDBMongoDB
 
Creating Revenue from Customer Data
Creating Revenue from Customer DataCreating Revenue from Customer Data
Creating Revenue from Customer Dataaccenture
 
Siebel CRM Strategy & Roadmap
Siebel CRM Strategy & Roadmap Siebel CRM Strategy & Roadmap
Siebel CRM Strategy & Roadmap crm2life
 
Next Generation Digital Transformation
Next Generation Digital TransformationNext Generation Digital Transformation
Next Generation Digital TransformationVishal Sharma
 
Introduction of Data Science
Introduction of Data ScienceIntroduction of Data Science
Introduction of Data ScienceJason Geng
 
The Future of AI in Digital Marketing Transforming Customer Experiences.pdf
The Future of AI in Digital Marketing Transforming Customer Experiences.pdfThe Future of AI in Digital Marketing Transforming Customer Experiences.pdf
The Future of AI in Digital Marketing Transforming Customer Experiences.pdfAdsy
 
Introduction to Python for Data Science
Introduction to Python for Data ScienceIntroduction to Python for Data Science
Introduction to Python for Data ScienceArc & Codementor
 
Digital disruption in Finance
Digital disruption in FinanceDigital disruption in Finance
Digital disruption in FinanceScopernia
 
Understanding Digital transformation
Understanding Digital transformation Understanding Digital transformation
Understanding Digital transformation Patrizia Bertini
 

What's hot (20)

Monetizing the Internet of Things: Creating a Connected Customer Experience
Monetizing the Internet of Things: Creating a Connected Customer ExperienceMonetizing the Internet of Things: Creating a Connected Customer Experience
Monetizing the Internet of Things: Creating a Connected Customer Experience
 
The Role of Contact Center in Omnichannel Strategy
The Role of Contact Center in Omnichannel StrategyThe Role of Contact Center in Omnichannel Strategy
The Role of Contact Center in Omnichannel Strategy
 
Introduction to-data-science
Introduction to-data-scienceIntroduction to-data-science
Introduction to-data-science
 
The Analytics COE positioning your business analytics program for success
The Analytics COE   positioning your business analytics program for successThe Analytics COE   positioning your business analytics program for success
The Analytics COE positioning your business analytics program for success
 
COGNIZANT TECHNOLOGY SOLUTIONS
COGNIZANT TECHNOLOGY SOLUTIONSCOGNIZANT TECHNOLOGY SOLUTIONS
COGNIZANT TECHNOLOGY SOLUTIONS
 
Raising Venture Capital
Raising Venture CapitalRaising Venture Capital
Raising Venture Capital
 
AI Overview and Capabilities
AI Overview and CapabilitiesAI Overview and Capabilities
AI Overview and Capabilities
 
Big data analytics in healthcare industry
Big data analytics in healthcare industryBig data analytics in healthcare industry
Big data analytics in healthcare industry
 
Building Effective Denial Management Dashboards
Building Effective Denial Management DashboardsBuilding Effective Denial Management Dashboards
Building Effective Denial Management Dashboards
 
Webinar: Building a Blockchain Database with MongoDB
Webinar: Building a Blockchain Database with MongoDBWebinar: Building a Blockchain Database with MongoDB
Webinar: Building a Blockchain Database with MongoDB
 
Creating Revenue from Customer Data
Creating Revenue from Customer DataCreating Revenue from Customer Data
Creating Revenue from Customer Data
 
Siebel CRM Strategy & Roadmap
Siebel CRM Strategy & Roadmap Siebel CRM Strategy & Roadmap
Siebel CRM Strategy & Roadmap
 
Next Generation Digital Transformation
Next Generation Digital TransformationNext Generation Digital Transformation
Next Generation Digital Transformation
 
Introduction of Data Science
Introduction of Data ScienceIntroduction of Data Science
Introduction of Data Science
 
The Future of AI in Digital Marketing Transforming Customer Experiences.pdf
The Future of AI in Digital Marketing Transforming Customer Experiences.pdfThe Future of AI in Digital Marketing Transforming Customer Experiences.pdf
The Future of AI in Digital Marketing Transforming Customer Experiences.pdf
 
Introduction to Python for Data Science
Introduction to Python for Data ScienceIntroduction to Python for Data Science
Introduction to Python for Data Science
 
Digital disruption in Finance
Digital disruption in FinanceDigital disruption in Finance
Digital disruption in Finance
 
Workshop: Make the Most of Customer Data Platforms - David Raab
Workshop: Make the Most of Customer Data Platforms - David RaabWorkshop: Make the Most of Customer Data Platforms - David Raab
Workshop: Make the Most of Customer Data Platforms - David Raab
 
Understanding Digital transformation
Understanding Digital transformation Understanding Digital transformation
Understanding Digital transformation
 
Three Big Data Case Studies
Three Big Data Case StudiesThree Big Data Case Studies
Three Big Data Case Studies
 

Similar to The right model for the job: vacancy recommendations at Randstad

Kenneth Claeyssens - Randstad
Kenneth Claeyssens - RandstadKenneth Claeyssens - Randstad
Kenneth Claeyssens - RandstadFDMagazine
 
Learn Mechanical Digital Modeling With Auto CAD, CATIA v5, Solid-works & Pro-...
Learn Mechanical Digital Modeling With Auto CAD, CATIA v5, Solid-works & Pro-...Learn Mechanical Digital Modeling With Auto CAD, CATIA v5, Solid-works & Pro-...
Learn Mechanical Digital Modeling With Auto CAD, CATIA v5, Solid-works & Pro-...techageacademy
 
2015 Netcad Corporate Presentation
2015 Netcad Corporate Presentation2015 Netcad Corporate Presentation
2015 Netcad Corporate PresentationAhmet Aktaş
 
Engineering presentation
Engineering presentationEngineering presentation
Engineering presentationregitze_gerlach
 
Talent Alpha: Unleashing talent of the global tech workforce
Talent Alpha: Unleashing talent of the global tech workforceTalent Alpha: Unleashing talent of the global tech workforce
Talent Alpha: Unleashing talent of the global tech workforcePaul Kulon
 
Sapphire Technologies
Sapphire TechnologiesSapphire Technologies
Sapphire TechnologieshaleyATL
 
Analyzing the Digital Workplace in Europe
Analyzing the Digital Workplace in EuropeAnalyzing the Digital Workplace in Europe
Analyzing the Digital Workplace in EuropeNGA Human Resources
 
SCL Corporate Presentation
SCL Corporate PresentationSCL Corporate Presentation
SCL Corporate PresentationSergio Porcar
 
A new architecture for automation
A new architecture for automationA new architecture for automation
A new architecture for automationMWD Advisors
 
Microsoft apprenticeship program train 1000 nigerians dynamics intelligence
Microsoft apprenticeship program train 1000 nigerians dynamics intelligenceMicrosoft apprenticeship program train 1000 nigerians dynamics intelligence
Microsoft apprenticeship program train 1000 nigerians dynamics intelligenceJenna Bourgeois
 
CIO Leadership Summit 2018 - From Digital to Intelligent Enterprise
CIO Leadership Summit 2018 - From Digital to Intelligent EnterpriseCIO Leadership Summit 2018 - From Digital to Intelligent Enterprise
CIO Leadership Summit 2018 - From Digital to Intelligent EnterprisePhilippe Nemery
 
TDMessage 11-2016 English
TDMessage 11-2016 EnglishTDMessage 11-2016 English
TDMessage 11-2016 EnglishTDM Systems
 
How to tap into MDE: Conditions for Success
How to tap into MDE: Conditions for SuccessHow to tap into MDE: Conditions for Success
How to tap into MDE: Conditions for SuccessAndriy Levytskyy
 
Key Considerations While Rolling Out Denodo Platform
Key Considerations While Rolling Out Denodo PlatformKey Considerations While Rolling Out Denodo Platform
Key Considerations While Rolling Out Denodo PlatformDenodo
 

Similar to The right model for the job: vacancy recommendations at Randstad (20)

Kenneth Claeyssens - Randstad
Kenneth Claeyssens - RandstadKenneth Claeyssens - Randstad
Kenneth Claeyssens - Randstad
 
Randstad- RPO
Randstad- RPORandstad- RPO
Randstad- RPO
 
Learn Mechanical Digital Modeling With Auto CAD, CATIA v5, Solid-works & Pro-...
Learn Mechanical Digital Modeling With Auto CAD, CATIA v5, Solid-works & Pro-...Learn Mechanical Digital Modeling With Auto CAD, CATIA v5, Solid-works & Pro-...
Learn Mechanical Digital Modeling With Auto CAD, CATIA v5, Solid-works & Pro-...
 
IT presentation
IT presentationIT presentation
IT presentation
 
It
ItIt
It
 
2015 Netcad Corporate Presentation
2015 Netcad Corporate Presentation2015 Netcad Corporate Presentation
2015 Netcad Corporate Presentation
 
Engineering presentation
Engineering presentationEngineering presentation
Engineering presentation
 
Value of Data Science
Value of Data ScienceValue of Data Science
Value of Data Science
 
Sapphire
SapphireSapphire
Sapphire
 
Talent Alpha: Unleashing talent of the global tech workforce
Talent Alpha: Unleashing talent of the global tech workforceTalent Alpha: Unleashing talent of the global tech workforce
Talent Alpha: Unleashing talent of the global tech workforce
 
Sapphire Technologies
Sapphire TechnologiesSapphire Technologies
Sapphire Technologies
 
Tangent Brochure
Tangent BrochureTangent Brochure
Tangent Brochure
 
Analyzing the Digital Workplace in Europe
Analyzing the Digital Workplace in EuropeAnalyzing the Digital Workplace in Europe
Analyzing the Digital Workplace in Europe
 
SCL Corporate Presentation
SCL Corporate PresentationSCL Corporate Presentation
SCL Corporate Presentation
 
A new architecture for automation
A new architecture for automationA new architecture for automation
A new architecture for automation
 
Microsoft apprenticeship program train 1000 nigerians dynamics intelligence
Microsoft apprenticeship program train 1000 nigerians dynamics intelligenceMicrosoft apprenticeship program train 1000 nigerians dynamics intelligence
Microsoft apprenticeship program train 1000 nigerians dynamics intelligence
 
CIO Leadership Summit 2018 - From Digital to Intelligent Enterprise
CIO Leadership Summit 2018 - From Digital to Intelligent EnterpriseCIO Leadership Summit 2018 - From Digital to Intelligent Enterprise
CIO Leadership Summit 2018 - From Digital to Intelligent Enterprise
 
TDMessage 11-2016 English
TDMessage 11-2016 EnglishTDMessage 11-2016 English
TDMessage 11-2016 English
 
How to tap into MDE: Conditions for Success
How to tap into MDE: Conditions for SuccessHow to tap into MDE: Conditions for Success
How to tap into MDE: Conditions for Success
 
Key Considerations While Rolling Out Denodo Platform
Key Considerations While Rolling Out Denodo PlatformKey Considerations While Rolling Out Denodo Platform
Key Considerations While Rolling Out Denodo Platform
 

Recently uploaded

dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 

Recently uploaded (20)

dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 

The right model for the job: vacancy recommendations at Randstad

  • 1. to our data science meetup welcome hosted by randstad group the netherlands IT and BigData Republic
  • 2. a story about data driven randstad group. randstad group meets data scientists. falco vermeer randstad group the netherlands IT
  • 3. © randstad group the netherlands IT | 3 our purpose is to support people and organizations in realizing their true potential. randstad group human forward.
  • 4. randstad in numbers #1 HR services provider worldwide. € 23.8 billion in revenue key figures 2018 262,500 permanent placements 38,820 avg. corporate employees 4,826 offices in 39 countries 48% women in leadership positions 670,900 people we help to work every day © randstad group the netherlands IT | 4
  • 5. || 5 back in the good old days... information has always been an important asset for Randstad.
  • 6. | 6 labour market predicting trends and scarcity on the labour market candidates tailored insights on job & career at every moment customers predict the most suitable candidate at the right time
  • 7. || data driven randstad purpose - what & how 7© randstad group the netherlands IT ”With data & insights we help our customers & talents moving forward in the world of work, and by building solutions for our own consultants we help them in organising their work & making decisions in a more effective way.” For decisions at strategic & tactical level we develop dashboards, a self-service BI environment and ad-hoc analytics to help making the right decision at the right time. For decisions on operational level we automate the decision making process and help the user making smarter choices. Therefore we build models and integrate them in our processes.
  • 8. || 8 our data driven journey. redesigning the core of our company 2014 2018 datatechnology skills & peopleorganisation
  • 9. || Data Scientist meet our data & analytics team! 9 (Advanced) Analyst Machine Learning Engineer Analytics Translator Data Engineer BI Engineer © randstad group the netherlands IT
  • 10. || our core process since 60 years. a lot of potential for predictive data use cases 10© randstad group the netherlands IT
  • 11. vacancy recommender the use case hugo valk randstad group the Netherlands IT
  • 12. the business process of randstad. © randstad group the netherlands IT | 12
  • 13. team SmartMatch. SmartMatch has the ambition to improve the end user experience by providing actionable insights in the needs and wishes of talents as well as clients. SmartMatch complements Randstad’s expertise using Artificial Intelligence and Machine Learning. SmartMatch develops algorithms and exposes these via web services. © randstad group the netherlands IT | 13
  • 14. main challenges. data quality database utilizationpersonalized advice | 14
  • 15. data quality. With our taxonomy we can for example: Provide clients with suggestions on necessary skills, certificates and educations you might need for this job Present candidates skills they might have based on their work experience. © randstad group the netherlands IT | 15
  • 16. database utilization. We provide: a talent search engine based on ElasticSearch that replaced the old engine. the talent recommender which is an algorithm to find the most promising candidates for a vacancy in our database. © randstad group the netherlands IT | 16
  • 17. personalized advice for candidates. The vacancy recommender instantly offers the best matching vacancies for any talent. The vacancy recommender will increase the quality of applications and improves the utilization of talents in a more efficient way. © randstad group the netherlands IT | 17
  • 18. primary impact of smartmatch solutions. © randstad group the netherlands IT | 18
  • 19. “working software is the primary measure of progress.” — Agile Manifesto
  • 20. vacancy recommender - recruiter perspective. © randstad group the netherlands IT | 20
  • 21. vacancy recommender - recruiter perspective. © randstad group the netherlands IT | 21
  • 22. | vacancy recommender talent perspective. © randstad group the netherlands IT When you are known to us, we provide you with personalized vacancy recommendations. The same component is shown on the job board, where yuu search through our vacancies.
  • 23. results vacancy recommender. Results on the Tempo-Team website are looking great! 27% improvement in application rate compared to self-found vacancies Still waiting for results from the Randstad website and the effect for our recruiters. © randstad group the netherlands IT | 23 27%
  • 24. “All we have to decide is what to do with the time that is given us.” — J.R.R. Tolkien
  • 25. project lifecycle vacancy recommender. typical adoption curve © randstad group the netherlands IT | 25 time adoption
  • 26. project lifecycle vacancy recommender. inception PoC MVP Scale-up Mature Phase-out ideas & research business concept plan develop model test model with historical data test model with selected users deploy to production API minimal engineering effort but not losing quality larger user groups more channels model improvements more use cases iterate on product maintenance © randstad group the netherlands IT | 26
  • 27. implementation frontend recruiter project lifecycle vacancy recommender. inception PoC implementation Randstad jobboard MVP Scale-up Mature Phase-out idea implementation tempo-team jobboard evaluation 1st version algorithm test on google analytics data implementation tempo-team jobboard improved version evaluation © randstad group the netherlands IT | 27
  • 28. project learnings - what went well. In the past we had problems to scale: + engineering involvement too late leading to code quality issues + many features in model from the beginning high complexity made maintenance more problematic First time we went through this lifecycle, which did help to address these issues. Takeaways: → minimum viable product must be as simple as possible but no simpler! → minimum viable product must be of good quality from the start! © randstad group the netherlands IT | 28
  • 29. project learnings - what could be better. frontends are developed and maintained in another team. - difficult to get good measurements - difficulties to develop in sync because of different priorities Takeaway: → Agree on the priorities before you start your MVP! © randstad group the netherlands IT | 29
  • 30. thank you! any questions? © randstad group the netherlands IT | 30
  • 31. | vacancy recommender the science. robbert van der gugten randstad group the netherlands IT
  • 32. | matching from multiple perspectives. 32 match recruiter provide all open vacancies talent wants job of the highest possible aspiration client wants best possible talent available © randstad group the netherlands IT
  • 33. | online journey matching by showing relevant vacancies to talent. 33 talent match kpi search find apply model touchpoint rank © randstad group the netherlands IT data feedback
  • 34. | 34 vacancy recommender. more effective and efficient process. no time to read no need to read vacancy recommender © randstad group the netherlands IT
  • 35. | the reciprocal recommendation problem. 35 “capability match” “preferred vacancy predictor” talent vacancy © randstad group the netherlands IT
  • 36. | talent vacancy preferred vacancy predictor. © randstad group the netherlands IT
  • 37. | we can use interaction data. 37 1 ? ? 5 1 ? 1 1 ? ? ? ? 5 1 ? ? 5 ? ? 1 1 ? 1 ? ? © randstad group the netherlands IT talent vacancy talent-vacancy interaction matrix
  • 38. | vacancies talent × ≈ predicted preferences = predicted preference of talent t for vacancy v preferences 38 matrix factorization. talent factors vacancy factors © randstad group the netherlands IT
  • 39. | from scipy.sparse import csr_matrix from sklearn.decomposition import NMF 39© randstad group the netherlands IT talent_ids = df['talent_id'].unique() vacancies = df['vacancy_id'].unique() data = df['action'].tolist() row = df['talent_id'].astype('category', categories=talent_ids).cat.codes col = df['vacancy_id'].astype('category', categories=vacancies).cat.codes factorizer = NMF(n_components=20, random_state=42) model = factorizer.fit(sparse_matrix) sparse_matrix = csr_matrix((data, (row, col)), shape=(len(talent_ids), len(vacancies)) talent_factors = model.transform(sparse_matrix) vacancy_factors = model.components_
  • 40. | problem: vacancies are timely. 40© randstad group the netherlands IT time {v1 ,v2 ,v3 } {v1 ,v2 ,v3 ,v4 } {v2 ,v4 }
  • 41. | available data. 41 profile online interactions preferred job title location preferred salary hours per week night shift ... job detail page job applications soon: applied filters soon: search queries © randstad group the netherlands IT
  • 42. | modelling. 42 matrix factorization “item” additional features © randstad group the netherlands IT
  • 43. | scoring. 43© randstad group the netherlands IT [x1 , x2 , x3 ]x = x4 pred(t, job title) - [v1 , v2 , v3 ]v [t1 , t2 , t3 ]t
  • 44. | offline evaluation set up. calculate the average rank of historical matches 44 historical matches use the model to score vacancies for different talents rank all vacancy scores for each talent 1 2 3 calculate average rank for the historical matches 1 0.993 0.945 © randstad group the netherlands IT
  • 45. | offline evaluation results. 45 top 10 percentage average rank 0.83 0.50 30.2 1.6 average rank top 10% © randstad group the netherlands IT
  • 46. | metric matching by showing relevant vacancies to talent. 46 talent match kpi search find apply touchpoint rank datamodel © randstad group the netherlands IT feedback
  • 47. | live results: ideal experiment setup. 47 recommendations no recommendations random recommendations compare application rate between three groups © randstad group the netherlands IT
  • 48. | live results: used setup. 48 compare application rate for logged in users © randstad group the netherlands IT recommendations no recommendations
  • 49. | live results. application rate improvement: 9% 49© randstad group the netherlands IT
  • 50. | talent vacancy capability match. © randstad group the netherlands IT 50
  • 51. | same goal, different perspective, different data. 51 profile work experience education skills driving licences languages ... job title sector company ... © randstad group the netherlands IT
  • 52. | beyond matrix factorization. 52© randstad group the netherlands IT ?1 ? ? 1 1 ? 1 1 ? ? ? ? 1 1 ? ? 1 ? ? 1 1 ? 1 ? ?
  • 53. | StarSpace model. 53© randstad group the netherlands IT
  • 54. | 54 + ? ... ... ... © randstad group the netherlands IT      StarSpace model.
  • 55. | 55 talent_model = Model(inputs=talent_inputs, outputs=talent_embedding) vacancy_model = Model(inputs=vacancy_inputs, outputs=vacancy_embedding) © randstad group the netherlands IT talent_embedding = talent_model(talent_input_features) pos_vacancy_embedding = vacancy_model(vacancy_pos_input_features) random_vacancy_embedding = vacancy_model(vacancy_random_input_features) pos_similarity = Dot(axes=1, normalize=True)( [talent_embedding, pos_vacancy_embedding]) random_similarity = Dot(axes=1, normalize=True)( [talent_embedding, random_vacancy_embedding]) similarities = Concatenate()([pos_similarity, random_similarity])
  • 56. | 56 model = Model( inputs=[talent_input_features, vacancy_pos_input_features, vacancy_random_input_features], outputs=similarities) © randstad group the netherlands IT similarities = Concatenate()([pos_similarity, random_similarity])
  • 57. | 57 scoring. 57 talent embedding cache realtime similarity © randstad group the netherlands IT all vacancy embeddings
  • 58. | offline evaluation results. 58 top 10 percentage average rank 0.78 0.50 33.7 1.6 average rank top 10% © randstad group the netherlands IT
  • 59. | bringing it together. 59© randstad group the netherlands IT
  • 60. | live results. application rate improvement preferred vacancy predictor: 9% 60© randstad group the netherlands IT preferred vacancy predictor + capability match: 27%
  • 61. | key takeaways. matrix factorization provides a good baseline method to recommend vacancies to talents 61© randstad group the netherlands IT 1 2 3 use the StarSpace algorithm to generate comparable embeddings of different entities the reciprocal recommendation problem can be solved by combining approaches
  • 62. | one StarSpace to rule them all. 62© randstad group the netherlands IT + ? ... ... ...
  • 63. | thank you! questions? 63© randstad group the netherlands IT
  • 65. vacancy recommender engineering the solution hugo valk randstad group the netherlands IT
  • 66. “This is a story of how a [team] had an adventure, and found [themselves] doing and saying things altogether unexpected..” — J.R.R. Tolkien
  • 67. so, how do we bring this to the user? © randstad group the netherlands IT | 67
  • 68. • providing computing power for ML jobs • scheduling jobs • ingesting data and training • serving model scores via an API • integration & final API • moving to ‘near real time’ the steps to put the model in production. © randstad group the netherlands IT | 68
  • 69. setting the stage 69 the context in 2016. © randstad group the netherlands IT | 69
  • 70. november 2016. Data center AWS Data Lake (RedShift) backoffice service websites frontoffice ... ... ... ... ... Sales application ... ... AWS was new territory for Randstad. Most of the IT was hosted on-premise. Just a data lake. No standards and out-of-the-box ML services. SageMaker came one year later. Some team members had experience with earlier AWS efforts. © randstad group the netherlands IT | 70
  • 71. the solution what we put in production. © randstad group the netherlands IT | 71
  • 72. “Little by little, one travels far.” — J.R.R. Tolkien
  • 73. • providing computing power for ML jobs • scheduling jobs • ingesting data and training • serving model scores via an API • integration & final API • moving to ‘near real time’ the steps to put the model in production. © randstad group the netherlands IT | 73
  • 74. | We use Docker on AWS Elastic Container Services • ECS Cluster with 1 big EC2 instance assigned to it • We run ECS Services with a task definition • An ECS task runs a number of Docker containers ECS Cluster computing power with AWS ECS. EC2 Instance ECS Service ECS Service ECS Task ECS Task © randstad group the netherlands IT | 74
  • 75. Since we have several Docker containers running, we will omit the Service/Task structure and the actual EC2 instances for clarity. we will summarize this. ECS Cluster ECS Cluster EC2 Instance ECS Service ECS Service ECS Task ECS Task © randstad group the netherlands IT | 75
  • 76. • providing computing power for ML jobs • scheduling jobs • ingesting data and training • serving model scores via an API • integration & final API • moving to ‘near real time’ the steps to put the model in production. © randstad group the netherlands IT | 76
  • 77. | SmartMatch Compute Cluster Apache Airflow is commonly used for scheduling tasks. • Distributed workers • Execute a directed acyclic graph (DAG) of tasks • Web interface to view and control the DAGs Deployed as Docker containers. scheduling with Apache Airflow. Airflow core Airflow webserver Airflow workers Dag syncer ELB PostGreSQL RDS Redis ElastiCache DAGs in S3 © randstad group the netherlands IT | 77
  • 78. scheduling with Apache Airflow. © randstad group the netherlands IT | 78
  • 79. | SmartMatch Compute Cluster The training DAGs are run on the Airflow worker. We omit the compute cluster, the scheduler, the UI, the Airflow storage components, etc. In the next slides we will show a DAG running in one of the Airflow workers like this: we summarize again. Airflow core Airflow webserver Airflow workers Dag syncer ELB PostGreSQL RDS Redis ElastiCache DAGs in S3 Training job © randstad group the netherlands IT | 79
  • 80. • providing computing power for ML jobs • scheduling jobs • ingesting data and training • serving model scores via an API • integration & final API • moving to ‘near real time’ the steps to put the model in production. © randstad group the netherlands IT | 80
  • 81. ingesting data and training. Shared feature store Preferred Vacancy Predictor trainer Shared features ingestion Capability Match trainer Data Lake (RedShift) Google Analytics Capability Match model in S3 Preferred Vacancy Predictor model in S3 © randstad group the netherlands IT | 81
  • 82. • providing computing power for ML jobs • scheduling jobs • ingesting data and training • serving model scores via an API • integration & final API • moving to ‘near real time’ the steps to put the model in production. © randstad group the netherlands IT | 82
  • 83. SmartMatch Production Cluster Preferred Vacancy Predictor scoring engine Capability Match scoring engine serving model predictions with Flask. • A separate ECS cluster for serving production APIs • Internal model APIs with Flask • These serve the scores for the individual model components that make up the total prediction © randstad group the netherlands IT | 83 ELB ELB
  • 84. SmartMatch Production Cluster Preferred Vacancy Predictor scoring engine Capability Match scoring engine yet another summary. We will omit the ELB and the fact that the scoring engines run as a Docker container. Preferred Vacancy Predictor scorer Capability Match scorer © randstad group the netherlands IT | 84 ELB ELB
  • 85. serving the model scores. Shared feature store Preferred Vacancy Predictor trainer Shared features ingestion Capability Match trainer Data Lake (RedShift) Google Analytics Capability Match model & data in S3 Preferred Vacancy Predictor model & data in S3 Capability Match current data ingest Preferred Vacancy Predictor current data ingest Preferred Vacancy Predictor scorer Capability Match scorer Daily restart lambda © randstad group the netherlands IT | 85
  • 86. • providing computing power for ML jobs • scheduling jobs • ingesting data and training • serving model scores via an API • integration & final API • moving to ‘near real time’ the steps to put the model in production. © randstad group the netherlands IT | 86
  • 87. integration and final API. Preferred Vacancy Predictor scorer Capability Match scorer Vacancy Recommender API Frontend system © randstad group the netherlands IT | 87 Firehose register API definition API manager
  • 88. • providing computing power for ML jobs • scheduling jobs • ingesting data and training • serving model scores via an API • integration & final API • moving to ‘near real time’ the steps to put the model in production. © randstad group the netherlands IT | 88
  • 89. we only refresh data in batch. Shared feature store Preferred Vacancy Predictor trainer Shared features ingestion Capability Match trainer Data Lake (RedShift) Google Analytics Capability Match model & data in S3 Preferred Vacancy Predictor model & data in S3 Capability Match current data ingest Preferred Vacancy Predictor current data ingest Preferred Vacancy Predictor scorer Capability Match scorer © randstad group the netherlands IT | 89
  • 90. ‘near real time’ updates on vacancies. Capability Match model & data in S3 Preferred Vacancy Predictor model & data in S3 Preferred Vacancy Predictor scorer Capability Match scorer Vacancy Management System Business Event Service PostGreSQL persistent Vacancy cache © randstad group the netherlands IT | 90 SQS API manager Lambda reads vacancy and performs preprocessing
  • 91. time to recap WHAT have we just done? © randstad group the netherlands IT | 91
  • 92. Shared feature store Preferred Vacancy Predictor trainer Shared features ingestion Capability Match trainer Data Lake (RedShift) Google Analytics Capability Match model & data in S3 Preferred Vacancy Predictor model & data in S3 Capability Match current data ingest Preferred Vacancy Predictor current data ingest Preferred Vacancy Predictor scorer Capability Match scorer Frontend system Vacancy Management System Business Event Service © randstad group the netherlands IT | 92
  • 93. having done all this we learned a lot. © randstad group the netherlands IT | 93
  • 94. we now know many AWS components. © randstad group the netherlands IT | 94
  • 95. we have great experience with fully scripted environments. © randstad group the netherlands IT | 95
  • 96. Shared feature store Preferred Vacancy Predictor trainer Shared features ingestion Capability Match trainer Data Lake (RedShift) Google Analytics Capability Match model & data in S3 Preferred Vacancy Predictor model & data in S3 Capability Match current data ingest Preferred Vacancy Predictor current data ingest Preferred Vacancy Predictor trainer Capability Match trainer Frontend system Vacancy Management System Business Event Service we have seen the different components of a ML solution. data ingestion handling live events © randstad group the netherlands IT | 96 training models expose scoring results with API
  • 97. we have been through the journey of a data scientist. access to the data exploration of the data modeling versioning & source control scheduling model training scoring in production © randstad group the netherlands IT | 97
  • 98. “real artists ship!” 98 rinse & repeat all this for, let’s say, another 50 algorithms...
  • 99. “Don’t adventures ever have an end? I suppose not. Someone else always has to carry on the story.” — J.R.R. Tolkien
  • 100. moving forward the DataHub. © randstad group the netherlands IT | 100
  • 101. problems the data scientist encounters. access to the data exploration of the data modeling versioning build deploy scheduling model training scoring in production laptop or shared RStudio Server no standards lack of knowledge no shared self-service scheduling infrastructure lack of skills building production APIs © randstad group the netherlands IT | 101
  • 102. self-service DataHub. data lake project space ECS task SageMaker training job model endpoint SageMaker notebooks DAG data subscriptions Airflow is used to schedule everything DataHub provides roles, policies, naming, tagging, etc. SageMaker notebook instances contain JupyterLab for research Standardizing data unloads/subscriptions will later allow for notifications and governance © randstad group the netherlands IT | 102
  • 103. project template with cookiecutter. • provides default project layout • folder for notebooks - store your research • DAG - write the DAG to run your scheduled jobs • operator for data ingest • operators for AWS SageMaker • operator for ECS tasks • Docker files - for SageMaker models or custom tasks • standard Airflow operators to perform tasks on the DataHub • Jenkins webhook provides standard deploy scripts © randstad group the netherlands IT | 103
  • 104. 10 4 data subscriptions. Get a periodic data extract from the DataHub in your project space. • no involvement data engineering • central management of all data subscriptions
  • 105. 10 5 self-service JupyterLab SageMaker instances. Get a JupyterLab instance with access to the data lake and your own project space. • no involvement data engineering • comes with Anaconda installed • comes with working GPU support
  • 106. © randstad group the netherlands IT | 106
  • 107. revisiting the data scientist journey. access to the data exploration of the data modeling versioning build deploy scheduling model training scoring in production personal computing power with SageMaker notebook instances standard build & deploy pipeline maintained by data engineering Airflow cluster with fixed user management project template with all the necessary components standard way of working supported by data engineering © randstad group the netherlands IT | 107
  • 108. “A single dream is more powerful than a thousand realities.” — J.R.R. Tolkien
  • 109. thank you! any questions? © randstad group the netherlands IT | 109
  • 110. randstad group the netherlands IT