I
Job recommendations are significantly different
Rapid inventory growth - Millions of new jobs discovered every day
Job recommendations are significantly different
Rapid inventory growth - Millions of new jobs discovered every day
~ 1.5 million new users visit indeed every day
Job recommendations are significantly different
Rapid inventory growth - Millions of new jobs discovered every day
~ 1.5 million new users visit indeed every day
Average lifespan of a job is ~30 days
Job recommendations are significantly different
Rapid inventory growth - Millions of new jobs discovered every day
~ 1.5 million new users visit indeed every day
Average lifespan of a job is ~30 days
One job posting usually meant to hire one individual
Compute similarity
For ui
In {Users}
For uj
In {Users}
SIMi,j
= compute_similarity(ui,
uj
)
→
→
→
∩ ∪
Items[Ui
] = {x1
, x2
, ..xn
}
H
minhashH
(Ui
)= min{ x∈Itemsi
| H(x) }
Similarity(U1, U2) = 1, if minhash(U1) == minhash(U2)
Similarity(U1, U2) = 0, otherwise
This is an unbiased estimator
Similarity(U1, U2) = 1, if minhash(U1) == minhash(U2)
Similarity(U1, U2) = 0,
Hk
Hk
Prob(minhashH
(Ui
) == minhashH
(Uj
)) = J(Ui
, Uj
)
user → {job1, job2, job3,..}
H = {H1
, H2
, ..H20
}
for user in Users
for hash in H
minhash[hash] = min{x∈Itemsi
| hash(x)}
For ui
In {Users}
For uj
In {Users}
SIMi,j
= compute_similarity(ui,
uj
)
user1 → (111, 123, 134, 148, ..129)
user2 → (101, 123, 139, 148, ..135)
user3 → (191, 103, 126, 108, ..119)
user4 → (191, 103, 126, 108, ..129)
...
user → {cluster}
cluster → {users}
123 → (user1, user2)
148 → (user1, user2)
129 → (user1, user4)
191 → (user3, user4)
...
→
→
user1 → {job1, job2}
user2 → {job2, job3, job5}
123 → {user1, user2}
→
user1 → {job1, job2}
user2 → {job2, job3, job5}
123 → {job1, job2, job3, job5}
1. user → {cluster}
user → {cluster} user1 → {111, 123, ..}
111 → {job5, job2, job9}
123 → {job1, job2, job3, job5}
{job2, job5, job9, job1, job3}
→
→
{job2, job5, job9, job1, job3}
1.
→
→ {101, 121}
→
→ {101, 121}
{“Software Engineer”,
“Java Developer”, “Python Developer”}
→
→ {101, 121}
{“Software Engineer”,
“Java Developer”, “Python Developer”}
minhash({“Software Engineer”, “Java Developer”,
“Python Developer”}) → {99, 135}
→
→ {101, 121}
{“Software Engineer”,
“Java Developer”, “Python Developer”}
minhash({“Software Engineer”, “Java Developer”,
“Python Developer”}) → {99, 135}
→ {99, 121}
→
minhash({“Software Engineer”, “Java Developer”,
“Python Developer”}) → {99,121}
99 → add {“Software Engineer”, “Java Developer”,
“Python Developer”}
121 → add {“Software Engineer”, “Java Developer”,
“Python Developer”}
{“Software Engineer”, “Java Developer”,
“Python Developer”} {99, 121}
99 → {“Software Engineer”, “Java Developer”, “Python
Developer”}
→ {99, 131}
{“Software Engineer”, “Java Developer”,
“Python Developer”}
→
→
→
→
→
→
●
●
1. http://go.indeed.com/docservice
→
→
→
→
→
●
●
●
Engineering blog & talks http://indeed.tech
Open Source http://opensource.indeedeng.io
Careers http://indeed.jobs
Twitter @IndeedEng
Data Day Texas - Recommendations
Data Day Texas - Recommendations

Data Day Texas - Recommendations