Creating the Best Teams Ever
with Collaborative Filtering
Maurits @vanderGoes
Tuesday 26 April 2016 - GraphConnect Europe 2016
5 reasons for virtual teams
Flat
organisations
Interorganizational
cooperation
Workers’
expectations
Service
economy
Globalization
(Townsend et at., 1998)
Architecture
API
(Scala)
Platform
(Meteor)
Recom. engine
(GraphAware)
Recom. DB
(Neo4j)
Importer
(Java)
Platform DB
(MongoDB)
HybridHybrid DB
Logical Data Model in Neo4j
User
Network
Team
CityCountry
Strength
ACTIVE_IN
LIVES_IN
LOCATED_IN
LOCATED_IN
MEMBER_OF
PART_OF
LOCATED_IN
HOLDS
GraphAware setup
Recommendation engine Post processor
Config
RewardSameLanguage
RewardSameCity
RewardSameCountry
PenalizePartners
RewardSameTags
Filter
FilterOutPrivate
FilterOutEnded
FilterOutNetwork
FilterOutDeactivated
MaxRecommendations
Blacklist
ExistingRelationships
NearbyTeamsViaNetworks
NearbyTeams
Logger
StatisticsLogger
RecommendationLogger
MATCH (u:User)-[r:ACTIVE_IN]->(t:Team)
WHERE id(u)={id} AND r.role>1.0
RETURN t as blacklist
Blacklist
Filter
MATCH (u:User),
(n:Network)<-[:PART_OF]-(t:Team)
WHERE id(u)={id} AND n.privacy_type=3
AND NOT (u)-[:MEMBER_OF]->(n)
RETURN t as blacklist
MATCH (u:User),(t:Team)
WHERE id(u)={id} AND t.privacy_type=2
AND NOT (u)-[:ACTIVE_IN {role:1.0}]->(t)
RETURN t as blacklist
5 Challenges for Filtering
Data sparsity
(cold start)
Gray sheep Scalability Shilling attacks Synonymy
(Goldberg et al, 1992)
Domain independent solution
User-based collaborative filtering algorithm
User-based collaborative filtering algorithm
User-based collaborative filtering algorithm
User-based collaborative filtering algorithm
#1
#4
#2
#3
2x
4x
5x
1x
MATCH (u1:User)-[r1:ACTIVE_IN]->(t:Team)
<-[r2:ACTIVE_IN]-(u2:User)
-[r3:ACTIVE_IN]->(reco:Team)
WHERE id(u1)={id}
RETURN reco,
COLLECT(t._id) as team_id",
u2._id as partner_id,
(r1.role + r2.role + r3.role)/3 AS score
Nearby teams
Nearby teams via networks
MATCH (u1:User)-[r1:MEMBER_OF]->(n:Network)
<-[r2:MEMBER_OF]-(u2:User)
-[r3:ACTIVE_IN]->(reco:Team)
WHERE id(u1)={id} AND r3.role>=1.5
RETURN reco,
n._id as network_id,
u2._id as member_id,
count(*)*0.5 AS score
Item-based collaborative filtering algorithm
4,5
2,3
3,7
2,6
3,9
4,1
Some relations are not shown for the visibility
Item-based collaborative filtering algorithm
4,5
2,3
3,7
2,6
3,9
4,1
Some relations are not shown for the visibility
0,87321
Item-based collaborative filtering algorithm
4,5
2,3
3,7
2,6
3,9
4,1
Some relations are not shown for the visibility
0,64962
Item-based collaborative filtering algorithm
4,5
2,3
3,7
2,6
3,9
4,1
Predicted
participation
score: 3.6
Some relations are not shown for the visibility
0,87321
0,64962
Similarity
MATCH (t1:Team), (t2:Team)
WHERE t1<>t2
MATCH (t1)<-[r:ACTIVE_IN]-(u:User)
WITH toFloat(AVG(r.participation)) AS t1Mean, t1, t2
MATCH (t2)<-[r:ACTIVE_IN]-(u:User)
WITH toFloat(AVG(r.participation)) AS t2Mean, t1Mean, t1, t2
MATCH (t1)<-[r1:ACTIVE_IN]-(u:User)-[r2:ACTIVE_IN]->(t2)
WITH SUM((r1.participation-t1Mean)*(r2.participation-t2Mean))
AS numerator,
SQRT(SUM((r1.participation-t1Mean)^2) *
SUM((r2.participation-t2Mean)^2)) AS denominator,
t1, t2, COUNT(r1) AS r1Count
WHERE denominator<>0 AND r1Count>2
MERGE (t1)-[q:SIMILARITY]-(t2)
SET q.coefficient=(numerator/denominator)
Item-based collaborative filtering
MATCH (u:User)-[a:ACTIVE_IN]->(t:Team)
-[s:SIMILARITY]-(reco:Team)
WHERE id(u)={id}
WITH AVG(a.participation*s.coefficient)
AS score, reco
RETURN score, reco
User feedback
Better
participation
score
Show contextRecognised
their profile
github.com/part-up
Round-up
Filtering
needed
Easily
customizable
ScalableDomain
independent
Hybrid
Visit Part-up.com
Slides: vdgo.es/graph16
Icon credits (the Noun Project): Aha-Soft, Akshay Kore, Alfredo Hernandez, Aneeque Ahmed Creative Stall, Gregor Črešnar, Luis Prado, icon 54,
Iconathon, Kevin Augustine LO, Klara Zalokar, Magicon, Matt Hawdon, Muneer A.Safiah, Nono Martínez Alonso, Simple Icons, Wilson Joseph
Questions: @vanderGoes & m@vdgo.es

GraphConnect Europe 2016 - Creating the Best Teams Ever with Collaborative Filtering - Maurits van der Goes