SlideShare a Scribd company logo
Graph Modeling
Tips and Tricks
Mark Needham @markhneedham
Introducing our data set...
meetup.com’s recommendations
Recommendation queries
‣ Several different types
• groups to join
• topics to follow
• events to attend
‣ As a user of meetup.com trying to find
groups to join and events to attend
How will this talk be structured?
Find similar groups to Neo4j
As a member of the Neo4j London group
I want to find other similar meetup groups
So that I can join those groups
What makes groups similar?
As a member of the Neo4j London group
I want to find other similar meetup groups
So that I can join those groups
Find similar groups to Neo4j
As a member of the Neo4j London group
I want to find other similar meetup groups
So that I can join those groups
Nodes
As a member of the Neo4j London group
I want to find other similar meetup groups
So that I can join those groups
Relationships
As a member of the Neo4j London group
I want to find other similar meetup groups
So that I can join those groups
Labels
As a member of the Neo4j London group
I want to find other similar meetup groups
So that I can join those groups
Properties
Find similar groups to Neo4j
MATCH (group:Group {name: "Neo4j - London User Group"})
-[:HAS_TOPIC]->(topic)<-[:HAS_TOPIC]-(otherGroup)
RETURN otherGroup.name,
COUNT(topic) AS topicsInCommon,
COLLECT(topic.name) AS topics
ORDER BY topicsInCommon DESC, otherGroup.name
LIMIT 10
Find similar groups to Neo4j
Tip: Model incrementally
‣ Build the model for the question we need
to answer now then move onto the next
question
I’m already a member of these!
What other data can we get?
Exclude groups I’m a member of
As a member of the Neo4j London group
I want to find other similar meetup groups
that I’m not already a member of
So that I can join those groups
Exclude groups I’m a member of
As a member of the Neo4j London group
I want to find other similar meetup groups
that I’m not already a member of
So that I can join those groups
Exclude groups I’m a member of
MATCH (group:Group {name: "Neo4j - London User Group"})
-[:HAS_TOPIC]->(topic)<-[:HAS_TOPIC]-(otherGroup:Group)
RETURN otherGroup.name,
COUNT(topic) AS topicsInCommon,
EXISTS((:Member {name: "Mark Needham"})
-[:MEMBER_OF]->(otherGroup)) AS alreadyMember,
COLLECT(topic.name) AS topics
ORDER BY topicsInCommon DESC
LIMIT 10
Exclude groups I’m a member of
Exclude groups I’m a member of
MATCH (group:Group {name: "Neo4j - London User Group"})
-[:HAS_TOPIC]->(topic)<-[:HAS_TOPIC]-(otherGroup:Group)
WHERE NOT( (:Member {name: "Mark Needham"})
-[:MEMBER_OF]->(otherGroup) )
RETURN otherGroup.name,
COUNT(topic) AS topicsInCommon,
COLLECT(topic.name) AS topics
ORDER BY topicsInCommon DESC
LIMIT 10
Exclude groups I’m a member of
Find my similar groups
As a member of several meetup groups
I want to find other similar meetup groups
that I’m not already a member of
So that I can join those groups
Find my similar groups
As a member of several meetup groups
I want to find other similar meetup groups
that I’m not already a member of
So that I can join those groups
Find my similar groups
MATCH (member:Member {name: "Mark Needham"})
-[:INTERESTED_IN]->(topic),
(member)-[:MEMBER_OF]->(group)-[:HAS_TOPIC]->(topic)
WITH member, topic, COUNT(*) AS score
MATCH (topic)<-[:HAS_TOPIC]-(otherGroup)
WHERE NOT (member)-[:MEMBER_OF]->(otherGroup)
RETURN otherGroup.name,
COLLECT(topic.name),
SUM(score) as score
ORDER BY score DESC
Find my similar groups
Find Jonny’s similar groups
Oops...Jonny has no interests!
What is Jonny interested in?
As a member of several meetup groups
I want to find other similar meetup groups
that I’m not already a member of
So that I can join those groups
INTERESTED_IN?
What is Jonny interested in?
There’s an implicit INTERESTED_IN relationship
between the topics of groups I belong to but
don’t express an interest in. Let’s make it explicit
P
G
T
MEMBER_OF
HAS_TOPIC
P
G
T
MEMBER_OF
HAS_TOPIC
INTERESTED_IN
What is Jonny interested in?
MATCH (m:Member)-[:MEMBER_OF]->(group)-[:HAS_TOPIC]->(topic)
WITH m, topic, COUNT(*) AS times
WHERE times > 3
MERGE (m)-[:INTERESTED_IN]->(topic)
What is Jonny interested in?
Tip: Make the implicit explicit
‣ Fill in the missing links in the graph
‣ You could run this type of query once a day
during a quiet period
‣ On bigger graphs we’d run it in
batches to avoid loading the
whole database into memory
Find next group people join
As a member of a meetup group
I want to find out which meetup groups
other people join after this one
So that I can join those groups
Find next group people join
MATCH (group:Group {name: "Neo4j - London User Group"})
<-[membership:MEMBER_OF]-(member),
(member)-[otherMembership:MEMBER_OF]->(otherGroup)
WHERE membership.joined < otherMembership.joined
WITH member, otherGroup
ORDER BY otherMembership.joined
WITH member, COLLECT(otherGroup)[0] AS nextGroup
RETURN nextGroup.name, COUNT(*) AS times
ORDER BY times DESC
Find next group people join
It feels a bit clunky...
MATCH (group:Group {name: "Neo4j - London User Group"})
<-[membership:MEMBER_OF]-(member),
(member)-[otherMembership:MEMBER_OF]->(otherGroup)
WHERE membership.joined < otherMembership.joined
WITH member, otherGroup
ORDER BY otherMembership.joined
WITH member, COLLECT(otherGroup)[0] AS nextGroup
RETURN nextGroup.name, COUNT(*) AS times
ORDER BY times DESC
‣ We have to scan through all the MEMBER_OF
relationships to find the one we want
‣ It might make our lives easier if we made
membership a first class citizen of the domain
Facts can become nodes
Facts can become nodes
Refactor to facts
MATCH (member:Member)-[rel:MEMBER_OF]->(group)
MERGE (membership:Membership {id: member.id + "_" + group.id})
SET membership.joined = rel.joined
MERGE (member)-[:HAS_MEMBERSHIP]->(membership)
MERGE (membership)-[:OF_GROUP]->(group)
Refactor to facts
MATCH (member:Member)-[:HAS_MEMBERSHIP]->(membership)
WITH member, membership ORDER BY member.id, membership.joined
WITH member, COLLECT(membership) AS memberships
UNWIND RANGE(0,SIZE(memberships) - 2) as idx
WITH memberships[idx] AS m1, memberships[idx+1] AS m2
MERGE (m1)-[:NEXT]->(m2)
Find next group people join
MATCH (group:Group {name: "Neo4j - London User Group"})
<-[:OF_GROUP]-(membership)-[:NEXT]->(nextMembership),
(membership)<-[:HAS_MEMBERSHIP]-(member:Member)
-[:HAS_MEMBERSHIP]->(nextMembership),
(nextMembership)-[:OF_GROUP]->(nextGroup)
RETURN nextGroup.name, COUNT(*) AS times
ORDER BY times DESC
Comparing the approaches
vs
MATCH (group:Group {name: "Neo4j - London User Group"})
<-[membership:MEMBER_OF]-(member),
(member)-[otherMembership:MEMBER_OF]->(otherGroup)
WITH member, membership, otherMembership, otherGroup
ORDER BY member.id, otherMembership.joined
WHERE membership.joined < otherMembership.joined
WITH member, membership, COLLECT(otherGroup)[0] AS nextGroup
RETURN nextGroup.name, COUNT(*) AS times
ORDER BY times DESC
MATCH (group:Group {name: "Neo4j - London User Group"})
<-[:OF_GROUP]-(membership)-[:NEXT]->(nextMembership),
(membership)<-[:HAS_MEMBERSHIP]-(member:Member)
-[:HAS_MEMBERSHIP]->(nextMembership),
(nextMembership)-[:OF_GROUP]->(nextGroup)
RETURN nextGroup.name, COUNT(*) AS times
ORDER BY times DESC
How do I profile a query?
‣ EXPLAIN
• shows the execution plan without actually
executing it or returning any results.
‣ PROFILE
• executes the statement and returns the results
along with profiling information.
45
Neo4j’s longest plan (so far…)
46
Neo4j’s longest plan (so far…)
47
Neo4j’s longest plan (so far…)
48
What is our goal?
At a high level, the goal is
simple: get the number of
db hits down.
49
an abstract unit of storage
engine work.
What is a database hit?
“
”
50
Comparing the approaches
Cypher version: CYPHER 2.3,
planner: COST.
111656 total db hits in 330 ms.
vs
Cypher version: CYPHER 2.3,
planner: COST.
23650 total db hits in 60 ms.
Tip: Profile your queries
‣ Spike the different models and see which
one performs the best
Should we keep both models?
We could but when we add, edit or remove a membership
we’d have to keep both graph structures in sync.
Adding a group membership
WITH "Mark Needham" AS memberName,
"Neo4j - London User Group" AS groupName,
timestamp() AS now
MATCH (group:Group {name: groupName})
MATCH (member:Member {name: memberName})
MERGE (member)-[memberOfRel:MEMBER_OF]->(group)
ON CREATE SET memberOfRel.time = now
MERGE (membership:Membership {id: member.id + "_" + group.id})
ON CREATE SET membership.joined = now
MERGE (member)-[:HAS_MEMBERSHIP]->(membership)
MERGE (membership)-[:OF_GROUP]->(group)
Removing a group membership
WITH "Mark Needham" AS memberName,
"Neo4j - London User Group" AS groupName,
timestamp() AS now
MATCH (group:Group {name: groupName})
MATCH (member:Member {name: memberName})
MATCH (member)-[memberOfRel:MEMBER_OF]->(group)
MATCH (membership:Membership {id: member.id + "_" + group.id})
MATCH (member)-[hasMembershipRel:HAS_MEMBERSHIP]->(membership)
MATCH (membership)-[ofGroupRel:OF_GROUP]->(group)
DELETE memberOfRel, hasMembershipRel, ofGroupRel, membership
Let’s delete MEMBER_OF then...
...not so fast!
As a member of several meetup groups
I want to find other similar meetup groups
that I’m not already a member of
So that I can join those groups
Why not delete MEMBER_OF?
MATCH (member:Member {name: "Mark Needham"})
-[:MEMBER_OF]->(group)-[:HAS_TOPIC]->(topic)
WITH member, topic, COUNT(*) AS score
MATCH (topic)<-[:HAS_TOPIC]-(otherGroup)
WHERE NOT (member)-[:MEMBER_OF]->(otherGroup)
RETURN otherGroup.name, COLLECT(topic.name), SUM(score) as score
ORDER BY score DESC
MATCH (member:Member {name: "Mark Needham"})
-[:HAS_MEMBERSHIP]->()-[:OF_GROUP]->(group:Group)-[:HAS_TOPIC]->(topic)
WITH member, topic, COUNT(*) AS score
MATCH (topic)<-[:HAS_TOPIC]-(otherGroup)
WHERE NOT (member)-[:HAS_MEMBERSHIP]->(:Membership)-[:OF_GROUP]->(otherGroup:Group)
RETURN otherGroup.name, COLLECT(topic.name), SUM(score) as score
ORDER BY score DESC
Why not delete MEMBER_OF?
MATCH (member:Member {name: "Mark Needham"})
-[:MEMBER_OF]->(group)-[:HAS_TOPIC]->(topic)
WITH member, topic, COUNT(*) AS score
MATCH (topic)<-[:HAS_TOPIC]-(otherGroup)
WHERE NOT (member)-[:MEMBER_OF]->(otherGroup)
RETURN otherGroup.name, COLLECT(topic.name), SUM(score) as score
ORDER BY score DESC
MATCH (member:Member {name: "Mark Needham"})
-[:HAS_MEMBERSHIP]->()-[:OF_GROUP]->(group:Group)-[:HAS_TOPIC]->(topic)
WITH member, topic, COUNT(*) AS score
MATCH (topic)<-[:HAS_TOPIC]-(otherGroup)
WHERE NOT (member)-[:HAS_MEMBERSHIP]->(:Membership)-[:OF_GROUP]->(otherGroup:Group)
RETURN otherGroup.name, COLLECT(topic.name), SUM(score) as score
ORDER BY score DESC
433318 total db hits in 485 ms.
83268 total db hits in 117 ms.
Tip: Maintaining multiple models
‣ Different models perform better for
different queries but worse for others
‣ Optimising for reads may mean we pay a
write and maintenance penalty
What about events?
Events in my groups
As a member of several meetup groups who
has previously attended events
I want to find other events hosted by those
groups
So that I can attend those events
Events in my groups
As a member of several meetup
groups who has previously
attended events
I want to find other events
hosted by those groups
So that I can attend those events
WITH 24.0*60*60*1000 AS oneDay
MATCH (member:Member {name: "Mark Needham"}),
(member)-[:MEMBER_OF]->(group),
(group)-[:HOSTED_EVENT]->(futureEvent)
WHERE futureEvent.time >= timestamp()
AND NOT (member)-[:RSVPD]->(futureEvent)
RETURN group.name, futureEvent.name,
round((futureEvent.time - timestamp()) / oneDay) AS days
ORDER BY days
LIMIT 10
Events in my groups
Events in my groups
+ previous events attended
WITH 24.0*60*60*1000 AS oneDay
MATCH (member:Member {name: "Mark Needham"})
MATCH (futureEvent:Event)
WHERE futureEvent.time >= timestamp() AND NOT (member)-[:RSVPD]->(futureEvent)
MATCH (futureEvent)<-[:HOSTED_EVENT]-(group)
WITH oneDay, group, futureEvent, member, EXISTS((group)<-[:MEMBER_OF]-(member)) AS isMember
OPTIONAL MATCH (member)-[rsvp:RSVPD {response: "yes"}]->(pastEvent)<-[:HOSTED_EVENT]-(group)
WHERE pastEvent.time < timestamp()
RETURN group.name,
futureEvent.name,
isMember,
COUNT(rsvp) AS previousEvents,
round((futureEvent.time - timestamp()) / oneDay) AS days
ORDER BY days, previousEvents DESC
+ previous events attended
RSVPD_YES vs RSVPD
I was curious whether refactoring
RSVPD {response: "yes"} to RSVPD_YES would have
any impact as Neo4j is optimised for querying
by unique relationship types.
Refactor to specific relationships
MATCH (m:Member)-[rsvp:RSVPD {response:"yes"}]->(event)
MERGE (m)-[rsvpYes:RSVPD_YES {id: rsvp.id}]->(event)
ON CREATE SET rsvpYes.created = rsvp.created,
rsvpYes.lastModified = rsvp.lastModified;
MATCH (m:Member)-[rsvp:RSVPD {response:"no"}]->(event)
MERGE (m)-[rsvpYes:RSVPD_NO {id: rsvp.id}]->(event)
ON CREATE SET rsvpYes.created = rsvp.created,
rsvpYes.lastModified = rsvp.lastModified;
RSVPD_YES vs RSVPD
RSVPD {response: "yes"}
vs
RSVPD_YES
Cypher version: CYPHER 2.3,
planner: COST.
688635 total db hits in 232 ms.
Cypher version: CYPHER 2.3,
planner: COST.
559866 total db hits in 207 ms.
Why would we keep RSVPD?
MATCH (m:Member)-[r:RSVPD]->(event)<-[:HOSTED_EVENT]-(group)
WHERE m.name = "Mark Needham"
RETURN event, group, r
MATCH (m:Member)-[r:RSVPD_YES|:RSVPD_NO|:RSVPD_WAITLIST]->(event),
(event)<-[:HOSTED_EVENT]-(group)
WHERE m.name = "Mark Needham"
RETURN event, group, r
Tip: Specific relationships
‣ Neo4j is optimised for querying by unique
relationship types…
‣ ...but sometimes we pay a query
maintenance cost to achieve this
+ events my friends are attending
There’s an implicit FRIENDS relationship
between people who attended the same events.
Let’s make it explicit.
M
E
M
RSVPD
RSVPD
FRIENDS
M
E
M
RSVPD
RSVPD
+ events my friends are attending
MATCH (m1:Member)
WHERE NOT m1:Processed
WITH m1 LIMIT {limit}
MATCH (m1)-[:RSVP_YES]->(event:Event)<-[:RSVP_YES]-(m2:Member)
WITH m1, m2, COLLECT(event) AS events, COUNT(*) AS times
WHERE times >= 5
WITH m1, m2, times, [event IN events | SIZE((event)<-[:RSVP_YES]-())] AS attendances
WITH m1, m2, REDUCE(score = 0.0, a IN attendances | score + (1.0 / a)) AS score
MERGE (m1)-[friendsRel:FRIENDS]-(m2)
SET friendsRel.score = row.score
Bidirectional relationships
‣ You may have noticed that we didn’t specify a
direction when creating the relationship
MERGE (m1)-[:FRIENDS]-(m2)
‣ FRIENDS is a bidirectional relationship. We only
need to create it once between two people.
‣ We ignore the direction when querying
+ events my friends are attending
WITH 24.0*60*60*1000 AS oneDay
MATCH (member:Member {name: "Mark Needham"})
MATCH (futureEvent:Event)
WHERE futureEvent.time >= timestamp() AND NOT (member)-[:RSVPD]->(futureEvent)
MATCH (futureEvent)<-[:HOSTED_EVENT]-(group)
WITH oneDay, group, futureEvent, member, EXISTS((group)<-[:MEMBER_OF]-(member)) AS isMember
OPTIONAL MATCH (member)-[rsvp:RSVPD {response: "yes"}]->(pastEvent)<-[:HOSTED_EVENT]-(group)
WHERE pastEvent.time < timestamp()
WITH oneDay, group, futureEvent, member, isMember, COUNT(rsvp) AS previousEvents
OPTIONAL MATCH (futureEvent)<-[:HOSTED_EVENT]-()-[:HAS_TOPIC]->(topic)<-[:INTERESTED_IN]-(member)
WITH oneDay, group, futureEvent, member, isMember, previousEvents, COUNT(topic) AS topics
OPTIONAL MATCH (member)-[:FRIENDS]-(:Member)-[rsvpYes:RSVP_YES]->(futureEvent)
RETURN group.name, futureEvent.name, isMember, round((futureEvent.time - timestamp()) / oneDay) AS days,
previousEvents, topics, COUNT(rsvpYes) AS friendsGoing
ORDER BY days, friendsGoing DESC, previousEvents DESC
LIMIT 15
+ events my friends are attending
Tip: Bidirectional relationships
‣ Some relationships are bidirectional in nature
‣ Neo4j always stores relationships with a
direction but we can choose to ignore that
when we query
tl;dr
‣ Model incrementally
‣ Always profile your queries
‣ Consider making the implicit explicit…
• ...but beware the maintenance cost
‣ Be specific with relationship types
‣ Ignore direction for bidirectional relationships
That’s all for today!
Questions? :-)
Mark Needham @markhneedham
https://github.com/neo4j-meetups/modeling-worked-example

More Related Content

Viewers also liked

Mastering Customer Data on Apache Spark
Mastering Customer Data on Apache SparkMastering Customer Data on Apache Spark
Mastering Customer Data on Apache Spark
Caserta
 
Introduction to Dublin Core Metadata
Introduction to Dublin Core MetadataIntroduction to Dublin Core Metadata
Introduction to Dublin Core Metadata
Hannes Ebner
 
Real-World Data Governance: Master Data Management & Data Governance
Real-World Data Governance: Master Data Management & Data GovernanceReal-World Data Governance: Master Data Management & Data Governance
Real-World Data Governance: Master Data Management & Data Governance
DATAVERSITY
 
An overview of Neo4j Internals
An overview of Neo4j InternalsAn overview of Neo4j Internals
An overview of Neo4j Internals
Tobias Lindaaker
 

Viewers also liked (19)

Neo4j GraphTalks Panama Papers
Neo4j GraphTalks Panama PapersNeo4j GraphTalks Panama Papers
Neo4j GraphTalks Panama Papers
 
Neo4j PartnerDay Amsterdam 2017
Neo4j PartnerDay Amsterdam 2017Neo4j PartnerDay Amsterdam 2017
Neo4j PartnerDay Amsterdam 2017
 
raph Databases with Neo4j – Emil Eifrem
raph Databases with Neo4j – Emil Eifremraph Databases with Neo4j – Emil Eifrem
raph Databases with Neo4j – Emil Eifrem
 
Museo Torino - un esempio reale d'uso di NOSQL-GraphDB, Linked Data e Web Sem...
Museo Torino - un esempio reale d'uso di NOSQL-GraphDB, Linked Data e Web Sem...Museo Torino - un esempio reale d'uso di NOSQL-GraphDB, Linked Data e Web Sem...
Museo Torino - un esempio reale d'uso di NOSQL-GraphDB, Linked Data e Web Sem...
 
Presentatie Marktonderzoek
Presentatie MarktonderzoekPresentatie Marktonderzoek
Presentatie Marktonderzoek
 
Graph db
Graph dbGraph db
Graph db
 
AWS Summit London 2014 | Improving Availability and Lowering Costs (300)
AWS Summit London 2014 | Improving Availability and Lowering Costs (300)AWS Summit London 2014 | Improving Availability and Lowering Costs (300)
AWS Summit London 2014 | Improving Availability and Lowering Costs (300)
 
Graphs in the Real World
Graphs in the Real WorldGraphs in the Real World
Graphs in the Real World
 
Mastering Customer Data on Apache Spark
Mastering Customer Data on Apache SparkMastering Customer Data on Apache Spark
Mastering Customer Data on Apache Spark
 
Introduction to Dublin Core Metadata
Introduction to Dublin Core MetadataIntroduction to Dublin Core Metadata
Introduction to Dublin Core Metadata
 
Graph Databases for Master Data Management
Graph Databases for Master Data ManagementGraph Databases for Master Data Management
Graph Databases for Master Data Management
 
Real-World Data Governance: Master Data Management & Data Governance
Real-World Data Governance: Master Data Management & Data GovernanceReal-World Data Governance: Master Data Management & Data Governance
Real-World Data Governance: Master Data Management & Data Governance
 
Metadata an overview
Metadata an overviewMetadata an overview
Metadata an overview
 
An overview of Neo4j Internals
An overview of Neo4j InternalsAn overview of Neo4j Internals
An overview of Neo4j Internals
 
The Five Graphs of Government: How Federal Agencies can Utilize Graph Technology
The Five Graphs of Government: How Federal Agencies can Utilize Graph TechnologyThe Five Graphs of Government: How Federal Agencies can Utilize Graph Technology
The Five Graphs of Government: How Federal Agencies can Utilize Graph Technology
 
Intro to Neo4j and Graph Databases
Intro to Neo4j and Graph DatabasesIntro to Neo4j and Graph Databases
Intro to Neo4j and Graph Databases
 
Seven building blocks for MDM
Seven building blocks for MDMSeven building blocks for MDM
Seven building blocks for MDM
 
GraphTalks Rome - Selecting the right Technology
GraphTalks Rome - Selecting the right TechnologyGraphTalks Rome - Selecting the right Technology
GraphTalks Rome - Selecting the right Technology
 
Intro to Neo4j presentation
Intro to Neo4j presentationIntro to Neo4j presentation
Intro to Neo4j presentation
 

Similar to Tips and Tricks for Graph Data Modeling

IDCC Website Design Ideas Jessica Li
IDCC Website Design Ideas Jessica LiIDCC Website Design Ideas Jessica Li
IDCC Website Design Ideas Jessica Li
Jessica Li
 
You need to know Concept of Discourse Communities( GOOGLE IT ).docx
You need to know Concept of Discourse Communities( GOOGLE IT ).docxYou need to know Concept of Discourse Communities( GOOGLE IT ).docx
You need to know Concept of Discourse Communities( GOOGLE IT ).docx
shantayjewison
 

Similar to Tips and Tricks for Graph Data Modeling (14)

Building a recommendation engine with python and neo4j
Building a recommendation engine with python and neo4jBuilding a recommendation engine with python and neo4j
Building a recommendation engine with python and neo4j
 
Sparking more meetups with machine learning
Sparking more meetups with machine learningSparking more meetups with machine learning
Sparking more meetups with machine learning
 
Googlegroups
GooglegroupsGooglegroups
Googlegroups
 
GreenRope Group Setup Best Practices
GreenRope Group Setup Best PracticesGreenRope Group Setup Best Practices
GreenRope Group Setup Best Practices
 
Monetate Tech Community Visual Guide
Monetate Tech Community Visual GuideMonetate Tech Community Visual Guide
Monetate Tech Community Visual Guide
 
Monetate Tech Community Visual Guide
Monetate Tech Community Visual GuideMonetate Tech Community Visual Guide
Monetate Tech Community Visual Guide
 
Content Strategy Workflow
Content Strategy WorkflowContent Strategy Workflow
Content Strategy Workflow
 
Office365 groups from the ground up - Collab365 Global Conference
Office365 groups from the ground up - Collab365 Global ConferenceOffice365 groups from the ground up - Collab365 Global Conference
Office365 groups from the ground up - Collab365 Global Conference
 
Office 365 Groups: Deep Dive
Office 365 Groups: Deep DiveOffice 365 Groups: Deep Dive
Office 365 Groups: Deep Dive
 
IDCC Website Design Ideas Jessica Li
IDCC Website Design Ideas Jessica LiIDCC Website Design Ideas Jessica Li
IDCC Website Design Ideas Jessica Li
 
You need to know Concept of Discourse Communities( GOOGLE IT ).docx
You need to know Concept of Discourse Communities( GOOGLE IT ).docxYou need to know Concept of Discourse Communities( GOOGLE IT ).docx
You need to know Concept of Discourse Communities( GOOGLE IT ).docx
 
Meetup Analytics with R and Neo4j
Meetup Analytics with R and Neo4jMeetup Analytics with R and Neo4j
Meetup Analytics with R and Neo4j
 
Types Working for You, Not Against You
Types Working for You, Not Against YouTypes Working for You, Not Against You
Types Working for You, Not Against You
 
Utilizing The Power of LinkedIn Groups
Utilizing The Power of LinkedIn GroupsUtilizing The Power of LinkedIn Groups
Utilizing The Power of LinkedIn Groups
 

More from Neo4j

More from Neo4j (20)

GraphAware - Transforming policing with graph-based intelligence analysis
GraphAware - Transforming policing with graph-based intelligence analysisGraphAware - Transforming policing with graph-based intelligence analysis
GraphAware - Transforming policing with graph-based intelligence analysis
 
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product UpdatesGraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
 
KLARNA - Language Models and Knowledge Graphs: A Systems Approach
KLARNA -  Language Models and Knowledge Graphs: A Systems ApproachKLARNA -  Language Models and Knowledge Graphs: A Systems Approach
KLARNA - Language Models and Knowledge Graphs: A Systems Approach
 
INGKA DIGITAL: Linked Metadata by Design
INGKA DIGITAL: Linked Metadata by DesignINGKA DIGITAL: Linked Metadata by Design
INGKA DIGITAL: Linked Metadata by Design
 
Your enemies use GenAI too - staying ahead of fraud with Neo4j
Your enemies use GenAI too - staying ahead of fraud with Neo4jYour enemies use GenAI too - staying ahead of fraud with Neo4j
Your enemies use GenAI too - staying ahead of fraud with Neo4j
 
BT & Neo4j _ How Knowledge Graphs help BT deliver Digital Transformation.pptx
BT & Neo4j _ How Knowledge Graphs help BT deliver Digital Transformation.pptxBT & Neo4j _ How Knowledge Graphs help BT deliver Digital Transformation.pptx
BT & Neo4j _ How Knowledge Graphs help BT deliver Digital Transformation.pptx
 
Workshop: Enabling GenAI Breakthroughs with Knowledge Graphs - GraphSummit Milan
Workshop: Enabling GenAI Breakthroughs with Knowledge Graphs - GraphSummit MilanWorkshop: Enabling GenAI Breakthroughs with Knowledge Graphs - GraphSummit Milan
Workshop: Enabling GenAI Breakthroughs with Knowledge Graphs - GraphSummit Milan
 
Workshop - Architecting Innovative Graph Applications- GraphSummit Milan
Workshop -  Architecting Innovative Graph Applications- GraphSummit MilanWorkshop -  Architecting Innovative Graph Applications- GraphSummit Milan
Workshop - Architecting Innovative Graph Applications- GraphSummit Milan
 
LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...
LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...
LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...
 
GraphSummit Milan - Visione e roadmap del prodotto Neo4j
GraphSummit Milan - Visione e roadmap del prodotto Neo4jGraphSummit Milan - Visione e roadmap del prodotto Neo4j
GraphSummit Milan - Visione e roadmap del prodotto Neo4j
 
GraphSummit Milan & Stockholm - Neo4j: The Art of the Possible with Graph
GraphSummit Milan & Stockholm - Neo4j: The Art of the Possible with GraphGraphSummit Milan & Stockholm - Neo4j: The Art of the Possible with Graph
GraphSummit Milan & Stockholm - Neo4j: The Art of the Possible with Graph
 
LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...
LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...
LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...
 
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale Ibrida
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale IbridaUNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale Ibrida
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale Ibrida
 
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
 
From Knowledge Graphs via Lego Bricks to scientific conversations.pptx
From Knowledge Graphs via Lego Bricks to scientific conversations.pptxFrom Knowledge Graphs via Lego Bricks to scientific conversations.pptx
From Knowledge Graphs via Lego Bricks to scientific conversations.pptx
 
Novo Nordisk: When Knowledge Graphs meet LLMs
Novo Nordisk: When Knowledge Graphs meet LLMsNovo Nordisk: When Knowledge Graphs meet LLMs
Novo Nordisk: When Knowledge Graphs meet LLMs
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansQIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 

Recently uploaded

Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 

Recently uploaded (20)

Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Agentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdfAgentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdf
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
In-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsIn-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT Professionals
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
 

Tips and Tricks for Graph Data Modeling

  • 1. Graph Modeling Tips and Tricks Mark Needham @markhneedham
  • 4. Recommendation queries ‣ Several different types • groups to join • topics to follow • events to attend ‣ As a user of meetup.com trying to find groups to join and events to attend
  • 5. How will this talk be structured?
  • 6. Find similar groups to Neo4j As a member of the Neo4j London group I want to find other similar meetup groups So that I can join those groups
  • 7. What makes groups similar?
  • 8. As a member of the Neo4j London group I want to find other similar meetup groups So that I can join those groups Find similar groups to Neo4j
  • 9. As a member of the Neo4j London group I want to find other similar meetup groups So that I can join those groups Nodes
  • 10. As a member of the Neo4j London group I want to find other similar meetup groups So that I can join those groups Relationships
  • 11. As a member of the Neo4j London group I want to find other similar meetup groups So that I can join those groups Labels
  • 12. As a member of the Neo4j London group I want to find other similar meetup groups So that I can join those groups Properties
  • 13. Find similar groups to Neo4j MATCH (group:Group {name: "Neo4j - London User Group"}) -[:HAS_TOPIC]->(topic)<-[:HAS_TOPIC]-(otherGroup) RETURN otherGroup.name, COUNT(topic) AS topicsInCommon, COLLECT(topic.name) AS topics ORDER BY topicsInCommon DESC, otherGroup.name LIMIT 10
  • 15. Tip: Model incrementally ‣ Build the model for the question we need to answer now then move onto the next question
  • 16. I’m already a member of these!
  • 17. What other data can we get?
  • 18. Exclude groups I’m a member of As a member of the Neo4j London group I want to find other similar meetup groups that I’m not already a member of So that I can join those groups
  • 19. Exclude groups I’m a member of As a member of the Neo4j London group I want to find other similar meetup groups that I’m not already a member of So that I can join those groups
  • 20. Exclude groups I’m a member of MATCH (group:Group {name: "Neo4j - London User Group"}) -[:HAS_TOPIC]->(topic)<-[:HAS_TOPIC]-(otherGroup:Group) RETURN otherGroup.name, COUNT(topic) AS topicsInCommon, EXISTS((:Member {name: "Mark Needham"}) -[:MEMBER_OF]->(otherGroup)) AS alreadyMember, COLLECT(topic.name) AS topics ORDER BY topicsInCommon DESC LIMIT 10
  • 21. Exclude groups I’m a member of
  • 22. Exclude groups I’m a member of MATCH (group:Group {name: "Neo4j - London User Group"}) -[:HAS_TOPIC]->(topic)<-[:HAS_TOPIC]-(otherGroup:Group) WHERE NOT( (:Member {name: "Mark Needham"}) -[:MEMBER_OF]->(otherGroup) ) RETURN otherGroup.name, COUNT(topic) AS topicsInCommon, COLLECT(topic.name) AS topics ORDER BY topicsInCommon DESC LIMIT 10
  • 23. Exclude groups I’m a member of
  • 24. Find my similar groups As a member of several meetup groups I want to find other similar meetup groups that I’m not already a member of So that I can join those groups
  • 25. Find my similar groups As a member of several meetup groups I want to find other similar meetup groups that I’m not already a member of So that I can join those groups
  • 26. Find my similar groups MATCH (member:Member {name: "Mark Needham"}) -[:INTERESTED_IN]->(topic), (member)-[:MEMBER_OF]->(group)-[:HAS_TOPIC]->(topic) WITH member, topic, COUNT(*) AS score MATCH (topic)<-[:HAS_TOPIC]-(otherGroup) WHERE NOT (member)-[:MEMBER_OF]->(otherGroup) RETURN otherGroup.name, COLLECT(topic.name), SUM(score) as score ORDER BY score DESC
  • 27. Find my similar groups
  • 29. Oops...Jonny has no interests!
  • 30. What is Jonny interested in? As a member of several meetup groups I want to find other similar meetup groups that I’m not already a member of So that I can join those groups INTERESTED_IN?
  • 31. What is Jonny interested in? There’s an implicit INTERESTED_IN relationship between the topics of groups I belong to but don’t express an interest in. Let’s make it explicit P G T MEMBER_OF HAS_TOPIC P G T MEMBER_OF HAS_TOPIC INTERESTED_IN
  • 32. What is Jonny interested in? MATCH (m:Member)-[:MEMBER_OF]->(group)-[:HAS_TOPIC]->(topic) WITH m, topic, COUNT(*) AS times WHERE times > 3 MERGE (m)-[:INTERESTED_IN]->(topic)
  • 33. What is Jonny interested in?
  • 34. Tip: Make the implicit explicit ‣ Fill in the missing links in the graph ‣ You could run this type of query once a day during a quiet period ‣ On bigger graphs we’d run it in batches to avoid loading the whole database into memory
  • 35. Find next group people join As a member of a meetup group I want to find out which meetup groups other people join after this one So that I can join those groups
  • 36. Find next group people join MATCH (group:Group {name: "Neo4j - London User Group"}) <-[membership:MEMBER_OF]-(member), (member)-[otherMembership:MEMBER_OF]->(otherGroup) WHERE membership.joined < otherMembership.joined WITH member, otherGroup ORDER BY otherMembership.joined WITH member, COLLECT(otherGroup)[0] AS nextGroup RETURN nextGroup.name, COUNT(*) AS times ORDER BY times DESC
  • 37. Find next group people join
  • 38. It feels a bit clunky... MATCH (group:Group {name: "Neo4j - London User Group"}) <-[membership:MEMBER_OF]-(member), (member)-[otherMembership:MEMBER_OF]->(otherGroup) WHERE membership.joined < otherMembership.joined WITH member, otherGroup ORDER BY otherMembership.joined WITH member, COLLECT(otherGroup)[0] AS nextGroup RETURN nextGroup.name, COUNT(*) AS times ORDER BY times DESC ‣ We have to scan through all the MEMBER_OF relationships to find the one we want ‣ It might make our lives easier if we made membership a first class citizen of the domain
  • 41. Refactor to facts MATCH (member:Member)-[rel:MEMBER_OF]->(group) MERGE (membership:Membership {id: member.id + "_" + group.id}) SET membership.joined = rel.joined MERGE (member)-[:HAS_MEMBERSHIP]->(membership) MERGE (membership)-[:OF_GROUP]->(group)
  • 42. Refactor to facts MATCH (member:Member)-[:HAS_MEMBERSHIP]->(membership) WITH member, membership ORDER BY member.id, membership.joined WITH member, COLLECT(membership) AS memberships UNWIND RANGE(0,SIZE(memberships) - 2) as idx WITH memberships[idx] AS m1, memberships[idx+1] AS m2 MERGE (m1)-[:NEXT]->(m2)
  • 43. Find next group people join MATCH (group:Group {name: "Neo4j - London User Group"}) <-[:OF_GROUP]-(membership)-[:NEXT]->(nextMembership), (membership)<-[:HAS_MEMBERSHIP]-(member:Member) -[:HAS_MEMBERSHIP]->(nextMembership), (nextMembership)-[:OF_GROUP]->(nextGroup) RETURN nextGroup.name, COUNT(*) AS times ORDER BY times DESC
  • 44. Comparing the approaches vs MATCH (group:Group {name: "Neo4j - London User Group"}) <-[membership:MEMBER_OF]-(member), (member)-[otherMembership:MEMBER_OF]->(otherGroup) WITH member, membership, otherMembership, otherGroup ORDER BY member.id, otherMembership.joined WHERE membership.joined < otherMembership.joined WITH member, membership, COLLECT(otherGroup)[0] AS nextGroup RETURN nextGroup.name, COUNT(*) AS times ORDER BY times DESC MATCH (group:Group {name: "Neo4j - London User Group"}) <-[:OF_GROUP]-(membership)-[:NEXT]->(nextMembership), (membership)<-[:HAS_MEMBERSHIP]-(member:Member) -[:HAS_MEMBERSHIP]->(nextMembership), (nextMembership)-[:OF_GROUP]->(nextGroup) RETURN nextGroup.name, COUNT(*) AS times ORDER BY times DESC
  • 45. How do I profile a query? ‣ EXPLAIN • shows the execution plan without actually executing it or returning any results. ‣ PROFILE • executes the statement and returns the results along with profiling information. 45
  • 46. Neo4j’s longest plan (so far…) 46
  • 47. Neo4j’s longest plan (so far…) 47
  • 48. Neo4j’s longest plan (so far…) 48
  • 49. What is our goal? At a high level, the goal is simple: get the number of db hits down. 49
  • 50. an abstract unit of storage engine work. What is a database hit? “ ” 50
  • 51. Comparing the approaches Cypher version: CYPHER 2.3, planner: COST. 111656 total db hits in 330 ms. vs Cypher version: CYPHER 2.3, planner: COST. 23650 total db hits in 60 ms.
  • 52. Tip: Profile your queries ‣ Spike the different models and see which one performs the best
  • 53. Should we keep both models? We could but when we add, edit or remove a membership we’d have to keep both graph structures in sync.
  • 54. Adding a group membership WITH "Mark Needham" AS memberName, "Neo4j - London User Group" AS groupName, timestamp() AS now MATCH (group:Group {name: groupName}) MATCH (member:Member {name: memberName}) MERGE (member)-[memberOfRel:MEMBER_OF]->(group) ON CREATE SET memberOfRel.time = now MERGE (membership:Membership {id: member.id + "_" + group.id}) ON CREATE SET membership.joined = now MERGE (member)-[:HAS_MEMBERSHIP]->(membership) MERGE (membership)-[:OF_GROUP]->(group)
  • 55. Removing a group membership WITH "Mark Needham" AS memberName, "Neo4j - London User Group" AS groupName, timestamp() AS now MATCH (group:Group {name: groupName}) MATCH (member:Member {name: memberName}) MATCH (member)-[memberOfRel:MEMBER_OF]->(group) MATCH (membership:Membership {id: member.id + "_" + group.id}) MATCH (member)-[hasMembershipRel:HAS_MEMBERSHIP]->(membership) MATCH (membership)-[ofGroupRel:OF_GROUP]->(group) DELETE memberOfRel, hasMembershipRel, ofGroupRel, membership
  • 57. ...not so fast! As a member of several meetup groups I want to find other similar meetup groups that I’m not already a member of So that I can join those groups
  • 58. Why not delete MEMBER_OF? MATCH (member:Member {name: "Mark Needham"}) -[:MEMBER_OF]->(group)-[:HAS_TOPIC]->(topic) WITH member, topic, COUNT(*) AS score MATCH (topic)<-[:HAS_TOPIC]-(otherGroup) WHERE NOT (member)-[:MEMBER_OF]->(otherGroup) RETURN otherGroup.name, COLLECT(topic.name), SUM(score) as score ORDER BY score DESC MATCH (member:Member {name: "Mark Needham"}) -[:HAS_MEMBERSHIP]->()-[:OF_GROUP]->(group:Group)-[:HAS_TOPIC]->(topic) WITH member, topic, COUNT(*) AS score MATCH (topic)<-[:HAS_TOPIC]-(otherGroup) WHERE NOT (member)-[:HAS_MEMBERSHIP]->(:Membership)-[:OF_GROUP]->(otherGroup:Group) RETURN otherGroup.name, COLLECT(topic.name), SUM(score) as score ORDER BY score DESC
  • 59. Why not delete MEMBER_OF? MATCH (member:Member {name: "Mark Needham"}) -[:MEMBER_OF]->(group)-[:HAS_TOPIC]->(topic) WITH member, topic, COUNT(*) AS score MATCH (topic)<-[:HAS_TOPIC]-(otherGroup) WHERE NOT (member)-[:MEMBER_OF]->(otherGroup) RETURN otherGroup.name, COLLECT(topic.name), SUM(score) as score ORDER BY score DESC MATCH (member:Member {name: "Mark Needham"}) -[:HAS_MEMBERSHIP]->()-[:OF_GROUP]->(group:Group)-[:HAS_TOPIC]->(topic) WITH member, topic, COUNT(*) AS score MATCH (topic)<-[:HAS_TOPIC]-(otherGroup) WHERE NOT (member)-[:HAS_MEMBERSHIP]->(:Membership)-[:OF_GROUP]->(otherGroup:Group) RETURN otherGroup.name, COLLECT(topic.name), SUM(score) as score ORDER BY score DESC 433318 total db hits in 485 ms. 83268 total db hits in 117 ms.
  • 60. Tip: Maintaining multiple models ‣ Different models perform better for different queries but worse for others ‣ Optimising for reads may mean we pay a write and maintenance penalty
  • 62. Events in my groups As a member of several meetup groups who has previously attended events I want to find other events hosted by those groups So that I can attend those events
  • 63. Events in my groups As a member of several meetup groups who has previously attended events I want to find other events hosted by those groups So that I can attend those events
  • 64. WITH 24.0*60*60*1000 AS oneDay MATCH (member:Member {name: "Mark Needham"}), (member)-[:MEMBER_OF]->(group), (group)-[:HOSTED_EVENT]->(futureEvent) WHERE futureEvent.time >= timestamp() AND NOT (member)-[:RSVPD]->(futureEvent) RETURN group.name, futureEvent.name, round((futureEvent.time - timestamp()) / oneDay) AS days ORDER BY days LIMIT 10 Events in my groups
  • 65. Events in my groups
  • 66. + previous events attended WITH 24.0*60*60*1000 AS oneDay MATCH (member:Member {name: "Mark Needham"}) MATCH (futureEvent:Event) WHERE futureEvent.time >= timestamp() AND NOT (member)-[:RSVPD]->(futureEvent) MATCH (futureEvent)<-[:HOSTED_EVENT]-(group) WITH oneDay, group, futureEvent, member, EXISTS((group)<-[:MEMBER_OF]-(member)) AS isMember OPTIONAL MATCH (member)-[rsvp:RSVPD {response: "yes"}]->(pastEvent)<-[:HOSTED_EVENT]-(group) WHERE pastEvent.time < timestamp() RETURN group.name, futureEvent.name, isMember, COUNT(rsvp) AS previousEvents, round((futureEvent.time - timestamp()) / oneDay) AS days ORDER BY days, previousEvents DESC
  • 67. + previous events attended
  • 68. RSVPD_YES vs RSVPD I was curious whether refactoring RSVPD {response: "yes"} to RSVPD_YES would have any impact as Neo4j is optimised for querying by unique relationship types.
  • 69. Refactor to specific relationships MATCH (m:Member)-[rsvp:RSVPD {response:"yes"}]->(event) MERGE (m)-[rsvpYes:RSVPD_YES {id: rsvp.id}]->(event) ON CREATE SET rsvpYes.created = rsvp.created, rsvpYes.lastModified = rsvp.lastModified; MATCH (m:Member)-[rsvp:RSVPD {response:"no"}]->(event) MERGE (m)-[rsvpYes:RSVPD_NO {id: rsvp.id}]->(event) ON CREATE SET rsvpYes.created = rsvp.created, rsvpYes.lastModified = rsvp.lastModified;
  • 70. RSVPD_YES vs RSVPD RSVPD {response: "yes"} vs RSVPD_YES Cypher version: CYPHER 2.3, planner: COST. 688635 total db hits in 232 ms. Cypher version: CYPHER 2.3, planner: COST. 559866 total db hits in 207 ms.
  • 71. Why would we keep RSVPD? MATCH (m:Member)-[r:RSVPD]->(event)<-[:HOSTED_EVENT]-(group) WHERE m.name = "Mark Needham" RETURN event, group, r MATCH (m:Member)-[r:RSVPD_YES|:RSVPD_NO|:RSVPD_WAITLIST]->(event), (event)<-[:HOSTED_EVENT]-(group) WHERE m.name = "Mark Needham" RETURN event, group, r
  • 72. Tip: Specific relationships ‣ Neo4j is optimised for querying by unique relationship types… ‣ ...but sometimes we pay a query maintenance cost to achieve this
  • 73. + events my friends are attending There’s an implicit FRIENDS relationship between people who attended the same events. Let’s make it explicit. M E M RSVPD RSVPD FRIENDS M E M RSVPD RSVPD
  • 74. + events my friends are attending MATCH (m1:Member) WHERE NOT m1:Processed WITH m1 LIMIT {limit} MATCH (m1)-[:RSVP_YES]->(event:Event)<-[:RSVP_YES]-(m2:Member) WITH m1, m2, COLLECT(event) AS events, COUNT(*) AS times WHERE times >= 5 WITH m1, m2, times, [event IN events | SIZE((event)<-[:RSVP_YES]-())] AS attendances WITH m1, m2, REDUCE(score = 0.0, a IN attendances | score + (1.0 / a)) AS score MERGE (m1)-[friendsRel:FRIENDS]-(m2) SET friendsRel.score = row.score
  • 75. Bidirectional relationships ‣ You may have noticed that we didn’t specify a direction when creating the relationship MERGE (m1)-[:FRIENDS]-(m2) ‣ FRIENDS is a bidirectional relationship. We only need to create it once between two people. ‣ We ignore the direction when querying
  • 76. + events my friends are attending WITH 24.0*60*60*1000 AS oneDay MATCH (member:Member {name: "Mark Needham"}) MATCH (futureEvent:Event) WHERE futureEvent.time >= timestamp() AND NOT (member)-[:RSVPD]->(futureEvent) MATCH (futureEvent)<-[:HOSTED_EVENT]-(group) WITH oneDay, group, futureEvent, member, EXISTS((group)<-[:MEMBER_OF]-(member)) AS isMember OPTIONAL MATCH (member)-[rsvp:RSVPD {response: "yes"}]->(pastEvent)<-[:HOSTED_EVENT]-(group) WHERE pastEvent.time < timestamp() WITH oneDay, group, futureEvent, member, isMember, COUNT(rsvp) AS previousEvents OPTIONAL MATCH (futureEvent)<-[:HOSTED_EVENT]-()-[:HAS_TOPIC]->(topic)<-[:INTERESTED_IN]-(member) WITH oneDay, group, futureEvent, member, isMember, previousEvents, COUNT(topic) AS topics OPTIONAL MATCH (member)-[:FRIENDS]-(:Member)-[rsvpYes:RSVP_YES]->(futureEvent) RETURN group.name, futureEvent.name, isMember, round((futureEvent.time - timestamp()) / oneDay) AS days, previousEvents, topics, COUNT(rsvpYes) AS friendsGoing ORDER BY days, friendsGoing DESC, previousEvents DESC LIMIT 15
  • 77. + events my friends are attending
  • 78. Tip: Bidirectional relationships ‣ Some relationships are bidirectional in nature ‣ Neo4j always stores relationships with a direction but we can choose to ignore that when we query
  • 79. tl;dr ‣ Model incrementally ‣ Always profile your queries ‣ Consider making the implicit explicit… • ...but beware the maintenance cost ‣ Be specific with relationship types ‣ Ignore direction for bidirectional relationships
  • 80. That’s all for today! Questions? :-) Mark Needham @markhneedham https://github.com/neo4j-meetups/modeling-worked-example