20141216 graph database prototyping ams meetup

Graph
Database
Prototyping
@
AMS
GraphDB
meetup

Agenda
for
Tonight
• Building
a
Graph
Database
Prototype
• 3
parts
– Graph
database
&
modeling
concepts
– Prototyping
tools
&
import
– Graph
querying
with
Cypher

Topics
• Graph
model
building
blocks
• Quick
intro
to
Cypher
• Example
modeling
process
• Modeling
Eps
• Recipes
for
common
modeling
scenarios
• Refactoring
• Test-‐driven
data
modeling

Graph
Model
Building
Blocks

Four
Building
Blocks
• Nodes
• RelaEonships
• ProperEes
• Labels

Nodes
• Used
to
represent
en##es
and
complex
value
types
in
your
domain
• Can
contain
properEes
– Used
to
represent
enEty
a1ributes
and/or
metadata
(e.g.
Emestamps,
version)
– Key-‐value
pairs
• Java
primiEves
• Arrays
• null
is
not
a
valid
value
– Every
node
can
have
different
properEes

EnEEes
and
Value
Types
• EnEEes
– Have
unique
conceptual
idenEty
– Change
aWribute
values,
but
idenEty
remains
the
same
• Value
types
– No
conceptual
idenEty
– Can
subsEtute
for
each
other
if
they
have
the
same
value
• Simple:
single
value
(e.g.
colour,
category)
• Complex:
mulEple
aWributes
(e.g.
address)

RelaEonships
• Every
relaEonship
has
a
name
and
a
direc#on
– Add
structure
to
the
graph
– Provide
semanEc
context
for
nodes
• Can
contain
properEes
– Used
to
represent
quality
or
weight
of
relaEonship,
or
metadata
• Every
relaEonship
must
have
a
start
node
and
end
node
– No
dangling
relaEonships

RelaEonships
(conEnued)
Nodes
can
have
more
than
one
relaEonship
Nodes
can
be
connected
by
more
than
one
relaEonship
Self
relaEonships
are
allowed

Variable
Structure
• RelaEonships
are
defined
with
regard
to
node
instances,
not
classes
of
nodes
– Two
nodes
represenEng
the
same
kind
of
“thing”
can
be
connected
in
very
different
ways
• Allows
for
structural
variaEon
in
the
domain
– Contrast
with
relaEonal
schemas,
where
foreign
key
relaEonships
apply
to
all
rows
in
a
table
• No
need
to
use
null
to
represent
the
absence
of
a
connecEon

Labels
• Every
node
can
have
zero
or
more
labels
• Used
to
represent
roles
(e.g.
user,
product,
company)
– Group
nodes
– Allow
us
to
associate
indexes
and
constraints
with
groups
of
nodes

Four
Building
Blocks
• Nodes
– EnEEes
• RelaEonships
– Connect
enEEes
and
structure
domain
• ProperEes
– EnEty
aWributes,
relaEonship
qualiEes,
and
metadata
• Labels
– Group
nodes
by
role

Models
Purposeful
abstracEon
of
a
domain
designed
to
saEsfy
parEcular
applicaEon/end-‐user
goals
Images:
en.wikipedia.org

Design
for
Queryability
MQuoedreyl

Method
1. IdenEfy
applicaEon/end-‐user
goals
2. Figure
out
what
quesEons
to
ask
of
the
domain
3. IdenEfy
enEEes
in
each
quesEon
4. IdenEfy
relaEonships
between
enEEes
in
each
quesEon
5. Convert
enEEes
and
relaEonships
to
paths
– These
become
the
basis
of
the
data
model
6. Express
quesEons
as
graph
paWerns
– These
become
the
basis
for
queries

ApplicaEon/End-‐User
Goals
As
an
employee
I
want
to
know
who
in
the
company
has
similar
skills
to
me
So
that
we
can
exchange
knowledge

QuesEons
To
Ask
of
the
Domain
As
an
employee
I
want
to
know
who
in
the
company
has
similar
skills
to
me
So
that
we
can
exchange
knowledge
Which
people,
who
work
for
the
same
company
as
me,
have
similar
skills
to
me?

IdenEfy
EnEEes
Which
people,
who
work
for
the
same
company
as
me,
have
similar
skills
to
me?
Person
Company
Skill

IdenEfy
RelaEonships
Between
EnEEes
Which
people,
who
work
for
the
same
company
as
me,
have
similar
skills
to
me?
Person
WORKS_FOR
Company
Person
HAS_SKILL
Skill

Convert
to
Cypher
Paths
RelaEonship
Person
WORKS_FOR
Company
Person
HAS_SKILL
Skill
Label
(:Person)-[:WORKS_FOR]->(:Company),
(:Person)-[:HAS_SKILL]->(:Skill)

Consolidate
Paths
(:Person)-[:WORKS_FOR]->(:Company),
(:Person)-[:HAS_SKILL]->(:Skill)
(:Company)<-[:WORKS_FOR]-(:Person)-[:HAS_SKILL]->(:Skill)

Create
Person
Subgraph
MERGE (c:Company{name:'Acme'})
MERGE (p:Person{name:'Ian'})
MERGE (s1:Skill{name:'Java'})
MERGE (s2:Skill{name:'C#'})
MERGE (s3:Skill{name:'Neo4j'})
CREATE UNIQUE (c)<-[:WORKS_FOR]-(p),
(p)-[:HAS_SKILL]->(s1),
(p)-[:HAS_SKILL]->(s2),
(p)-[:HAS_SKILL]->(s3)
RETURN c, p, s1, s2, s3

Candidate
Data
Model

Express
QuesEon
as
Graph
PaWern
Which
people,
who
work
for
the
same
company
as
me,
have
similar
skills
to
me?

Cypher
Query
Which
people,
who
work
for
the
same
company
as
me,
have
similar
skills
to
me?
MATCH (company)<-[:WORKS_FOR]-(me:Person)-[:HAS_SKILL]->(skill),
(company)<-[:WORKS_FOR]-(colleague)-[:HAS_SKILL]->(skill)
WHERE me.name = {name}
RETURN colleague.name AS name,
count(skill) AS score,
collect(skill.name) AS skills
ORDER BY score DESC

Graph
PaWern
Which
people,
who
work
for
the
same
company
as
me,
have
similar
skills
to
me?
ORDER BY score DESC

Anchor
PaWern
in
Graph
Which
people,
who
work
for
the
same
company
as
me,
have
similar
skills
to
me?
ORDER BY score DESC
If
an
index
for
Person.name
exists,
Cypher
will
use
it

Create
ProjecEon
of
Results
Which
people,
who
work
for
the
same
company
as
me,
have
similar
skills
to
me?
ORDER BY score DESC

From
User
Story
to
Model
and
Query
ORDER BY score DESC
As
an
employee
I
want
to
know
who
in
the
company
has
similar
skills
to
me
So
that
we
can
exchange
knowledge
Person
WORKS_FOR
Company
Person
HAS_SKILL
Skill
? Which
people,
who
work
for
the
same
company
as
me,
have
similar
skills
to
me?

ProperEes
Versus
RelaEonships

Use
RelaEonships
When…
• You
need
to
specify
the
weight,
strength,
or
some
other
quality
of
the
rela#onship
• AND/OR
the
aWribute
value
comprises
a
complex
value
type
(e.g.
address)
• Examples:
– Find
all
my
colleagues
who
are
expert
(relaEonship
quality)
at
a
skill
(aWribute
value)
we
have
in
common
– Find
all
recent
orders
delivered
to
the
same
delivery
address
(complex
value
type)

Use
ProperEes
When…
• There’s
no
need
to
qualify
the
relaEonship
• AND
the
aWribute
value
comprises
a
simple
value
type
(e.g.
colour)
• Examples:
– Find
those
projects
wriWen
by
contributors
to
my
projects
that
use
the
same
language
(aWribute
value)
as
my
projects

If
Performance
is
CriEcal…
• Small
property
lookup
on
a
node
will
be
quicker
than
traversing
a
relaEonship
– But
traversing
a
relaEonship
is
sEll
faster
than
a
SQL
join…
• However,
many
small
proper#es
on
a
node,
or
a
lookup
on
a
large
string
or
large
array
property
will
impact
performance
– Always
performance
test
against
a
representaEve
dataset

Align
With
Use
Cases
• RelaEonships
are
the
“royal
road”
into
the
graph
• When
querying,
well-‐named
relaEonships
help
discover
only
what
is
absolutely
necessary
– And
eliminate
unnecessary
porEons
of
the
graph
from
consideraEon

General
RelaEonships
• Qualified
by
property

Events
and
AcEons
• Oken
involve
mulEple
parEes
• Can
include
other
circumstanEal
detail,
which
may
be
common
to
mulEple
events
• Examples
– Patrick
worked
for
Acme
from
2001
to
2005
as
a
Sokware
Developer
– Sarah
sent
an
email
to
Lucy,
copying
in
David
and
Claire

Timeline
Trees
• Discrete
events
– No
natural
relaEonships
to
other
events
• You
need
to
find
events
at
differing
levels
of
granularity
– Between
two
days
– Between
two
months
– Between
two
minutes

Pimalls
and
AnE-‐PaWerns

Modeling
EnEEes
as
RelaEonships
• Limits
data
model
evoluEon
– A
relaEonship
connects
two
things
– Modeling
an
enEty
as
a
relaEonship
prevents
it
from
being
related
to
more
than
two
things
• Smells:
– Lots
of
aWribute-‐like
properEes
– Heavy
use
of
relaEonship
indexes
• EnEEes
hidden
in
verbs:
– E.g.
emailed,
reviewed

Example:
Movie
Reviews
• IniEal
requirements:
– People
review
films
– ApplicaEon
aggregates
reviews
from
mulEple
sites

New
Requirements
• Allow
user
to
comment
on
each
other’s
reviews
– Can’t
connect
a
review
to
a
third
enEty

Model
AcEons
in
Terms
of
Products

Draw
a
Model!
Eg.
Using
Visio,
www.apcjones.com/arrows,
hWp://graphjson.io,
Omnigraffle

CreaEng
a
prototype
DB
out
of
our
model?

Next
meetup!
• January
22nd
:
how
to
create
an
APPLICATION
on
top
of
our
newly
created
database

BACKUP
slides:
Cypher
Query
Language

Nodes
and
RelaEonships
()-->()

Labels
and
RelaEonship
Types
(:Person)-[:FRIEND]->(:Person)

ProperEes
(:Person{name:'Peter'})-[:FRIEND]->(:Person{name:'Lucy'})

IdenEfiers
(p1:Person{name:'Peter'})-[r:FRIEND]->(p2:Person{name:'Lucy'})

Cypher
MATCH graph_pattern
WHERE binding_and_filter_criteria
RETURN results

Cypher
MATCH (p:Person)-[:FRIEND]->(friends)
WHERE p.name = 'Peter'
RETURN friends

Lookup
Using
IdenEfier
+
Label
MATCH (p:Person)-[:FRIEND]->(friends)
WHERE p.name = 'Peter'
RETURN friends

20141216 graph database prototyping ams meetup

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (6)

Similar to 20141216 graph database prototyping ams meetup

Similar to 20141216 graph database prototyping ams meetup (20)

More from Rik Van Bruggen

More from Rik Van Bruggen (11)

Recently uploaded

Recently uploaded (20)

20141216 graph database prototyping ams meetup