BASEL BERN BRUGG DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. GENF
HAMBURG KOPENHAGEN LAUSANNE MÜNCHEN STUTTGART WIEN ZÜRICH
Introduction to DataStax Enterprise
(DSE) Graph
Guido Schmutz
@gschmutz guidoschmutz.wordpress.com
Agenda
1. Why a Graph Database?
2. DataStax Enterprise Edition – DSE Graph
3. DSE Graph in Action
• Working with Graph Schemas
• Inserting Graph Data
• Traversing Graph Data
• Indexing Graph Data
4. Summary
Introduction to DataStax Enterprise (DSE) Graph2 9/9/2016
Why a Graph Database?
Introduction to DataStax Enterprise (DSE) Graph3 9/9/2016
Know your domain
Connectedness	of	Datalow high
Document
Data
Store
Key-Value
Stores
Wide-
Column
Store
Graph
Databases
Relational
Databases
Introduction to DataStax Enterprise (DSE) Graph4 9/9/2016
RDBMS vs. Graph Database
Introduction to DataStax Enterprise (DSE) Graph5 9/9/2016
RDBMS Graph	Database
Process	to	query	data	elements	
(joins)	is	inefficient	on	large	data	set	
or	many	relationships
Better	performance	for	relationship	queries	
due to	specialized	index	structures
Expressing	JOIN-intensive	queries	in	
SQL is	time-consuming	and	error	
prone
Intuitive	query	language	enabling	faster	
application	development
RDBMS vs. Graph Database – Data Modelling
Introduction to DataStax Enterprise (DSE) Graph6 9/9/2016
VS. !"
!"
!"
#" #"
$"
$"
$"
partner!
worksFor

!
develops!
develops!
uses!
uses!
reportsTo!
reportsTo!
manages!
worksFor

!
worksFor

!
Source:	DataStax
Graph Use Cases
Introduction to DataStax Enterprise (DSE) Graph9 9/9/2016
Master Data Management
Customer 360
Recommendation
Personalization
Security
Fraud Detection
Internet of Things (IoT)
Introduction to the Graph Model – Property Graph
Node / Vertex
• Represent Entities
• Can contain properties
(key-value pairs)
Relationship / Edge
• Lines between nodes
• may be directed or
undirected
Properties
• Values about node or
relationship
• Allow to add semantic to
relationships
User	1 Movie
rated
knows rated
User	2
userId:	16134540
name:	Peter
location:	Palo	Alto
userId:	18898576
name:	Guido
location:	Berne
rating:	8
key:	value
Introduction to DataStax Enterprise (DSE) Graph12 9/9/2016
rating:	6
movieId:	m1000
title:	Sully
year:	2016
duration:	95
country:	United	States
production:	[FilmNation,
Flashlight	Films]
DataStax Enterprise Edition – DSE
Graph
Introduction to DataStax Enterprise (DSE) Graph13 9/9/2016
DataStax Enterprise Edition – DSE Graph
Introduction to DataStax Enterprise (DSE) Graph14
• A scale-out graph database purposely built
for cloud applications that need to act on
complex and highly connected relationships.
• Supports a property graph model natively
inside the DataStax product, engineered
specifically for DataStax Enterprise
(Cassandra, Search, Analytics).
• Store & find relationships in data fast and
easy in large graphs.
• Part of DSE’s multi-model platform.
9/9/2016
Image:	DataStax
DataStax Enterprise Graph
Introduction to DataStax Enterprise (DSE) Graph15 9/9/2016
Real-time graph database management system
• Adopts Apache TinkerPop standards
for data and traversal
• Uses Apache Cassandra for scalable
storage and retrieval
• Leverages Apache Solr for full-text
search and indexing
• Integrates Apache Spark for fast
analytic traversal
• Supports comprehensive data security
for the enterprise
Image:	DataStax
Why Cassandra?
• All nodes participate in a cluster
• Shared nothing
• Add or remove as needed
• More capacity? Add more servers
• Each node owns a range of partitions
• Consistent Hashing
DSE Graph - Architecture
Introduction to DataStax Enterprise (DSE) Graph17 9/9/2016
Image:	DataStax
Today’s Needs and the solution in DSE Graph
Need DSE	Graph	Feature
Store	and	Access	Data	Quickly Graph	Data	Model	+	Gremlin
Flexible,	Fast	Application	Builds DSE	Graph	+	DSE	Studio &	DSE	Drivers	
Analyze	Information Graph	Analytics	with	DSE	Analytics
Search	and	Find	Quickly Graph	Search	with	DSE	Search
Ingest	and Export	Data DSE	Graph	Loader
Secure	Information DSE	Security
Manage	and	Monitor Opscenter
Availability,	Scale,	Operational	Ease,	
Performance…
DataStax Enterprise
Introduction to DataStax Enterprise (DSE) Graph18 9/9/2016
• Web-based developer solution which helps developers visually explore, query, and
trouble-shoot DSE Graph in one intuitive UI.
• Auto-completion, result set visualization, execution management, and much more.
Developer support with DataStax Studio
Introduction to DataStax Enterprise (DSE) Graph19 9/9/2016
Data Loading Support with DSE Graph Loader
• Simplifies loading large amounts of enterprise data from various sources into DSE
Graph efficiently and robustly.
• Inspects incoming data for schema compliance.
• Uses declarative data mappings and custom transformations to handle diverse types
of data.
Graph
Loader
 
Data Mappings
Batch Loading
Stream Ingestion
RDBMSJSON
DSE Graph
Introduction to DataStax Enterprise (DSE) Graph20 9/9/2016
Operational support with DataStax OpsCenter
• Web-based operations solution which can launch, manage, monitor and trouble-
shoot DSE clusters and deployments.
• Launch wizard, failure alerts, monitoring dashboards, and much more.
DSE Graph in Action
Introduction to DataStax Enterprise (DSE) Graph22 9/9/2016
Sample Graph: Movies Database
Introduction to DataStax Enterprise (DSE) Graph23 9/9/2016
Person
Genre
movieId : text
title : text
year : int
duration : int
country : text
production : text*
personId : text
name : text
User Movie belongsTo
actor
rated
director
composer
screen
writer
cinemato
grapher
knows
rating : int
userId : text
age : int
gender : text
genreId : int
name : text
DSE Graph in Action – Working
with Graph Schemas
Introduction to DataStax Enterprise (DSE) Graph24 9/9/2016
Graph Schema API
Introduction to DataStax Enterprise (DSE) Graph25 9/9/2016
Schema management basics
• Accessible via schema object
• Defined for each graph in the DSE Graph system
• Provides methods propertyKey, vertexLabel, edgeLabel, etc.
• Supports fluent interface
Four important elements of a graph schema
• Property keys
• Vertex labels
• Edge labels
• Indexes
Graph Schema API – Property Keys
Introduction to DataStax Enterprise (DSE) Graph26 9/9/2016
Define types of properties to be used with Vertices and Edges
• Creating single-cardinality property keys (single() is assumed by default)
• Creating multi-property keys (can only be associated with vertices)
• Creating meta-property keys (can only be associated with vertices)
schema.propertyKey("title").Text().single().create();
schema.propertyKey("year").Int().create();
schema.propertyKey("production").Text().multiple().create();
schema.propertyKey("source").Text().create();
schema.propertyKey("date").Timestamp().create();
schema.propertyKey("budget").Text().multiple().
properties("source","date").create();
Movie
movieId:	 Text
title:	 Text
year:	 Int
duration:	 Int
country:	 Text
production: Text*
Graph Schema API – Vertex Labels
Introduction to DataStax Enterprise (DSE) Graph27 9/9/2016
Define types of vertices to be used in a graph
• Creating vertex labels
• Different labels can use same property key
schema.vertexLabel("movie").
properties("movieId","title",”year",”duration",”country","production").
create();
schema.vertexLabel("genre").properties("genreId","name").create();
schema.vertexLabel("person").properties("personId","name").create();
Movie
movieId:	 Text
title:	 Text
year:	 Int
duration:	 Int
country:	 Text
production: Text*
Genre
genreId:				Text
name:								Text
Person
personId:		Text
name:								Text
Graph Schema API – Edge Labels
Introduction to DataStax Enterprise (DSE) Graph28 9/9/2016
Define types of edges to be used in a graph
• Creating single-cardinality edge label definition
• Creating multi-cardinality edge label definition (multiple() is assumed by default)
schema.edgeLabel("rated").
single().
properties("rating").
connection("user","movie").
create();
schema.edgeLabel("actor").
multiple().
properties("rating").
connection("movie","person").
create();
User Movie
rated	(0..1)
rating:	int
Person Movie
actor	(0..*)
Graph Schema API – Management of Graph Schemas
Introduction to DataStax Enterprise (DSE) Graph29 9/9/2016
• Listing all schema definitions
• Listing individual schema elements
• Dropping all schema definitions (will also delete all graph data !!!)
schema.describe()
schema.vertexLabel("user").describe()
schema.edgeLabel("rated").describe()
schema.clear()
DSE Graph in Action – Inserting
Graph Data
Introduction to DataStax Enterprise (DSE) Graph30 9/9/2016
Creating Vertices and Vertex Properties
Introduction to DataStax Enterprise (DSE) Graph31 9/9/2016
Adding a new User vertex
Adding a new Movie vertex
Vertex u = graph.addVertex("user");
u.property("userId","u2016");
u.property("age",36);
u.property("gender","M");
User
userId:	u2016
age:	26
gender:	M
Vertex m = graph.addVertex(label, "movie",
"movieId", "m1000",
"title", "Sully",
"year", 2016,
"duration", 95,
"country", "United States");
m.property("production", "FilmNation");
m.property("production", "Flashlight Films");
Movie
movieId:	m1000
title:	Sully
year:	2016
Duration:	95
Country:	United	States
Production:	[FilmNation,
Flashlight	Films]
Creating Vertices and Vertex Properties
Introduction to DataStax Enterprise (DSE) Graph32 9/9/2016
Adding an edge between the user vertex and the movie vertex
Vertex u = graph.addVertex("user", ...);
Vertex m = graph.addVertex("movie", ...);
Edge e = u.addEdge("rated", m);
e.property("rating", 8);
User
userId:	u2016
age:	26
gender:	M
Movie
rated
rating:	8
movieId:	m1000
title:	Sully
year:	2016
Duration:	95
Country:	United	States
Production:	[FilmNation,
Flashlight	Films]
Gremlin I/O
Introduction to DataStax Enterprise (DSE) Graph33 9/9/2016
Exporting and importing existing graphs and subgraphs
• Migration between Apache TinkerPop-enabled graph databases
• Serialization and interchange of graphs between databases and tools
• Creating backup copies of graph
• Different file formats for graph serialization
Format Description
GraphML Common	XML	vocabulary	for	graph	representation. Supported	by	many	graph-related	tools,	
applications	and	libraries.	"Lossy" format	as	it	lacks	support	for	complex	data	types	and	graph	
variables.
GraphSON Apache TinkerPop-originated	JSON-based,	lossless	format	for	graph	representation.
Gryo Fast,	space-efficient, losseless,	binary	graph	serialization	format	for	use	by	JVM	languages.	
Gryo is	enalbed by	Kryo – a	fast	and	efficient	object	graph	serialization	framework.
Gremlin I/O
Introduction to DataStax Enterprise (DSE) Graph34 9/9/2016
Exporting to and importing from a GraphML file
Exporting to and importing from a GraphSON file
Exporting to and importing from a Kryo file
graph.io(IoCore.graphml()).writeGraph("KillrVideo.xml")
graph.io(IoCore.graphml()).readGraph("KillrVideo.xml")
graph.io(IoCore.graphson()).writeGraph("KillrVideo.json")
graph.io(IoCore.graphson()).readGraph("KillrVideo.json")
graph.io(IoCore.gryo()).writeGraph("KillrVideo.json")
graph.io(IoCore.gryo()).readGraph("KillrVideo.json")
DSE Graph in Action – Traversing
Graph Data
Introduction to DataStax Enterprise (DSE) Graph35 9/9/2016
Gremlin Traversals in DSE Graph
Introduction to DataStax Enterprise (DSE) Graph36 9/9/2016
• Standard Gremlin Language
• Gremlin is defined by Apache TinkerPop
• Expressive fluent language to define traversals (Groovy)
• Production vs. Development modes
• Production (default) requires an explicitly defined graph schema and proper graph indexes to
avoid expensive scans
• Enabling graph scans in production mode (acceptable if scanning small portions of data =>
caution is advised!)
g.V().has("movie","movieId","m366").values("title","year")
schema.config().option("graph.schema_mode").get()
schema.config().option("graph.schema_mode").set("Development")
schema.config().option("graph.allow_scan").set(true)
Gremlin Traversals in DSE Graph
Introduction to DataStax Enterprise (DSE) Graph37 9/9/2016
OLTP vs. OLAP Traversals
OLTP	traversals OLAP	traversals
Generate	instantaneous responses Take longer	to	execute
Resemble	targeted	database	queries Involve broader-scope	data	analysis
Heavily rely	on	vertex	ids	or	indexes Require expensive	graph	scans
Access	small	subgraphs Move	to the	outgoing	edges
Traverse	short	paths	with	few	branches Move	to	the incoming	edges
Gremlin Traversal Ingredients
Introduction to DataStax Enterprise (DSE) Graph38 9/9/2016
Traversal source
Traversal Steps
Traverser
g = grap.traversal()
g.V().has("title","Alice in Wonderland").has("year",2010).
out("director").values("name")
Defining Gremlin Traversals
Introduction to DataStax Enterprise (DSE) Graph39 9/9/2016
Linear motif: Traversal is a sequence of steps
g.
V().
has("title","Alice in Wonderland").
has("year",2010).
out("director").
values("name")
// Sample Output:
// Tim Burton
Person
Movie
screenwriter
movieId:	m267
title:	Alice	in	Wonderland
year:	2010
…
Person
Person
actor
personId:	p5206
Name:	Linda	Woolverton
director
personId:	p8153
Name:	Tim	Burton
personId:	p4361
Name:	Johnny	Depp
Defining Gremlin Traversals
Introduction to DataStax Enterprise (DSE) Graph40 9/9/2016
Netsted motif: Traversal is a tree of steps
g.
V().
has("title","Alice in Wonderland").
has("year",2010).
union(__.out("director"),
out("screenwriter")).
values("name")
// Sample Output:
// Tim Burton
// Linda Woolverton
Person
Movie
screenwriter
movieId:	m267
title:	Alice	in	Wonderland
year:	2010
…
Person
Person
actor
personId:	p5206
Name:	Linda	Woolverton
director
personId:	p8153
Name:	Tim	Burton
personId:	p4361
Name:	Johnny	Depp
Simple Traversal
Introduction to DataStax Enterprise (DSE) Graph41 9/9/2016
Navigating from a Vertex:
Step Description
out([label],	…) Move	to	the	outgoing vertices
in([label],	…) Move	to	the	incoming	vertices
both([label],	…) Move	to	both	the	incoming	and	
outgoing	vertices
outE([label],	…) Move	to the	outgoing	edges
inE([label],	…) Move	to	the incoming	edges
bothE([label],	…) Move	to	both	the	incoming	and	
outgoing	edges
MovieUser
Genre
Genre
rated
belongsTo
actor
both()
bothE()
in() inE() outE() out()
Simple Traversal
Introduction to DataStax Enterprise (DSE) Graph42 9/9/2016
Navigating from an Edge:
Step Description
outV() Move	to	the	outgoing vertices
inV() Move	to	the	incoming	vertices
bothV() Move	to	both	the	incoming	and	outgoing	
vertices
otherV() Move	to the	vertex	that	was	not	the	vertex	
that	was	moved	from
MovieUser
rated	
outV() bothV() inV()
Simple Traversal with in
Introduction to DataStax Enterprise (DSE) Graph43 9/9/2016
Find Jonny Depp’s movies release in 2010 or later:
g.V().
hasLabel("person").
has("name","Johnny Depp").
in("actor").
has("year",gte(2010)).
values("title")
"Into the Woods",
"Pirates of the Caribbean: On Stranger Tides",
"Alice in Wonderland"
Person
Genre
movieId : text
title : text
year : int
duration : int
country : text
production : text*
personId : text
name : text
User Movie belongsTo
actor
rated
director
composer
screen
writer
cinemato
grapher
knows
rating : int
userId : text
age : int
gender : text
genreId : int
name : text
Simple Traversal with in and inE
Introduction to DataStax Enterprise (DSE) Graph44 9/9/2016
Find user rating for Jonny Depp’s movies released in
2010 or later:
g.V().
hasLabel("person").
has("name","Johnny Depp").
in("actor").
has("year",gte(2010)).
inE("rated").
values("rating")
3,
7,
7,
7,
5,
5,
8,
...
Person
Genre
movieId : text
title : text
year : int
duration : int
country : text
production : text*
personId : text
name : text
User Movie belongsTo
actor
rated
director
composer
screen
writer
cinemato
grapher
knows
rating : int
userId : text
age : int
gender : text
genreId : int
name : text
Simple Traversal with in, inE and outV
Introduction to DataStax Enterprise (DSE) Graph45 9/9/2016
Find ages of users who left 7 or 8 star rating for
Johnny Depp’s movies released in 2010 or later:
g.V().
hasLabel("person").
has("name","Johnny Depp").
in("actor").
has("year",gte(2010)).
inE("rated").
has("rating", within(7,8)).
outV().
values("age")
60,
37,
63,
62,
57,
...
Person
Genre
movieId : text
title : text
year : int
duration : int
country : text
production : text*
personId : text
name : text
User Movie belongsTo
actor
rated
director
composer
screen
writer
cinemato
grapher
knows
rating : int
userId : text
age : int
gender : text
genreId : int
name : text
Path Traversal with path
Introduction to DataStax Enterprise (DSE) Graph46 9/9/2016
Find a path from a user to a friend
g.V().
has("user","userId","u1").
out("knows").out("knows").
path().by("userId").limit(1)
g.V().
has(“user“,"userId","u1").
out(“knows“).out("knows").
path().by("age").limit(1)
Person
Genre
movieId : text
title : text
year : int
duration : int
country : text
production : text*
personId : text
name : text
User Movie belongsTo
actor
rated
director
composer
screen
writer
cinemato
grapher
knows
rating : int
userId : text
age : int
gender : text
genreId : int
name : text
Path Traversal with path
Introduction to DataStax Enterprise (DSE) Graph48 9/9/2016
Find a shortest, simple path between to actors
g.V().has("person","name","Johnny Depp").
repeat(both("actor").simplePath().timeLimit(1000)).
until(has("person","name","Leonardo DiCaprio")).
path().by(choose(hasLabel("person"),
values("name"),values("title"))).limit(1)
// Sample Output
// "Johnny Depp", "Dead Man", "Gabriel Byrne",
// "The Man in the Iron Mask", "Leonardo DiCaprio"
Person
Genre
movieId : text
title : text
year : int
duration : int
country : text
production : text*
personId : text
name : text
User Movie belongsTo
actor
rated
director
composer
screen
writer
cinemato
grapher
knows
rating : int
userId : text
age : int
gender : text
genreId : int
name : text
Projecting Traversal with select
Introduction to DataStax Enterprise (DSE) Graph49 9/9/2016
Find Johnny Depp’s movie titles
g.V().has("person","name","Johnny Depp").
as("actor").
in("actor").as("movie").
select("actor","movie").
by("name").
by("title").
sample(1)
// Sample Output
// "actor": "Johnny Depp", "movie": "Dead Man"
Person
Genre
movieId : text
title : text
year : int
duration : int
country : text
production : text*
personId : text
name : text
User Movie belongsTo
actor
rated
director
composer
screen
writer
cinemato
grapher
knows
rating : int
userId : text
age : int
gender : text
genreId : int
name : text
Projecting Traversal with select
Introduction to DataStax Enterprise (DSE) Graph50 9/9/2016
Find Johnny Depp’s movie titles, years, and genres
g.V().has("person","name","Johnny Depp").
as("a").
in("actor").as("t","y").
out("belongsTo").as("g").
select("a","t","y","g").
by("name").by("title").by("year").by("name").
sample(3)
// Sample output:
// [a:Johnny Depp, t:Alice in Wonderland, y:2010, g:Fantasy]
// [a:Johnny Depp, t:Alice in Wonderland, y:2010, g:Animation]
// [a:Johnny Depp, t:Alice in Wonderland, y:2010, g:Adventure]
Person
Genre
movieId : text
title : text
year : int
duration : int
country : text
production : text*
personId : text
name : text
User Movie belongsTo
actor
rated
director
composer
screen
writer
cinemato
grapher
knows
rating : int
userId : text
age : int
gender : text
genreId : int
name : text
Statistical Traversal with count
Introduction to DataStax Enterprise (DSE) Graph51 9/9/2016
Find the number of vertices and edges in a graph
g.V().count()
// 10797
g.E().count()
// 69054
g.V().hasLabel("user").count()
// 1100
Person
Genre
movieId : text
title : text
year : int
duration : int
country : text
production : text*
personId : text
name : text
User Movie belongsTo
actor
rated
director
composer
screen
writer
cinemato
grapher
knows
rating : int
userId : text
age : int
gender : text
genreId : int
name : text
Statistical Traversal with mean, min and max
Introduction to DataStax Enterprise (DSE) Graph52 9/9/2016
Find the average, smallest and largest user ages
g.V().hasLabel("user").values("age").mean()
// 32.01282051282051
g.V().hasLabel("user").values("age").min()
// 12
Person
Genre
movieId : text
title : text
year : int
duration : int
country : text
production : text*
personId : text
name : text
User Movie belongsTo
actor
rated
director
composer
screen
writer
cinemato
grapher
knows
rating : int
userId : text
age : int
gender : text
genreId : int
name : text
Statistical Traversal with group().by().by() pattern
Introduction to DataStax Enterprise (DSE) Graph53 9/9/2016
Find the average ages of female and male users in the graph
g.V().hasLabel("user").
group().
by("gender").
by(values("age").mean())
// F : 30.561371841155236
// M : 32.01282051282051
Person
Genre
movieId : text
title : text
year : int
duration : int
country : text
production : text*
personId : text
name : text
User Movie belongsTo
actor
rated
director
composer
screen
writer
cinemato
grapher
knows
rating : int
userId : text
age : int
gender : text
genreId : int
name : text
Statistical Traversal with groupCount
Introduction to DataStax Enterprise (DSE) Graph54 9/9/2016
Find the vertex and edge distribution by label and graph
Person
Genre
movieId : text
title : text
year : int
duration : int
country : text
production : text*
personId : text
name : text
User Movie belongsTo
actor
rated
director
composer
screen
writer
cinemato
grapher
knows
rating : int
userId : text
age : int
gender : text
genreId : int
name : text
g.V().groupCount().by(label)
// [movie:920,
// person:8759,
// genre:18,
// user:1100]
Declarative Traversal with match
Introduction to DataStax Enterprise (DSE) Graph55 9/9/2016
Find directors who appeared in their own movies
g.V().match(
__.as("m").out("actor").as("p"),
__.as("m").out("director").as("p")).
select("p").by("name").dedup().
sample(3)
// Sample Output.
// "Gene Kelly",
// "Sylvester Stallone",
// "Terry Jones"
Person
Genre
movieId : text
title : text
year : int
duration : int
country : text
production : text*
personId : text
name : text
User Movie belongsTo
actor
rated
director
composer
screen
writer
cinemato
grapher
knows
rating : int
userId : text
age : int
gender : text
genreId : int
name : text
DSE Graph in Action – Indexing
Graph Data
Introduction to DataStax Enterprise (DSE) Graph56 9/9/2016
Graph Schema API – Vertex Indexes (I)
Introduction to DataStax Enterprise (DSE) Graph57 9/9/2016
Efficiently retrieve all vertices with a known label and a given vertex value
• Creating and Using materialized view index on a high cardinality property
• Creating and Using secondary index on a low cardinality property
schema.vertexLabel("movie").index("moviedsById").materialized().by("movieId")
.add()
g.V().hasLabel("movie").has("movieId","m267")
schema.vertexLabel("movie").index("moviesByYear").secondary().by("year")
.add()
g.V().has("movie","year",2010)
Graph Schema API – Vertex Indexes (II)
Introduction to DataStax Enterprise (DSE) Graph58 9/9/2016
Efficiently retrieve all vertices with a known label and a given vertex value
• Creating and Using full text search index on a text property
• Creating and Using string search index on a text property
schema.vertexLabel("movie").index("search").search().by("title").asText()
.add()
g.V().has("movie","title",Search.tokenRegex("Wonder.*"))
schema.vertexLabel("movie").index("search").search().by("country").asString()
.add()
g.V().has("movie","country",Search.prefix("U"))
Graph Schema API – Property Indexes
Introduction to DataStax Enterprise (DSE) Graph59 9/9/2016
Efficiently retrieve properties of a known vertex that have associated meta-properties
whose values are known or fall into a known range
• Creating and Using a Property Index
schema.vertexLabel("movie").index("movieBudgetBySource")
.property("budget").by("source")
.add()
g.V().has("movie","movieId","m267").properties("budget").
.has("source","Los Angeles Times").value()
Movie
movieId:	 Text
title:	 Text
year:	 Int
duration:	 Int
country:	 Text
production: Text*
budget: Int*	[
source:											Text
date:															Timestamp]	
schema.vertexLabel("movie").index("movieBudgetByDate")
.property("budget").by("date")
.add()
Graph Schema API – Edge Indexes (I)
Introduction to DataStax Enterprise (DSE) Graph60 9/9/2016
Efficiently traverse edges that are incident to a known vertex, have a known label, and
have properties whose values are known or fall into a known range
• Creating and Using an Edge Index
• Find how many users rated a particular movie with an
8-star rating
schema.vertexLabel("movie").index("toUsersByRating")
.inE("rated").by("rating")
.add()
g.V().has("movie", "movieId", "m267").
inE("rated").has("rating",8).count()
movieId:			Text
…
User Movie
rated	(0..1)
rating:	int
Graph Schema API – Edge Indexes (II)
Introduction to DataStax Enterprise (DSE) Graph61 9/9/2016
Efficiently traverse edges that are incident to a known vertex, have a known label, and
have properties whose values are known or fall into a known range
• Creating and Using an Edge Index
• Find movies rated with a greater-than-7 rating
by a particular user
schema.vertexLabel("user").index("toMoviesByRating")
.outE("rated").by("rating")
.add()
g.V().has("user", "userId", "u1").
outE("rated").has("rating",gt(7)).inV()
movieId:			Text
…
User Movie
rated	(0..1)
rating:	int
userId:			Text
…
Summary
Introduction to DataStax Enterprise (DSE) Graph62 9/9/2016
Summary - A Complete Integrated Solution for Graph
Introduction to DataStax Enterprise (DSE) Graph63
Server																																																							Visual	Management/Monitoring
Visual	Development																																						Integrated	Drivers	(CQL,	Gremlin,	etc.)	
Java												Python															C++														More…
9/9/2016 Images:	DataStax
Trivadis Enterprise Knowledge Graph
Introduction to DataStax Enterprise (DSE) Graph65 9/9/2016
Presentation
Project
Employee
Event
Course
Term
Certification Department
Locationskill (level, since)
teaches
located in (since)
related to
related to
author
participates in (since, role)
part of (since)owns (since)
related to
related to
created for
related to / child of
managed by
Customer for
Event Type isOf
managed by
located in
responsible unit
owned by
Project Type
isOf
Industry
forin
participates in
Publication
related to
for
Social Media
Profile
uses
related to
Instant
Messaging
uses
located in
responsible (for)
previous year
main
teacher
Armasuisse W&T Social Network Graph
Introduction to DataStax Enterprise (DSE) Graph66 9/9/2016
Twitter User Tweetpublishes (timestamp)
Term
mentions (timestamp)
Place
uses
followedBy
Youtube
Channel
id
name
language
timestamp
lastUpdateTime
Youtube
Videopublishes (timestamp)
name
type
lastUpdateTime
id
targetIds
timestamp
language
lastUpdateTime
uses
id
street
country
name
type
url
lastUpdateTime
Url
id
name
hasFavorites
hasLikes
hasDislikes
hasComments
lastUpdateTime
uses
Person
owns
Webpage
retweets
replies
id
shortUrl
url
lastUpdateTime
uses
uses
linksTo
owns
linksTo
located
id
name
lastUpdateTime
id
name
lastUpdateTime
Geo Point
id
location
altitude
lastUpdateTime
uses
uses
Entity
uses (time)
uses
uses
Introduction to DataStax Enterprise (DSE) Graph67 9/9/2016
More Information
• DS330: DataStax Enterprise Graph – Self-paced course offered for free by
DataStax
Guido Schmutz
Technology Manager
guido.schmutz@trivadis.com
9/9/2016 Introduction to DataStax Enterprise (DSE) Graph68
@gschmutz guidoschmutz.wordpress.com

Trivadis TechEvent 2016 Introduction to DataStax Enterprise (DSE) Graph by Guido Schmutz

  • 1.
    BASEL BERN BRUGGDÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. GENF HAMBURG KOPENHAGEN LAUSANNE MÜNCHEN STUTTGART WIEN ZÜRICH Introduction to DataStax Enterprise (DSE) Graph Guido Schmutz @gschmutz guidoschmutz.wordpress.com
  • 2.
    Agenda 1. Why aGraph Database? 2. DataStax Enterprise Edition – DSE Graph 3. DSE Graph in Action • Working with Graph Schemas • Inserting Graph Data • Traversing Graph Data • Indexing Graph Data 4. Summary Introduction to DataStax Enterprise (DSE) Graph2 9/9/2016
  • 3.
    Why a GraphDatabase? Introduction to DataStax Enterprise (DSE) Graph3 9/9/2016
  • 4.
    Know your domain Connectedness of Datalowhigh Document Data Store Key-Value Stores Wide- Column Store Graph Databases Relational Databases Introduction to DataStax Enterprise (DSE) Graph4 9/9/2016
  • 5.
    RDBMS vs. GraphDatabase Introduction to DataStax Enterprise (DSE) Graph5 9/9/2016 RDBMS Graph Database Process to query data elements (joins) is inefficient on large data set or many relationships Better performance for relationship queries due to specialized index structures Expressing JOIN-intensive queries in SQL is time-consuming and error prone Intuitive query language enabling faster application development
  • 6.
    RDBMS vs. GraphDatabase – Data Modelling Introduction to DataStax Enterprise (DSE) Graph6 9/9/2016 VS. !" !" !" #" #" $" $" $" partner! worksFor
 ! develops! develops! uses! uses! reportsTo! reportsTo! manages! worksFor
 ! worksFor
 ! Source: DataStax
  • 7.
    Graph Use Cases Introductionto DataStax Enterprise (DSE) Graph9 9/9/2016 Master Data Management Customer 360 Recommendation Personalization Security Fraud Detection Internet of Things (IoT)
  • 8.
    Introduction to theGraph Model – Property Graph Node / Vertex • Represent Entities • Can contain properties (key-value pairs) Relationship / Edge • Lines between nodes • may be directed or undirected Properties • Values about node or relationship • Allow to add semantic to relationships User 1 Movie rated knows rated User 2 userId: 16134540 name: Peter location: Palo Alto userId: 18898576 name: Guido location: Berne rating: 8 key: value Introduction to DataStax Enterprise (DSE) Graph12 9/9/2016 rating: 6 movieId: m1000 title: Sully year: 2016 duration: 95 country: United States production: [FilmNation, Flashlight Films]
  • 9.
    DataStax Enterprise Edition– DSE Graph Introduction to DataStax Enterprise (DSE) Graph13 9/9/2016
  • 10.
    DataStax Enterprise Edition– DSE Graph Introduction to DataStax Enterprise (DSE) Graph14 • A scale-out graph database purposely built for cloud applications that need to act on complex and highly connected relationships. • Supports a property graph model natively inside the DataStax product, engineered specifically for DataStax Enterprise (Cassandra, Search, Analytics). • Store & find relationships in data fast and easy in large graphs. • Part of DSE’s multi-model platform. 9/9/2016 Image: DataStax
  • 11.
    DataStax Enterprise Graph Introductionto DataStax Enterprise (DSE) Graph15 9/9/2016 Real-time graph database management system • Adopts Apache TinkerPop standards for data and traversal • Uses Apache Cassandra for scalable storage and retrieval • Leverages Apache Solr for full-text search and indexing • Integrates Apache Spark for fast analytic traversal • Supports comprehensive data security for the enterprise Image: DataStax
  • 12.
    Why Cassandra? • Allnodes participate in a cluster • Shared nothing • Add or remove as needed • More capacity? Add more servers • Each node owns a range of partitions • Consistent Hashing
  • 13.
    DSE Graph -Architecture Introduction to DataStax Enterprise (DSE) Graph17 9/9/2016 Image: DataStax
  • 14.
    Today’s Needs andthe solution in DSE Graph Need DSE Graph Feature Store and Access Data Quickly Graph Data Model + Gremlin Flexible, Fast Application Builds DSE Graph + DSE Studio & DSE Drivers Analyze Information Graph Analytics with DSE Analytics Search and Find Quickly Graph Search with DSE Search Ingest and Export Data DSE Graph Loader Secure Information DSE Security Manage and Monitor Opscenter Availability, Scale, Operational Ease, Performance… DataStax Enterprise Introduction to DataStax Enterprise (DSE) Graph18 9/9/2016
  • 15.
    • Web-based developersolution which helps developers visually explore, query, and trouble-shoot DSE Graph in one intuitive UI. • Auto-completion, result set visualization, execution management, and much more. Developer support with DataStax Studio Introduction to DataStax Enterprise (DSE) Graph19 9/9/2016
  • 16.
    Data Loading Supportwith DSE Graph Loader • Simplifies loading large amounts of enterprise data from various sources into DSE Graph efficiently and robustly. • Inspects incoming data for schema compliance. • Uses declarative data mappings and custom transformations to handle diverse types of data. Graph Loader   Data Mappings Batch Loading Stream Ingestion RDBMSJSON DSE Graph Introduction to DataStax Enterprise (DSE) Graph20 9/9/2016
  • 17.
    Operational support withDataStax OpsCenter • Web-based operations solution which can launch, manage, monitor and trouble- shoot DSE clusters and deployments. • Launch wizard, failure alerts, monitoring dashboards, and much more.
  • 18.
    DSE Graph inAction Introduction to DataStax Enterprise (DSE) Graph22 9/9/2016
  • 19.
    Sample Graph: MoviesDatabase Introduction to DataStax Enterprise (DSE) Graph23 9/9/2016 Person Genre movieId : text title : text year : int duration : int country : text production : text* personId : text name : text User Movie belongsTo actor rated director composer screen writer cinemato grapher knows rating : int userId : text age : int gender : text genreId : int name : text
  • 20.
    DSE Graph inAction – Working with Graph Schemas Introduction to DataStax Enterprise (DSE) Graph24 9/9/2016
  • 21.
    Graph Schema API Introductionto DataStax Enterprise (DSE) Graph25 9/9/2016 Schema management basics • Accessible via schema object • Defined for each graph in the DSE Graph system • Provides methods propertyKey, vertexLabel, edgeLabel, etc. • Supports fluent interface Four important elements of a graph schema • Property keys • Vertex labels • Edge labels • Indexes
  • 22.
    Graph Schema API– Property Keys Introduction to DataStax Enterprise (DSE) Graph26 9/9/2016 Define types of properties to be used with Vertices and Edges • Creating single-cardinality property keys (single() is assumed by default) • Creating multi-property keys (can only be associated with vertices) • Creating meta-property keys (can only be associated with vertices) schema.propertyKey("title").Text().single().create(); schema.propertyKey("year").Int().create(); schema.propertyKey("production").Text().multiple().create(); schema.propertyKey("source").Text().create(); schema.propertyKey("date").Timestamp().create(); schema.propertyKey("budget").Text().multiple(). properties("source","date").create(); Movie movieId: Text title: Text year: Int duration: Int country: Text production: Text*
  • 23.
    Graph Schema API– Vertex Labels Introduction to DataStax Enterprise (DSE) Graph27 9/9/2016 Define types of vertices to be used in a graph • Creating vertex labels • Different labels can use same property key schema.vertexLabel("movie"). properties("movieId","title",”year",”duration",”country","production"). create(); schema.vertexLabel("genre").properties("genreId","name").create(); schema.vertexLabel("person").properties("personId","name").create(); Movie movieId: Text title: Text year: Int duration: Int country: Text production: Text* Genre genreId: Text name: Text Person personId: Text name: Text
  • 24.
    Graph Schema API– Edge Labels Introduction to DataStax Enterprise (DSE) Graph28 9/9/2016 Define types of edges to be used in a graph • Creating single-cardinality edge label definition • Creating multi-cardinality edge label definition (multiple() is assumed by default) schema.edgeLabel("rated"). single(). properties("rating"). connection("user","movie"). create(); schema.edgeLabel("actor"). multiple(). properties("rating"). connection("movie","person"). create(); User Movie rated (0..1) rating: int Person Movie actor (0..*)
  • 25.
    Graph Schema API– Management of Graph Schemas Introduction to DataStax Enterprise (DSE) Graph29 9/9/2016 • Listing all schema definitions • Listing individual schema elements • Dropping all schema definitions (will also delete all graph data !!!) schema.describe() schema.vertexLabel("user").describe() schema.edgeLabel("rated").describe() schema.clear()
  • 26.
    DSE Graph inAction – Inserting Graph Data Introduction to DataStax Enterprise (DSE) Graph30 9/9/2016
  • 27.
    Creating Vertices andVertex Properties Introduction to DataStax Enterprise (DSE) Graph31 9/9/2016 Adding a new User vertex Adding a new Movie vertex Vertex u = graph.addVertex("user"); u.property("userId","u2016"); u.property("age",36); u.property("gender","M"); User userId: u2016 age: 26 gender: M Vertex m = graph.addVertex(label, "movie", "movieId", "m1000", "title", "Sully", "year", 2016, "duration", 95, "country", "United States"); m.property("production", "FilmNation"); m.property("production", "Flashlight Films"); Movie movieId: m1000 title: Sully year: 2016 Duration: 95 Country: United States Production: [FilmNation, Flashlight Films]
  • 28.
    Creating Vertices andVertex Properties Introduction to DataStax Enterprise (DSE) Graph32 9/9/2016 Adding an edge between the user vertex and the movie vertex Vertex u = graph.addVertex("user", ...); Vertex m = graph.addVertex("movie", ...); Edge e = u.addEdge("rated", m); e.property("rating", 8); User userId: u2016 age: 26 gender: M Movie rated rating: 8 movieId: m1000 title: Sully year: 2016 Duration: 95 Country: United States Production: [FilmNation, Flashlight Films]
  • 29.
    Gremlin I/O Introduction toDataStax Enterprise (DSE) Graph33 9/9/2016 Exporting and importing existing graphs and subgraphs • Migration between Apache TinkerPop-enabled graph databases • Serialization and interchange of graphs between databases and tools • Creating backup copies of graph • Different file formats for graph serialization Format Description GraphML Common XML vocabulary for graph representation. Supported by many graph-related tools, applications and libraries. "Lossy" format as it lacks support for complex data types and graph variables. GraphSON Apache TinkerPop-originated JSON-based, lossless format for graph representation. Gryo Fast, space-efficient, losseless, binary graph serialization format for use by JVM languages. Gryo is enalbed by Kryo – a fast and efficient object graph serialization framework.
  • 30.
    Gremlin I/O Introduction toDataStax Enterprise (DSE) Graph34 9/9/2016 Exporting to and importing from a GraphML file Exporting to and importing from a GraphSON file Exporting to and importing from a Kryo file graph.io(IoCore.graphml()).writeGraph("KillrVideo.xml") graph.io(IoCore.graphml()).readGraph("KillrVideo.xml") graph.io(IoCore.graphson()).writeGraph("KillrVideo.json") graph.io(IoCore.graphson()).readGraph("KillrVideo.json") graph.io(IoCore.gryo()).writeGraph("KillrVideo.json") graph.io(IoCore.gryo()).readGraph("KillrVideo.json")
  • 31.
    DSE Graph inAction – Traversing Graph Data Introduction to DataStax Enterprise (DSE) Graph35 9/9/2016
  • 32.
    Gremlin Traversals inDSE Graph Introduction to DataStax Enterprise (DSE) Graph36 9/9/2016 • Standard Gremlin Language • Gremlin is defined by Apache TinkerPop • Expressive fluent language to define traversals (Groovy) • Production vs. Development modes • Production (default) requires an explicitly defined graph schema and proper graph indexes to avoid expensive scans • Enabling graph scans in production mode (acceptable if scanning small portions of data => caution is advised!) g.V().has("movie","movieId","m366").values("title","year") schema.config().option("graph.schema_mode").get() schema.config().option("graph.schema_mode").set("Development") schema.config().option("graph.allow_scan").set(true)
  • 33.
    Gremlin Traversals inDSE Graph Introduction to DataStax Enterprise (DSE) Graph37 9/9/2016 OLTP vs. OLAP Traversals OLTP traversals OLAP traversals Generate instantaneous responses Take longer to execute Resemble targeted database queries Involve broader-scope data analysis Heavily rely on vertex ids or indexes Require expensive graph scans Access small subgraphs Move to the outgoing edges Traverse short paths with few branches Move to the incoming edges
  • 34.
    Gremlin Traversal Ingredients Introductionto DataStax Enterprise (DSE) Graph38 9/9/2016 Traversal source Traversal Steps Traverser g = grap.traversal() g.V().has("title","Alice in Wonderland").has("year",2010). out("director").values("name")
  • 35.
    Defining Gremlin Traversals Introductionto DataStax Enterprise (DSE) Graph39 9/9/2016 Linear motif: Traversal is a sequence of steps g. V(). has("title","Alice in Wonderland"). has("year",2010). out("director"). values("name") // Sample Output: // Tim Burton Person Movie screenwriter movieId: m267 title: Alice in Wonderland year: 2010 … Person Person actor personId: p5206 Name: Linda Woolverton director personId: p8153 Name: Tim Burton personId: p4361 Name: Johnny Depp
  • 36.
    Defining Gremlin Traversals Introductionto DataStax Enterprise (DSE) Graph40 9/9/2016 Netsted motif: Traversal is a tree of steps g. V(). has("title","Alice in Wonderland"). has("year",2010). union(__.out("director"), out("screenwriter")). values("name") // Sample Output: // Tim Burton // Linda Woolverton Person Movie screenwriter movieId: m267 title: Alice in Wonderland year: 2010 … Person Person actor personId: p5206 Name: Linda Woolverton director personId: p8153 Name: Tim Burton personId: p4361 Name: Johnny Depp
  • 37.
    Simple Traversal Introduction toDataStax Enterprise (DSE) Graph41 9/9/2016 Navigating from a Vertex: Step Description out([label], …) Move to the outgoing vertices in([label], …) Move to the incoming vertices both([label], …) Move to both the incoming and outgoing vertices outE([label], …) Move to the outgoing edges inE([label], …) Move to the incoming edges bothE([label], …) Move to both the incoming and outgoing edges MovieUser Genre Genre rated belongsTo actor both() bothE() in() inE() outE() out()
  • 38.
    Simple Traversal Introduction toDataStax Enterprise (DSE) Graph42 9/9/2016 Navigating from an Edge: Step Description outV() Move to the outgoing vertices inV() Move to the incoming vertices bothV() Move to both the incoming and outgoing vertices otherV() Move to the vertex that was not the vertex that was moved from MovieUser rated outV() bothV() inV()
  • 39.
    Simple Traversal within Introduction to DataStax Enterprise (DSE) Graph43 9/9/2016 Find Jonny Depp’s movies release in 2010 or later: g.V(). hasLabel("person"). has("name","Johnny Depp"). in("actor"). has("year",gte(2010)). values("title") "Into the Woods", "Pirates of the Caribbean: On Stranger Tides", "Alice in Wonderland" Person Genre movieId : text title : text year : int duration : int country : text production : text* personId : text name : text User Movie belongsTo actor rated director composer screen writer cinemato grapher knows rating : int userId : text age : int gender : text genreId : int name : text
  • 40.
    Simple Traversal within and inE Introduction to DataStax Enterprise (DSE) Graph44 9/9/2016 Find user rating for Jonny Depp’s movies released in 2010 or later: g.V(). hasLabel("person"). has("name","Johnny Depp"). in("actor"). has("year",gte(2010)). inE("rated"). values("rating") 3, 7, 7, 7, 5, 5, 8, ... Person Genre movieId : text title : text year : int duration : int country : text production : text* personId : text name : text User Movie belongsTo actor rated director composer screen writer cinemato grapher knows rating : int userId : text age : int gender : text genreId : int name : text
  • 41.
    Simple Traversal within, inE and outV Introduction to DataStax Enterprise (DSE) Graph45 9/9/2016 Find ages of users who left 7 or 8 star rating for Johnny Depp’s movies released in 2010 or later: g.V(). hasLabel("person"). has("name","Johnny Depp"). in("actor"). has("year",gte(2010)). inE("rated"). has("rating", within(7,8)). outV(). values("age") 60, 37, 63, 62, 57, ... Person Genre movieId : text title : text year : int duration : int country : text production : text* personId : text name : text User Movie belongsTo actor rated director composer screen writer cinemato grapher knows rating : int userId : text age : int gender : text genreId : int name : text
  • 42.
    Path Traversal withpath Introduction to DataStax Enterprise (DSE) Graph46 9/9/2016 Find a path from a user to a friend g.V(). has("user","userId","u1"). out("knows").out("knows"). path().by("userId").limit(1) g.V(). has(“user“,"userId","u1"). out(“knows“).out("knows"). path().by("age").limit(1) Person Genre movieId : text title : text year : int duration : int country : text production : text* personId : text name : text User Movie belongsTo actor rated director composer screen writer cinemato grapher knows rating : int userId : text age : int gender : text genreId : int name : text
  • 43.
    Path Traversal withpath Introduction to DataStax Enterprise (DSE) Graph48 9/9/2016 Find a shortest, simple path between to actors g.V().has("person","name","Johnny Depp"). repeat(both("actor").simplePath().timeLimit(1000)). until(has("person","name","Leonardo DiCaprio")). path().by(choose(hasLabel("person"), values("name"),values("title"))).limit(1) // Sample Output // "Johnny Depp", "Dead Man", "Gabriel Byrne", // "The Man in the Iron Mask", "Leonardo DiCaprio" Person Genre movieId : text title : text year : int duration : int country : text production : text* personId : text name : text User Movie belongsTo actor rated director composer screen writer cinemato grapher knows rating : int userId : text age : int gender : text genreId : int name : text
  • 44.
    Projecting Traversal withselect Introduction to DataStax Enterprise (DSE) Graph49 9/9/2016 Find Johnny Depp’s movie titles g.V().has("person","name","Johnny Depp"). as("actor"). in("actor").as("movie"). select("actor","movie"). by("name"). by("title"). sample(1) // Sample Output // "actor": "Johnny Depp", "movie": "Dead Man" Person Genre movieId : text title : text year : int duration : int country : text production : text* personId : text name : text User Movie belongsTo actor rated director composer screen writer cinemato grapher knows rating : int userId : text age : int gender : text genreId : int name : text
  • 45.
    Projecting Traversal withselect Introduction to DataStax Enterprise (DSE) Graph50 9/9/2016 Find Johnny Depp’s movie titles, years, and genres g.V().has("person","name","Johnny Depp"). as("a"). in("actor").as("t","y"). out("belongsTo").as("g"). select("a","t","y","g"). by("name").by("title").by("year").by("name"). sample(3) // Sample output: // [a:Johnny Depp, t:Alice in Wonderland, y:2010, g:Fantasy] // [a:Johnny Depp, t:Alice in Wonderland, y:2010, g:Animation] // [a:Johnny Depp, t:Alice in Wonderland, y:2010, g:Adventure] Person Genre movieId : text title : text year : int duration : int country : text production : text* personId : text name : text User Movie belongsTo actor rated director composer screen writer cinemato grapher knows rating : int userId : text age : int gender : text genreId : int name : text
  • 46.
    Statistical Traversal withcount Introduction to DataStax Enterprise (DSE) Graph51 9/9/2016 Find the number of vertices and edges in a graph g.V().count() // 10797 g.E().count() // 69054 g.V().hasLabel("user").count() // 1100 Person Genre movieId : text title : text year : int duration : int country : text production : text* personId : text name : text User Movie belongsTo actor rated director composer screen writer cinemato grapher knows rating : int userId : text age : int gender : text genreId : int name : text
  • 47.
    Statistical Traversal withmean, min and max Introduction to DataStax Enterprise (DSE) Graph52 9/9/2016 Find the average, smallest and largest user ages g.V().hasLabel("user").values("age").mean() // 32.01282051282051 g.V().hasLabel("user").values("age").min() // 12 Person Genre movieId : text title : text year : int duration : int country : text production : text* personId : text name : text User Movie belongsTo actor rated director composer screen writer cinemato grapher knows rating : int userId : text age : int gender : text genreId : int name : text
  • 48.
    Statistical Traversal withgroup().by().by() pattern Introduction to DataStax Enterprise (DSE) Graph53 9/9/2016 Find the average ages of female and male users in the graph g.V().hasLabel("user"). group(). by("gender"). by(values("age").mean()) // F : 30.561371841155236 // M : 32.01282051282051 Person Genre movieId : text title : text year : int duration : int country : text production : text* personId : text name : text User Movie belongsTo actor rated director composer screen writer cinemato grapher knows rating : int userId : text age : int gender : text genreId : int name : text
  • 49.
    Statistical Traversal withgroupCount Introduction to DataStax Enterprise (DSE) Graph54 9/9/2016 Find the vertex and edge distribution by label and graph Person Genre movieId : text title : text year : int duration : int country : text production : text* personId : text name : text User Movie belongsTo actor rated director composer screen writer cinemato grapher knows rating : int userId : text age : int gender : text genreId : int name : text g.V().groupCount().by(label) // [movie:920, // person:8759, // genre:18, // user:1100]
  • 50.
    Declarative Traversal withmatch Introduction to DataStax Enterprise (DSE) Graph55 9/9/2016 Find directors who appeared in their own movies g.V().match( __.as("m").out("actor").as("p"), __.as("m").out("director").as("p")). select("p").by("name").dedup(). sample(3) // Sample Output. // "Gene Kelly", // "Sylvester Stallone", // "Terry Jones" Person Genre movieId : text title : text year : int duration : int country : text production : text* personId : text name : text User Movie belongsTo actor rated director composer screen writer cinemato grapher knows rating : int userId : text age : int gender : text genreId : int name : text
  • 51.
    DSE Graph inAction – Indexing Graph Data Introduction to DataStax Enterprise (DSE) Graph56 9/9/2016
  • 52.
    Graph Schema API– Vertex Indexes (I) Introduction to DataStax Enterprise (DSE) Graph57 9/9/2016 Efficiently retrieve all vertices with a known label and a given vertex value • Creating and Using materialized view index on a high cardinality property • Creating and Using secondary index on a low cardinality property schema.vertexLabel("movie").index("moviedsById").materialized().by("movieId") .add() g.V().hasLabel("movie").has("movieId","m267") schema.vertexLabel("movie").index("moviesByYear").secondary().by("year") .add() g.V().has("movie","year",2010)
  • 53.
    Graph Schema API– Vertex Indexes (II) Introduction to DataStax Enterprise (DSE) Graph58 9/9/2016 Efficiently retrieve all vertices with a known label and a given vertex value • Creating and Using full text search index on a text property • Creating and Using string search index on a text property schema.vertexLabel("movie").index("search").search().by("title").asText() .add() g.V().has("movie","title",Search.tokenRegex("Wonder.*")) schema.vertexLabel("movie").index("search").search().by("country").asString() .add() g.V().has("movie","country",Search.prefix("U"))
  • 54.
    Graph Schema API– Property Indexes Introduction to DataStax Enterprise (DSE) Graph59 9/9/2016 Efficiently retrieve properties of a known vertex that have associated meta-properties whose values are known or fall into a known range • Creating and Using a Property Index schema.vertexLabel("movie").index("movieBudgetBySource") .property("budget").by("source") .add() g.V().has("movie","movieId","m267").properties("budget"). .has("source","Los Angeles Times").value() Movie movieId: Text title: Text year: Int duration: Int country: Text production: Text* budget: Int* [ source: Text date: Timestamp] schema.vertexLabel("movie").index("movieBudgetByDate") .property("budget").by("date") .add()
  • 55.
    Graph Schema API– Edge Indexes (I) Introduction to DataStax Enterprise (DSE) Graph60 9/9/2016 Efficiently traverse edges that are incident to a known vertex, have a known label, and have properties whose values are known or fall into a known range • Creating and Using an Edge Index • Find how many users rated a particular movie with an 8-star rating schema.vertexLabel("movie").index("toUsersByRating") .inE("rated").by("rating") .add() g.V().has("movie", "movieId", "m267"). inE("rated").has("rating",8).count() movieId: Text … User Movie rated (0..1) rating: int
  • 56.
    Graph Schema API– Edge Indexes (II) Introduction to DataStax Enterprise (DSE) Graph61 9/9/2016 Efficiently traverse edges that are incident to a known vertex, have a known label, and have properties whose values are known or fall into a known range • Creating and Using an Edge Index • Find movies rated with a greater-than-7 rating by a particular user schema.vertexLabel("user").index("toMoviesByRating") .outE("rated").by("rating") .add() g.V().has("user", "userId", "u1"). outE("rated").has("rating",gt(7)).inV() movieId: Text … User Movie rated (0..1) rating: int userId: Text …
  • 57.
    Summary Introduction to DataStaxEnterprise (DSE) Graph62 9/9/2016
  • 58.
    Summary - AComplete Integrated Solution for Graph Introduction to DataStax Enterprise (DSE) Graph63 Server Visual Management/Monitoring Visual Development Integrated Drivers (CQL, Gremlin, etc.) Java Python C++ More… 9/9/2016 Images: DataStax
  • 59.
    Trivadis Enterprise KnowledgeGraph Introduction to DataStax Enterprise (DSE) Graph65 9/9/2016 Presentation Project Employee Event Course Term Certification Department Locationskill (level, since) teaches located in (since) related to related to author participates in (since, role) part of (since)owns (since) related to related to created for related to / child of managed by Customer for Event Type isOf managed by located in responsible unit owned by Project Type isOf Industry forin participates in Publication related to for Social Media Profile uses related to Instant Messaging uses located in responsible (for) previous year main teacher
  • 60.
    Armasuisse W&T SocialNetwork Graph Introduction to DataStax Enterprise (DSE) Graph66 9/9/2016 Twitter User Tweetpublishes (timestamp) Term mentions (timestamp) Place uses followedBy Youtube Channel id name language timestamp lastUpdateTime Youtube Videopublishes (timestamp) name type lastUpdateTime id targetIds timestamp language lastUpdateTime uses id street country name type url lastUpdateTime Url id name hasFavorites hasLikes hasDislikes hasComments lastUpdateTime uses Person owns Webpage retweets replies id shortUrl url lastUpdateTime uses uses linksTo owns linksTo located id name lastUpdateTime id name lastUpdateTime Geo Point id location altitude lastUpdateTime uses uses Entity uses (time) uses uses
  • 61.
    Introduction to DataStaxEnterprise (DSE) Graph67 9/9/2016 More Information • DS330: DataStax Enterprise Graph – Self-paced course offered for free by DataStax
  • 62.
    Guido Schmutz Technology Manager guido.schmutz@trivadis.com 9/9/2016Introduction to DataStax Enterprise (DSE) Graph68 @gschmutz guidoschmutz.wordpress.com