How To Model and Construct Graphs with Oracle Database (AskTOM Office Hours part 2)

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Oracle Ask TOM Office Hours:
How To Model and Construct
Graphs with Oracle Database
Learn the first steps to use property graphs in the database – modeling and
constructing a graph. 2018.04.24
Albert Godfrind, Solutions Architect
Zhe Wu, Architect
Jean Ihm, Product Manager
Confidential – Oracle Internal/Restricted/Highly Restricted

AskTOM sessions on property graphs
• Today’s is the second session on property graphs
• February’s session covered “Introduction to Property Graphs”
– In case you missed it, recording is available at URL above
• Today’s topic: How To Model and Construct Graphs with Oracle Database
• Visit the Spatial and Graph landing page to view recordings; submit
feedback, questions, topic requests; view upcoming session dates and
topics; sign up
4
https://devgym.oracle.com/pls/apex/dg/office_hours/3084

Oracle Big Data Spatial and Graph
• Available for Big Data platform/BDCS
– Hadoop, HBase, Oracle NoSQL
• Supported both on BDA and commodity
hardware
– CDH and Hortonworks
• Database connectivity through Big Data
Connectors or Big Data SQL
• Included in Big Data Cloud Service
Oracle Spatial and Graph
• Available with Oracle 18c/12.2/DBCS
• Using tables for graph persistence
• Graph views on relational data
• In-database graph analytics
– Sparsification, shortest path, page rank, triangle
counting, WCC, sub graphs
• SQL queries possible
• Included in Database Cloud Service
5
Graph Product Options

Property Graph Database Features:
• Scalability and Performance
• Graph analytics
• Graph Visualization
• Graph Query Language (PGQL)
• Standard interfaces
• Graph views on relational and RDF data
• Integration with Machine Learning tools
6
Courtesy Tom Sawyer Perspectives
Courtesy Linkurious

Graph Data Access Layer (DAL)
Architecture Property Graph Support
Graph Analytics
Blueprints & Lucene/SolrCloud RDF (RDF/XML, N-
Triples, N-Quads,
TriG,N3,JSON)
REST/WebService/Notebooks
Java,Groovy,Python,…
Java APIs
Java APIs/JDBC/SQL/PLSQL
Property Graph
formats
GraphML
GML
Graph-SON
Flat Files
7
Scalable and Persistent Storage Management
Parallel In-Memory Graph
Analytics/Graph Query (PGX)
Oracle NoSQL DatabaseOracle RDBMS Apache HBase
Apache
Spark
Apache
Spark

8
Designing Graphs

• What is a graph?
– Data model representing entities as
vertices and relationships as edges
– Optionally including attributes
• What are typical graphs?
– Social Networks
• LinkedIn, Facebook, Google+, Twitter, ...
– Financial networks
• Fund transfers, ownerships, …
– Dependencies
• Complex system interactions
9
Graph Data Model
E
A D
C B
F

• Why are graphs popular?
– Easy data modeling
• „whiteboard friendly“
– Flexible data model
• No predefined schema, easily extensible
• Particularly useful for sparse data
– Insight from graphical representation
• Intuitive visualization
– Enabling new kinds of analytics
• Overcoming limitations in relational
technology
10
Graph Data Model
E
A D
C B
F

• A set of vertices (or nodes)
– each vertex has a unique identifier.
– each vertex has a set of in/out edges.
– each vertex has a collection of properties.
– vertices are of different kinds
• A set of edges (or links)
– each edge has a unique identifier.
– each edge has a head/tail vertex.
– each edge has a label denoting type of
relationship between two vertices.
– each edge has a collection of properties.
11
A More Detailed Look at the Graph Model
https://github.com/tinkerpop/blueprints/wiki/Property-Graph-Model

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 12
Designing a Relational vs a Graph Model
Relational Graph
Relationships are implicit (primary / foreign keys) Relationships are explicit (edges)
Need technical expertise (normal form, …) Little or no technical expertise needed
Tools available (SQL Developer Data Modeler) No tools – just use a whiteboard !
Schema kept in dictionary No schema: add properties, vertex types, edge types at
will
Strong dependency between data model and
applications: changing the model impacts many
applications
Weak dependency: changing schema only impacts
applications that use the new properties, vertices or
edges

• Given a source Data Model
(Relational) ...
From Relational to Graph ...
• Obtain a graph …
Courtesy: Tom Sawyer 2017 13

Step 1: Understand the Source Data Models
• What are the entities?
– People, Departments
– Attributes of entities (name, ID, …)
• What are the relationships/links?
– Membership
– People-people relationship
– Department hierarchy
– Attributes of links(strength, startDate, location, …)
14Courtesy: Tom Sawyer 2017

Step 2: Think About What You Want To Achieve
• Navigation?
• Pattern matching?
• Analytics?
• Integration?
• Visualization?
• or something else?
15
Choice of edges and attributes e.g.
Tom  memberOf  Sales
Tom (memberOf:‘Sales,Marketing‘ )
Courtesy: Tom Sawyer 2017

Step 3: Construct the Graph Model
16
• Rows in tables usually
become vertices
• Columns become
properties on vertices
• Relationships become
edges
• Join tables in n:m relations
become relationships,
Courtesy: Tom Sawyer 2017

Step 4: Try Your Work Load and Repeat 1,2,3 if Necessary
It is OK and quite typical to do trial-and-error
17Courtesy: Tom Sawyer 2017

Example 1:
Human
Resources
JOBS
COUNTRIES
REGIONS
LOCATIONS
JOB_HISTORY
DEPARTMENTS EMPLOYEES

JOBS
COUNTRIES
REGIONS
LOCATIONS
EMPLOYEESDEPARTMENTS

Depart
ment
Employee
Job
Location
Works In
Located At
Occupation
Manages
Heads
Human
Resources
Graph
Model
Country and
region as
properties
History
integrated as
start/end dates
on edges

Depart
ment
Employee
Job
Location
Works In
Located At
Occupation
Manager
ManagesHeads
Manages
Human
Resources
Graph
Model
Manager as a
separate vertex
type

JOBS
LOCATIONS
EMPLOYEESDEPARTMENTS
PROJECTS
EMP_PROJ
COUNTRIES
REGIONS
Include
projects
Human
Resources
Schema
Extended

Depart
ment
Employee
Job
Location
Works In
Located At
Occupation
Manager
ManagesHeads
Manages
Human
Resources
Graph
Model
Project
Works
on
Include
projects

Manager
Employee
Job
Depart
ment Location
Occupation
Located
At
Manages
Works In
Heads
Manages
Project Works
on
Another
Graphic
Representation

PRODUCTS
ORDERS
CUSTOMERS
ORDER_LINES
Customer ProductPurchased
Example 2: Order Entry
Order
properties:
quantity, date
purchased ..

Example 3: Dependencies • Dependencies between
schema objects in a database
• Very useful for impact
analysis
• Map dependencies as a graph
desc dba_dependencies
Name Null? Type
----------------------- -------- --------------
OWNER NOT NULL VARCHAR2(128)
NAME NOT NULL VARCHAR2(128)
TYPE VARCHAR2(19)
REFERENCED_OWNER VARCHAR2(128)
REFERENCED_NAME VARCHAR2(128)
REFERENCED_TYPE VARCHAR2(19)
REFERENCED_LINK_NAME VARCHAR2(128)
DEPENDENCY_TYPE VARCHAR2(4)
Object
Depends
On

Type
View
Index Table
Column
Indexes
View On
Has Column
Uses
View On • More detailed graph model
• Can also include stored
functions, packages, triggers
• Extend with programs
References

28
Constructing Graphs

• Oracle Data Integrator (ODI)
• Data Access Layer (DAL) Java Programming APIs
• DAL Utility
– RDBMS Data Source to Graph
– CSV to Graph
• SQL
29
Several Simple Ways to Construct a Property Graph

Use Oracle Data Integrator (ODI) to Generate Graph
30

Oracle Data Integrator
Bulk Data Performance
Non Invasive Footprint
Future Proof IT Skills
Oracle Data Integrator provides high performance bulk
data movement, massively parallel data transformation
using database or big data technologies, and block-level
data loading that leverages native data utilities
Bulk Data
Transformation
Most Apps,
Databases
& Cloud Bulk Data Movement
Cloud
DBs
Big
Data
1000’s of
customers –
more than other
ETL tools
1000’s of
customers –
more than other
ETL tools
Flexible ELT
workloads run
anywhere: DBs,
Big Data, Cloud
Flexible ELT
workloads run
anywhere: DBs,
Big Data, Cloud
Up to 2x faster
batch processes
and 3x more
efficient tooling
Up to 2x faster
batch processes
and 3x more
efficient tooling

Key Architecture Benefits
Reverse Engineer
Metadata
Journalize
(CDC)
Load from
Source to
Staging
Check
Constraints
Integrate,
Transform
Data Service
32
Simpler Physical Design and Shorter Implementation Time
Knowledge Modules
Oracle Spark Hive HBase Pig Sqoop
BICS
Oracle
Datapump
Oracle
DBLink
JMS
External
Tables
Teradata
SAP Siebel
eBusiness
Suite
IBM DB2 Netezza SCD
Pluggable Knowledge Modules Architecture
Sample out-of-the-box Knowledge Modules
• Faster development and simpler
maintenance using templates
• Easy to extend and add new best practices
• Enforces predictability and reduces cost of
ownership

Property Graph Construction Work Flow with ODI
Define Source
Model
(payment data in CSV)
Define Target
Model
(data shape required
by Graph)
Generate Mapping Execute Conversion
Graph Model

Property Graph Construction Work Flow with ODI

Demo
35

Use DAL APIs to programmatically create a graph
36

opgtest = OraclePropertyGraph.getInstance(cfg);
v=opgtest.addVertex(100l);
v.setProperty("name", name);
v.setProperty("hiredate", d);
v2=opgtest.addVertex(102l);
e1 = opgtest.addEdge(1l, v1, v2, "likes");
e1.setProperty("strength", 0.8d);
…
OraclePropertyGraphUtils.exportFlatFiles(opgtest,
"/tmp/graph.opv",
"/tmp/graph.ope",
2,
false);
37
Using Data Access Layer (DAL) Java APIs
Building a Property Graph From Scratch
Incremental vertices update & set of properties
Incremental edges update & set of properties

Read from a data source &
Use DAL APIs to programmatically create a graph
38

Building a Property Graph from a Data Source
oracle=new Oracle("jdbc:oracle:thin:@(…)", "scott", "password");
cfg =…;
opgtest = OraclePropertyGraph.getInstance(cfg);
rs = oracle.executeQuery("select empno, ename, mgr, hiredate
from scott.emp");
while(rs.next()) {
eid=rs.getLong(1); name=rs.getString(2);
mgr=rs.getLong(3); d=rs.getDate(4);
v=opgtest.addVertex((Long)eid);
v.setProperty("name", name);
v.setProperty("hiredate", d); …
}
39
Querying a Relational Data Source to produce a Property Graph Data Model

Building a Property Graph from a Data Source
opg-hbase> opgtest.getVertices();
==>Vertex ID 7902 {name:str:FORD, hiredate:dat:1981-12-03}
==>Vertex ID 7654 {name:str:MARTIN, hiredate:dat:1981-09-28}
==>Vertex ID 7499 {name:str:ALLEN, hiredate:dat:1981-02-20}
==>Vertex ID 7369 {name:str:SMITH, hiredate:dat:1980-12-17}
==>Vertex ID 7844 {name:str:TURNER, hiredate:dat:1981-09-08}
==>Vertex ID 7521 {name:str:WARD, hiredate:dat:1981-02-22}
==>Vertex ID 7698 {name:str:BLAKE, hiredate:dat:1981-05-01}
==>Vertex ID 7782 {name:str:CLARK, hiredate:dat:1981-06-09}
==>Vertex ID 7788 {name:str:SCOTT, hiredate:dat:1987-04-19}
==>Vertex ID 7566 {name:str:JONES, hiredate:dat:1981-04-02}
==>Vertex ID 7900 {name:str:JAMES, hiredate:dat:1981-12-03}
==>Vertex ID 7839 {name:str:KING, hiredate:dat:1981-11-17}
==>Vertex ID 7934 {name:str:MILLER, hiredate:dat:1982-01-23}
==>Vertex ID 7876 {name:str:ADAMS, hiredate:dat:1987-05-23}
7521,hiredate,5,,,1981-02-22T00:00:00.000-05:00
7521,name,1,WARD,,
7788,hiredate,5,,,1987-04-19T00:00:00.000-04:00
7788,name,1,SCOTT,,
40
Generated output
Vertices converted from
Scott.EMP
Serialized Vertices in Flat File format

Generate a Property Graph (.opv/.ope) directly from
relational tables (views)
41

Graph Construction Work Flow From Relational Tables
Select Tables Define Mappings
(ColumnToAttrMappings)
Execute Conversion Load Graph
1101,name,1,Jean,,
1101,age,2,,20,
1101,salary,4,,120.0,
1102,name,1,Mary,,
…
100,1101,1002,works,weight,3,,1.0,
101,1110,1015,works,weight,3,,1.0,
102,1110,1102,works,weight,3,,1.0,
…
Vertex Flat-file
Edge Flat-file
name String
join_date Date
salary Double

Convert Relational Tables/Views to Flat Files: Vertices
43
1
1101,name,1,Mara%20Kessler,,
1101,join_date,5,,,2016-11-21T00:00:00.000-00:00
1101,location,1,East%20Pinkchester,,
1102,name,1,Brad%20Eichmann,,
1102,join_date,5,,,2018-05-02T00:00:00.000-00:00
1102,location,1,Kiannaside,,
…
name String
join_date Date
location String
employees Table
EMP_ID name join_date location …
1 Mara Kessler 2016-11-21
East
Pinkchester
…
2 Brad Eichmann 2018-05-02 Kiannaside …
… … … … …
8 Clotilde Romaguera 2018-06-16 Rodgerborough …

Convert from Relational to Flat Files: Vertices
String opv = "./employees-graph.opv";
OutputStream opvOS = new FileOutputStream(opv);
ColumnToAttrMapping[] ctams = new ColumnToAttrMapping[3];
ctams[0] = ColumnToAttrMapping.getInstance("name", "name", String.class);
ctams[1] = ColumnToAttrMapping.getInstance("join_date", "join date", Date.class);
ctams[2] = ColumnToAttrMapping.getInstance("loc", "location", String.class);
OraclePropertyGraphUtils.convertRDBMSTable2OPV(conn,
"employees",
"emp_id",
1100,
ctams,
8,
opvOS,
(DataConverterListener) null);
44
Specify the table to convert data from and the column
used to identify the vertices ID
Mappings between table columns &
vertex properties (including data types)
Specify the offset for vertex ID

Generate a Property Graph (.opv/.ope) directly from CSV
Files
45

Graph Construction Work Flow From CSV Files
Select CSV Files Define Mappings
(ColumnToAttrMappings)
Execute Conversion Load Graph
1101,name,1,Jean,,
1101,age,2,,20,
1101,salary,4,,120.0,
1102,name,1,Mary,,
…
100,1101,1002,works,weight,3,,1.0,
101,1110,1015,works,weight,3,,1.0,
102,1110,1102,works,weight,3,,1.0,
…
Vertex Flat-file
Edge Flat-file
name String
join_date Date
salary Double

Convert From CSV File to Flat File: Vertices
47
1
1101,name,1,John,,
1101,score,4,,4.2,
1101,age,2,,30,
1102,name,1,Mary,,
1102,score,4,,4.3,
1102,age,2,,32,
…
name String
score Float
age Integer
1,John,4.2,30
2,Mary,4.3,32
3,"Skywalker, Anakin",5.0,46
4,"Darth Vader",5.0,46
5,"Skywalker, Luke",5.0,53
…

Convert From CSV File to Flat File: Vertices
String inputCSV = "/path/mygraph-vertices.csv";
String outputOPV = "/path/mygraph.opv";
ColumnToAttrMapping[] ctams = new ColumnToAttrMapping[4];
ctams[0] = ColumnToAttrMapping.getInstance("VID", Long.class);
ctams[1] = ColumnToAttrMapping.getInstance("name", String.class);
ctams[2] = ColumnToAttrMapping.getInstance("score", Double.class);
ctams[3] = ColumnToAttrMapping.getInstance("age", Integer.class);
InputStream isCSV = new FileInputStream(inputCSV);
OutputStream osOPV = new FileOutputStream(new File(outputOPV));
OraclePropertyGraphUtilsBase.convertCSV2OPV(isCSV,
"VID",
0,
ctams,
1,
0, osOPV, null);
48
Specify the CSV File to convert data from and the
column used to identify the vertices ID
Mappings between table columns &
vertex properties (including data types)
Specify the offset for vertex ID
Specify the offset of lines to skip from file

VID:int,name,score:float,age:integer
1,John,4.2,30
2,Mary,4.3,32
3,"Skywalker, Anakin",5.0,46
4,"Darth Vader",5.0,46
5,"Skywalker, Luke",5.0,53
Convert From CSV File with Headers to Flat File: Vertices
49
1
name String
score Float
age Integer
1101,name,1,John,,
1101,score,4,,4.2,
1101,age,2,,30,
1102,name,1,Mary,,
1102,score,4,,4.3,
1102,age,2,,32,
…

Convert From CSV File with Headers to Flat File: Vertices
String inputCSV = "/path/mygraph-vertices.csv";
String outputOPV = "/path/mygraph.opv";
InputStream isCSV = new FileInputStream(inputCSV);
OutputStream osOPV = new FileOutputStream(new File(outputOPV));
OraclePropertyGraphUtilsBase.convertCSV2OPV(isCSV,
"VID",
0 /* Vertex ID offset */,
null,
1 /* Degree of Parallelism */,
0 /* Lines to skip from file */,
osOPV, (DataConverterListener)null);
5050
Mappings between CSV columns & vertex properties will
be read from the header of the CSV file

Convert CSV Files with Headers to Flat Files: Edges
51
1,30,0,friends,since,5,,,2010-04-14T00:00:00.000-00:00
1,30,0,friends,weight,3,,1.0,
3,48,36,colleagues,since,5,,,2013-11-07T00:00:00.000-00:00
3,48,36,colleagues,weight,3,,1.0,
…
10,21,16,manager,since,5,,,2014-03-05T00:00:00.000-00:00
10,21,16,manager,weight,3,,1.0,
0
30
‘friends’since Date
weight Float
EID,SINCE,SRC,STRENGTH,DEST,RELATION
1,14-04-2010,1,30,0,friends
3,07-11,2013,1,48,36,friends
...
10,05-03-2014,0,21,16,manager

Convert From CSV File with Headers to Flat File: Edges
String inputCSE = "/path/mygraph-edges.csv";
String outputOPE = "/path/mygraph.ope";
InputStream isCSE = new FileInputStream(inputCSE);
OutputStream osOPE = new FileOutputStream(new File(outputOPE));
OraclePropertyGraphUtilsBase.convertCSV2OPE(isCSE,
"EID",
0 /* Edge ID offset */,
"SRC",
"DEST",
true, "RELATION",
null,
1 /* Degree of Parallelism */,
0 /* Lines to skip from file */,
osOPE,
(DataConverterListener)null);
5252
Mappings between CSV columns & vertex
properties will be read from the header of the CSV
file
Specify the columns where source and
destination vertex IDs will be taken from

Customization of CSV Conversion to Flat Files
53
OraclePropertyGraphUtilsBase.convertCSV2OPE(isCSV,
"eid",
100 /* edge ID offset */,
"person" /* source vertex column */,
"product" /* destination vertex column */,
10 /* offset source vertex IDs */,
50 /* offset destination vertex IDs */,
false /* has edge column? */,
"bought",
null /* ctams mappings */,
1 /* degree of parallelism */,
‘|’,
‘"’,
1,
new SimpleDateFormat("dd/MM/yyyy"),
true,
osOPE,
null) ;
Specify default edge label
Delimiter character
Quotation character
Date format
Skip CSV records
Allow support for multiline in CSV records

Construct a property graph with SQL
54

Graph Construction with SQL
Confidential – Oracle Internal/Restricted/Highly Restricted 55
--
-- A Helper SQL View to simplify comparison of multiple vertex properties
--
CREATE VIEW BankXpivot AS
SELECT * FROM ( SELECT vid, k, v FROM BankXVT$)
PIVOT (MIN(v) FOR k IN ('Address' AS Address,
'Address1' AS Address1,
'Address2' AS Address2,
'AppId' AS AppId,
'CIO' AS CIO , 'In-Address' AS In_Address,
'In-Port' AS In_Port, 'LOB' AS LOB,
'Label' AS Label, 'Name' AS Name,
'Out-Address' AS Out_Address,
'Out-Port' AS Out_Port,
'Type' AS Type, 'VertexType' AS VertexType,
'Web' AS Web )
)
• Use native SQL to construct edges (modeling complex business relationships) on the fly

Graph Construction with SQL
Confidential – Oracle Internal/Restricted/Highly Restricted 56
--
-- App to LOB where Appname.LOB = LOB.LOB
-- Edge keys: LOB
--
INSERT INTO BankX_pivotGE$ (eid, svid, dvid, el, k, t, v)
SELECT ROWNUM+100000 AS eid,
apps.vid AS svid,
lobs.vid AS dvid,
'Owner' AS el,
'LOB' AS k,
1 AS t,
apps.lob AS v
FROM ( SELECT * FROM BankXpivot WHERE VertexType = 'App') apps
JOIN ( SELECT * FROM BankXpivot WHERE VertexType = 'LOB') lobs
ON apps.lob = lobs.label
ORDER BY svid, dvid
• Use native SQL

58
Resources

Resources
• Oracle Spatial and Graph & Big Data Spatial and Graph on OTN
oracle.com/technetwork/database/options/spatialandgraph
oracle.com/technetwork/database/database-technologies/bigdata-spatialandgraph
– White papers, software downloads, documentation and videos
• Blogs – examples, tips & tricks
blogs.oracle.com/oraclespatial | blogs.oracle.com/bigdataspatialgraph
• Property Graphs 101: How to Get Started with Property Graphs on the Oracle Database –
Arthur Dayton, Vlamis Software video https://youtu.be/QSj0zOjOAWI |
blog post http://www.vlamis.com/blog/2018/2/5/creating-a-property-graph-on-oracle-database
• YouTube channel https://www.youtube.com/channel/UCZqBavfLlCuS0il6zNY696w
• Oracle Big Data Lite Virtual Machine - a free sandbox to get started
www.oracle.com/technetwork/database/bigdata-appliance/oracle-bigdatalite-2104726.html
– Hands On Lab included in /opt/oracle/oracle-spatial-graph/ or http://github.com/oracle/BigDataLite/
59

Resources – social media and online communities
• Follow the product team: @JeanIhm, @agodfrin, @alanzwu, @SpatialHannes
• Oracle Spatial and Graph SIG user groups (search “Oracle Spatial and
Graph Community”)
60

AskTOM sessions on property graphs
• Next Spatial and Graph session in late May
– Topic: Performing Graph Analytics
• View recordings, submit feedback, questions,
topic requests, view upcoming session dates
and topics, sign up to get regular updates
61

62
Thanks for attending! See you next month.

How To Model and Construct Graphs with Oracle Database (AskTOM Office Hours part 2)

How To Model and Construct Graphs with Oracle Database (AskTOM Office Hours part 2)

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to How To Model and Construct Graphs with Oracle Database (AskTOM Office Hours part 2)

Similar to How To Model and Construct Graphs with Oracle Database (AskTOM Office Hours part 2) (20)

Recently uploaded

Recently uploaded (20)

How To Model and Construct Graphs with Oracle Database (AskTOM Office Hours part 2)