Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Analyzing Blockchain and Bitcoin
Transaction Data as Graph
Oracle Code | 2018-06-12 | Funkhaus Berlin
Karin Patenge |  karin.patenge@oracle.com
Business Development Manager Technology
Oracle Deutschland B.V. & Co. KG
Hans Viehmann |  hans.viehmann@oracle.com
Product Manager Spatial and Graph Technologies
Oracle Corporation
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
• This presentation is based on the works of:
• Zhe (Alan) Wu
• Architect for Graph and Semantic
Technologies @ Oracle Corporation
• Email: alan.wu@oracle.com
Acknowledgement
@kpatenge @alanzwu @SpatialHannes
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Agenda
• Modeling of Bitcoin Transactions
• Questions of Interest
• Data Processing Workflow
• Summary
• Q&A
@kpatenge @alanzwu @SpatialHannes
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Setting the Scene: Analyze Bitcoin Transaction Data
@kpatenge @alanzwu @SpatialHannes
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Setting the Scene: Interesting Patterns in Bitcoin Transaction
Data
@kpatenge @alanzwu @SpatialHannes
Source: http://blockchain.info
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
What does a Bitcoin Transaction look like?
• A transaction has input(s) and output(s)
– An input comes from an output of a(nother) transaction
TX hash: 6f7cf9580f1c2dfb3c4d5d043cdbb128c640e3f20161245aa7372e9666168516
TX outputSum : 10000000000
-- TX Input from: ff3dc8b461305acc5900d31602f2dafebfc406e5b050b14a352294f0965e0bf6:0
-- TX Input from: 2db69558056d0132d9848851fd20329be9cd590fa5ae2b3c55f58931f42e27f7:0
-- TX Output value: 10000000000
-- TX Output scriPubAddr: 12higDjoCCNXSA95xZMWUdPvXNmkAduhWv
Note: 1,000,000 is 0.01 BTC
@kpatenge @alanzwu @SpatialHannes
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
What does a Bitcoin Transaction look like?
• A transaction has input(s) and output(s)
–An input comes from an output of a(nother) transaction
TX9
TX1
TX8
TX3
Addr X
Addr K
Addr L
Addr Y
Addr Z
$
$
$
$ $
$
$
$
@kpatenge @alanzwu @SpatialHannes
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
What does a Graph look like?
• A graph has vertices (entities), edges (relationships), and properties
–Also known as linked data
TX9
TX1
TX8
TX3
Addr X
Addr K
Addr L
Addr Y
Addr Z
$
$
$
$ $
$
$
$
@kpatenge @alanzwu @SpatialHannes
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
• Model 1
– Vertices: Transaction, Address
– Edges: Transaction references
(TX  TX, TX  Addr)
• Model 2
– Vertices: Transaction, Address
– Edges: Transaction‘s indirect
reference to Address
(Addr  TX  Addr)
• Model 3
– Vertices: Address
– Edges: Address to Address
payment (Addr  Addr)
Modeling Bitcoin Transactions as a Graph
TX
9
TX
1
TX
8
TX
3
Addr
X
Addr
K
Addr
L
Addr
Y
Addr
Z
$
$
$
$ $
$
$
$
TX
9
TX
1
TX
8
TX
3
Addr
X
Addr
K
Addr
L
Addr
Y
Addr
Z
$
$
$
$ $
$
$
$
TX
9
TX
1
TX
8
TX
3
Addr
X
Addr
K
Addr
L
Addr
Y
Addr
Z
$
$
$
$ $
$
$
$
@kpatenge @alanzwu @SpatialHannes
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
• Graph Model 3
–What is Addr X´s contribution to
Addr K?
– Given an input address i, output
address o
-> Contribution of i to o is:
Bitcoin Transactions as a Graph: Money Flow
TX9
TX1
TX8
TX3
Addr X
Addr K
Addr L
Addr Y
Addr Z
$
$
$
$
$
$
$
$
o
i i
i
Amount
Amount
Amount
•

@kpatenge @alanzwu @SpatialHannes
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Functions of a Graph Database
Bitcoin Transactions as a Graph: Workflow
Graph
Generation
& Loading
Data
Preparation
Graph
Querying &
Analysis
Graph
Visualization
Retrieving
& Parsing
Data
@kpatenge @alanzwu @SpatialHannes
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Modeling Data as Graphs
The more connected the data is, the better a Graph fits
Oracle NoSQL DB with Big Data Spatial and GraphGraphic source: http://www.ateam-oracle.com/intro-to-graphs-at-oracle/
@kpatenge @alanzwu @SpatialHannes
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
• A set of nodes (aka vertices)
– each vertex has a unique identifier
– each vertex has a set of in/out edges
– each vertex has a collection of key-value
properties
• A set of edges
– each edge has a unique identifier
– each edge has a head/tail vertex
– each edge has a label denoting type of
relationship between two vertices
– each edge has a collection of key-value properties
• Blueprints Java APIs
• Implementations
– Oracle (Spatial and Graph, Big Data Spatial and
Graph), Neo4j, DataStax (Titan), InfiniteGraph,
Dex, Sail, MongoDB, …
What is a Property Graph?
https://github.com/tinkerpop/blueprints/wiki/Property-Graph-Model
@kpatenge @alanzwu @SpatialHannes
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Property Graph Support
Graph Data Access Layer (DAL)
Graph Analytics
Blueprints & Lucene/SolrCloud RDF (RDF/XML, N-
Triples, N-Quads,
TriG,N3,JSON)
REST/Web
Service/Notebooks
Java,Groovy,Python,…
Java APIs
Java APIs/JDBC/SQL/PLSQL
Property Graph
formats
GraphML
GML
GraphSON
Flat FilesScalable and Persistent Storage Management
Parallel In-Memory Graph
Analytics (PGX) /
Graph Querying (PGQL)
Oracle NoSQL
Database
Oracle RDBMS Apache HBase
Apache
Spark
@kpatenge @alanzwu @SpatialHannes
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Demo Environment
• Available for free:
Oracle Big Data Lite VM 4.11 running in Oracle VirtualBox
– Oracle NoSQL Database (kvlite: unclustered -> 1 node, no replication)
– Big Data Spatial and Graph (BDSG) 2.4
http://www.oracle.com/technetwork/database/bigdata-appliance/oracle-bigdatalite-2104726.html
• Property Graph Analytics Engine (PGX), Property Graph Query Language (PGQL)
• Gremlin, Apache Groovy (Shell)
• Zeppelin Notebook with PGX Interpreter
– Property Graph Format
• Oracle Flat Files
– Cytoscape 3.6.0
• Big Data Spatial and Graph 2.4 support installed
@kpatenge @alanzwu @SpatialHannes
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Definition Bitcoin transaction data sample
[oracle@bigdatalite data]$ head –n 5 btc.opv
1,bt_addr,1,1111111111111111111114oLvT2,,
2,bt_addr,1,11126yHiXjavR3oNVwV2GRNso2ah4MnZtm,,
3,bt_addr,1,11128BtJwtyW4q9eRe3zts6BB4jg4uKLv8,,
4,bt_addr,1,111HnjYiCubyhPjtmZ7jEQjYcYBpKZHvJ,,
5,bt_addr,1,111KHWctzJ8tsTbittCDVzmTHVjxQR2g4,,
[oracle@bigdatalite data]$
Oracle Flat File Format: Vertices
Field # Name Description
1 vertex_ID An integer that uniquely identifies the
vertex
2 key_name The name of the key in the key-value pair
3 value_type 1=String, 2=Integer, 3=Float, ...
4 value The encoded, non-null value of key_name
when it is neither numeric nor date
5 value The encoded, non-null value of key_name
when it is numeric
6 value The encoded, nonnull value of key_name
when it is a timestamp (date)
Source: http://blockchain.info
@kpatenge @alanzwu @SpatialHannes
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Definition Bitcoin transaction data sample
[oracle@bigdatalite data]$ head –n 5 btc.ope
1,317335,91594,contrib,trans_hash,1,4391b11d991e7c9ad4
f9a1a5a7ea9ed7f234643b0c883f49511e1394a5ab8ff5,,
1,317335,91594,contrib,amount,3,,5.0E9,
2,357443,91594,contrib,trans_hash,1,4391b11d991e7c9ad4
f9a1a5a7ea9ed7f234643b0c883f49511e1394a5ab8ff5,,
2,357443,91594,contrib,amount,3,,5.0E9,
3,352850,91594,contrib,trans_hash,1,4391b11d991e7c9ad4
f9a1a5a7ea9ed7f234643b0c883f49511e1394a5ab8ff5,,
3,352850,91594,contrib,amount,3,,5.0E9,
4,308829,91594,contrib,trans_hash,1,4391b11d991e7c9ad4
f9a1a5a7ea9ed7f234643b0c883f49511e1394a5ab8ff5,,
4,308829,91594,contrib,amount,3,,5.0E9,
5,314511,11714,contrib,trans_hash,1,2e8250e9f3f8043cda
d60f747982275fee2a1836ebb48b2f620d03371be8e3f6,,
5,314511,11714,contrib,amount,3,,5.0E9,
[oracle@bigdatalite data]$
Oracle Flat File Format: Edges
Field # Name Description
1 edge_ID An integer that uniquely identifies the edge
2 source_vertex_ID The vertex_ID of the outgoing tail of the edge
3 dest_vertex_ID The vertex_ID of the incoming head of the edge
4 edge_label The encoded label of the edge, which describes the
relationship between the two vertices
5 key_name The encoded name of the key in a KV pair
6 value_type 1=String, 2=Integer, 3=Double, ...
7 value The encoded, nonnull value of key_name when it is
neither numeric nor timestamp (date)
8 value The encoded, nonnull value of key_name when it is
numeric
9 value The encoded, nonnull value of key_name when it is
a timestamp (date)
@kpatenge @alanzwu @SpatialHannes
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Graph Generation and Loading using Vertices & Edges files
// Start Groovy Shell connecting to Oracle NoSQL DB
cd /opt/oracle/oracle-spatial-
graph/property_graph/dal/groovy
./gremlin-opg-nosql.sh
server = new ArrayList();
server.add("bigdatalite.localdomain:5000");
// Create a graph config with graph name "btc"
// Name of key-value store is "kvstore"
// Make sure to add all vertex/edge properties needed
cfg = GraphConfigBuilder.forPropertyGraphNosql() 
.setName("btc") 
.setStoreName("kvstore") 
.setHosts(server) 
.addVertexProperty("bt_addr", PropertyType.STRING, "NA") 
.addEdgeProperty("amount", PropertyType.FLOAT, 1.0f) 
.hasEdgeLabel(true) 
.setLoadEdgeLabel(true) 
.setMaxNumConnections(2) 
.build();
// Create an instance of the graph
opg = OraclePropertyGraph.getInstance(cfg);
opg.getKVStoreConfig();
// Prepare for data load
opg.setClearTableDOP(2);
opg.clearRepository();
// Create an instance for the graph data loader
opgdl=OraclePropertyGraphDataLoader.getInstance();
// Flat files with vertices & edges of Bitcoin txs
vfile="/home/oracle/Documents/BTC/data/btc.opv";
efile="/home/oracle/Documents/BTC/data/btc.ope
// Load data into the graph
opgdl.loadData(opg, vfile, efile, 2);
// Do some checks
// Count vertices and edges
opg.countVertices();
opg.countEdges();
// Get vertices and edges
opg.getVertices();
opg.getEdges();
...
// Shut down instance and close shell
opg.shutdown();
:q
@kpatenge @alanzwu @SpatialHannes
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
PGX – Graph Analytics Engine
• Toolkit for In-Memory, Parallel Graph
Analysis containing
– PGX shell
– Analyst API with a large collection of built-in
Graph algorithms
– and more
• Developed by Oracle Labs
– http://www.oracle.com/technetwork/oracle-
labs/parallel-graph-analytix/overview/index.html
– https://event.cwi.nl/grades/2018/07-VanRest.pdf
– https://docs.oracle.com/cd/E56133_01/latest/tutorials
/index.html
PGQL – Property Graph Query Language
• SQL-like Graph Pattern Matching
– WHERE clause set of comma-separated
constraints
• Developed by Oracle Labs
– http://pgql-lang.org/
• Proposed for standardization
Graph Querying and Analysis
@kpatenge @alanzwu @SpatialHannes
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Analyze Bitcoin Transaction Data using PGX
• Start PGX server
/opt/oracle/oracle-spatial-
graph/property_graph/pgx/bin/start-server
• Start / Return to Groovy Shell
// Create in-memory analyst session
session=Pgx.createSession("session_ID_1");
analyst=session.createAnalyst();
// Read the graph from Oracle NoSQL DB into memory
pgxGraph =
session.readGraphWithProperties(opg.getConfig());
// Working with In-Memory Analyst
// Execute Page Rank
rank=analyst.pagerank(pgxGraph, 0.0001, 0.85, 100);
// Get top 10 vertices
rank.getTopKValues(10);
// BetweenNess Centrality
bc=analyst.vertexBetweennessCentrality(pgxGraph);
// Get top 10 vertices
bc.getTopKValues(10);
...
@kpatenge @alanzwu @SpatialHannes
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Analyze Bitcoin Transaction Data using PGX
Using Zeppelin Notebook with PGX Interpreter
@kpatenge @alanzwu @SpatialHannes
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
• Topology constraints
▪ (n)–[e]–>(m)
▪ (n)–[e1]–>(m1), (n)–[e2]–>(m2)
▪ (n1)-[e1]->(n2)-[e2]->(n3)-[e3]->(n4)
▪ (n1)-[e1]->(n2)<-[e2]-(n3)
• Label matching
▪ (x:Person) -[e:likes]-> (y:Person)
▪ (:Person) -[:likes]-> (:Person)
▪ (x:Student|Professor) -[e:likes|knows]->
(y:Student|Professor)
• Value constraints
▪ (x) -> (y), x.name = 'John’, y.age > 25
• In-Line constraints
▪ (n WITH name = 'John' OR name = 'James', type =
'Person') -[e WITH type = 'workAt', workHours <
40]-> ()
• …
Syntax form Examples
Basic form (n)-[e]->(m)
Omit variable name of the source
vertex
()-[e]->(m)
Omit variable name of the destination
vertex
(n)-[e]->()
Omit variable names in both vertices ()-[e]->()
Omit variable name in edge (n)-->(m)
Omit variable name in edge
(alternative, one dash)
(n)->(m)
Querying Property Graph Data using PGQL
@kpatenge @alanzwu @SpatialHannes
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Query Bitcoin Transaction Data using PGQL
// Some PGQL queries
// Explore relationships in the graph
pgxResultSet = pgxGraph.queryPgql("SELECT e.label(),
count(*) WHERE (n) -[e]-> (m) GROUP BY e.label() ORDER BY
count(*) DESC");
pgxResultSet.print();
// Find top most collaborative Bitcoin addresses
pgxResultSet = pgxGraph.queryPgql("SELECT n, count(*) WHERE
(n) -[e:contrib]-> (m) GROUP BY n ORDER BY count(*) DESC
LIMIT 10");
pgxResultSet.print(3);
// Find top least collaborative Bitcoin addresses
pgxResultSet = pgxGraph.queryPgql("SELECT n, count(*) WHERE
(n) -[e:contrib]-> (m) GROUP BY n ORDER BY count(*) ASC");
pgxResultSet.print(3);
// InDegree count
pgxResultSet = pgxGraph.queryPgql("SELECT y.id(),
y.bt_addr, x.inDegree() WHERE (x) -> (y), x.inDegree() >
1000 ORDER BY x.inDegree() DESC");
pgxResultSet.print(3);
...
https://blogs.oracle.com/bigdataspatialgraph/how-many-ways-to-run-property-graph-query-language-pgql-in-bdsg-i
@kpatenge @alanzwu @SpatialHannes
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Query Bitcoin Transaction Data using PGQL
Using Zeppelin Notebook with PGX Interpreter
@kpatenge @alanzwu @SpatialHannes
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Visualize Bitcoin Transaction Data using Cytoscape
@kpatenge @alanzwu @SpatialHannes
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Pattern Analysis 01
@kpatenge @alanzwu @SpatialHannes
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Pattern Analysis 02: Addresses with incoming TX´s only
@kpatenge @alanzwu @SpatialHannes
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Pattern Analysis 03: Degree of Centrality
@kpatenge @alanzwu @SpatialHannes
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Summary
• Graph databases are powerful tools, complementing relational databases
– Especially strong for analysis of graph topology and connectedness
• Graph analytics offer new insight
– Especially relationships, dependencies and behavioural patterns
• Oracle Property Graph technology offers
– Comprehensive analytics through various APIs, integration with relational database
– Scaleable, parallel in-memory processing
– Secure and scaleable graph storage using Oracle NoSQL, HBase or Oracle Database
• Available both on-premise or in the Cloud
Graph capabilities in Oracle Big Data Spatial and Graph
@kpatenge @alanzwu @SpatialHannes
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Property Graph running in the Oracle Cloud
@kpatenge @alanzwu @SpatialHannes
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |@kpatenge @alanzwu @SpatialHannes
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Rich set of built-in parallel graph
algorithms
… and parallel graph mutation
operations
Additional Information: PGX - Built-in Package
@kpatenge @alanzwu @SpatialHannes
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
• Getting Started – Creating a Property Graph on
Oracle Database by Arthur Dayton (Vlamis
Software Solutions)
https://blogs.oracle.com/oraclespatial/getting-
started-creating-a-property-graph-on-oracle-
database
• Improve your Meetup Experience using Graph
Analytics by Karin Patenge (Oracle)
https://de.slideshare.net/kpatenge
• Big Data Spatial and Graph In-Memory Analyst
Java API:
https://docs.oracle.com/bigdata/bda411/PGXJV/toc.h
tm
• Oracle Big Data Spatial and Graph on
Oracle.com: www.oracle.com/database/big-data-
spatial-and-graph
• OTN product page (white papers, software
downloads, documentation, tutorials):
www.oracle.com/technetwork/database/database-
technologies/bigdata-spatialandgraph
• Oracle Big Data Lite Virtual Machine - a free
sandbox to get started:
www.oracle.com/technetwork/database/bigdata-
appliance/oracle-bigdatalite-2104726.html
• Hands On Lab for Big Data Spatial:
tinyurl.com/BDSG-HOL
• Blog – Examples, Tips & Tricks:
blogs.oracle.com/bigdataspatialgraph
Resources on Oracle‘s Property Graph Support
@kpatenge @alanzwu @SpatialHannes

OracleCode_Berlin_Jun2018_AnalyzeBitcoinTransactionDataUsingAsGraph

  • 1.
    Copyright © 2018,Oracle and/or its affiliates. All rights reserved. | Analyzing Blockchain and Bitcoin Transaction Data as Graph Oracle Code | 2018-06-12 | Funkhaus Berlin Karin Patenge |  karin.patenge@oracle.com Business Development Manager Technology Oracle Deutschland B.V. & Co. KG Hans Viehmann |  hans.viehmann@oracle.com Product Manager Spatial and Graph Technologies Oracle Corporation
  • 2.
    Copyright © 2018,Oracle and/or its affiliates. All rights reserved. | • This presentation is based on the works of: • Zhe (Alan) Wu • Architect for Graph and Semantic Technologies @ Oracle Corporation • Email: alan.wu@oracle.com Acknowledgement @kpatenge @alanzwu @SpatialHannes
  • 3.
    Copyright © 2018,Oracle and/or its affiliates. All rights reserved. | Agenda • Modeling of Bitcoin Transactions • Questions of Interest • Data Processing Workflow • Summary • Q&A @kpatenge @alanzwu @SpatialHannes
  • 4.
    Copyright © 2018,Oracle and/or its affiliates. All rights reserved. | Setting the Scene: Analyze Bitcoin Transaction Data @kpatenge @alanzwu @SpatialHannes
  • 5.
    Copyright © 2018,Oracle and/or its affiliates. All rights reserved. | Setting the Scene: Interesting Patterns in Bitcoin Transaction Data @kpatenge @alanzwu @SpatialHannes Source: http://blockchain.info
  • 6.
    Copyright © 2018,Oracle and/or its affiliates. All rights reserved. | What does a Bitcoin Transaction look like? • A transaction has input(s) and output(s) – An input comes from an output of a(nother) transaction TX hash: 6f7cf9580f1c2dfb3c4d5d043cdbb128c640e3f20161245aa7372e9666168516 TX outputSum : 10000000000 -- TX Input from: ff3dc8b461305acc5900d31602f2dafebfc406e5b050b14a352294f0965e0bf6:0 -- TX Input from: 2db69558056d0132d9848851fd20329be9cd590fa5ae2b3c55f58931f42e27f7:0 -- TX Output value: 10000000000 -- TX Output scriPubAddr: 12higDjoCCNXSA95xZMWUdPvXNmkAduhWv Note: 1,000,000 is 0.01 BTC @kpatenge @alanzwu @SpatialHannes
  • 7.
    Copyright © 2018,Oracle and/or its affiliates. All rights reserved. | What does a Bitcoin Transaction look like? • A transaction has input(s) and output(s) –An input comes from an output of a(nother) transaction TX9 TX1 TX8 TX3 Addr X Addr K Addr L Addr Y Addr Z $ $ $ $ $ $ $ $ @kpatenge @alanzwu @SpatialHannes
  • 8.
    Copyright © 2018,Oracle and/or its affiliates. All rights reserved. | What does a Graph look like? • A graph has vertices (entities), edges (relationships), and properties –Also known as linked data TX9 TX1 TX8 TX3 Addr X Addr K Addr L Addr Y Addr Z $ $ $ $ $ $ $ $ @kpatenge @alanzwu @SpatialHannes
  • 9.
    Copyright © 2018,Oracle and/or its affiliates. All rights reserved. | • Model 1 – Vertices: Transaction, Address – Edges: Transaction references (TX  TX, TX  Addr) • Model 2 – Vertices: Transaction, Address – Edges: Transaction‘s indirect reference to Address (Addr  TX  Addr) • Model 3 – Vertices: Address – Edges: Address to Address payment (Addr  Addr) Modeling Bitcoin Transactions as a Graph TX 9 TX 1 TX 8 TX 3 Addr X Addr K Addr L Addr Y Addr Z $ $ $ $ $ $ $ $ TX 9 TX 1 TX 8 TX 3 Addr X Addr K Addr L Addr Y Addr Z $ $ $ $ $ $ $ $ TX 9 TX 1 TX 8 TX 3 Addr X Addr K Addr L Addr Y Addr Z $ $ $ $ $ $ $ $ @kpatenge @alanzwu @SpatialHannes
  • 10.
    Copyright © 2018,Oracle and/or its affiliates. All rights reserved. | • Graph Model 3 –What is Addr X´s contribution to Addr K? – Given an input address i, output address o -> Contribution of i to o is: Bitcoin Transactions as a Graph: Money Flow TX9 TX1 TX8 TX3 Addr X Addr K Addr L Addr Y Addr Z $ $ $ $ $ $ $ $ o i i i Amount Amount Amount •  @kpatenge @alanzwu @SpatialHannes
  • 11.
    Copyright © 2018,Oracle and/or its affiliates. All rights reserved. | Functions of a Graph Database Bitcoin Transactions as a Graph: Workflow Graph Generation & Loading Data Preparation Graph Querying & Analysis Graph Visualization Retrieving & Parsing Data @kpatenge @alanzwu @SpatialHannes
  • 12.
    Copyright © 2018,Oracle and/or its affiliates. All rights reserved. | Modeling Data as Graphs The more connected the data is, the better a Graph fits Oracle NoSQL DB with Big Data Spatial and GraphGraphic source: http://www.ateam-oracle.com/intro-to-graphs-at-oracle/ @kpatenge @alanzwu @SpatialHannes
  • 13.
    Copyright © 2018,Oracle and/or its affiliates. All rights reserved. | • A set of nodes (aka vertices) – each vertex has a unique identifier – each vertex has a set of in/out edges – each vertex has a collection of key-value properties • A set of edges – each edge has a unique identifier – each edge has a head/tail vertex – each edge has a label denoting type of relationship between two vertices – each edge has a collection of key-value properties • Blueprints Java APIs • Implementations – Oracle (Spatial and Graph, Big Data Spatial and Graph), Neo4j, DataStax (Titan), InfiniteGraph, Dex, Sail, MongoDB, … What is a Property Graph? https://github.com/tinkerpop/blueprints/wiki/Property-Graph-Model @kpatenge @alanzwu @SpatialHannes
  • 14.
    Copyright © 2018,Oracle and/or its affiliates. All rights reserved. | Property Graph Support Graph Data Access Layer (DAL) Graph Analytics Blueprints & Lucene/SolrCloud RDF (RDF/XML, N- Triples, N-Quads, TriG,N3,JSON) REST/Web Service/Notebooks Java,Groovy,Python,… Java APIs Java APIs/JDBC/SQL/PLSQL Property Graph formats GraphML GML GraphSON Flat FilesScalable and Persistent Storage Management Parallel In-Memory Graph Analytics (PGX) / Graph Querying (PGQL) Oracle NoSQL Database Oracle RDBMS Apache HBase Apache Spark @kpatenge @alanzwu @SpatialHannes
  • 15.
    Copyright © 2018,Oracle and/or its affiliates. All rights reserved. | Demo Environment • Available for free: Oracle Big Data Lite VM 4.11 running in Oracle VirtualBox – Oracle NoSQL Database (kvlite: unclustered -> 1 node, no replication) – Big Data Spatial and Graph (BDSG) 2.4 http://www.oracle.com/technetwork/database/bigdata-appliance/oracle-bigdatalite-2104726.html • Property Graph Analytics Engine (PGX), Property Graph Query Language (PGQL) • Gremlin, Apache Groovy (Shell) • Zeppelin Notebook with PGX Interpreter – Property Graph Format • Oracle Flat Files – Cytoscape 3.6.0 • Big Data Spatial and Graph 2.4 support installed @kpatenge @alanzwu @SpatialHannes
  • 16.
    Copyright © 2018,Oracle and/or its affiliates. All rights reserved. | Definition Bitcoin transaction data sample [oracle@bigdatalite data]$ head –n 5 btc.opv 1,bt_addr,1,1111111111111111111114oLvT2,, 2,bt_addr,1,11126yHiXjavR3oNVwV2GRNso2ah4MnZtm,, 3,bt_addr,1,11128BtJwtyW4q9eRe3zts6BB4jg4uKLv8,, 4,bt_addr,1,111HnjYiCubyhPjtmZ7jEQjYcYBpKZHvJ,, 5,bt_addr,1,111KHWctzJ8tsTbittCDVzmTHVjxQR2g4,, [oracle@bigdatalite data]$ Oracle Flat File Format: Vertices Field # Name Description 1 vertex_ID An integer that uniquely identifies the vertex 2 key_name The name of the key in the key-value pair 3 value_type 1=String, 2=Integer, 3=Float, ... 4 value The encoded, non-null value of key_name when it is neither numeric nor date 5 value The encoded, non-null value of key_name when it is numeric 6 value The encoded, nonnull value of key_name when it is a timestamp (date) Source: http://blockchain.info @kpatenge @alanzwu @SpatialHannes
  • 17.
    Copyright © 2018,Oracle and/or its affiliates. All rights reserved. | Definition Bitcoin transaction data sample [oracle@bigdatalite data]$ head –n 5 btc.ope 1,317335,91594,contrib,trans_hash,1,4391b11d991e7c9ad4 f9a1a5a7ea9ed7f234643b0c883f49511e1394a5ab8ff5,, 1,317335,91594,contrib,amount,3,,5.0E9, 2,357443,91594,contrib,trans_hash,1,4391b11d991e7c9ad4 f9a1a5a7ea9ed7f234643b0c883f49511e1394a5ab8ff5,, 2,357443,91594,contrib,amount,3,,5.0E9, 3,352850,91594,contrib,trans_hash,1,4391b11d991e7c9ad4 f9a1a5a7ea9ed7f234643b0c883f49511e1394a5ab8ff5,, 3,352850,91594,contrib,amount,3,,5.0E9, 4,308829,91594,contrib,trans_hash,1,4391b11d991e7c9ad4 f9a1a5a7ea9ed7f234643b0c883f49511e1394a5ab8ff5,, 4,308829,91594,contrib,amount,3,,5.0E9, 5,314511,11714,contrib,trans_hash,1,2e8250e9f3f8043cda d60f747982275fee2a1836ebb48b2f620d03371be8e3f6,, 5,314511,11714,contrib,amount,3,,5.0E9, [oracle@bigdatalite data]$ Oracle Flat File Format: Edges Field # Name Description 1 edge_ID An integer that uniquely identifies the edge 2 source_vertex_ID The vertex_ID of the outgoing tail of the edge 3 dest_vertex_ID The vertex_ID of the incoming head of the edge 4 edge_label The encoded label of the edge, which describes the relationship between the two vertices 5 key_name The encoded name of the key in a KV pair 6 value_type 1=String, 2=Integer, 3=Double, ... 7 value The encoded, nonnull value of key_name when it is neither numeric nor timestamp (date) 8 value The encoded, nonnull value of key_name when it is numeric 9 value The encoded, nonnull value of key_name when it is a timestamp (date) @kpatenge @alanzwu @SpatialHannes
  • 18.
    Copyright © 2018,Oracle and/or its affiliates. All rights reserved. | Graph Generation and Loading using Vertices & Edges files // Start Groovy Shell connecting to Oracle NoSQL DB cd /opt/oracle/oracle-spatial- graph/property_graph/dal/groovy ./gremlin-opg-nosql.sh server = new ArrayList(); server.add("bigdatalite.localdomain:5000"); // Create a graph config with graph name "btc" // Name of key-value store is "kvstore" // Make sure to add all vertex/edge properties needed cfg = GraphConfigBuilder.forPropertyGraphNosql() .setName("btc") .setStoreName("kvstore") .setHosts(server) .addVertexProperty("bt_addr", PropertyType.STRING, "NA") .addEdgeProperty("amount", PropertyType.FLOAT, 1.0f) .hasEdgeLabel(true) .setLoadEdgeLabel(true) .setMaxNumConnections(2) .build(); // Create an instance of the graph opg = OraclePropertyGraph.getInstance(cfg); opg.getKVStoreConfig(); // Prepare for data load opg.setClearTableDOP(2); opg.clearRepository(); // Create an instance for the graph data loader opgdl=OraclePropertyGraphDataLoader.getInstance(); // Flat files with vertices & edges of Bitcoin txs vfile="/home/oracle/Documents/BTC/data/btc.opv"; efile="/home/oracle/Documents/BTC/data/btc.ope // Load data into the graph opgdl.loadData(opg, vfile, efile, 2); // Do some checks // Count vertices and edges opg.countVertices(); opg.countEdges(); // Get vertices and edges opg.getVertices(); opg.getEdges(); ... // Shut down instance and close shell opg.shutdown(); :q @kpatenge @alanzwu @SpatialHannes
  • 19.
    Copyright © 2018,Oracle and/or its affiliates. All rights reserved. | PGX – Graph Analytics Engine • Toolkit for In-Memory, Parallel Graph Analysis containing – PGX shell – Analyst API with a large collection of built-in Graph algorithms – and more • Developed by Oracle Labs – http://www.oracle.com/technetwork/oracle- labs/parallel-graph-analytix/overview/index.html – https://event.cwi.nl/grades/2018/07-VanRest.pdf – https://docs.oracle.com/cd/E56133_01/latest/tutorials /index.html PGQL – Property Graph Query Language • SQL-like Graph Pattern Matching – WHERE clause set of comma-separated constraints • Developed by Oracle Labs – http://pgql-lang.org/ • Proposed for standardization Graph Querying and Analysis @kpatenge @alanzwu @SpatialHannes
  • 20.
    Copyright © 2018,Oracle and/or its affiliates. All rights reserved. | Analyze Bitcoin Transaction Data using PGX • Start PGX server /opt/oracle/oracle-spatial- graph/property_graph/pgx/bin/start-server • Start / Return to Groovy Shell // Create in-memory analyst session session=Pgx.createSession("session_ID_1"); analyst=session.createAnalyst(); // Read the graph from Oracle NoSQL DB into memory pgxGraph = session.readGraphWithProperties(opg.getConfig()); // Working with In-Memory Analyst // Execute Page Rank rank=analyst.pagerank(pgxGraph, 0.0001, 0.85, 100); // Get top 10 vertices rank.getTopKValues(10); // BetweenNess Centrality bc=analyst.vertexBetweennessCentrality(pgxGraph); // Get top 10 vertices bc.getTopKValues(10); ... @kpatenge @alanzwu @SpatialHannes
  • 21.
    Copyright © 2018,Oracle and/or its affiliates. All rights reserved. | Analyze Bitcoin Transaction Data using PGX Using Zeppelin Notebook with PGX Interpreter @kpatenge @alanzwu @SpatialHannes
  • 22.
    Copyright © 2018,Oracle and/or its affiliates. All rights reserved. | • Topology constraints ▪ (n)–[e]–>(m) ▪ (n)–[e1]–>(m1), (n)–[e2]–>(m2) ▪ (n1)-[e1]->(n2)-[e2]->(n3)-[e3]->(n4) ▪ (n1)-[e1]->(n2)<-[e2]-(n3) • Label matching ▪ (x:Person) -[e:likes]-> (y:Person) ▪ (:Person) -[:likes]-> (:Person) ▪ (x:Student|Professor) -[e:likes|knows]-> (y:Student|Professor) • Value constraints ▪ (x) -> (y), x.name = 'John’, y.age > 25 • In-Line constraints ▪ (n WITH name = 'John' OR name = 'James', type = 'Person') -[e WITH type = 'workAt', workHours < 40]-> () • … Syntax form Examples Basic form (n)-[e]->(m) Omit variable name of the source vertex ()-[e]->(m) Omit variable name of the destination vertex (n)-[e]->() Omit variable names in both vertices ()-[e]->() Omit variable name in edge (n)-->(m) Omit variable name in edge (alternative, one dash) (n)->(m) Querying Property Graph Data using PGQL @kpatenge @alanzwu @SpatialHannes
  • 23.
    Copyright © 2018,Oracle and/or its affiliates. All rights reserved. | Query Bitcoin Transaction Data using PGQL // Some PGQL queries // Explore relationships in the graph pgxResultSet = pgxGraph.queryPgql("SELECT e.label(), count(*) WHERE (n) -[e]-> (m) GROUP BY e.label() ORDER BY count(*) DESC"); pgxResultSet.print(); // Find top most collaborative Bitcoin addresses pgxResultSet = pgxGraph.queryPgql("SELECT n, count(*) WHERE (n) -[e:contrib]-> (m) GROUP BY n ORDER BY count(*) DESC LIMIT 10"); pgxResultSet.print(3); // Find top least collaborative Bitcoin addresses pgxResultSet = pgxGraph.queryPgql("SELECT n, count(*) WHERE (n) -[e:contrib]-> (m) GROUP BY n ORDER BY count(*) ASC"); pgxResultSet.print(3); // InDegree count pgxResultSet = pgxGraph.queryPgql("SELECT y.id(), y.bt_addr, x.inDegree() WHERE (x) -> (y), x.inDegree() > 1000 ORDER BY x.inDegree() DESC"); pgxResultSet.print(3); ... https://blogs.oracle.com/bigdataspatialgraph/how-many-ways-to-run-property-graph-query-language-pgql-in-bdsg-i @kpatenge @alanzwu @SpatialHannes
  • 24.
    Copyright © 2018,Oracle and/or its affiliates. All rights reserved. | Query Bitcoin Transaction Data using PGQL Using Zeppelin Notebook with PGX Interpreter @kpatenge @alanzwu @SpatialHannes
  • 25.
    Copyright © 2018,Oracle and/or its affiliates. All rights reserved. | Visualize Bitcoin Transaction Data using Cytoscape @kpatenge @alanzwu @SpatialHannes
  • 26.
    Copyright © 2018,Oracle and/or its affiliates. All rights reserved. | Pattern Analysis 01 @kpatenge @alanzwu @SpatialHannes
  • 27.
    Copyright © 2018,Oracle and/or its affiliates. All rights reserved. | Pattern Analysis 02: Addresses with incoming TX´s only @kpatenge @alanzwu @SpatialHannes
  • 28.
    Copyright © 2018,Oracle and/or its affiliates. All rights reserved. | Pattern Analysis 03: Degree of Centrality @kpatenge @alanzwu @SpatialHannes
  • 29.
    Copyright © 2018,Oracle and/or its affiliates. All rights reserved. | Summary • Graph databases are powerful tools, complementing relational databases – Especially strong for analysis of graph topology and connectedness • Graph analytics offer new insight – Especially relationships, dependencies and behavioural patterns • Oracle Property Graph technology offers – Comprehensive analytics through various APIs, integration with relational database – Scaleable, parallel in-memory processing – Secure and scaleable graph storage using Oracle NoSQL, HBase or Oracle Database • Available both on-premise or in the Cloud Graph capabilities in Oracle Big Data Spatial and Graph @kpatenge @alanzwu @SpatialHannes
  • 30.
    Copyright © 2018,Oracle and/or its affiliates. All rights reserved. | Property Graph running in the Oracle Cloud @kpatenge @alanzwu @SpatialHannes
  • 31.
    Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |@kpatenge @alanzwu @SpatialHannes
  • 32.
    Copyright © 2018,Oracle and/or its affiliates. All rights reserved. | Rich set of built-in parallel graph algorithms … and parallel graph mutation operations Additional Information: PGX - Built-in Package @kpatenge @alanzwu @SpatialHannes
  • 33.
    Copyright © 2018,Oracle and/or its affiliates. All rights reserved. | • Getting Started – Creating a Property Graph on Oracle Database by Arthur Dayton (Vlamis Software Solutions) https://blogs.oracle.com/oraclespatial/getting- started-creating-a-property-graph-on-oracle- database • Improve your Meetup Experience using Graph Analytics by Karin Patenge (Oracle) https://de.slideshare.net/kpatenge • Big Data Spatial and Graph In-Memory Analyst Java API: https://docs.oracle.com/bigdata/bda411/PGXJV/toc.h tm • Oracle Big Data Spatial and Graph on Oracle.com: www.oracle.com/database/big-data- spatial-and-graph • OTN product page (white papers, software downloads, documentation, tutorials): www.oracle.com/technetwork/database/database- technologies/bigdata-spatialandgraph • Oracle Big Data Lite Virtual Machine - a free sandbox to get started: www.oracle.com/technetwork/database/bigdata- appliance/oracle-bigdatalite-2104726.html • Hands On Lab for Big Data Spatial: tinyurl.com/BDSG-HOL • Blog – Examples, Tips & Tricks: blogs.oracle.com/bigdataspatialgraph Resources on Oracle‘s Property Graph Support @kpatenge @alanzwu @SpatialHannes