SlideShare a Scribd company logo
1 of 37
Download to read offline
GraphGen: Conducting
Graph Analytics over
Relational Databases
Konstantinos Xirogiannopoulos
Amol Deshpande
collaborated
Name:
Konstantinos
Name:
Amol
Name: University of MarylandName: PyData DC
Year: 2016
gave_talk
works_at
works_at
Graph Analytics:
(Network Science)
Leveraging of connections between
entities in a network towards gaining
insight about said entities and/or the
network via the use of graph
algorithms.
1) Why graph analytics?
2) How are graph analytics done currently?
3) What are most people dealing with?
4) Bolt-on graph analytics with GraphGen
5) The GraphGen Language
Graphs Across Domains
Protein-protein
interaction networks
Financial transaction
networks
Stock Trading Networks
Social Networks
Federal Funds Networks
Knowledge Graph
World Wide Web
Communication
Networks
Citation Networks
…...
http://go.umd.edu/graphs
Example Use cases
● Financial crimes
(e.g. money
laundering)
● Fraudulent
transactions
● Cybercrime
● Counterterrorism
● Key players in a
network
● Ranking entities (web
pages, PageRank)
● Providing connection
recommendations to
users
● Optimizing
transportation
routes
● Identifying
weaknesses in
power grids, water
grids etc.
● Computer networks
● Medical Research
● Disease pathology
● DNA Sequencing
1) Why graph analytics?
2) How are graph analytics
done currently?
3) What are most people dealing with?
4) Bolt-on graph analytics with GraphGen
5) The GraphGen Language
Types of Graph Analytics
● Graph “queries”: Subgraph pattern matching, shortest
paths, temporal queries
● Real Time Analytics: Anomaly/Event detection, online
prediction
● Batch Analytics (Network Science): Centrality analysis,
community detection, network evolution
● Machine Learning: Matrix factorization, logistic
regression modeled as message passing in specially
structured graphs.
http://go.umd.edu/graphs
State of the art
● Graph Analytics tasks are too widely varied
http://go.umd.edu/graphs
● There is no one-size-fits-all solution
○ RDBMS/Hadoop/Spark have their tradeoffs
● Fragmented area with little consensus
❖ Specialized graph databases (Neo4j, Titan, Blazegraph, Cayley,Dgraph)
❖ RDF stores (Allegrograph, Jena)
❖ Bolt-on solutions (Teradata SQL-Graph, SAP Graph Engine,
Oracle)
❖ Distributed batch processing systems (Giraph, GraphX,
GraphLab) Lots of ETL required!
❖ Many more research prototypes...
Different Analytics Flows
Other SystemsGraph Databases Bolt-On Solutions
What should I use then??
● What fraction of the overall workload is
graph-oriented?
● How often are some sort of graph analytics
required to run?
● Do you need to do graph updates?
● What types of analytics are required?
● How large would the graphs be?
● Are you starting from scratch or do you have an
already deployed DBMS?
1) Why graph analytics?
2) How are graph analytics done currently?
3) What are most people
dealing with?
4) Bolt-on graph analytics with GraphGen
5) The GraphGen Language
● Most business analytics (querying, reporting,
OLAP) happen in SQL
● Organizations typically model their data
according to their needs
● Graph databases if you have strictly
graph-centric workloads
Where’s the Data?
Where’s the Data?
● Most likely organized in some type of database schema
● Collection of tables related to each-other through
common attributes, or primary, foreign-key constraints.
We need to extract connections between entities
Most Likely...
Lots of “hidden” graphs
● Let’s take TPC-H.
part_key
Part
supplier_key
...
customer_key
Customer
customer_name
...
order_key
Orders
part_key
customer_key
...
supplier_key
Supplier
supplier_name
...
● We could create edges
between two customers if
they’ve:
○ Bought the same item
○ Bought the same item on
the same day
○ Bought from the same
supplier
○ Etc.
State of the art
● Graph Analytics tasks are too widely varied
http://go.umd.edu/graphs
● There is no one-size-fits-all solution
○ RDBMS/Hadoop/Spark have their tradeoffs
● Fragmented area with little consensus
❖ Specialized graph databases (Neo4j, Titan, Blazegraph, Cayley,Dgraph)
❖ RDF stores (Allegrograph, Jena)
❖ Bolt-on solutions (Teradata SQL-Graph, SAP Graph Engine,
Oracle)
❖ Distributed batch processing systems (Giraph, GraphX,
GraphLab) Lots of ETL required!
❖ Many more research prototypes...
State of the art
● Graph Analytics tasks are too widely varied
http://go.umd.edu/graphs
● There is no one-size-fits-all solution
○ RDBMS/Hadoop/Spark have their tradeoffs
● Fragmented area with little consensus
❖ Specialized graph databases (Neo4j, Titan, Blazegraph, Cayley,Dgraph)
❖ RDF stores (Allegrograph, Jena)
❖ Bolt-on solutions (Teradata SQL-Graph, SAP Graph Engine,
Oracle)
❖ Distributed batch processing systems (Giraph, GraphX,
GraphLab) Lots of ETL required!
❖ Many more research prototypes...
1) Why graph analytics?
2) How are graph analytics done currently?
3) What are most people dealing with?
4) Bolt-on graph analytics
with GraphGen
5) The GraphGen Language
GraphGen
Extract and analyze
many different kinds
of graphs
Simple, Intuitive,
Declarative Language,
No ETL required
Full Graph API & Vertex
Centric Framework
GraphGen Interfaces
Native Java LibraryPython wrapper LibraryGraphGen Explorer: UI
Web Application
Graphgen Explorer Web App
● Exploration of database schema to detect
different types of hidden graphs.
● Allows users to visually explore potential
graphs.
● Simple statistic and on-the-fly analysis
Not all graphs will be useful!
GraphGen Explorer Web App
GraphgenPy in Python
from graphgenpy import GraphGenerator
import networkx as nx
datalogQuery = """
Nodes(ID, Name) :- Author(ID, Name).
Edges(ID1, ID2) :- AuthorPublication(ID1, PubID), AuthorPublication(ID2, PubID).
"""
# Credentials for connecting to the database
gg = GraphGenerator("localhost","5432","testgraphgen","kostasx","password")
fname = gg.generateGraph(datalogQuery,"extracted_graph",GraphGenerator.GML)
G = nx.read_gml(fname,'id')
print "Graph Loaded into NetworkX! Running PageRank..."
# Run any algorithm on the graph using NetworkX
print nx.pagerank(G)
print "Done!"
Define GraphGen Query
Database Credentials
Generate and
Serialize Graph
Load Graph into
NetworkX
Run Any Algorithm
Native GraphGen in Java
// Establish Connection to Database
GraphGenerator ggen = new GraphGenerator("host", "port", "dbName",
"username", "password");
// Define and evaluate a single graph extraction query
String datalog_query = "...";
Graph g = ggen.generateGraph(datalog_query).get(0);
// Initialize vertec-centric object
VertexCentric p = new VertexCentric(g);
// Define vertex-centric compute function
Executor program = new Executor("result_value_name") {
@Override
public void compute(Vertex v, VertexCentric p) {
// implementation of compute function
}
};
// Begin execution
p.run(program);
Define GraphGen Query
Database Credentials
Extract and Load
Graph
Define Vertex
Centric Program
Run Program
// Establish Connection to Database
GraphGenerator ggen = new GraphGenerator("host", "port", "dbName",
"username", "password");
// Define and evaluate a single graph extraction query
String datalog_query = "...";
Graph g = ggen.generateGraph(datalog_query).get(0);
for (Vertex v : g.getVertices()) {
// For each neighbor
for (Vertex neighbor : v.getVertices(Direction.OUT)) {
// Do something
}
}
Define GraphGen Query
Database Credentials
Extract and Load
Graph
Use Full API to
access the Graph
GraphGen Back-End Architecture
1) Why graph analytics?
2) How are graph analytics done currently?
3) What are most people dealing with?
4) Bolt-on graph analytics with GraphGen
5) The GraphGen Language
GraphGen DSL
● Intuitive Domain Specific Language based on Datalog
● User needs to specify:
○ How the nodes are defined
○ How the edges are defined
● The query is executed, and the user gets a Graph object
to operate upon.
● Very expressive: Allows for homogeneous and
heterogeneous graphs with various types of nodes and
edges.
TPC-H Database
partKey
Part
supplierKey
...
customerKey
Customer
customerName
...
● We want to explore a
graph of customers!
● Using the GraphGen
Language:
○ Which tables do
we need to
combine to extract
the nodes and
edges
orderKey
Orders
partKey
customerKey
...
supplierKey
Supplier
supplierName
...
GraphGen DSL Example
Nodes(ID, Name) :- Customer(ID, Name).
● Creates a node out of each row in the Customer table
■ Customer ID and Name as properties
Edges(ID1, ID2) :-
Orders(_,partKey, ID1), Orders(_,partKey, ID2).
● Connect ID1 -> ID2 if they have both ordered the same part
GraphGen
● Enable extraction of
different types of hidden
graphs
● Independent of where the
data is stored (given SQL)
● Enable complex analytics
over the extracted graphs
● Efficient extraction
through various
in-memory
representations
● Efficient analysis
through a parallel
execution engine
● Effortless through a
Declarative Language
● Eliminates the need
for complex ETL
● Intuitive and swift
analysis of any graph
that exists in your
data!
Download GraphGen at:
konstantinosx.github.io/graphgen-project/
DDL Blog Post at:
blog.districtdatalabs.com/graph-analytics-over-relational-datasets
Email: kostasx@cs.umd.edu
Twitter: @kxirog
Download GraphGen at:
konstantinosx.github.io/graphgen-project/
Thank you!

More Related Content

What's hot

How Graph Databases started the Multi Model revolution
How Graph Databases started the Multi Model revolutionHow Graph Databases started the Multi Model revolution
How Graph Databases started the Multi Model revolutionLuca Garulli
 
Neo4j-Databridge: Enterprise-scale ETL for Neo4j
Neo4j-Databridge: Enterprise-scale ETL for Neo4jNeo4j-Databridge: Enterprise-scale ETL for Neo4j
Neo4j-Databridge: Enterprise-scale ETL for Neo4jGraphAware
 
Graph Gurus 15: Introducing TigerGraph 2.4
Graph Gurus 15: Introducing TigerGraph 2.4 Graph Gurus 15: Introducing TigerGraph 2.4
Graph Gurus 15: Introducing TigerGraph 2.4 TigerGraph
 
Graphs are everywhere! Distributed graph computing with Spark GraphX
Graphs are everywhere! Distributed graph computing with Spark GraphXGraphs are everywhere! Distributed graph computing with Spark GraphX
Graphs are everywhere! Distributed graph computing with Spark GraphXAndrea Iacono
 
GraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on DemandGraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on DemandOntotext
 
Continuous delivery for machine learning
Continuous delivery for machine learningContinuous delivery for machine learning
Continuous delivery for machine learningRajesh Muppalla
 
Extending Spark Graph for the Enterprise with Morpheus and Neo4j
Extending Spark Graph for the Enterprise with Morpheus and Neo4jExtending Spark Graph for the Enterprise with Morpheus and Neo4j
Extending Spark Graph for the Enterprise with Morpheus and Neo4jDatabricks
 
Automatic Detection of Web Trackers by Vasia Kalavri
Automatic Detection of Web Trackers by Vasia KalavriAutomatic Detection of Web Trackers by Vasia Kalavri
Automatic Detection of Web Trackers by Vasia KalavriFlink Forward
 
Graph Gurus Episode 12: Tiger Graph v2.3 Overview
Graph Gurus Episode 12: Tiger Graph v2.3 OverviewGraph Gurus Episode 12: Tiger Graph v2.3 Overview
Graph Gurus Episode 12: Tiger Graph v2.3 OverviewTigerGraph
 
Federated Queries Across Both Different Storage Mediums and Different Data En...
Federated Queries Across Both Different Storage Mediums and Different Data En...Federated Queries Across Both Different Storage Mediums and Different Data En...
Federated Queries Across Both Different Storage Mediums and Different Data En...VMware Tanzu
 
Speed layer : Real time views in LAMBDA architecture
Speed layer : Real time views in LAMBDA architecture Speed layer : Real time views in LAMBDA architecture
Speed layer : Real time views in LAMBDA architecture Tin Ho
 
Graph Gurus Episode 1: Enterprise Graph
Graph Gurus Episode 1: Enterprise GraphGraph Gurus Episode 1: Enterprise Graph
Graph Gurus Episode 1: Enterprise GraphTigerGraph
 
When Apache Spark Meets TiDB with Xiaoyu Ma
When Apache Spark Meets TiDB with Xiaoyu MaWhen Apache Spark Meets TiDB with Xiaoyu Ma
When Apache Spark Meets TiDB with Xiaoyu MaDatabricks
 
Connected datalondon metadata-driven apps
Connected datalondon metadata-driven appsConnected datalondon metadata-driven apps
Connected datalondon metadata-driven appsConnected Data World
 
GraphFrames: DataFrame-based graphs for Apache® Spark™
GraphFrames: DataFrame-based graphs for Apache® Spark™GraphFrames: DataFrame-based graphs for Apache® Spark™
GraphFrames: DataFrame-based graphs for Apache® Spark™Databricks
 
OracleCode_Berlin_Jun2018_AnalyzeBitcoinTransactionDataUsingAsGraph
OracleCode_Berlin_Jun2018_AnalyzeBitcoinTransactionDataUsingAsGraphOracleCode_Berlin_Jun2018_AnalyzeBitcoinTransactionDataUsingAsGraph
OracleCode_Berlin_Jun2018_AnalyzeBitcoinTransactionDataUsingAsGraphKarin Patenge
 
SHACL-based data life cycle management
SHACL-based data life cycle managementSHACL-based data life cycle management
SHACL-based data life cycle managementConnected Data World
 
MongoDB Atlas Workshop - Singapore
MongoDB Atlas Workshop - SingaporeMongoDB Atlas Workshop - Singapore
MongoDB Atlas Workshop - SingaporeAshnikbiz
 
Credit Fraud Prevention with Spark and Graph Analysis
Credit Fraud Prevention with Spark and Graph AnalysisCredit Fraud Prevention with Spark and Graph Analysis
Credit Fraud Prevention with Spark and Graph AnalysisJen Aman
 
MongoDB & Hadoop - Understanding Your Big Data
MongoDB & Hadoop - Understanding Your Big DataMongoDB & Hadoop - Understanding Your Big Data
MongoDB & Hadoop - Understanding Your Big DataMongoDB
 

What's hot (20)

How Graph Databases started the Multi Model revolution
How Graph Databases started the Multi Model revolutionHow Graph Databases started the Multi Model revolution
How Graph Databases started the Multi Model revolution
 
Neo4j-Databridge: Enterprise-scale ETL for Neo4j
Neo4j-Databridge: Enterprise-scale ETL for Neo4jNeo4j-Databridge: Enterprise-scale ETL for Neo4j
Neo4j-Databridge: Enterprise-scale ETL for Neo4j
 
Graph Gurus 15: Introducing TigerGraph 2.4
Graph Gurus 15: Introducing TigerGraph 2.4 Graph Gurus 15: Introducing TigerGraph 2.4
Graph Gurus 15: Introducing TigerGraph 2.4
 
Graphs are everywhere! Distributed graph computing with Spark GraphX
Graphs are everywhere! Distributed graph computing with Spark GraphXGraphs are everywhere! Distributed graph computing with Spark GraphX
Graphs are everywhere! Distributed graph computing with Spark GraphX
 
GraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on DemandGraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on Demand
 
Continuous delivery for machine learning
Continuous delivery for machine learningContinuous delivery for machine learning
Continuous delivery for machine learning
 
Extending Spark Graph for the Enterprise with Morpheus and Neo4j
Extending Spark Graph for the Enterprise with Morpheus and Neo4jExtending Spark Graph for the Enterprise with Morpheus and Neo4j
Extending Spark Graph for the Enterprise with Morpheus and Neo4j
 
Automatic Detection of Web Trackers by Vasia Kalavri
Automatic Detection of Web Trackers by Vasia KalavriAutomatic Detection of Web Trackers by Vasia Kalavri
Automatic Detection of Web Trackers by Vasia Kalavri
 
Graph Gurus Episode 12: Tiger Graph v2.3 Overview
Graph Gurus Episode 12: Tiger Graph v2.3 OverviewGraph Gurus Episode 12: Tiger Graph v2.3 Overview
Graph Gurus Episode 12: Tiger Graph v2.3 Overview
 
Federated Queries Across Both Different Storage Mediums and Different Data En...
Federated Queries Across Both Different Storage Mediums and Different Data En...Federated Queries Across Both Different Storage Mediums and Different Data En...
Federated Queries Across Both Different Storage Mediums and Different Data En...
 
Speed layer : Real time views in LAMBDA architecture
Speed layer : Real time views in LAMBDA architecture Speed layer : Real time views in LAMBDA architecture
Speed layer : Real time views in LAMBDA architecture
 
Graph Gurus Episode 1: Enterprise Graph
Graph Gurus Episode 1: Enterprise GraphGraph Gurus Episode 1: Enterprise Graph
Graph Gurus Episode 1: Enterprise Graph
 
When Apache Spark Meets TiDB with Xiaoyu Ma
When Apache Spark Meets TiDB with Xiaoyu MaWhen Apache Spark Meets TiDB with Xiaoyu Ma
When Apache Spark Meets TiDB with Xiaoyu Ma
 
Connected datalondon metadata-driven apps
Connected datalondon metadata-driven appsConnected datalondon metadata-driven apps
Connected datalondon metadata-driven apps
 
GraphFrames: DataFrame-based graphs for Apache® Spark™
GraphFrames: DataFrame-based graphs for Apache® Spark™GraphFrames: DataFrame-based graphs for Apache® Spark™
GraphFrames: DataFrame-based graphs for Apache® Spark™
 
OracleCode_Berlin_Jun2018_AnalyzeBitcoinTransactionDataUsingAsGraph
OracleCode_Berlin_Jun2018_AnalyzeBitcoinTransactionDataUsingAsGraphOracleCode_Berlin_Jun2018_AnalyzeBitcoinTransactionDataUsingAsGraph
OracleCode_Berlin_Jun2018_AnalyzeBitcoinTransactionDataUsingAsGraph
 
SHACL-based data life cycle management
SHACL-based data life cycle managementSHACL-based data life cycle management
SHACL-based data life cycle management
 
MongoDB Atlas Workshop - Singapore
MongoDB Atlas Workshop - SingaporeMongoDB Atlas Workshop - Singapore
MongoDB Atlas Workshop - Singapore
 
Credit Fraud Prevention with Spark and Graph Analysis
Credit Fraud Prevention with Spark and Graph AnalysisCredit Fraud Prevention with Spark and Graph Analysis
Credit Fraud Prevention with Spark and Graph Analysis
 
MongoDB & Hadoop - Understanding Your Big Data
MongoDB & Hadoop - Understanding Your Big DataMongoDB & Hadoop - Understanding Your Big Data
MongoDB & Hadoop - Understanding Your Big Data
 

Similar to GraphGen: Conducting Graph Analytics over Relational Databases

Lambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big dataLambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big dataTrieu Nguyen
 
20181123 dn2018 graph_analytics_k_patenge
20181123 dn2018 graph_analytics_k_patenge20181123 dn2018 graph_analytics_k_patenge
20181123 dn2018 graph_analytics_k_patengeKarin Patenge
 
Multiplatform Spark solution for Graph datasources by Javier Dominguez
Multiplatform Spark solution for Graph datasources by Javier DominguezMultiplatform Spark solution for Graph datasources by Javier Dominguez
Multiplatform Spark solution for Graph datasources by Javier DominguezBig Data Spain
 
SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018 SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018 CareerBuilder.com
 
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...TigerGraph
 
The Apache Solr Semantic Knowledge Graph
The Apache Solr Semantic Knowledge GraphThe Apache Solr Semantic Knowledge Graph
The Apache Solr Semantic Knowledge GraphTrey Grainger
 
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022ArangoDB Database
 
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....ArangoDB Database
 
How Graph Databases used in Police Department?
How Graph Databases used in Police Department?How Graph Databases used in Police Department?
How Graph Databases used in Police Department?Samet KILICTAS
 
Graph protocol for accessing information about blockchains and d apps
Graph protocol for accessing information about blockchains and d appsGraph protocol for accessing information about blockchains and d apps
Graph protocol for accessing information about blockchains and d appsGene Leybzon
 
201411203 goto night on graphs for fraud detection
201411203 goto night on graphs for fraud detection201411203 goto night on graphs for fraud detection
201411203 goto night on graphs for fraud detectionRik Van Bruggen
 
aRangodb, un package per l'utilizzo di ArangoDB con R
aRangodb, un package per l'utilizzo di ArangoDB con RaRangodb, un package per l'utilizzo di ArangoDB con R
aRangodb, un package per l'utilizzo di ArangoDB con RGraphRM
 
Monitoring Big Data Systems - "The Simple Way"
Monitoring Big Data Systems - "The Simple Way"Monitoring Big Data Systems - "The Simple Way"
Monitoring Big Data Systems - "The Simple Way"Demi Ben-Ari
 
Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...
Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...
Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...Neo4j
 
Intro To Graph Databases - Oxana Goriuc
Intro To Graph Databases - Oxana GoriucIntro To Graph Databases - Oxana Goriuc
Intro To Graph Databases - Oxana GoriucFraugster
 
BIG GRAPH: TOOLS, TECHNIQUES, ISSUES, CHALLENGES AND FUTURE DIRECTIONS
BIG GRAPH: TOOLS, TECHNIQUES, ISSUES, CHALLENGES AND FUTURE DIRECTIONSBIG GRAPH: TOOLS, TECHNIQUES, ISSUES, CHALLENGES AND FUTURE DIRECTIONS
BIG GRAPH: TOOLS, TECHNIQUES, ISSUES, CHALLENGES AND FUTURE DIRECTIONScscpconf
 
Big Graph : Tools, Techniques, Issues, Challenges and Future Directions
Big Graph : Tools, Techniques, Issues, Challenges and Future Directions Big Graph : Tools, Techniques, Issues, Challenges and Future Directions
Big Graph : Tools, Techniques, Issues, Challenges and Future Directions csandit
 
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022ArangoDB Database
 
Machine Learning + Graph Databases for Better Recommendations
Machine Learning + Graph Databases for Better RecommendationsMachine Learning + Graph Databases for Better Recommendations
Machine Learning + Graph Databases for Better RecommendationsChristopherWoodward16
 

Similar to GraphGen: Conducting Graph Analytics over Relational Databases (20)

Lambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big dataLambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big data
 
20181123 dn2018 graph_analytics_k_patenge
20181123 dn2018 graph_analytics_k_patenge20181123 dn2018 graph_analytics_k_patenge
20181123 dn2018 graph_analytics_k_patenge
 
Handout: 'Open Source Tools & Resources'
Handout: 'Open Source Tools & Resources'Handout: 'Open Source Tools & Resources'
Handout: 'Open Source Tools & Resources'
 
Multiplatform Spark solution for Graph datasources by Javier Dominguez
Multiplatform Spark solution for Graph datasources by Javier DominguezMultiplatform Spark solution for Graph datasources by Javier Dominguez
Multiplatform Spark solution for Graph datasources by Javier Dominguez
 
SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018 SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018
 
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
 
The Apache Solr Semantic Knowledge Graph
The Apache Solr Semantic Knowledge GraphThe Apache Solr Semantic Knowledge Graph
The Apache Solr Semantic Knowledge Graph
 
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022
 
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
 
How Graph Databases used in Police Department?
How Graph Databases used in Police Department?How Graph Databases used in Police Department?
How Graph Databases used in Police Department?
 
Graph protocol for accessing information about blockchains and d apps
Graph protocol for accessing information about blockchains and d appsGraph protocol for accessing information about blockchains and d apps
Graph protocol for accessing information about blockchains and d apps
 
201411203 goto night on graphs for fraud detection
201411203 goto night on graphs for fraud detection201411203 goto night on graphs for fraud detection
201411203 goto night on graphs for fraud detection
 
aRangodb, un package per l'utilizzo di ArangoDB con R
aRangodb, un package per l'utilizzo di ArangoDB con RaRangodb, un package per l'utilizzo di ArangoDB con R
aRangodb, un package per l'utilizzo di ArangoDB con R
 
Monitoring Big Data Systems - "The Simple Way"
Monitoring Big Data Systems - "The Simple Way"Monitoring Big Data Systems - "The Simple Way"
Monitoring Big Data Systems - "The Simple Way"
 
Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...
Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...
Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...
 
Intro To Graph Databases - Oxana Goriuc
Intro To Graph Databases - Oxana GoriucIntro To Graph Databases - Oxana Goriuc
Intro To Graph Databases - Oxana Goriuc
 
BIG GRAPH: TOOLS, TECHNIQUES, ISSUES, CHALLENGES AND FUTURE DIRECTIONS
BIG GRAPH: TOOLS, TECHNIQUES, ISSUES, CHALLENGES AND FUTURE DIRECTIONSBIG GRAPH: TOOLS, TECHNIQUES, ISSUES, CHALLENGES AND FUTURE DIRECTIONS
BIG GRAPH: TOOLS, TECHNIQUES, ISSUES, CHALLENGES AND FUTURE DIRECTIONS
 
Big Graph : Tools, Techniques, Issues, Challenges and Future Directions
Big Graph : Tools, Techniques, Issues, Challenges and Future Directions Big Graph : Tools, Techniques, Issues, Challenges and Future Directions
Big Graph : Tools, Techniques, Issues, Challenges and Future Directions
 
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
 
Machine Learning + Graph Databases for Better Recommendations
Machine Learning + Graph Databases for Better RecommendationsMachine Learning + Graph Databases for Better Recommendations
Machine Learning + Graph Databases for Better Recommendations
 

Recently uploaded

Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...nirzagarg
 
Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?RemarkSemacio
 
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...vershagrag
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Introduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptxIntroduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptxAniqa Zai
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...HyderabadDolls
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRajesh Mondal
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...SOFTTECHHUB
 
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...ThinkInnovation
 
Giridih Escorts Service Girl ^ 9332606886, WhatsApp Anytime Giridih
Giridih Escorts Service Girl ^ 9332606886, WhatsApp Anytime GiridihGiridih Escorts Service Girl ^ 9332606886, WhatsApp Anytime Giridih
Giridih Escorts Service Girl ^ 9332606886, WhatsApp Anytime Giridihmeghakumariji156
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...HyderabadDolls
 
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...kumargunjan9515
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxronsairoathenadugay
 
社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token PredictionNABLAS株式会社
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...gajnagarg
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Klinik kandungan
 

Recently uploaded (20)

Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
Call Girls in G.T.B. Nagar (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in G.T.B. Nagar  (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in G.T.B. Nagar  (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in G.T.B. Nagar (delhi) call me [🔝9953056974🔝] escort service 24X7
 
Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?
 
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Introduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptxIntroduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptx
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get CytotecAbortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
 
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Giridih Escorts Service Girl ^ 9332606886, WhatsApp Anytime Giridih
Giridih Escorts Service Girl ^ 9332606886, WhatsApp Anytime GiridihGiridih Escorts Service Girl ^ 9332606886, WhatsApp Anytime Giridih
Giridih Escorts Service Girl ^ 9332606886, WhatsApp Anytime Giridih
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
 
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 

GraphGen: Conducting Graph Analytics over Relational Databases

  • 1. GraphGen: Conducting Graph Analytics over Relational Databases Konstantinos Xirogiannopoulos Amol Deshpande
  • 2. collaborated Name: Konstantinos Name: Amol Name: University of MarylandName: PyData DC Year: 2016 gave_talk works_at works_at
  • 3. Graph Analytics: (Network Science) Leveraging of connections between entities in a network towards gaining insight about said entities and/or the network via the use of graph algorithms.
  • 4. 1) Why graph analytics? 2) How are graph analytics done currently? 3) What are most people dealing with? 4) Bolt-on graph analytics with GraphGen 5) The GraphGen Language
  • 5. Graphs Across Domains Protein-protein interaction networks Financial transaction networks Stock Trading Networks Social Networks Federal Funds Networks Knowledge Graph World Wide Web Communication Networks Citation Networks …... http://go.umd.edu/graphs
  • 6. Example Use cases ● Financial crimes (e.g. money laundering) ● Fraudulent transactions ● Cybercrime ● Counterterrorism ● Key players in a network ● Ranking entities (web pages, PageRank) ● Providing connection recommendations to users ● Optimizing transportation routes ● Identifying weaknesses in power grids, water grids etc. ● Computer networks ● Medical Research ● Disease pathology ● DNA Sequencing
  • 7. 1) Why graph analytics? 2) How are graph analytics done currently? 3) What are most people dealing with? 4) Bolt-on graph analytics with GraphGen 5) The GraphGen Language
  • 8. Types of Graph Analytics ● Graph “queries”: Subgraph pattern matching, shortest paths, temporal queries ● Real Time Analytics: Anomaly/Event detection, online prediction ● Batch Analytics (Network Science): Centrality analysis, community detection, network evolution ● Machine Learning: Matrix factorization, logistic regression modeled as message passing in specially structured graphs. http://go.umd.edu/graphs
  • 9. State of the art ● Graph Analytics tasks are too widely varied http://go.umd.edu/graphs ● There is no one-size-fits-all solution ○ RDBMS/Hadoop/Spark have their tradeoffs ● Fragmented area with little consensus ❖ Specialized graph databases (Neo4j, Titan, Blazegraph, Cayley,Dgraph) ❖ RDF stores (Allegrograph, Jena) ❖ Bolt-on solutions (Teradata SQL-Graph, SAP Graph Engine, Oracle) ❖ Distributed batch processing systems (Giraph, GraphX, GraphLab) Lots of ETL required! ❖ Many more research prototypes...
  • 10. Different Analytics Flows Other SystemsGraph Databases Bolt-On Solutions
  • 11. What should I use then?? ● What fraction of the overall workload is graph-oriented? ● How often are some sort of graph analytics required to run? ● Do you need to do graph updates? ● What types of analytics are required? ● How large would the graphs be? ● Are you starting from scratch or do you have an already deployed DBMS?
  • 12. 1) Why graph analytics? 2) How are graph analytics done currently? 3) What are most people dealing with? 4) Bolt-on graph analytics with GraphGen 5) The GraphGen Language
  • 13. ● Most business analytics (querying, reporting, OLAP) happen in SQL ● Organizations typically model their data according to their needs ● Graph databases if you have strictly graph-centric workloads Where’s the Data?
  • 14. Where’s the Data? ● Most likely organized in some type of database schema ● Collection of tables related to each-other through common attributes, or primary, foreign-key constraints. We need to extract connections between entities
  • 16. Lots of “hidden” graphs ● Let’s take TPC-H. part_key Part supplier_key ... customer_key Customer customer_name ... order_key Orders part_key customer_key ... supplier_key Supplier supplier_name ... ● We could create edges between two customers if they’ve: ○ Bought the same item ○ Bought the same item on the same day ○ Bought from the same supplier ○ Etc.
  • 17. State of the art ● Graph Analytics tasks are too widely varied http://go.umd.edu/graphs ● There is no one-size-fits-all solution ○ RDBMS/Hadoop/Spark have their tradeoffs ● Fragmented area with little consensus ❖ Specialized graph databases (Neo4j, Titan, Blazegraph, Cayley,Dgraph) ❖ RDF stores (Allegrograph, Jena) ❖ Bolt-on solutions (Teradata SQL-Graph, SAP Graph Engine, Oracle) ❖ Distributed batch processing systems (Giraph, GraphX, GraphLab) Lots of ETL required! ❖ Many more research prototypes...
  • 18. State of the art ● Graph Analytics tasks are too widely varied http://go.umd.edu/graphs ● There is no one-size-fits-all solution ○ RDBMS/Hadoop/Spark have their tradeoffs ● Fragmented area with little consensus ❖ Specialized graph databases (Neo4j, Titan, Blazegraph, Cayley,Dgraph) ❖ RDF stores (Allegrograph, Jena) ❖ Bolt-on solutions (Teradata SQL-Graph, SAP Graph Engine, Oracle) ❖ Distributed batch processing systems (Giraph, GraphX, GraphLab) Lots of ETL required! ❖ Many more research prototypes...
  • 19. 1) Why graph analytics? 2) How are graph analytics done currently? 3) What are most people dealing with? 4) Bolt-on graph analytics with GraphGen 5) The GraphGen Language
  • 20. GraphGen Extract and analyze many different kinds of graphs Simple, Intuitive, Declarative Language, No ETL required Full Graph API & Vertex Centric Framework
  • 21. GraphGen Interfaces Native Java LibraryPython wrapper LibraryGraphGen Explorer: UI Web Application
  • 23. ● Exploration of database schema to detect different types of hidden graphs. ● Allows users to visually explore potential graphs. ● Simple statistic and on-the-fly analysis Not all graphs will be useful! GraphGen Explorer Web App
  • 24.
  • 26. from graphgenpy import GraphGenerator import networkx as nx datalogQuery = """ Nodes(ID, Name) :- Author(ID, Name). Edges(ID1, ID2) :- AuthorPublication(ID1, PubID), AuthorPublication(ID2, PubID). """ # Credentials for connecting to the database gg = GraphGenerator("localhost","5432","testgraphgen","kostasx","password") fname = gg.generateGraph(datalogQuery,"extracted_graph",GraphGenerator.GML) G = nx.read_gml(fname,'id') print "Graph Loaded into NetworkX! Running PageRank..." # Run any algorithm on the graph using NetworkX print nx.pagerank(G) print "Done!" Define GraphGen Query Database Credentials Generate and Serialize Graph Load Graph into NetworkX Run Any Algorithm
  • 28. // Establish Connection to Database GraphGenerator ggen = new GraphGenerator("host", "port", "dbName", "username", "password"); // Define and evaluate a single graph extraction query String datalog_query = "..."; Graph g = ggen.generateGraph(datalog_query).get(0); // Initialize vertec-centric object VertexCentric p = new VertexCentric(g); // Define vertex-centric compute function Executor program = new Executor("result_value_name") { @Override public void compute(Vertex v, VertexCentric p) { // implementation of compute function } }; // Begin execution p.run(program); Define GraphGen Query Database Credentials Extract and Load Graph Define Vertex Centric Program Run Program
  • 29. // Establish Connection to Database GraphGenerator ggen = new GraphGenerator("host", "port", "dbName", "username", "password"); // Define and evaluate a single graph extraction query String datalog_query = "..."; Graph g = ggen.generateGraph(datalog_query).get(0); for (Vertex v : g.getVertices()) { // For each neighbor for (Vertex neighbor : v.getVertices(Direction.OUT)) { // Do something } } Define GraphGen Query Database Credentials Extract and Load Graph Use Full API to access the Graph
  • 31. 1) Why graph analytics? 2) How are graph analytics done currently? 3) What are most people dealing with? 4) Bolt-on graph analytics with GraphGen 5) The GraphGen Language
  • 32. GraphGen DSL ● Intuitive Domain Specific Language based on Datalog ● User needs to specify: ○ How the nodes are defined ○ How the edges are defined ● The query is executed, and the user gets a Graph object to operate upon. ● Very expressive: Allows for homogeneous and heterogeneous graphs with various types of nodes and edges.
  • 33. TPC-H Database partKey Part supplierKey ... customerKey Customer customerName ... ● We want to explore a graph of customers! ● Using the GraphGen Language: ○ Which tables do we need to combine to extract the nodes and edges orderKey Orders partKey customerKey ... supplierKey Supplier supplierName ...
  • 34. GraphGen DSL Example Nodes(ID, Name) :- Customer(ID, Name). ● Creates a node out of each row in the Customer table ■ Customer ID and Name as properties Edges(ID1, ID2) :- Orders(_,partKey, ID1), Orders(_,partKey, ID2). ● Connect ID1 -> ID2 if they have both ordered the same part
  • 35. GraphGen ● Enable extraction of different types of hidden graphs ● Independent of where the data is stored (given SQL) ● Enable complex analytics over the extracted graphs ● Efficient extraction through various in-memory representations ● Efficient analysis through a parallel execution engine ● Effortless through a Declarative Language ● Eliminates the need for complex ETL ● Intuitive and swift analysis of any graph that exists in your data!
  • 36. Download GraphGen at: konstantinosx.github.io/graphgen-project/ DDL Blog Post at: blog.districtdatalabs.com/graph-analytics-over-relational-datasets
  • 37. Email: kostasx@cs.umd.edu Twitter: @kxirog Download GraphGen at: konstantinosx.github.io/graphgen-project/ Thank you!