Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
AgensGraph: a Multi-Model Graph Database
based-on PostgreSQL
Kisung Kim (kskim@bitnine.net)
Bitnine R&D Center
2017-1-14
Who am I
• Ph.D Kisung Kim -Chief Technology Officer of Bitnine Global Inc.
• Researched query optimization for graph-stru...
What is Graph Database?
Images from http://www.slideshare.net/debanjanmahata/an-introduction-to-nosql-graph-databases-and-...
What is Graph Database?
• Relationship is the first-class citizen in the graph database
• Make your data connected in the ...
What is the Graph Database?
• Handle data in different view
• Data model similar to entity-relationship model
• Gartner sa...
Cypher Query Language
• Declarative query language for the property graph model
• Inspired by SQL and SPARQL
– Designed to...
Cypher Query Example
Make two nodes
CREATE (:person {id: 1, name: “Kisung Kim”, birthday: 1980-01-05});
CREATE (:company {...
Cypher Query Example
Querying
MATCH (p:person {name: “Kisung Kim”})-[:workFor]->(c:company)
RETURN (p), (c)
No Table Defin...
GraphDB to PostgreSQL Case
• From Hipolabs
http://engineering.hipolabs.com/graphdb-to-postgresql/
Graph Database and Hybrid Database
Magic Quadrant for Operational Database Management Systems, Gartner, 2016
So, What We Want to Make is
• Hybrid database engine with graph and relational model
• Cypher query processing on PostgreS...
Why We Choose PostgreSQL?
• Fully-featured enterprise-ready open source database
• Graph processing actually uses relation...
Challenges
• How to store graph data
– Efficient structure for graph pattern matching
– At the same time, efficient for tr...
Graph Storage
• Graph data is stored in disk as decomposed into vertexes
and edges
• When processing graph pattern matchin...
Two Graph Databases
Solution Company Latest Version Features
Neo Technology 3.1
Most famous graph database, Cypher
O(1) ac...
Graph Storage -Neo4j
• Fixed-size array for nodes and relationships
• Relationships for a node is organized as a doubly-li...
Graph Storage – Titan (DSE Graph)
• Titan stores graphs in adjacency list format
• Each edge is stored twice
• Vertex and ...
Graph Storage -AgensGraph
• Fixed-size array is hard to implement in PostgreSQL
– Tuples are moved when updated
• Titan’s ...
Index Problems
• Current B-tree has several disadvantages for our workload
– Composite index is preferable but the size in...
Graph Storage -AgensGraph
• Vertexes and edges are grouped into labels
• Labels are organized as a label hierarchy
• We us...
Current Status
• AgensGraph v0.9
(https://github.com/bitnine-oss/agens-graph or http://bitnine.net/downloads/)
– Graph dat...
Tadpole for Agens Graph
• Tadpole DB Hub is open-source project for managing unified
infrastructure (https://github.com/ha...
Tadpole for AgensGraph
Future Roadmap
• Distributed graph database
– Plan to exploit Postgres-XL
• Specialized storage and index for graph traver...
Join Us
• AgensGraph is an open-source project https://github.com/bitnine-oss/agens-
graph
• We also wish to contribute Po...
Thank You
kskim@bitinine.net
:likes
Upcoming SlideShare
Loading in …5
×
Upcoming SlideShare
Pg Conf - Implementing Graph Database based-on PostgreSQL
Next
Download to read offline and view in fullscreen.

2

Share

Download to read offline

AgensGraph: a Multi-model Graph Database based on PostgreSql

Download to read offline

Presentation material of GraphDay 2017 about implementing a graph database, AgensGraph.

AgensGraph: a Multi-model Graph Database based on PostgreSql

  1. 1. AgensGraph: a Multi-Model Graph Database based-on PostgreSQL Kisung Kim (kskim@bitnine.net) Bitnine R&D Center 2017-1-14
  2. 2. Who am I • Ph.D Kisung Kim -Chief Technology Officer of Bitnine Global Inc. • Researched query optimization for graph-structured data during doctorate degree • Developed a distributed relational database engine in TmaxSoft • Lead the development of a new graph database, AgensGraph in Bitnine Global
  3. 3. What is Graph Database? Images from http://www.slideshare.net/debanjanmahata/an-introduction-to-nosql-graph-databases-and-neo4j
  4. 4. What is Graph Database? • Relationship is the first-class citizen in the graph database • Make your data connected in the graph database Relational Database Graph Database Entity Row Node (Vertex) Relationship Row Relationship (Edge)
  5. 5. What is the Graph Database? • Handle data in different view • Data model similar to entity-relationship model • Gartner says it represents a radical change in how data is organized and processed
  6. 6. Cypher Query Language • Declarative query language for the property graph model • Inspired by SQL and SPARQL – Designed to be human-readable query language • Developed by Neo technology Inc. since 2011 • Current version is 3.0 • OpenCypher.org (http://opencypher.org) – Participate in developing the query language
  7. 7. Cypher Query Example Make two nodes CREATE (:person {id: 1, name: “Kisung Kim”, birthday: 1980-01-05}); CREATE (:company {id: 1, name: “Bitnine Global”}); Make a relationship between the two nodes MATCH (p:person {id: 1}), (c:company {id:1}) CREATE (p)-[:workFor {title: “CTO”, since: 2014}]->(c); Kisung Kim Bitnine Global workFor
  8. 8. Cypher Query Example Querying MATCH (p:person {name: “Kisung Kim”})-[:workFor]->(c:company) RETURN (p), (c) No Table Definitions and No Joins Query with variable length relationships MATCH (p:person {name: “Kisung Kim”})-[:knows*..3]->(f:person) RETURN (f) Kisung Kim ? workFor Kisung Kim ? knows ? knows ? knows
  9. 9. GraphDB to PostgreSQL Case • From Hipolabs http://engineering.hipolabs.com/graphdb-to-postgresql/
  10. 10. Graph Database and Hybrid Database Magic Quadrant for Operational Database Management Systems, Gartner, 2016
  11. 11. So, What We Want to Make is • Hybrid database engine with graph and relational model • Cypher query processing on PostgreSQL • Online transactional graph database • Disk-based persistent graph storage ( ) -[:processes]->(Cypher)
  12. 12. Why We Choose PostgreSQL? • Fully-featured enterprise-ready open source database • Graph processing actually uses relational algebra – Graph is serialized as tables in disk – Every graph traversal step is in principle a join (from LDBC documentation) • It is important to optimize the joins speed up join processing – PostgreSQL has an excellent query optimizer • And…. Abundant eco-system of PostgreSQL
  13. 13. Challenges • How to store graph data – Efficient structure for graph pattern matching – At the same time, efficient for transaction processing • How to process graph queries – Processing complex graph pattern matching: variable length path, shortest path – Mismatches between graph data model & relational data model – Graph query optimization
  14. 14. Graph Storage • Graph data is stored in disk as decomposed into vertexes and edges • When processing graph pattern matching, it is essential to find adjacent vertexes or edges efficiently – Given a start vertex, find end vertexes – Given an end vertex, find start vertexes v1
  15. 15. Two Graph Databases Solution Company Latest Version Features Neo Technology 3.1 Most famous graph database, Cypher O(1) access using fixed-size array Datastax - Distributed graph system based on Cassandra Titan
  16. 16. Graph Storage -Neo4j • Fixed-size array for nodes and relationships • Relationships for a node is organized as a doubly-linked list • Index-free adjacency • O(1) access for adjacent edges: follow the pointer From Graph Databases 2nd ed. O’Reilly, 2015
  17. 17. Graph Storage – Titan (DSE Graph) • Titan stores graphs in adjacency list format • Each edge is stored twice • Vertex and edge list are stored in backend storage like HBase Cassandra or BerkeleyDB From http://s3.thinkaurelius.com/docs/titan/1.0.0/data-model.html
  18. 18. Graph Storage -AgensGraph • Fixed-size array is hard to implement in PostgreSQL – Tuples are moved when updated • Titan’s big row approach is also inadequate • We chose B-tree index for graph traversal Graph Vertex Edge Vertex ID Properties Edge ID PropertiesStart Vertex ID End Vertex ID B-tree Vertex ID B-tree (Start, End) B-tree (End, Start)
  19. 19. Index Problems • Current B-tree has several disadvantages for our workload – Composite index is preferable but the size increases – There exists a lot of duplicate keys (vertex ID)on start_ID or end_ID – Property updates incur insertions into B-trees • We are developing a new index having bucket structure (like GIN index), in-direct index and supports for index-only scan for the graph traversals
  20. 20. Graph Storage -AgensGraph • Vertexes and edges are grouped into labels • Labels are organized as a label hierarchy • We use PostgreSQL’s table hierarchy feature Vertex ID Properties ag_vertex Vertex ID Properties Person Vertex ID Properties Message Vertex ID Properties Comment Vertex ID Properties Post
  21. 21. Current Status • AgensGraph v0.9 (https://github.com/bitnine-oss/agens-graph or http://bitnine.net/downloads/) – Graph data model and DDL on PostgreSQL 9.6 – Cypher query processing (70% of OpenCypher spec.) – Integrated query processing (Cypher + SQL) – Client library (JDBC, ODBC, Python) – Monitoring and development using Tadpole DB-hub
  22. 22. Tadpole for Agens Graph • Tadpole DB Hub is open-source project for managing unified infrastructure (https://github.com/hangum/TadpoleForDBTools) • Support various databases including (PostgreSQL and Agens Graph) • Features of Tadpole for Agens Graph – Monitoring Agens Graph server – Cypher query browser and graph visualization
  23. 23. Tadpole for AgensGraph
  24. 24. Future Roadmap • Distributed graph database – Plan to exploit Postgres-XL • Specialized storage and index for graph traversals • Dictionary compression for JSONB (ZSON) • Graph query optimization using graph statistics • Integration with big data systems – HDFS Storage – Graph analysis using GraphX
  25. 25. Join Us • AgensGraph is an open-source project https://github.com/bitnine-oss/agens- graph • We also wish to contribute PostgreSQL community • Graph database meetup in Silicon Valley – http://www.meetup.com/Graph-Database-in-Silicon-Valley/
  26. 26. Thank You kskim@bitinine.net :likes
  • olleolleolle

    Jun. 18, 2019
  • nilriri

    Jun. 13, 2019

Presentation material of GraphDay 2017 about implementing a graph database, AgensGraph.

Views

Total views

4,052

On Slideshare

0

From embeds

0

Number of embeds

6

Actions

Downloads

59

Shares

0

Comments

0

Likes

2

×