Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

HBaseCon2017 Community-Driven Graphs with JanusGraph

2,981 views

Published on

Graphs are well-suited for many use cases to express and process complex relationships among entities in enterprise and social contexts. Fueled by the growing interest in graphs, there are various graph databases and processing systems that dot the graph landscape. JanusGraph is a community-driven project that continues the legacy of Titan, a pioneer of open source graph databases. JanusGraph is a scalable graph database optimized for large scale transactional and analytical graph processing. In the session, we will introduce JanusGraph, which features full integration with the Apache TinkerPop graph stack. We will discuss JanusGraph's optimized storage model that relies on HBase for fast graph transversal and processing.

by Jason Plurad and Jing Chen He of IBM

Published in: Technology
  • Be the first to comment

HBaseCon2017 Community-Driven Graphs with JanusGraph

  1. 1. Jing Chen He • jinghe@us.ibm.com • Apache HBase PMC • JanusGraph TSC Jason Plurad • pluradj@us.ibm.com • Apache TinkerPop PMC • JanusGraph TSC HBaseCon West 2017 • June 12, 2017 Community-Driven Graphs with JanusGraph
  2. 2. Agenda Property Graphs Graph Community Introduction to JanusGraph JanusGraph with HBase 2 #HBaseCon
  3. 3. Graph  Born for relationship!  Intuitive modeling  Expressive querying  Native analysis 3 #HBaseCon https://tinkerpop.apache.org/docs/3.2.4/reference/#intro
  4. 4. Graph Data Use Cases  Social network analysis  Configuration management database  Master data management  Recommendation engines  Knowledge graphs  Internet of things  Cybersecurity attack analysis 4 #HBaseCon
  5. 5. Apache TinkerPop  Open source, vendor-agnostic, graph computing framework  Gremlin graph traversal language 5 Apache TinkerPop™ Maintainer Apache Software Foundation License Apache Latest Release 3.2.4 February 2017 https://tinkerpop.apache.org #HBaseCon
  6. 6. Gremlin Graph Traversal Language 6 #HBaseCon https://tinkerpop.apache.org/gremlin.html
  7. 7. TinkerPop Stack 7 #HBaseCon https://tinkerpop.apache.org/docs/3.2.4/reference/#_graph_system_integration
  8. 8. Graph Landscape 8 #HBaseCon https://tinkerpop.apache.org/gremlin.html#oltp-and-olap-traversals
  9. 9.  Scalable graph database distributed on multi-machine clusters with pluggable storage and indexing  Fully-compliant with Apache TinkerPop graph computing framework  Vendor-neutral, open community with open governance – Founding members: Expero, Google, GRAKN.AI, Hortonworks, IBM – Latest members: Amazon, Netflix, Orchestral Developments, Uber 9 JanusGraph™ Maintainer Linux Foundation License Apache Latest Release 0.1.0 April 2017 https://janusgraph.org #HBaseCon
  10. 10. 10 #HBaseCon Architecture Google Cloud Bigtable http://docs.janusgraph.org/latest/arch-overview.html
  11. 11. 11 #HBaseCon Storage Model http://docs.janusgraph.org/latest/data-model.html#_janusgraph_data_layout
  12. 12. 12 #HBaseCon Storage Model http://docs.janusgraph.org/latest/data-model.html#_individual_edge_layout
  13. 13. 13 #HBaseCon with HBase  HBase – Perfect Storage Backend for JanusGraph Big enough for your biggest graph! The storage model Read and write speed Scalability and partitioning Strong consistency Tight integration with Hadoop Ecosystem Great open community! http://docs.janusgraph.org/latest/hbase.html
  14. 14. 14 #HBaseCon with HBase  HBase – Perfect Storage Backend for JanusGraph Simple configuration!  conf/janusgraph-hbase-solr.properties  storage.backend=hbase  storage.hostname=zookeeper-host1,zookeeper-host2,zookeeper-host3  storage.hbase.table=janusgraph  storage.hbase.ext.zookeeper.znode.parent=/hbase  storage.hbase.ext.hbase.zookeeper.property.clientPort=2181  Just open your graph!  graph=JanusGraphFactory.open('conf/janusgraph-hbase-solr.properties') Optional Optional
  15. 15. 15 #HBaseCon with HBase  HBase – Perfect Storage Backend for JanusGraph Throw in an Index Backend for better performance  conf/janusgraph-hbase-solr.properties  index.search.backend=solr  index.search.solr.mode=cloud  index.search.solr.zookeeper-url=zookeeper-host1:2181/solr,zookeeper- host2:2181/solr,zookeeper-host3:2181/solr  index.search.solr.configset=janusgraph
  16. 16. 16 #HBaseCon with HBase  HBase – Perfect Storage Backend for JanusGraph Look into more details  Stores to Column Families  Edge store  e  Index store  g  ID store  i  Transaction log store  l  System property store  s  CF attributes can be set. E.g. compression, TTL.
  17. 17. 17 #HBaseCon with HBase  HBase – Perfect Storage Backend for JanusGraph Look into more details g.V().has("name", "Alice").out("knows").out("knows").values("name") Query Plan to Backend Store and Index Edge Store Index Store Index provider
  18. 18. 18 #HBaseCon with HBase  HBase – Perfect Storage Backend for JanusGraph Look into more details  A store (column family) is always specified.  Get or Multi Get  Batch to mutate  Key range scan  ColumnRangeFilter  ColumnPaginationFilter  HBase tuning Edge Store Index Store
  19. 19. 19 #HBaseCon with Google Cloud Bigtable  Bigtable implements the HBase 1.0 client API Need the latest version of the bigtable-hbase-1.0 artifact.  storage.backend=hbase  storage.hbase.ext.hbase.client.connection.impl= com.google.cloud.bigtable.hbase1_0.BigtableConnection  storage.hbase.ext.google.bigtable.project.id= <Google Cloud Platform project id>  storage.hbase.ext.google.bigtable.instance.id=<Bigtable instance id>
  20. 20. Thank you!

×