SlideShare a Scribd company logo
1 of 22
Note: This is a updated version of the original presentation at HBaseCon West 2017.
Jing Chen He • jinghe@us.ibm.com • Apache HBase PMC • JanusGraph TSC
Jason Plurad • pluradj@us.ibm.com • Apache TinkerPop PMC • JanusGraph TSC
HBaseCon West 2017 • June 12, 2017
Community-Driven Graphs with
JanusGraph
Agenda
Property Graphs
Graph Framework and Graph DB
Graph Community
Introduction to JanusGraph
JanusGraph with HBase
2 #HBaseCon
Graph
 Born for relationship!
 Intuitive modeling
 Expressive querying
 Native analysis
 Property Graph
3 #HBaseCon
https://tinkerpop.apache.org/docs/3.2.4/reference/#intro
Graph Use Cases
 Social network analysis
 Recommendation engines
 Knowledge graphs
 Internet of things
 Fraud detection system
 Meta data and Master data management
4 #HBaseCon
Graph Computing and Graph Database
 Graph Computing and Processing framework
 OLTP - Local traversal, real time.
 OLAP - Entire data set accessed, long running, parallel and batch processing.
 Apache TinkerPop - OLTP and OLAP (TinkerPop SparkGraphComputer)
 Apache Spark GraphX - OLAP
 Graph Database
 Graph Computing and Processing + Data Storage
 Example  Neo4J - Integrated graph processing and storage in one.
 Example  JanusGraph - Graph processing layer on top of external storage.
5 #HBaseCon
Apache TinkerPop
 Open source, vendor-agnostic,
graph computing framework
 Gremlin graph traversal language
6
Apache TinkerPop™
Maintainer Apache
Software
Foundation
License Apache
Latest Release 3.2.4
February 2017
https://tinkerpop.apache.org
#HBaseCon
Gremlin Graph Traversal Language
7 #HBaseCon
https://tinkerpop.apache.org/gremlin.html
TinkerPop Stack
8 #HBaseCon
https://tinkerpop.apache.org/docs/3.2.4/reference/#_graph_system_integration
Graph Landscape
9 #HBaseCon
https://tinkerpop.apache.org/gremlin.html#oltp-and-olap-traversals
Graph Landscape
10 #HBaseCon
https://tinkerpop.apache.org/gremlin.html#oltp-and-olap-traversals
 Scalable graph database distributed on
multi-machine clusters with pluggable storage
and indexing
 Fully-compliant with Apache TinkerPop graph
computing framework
 Vendor-neutral, open community with
open governance
– Founding members: Expero, Google, GRAKN.AI,
Hortonworks, IBM
– Latest members: Amazon, Netflix, Orchestral
Developments, Uber
11
JanusGraph™
Maintainer Linux
Foundation
License Apache
Latest
Release
0.2.0
Oct, 2017
https://janusgraph.org
#HBaseCon
12 #HBaseCon
Architecture
Google Cloud Bigtable
http://docs.janusgraph.org/latest/arch-overview.html
13 #HBaseCon
Storage Model
http://docs.janusgraph.org/latest/data-model.html#_janusgraph_data_layout
14 #HBaseCon
Storage Model
http://docs.janusgraph.org/latest/data-model.html#_individual_edge_layout
15 #HBaseCon
with HBase
 HBase – Perfect Storage Backend for JanusGraph
Big enough for your biggest graph!
The storage model
Read and write speed
Scalability and partitioning
Strong consistency
Tight integration with Hadoop Ecosystem
Great open community!
http://docs.janusgraph.org/latest/hbase.html
16 #HBaseCon
with HBase
 HBase – Perfect Storage Backend for JanusGraph
Simple configuration!
 conf/janusgraph-hbase-solr.properties
 storage.backend=hbase
 storage.hostname=zookeeper-host1,zookeeper-host2,zookeeper-host3
 storage.hbase.table=janusgraph
 storage.hbase.ext.zookeeper.znode.parent=/hbase
 storage.hbase.ext.hbase.zookeeper.property.clientPort=2181
 Just open your graph!
 graph=JanusGraphFactory.open('conf/janusgraph-hbase-solr.properties')
Optional
Optional
17 #HBaseCon
with HBase
 HBase – Perfect Storage Backend for JanusGraph
Throw in an Index Backend for better performance
 conf/janusgraph-hbase-solr.properties
 index.search.backend=solr
 index.search.solr.mode=cloud
 index.search.solr.zookeeper-url=zookeeper-host1:2181/solr,zookeeper-
host2:2181/solr,zookeeper-host3:2181/solr
 index.search.solr.configset=janusgraph
18 #HBaseCon
with HBase
 HBase – Perfect Storage Backend for JanusGraph
 Look into more details
 Stores to Column Families
 Edge store  e
 Index store  g
 ID store  i
 System Transaction log store  l
 System Management log store  m
 System property store  s
 …
 CF attributes can be set. E.g. compression, TTL.
19 #HBaseCon
with HBase
 HBase – Perfect Storage Backend for JanusGraph
 Look into more details
g.V().has("name", "Alice").out("knows").out("knows").values("name")
Execution Plan to
Backend Store and
Index
Edge Store
Index Store
Index
provider
(ES or
Solr)
Gremlin TraversalStrategy
Optimization
JanusGraph Optimization
20 #HBaseCon
with HBase
 HBase – Perfect Storage Backend for JanusGraph
Look into more details
 A store (column family) is always specified.
 Get or Multi Get
 Batch to mutate
 Key range scan
 ColumnRangeFilter
 ColumnPaginationFilter
 HBase tuning
Edge Store
Index Store
21 #HBaseCon
with Google Cloud Bigtable
 Bigtable implements the HBase 1.0 client API
Need the latest version of the bigtable-hbase-1.x artifact.
 storage.backend=hbase
 storage.hbase.ext.hbase.client.connection.impl=
com.google.cloud.bigtable.hbase1_x.BigtableConnection
 storage.hbase.ext.google.bigtable.project.id=
<Google Cloud Platform project id>
 storage.hbase.ext.google.bigtable.instance.id=<Bigtable instance id>
Thank you!

More Related Content

What's hot

Hadoop @ eBay: Past, Present, and Future
Hadoop @ eBay: Past, Present, and FutureHadoop @ eBay: Past, Present, and Future
Hadoop @ eBay: Past, Present, and FutureRyan Hennig
 
HBaseConAsia2018 Track2-1: Kerberos-based Big Data Security Solution and Prac...
HBaseConAsia2018 Track2-1: Kerberos-based Big Data Security Solution and Prac...HBaseConAsia2018 Track2-1: Kerberos-based Big Data Security Solution and Prac...
HBaseConAsia2018 Track2-1: Kerberos-based Big Data Security Solution and Prac...Michael Stack
 
Dataiku big data paris - the rise of the hadoop ecosystem
Dataiku   big data paris - the rise of the hadoop ecosystemDataiku   big data paris - the rise of the hadoop ecosystem
Dataiku big data paris - the rise of the hadoop ecosystemDataiku
 
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and CloudHBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and CloudMichael Stack
 
Qubole Overview at the Fifth Elephant Conference
Qubole Overview at the Fifth Elephant ConferenceQubole Overview at the Fifth Elephant Conference
Qubole Overview at the Fifth Elephant ConferenceJoydeep Sen Sarma
 
2 hadoop@e bay-hug-2010-07-21
2 hadoop@e bay-hug-2010-07-212 hadoop@e bay-hug-2010-07-21
2 hadoop@e bay-hug-2010-07-21Hadoop User Group
 
Summer Shorts: Big Data Integration
Summer Shorts: Big Data IntegrationSummer Shorts: Big Data Integration
Summer Shorts: Big Data Integrationibi
 
High performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyDataHigh performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyDataVMware Tanzu
 
Hadoop at Yahoo! -- Hadoop World NY 2009
Hadoop at Yahoo! -- Hadoop World NY 2009Hadoop at Yahoo! -- Hadoop World NY 2009
Hadoop at Yahoo! -- Hadoop World NY 2009yhadoop
 
Review of Calculation Paradigm and its Components
Review of Calculation Paradigm and its ComponentsReview of Calculation Paradigm and its Components
Review of Calculation Paradigm and its ComponentsNamuk Park
 
Flurry Analytic Backend - Processing Terabytes of Data in Real-time
Flurry Analytic Backend - Processing Terabytes of Data in Real-timeFlurry Analytic Backend - Processing Terabytes of Data in Real-time
Flurry Analytic Backend - Processing Terabytes of Data in Real-timeTrieu Nguyen
 
Hunk - Unlocking The Power of Big Data Breakout Session
Hunk - Unlocking The Power of Big Data Breakout SessionHunk - Unlocking The Power of Big Data Breakout Session
Hunk - Unlocking The Power of Big Data Breakout SessionSplunk
 
The Meta of Hadoop - COMAD 2012
The Meta of Hadoop - COMAD 2012The Meta of Hadoop - COMAD 2012
The Meta of Hadoop - COMAD 2012Joydeep Sen Sarma
 
Hunk - Unlocking the Power of Big Data
Hunk - Unlocking the Power of Big DataHunk - Unlocking the Power of Big Data
Hunk - Unlocking the Power of Big DataSplunk
 
Big Data Anti-Patterns: Lessons From the Front LIne
Big Data Anti-Patterns: Lessons From the Front LIneBig Data Anti-Patterns: Lessons From the Front LIne
Big Data Anti-Patterns: Lessons From the Front LIneDouglas Moore
 
Hadoop Summit 2011 - Using a Hadoop Data Pipeline to Build a Graph of Users a...
Hadoop Summit 2011 - Using a Hadoop Data Pipeline to Build a Graph of Users a...Hadoop Summit 2011 - Using a Hadoop Data Pipeline to Build a Graph of Users a...
Hadoop Summit 2011 - Using a Hadoop Data Pipeline to Build a Graph of Users a...Bill Graham
 

What's hot (20)

Hadoop @ eBay: Past, Present, and Future
Hadoop @ eBay: Past, Present, and FutureHadoop @ eBay: Past, Present, and Future
Hadoop @ eBay: Past, Present, and Future
 
HBaseConAsia2018 Track2-1: Kerberos-based Big Data Security Solution and Prac...
HBaseConAsia2018 Track2-1: Kerberos-based Big Data Security Solution and Prac...HBaseConAsia2018 Track2-1: Kerberos-based Big Data Security Solution and Prac...
HBaseConAsia2018 Track2-1: Kerberos-based Big Data Security Solution and Prac...
 
Hadoop at Ebay
Hadoop at EbayHadoop at Ebay
Hadoop at Ebay
 
Hadoop and HBase @eBay
Hadoop and HBase @eBayHadoop and HBase @eBay
Hadoop and HBase @eBay
 
Dataiku big data paris - the rise of the hadoop ecosystem
Dataiku   big data paris - the rise of the hadoop ecosystemDataiku   big data paris - the rise of the hadoop ecosystem
Dataiku big data paris - the rise of the hadoop ecosystem
 
Hadoop-2 @ eBay
Hadoop-2 @ eBayHadoop-2 @ eBay
Hadoop-2 @ eBay
 
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and CloudHBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
 
Qubole Overview at the Fifth Elephant Conference
Qubole Overview at the Fifth Elephant ConferenceQubole Overview at the Fifth Elephant Conference
Qubole Overview at the Fifth Elephant Conference
 
2 hadoop@e bay-hug-2010-07-21
2 hadoop@e bay-hug-2010-07-212 hadoop@e bay-hug-2010-07-21
2 hadoop@e bay-hug-2010-07-21
 
Summer Shorts: Big Data Integration
Summer Shorts: Big Data IntegrationSummer Shorts: Big Data Integration
Summer Shorts: Big Data Integration
 
High performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyDataHigh performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyData
 
Hadoop at Yahoo! -- Hadoop World NY 2009
Hadoop at Yahoo! -- Hadoop World NY 2009Hadoop at Yahoo! -- Hadoop World NY 2009
Hadoop at Yahoo! -- Hadoop World NY 2009
 
Review of Calculation Paradigm and its Components
Review of Calculation Paradigm and its ComponentsReview of Calculation Paradigm and its Components
Review of Calculation Paradigm and its Components
 
Big Data A La Carte Menu
Big Data A La Carte MenuBig Data A La Carte Menu
Big Data A La Carte Menu
 
Flurry Analytic Backend - Processing Terabytes of Data in Real-time
Flurry Analytic Backend - Processing Terabytes of Data in Real-timeFlurry Analytic Backend - Processing Terabytes of Data in Real-time
Flurry Analytic Backend - Processing Terabytes of Data in Real-time
 
Hunk - Unlocking The Power of Big Data Breakout Session
Hunk - Unlocking The Power of Big Data Breakout SessionHunk - Unlocking The Power of Big Data Breakout Session
Hunk - Unlocking The Power of Big Data Breakout Session
 
The Meta of Hadoop - COMAD 2012
The Meta of Hadoop - COMAD 2012The Meta of Hadoop - COMAD 2012
The Meta of Hadoop - COMAD 2012
 
Hunk - Unlocking the Power of Big Data
Hunk - Unlocking the Power of Big DataHunk - Unlocking the Power of Big Data
Hunk - Unlocking the Power of Big Data
 
Big Data Anti-Patterns: Lessons From the Front LIne
Big Data Anti-Patterns: Lessons From the Front LIneBig Data Anti-Patterns: Lessons From the Front LIne
Big Data Anti-Patterns: Lessons From the Front LIne
 
Hadoop Summit 2011 - Using a Hadoop Data Pipeline to Build a Graph of Users a...
Hadoop Summit 2011 - Using a Hadoop Data Pipeline to Build a Graph of Users a...Hadoop Summit 2011 - Using a Hadoop Data Pipeline to Build a Graph of Users a...
Hadoop Summit 2011 - Using a Hadoop Data Pipeline to Build a Graph of Users a...
 

Similar to HBaseCon 2017: Community-Driven Graph with JanusGraph (updated)

Community-Driven Graphs with JanusGraph
Community-Driven Graphs with JanusGraphCommunity-Driven Graphs with JanusGraph
Community-Driven Graphs with JanusGraphJason Plurad
 
HBaseCon2017 Community-Driven Graphs with JanusGraph
HBaseCon2017 Community-Driven Graphs with JanusGraphHBaseCon2017 Community-Driven Graphs with JanusGraph
HBaseCon2017 Community-Driven Graphs with JanusGraphHBaseCon
 
Big Data Analytics with Hadoop, MongoDB and SQL Server
Big Data Analytics with Hadoop, MongoDB and SQL ServerBig Data Analytics with Hadoop, MongoDB and SQL Server
Big Data Analytics with Hadoop, MongoDB and SQL ServerMark Kromer
 
Hadoop ecosystem framework n hadoop in live environment
Hadoop ecosystem framework  n hadoop in live environmentHadoop ecosystem framework  n hadoop in live environment
Hadoop ecosystem framework n hadoop in live environmentDelhi/NCR HUG
 
Accelerating Hadoop, Spark, and Memcached with HPC Technologies
Accelerating Hadoop, Spark, and Memcached with HPC TechnologiesAccelerating Hadoop, Spark, and Memcached with HPC Technologies
Accelerating Hadoop, Spark, and Memcached with HPC Technologiesinside-BigData.com
 
Architecting applications with Hadoop - Fraud Detection
Architecting applications with Hadoop - Fraud DetectionArchitecting applications with Hadoop - Fraud Detection
Architecting applications with Hadoop - Fraud Detectionhadooparchbook
 
Hadoop demo ppt
Hadoop demo pptHadoop demo ppt
Hadoop demo pptPhil Young
 
Hadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHitendra Kumar
 
Apache HBase + Spark: Leveraging your Non-Relational Datastore in Batch and S...
Apache HBase + Spark: Leveraging your Non-Relational Datastore in Batch and S...Apache HBase + Spark: Leveraging your Non-Relational Datastore in Batch and S...
Apache HBase + Spark: Leveraging your Non-Relational Datastore in Batch and S...DataWorks Summit/Hadoop Summit
 
Big Data in the Real World
Big Data in the Real WorldBig Data in the Real World
Big Data in the Real WorldMark Kromer
 
Stream Processing and Real-Time Data Pipelines
Stream Processing and Real-Time Data PipelinesStream Processing and Real-Time Data Pipelines
Stream Processing and Real-Time Data PipelinesVladimír Schreiner
 
Architectures for HPC/HTC Workloads on AWS - CMP306 - re:Invent 2017
Architectures for HPC/HTC Workloads on AWS - CMP306 - re:Invent 2017Architectures for HPC/HTC Workloads on AWS - CMP306 - re:Invent 2017
Architectures for HPC/HTC Workloads on AWS - CMP306 - re:Invent 2017Amazon Web Services
 
HariKrishna4+_cv
HariKrishna4+_cvHariKrishna4+_cv
HariKrishna4+_cvrevuri
 
Big data hadoop ecosystem and nosql
Big data hadoop ecosystem and nosqlBig data hadoop ecosystem and nosql
Big data hadoop ecosystem and nosqlKhanderao Kand
 
Functional programming
 for optimization problems 
in Big Data
Functional programming
  for optimization problems 
in Big DataFunctional programming
  for optimization problems 
in Big Data
Functional programming
 for optimization problems 
in Big DataPaco Nathan
 
Scalable Application Insight Framework
Scalable Application Insight FrameworkScalable Application Insight Framework
Scalable Application Insight FrameworkRajesh Chandramohan
 

Similar to HBaseCon 2017: Community-Driven Graph with JanusGraph (updated) (20)

Community-Driven Graphs with JanusGraph
Community-Driven Graphs with JanusGraphCommunity-Driven Graphs with JanusGraph
Community-Driven Graphs with JanusGraph
 
HBaseCon2017 Community-Driven Graphs with JanusGraph
HBaseCon2017 Community-Driven Graphs with JanusGraphHBaseCon2017 Community-Driven Graphs with JanusGraph
HBaseCon2017 Community-Driven Graphs with JanusGraph
 
HBase, no trouble
HBase, no troubleHBase, no trouble
HBase, no trouble
 
Big Data Analytics with Hadoop, MongoDB and SQL Server
Big Data Analytics with Hadoop, MongoDB and SQL ServerBig Data Analytics with Hadoop, MongoDB and SQL Server
Big Data Analytics with Hadoop, MongoDB and SQL Server
 
Big Data Journey
Big Data JourneyBig Data Journey
Big Data Journey
 
Hadoop ecosystem framework n hadoop in live environment
Hadoop ecosystem framework  n hadoop in live environmentHadoop ecosystem framework  n hadoop in live environment
Hadoop ecosystem framework n hadoop in live environment
 
Accelerating Hadoop, Spark, and Memcached with HPC Technologies
Accelerating Hadoop, Spark, and Memcached with HPC TechnologiesAccelerating Hadoop, Spark, and Memcached with HPC Technologies
Accelerating Hadoop, Spark, and Memcached with HPC Technologies
 
Architecting applications with Hadoop - Fraud Detection
Architecting applications with Hadoop - Fraud DetectionArchitecting applications with Hadoop - Fraud Detection
Architecting applications with Hadoop - Fraud Detection
 
Hadoop demo ppt
Hadoop demo pptHadoop demo ppt
Hadoop demo ppt
 
Hadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log Processing
 
Apache HBase + Spark: Leveraging your Non-Relational Datastore in Batch and S...
Apache HBase + Spark: Leveraging your Non-Relational Datastore in Batch and S...Apache HBase + Spark: Leveraging your Non-Relational Datastore in Batch and S...
Apache HBase + Spark: Leveraging your Non-Relational Datastore in Batch and S...
 
Big Data in the Real World
Big Data in the Real WorldBig Data in the Real World
Big Data in the Real World
 
Handling not so big data
Handling not so big dataHandling not so big data
Handling not so big data
 
Stream Processing and Real-Time Data Pipelines
Stream Processing and Real-Time Data PipelinesStream Processing and Real-Time Data Pipelines
Stream Processing and Real-Time Data Pipelines
 
MongoDB and Hadoop
MongoDB and HadoopMongoDB and Hadoop
MongoDB and Hadoop
 
Architectures for HPC/HTC Workloads on AWS - CMP306 - re:Invent 2017
Architectures for HPC/HTC Workloads on AWS - CMP306 - re:Invent 2017Architectures for HPC/HTC Workloads on AWS - CMP306 - re:Invent 2017
Architectures for HPC/HTC Workloads on AWS - CMP306 - re:Invent 2017
 
HariKrishna4+_cv
HariKrishna4+_cvHariKrishna4+_cv
HariKrishna4+_cv
 
Big data hadoop ecosystem and nosql
Big data hadoop ecosystem and nosqlBig data hadoop ecosystem and nosql
Big data hadoop ecosystem and nosql
 
Functional programming
 for optimization problems 
in Big Data
Functional programming
  for optimization problems 
in Big DataFunctional programming
  for optimization problems 
in Big Data
Functional programming
 for optimization problems 
in Big Data
 
Scalable Application Insight Framework
Scalable Application Insight FrameworkScalable Application Insight Framework
Scalable Application Insight Framework
 

Recently uploaded

5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 

Recently uploaded (20)

5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 

HBaseCon 2017: Community-Driven Graph with JanusGraph (updated)

  • 1. Note: This is a updated version of the original presentation at HBaseCon West 2017. Jing Chen He • jinghe@us.ibm.com • Apache HBase PMC • JanusGraph TSC Jason Plurad • pluradj@us.ibm.com • Apache TinkerPop PMC • JanusGraph TSC HBaseCon West 2017 • June 12, 2017 Community-Driven Graphs with JanusGraph
  • 2. Agenda Property Graphs Graph Framework and Graph DB Graph Community Introduction to JanusGraph JanusGraph with HBase 2 #HBaseCon
  • 3. Graph  Born for relationship!  Intuitive modeling  Expressive querying  Native analysis  Property Graph 3 #HBaseCon https://tinkerpop.apache.org/docs/3.2.4/reference/#intro
  • 4. Graph Use Cases  Social network analysis  Recommendation engines  Knowledge graphs  Internet of things  Fraud detection system  Meta data and Master data management 4 #HBaseCon
  • 5. Graph Computing and Graph Database  Graph Computing and Processing framework  OLTP - Local traversal, real time.  OLAP - Entire data set accessed, long running, parallel and batch processing.  Apache TinkerPop - OLTP and OLAP (TinkerPop SparkGraphComputer)  Apache Spark GraphX - OLAP  Graph Database  Graph Computing and Processing + Data Storage  Example  Neo4J - Integrated graph processing and storage in one.  Example  JanusGraph - Graph processing layer on top of external storage. 5 #HBaseCon
  • 6. Apache TinkerPop  Open source, vendor-agnostic, graph computing framework  Gremlin graph traversal language 6 Apache TinkerPop™ Maintainer Apache Software Foundation License Apache Latest Release 3.2.4 February 2017 https://tinkerpop.apache.org #HBaseCon
  • 7. Gremlin Graph Traversal Language 7 #HBaseCon https://tinkerpop.apache.org/gremlin.html
  • 11.  Scalable graph database distributed on multi-machine clusters with pluggable storage and indexing  Fully-compliant with Apache TinkerPop graph computing framework  Vendor-neutral, open community with open governance – Founding members: Expero, Google, GRAKN.AI, Hortonworks, IBM – Latest members: Amazon, Netflix, Orchestral Developments, Uber 11 JanusGraph™ Maintainer Linux Foundation License Apache Latest Release 0.2.0 Oct, 2017 https://janusgraph.org #HBaseCon
  • 12. 12 #HBaseCon Architecture Google Cloud Bigtable http://docs.janusgraph.org/latest/arch-overview.html
  • 15. 15 #HBaseCon with HBase  HBase – Perfect Storage Backend for JanusGraph Big enough for your biggest graph! The storage model Read and write speed Scalability and partitioning Strong consistency Tight integration with Hadoop Ecosystem Great open community! http://docs.janusgraph.org/latest/hbase.html
  • 16. 16 #HBaseCon with HBase  HBase – Perfect Storage Backend for JanusGraph Simple configuration!  conf/janusgraph-hbase-solr.properties  storage.backend=hbase  storage.hostname=zookeeper-host1,zookeeper-host2,zookeeper-host3  storage.hbase.table=janusgraph  storage.hbase.ext.zookeeper.znode.parent=/hbase  storage.hbase.ext.hbase.zookeeper.property.clientPort=2181  Just open your graph!  graph=JanusGraphFactory.open('conf/janusgraph-hbase-solr.properties') Optional Optional
  • 17. 17 #HBaseCon with HBase  HBase – Perfect Storage Backend for JanusGraph Throw in an Index Backend for better performance  conf/janusgraph-hbase-solr.properties  index.search.backend=solr  index.search.solr.mode=cloud  index.search.solr.zookeeper-url=zookeeper-host1:2181/solr,zookeeper- host2:2181/solr,zookeeper-host3:2181/solr  index.search.solr.configset=janusgraph
  • 18. 18 #HBaseCon with HBase  HBase – Perfect Storage Backend for JanusGraph  Look into more details  Stores to Column Families  Edge store  e  Index store  g  ID store  i  System Transaction log store  l  System Management log store  m  System property store  s  …  CF attributes can be set. E.g. compression, TTL.
  • 19. 19 #HBaseCon with HBase  HBase – Perfect Storage Backend for JanusGraph  Look into more details g.V().has("name", "Alice").out("knows").out("knows").values("name") Execution Plan to Backend Store and Index Edge Store Index Store Index provider (ES or Solr) Gremlin TraversalStrategy Optimization JanusGraph Optimization
  • 20. 20 #HBaseCon with HBase  HBase – Perfect Storage Backend for JanusGraph Look into more details  A store (column family) is always specified.  Get or Multi Get  Batch to mutate  Key range scan  ColumnRangeFilter  ColumnPaginationFilter  HBase tuning Edge Store Index Store
  • 21. 21 #HBaseCon with Google Cloud Bigtable  Bigtable implements the HBase 1.0 client API Need the latest version of the bigtable-hbase-1.x artifact.  storage.backend=hbase  storage.hbase.ext.hbase.client.connection.impl= com.google.cloud.bigtable.hbase1_x.BigtableConnection  storage.hbase.ext.google.bigtable.project.id= <Google Cloud Platform project id>  storage.hbase.ext.google.bigtable.instance.id=<Bigtable instance id>

Editor's Notes

  1. Abstract: Graphs are well-suited for many use cases to express and process complex relationships among entities in enterprise and social contexts. Fueled by the growing interest in graphs, there are various graph databases and processing systems that dot the graph landscape. JanusGraph is a community-driven project that continues the legacy of Titan, a pioneer of open source graph databases. JanusGraph is a scalable graph database optimized for large scale transactional and analytical graph processing. In the session, we will introduce JanusGraph, which features full integration with the Apache TinkerPop graph stack. We will discuss JanusGraph's optimized storage model that relies on HBase for fast graph transversal and processing.
  2. Brief history with TinkerPop. Long history as an open source project.
  3. Brief history with TinkerPop. Long history as an open source project.
  4. Brief history with TinkerPop. Long history as an open source project.
  5. Brief history with TinkerPop. Long history as an open source project.
  6. Lots of interesting parts to graph system integration. Lots of ways to extend and contribute.
  7. Lots of interesting parts to graph system integration. Lots of ways to extend and contribute.
  8. Lots of interesting parts to graph system integration. Lots of ways to extend and contribute.
  9. Lots of interesting parts to graph system integration. Lots of ways to extend and contribute.
  10. Server side tune. Pre-split. Client side pass-thru properties: storage.hbase.ext.<hbase-client-property>