SlideShare a Scribd company logo
Large Scale Graph Analytics with
DataWorks Summit San Jose
June 13, 2017
P. Taylor Goetz, Hortonworks
@ptgoetz
About Me
• Tech Staff @ Hortonworks
• TSC Member, JanusGraph
• PMC Chair, Apache Storm
• ASF Member
• PMC: Apache Incubator, Apache Arrow, Apache
Kylin, Apache Apex, Apache Eagle, Apache Metron
What is a Graph Database?
–Wikipedia
“In computing, a graph database is a database that uses graph
structures for semantic queries with nodes, edges and
properties to represent and store data. A key concept of the
system is the graph (or edge or relationship), which directly relates
data items in the store. The relationships allow data in the store
to be linked together directly, and in many cases retrieved
with one operation.”
Graph Structures - Vertices
• Vertices are the nodes or points in
a graph structure
Graph Structures - Vertices
• Vertices are the nodes or points in a
graph structure
• Vertices can be associated with a
set of properties (key-value pairs)
Graph Structures - Edges
• Edges are the connections
between the vertices in a graph
Graph Structures - Edges
• Edges are the connections
between the vertices in a graph
• Edges can be non-directional,
directional, or bi-directional
Graph Structures - Edges
• Edges are the connections
between the vertices in a graph
• Edges can be non-directional,
directional, or bi-directional
• Edges can be named and like
vertices can have properties
Graph Structures - Graph
• The graph is the collection of
vertices, edges, and associated
properties
G = (V, E)
What is a Graph Database?
• A graph database is a datastore
optimized for storing and querying
graph structures
• Distinct from relational databases
• Focus in terms of storage and
queries is on relationships
Common Use Cases
Anywhere relationship modeling and analysis can provide insight or value.
Social Media
Master Data Management
Common Use Cases
• Social Networks
• Master Data Management
• Fraud Detection
• Cybersecurity
• Identity and Access Management
• Recommendation Engines
Common Use Cases
• Social Networks
• Master Data Management
• Fraud Detection
• Cybersecurity
• Identity and Access Management
• Recommendation Engines
Many of these can overlap
and be combined to provide
new insights.
The Power of Relationships
The Power of Relationships
• Harness the value of interconnectedness
• “Paths to Insight”
• Traversal vs. Traditional Query: Join Reduction
• “If you can whiteboard it, you can graph it.”
A little history and the
importance of OSS licensing.
Titan DB
• Large scale graph db developed by Aurelius
• Licensed under ALv2 (this is important)
• Aurelius acquired by DataStax Feb. 2015
• 1.0 released Sept. 19, 2015
GitHub Contributions to Titan
DataStax Aurelius
Acquisition Feb. 2015
GitHub Contributions to Titan
DataStax Aurelius
Acquisition Feb. 2015
0.9.0-M2
Jun. 9, 2015
GitHub Contributions to Titan
DataStax Aurelius
Acquisition Feb. 2015
0.9.0-M2
Jun. 9, 2015
1.0
Sept. 19, 2015
GitHub Contributions to Titan
DataStax Aurelius
Acquisition Feb. 2015
0.9.0-M2
Jun. 9, 2015
1.0
Sept. 19, 2015
Where does that leave
community, users?
ALv2 to the
Rescue!
Empowering Communities
ALv2 to the
Rescue!
Empowering Communities
“We can do this. What’s the next step?”
“Apache Olympian?”
What is a “hostile fork?”
A "hostile fork" is a fork of a project that goes against the wishes of the
copyright holders and/or community.
–DataStax counsel on Apache Incubator mailing list
“DataStax does not approve of and objects to the proposed forking
of Titan into Olympian or any other ASF project.”
“Apache Olympian?”
Next stop…
Introducing…
• Spearheaded by Google, IBM,
Hortonworks, Expero, GRAKN.AI
• Contributors from Netflix, Amazon,
Uber, Orchestral Developments
• Sponsored by the Linux Foundation
Introducing…
• ALv2 License
• Apache style governance model
• Source code, issues hosted on
GitHub
• Mailing lists on Google Groups
• Chat on Gitter
Technical Dive
• Optimized for storing/querying billions of vertices and edges
• Supports thousands of concurrent users
• Can execute local queries (OLTP) or cross-cluster distributed
queries (OLAP)
Apache Tinkerpop
• THE framework and API for graph manipulation and
traversal
• Open source, vendor agnostic
• Supported by a number of Graph DBs
• Promotes portability
Gremlin Query Language
• DSL for graph traversal and manipulation
• Fluent style API
• Multi-language support (Java, Scala,
Groovy, Python, Ruby, etc.)
OLAP Integration
• Apache Hadoop
• Apache Spark
• Apache Giraph
• ACID compliant (depending on backend)
• Supports very many concurrent transactions
• Embedded, Single Node, or Scale out
JanusGraph Architectural Overview
Storage Backends
• Well defined storage API allows for easily
pluggable implementations
• Choose the backend best for your use case and
architecture
• Options include: Apache HBase, Apache
Cassandra, Google Cloud Bigtable, Berkeley DB
• More on the way…
Choose Your Own [CAP] Adventure
Consistency
Availability
Partition
Tolerance
Apache
HBase
Berkeley DB
Apache
Cassandra
Scylla DB
Google Cloud
Bigtable
JanusGraph External Indices
• Secondary to primary graph storage
• Provide a means to speed up graph traversal
and information retrieval
• Two types:
• Graph Index
• Vertex-centric Index
Graph Indices
• Global index structures across entire graph
• Efficient retrieval of vertices and edges based on
associated properties
• Eliminates need to do a full graph scan
• When querying, JanusGraph will typically warn
when a full scan is necessary
• New indexes take effect immediately, but
reindexing may be required
Vertex-Centric Indexes
• Local index structures built per-vertex
• Eliminates the need to load all vertices from the
graph for filtering
Pluggable Index Backends
• Elastic Search
• Apache Solr
• Apache Lucene
Schema and Data Modeling
• Consist of edge labels, property keys, vertex labels
• Explicit or Implicit
• Can evolve over time w/out database downtime
• Edge label multiplicity, Property keys, Key cardinality, Vertex labels
Schema - Edge Label Multiplicity
• MULTI: Multiple edges of the same label between vertices
• SIMPLE: One edge with that label (unique per label)
• MANY2ONE: One outgoing edge with that label (mother/children)
• ONE2MANY: One incoming edge with that label
• ONE2ONE: One incoming, one outgoing edge with that label
Schema - Property Key Data Types
Schema - Property Key Cardinality
• SINGLE: At most one value per element.
• LIST: Arbitrary number of values per element. Allows duplicates.
• SET: Multiple values, but no duplicates.
• Gremlin console:
• Groovy-based REPL for exploring the graph
• Pre-defined convenience variables, expandable by plugins. E.g.:
• “g” — represents the entire graph
• “hdfs” — access to hdfs provided by the TinkerPop Hadoop
plugin
• Local or remote
Graph Traversal with Gremlin
,,,/
(o o)
-----oOOo-(3)-oOOo-----
09:12:24 INFO org.apache.tinkerpop.gremlin.hadoop.structure.H
plugin activated: tinkerpop.hadoop
plugin activated: janusgraph.imports
gremlin>
Graph Traversal with Gremlin
What path will
we be taking
today?
“Graph of the Gods”
Who is Hercules’
grandfather?
gremlin>
g
gremlin>
Global variable representing
the entire graph
g.V()
Select all vertices in the graph
gremlin>
g.V().has('name', ‘hercules')
Find the vertex that has a ‘name’
Property with the value of ‘hercules’
gremlin>
g.V().has('name', ‘hercules')
.out(‘father')
Follow outbound edge named ‘father’
to the connected vertex
gremlin>
g.V().has('name', ‘hercules')
.out(‘father')
.out(‘father')
Follow outbound edge named ‘father’
to the connected vertex
gremlin>
g.V().has('name', ‘hercules')
.out(‘father')
.out(‘father')
.values('name')
Select the vertex property ‘name’
gremlin>
g.V().has('name', ‘hercules')
.out(‘father')
.out(‘father')
.values('name')
Select the vertex property ‘name’
gremlin>
g.V().has('name', ‘hercules')
.out(‘father')
.out(‘father')
.values('name')
gremlin>
==> saturn
What’s in a version number?
1.1
Unreleased
0.1.1
May 16, 2017
Contributions Welcome!
• Website: http://janusgraph.org
• GitHub Organization: https://github.com/JanusGraph
• User Mailing List: janusgraph-user@googlegroups.com
• Developer Mailing List: janusgraph-dev@googlegroups.com
Thank you!
Questions?
P. Taylor Goetz, Hortonworks
@ptgoetz

More Related Content

What's hot

Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
DataWorks Summit/Hadoop Summit
 
Large Scale Graph Analytics with JanusGraph
Large Scale Graph Analytics with JanusGraphLarge Scale Graph Analytics with JanusGraph
Large Scale Graph Analytics with JanusGraph
P. Taylor Goetz
 
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)
Kai Wähner
 
Tag based policies using Apache Atlas and Ranger
Tag based policies using Apache Atlas and RangerTag based policies using Apache Atlas and Ranger
Tag based policies using Apache Atlas and Ranger
Vimal Sharma
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
confluent
 
Ingesting streaming data into Graph Database
Ingesting streaming data into Graph DatabaseIngesting streaming data into Graph Database
Ingesting streaming data into Graph Database
Guido Schmutz
 
facebook architecture for 600M users
facebook architecture for 600M usersfacebook architecture for 600M users
facebook architecture for 600M users
Jongyoon Choi
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta Lake
Databricks
 
Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...
Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...
Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...
Flink Forward
 
Apigee Products Overview
Apigee Products OverviewApigee Products Overview
Apigee Products Overview
Apigee | Google Cloud
 
Kafka Retry and DLQ
Kafka Retry and DLQKafka Retry and DLQ
Kafka Retry and DLQ
George Teo
 
How to govern and secure a Data Mesh?
How to govern and secure a Data Mesh?How to govern and secure a Data Mesh?
How to govern and secure a Data Mesh?
confluent
 
Best practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloudBest practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloud
Anshum Gupta
 
Building a Data Lake on AWS
Building a Data Lake on AWSBuilding a Data Lake on AWS
Building a Data Lake on AWS
Gary Stafford
 
Using Amazon Neptune to power identity resolution at scale - ADB303 - Atlanta...
Using Amazon Neptune to power identity resolution at scale - ADB303 - Atlanta...Using Amazon Neptune to power identity resolution at scale - ADB303 - Atlanta...
Using Amazon Neptune to power identity resolution at scale - ADB303 - Atlanta...
Amazon Web Services
 
Hyperspace for Delta Lake
Hyperspace for Delta LakeHyperspace for Delta Lake
Hyperspace for Delta Lake
Databricks
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
Databricks
 
Apache Kafka® and the Data Mesh
Apache Kafka® and the Data MeshApache Kafka® and the Data Mesh
Apache Kafka® and the Data Mesh
ConfluentInc1
 
On-boarding with JanusGraph Performance
On-boarding with JanusGraph PerformanceOn-boarding with JanusGraph Performance
On-boarding with JanusGraph Performance
Chin Huang
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
Jeffrey T. Pollock
 

What's hot (20)

Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
 
Large Scale Graph Analytics with JanusGraph
Large Scale Graph Analytics with JanusGraphLarge Scale Graph Analytics with JanusGraph
Large Scale Graph Analytics with JanusGraph
 
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)
 
Tag based policies using Apache Atlas and Ranger
Tag based policies using Apache Atlas and RangerTag based policies using Apache Atlas and Ranger
Tag based policies using Apache Atlas and Ranger
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
 
Ingesting streaming data into Graph Database
Ingesting streaming data into Graph DatabaseIngesting streaming data into Graph Database
Ingesting streaming data into Graph Database
 
facebook architecture for 600M users
facebook architecture for 600M usersfacebook architecture for 600M users
facebook architecture for 600M users
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta Lake
 
Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...
Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...
Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...
 
Apigee Products Overview
Apigee Products OverviewApigee Products Overview
Apigee Products Overview
 
Kafka Retry and DLQ
Kafka Retry and DLQKafka Retry and DLQ
Kafka Retry and DLQ
 
How to govern and secure a Data Mesh?
How to govern and secure a Data Mesh?How to govern and secure a Data Mesh?
How to govern and secure a Data Mesh?
 
Best practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloudBest practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloud
 
Building a Data Lake on AWS
Building a Data Lake on AWSBuilding a Data Lake on AWS
Building a Data Lake on AWS
 
Using Amazon Neptune to power identity resolution at scale - ADB303 - Atlanta...
Using Amazon Neptune to power identity resolution at scale - ADB303 - Atlanta...Using Amazon Neptune to power identity resolution at scale - ADB303 - Atlanta...
Using Amazon Neptune to power identity resolution at scale - ADB303 - Atlanta...
 
Hyperspace for Delta Lake
Hyperspace for Delta LakeHyperspace for Delta Lake
Hyperspace for Delta Lake
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
 
Apache Kafka® and the Data Mesh
Apache Kafka® and the Data MeshApache Kafka® and the Data Mesh
Apache Kafka® and the Data Mesh
 
On-boarding with JanusGraph Performance
On-boarding with JanusGraph PerformanceOn-boarding with JanusGraph Performance
On-boarding with JanusGraph Performance
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
 

Similar to Large Scale Graph Analytics with JanusGraph

Sharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data LessonsSharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data LessonsGeorge Stathis
 
Evolution of the Graph Schema
Evolution of the Graph SchemaEvolution of the Graph Schema
Evolution of the Graph Schema
Joshua Shinavier
 
Elasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetupElasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetup
Eric Rodriguez (Hiring in Lex)
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
Rahul Borate
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
Rahul Borate
 
Graph Database and Neo4j
Graph Database and Neo4jGraph Database and Neo4j
Graph Database and Neo4jSina Khorami
 
DataFrames: The Extended Cut
DataFrames: The Extended CutDataFrames: The Extended Cut
DataFrames: The Extended Cut
Wes McKinney
 
Graph Databases
Graph DatabasesGraph Databases
Graph Databases
thai
 
An architecture for federated data discovery and lineage over on-prem datasou...
An architecture for federated data discovery and lineage over on-prem datasou...An architecture for federated data discovery and lineage over on-prem datasou...
An architecture for federated data discovery and lineage over on-prem datasou...
DataWorks Summit
 
Ciel, mes données ne sont plus relationnelles
Ciel, mes données ne sont plus relationnellesCiel, mes données ne sont plus relationnelles
Ciel, mes données ne sont plus relationnelles
Xavier Gorse
 
Big Data App servor by Lance Riedel, CTO, The Hive for The Hive India event
Big Data App servor by Lance Riedel, CTO, The Hive for The Hive India eventBig Data App servor by Lance Riedel, CTO, The Hive for The Hive India event
Big Data App servor by Lance Riedel, CTO, The Hive for The Hive India eventThe Hive
 
Architecting Your First Big Data Implementation
Architecting Your First Big Data ImplementationArchitecting Your First Big Data Implementation
Architecting Your First Big Data Implementation
Adaryl "Bob" Wakefield, MBA
 
Big Graph Data with Titan DB
Big Graph Data with Titan DBBig Graph Data with Titan DB
(DAT203) Building Graph Databases on AWS
(DAT203) Building Graph Databases on AWS(DAT203) Building Graph Databases on AWS
(DAT203) Building Graph Databases on AWS
Amazon Web Services
 
Neo4j Training Introduction
Neo4j Training IntroductionNeo4j Training Introduction
Neo4j Training Introduction
Max De Marzi
 
State of Florida Neo4j Graph Briefing - Cyber IAM
State of Florida Neo4j Graph Briefing - Cyber IAMState of Florida Neo4j Graph Briefing - Cyber IAM
State of Florida Neo4j Graph Briefing - Cyber IAM
Neo4j
 
NoSql - mayank singh
NoSql - mayank singhNoSql - mayank singh
NoSql - mayank singh
Mayank Singh
 
Choosing the Right Big Data Tools for the Job - A Polyglot Approach
Choosing the Right Big Data Tools for the Job - A Polyglot ApproachChoosing the Right Big Data Tools for the Job - A Polyglot Approach
Choosing the Right Big Data Tools for the Job - A Polyglot ApproachDATAVERSITY
 
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Perficient, Inc.
 

Similar to Large Scale Graph Analytics with JanusGraph (20)

Sharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data LessonsSharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data Lessons
 
Evolution of the Graph Schema
Evolution of the Graph SchemaEvolution of the Graph Schema
Evolution of the Graph Schema
 
Elasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetupElasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetup
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
 
NoSQL-Overview
NoSQL-OverviewNoSQL-Overview
NoSQL-Overview
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
 
Graph Database and Neo4j
Graph Database and Neo4jGraph Database and Neo4j
Graph Database and Neo4j
 
DataFrames: The Extended Cut
DataFrames: The Extended CutDataFrames: The Extended Cut
DataFrames: The Extended Cut
 
Graph Databases
Graph DatabasesGraph Databases
Graph Databases
 
An architecture for federated data discovery and lineage over on-prem datasou...
An architecture for federated data discovery and lineage over on-prem datasou...An architecture for federated data discovery and lineage over on-prem datasou...
An architecture for federated data discovery and lineage over on-prem datasou...
 
Ciel, mes données ne sont plus relationnelles
Ciel, mes données ne sont plus relationnellesCiel, mes données ne sont plus relationnelles
Ciel, mes données ne sont plus relationnelles
 
Big Data App servor by Lance Riedel, CTO, The Hive for The Hive India event
Big Data App servor by Lance Riedel, CTO, The Hive for The Hive India eventBig Data App servor by Lance Riedel, CTO, The Hive for The Hive India event
Big Data App servor by Lance Riedel, CTO, The Hive for The Hive India event
 
Architecting Your First Big Data Implementation
Architecting Your First Big Data ImplementationArchitecting Your First Big Data Implementation
Architecting Your First Big Data Implementation
 
Big Graph Data with Titan DB
Big Graph Data with Titan DBBig Graph Data with Titan DB
Big Graph Data with Titan DB
 
(DAT203) Building Graph Databases on AWS
(DAT203) Building Graph Databases on AWS(DAT203) Building Graph Databases on AWS
(DAT203) Building Graph Databases on AWS
 
Neo4j Training Introduction
Neo4j Training IntroductionNeo4j Training Introduction
Neo4j Training Introduction
 
State of Florida Neo4j Graph Briefing - Cyber IAM
State of Florida Neo4j Graph Briefing - Cyber IAMState of Florida Neo4j Graph Briefing - Cyber IAM
State of Florida Neo4j Graph Briefing - Cyber IAM
 
NoSql - mayank singh
NoSql - mayank singhNoSql - mayank singh
NoSql - mayank singh
 
Choosing the Right Big Data Tools for the Job - A Polyglot Approach
Choosing the Right Big Data Tools for the Job - A Polyglot ApproachChoosing the Right Big Data Tools for the Job - A Polyglot Approach
Choosing the Right Big Data Tools for the Job - A Polyglot Approach
 
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
 

More from DataWorks Summit

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
DataWorks Summit
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
DataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
DataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
DataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
DataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
DataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
DataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
DataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
DataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
DataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
DataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
DataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
DataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
DataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
DataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
Vlad Stirbu
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 

Recently uploaded (20)

GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 

Large Scale Graph Analytics with JanusGraph