SlideShare a Scribd company logo
Comparing three data ingestion
approaches where Apache Kafka
integrates with a distributed
graph database in real time
Benyue (Emma) Liu
Product Manager, TigerGraph
April, 2021
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Today's Speakers
Benyue (Emma) Liu
Senior Product Manager, TigerGraph
● BS in Engineering from Harvey Mudd College, MS in
Engineering Systems from MIT
● Prior work experience at Oracle and MarkLogic
● Focus - Cloud, Containers, Enterprise Infra, Monitoring,
Management, Connectors, Developer Tools,
Applications
2
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
2
TigerGraph Data Ingestion System
Architecture: Internal Kakfa Component
TigerGraph Cloud and Kafka Data
Pipeline Use Cases
Built-in Kafka Loader in TigerGraph
Today’s Outline
4
3
3
1 Graph Analytics Overview
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Graph Analytics
Overview
4
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
5
By 2025, graph technologies will be used
in 80% of data and analytics innovations,
up from 10% in 2021, facilitating rapid
decision making across the enterprise.1
“To Graph or Not to Graph? That Is Not
the Question — You Will Graph.”2
Mark Beyer, Distinguished VP Analyst
1Gartner, Top Trends in Data and Analytics for 2021, 16 February 2021
2Gartner, Graph Steps Onto the Main Stage of Data and Analytics: A Gartner Trend Insight Report, 14, December 2020
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM |
Why Graph Analytics?
Source: Gartner - Top 10 data and analytics Trends for 2019
Graph deployments are going deeper, wider and operational:
Need to make it accessible to non-technical users
6
● Definition - Graph analytics is a set of analytic techniques that
allows for the exploration of relationships between entities of
interest such as organizations, people and transactions.
● Forecasted growth - 100% annually through 2022
● What’s driving the growth
○ Need to ask complex questions across complex data, which is
not always practical or even possible at scale using SQL
queries. (RDBMS requires time-consuming & expensive table
joins!)
● What’s needed for broad adoption of graph data stores
○ Graph data stores can efficiently model, explore and query data
with complex interrelationships across data silos, but the need for
specialized skills has limited their adoption to date.
Customer
Supplier
Location 2
Product
Payment
PURCHASED
Location 1
Order
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM
Who is TigerGraph?
Corporate Overview Video
We provide advanced analytics on connected data
○ The only scalable graph database for the enterprise
○ HTAP graph database, foundational for AI and ML solutions
○ SQL-like query language (GSQL) accelerates time to solution
○ Cloud Neutral: Google GCP, Microsoft Azure,
Our customers include:
○ The largest companies in financial, healthcare, telecoms, media,
utilities and innovative startups in cybersecurity, ecommerce and
finserv
Founded in 2012, HQ in Redwood City, California
7
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM
How Our Customers Use TigerGraph?
8
Find most influential
users/customers
Find similar users/customers Who are the patients that are going through a particular type of
journey that results in an adverse health outcome?
Is the
Uncover hidden connections Is the new credit card applicant or transaction connected to
known fraudsters?
Recommend next best action Can I run a real-time credit score algorithm and recommend an
offer based on the customer’s credit profile & need?
Which users are driving higher usage or adoption of my product or
service?
Detect connected users
(communities)
What is average spend over time across a community of connected
users (fin. services, airlines, healthcare, retail..)?
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
TigerGraph
System Architecture
Overview
(Kafka as a Key Component)
9
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
The TigerGraph Difference
Feature Design Difference Benefit
Real-Time Deep-Link Querying ● Native Graph design
● C++ engine, for high performance
● Storage Architecture
● Uncovers hard-to-find patterns
● Operational, real-time
● HTAP: Transactions+Analytics
Handling Massive Scale ● Distributed DB architecture
● Massively parallel processing
● Compressed storage reduces
footprint and messaging
● Integrates all your data
● Automatic partitioning
● Elastic scaling of resource usage
In-Database Analytics ● GSQL: High-level yet Turing-
complete language
● User-extensible graph algorithm
library, runs in-DB
● ACID (OLTP) and Accumulators
(OLAP)
● Avoids transferring data
● Richer graph context
● In-DB machine learning
5 to 10+ hops deep
10
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
TigerGraph Architecture - Kafka as a Key Component
11
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Data Ingestion Steps Inside of TigerGraph
12
Step 3
Each GPE consumes the
partial data updates,
processes it and puts it on
disk.
Loading Jobs and POST use
UPSERT semantics:
● If vertex/edge doesn't
yet exist, create it.
● If vertex/edge already
exists, update it.
Step 1
Data integration through the
following ways to ingest in
user source data.
● Bulk load of data files or
a Kafka stream in CSV or
JSON format
● HTTP POSTs via REST
services (JSON)
● GSQL Insert commands
Step 2
Dispatcher takes in the data
ingestion requests in the form of
updates to the database.
1. Query IDS to get internal
IDs
2. Convert data to internal
format
3. Send data to one or more
corresponding GPEs
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Data Ingestion (Internal)
13
Incremental
Data
Nginx Restpp
GPE GPE GPE
Disk Disk Disk
CSV/JSON Insert/Update/Delete
Vertices and Edges
Listen to
corresponding
topic for new
messages
Acknowledge
Response
Incoming
Outgoing
Synchronize
data to disk
GSE(IDS)
ID Translation
Kafka Kafka Kafka
Server 1 Server 2 Server 3
Kafka Cluster
In-memory
copy of data
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Kafka and TigerGraph
-Native Kafka Loader
14
External
Kafka
Cluster
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Kafka and TigerGraph Data Pipeline
Static
Data
Sources
Streaming
Data
Sources
Kafka
Loader
15
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Kafka Loader - Speed to Value from Real-time
Streaming Data
• Reduce Data Availability Gap and Accelerate Time to Value
• Native Integration with Real-time Streaming Data and Batch
Data
• Enables Real-time Graph Feature Updates with Streaming
Data in Machine Learning Use Cases
• Decrease Learning Curve With Familiar Syntax
• GSQL Support with Consistent Data Loading Syntax
• Maintain Separation of Control for Data Loading
• Designed with Built-in MultiGraph Support
16
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Kafka Loader : Three Steps
Consistent with GSQL Data Loading Steps
Step 1: Define the Data Source
Step 2: Create a Loading Job
Step 3: Run the Loading Job
17
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Kafka Loader High Level Architecture
● Connect to External Kafka Cluster
● User Commands Through GSQL Server
● Configuration Settings:
○ Config 1: Kakfa Cluster Configuration
○ Config 2: Topic/Partition/Offset Info
18
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
DEMO
19
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Kafka and tgcloud.io
Data Loading Use Cases
20
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
What is TigerGraph Cloud (tgcloud.io)?
● TigerGraph Cloud is a distributed graph Database-as-a-Service
● Cloud Infrastructure Included
● Out-of-box graph starter kits - industry use case library
● Complete operational support by TigerGraph
○ Upgrades
○ Patches
○ Maintenance
○ Status Monitoring
● Flexible and hourly billing rates by credit card and through prepaid bulk
“cloud credits”
● Multiple versions and multiple clusters supported
● Easy provisioning for distributed clusters
● Built in Encryption and Security
21
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
TigerGraph Cloud Architecture
TigerGraph Cloud Portal
GraphStudio Admin Portal
GSQL
Web Shell
TigerGraph
Database
+
Graph Starter Kit
GraphStudio Admin Portal
GSQL
Web Shell
TigerGraph
Database
+
Graph Starter Kit
…...
…...
22
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Azure Blob + Kafka + tgcloud.io
23
Kafka
Loader
Kafka Connect
Azure Blob Storage
Source Connector
Azure Blob Storage
https://docs.confluent.io/kafka-connect-azure-blob-storage-source/current/index.html
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Google Cloud Storage + Kafka + tgcloud.io
24
Kafka
Loader
Kafka Connect
Google Cloud
Storage Source
Connector
https://docs.confluent.io/kafka-connect-gcs-source/current/overview.html
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Amazon Kinesis + Kafka + tgcloud.io
25
Kafka
Loader
Kafka Connect
Amazon Kinesis
Source Connector
https://docs.confluent.io/kafka-connect-kinesis/current/index.html
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
DEMO
26
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Summary
27
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Summary
● Graph Analytics Overview
● TigerGraph Data Ingestion System Architecture: Kafka as
an Internal Component
● Built-in Kafka Loader in TigerGraph
● TigerGraph Cloud (tgcloud.io) and Kafka Data Pipeline Use
Cases
28
Get Started for Free
● Try TigerGraph Cloud ( tgcloud.io )
● Download TigerGraph’s Developer Edition
● Take a Test Drive - Online Demo
● Get TigerGraph Certified
● Join the Community
@TigerGraphDB /tigergraph /TigerGraphDB /company/TigerGraph
29

More Related Content

What's hot

The Knowledge Graph Explosion
The Knowledge Graph ExplosionThe Knowledge Graph Explosion
The Knowledge Graph Explosion
Neo4j
 
Get Started with the Most Advanced Edition Yet of Neo4j Graph Data Science
Get Started with the Most Advanced Edition Yet of Neo4j Graph Data ScienceGet Started with the Most Advanced Edition Yet of Neo4j Graph Data Science
Get Started with the Most Advanced Edition Yet of Neo4j Graph Data Science
Neo4j
 
Using Graph Algorithms for Advanced Analytics - Part 5 Classification
Using Graph Algorithms for Advanced Analytics - Part 5 ClassificationUsing Graph Algorithms for Advanced Analytics - Part 5 Classification
Using Graph Algorithms for Advanced Analytics - Part 5 Classification
TigerGraph
 
Incremental View Maintenance with Coral, DBT, and Iceberg
Incremental View Maintenance with Coral, DBT, and IcebergIncremental View Maintenance with Coral, DBT, and Iceberg
Incremental View Maintenance with Coral, DBT, and Iceberg
Walaa Eldin Moustafa
 
Batch and Stream Graph Processing with Apache Flink
Batch and Stream Graph Processing with Apache FlinkBatch and Stream Graph Processing with Apache Flink
Batch and Stream Graph Processing with Apache Flink
Vasia Kalavri
 
Iceberg: a fast table format for S3
Iceberg: a fast table format for S3Iceberg: a fast table format for S3
Iceberg: a fast table format for S3
DataWorks Summit
 
Web-Scale Graph Analytics with Apache® Spark™
Web-Scale Graph Analytics with Apache® Spark™Web-Scale Graph Analytics with Apache® Spark™
Web-Scale Graph Analytics with Apache® Spark™
Databricks
 
Real-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotReal-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache Pinot
Xiang Fu
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache Flink
DataWorks Summit
 
Bootstrapping state in Apache Flink
Bootstrapping state in Apache FlinkBootstrapping state in Apache Flink
Bootstrapping state in Apache Flink
DataWorks Summit
 
The Year of the Graph
The Year of the GraphThe Year of the Graph
The Year of the Graph
Cambridge Semantics
 
Apache Iceberg Presentation for the St. Louis Big Data IDEA
Apache Iceberg Presentation for the St. Louis Big Data IDEAApache Iceberg Presentation for the St. Louis Big Data IDEA
Apache Iceberg Presentation for the St. Louis Big Data IDEA
Adam Doyle
 
Benchmark MinHash+LSH algorithm on Spark
Benchmark MinHash+LSH algorithm on SparkBenchmark MinHash+LSH algorithm on Spark
Benchmark MinHash+LSH algorithm on Spark
Xiaoqian Liu
 
Apache Flink and what it is used for
Apache Flink and what it is used forApache Flink and what it is used for
Apache Flink and what it is used for
Aljoscha Krettek
 
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark Summit
 
Knowledge Graphs and Graph Data Science: More Context, Better Predictions (Ne...
Knowledge Graphs and Graph Data Science: More Context, Better Predictions (Ne...Knowledge Graphs and Graph Data Science: More Context, Better Predictions (Ne...
Knowledge Graphs and Graph Data Science: More Context, Better Predictions (Ne...
Neo4j
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
Databricks
 
PySpark Best Practices
PySpark Best PracticesPySpark Best Practices
PySpark Best Practices
Cloudera, Inc.
 
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational CacheUsing Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Dremio Corporation
 
Data Pipline Observability meetup
Data Pipline Observability meetup Data Pipline Observability meetup
Data Pipline Observability meetup
Omid Vahdaty
 

What's hot (20)

The Knowledge Graph Explosion
The Knowledge Graph ExplosionThe Knowledge Graph Explosion
The Knowledge Graph Explosion
 
Get Started with the Most Advanced Edition Yet of Neo4j Graph Data Science
Get Started with the Most Advanced Edition Yet of Neo4j Graph Data ScienceGet Started with the Most Advanced Edition Yet of Neo4j Graph Data Science
Get Started with the Most Advanced Edition Yet of Neo4j Graph Data Science
 
Using Graph Algorithms for Advanced Analytics - Part 5 Classification
Using Graph Algorithms for Advanced Analytics - Part 5 ClassificationUsing Graph Algorithms for Advanced Analytics - Part 5 Classification
Using Graph Algorithms for Advanced Analytics - Part 5 Classification
 
Incremental View Maintenance with Coral, DBT, and Iceberg
Incremental View Maintenance with Coral, DBT, and IcebergIncremental View Maintenance with Coral, DBT, and Iceberg
Incremental View Maintenance with Coral, DBT, and Iceberg
 
Batch and Stream Graph Processing with Apache Flink
Batch and Stream Graph Processing with Apache FlinkBatch and Stream Graph Processing with Apache Flink
Batch and Stream Graph Processing with Apache Flink
 
Iceberg: a fast table format for S3
Iceberg: a fast table format for S3Iceberg: a fast table format for S3
Iceberg: a fast table format for S3
 
Web-Scale Graph Analytics with Apache® Spark™
Web-Scale Graph Analytics with Apache® Spark™Web-Scale Graph Analytics with Apache® Spark™
Web-Scale Graph Analytics with Apache® Spark™
 
Real-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotReal-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache Pinot
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache Flink
 
Bootstrapping state in Apache Flink
Bootstrapping state in Apache FlinkBootstrapping state in Apache Flink
Bootstrapping state in Apache Flink
 
The Year of the Graph
The Year of the GraphThe Year of the Graph
The Year of the Graph
 
Apache Iceberg Presentation for the St. Louis Big Data IDEA
Apache Iceberg Presentation for the St. Louis Big Data IDEAApache Iceberg Presentation for the St. Louis Big Data IDEA
Apache Iceberg Presentation for the St. Louis Big Data IDEA
 
Benchmark MinHash+LSH algorithm on Spark
Benchmark MinHash+LSH algorithm on SparkBenchmark MinHash+LSH algorithm on Spark
Benchmark MinHash+LSH algorithm on Spark
 
Apache Flink and what it is used for
Apache Flink and what it is used forApache Flink and what it is used for
Apache Flink and what it is used for
 
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
 
Knowledge Graphs and Graph Data Science: More Context, Better Predictions (Ne...
Knowledge Graphs and Graph Data Science: More Context, Better Predictions (Ne...Knowledge Graphs and Graph Data Science: More Context, Better Predictions (Ne...
Knowledge Graphs and Graph Data Science: More Context, Better Predictions (Ne...
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
 
PySpark Best Practices
PySpark Best PracticesPySpark Best Practices
PySpark Best Practices
 
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational CacheUsing Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
 
Data Pipline Observability meetup
Data Pipline Observability meetup Data Pipline Observability meetup
Data Pipline Observability meetup
 

Similar to Comparing three data ingestion approaches where Apache Kafka integrates with a distributed graph database in real time | Benyue (Emma) Liu, TigerGraph

Oracle GoldenGate Cloud Service Overview
Oracle GoldenGate Cloud Service OverviewOracle GoldenGate Cloud Service Overview
Oracle GoldenGate Cloud Service Overview
Jinyu Wang
 
Peek into Neo4j Product Strategy and Roadmap
Peek into Neo4j Product Strategy and RoadmapPeek into Neo4j Product Strategy and Roadmap
Peek into Neo4j Product Strategy and Roadmap
Neo4j
 
Portworx 201 Customer Deck.pptx
Portworx 201 Customer Deck.pptxPortworx 201 Customer Deck.pptx
Portworx 201 Customer Deck.pptx
ssuser1490e8
 
Amsterdam - The Neo4j Graph Data Platform Today & Tomorrow
Amsterdam - The Neo4j Graph Data Platform Today & TomorrowAmsterdam - The Neo4j Graph Data Platform Today & Tomorrow
Amsterdam - The Neo4j Graph Data Platform Today & Tomorrow
Neo4j
 
GxP Data Integrity for Cloud Apps – Part 1
GxP Data Integrity for Cloud Apps – Part 1GxP Data Integrity for Cloud Apps – Part 1
GxP Data Integrity for Cloud Apps – Part 1
Agaram Technologies
 
Better Together: How Graph database enables easy data integration with Spark ...
Better Together: How Graph database enables easy data integration with Spark ...Better Together: How Graph database enables easy data integration with Spark ...
Better Together: How Graph database enables easy data integration with Spark ...
TigerGraph
 
Meetup Streaming Data Pipeline Development
Meetup Streaming Data Pipeline DevelopmentMeetup Streaming Data Pipeline Development
Meetup Streaming Data Pipeline Development
Timothy Spann
 
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
ssuser73434e
 
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 DatasetGraph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
TigerGraph
 
Data platform modernization with Databricks.pptx
Data platform modernization with Databricks.pptxData platform modernization with Databricks.pptx
Data platform modernization with Databricks.pptx
CalvinSim10
 
Replatform your Teradata to a Next-Gen Cloud Data Platform in Weeks, Not Years
Replatform your Teradata to a Next-Gen Cloud Data Platform in Weeks, Not YearsReplatform your Teradata to a Next-Gen Cloud Data Platform in Weeks, Not Years
Replatform your Teradata to a Next-Gen Cloud Data Platform in Weeks, Not Years
VMware Tanzu
 
Nordics Edition - The Neo4j Graph Data Platform Today & Tomorrow
Nordics Edition - The Neo4j Graph Data Platform Today & TomorrowNordics Edition - The Neo4j Graph Data Platform Today & Tomorrow
Nordics Edition - The Neo4j Graph Data Platform Today & Tomorrow
Neo4j
 
Cloud Computing for National Security Applications
Cloud Computing for National Security ApplicationsCloud Computing for National Security Applications
Cloud Computing for National Security Applicationswhite paper
 
RAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsRAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needs
Connected Data World
 
Containers and Kubernetes
Containers and KubernetesContainers and Kubernetes
Containers and Kubernetes
Altoros
 
Neo4j: The path to success with Graph Database and Graph Data Science
Neo4j: The path to success with Graph Database and Graph Data ScienceNeo4j: The path to success with Graph Database and Graph Data Science
Neo4j: The path to success with Graph Database and Graph Data Science
Neo4j
 
Dimension Data Saugatuk Webinar
Dimension Data Saugatuk WebinarDimension Data Saugatuk Webinar
Dimension Data Saugatuk WebinarKeao Caindec
 
Eric Andersen Keynote
Eric Andersen KeynoteEric Andersen Keynote
Eric Andersen Keynote
Data Con LA
 
Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google
Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, GoogleHybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google
Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google
HostedbyConfluent
 
Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google
Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, GoogleHybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google
Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google
HostedbyConfluent
 

Similar to Comparing three data ingestion approaches where Apache Kafka integrates with a distributed graph database in real time | Benyue (Emma) Liu, TigerGraph (20)

Oracle GoldenGate Cloud Service Overview
Oracle GoldenGate Cloud Service OverviewOracle GoldenGate Cloud Service Overview
Oracle GoldenGate Cloud Service Overview
 
Peek into Neo4j Product Strategy and Roadmap
Peek into Neo4j Product Strategy and RoadmapPeek into Neo4j Product Strategy and Roadmap
Peek into Neo4j Product Strategy and Roadmap
 
Portworx 201 Customer Deck.pptx
Portworx 201 Customer Deck.pptxPortworx 201 Customer Deck.pptx
Portworx 201 Customer Deck.pptx
 
Amsterdam - The Neo4j Graph Data Platform Today & Tomorrow
Amsterdam - The Neo4j Graph Data Platform Today & TomorrowAmsterdam - The Neo4j Graph Data Platform Today & Tomorrow
Amsterdam - The Neo4j Graph Data Platform Today & Tomorrow
 
GxP Data Integrity for Cloud Apps – Part 1
GxP Data Integrity for Cloud Apps – Part 1GxP Data Integrity for Cloud Apps – Part 1
GxP Data Integrity for Cloud Apps – Part 1
 
Better Together: How Graph database enables easy data integration with Spark ...
Better Together: How Graph database enables easy data integration with Spark ...Better Together: How Graph database enables easy data integration with Spark ...
Better Together: How Graph database enables easy data integration with Spark ...
 
Meetup Streaming Data Pipeline Development
Meetup Streaming Data Pipeline DevelopmentMeetup Streaming Data Pipeline Development
Meetup Streaming Data Pipeline Development
 
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
 
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 DatasetGraph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
 
Data platform modernization with Databricks.pptx
Data platform modernization with Databricks.pptxData platform modernization with Databricks.pptx
Data platform modernization with Databricks.pptx
 
Replatform your Teradata to a Next-Gen Cloud Data Platform in Weeks, Not Years
Replatform your Teradata to a Next-Gen Cloud Data Platform in Weeks, Not YearsReplatform your Teradata to a Next-Gen Cloud Data Platform in Weeks, Not Years
Replatform your Teradata to a Next-Gen Cloud Data Platform in Weeks, Not Years
 
Nordics Edition - The Neo4j Graph Data Platform Today & Tomorrow
Nordics Edition - The Neo4j Graph Data Platform Today & TomorrowNordics Edition - The Neo4j Graph Data Platform Today & Tomorrow
Nordics Edition - The Neo4j Graph Data Platform Today & Tomorrow
 
Cloud Computing for National Security Applications
Cloud Computing for National Security ApplicationsCloud Computing for National Security Applications
Cloud Computing for National Security Applications
 
RAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsRAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needs
 
Containers and Kubernetes
Containers and KubernetesContainers and Kubernetes
Containers and Kubernetes
 
Neo4j: The path to success with Graph Database and Graph Data Science
Neo4j: The path to success with Graph Database and Graph Data ScienceNeo4j: The path to success with Graph Database and Graph Data Science
Neo4j: The path to success with Graph Database and Graph Data Science
 
Dimension Data Saugatuk Webinar
Dimension Data Saugatuk WebinarDimension Data Saugatuk Webinar
Dimension Data Saugatuk Webinar
 
Eric Andersen Keynote
Eric Andersen KeynoteEric Andersen Keynote
Eric Andersen Keynote
 
Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google
Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, GoogleHybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google
Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google
 
Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google
Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, GoogleHybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google
Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google
 

More from HostedbyConfluent

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
HostedbyConfluent
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
HostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
HostedbyConfluent
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
HostedbyConfluent
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
HostedbyConfluent
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
HostedbyConfluent
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
HostedbyConfluent
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
HostedbyConfluent
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
HostedbyConfluent
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
HostedbyConfluent
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
HostedbyConfluent
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
HostedbyConfluent
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
HostedbyConfluent
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
HostedbyConfluent
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
HostedbyConfluent
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
HostedbyConfluent
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
HostedbyConfluent
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
HostedbyConfluent
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
HostedbyConfluent
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
HostedbyConfluent
 

More from HostedbyConfluent (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
 

Recently uploaded

RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Vladimir Iglovikov, Ph.D.
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website
Pixlogix Infotech
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
Rohit Gautam
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 

Recently uploaded (20)

RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 

Comparing three data ingestion approaches where Apache Kafka integrates with a distributed graph database in real time | Benyue (Emma) Liu, TigerGraph

  • 1. Comparing three data ingestion approaches where Apache Kafka integrates with a distributed graph database in real time Benyue (Emma) Liu Product Manager, TigerGraph April, 2021
  • 2. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Today's Speakers Benyue (Emma) Liu Senior Product Manager, TigerGraph ● BS in Engineering from Harvey Mudd College, MS in Engineering Systems from MIT ● Prior work experience at Oracle and MarkLogic ● Focus - Cloud, Containers, Enterprise Infra, Monitoring, Management, Connectors, Developer Tools, Applications 2
  • 3. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | 2 TigerGraph Data Ingestion System Architecture: Internal Kakfa Component TigerGraph Cloud and Kafka Data Pipeline Use Cases Built-in Kafka Loader in TigerGraph Today’s Outline 4 3 3 1 Graph Analytics Overview
  • 4. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Graph Analytics Overview 4
  • 5. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | 5 By 2025, graph technologies will be used in 80% of data and analytics innovations, up from 10% in 2021, facilitating rapid decision making across the enterprise.1 “To Graph or Not to Graph? That Is Not the Question — You Will Graph.”2 Mark Beyer, Distinguished VP Analyst 1Gartner, Top Trends in Data and Analytics for 2021, 16 February 2021 2Gartner, Graph Steps Onto the Main Stage of Data and Analytics: A Gartner Trend Insight Report, 14, December 2020
  • 6. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | Why Graph Analytics? Source: Gartner - Top 10 data and analytics Trends for 2019 Graph deployments are going deeper, wider and operational: Need to make it accessible to non-technical users 6 ● Definition - Graph analytics is a set of analytic techniques that allows for the exploration of relationships between entities of interest such as organizations, people and transactions. ● Forecasted growth - 100% annually through 2022 ● What’s driving the growth ○ Need to ask complex questions across complex data, which is not always practical or even possible at scale using SQL queries. (RDBMS requires time-consuming & expensive table joins!) ● What’s needed for broad adoption of graph data stores ○ Graph data stores can efficiently model, explore and query data with complex interrelationships across data silos, but the need for specialized skills has limited their adoption to date. Customer Supplier Location 2 Product Payment PURCHASED Location 1 Order
  • 7. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM Who is TigerGraph? Corporate Overview Video We provide advanced analytics on connected data ○ The only scalable graph database for the enterprise ○ HTAP graph database, foundational for AI and ML solutions ○ SQL-like query language (GSQL) accelerates time to solution ○ Cloud Neutral: Google GCP, Microsoft Azure, Our customers include: ○ The largest companies in financial, healthcare, telecoms, media, utilities and innovative startups in cybersecurity, ecommerce and finserv Founded in 2012, HQ in Redwood City, California 7
  • 8. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM How Our Customers Use TigerGraph? 8 Find most influential users/customers Find similar users/customers Who are the patients that are going through a particular type of journey that results in an adverse health outcome? Is the Uncover hidden connections Is the new credit card applicant or transaction connected to known fraudsters? Recommend next best action Can I run a real-time credit score algorithm and recommend an offer based on the customer’s credit profile & need? Which users are driving higher usage or adoption of my product or service? Detect connected users (communities) What is average spend over time across a community of connected users (fin. services, airlines, healthcare, retail..)?
  • 9. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | TigerGraph System Architecture Overview (Kafka as a Key Component) 9
  • 10. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | The TigerGraph Difference Feature Design Difference Benefit Real-Time Deep-Link Querying ● Native Graph design ● C++ engine, for high performance ● Storage Architecture ● Uncovers hard-to-find patterns ● Operational, real-time ● HTAP: Transactions+Analytics Handling Massive Scale ● Distributed DB architecture ● Massively parallel processing ● Compressed storage reduces footprint and messaging ● Integrates all your data ● Automatic partitioning ● Elastic scaling of resource usage In-Database Analytics ● GSQL: High-level yet Turing- complete language ● User-extensible graph algorithm library, runs in-DB ● ACID (OLTP) and Accumulators (OLAP) ● Avoids transferring data ● Richer graph context ● In-DB machine learning 5 to 10+ hops deep 10
  • 11. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | TigerGraph Architecture - Kafka as a Key Component 11
  • 12. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Data Ingestion Steps Inside of TigerGraph 12 Step 3 Each GPE consumes the partial data updates, processes it and puts it on disk. Loading Jobs and POST use UPSERT semantics: ● If vertex/edge doesn't yet exist, create it. ● If vertex/edge already exists, update it. Step 1 Data integration through the following ways to ingest in user source data. ● Bulk load of data files or a Kafka stream in CSV or JSON format ● HTTP POSTs via REST services (JSON) ● GSQL Insert commands Step 2 Dispatcher takes in the data ingestion requests in the form of updates to the database. 1. Query IDS to get internal IDs 2. Convert data to internal format 3. Send data to one or more corresponding GPEs
  • 13. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Data Ingestion (Internal) 13 Incremental Data Nginx Restpp GPE GPE GPE Disk Disk Disk CSV/JSON Insert/Update/Delete Vertices and Edges Listen to corresponding topic for new messages Acknowledge Response Incoming Outgoing Synchronize data to disk GSE(IDS) ID Translation Kafka Kafka Kafka Server 1 Server 2 Server 3 Kafka Cluster In-memory copy of data
  • 14. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Kafka and TigerGraph -Native Kafka Loader 14 External Kafka Cluster
  • 15. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Kafka and TigerGraph Data Pipeline Static Data Sources Streaming Data Sources Kafka Loader 15
  • 16. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Kafka Loader - Speed to Value from Real-time Streaming Data • Reduce Data Availability Gap and Accelerate Time to Value • Native Integration with Real-time Streaming Data and Batch Data • Enables Real-time Graph Feature Updates with Streaming Data in Machine Learning Use Cases • Decrease Learning Curve With Familiar Syntax • GSQL Support with Consistent Data Loading Syntax • Maintain Separation of Control for Data Loading • Designed with Built-in MultiGraph Support 16
  • 17. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Kafka Loader : Three Steps Consistent with GSQL Data Loading Steps Step 1: Define the Data Source Step 2: Create a Loading Job Step 3: Run the Loading Job 17
  • 18. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Kafka Loader High Level Architecture ● Connect to External Kafka Cluster ● User Commands Through GSQL Server ● Configuration Settings: ○ Config 1: Kakfa Cluster Configuration ○ Config 2: Topic/Partition/Offset Info 18
  • 19. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | DEMO 19
  • 20. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Kafka and tgcloud.io Data Loading Use Cases 20
  • 21. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | What is TigerGraph Cloud (tgcloud.io)? ● TigerGraph Cloud is a distributed graph Database-as-a-Service ● Cloud Infrastructure Included ● Out-of-box graph starter kits - industry use case library ● Complete operational support by TigerGraph ○ Upgrades ○ Patches ○ Maintenance ○ Status Monitoring ● Flexible and hourly billing rates by credit card and through prepaid bulk “cloud credits” ● Multiple versions and multiple clusters supported ● Easy provisioning for distributed clusters ● Built in Encryption and Security 21
  • 22. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | TigerGraph Cloud Architecture TigerGraph Cloud Portal GraphStudio Admin Portal GSQL Web Shell TigerGraph Database + Graph Starter Kit GraphStudio Admin Portal GSQL Web Shell TigerGraph Database + Graph Starter Kit …... …... 22
  • 23. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Azure Blob + Kafka + tgcloud.io 23 Kafka Loader Kafka Connect Azure Blob Storage Source Connector Azure Blob Storage https://docs.confluent.io/kafka-connect-azure-blob-storage-source/current/index.html
  • 24. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Google Cloud Storage + Kafka + tgcloud.io 24 Kafka Loader Kafka Connect Google Cloud Storage Source Connector https://docs.confluent.io/kafka-connect-gcs-source/current/overview.html
  • 25. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Amazon Kinesis + Kafka + tgcloud.io 25 Kafka Loader Kafka Connect Amazon Kinesis Source Connector https://docs.confluent.io/kafka-connect-kinesis/current/index.html
  • 26. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | DEMO 26
  • 27. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Summary 27
  • 28. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Summary ● Graph Analytics Overview ● TigerGraph Data Ingestion System Architecture: Kafka as an Internal Component ● Built-in Kafka Loader in TigerGraph ● TigerGraph Cloud (tgcloud.io) and Kafka Data Pipeline Use Cases 28
  • 29. Get Started for Free ● Try TigerGraph Cloud ( tgcloud.io ) ● Download TigerGraph’s Developer Edition ● Take a Test Drive - Online Demo ● Get TigerGraph Certified ● Join the Community @TigerGraphDB /tigergraph /TigerGraphDB /company/TigerGraph 29