SlideShare a Scribd company logo
Integrating a Low-Latency
Distributed Database with Event
Streaming
Developer Advocate
at ScyllaDB
Raouf Chebri
Outline
2
1. ScyllaDB
2. Change Data Capture
3. Connectors
4. Pulsar Sink Connector
ScyllaDB
1
Cluster
4
Multi-datacenter replication
5
ALTER KEYSPACE ks WITH REPLICATION = {'class': 'NetworkTopologyStrategy', 'DC1':3, 'DC2':3};
Shard-awareness
6
7
Availability vs Consistency
8
Insert Select
Success Data
Availability vs Consistency
9
Insert Select
Success Data
Change Data
Capture
2
Change Data Capture
11
Update
Table t
pk ck v
0 0 1
Table t
pk ck v
0 0 3
change at 2020-01-29 14:37:32: UPDATE ks.t SET v = 3 WHERE pk = 0 AND ck = 0;
SOURCES
Kafka Connectors
12
SINKS
Connectors
3
Evolution of System Architecture
16
frontend backend database
Evolution of System Architecture
17
Evolution of System Architecture
18
Connectors
19
“The code is a
liability,
not an asset.”
Matt Coulter
Built-in connectors
20
SOURCES SINKS
Pulsar Sink
Connector
4
Latency Matters
22
SOURCES
Future Connectors
23
SINKS
Recap
24
1. ScyllaDB
2. Change Data Capture
3. Connectors
4. Pulsar + ScyllaDB
Keep in touch!
Raouf Chebri
Developer Advocate
ScyllaDB
developers@scylladb.com
@raoufscylladb

More Related Content

Similar to Integrating a Low-Latency Distributed Database with Event Streaming

SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
HostedbyConfluent
 
Changing landscapes in data integration - Kafka Connect for near real-time da...
Changing landscapes in data integration - Kafka Connect for near real-time da...Changing landscapes in data integration - Kafka Connect for near real-time da...
Changing landscapes in data integration - Kafka Connect for near real-time da...
HostedbyConfluent
 
A Better Kafka Connect With Kubernetes, Stefan Sprenger & Hakan Lofcali | Cur...
A Better Kafka Connect With Kubernetes, Stefan Sprenger & Hakan Lofcali | Cur...A Better Kafka Connect With Kubernetes, Stefan Sprenger & Hakan Lofcali | Cur...
A Better Kafka Connect With Kubernetes, Stefan Sprenger & Hakan Lofcali | Cur...
HostedbyConfluent
 
Learnings From Shipping 1000+ Streaming Data Pipelines To Production with Hak...
Learnings From Shipping 1000+ Streaming Data Pipelines To Production with Hak...Learnings From Shipping 1000+ Streaming Data Pipelines To Production with Hak...
Learnings From Shipping 1000+ Streaming Data Pipelines To Production with Hak...
HostedbyConfluent
 
Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)
Introduction to ksqlDB and stream processing (Vish Srinivasan  - Confluent)Introduction to ksqlDB and stream processing (Vish Srinivasan  - Confluent)
Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)
KafkaZone
 
PartnerSkillUp_Enable a Streaming CDC Solution
PartnerSkillUp_Enable a Streaming CDC SolutionPartnerSkillUp_Enable a Streaming CDC Solution
PartnerSkillUp_Enable a Streaming CDC Solution
Timothy Spann
 
Spark Summit EU talk by Michael Nitschinger
Spark Summit EU talk by Michael NitschingerSpark Summit EU talk by Michael Nitschinger
Spark Summit EU talk by Michael Nitschinger
Spark Summit
 
First review presentation
First review presentationFirst review presentation
First review presentation
Arvind Krishnaa
 
Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)
Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)
Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)
Cédrick Lunven
 
All Streams Ahead! ksqlDB Workshop ANZ
All Streams Ahead! ksqlDB Workshop ANZAll Streams Ahead! ksqlDB Workshop ANZ
All Streams Ahead! ksqlDB Workshop ANZ
confluent
 
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena EdelsonStreaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
Spark Summit
 
SF big Analytics : Stream all things by Gwen Shapira @ Lyft 2018
SF big Analytics : Stream all things by Gwen Shapira @ Lyft 2018SF big Analytics : Stream all things by Gwen Shapira @ Lyft 2018
SF big Analytics : Stream all things by Gwen Shapira @ Lyft 2018
Chester Chen
 
Concepts and Patterns for Streaming Services with Kafka
Concepts and Patterns for Streaming Services with KafkaConcepts and Patterns for Streaming Services with Kafka
Concepts and Patterns for Streaming Services with Kafka
QAware GmbH
 
Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...
Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...
Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...
HostedbyConfluent
 
Data relay introduction to big data clusters
Data relay introduction to big data clustersData relay introduction to big data clusters
Data relay introduction to big data clusters
Chris Adkin
 
BigData Developers MeetUp
BigData Developers MeetUpBigData Developers MeetUp
BigData Developers MeetUp
Christian Johannsen
 
Сloud Webinar #1 “Architecture of Highly Loaded Geo-Distributed Applications”
Сloud Webinar #1 “Architecture of Highly Loaded Geo-Distributed Applications”Сloud Webinar #1 “Architecture of Highly Loaded Geo-Distributed Applications”
Сloud Webinar #1 “Architecture of Highly Loaded Geo-Distributed Applications”
GlobalLogic Ukraine
 
INTERFACE, by apidays - Apache Cassandra now speaks developer with Stargate ...
INTERFACE, by apidays  - Apache Cassandra now speaks developer with Stargate ...INTERFACE, by apidays  - Apache Cassandra now speaks developer with Stargate ...
INTERFACE, by apidays - Apache Cassandra now speaks developer with Stargate ...
apidays
 
Distributing the SMACK stack - Kubernetes VS DCOS - Sahil Sawhney (Knoldus Inc.)
Distributing the SMACK stack - Kubernetes VS DCOS - Sahil Sawhney (Knoldus Inc.)Distributing the SMACK stack - Kubernetes VS DCOS - Sahil Sawhney (Knoldus Inc.)
Distributing the SMACK stack - Kubernetes VS DCOS - Sahil Sawhney (Knoldus Inc.)
Tech Triveni
 
Distributing the SMACK stack - Kubernetes VS DCOS - Sahil Sawhney (Knoldus Inc.)
Distributing the SMACK stack - Kubernetes VS DCOS - Sahil Sawhney (Knoldus Inc.)Distributing the SMACK stack - Kubernetes VS DCOS - Sahil Sawhney (Knoldus Inc.)
Distributing the SMACK stack - Kubernetes VS DCOS - Sahil Sawhney (Knoldus Inc.)
Knoldus Inc.
 

Similar to Integrating a Low-Latency Distributed Database with Event Streaming (20)

SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
 
Changing landscapes in data integration - Kafka Connect for near real-time da...
Changing landscapes in data integration - Kafka Connect for near real-time da...Changing landscapes in data integration - Kafka Connect for near real-time da...
Changing landscapes in data integration - Kafka Connect for near real-time da...
 
A Better Kafka Connect With Kubernetes, Stefan Sprenger & Hakan Lofcali | Cur...
A Better Kafka Connect With Kubernetes, Stefan Sprenger & Hakan Lofcali | Cur...A Better Kafka Connect With Kubernetes, Stefan Sprenger & Hakan Lofcali | Cur...
A Better Kafka Connect With Kubernetes, Stefan Sprenger & Hakan Lofcali | Cur...
 
Learnings From Shipping 1000+ Streaming Data Pipelines To Production with Hak...
Learnings From Shipping 1000+ Streaming Data Pipelines To Production with Hak...Learnings From Shipping 1000+ Streaming Data Pipelines To Production with Hak...
Learnings From Shipping 1000+ Streaming Data Pipelines To Production with Hak...
 
Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)
Introduction to ksqlDB and stream processing (Vish Srinivasan  - Confluent)Introduction to ksqlDB and stream processing (Vish Srinivasan  - Confluent)
Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)
 
PartnerSkillUp_Enable a Streaming CDC Solution
PartnerSkillUp_Enable a Streaming CDC SolutionPartnerSkillUp_Enable a Streaming CDC Solution
PartnerSkillUp_Enable a Streaming CDC Solution
 
Spark Summit EU talk by Michael Nitschinger
Spark Summit EU talk by Michael NitschingerSpark Summit EU talk by Michael Nitschinger
Spark Summit EU talk by Michael Nitschinger
 
First review presentation
First review presentationFirst review presentation
First review presentation
 
Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)
Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)
Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)
 
All Streams Ahead! ksqlDB Workshop ANZ
All Streams Ahead! ksqlDB Workshop ANZAll Streams Ahead! ksqlDB Workshop ANZ
All Streams Ahead! ksqlDB Workshop ANZ
 
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena EdelsonStreaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
 
SF big Analytics : Stream all things by Gwen Shapira @ Lyft 2018
SF big Analytics : Stream all things by Gwen Shapira @ Lyft 2018SF big Analytics : Stream all things by Gwen Shapira @ Lyft 2018
SF big Analytics : Stream all things by Gwen Shapira @ Lyft 2018
 
Concepts and Patterns for Streaming Services with Kafka
Concepts and Patterns for Streaming Services with KafkaConcepts and Patterns for Streaming Services with Kafka
Concepts and Patterns for Streaming Services with Kafka
 
Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...
Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...
Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...
 
Data relay introduction to big data clusters
Data relay introduction to big data clustersData relay introduction to big data clusters
Data relay introduction to big data clusters
 
BigData Developers MeetUp
BigData Developers MeetUpBigData Developers MeetUp
BigData Developers MeetUp
 
Сloud Webinar #1 “Architecture of Highly Loaded Geo-Distributed Applications”
Сloud Webinar #1 “Architecture of Highly Loaded Geo-Distributed Applications”Сloud Webinar #1 “Architecture of Highly Loaded Geo-Distributed Applications”
Сloud Webinar #1 “Architecture of Highly Loaded Geo-Distributed Applications”
 
INTERFACE, by apidays - Apache Cassandra now speaks developer with Stargate ...
INTERFACE, by apidays  - Apache Cassandra now speaks developer with Stargate ...INTERFACE, by apidays  - Apache Cassandra now speaks developer with Stargate ...
INTERFACE, by apidays - Apache Cassandra now speaks developer with Stargate ...
 
Distributing the SMACK stack - Kubernetes VS DCOS - Sahil Sawhney (Knoldus Inc.)
Distributing the SMACK stack - Kubernetes VS DCOS - Sahil Sawhney (Knoldus Inc.)Distributing the SMACK stack - Kubernetes VS DCOS - Sahil Sawhney (Knoldus Inc.)
Distributing the SMACK stack - Kubernetes VS DCOS - Sahil Sawhney (Knoldus Inc.)
 
Distributing the SMACK stack - Kubernetes VS DCOS - Sahil Sawhney (Knoldus Inc.)
Distributing the SMACK stack - Kubernetes VS DCOS - Sahil Sawhney (Knoldus Inc.)Distributing the SMACK stack - Kubernetes VS DCOS - Sahil Sawhney (Knoldus Inc.)
Distributing the SMACK stack - Kubernetes VS DCOS - Sahil Sawhney (Knoldus Inc.)
 

More from ScyllaDB

Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
ScyllaDB
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
ScyllaDB
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
ScyllaDB
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
ScyllaDB
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
ScyllaDB
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
ScyllaDB
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQL
ScyllaDB
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & Pitfalls
ScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
ScyllaDB
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
ScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
ScyllaDB
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
ScyllaDB
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
ScyllaDB
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
ScyllaDB
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDB
ScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
ScyllaDB
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
ScyllaDB
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDB
ScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
ScyllaDB
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
ScyllaDB
 

More from ScyllaDB (20)

Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQL
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & Pitfalls
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
 

Recently uploaded

GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Webinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data WarehouseWebinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data Warehouse
Federico Razzoli
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
Project Management Semester Long Project - Acuity
Project Management Semester Long Project - AcuityProject Management Semester Long Project - Acuity
Project Management Semester Long Project - Acuity
jpupo2018
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 

Recently uploaded (20)

GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Webinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data WarehouseWebinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data Warehouse
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
Project Management Semester Long Project - Acuity
Project Management Semester Long Project - AcuityProject Management Semester Long Project - Acuity
Project Management Semester Long Project - Acuity
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 

Integrating a Low-Latency Distributed Database with Event Streaming

Editor's Notes

  1. Intro self THANKS for attending the Distributed Data Systems Masterclass!
  2. So maybe, this is a good time, before we dive into the demo, to take a moment and explain what is ScyllaDB and why we’re using it.
  3. Here is an illustration of how data is replicated across a cluster. The query is parsed and processed by a node then replicated in other nodes for high availability. We’ve seen this representation of a cluster in a previous slide and you might also have seen this architecture before if you’re familiar with distributed systems and other databases such as DynamoDB, CosmosDB and Apache Cassandra. In fact, ScyllaDB is modeled after Apache Cassandra and is API compatible both Cassandra and DynamoDB. But what makes ScyllaDB a lot faster than the competition and standout is two things: It’s implemented in C++ and not Java And, its shard-per-code architecture.
  4. Let’s talk briefly about what shard-per-core means. Shard-per-core means that every CPU of your machine is responsible of a portion of the data. So imagine the following: when you send a query to the database, the partition key determines which node in the cluster will process the query. With shard-awareness, it also determines which core will process the query. NEXT
  5. Let’s talk first a little bit about Connectors.
  6. built-in connectors helps you avoid all those bug-prone and lengthy procedures. So let’s say you’re using a streaming platform such as Kafka or Pulsar. Connectors help you import and export data from some of the most commonly used data systems with just a simple configuration. Connectors to import data into the steaming platform are called Sources. Connectors to export data from the streaming platform. We call those Sinks.
  7. We don’t have time to cover this today, but If you’re interested in knowing more about ScyllaDB, you should check-out Scylla University or actually attend Scylla University LIVE in July. Alright! We said ScyllaDB and Pulsar are both very fast! So let’s now get to the demo!
  8. Cheers to you for keeping afloat with the current pace of technology and trying to always improve
  9. Let’s talk first a little bit about Connectors.
  10. Life used to be simple Your application became popular
  11. Scale up Virtual machines Worried about the data: multiple instances, Geo-replication Worry about latency because of geographical locations: Caching and CDN
  12. You heard of containers and its advantages You heard of microservices Added an event-bus
  13. Code is a liability, not an asset I actually stole this quote from Matt Coulter, who is AWS Serverless Hero. I tried to trace the quote back to its original author but I couldn’t find them. So please ping me on twitter or send me an email if you know more about that. But basically, code is a liability. The more code you write, the more bugs you write and the more you fix and maintain.
  14. Summary: Streaming Systems serve as data pipelines and help take it from a place to another
  15. In this demo, we’ll spin-up a ScyllaDB instance Configure Pulsar to use ScyllaDB as a Sink And walk through the code to set up a producer I already have pulsar running using docker. I need an instance of Scylla as well. Here is the command to run a cluster locally. I’m using docker to run a ScyllaDB container on my machine, but sometimes I don’t like to run many things on my machine, and also to avoid the “it worked on my machine” kinda issues I like to use Scylla Cloud, that you can try for free. In the dashboard, I can create a cluster and deploy it on AWS or GCP in the geographical location that makes the most sense for your application. The closer the better your latency. I’m going to skip this part as I already have a cluster ready for the demo. On my cluster, I get information about my nodes and if they are running. I see that everything is okay. In the connect tab, I have instructions to connect to the cluster. I can use any of the languages I like and also connect using the command line and CQLSH. CQL is the Cassandra Query Language, which if you are familiar with SQL, should be very intuitive. So I’m going to connect to the cluster and see what’s in there. Here are the keyspaces that I have. Keyspace is the logical location of where your data is stored where you also define things like Replication Strategy and Replication factor. Let me use pulsar_test_keyspace and have a look at the tables. And you can see I created a pulsar_test_table that has a key and a col. So let’s configure the sink. In pulsar, I created a config file sink-config.json where I specify the keyspace, table and topic. Now I ready to create the sink using the following command. You can see I have tenant, namespace, the sink-type and the config file as input. Now that I successfully create the sink, let’s test it out. I have a simple python script
  16. The reason you’d consider using ScyllaDB is because latency matters at scale. This is a chart from the Kafka vs Pulsar benchmark, that you can find on StreamNative’s website and that I highly encourage you to read. At p99 and any anything beyond that is where it gets interesting. With 1KB size messages, we clearly see Pulsar (here in blue) consistently operate at a few milliseconds or even microseconds. The database you choose can easily become a bottleneck to your system. That’s why you need a system that can digest that huge in-flux of data very quickly. ScyllaDB is designed to do just that. It’s designed to handle millions of operations per second with sub-milliseconds p99. NEXT
  17. built-in connectors helps you avoid all those bug-prone and lengthy procedures. So let’s say you’re using a streaming platform such as Kafka or Pulsar. Connectors help you import and export data from some of the most commonly used data systems with just a simple configuration. Connectors to import data into the steaming platform are called Sources. Connectors to export data from the streaming platform. We call those Sinks.