SlideShare a Scribd company logo
1 of 8
Version 1.0
Moving Data from Cassandra to
DataStax Astra using DSBulk
An Anant Corporation Story.
Cassandra
● Apache Cassandra is an open-source distributed No-
SQL database designed to handle large volumes of data
across multiple different servers
● Cassandra clusters can be upgraded by either
improving hardware on current nodes (vertical
scalability) or adding more nodes (horizontal
scalability)
○ Horizontal scalability is part of why Cassandra is
so powerful - cheap machines can be added to a
cluster to improve its performance in a
significant manner
● Note: Demo will use Open Source Cassandra
○ Works nearly identically with DSE Cassandra
DataStax Astra
● Astra website:
https://www.datastax.com/products/datastax-astra
● DataStax Astra is a fully managed, serverless database
built on Apache Cassandra, and is provided by
DataStax
● Some additional features:
○ Stargate APIs: Makes it easy for developers to use a
Cassandra-based database like Astra to work with data
without deep knowledge of CQL
○ Zero Lock-In: Deploy on AWS, GCP and Azure and still
maintain compatibility with open-source Cassandra
○ Global Scale: Data replication across multiple data
centers, availability zones, and multiple regions.
■ Additionally, allows a user to scale an Astra
database up to multiple petabytes of data without
impacting speed or performance
○ 80 GB of storage and 20 million read/write operations for
free every month
DSBulk
● DSBulk: DataStax Bulk Loader for Apache Cassandra is an open source software used to
load/unload CSV or JSON data in and out of supported databases
● Supported databases:
○ DataStax Astra cloud database
○ DataStax Enterprise (DSE) 4.7 and later
○ Open source Apache Cassandra 2.1 and later
● More information about DSBulk, along with an introduction to it and various documentation can
be found linked here: https://docs.datastax.com/en/dsbulk/doc/dsbulk/dsbulkAbout.html
● Github Repository for the DataStax DSBulk project: https://github.com/datastax/dsbulk
DSBulk cont...
● Commands that will be used in today’s presentation/demo:
○ dsbulk load
■ This command is used to load data into a cassandra/astra database without a configuration file. Note
that necessary parameters will have to be passed in (listed below)
○ dsbulk unload
■ This command is used to unload data from a cassandra/astra database without a configuration file,
into a CSV or JSON file. Note that necessary parameters will have to be passed in as well.
○ dsbulk count
■ This command is used to return information about loaded data in a cassandra/astra database.
● Some necessary parameters/flags that must be used if using these commands without a configuration file:
○ -k: keyspace
○ -t: table
○ -b: path to secure connect bundle (only necessary if connecting to astra)
○ -u: username, -p: password (to the database)
■ Since recent Astra update earlier this year, need to use ClientID/ClientSecret instead of
username/password.
■ Can be left empty if cassandra database user/password is left as default (cassandra/cassandra)
○ -url: url from where to pull .CSV or .JSON file from, or a local directory for where to unload data into
Demo Project Slide
● Link to Github Repo: https://github.com/DataStax-Examples/dsbulk-to-astra/
○ Demo is based on sample data from this github repository
● Will be going through four main processes using dsbulk:
○ Loading a .csv hosted online into local cassandra
○ Loading a .csv hosted online into astra
○ Unloading from local cassandra to a .csv file
○ Loading from a .csv file into astra
Resources
● https://www.datastax.com/products/datastax-astra
● https://github.com/datastax/dsbulk
● https://docs.datastax.com/en/dsbulk/doc/dsbulk/dsbulkAbout.html
● https://docs.datastax.com/en/dsbulk/doc/dsbulk/install/dsbulkInstall.html
● https://github.com/DataStax-Examples/dsbulk-to-astra/
● https://github.com/Anant/cassandra.api/
● https://docs.datastax.com/en/dsbulk/doc/dsbulk/getStartedDsbulk.html
● https://www.datastax.com/products/datastax-astra/features
● https://docs.datastax.com/en/dsbulk/doc/dsbulk/dsbulkSimpleLoad.html
● https://docs.datastax.com/en/dsbulk/doc/dsbulk/dsbulkSimpleUnload.html
Strategy: Scalable Fast Data
Architecture: Cassandra, Spark, Kafka
Engineering: Node, Python, JVM,CLR
Operations: Cloud, Container
Rescue: Downtime!! I need help.
www.anant.us | solutions@anant.us | (855) 262-6826
3 Washington Circle, NW | Suite 301 | Washington, DC 20037

More Related Content

What's hot

Open stack @ iiit hyderabad
Open stack @ iiit hyderabad Open stack @ iiit hyderabad
Open stack @ iiit hyderabad
openstackindia
 

What's hot (20)

SANSA ISWC 2017 Talk
SANSA ISWC 2017 TalkSANSA ISWC 2017 Talk
SANSA ISWC 2017 Talk
 
OpenNebulaConf2018 - OpenNebula and LXD Containers - Rubén S. Montero - OpenN...
OpenNebulaConf2018 - OpenNebula and LXD Containers - Rubén S. Montero - OpenN...OpenNebulaConf2018 - OpenNebula and LXD Containers - Rubén S. Montero - OpenN...
OpenNebulaConf2018 - OpenNebula and LXD Containers - Rubén S. Montero - OpenN...
 
MapDB - taking Java collections to the next level
MapDB - taking Java collections to the next levelMapDB - taking Java collections to the next level
MapDB - taking Java collections to the next level
 
DB reading group may 16, 2018
DB reading group may 16, 2018DB reading group may 16, 2018
DB reading group may 16, 2018
 
NoSql
NoSqlNoSql
NoSql
 
Windows Azure Tables e NoSQL
Windows Azure Tables e NoSQLWindows Azure Tables e NoSQL
Windows Azure Tables e NoSQL
 
KDB database (EPAM tech talks, Sofia, April, 2015)
KDB database (EPAM tech talks, Sofia, April, 2015)KDB database (EPAM tech talks, Sofia, April, 2015)
KDB database (EPAM tech talks, Sofia, April, 2015)
 
Geo data analytics
Geo data analyticsGeo data analytics
Geo data analytics
 
Introduction to Bizur
Introduction to BizurIntroduction to Bizur
Introduction to Bizur
 
KDB+ Lite
KDB+ LiteKDB+ Lite
KDB+ Lite
 
Monitoring your shiny new docker environment
Monitoring your shiny new docker environmentMonitoring your shiny new docker environment
Monitoring your shiny new docker environment
 
Marble talk at akademy 2008
Marble talk  at akademy 2008Marble talk  at akademy 2008
Marble talk at akademy 2008
 
Hello cloud 2
Hello  cloud 2Hello  cloud 2
Hello cloud 2
 
2013 05 ny
2013 05 ny2013 05 ny
2013 05 ny
 
Open stack @ iiit hyderabad
Open stack @ iiit hyderabad Open stack @ iiit hyderabad
Open stack @ iiit hyderabad
 
Devoxx france 2015 influxdb
Devoxx france 2015 influxdbDevoxx france 2015 influxdb
Devoxx france 2015 influxdb
 
Mongo db3.0 wired_tiger_storage_engine
Mongo db3.0 wired_tiger_storage_engineMongo db3.0 wired_tiger_storage_engine
Mongo db3.0 wired_tiger_storage_engine
 
Mongo nyc nyt + mongodb
Mongo nyc nyt + mongodbMongo nyc nyt + mongodb
Mongo nyc nyt + mongodb
 
OpenShift.io on Gluster
OpenShift.io on GlusterOpenShift.io on Gluster
OpenShift.io on Gluster
 
Type safe, versioned, and rewindable stream processing with Apache {Avro, K...
Type safe, versioned, and rewindable stream processing  with  Apache {Avro, K...Type safe, versioned, and rewindable stream processing  with  Apache {Avro, K...
Type safe, versioned, and rewindable stream processing with Apache {Avro, K...
 

Similar to Apache Cassandra Lunch #67: Moving Data from Cassandra to Datastax Astra

Scaling Cassandra for Big Data
Scaling Cassandra for Big DataScaling Cassandra for Big Data
Scaling Cassandra for Big Data
DataStax Academy
 

Similar to Apache Cassandra Lunch #67: Moving Data from Cassandra to Datastax Astra (20)

Apache Cassandra Lunch #71: Creating a User Profile Using DataStax Astra and ...
Apache Cassandra Lunch #71: Creating a User Profile Using DataStax Astra and ...Apache Cassandra Lunch #71: Creating a User Profile Using DataStax Astra and ...
Apache Cassandra Lunch #71: Creating a User Profile Using DataStax Astra and ...
 
Cassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction GuideCassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction Guide
 
Cassandra Lunch #87: Recreating Cassandra.api using Astra and Stargate
Cassandra Lunch #87: Recreating Cassandra.api using Astra and StargateCassandra Lunch #87: Recreating Cassandra.api using Astra and Stargate
Cassandra Lunch #87: Recreating Cassandra.api using Astra and Stargate
 
Lambda at Weather Scale - Cassandra Summit 2015
Lambda at Weather Scale - Cassandra Summit 2015Lambda at Weather Scale - Cassandra Summit 2015
Lambda at Weather Scale - Cassandra Summit 2015
 
Flight on Zeppelin with Apache Spark & Cassandra
Flight on Zeppelin with Apache Spark & CassandraFlight on Zeppelin with Apache Spark & Cassandra
Flight on Zeppelin with Apache Spark & Cassandra
 
The DBpedia databus
The DBpedia databusThe DBpedia databus
The DBpedia databus
 
Apache Cassandra Lunch #70: Basics of Apache Cassandra
Apache Cassandra Lunch #70: Basics of Apache CassandraApache Cassandra Lunch #70: Basics of Apache Cassandra
Apache Cassandra Lunch #70: Basics of Apache Cassandra
 
Tobi Bosede - PyCassa Setting Up and Using Apache Cassandra with Python in Wi...
Tobi Bosede - PyCassa Setting Up and Using Apache Cassandra with Python in Wi...Tobi Bosede - PyCassa Setting Up and Using Apache Cassandra with Python in Wi...
Tobi Bosede - PyCassa Setting Up and Using Apache Cassandra with Python in Wi...
 
Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...
Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...
Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...
 
GPS Insight on Using Presto with Scylla for Data Analytics and Data Archival
GPS Insight on Using Presto with Scylla for Data Analytics and Data ArchivalGPS Insight on Using Presto with Scylla for Data Analytics and Data Archival
GPS Insight on Using Presto with Scylla for Data Analytics and Data Archival
 
Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016
Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016
Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016
 
Dask: Scaling Python
Dask: Scaling PythonDask: Scaling Python
Dask: Scaling Python
 
Apache Cassandra Lunch #94: StreamSets and Cassandra
Apache Cassandra Lunch #94: StreamSets and CassandraApache Cassandra Lunch #94: StreamSets and Cassandra
Apache Cassandra Lunch #94: StreamSets and Cassandra
 
Big Data and its emergence
Big Data and its emergenceBig Data and its emergence
Big Data and its emergence
 
Cassandra vs. ScyllaDB: Evolutionary Differences
Cassandra vs. ScyllaDB: Evolutionary DifferencesCassandra vs. ScyllaDB: Evolutionary Differences
Cassandra vs. ScyllaDB: Evolutionary Differences
 
Apache Cassandra Lunch #80: How to Use Cassandra for Content Management
Apache Cassandra Lunch #80: How to Use Cassandra for Content ManagementApache Cassandra Lunch #80: How to Use Cassandra for Content Management
Apache Cassandra Lunch #80: How to Use Cassandra for Content Management
 
Apache Cassandra introduction
Apache Cassandra introductionApache Cassandra introduction
Apache Cassandra introduction
 
CL 121
CL 121CL 121
CL 121
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Scaling Cassandra for Big Data
Scaling Cassandra for Big DataScaling Cassandra for Big Data
Scaling Cassandra for Big Data
 

More from Anant Corporation

NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...
Anant Corporation
 
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPTAutomate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Anant Corporation
 
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Anant Corporation
 

More from Anant Corporation (20)

QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137
QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137
QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137
 
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdf
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdfKono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdf
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdf
 
Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache Pinot
Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache PinotData Engineer's Lunch 96: Intro to Real Time Analytics Using Apache Pinot
Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache Pinot
 
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...
 
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPTAutomate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
 
YugabyteDB Developer Tools
YugabyteDB Developer ToolsYugabyteDB Developer Tools
YugabyteDB Developer Tools
 
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Episode 2: The LLM / GPT / AI Prompt / Data Engineer RoadmapEpisode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
 
Machine Learning Orchestration with Airflow
Machine Learning Orchestration with AirflowMachine Learning Orchestration with Airflow
Machine Learning Orchestration with Airflow
 
Cassandra Lunch 130: Recap of Cassandra Forward Talks
Cassandra Lunch 130: Recap of Cassandra Forward TalksCassandra Lunch 130: Recap of Cassandra Forward Talks
Cassandra Lunch 130: Recap of Cassandra Forward Talks
 
Data Engineer's Lunch 90: Migrating SQL Data with Arcion
Data Engineer's Lunch 90: Migrating SQL Data with ArcionData Engineer's Lunch 90: Migrating SQL Data with Arcion
Data Engineer's Lunch 90: Migrating SQL Data with Arcion
 
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
 
Cassandra Lunch 129: What’s New: Apache Cassandra 4.1+ Features & Future
Cassandra Lunch 129: What’s New:  Apache Cassandra 4.1+ Features & FutureCassandra Lunch 129: What’s New:  Apache Cassandra 4.1+ Features & Future
Cassandra Lunch 129: What’s New: Apache Cassandra 4.1+ Features & Future
 
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
 
Data Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackData Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data Stack
 
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache IcebergData Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
 
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOpsApache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
 
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache CassandraApache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
 
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
 
Data Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise ConsciousnessData Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise Consciousness
 
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data PlatformsData Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
 

Recently uploaded

Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
JoseMangaJr1
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 

Recently uploaded (20)

Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 

Apache Cassandra Lunch #67: Moving Data from Cassandra to Datastax Astra

  • 1. Version 1.0 Moving Data from Cassandra to DataStax Astra using DSBulk An Anant Corporation Story.
  • 2. Cassandra ● Apache Cassandra is an open-source distributed No- SQL database designed to handle large volumes of data across multiple different servers ● Cassandra clusters can be upgraded by either improving hardware on current nodes (vertical scalability) or adding more nodes (horizontal scalability) ○ Horizontal scalability is part of why Cassandra is so powerful - cheap machines can be added to a cluster to improve its performance in a significant manner ● Note: Demo will use Open Source Cassandra ○ Works nearly identically with DSE Cassandra
  • 3. DataStax Astra ● Astra website: https://www.datastax.com/products/datastax-astra ● DataStax Astra is a fully managed, serverless database built on Apache Cassandra, and is provided by DataStax ● Some additional features: ○ Stargate APIs: Makes it easy for developers to use a Cassandra-based database like Astra to work with data without deep knowledge of CQL ○ Zero Lock-In: Deploy on AWS, GCP and Azure and still maintain compatibility with open-source Cassandra ○ Global Scale: Data replication across multiple data centers, availability zones, and multiple regions. ■ Additionally, allows a user to scale an Astra database up to multiple petabytes of data without impacting speed or performance ○ 80 GB of storage and 20 million read/write operations for free every month
  • 4. DSBulk ● DSBulk: DataStax Bulk Loader for Apache Cassandra is an open source software used to load/unload CSV or JSON data in and out of supported databases ● Supported databases: ○ DataStax Astra cloud database ○ DataStax Enterprise (DSE) 4.7 and later ○ Open source Apache Cassandra 2.1 and later ● More information about DSBulk, along with an introduction to it and various documentation can be found linked here: https://docs.datastax.com/en/dsbulk/doc/dsbulk/dsbulkAbout.html ● Github Repository for the DataStax DSBulk project: https://github.com/datastax/dsbulk
  • 5. DSBulk cont... ● Commands that will be used in today’s presentation/demo: ○ dsbulk load ■ This command is used to load data into a cassandra/astra database without a configuration file. Note that necessary parameters will have to be passed in (listed below) ○ dsbulk unload ■ This command is used to unload data from a cassandra/astra database without a configuration file, into a CSV or JSON file. Note that necessary parameters will have to be passed in as well. ○ dsbulk count ■ This command is used to return information about loaded data in a cassandra/astra database. ● Some necessary parameters/flags that must be used if using these commands without a configuration file: ○ -k: keyspace ○ -t: table ○ -b: path to secure connect bundle (only necessary if connecting to astra) ○ -u: username, -p: password (to the database) ■ Since recent Astra update earlier this year, need to use ClientID/ClientSecret instead of username/password. ■ Can be left empty if cassandra database user/password is left as default (cassandra/cassandra) ○ -url: url from where to pull .CSV or .JSON file from, or a local directory for where to unload data into
  • 6. Demo Project Slide ● Link to Github Repo: https://github.com/DataStax-Examples/dsbulk-to-astra/ ○ Demo is based on sample data from this github repository ● Will be going through four main processes using dsbulk: ○ Loading a .csv hosted online into local cassandra ○ Loading a .csv hosted online into astra ○ Unloading from local cassandra to a .csv file ○ Loading from a .csv file into astra
  • 7. Resources ● https://www.datastax.com/products/datastax-astra ● https://github.com/datastax/dsbulk ● https://docs.datastax.com/en/dsbulk/doc/dsbulk/dsbulkAbout.html ● https://docs.datastax.com/en/dsbulk/doc/dsbulk/install/dsbulkInstall.html ● https://github.com/DataStax-Examples/dsbulk-to-astra/ ● https://github.com/Anant/cassandra.api/ ● https://docs.datastax.com/en/dsbulk/doc/dsbulk/getStartedDsbulk.html ● https://www.datastax.com/products/datastax-astra/features ● https://docs.datastax.com/en/dsbulk/doc/dsbulk/dsbulkSimpleLoad.html ● https://docs.datastax.com/en/dsbulk/doc/dsbulk/dsbulkSimpleUnload.html
  • 8. Strategy: Scalable Fast Data Architecture: Cassandra, Spark, Kafka Engineering: Node, Python, JVM,CLR Operations: Cloud, Container Rescue: Downtime!! I need help. www.anant.us | solutions@anant.us | (855) 262-6826 3 Washington Circle, NW | Suite 301 | Washington, DC 20037