SlideShare a Scribd company logo
1 of 20
An Introduction to Cassandra
• 18 Years of Data infrastructure
management consulting
• 200+ Top brands
• 6000+ databases under
management
• Over 400 DBA’s, in 35 countries
• Top 5% of DBA work force, 9
Oracle ACE’s, 2 Microsoft MVP’s, 1
Cassandra MVP
• Oracle, Microsoft, MySQL,
• Datastax partners, Netezza,
• Hadoop and MongoDB plus
• UNIX Sysadmin and Oracle apps
2© 2016. All Rights Reserved.
About Pythian
•Cassandra Consultant
•First contact was 0.8
•Cassandra MVP & Datastax
Certified Architect
•Lisbon Cassandra Meetup
•Passion for distributed
systems
•Loves a good challenge
•Waterpolo is my sport
•@cjrolo
3
About me
4
Cassandra 101
• Cassandra is a highly scalable distributed masterless noSQL
database
• Column Store Architecture
• Log Structured Data
• CAP Theorem
• Eventually Consistent
Introduction
• Cassandra is a highly scalable distributed masterless noSQL database
• All nodes are the same, highly resilient
CAP Theorem
• The CAP theorem states that you have to pick two of Consistency,
Availability, Partition tolerance: You can't have the three at the same
time and get an acceptable latency…
• … at any given moment.
• Cassandra values Availability and Partitioning tolerance (AP).
Tradeoffs between consistency and latency are tunable in Cassandra
(Per request!).
• Requests offer a tunable level of consistency, all the way from "writes
never fail" to "block for all replicas to be readable".
Consistent Hashing
• A hash consists of one or more arithmetic operations on a piece of
data – e.g. MD5, Murmur3
• We hash keys in an attempt to spread key hashes in a uniform
manner for any given set of keys.
• A consistent hash is one where the hash range is divided up into
ranges called a map.
• Once the map is defined a given key will always end up in the same
map range.
Log Structured Data
• Instead of rewriting records in place or storing records near each other
based on key (clustering), just simply write new records, updates to records
or deletes at the end of the file that holds the table.
• Add an index so you can read the table randomly without loading the whole
thing into memory
• UPSERT people (1,”Jonathan”,”Ellis”);
• UPSERT people (2,”Billy”,”Bosworth”);
• UPSERT people (2,”William”,”Bosworth”);
• DELETE people (1);
CRUD, ACID and Cassandra
• C* doesn’t really have CRUD. Update is a special case of Create, and
Delete is not a real Delete.
• C* is not ACID. C* doesn’t support transactions.
• C* is BASE: Basically Available Soft-state Eventual consistency.
• Different versions may live in the cluster at the same time. Eventually all the
nodes will see the newest data.
Write Path
Read Path
Compaction and Repair
• Compaction is a process that Cassandra uses to keep local data in
check.
• Tables are append only, so obsolete data will live with current data
• Compaction “cleans the house”
• Repair is a process that Cassandra uses to keep Cluster data in check:
• Nodes can get out-of-sync (Hardware failure, network issues, etc…)
• Repair makes sure every node have the latest data
CQL - Cassandra Query Language
• CQL is not SQL
• Very similar:
• cqlsh> CREATE KEYSPACE sandbox WITH REPLICATION = { 'class' :
'NetworkTopologyStrategy', DC1 : 1};
• cqlsh> USE sandbox;
• cqlsh:sandbox>CREATE TABLE data (id uuid, data text, PRIMARY KEY (id));
• cqlsh:sandbox> INSERT INTO data (id, data) values (c37d661d-7e61-49ea-
96a5-68c34e83db3a, 'testing');
• cqlsh:sandbox> SELECT * FROM data;
• Abstracts from the user from the internal structure (Can be dangerous!)
• Provides several benefits over older model (Thrift)
14
Cassandra vs Oracle
• Scale up vs Scale out
• High availability vs Continuous availability
• Highly structured vs Semi-structured
• Replication is easy
15
Data Distribution
16© 2015. All Rights Reserved.
Cassandra is not RDBMS
• You need to know your reads before you write
• Replication Factor affects your availability, choose wisely!
• Per operation consistency
– All the way from "writes never fail" to "block for all replicas to be
readable".
• Update is a special case of Create, and Delete is not a real
Delete.
17©
The good, the bad and the ugly
• The Good:
– Distributed
– No single point of failure
– It's easy to use
– You get to sleep at night!
• The Bad:
– SQL != CQL, can't just drop in
• The Ugly
– Not enough users!
18
Opportunities
• New applications
– loose data model avoids app re-writing, data migration, etc...
• Augmentation
• Scale out better
• Have a data model that allows Cassandra to absorb high
velocity data
• Absorb traffic from several locations
19
Pitfalls!
• Hardware is important
– Get SSDs...
• Cassandra "just works"
– Can lead to overlooking how the system is performing
• Cassandra is new and is changing fast!
20© 2015. All Rights Reserved.
Q&A
• Thanks for listening!

More Related Content

What's hot

Mesos - A Platform for Fine-Grained Resource Sharing in the Data Center
Mesos - A Platform for Fine-Grained Resource Sharing in the Data CenterMesos - A Platform for Fine-Grained Resource Sharing in the Data Center
Mesos - A Platform for Fine-Grained Resource Sharing in the Data CenterAnkur Chauhan
 
The Cassandra Distributed Database
The Cassandra Distributed DatabaseThe Cassandra Distributed Database
The Cassandra Distributed DatabaseEric Evans
 
Mesos study report 03v1.2
Mesos study report  03v1.2Mesos study report  03v1.2
Mesos study report 03v1.2Stefanie Zhao
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core ConceptsJon Haddad
 
Intro to cassandra
Intro to cassandraIntro to cassandra
Intro to cassandraAaron Ploetz
 
Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26Benoit Perroud
 
Small intro to Big Data
Small intro to Big DataSmall intro to Big Data
Small intro to Big DataSoftwareMill
 
Intro to Cassandra
Intro to CassandraIntro to Cassandra
Intro to CassandraJon Haddad
 
Cassandra basics 2.0
Cassandra basics 2.0Cassandra basics 2.0
Cassandra basics 2.0Asis Mohanty
 
Evaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud DatabaseEvaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud DatabaseDataStax
 
Apache Cassandra introduction
Apache Cassandra introductionApache Cassandra introduction
Apache Cassandra introductionfardinjamshidi
 
Everyday I’m scaling... Cassandra
Everyday I’m scaling... CassandraEveryday I’m scaling... Cassandra
Everyday I’m scaling... CassandraInstaclustr
 
Managing Objects and Data in Apache Cassandra
Managing Objects and Data in Apache CassandraManaging Objects and Data in Apache Cassandra
Managing Objects and Data in Apache CassandraDataStax
 
Cassandra Day SV 2014: Beyond Read-Modify-Write with Apache Cassandra
Cassandra Day SV 2014: Beyond Read-Modify-Write with Apache CassandraCassandra Day SV 2014: Beyond Read-Modify-Write with Apache Cassandra
Cassandra Day SV 2014: Beyond Read-Modify-Write with Apache CassandraDataStax Academy
 
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...DataStax Academy
 
Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...
Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...
Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...DataStax
 

What's hot (20)

Mesos - A Platform for Fine-Grained Resource Sharing in the Data Center
Mesos - A Platform for Fine-Grained Resource Sharing in the Data CenterMesos - A Platform for Fine-Grained Resource Sharing in the Data Center
Mesos - A Platform for Fine-Grained Resource Sharing in the Data Center
 
The Cassandra Distributed Database
The Cassandra Distributed DatabaseThe Cassandra Distributed Database
The Cassandra Distributed Database
 
Mesos study report 03v1.2
Mesos study report  03v1.2Mesos study report  03v1.2
Mesos study report 03v1.2
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Intro to cassandra
Intro to cassandraIntro to cassandra
Intro to cassandra
 
Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26
 
Database
DatabaseDatabase
Database
 
BigData Developers MeetUp
BigData Developers MeetUpBigData Developers MeetUp
BigData Developers MeetUp
 
Small intro to Big Data
Small intro to Big DataSmall intro to Big Data
Small intro to Big Data
 
Intro to Cassandra
Intro to CassandraIntro to Cassandra
Intro to Cassandra
 
Cassandra Architecture FTW
Cassandra Architecture FTWCassandra Architecture FTW
Cassandra Architecture FTW
 
Cassandra basics 2.0
Cassandra basics 2.0Cassandra basics 2.0
Cassandra basics 2.0
 
Evaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud DatabaseEvaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud Database
 
Apache Cassandra introduction
Apache Cassandra introductionApache Cassandra introduction
Apache Cassandra introduction
 
Everyday I’m scaling... Cassandra
Everyday I’m scaling... CassandraEveryday I’m scaling... Cassandra
Everyday I’m scaling... Cassandra
 
Managing Objects and Data in Apache Cassandra
Managing Objects and Data in Apache CassandraManaging Objects and Data in Apache Cassandra
Managing Objects and Data in Apache Cassandra
 
Cassandra Day SV 2014: Beyond Read-Modify-Write with Apache Cassandra
Cassandra Day SV 2014: Beyond Read-Modify-Write with Apache CassandraCassandra Day SV 2014: Beyond Read-Modify-Write with Apache Cassandra
Cassandra Day SV 2014: Beyond Read-Modify-Write with Apache Cassandra
 
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
 
Cassandra ppt 2
Cassandra ppt 2Cassandra ppt 2
Cassandra ppt 2
 
Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...
Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...
Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...
 

Viewers also liked

Tokyo Cassandra Summit 2014: Tunable Consistency by Al Tobey
Tokyo Cassandra Summit 2014: Tunable Consistency by Al TobeyTokyo Cassandra Summit 2014: Tunable Consistency by Al Tobey
Tokyo Cassandra Summit 2014: Tunable Consistency by Al TobeyDataStax Academy
 
Couchbase, что за зверь и на что способен.
Couchbase, что за зверь и на что способен.Couchbase, что за зверь и на что способен.
Couchbase, что за зверь и на что способен.Alexey Rusnak
 
Large Scale Log collection using LogStash & mongoDB
Large Scale Log collection using LogStash & mongoDB Large Scale Log collection using LogStash & mongoDB
Large Scale Log collection using LogStash & mongoDB Gaurav Bhardwaj
 
Creating a MongoDB Based Logging System in a Webservice Heavy Environment
Creating a MongoDB Based Logging System in a Webservice Heavy EnvironmentCreating a MongoDB Based Logging System in a Webservice Heavy Environment
Creating a MongoDB Based Logging System in a Webservice Heavy EnvironmentMongoDB
 
Yahoo Cloud Serving Benchmark
Yahoo Cloud Serving BenchmarkYahoo Cloud Serving Benchmark
Yahoo Cloud Serving Benchmarkkevin han
 
Преимущества NoSQL баз данных на примере MongoDB
Преимущества NoSQL баз данных на примере MongoDBПреимущества NoSQL баз данных на примере MongoDB
Преимущества NoSQL баз данных на примере MongoDBUNETA
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseDataStax Academy
 
Cassandra vs. Redis
Cassandra vs. RedisCassandra vs. Redis
Cassandra vs. RedisTim Lossen
 
Web 2.0 Is the Future of Education
Web 2.0 Is the Future of EducationWeb 2.0 Is the Future of Education
Web 2.0 Is the Future of EducationSteve Hargadon
 
Logging Application Behavior to MongoDB
Logging Application Behavior to MongoDBLogging Application Behavior to MongoDB
Logging Application Behavior to MongoDBRobert Stewart
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra ExplainedEric Evans
 
Migrating Netflix from Datacenter Oracle to Global Cassandra
Migrating Netflix from Datacenter Oracle to Global CassandraMigrating Netflix from Datacenter Oracle to Global Cassandra
Migrating Netflix from Datacenter Oracle to Global CassandraAdrian Cockcroft
 
Digital Marketing for the Travel Industry in the Web 2.0. Scenario
Digital Marketing for the Travel Industry in the Web 2.0. ScenarioDigital Marketing for the Travel Industry in the Web 2.0. Scenario
Digital Marketing for the Travel Industry in the Web 2.0. Scenariodelhibloggers
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache CassandraDataStax
 
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL database
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL databaseHBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL database
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL databaseEdureka!
 

Viewers also liked (20)

Web 10,20,30
Web 10,20,30 Web 10,20,30
Web 10,20,30
 
Cv orlan
Cv orlanCv orlan
Cv orlan
 
Tokyo Cassandra Summit 2014: Tunable Consistency by Al Tobey
Tokyo Cassandra Summit 2014: Tunable Consistency by Al TobeyTokyo Cassandra Summit 2014: Tunable Consistency by Al Tobey
Tokyo Cassandra Summit 2014: Tunable Consistency by Al Tobey
 
Couchbase, что за зверь и на что способен.
Couchbase, что за зверь и на что способен.Couchbase, что за зверь и на что способен.
Couchbase, что за зверь и на что способен.
 
Large Scale Log collection using LogStash & mongoDB
Large Scale Log collection using LogStash & mongoDB Large Scale Log collection using LogStash & mongoDB
Large Scale Log collection using LogStash & mongoDB
 
Creating a MongoDB Based Logging System in a Webservice Heavy Environment
Creating a MongoDB Based Logging System in a Webservice Heavy EnvironmentCreating a MongoDB Based Logging System in a Webservice Heavy Environment
Creating a MongoDB Based Logging System in a Webservice Heavy Environment
 
Yahoo Cloud Serving Benchmark
Yahoo Cloud Serving BenchmarkYahoo Cloud Serving Benchmark
Yahoo Cloud Serving Benchmark
 
Преимущества NoSQL баз данных на примере MongoDB
Преимущества NoSQL баз данных на примере MongoDBПреимущества NoSQL баз данных на примере MongoDB
Преимущества NoSQL баз данных на примере MongoDB
 
NoSQL?? (marc)
NoSQL?? (marc)NoSQL?? (marc)
NoSQL?? (marc)
 
NoSQL: Cassadra vs. HBase
NoSQL: Cassadra vs. HBaseNoSQL: Cassadra vs. HBase
NoSQL: Cassadra vs. HBase
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
 
NOSQL vs SQL
NOSQL vs SQLNOSQL vs SQL
NOSQL vs SQL
 
Cassandra vs. Redis
Cassandra vs. RedisCassandra vs. Redis
Cassandra vs. Redis
 
Web 2.0 Is the Future of Education
Web 2.0 Is the Future of EducationWeb 2.0 Is the Future of Education
Web 2.0 Is the Future of Education
 
Logging Application Behavior to MongoDB
Logging Application Behavior to MongoDBLogging Application Behavior to MongoDB
Logging Application Behavior to MongoDB
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained
 
Migrating Netflix from Datacenter Oracle to Global Cassandra
Migrating Netflix from Datacenter Oracle to Global CassandraMigrating Netflix from Datacenter Oracle to Global Cassandra
Migrating Netflix from Datacenter Oracle to Global Cassandra
 
Digital Marketing for the Travel Industry in the Web 2.0. Scenario
Digital Marketing for the Travel Industry in the Web 2.0. ScenarioDigital Marketing for the Travel Industry in the Web 2.0. Scenario
Digital Marketing for the Travel Industry in the Web 2.0. Scenario
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL database
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL databaseHBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL database
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL database
 

Similar to An Introduction to Cassandra - Oracle User Group

NoSQL Intro with cassandra
NoSQL Intro with cassandraNoSQL Intro with cassandra
NoSQL Intro with cassandraBrian Enochson
 
Pythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra ClusterPythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra ClusterDataStax Academy
 
Deep Dive into Cassandra
Deep Dive into CassandraDeep Dive into Cassandra
Deep Dive into CassandraBrent Theisen
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache CassandraRobert Stupp
 
Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)Jon Haddad
 
Introduction to Cassandra - Denver
Introduction to Cassandra - DenverIntroduction to Cassandra - Denver
Introduction to Cassandra - DenverJon Haddad
 
Cassandra Day Denver 2014: Introduction to Apache Cassandra
Cassandra Day Denver 2014: Introduction to Apache CassandraCassandra Day Denver 2014: Introduction to Apache Cassandra
Cassandra Day Denver 2014: Introduction to Apache CassandraDataStax Academy
 
Cassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataCassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataChen Robert
 
London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0jbellis
 
Cassandra - A Distributed Database System
Cassandra - A Distributed Database System Cassandra - A Distributed Database System
Cassandra - A Distributed Database System Md. Shohel Rana
 
The No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra ModelThe No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra ModelRishikese MR
 
Cassandra
CassandraCassandra
Cassandraexsuns
 
Aruman Cassandra database
Aruman Cassandra databaseAruman Cassandra database
Aruman Cassandra databaseUmesh Dande
 
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDBBig Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDBBigDataCloud
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQLYan Cui
 
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,..."Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...lisapaglia
 

Similar to An Introduction to Cassandra - Oracle User Group (20)

NoSQL Intro with cassandra
NoSQL Intro with cassandraNoSQL Intro with cassandra
NoSQL Intro with cassandra
 
Pythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra ClusterPythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra Cluster
 
Deep Dive into Cassandra
Deep Dive into CassandraDeep Dive into Cassandra
Deep Dive into Cassandra
 
6269441.ppt
6269441.ppt6269441.ppt
6269441.ppt
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
 
Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)
 
Introduction to Cassandra - Denver
Introduction to Cassandra - DenverIntroduction to Cassandra - Denver
Introduction to Cassandra - Denver
 
Cassandra Day Denver 2014: Introduction to Apache Cassandra
Cassandra Day Denver 2014: Introduction to Apache CassandraCassandra Day Denver 2014: Introduction to Apache Cassandra
Cassandra Day Denver 2014: Introduction to Apache Cassandra
 
L6.sp17.pptx
L6.sp17.pptxL6.sp17.pptx
L6.sp17.pptx
 
Cassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataCassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting data
 
London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0
 
Cassandra - A Distributed Database System
Cassandra - A Distributed Database System Cassandra - A Distributed Database System
Cassandra - A Distributed Database System
 
The No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra ModelThe No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra Model
 
Cassandra
CassandraCassandra
Cassandra
 
Apache cassandra
Apache cassandraApache cassandra
Apache cassandra
 
Aruman Cassandra database
Aruman Cassandra databaseAruman Cassandra database
Aruman Cassandra database
 
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDBBig Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,..."Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
 
NoSQL.pptx
NoSQL.pptxNoSQL.pptx
NoSQL.pptx
 

Recently uploaded

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 

Recently uploaded (20)

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 

An Introduction to Cassandra - Oracle User Group

  • 1. An Introduction to Cassandra
  • 2. • 18 Years of Data infrastructure management consulting • 200+ Top brands • 6000+ databases under management • Over 400 DBA’s, in 35 countries • Top 5% of DBA work force, 9 Oracle ACE’s, 2 Microsoft MVP’s, 1 Cassandra MVP • Oracle, Microsoft, MySQL, • Datastax partners, Netezza, • Hadoop and MongoDB plus • UNIX Sysadmin and Oracle apps 2© 2016. All Rights Reserved. About Pythian
  • 3. •Cassandra Consultant •First contact was 0.8 •Cassandra MVP & Datastax Certified Architect •Lisbon Cassandra Meetup •Passion for distributed systems •Loves a good challenge •Waterpolo is my sport •@cjrolo 3 About me
  • 4. 4 Cassandra 101 • Cassandra is a highly scalable distributed masterless noSQL database • Column Store Architecture • Log Structured Data • CAP Theorem • Eventually Consistent
  • 5. Introduction • Cassandra is a highly scalable distributed masterless noSQL database • All nodes are the same, highly resilient
  • 6. CAP Theorem • The CAP theorem states that you have to pick two of Consistency, Availability, Partition tolerance: You can't have the three at the same time and get an acceptable latency… • … at any given moment. • Cassandra values Availability and Partitioning tolerance (AP). Tradeoffs between consistency and latency are tunable in Cassandra (Per request!). • Requests offer a tunable level of consistency, all the way from "writes never fail" to "block for all replicas to be readable".
  • 7. Consistent Hashing • A hash consists of one or more arithmetic operations on a piece of data – e.g. MD5, Murmur3 • We hash keys in an attempt to spread key hashes in a uniform manner for any given set of keys. • A consistent hash is one where the hash range is divided up into ranges called a map. • Once the map is defined a given key will always end up in the same map range.
  • 8. Log Structured Data • Instead of rewriting records in place or storing records near each other based on key (clustering), just simply write new records, updates to records or deletes at the end of the file that holds the table. • Add an index so you can read the table randomly without loading the whole thing into memory • UPSERT people (1,”Jonathan”,”Ellis”); • UPSERT people (2,”Billy”,”Bosworth”); • UPSERT people (2,”William”,”Bosworth”); • DELETE people (1);
  • 9. CRUD, ACID and Cassandra • C* doesn’t really have CRUD. Update is a special case of Create, and Delete is not a real Delete. • C* is not ACID. C* doesn’t support transactions. • C* is BASE: Basically Available Soft-state Eventual consistency. • Different versions may live in the cluster at the same time. Eventually all the nodes will see the newest data.
  • 12. Compaction and Repair • Compaction is a process that Cassandra uses to keep local data in check. • Tables are append only, so obsolete data will live with current data • Compaction “cleans the house” • Repair is a process that Cassandra uses to keep Cluster data in check: • Nodes can get out-of-sync (Hardware failure, network issues, etc…) • Repair makes sure every node have the latest data
  • 13. CQL - Cassandra Query Language • CQL is not SQL • Very similar: • cqlsh> CREATE KEYSPACE sandbox WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', DC1 : 1}; • cqlsh> USE sandbox; • cqlsh:sandbox>CREATE TABLE data (id uuid, data text, PRIMARY KEY (id)); • cqlsh:sandbox> INSERT INTO data (id, data) values (c37d661d-7e61-49ea- 96a5-68c34e83db3a, 'testing'); • cqlsh:sandbox> SELECT * FROM data; • Abstracts from the user from the internal structure (Can be dangerous!) • Provides several benefits over older model (Thrift)
  • 14. 14 Cassandra vs Oracle • Scale up vs Scale out • High availability vs Continuous availability • Highly structured vs Semi-structured • Replication is easy
  • 16. 16© 2015. All Rights Reserved. Cassandra is not RDBMS • You need to know your reads before you write • Replication Factor affects your availability, choose wisely! • Per operation consistency – All the way from "writes never fail" to "block for all replicas to be readable". • Update is a special case of Create, and Delete is not a real Delete.
  • 17. 17© The good, the bad and the ugly • The Good: – Distributed – No single point of failure – It's easy to use – You get to sleep at night! • The Bad: – SQL != CQL, can't just drop in • The Ugly – Not enough users!
  • 18. 18 Opportunities • New applications – loose data model avoids app re-writing, data migration, etc... • Augmentation • Scale out better • Have a data model that allows Cassandra to absorb high velocity data • Absorb traffic from several locations
  • 19. 19 Pitfalls! • Hardware is important – Get SSDs... • Cassandra "just works" – Can lead to overlooking how the system is performing • Cassandra is new and is changing fast!
  • 20. 20© 2015. All Rights Reserved. Q&A • Thanks for listening!

Editor's Notes

  1. 1
  2. 2
  3. 3
  4. 4
  5. 14
  6. 15
  7. 16
  8. 17
  9. 18
  10. 19
  11. 20