SlideShare a Scribd company logo
Cassandra   Structured Storage System over a P2P Network Avinash Lakshman, Prashant Malik, Karthik Ranganathan
Why Cassandra? ,[object Object],[object Object],[object Object],[object Object]
Design Goals ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Cassandra Architecture Messaging Layer Cluster Membership Failure Detector Storage Layer Partitioner Replicator Cassandra API Tools
Data Model KEY ColumnFamily1  Name  : MailList   Type  : Simple   Sort  : Name   Name : tid1 Value : <Binary> TimeStamp : t1 Name : tid2 Value : <Binary> TimeStamp : t2 Name : tid3 Value : <Binary> TimeStamp : t3 Name : tid4 Value : <Binary> TimeStamp : t4 ColumnFamily2  Name  : WordList   Type  : Super   Sort  : Time   Name : aloha ColumnFamily3  Name  : System   Type  : Super   Sort  : Name   Name : hint1 <Column List> Name : hint2 <Column List> Name : hint3 <Column List> Name : hint4 <Column List> Name : dude C2  V2 T2 C6 V6 T6 Column Families are declared upfront Columns are added and modified dynamically SuperColumns are added and modified dynamically Columns are added and modified dynamically C1  V1 T1 C2 V2 T2 C3 V3 T3 C4 V4 T4
Write Operations ,[object Object],[object Object],[object Object],[object Object]
Write cont’d Key (CF1 , CF2 , CF3) Commit Log Binary serialized  Key ( CF1 , CF2 , CF3 ) Memtable ( CF1) Memtable ( CF2) Memtable ( CF2) FLUSH ,[object Object],[object Object],[object Object],Dedicated Disk <Key name><Size of key Data><Index of columns/supercolumns>< Serialized column family>  --- --- --- --- <Key name><Size of key Data><Index of columns/supercolumns>< Serialized column family> BLOCK Index  <Key Name> Offset, <Key Name> Offset K 128   Offset K 256   Offset K 384   Offset Bloom Filter (Index in memory) Data file on disk
Compactions K1 < Serialized data > K2 < Serialized data > K3 < Serialized data > -- -- -- Sorted K2 < Serialized data > K10 < Serialized data > K30 < Serialized data > -- -- -- Sorted K4 < Serialized data > K5 < Serialized data > K10 < Serialized data > -- -- -- Sorted MERGE  SORT K1 < Serialized data > K2 < Serialized data > K3 < Serialized data > K4 < Serialized data > K5 < Serialized data > K10 < Serialized data > K30 < Serialized data > Sorted K1  Offset K5  Offset K30  Offset Bloom Filter Loaded in memory Index File Data File D E L E T E D
Write Properties ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Read Query Closest replica Cassandra Cluster Replica A Result Replica B Replica C Digest Query Digest Response Digest Response Result Client Read repair if digests differ
Partitioning N=3 h(key2) And Replication 0 1 1/2 F E D C B A h(key1)
Cluster Membership and Failure Detection ,[object Object],[object Object],[object Object],[object Object],[object Object]
Accrual Failure Detector ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Properties of the Failure Detector ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Implementation  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Information Flow in the Implementation
Performance Benchmark ,[object Object],[object Object],Search Interactions Term Search Min 7.69 ms 7.78 ms Median 15.69 ms 18.27 ms Average 26.13 ms 44.41 ms
Lessons Learnt ,[object Object],[object Object],[object Object],[object Object]
Future work ,[object Object],[object Object],[object Object],[object Object]
Questions?

More Related Content

What's hot

Kafka internals
Kafka internalsKafka internals
Kafka internals
David Groozman
 
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangApache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Databricks
 
Using Delta Lake to Transform a Legacy Apache Spark to Support Complex Update...
Using Delta Lake to Transform a Legacy Apache Spark to Support Complex Update...Using Delta Lake to Transform a Legacy Apache Spark to Support Complex Update...
Using Delta Lake to Transform a Legacy Apache Spark to Support Complex Update...
Databricks
 
Capacity Planning Your Kafka Cluster | Jason Bell, Digitalis
Capacity Planning Your Kafka Cluster | Jason Bell, DigitalisCapacity Planning Your Kafka Cluster | Jason Bell, Digitalis
Capacity Planning Your Kafka Cluster | Jason Bell, Digitalis
HostedbyConfluent
 
Thoughts on kafka capacity planning
Thoughts on kafka capacity planningThoughts on kafka capacity planning
Thoughts on kafka capacity planning
JamieAlquiza
 
Dynamic Partition Pruning in Apache Spark
Dynamic Partition Pruning in Apache SparkDynamic Partition Pruning in Apache Spark
Dynamic Partition Pruning in Apache Spark
Databricks
 
Apache Cassandra at Macys
Apache Cassandra at MacysApache Cassandra at Macys
Apache Cassandra at Macys
DataStax Academy
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
confluent
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
Jiangjie Qin
 
Building Event Streaming Architectures on Scylla and Kafka
Building Event Streaming Architectures on Scylla and KafkaBuilding Event Streaming Architectures on Scylla and Kafka
Building Event Streaming Architectures on Scylla and Kafka
ScyllaDB
 
Handle Large Messages In Apache Kafka
Handle Large Messages In Apache KafkaHandle Large Messages In Apache Kafka
Handle Large Messages In Apache Kafka
Jiangjie Qin
 
Apache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper OptimizationApache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper Optimization
Databricks
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
confluent
 
Designing and Implementing a Real-time Data Lake with Dynamically Changing Sc...
Designing and Implementing a Real-time Data Lake with Dynamically Changing Sc...Designing and Implementing a Real-time Data Lake with Dynamically Changing Sc...
Designing and Implementing a Real-time Data Lake with Dynamically Changing Sc...
Databricks
 
Apache Flink and what it is used for
Apache Flink and what it is used forApache Flink and what it is used for
Apache Flink and what it is used for
Aljoscha Krettek
 
Introduction to Kafka
Introduction to KafkaIntroduction to Kafka
Introduction to Kafka
Akash Vacher
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
Flink Forward
 
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2
Tanel Poder
 
Simplifying Change Data Capture using Databricks Delta
Simplifying Change Data Capture using Databricks DeltaSimplifying Change Data Capture using Databricks Delta
Simplifying Change Data Capture using Databricks Delta
Databricks
 
Everything You Always Wanted to Know About Kafka’s Rebalance Protocol but Wer...
Everything You Always Wanted to Know About Kafka’s Rebalance Protocol but Wer...Everything You Always Wanted to Know About Kafka’s Rebalance Protocol but Wer...
Everything You Always Wanted to Know About Kafka’s Rebalance Protocol but Wer...
confluent
 

What's hot (20)

Kafka internals
Kafka internalsKafka internals
Kafka internals
 
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangApache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
 
Using Delta Lake to Transform a Legacy Apache Spark to Support Complex Update...
Using Delta Lake to Transform a Legacy Apache Spark to Support Complex Update...Using Delta Lake to Transform a Legacy Apache Spark to Support Complex Update...
Using Delta Lake to Transform a Legacy Apache Spark to Support Complex Update...
 
Capacity Planning Your Kafka Cluster | Jason Bell, Digitalis
Capacity Planning Your Kafka Cluster | Jason Bell, DigitalisCapacity Planning Your Kafka Cluster | Jason Bell, Digitalis
Capacity Planning Your Kafka Cluster | Jason Bell, Digitalis
 
Thoughts on kafka capacity planning
Thoughts on kafka capacity planningThoughts on kafka capacity planning
Thoughts on kafka capacity planning
 
Dynamic Partition Pruning in Apache Spark
Dynamic Partition Pruning in Apache SparkDynamic Partition Pruning in Apache Spark
Dynamic Partition Pruning in Apache Spark
 
Apache Cassandra at Macys
Apache Cassandra at MacysApache Cassandra at Macys
Apache Cassandra at Macys
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
 
Building Event Streaming Architectures on Scylla and Kafka
Building Event Streaming Architectures on Scylla and KafkaBuilding Event Streaming Architectures on Scylla and Kafka
Building Event Streaming Architectures on Scylla and Kafka
 
Handle Large Messages In Apache Kafka
Handle Large Messages In Apache KafkaHandle Large Messages In Apache Kafka
Handle Large Messages In Apache Kafka
 
Apache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper OptimizationApache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper Optimization
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
 
Designing and Implementing a Real-time Data Lake with Dynamically Changing Sc...
Designing and Implementing a Real-time Data Lake with Dynamically Changing Sc...Designing and Implementing a Real-time Data Lake with Dynamically Changing Sc...
Designing and Implementing a Real-time Data Lake with Dynamically Changing Sc...
 
Apache Flink and what it is used for
Apache Flink and what it is used forApache Flink and what it is used for
Apache Flink and what it is used for
 
Introduction to Kafka
Introduction to KafkaIntroduction to Kafka
Introduction to Kafka
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
 
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2
 
Simplifying Change Data Capture using Databricks Delta
Simplifying Change Data Capture using Databricks DeltaSimplifying Change Data Capture using Databricks Delta
Simplifying Change Data Capture using Databricks Delta
 
Everything You Always Wanted to Know About Kafka’s Rebalance Protocol but Wer...
Everything You Always Wanted to Know About Kafka’s Rebalance Protocol but Wer...Everything You Always Wanted to Know About Kafka’s Rebalance Protocol but Wer...
Everything You Always Wanted to Know About Kafka’s Rebalance Protocol but Wer...
 

Viewers also liked

3D_Reservoir_Characterization_Term_Project
3D_Reservoir_Characterization_Term_Project3D_Reservoir_Characterization_Term_Project
3D_Reservoir_Characterization_Term_ProjectTyler Howe
 
Cassandra - Research Paper Overview
Cassandra - Research Paper OverviewCassandra - Research Paper Overview
Cassandra - Research Paper Overview
sameiralk
 
Yahoo! JAPANにおけるApache Cassandraへの取り組み
Yahoo! JAPANにおけるApache Cassandraへの取り組みYahoo! JAPANにおけるApache Cassandraへの取り組み
Yahoo! JAPANにおけるApache Cassandraへの取り組み
Yahoo!デベロッパーネットワーク
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
DataStax
 
Apache cassandra an introduction
Apache cassandra  an introductionApache cassandra  an introduction
Apache cassandra an introduction
Shehaaz Saif
 
NoSQL Cassandra Talk for Seattle Tech Startups 3-10-10
NoSQL Cassandra Talk for Seattle Tech Startups 3-10-10NoSQL Cassandra Talk for Seattle Tech Startups 3-10-10
NoSQL Cassandra Talk for Seattle Tech Startups 3-10-10
egpeters
 
Cassandra Prophecy
Cassandra ProphecyCassandra Prophecy
Cassandra Prophecy
Igor Khotin
 
NoSQL Cassandra Talk for Seattle Tech Startups 3-10-10
NoSQL Cassandra Talk for Seattle Tech Startups 3-10-10NoSQL Cassandra Talk for Seattle Tech Startups 3-10-10
NoSQL Cassandra Talk for Seattle Tech Startups 3-10-10
egpeters
 
Cassandra - A Decentralized Structured Storage System
Cassandra - A Decentralized Structured Storage SystemCassandra - A Decentralized Structured Storage System
Cassandra - A Decentralized Structured Storage System
Varad Meru
 
Storm@Twitter, SIGMOD 2014
Storm@Twitter, SIGMOD 2014Storm@Twitter, SIGMOD 2014
Storm@Twitter, SIGMOD 2014
Karthik Ramasamy
 
Cassandra
CassandraCassandra
Cassandra
Upaang Saxena
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage system
Arunit Gupta
 
Guide to Cassandra for Production Deployments
Guide to Cassandra for Production DeploymentsGuide to Cassandra for Production Deployments
Guide to Cassandra for Production Deployments
smdkk
 
Oconee county crime data 2010
Oconee county crime data 2010Oconee county crime data 2010
Oconee county crime data 2010Trey Downs
 
How to install Civicrm in Drupal 7
How to install Civicrm in Drupal 7How to install Civicrm in Drupal 7
How to install Civicrm in Drupal 7
Zabisco Digital
 

Viewers also liked (16)

3D_Reservoir_Characterization_Term_Project
3D_Reservoir_Characterization_Term_Project3D_Reservoir_Characterization_Term_Project
3D_Reservoir_Characterization_Term_Project
 
Cassandra - Research Paper Overview
Cassandra - Research Paper OverviewCassandra - Research Paper Overview
Cassandra - Research Paper Overview
 
Yahoo! JAPANにおけるApache Cassandraへの取り組み
Yahoo! JAPANにおけるApache Cassandraへの取り組みYahoo! JAPANにおけるApache Cassandraへの取り組み
Yahoo! JAPANにおけるApache Cassandraへの取り組み
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
Apache cassandra an introduction
Apache cassandra  an introductionApache cassandra  an introduction
Apache cassandra an introduction
 
NoSQL Cassandra Talk for Seattle Tech Startups 3-10-10
NoSQL Cassandra Talk for Seattle Tech Startups 3-10-10NoSQL Cassandra Talk for Seattle Tech Startups 3-10-10
NoSQL Cassandra Talk for Seattle Tech Startups 3-10-10
 
Cassandra Prophecy
Cassandra ProphecyCassandra Prophecy
Cassandra Prophecy
 
NoSQL Cassandra Talk for Seattle Tech Startups 3-10-10
NoSQL Cassandra Talk for Seattle Tech Startups 3-10-10NoSQL Cassandra Talk for Seattle Tech Startups 3-10-10
NoSQL Cassandra Talk for Seattle Tech Startups 3-10-10
 
Cassandra - A Decentralized Structured Storage System
Cassandra - A Decentralized Structured Storage SystemCassandra - A Decentralized Structured Storage System
Cassandra - A Decentralized Structured Storage System
 
Storm@Twitter, SIGMOD 2014
Storm@Twitter, SIGMOD 2014Storm@Twitter, SIGMOD 2014
Storm@Twitter, SIGMOD 2014
 
Cassandra
CassandraCassandra
Cassandra
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage system
 
Guide to Cassandra for Production Deployments
Guide to Cassandra for Production DeploymentsGuide to Cassandra for Production Deployments
Guide to Cassandra for Production Deployments
 
Oconee county crime data 2010
Oconee county crime data 2010Oconee county crime data 2010
Oconee county crime data 2010
 
Book Of Prayers
Book Of PrayersBook Of Prayers
Book Of Prayers
 
How to install Civicrm in Drupal 7
How to install Civicrm in Drupal 7How to install Civicrm in Drupal 7
How to install Civicrm in Drupal 7
 

Similar to Data Presentations Cassandra Sigmod

Svccg nosql 2011_sri-cassandra
Svccg nosql 2011_sri-cassandraSvccg nosql 2011_sri-cassandra
Svccg nosql 2011_sri-cassandra
srisatish ambati
 
Confluent real time_acquisition_analysis_and_evaluation_of_data_streams_20190...
Confluent real time_acquisition_analysis_and_evaluation_of_data_streams_20190...Confluent real time_acquisition_analysis_and_evaluation_of_data_streams_20190...
Confluent real time_acquisition_analysis_and_evaluation_of_data_streams_20190...
confluent
 
Wireshark Basics
Wireshark BasicsWireshark Basics
Wireshark Basics
Yoram Orzach
 
The post release technologies of Crysis 3 (Slides Only) - Stewart Needham
The post release technologies of Crysis 3 (Slides Only) - Stewart NeedhamThe post release technologies of Crysis 3 (Slides Only) - Stewart Needham
The post release technologies of Crysis 3 (Slides Only) - Stewart Needham
Stewart Needham
 
Outsmarting the smart meter (Jfokus 2017)
Outsmarting the smart meter (Jfokus 2017)Outsmarting the smart meter (Jfokus 2017)
Outsmarting the smart meter (Jfokus 2017)
Maarten Mulders
 
Dpdk applications
Dpdk applicationsDpdk applications
Dpdk applications
Vipin Varghese
 
GC free coding in @Java presented @Geecon
GC free coding in @Java presented @GeeconGC free coding in @Java presented @Geecon
GC free coding in @Java presented @Geecon
Peter Lawrey
 
Automating the Hunt for Non-Obvious Sources of Latency Spreads
Automating the Hunt for Non-Obvious Sources of Latency SpreadsAutomating the Hunt for Non-Obvious Sources of Latency Spreads
Automating the Hunt for Non-Obvious Sources of Latency Spreads
ScyllaDB
 
Provenance for Data Munging Environments
Provenance for Data Munging EnvironmentsProvenance for Data Munging Environments
Provenance for Data Munging Environments
Paul Groth
 
[Globant summer take over] Empowering Big Data with Cassandra
[Globant summer take over] Empowering Big Data with Cassandra[Globant summer take over] Empowering Big Data with Cassandra
[Globant summer take over] Empowering Big Data with Cassandra
Globant
 
Stream Processing Overview
Stream Processing OverviewStream Processing Overview
Stream Processing Overview
Maycon Viana Bordin
 
Synapse 2018 Guarding against failure in a hundred step pipeline
Synapse 2018 Guarding against failure in a hundred step pipelineSynapse 2018 Guarding against failure in a hundred step pipeline
Synapse 2018 Guarding against failure in a hundred step pipeline
Calvin French-Owen
 
Influx data basic
Influx data basicInflux data basic
Influx data basic
Сергій Саварин
 
Understanding DPDK
Understanding DPDKUnderstanding DPDK
Understanding DPDK
Denys Haryachyy
 
Apache Cassandra, part 2 – data model example, machinery
Apache Cassandra, part 2 – data model example, machineryApache Cassandra, part 2 – data model example, machinery
Apache Cassandra, part 2 – data model example, machinery
Andrey Lomakin
 
Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das
Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das
Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das
Databricks
 
JConf.dev 2022 - Apache Pulsar Development 101 with Java
JConf.dev 2022 - Apache Pulsar Development 101 with JavaJConf.dev 2022 - Apache Pulsar Development 101 with Java
JConf.dev 2022 - Apache Pulsar Development 101 with Java
Timothy Spann
 

Similar to Data Presentations Cassandra Sigmod (20)

Svccg nosql 2011_sri-cassandra
Svccg nosql 2011_sri-cassandraSvccg nosql 2011_sri-cassandra
Svccg nosql 2011_sri-cassandra
 
Confluent real time_acquisition_analysis_and_evaluation_of_data_streams_20190...
Confluent real time_acquisition_analysis_and_evaluation_of_data_streams_20190...Confluent real time_acquisition_analysis_and_evaluation_of_data_streams_20190...
Confluent real time_acquisition_analysis_and_evaluation_of_data_streams_20190...
 
Wireshark Basics
Wireshark BasicsWireshark Basics
Wireshark Basics
 
Odp
OdpOdp
Odp
 
The post release technologies of Crysis 3 (Slides Only) - Stewart Needham
The post release technologies of Crysis 3 (Slides Only) - Stewart NeedhamThe post release technologies of Crysis 3 (Slides Only) - Stewart Needham
The post release technologies of Crysis 3 (Slides Only) - Stewart Needham
 
Outsmarting the smart meter (Jfokus 2017)
Outsmarting the smart meter (Jfokus 2017)Outsmarting the smart meter (Jfokus 2017)
Outsmarting the smart meter (Jfokus 2017)
 
Dpdk applications
Dpdk applicationsDpdk applications
Dpdk applications
 
GC free coding in @Java presented @Geecon
GC free coding in @Java presented @GeeconGC free coding in @Java presented @Geecon
GC free coding in @Java presented @Geecon
 
Automating the Hunt for Non-Obvious Sources of Latency Spreads
Automating the Hunt for Non-Obvious Sources of Latency SpreadsAutomating the Hunt for Non-Obvious Sources of Latency Spreads
Automating the Hunt for Non-Obvious Sources of Latency Spreads
 
Provenance for Data Munging Environments
Provenance for Data Munging EnvironmentsProvenance for Data Munging Environments
Provenance for Data Munging Environments
 
No[1][1]
No[1][1]No[1][1]
No[1][1]
 
[Globant summer take over] Empowering Big Data with Cassandra
[Globant summer take over] Empowering Big Data with Cassandra[Globant summer take over] Empowering Big Data with Cassandra
[Globant summer take over] Empowering Big Data with Cassandra
 
Cassandra
CassandraCassandra
Cassandra
 
Stream Processing Overview
Stream Processing OverviewStream Processing Overview
Stream Processing Overview
 
Synapse 2018 Guarding against failure in a hundred step pipeline
Synapse 2018 Guarding against failure in a hundred step pipelineSynapse 2018 Guarding against failure in a hundred step pipeline
Synapse 2018 Guarding against failure in a hundred step pipeline
 
Influx data basic
Influx data basicInflux data basic
Influx data basic
 
Understanding DPDK
Understanding DPDKUnderstanding DPDK
Understanding DPDK
 
Apache Cassandra, part 2 – data model example, machinery
Apache Cassandra, part 2 – data model example, machineryApache Cassandra, part 2 – data model example, machinery
Apache Cassandra, part 2 – data model example, machinery
 
Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das
Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das
Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das
 
JConf.dev 2022 - Apache Pulsar Development 101 with Java
JConf.dev 2022 - Apache Pulsar Development 101 with JavaJConf.dev 2022 - Apache Pulsar Development 101 with Java
JConf.dev 2022 - Apache Pulsar Development 101 with Java
 

More from Jeff Hammerbacher

20091027genentech
20091027genentech20091027genentech
20091027genentech
Jeff Hammerbacher
 
Mårten Mickos's presentation "Open Source: Why Freedom Makes a Better Busines...
Mårten Mickos's presentation "Open Source: Why Freedom Makes a Better Busines...Mårten Mickos's presentation "Open Source: Why Freedom Makes a Better Busines...
Mårten Mickos's presentation "Open Source: Why Freedom Makes a Better Busines...Jeff Hammerbacher
 
20081030linkedin
20081030linkedin20081030linkedin
20081030linkedin
Jeff Hammerbacher
 
20081022cca
20081022cca20081022cca
20081022cca
Jeff Hammerbacher
 

More from Jeff Hammerbacher (20)

20120223keystone
20120223keystone20120223keystone
20120223keystone
 
20100714accel
20100714accel20100714accel
20100714accel
 
20100608sigmod
20100608sigmod20100608sigmod
20100608sigmod
 
20100513brown
20100513brown20100513brown
20100513brown
 
20100423sage
20100423sage20100423sage
20100423sage
 
20100418sos
20100418sos20100418sos
20100418sos
 
20100301icde
20100301icde20100301icde
20100301icde
 
20100201hplabs
20100201hplabs20100201hplabs
20100201hplabs
 
20100128ebay
20100128ebay20100128ebay
20100128ebay
 
20091203gemini
20091203gemini20091203gemini
20091203gemini
 
20091203gemini
20091203gemini20091203gemini
20091203gemini
 
20091110startup2startup
20091110startup2startup20091110startup2startup
20091110startup2startup
 
20091030nasajpl
20091030nasajpl20091030nasajpl
20091030nasajpl
 
20091027genentech
20091027genentech20091027genentech
20091027genentech
 
Mårten Mickos's presentation "Open Source: Why Freedom Makes a Better Busines...
Mårten Mickos's presentation "Open Source: Why Freedom Makes a Better Busines...Mårten Mickos's presentation "Open Source: Why Freedom Makes a Better Busines...
Mårten Mickos's presentation "Open Source: Why Freedom Makes a Better Busines...
 
20090622 Velocity
20090622 Velocity20090622 Velocity
20090622 Velocity
 
20090422 Www
20090422 Www20090422 Www
20090422 Www
 
20090309berkeley
20090309berkeley20090309berkeley
20090309berkeley
 
20081030linkedin
20081030linkedin20081030linkedin
20081030linkedin
 
20081022cca
20081022cca20081022cca
20081022cca
 

Recently uploaded

Global Interconnection Group Joint Venture[960] (1).pdf
Global Interconnection Group Joint Venture[960] (1).pdfGlobal Interconnection Group Joint Venture[960] (1).pdf
Global Interconnection Group Joint Venture[960] (1).pdf
Henry Tapper
 
Accpac to QuickBooks Conversion Navigating the Transition with Online Account...
Accpac to QuickBooks Conversion Navigating the Transition with Online Account...Accpac to QuickBooks Conversion Navigating the Transition with Online Account...
Accpac to QuickBooks Conversion Navigating the Transition with Online Account...
PaulBryant58
 
The Parable of the Pipeline a book every new businessman or business student ...
The Parable of the Pipeline a book every new businessman or business student ...The Parable of the Pipeline a book every new businessman or business student ...
The Parable of the Pipeline a book every new businessman or business student ...
awaisafdar
 
Premium MEAN Stack Development Solutions for Modern Businesses
Premium MEAN Stack Development Solutions for Modern BusinessesPremium MEAN Stack Development Solutions for Modern Businesses
Premium MEAN Stack Development Solutions for Modern Businesses
SynapseIndia
 
Cree_Rey_BrandIdentityKit.PDF_PersonalBd
Cree_Rey_BrandIdentityKit.PDF_PersonalBdCree_Rey_BrandIdentityKit.PDF_PersonalBd
Cree_Rey_BrandIdentityKit.PDF_PersonalBd
creerey
 
Role of Remote Sensing and Monitoring in Mining
Role of Remote Sensing and Monitoring in MiningRole of Remote Sensing and Monitoring in Mining
Role of Remote Sensing and Monitoring in Mining
Naaraayani Minerals Pvt.Ltd
 
RMD24 | Retail media: hoe zet je dit in als je geen AH of Unilever bent? Heid...
RMD24 | Retail media: hoe zet je dit in als je geen AH of Unilever bent? Heid...RMD24 | Retail media: hoe zet je dit in als je geen AH of Unilever bent? Heid...
RMD24 | Retail media: hoe zet je dit in als je geen AH of Unilever bent? Heid...
BBPMedia1
 
The-McKinsey-7S-Framework. strategic management
The-McKinsey-7S-Framework. strategic managementThe-McKinsey-7S-Framework. strategic management
The-McKinsey-7S-Framework. strategic management
Bojamma2
 
Brand Analysis for an artist named Struan
Brand Analysis for an artist named StruanBrand Analysis for an artist named Struan
Brand Analysis for an artist named Struan
sarahvanessa51503
 
FINAL PRESENTATION.pptx12143241324134134
FINAL PRESENTATION.pptx12143241324134134FINAL PRESENTATION.pptx12143241324134134
FINAL PRESENTATION.pptx12143241324134134
LR1709MUSIC
 
Business Valuation Principles for Entrepreneurs
Business Valuation Principles for EntrepreneursBusiness Valuation Principles for Entrepreneurs
Business Valuation Principles for Entrepreneurs
Ben Wann
 
Search Disrupted Google’s Leaked Documents Rock the SEO World.pdf
Search Disrupted Google’s Leaked Documents Rock the SEO World.pdfSearch Disrupted Google’s Leaked Documents Rock the SEO World.pdf
Search Disrupted Google’s Leaked Documents Rock the SEO World.pdf
Arihant Webtech Pvt. Ltd
 
Affordable Stationery Printing Services in Jaipur | Navpack n Print
Affordable Stationery Printing Services in Jaipur | Navpack n PrintAffordable Stationery Printing Services in Jaipur | Navpack n Print
Affordable Stationery Printing Services in Jaipur | Navpack n Print
Navpack & Print
 
Buy Verified PayPal Account | Buy Google 5 Star Reviews
Buy Verified PayPal Account | Buy Google 5 Star ReviewsBuy Verified PayPal Account | Buy Google 5 Star Reviews
Buy Verified PayPal Account | Buy Google 5 Star Reviews
usawebmarket
 
Taurus Zodiac Sign_ Personality Traits and Sign Dates.pptx
Taurus Zodiac Sign_ Personality Traits and Sign Dates.pptxTaurus Zodiac Sign_ Personality Traits and Sign Dates.pptx
Taurus Zodiac Sign_ Personality Traits and Sign Dates.pptx
my Pandit
 
Set off and carry forward of losses and assessment of individuals.pptx
Set off and carry forward of losses and assessment of individuals.pptxSet off and carry forward of losses and assessment of individuals.pptx
Set off and carry forward of losses and assessment of individuals.pptx
HARSHITHV26
 
Exploring Patterns of Connection with Social Dreaming
Exploring Patterns of Connection with Social DreamingExploring Patterns of Connection with Social Dreaming
Exploring Patterns of Connection with Social Dreaming
Nicola Wreford-Howard
 
Skye Residences | Extended Stay Residences Near Toronto Airport
Skye Residences | Extended Stay Residences Near Toronto AirportSkye Residences | Extended Stay Residences Near Toronto Airport
Skye Residences | Extended Stay Residences Near Toronto Airport
marketingjdass
 
Lookback Analysis
Lookback AnalysisLookback Analysis
Lookback Analysis
Safe PaaS
 
Discover the innovative and creative projects that highlight my journey throu...
Discover the innovative and creative projects that highlight my journey throu...Discover the innovative and creative projects that highlight my journey throu...
Discover the innovative and creative projects that highlight my journey throu...
dylandmeas
 

Recently uploaded (20)

Global Interconnection Group Joint Venture[960] (1).pdf
Global Interconnection Group Joint Venture[960] (1).pdfGlobal Interconnection Group Joint Venture[960] (1).pdf
Global Interconnection Group Joint Venture[960] (1).pdf
 
Accpac to QuickBooks Conversion Navigating the Transition with Online Account...
Accpac to QuickBooks Conversion Navigating the Transition with Online Account...Accpac to QuickBooks Conversion Navigating the Transition with Online Account...
Accpac to QuickBooks Conversion Navigating the Transition with Online Account...
 
The Parable of the Pipeline a book every new businessman or business student ...
The Parable of the Pipeline a book every new businessman or business student ...The Parable of the Pipeline a book every new businessman or business student ...
The Parable of the Pipeline a book every new businessman or business student ...
 
Premium MEAN Stack Development Solutions for Modern Businesses
Premium MEAN Stack Development Solutions for Modern BusinessesPremium MEAN Stack Development Solutions for Modern Businesses
Premium MEAN Stack Development Solutions for Modern Businesses
 
Cree_Rey_BrandIdentityKit.PDF_PersonalBd
Cree_Rey_BrandIdentityKit.PDF_PersonalBdCree_Rey_BrandIdentityKit.PDF_PersonalBd
Cree_Rey_BrandIdentityKit.PDF_PersonalBd
 
Role of Remote Sensing and Monitoring in Mining
Role of Remote Sensing and Monitoring in MiningRole of Remote Sensing and Monitoring in Mining
Role of Remote Sensing and Monitoring in Mining
 
RMD24 | Retail media: hoe zet je dit in als je geen AH of Unilever bent? Heid...
RMD24 | Retail media: hoe zet je dit in als je geen AH of Unilever bent? Heid...RMD24 | Retail media: hoe zet je dit in als je geen AH of Unilever bent? Heid...
RMD24 | Retail media: hoe zet je dit in als je geen AH of Unilever bent? Heid...
 
The-McKinsey-7S-Framework. strategic management
The-McKinsey-7S-Framework. strategic managementThe-McKinsey-7S-Framework. strategic management
The-McKinsey-7S-Framework. strategic management
 
Brand Analysis for an artist named Struan
Brand Analysis for an artist named StruanBrand Analysis for an artist named Struan
Brand Analysis for an artist named Struan
 
FINAL PRESENTATION.pptx12143241324134134
FINAL PRESENTATION.pptx12143241324134134FINAL PRESENTATION.pptx12143241324134134
FINAL PRESENTATION.pptx12143241324134134
 
Business Valuation Principles for Entrepreneurs
Business Valuation Principles for EntrepreneursBusiness Valuation Principles for Entrepreneurs
Business Valuation Principles for Entrepreneurs
 
Search Disrupted Google’s Leaked Documents Rock the SEO World.pdf
Search Disrupted Google’s Leaked Documents Rock the SEO World.pdfSearch Disrupted Google’s Leaked Documents Rock the SEO World.pdf
Search Disrupted Google’s Leaked Documents Rock the SEO World.pdf
 
Affordable Stationery Printing Services in Jaipur | Navpack n Print
Affordable Stationery Printing Services in Jaipur | Navpack n PrintAffordable Stationery Printing Services in Jaipur | Navpack n Print
Affordable Stationery Printing Services in Jaipur | Navpack n Print
 
Buy Verified PayPal Account | Buy Google 5 Star Reviews
Buy Verified PayPal Account | Buy Google 5 Star ReviewsBuy Verified PayPal Account | Buy Google 5 Star Reviews
Buy Verified PayPal Account | Buy Google 5 Star Reviews
 
Taurus Zodiac Sign_ Personality Traits and Sign Dates.pptx
Taurus Zodiac Sign_ Personality Traits and Sign Dates.pptxTaurus Zodiac Sign_ Personality Traits and Sign Dates.pptx
Taurus Zodiac Sign_ Personality Traits and Sign Dates.pptx
 
Set off and carry forward of losses and assessment of individuals.pptx
Set off and carry forward of losses and assessment of individuals.pptxSet off and carry forward of losses and assessment of individuals.pptx
Set off and carry forward of losses and assessment of individuals.pptx
 
Exploring Patterns of Connection with Social Dreaming
Exploring Patterns of Connection with Social DreamingExploring Patterns of Connection with Social Dreaming
Exploring Patterns of Connection with Social Dreaming
 
Skye Residences | Extended Stay Residences Near Toronto Airport
Skye Residences | Extended Stay Residences Near Toronto AirportSkye Residences | Extended Stay Residences Near Toronto Airport
Skye Residences | Extended Stay Residences Near Toronto Airport
 
Lookback Analysis
Lookback AnalysisLookback Analysis
Lookback Analysis
 
Discover the innovative and creative projects that highlight my journey throu...
Discover the innovative and creative projects that highlight my journey throu...Discover the innovative and creative projects that highlight my journey throu...
Discover the innovative and creative projects that highlight my journey throu...
 

Data Presentations Cassandra Sigmod

  • 1. Cassandra Structured Storage System over a P2P Network Avinash Lakshman, Prashant Malik, Karthik Ranganathan
  • 2.
  • 3.
  • 4. Cassandra Architecture Messaging Layer Cluster Membership Failure Detector Storage Layer Partitioner Replicator Cassandra API Tools
  • 5. Data Model KEY ColumnFamily1 Name : MailList Type : Simple Sort : Name Name : tid1 Value : <Binary> TimeStamp : t1 Name : tid2 Value : <Binary> TimeStamp : t2 Name : tid3 Value : <Binary> TimeStamp : t3 Name : tid4 Value : <Binary> TimeStamp : t4 ColumnFamily2 Name : WordList Type : Super Sort : Time Name : aloha ColumnFamily3 Name : System Type : Super Sort : Name Name : hint1 <Column List> Name : hint2 <Column List> Name : hint3 <Column List> Name : hint4 <Column List> Name : dude C2 V2 T2 C6 V6 T6 Column Families are declared upfront Columns are added and modified dynamically SuperColumns are added and modified dynamically Columns are added and modified dynamically C1 V1 T1 C2 V2 T2 C3 V3 T3 C4 V4 T4
  • 6.
  • 7.
  • 8. Compactions K1 < Serialized data > K2 < Serialized data > K3 < Serialized data > -- -- -- Sorted K2 < Serialized data > K10 < Serialized data > K30 < Serialized data > -- -- -- Sorted K4 < Serialized data > K5 < Serialized data > K10 < Serialized data > -- -- -- Sorted MERGE SORT K1 < Serialized data > K2 < Serialized data > K3 < Serialized data > K4 < Serialized data > K5 < Serialized data > K10 < Serialized data > K30 < Serialized data > Sorted K1 Offset K5 Offset K30 Offset Bloom Filter Loaded in memory Index File Data File D E L E T E D
  • 9.
  • 10. Read Query Closest replica Cassandra Cluster Replica A Result Replica B Replica C Digest Query Digest Response Digest Response Result Client Read repair if digests differ
  • 11. Partitioning N=3 h(key2) And Replication 0 1 1/2 F E D C B A h(key1)
  • 12.
  • 13.
  • 14.
  • 15.
  • 16. Information Flow in the Implementation
  • 17.
  • 18.
  • 19.