SlideShare a Scribd company logo
Master Thesis, 21 July 2014, University of Crete
A Distributed Key-Value Store based on
Replicated LSM-Trees
Panagiotis Garefalakis
Computer Science Department – University of Crete
21 July 2014, University of Crete
Motivation
• This is the age of big data
• Distributed key value stores are key to analyzing
them
21 July 2014, University of Crete
Motivation
• Companies such as Amazon and Google and open-
source communities such as Apache have proposed
several key-value stores
– Availability and fault tolerance through data replication
21 July 2014, University of Crete
LSM-Trees
21 July 2014, University of Crete
Data partitioning over LSM-Trees
21 July 2014, University of Crete
Replication
Primary-Backup
replication
L
Zookeeper
F F
ZAB
Replication Group (RG)
…..
21 July 2014, University of Crete
Replicated LSM-Trees
Primary-Backup
replication
L
F F
ZAB
Replication Group (RG)
SSTables
Write
#
Valu
e
#
#
Key
#
memtable
memorydisk
1 N2 3
…Commit log
flush
Compaction
LSM Trees
batch/
periodic
WAL
21 July 2014, University of Crete
Replicated LSM-Trees
Primary-Backup
replication
L
Zookeeper
F F
ZAB
Replication Group (RG)
Apache Cassandra
SSTables
Write
#
Valu
e
#
#
Key
#
memtable
memorydisk
1 N2 3
…Commit log
flush
Compaction
LSM Trees
batch/
periodic
WAL
ACaZoo
21 July 2014, University of Crete
Thesis Contributions
• A high performance data replication primitive:
– Combines the ZAB protocol with an implementation of LSM-Trees
– Key point: Replication of LSM-Tree WAL
• A novel technique that reduces the impact of LSM-Tree
compactions on write performance
– Changing leader prior to heavy compactions results to up to 60%
higher throughput
21 July 2014, University of Crete
Data model
A18-v1 XYZ18-v2
cf2:col2-XYZ
B18-v3 foobar18-v1
row-6
cf1:col-B cf2:foobar
row-5
Foo18-v1
cf2:col-Foo
row-2
row-7
row-1
cf1:col-A
row-10
row-18 A18 - v1
Column Family 1 Column Family 2
Coordinates for a Cell: Row Key Column Family Name Column Qualifier Version
B18 - v3
Peter - v2
Bob - v1
Foo18-v1 XYZ18-v2
Mary - v1
foobar18 - v1
CF Prefix
21 July 2014, University of Crete
Consistent Hashing
A18-v1 XYZ18-v2
cf2:col2-XYZ
B18-v3 foobar18-v1
row-6
cf1:col-B cf2:foobar
row-5
Foo18-v1
cf2:col-Foo
row-2
row-7
row-1
cf1:col-A
row-10
row-18 A18 - v1
Column Family 1 Column Family 2
Coordinates for a Cell: Row Key Column Family Name Column Qualifier Version
B18 - v3
Peter - v2
Bob - v1
Foo18-v1 XYZ18-v2
Mary - v1
foobar18 - v1
CF Prefix
md5
21 July 2014, University of Crete
System Architecture
21 July 2014, University of Crete
System Architecture Replication
21 July 2014, University of Crete
RG leader switch policies
SSTables
1 N’2 3
…
Compaction
ACaZoo
L
F F
ZAB
Replication Group (RG)
SSTables
1 N’’2 3
Compaction
…
SSTables
1 N2 3
Compaction
…
High
Low
High
Low
#1: When to switch
High
Low
21 July 2014, University of Crete
RG leader switch policies
SSTables
1 N’2 3
…
Compaction
ACaZoo
L
F F
ZAB
Replication Group (RG)
SSTables
1 N’’2 3
Compaction
…
SSTables
1 N2 3
Compaction
…
High
Low
High
Low
#1: When to switch
High
Low
Weighted Votes
#2: Whom to elect
Round Robin and Random policies
21 July 2014, University of Crete
Evaluation
• OpenStack private Cloud
• VMs with 2 CPUs, 2 GB RAM and 20GB remotely mounted disk
• Software:
– Apache Cassandra version 2.0.1
– Apache Zookeeper version 3.4.5
– Oracle NoSQL version 2.1.54
• Benchmarks:
– YCSB version 0.1.4
– 1 KB accesses, 10 columns of 100 bytes cells
– three different operation mixes (100/0, 50/50, 0/100 reads/writes)
– # concurrent threads
– Postal version 0.72
– configurable message size
– # concurrent threads
21 July 2014, University of Crete
Systems compared
• ACaZoo with/without RG leader changes
– Batch and Periodic
• Cassandra Quorum (2 out of 3 replicas)
– Batch and Periodic
• Cassandra Serial (extension of Paxos algorithm)
– Batch and Periodic
• Oracle NoSQL
– Absolute consistency
21 July 2014, University of Crete
Impact of compaction
0
500
1000
1500
2000
2500
0 25 50 75 100 125 150 175 200
WriteThroughput(ops/100ms)
Time (sec)
Smoothed Average Throughput
0
500
1000
1500
2000
2500
0 25 50 75 100 125 150 175 2
WriteThroughput(ops/100ms)
Time (sec)
Smoothed Average Throughput
• YCSB 100% write workload, 64 Threads
ACaZoo without RG changes ACaZoo with RG changes
Memtable flush Leader electionCompaction
21 July 2014, University of Crete
A deeper look into background activity
Count
(#)
Longest
(sec)
Average
(sec)
Total
(sec)
Compaction (RA) 11 78.44 17.96 197.64
Memtable flush (RA) 53 - - -
Garbage Collection (RA) 197 0.91 0.148 29.33
Compaction (RR) 12 72.65 15.94 191.39
Memtable flush (RR) 52 - - -
Garbage Collection (RR) 192 0.85 0.147 27.84
• YCSB 20min 100% write workload, 256 Threads
• RA : RG change random policy
• RR : RG round robin policy
21 July 2014, University of Crete
Time correlation of compactions
across replicas
23% 13%
12%
21 July 2014, University of Crete
Evaluation – 3 Node RG
25%
40%
21 July 2014, University of Crete
Evaluation – 5 Node RG
60%
21 July 2014, University of Crete
Application Performance: CassMail
ACaZoo ACaZoo ACaZoo
21 July 2014, University of Crete
CassMail on a 3-node RG
50KB-500KB attachment 200KB-2MB attachment
30% 31%
21 July 2014, University of Crete
CassMail on a 5-node RG
50KB-500KB attachment 200KB-2MB attachment
35%
42%
21 July 2014, University of Crete
Thesis Contributions
• A high performance data replication primitive:
– Combines the ZAB protocol with an implementation of LSM-Trees
– Key point: Replication of LSM-Tree WAL
• A novel technique that reduces the impact of LSM-Tree
compactions on write performance
– Changing leader prior to heavy compactions results to up to 60%
higher throughput
21 July 2014, University of Crete
Future Work
• Elasticity: stream a number of key ranges to a newly
joining RG.
• Further investigate the load balancing methodology
for Zookeeper watch notifications.
21 July 2014, University of Crete
Thesis Publications
1. Panagiotis Garefalakis, Panagiotis Papadopoulos, and Kostas
Magoutis, “ACaZoo: A distributed key-value store based on
replicated LSM-trees.” in 33rd IEEE International Symposium
on Reliable Distributed Systems (SRDS), IEEE 2014.
2. Panagiotis Garefalakis, Panagiotis Papadopoulos, Ioannis
Manousakis, and Kostas Magoutis, “Strengthening consistency
in the Cassandra distributed key-value store.” in Distributed
Applications and Interoperable Systems (DAIS), Springer 2013.
21 July 2014, University of Crete
Other Publications
1. Baryannis G., Garefalakis P., Kritikos K., Magoutis K.,
Papaioannou A., Plexousakis D., & Zeginis C.
“Lifecycle management of service-based applications on multi-
clouds: a research roadmap.” In Proceedings of the 2013
international workshop on Multi-cloud applications and federated
clouds. ACM, 2013.
2. Zeginis C., Kritikos K., Garefalakis P., Konsolaki K., Magoutis K.,
& Plexousakis D.
“Towards cross-layer monitoring of multi-cloud service-based
applications.” In Service-Oriented and Cloud Computing. Springer,
2013.
3. Garefalakis Panagiotis, and Kostas Magoutis.
"Improving Datacenter Operations Management using Wireless
Sensor Networks." Green Computing and Communications
(GreenCom), 2012 IEEE International Conference on. IEEE, 2012.
21 July 2014, University of Crete
Email : pgaref@ics.forth.gr
21 July 2014, University of Crete
RG Leader Failover
0
500
1000
1500
2000
2500
3000
0 5 10 15 20 25 30 35 40 45
Throughput(ops/100ms)
sec
0
500
1000
1500
2000
2500
0 4 8 12 16 20 24 28 32 36 40 44
Throughput(ops/100ms)
sec
• YCSB read-only 64 threads
• 1.19sec for client to notice
• 220ms for the RG to elect a new leader
• 970ms to propagate to the client through the CM
• 2 sec to establish connection
ACaZoo Oracle NoSQL
21 July 2014, University of Crete
Backup - ArchitectureCassandra’s
21 July 2014, University of Crete
Cassandra’s Architecture
21 July 2014, University of Crete
Cassandra’s Architecture
21 July 2014, University of Crete
Cassandra’s Architecture
2/3 Responses: {X,Y}
Need for reconciliation!
21 July 2014, University of Crete
Backup-Paxos1
Backup-Paxos2
21 July 2014, University of Crete
Benefit of client coordinated I/O
• Yahoo Cloud Serving Benchmark (YCSB).
– 4 threads and read 1 GB of Data
Throughput
(ops/sec)
Read latency
(average,
ms)
Read latency
(99 percentile,
ms
Original
Cassandra
317 3.1 4
Client
Coordinated I/O
412 2.3 3
21 July 2014, University of Crete
CM load balancer
0
500
1000
1500
2000
2500
1 10 100 1000 10000
AverageLatency(ms)
# Threads
1 node
3 nodes
3 nodes balanced

More Related Content

Viewers also liked

Test case-point-analysis (whitepaper)
Test case-point-analysis (whitepaper)Test case-point-analysis (whitepaper)
Test case-point-analysis (whitepaper)
KMS Technology
 
Identity Based Secure Distributed Storage Scheme
Identity Based Secure Distributed Storage SchemeIdentity Based Secure Distributed Storage Scheme
Identity Based Secure Distributed Storage Scheme
Venkatesh Devam ☁
 
Disseration M.Tech
Disseration M.TechDisseration M.Tech
Disseration M.Tech
Vijayananda Mohire
 
Synopsis on cloud computing by Prashant upta
Synopsis on cloud computing by Prashant uptaSynopsis on cloud computing by Prashant upta
Synopsis on cloud computing by Prashant upta
Prashant Gupta
 
Readymade M Tech Thesis
Readymade M Tech ThesisReadymade M Tech Thesis
Readymade M Tech Thesis
e2-matrix
 
M.tech thesis
M.tech thesisM.tech thesis
M.tech thesis
Venkataraju Badanapuri
 
Cloud computing project report
Cloud computing project reportCloud computing project report
Cloud computing project report
Naveed Farooq
 
Cloud Computing Documentation Report
Cloud Computing Documentation ReportCloud Computing Documentation Report
Cloud Computing Documentation Report
Usman Sait
 

Viewers also liked (8)

Test case-point-analysis (whitepaper)
Test case-point-analysis (whitepaper)Test case-point-analysis (whitepaper)
Test case-point-analysis (whitepaper)
 
Identity Based Secure Distributed Storage Scheme
Identity Based Secure Distributed Storage SchemeIdentity Based Secure Distributed Storage Scheme
Identity Based Secure Distributed Storage Scheme
 
Disseration M.Tech
Disseration M.TechDisseration M.Tech
Disseration M.Tech
 
Synopsis on cloud computing by Prashant upta
Synopsis on cloud computing by Prashant uptaSynopsis on cloud computing by Prashant upta
Synopsis on cloud computing by Prashant upta
 
Readymade M Tech Thesis
Readymade M Tech ThesisReadymade M Tech Thesis
Readymade M Tech Thesis
 
M.tech thesis
M.tech thesisM.tech thesis
M.tech thesis
 
Cloud computing project report
Cloud computing project reportCloud computing project report
Cloud computing project report
 
Cloud Computing Documentation Report
Cloud Computing Documentation ReportCloud Computing Documentation Report
Cloud Computing Documentation Report
 

Similar to Master presentation-21-7-2014

The Pacific Research Platform
The Pacific Research PlatformThe Pacific Research Platform
The Pacific Research Platform
Larry Smarr
 
Grid Projects In The US July 2008
Grid Projects In The US July 2008Grid Projects In The US July 2008
Grid Projects In The US July 2008
Ian Foster
 
Security Challenges and the Pacific Research Platform
Security Challenges and the Pacific Research PlatformSecurity Challenges and the Pacific Research Platform
Security Challenges and the Pacific Research Platform
Larry Smarr
 
Pacific Wave and PRP Update Big News for Big Data
Pacific Wave and PRP Update Big News for Big DataPacific Wave and PRP Update Big News for Big Data
Pacific Wave and PRP Update Big News for Big Data
Larry Smarr
 
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science Central
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science CentralCloud e-Genome: NGS Workflows on the Cloud Using e-Science Central
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science Central
Paolo Missier
 
useR 2014 jskim
useR 2014 jskimuseR 2014 jskim
useR 2014 jskim
Jinseob Kim
 
PRP, NRP, GRP & the Path Forward
PRP, NRP, GRP & the Path ForwardPRP, NRP, GRP & the Path Forward
PRP, NRP, GRP & the Path Forward
Larry Smarr
 
The Pacific Research Platform: A Science-Driven Big-Data Freeway System
The Pacific Research Platform: A Science-Driven Big-Data Freeway SystemThe Pacific Research Platform: A Science-Driven Big-Data Freeway System
The Pacific Research Platform: A Science-Driven Big-Data Freeway System
Larry Smarr
 
Building a Regional 100G Collaboration Infrastructure
Building a Regional 100G Collaboration InfrastructureBuilding a Regional 100G Collaboration Infrastructure
Building a Regional 100G Collaboration Infrastructure
Larry Smarr
 
Introduction to Big data
Introduction to Big dataIntroduction to Big data
Introduction to Big data
cthanopoulos
 
Automated Design Flow for Coarse-Grained Reconfigurable Platforms: an RVC-CAL...
Automated Design Flow for Coarse-Grained Reconfigurable Platforms: an RVC-CAL...Automated Design Flow for Coarse-Grained Reconfigurable Platforms: an RVC-CAL...
Automated Design Flow for Coarse-Grained Reconfigurable Platforms: an RVC-CAL...
MDC_UNICA
 
Blue Waters and Resource Management - Now and in the Future
 Blue Waters and Resource Management - Now and in the Future Blue Waters and Resource Management - Now and in the Future
Blue Waters and Resource Management - Now and in the Future
inside-BigData.com
 
The Pacific Research Platform
The Pacific Research PlatformThe Pacific Research Platform
The Pacific Research Platform
Larry Smarr
 
The Pacific Research Platform: A Science-Driven Big-Data Freeway System
The Pacific Research Platform: A Science-Driven Big-Data Freeway SystemThe Pacific Research Platform: A Science-Driven Big-Data Freeway System
The Pacific Research Platform: A Science-Driven Big-Data Freeway System
Larry Smarr
 
EarthCube Monthly Community Webinar- Nov. 22, 2013
EarthCube Monthly Community Webinar- Nov. 22, 2013EarthCube Monthly Community Webinar- Nov. 22, 2013
EarthCube Monthly Community Webinar- Nov. 22, 2013
EarthCube
 
DSD-INT 2019 Modelling in DANUBIUS-RI-Bellafiore
DSD-INT 2019 Modelling in DANUBIUS-RI-BellafioreDSD-INT 2019 Modelling in DANUBIUS-RI-Bellafiore
DSD-INT 2019 Modelling in DANUBIUS-RI-Bellafiore
Deltares
 
OpenACC and Hackathons Monthly Highlights: April 2023
OpenACC and Hackathons Monthly Highlights: April  2023OpenACC and Hackathons Monthly Highlights: April  2023
OpenACC and Hackathons Monthly Highlights: April 2023
OpenACC
 
Dash UCCSC 2016
Dash UCCSC 2016Dash UCCSC 2016
Positioning University of California Information Technology for the Future: S...
Positioning University of California Information Technology for the Future: S...Positioning University of California Information Technology for the Future: S...
Positioning University of California Information Technology for the Future: S...
Larry Smarr
 
Creating a Big Data Machine Learning Platform in California
Creating a Big Data Machine Learning Platform in CaliforniaCreating a Big Data Machine Learning Platform in California
Creating a Big Data Machine Learning Platform in California
Larry Smarr
 

Similar to Master presentation-21-7-2014 (20)

The Pacific Research Platform
The Pacific Research PlatformThe Pacific Research Platform
The Pacific Research Platform
 
Grid Projects In The US July 2008
Grid Projects In The US July 2008Grid Projects In The US July 2008
Grid Projects In The US July 2008
 
Security Challenges and the Pacific Research Platform
Security Challenges and the Pacific Research PlatformSecurity Challenges and the Pacific Research Platform
Security Challenges and the Pacific Research Platform
 
Pacific Wave and PRP Update Big News for Big Data
Pacific Wave and PRP Update Big News for Big DataPacific Wave and PRP Update Big News for Big Data
Pacific Wave and PRP Update Big News for Big Data
 
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science Central
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science CentralCloud e-Genome: NGS Workflows on the Cloud Using e-Science Central
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science Central
 
useR 2014 jskim
useR 2014 jskimuseR 2014 jskim
useR 2014 jskim
 
PRP, NRP, GRP & the Path Forward
PRP, NRP, GRP & the Path ForwardPRP, NRP, GRP & the Path Forward
PRP, NRP, GRP & the Path Forward
 
The Pacific Research Platform: A Science-Driven Big-Data Freeway System
The Pacific Research Platform: A Science-Driven Big-Data Freeway SystemThe Pacific Research Platform: A Science-Driven Big-Data Freeway System
The Pacific Research Platform: A Science-Driven Big-Data Freeway System
 
Building a Regional 100G Collaboration Infrastructure
Building a Regional 100G Collaboration InfrastructureBuilding a Regional 100G Collaboration Infrastructure
Building a Regional 100G Collaboration Infrastructure
 
Introduction to Big data
Introduction to Big dataIntroduction to Big data
Introduction to Big data
 
Automated Design Flow for Coarse-Grained Reconfigurable Platforms: an RVC-CAL...
Automated Design Flow for Coarse-Grained Reconfigurable Platforms: an RVC-CAL...Automated Design Flow for Coarse-Grained Reconfigurable Platforms: an RVC-CAL...
Automated Design Flow for Coarse-Grained Reconfigurable Platforms: an RVC-CAL...
 
Blue Waters and Resource Management - Now and in the Future
 Blue Waters and Resource Management - Now and in the Future Blue Waters and Resource Management - Now and in the Future
Blue Waters and Resource Management - Now and in the Future
 
The Pacific Research Platform
The Pacific Research PlatformThe Pacific Research Platform
The Pacific Research Platform
 
The Pacific Research Platform: A Science-Driven Big-Data Freeway System
The Pacific Research Platform: A Science-Driven Big-Data Freeway SystemThe Pacific Research Platform: A Science-Driven Big-Data Freeway System
The Pacific Research Platform: A Science-Driven Big-Data Freeway System
 
EarthCube Monthly Community Webinar- Nov. 22, 2013
EarthCube Monthly Community Webinar- Nov. 22, 2013EarthCube Monthly Community Webinar- Nov. 22, 2013
EarthCube Monthly Community Webinar- Nov. 22, 2013
 
DSD-INT 2019 Modelling in DANUBIUS-RI-Bellafiore
DSD-INT 2019 Modelling in DANUBIUS-RI-BellafioreDSD-INT 2019 Modelling in DANUBIUS-RI-Bellafiore
DSD-INT 2019 Modelling in DANUBIUS-RI-Bellafiore
 
OpenACC and Hackathons Monthly Highlights: April 2023
OpenACC and Hackathons Monthly Highlights: April  2023OpenACC and Hackathons Monthly Highlights: April  2023
OpenACC and Hackathons Monthly Highlights: April 2023
 
Dash UCCSC 2016
Dash UCCSC 2016Dash UCCSC 2016
Dash UCCSC 2016
 
Positioning University of California Information Technology for the Future: S...
Positioning University of California Information Technology for the Future: S...Positioning University of California Information Technology for the Future: S...
Positioning University of California Information Technology for the Future: S...
 
Creating a Big Data Machine Learning Platform in California
Creating a Big Data Machine Learning Platform in CaliforniaCreating a Big Data Machine Learning Platform in California
Creating a Big Data Machine Learning Platform in California
 

More from Panagiotis Garefalakis

Accelerating distributed joins in Apache Hive: Runtime filtering enhancements
Accelerating distributed joins in Apache Hive: Runtime filtering enhancementsAccelerating distributed joins in Apache Hive: Runtime filtering enhancements
Accelerating distributed joins in Apache Hive: Runtime filtering enhancements
Panagiotis Garefalakis
 
Neptune: Scheduling Suspendable Tasks for Unified Stream/Batch Applications
Neptune: Scheduling Suspendable Tasks for Unified Stream/Batch ApplicationsNeptune: Scheduling Suspendable Tasks for Unified Stream/Batch Applications
Neptune: Scheduling Suspendable Tasks for Unified Stream/Batch Applications
Panagiotis Garefalakis
 
Medea: Scheduling of Long Running Applications in Shared Production Clusters
Medea: Scheduling of Long Running Applications in Shared Production ClustersMedea: Scheduling of Long Running Applications in Shared Production Clusters
Medea: Scheduling of Long Running Applications in Shared Production Clusters
Panagiotis Garefalakis
 
Mres presentation
Mres presentationMres presentation
Mres presentation
Panagiotis Garefalakis
 
Dais 2013 2 6 june
Dais 2013 2 6 juneDais 2013 2 6 june
Dais 2013 2 6 june
Panagiotis Garefalakis
 
Pgaref Piccolo Building Fast, Distributed Programs with Partitioned Tables
Pgaref   Piccolo Building Fast, Distributed Programs with Partitioned TablesPgaref   Piccolo Building Fast, Distributed Programs with Partitioned Tables
Pgaref Piccolo Building Fast, Distributed Programs with Partitioned Tables
Panagiotis Garefalakis
 
Storage managment using nagios
Storage managment using nagiosStorage managment using nagios
Storage managment using nagios
Panagiotis Garefalakis
 
Ithings2012 20nov
Ithings2012 20novIthings2012 20nov
Ithings2012 20nov
Panagiotis Garefalakis
 

More from Panagiotis Garefalakis (8)

Accelerating distributed joins in Apache Hive: Runtime filtering enhancements
Accelerating distributed joins in Apache Hive: Runtime filtering enhancementsAccelerating distributed joins in Apache Hive: Runtime filtering enhancements
Accelerating distributed joins in Apache Hive: Runtime filtering enhancements
 
Neptune: Scheduling Suspendable Tasks for Unified Stream/Batch Applications
Neptune: Scheduling Suspendable Tasks for Unified Stream/Batch ApplicationsNeptune: Scheduling Suspendable Tasks for Unified Stream/Batch Applications
Neptune: Scheduling Suspendable Tasks for Unified Stream/Batch Applications
 
Medea: Scheduling of Long Running Applications in Shared Production Clusters
Medea: Scheduling of Long Running Applications in Shared Production ClustersMedea: Scheduling of Long Running Applications in Shared Production Clusters
Medea: Scheduling of Long Running Applications in Shared Production Clusters
 
Mres presentation
Mres presentationMres presentation
Mres presentation
 
Dais 2013 2 6 june
Dais 2013 2 6 juneDais 2013 2 6 june
Dais 2013 2 6 june
 
Pgaref Piccolo Building Fast, Distributed Programs with Partitioned Tables
Pgaref   Piccolo Building Fast, Distributed Programs with Partitioned TablesPgaref   Piccolo Building Fast, Distributed Programs with Partitioned Tables
Pgaref Piccolo Building Fast, Distributed Programs with Partitioned Tables
 
Storage managment using nagios
Storage managment using nagiosStorage managment using nagios
Storage managment using nagios
 
Ithings2012 20nov
Ithings2012 20novIthings2012 20nov
Ithings2012 20nov
 

Recently uploaded

The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
heathfieldcps1
 
Leveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit InnovationLeveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit Innovation
TechSoup
 
Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
WaniBasim
 
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skillsspot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
haiqairshad
 
B. Ed Syllabus for babasaheb ambedkar education university.pdf
B. Ed Syllabus for babasaheb ambedkar education university.pdfB. Ed Syllabus for babasaheb ambedkar education university.pdf
B. Ed Syllabus for babasaheb ambedkar education university.pdf
BoudhayanBhattachari
 
BBR 2024 Summer Sessions Interview Training
BBR  2024 Summer Sessions Interview TrainingBBR  2024 Summer Sessions Interview Training
BBR 2024 Summer Sessions Interview Training
Katrina Pritchard
 
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptxNEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
iammrhaywood
 
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective UpskillingYour Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Excellence Foundation for South Sudan
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
Priyankaranawat4
 
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
Nguyen Thanh Tu Collection
 
Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) Curriculum
Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) CurriculumPhilippine Edukasyong Pantahanan at Pangkabuhayan (EPP) Curriculum
Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) Curriculum
MJDuyan
 
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
Nguyen Thanh Tu Collection
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
Nicholas Montgomery
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
RAHUL
 
Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47
MysoreMuleSoftMeetup
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
adhitya5119
 
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching AptitudeUGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
S. Raj Kumar
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
HajraNaeem15
 
Chapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptxChapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptx
Denish Jangid
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
PECB
 

Recently uploaded (20)

The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
 
Leveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit InnovationLeveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit Innovation
 
Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
 
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skillsspot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
 
B. Ed Syllabus for babasaheb ambedkar education university.pdf
B. Ed Syllabus for babasaheb ambedkar education university.pdfB. Ed Syllabus for babasaheb ambedkar education university.pdf
B. Ed Syllabus for babasaheb ambedkar education university.pdf
 
BBR 2024 Summer Sessions Interview Training
BBR  2024 Summer Sessions Interview TrainingBBR  2024 Summer Sessions Interview Training
BBR 2024 Summer Sessions Interview Training
 
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptxNEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
 
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective UpskillingYour Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective Upskilling
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
 
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
 
Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) Curriculum
Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) CurriculumPhilippine Edukasyong Pantahanan at Pangkabuhayan (EPP) Curriculum
Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) Curriculum
 
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
 
Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
 
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching AptitudeUGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
 
Chapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptxChapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptx
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
 

Master presentation-21-7-2014

  • 1. Master Thesis, 21 July 2014, University of Crete A Distributed Key-Value Store based on Replicated LSM-Trees Panagiotis Garefalakis Computer Science Department – University of Crete
  • 2. 21 July 2014, University of Crete Motivation • This is the age of big data • Distributed key value stores are key to analyzing them
  • 3. 21 July 2014, University of Crete Motivation • Companies such as Amazon and Google and open- source communities such as Apache have proposed several key-value stores – Availability and fault tolerance through data replication
  • 4. 21 July 2014, University of Crete LSM-Trees
  • 5. 21 July 2014, University of Crete Data partitioning over LSM-Trees
  • 6. 21 July 2014, University of Crete Replication Primary-Backup replication L Zookeeper F F ZAB Replication Group (RG) …..
  • 7. 21 July 2014, University of Crete Replicated LSM-Trees Primary-Backup replication L F F ZAB Replication Group (RG) SSTables Write # Valu e # # Key # memtable memorydisk 1 N2 3 …Commit log flush Compaction LSM Trees batch/ periodic WAL
  • 8. 21 July 2014, University of Crete Replicated LSM-Trees Primary-Backup replication L Zookeeper F F ZAB Replication Group (RG) Apache Cassandra SSTables Write # Valu e # # Key # memtable memorydisk 1 N2 3 …Commit log flush Compaction LSM Trees batch/ periodic WAL ACaZoo
  • 9. 21 July 2014, University of Crete Thesis Contributions • A high performance data replication primitive: – Combines the ZAB protocol with an implementation of LSM-Trees – Key point: Replication of LSM-Tree WAL • A novel technique that reduces the impact of LSM-Tree compactions on write performance – Changing leader prior to heavy compactions results to up to 60% higher throughput
  • 10. 21 July 2014, University of Crete Data model A18-v1 XYZ18-v2 cf2:col2-XYZ B18-v3 foobar18-v1 row-6 cf1:col-B cf2:foobar row-5 Foo18-v1 cf2:col-Foo row-2 row-7 row-1 cf1:col-A row-10 row-18 A18 - v1 Column Family 1 Column Family 2 Coordinates for a Cell: Row Key Column Family Name Column Qualifier Version B18 - v3 Peter - v2 Bob - v1 Foo18-v1 XYZ18-v2 Mary - v1 foobar18 - v1 CF Prefix
  • 11. 21 July 2014, University of Crete Consistent Hashing A18-v1 XYZ18-v2 cf2:col2-XYZ B18-v3 foobar18-v1 row-6 cf1:col-B cf2:foobar row-5 Foo18-v1 cf2:col-Foo row-2 row-7 row-1 cf1:col-A row-10 row-18 A18 - v1 Column Family 1 Column Family 2 Coordinates for a Cell: Row Key Column Family Name Column Qualifier Version B18 - v3 Peter - v2 Bob - v1 Foo18-v1 XYZ18-v2 Mary - v1 foobar18 - v1 CF Prefix md5
  • 12. 21 July 2014, University of Crete System Architecture
  • 13. 21 July 2014, University of Crete System Architecture Replication
  • 14. 21 July 2014, University of Crete RG leader switch policies SSTables 1 N’2 3 … Compaction ACaZoo L F F ZAB Replication Group (RG) SSTables 1 N’’2 3 Compaction … SSTables 1 N2 3 Compaction … High Low High Low #1: When to switch High Low
  • 15. 21 July 2014, University of Crete RG leader switch policies SSTables 1 N’2 3 … Compaction ACaZoo L F F ZAB Replication Group (RG) SSTables 1 N’’2 3 Compaction … SSTables 1 N2 3 Compaction … High Low High Low #1: When to switch High Low Weighted Votes #2: Whom to elect Round Robin and Random policies
  • 16. 21 July 2014, University of Crete Evaluation • OpenStack private Cloud • VMs with 2 CPUs, 2 GB RAM and 20GB remotely mounted disk • Software: – Apache Cassandra version 2.0.1 – Apache Zookeeper version 3.4.5 – Oracle NoSQL version 2.1.54 • Benchmarks: – YCSB version 0.1.4 – 1 KB accesses, 10 columns of 100 bytes cells – three different operation mixes (100/0, 50/50, 0/100 reads/writes) – # concurrent threads – Postal version 0.72 – configurable message size – # concurrent threads
  • 17. 21 July 2014, University of Crete Systems compared • ACaZoo with/without RG leader changes – Batch and Periodic • Cassandra Quorum (2 out of 3 replicas) – Batch and Periodic • Cassandra Serial (extension of Paxos algorithm) – Batch and Periodic • Oracle NoSQL – Absolute consistency
  • 18. 21 July 2014, University of Crete Impact of compaction 0 500 1000 1500 2000 2500 0 25 50 75 100 125 150 175 200 WriteThroughput(ops/100ms) Time (sec) Smoothed Average Throughput 0 500 1000 1500 2000 2500 0 25 50 75 100 125 150 175 2 WriteThroughput(ops/100ms) Time (sec) Smoothed Average Throughput • YCSB 100% write workload, 64 Threads ACaZoo without RG changes ACaZoo with RG changes Memtable flush Leader electionCompaction
  • 19. 21 July 2014, University of Crete A deeper look into background activity Count (#) Longest (sec) Average (sec) Total (sec) Compaction (RA) 11 78.44 17.96 197.64 Memtable flush (RA) 53 - - - Garbage Collection (RA) 197 0.91 0.148 29.33 Compaction (RR) 12 72.65 15.94 191.39 Memtable flush (RR) 52 - - - Garbage Collection (RR) 192 0.85 0.147 27.84 • YCSB 20min 100% write workload, 256 Threads • RA : RG change random policy • RR : RG round robin policy
  • 20. 21 July 2014, University of Crete Time correlation of compactions across replicas 23% 13% 12%
  • 21. 21 July 2014, University of Crete Evaluation – 3 Node RG 25% 40%
  • 22. 21 July 2014, University of Crete Evaluation – 5 Node RG 60%
  • 23. 21 July 2014, University of Crete Application Performance: CassMail ACaZoo ACaZoo ACaZoo
  • 24. 21 July 2014, University of Crete CassMail on a 3-node RG 50KB-500KB attachment 200KB-2MB attachment 30% 31%
  • 25. 21 July 2014, University of Crete CassMail on a 5-node RG 50KB-500KB attachment 200KB-2MB attachment 35% 42%
  • 26. 21 July 2014, University of Crete Thesis Contributions • A high performance data replication primitive: – Combines the ZAB protocol with an implementation of LSM-Trees – Key point: Replication of LSM-Tree WAL • A novel technique that reduces the impact of LSM-Tree compactions on write performance – Changing leader prior to heavy compactions results to up to 60% higher throughput
  • 27. 21 July 2014, University of Crete Future Work • Elasticity: stream a number of key ranges to a newly joining RG. • Further investigate the load balancing methodology for Zookeeper watch notifications.
  • 28. 21 July 2014, University of Crete Thesis Publications 1. Panagiotis Garefalakis, Panagiotis Papadopoulos, and Kostas Magoutis, “ACaZoo: A distributed key-value store based on replicated LSM-trees.” in 33rd IEEE International Symposium on Reliable Distributed Systems (SRDS), IEEE 2014. 2. Panagiotis Garefalakis, Panagiotis Papadopoulos, Ioannis Manousakis, and Kostas Magoutis, “Strengthening consistency in the Cassandra distributed key-value store.” in Distributed Applications and Interoperable Systems (DAIS), Springer 2013.
  • 29. 21 July 2014, University of Crete Other Publications 1. Baryannis G., Garefalakis P., Kritikos K., Magoutis K., Papaioannou A., Plexousakis D., & Zeginis C. “Lifecycle management of service-based applications on multi- clouds: a research roadmap.” In Proceedings of the 2013 international workshop on Multi-cloud applications and federated clouds. ACM, 2013. 2. Zeginis C., Kritikos K., Garefalakis P., Konsolaki K., Magoutis K., & Plexousakis D. “Towards cross-layer monitoring of multi-cloud service-based applications.” In Service-Oriented and Cloud Computing. Springer, 2013. 3. Garefalakis Panagiotis, and Kostas Magoutis. "Improving Datacenter Operations Management using Wireless Sensor Networks." Green Computing and Communications (GreenCom), 2012 IEEE International Conference on. IEEE, 2012.
  • 30. 21 July 2014, University of Crete Email : pgaref@ics.forth.gr
  • 31. 21 July 2014, University of Crete RG Leader Failover 0 500 1000 1500 2000 2500 3000 0 5 10 15 20 25 30 35 40 45 Throughput(ops/100ms) sec 0 500 1000 1500 2000 2500 0 4 8 12 16 20 24 28 32 36 40 44 Throughput(ops/100ms) sec • YCSB read-only 64 threads • 1.19sec for client to notice • 220ms for the RG to elect a new leader • 970ms to propagate to the client through the CM • 2 sec to establish connection ACaZoo Oracle NoSQL
  • 32. 21 July 2014, University of Crete Backup - ArchitectureCassandra’s
  • 33. 21 July 2014, University of Crete Cassandra’s Architecture
  • 34. 21 July 2014, University of Crete Cassandra’s Architecture
  • 35. 21 July 2014, University of Crete Cassandra’s Architecture 2/3 Responses: {X,Y} Need for reconciliation!
  • 36. 21 July 2014, University of Crete Backup-Paxos1
  • 38. 21 July 2014, University of Crete Benefit of client coordinated I/O • Yahoo Cloud Serving Benchmark (YCSB). – 4 threads and read 1 GB of Data Throughput (ops/sec) Read latency (average, ms) Read latency (99 percentile, ms Original Cassandra 317 3.1 4 Client Coordinated I/O 412 2.3 3
  • 39. 21 July 2014, University of Crete CM load balancer 0 500 1000 1500 2000 2500 1 10 100 1000 10000 AverageLatency(ms) # Threads 1 node 3 nodes 3 nodes balanced

Editor's Notes

  1. Motivating this work
  2. Ta teleutaia xronia ο όγκος των δεδομένων έχει αυξηθεί δραματικά.
  3. Image of Key value stores…!! Several companies.. A number of eBay supports critical applications that need both real-time and analytics capabilities with the features of Cassandra. Netflix increased the availability of member information and quality of data for its global streaming video service thanks to Cassandra. Adobe relies on Cassandra to provide a highly scalable, low-latency database to support its distributed cache architecture.
  4. Sas edeiksa pws einai h ulopoishs gia ena LSM dentro omws otan exw pollous komvous me mia ulopoihsh lsm se kathe komvo..
  5. ----- Meeting Notes (7/18/14 18:41) ----- Compaction is a problem
  6. Cassandra no longer handles replication.
  7. ----- Meeting Notes (7/18/14 18:58) ----- An estiasoume ston leader, ola ta
  8. ----- Meeting Notes (7/18/14 18:58) ----- 3 diaforetikes polites RR, RR kai antistrofos analoga tou Compacti
  9. Majority: 23, 13, 12 Minority: 21, 32, 44 ----- Meeting Notes (7/18/14 18:58) ----- afou ginetai replication tha perimene kaneis oti yparxei synchronismos omws den einai etsi..
  10. RW: 25 -20 b-p W: 40- 33 b-p
  11. ----- Meeting Notes (7/18/14 18:41) ----- Compaction is a problem
  12. Focus on alternatives that exploit replication mechanisms.
  13. This concludes my talk and I would be happy to take any questions
  14. (a) 1.19 sec between the time the leader crashes until the client notices; (b) 2 sec until the client establishes a connection with the new leader and restores service. Interval (a) further breaks down into: (1) 220 ms for the RG to reconfigure (elect a new leader); (2) 970 ms to propagate the new-leader information (e.g., its IP address) to the client through the CM.
  15. Cassandra works well with applications that share its relaxed semantics (such as customer carts in online stores). Cassandra is not a good fit for more traditional applications requiring strong consistency. All nodes in Cassandra are peers No ordering guarantees, ad hoc synchronization mechanism, membership state to clients – gossip  If a replica misses a write, the row will be made consistent later via one of Cassandra’s built-in repair mechanisms: hinted handoff, read repair or anti-entropy node repairing eventually consistent
  16. Cassandra works well with applications that share its relaxed semantics (such as customer carts in online stores). Cassandra is not a good fit for more traditional applications requiring strong consistency. All nodes in Cassandra are peers No ordering guarantees, ad hoc synchronization mechanism, membership state to clients – gossip  If a replica misses a write, the row will be made consistent later via one of Cassandra’s built-in repair mechanisms: hinted handoff, read repair or anti-entropy node repairing eventually consistent
  17. Cassandra works well with applications that share its relaxed semantics (such as customer carts in online stores). Cassandra is not a good fit for more traditional applications requiring strong consistency. All nodes in Cassandra are peers No ordering guarantees, ad hoc synchronization mechanism, membership state to clients – gossip  If a replica misses a write, the row will be made consistent later via one of Cassandra’s built-in repair mechanisms: hinted handoff, read repair or anti-entropy node repairing eventually consistent
  18. Cassandra works well with applications that share its relaxed semantics (such as customer carts in online stores). Cassandra is not a good fit for more traditional applications requiring strong consistency. All nodes in Cassandra are peers No ordering guarantees, ad hoc synchronization mechanism, membership state to clients – gossip  If a replica misses a write, the row will be made consistent later via one of Cassandra’s built-in repair mechanisms: hinted handoff, read repair or anti-entropy node repairing eventually consistent
  19. 26% response time 30% throuput
  20. 26% response time 30% throuput