SlideShare a Scribd company logo
1 of 32
Download to read offline
Scalar DL: Scalable and Practical Byzantine Fault Detection
for Transactional Database Systems
Hiroyuki Yamada, Jun Nemoto
Scalar, Inc.
Towards a reliable database system
● We live in a data-driven / data-centric world.
○ Data needs to be reliable and trustful.
○ Database systems need to be reliable and trustful.
● Dealing with Byzantine faults in a database system is one of the key factors.
○ Byzantine faults: software errors, data tampering, (internal) malicious attacks.
Our Goal: A database system that deals with Byzantine faults in a practical and
scalable way.
Dealing with Byzantine faults
● Basic principle: find discrepancies between replicas.
● Byzantine fault tolerance (BFT).
○ N > 3f, N: # of replicas, f: # of faulty replicas.
○ SMR: PBFT [OSDI’99], BFT-SMaRt [DSN’14], HotStuff [PODC’19] …
○ Database: HRDB [SOSP’07], Byzantium [EuroSys’11], Hyperledger fabric
[EuroSys’18], Basil [SOSP’21]
● Byzantine fault detection (BFD).
○ N > f, N: # of replicas, f: # of faulty replicas.
○ SMR: PeerReview [SOSP’07]
Are existing solutions practical and scalable enough for a database system?
BFT is ideal, but may not be practical for database systems
● At least 4 administrative domains (ADs) are required for correctness.
○ Malicious attacks are likely to be dependent in an AD.
● BFT might not fit well with enterprise database systems.
○ Many enterprise database systems are managed by a single AD or a few ADs.
An AD is a collection of
nodes and networks
operated by a single
organization or
administrative authority.
BFT is ideal, but may not be practical for database systems
● At least 4 administrative domains (ADs) are required for correctness.
○ Malicious attacks are likely to be dependent in an AD.
● BFT might not fit well with enterprise database systems.
○ Many enterprise database systems are managed by a single AD or a few ADs.
An AD is a collection of
nodes and networks
operated by a single
organization or
administrative authority.
BFT is ideal, but may not be practical for database systems
● At least 4 administrative domains (ADs) are required for correctness.
○ Malicious attacks are likely to be dependent in an AD.
● BFT might not fit well with enterprise database systems.
○ Many enterprise database systems are managed by a single AD or a few ADs.
AD-1
AD-2
AD-3
AD-4
An AD is a collection of
nodes and networks
operated by a single
organization or
administrative authority.
BFT is ideal, but may not be practical for database systems
● At least 4 administrative domains (ADs) are required for correctness.
○ Malicious attacks are likely to be dependent in an AD.
● BFT might not fit well with enterprise database systems.
○ Many enterprise database systems are managed by a single AD or a few ADs.
AD-1
AD-2
AD-3
AD-4
4 ADs is at least
required to mask
1 fault.
An AD is a collection of
nodes and networks
operated by a single
organization or
administrative authority.
BFD is a promising approach for database systems
● Require only 2 ADs for correctness.
○ 2 is the lower bound for the number of replicas in dealing with Byzantine faults.
● Many use cases that require only BFD or tamper evidence.
○ Regulations on data protection and privacy (e.g., GDPR and CCPA), prior user
right for IP, and vehicle regulations around software updates with OTA in WP.29.
● Existing solutions are not designed for transactional database systems.
○ Cannot run transactions in parallel (i.e., not scalable)
1 faulty AD can be
detected as long as
there are 2 ADs.
AD-1 AD-2
Challenge:
Scalable BFD for a database system deployed to a 2-AD environment
BFT BFD
SMR
(run transactions
sequentially)
DB
(run transactions
concurrently)
BFT SMR
PBFT, BFT-SMaRt,
HotStuff, Tendermint
BFD SMR
PeerReview
BFT DB
HRDB, Byzantium, Basil,
Hyperledger Fabric
BFD DB
No existing work
Challenge:
Scalable BFD for a database system deployed to a 2-AD environment
BFT BFD
SMR
(run transactions
sequentially)
DB
(run transactions
concurrently)
BFT SMR
PBFT, BFT-SMaRt,
HotStuff, Tendermint
BFD SMR
PeerReview
BFT DB
HRDB, Byzantium, Basil,
Hyperledger Fabric
BFD DB
No existing work
Not practical from an administrative perspective
Challenge:
Scalable BFD for a database system deployed to a 2-AD environment
BFT BFD
SMR
(run transactions
sequentially)
DB
(run transactions
concurrently)
BFT SMR
PBFT, BFT-SMaRt,
HotStuff, Tendermint
BFD SMR
PeerReview
BFT DB
HRDB, Byzantium, Basil,
Hyperledger Fabric
BFD DB
No existing work
Not practical from an administrative perspective
Not designed
for database
transactions
Challenge:
Scalable BFD for a database system deployed to a 2-AD environment
BFT BFD
SMR
(run transactions
sequentially)
DB
(run transactions
concurrently)
BFT SMR
PBFT, BFT-SMaRt,
HotStuff, Tendermint
BFD SMR
PeerReview
BFT DB
HRDB, Byzantium, Basil,
Hyperledger Fabric
BFD DB
No existing work
Not practical from an administrative perspective
Not designed
for database
transactions
Challenge:
Scalable BFD for a database system deployed to a 2-AD environment
BFT BFD
SMR
(run transactions
sequentially)
DB
(run transactions
concurrently)
BFT SMR
PBFT, BFT-SMaRt,
HotStuff, Tendermint
BFD SMR
PeerReview
BFT DB
HRDB, Byzantium, Basil,
Hyperledger Fabric
BFD DB
No existing work
Not practical from an administrative perspective
Not designed
for database
transactions
BFT DB => BFD DB
● Can we realize BFD by splitting up replicas into 2 ADs?
○ No.
● 1 Byzantine-faulty replica will exceed the predefined threshold for correctness
because Byzantine faults are dependent in an AD.
○ Need to accept the fault, i.e., data will be tampered.
BFT DB => BFD DB
● Can we realize BFD by splitting up replicas into 2 ADs?
○ No.
● 1 Byzantine-faulty replica will exceed the predefined threshold for correctness
because Byzantine faults are dependent in an AD.
○ Need to accept the fault, i.e., data will be tampered.
BFT DB => BFD DB
● Can we realize BFD by splitting up replicas into 2 ADs?
○ No.
● 1 Byzantine-faulty replica will exceed the predefined threshold for correctness
because Byzantine faults are dependent in an AD.
○ Need to accept the fault, i.e., data will be tampered.
AD-1 AD-2
BFT DB => BFD DB
● Can we realize BFD by splitting up replicas into 2 ADs?
○ No.
● 1 Byzantine-faulty replica will exceed the predefined threshold for correctness
because Byzantine faults are dependent in an AD.
○ Need to accept the fault, i.e., data will be tampered.
AD-1 AD-2
BFT DB => BFD DB
● Can we realize BFD by splitting up replicas into 2 ADs?
○ No.
● 1 Byzantine-faulty replica will exceed the predefined threshold for correctness
because Byzantine faults are dependent in an AD.
○ Need to accept the fault, i.e., data will be tampered.
AD-1 AD-2
BFT DB => BFD DB
● Can we realize BFD by splitting up replicas into 2 ADs?
○ No.
● 1 Byzantine-faulty replica will exceed the predefined threshold for correctness
because Byzantine faults are dependent in an AD.
○ Need to accept the fault, i.e., data will be tampered.
BFT DB cannot
trivially be extended
to realize BFD DB
AD-1 AD-2
N=4, f=2 => N>3f
BFD SMR => BFD DB
● Can we make BFD SMR (PeerReview) run transactions concurrently?
○ Yes, but only partially.
○ We could apply a concurrency control in a primary-side processing.
● Require sequential execution of hash-chained log in a witness-side for
correctness (i.e., strict serializability), which limits the overall scalability.
○ Running transactions in parallel could cause time-travel anomalies.
AD-1 AD-2
T1
T2
T2
T1
hash-chained log
Primary Witness (Auditor)
Witness-side execution has to
be sequential for correctness.
Challenge:
Scalable BFD for a database system deployed to a 2-AD environment
BFT BFD
SMR
(run transactions
sequentially)
DB
(run transactions
concurrently)
BFT SMR
PBFT, BFT-SMaRt,
HotStuff, Tendermint
BFD SMR
PeerReview
BFT DB
HRDB, Byzantium, Basil,
Hyperledger Fabric
BFD DB
NONE
Not possible
(as it is)
Possible but
not scalable
Scalar DL: A scalable and practical BFD approach
● Scalable and practical BFD middleware for transactional database systems.
○ Manage two types of servers and databases in separate ADs internally.
○ Database-agnostic by depending only on common database operations.
● Execute non-conflicting transactions in parallel while guaranteeing correctness.
Primary Secondary
Scalar DL Primary Servers
Primary Database
AD1
Scalar DL Clients
Applications
Scalar DL Secondary Servers
Secondary Database
AD2
Database System
• Provide safety (strict serializability)
and liveness if no fault.
• Provide safety (correct clients can
detect a Byzantine fault) if one AD
is faulty.
Correctness:
The BFD protocol - Overview
● Key idea: Make an agreement on the partial ordering of transactions in a
decentralized and concurrent way
○ Either primary or secondary cannot selfishly order/commit transactions.
● 3-phase protocol: Ordering -> Commit -> Validation.
○ The protocol assumes one-shot request model.
Client
Secondary
Primary
Ordering Commit Validation
The BFD protocol - Ordering phase
● Order transactions in a strict serializable manner with a variant of 2PL.
○ Simulate a transaction and identify the read/write sets of the transaction.
○ Acquire R/W locks using underlying database’s linearizable operations.
○ Go to the commit phase once all the required locks are acquired.
● Why not using multi-version concurrency control (MVCC)?
○ A primary and a secondary could derive different serialization orders without sharing explicit
order dependencies (e.g., conflict graph).
Primary key Version Lock count Lock mode
Lock holders
(TxIDs)
Input
dependencies
Lock entry:
A set of
<primary-key, version>.
Client
Secondary
Primary
Ordering Commit Validation
Indicate the
partial order of
transactions
The BFD protocol - Commit phase
● Execute transactions in an ACID way in an arbitrary order.
○ Also write a transaction status with a transaction ID as a key for recovery.
○ This is where a transaction is regarded as committed or aborted.
● Create proofs that indicate what records are read and written.
● The input dependencies indicate the partial order of transactions
Primary key Version TxID
Input
dependencies
MAC
Proof entry:
Client
Secondary
Primary
Ordering Commit Validation
Indicate the
partial order of
transactions
The BFD protocol - Validation phase
● Validate if the commit order is the same as the one the secondary expects.
○ Compare the lock entries and proofs.
● Execute transactions in the secondary once validated and create proofs.
● A client compares the results and proofs from the primary and the secondary
to find discrepancies (i.e., Byzantine faults).
Primary Secondary
Result
Proofs
Result
Proofs
2. Commit phase
3. Validation phase
Compare
=?
Compare
lock table
=?
Pre-validation
Client
Client
Secondary
Primary
Ordering Commit Validation
Evaluation - Benchmarked systems and workloads
● Benchmarked Systems:
○ PeerReviewTx: an extended version of PeerReview, which runs TXs in parallel in
a primary side.
○ Scalar DL: use Scalar DB to execute transactions on non-transactional databases.
○ Both PeerReviewTx and Scalar DL servers are placed in database instances.
○ PostgreSQL and Cassandra as backend database systems.
● Workloads
○ YCSB: F and C. 100M records with 100 bytes payload and uniform distribution.
○ TPC-C: 50/50 ratio of NewOrder and Payment. 100 - 1000 warehouses.
Evaluation - Experimental setup
● Environment
○ AWS. c5d.4xlarge for each database instance (8 cores, 32GB DRAM, NVMe SSD).
c5.9xlarge for a client.
○ 2 ADs in different VPCs.
PostgreSQL
Scalar DL
C*
DL
…
PostgreSQL
Scalar DL
C*
DL
C*
DL
C*
DL
…
C*
DL
C*
DL
Clients Clients
AD AD AD AD
Throughput on PostgreSQL
YCSB-F TPC-C (NP)
Scalar DL scaled as the number of client threads increased, whereas PeerReviewTx
didn’t scale as much. The benefit of Scalar DL comes from its concurrency control.
Throughput on Cassandra (3 nodes per AD, RF=3)
YCSB-F TPC-C (NP)
The results were similar results to the one on PostgreSQL.
The database-agnostic property was also verified.
Scalability (with TPC-C)
Scalar DL scaled near-linearly as the number of nodes increased in each AD
Summary
● Scalar DL is scalable and practical BFD middleware for transactional
database systems.
● Key contribution: Byzantine fault detection protocol that executes non-
conflicting transactions in parallel while guaranteeing correctness.
● Achieve up to 10 times speedup compared to the state-of-the-art BFD
approach and near-linear (91%) node scalability.
● Scalar DL is a real product, not a research prototype.
○ See https://github.com/scalar-labs/scalardl

More Related Content

Similar to Scalar DL: Scalable and Practical Byzantine Fault Detection for Transactional Database Systems (VLDB'22)

brief introduction of drbd in SLE12SP2
brief introduction of drbd in SLE12SP2brief introduction of drbd in SLE12SP2
brief introduction of drbd in SLE12SP2Nick Wang
 
Scale Your Mission-Critical Applications With Neo4j Fabric and Clustering Arc...
Scale Your Mission-Critical Applications With Neo4j Fabric and Clustering Arc...Scale Your Mission-Critical Applications With Neo4j Fabric and Clustering Arc...
Scale Your Mission-Critical Applications With Neo4j Fabric and Clustering Arc...Neo4j
 
Netflix's Transition to High-Availability Storage (QCon SF 2010)
Netflix's Transition to High-Availability Storage (QCon SF 2010)Netflix's Transition to High-Availability Storage (QCon SF 2010)
Netflix's Transition to High-Availability Storage (QCon SF 2010)Sid Anand
 
Ibm db2 analytics accelerator high availability and disaster recovery
Ibm db2 analytics accelerator  high availability and disaster recoveryIbm db2 analytics accelerator  high availability and disaster recovery
Ibm db2 analytics accelerator high availability and disaster recoverybupbechanhgmail
 
RDB - Repairable Database Systems
RDB - Repairable Database SystemsRDB - Repairable Database Systems
RDB - Repairable Database SystemsAlexey Smirnov
 
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?Clustrix
 
Disaster recovery in sql server
Disaster recovery in  sql serverDisaster recovery in  sql server
Disaster recovery in sql serverRajib Kundu
 
MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014Dylan Tong
 
Data has a better idea the in-memory data grid
Data has a better idea   the in-memory data gridData has a better idea   the in-memory data grid
Data has a better idea the in-memory data gridBogdan Dina
 
MongoDB World 2019: New Encryption Capabilities in MongoDB 4.2: A Deep Dive i...
MongoDB World 2019: New Encryption Capabilities in MongoDB 4.2: A Deep Dive i...MongoDB World 2019: New Encryption Capabilities in MongoDB 4.2: A Deep Dive i...
MongoDB World 2019: New Encryption Capabilities in MongoDB 4.2: A Deep Dive i...MongoDB
 
Deploying MongoDB for the Win
Deploying MongoDB for the WinDeploying MongoDB for the Win
Deploying MongoDB for the WinMongoDB
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftAmazon Web Services
 
Getting Started With Amazon Redshift
Getting Started With Amazon Redshift Getting Started With Amazon Redshift
Getting Started With Amazon Redshift Matillion
 
Leveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data WarehouseLeveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data WarehouseAmazon Web Services
 
Big Data, Fast Data @ PayPal (YOW 2018)
Big Data, Fast Data @ PayPal (YOW 2018)Big Data, Fast Data @ PayPal (YOW 2018)
Big Data, Fast Data @ PayPal (YOW 2018)Sid Anand
 
Tokyo azure meetup #12 service fabric internals
Tokyo azure meetup #12   service fabric internalsTokyo azure meetup #12   service fabric internals
Tokyo azure meetup #12 service fabric internalsTokyo Azure Meetup
 
Scylla Summit 2018: Consensus in Eventually Consistent Databases
Scylla Summit 2018: Consensus in Eventually Consistent DatabasesScylla Summit 2018: Consensus in Eventually Consistent Databases
Scylla Summit 2018: Consensus in Eventually Consistent DatabasesScyllaDB
 
Druid Summit 2023 : Changing Druid Ingestion from 3 hours to 5 minutes
Druid Summit 2023 : Changing Druid Ingestion from 3 hours to 5 minutesDruid Summit 2023 : Changing Druid Ingestion from 3 hours to 5 minutes
Druid Summit 2023 : Changing Druid Ingestion from 3 hours to 5 minutesShivji Kumar Jha
 

Similar to Scalar DL: Scalable and Practical Byzantine Fault Detection for Transactional Database Systems (VLDB'22) (20)

brief introduction of drbd in SLE12SP2
brief introduction of drbd in SLE12SP2brief introduction of drbd in SLE12SP2
brief introduction of drbd in SLE12SP2
 
Scale Your Mission-Critical Applications With Neo4j Fabric and Clustering Arc...
Scale Your Mission-Critical Applications With Neo4j Fabric and Clustering Arc...Scale Your Mission-Critical Applications With Neo4j Fabric and Clustering Arc...
Scale Your Mission-Critical Applications With Neo4j Fabric and Clustering Arc...
 
Netflix's Transition to High-Availability Storage (QCon SF 2010)
Netflix's Transition to High-Availability Storage (QCon SF 2010)Netflix's Transition to High-Availability Storage (QCon SF 2010)
Netflix's Transition to High-Availability Storage (QCon SF 2010)
 
Ibm db2 analytics accelerator high availability and disaster recovery
Ibm db2 analytics accelerator  high availability and disaster recoveryIbm db2 analytics accelerator  high availability and disaster recovery
Ibm db2 analytics accelerator high availability and disaster recovery
 
No stress with state
No stress with stateNo stress with state
No stress with state
 
RDB - Repairable Database Systems
RDB - Repairable Database SystemsRDB - Repairable Database Systems
RDB - Repairable Database Systems
 
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
 
Disaster recovery in sql server
Disaster recovery in  sql serverDisaster recovery in  sql server
Disaster recovery in sql server
 
MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014
 
VoltDB on SolftLayer Cloud
VoltDB on SolftLayer CloudVoltDB on SolftLayer Cloud
VoltDB on SolftLayer Cloud
 
Data has a better idea the in-memory data grid
Data has a better idea   the in-memory data gridData has a better idea   the in-memory data grid
Data has a better idea the in-memory data grid
 
MongoDB World 2019: New Encryption Capabilities in MongoDB 4.2: A Deep Dive i...
MongoDB World 2019: New Encryption Capabilities in MongoDB 4.2: A Deep Dive i...MongoDB World 2019: New Encryption Capabilities in MongoDB 4.2: A Deep Dive i...
MongoDB World 2019: New Encryption Capabilities in MongoDB 4.2: A Deep Dive i...
 
Deploying MongoDB for the Win
Deploying MongoDB for the WinDeploying MongoDB for the Win
Deploying MongoDB for the Win
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon Redshift
 
Getting Started With Amazon Redshift
Getting Started With Amazon Redshift Getting Started With Amazon Redshift
Getting Started With Amazon Redshift
 
Leveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data WarehouseLeveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data Warehouse
 
Big Data, Fast Data @ PayPal (YOW 2018)
Big Data, Fast Data @ PayPal (YOW 2018)Big Data, Fast Data @ PayPal (YOW 2018)
Big Data, Fast Data @ PayPal (YOW 2018)
 
Tokyo azure meetup #12 service fabric internals
Tokyo azure meetup #12   service fabric internalsTokyo azure meetup #12   service fabric internals
Tokyo azure meetup #12 service fabric internals
 
Scylla Summit 2018: Consensus in Eventually Consistent Databases
Scylla Summit 2018: Consensus in Eventually Consistent DatabasesScylla Summit 2018: Consensus in Eventually Consistent Databases
Scylla Summit 2018: Consensus in Eventually Consistent Databases
 
Druid Summit 2023 : Changing Druid Ingestion from 3 hours to 5 minutes
Druid Summit 2023 : Changing Druid Ingestion from 3 hours to 5 minutesDruid Summit 2023 : Changing Druid Ingestion from 3 hours to 5 minutes
Druid Summit 2023 : Changing Druid Ingestion from 3 hours to 5 minutes
 

More from Scalar, Inc.

Scalar DB: Universal Transaction Manager
Scalar DB: Universal Transaction ManagerScalar DB: Universal Transaction Manager
Scalar DB: Universal Transaction ManagerScalar, Inc.
 
Scalar DL Technical Overview
Scalar DL Technical OverviewScalar DL Technical Overview
Scalar DL Technical OverviewScalar, Inc.
 
Scalar DL Technical Overview
Scalar DL Technical OverviewScalar DL Technical Overview
Scalar DL Technical OverviewScalar, Inc.
 
Making Cassandra more capable, faster, and more reliable (at ApacheCon@Home 2...
Making Cassandra more capable, faster, and more reliable (at ApacheCon@Home 2...Making Cassandra more capable, faster, and more reliable (at ApacheCon@Home 2...
Making Cassandra more capable, faster, and more reliable (at ApacheCon@Home 2...Scalar, Inc.
 
Scalar IST のご紹介
Scalar IST のご紹介 Scalar IST のご紹介
Scalar IST のご紹介 Scalar, Inc.
 
Scalar DB: A library that makes non-ACID databases ACID-compliant
Scalar DB: A library that makes non-ACID databases ACID-compliantScalar DB: A library that makes non-ACID databases ACID-compliant
Scalar DB: A library that makes non-ACID databases ACID-compliantScalar, Inc.
 
個人データ連携から見えるSociety5.0~法令対応に向けた技術的な活用事例について~
個人データ連携から見えるSociety5.0~法令対応に向けた技術的な活用事例について~個人データ連携から見えるSociety5.0~法令対応に向けた技術的な活用事例について~
個人データ連携から見えるSociety5.0~法令対応に向けた技術的な活用事例について~Scalar, Inc.
 
事業者間・対個人におけるデータの信頼性と透明性の担保によるデジタライゼーションの推進
事業者間・対個人におけるデータの信頼性と透明性の担保によるデジタライゼーションの推進事業者間・対個人におけるデータの信頼性と透明性の担保によるデジタライゼーションの推進
事業者間・対個人におけるデータの信頼性と透明性の担保によるデジタライゼーションの推進Scalar, Inc.
 
Transaction Management on Cassandra
Transaction Management on CassandraTransaction Management on Cassandra
Transaction Management on CassandraScalar, Inc.
 

More from Scalar, Inc. (9)

Scalar DB: Universal Transaction Manager
Scalar DB: Universal Transaction ManagerScalar DB: Universal Transaction Manager
Scalar DB: Universal Transaction Manager
 
Scalar DL Technical Overview
Scalar DL Technical OverviewScalar DL Technical Overview
Scalar DL Technical Overview
 
Scalar DL Technical Overview
Scalar DL Technical OverviewScalar DL Technical Overview
Scalar DL Technical Overview
 
Making Cassandra more capable, faster, and more reliable (at ApacheCon@Home 2...
Making Cassandra more capable, faster, and more reliable (at ApacheCon@Home 2...Making Cassandra more capable, faster, and more reliable (at ApacheCon@Home 2...
Making Cassandra more capable, faster, and more reliable (at ApacheCon@Home 2...
 
Scalar IST のご紹介
Scalar IST のご紹介 Scalar IST のご紹介
Scalar IST のご紹介
 
Scalar DB: A library that makes non-ACID databases ACID-compliant
Scalar DB: A library that makes non-ACID databases ACID-compliantScalar DB: A library that makes non-ACID databases ACID-compliant
Scalar DB: A library that makes non-ACID databases ACID-compliant
 
個人データ連携から見えるSociety5.0~法令対応に向けた技術的な活用事例について~
個人データ連携から見えるSociety5.0~法令対応に向けた技術的な活用事例について~個人データ連携から見えるSociety5.0~法令対応に向けた技術的な活用事例について~
個人データ連携から見えるSociety5.0~法令対応に向けた技術的な活用事例について~
 
事業者間・対個人におけるデータの信頼性と透明性の担保によるデジタライゼーションの推進
事業者間・対個人におけるデータの信頼性と透明性の担保によるデジタライゼーションの推進事業者間・対個人におけるデータの信頼性と透明性の担保によるデジタライゼーションの推進
事業者間・対個人におけるデータの信頼性と透明性の担保によるデジタライゼーションの推進
 
Transaction Management on Cassandra
Transaction Management on CassandraTransaction Management on Cassandra
Transaction Management on Cassandra
 

Recently uploaded

WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...Shane Coughlan
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park masabamasaba
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...SelfMade bd
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesVictorSzoltysek
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in sowetomasabamasaba
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...masabamasaba
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfkalichargn70th171
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2
 
tonesoftg
tonesoftgtonesoftg
tonesoftglanshi9
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Hararemasabamasaba
 

Recently uploaded (20)

WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare
 

Scalar DL: Scalable and Practical Byzantine Fault Detection for Transactional Database Systems (VLDB'22)

  • 1. Scalar DL: Scalable and Practical Byzantine Fault Detection for Transactional Database Systems Hiroyuki Yamada, Jun Nemoto Scalar, Inc.
  • 2. Towards a reliable database system ● We live in a data-driven / data-centric world. ○ Data needs to be reliable and trustful. ○ Database systems need to be reliable and trustful. ● Dealing with Byzantine faults in a database system is one of the key factors. ○ Byzantine faults: software errors, data tampering, (internal) malicious attacks. Our Goal: A database system that deals with Byzantine faults in a practical and scalable way.
  • 3. Dealing with Byzantine faults ● Basic principle: find discrepancies between replicas. ● Byzantine fault tolerance (BFT). ○ N > 3f, N: # of replicas, f: # of faulty replicas. ○ SMR: PBFT [OSDI’99], BFT-SMaRt [DSN’14], HotStuff [PODC’19] … ○ Database: HRDB [SOSP’07], Byzantium [EuroSys’11], Hyperledger fabric [EuroSys’18], Basil [SOSP’21] ● Byzantine fault detection (BFD). ○ N > f, N: # of replicas, f: # of faulty replicas. ○ SMR: PeerReview [SOSP’07] Are existing solutions practical and scalable enough for a database system?
  • 4. BFT is ideal, but may not be practical for database systems ● At least 4 administrative domains (ADs) are required for correctness. ○ Malicious attacks are likely to be dependent in an AD. ● BFT might not fit well with enterprise database systems. ○ Many enterprise database systems are managed by a single AD or a few ADs. An AD is a collection of nodes and networks operated by a single organization or administrative authority.
  • 5. BFT is ideal, but may not be practical for database systems ● At least 4 administrative domains (ADs) are required for correctness. ○ Malicious attacks are likely to be dependent in an AD. ● BFT might not fit well with enterprise database systems. ○ Many enterprise database systems are managed by a single AD or a few ADs. An AD is a collection of nodes and networks operated by a single organization or administrative authority.
  • 6. BFT is ideal, but may not be practical for database systems ● At least 4 administrative domains (ADs) are required for correctness. ○ Malicious attacks are likely to be dependent in an AD. ● BFT might not fit well with enterprise database systems. ○ Many enterprise database systems are managed by a single AD or a few ADs. AD-1 AD-2 AD-3 AD-4 An AD is a collection of nodes and networks operated by a single organization or administrative authority.
  • 7. BFT is ideal, but may not be practical for database systems ● At least 4 administrative domains (ADs) are required for correctness. ○ Malicious attacks are likely to be dependent in an AD. ● BFT might not fit well with enterprise database systems. ○ Many enterprise database systems are managed by a single AD or a few ADs. AD-1 AD-2 AD-3 AD-4 4 ADs is at least required to mask 1 fault. An AD is a collection of nodes and networks operated by a single organization or administrative authority.
  • 8. BFD is a promising approach for database systems ● Require only 2 ADs for correctness. ○ 2 is the lower bound for the number of replicas in dealing with Byzantine faults. ● Many use cases that require only BFD or tamper evidence. ○ Regulations on data protection and privacy (e.g., GDPR and CCPA), prior user right for IP, and vehicle regulations around software updates with OTA in WP.29. ● Existing solutions are not designed for transactional database systems. ○ Cannot run transactions in parallel (i.e., not scalable) 1 faulty AD can be detected as long as there are 2 ADs. AD-1 AD-2
  • 9. Challenge: Scalable BFD for a database system deployed to a 2-AD environment BFT BFD SMR (run transactions sequentially) DB (run transactions concurrently) BFT SMR PBFT, BFT-SMaRt, HotStuff, Tendermint BFD SMR PeerReview BFT DB HRDB, Byzantium, Basil, Hyperledger Fabric BFD DB No existing work
  • 10. Challenge: Scalable BFD for a database system deployed to a 2-AD environment BFT BFD SMR (run transactions sequentially) DB (run transactions concurrently) BFT SMR PBFT, BFT-SMaRt, HotStuff, Tendermint BFD SMR PeerReview BFT DB HRDB, Byzantium, Basil, Hyperledger Fabric BFD DB No existing work Not practical from an administrative perspective
  • 11. Challenge: Scalable BFD for a database system deployed to a 2-AD environment BFT BFD SMR (run transactions sequentially) DB (run transactions concurrently) BFT SMR PBFT, BFT-SMaRt, HotStuff, Tendermint BFD SMR PeerReview BFT DB HRDB, Byzantium, Basil, Hyperledger Fabric BFD DB No existing work Not practical from an administrative perspective Not designed for database transactions
  • 12. Challenge: Scalable BFD for a database system deployed to a 2-AD environment BFT BFD SMR (run transactions sequentially) DB (run transactions concurrently) BFT SMR PBFT, BFT-SMaRt, HotStuff, Tendermint BFD SMR PeerReview BFT DB HRDB, Byzantium, Basil, Hyperledger Fabric BFD DB No existing work Not practical from an administrative perspective Not designed for database transactions
  • 13. Challenge: Scalable BFD for a database system deployed to a 2-AD environment BFT BFD SMR (run transactions sequentially) DB (run transactions concurrently) BFT SMR PBFT, BFT-SMaRt, HotStuff, Tendermint BFD SMR PeerReview BFT DB HRDB, Byzantium, Basil, Hyperledger Fabric BFD DB No existing work Not practical from an administrative perspective Not designed for database transactions
  • 14. BFT DB => BFD DB ● Can we realize BFD by splitting up replicas into 2 ADs? ○ No. ● 1 Byzantine-faulty replica will exceed the predefined threshold for correctness because Byzantine faults are dependent in an AD. ○ Need to accept the fault, i.e., data will be tampered.
  • 15. BFT DB => BFD DB ● Can we realize BFD by splitting up replicas into 2 ADs? ○ No. ● 1 Byzantine-faulty replica will exceed the predefined threshold for correctness because Byzantine faults are dependent in an AD. ○ Need to accept the fault, i.e., data will be tampered.
  • 16. BFT DB => BFD DB ● Can we realize BFD by splitting up replicas into 2 ADs? ○ No. ● 1 Byzantine-faulty replica will exceed the predefined threshold for correctness because Byzantine faults are dependent in an AD. ○ Need to accept the fault, i.e., data will be tampered. AD-1 AD-2
  • 17. BFT DB => BFD DB ● Can we realize BFD by splitting up replicas into 2 ADs? ○ No. ● 1 Byzantine-faulty replica will exceed the predefined threshold for correctness because Byzantine faults are dependent in an AD. ○ Need to accept the fault, i.e., data will be tampered. AD-1 AD-2
  • 18. BFT DB => BFD DB ● Can we realize BFD by splitting up replicas into 2 ADs? ○ No. ● 1 Byzantine-faulty replica will exceed the predefined threshold for correctness because Byzantine faults are dependent in an AD. ○ Need to accept the fault, i.e., data will be tampered. AD-1 AD-2
  • 19. BFT DB => BFD DB ● Can we realize BFD by splitting up replicas into 2 ADs? ○ No. ● 1 Byzantine-faulty replica will exceed the predefined threshold for correctness because Byzantine faults are dependent in an AD. ○ Need to accept the fault, i.e., data will be tampered. BFT DB cannot trivially be extended to realize BFD DB AD-1 AD-2 N=4, f=2 => N>3f
  • 20. BFD SMR => BFD DB ● Can we make BFD SMR (PeerReview) run transactions concurrently? ○ Yes, but only partially. ○ We could apply a concurrency control in a primary-side processing. ● Require sequential execution of hash-chained log in a witness-side for correctness (i.e., strict serializability), which limits the overall scalability. ○ Running transactions in parallel could cause time-travel anomalies. AD-1 AD-2 T1 T2 T2 T1 hash-chained log Primary Witness (Auditor) Witness-side execution has to be sequential for correctness.
  • 21. Challenge: Scalable BFD for a database system deployed to a 2-AD environment BFT BFD SMR (run transactions sequentially) DB (run transactions concurrently) BFT SMR PBFT, BFT-SMaRt, HotStuff, Tendermint BFD SMR PeerReview BFT DB HRDB, Byzantium, Basil, Hyperledger Fabric BFD DB NONE Not possible (as it is) Possible but not scalable
  • 22. Scalar DL: A scalable and practical BFD approach ● Scalable and practical BFD middleware for transactional database systems. ○ Manage two types of servers and databases in separate ADs internally. ○ Database-agnostic by depending only on common database operations. ● Execute non-conflicting transactions in parallel while guaranteeing correctness. Primary Secondary Scalar DL Primary Servers Primary Database AD1 Scalar DL Clients Applications Scalar DL Secondary Servers Secondary Database AD2 Database System • Provide safety (strict serializability) and liveness if no fault. • Provide safety (correct clients can detect a Byzantine fault) if one AD is faulty. Correctness:
  • 23. The BFD protocol - Overview ● Key idea: Make an agreement on the partial ordering of transactions in a decentralized and concurrent way ○ Either primary or secondary cannot selfishly order/commit transactions. ● 3-phase protocol: Ordering -> Commit -> Validation. ○ The protocol assumes one-shot request model. Client Secondary Primary Ordering Commit Validation
  • 24. The BFD protocol - Ordering phase ● Order transactions in a strict serializable manner with a variant of 2PL. ○ Simulate a transaction and identify the read/write sets of the transaction. ○ Acquire R/W locks using underlying database’s linearizable operations. ○ Go to the commit phase once all the required locks are acquired. ● Why not using multi-version concurrency control (MVCC)? ○ A primary and a secondary could derive different serialization orders without sharing explicit order dependencies (e.g., conflict graph). Primary key Version Lock count Lock mode Lock holders (TxIDs) Input dependencies Lock entry: A set of <primary-key, version>. Client Secondary Primary Ordering Commit Validation Indicate the partial order of transactions
  • 25. The BFD protocol - Commit phase ● Execute transactions in an ACID way in an arbitrary order. ○ Also write a transaction status with a transaction ID as a key for recovery. ○ This is where a transaction is regarded as committed or aborted. ● Create proofs that indicate what records are read and written. ● The input dependencies indicate the partial order of transactions Primary key Version TxID Input dependencies MAC Proof entry: Client Secondary Primary Ordering Commit Validation Indicate the partial order of transactions
  • 26. The BFD protocol - Validation phase ● Validate if the commit order is the same as the one the secondary expects. ○ Compare the lock entries and proofs. ● Execute transactions in the secondary once validated and create proofs. ● A client compares the results and proofs from the primary and the secondary to find discrepancies (i.e., Byzantine faults). Primary Secondary Result Proofs Result Proofs 2. Commit phase 3. Validation phase Compare =? Compare lock table =? Pre-validation Client Client Secondary Primary Ordering Commit Validation
  • 27. Evaluation - Benchmarked systems and workloads ● Benchmarked Systems: ○ PeerReviewTx: an extended version of PeerReview, which runs TXs in parallel in a primary side. ○ Scalar DL: use Scalar DB to execute transactions on non-transactional databases. ○ Both PeerReviewTx and Scalar DL servers are placed in database instances. ○ PostgreSQL and Cassandra as backend database systems. ● Workloads ○ YCSB: F and C. 100M records with 100 bytes payload and uniform distribution. ○ TPC-C: 50/50 ratio of NewOrder and Payment. 100 - 1000 warehouses.
  • 28. Evaluation - Experimental setup ● Environment ○ AWS. c5d.4xlarge for each database instance (8 cores, 32GB DRAM, NVMe SSD). c5.9xlarge for a client. ○ 2 ADs in different VPCs. PostgreSQL Scalar DL C* DL … PostgreSQL Scalar DL C* DL C* DL C* DL … C* DL C* DL Clients Clients AD AD AD AD
  • 29. Throughput on PostgreSQL YCSB-F TPC-C (NP) Scalar DL scaled as the number of client threads increased, whereas PeerReviewTx didn’t scale as much. The benefit of Scalar DL comes from its concurrency control.
  • 30. Throughput on Cassandra (3 nodes per AD, RF=3) YCSB-F TPC-C (NP) The results were similar results to the one on PostgreSQL. The database-agnostic property was also verified.
  • 31. Scalability (with TPC-C) Scalar DL scaled near-linearly as the number of nodes increased in each AD
  • 32. Summary ● Scalar DL is scalable and practical BFD middleware for transactional database systems. ● Key contribution: Byzantine fault detection protocol that executes non- conflicting transactions in parallel while guaranteeing correctness. ● Achieve up to 10 times speedup compared to the state-of-the-art BFD approach and near-linear (91%) node scalability. ● Scalar DL is a real product, not a research prototype. ○ See https://github.com/scalar-labs/scalardl