SlideShare a Scribd company logo
1 of 21
IN-MEMORY NOSQL, Now OPEN 
SOURCE! 
ACID & CAP: 
CLEARING CAP CONFUSION AND 
WHY C IN CAP ≠ C IN ACID 
SRINI V. SRINIVASAN, PH.D 
SUNIL SAYYAPARAJU 
Aerospike aer . o . spike [air-oh- spahyk] 
noun, 1. tip of a rocket that enhances speed and stability 
© 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 1
REQUIREMENTS FOR INTERNET ENTERPRISES 
© 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 2
Introduction to Advertising: Real-time Bidding 
© 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 3
North American RTB speeds & feeds 
■ 1 to 6 billion cookies tracked 
■ Some companies track 200M, some track 20B 
■ Each bidder has their own data pool 
■ Data is your weapon 
■ Recent searches, behavior, IP addresses 
■ Audience clusters (K-cluster, K-means) from offline Hadoop 
■ “Remnant” from Google, Yahoo is about 0.6 million / sec 
■ Facebook exchange: about 0.6 million / sec 
■ “other” is 0.5 million / sec 
Currently about 3.0M / sec in North American 
© 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 4
Advertising requirements 
■ 100 millisecond or 150 millisecond ad delivery 
■ De-facto standard set in 2004 by Washington Post and others 
■ North America is 70 to 90 milliseconds wide 
■ Two or three data centers 
■ Auction is limited to 30 milliseconds 
■ Typically closes in 5 milliseconds 
■ Winners have more data, better models – in 5 milliseconds 
© 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 5
© 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 6 
MILLIONS OF CONSUMERS 
BILLIONS OF DEVICES 
APP SERVERS 
DATA 
INSIGHTS WAREHOUSE 
Advertising Technology Stack 
WRITE CONTEXT 
OPERATIONAL DB 
WRITE REAL-TIME CONTEXT 
READ RECENT CONTENT 
PROFILE STORE 
Cookies, email, deviceID, IP address, location, 
segments, clicks, likes, tweets, search terms... 
REAL-TIME ANALYTICS 
Best sellers, top scores, trending tweets 
BATCH ANALYTICS 
Discover patterns, 
segment data: 
location patterns, 
audience affinity
Financial Services – Intraday Positions 
ACCOUNT 
POSITIONS 
Read/Write 
Query 
Start of Day 
Data Loading 
End of Day 
Reconciliation 
LEGACY DATABASE 
(MAINFRAME) 
REAL-TIME 
DATA FEED 
© 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 7 
XDR 
10M+ user records 
Primary key access 
1M+ TPS planned 
Finance App 
Records App 
RT Reporting App
© 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 8 
Social Media 
MYSQL or POSTGRES 
(ROTATIONAL DISK) 
Java application tier 
Data abstraction 
and sharding 
Recent user 
generated content 
MODIFIED REDIS 
(SSD ENABLED) 
Content and 
Historical data
Modern Scale Out Architecture 
LOAD BALANCER 
Load balancer 
Simple stateless 
APP SERVERS 
CONTENT 
DELIVERY NETWORK 
Fast stateless 
IN-MEMORY NoSQL 
RESEARCH 
WAREHOUSE 
Long term cold 
storage 
HDFS BASED 
© 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 9
© 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 10 
ACID 
■ A : Atomicity 
■ All the changes will happen or none of them will happen 
■ Aborted transactions are rolled back 
■ C : Consistency 
■ Database will adhere to all the consistency rules before and after every 
transaction 
■ I.E, Data integrity is preserved before and after transaction 
■ Consistency rules specified by constraints for check, foreign keys, etc. 
■ I : Isolation 
■ Defines what data will be shown to the transactions 
■ Level-0/1/2/3 : Different types of locking semantics are used 
■ D : Durability 
■ Committed changes will never be lost 
■ Usually achieved by writing both log & data
© 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 11 
CAP 
■ C : Consistency 
■ All the copies of the data are same in a distributed system with 
replication 
■ A : Availability 
■ The system is 100% responsive for reads and writes with strict SLA 
■ It could return failure temporarily for a finite amount of time 
■ P : Partition Tolerance 
■ System continues to work (take reads/writes) even if some nodes 
cannot talk to each other 
■ Brewer’s CAP THEOREM 
■ Only two of the three (C, A, P) can be satisfied in any distributed 
system 
■ COROLLARY 
■ A system has to choose one of C or A in the event of partitioning
C for Controversy 
■ C in ACID != C in CAP 
■ So, ACID is possible in distributed systems 
© 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 12
ACID in Aerospike 
© 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 13 
■ Atomicity 
■ Currently, single-record atomicity with replication and secondary indexes 
■ Entire object including all the bins are changed together. “Copy on write” 
■ If any portion of the update fails, the entire operation is aborted 
■ Consistency 
■ No RDBMS style constraints can be defined 
■ Implied constraints are enforced, for example: 
■ Secondary index queries need to be able to find objects after the write transaction completes. 
■ Isolation 
■ Supports read-committed isolation for long transactions like backup/restore, scans, etc. (level-1) 
■ Provides Check-And-Set (CAS) operations 
■ Durability 
■ Achieved by writing to multiple replicas synchronously 
■ E.g., if one node fails, other copies can be used 
■ Effectively the level of durability is the same as using disk + log in traditional systems 
■ Enhanced durability 
■ Rackaware replication 
■ Backup + Restore 
■ XDR : Cross Datacenter Replication
CAP in Aerospike 
© 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 14 
■ Consistency 
■ Immediate consistency : All replicas are updated synchronously 
■ Availability 
■ New master/replicas will be assigned immediately on cluster state change 
■ New master will start taking writes 
■ Old replicas will server the reads 
■ Partition-tolerance 
■ Tries to avoid partitioning (secondary heartbeats) 
■ Chooses Availability over consistency 
■ Achieves eventual consistency when network restores 
■ For internet applications (e.g., Real-time Bidding in Display advertising) 
■ (AP + Eventual consistency) could be better than (CP - Availability) 
■ For enterprise applications (e.g., Consumer access to Retail Banking 
Accounts) 
■ C is paramount + A is very important, so partitions need to be avoided like the 
plague
Partitions are Rare 
Brewer’s CAP Revisited – 2012 
“First, because partitions are rare, there is little 
reason to forfeit C or A when the system is not 
partitioned.” 
© 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 15 
■ Fast heartbeats 
■ Nodes close to each other in same data center and same 
switch/rack 
■ Dual channel replicated heartbeats keeps system robust during 
network switch failures 
■ Ensures fast cluster formation and reorganization using Paxos 
algorithm 
■ Handling consistency during node failures 
■ Generation count based conflict detection and resolution 
■ Duplicate resolution for reads during cluster reorganization 
■ Atomically moving data partitions from one cluster node to 
another
SHARED-NOTHING SYSTEM:100% DATA 
AVAILABILITY 
■ Every node in a cluster is identical, 
handles both transactions and long 
running tasks 
■ Data is replicated synchronously 
with immediate consistency within 
the cluster 
■ Data is replicated asynchronously 
© 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 16 
across data centers 
OHIO Data Center
Consistency and Availability Tradeoffs 
Brewer’s CAP Revisited – 2012 
"Second, the choice between C and A can occur 
many times within the same system at very fine 
granularity; not only can subsystems make 
different choices, but the choice can change 
according to the operation or even the specific 
data or user involved." 
■ Data path tradeoff during partition migration 
■ Providing repeatable read results in higher read latency when 
multiple copies of data partitions are being merged 
■ Disabling repeatable read could deliver slightly stale data during 
partition migrations 
■ Cluster state tradeoff during cluster formation event 
■ Individual cluster nodes can reject requests for brief periods (10 
milliseconds) to ensure that a new cluster forms in a timely manner 
■ Clients barely notice this and cluster reorganization events are rare 
© 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 17
WRITING RELIABILY WITH HIGH PERFORMANCE 
© 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 18 
master replica 
1. Write sent to row master 
2. Latch against simultaneous writes 
3. Apply write to master memory and replica 
memory synchronously 
4. Queue operations to disk 
5. Signal completed transaction (optional 
storage commit wait) 
6. Master applies conflict resolution policy 
(rollback/ rollforward) 
1. Cluster discovers new node via gossip 
protocol 
2. Paxos vote determines new data 
organization 
3. Partition migrations scheduled 
4. When a partition migration starts, write 
journal starts on destination 
5. Partition moves atomically 
6. Journal is applied and source data 
deleted 
transactions 
continue 
Writing with Immediate Consistency Adding a Node
Tuning and High Performance 
Brewer’s CAP Revisited – 2012 
"Finally, all three properties are more continuous than 
binary. Availability is obviously continuous from 0 to 
100 percent, but there are also many levels of 
consistency, and even partitions have nuances, 
including disagreement within the system about 
whether a partition exists." 
© 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 19 
■ Tunable system 
■ Repeatable read allows higher consistency while lowering availability 
■ Heartbeat tuning helps system to continue to work in a robust manner 
■ Increasing replication factor to more than 2 helps keep small highly used data 
consistent 
■ Write all copies (sync and default) versus respond on master-complete 
(async) 
■ High Performance 
■ Vertical scale at 1M TPS / 10 TB node results in smaller clusters 
■ Smaller clusters leads to more robust system enabling 100% uptime 
■ Fast restart of servers (in seconds) minimizes the time when nodes go out of 
sync
Partition Avoidance versus Partition Detection 
Brewer’s CAP Revisited – 2012 
"Because partitions are rare, CAP should allow perfect C and A 
most of the time, but when partitions are present or perceived, a 
strategy that detects partitions and explicitly accounts for them is in 
order. This strategy should have three steps: detect partitions, enter 
an explicit partition mode that can limit some operations, and 
initiate a recovery process to restore consistency and compensate 
for mistakes made during a partition." 
■ High Consistency in AP Mode 
■ Avoid, as much as possible, the need to sacrifice consistency by minimizing formation of 
© 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 20 
network partitions 
■ High consistency using robust heartbeats, block for a few milliseconds during cluster 
formation, etc. 
■ Tunable consistency using repeatable read setting to maintain or relax consistency as 
necessary 
■ Smaller high capacity clusters hugely improves system behavior 
■ High Availability in CP Mode 
■ Static cluster to pre-define cluster size 
■ Detect partition occurrence accurately and enforce appropriate policies to protect the data 
■ Suspend partition migrations when the cluster is not whole 
■ Some amount of availability needs to be sacrificed to maintain consistency 
■ Block writes to partitions all of whose copies are not available in the partitioned cluster 
■ Serve reads if the replica is alive 
■ Not all reads/writes will fail, Only, writes meant for the nodes which are down will fail
Brewer’s CAP Revisited – 2012 
http://www.infoq.com/articles/cap-twelve- 
years-later-how-the-rules-have-changed 
© 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 21 
Conclusion 
■ Aerospike has been in development for about 6 years 
■ Does not sacrifice consistency at the altar of availability and high 
performance 
■ Has independently discovered and exploited some of the flexibility 
available to distributed systems as expressed in Brewer’s 2012 
article 
■ Attempts to provide the highest consistency, highest availability and 
highest performance possible in a distributed system 
■ Aerospike is now Open Source 
■ https://github.com/aerospike/aerospike-server 
■ Download and check it out!

More Related Content

What's hot

Distributing Data The Aerospike Way
Distributing Data The Aerospike WayDistributing Data The Aerospike Way
Distributing Data The Aerospike WayAerospike, Inc.
 
Leveraging Big Data with Hadoop, NoSQL and RDBMS
Leveraging Big Data with Hadoop, NoSQL and RDBMSLeveraging Big Data with Hadoop, NoSQL and RDBMS
Leveraging Big Data with Hadoop, NoSQL and RDBMSAerospike, Inc.
 
There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?Aerospike, Inc.
 
Aerospike AdTech Gets Hacked in Lower Manhattan
Aerospike AdTech Gets Hacked in Lower ManhattanAerospike AdTech Gets Hacked in Lower Manhattan
Aerospike AdTech Gets Hacked in Lower ManhattanAerospike
 
Running a High Performance NoSQL Database on Amazon EC2 for Just $1.68/Hour
Running a High Performance NoSQL Database on Amazon EC2 for Just $1.68/HourRunning a High Performance NoSQL Database on Amazon EC2 for Just $1.68/Hour
Running a High Performance NoSQL Database on Amazon EC2 for Just $1.68/HourAerospike, Inc.
 
How to Get a Game Changing Performance Advantage with Intel SSDs and Aerospike
How to Get a Game Changing Performance Advantage with Intel SSDs and AerospikeHow to Get a Game Changing Performance Advantage with Intel SSDs and Aerospike
How to Get a Game Changing Performance Advantage with Intel SSDs and AerospikeAerospike, Inc.
 
2017 DB Trends for Powering Real-Time Systems of Engagement
2017 DB Trends for Powering Real-Time Systems of Engagement2017 DB Trends for Powering Real-Time Systems of Engagement
2017 DB Trends for Powering Real-Time Systems of EngagementAerospike, Inc.
 
10 reasons why to choose Pure Storage
10 reasons why to choose Pure Storage10 reasons why to choose Pure Storage
10 reasons why to choose Pure StorageMarketingArrowECS_CZ
 
Predictable Big Data Performance in Real-time
Predictable Big Data Performance in Real-timePredictable Big Data Performance in Real-time
Predictable Big Data Performance in Real-timeAerospike, Inc.
 
Backup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
Backup management with Ceph Storage - Camilo Echevarne, Félix BarbeiraBackup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
Backup management with Ceph Storage - Camilo Echevarne, Félix BarbeiraCeph Community
 
Why Software-Defined Storage Matters
Why Software-Defined Storage MattersWhy Software-Defined Storage Matters
Why Software-Defined Storage MattersColleen Corrice
 
Ceph optimized Storage / Global HW solutions for SDS, David Alvarez
Ceph optimized Storage / Global HW solutions for SDS, David AlvarezCeph optimized Storage / Global HW solutions for SDS, David Alvarez
Ceph optimized Storage / Global HW solutions for SDS, David AlvarezCeph Community
 
Red Hat Storage Day Boston - Supermicro Super Storage
Red Hat Storage Day Boston - Supermicro Super StorageRed Hat Storage Day Boston - Supermicro Super Storage
Red Hat Storage Day Boston - Supermicro Super StorageRed_Hat_Storage
 
Red Hat Storage Day Atlanta - Why Software Defined Storage Matters
Red Hat Storage Day Atlanta - Why Software Defined Storage MattersRed Hat Storage Day Atlanta - Why Software Defined Storage Matters
Red Hat Storage Day Atlanta - Why Software Defined Storage MattersRed_Hat_Storage
 
Red Hat Storage Day Atlanta - Persistent Storage for Linux Containers
Red Hat Storage Day Atlanta - Persistent Storage for Linux Containers Red Hat Storage Day Atlanta - Persistent Storage for Linux Containers
Red Hat Storage Day Atlanta - Persistent Storage for Linux Containers Red_Hat_Storage
 
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...Red_Hat_Storage
 
What's new with enterprise Redis - Leena Joshi, Redis Labs
What's new with enterprise Redis - Leena Joshi, Redis LabsWhat's new with enterprise Redis - Leena Joshi, Redis Labs
What's new with enterprise Redis - Leena Joshi, Redis LabsRedis Labs
 
Red Hat Storage Day Boston - Red Hat Gluster Storage vs. Traditional Storage ...
Red Hat Storage Day Boston - Red Hat Gluster Storage vs. Traditional Storage ...Red Hat Storage Day Boston - Red Hat Gluster Storage vs. Traditional Storage ...
Red Hat Storage Day Boston - Red Hat Gluster Storage vs. Traditional Storage ...Red_Hat_Storage
 

What's hot (20)

Distributing Data The Aerospike Way
Distributing Data The Aerospike WayDistributing Data The Aerospike Way
Distributing Data The Aerospike Way
 
Leveraging Big Data with Hadoop, NoSQL and RDBMS
Leveraging Big Data with Hadoop, NoSQL and RDBMSLeveraging Big Data with Hadoop, NoSQL and RDBMS
Leveraging Big Data with Hadoop, NoSQL and RDBMS
 
There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?
 
Aerospike AdTech Gets Hacked in Lower Manhattan
Aerospike AdTech Gets Hacked in Lower ManhattanAerospike AdTech Gets Hacked in Lower Manhattan
Aerospike AdTech Gets Hacked in Lower Manhattan
 
Running a High Performance NoSQL Database on Amazon EC2 for Just $1.68/Hour
Running a High Performance NoSQL Database on Amazon EC2 for Just $1.68/HourRunning a High Performance NoSQL Database on Amazon EC2 for Just $1.68/Hour
Running a High Performance NoSQL Database on Amazon EC2 for Just $1.68/Hour
 
How to Get a Game Changing Performance Advantage with Intel SSDs and Aerospike
How to Get a Game Changing Performance Advantage with Intel SSDs and AerospikeHow to Get a Game Changing Performance Advantage with Intel SSDs and Aerospike
How to Get a Game Changing Performance Advantage with Intel SSDs and Aerospike
 
2017 DB Trends for Powering Real-Time Systems of Engagement
2017 DB Trends for Powering Real-Time Systems of Engagement2017 DB Trends for Powering Real-Time Systems of Engagement
2017 DB Trends for Powering Real-Time Systems of Engagement
 
10 reasons why to choose Pure Storage
10 reasons why to choose Pure Storage10 reasons why to choose Pure Storage
10 reasons why to choose Pure Storage
 
Predictable Big Data Performance in Real-time
Predictable Big Data Performance in Real-timePredictable Big Data Performance in Real-time
Predictable Big Data Performance in Real-time
 
Scaling HDFS at Xiaomi
Scaling HDFS at XiaomiScaling HDFS at Xiaomi
Scaling HDFS at Xiaomi
 
Backup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
Backup management with Ceph Storage - Camilo Echevarne, Félix BarbeiraBackup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
Backup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
 
Why Software-Defined Storage Matters
Why Software-Defined Storage MattersWhy Software-Defined Storage Matters
Why Software-Defined Storage Matters
 
Ceph optimized Storage / Global HW solutions for SDS, David Alvarez
Ceph optimized Storage / Global HW solutions for SDS, David AlvarezCeph optimized Storage / Global HW solutions for SDS, David Alvarez
Ceph optimized Storage / Global HW solutions for SDS, David Alvarez
 
AltaVault
AltaVaultAltaVault
AltaVault
 
Red Hat Storage Day Boston - Supermicro Super Storage
Red Hat Storage Day Boston - Supermicro Super StorageRed Hat Storage Day Boston - Supermicro Super Storage
Red Hat Storage Day Boston - Supermicro Super Storage
 
Red Hat Storage Day Atlanta - Why Software Defined Storage Matters
Red Hat Storage Day Atlanta - Why Software Defined Storage MattersRed Hat Storage Day Atlanta - Why Software Defined Storage Matters
Red Hat Storage Day Atlanta - Why Software Defined Storage Matters
 
Red Hat Storage Day Atlanta - Persistent Storage for Linux Containers
Red Hat Storage Day Atlanta - Persistent Storage for Linux Containers Red Hat Storage Day Atlanta - Persistent Storage for Linux Containers
Red Hat Storage Day Atlanta - Persistent Storage for Linux Containers
 
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
 
What's new with enterprise Redis - Leena Joshi, Redis Labs
What's new with enterprise Redis - Leena Joshi, Redis LabsWhat's new with enterprise Redis - Leena Joshi, Redis Labs
What's new with enterprise Redis - Leena Joshi, Redis Labs
 
Red Hat Storage Day Boston - Red Hat Gluster Storage vs. Traditional Storage ...
Red Hat Storage Day Boston - Red Hat Gluster Storage vs. Traditional Storage ...Red Hat Storage Day Boston - Red Hat Gluster Storage vs. Traditional Storage ...
Red Hat Storage Day Boston - Red Hat Gluster Storage vs. Traditional Storage ...
 

Similar to ACID & CAP: Clearing CAP Confusion and Why C In CAP ≠ C in ACID

What enterprises can learn from Real Time Bidding
What enterprises can learn from Real Time BiddingWhat enterprises can learn from Real Time Bidding
What enterprises can learn from Real Time BiddingAerospike
 
What enterprises can learn from Real Time Bidding (RTB)
What enterprises can learn from Real Time Bidding (RTB)What enterprises can learn from Real Time Bidding (RTB)
What enterprises can learn from Real Time Bidding (RTB)bigdatagurus_meetup
 
Aerospike Architecture
Aerospike ArchitectureAerospike Architecture
Aerospike ArchitecturePeter Milne
 
You Snooze You Lose or How to Win in Ad Tech?
You Snooze You Lose or How to Win in Ad Tech?You Snooze You Lose or How to Win in Ad Tech?
You Snooze You Lose or How to Win in Ad Tech?Aerospike, Inc.
 
Aerospike Architecture
Aerospike ArchitectureAerospike Architecture
Aerospike ArchitecturePeter Milne
 
Brian Bulkowski. Aerospike
Brian Bulkowski. AerospikeBrian Bulkowski. Aerospike
Brian Bulkowski. AerospikeVolha Banadyseva
 
Top10 list planningpostgresdeployment.2014
Top10 list planningpostgresdeployment.2014Top10 list planningpostgresdeployment.2014
Top10 list planningpostgresdeployment.2014EDB
 
NoSQL in Real-time Architectures
NoSQL in Real-time ArchitecturesNoSQL in Real-time Architectures
NoSQL in Real-time ArchitecturesRonen Botzer
 
HPC Storage and IO Trends and Workflows
HPC Storage and IO Trends and WorkflowsHPC Storage and IO Trends and Workflows
HPC Storage and IO Trends and Workflowsinside-BigData.com
 
OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017
OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017
OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017Cloud Native Day Tel Aviv
 
Introduction to Distributed Computing & Distributed Databases
Introduction to Distributed Computing & Distributed DatabasesIntroduction to Distributed Computing & Distributed Databases
Introduction to Distributed Computing & Distributed DatabasesShankar Iyer
 
Webinar: Cut Disaster Recovery Expenses – Improve Recovery Times
Webinar: Cut Disaster Recovery Expenses – Improve Recovery TimesWebinar: Cut Disaster Recovery Expenses – Improve Recovery Times
Webinar: Cut Disaster Recovery Expenses – Improve Recovery TimesStorage Switzerland
 
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014Johnny Miller
 
IMC Summit 2016 Breakout - Brian Bulkowski - NVMe, Storage Class Memory and O...
IMC Summit 2016 Breakout - Brian Bulkowski - NVMe, Storage Class Memory and O...IMC Summit 2016 Breakout - Brian Bulkowski - NVMe, Storage Class Memory and O...
IMC Summit 2016 Breakout - Brian Bulkowski - NVMe, Storage Class Memory and O...In-Memory Computing Summit
 
Lookout on Scaling Security to 100 Million Devices
Lookout on Scaling Security to 100 Million DevicesLookout on Scaling Security to 100 Million Devices
Lookout on Scaling Security to 100 Million DevicesScyllaDB
 
Why MySQL High Availability Matters
Why MySQL High Availability MattersWhy MySQL High Availability Matters
Why MySQL High Availability MattersMatt Lord
 
Flash changes everything?! Flash changes nothing...
Flash changes everything?! Flash changes nothing...Flash changes everything?! Flash changes nothing...
Flash changes everything?! Flash changes nothing...Proact Netherlands B.V.
 
It Sizing for Aras on Azure, Hybrid or On-site Deployments
It Sizing for Aras on Azure, Hybrid or On-site DeploymentsIt Sizing for Aras on Azure, Hybrid or On-site Deployments
It Sizing for Aras on Azure, Hybrid or On-site DeploymentsAras
 

Similar to ACID & CAP: Clearing CAP Confusion and Why C In CAP ≠ C in ACID (20)

What enterprises can learn from Real Time Bidding
What enterprises can learn from Real Time BiddingWhat enterprises can learn from Real Time Bidding
What enterprises can learn from Real Time Bidding
 
What enterprises can learn from Real Time Bidding (RTB)
What enterprises can learn from Real Time Bidding (RTB)What enterprises can learn from Real Time Bidding (RTB)
What enterprises can learn from Real Time Bidding (RTB)
 
Aerospike Architecture
Aerospike ArchitectureAerospike Architecture
Aerospike Architecture
 
You Snooze You Lose or How to Win in Ad Tech?
You Snooze You Lose or How to Win in Ad Tech?You Snooze You Lose or How to Win in Ad Tech?
You Snooze You Lose or How to Win in Ad Tech?
 
Aerospike Architecture
Aerospike ArchitectureAerospike Architecture
Aerospike Architecture
 
Brian Bulkowski. Aerospike
Brian Bulkowski. AerospikeBrian Bulkowski. Aerospike
Brian Bulkowski. Aerospike
 
Top10 list planningpostgresdeployment.2014
Top10 list planningpostgresdeployment.2014Top10 list planningpostgresdeployment.2014
Top10 list planningpostgresdeployment.2014
 
NoSQL in Real-time Architectures
NoSQL in Real-time ArchitecturesNoSQL in Real-time Architectures
NoSQL in Real-time Architectures
 
Oracle Storage a ochrana dat
Oracle Storage a ochrana datOracle Storage a ochrana dat
Oracle Storage a ochrana dat
 
HPC Storage and IO Trends and Workflows
HPC Storage and IO Trends and WorkflowsHPC Storage and IO Trends and Workflows
HPC Storage and IO Trends and Workflows
 
OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017
OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017
OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017
 
Introduction to Distributed Computing & Distributed Databases
Introduction to Distributed Computing & Distributed DatabasesIntroduction to Distributed Computing & Distributed Databases
Introduction to Distributed Computing & Distributed Databases
 
SD Times - Docker v2
SD Times - Docker v2SD Times - Docker v2
SD Times - Docker v2
 
Webinar: Cut Disaster Recovery Expenses – Improve Recovery Times
Webinar: Cut Disaster Recovery Expenses – Improve Recovery TimesWebinar: Cut Disaster Recovery Expenses – Improve Recovery Times
Webinar: Cut Disaster Recovery Expenses – Improve Recovery Times
 
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
 
IMC Summit 2016 Breakout - Brian Bulkowski - NVMe, Storage Class Memory and O...
IMC Summit 2016 Breakout - Brian Bulkowski - NVMe, Storage Class Memory and O...IMC Summit 2016 Breakout - Brian Bulkowski - NVMe, Storage Class Memory and O...
IMC Summit 2016 Breakout - Brian Bulkowski - NVMe, Storage Class Memory and O...
 
Lookout on Scaling Security to 100 Million Devices
Lookout on Scaling Security to 100 Million DevicesLookout on Scaling Security to 100 Million Devices
Lookout on Scaling Security to 100 Million Devices
 
Why MySQL High Availability Matters
Why MySQL High Availability MattersWhy MySQL High Availability Matters
Why MySQL High Availability Matters
 
Flash changes everything?! Flash changes nothing...
Flash changes everything?! Flash changes nothing...Flash changes everything?! Flash changes nothing...
Flash changes everything?! Flash changes nothing...
 
It Sizing for Aras on Azure, Hybrid or On-site Deployments
It Sizing for Aras on Azure, Hybrid or On-site DeploymentsIt Sizing for Aras on Azure, Hybrid or On-site Deployments
It Sizing for Aras on Azure, Hybrid or On-site Deployments
 

More from Aerospike, Inc.

Aerospike Hybrid Memory Architecture
Aerospike Hybrid Memory ArchitectureAerospike Hybrid Memory Architecture
Aerospike Hybrid Memory ArchitectureAerospike, Inc.
 
What the Spark!? Intro and Use Cases
What the Spark!? Intro and Use CasesWhat the Spark!? Intro and Use Cases
What the Spark!? Intro and Use CasesAerospike, Inc.
 
Get Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California HighwaysGet Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California HighwaysAerospike, Inc.
 
Storm Persistence and Real-Time Analytics
Storm Persistence and Real-Time AnalyticsStorm Persistence and Real-Time Analytics
Storm Persistence and Real-Time AnalyticsAerospike, Inc.
 
Aerospike: Key Value Data Access
Aerospike: Key Value Data AccessAerospike: Key Value Data Access
Aerospike: Key Value Data AccessAerospike, Inc.
 
Aerospike: Maximizing Performance
Aerospike: Maximizing PerformanceAerospike: Maximizing Performance
Aerospike: Maximizing PerformanceAerospike, Inc.
 
Getting The Most Out Of Your Flash/SSDs
Getting The Most Out Of Your Flash/SSDsGetting The Most Out Of Your Flash/SSDs
Getting The Most Out Of Your Flash/SSDsAerospike, Inc.
 
Configuring Aerospike - Part 2
Configuring Aerospike - Part 2 Configuring Aerospike - Part 2
Configuring Aerospike - Part 2 Aerospike, Inc.
 
Configuring Aerospike - Part 1
Configuring Aerospike - Part 1Configuring Aerospike - Part 1
Configuring Aerospike - Part 1Aerospike, Inc.
 
Big Data Learnings from a Vendor's Perspective
Big Data Learnings from a Vendor's PerspectiveBig Data Learnings from a Vendor's Perspective
Big Data Learnings from a Vendor's PerspectiveAerospike, Inc.
 

More from Aerospike, Inc. (10)

Aerospike Hybrid Memory Architecture
Aerospike Hybrid Memory ArchitectureAerospike Hybrid Memory Architecture
Aerospike Hybrid Memory Architecture
 
What the Spark!? Intro and Use Cases
What the Spark!? Intro and Use CasesWhat the Spark!? Intro and Use Cases
What the Spark!? Intro and Use Cases
 
Get Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California HighwaysGet Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California Highways
 
Storm Persistence and Real-Time Analytics
Storm Persistence and Real-Time AnalyticsStorm Persistence and Real-Time Analytics
Storm Persistence and Real-Time Analytics
 
Aerospike: Key Value Data Access
Aerospike: Key Value Data AccessAerospike: Key Value Data Access
Aerospike: Key Value Data Access
 
Aerospike: Maximizing Performance
Aerospike: Maximizing PerformanceAerospike: Maximizing Performance
Aerospike: Maximizing Performance
 
Getting The Most Out Of Your Flash/SSDs
Getting The Most Out Of Your Flash/SSDsGetting The Most Out Of Your Flash/SSDs
Getting The Most Out Of Your Flash/SSDs
 
Configuring Aerospike - Part 2
Configuring Aerospike - Part 2 Configuring Aerospike - Part 2
Configuring Aerospike - Part 2
 
Configuring Aerospike - Part 1
Configuring Aerospike - Part 1Configuring Aerospike - Part 1
Configuring Aerospike - Part 1
 
Big Data Learnings from a Vendor's Perspective
Big Data Learnings from a Vendor's PerspectiveBig Data Learnings from a Vendor's Perspective
Big Data Learnings from a Vendor's Perspective
 

Recently uploaded

Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 

Recently uploaded (20)

Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 

ACID & CAP: Clearing CAP Confusion and Why C In CAP ≠ C in ACID

  • 1. IN-MEMORY NOSQL, Now OPEN SOURCE! ACID & CAP: CLEARING CAP CONFUSION AND WHY C IN CAP ≠ C IN ACID SRINI V. SRINIVASAN, PH.D SUNIL SAYYAPARAJU Aerospike aer . o . spike [air-oh- spahyk] noun, 1. tip of a rocket that enhances speed and stability © 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 1
  • 2. REQUIREMENTS FOR INTERNET ENTERPRISES © 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 2
  • 3. Introduction to Advertising: Real-time Bidding © 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 3
  • 4. North American RTB speeds & feeds ■ 1 to 6 billion cookies tracked ■ Some companies track 200M, some track 20B ■ Each bidder has their own data pool ■ Data is your weapon ■ Recent searches, behavior, IP addresses ■ Audience clusters (K-cluster, K-means) from offline Hadoop ■ “Remnant” from Google, Yahoo is about 0.6 million / sec ■ Facebook exchange: about 0.6 million / sec ■ “other” is 0.5 million / sec Currently about 3.0M / sec in North American © 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 4
  • 5. Advertising requirements ■ 100 millisecond or 150 millisecond ad delivery ■ De-facto standard set in 2004 by Washington Post and others ■ North America is 70 to 90 milliseconds wide ■ Two or three data centers ■ Auction is limited to 30 milliseconds ■ Typically closes in 5 milliseconds ■ Winners have more data, better models – in 5 milliseconds © 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 5
  • 6. © 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 6 MILLIONS OF CONSUMERS BILLIONS OF DEVICES APP SERVERS DATA INSIGHTS WAREHOUSE Advertising Technology Stack WRITE CONTEXT OPERATIONAL DB WRITE REAL-TIME CONTEXT READ RECENT CONTENT PROFILE STORE Cookies, email, deviceID, IP address, location, segments, clicks, likes, tweets, search terms... REAL-TIME ANALYTICS Best sellers, top scores, trending tweets BATCH ANALYTICS Discover patterns, segment data: location patterns, audience affinity
  • 7. Financial Services – Intraday Positions ACCOUNT POSITIONS Read/Write Query Start of Day Data Loading End of Day Reconciliation LEGACY DATABASE (MAINFRAME) REAL-TIME DATA FEED © 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 7 XDR 10M+ user records Primary key access 1M+ TPS planned Finance App Records App RT Reporting App
  • 8. © 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 8 Social Media MYSQL or POSTGRES (ROTATIONAL DISK) Java application tier Data abstraction and sharding Recent user generated content MODIFIED REDIS (SSD ENABLED) Content and Historical data
  • 9. Modern Scale Out Architecture LOAD BALANCER Load balancer Simple stateless APP SERVERS CONTENT DELIVERY NETWORK Fast stateless IN-MEMORY NoSQL RESEARCH WAREHOUSE Long term cold storage HDFS BASED © 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 9
  • 10. © 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 10 ACID ■ A : Atomicity ■ All the changes will happen or none of them will happen ■ Aborted transactions are rolled back ■ C : Consistency ■ Database will adhere to all the consistency rules before and after every transaction ■ I.E, Data integrity is preserved before and after transaction ■ Consistency rules specified by constraints for check, foreign keys, etc. ■ I : Isolation ■ Defines what data will be shown to the transactions ■ Level-0/1/2/3 : Different types of locking semantics are used ■ D : Durability ■ Committed changes will never be lost ■ Usually achieved by writing both log & data
  • 11. © 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 11 CAP ■ C : Consistency ■ All the copies of the data are same in a distributed system with replication ■ A : Availability ■ The system is 100% responsive for reads and writes with strict SLA ■ It could return failure temporarily for a finite amount of time ■ P : Partition Tolerance ■ System continues to work (take reads/writes) even if some nodes cannot talk to each other ■ Brewer’s CAP THEOREM ■ Only two of the three (C, A, P) can be satisfied in any distributed system ■ COROLLARY ■ A system has to choose one of C or A in the event of partitioning
  • 12. C for Controversy ■ C in ACID != C in CAP ■ So, ACID is possible in distributed systems © 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 12
  • 13. ACID in Aerospike © 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 13 ■ Atomicity ■ Currently, single-record atomicity with replication and secondary indexes ■ Entire object including all the bins are changed together. “Copy on write” ■ If any portion of the update fails, the entire operation is aborted ■ Consistency ■ No RDBMS style constraints can be defined ■ Implied constraints are enforced, for example: ■ Secondary index queries need to be able to find objects after the write transaction completes. ■ Isolation ■ Supports read-committed isolation for long transactions like backup/restore, scans, etc. (level-1) ■ Provides Check-And-Set (CAS) operations ■ Durability ■ Achieved by writing to multiple replicas synchronously ■ E.g., if one node fails, other copies can be used ■ Effectively the level of durability is the same as using disk + log in traditional systems ■ Enhanced durability ■ Rackaware replication ■ Backup + Restore ■ XDR : Cross Datacenter Replication
  • 14. CAP in Aerospike © 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 14 ■ Consistency ■ Immediate consistency : All replicas are updated synchronously ■ Availability ■ New master/replicas will be assigned immediately on cluster state change ■ New master will start taking writes ■ Old replicas will server the reads ■ Partition-tolerance ■ Tries to avoid partitioning (secondary heartbeats) ■ Chooses Availability over consistency ■ Achieves eventual consistency when network restores ■ For internet applications (e.g., Real-time Bidding in Display advertising) ■ (AP + Eventual consistency) could be better than (CP - Availability) ■ For enterprise applications (e.g., Consumer access to Retail Banking Accounts) ■ C is paramount + A is very important, so partitions need to be avoided like the plague
  • 15. Partitions are Rare Brewer’s CAP Revisited – 2012 “First, because partitions are rare, there is little reason to forfeit C or A when the system is not partitioned.” © 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 15 ■ Fast heartbeats ■ Nodes close to each other in same data center and same switch/rack ■ Dual channel replicated heartbeats keeps system robust during network switch failures ■ Ensures fast cluster formation and reorganization using Paxos algorithm ■ Handling consistency during node failures ■ Generation count based conflict detection and resolution ■ Duplicate resolution for reads during cluster reorganization ■ Atomically moving data partitions from one cluster node to another
  • 16. SHARED-NOTHING SYSTEM:100% DATA AVAILABILITY ■ Every node in a cluster is identical, handles both transactions and long running tasks ■ Data is replicated synchronously with immediate consistency within the cluster ■ Data is replicated asynchronously © 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 16 across data centers OHIO Data Center
  • 17. Consistency and Availability Tradeoffs Brewer’s CAP Revisited – 2012 "Second, the choice between C and A can occur many times within the same system at very fine granularity; not only can subsystems make different choices, but the choice can change according to the operation or even the specific data or user involved." ■ Data path tradeoff during partition migration ■ Providing repeatable read results in higher read latency when multiple copies of data partitions are being merged ■ Disabling repeatable read could deliver slightly stale data during partition migrations ■ Cluster state tradeoff during cluster formation event ■ Individual cluster nodes can reject requests for brief periods (10 milliseconds) to ensure that a new cluster forms in a timely manner ■ Clients barely notice this and cluster reorganization events are rare © 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 17
  • 18. WRITING RELIABILY WITH HIGH PERFORMANCE © 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 18 master replica 1. Write sent to row master 2. Latch against simultaneous writes 3. Apply write to master memory and replica memory synchronously 4. Queue operations to disk 5. Signal completed transaction (optional storage commit wait) 6. Master applies conflict resolution policy (rollback/ rollforward) 1. Cluster discovers new node via gossip protocol 2. Paxos vote determines new data organization 3. Partition migrations scheduled 4. When a partition migration starts, write journal starts on destination 5. Partition moves atomically 6. Journal is applied and source data deleted transactions continue Writing with Immediate Consistency Adding a Node
  • 19. Tuning and High Performance Brewer’s CAP Revisited – 2012 "Finally, all three properties are more continuous than binary. Availability is obviously continuous from 0 to 100 percent, but there are also many levels of consistency, and even partitions have nuances, including disagreement within the system about whether a partition exists." © 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 19 ■ Tunable system ■ Repeatable read allows higher consistency while lowering availability ■ Heartbeat tuning helps system to continue to work in a robust manner ■ Increasing replication factor to more than 2 helps keep small highly used data consistent ■ Write all copies (sync and default) versus respond on master-complete (async) ■ High Performance ■ Vertical scale at 1M TPS / 10 TB node results in smaller clusters ■ Smaller clusters leads to more robust system enabling 100% uptime ■ Fast restart of servers (in seconds) minimizes the time when nodes go out of sync
  • 20. Partition Avoidance versus Partition Detection Brewer’s CAP Revisited – 2012 "Because partitions are rare, CAP should allow perfect C and A most of the time, but when partitions are present or perceived, a strategy that detects partitions and explicitly accounts for them is in order. This strategy should have three steps: detect partitions, enter an explicit partition mode that can limit some operations, and initiate a recovery process to restore consistency and compensate for mistakes made during a partition." ■ High Consistency in AP Mode ■ Avoid, as much as possible, the need to sacrifice consistency by minimizing formation of © 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 20 network partitions ■ High consistency using robust heartbeats, block for a few milliseconds during cluster formation, etc. ■ Tunable consistency using repeatable read setting to maintain or relax consistency as necessary ■ Smaller high capacity clusters hugely improves system behavior ■ High Availability in CP Mode ■ Static cluster to pre-define cluster size ■ Detect partition occurrence accurately and enforce appropriate policies to protect the data ■ Suspend partition migrations when the cluster is not whole ■ Some amount of availability needs to be sacrificed to maintain consistency ■ Block writes to partitions all of whose copies are not available in the partitioned cluster ■ Serve reads if the replica is alive ■ Not all reads/writes will fail, Only, writes meant for the nodes which are down will fail
  • 21. Brewer’s CAP Revisited – 2012 http://www.infoq.com/articles/cap-twelve- years-later-how-the-rules-have-changed © 2014 Aerospike, Inc. All rights reserved. Confidential. | ACID & CAP Webinar – July 1, 2014 | 21 Conclusion ■ Aerospike has been in development for about 6 years ■ Does not sacrifice consistency at the altar of availability and high performance ■ Has independently discovered and exploited some of the flexibility available to distributed systems as expressed in Brewer’s 2012 article ■ Attempts to provide the highest consistency, highest availability and highest performance possible in a distributed system ■ Aerospike is now Open Source ■ https://github.com/aerospike/aerospike-server ■ Download and check it out!

Editor's Notes

  1. Know who the Interaction is with - > Monitor 200+ Million US Consumers, 5+ Billion mobile devices and sensors Determine intent based on current context -> Page views, search terms, game state, last purchase, friends list, ads served, location Respond now, use big data for more accurate decisions -> Display the most relevant Ad -> Recommend the best product - > Deliver the richest gaming experience -> Eliminate fraud… Service can NEVER go down!
  2. Let’s start with what happened in internet advertising that kicked off a scale revolution. In 2000, Google launched AdWords. This was the ability to buy advertising on a search keyword, instead of a web page. Display advertising was still static. Prior to 2005, internet advertising was traded statically. A person bought a certain number of impressions on a website – like buying 1M impressions on the Yahoo home page. These were “rotated” using a variety of technologies, but the model fit the existing model of media buying. Advertising companies would say “you want an article in Car and Driver? And on Car and Driver’s website?” In April 2007, Yahoo bought RightMedia. March 2008, Google acquired DoubleClick. Both of these systems matching display advertisements with consumers, based on “cost per click”, and revolutionized the industry. Google’s position in the center of advertising, as a black box, was challenged by open bidding exchanges. Founders of both RightMedia and DoubleClick created several companies, and an open “auction system” to democratize the flow of impressions (from publishers) and ads (from advertisers). These companies realized that real time pricing – individual auctions – were the only fair system for determining price. The RTB system has been used to monitize “long tail” (remenant) advertising, which catches users wherever they might go. “Premium” advertising is still in high demand, and may or may not enter the real-time bidding system. At the time of Facebook’s public launch, they used the same closed system that Google Search uses. They eventually found they didn’t have enough advertising content, income was down, thus they opened facebook exchange. At the time, many technologists said about advertising: the algorithms are simple, it’s only scaling that’s hard. Exchanges that slowed down publisher websites were quickly avoided. The 150 millisecond rule was established, with advertising platforms needing to deliver ads in 150ms to an end user. Platform companies realized the critical nature of keeping that contract – if they failed, there would be fewer ads per page, and less revenue to be had. Although some might argue that there is too much display advertising today, this exchange capacity has become necessary for satisfying mobile.
  3. This is the technology stack that major advertising technology companies built To sustain the crushing load of aggregating the clicks and views from so many websites Individual retailers are now using this same tech stack, for the same reason They wish to present an experience, and include Analytics-based results
  4. Advertising was one of the pioneers, now other enterprises are understanding the need For this architecture. The traditional software & hardware providers scoff at these requirements and speed, but Cutting edge companies now understand that the technology is available, and are finding uses In their interprises. We are working with several financial services companies on providing an Intraday Positions Database. Instead of a cache / relational architecture, the requirement is around 1M TPS Velocity driven by: Faster trading Mobile customers Recommendations
  5. In China, Weibo, Alibaba, and TenCent are masters of agility, jumping on new trends in application design, at scale. A recent discussion with Pinterest showed a similar design. These companies see the benefits of the flexibility of in-memory NoSQL on the front application tier, but also abstract the application logic from database choice and scale using a separate layer. They also use this layer to separate users into “high traffic” and “low traffic”, and to allow different optimization patterns. An engineer at Weibo told me this was the most important optimization. Instead of allowing developers to directly access a database – making assumptions about that database’s performance and indexes – it is better to create an APPLICATION SPECIFIC – with DOMAIN KNOWLEDGE – layer.
  6. Here are the technologies and technology providers to watch in each area. Go through the App Layer in particular Research warehouse --- includes new systems like Spark --- easy to have multiple analytics systems, common in large deployments --- HDFS based systems