SlideShare a Scribd company logo
Full Stack High Availability
In an Eventually Consistent World
Ben Coverston
@bcoverston
DataStax Inc
OpenWest 2015
Who Am I?
• Ben Coverston
• Contributor: Apache Cassandra
• DSE Architect, DataStax Inc
Availability
“An analytics system based on Cassandra, for example,
where there are no single points of failure might be
considered fault tolerant, but if application-level data
migrations, software upgrades, or configuration changes
take an hour or more of downtime to complete, then the
system is not highly available.”
Failure Before ~2010
• The website can fail
• We have a farm!
What About the Middleware?
• Make it stateless
• Spin up a bunch of app servers
• Who cares if one fails, we can recover.
How to Play Kick The Can!
Kick the Can is an old game that has
been played through the generations. Not
to mention lots of fun!
How about the Database?
• Build a massive database server
• Scale up
How about the Database?
• We can backup to tape!
• MTTR Hours, possibly days.
• We can mirror!
• Possible loss of data
• Some loss of availability during recovery
• What if we have multiple Availability Zones?
• Geographical distribution of master slave systems is not practical
No Good Option
• RDBMS Recovery is a Special Case
• RDBMS Was Not Built for Failure
• Once you shard it, you lose the benefits of an RDBMS
anyway
The Problem
• Traditional databases don’t scale
• Because information is context sensitive
• When relationships matter, so does time
• When you guarantee consistency, something else has
to give.
Eventual Consistency
“In an ideal world there would only be one consistency
model . . . Many systems during this time took the
approach that it was better to fail the complete system
than to break this transparency”[1]
— Werner Vogels
CTO amazon.com
The CAP Theorem
• Dr. Eric Brewer
• Consistency
• Availability
• Partition Tolerance
Tradeoffs!
• With Distribution we have to accept Partitioning
• If you want strong consistency, any failure of a master
will result in a partial outage.
Banks Use BASE, not ACID
• Basically Available, Soft State, Eventually Consistent
(BASE)
• Real time transactions are preferred, but ATMs fall back
to partitioned mode when disconnected.
Why Does This Work?
• Templars and Hospitiallars were some of the first modern
bankers [4]
• ATM Networks try to be fully consistent
• Banks lose money when ATMs are not working
• Partitioned State Fallback
• Operations are commutative
• Risk is an actuarial problem
Building On Eventual
Consistency
• Eventual Consistency means . . .
• Two queries at the same time could get different results
• If that’s bad for your application:
• Change your application logic
• Change your business model
• OR
• Don’t use eventual consistency
Seat Inventory
• Airlines are eventually consistent too!
• Aircraft are routinely oversold (because booking flights
is a distributed systems problem)
• People fail to show up, the airline makes money
• Too many people show up, the airline compensates a
few, the airline makes money
But What If?
• I Need Global Distribution
• Strong Consistency
• At Scale
Remember, Tradeoffs
Other Problems
• Real Time Analytics is a Challenge
• MapReduce Helps
• But it’s too slow for a many things
Spark
• Not limited to MapReduce
• Directed Acyclical Graph
• Easy-To-Understand Paradigm
• Compared to Hadoop
What is Spark?
• Apache Project Since 2010
• Fast
• 10-100x faster than Hadoop MapReduce
• Easy
• Scala, Java, Python APIs
• A lot less code (for you to write)
• Interactive Shell
Hadoop Mapreduce
WordCount
package org.myorg;
import java.io.IOException;
import java.util.*;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapreduce.*;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
public class WordCount {
public static class Map extends Mapper<LongWritable, Text, Text, IntWritable> {
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()) {
word.set(tokenizer.nextToken());
context.write(word, one);
}
}
}
public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> {
public void reduce(Text key, Iterable<IntWritable> values, Context context)
throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
context.write(key, new IntWritable(sum));
}
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = new Job(conf, "wordcount");
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setMapperClass(Map.class);
job.setReducerClass(Reduce.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
Spark MapReduce for
WordCount
scala>  sc.cassandraTable(“newyork”,"presidentlocations")  
   .map(  _.get[String](“location”)  )  
   .flatMap(  _.split(“  “))  
   .map(  (_,1))  
   .reduceByKey(  _  +  _  )  
   .toArray  
res17:  Array[(String,  Int)]  =  Array((1,3),  (House,4),  (NYC,3),  (Force,3),  (White,4),  (Air,3))
1 white house
white house
white, 1 house, 1
house, 1 house, 1
house, 2
cassandraTable
get[String]
_.split()
(_,1)
_ + _
Just In Memory?
• Map Reduce Style Analytics
• Not Just In-Memory (though it is really good for
‘iteration’)
Cassandra + Spark
• Cassandra-Spark Driver
• Open source
• https://github.com/datastax/cassandra-driver-spark
That’s Great, But
• I still need real-time
Distributed Aggregation (A
case study)
• Real time distributed counting is hard.
• At high volume, with geographic distribution.
• For most aggregations you only need sums and counts
Goals
• Provide near-real time counts for increments
• Updates are non-monotonic
• Historical windowing
• Real time
What about Streaming?
• Storm or Spark Streaming can help for some use cases
• But to do it right, you have to get acks from an external
system (spark), or block until the items get processed
by something you might have integrated (storm).
• Blocking for stream processing could cause back
pressure, and loss of availability.
• Not “Real Time”
Distributed Counting
• Aggregate over time
• Per shard
• Save deltas
• Timestamps
• Because timestamps can come out of order
• Arrival time is important
Compromises
• Create a C* plugin to do aggregation (daemon,
singleton)
• Do it locally, on each node (on the coordinator).
• Create a separate API to query for aggregation
• Create real-time aggregates on the fly
• Store snapshotted data in C* for windowed
aggregation (1s, 1h, 1d, 1w, 1m).
Deltas
• Deltas are stored
• Aggregates have to be composed of commutative
operations (because we cannot recalculate everything,
every time)
• Cumulative Average is a good example of a compatible
streaming operation.
Cassandra Counters
• Distributed Counting is Hard
• But a tractable problem
• In fact Cassandra already solved this problem
ARRIVED WINDOW DELTA COUNT
1 0 50 40
2 0 40 30
1 1 10 20
2 2 30 40
ARRI
VED
WIN
DOW
DELT
A
COU
NT
1 0 50 40
1 0 40 30
1 1 10 20
2 1 30 40
ARRIVED WINDOW DELTA COUNT
1 0 90 70
1 1 10 20
2 1 30 40
ARRI
VED
TIME
STA
DELT
A
COU
NT
1 0 50 40
1 0 40 30
1 1 10 20
2 1 30 40
ARRIVED TIME DELTA COUNT
1 0 90 70
1 1 10 20
2 1 30 40
Aggregation
Service Average T(1:1) -> 0.923
But Nodes Can Fail
• Snapshotted data is stored with RF > 1 (similar to
RDDs)
• Aggregation is done by a ‘fat client’ running on each
node.
• If a network partition happens, the real-time counts
may be inconsistent.
• In case of a node failure, the counts may need to be
repaired
Network Partition
C*
C*C*
C*
Average T(1:1) -> 0.923
Average T(1:1) -> 2.345
C1:3
Shard1
C1:4
Shard2
3+4 C1:7
C1:3
Shard1
C1:4
Shard2
4 C1:4
The Partition
Problem
3 C1:3
Lambda Architecture
• Real Time, should be Real Time
• Analytics is Batch
• Real time layer depends on processing incoming
streams, or pre-aggregated data
• In a non-trivial system, CAP still affects the design
S1:C1:3
S2:C1:4
Shard1
S1:C1:3
S2:C1:4
Shard2
3+4 C1:7
The Partition
Problem
3+4 C1:7
Design Compromises
• Counter Increment could fail
• Data Insert could fail
• Either could result in an over/under count
Counting Inconsistency
• Similar to Sharded Counters
• Background Task
• Watch for failed mutations
• Recalculate windows when failed mutations happen
The Partition Decision [3]
• Cancel the operation, and decrease availability
• Proceed with the operation, and risk inconsistency
If You Accept Eventual
Consistency
• Real time aggregates may be inaccurate
• Due a network partition (may persist for hours)
• Due to latency (speed of light, network latency, small
number of ms)
• In this system Historical aggregates are more reliable,
because the deltas get written (and replicated) every
second.
Designing for Eventual
Consistency
• Partitions happen in the real world (not a myth)
• If you are building a distributed system, you have to
account for possible failure.
• Define system behavior under failure conditions
• Make adjustments, set expectations
Call To Action
• When building distributed systems
• Reason about concurrency
• Avoid Locking (if at all possible)
• Learn about Commutative Replicated Data Types (CRDTs)
• Learn about MultiVersion Concurrency Control (MVCC)
• Learn Functional Programming
• Scala, Clojure (lisp), whatever
• Functional programming makes distributed programming better
Things to Look At
• Cassandra (fully distributed database)
• Actor Pattern
• akka-cluster (fully distributed compute platform)
References
[1] http://www.allthingsdistributed.com/2007/12/
eventually_consistent.html
[2] http://en.wikipedia.org/wiki/CAP_theorem
[3] http://www.infoq.com/articles/cap-twelve-years-later-
how-the-rules-have-changed
[4] http://en.wikipedia.org/wiki/History_of_banking
Thank You!
ben@datastax.com
@bcoverston
We Are Hiring!

More Related Content

What's hot

Papers we love realtime at facebook
Papers we love   realtime at facebookPapers we love   realtime at facebook
Papers we love realtime at facebook
Gwen (Chen) Shapira
 
Multi-Datacenter Kafka - Strata San Jose 2017
Multi-Datacenter Kafka - Strata San Jose 2017Multi-Datacenter Kafka - Strata San Jose 2017
Multi-Datacenter Kafka - Strata San Jose 2017
Gwen (Chen) Shapira
 
Disaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache KafkaDisaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache Kafka
confluent
 
The Foundations of Multi-DC Kafka (Jakub Korab, Solutions Architect, Confluen...
The Foundations of Multi-DC Kafka (Jakub Korab, Solutions Architect, Confluen...The Foundations of Multi-DC Kafka (Jakub Korab, Solutions Architect, Confluen...
The Foundations of Multi-DC Kafka (Jakub Korab, Solutions Architect, Confluen...
confluent
 
Cassandra Summit 2015: Real World DTCS For Operators
Cassandra Summit 2015: Real World DTCS For OperatorsCassandra Summit 2015: Real World DTCS For Operators
Cassandra Summit 2015: Real World DTCS For Operators
Jeff Jirsa
 
How we sleep well at night using Hystrix at Finn.no
How we sleep well at night using Hystrix at Finn.noHow we sleep well at night using Hystrix at Finn.no
How we sleep well at night using Hystrix at Finn.no
Henning Spjelkavik
 
Cassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in ProductionCassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in Production
DataStax Academy
 
Performance Comparison of Streaming Big Data Platforms
Performance Comparison of Streaming Big Data PlatformsPerformance Comparison of Streaming Big Data Platforms
Performance Comparison of Streaming Big Data Platforms
DataWorks Summit/Hadoop Summit
 
Clock Skew and Other Annoying Realities in Distributed Systems (Donny Nadolny...
Clock Skew and Other Annoying Realities in Distributed Systems (Donny Nadolny...Clock Skew and Other Annoying Realities in Distributed Systems (Donny Nadolny...
Clock Skew and Other Annoying Realities in Distributed Systems (Donny Nadolny...
DataStax
 
Advanced Operations
Advanced OperationsAdvanced Operations
Advanced Operations
DataStax Academy
 
Go Reactive: Building Responsive, Resilient, Elastic & Message-Driven Systems
Go Reactive: Building Responsive, Resilient, Elastic & Message-Driven SystemsGo Reactive: Building Responsive, Resilient, Elastic & Message-Driven Systems
Go Reactive: Building Responsive, Resilient, Elastic & Message-Driven Systems
Jonas Bonér
 
Kafka at scale facebook israel
Kafka at scale   facebook israelKafka at scale   facebook israel
Kafka at scale facebook israel
Gwen (Chen) Shapira
 
Deploying Kafka at Dropbox, Mark Smith, Sean Fellows
Deploying Kafka at Dropbox, Mark Smith, Sean FellowsDeploying Kafka at Dropbox, Mark Smith, Sean Fellows
Deploying Kafka at Dropbox, Mark Smith, Sean Fellows
confluent
 
Putting Kafka Into Overdrive
Putting Kafka Into OverdrivePutting Kafka Into Overdrive
Putting Kafka Into Overdrive
Todd Palino
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudy
John Adams
 
Go Reactive: Event-Driven, Scalable, Resilient & Responsive Systems
Go Reactive: Event-Driven, Scalable, Resilient & Responsive SystemsGo Reactive: Event-Driven, Scalable, Resilient & Responsive Systems
Go Reactive: Event-Driven, Scalable, Resilient & Responsive Systems
Jonas Bonér
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
smallerror
 
Kafka Reliability - When it absolutely, positively has to be there
Kafka Reliability - When it absolutely, positively has to be thereKafka Reliability - When it absolutely, positively has to be there
Kafka Reliability - When it absolutely, positively has to be there
Gwen (Chen) Shapira
 
Webinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in ProductionWebinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in Production
DataStax Academy
 

What's hot (19)

Papers we love realtime at facebook
Papers we love   realtime at facebookPapers we love   realtime at facebook
Papers we love realtime at facebook
 
Multi-Datacenter Kafka - Strata San Jose 2017
Multi-Datacenter Kafka - Strata San Jose 2017Multi-Datacenter Kafka - Strata San Jose 2017
Multi-Datacenter Kafka - Strata San Jose 2017
 
Disaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache KafkaDisaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache Kafka
 
The Foundations of Multi-DC Kafka (Jakub Korab, Solutions Architect, Confluen...
The Foundations of Multi-DC Kafka (Jakub Korab, Solutions Architect, Confluen...The Foundations of Multi-DC Kafka (Jakub Korab, Solutions Architect, Confluen...
The Foundations of Multi-DC Kafka (Jakub Korab, Solutions Architect, Confluen...
 
Cassandra Summit 2015: Real World DTCS For Operators
Cassandra Summit 2015: Real World DTCS For OperatorsCassandra Summit 2015: Real World DTCS For Operators
Cassandra Summit 2015: Real World DTCS For Operators
 
How we sleep well at night using Hystrix at Finn.no
How we sleep well at night using Hystrix at Finn.noHow we sleep well at night using Hystrix at Finn.no
How we sleep well at night using Hystrix at Finn.no
 
Cassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in ProductionCassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in Production
 
Performance Comparison of Streaming Big Data Platforms
Performance Comparison of Streaming Big Data PlatformsPerformance Comparison of Streaming Big Data Platforms
Performance Comparison of Streaming Big Data Platforms
 
Clock Skew and Other Annoying Realities in Distributed Systems (Donny Nadolny...
Clock Skew and Other Annoying Realities in Distributed Systems (Donny Nadolny...Clock Skew and Other Annoying Realities in Distributed Systems (Donny Nadolny...
Clock Skew and Other Annoying Realities in Distributed Systems (Donny Nadolny...
 
Advanced Operations
Advanced OperationsAdvanced Operations
Advanced Operations
 
Go Reactive: Building Responsive, Resilient, Elastic & Message-Driven Systems
Go Reactive: Building Responsive, Resilient, Elastic & Message-Driven SystemsGo Reactive: Building Responsive, Resilient, Elastic & Message-Driven Systems
Go Reactive: Building Responsive, Resilient, Elastic & Message-Driven Systems
 
Kafka at scale facebook israel
Kafka at scale   facebook israelKafka at scale   facebook israel
Kafka at scale facebook israel
 
Deploying Kafka at Dropbox, Mark Smith, Sean Fellows
Deploying Kafka at Dropbox, Mark Smith, Sean FellowsDeploying Kafka at Dropbox, Mark Smith, Sean Fellows
Deploying Kafka at Dropbox, Mark Smith, Sean Fellows
 
Putting Kafka Into Overdrive
Putting Kafka Into OverdrivePutting Kafka Into Overdrive
Putting Kafka Into Overdrive
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudy
 
Go Reactive: Event-Driven, Scalable, Resilient & Responsive Systems
Go Reactive: Event-Driven, Scalable, Resilient & Responsive SystemsGo Reactive: Event-Driven, Scalable, Resilient & Responsive Systems
Go Reactive: Event-Driven, Scalable, Resilient & Responsive Systems
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
 
Kafka Reliability - When it absolutely, positively has to be there
Kafka Reliability - When it absolutely, positively has to be thereKafka Reliability - When it absolutely, positively has to be there
Kafka Reliability - When it absolutely, positively has to be there
 
Webinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in ProductionWebinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in Production
 

Viewers also liked

Design (Cloud systems) for Failures
Design (Cloud systems) for FailuresDesign (Cloud systems) for Failures
Design (Cloud systems) for Failures
Rodolfo Kohn
 
CS519 - Cloud Types for Eventual Consistency
CS519 - Cloud Types for Eventual ConsistencyCS519 - Cloud Types for Eventual Consistency
CS519 - Cloud Types for Eventual Consistency
Sergii Shmarkatiuk
 
AWS Summit Auckland - Smaller is Better - Microservices on AWS
AWS Summit Auckland - Smaller is Better - Microservices on AWSAWS Summit Auckland - Smaller is Better - Microservices on AWS
AWS Summit Auckland - Smaller is Better - Microservices on AWS
Amazon Web Services
 
Microservices Manchester: Testing Microservices: Pain or Opportunity? By Davi...
Microservices Manchester: Testing Microservices: Pain or Opportunity? By Davi...Microservices Manchester: Testing Microservices: Pain or Opportunity? By Davi...
Microservices Manchester: Testing Microservices: Pain or Opportunity? By Davi...
OpenCredo
 
Architecting Cloud Applications - the essential checklist
Architecting Cloud Applications - the essential checklistArchitecting Cloud Applications - the essential checklist
Architecting Cloud Applications - the essential checklist
Object Consulting
 
Monitoring Microservices
Monitoring MicroservicesMonitoring Microservices
Monitoring Microservices
Weaveworks
 
Best Practices for Large-Scale Websites -- Lessons from eBay
Best Practices for Large-Scale Websites -- Lessons from eBayBest Practices for Large-Scale Websites -- Lessons from eBay
Best Practices for Large-Scale Websites -- Lessons from eBay
Randy Shoup
 
Designing large scale distributed systems
Designing large scale distributed systemsDesigning large scale distributed systems
Designing large scale distributed systemsAshwani Priyedarshi
 
Aws 201:Advanced Breakout Track on HA and DR
Aws 201:Advanced Breakout Track on HA and DRAws 201:Advanced Breakout Track on HA and DR
Aws 201:Advanced Breakout Track on HA and DRHarish Ganesan
 
Building Eventing Systems for Microservice Architecture
Building Eventing Systems for Microservice Architecture  Building Eventing Systems for Microservice Architecture
Building Eventing Systems for Microservice Architecture
Yaroslav Tkachenko
 
Webinar: Eventual Consistency != Hopeful Consistency
Webinar: Eventual Consistency != Hopeful ConsistencyWebinar: Eventual Consistency != Hopeful Consistency
Webinar: Eventual Consistency != Hopeful Consistency
DataStax
 
Microservices Workshop - Craft Conference
Microservices Workshop - Craft ConferenceMicroservices Workshop - Craft Conference
Microservices Workshop - Craft Conference
Adrian Cockcroft
 
Netflix JavaScript Talks - Scaling A/B Testing on Netflix.com with Node.js
Netflix JavaScript Talks - Scaling A/B Testing on Netflix.com with Node.jsNetflix JavaScript Talks - Scaling A/B Testing on Netflix.com with Node.js
Netflix JavaScript Talks - Scaling A/B Testing on Netflix.com with Node.js
Chris Saint-Amant
 
Microservices pattern language (microxchg microxchg2016)
Microservices pattern language (microxchg microxchg2016)Microservices pattern language (microxchg microxchg2016)
Microservices pattern language (microxchg microxchg2016)
Chris Richardson
 
Modernizing Applications with Microservices and DC/OS (Lightbend/Mesosphere c...
Modernizing Applications with Microservices and DC/OS (Lightbend/Mesosphere c...Modernizing Applications with Microservices and DC/OS (Lightbend/Mesosphere c...
Modernizing Applications with Microservices and DC/OS (Lightbend/Mesosphere c...
Lightbend
 
Microservice Summit 2016 "Microservices: The Organisational and People Impact"
Microservice Summit 2016 "Microservices: The Organisational and People Impact"Microservice Summit 2016 "Microservices: The Organisational and People Impact"
Microservice Summit 2016 "Microservices: The Organisational and People Impact"
Daniel Bryant
 
Gluecon Monitoring Microservices and Containers: A Challenge
Gluecon Monitoring Microservices and Containers: A ChallengeGluecon Monitoring Microservices and Containers: A Challenge
Gluecon Monitoring Microservices and Containers: A Challenge
Adrian Cockcroft
 
Developing event-driven microservices with event sourcing and CQRS (svcc, sv...
Developing event-driven microservices with event sourcing and CQRS  (svcc, sv...Developing event-driven microservices with event sourcing and CQRS  (svcc, sv...
Developing event-driven microservices with event sourcing and CQRS (svcc, sv...
Chris Richardson
 

Viewers also liked (18)

Design (Cloud systems) for Failures
Design (Cloud systems) for FailuresDesign (Cloud systems) for Failures
Design (Cloud systems) for Failures
 
CS519 - Cloud Types for Eventual Consistency
CS519 - Cloud Types for Eventual ConsistencyCS519 - Cloud Types for Eventual Consistency
CS519 - Cloud Types for Eventual Consistency
 
AWS Summit Auckland - Smaller is Better - Microservices on AWS
AWS Summit Auckland - Smaller is Better - Microservices on AWSAWS Summit Auckland - Smaller is Better - Microservices on AWS
AWS Summit Auckland - Smaller is Better - Microservices on AWS
 
Microservices Manchester: Testing Microservices: Pain or Opportunity? By Davi...
Microservices Manchester: Testing Microservices: Pain or Opportunity? By Davi...Microservices Manchester: Testing Microservices: Pain or Opportunity? By Davi...
Microservices Manchester: Testing Microservices: Pain or Opportunity? By Davi...
 
Architecting Cloud Applications - the essential checklist
Architecting Cloud Applications - the essential checklistArchitecting Cloud Applications - the essential checklist
Architecting Cloud Applications - the essential checklist
 
Monitoring Microservices
Monitoring MicroservicesMonitoring Microservices
Monitoring Microservices
 
Best Practices for Large-Scale Websites -- Lessons from eBay
Best Practices for Large-Scale Websites -- Lessons from eBayBest Practices for Large-Scale Websites -- Lessons from eBay
Best Practices for Large-Scale Websites -- Lessons from eBay
 
Designing large scale distributed systems
Designing large scale distributed systemsDesigning large scale distributed systems
Designing large scale distributed systems
 
Aws 201:Advanced Breakout Track on HA and DR
Aws 201:Advanced Breakout Track on HA and DRAws 201:Advanced Breakout Track on HA and DR
Aws 201:Advanced Breakout Track on HA and DR
 
Building Eventing Systems for Microservice Architecture
Building Eventing Systems for Microservice Architecture  Building Eventing Systems for Microservice Architecture
Building Eventing Systems for Microservice Architecture
 
Webinar: Eventual Consistency != Hopeful Consistency
Webinar: Eventual Consistency != Hopeful ConsistencyWebinar: Eventual Consistency != Hopeful Consistency
Webinar: Eventual Consistency != Hopeful Consistency
 
Microservices Workshop - Craft Conference
Microservices Workshop - Craft ConferenceMicroservices Workshop - Craft Conference
Microservices Workshop - Craft Conference
 
Netflix JavaScript Talks - Scaling A/B Testing on Netflix.com with Node.js
Netflix JavaScript Talks - Scaling A/B Testing on Netflix.com with Node.jsNetflix JavaScript Talks - Scaling A/B Testing on Netflix.com with Node.js
Netflix JavaScript Talks - Scaling A/B Testing on Netflix.com with Node.js
 
Microservices pattern language (microxchg microxchg2016)
Microservices pattern language (microxchg microxchg2016)Microservices pattern language (microxchg microxchg2016)
Microservices pattern language (microxchg microxchg2016)
 
Modernizing Applications with Microservices and DC/OS (Lightbend/Mesosphere c...
Modernizing Applications with Microservices and DC/OS (Lightbend/Mesosphere c...Modernizing Applications with Microservices and DC/OS (Lightbend/Mesosphere c...
Modernizing Applications with Microservices and DC/OS (Lightbend/Mesosphere c...
 
Microservice Summit 2016 "Microservices: The Organisational and People Impact"
Microservice Summit 2016 "Microservices: The Organisational and People Impact"Microservice Summit 2016 "Microservices: The Organisational and People Impact"
Microservice Summit 2016 "Microservices: The Organisational and People Impact"
 
Gluecon Monitoring Microservices and Containers: A Challenge
Gluecon Monitoring Microservices and Containers: A ChallengeGluecon Monitoring Microservices and Containers: A Challenge
Gluecon Monitoring Microservices and Containers: A Challenge
 
Developing event-driven microservices with event sourcing and CQRS (svcc, sv...
Developing event-driven microservices with event sourcing and CQRS  (svcc, sv...Developing event-driven microservices with event sourcing and CQRS  (svcc, sv...
Developing event-driven microservices with event sourcing and CQRS (svcc, sv...
 

Similar to Open west 2015 talk ben coverston

Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...xlight
 
Fixing twitter
Fixing twitterFixing twitter
Fixing twitter
Roger Xia
 
Stream Computing (The Engineer's Perspective)
Stream Computing (The Engineer's Perspective)Stream Computing (The Engineer's Perspective)
Stream Computing (The Engineer's Perspective)
Ilya Ganelin
 
Accumulo Nutch/GORA, Storm, and Pig
Accumulo Nutch/GORA, Storm, and PigAccumulo Nutch/GORA, Storm, and Pig
Accumulo Nutch/GORA, Storm, and PigJason Trost
 
Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)
Tibo Beijen
 
Building a Database for the End of the World
Building a Database for the End of the WorldBuilding a Database for the End of the World
Building a Database for the End of the World
jhugg
 
Diagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - CassandraDiagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - Cassandra
Jon Haddad
 
Building Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesBuilding Big Data Streaming Architectures
Building Big Data Streaming Architectures
David Martínez Rego
 
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInJay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
LinkedIn
 
Building Stream Infrastructure across Multiple Data Centers with Apache Kafka
Building Stream Infrastructure across Multiple Data Centers with Apache KafkaBuilding Stream Infrastructure across Multiple Data Centers with Apache Kafka
Building Stream Infrastructure across Multiple Data Centers with Apache Kafka
Guozhang Wang
 
Always On: Building Highly Available Applications on Cassandra
Always On: Building Highly Available Applications on CassandraAlways On: Building Highly Available Applications on Cassandra
Always On: Building Highly Available Applications on Cassandra
Robbie Strickland
 
Storm at Forter
Storm at ForterStorm at Forter
Storm at Forter
Re'em Bensimhon
 
Instrumenting the real-time web: Node.js in production
Instrumenting the real-time web: Node.js in productionInstrumenting the real-time web: Node.js in production
Instrumenting the real-time web: Node.js in production
bcantrill
 
The impact of cloud NSBCon NY by Yves Goeleven
The impact of cloud NSBCon NY by Yves GoelevenThe impact of cloud NSBCon NY by Yves Goeleven
The impact of cloud NSBCon NY by Yves Goeleven
Particular Software
 
Chirp 2010: Scaling Twitter
Chirp 2010: Scaling TwitterChirp 2010: Scaling Twitter
Chirp 2010: Scaling Twitter
John Adams
 
Building FoundationDB
Building FoundationDBBuilding FoundationDB
Building FoundationDB
FoundationDB
 
Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications
OpenEBS
 
Interactions complicate debugging
Interactions complicate debuggingInteractions complicate debugging
Interactions complicate debugging
Syed Zaid Irshad
 
Solving k8s persistent workloads using k8s DevOps style
Solving k8s persistent workloads using k8s DevOps styleSolving k8s persistent workloads using k8s DevOps style
Solving k8s persistent workloads using k8s DevOps style
MayaData
 

Similar to Open west 2015 talk ben coverston (20)

Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
 
Fixing twitter
Fixing twitterFixing twitter
Fixing twitter
 
Fixing_Twitter
Fixing_TwitterFixing_Twitter
Fixing_Twitter
 
Stream Computing (The Engineer's Perspective)
Stream Computing (The Engineer's Perspective)Stream Computing (The Engineer's Perspective)
Stream Computing (The Engineer's Perspective)
 
Accumulo Nutch/GORA, Storm, and Pig
Accumulo Nutch/GORA, Storm, and PigAccumulo Nutch/GORA, Storm, and Pig
Accumulo Nutch/GORA, Storm, and Pig
 
Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)
 
Building a Database for the End of the World
Building a Database for the End of the WorldBuilding a Database for the End of the World
Building a Database for the End of the World
 
Diagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - CassandraDiagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - Cassandra
 
Building Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesBuilding Big Data Streaming Architectures
Building Big Data Streaming Architectures
 
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInJay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
 
Building Stream Infrastructure across Multiple Data Centers with Apache Kafka
Building Stream Infrastructure across Multiple Data Centers with Apache KafkaBuilding Stream Infrastructure across Multiple Data Centers with Apache Kafka
Building Stream Infrastructure across Multiple Data Centers with Apache Kafka
 
Always On: Building Highly Available Applications on Cassandra
Always On: Building Highly Available Applications on CassandraAlways On: Building Highly Available Applications on Cassandra
Always On: Building Highly Available Applications on Cassandra
 
Storm at Forter
Storm at ForterStorm at Forter
Storm at Forter
 
Instrumenting the real-time web: Node.js in production
Instrumenting the real-time web: Node.js in productionInstrumenting the real-time web: Node.js in production
Instrumenting the real-time web: Node.js in production
 
The impact of cloud NSBCon NY by Yves Goeleven
The impact of cloud NSBCon NY by Yves GoelevenThe impact of cloud NSBCon NY by Yves Goeleven
The impact of cloud NSBCon NY by Yves Goeleven
 
Chirp 2010: Scaling Twitter
Chirp 2010: Scaling TwitterChirp 2010: Scaling Twitter
Chirp 2010: Scaling Twitter
 
Building FoundationDB
Building FoundationDBBuilding FoundationDB
Building FoundationDB
 
Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications
 
Interactions complicate debugging
Interactions complicate debuggingInteractions complicate debugging
Interactions complicate debugging
 
Solving k8s persistent workloads using k8s DevOps style
Solving k8s persistent workloads using k8s DevOps styleSolving k8s persistent workloads using k8s DevOps style
Solving k8s persistent workloads using k8s DevOps style
 

Recently uploaded

May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
Alina Yurenko
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Neo4j
 
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket ManagementUtilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
NYGGS Automation Suite
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
timtebeek1
 
Transform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR SolutionsTransform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR Solutions
TheSMSPoint
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
Google
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
Paco van Beckhoven
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
rickgrimesss22
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
lorraineandreiamcidl
 
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Mind IT Systems
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
Boni García
 
SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024
Hironori Washizaki
 
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
kalichargn70th171
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
Fermin Galan
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
Neo4j
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 

Recently uploaded (20)

May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
 
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket ManagementUtilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
 
Transform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR SolutionsTransform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR Solutions
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
 
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
 
SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024
 
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 

Open west 2015 talk ben coverston

  • 1. Full Stack High Availability In an Eventually Consistent World Ben Coverston @bcoverston DataStax Inc OpenWest 2015
  • 2. Who Am I? • Ben Coverston • Contributor: Apache Cassandra • DSE Architect, DataStax Inc
  • 4.
  • 5. “An analytics system based on Cassandra, for example, where there are no single points of failure might be considered fault tolerant, but if application-level data migrations, software upgrades, or configuration changes take an hour or more of downtime to complete, then the system is not highly available.”
  • 6. Failure Before ~2010 • The website can fail • We have a farm!
  • 7.
  • 8. What About the Middleware? • Make it stateless • Spin up a bunch of app servers • Who cares if one fails, we can recover.
  • 9.
  • 10. How to Play Kick The Can! Kick the Can is an old game that has been played through the generations. Not to mention lots of fun!
  • 11. How about the Database? • Build a massive database server • Scale up
  • 12. How about the Database? • We can backup to tape! • MTTR Hours, possibly days. • We can mirror! • Possible loss of data • Some loss of availability during recovery • What if we have multiple Availability Zones? • Geographical distribution of master slave systems is not practical
  • 13.
  • 14. No Good Option • RDBMS Recovery is a Special Case • RDBMS Was Not Built for Failure • Once you shard it, you lose the benefits of an RDBMS anyway
  • 15. The Problem • Traditional databases don’t scale • Because information is context sensitive • When relationships matter, so does time • When you guarantee consistency, something else has to give.
  • 16. Eventual Consistency “In an ideal world there would only be one consistency model . . . Many systems during this time took the approach that it was better to fail the complete system than to break this transparency”[1] — Werner Vogels CTO amazon.com
  • 17. The CAP Theorem • Dr. Eric Brewer • Consistency • Availability • Partition Tolerance
  • 18. Tradeoffs! • With Distribution we have to accept Partitioning • If you want strong consistency, any failure of a master will result in a partial outage.
  • 19. Banks Use BASE, not ACID • Basically Available, Soft State, Eventually Consistent (BASE) • Real time transactions are preferred, but ATMs fall back to partitioned mode when disconnected.
  • 20. Why Does This Work? • Templars and Hospitiallars were some of the first modern bankers [4] • ATM Networks try to be fully consistent • Banks lose money when ATMs are not working • Partitioned State Fallback • Operations are commutative • Risk is an actuarial problem
  • 21. Building On Eventual Consistency • Eventual Consistency means . . . • Two queries at the same time could get different results • If that’s bad for your application: • Change your application logic • Change your business model • OR • Don’t use eventual consistency
  • 22. Seat Inventory • Airlines are eventually consistent too! • Aircraft are routinely oversold (because booking flights is a distributed systems problem) • People fail to show up, the airline makes money • Too many people show up, the airline compensates a few, the airline makes money
  • 23. But What If? • I Need Global Distribution • Strong Consistency • At Scale
  • 24.
  • 26. Other Problems • Real Time Analytics is a Challenge • MapReduce Helps • But it’s too slow for a many things
  • 27. Spark • Not limited to MapReduce • Directed Acyclical Graph • Easy-To-Understand Paradigm • Compared to Hadoop
  • 28. What is Spark? • Apache Project Since 2010 • Fast • 10-100x faster than Hadoop MapReduce • Easy • Scala, Java, Python APIs • A lot less code (for you to write) • Interactive Shell
  • 29. Hadoop Mapreduce WordCount package org.myorg; import java.io.IOException; import java.util.*; import org.apache.hadoop.fs.Path; import org.apache.hadoop.conf.*; import org.apache.hadoop.io.*; import org.apache.hadoop.mapreduce.*; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.input.TextInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat; public class WordCount { public static class Map extends Mapper<LongWritable, Text, Text, IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { String line = value.toString(); StringTokenizer tokenizer = new StringTokenizer(line); while (tokenizer.hasMoreTokens()) { word.set(tokenizer.nextToken()); context.write(word, one); } } } public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> { public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { int sum = 0; for (IntWritable val : values) { sum += val.get(); } context.write(key, new IntWritable(sum)); } } public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); Job job = new Job(conf, "wordcount"); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setMapperClass(Map.class); job.setReducerClass(Reduce.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1]));
  • 30. Spark MapReduce for WordCount scala>  sc.cassandraTable(“newyork”,"presidentlocations")     .map(  _.get[String](“location”)  )     .flatMap(  _.split(“  “))     .map(  (_,1))     .reduceByKey(  _  +  _  )     .toArray   res17:  Array[(String,  Int)]  =  Array((1,3),  (House,4),  (NYC,3),  (Force,3),  (White,4),  (Air,3)) 1 white house white house white, 1 house, 1 house, 1 house, 1 house, 2 cassandraTable get[String] _.split() (_,1) _ + _
  • 31. Just In Memory? • Map Reduce Style Analytics • Not Just In-Memory (though it is really good for ‘iteration’)
  • 32. Cassandra + Spark • Cassandra-Spark Driver • Open source • https://github.com/datastax/cassandra-driver-spark
  • 33. That’s Great, But • I still need real-time
  • 34. Distributed Aggregation (A case study) • Real time distributed counting is hard. • At high volume, with geographic distribution. • For most aggregations you only need sums and counts
  • 35. Goals • Provide near-real time counts for increments • Updates are non-monotonic • Historical windowing • Real time
  • 36. What about Streaming? • Storm or Spark Streaming can help for some use cases • But to do it right, you have to get acks from an external system (spark), or block until the items get processed by something you might have integrated (storm). • Blocking for stream processing could cause back pressure, and loss of availability. • Not “Real Time”
  • 37. Distributed Counting • Aggregate over time • Per shard • Save deltas • Timestamps • Because timestamps can come out of order • Arrival time is important
  • 38. Compromises • Create a C* plugin to do aggregation (daemon, singleton) • Do it locally, on each node (on the coordinator). • Create a separate API to query for aggregation • Create real-time aggregates on the fly • Store snapshotted data in C* for windowed aggregation (1s, 1h, 1d, 1w, 1m).
  • 39. Deltas • Deltas are stored • Aggregates have to be composed of commutative operations (because we cannot recalculate everything, every time) • Cumulative Average is a good example of a compatible streaming operation.
  • 40. Cassandra Counters • Distributed Counting is Hard • But a tractable problem • In fact Cassandra already solved this problem
  • 41. ARRIVED WINDOW DELTA COUNT 1 0 50 40 2 0 40 30 1 1 10 20 2 2 30 40
  • 42. ARRI VED WIN DOW DELT A COU NT 1 0 50 40 1 0 40 30 1 1 10 20 2 1 30 40 ARRIVED WINDOW DELTA COUNT 1 0 90 70 1 1 10 20 2 1 30 40
  • 43. ARRI VED TIME STA DELT A COU NT 1 0 50 40 1 0 40 30 1 1 10 20 2 1 30 40 ARRIVED TIME DELTA COUNT 1 0 90 70 1 1 10 20 2 1 30 40 Aggregation Service Average T(1:1) -> 0.923
  • 44. But Nodes Can Fail • Snapshotted data is stored with RF > 1 (similar to RDDs) • Aggregation is done by a ‘fat client’ running on each node. • If a network partition happens, the real-time counts may be inconsistent. • In case of a node failure, the counts may need to be repaired
  • 45. Network Partition C* C*C* C* Average T(1:1) -> 0.923 Average T(1:1) -> 2.345
  • 46.
  • 49. Lambda Architecture • Real Time, should be Real Time • Analytics is Batch • Real time layer depends on processing incoming streams, or pre-aggregated data • In a non-trivial system, CAP still affects the design
  • 51. Design Compromises • Counter Increment could fail • Data Insert could fail • Either could result in an over/under count
  • 52. Counting Inconsistency • Similar to Sharded Counters • Background Task • Watch for failed mutations • Recalculate windows when failed mutations happen
  • 53. The Partition Decision [3] • Cancel the operation, and decrease availability • Proceed with the operation, and risk inconsistency
  • 54.
  • 55. If You Accept Eventual Consistency • Real time aggregates may be inaccurate • Due a network partition (may persist for hours) • Due to latency (speed of light, network latency, small number of ms) • In this system Historical aggregates are more reliable, because the deltas get written (and replicated) every second.
  • 56. Designing for Eventual Consistency • Partitions happen in the real world (not a myth) • If you are building a distributed system, you have to account for possible failure. • Define system behavior under failure conditions • Make adjustments, set expectations
  • 57. Call To Action • When building distributed systems • Reason about concurrency • Avoid Locking (if at all possible) • Learn about Commutative Replicated Data Types (CRDTs) • Learn about MultiVersion Concurrency Control (MVCC) • Learn Functional Programming • Scala, Clojure (lisp), whatever • Functional programming makes distributed programming better
  • 58. Things to Look At • Cassandra (fully distributed database) • Actor Pattern • akka-cluster (fully distributed compute platform)
  • 59. References [1] http://www.allthingsdistributed.com/2007/12/ eventually_consistent.html [2] http://en.wikipedia.org/wiki/CAP_theorem [3] http://www.infoq.com/articles/cap-twelve-years-later- how-the-rules-have-changed [4] http://en.wikipedia.org/wiki/History_of_banking