SlideShare a Scribd company logo
1 of 52
Download to read offline
Your Database Wants
to Kill You
Kevin Lawver - 11/1/2013

1
Hi, I’m Kevin.

2
3

-

I work at Rails Machine
We do ops
Lots of ops on lots of different kinds of databases
enough introductions, let’s get w/ the murder!
Databases have been
around since before
most of us were born.

4

- So they’re well understood
- and well despised
- and crusty
There’s been a
revolution the past few
years.

5
Getting away from fully
relational databases, to
something... odd.

6
But, don’t get
comfortable.

7
Because your database
wants...

8
TO KILL YOU!

9
really, it does.

10
The Old School

11
Relational Databases

12
MySQL, PostgreSQL,
Oracle, Sybase, etc

13
This is what you’re
used to.

14
Tables, relationships,
foreign keys, SQL, etc.

15

- And lots of rules
ACID

16

The set of rules relational databases follow to assure the data gets where it needs to go and
is consistent.
They’re fine for a certain kind of workload.
Atomicity

17

Transactions are all or nothing. If any part of the transaction fails, the WHOLE thing has to
fail and roll back.
That means a lot of locking, which can become a performance problem.
Consistency

18

Any transaction brings the database from one valid “state” to another - which means you can
have a bunch of rules inside the database to judge the validity of data, and any transaction
that doesn’t pass fails and rolls back.
Again, not great for performance.
Isolation

19

Transactions executed concurrently have to result in the same state of the database as if they
had been executed serially.
Requires partially applied transactions to NOT be visible to other transactions.
Durability

20

Once a transaction is committed, it’s IN THERE.
That’s a lot of rules, and
it makes for inflexible
systems.

21
And that’s where the
killing comes in:

22
Replication

23

It’s evil, and almost all RDBMS’s do it wrong.
It’s so fragile that you spend more time redoing it than actually getting any benefit from it.
MySQL can do master/master. PostgreSQL ships binary logs via scp.
It’s all horrible and gives me grey hairs.
Because it was an afterthought and not designed from the beginning.
Add-on replication is almost always horrible.
Failover

24

This is even worse than replication. Because it was even more of an afterthought.
Most of the time it fails over on accident and breaks replication.
And then someone gets woken up to clean up a steaming pile of bad data.
And that person isn’t very happy about it.
All those solutions are
hacked on and horrible.

25
There has to be a
better way.

26
Enter the CAP
Theorem

27
It came from Amazon,
and changed everything.

28

It adds some reality to the database world. It basically says that no database can do
everything.
CAP stands for...

29
Consistency

30

All nodes have the same data at the same time.
Availability

31

Every request is guaranteed to receive a response as to its success or failure
Partition Tolerance

32

The system will continue to operate despite arbitrary message loss or a failure of part of the
system.
Also known as “split brain” - which happens to me if I don’t get enough coffee.
But, you can never have
all three. It’s impossible.

33

Finally, some reality! Stop trying to be everything to everyone and solve all types of problems
with the same hammer.
So when you’re looking at a data store, see which two it can do and which you need for your
data!
Enter all the NoSQL!

34

Stands for either “NO SQL” or “Not Only SQL” - but it’s really a bunch of different data stores
that aren’t relational and solve different kinds of problems.
And provide some solutions for old school reliability problems.
Document Stores

35

-

MongoDB, Riak, CouchDB, etc
Not relational (though you can convince mongodb to do it, you shouldn’t)
Usually have really good replication stories
Let’s look at MongoDB vs traditional MySQL
MySQL Replication

36

That’s typical master/master.
Each can take writes (but you shouldn’t)
They ship bin logs back and forth
Fragile
Easy to break replication by having conflicting writes committed near the same time on both
sides - so split-brain is always a possibility.
MongoDB Replica Set

37

- There’s an election, and one node is picked as the primary.
- It takes all writes, distributes to the secondaries
- If the primary goes down, there’s an election and a new primary is chosen (usually less than
1 second).
- New nodes join the replica set and get all the data, then can be elected primary
Benefits of Replica Sets

38

- Replication and failover designed into the system as core functionality!
- Much better failover
- Much better reliability
- I get to sleep more
- Easy to add capacity as the replica set grows (either shard by adding new replica sets or
add more nodes to scale reads).
Riak & the “Ring”

39



Riak is crazy town
Document store with very light querying (though the new search stuff is badass)
Super scalable via the “Ring”
Data is automagically replicated around the ring based on configuration
- Number of copies
The Ring

40

-

All nodes “gossip” to confirm they’re up.
Any node can take a query and will gather the results from the other nodes.
Nodes dropping out are “noticed” by the ring and data gets shuffled around.
New news automatically join the node and get their “share” of the data.
Theoretically infinitely scalable (though the gossip gets REALLY noisy)
Useful as a file store (see Riak CS)
I think that drawing can be used to summon Beetlejuice.
What’s Old is New

41

-

MariaDB + Galera Cluster = MySQL replica sets! (kind of)
row-based replication is much more reliable
automatic failover and syncing of new nodes
can be load balanced for reads and writes!
still the same sql everyone’s used to
theoretically any node can take writes - but I don’t trust it
My MariaDB

42

- Yes, this is the mongodb diagram
- I use haproxy to send all the writes to a single primary, with the others as backups in case
it goes down.
- I have a separate haproxy frontend that load balances across all three for reads.
- so far, i love it to pieces
Here’s HAProxy

43

-

rmcom_backend - app servers
mariadb_read_backend - the leastconn balanced pool of readers
mariadb_write_backend - db1 is the primary unless it goes down, then db2 is “promoted”
rails, mariadb_read and mariadb_write are the frontends
Now, some rules...

44
If you query it, index it.

45

- As your data grows, you’ll see query speed decrease.
- Add indexes for your common queries!
- Don’t forget compound indexes.
As data increases,
flexibility decreases.

46

- You’ll need to limit the types of queries you allow people to perform because they’ll lock
things up and stop everyone from accessing it.
- You’ll need to find other ways to “protect” the database, like.
Cache it!

47

- Use memcached or other caching technologies to keep common queries away from the
database.
- If it can be read, it can be cached.
- Saves you a ton of money in vertically scaling your database.
- You may also need to add other ways to access your data, like say, elasticsearch or solr.
Scale vertically

48

- Throw hardware at it until it’s too expensive, then shard it.
- Because sharding is almost always horrible.
What does it all mean?

49

-

Don’t default to RDBMS!
Use RDBMS if you need transactions and your data truly is relational.
If it’s a document, use a document store
Understand the tradeoffs
Understand how your data will be queried
Don’t forget you can combine technologies to build whatever you need
If We Have Time...
•
•
•

Key/Value Stores

•

Questions!

Elasticsearch
Why you shouldn’t use
Redis... ever.

50
RailsBridge!
http://rubysavannah.com - 11/16/2013

51

- We need front-end volunteers and students!
- Next one is in January so check back in November for the signup!
Thank you!
• kevin@railsmachine.com
• @kplawver
• http://railsmachine.com
• http://lawver.net
52

More Related Content

What's hot

An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache CassandraDataStax
 
Growing Up MongoDB
Growing Up MongoDBGrowing Up MongoDB
Growing Up MongoDBMongoDB
 
Introduction to NoSQL & Apache Cassandra
Introduction to NoSQL & Apache CassandraIntroduction to NoSQL & Apache Cassandra
Introduction to NoSQL & Apache CassandraChetan Baheti
 
Managing Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al TobeyManaging Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al TobeyDataStax Academy
 
Understanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsUnderstanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsAcunu
 
Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...
Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...
Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...DataStax
 
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...DataStax
 
Cassandra in Operation
Cassandra in OperationCassandra in Operation
Cassandra in Operationniallmilton
 
Webinar: Getting Started with Apache Cassandra
Webinar: Getting Started with Apache CassandraWebinar: Getting Started with Apache Cassandra
Webinar: Getting Started with Apache CassandraDataStax
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache CassandraRobert Stupp
 
Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...
Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...
Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...DataStax Academy
 
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...Glenn K. Lockwood
 
Cassandra background-and-architecture
Cassandra background-and-architectureCassandra background-and-architecture
Cassandra background-and-architectureMarkus Klems
 
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016DataStax
 
Cassandra for Sysadmins
Cassandra for SysadminsCassandra for Sysadmins
Cassandra for SysadminsNathan Milford
 

What's hot (17)

An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
Growing Up MongoDB
Growing Up MongoDBGrowing Up MongoDB
Growing Up MongoDB
 
Cassandra compaction
Cassandra compactionCassandra compaction
Cassandra compaction
 
Introduction to NoSQL & Apache Cassandra
Introduction to NoSQL & Apache CassandraIntroduction to NoSQL & Apache Cassandra
Introduction to NoSQL & Apache Cassandra
 
Managing Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al TobeyManaging Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al Tobey
 
Understanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsUnderstanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problems
 
Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...
Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...
Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...
 
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
 
Cassandra in Operation
Cassandra in OperationCassandra in Operation
Cassandra in Operation
 
Webinar: Getting Started with Apache Cassandra
Webinar: Getting Started with Apache CassandraWebinar: Getting Started with Apache Cassandra
Webinar: Getting Started with Apache Cassandra
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
 
Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...
Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...
Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...
 
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
 
Cassandra background-and-architecture
Cassandra background-and-architectureCassandra background-and-architecture
Cassandra background-and-architecture
 
Cassandra Redis
Cassandra RedisCassandra Redis
Cassandra Redis
 
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
 
Cassandra for Sysadmins
Cassandra for SysadminsCassandra for Sysadmins
Cassandra for Sysadmins
 

Viewers also liked

Hinduja Interactive Company Profile
Hinduja Interactive Company ProfileHinduja Interactive Company Profile
Hinduja Interactive Company Profilearyapatnaik
 
Crowdsourcing in the Public Sector
Crowdsourcing in the Public SectorCrowdsourcing in the Public Sector
Crowdsourcing in the Public SectorBas Kotterink
 
Making Marketing More Human Through Technology
Making Marketing More Human Through TechnologyMaking Marketing More Human Through Technology
Making Marketing More Human Through TechnologyKevin Lawver
 
Enabling Creativity: Software that encourages creation and exploration
Enabling Creativity: Software that encourages creation and explorationEnabling Creativity: Software that encourages creation and exploration
Enabling Creativity: Software that encourages creation and explorationKevin Lawver
 
HTML5: About Damn Time
HTML5: About Damn TimeHTML5: About Damn Time
HTML5: About Damn TimeKevin Lawver
 
Welcome To Ruby On Rails
Welcome To Ruby On RailsWelcome To Ruby On Rails
Welcome To Ruby On RailsKevin Lawver
 
Vocabulario o viño
Vocabulario o viñoVocabulario o viño
Vocabulario o viñoalxen
 
Súper Casares Paqui
Súper Casares PaquiSúper Casares Paqui
Súper Casares Paquialxen
 
Social Media Food Chain
Social Media Food ChainSocial Media Food Chain
Social Media Food ChainKevin Lawver
 
'UX', 'UX Design' and 'Good UX'
'UX', 'UX Design' and 'Good UX''UX', 'UX Design' and 'Good UX'
'UX', 'UX Design' and 'Good UX'Jinyong Kim
 

Viewers also liked (16)

Hinduja Interactive Company Profile
Hinduja Interactive Company ProfileHinduja Interactive Company Profile
Hinduja Interactive Company Profile
 
Crowdsourcing in the Public Sector
Crowdsourcing in the Public SectorCrowdsourcing in the Public Sector
Crowdsourcing in the Public Sector
 
Making Marketing More Human Through Technology
Making Marketing More Human Through TechnologyMaking Marketing More Human Through Technology
Making Marketing More Human Through Technology
 
Inspire U Presents Aromatherapy for Special Populations
Inspire U Presents Aromatherapy for Special PopulationsInspire U Presents Aromatherapy for Special Populations
Inspire U Presents Aromatherapy for Special Populations
 
Ma 51st annual meeting
Ma 51st annual meetingMa 51st annual meeting
Ma 51st annual meeting
 
Enabling Creativity: Software that encourages creation and exploration
Enabling Creativity: Software that encourages creation and explorationEnabling Creativity: Software that encourages creation and exploration
Enabling Creativity: Software that encourages creation and exploration
 
CODE!
CODE!CODE!
CODE!
 
HTML5: About Damn Time
HTML5: About Damn TimeHTML5: About Damn Time
HTML5: About Damn Time
 
Inspire u featuring allissa haines~marketing with personality
Inspire u featuring allissa haines~marketing with personalityInspire u featuring allissa haines~marketing with personality
Inspire u featuring allissa haines~marketing with personality
 
Building Whuffie
Building WhuffieBuilding Whuffie
Building Whuffie
 
Welcome To Ruby On Rails
Welcome To Ruby On RailsWelcome To Ruby On Rails
Welcome To Ruby On Rails
 
Inspire U Billing for Massage Therapists with Vivian mahoney1
Inspire U Billing for Massage Therapists with Vivian mahoney1Inspire U Billing for Massage Therapists with Vivian mahoney1
Inspire U Billing for Massage Therapists with Vivian mahoney1
 
Vocabulario o viño
Vocabulario o viñoVocabulario o viño
Vocabulario o viño
 
Súper Casares Paqui
Súper Casares PaquiSúper Casares Paqui
Súper Casares Paqui
 
Social Media Food Chain
Social Media Food ChainSocial Media Food Chain
Social Media Food Chain
 
'UX', 'UX Design' and 'Good UX'
'UX', 'UX Design' and 'Good UX''UX', 'UX Design' and 'Good UX'
'UX', 'UX Design' and 'Good UX'
 

Similar to Your Database is Trying to Kill You

Solr cloud the 'search first' nosql database extended deep dive
Solr cloud the 'search first' nosql database   extended deep diveSolr cloud the 'search first' nosql database   extended deep dive
Solr cloud the 'search first' nosql database extended deep divelucenerevolution
 
NoSQL and NewSQL: Tradeoffs between Scalable Performance & Consistency
NoSQL and NewSQL: Tradeoffs between Scalable Performance & ConsistencyNoSQL and NewSQL: Tradeoffs between Scalable Performance & Consistency
NoSQL and NewSQL: Tradeoffs between Scalable Performance & ConsistencyScyllaDB
 
What Does Big Data Mean and Who Will Win
What Does Big Data Mean and Who Will WinWhat Does Big Data Mean and Who Will Win
What Does Big Data Mean and Who Will WinBigDataCloud
 
SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!Andraz Tori
 
Databases benoitg 2009-03-10
Databases benoitg 2009-03-10Databases benoitg 2009-03-10
Databases benoitg 2009-03-10benoitg
 
NoSQL Intro with cassandra
NoSQL Intro with cassandraNoSQL Intro with cassandra
NoSQL Intro with cassandraBrian Enochson
 
Scaling opensimulator inventory using nosql
Scaling opensimulator inventory using nosqlScaling opensimulator inventory using nosql
Scaling opensimulator inventory using nosqlDavid Daeschler
 
Is NoSQL The Future of Data Storage?
Is NoSQL The Future of Data Storage?Is NoSQL The Future of Data Storage?
Is NoSQL The Future of Data Storage?Saltmarch Media
 
Vote NO for MySQL
Vote NO for MySQLVote NO for MySQL
Vote NO for MySQLUlf Wendel
 
NoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed DatabaseNoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed DatabaseJoe Alex
 
Avoiding big data antipatterns
Avoiding big data antipatternsAvoiding big data antipatterns
Avoiding big data antipatternsgrepalex
 
Microsoft Openness Mongo DB
Microsoft Openness Mongo DBMicrosoft Openness Mongo DB
Microsoft Openness Mongo DBHeriyadi Janwar
 
NoSQL for great good [hanoi.rb talk]
NoSQL for great good [hanoi.rb talk]NoSQL for great good [hanoi.rb talk]
NoSQL for great good [hanoi.rb talk]Huy Do
 
Architecture by Accident
Architecture by AccidentArchitecture by Accident
Architecture by AccidentGleicon Moraes
 
NoSQL databases - An introduction
NoSQL databases - An introductionNoSQL databases - An introduction
NoSQL databases - An introductionPooyan Mehrparvar
 
Schemaless Databases
Schemaless DatabasesSchemaless Databases
Schemaless DatabasesDan Gunter
 
Basho and Riak at GOTO Stockholm: "Don't Use My Database."
Basho and Riak at GOTO Stockholm:  "Don't Use My Database."Basho and Riak at GOTO Stockholm:  "Don't Use My Database."
Basho and Riak at GOTO Stockholm: "Don't Use My Database."Basho Technologies
 
Nibiru: Building your own NoSQL store
Nibiru: Building your own NoSQL storeNibiru: Building your own NoSQL store
Nibiru: Building your own NoSQL storeEdward Capriolo
 

Similar to Your Database is Trying to Kill You (20)

No sql3 rmoug
No sql3 rmougNo sql3 rmoug
No sql3 rmoug
 
Solr cloud the 'search first' nosql database extended deep dive
Solr cloud the 'search first' nosql database   extended deep diveSolr cloud the 'search first' nosql database   extended deep dive
Solr cloud the 'search first' nosql database extended deep dive
 
NoSQL and NewSQL: Tradeoffs between Scalable Performance & Consistency
NoSQL and NewSQL: Tradeoffs between Scalable Performance & ConsistencyNoSQL and NewSQL: Tradeoffs between Scalable Performance & Consistency
NoSQL and NewSQL: Tradeoffs between Scalable Performance & Consistency
 
What Does Big Data Mean and Who Will Win
What Does Big Data Mean and Who Will WinWhat Does Big Data Mean and Who Will Win
What Does Big Data Mean and Who Will Win
 
SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!
 
Databases benoitg 2009-03-10
Databases benoitg 2009-03-10Databases benoitg 2009-03-10
Databases benoitg 2009-03-10
 
NoSQL Intro with cassandra
NoSQL Intro with cassandraNoSQL Intro with cassandra
NoSQL Intro with cassandra
 
Scaling opensimulator inventory using nosql
Scaling opensimulator inventory using nosqlScaling opensimulator inventory using nosql
Scaling opensimulator inventory using nosql
 
Is NoSQL The Future of Data Storage?
Is NoSQL The Future of Data Storage?Is NoSQL The Future of Data Storage?
Is NoSQL The Future of Data Storage?
 
Vote NO for MySQL
Vote NO for MySQLVote NO for MySQL
Vote NO for MySQL
 
NoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed DatabaseNoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed Database
 
Avoiding big data antipatterns
Avoiding big data antipatternsAvoiding big data antipatterns
Avoiding big data antipatterns
 
Golden Hammer - Shawn Oden
Golden Hammer - Shawn OdenGolden Hammer - Shawn Oden
Golden Hammer - Shawn Oden
 
Microsoft Openness Mongo DB
Microsoft Openness Mongo DBMicrosoft Openness Mongo DB
Microsoft Openness Mongo DB
 
NoSQL for great good [hanoi.rb talk]
NoSQL for great good [hanoi.rb talk]NoSQL for great good [hanoi.rb talk]
NoSQL for great good [hanoi.rb talk]
 
Architecture by Accident
Architecture by AccidentArchitecture by Accident
Architecture by Accident
 
NoSQL databases - An introduction
NoSQL databases - An introductionNoSQL databases - An introduction
NoSQL databases - An introduction
 
Schemaless Databases
Schemaless DatabasesSchemaless Databases
Schemaless Databases
 
Basho and Riak at GOTO Stockholm: "Don't Use My Database."
Basho and Riak at GOTO Stockholm:  "Don't Use My Database."Basho and Riak at GOTO Stockholm:  "Don't Use My Database."
Basho and Riak at GOTO Stockholm: "Don't Use My Database."
 
Nibiru: Building your own NoSQL store
Nibiru: Building your own NoSQL storeNibiru: Building your own NoSQL store
Nibiru: Building your own NoSQL store
 

Recently uploaded

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 

Recently uploaded (20)

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 

Your Database is Trying to Kill You

  • 1. Your Database Wants to Kill You Kevin Lawver - 11/1/2013 1
  • 3. 3 - I work at Rails Machine We do ops Lots of ops on lots of different kinds of databases enough introductions, let’s get w/ the murder!
  • 4. Databases have been around since before most of us were born. 4 - So they’re well understood - and well despised - and crusty
  • 5. There’s been a revolution the past few years. 5
  • 6. Getting away from fully relational databases, to something... odd. 6
  • 14. This is what you’re used to. 14
  • 15. Tables, relationships, foreign keys, SQL, etc. 15 - And lots of rules
  • 16. ACID 16 The set of rules relational databases follow to assure the data gets where it needs to go and is consistent. They’re fine for a certain kind of workload.
  • 17. Atomicity 17 Transactions are all or nothing. If any part of the transaction fails, the WHOLE thing has to fail and roll back. That means a lot of locking, which can become a performance problem.
  • 18. Consistency 18 Any transaction brings the database from one valid “state” to another - which means you can have a bunch of rules inside the database to judge the validity of data, and any transaction that doesn’t pass fails and rolls back. Again, not great for performance.
  • 19. Isolation 19 Transactions executed concurrently have to result in the same state of the database as if they had been executed serially. Requires partially applied transactions to NOT be visible to other transactions.
  • 20. Durability 20 Once a transaction is committed, it’s IN THERE.
  • 21. That’s a lot of rules, and it makes for inflexible systems. 21
  • 22. And that’s where the killing comes in: 22
  • 23. Replication 23 It’s evil, and almost all RDBMS’s do it wrong. It’s so fragile that you spend more time redoing it than actually getting any benefit from it. MySQL can do master/master. PostgreSQL ships binary logs via scp. It’s all horrible and gives me grey hairs. Because it was an afterthought and not designed from the beginning. Add-on replication is almost always horrible.
  • 24. Failover 24 This is even worse than replication. Because it was even more of an afterthought. Most of the time it fails over on accident and breaks replication. And then someone gets woken up to clean up a steaming pile of bad data. And that person isn’t very happy about it.
  • 25. All those solutions are hacked on and horrible. 25
  • 26. There has to be a better way. 26
  • 28. It came from Amazon, and changed everything. 28 It adds some reality to the database world. It basically says that no database can do everything.
  • 30. Consistency 30 All nodes have the same data at the same time.
  • 31. Availability 31 Every request is guaranteed to receive a response as to its success or failure
  • 32. Partition Tolerance 32 The system will continue to operate despite arbitrary message loss or a failure of part of the system. Also known as “split brain” - which happens to me if I don’t get enough coffee.
  • 33. But, you can never have all three. It’s impossible. 33 Finally, some reality! Stop trying to be everything to everyone and solve all types of problems with the same hammer. So when you’re looking at a data store, see which two it can do and which you need for your data!
  • 34. Enter all the NoSQL! 34 Stands for either “NO SQL” or “Not Only SQL” - but it’s really a bunch of different data stores that aren’t relational and solve different kinds of problems. And provide some solutions for old school reliability problems.
  • 35. Document Stores 35 - MongoDB, Riak, CouchDB, etc Not relational (though you can convince mongodb to do it, you shouldn’t) Usually have really good replication stories Let’s look at MongoDB vs traditional MySQL
  • 36. MySQL Replication 36 That’s typical master/master. Each can take writes (but you shouldn’t) They ship bin logs back and forth Fragile Easy to break replication by having conflicting writes committed near the same time on both sides - so split-brain is always a possibility.
  • 37. MongoDB Replica Set 37 - There’s an election, and one node is picked as the primary. - It takes all writes, distributes to the secondaries - If the primary goes down, there’s an election and a new primary is chosen (usually less than 1 second). - New nodes join the replica set and get all the data, then can be elected primary
  • 38. Benefits of Replica Sets 38 - Replication and failover designed into the system as core functionality! - Much better failover - Much better reliability - I get to sleep more - Easy to add capacity as the replica set grows (either shard by adding new replica sets or add more nodes to scale reads).
  • 39. Riak & the “Ring” 39 Riak is crazy town Document store with very light querying (though the new search stuff is badass) Super scalable via the “Ring” Data is automagically replicated around the ring based on configuration - Number of copies
  • 40. The Ring 40 - All nodes “gossip” to confirm they’re up. Any node can take a query and will gather the results from the other nodes. Nodes dropping out are “noticed” by the ring and data gets shuffled around. New news automatically join the node and get their “share” of the data. Theoretically infinitely scalable (though the gossip gets REALLY noisy) Useful as a file store (see Riak CS) I think that drawing can be used to summon Beetlejuice.
  • 41. What’s Old is New 41 - MariaDB + Galera Cluster = MySQL replica sets! (kind of) row-based replication is much more reliable automatic failover and syncing of new nodes can be load balanced for reads and writes! still the same sql everyone’s used to theoretically any node can take writes - but I don’t trust it
  • 42. My MariaDB 42 - Yes, this is the mongodb diagram - I use haproxy to send all the writes to a single primary, with the others as backups in case it goes down. - I have a separate haproxy frontend that load balances across all three for reads. - so far, i love it to pieces
  • 43. Here’s HAProxy 43 - rmcom_backend - app servers mariadb_read_backend - the leastconn balanced pool of readers mariadb_write_backend - db1 is the primary unless it goes down, then db2 is “promoted” rails, mariadb_read and mariadb_write are the frontends
  • 45. If you query it, index it. 45 - As your data grows, you’ll see query speed decrease. - Add indexes for your common queries! - Don’t forget compound indexes.
  • 46. As data increases, flexibility decreases. 46 - You’ll need to limit the types of queries you allow people to perform because they’ll lock things up and stop everyone from accessing it. - You’ll need to find other ways to “protect” the database, like.
  • 47. Cache it! 47 - Use memcached or other caching technologies to keep common queries away from the database. - If it can be read, it can be cached. - Saves you a ton of money in vertically scaling your database. - You may also need to add other ways to access your data, like say, elasticsearch or solr.
  • 48. Scale vertically 48 - Throw hardware at it until it’s too expensive, then shard it. - Because sharding is almost always horrible.
  • 49. What does it all mean? 49 - Don’t default to RDBMS! Use RDBMS if you need transactions and your data truly is relational. If it’s a document, use a document store Understand the tradeoffs Understand how your data will be queried Don’t forget you can combine technologies to build whatever you need
  • 50. If We Have Time... • • • Key/Value Stores • Questions! Elasticsearch Why you shouldn’t use Redis... ever. 50
  • 51. RailsBridge! http://rubysavannah.com - 11/16/2013 51 - We need front-end volunteers and students! - Next one is in January so check back in November for the signup!
  • 52. Thank you! • kevin@railsmachine.com • @kplawver • http://railsmachine.com • http://lawver.net 52