SlideShare a Scribd company logo
1 of 54
Abdelmonaim Remani | Just.me Inc.


The Rise of NoSQL and
 Polyglot Persistence
About Me
• Software Architect at Just.me Inc.
• Interested in technology evangelism and enterprise software
  development and architecture
• Frequent speaker (JavaOne, JAX, OSCON, ORDEV, etc…)
• Open-source advocate
• President and founder of a number of user group
   – NorCal Java User Group
   – The Silicon Valley Spring User Group
   – The Silicon Valley Dart Meetup
• Bio:         http://about.me/PolymathicCoder
• Twitter:     @PolymathicCoder
• Email:       abdelmonaim.remani@gmail.com
License




• Creative Commons Attribution Non-Commercial 3.0 Unported
   – http://creativecommons.org/licenses/by-nc/3.0


• Disclaimer: The graphics and the logo in the presentation
  belong to their rightful owners
The Golden Age of Relational
        Databases
Relational Data Stores
• Relational Data Stores have been the
  predominant choice in storing data
  – The existence mature solutions
    • Oracle, MySQL, Ms SQL Server, etc…
  – Wide adoption and familiarity
    • Developers and even advanced business users
  – An abundance of tools
  – Etc…
• It became the De-Facto standard
The Relational Model
• Data
  – Stored in
     • 2 dimensional tables (Relations)
     • Rows (tuples) and columns (attributes)
 • Has well-define enforced schema
   – Relations themselves
   – Integrity constrains
• Normalization
  – Smaller tables with well-defined relationship
    between them
  – Why?
      • Minimized redundancy
      • No modification anomalies
          – Modification Propagation or cascading
The Relational Model
• Supported by SQL (Structured Query
  Language)
  – A somewhat standardized query language
  – Very flexible
  – Many Operations
    • Across multiple relations such as JOIN
    • Aggregations such as GROUP BY
    • Etc…
The Relational Model
• Transactional
  • ACID
    – Atomicity
        » All or nothing
    – Consistency
        » From one valid state to another
    – Isolation
        » Concurrency result in a valid state
    – Durability
        » Once committed, it’s forever
The Relational Model
• Designed with the assumptions that
 – The end-user will directly interact with database

   » It makes sense that the RDBMS should manage concurrency
     and integrity

   » Access Patterns are unknown

     » A flexible query language that is close to English

     » Data structure with no bias towards a particular pattern of
       querying

 – The database runs on a single machine

   » The only way to promise true ACID
Road Bumps
• We started building more complex applications on top
  of relational databases
 – Business logic moved out of the RDBMS

   » Fewer triggers and stored procedures and replaced by
     equivalent application layer code

 – The applications themselves evolved beyond the procedural
   paradigm to a more OOP approach

   » The Object-Relational impedance mismatch

     » ORM framework to the rescue
Scalability
We became data hoarders!
• As our datasets grew out of control
• Performance decreases exponentially
  – We buy a beefier machines
     • Larry Ellison’s most expensive RAC and make
       him even richer
• This put off the problem for a little while
Optimization
• We hire a guy
  – Indexes half of the databases
     • Made those queries a little faster
  – Creates materialized views for complex joins
     • Nightmare to maintain, get stale, etc…
  – He de-normalizes
     • Any thing but a smooth transition!
     • Redundancy
  – He introduces Caching
     • Data too stale
     • More redundancy
Clustering
• We hire another guy
   – Tells us that we hit the limit of the one machine
   – You need to scale out (Horizontally)
      • Master/Slave
          – Assuming you read more than you write
          – Write to the Master and Read from the Slaves
          – Master needs to replicate data across the slaves
              » Risk incorrect reads
          – How’s that consistent?!!
      • Sharding
          –   Improves reads as much as writes
          –   Can’t join across partitions
          –   No referential integrity
          –   Requires modification of client applications
          –   Introduces a single-point of failure
          –   How’s that consistent?!!
What’s the Point?
• We vertically scale our relational
  database
  – We’re no longer consistent
  – No ACIDity?
  – We loose query flexibility
• Are we doing something wrong?
The CAP Theorem
The CAP Theorem
• Eric Brewer on distributed systems
  – Pick tow out of
    • Consistency
    • Availability
    • Partition Tolerance
• There is Fast Cheap Good service
  – Cheap Good service won’t be Fast
  – Fast Good service won’t be Cheap
  – Fast Cheap service won’t be Good
Relational Model & CAP
• Relational Data Stores happen to favor
  – Consistency and Availability
  – For historical reasons
     • They are key to certain type of applications
     • The bank example
        – I deposit $100 in my friend’s bank account
        – Blah blah blah…
• According to CAP, Partition Tolerance is
  impossible meaning that horizontal
  scaling is impossible
Scheiße!
• We’re in a pickle
  – Too much data in CA model
  – Vertical Scaling
     • Too expensive
     • Not sustainable
• Forced to explore other alternatives in
  light of CAP
What AP Looks Like
• Partition Tolerance
  – Since we reached the limit of the one machine
    we have no choice but to scale horizontally
  – Which means to be partition tolerant
• Availability
  – Nobody is willing to give up most of the time
  – This becomes even better with distribution
  – In a cluster of servers
     • The individual node might be unreliable by itself
     • But a whole inherently reliable
What AP Looks Like
• According the CAP we simply cannot have C
• Consistency
  – I make a update and all subsequent read the most
    updated value
  – Unfortunately this is impossible as it takes time for
    the change to be replicated across each node of
    the cluster
• What a bummer?!
• Let’s look and AP system
  – DNS (Domain Naming Service)
     • Not all the nodes have the most updated records (You
       register that domain name and wait for a few days to
       guarantee that every DNS knows about it)
Eventual Consistency
• This is no so bad
   – It means that we just settled for a lesser degree
     Consistency
• So what if
   – Mohammad in Morocco updated his relationship status
     to single on an some edge node
   – His cousin who lives Spain saw it immediately because
     they happen to be on the same edge node
   – His secret admirer Sara who lives in the United States
     could not see it until an hour later
   – His bother in Japan got the update the next day
   – They all got it eventually!
• Eventual Consistency as Opposed to Immediate
  Consistency
The Compromise
• We settle for weaker consistency model
  – BASE
    • Basically Available
    • Soft state
    • Eventual Consistency
• ACID on the individual node BASE on
  the cluster
The Slippery Slope of the
        Faithless
You might as well Question…
• Schema
 – Logical
   • Well-defined and rigid in relational databases
   • Why not a flexible one or even no schema
 – Physical
   • B Trees in most relational databases
   • Why not use some other underlying data
     structure
You might as well Question…
• Integrity Constraints
  – Who cares?
• A Query Language
  – Anything would do…
• Security
  – None
• Name it…
NoSQL: Going Rogue…
NoSQL
• A wide range of specialized data stores
  with the goal of addressing the challenges
  of the relational model
• Eric Evans
  – The whole point of seeking alternatives is that
    you need to solve a problem that relational
    databases are a bad fit for
• Let me make it easier
  – It is does not anti-SQL or anti-Relational
  – Any data store that is non-relational
• “Not Only SQL” instead of “NO SQL”
SQL             vs.            NoSQL
A single machine                  A cluster
       CA                        AP/CA/CP
 Scale Vertically             Scale Horizontally
      SQL                       Custom APIs
      ACID                          BASE
  Full Indexes                 Mostly on Keys


            There are outliers of course
SQL              vs.            NoSQL
    Rigid Schema                    Schema-less
   Flexible Queries              Pre-defined Queries

• SQL (Relational)
  – Concerned about what the data consists of
• NoSQL (Non-Relational)
  – Concerned with how the data is queried

                There are outliers of course
The Zoo
Key-Value Data Stores
• Basically a big hash map associative array
   – Very Simple
   – Very fast read and write
   – No secondary indexes
• Use When
   – Your data is not highly related
   – All you need is basic CRUD
• Challenges
   – Complex queries
• Check out the Amazon Dynamo Paper
       • http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-
         sosp2007.pdf
• Featured Projects
   – DynamoDB http://hbase.apache.org/
   – Riak http://wiki.basho.com/
   – Redis http://redis.io/
Columnar Stores
•   In a table, data of the same column is stored together
     – Storage is not wasted on null value as in row-based stores (RDBMS)
     – Great for sparse tables
     – Very fast column operation including aggregation
•   Use When
     – Big Data (Excellent leverage of Map Reduce)
     – Need compression or versioning
•   Challenges
     – You better know your access patterns before hand
     – Keys design is not trivial
•   Check out Google’s BigTable Paper
     – http://static.googleusercontent.com/external_content/untrusted_dlcp/research.go
       ogle.com/en/us/archive/bigtable-osdi06.pdf
•   Featured Projects
     – Hbase http://hbase.apache.org/
     – Cassanda http://cassandra.apache.org/
Document Data Stores
•   Nested structures of hashes and their values
     – A document can be
          •   Simply a hash and its value
          •   Hash and another document as its value
          •   No limit in depth
     –   Very Flexible schema
     –   Well-Indexed data
     –   Works well with OOP (No impedance mismatch)
     –   De-normalize as a best practice
•   Use when
     – You don’t know much about the schema
     – The schema very likely to change
•   Challenges
     – Complex Join-like queries
     – Self-referencing documents and circular dependencies
•   Projects
     – MongoDB http://www.mongodb.org/
     – CouchDB http://couchdb.apache.org/
Graph Data Stores
• A graph
   –   Perfect for highly interconnected data
   –   Allows for explicit relationships
   –   Fined graph grained-traversal
   –   Very Flexible
   –   Works well with OOP (No impedance mismatch)
• Use when
   – Your data looks like a graph and requires graph question
   – You are smart enough not to try this on another data store
• Challenges
   – Doesn’t scale-well horizontally
• Featured Projects
   – Neo4j http://neo4j.org/
Relational Data Stores
• Use when
   – Your data Highly relational
   – There is a need to break data into small pieces and
     assemble it in different ways
   – When consistence is king
   – Access patterns are unknown
   – Reporting
• Challenges
   – Doesn’t scale-well horizontally
• Featured Projects
   –   Oracle http://www.oracle.com/index.html
   –   Postgres http://www.postgresql.org/
   –   Ms SQL Server http://dev.mysql.com/
   –   MySQL http://www.mysql.com/
How do you choose?
If It Doesn’t Fit, You Must Acquit!
• Data
  –   Does it have a natural structure?
  –   How it is connected to each other?
  –   How is it distributed?
  –   How much?
• Access Patterns
  – Reads/Writes ratio?
  – Uniform or random?
• CAP
Other Considerations
•   Maturity
•   Stability
•   Maintainability
•   Durability
•   Cost
•   Tools
•   Familiarity
For Fairness’ Sake!
For Fairness’ Sake!
• Relational data stores did not fail us
  – They actually perform very well
• We failed ourselves
  – By using them as solutions for problems
    they weren’t designed to solve to begin
    with
• Take any data store and you’ll get as
  much trouble
For Fairness’ Sake!
• You can’t expect
  – A flathead screwdriver to work on a Philips
    as well as one with the matching Philips
    blade
  – A crosshead screwdriver to work on
    flathead screw
Polyglot Persistence
Polyglot Persistence
• Enterprise application are complex and
  combine complex problems
  – Assumption that we should use one data store is
    absurd
  – You can’t try to fit all in one model and expect no
    problem
• Polyglot Persistence
  – To leverage multiple data storages, based on the
    way data is used by the application
     • Associated with a learning curve
     • Long term investment (More productive in the long-run)
  – Leverage the strength of multiple data stores
Polyglot Persistence
• Example
  –   MongoDB for the product catalog
  –   Redis for shopping cart
  –   DynamoDB for social profile info
  –   Neo4j for the social graph
  –   HBase for inbox and public feed messages
  –   MySQL for payment and account info
  –   Cassandra for audit and activity log
• Disclaimer: I’m not making any
  recommendation here.
NoSQL in the Cloud
NoSQL in the Cloud
• NoSQL as a commodity
  – Fully managed data stores (No
    maintenance)
  – Elastic scaling
  – Cheap storage
• Featured:
  – Amazon AWS
  – Heroku Add-ons
  – CloudFoundry
As Promised!
The A’s the Q’s in the Abstract
• What does the rise of all these NoSQL mean
  to my enterprise?
   – I’m guessing a lot
• What is NoSQL to begin with?
   – Any non-relational data store
• Does it mean “NO SQL”?
   – No
• Could this be just another fad?
   – I don’t think so
The A’s the Q’s in the Abstract
• Is a good idea to be the future of my
  enterprise on these new exotic
  technologies and simply abandon
  proven mature RDBMS?
  – It’s up to you. I will say “No guts, no glory!”
• How scalable is scalable?
  – However much you need it to be
The A’s the Q’s in the Abstract
• Assuming that I am sold, how do I
  choose the one that fits my needs the
  best?
  – I’ll tell you if you hire me
• Is there a middle ground somewhere?
  – Polyglot Persistence
• What is this Polyglot Persistence I hear
  about?
  – It’s the middle ground
Any Other Questions?
Thank You All!

@PolymathicCoder

More Related Content

What's hot

Cassandra Data Modelling
Cassandra Data ModellingCassandra Data Modelling
Cassandra Data ModellingKnoldus Inc.
 
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...Amazon Web Services
 
Snowflake Best Practices for Elastic Data Warehousing
Snowflake Best Practices for Elastic Data WarehousingSnowflake Best Practices for Elastic Data Warehousing
Snowflake Best Practices for Elastic Data WarehousingAmazon Web Services
 
Oracle Cloud Platform:IDCSを使ったアイデンティティ・ドメイン管理者ガイド
Oracle Cloud Platform:IDCSを使ったアイデンティティ・ドメイン管理者ガイドOracle Cloud Platform:IDCSを使ったアイデンティティ・ドメイン管理者ガイド
Oracle Cloud Platform:IDCSを使ったアイデンティティ・ドメイン管理者ガイドオラクルエンジニア通信
 
An Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jAn Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jDebanjan Mahata
 
Master the Multi-Clustered Data Warehouse - Snowflake
Master the Multi-Clustered Data Warehouse - SnowflakeMaster the Multi-Clustered Data Warehouse - Snowflake
Master the Multi-Clustered Data Warehouse - SnowflakeMatillion
 
Function Mesh for Apache Pulsar, the Way for Simple Streaming Solutions
Function Mesh for Apache Pulsar, the Way for Simple Streaming SolutionsFunction Mesh for Apache Pulsar, the Way for Simple Streaming Solutions
Function Mesh for Apache Pulsar, the Way for Simple Streaming SolutionsStreamNative
 
Zero to Snowflake Presentation
Zero to Snowflake Presentation Zero to Snowflake Presentation
Zero to Snowflake Presentation Brett VanderPlaats
 
NOSQL Databases types and Uses
NOSQL Databases types and UsesNOSQL Databases types and Uses
NOSQL Databases types and UsesSuvradeep Rudra
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureDatabricks
 
Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...
Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...
Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...NoSQLmatters
 
When NOT to use Apache Kafka?
When NOT to use Apache Kafka?When NOT to use Apache Kafka?
When NOT to use Apache Kafka?Kai Wähner
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta LakeDatabricks
 
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...Edureka!
 
Snowflake: The Good, the Bad, and the Ugly
Snowflake: The Good, the Bad, and the UglySnowflake: The Good, the Bad, and the Ugly
Snowflake: The Good, the Bad, and the UglyTyler Wishnoff
 
SeaweedFS introduction
SeaweedFS introductionSeaweedFS introduction
SeaweedFS introductionchrislusf
 

What's hot (20)

Snowflake Datawarehouse Architecturing
Snowflake Datawarehouse ArchitecturingSnowflake Datawarehouse Architecturing
Snowflake Datawarehouse Architecturing
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
Cassandra Data Modelling
Cassandra Data ModellingCassandra Data Modelling
Cassandra Data Modelling
 
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
 
Snowflake Best Practices for Elastic Data Warehousing
Snowflake Best Practices for Elastic Data WarehousingSnowflake Best Practices for Elastic Data Warehousing
Snowflake Best Practices for Elastic Data Warehousing
 
Oracle Cloud Platform:IDCSを使ったアイデンティティ・ドメイン管理者ガイド
Oracle Cloud Platform:IDCSを使ったアイデンティティ・ドメイン管理者ガイドOracle Cloud Platform:IDCSを使ったアイデンティティ・ドメイン管理者ガイド
Oracle Cloud Platform:IDCSを使ったアイデンティティ・ドメイン管理者ガイド
 
An Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jAn Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4j
 
Master the Multi-Clustered Data Warehouse - Snowflake
Master the Multi-Clustered Data Warehouse - SnowflakeMaster the Multi-Clustered Data Warehouse - Snowflake
Master the Multi-Clustered Data Warehouse - Snowflake
 
Function Mesh for Apache Pulsar, the Way for Simple Streaming Solutions
Function Mesh for Apache Pulsar, the Way for Simple Streaming SolutionsFunction Mesh for Apache Pulsar, the Way for Simple Streaming Solutions
Function Mesh for Apache Pulsar, the Way for Simple Streaming Solutions
 
Zero to Snowflake Presentation
Zero to Snowflake Presentation Zero to Snowflake Presentation
Zero to Snowflake Presentation
 
NOSQL Databases types and Uses
NOSQL Databases types and UsesNOSQL Databases types and Uses
NOSQL Databases types and Uses
 
Oracle Cloud Infrastructure
Oracle Cloud InfrastructureOracle Cloud Infrastructure
Oracle Cloud Infrastructure
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
 
Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...
Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...
Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...
 
When NOT to use Apache Kafka?
When NOT to use Apache Kafka?When NOT to use Apache Kafka?
When NOT to use Apache Kafka?
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta Lake
 
Oracle Integration Cloud 概要(20200507版)
Oracle Integration Cloud 概要(20200507版)Oracle Integration Cloud 概要(20200507版)
Oracle Integration Cloud 概要(20200507版)
 
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
 
Snowflake: The Good, the Bad, and the Ugly
Snowflake: The Good, the Bad, and the UglySnowflake: The Good, the Bad, and the Ugly
Snowflake: The Good, the Bad, and the Ugly
 
SeaweedFS introduction
SeaweedFS introductionSeaweedFS introduction
SeaweedFS introduction
 

Similar to The Rise of NoSQL and Polyglot Persistence

Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)Don Demcsak
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxRahul Borate
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxRahul Borate
 
Storage Systems For Scalable systems
Storage Systems For Scalable systemsStorage Systems For Scalable systems
Storage Systems For Scalable systemselliando dias
 
Big Data! Great! Now What? #SymfonyCon 2014
Big Data! Great! Now What? #SymfonyCon 2014Big Data! Great! Now What? #SymfonyCon 2014
Big Data! Great! Now What? #SymfonyCon 2014Ricard Clau
 
Sql vs NoSQL
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQLRTigger
 
Scaling the Web: Databases & NoSQL
Scaling the Web: Databases & NoSQLScaling the Web: Databases & NoSQL
Scaling the Web: Databases & NoSQLRichard Schneeman
 
Mongo db model relationships with documents
Mongo db model relationships with documentsMongo db model relationships with documents
Mongo db model relationships with documentsDr. Awase Khirni Syed
 
Big Data Platforms: An Overview
Big Data Platforms: An OverviewBig Data Platforms: An Overview
Big Data Platforms: An OverviewC. Scyphers
 
Cassandra from the trenches: migrating Netflix
Cassandra from the trenches: migrating NetflixCassandra from the trenches: migrating Netflix
Cassandra from the trenches: migrating NetflixJason Brown
 
Real World Performance - OLTP
Real World Performance - OLTPReal World Performance - OLTP
Real World Performance - OLTPConnor McDonald
 
Scaing databases on the cloud
Scaing databases on the cloudScaing databases on the cloud
Scaing databases on the cloudImaginea
 
Scaling Databases On The Cloud
Scaling Databases On The CloudScaling Databases On The Cloud
Scaling Databases On The CloudImaginea
 
Oracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data ArchitectureOracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data ArchitectureArthur Gimpel
 
NoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed DatabaseNoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed DatabaseJoe Alex
 
Big iron 2 (published)
Big iron 2 (published)Big iron 2 (published)
Big iron 2 (published)Ben Stopford
 
Scaling a High Traffic Web Application: Our Journey from Java to PHP
Scaling a High Traffic Web Application: Our Journey from Java to PHPScaling a High Traffic Web Application: Our Journey from Java to PHP
Scaling a High Traffic Web Application: Our Journey from Java to PHP120bi
 
Scaling High Traffic Web Applications
Scaling High Traffic Web ApplicationsScaling High Traffic Web Applications
Scaling High Traffic Web ApplicationsAchievers Tech
 

Similar to The Rise of NoSQL and Polyglot Persistence (20)

NoSql
NoSqlNoSql
NoSql
 
Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
 
Storage Systems For Scalable systems
Storage Systems For Scalable systemsStorage Systems For Scalable systems
Storage Systems For Scalable systems
 
Big Data! Great! Now What? #SymfonyCon 2014
Big Data! Great! Now What? #SymfonyCon 2014Big Data! Great! Now What? #SymfonyCon 2014
Big Data! Great! Now What? #SymfonyCon 2014
 
Sql vs NoSQL
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQL
 
Scaling the Web: Databases & NoSQL
Scaling the Web: Databases & NoSQLScaling the Web: Databases & NoSQL
Scaling the Web: Databases & NoSQL
 
Mongo db model relationships with documents
Mongo db model relationships with documentsMongo db model relationships with documents
Mongo db model relationships with documents
 
Big Data Platforms: An Overview
Big Data Platforms: An OverviewBig Data Platforms: An Overview
Big Data Platforms: An Overview
 
Cassandra from the trenches: migrating Netflix
Cassandra from the trenches: migrating NetflixCassandra from the trenches: migrating Netflix
Cassandra from the trenches: migrating Netflix
 
Real World Performance - OLTP
Real World Performance - OLTPReal World Performance - OLTP
Real World Performance - OLTP
 
Scaing databases on the cloud
Scaing databases on the cloudScaing databases on the cloud
Scaing databases on the cloud
 
Scaling Databases On The Cloud
Scaling Databases On The CloudScaling Databases On The Cloud
Scaling Databases On The Cloud
 
Oracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data ArchitectureOracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data Architecture
 
No SQL
No SQLNo SQL
No SQL
 
NoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed DatabaseNoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed Database
 
Big iron 2 (published)
Big iron 2 (published)Big iron 2 (published)
Big iron 2 (published)
 
Scaling a High Traffic Web Application: Our Journey from Java to PHP
Scaling a High Traffic Web Application: Our Journey from Java to PHPScaling a High Traffic Web Application: Our Journey from Java to PHP
Scaling a High Traffic Web Application: Our Journey from Java to PHP
 
Scaling High Traffic Web Applications
Scaling High Traffic Web ApplicationsScaling High Traffic Web Applications
Scaling High Traffic Web Applications
 

More from Abdelmonaim Remani

The Economies of Scaling Software
The Economies of Scaling SoftwareThe Economies of Scaling Software
The Economies of Scaling SoftwareAbdelmonaim Remani
 
The Art of Metaprogramming in Java
The Art of Metaprogramming in Java  The Art of Metaprogramming in Java
The Art of Metaprogramming in Java Abdelmonaim Remani
 
Building enterprise web applications with spring 3
Building enterprise web applications with spring 3Building enterprise web applications with spring 3
Building enterprise web applications with spring 3Abdelmonaim Remani
 
Introduction To Building Enterprise Web Application With Spring Mvc
Introduction To Building Enterprise Web Application With Spring MvcIntroduction To Building Enterprise Web Application With Spring Mvc
Introduction To Building Enterprise Web Application With Spring MvcAbdelmonaim Remani
 
Introduction To Rich Internet Applications
Introduction To Rich Internet ApplicationsIntroduction To Rich Internet Applications
Introduction To Rich Internet ApplicationsAbdelmonaim Remani
 

More from Abdelmonaim Remani (8)

The Eschatology of Java
The Eschatology of JavaThe Eschatology of Java
The Eschatology of Java
 
The Economies of Scaling Software
The Economies of Scaling SoftwareThe Economies of Scaling Software
The Economies of Scaling Software
 
How RESTful Is Your REST?
How RESTful Is Your REST?How RESTful Is Your REST?
How RESTful Is Your REST?
 
The Art of Metaprogramming in Java
The Art of Metaprogramming in Java  The Art of Metaprogramming in Java
The Art of Metaprogramming in Java
 
Le Tour de xUnit
Le Tour de xUnitLe Tour de xUnit
Le Tour de xUnit
 
Building enterprise web applications with spring 3
Building enterprise web applications with spring 3Building enterprise web applications with spring 3
Building enterprise web applications with spring 3
 
Introduction To Building Enterprise Web Application With Spring Mvc
Introduction To Building Enterprise Web Application With Spring MvcIntroduction To Building Enterprise Web Application With Spring Mvc
Introduction To Building Enterprise Web Application With Spring Mvc
 
Introduction To Rich Internet Applications
Introduction To Rich Internet ApplicationsIntroduction To Rich Internet Applications
Introduction To Rich Internet Applications
 

Recently uploaded

Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 

Recently uploaded (20)

Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 

The Rise of NoSQL and Polyglot Persistence

  • 1. Abdelmonaim Remani | Just.me Inc. The Rise of NoSQL and Polyglot Persistence
  • 2. About Me • Software Architect at Just.me Inc. • Interested in technology evangelism and enterprise software development and architecture • Frequent speaker (JavaOne, JAX, OSCON, ORDEV, etc…) • Open-source advocate • President and founder of a number of user group – NorCal Java User Group – The Silicon Valley Spring User Group – The Silicon Valley Dart Meetup • Bio: http://about.me/PolymathicCoder • Twitter: @PolymathicCoder • Email: abdelmonaim.remani@gmail.com
  • 3. License • Creative Commons Attribution Non-Commercial 3.0 Unported – http://creativecommons.org/licenses/by-nc/3.0 • Disclaimer: The graphics and the logo in the presentation belong to their rightful owners
  • 4. The Golden Age of Relational Databases
  • 5. Relational Data Stores • Relational Data Stores have been the predominant choice in storing data – The existence mature solutions • Oracle, MySQL, Ms SQL Server, etc… – Wide adoption and familiarity • Developers and even advanced business users – An abundance of tools – Etc… • It became the De-Facto standard
  • 6. The Relational Model • Data – Stored in • 2 dimensional tables (Relations) • Rows (tuples) and columns (attributes) • Has well-define enforced schema – Relations themselves – Integrity constrains • Normalization – Smaller tables with well-defined relationship between them – Why? • Minimized redundancy • No modification anomalies – Modification Propagation or cascading
  • 7. The Relational Model • Supported by SQL (Structured Query Language) – A somewhat standardized query language – Very flexible – Many Operations • Across multiple relations such as JOIN • Aggregations such as GROUP BY • Etc…
  • 8. The Relational Model • Transactional • ACID – Atomicity » All or nothing – Consistency » From one valid state to another – Isolation » Concurrency result in a valid state – Durability » Once committed, it’s forever
  • 9. The Relational Model • Designed with the assumptions that – The end-user will directly interact with database » It makes sense that the RDBMS should manage concurrency and integrity » Access Patterns are unknown » A flexible query language that is close to English » Data structure with no bias towards a particular pattern of querying – The database runs on a single machine » The only way to promise true ACID
  • 10. Road Bumps • We started building more complex applications on top of relational databases – Business logic moved out of the RDBMS » Fewer triggers and stored procedures and replaced by equivalent application layer code – The applications themselves evolved beyond the procedural paradigm to a more OOP approach » The Object-Relational impedance mismatch » ORM framework to the rescue
  • 12. We became data hoarders! • As our datasets grew out of control • Performance decreases exponentially – We buy a beefier machines • Larry Ellison’s most expensive RAC and make him even richer • This put off the problem for a little while
  • 13. Optimization • We hire a guy – Indexes half of the databases • Made those queries a little faster – Creates materialized views for complex joins • Nightmare to maintain, get stale, etc… – He de-normalizes • Any thing but a smooth transition! • Redundancy – He introduces Caching • Data too stale • More redundancy
  • 14. Clustering • We hire another guy – Tells us that we hit the limit of the one machine – You need to scale out (Horizontally) • Master/Slave – Assuming you read more than you write – Write to the Master and Read from the Slaves – Master needs to replicate data across the slaves » Risk incorrect reads – How’s that consistent?!! • Sharding – Improves reads as much as writes – Can’t join across partitions – No referential integrity – Requires modification of client applications – Introduces a single-point of failure – How’s that consistent?!!
  • 15. What’s the Point? • We vertically scale our relational database – We’re no longer consistent – No ACIDity? – We loose query flexibility • Are we doing something wrong?
  • 17. The CAP Theorem • Eric Brewer on distributed systems – Pick tow out of • Consistency • Availability • Partition Tolerance • There is Fast Cheap Good service – Cheap Good service won’t be Fast – Fast Good service won’t be Cheap – Fast Cheap service won’t be Good
  • 18. Relational Model & CAP • Relational Data Stores happen to favor – Consistency and Availability – For historical reasons • They are key to certain type of applications • The bank example – I deposit $100 in my friend’s bank account – Blah blah blah… • According to CAP, Partition Tolerance is impossible meaning that horizontal scaling is impossible
  • 19. Scheiße! • We’re in a pickle – Too much data in CA model – Vertical Scaling • Too expensive • Not sustainable • Forced to explore other alternatives in light of CAP
  • 20. What AP Looks Like • Partition Tolerance – Since we reached the limit of the one machine we have no choice but to scale horizontally – Which means to be partition tolerant • Availability – Nobody is willing to give up most of the time – This becomes even better with distribution – In a cluster of servers • The individual node might be unreliable by itself • But a whole inherently reliable
  • 21. What AP Looks Like • According the CAP we simply cannot have C • Consistency – I make a update and all subsequent read the most updated value – Unfortunately this is impossible as it takes time for the change to be replicated across each node of the cluster • What a bummer?! • Let’s look and AP system – DNS (Domain Naming Service) • Not all the nodes have the most updated records (You register that domain name and wait for a few days to guarantee that every DNS knows about it)
  • 22. Eventual Consistency • This is no so bad – It means that we just settled for a lesser degree Consistency • So what if – Mohammad in Morocco updated his relationship status to single on an some edge node – His cousin who lives Spain saw it immediately because they happen to be on the same edge node – His secret admirer Sara who lives in the United States could not see it until an hour later – His bother in Japan got the update the next day – They all got it eventually! • Eventual Consistency as Opposed to Immediate Consistency
  • 23. The Compromise • We settle for weaker consistency model – BASE • Basically Available • Soft state • Eventual Consistency • ACID on the individual node BASE on the cluster
  • 24. The Slippery Slope of the Faithless
  • 25. You might as well Question… • Schema – Logical • Well-defined and rigid in relational databases • Why not a flexible one or even no schema – Physical • B Trees in most relational databases • Why not use some other underlying data structure
  • 26. You might as well Question… • Integrity Constraints – Who cares? • A Query Language – Anything would do… • Security – None • Name it…
  • 28. NoSQL • A wide range of specialized data stores with the goal of addressing the challenges of the relational model • Eric Evans – The whole point of seeking alternatives is that you need to solve a problem that relational databases are a bad fit for • Let me make it easier – It is does not anti-SQL or anti-Relational – Any data store that is non-relational • “Not Only SQL” instead of “NO SQL”
  • 29. SQL vs. NoSQL A single machine A cluster CA AP/CA/CP Scale Vertically Scale Horizontally SQL Custom APIs ACID BASE Full Indexes Mostly on Keys There are outliers of course
  • 30. SQL vs. NoSQL Rigid Schema Schema-less Flexible Queries Pre-defined Queries • SQL (Relational) – Concerned about what the data consists of • NoSQL (Non-Relational) – Concerned with how the data is queried There are outliers of course
  • 31.
  • 33. Key-Value Data Stores • Basically a big hash map associative array – Very Simple – Very fast read and write – No secondary indexes • Use When – Your data is not highly related – All you need is basic CRUD • Challenges – Complex queries • Check out the Amazon Dynamo Paper • http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo- sosp2007.pdf • Featured Projects – DynamoDB http://hbase.apache.org/ – Riak http://wiki.basho.com/ – Redis http://redis.io/
  • 34. Columnar Stores • In a table, data of the same column is stored together – Storage is not wasted on null value as in row-based stores (RDBMS) – Great for sparse tables – Very fast column operation including aggregation • Use When – Big Data (Excellent leverage of Map Reduce) – Need compression or versioning • Challenges – You better know your access patterns before hand – Keys design is not trivial • Check out Google’s BigTable Paper – http://static.googleusercontent.com/external_content/untrusted_dlcp/research.go ogle.com/en/us/archive/bigtable-osdi06.pdf • Featured Projects – Hbase http://hbase.apache.org/ – Cassanda http://cassandra.apache.org/
  • 35. Document Data Stores • Nested structures of hashes and their values – A document can be • Simply a hash and its value • Hash and another document as its value • No limit in depth – Very Flexible schema – Well-Indexed data – Works well with OOP (No impedance mismatch) – De-normalize as a best practice • Use when – You don’t know much about the schema – The schema very likely to change • Challenges – Complex Join-like queries – Self-referencing documents and circular dependencies • Projects – MongoDB http://www.mongodb.org/ – CouchDB http://couchdb.apache.org/
  • 36. Graph Data Stores • A graph – Perfect for highly interconnected data – Allows for explicit relationships – Fined graph grained-traversal – Very Flexible – Works well with OOP (No impedance mismatch) • Use when – Your data looks like a graph and requires graph question – You are smart enough not to try this on another data store • Challenges – Doesn’t scale-well horizontally • Featured Projects – Neo4j http://neo4j.org/
  • 37. Relational Data Stores • Use when – Your data Highly relational – There is a need to break data into small pieces and assemble it in different ways – When consistence is king – Access patterns are unknown – Reporting • Challenges – Doesn’t scale-well horizontally • Featured Projects – Oracle http://www.oracle.com/index.html – Postgres http://www.postgresql.org/ – Ms SQL Server http://dev.mysql.com/ – MySQL http://www.mysql.com/
  • 38. How do you choose?
  • 39. If It Doesn’t Fit, You Must Acquit! • Data – Does it have a natural structure? – How it is connected to each other? – How is it distributed? – How much? • Access Patterns – Reads/Writes ratio? – Uniform or random? • CAP
  • 40. Other Considerations • Maturity • Stability • Maintainability • Durability • Cost • Tools • Familiarity
  • 42. For Fairness’ Sake! • Relational data stores did not fail us – They actually perform very well • We failed ourselves – By using them as solutions for problems they weren’t designed to solve to begin with • Take any data store and you’ll get as much trouble
  • 43. For Fairness’ Sake! • You can’t expect – A flathead screwdriver to work on a Philips as well as one with the matching Philips blade – A crosshead screwdriver to work on flathead screw
  • 45. Polyglot Persistence • Enterprise application are complex and combine complex problems – Assumption that we should use one data store is absurd – You can’t try to fit all in one model and expect no problem • Polyglot Persistence – To leverage multiple data storages, based on the way data is used by the application • Associated with a learning curve • Long term investment (More productive in the long-run) – Leverage the strength of multiple data stores
  • 46. Polyglot Persistence • Example – MongoDB for the product catalog – Redis for shopping cart – DynamoDB for social profile info – Neo4j for the social graph – HBase for inbox and public feed messages – MySQL for payment and account info – Cassandra for audit and activity log • Disclaimer: I’m not making any recommendation here.
  • 47. NoSQL in the Cloud
  • 48. NoSQL in the Cloud • NoSQL as a commodity – Fully managed data stores (No maintenance) – Elastic scaling – Cheap storage • Featured: – Amazon AWS – Heroku Add-ons – CloudFoundry
  • 50. The A’s the Q’s in the Abstract • What does the rise of all these NoSQL mean to my enterprise? – I’m guessing a lot • What is NoSQL to begin with? – Any non-relational data store • Does it mean “NO SQL”? – No • Could this be just another fad? – I don’t think so
  • 51. The A’s the Q’s in the Abstract • Is a good idea to be the future of my enterprise on these new exotic technologies and simply abandon proven mature RDBMS? – It’s up to you. I will say “No guts, no glory!” • How scalable is scalable? – However much you need it to be
  • 52. The A’s the Q’s in the Abstract • Assuming that I am sold, how do I choose the one that fits my needs the best? – I’ll tell you if you hire me • Is there a middle ground somewhere? – Polyglot Persistence • What is this Polyglot Persistence I hear about? – It’s the middle ground