Characteristics of NoSQL databases
A NoSQL database in action!
A database is an organized collection of data. The data are
typically organized to model relevant aspects of reality in a way
that supports processes requiring this information.
Management systems (DBMSs) are specially designed applications
that interact with the user, other applications, and the database
itself to capture and analyze data.
Formally, the term database refers to the data itself and
supporting data structures. Databases are created to operate
large quantities of information by inputting, storing, retrieving,
and managing that information.
SQL is an ANSI and ISO standard computer language for creating and
SQL allows the user to create, update, delete, and retrieve data from a
SQL is very simple and easy to learn.
High Speed: SQL Queries can be used to retrieve large amounts of
records from a database quickly and efficiently.
Well Defined Standards Exist: SQL databases use long-established standard,
which is being adopted by ANSI & ISO. Non-SQL databases do not adhere to
any clear standard.
No Coding Required: Using standard SQL it is easier to manage database
systems without having to write substantial amount of code.
Transactions – ACID Properties (Atomic, Consistent, Isolated, Durable)
What has happened?
Relational databases were introduced into the 1970s to allow applications to
store data through a standard data modeling and query language (SQL). Since
the rise of the web, the volume of data stored about users, objects,
products and events has exploded. Data is also accessed more frequently,
and is processed more intensively – for example, social networks create
hundreds of millions of customized, real-time activity feeds for users based
on their connections' activities.
In response to this demand, computing infrastructure and deployment
strategies have also changed dramatically. Low-cost, commodity cloud
hardware has emerged to replace vertical scaling on highly complex and
expensive single-server deployments. And engineers now use agile
development methods, which aim for continuous deployment and short
development cycles, to allow for quick response to user demand for
But.. What’s NoSQL?
A NoSQL database provides a
mechanism for storage and retrieval
of data that employs less constrained
consistency models than traditional
NoSQL systems are also referred to as
"Not only SQL" to emphasize that
they do in fact allow SQL-like query
languages to be used.
Large data volumes (such as Google’s big data’)
Scalable replication and distribution
Potentially thousands of machines
Potentially distributed around the world
Queries need to return answers quickly
Mostly query, few updates
Asynchronous Inserts & Updates
ACID transaction properties are not needed – BASE (Basically Available, SoftState, Eventually Consistent).
Open source development
According to the theorem, a distributed
system cannot satisfy all three of these
guarantees at the same time.
Eventual consistency guarantees that if no
new updates are made to a given data item,
eventually all accesses to that item will
return the last updated value.
The basic classification that most would
agree on is based on data model. A few
of these and their prototypes are:
Column: HBase, Accumulo
Document: MongoDB, Couchbase
Key-value : Dynamo, Riak, Redis, Cache,
Graph: Neo4J, Allegro, Virtuoso
A MapReduce program is composed of a Map() procedure that performs
filtering and sorting (such as sorting students by first name into queues, one
queue for each name) and a Reduce() procedure that performs a summary
operation (such as counting the number of students in each queue, yielding
NoSQL is not a magic solution
Inconsistent APIs between NoSQL providers.
Denormalized data requires you to maintain you own data relationships
Not a lot of real operational power for DevOps / IT.
Lack of complicated queries requires joins / aggregations / filters to be
done in code (except for MapReduce).
Need whole value from the key to read or write any partial information.
NoSQL Use Cases:
SAP uses MongoDB as a core component of SAP’s platform- as-a-service
Foursquare uses MongoDB to store venues and user ‘check-ins’ into
venues, sharding the data over more than 25 machines on Amazon EC2.
MongoDB is used for back-end storage on the SourceForge front pages,
project pages, and download pages for all projects.
Codecademy is the easiest way to learn to code online.
Guardian.co.uk is a leading UK-based news website.
EA Sports: MongoDB is being used for the game feeds component.
NoSQL Use Cases:
AOL: “We selected Couchbase after evaluating several open source products
to power our next-generation backend ad serving platform”.
Zynga’s FarmVille, Café World, Mafia Wars and other games have over 235
million active users per month. We rely on technology from Couchbase to
make that possible.
In the PayPal Media Network Advertising Pipeline, Couchbase is used to build
a scalable cross channel audience profiling, segmentation, identity mapping
& frequency capping.
LinkedIn built a durable and scalable index for it's metrics visualization
engine using Couchbase.
Skyscanner scaled one of its flight search APIs from 100,000 searches a day
to over 3 million, introducing Couchbase on its tech stack.
Another use cases..
Netflix is using Amazon SimpleDB. Link
Twitter uses Cassandra, Hadoop, Hbase, amont others. Link
Facebook and Instagram, are both using Cassandra.
Google uses BigTable (equivalent to Hadoop HBase).
LinkedIn uses Voldemort.
This is just the tip of an iceberg.
Now on, the rest it’s on you!
SQL works great, cant scale for
NoSQL works great, cant fit for
Use SQL + NoSQL
Base de Datos [Wikipedia]
NoSQL Distilled [Martin Fowler]
NoSQL vs. SQL - Battle of the Backends [Google IO12]
SQL Standard and NoSQL Databases
What is NoSQL? [MongoDB]
Why NoSQL? [Couchbase]
CouchDB: The Definitive Guide
BigTable Patent [Google]
designed for human-readable data interchange. Derived from the
data structures and associative arrays, called objects. Despite its
available for many languages.