NO-SQL
It’s not
Not Only SQL
It’s
What is NOSQL?
•Early ages, Relational databases allowed applications to
store data through a standard data modeling and query
language SQL.
•Expensive and data schemas were fairly simple and
straightforward. Since the rise of the web, the volume of
data stored about users, objects, products and events has
exploded.
•Data is also accessed more frequently, and is processed
more intensively
•Low-cost, commodity cloud hardware has emerged to
replace vertical scaling on highly complex and expensive
single-server deployments.
•Engineers now use agile development methods, which aim
for continuous deployment and short development cycles,
to allow for quick response to user demand for features.
What was the need?
What urged to introduce NOSQL?
•Trend 1: BigUsers
•Trend 2: Size(BigData)
•Trend 3: Connectedness(InterConnected
Data)
•Trend 4: Semi-structure(Complex Data
Structure)
•Trend 5: Architecture
Trend 1: BigUsers
161
253
397
623
988
0
200
400
600
800
1000
1200
2006 2007 2008 2009 2010
data(in exabyte)
Trend 2: Size(BigData)
ExaBytes of data stored per year
Source: neotechnology
Trend 3: Connectedness
• To handle hierarchical nested data structures
SQL, you would need multiple relational tables
with all kinds of keys.
• there is a relationship between performance
and data complexity. Performance can degrade
in a traditional RDBMS as we store the massive
amounts of data required in social networking
applications and the semantic web.
• Individualization of content
Trend 4: Semi-structure(Complex
Data Structure)
Trend 4: Semi-structure(Complex
Data Structure)
Source:couchbase.com
Trend 5:
Architecture
Why is so
NoSql
in
spotlight
?
1. Scaling.
•The process of adding
more capacity means
taking existing actors in a
system and increasing
their individual power.
•A single server has to
host the entire database
to ensure reliability and
continuous availability of
data. This gets expensive
quickly, places limits on
scale.
Vertical Scaling(Relational):
Example, let’s assume you have 3 trucks that can carry 25 felled trees per
load, and it takes 1 hour to move each load down the road, our maximum
capacity will be:
3 trucks * 25 trees * 1 hour/load = 75 trees processed per hour
Assuming we’ve chosen a vertical scaling capacity model, what if we
wanted to process 150 felled trees?
We’d need to do one of two things:
1. either double the carrying capacity of each truck (50 trees per hour),
2. halve the time it takes for each truck to process each load (30
minutes).
3 trucks * 50 trees * 1 hour/load = 150 trees processed per hour
OR
3 trucks * 25 trees * 30 minutes/load = 150 trees processed per hour
We haven’t increased the number of actors in the system, but we have
increased the productivity of each actor to achieve the desired jump in
capacity.
Vertical Scaling(Relational):
Horizontal Scaling(NoSql):
•Instead of increasing
the capacity of each
individual actor in the
system, we simply add
more actors to the
system.
•By adding servers
instead of
concentrating more
capacity in a single
server.
Horizontal Scaling(NoSql):
In our lumber harvesting example, this means adding more
trucks to move the lumber. So when we need to increase
our capacity from 75 trees per hour to 150 trees per hour, we
simply add 3 more trucks:
6 trucks * 25 trees * 1 hour/load = 150 trees processed per
hour
The productivity of each actor in the system remains the
same, but we’ve added more trucks to the system.
2. Dynamic Schemas.
Dynamic Schemas:
•Relational databases require that schemas
be defined before you can add data.
•This fits poorly with agile development
approaches, because each time you
complete new features, the schema of your
database often needs to change.
•If the database is large, this is a very slow
process that involves significant downtime.
How RELATIONAL DATABASE does it??
Dynamic Schemas:
And how NOSQL does:
•NoSQL databases are built to allow the insertion
of data without a predefined schema.
•That makes it easy to make significant
application changes in real-time, without
worrying about service interruptions – which
means development is faster, code integration is
more reliable, and less database administrator
time is needed.
3. Sharding.
Sharding:
How RELATIONAL DATABASE does it??
Sharding is the process of storing data records
across multiple machines.
•As SQL scales vertically, sharding is done by
complex arrangements for making hardware act
as a single server
Sharding:
And how NOSQL does:
•NOSQL natively and automatically spread data
across an arbitrary number of servers, without
requiring the application to even be aware of
the composition of the server pool.
• Data and query load are automatically
balanced across servers, and when a server
goes down, it can be quickly and transparently
replaced with no application disruption
(replication).
4. Replication.
Replication:
•NoSQL databases also support data
replication, storing multiple copies of data
across the cluster, and even across data
centers, to ensure high availability and
support disaster recovery.
•A properly managed NoSQL database
system should never need to be taken offline,
for any reason, supporting 24x365 continuous
operation of applications.
5.Integrated Caching.
Integrated Caching:
How RELATIONAL DATABASE does it??
•In relational technology, caching tier is usually
a separate infrastructure tier that must be
developed to, deployed on separate servers,
and explicitly managed by the operating team
•To reduce latency and increase sustained
data throughput, NoSQL database
transparently cache data in system
memory.
•This behavior is transparent to the
application developer and the operations
team
Integrated Caching:
And how NOSQL does:

Nosql- Introduction for Beginners

  • 2.
  • 3.
  • 4.
    What is NOSQL? •Earlyages, Relational databases allowed applications to store data through a standard data modeling and query language SQL. •Expensive and data schemas were fairly simple and straightforward. Since the rise of the web, the volume of data stored about users, objects, products and events has exploded. •Data is also accessed more frequently, and is processed more intensively •Low-cost, commodity cloud hardware has emerged to replace vertical scaling on highly complex and expensive single-server deployments. •Engineers now use agile development methods, which aim for continuous deployment and short development cycles, to allow for quick response to user demand for features. What was the need?
  • 5.
    What urged tointroduce NOSQL? •Trend 1: BigUsers •Trend 2: Size(BigData) •Trend 3: Connectedness(InterConnected Data) •Trend 4: Semi-structure(Complex Data Structure) •Trend 5: Architecture
  • 6.
  • 7.
    161 253 397 623 988 0 200 400 600 800 1000 1200 2006 2007 20082009 2010 data(in exabyte) Trend 2: Size(BigData) ExaBytes of data stored per year Source: neotechnology
  • 8.
  • 9.
    • To handlehierarchical nested data structures SQL, you would need multiple relational tables with all kinds of keys. • there is a relationship between performance and data complexity. Performance can degrade in a traditional RDBMS as we store the massive amounts of data required in social networking applications and the semantic web. • Individualization of content Trend 4: Semi-structure(Complex Data Structure)
  • 10.
    Trend 4: Semi-structure(Complex DataStructure) Source:couchbase.com
  • 11.
  • 12.
  • 13.
  • 14.
    •The process ofadding more capacity means taking existing actors in a system and increasing their individual power. •A single server has to host the entire database to ensure reliability and continuous availability of data. This gets expensive quickly, places limits on scale. Vertical Scaling(Relational):
  • 15.
    Example, let’s assumeyou have 3 trucks that can carry 25 felled trees per load, and it takes 1 hour to move each load down the road, our maximum capacity will be: 3 trucks * 25 trees * 1 hour/load = 75 trees processed per hour Assuming we’ve chosen a vertical scaling capacity model, what if we wanted to process 150 felled trees? We’d need to do one of two things: 1. either double the carrying capacity of each truck (50 trees per hour), 2. halve the time it takes for each truck to process each load (30 minutes). 3 trucks * 50 trees * 1 hour/load = 150 trees processed per hour OR 3 trucks * 25 trees * 30 minutes/load = 150 trees processed per hour We haven’t increased the number of actors in the system, but we have increased the productivity of each actor to achieve the desired jump in capacity. Vertical Scaling(Relational):
  • 16.
    Horizontal Scaling(NoSql): •Instead ofincreasing the capacity of each individual actor in the system, we simply add more actors to the system. •By adding servers instead of concentrating more capacity in a single server.
  • 17.
    Horizontal Scaling(NoSql): In ourlumber harvesting example, this means adding more trucks to move the lumber. So when we need to increase our capacity from 75 trees per hour to 150 trees per hour, we simply add 3 more trucks: 6 trucks * 25 trees * 1 hour/load = 150 trees processed per hour The productivity of each actor in the system remains the same, but we’ve added more trucks to the system.
  • 18.
  • 19.
    Dynamic Schemas: •Relational databasesrequire that schemas be defined before you can add data. •This fits poorly with agile development approaches, because each time you complete new features, the schema of your database often needs to change. •If the database is large, this is a very slow process that involves significant downtime. How RELATIONAL DATABASE does it??
  • 20.
    Dynamic Schemas: And howNOSQL does: •NoSQL databases are built to allow the insertion of data without a predefined schema. •That makes it easy to make significant application changes in real-time, without worrying about service interruptions – which means development is faster, code integration is more reliable, and less database administrator time is needed.
  • 21.
  • 22.
    Sharding: How RELATIONAL DATABASEdoes it?? Sharding is the process of storing data records across multiple machines. •As SQL scales vertically, sharding is done by complex arrangements for making hardware act as a single server
  • 23.
    Sharding: And how NOSQLdoes: •NOSQL natively and automatically spread data across an arbitrary number of servers, without requiring the application to even be aware of the composition of the server pool. • Data and query load are automatically balanced across servers, and when a server goes down, it can be quickly and transparently replaced with no application disruption (replication).
  • 24.
  • 25.
    Replication: •NoSQL databases alsosupport data replication, storing multiple copies of data across the cluster, and even across data centers, to ensure high availability and support disaster recovery. •A properly managed NoSQL database system should never need to be taken offline, for any reason, supporting 24x365 continuous operation of applications.
  • 26.
  • 27.
    Integrated Caching: How RELATIONALDATABASE does it?? •In relational technology, caching tier is usually a separate infrastructure tier that must be developed to, deployed on separate servers, and explicitly managed by the operating team
  • 28.
    •To reduce latencyand increase sustained data throughput, NoSQL database transparently cache data in system memory. •This behavior is transparent to the application developer and the operations team Integrated Caching: And how NOSQL does: