NoSQL
SIMPLE INTRODUCTION
Why need databases
Availability : data are made available to wide variety of users
Integrity : the data available in the database is a reliable data
Security : only authorized users can access the data
Independence : users deal with data in efficient manner , “abstract view” of how the data is
stored in the database.
The Relational model
The current traditional Databases such as (SQL Server, Oracle, MySQL …. etc.).
The Idea of relational model come at 1970
Is very stable architecture
The most popular model for storing data in the web and business applications till now
has fixed structure for database schema
Relational model databases depend on Transactions and it’s ACID rules
ACID Rules
Atomic : A transaction is a logical unit of work which must be either completed with all of its
data modifications, or none of them is performed.
Consistent : At the end of the transaction, all data must be left in a consistent state.
Isolated : Modifications of data performed by a transaction must be independent of another
transaction. Unless this happens, the outcome of a transaction may be erroneous.
Durable : When the transaction is completed, effects of the modifications performed by the
transaction must be permanent in the system.
The problem with Relational Model
Some datatypes not suitable for traditional database (Graph , unstructured)
Scalability problem : and it’s Cost
Agility challenges that face modern applications (Schema Free)
Distributed Systems
A distributed system consists of multiple computers and software components that
communicate through a computer network
A distributed system can consist of any number of possible configurations, such as mainframes,
workstations, personal computers, and so on.
The computers interact with each other and share the resources of the system to achieve a
common goal.
CAP Theorem
CAP theorem states that there are three basic requirements which exist in a special relation
when designing applications for a distributed architecture
Consistency - This means that the data in the database remains consistent after the execution of
an operation. For example after an update operation all clients see the same data.
Availability - This means that the system is always on (service guarantee availability), no
downtime.
Partition Tolerance - This means that the system continues to function even the communication
among the servers is unreliable (nodes are up, but can't communicate)
Ex: if the network stops delivering messages between two sets of servers, will the system
continue to work correctly?
CAP Theorem
In theoretically it is impossible to fulfill all 3
requirements
Relational Databases achieve
◦ Consistency
All clients always have the same view of data
◦ Availability
Single site cluster, therefore all nodes are always in
contact
◦ No Partition Tolerance
What is NoSQL
non-relational database management systems
It is designed for distributed data stores where very large scale of data storing needs
Scale Horizontally
Schema Free
Why NoSQL
The huge contents generated every time by humans,
devices, machines… etc.
Certain types of data
Agility challenges that face modern applications
History of NoSQL
The term NoSQL was coined by Carlo Strozzi in the year 1998. He used this term to name his
database which did not have an SQL interface.
In the early 2009, in an event on open-source distributed databases the term reused to refer
databases which are non-relational, distributed, and does not conform to ACID.
In the same year, the "no:sql(east)" conference held in Atlanta, USA, NoSQL was discussed.
And then, discussion and practice of NoSQL got a momentum, and NoSQL saw an
unprecedented growth.
NoSQL Consistency
According to CAP Theorem it is impossible to fulfill all 3 requirements
NoSQL Databases achieves
◦ No Consistency
may some clients have different views of data
◦ Availability
all nodes are always in contact
◦ Partition Tolerance
system continues to function even the communication among the servers is unreliable
NoSQL Consistency
In 2007, Amazon discovered that every 100ms of latency on the Amazon website cost 1% in
sales. At the time their annual sales were around $14.7 billion. And 1% of $14.7 billion is a lot of
sales to lose.
they outlined an approach for a new kind of database. One that guaranteed Availability and
Partition tolerance at the expense of Consistency.
They rely on Eventual Consistency, where data would be consistent in the end (after some of
time).
For a bank where transactions have to be consistent, that just wouldn’t work. For companies
like Google, it’s acceptable.
Types of NoSQL Databases
Key-Value Store
It has a Big Hash Table of keys & values {Riak, Amazon S3 (Dynamo)}
Document-based Store
It stores documents made up of tagged elements. {CouchDB, MongoDB}
Column-based Store
Each storage block contains data from only one column, {HBase, Cassandra}
Graph-based
A network database that uses edges and nodes to represent and store data. {Neo4J}
Advantages and Disadvantages
Advantages Disadvantages
High scalability No standardization
Distributed Computing Limited query capabilities (so far)
Lower cost Eventual consistent
Schema flexibility, semi-structure data Less support and tools compared to
relational databases
No complicated Relationships
Very Easy software development
Example of NoSQL (MongoDB)
Document oriented database
Query by json format
Aggregation can done by Aggregation
Pipeline or MapReduce
The NewSQL
is a class of modern relational database management systems that seek to provide the same
scalable performance of NoSQL systems for online transaction processing (OLTP) read-write
workloads while still maintaining the ACID guarantees of a traditional database system.
References
https://www.red-gate.com/simple-talk/opinion/opinion-pieces/does-nosql--
nodba/?utm_source=simpletalk&utm_medium=weblink&utm_content=sombrero&utm_campai
gn=magazine
https://www.w3resource.com/mongodb/nosql.php
https://www.mongodb.com/nosql-explained?jmp=footer
https://en.wikipedia.org/wiki/NewSQL
Demo

مقدمة عن NoSQL بالعربي

  • 1.
  • 2.
    Why need databases Availability: data are made available to wide variety of users Integrity : the data available in the database is a reliable data Security : only authorized users can access the data Independence : users deal with data in efficient manner , “abstract view” of how the data is stored in the database.
  • 3.
    The Relational model Thecurrent traditional Databases such as (SQL Server, Oracle, MySQL …. etc.). The Idea of relational model come at 1970 Is very stable architecture The most popular model for storing data in the web and business applications till now has fixed structure for database schema Relational model databases depend on Transactions and it’s ACID rules
  • 4.
    ACID Rules Atomic :A transaction is a logical unit of work which must be either completed with all of its data modifications, or none of them is performed. Consistent : At the end of the transaction, all data must be left in a consistent state. Isolated : Modifications of data performed by a transaction must be independent of another transaction. Unless this happens, the outcome of a transaction may be erroneous. Durable : When the transaction is completed, effects of the modifications performed by the transaction must be permanent in the system.
  • 5.
    The problem withRelational Model Some datatypes not suitable for traditional database (Graph , unstructured) Scalability problem : and it’s Cost Agility challenges that face modern applications (Schema Free)
  • 6.
    Distributed Systems A distributedsystem consists of multiple computers and software components that communicate through a computer network A distributed system can consist of any number of possible configurations, such as mainframes, workstations, personal computers, and so on. The computers interact with each other and share the resources of the system to achieve a common goal.
  • 7.
    CAP Theorem CAP theoremstates that there are three basic requirements which exist in a special relation when designing applications for a distributed architecture Consistency - This means that the data in the database remains consistent after the execution of an operation. For example after an update operation all clients see the same data. Availability - This means that the system is always on (service guarantee availability), no downtime. Partition Tolerance - This means that the system continues to function even the communication among the servers is unreliable (nodes are up, but can't communicate) Ex: if the network stops delivering messages between two sets of servers, will the system continue to work correctly?
  • 8.
    CAP Theorem In theoreticallyit is impossible to fulfill all 3 requirements Relational Databases achieve ◦ Consistency All clients always have the same view of data ◦ Availability Single site cluster, therefore all nodes are always in contact ◦ No Partition Tolerance
  • 9.
    What is NoSQL non-relationaldatabase management systems It is designed for distributed data stores where very large scale of data storing needs Scale Horizontally Schema Free
  • 10.
    Why NoSQL The hugecontents generated every time by humans, devices, machines… etc. Certain types of data Agility challenges that face modern applications
  • 11.
    History of NoSQL Theterm NoSQL was coined by Carlo Strozzi in the year 1998. He used this term to name his database which did not have an SQL interface. In the early 2009, in an event on open-source distributed databases the term reused to refer databases which are non-relational, distributed, and does not conform to ACID. In the same year, the "no:sql(east)" conference held in Atlanta, USA, NoSQL was discussed. And then, discussion and practice of NoSQL got a momentum, and NoSQL saw an unprecedented growth.
  • 12.
    NoSQL Consistency According toCAP Theorem it is impossible to fulfill all 3 requirements NoSQL Databases achieves ◦ No Consistency may some clients have different views of data ◦ Availability all nodes are always in contact ◦ Partition Tolerance system continues to function even the communication among the servers is unreliable
  • 13.
    NoSQL Consistency In 2007,Amazon discovered that every 100ms of latency on the Amazon website cost 1% in sales. At the time their annual sales were around $14.7 billion. And 1% of $14.7 billion is a lot of sales to lose. they outlined an approach for a new kind of database. One that guaranteed Availability and Partition tolerance at the expense of Consistency. They rely on Eventual Consistency, where data would be consistent in the end (after some of time). For a bank where transactions have to be consistent, that just wouldn’t work. For companies like Google, it’s acceptable.
  • 14.
    Types of NoSQLDatabases Key-Value Store It has a Big Hash Table of keys & values {Riak, Amazon S3 (Dynamo)} Document-based Store It stores documents made up of tagged elements. {CouchDB, MongoDB} Column-based Store Each storage block contains data from only one column, {HBase, Cassandra} Graph-based A network database that uses edges and nodes to represent and store data. {Neo4J}
  • 15.
    Advantages and Disadvantages AdvantagesDisadvantages High scalability No standardization Distributed Computing Limited query capabilities (so far) Lower cost Eventual consistent Schema flexibility, semi-structure data Less support and tools compared to relational databases No complicated Relationships Very Easy software development
  • 16.
    Example of NoSQL(MongoDB) Document oriented database Query by json format Aggregation can done by Aggregation Pipeline or MapReduce
  • 17.
    The NewSQL is aclass of modern relational database management systems that seek to provide the same scalable performance of NoSQL systems for online transaction processing (OLTP) read-write workloads while still maintaining the ACID guarantees of a traditional database system.
  • 18.
  • 19.