2. NoSQL
NoSQL originally referring to "non SQL", "non relational" or "not
only SQL".it's an open source,horizontally scalable, schema-free
database.
NoSQL database provides a mechanism for storage and retrieval of
data which is modeled in means other than the tabular relations
used in relational databases.
The data structures used by NoSQL databases (e.g. key-value, graph,
document,wide column) are different from those used by default in
relational databases, making some operations faster in NoSQL.
3. SQL vs NoSQL
SQL databases are primarily called as Relational Databases (RDBMS); whereas
NoSQL database are primarily called as non-relational or distributed database.
SQL databases are table based databases whereas NoSQL databases are
document based, key-value pairs, graph databases.
SQL databases have predefined schema whereas NoSQL databases have
dynamic schema for unstructured data.
SQL databases are vertically scalable whereas the NoSQL databases are
horizontally scalable.
SQL databases uses SQL ( structured query language ). In NoSQL database
uses UnQL (Unstructured Query Language). The syntax of using UnQL varies
from database to database.
SQL database examples: MySql, Oracle, Sqlite, Postgres and MS-SQL.
NoSQL database examples: MongoDB, BigTable, Redis, Cassandra, Hbase, Neo4j
and CouchDb
4. SQL vs NOSQL
SQL NoSQL
Complex
queries
Good fit for the complex query intensive
environment
It's not good fit for complex queries,
NoSQL don’t have standard interfaces
to perform complex queries
Data to be
stored
SQL databases are not best fit for
hierarchical data storage.
NoSQL database fits better for the
hierarchical data storage as it follows
the key-value pair way of storing data
similar to JSON data.NoSQL database
are highly preferred for large data set.
Scalability SQL databases are vertically
scalable.You can manage increasing
load by increasing the CPU,RAM,SSD,
etc,on a single server.
NoSQL databases are horizontally
scalable. You can just add few more
servers easily in your NoSQL database
infrastructure to handle the large traffic
High
Transactio
nal
It's best fit for heavy duty transactional
type applications as it is more stable
and promises the atomicity as well as
integrity of the data
We can use NoSQL for transactions
purpose, it is still not comparable and
sable enough in high load and for
complex transactional applications
Properties SQL databases emphasizes on ACID
properties ( Atomicity, Consistency,
Isolation and Durability)
NoSQL database follows the Brewers
CAP theorem ( Consistency, Availability
and Partition tolerance )
5. Database Types
Relational database management systemRelational database management system
Online analytical processing
It's an approach to answering multi-dimensional analytical (MDA
queries swiftly in computing.OLAP consists of three basic
analytical operations: consolidation (roll-up), drill-down, and
slicing and dicing
Example-Oracle express Database.
NoSQL
OLAPRDMS
Not Only SQL
6. Relational Databases
Flat file system was
created
Problem:-
No standard implementation
Relational Database
was an answer1970s
•E.F.Codd.
10. The Great Divide
Mongo tries to achieve
the performance of
traditional key-value
stores while
maintaining
functionality of
traditional RDBMS
11. What is MongoDB ?
MongoDB
is
Scalable, Open-source,
High-Performance ,Schema free,
Document-orientated database(DOD)
Developed and supported by
MongoDB Inc.
NoSQL
• DOD falls under the product category called NoSQL.
Document-orientated
database
MongoDB
12. Why we use MongoDB?
• Related information is stored together for fast query access through
the MongoDB query language.
• MongoDB uses dynamic schemas, meaning that you can create records
without first defining the structure.
• Documents in a collection need not have an identical set of fields
• Built for Speed
• Rich Document based queries for Easy readability.
• Full Index Support for High Performance.
• Replication and Fail-over for High Availability.
• Auto Sharding for Easy Scalability.
• Map / Reduce for Aggregation.
• MongoDB stores documents (or) objects.
• Documents are stored in BSON (binary JSON).
• BSON is a binary serialization of JSON-like objects
• Embedded documents and arrays reduce need for joins. No Joins and
No-multi document transactions.
13. What is MongoDB great for?
• Semi-structured Content Management.
• Real-time Analytics & High-Speed Logging.
• Caching and High Scalability
14. Not great for?
• Highly Transactional Applications.
• Problems requiring SQL.
17. Collection
When I say
Table
Think
• Schema-less, and contains Documents.
• Indexable by one/more keys.
• Created on-the-fly when referenced for the first time.
• Capped Collections: Fixed size, older records get dropped
after reaching the limit.
18. Document
When I say
Record/Row
Think
• Stored in a Collection.
• Can have _id key – works like Primary keys in MySQL.
MongoDB imposes a 4MB size limit on a single document.
• Supported Relationships – Embedded (or) References.
• Document storage in BSON (Binary form of JSON).
19. Terminology and Concepts
MySQL(RDBMS)
MongoDB
Table Collection
Row Document
Column Field
Joins Embedded documents, linking
Many concepts in RDBMS have close analogs in MongoDB. Above table outlines some of the
common concepts in each system.
23. Connect to the Database & Db's Manipulations: Create & drop
Connect:-
>mongo
Show Databases:-
>> show databases; or >> show dbs;
Show current Database:-
>> db
Show Collections:-
>> show collections; or >> show tables;
Change Database:-
>> use <databaseName>;
Create Database:-
Just switch and create an object ...
Delete Database:-
> use mydb;
> db.dropDatabase();
25. Select:No SQL, Just ORM...
Select All:-
>db.myColl.find();
Where:-
>db.mycoll.find({x:5});
Pattern Matching:-
>db.mycoll.find({title:/mongo/i})
Sort:-
>db.mycoll.find().sort({email:1, date:-1});
Limit:-
>db.mycoll.find().limit(1);
Specific Field -:(callled Projection)- Select All
>db.users.find(
{}, < Where
{user_id:1, status:1, _id:0} <Select userid and status
26. Where, Join, Group By
!=”A” {$ne:”A”}
>25 {$gt:25}
>25 AND <=50 {$gt:25, $lte:50}
Like 'bc%' /^bc/
<25 OR >=50 {$or: [{$lt:25}, {$gte:50}]}
Join:-
Wrong Place...
Or Map Reduce
27. Update & Delete
Update:-
db.mycoll.update({_id:1},{amount:78});
//update the record with amount
db.mycoll.update({_id:1},{$set:{amount:78}});
//Only set the amount field into the row
db.mycoll.update({},{amount:78},{multi:true});
// Will update all the record //multi will only work with $ operators
Delete-:
db.mycoll.remove({status:'D'})
$ function example -
$gt, $lt, $gte, $lte, $ne, $all, $in, $nin, count, limit, skip, group,
$push, $pull, $pop, $addToSet, $inc, $decr, $set $unset many more…
28. Create Index on any field in the document
// 1 means ascending, -1 means descending
> db.posts.ensureIndex({‘author’: 1});
//Index Nested Documents
> db.posts.ensureIndex(‘comments.author’: 1);
// Index on tags
> db.posts.ensureIndex({‘tags’: 1});
// Geo-spatial Index
> db.posts.ensureIndex({‘author.location’: ‘2d’});
Secondary Indexes
29. Aggregation
MongoDB’s aggregation framework is modeled on the concept of data processing
Pipelines.
Aggregation operations process data records and return computed results.
db.collection.aggregate(pipeline, options)-
Pipeline - A sequence of data aggregation operations or stages.
30. Aggregation Transformations Types
Example of some transformations types that can be used in an aggregation pipeline:
$match: filters input record set by any given expressions
$group: groups by one or more columns and perform aggregations on other columns
$geoNear: outputs documents in order of nearest to farthest from a specified point
$limit: picks first n documents from input sets (useful for percentile calculations, etc.)
$skip: ignores first n documents from input set
$sort: sorts all input documents as per the object given
$project: creates a resultset with a subset of input fields or computed fields
MongoDB provides three ways to perform aggregation:-
Aggregation pipeline
Map-reduce function
Single purpose aggregation method
31. Aggregation pipeline
The aggregation pipeline is a framework for data aggregation modeled on the concept
of data processing pipelines. Documents enter a multi-stage pipeline that transforms
the documents into aggregated results.
32. Map-reduce
Map-reduce uses custom JavaScript functions to perform the map and reduce operations,
as well as the optional finalize operation. map-reduce operations have two phases:
Map:- A Map stage that processes each document and emits one or more objects for each input
document.
Reduce:- Reduce phase that combines the output of the map operation.
33. Single purpose aggregation method
MongoDB also provides db.collection.count() and db.collection.distinct().
34. Optimizing pipelines
Starting a pipeline with a $match stage to restrict processing to
relevant documents.
Ensuring the initial $match / $sort stages are supported by
an efficient index.
Filtering data early using $match, $limit , and $skip .
Minimizing unnecessary stages and document manipulation.
Taking advantage of newer aggregation operators if you have
upgraded your MongoDB server. For example, MongoDB 3.4
added many new aggregation stages(tranformation types) and
expressions including support for working with arrays, strings, and
facets.
35. Some Cool features
• Geo-spatial Indexes for Geo-spatial queries.
$near, $within_distance, Bound queries (circle, box)
• GridFS
Stores Large Binary Files.
• Map/Reduce
GROUP BY in SQL, map/reduce in MongoDB.
37. Sharding, Replication, Replica-set
Sharding:- it's a process of storing or distributing the data across multiple machines or
shards. MongoDb uses sharding to support deployments with very large data set and high
throughput operations.
Each shard is an independent database, the shards make up a single logical database.
Replica-set:- A replica set in MongoDB is a group of mongod processes that maintain the
same data set.
A replica-set consists of one Master (also called "Primary") and one or more Slaves (aka
Secondary). Read-operations can be served by any slave, so you can increase read-
performance by adding more slaves to the replica-set. But write-operations always take place
on the master of the replica-set and are then propagated to the slaves, so writes won't get
faster when you add more slaves.
Replica-sets also offer fault-tolerance. When one of the members of the replica-set goes down,
the others take over. When the master goes down, the slaves will elect a new master.
39. Sharded Cluster
Sharded cluster consists of the following components:
Shard: Store the data, Each shard contains a subset of the sharded data. Each shard can be deployed
as a replica set.
Mongos or Query Routers: The mongos acts as a query router, providing an interface between client
applications and the sharded cluster.
Config servers: Config servers store metadata and configuration settings for the cluster. As of
MongoDB 3.4, config servers must be deployed as a replica set (CSRS).
40. How do we use MongoDB in NodeJs
Mongoose is a MongoDB object modeling tool designed to work in an asynchronous
environment.
The official MongoDB driver for Node.js. Provides a high-level API on top of
mongodb-core that is meant for end users.
NodeJs MongoDB
Mongodb, mongoose