Mongo Bb - NoSQL tutorial

MongoDB
Introduction to
Mohan Rathour

NoSQL

NoSQL originally referring to "non SQL", "non relational" or "not
only SQL".it's an open source,horizontally scalable, schema-free
database.

NoSQL database provides a mechanism for storage and retrieval of
data which is modeled in means other than the tabular relations
used in relational databases.

The data structures used by NoSQL databases (e.g. key-value, graph,
document,wide column) are different from those used by default in
relational databases, making some operations faster in NoSQL.

SQL vs NoSQL

SQL databases are primarily called as Relational Databases (RDBMS); whereas
NoSQL database are primarily called as non-relational or distributed database.

SQL databases are table based databases whereas NoSQL databases are
document based, key-value pairs, graph databases.

SQL databases have predefined schema whereas NoSQL databases have
dynamic schema for unstructured data.

SQL databases are vertically scalable whereas the NoSQL databases are
horizontally scalable.

SQL databases uses SQL ( structured query language ). In NoSQL database
uses UnQL (Unstructured Query Language). The syntax of using UnQL varies
from database to database.

SQL database examples: MySql, Oracle, Sqlite, Postgres and MS-SQL.
NoSQL database examples: MongoDB, BigTable, Redis, Cassandra, Hbase, Neo4j
and CouchDb

SQL vs NOSQL
SQL NoSQL
Complex
queries
Good fit for the complex query intensive
environment
It's not good fit for complex queries,
NoSQL don’t have standard interfaces
to perform complex queries
Data to be
stored
SQL databases are not best fit for
hierarchical data storage.
NoSQL database fits better for the
hierarchical data storage as it follows
the key-value pair way of storing data
similar to JSON data.NoSQL database
are highly preferred for large data set.
Scalability SQL databases are vertically
scalable.You can manage increasing
load by increasing the CPU,RAM,SSD,
etc,on a single server.
NoSQL databases are horizontally
scalable. You can just add few more
servers easily in your NoSQL database
infrastructure to handle the large traffic
High
Transactio
nal
It's best fit for heavy duty transactional
type applications as it is more stable
and promises the atomicity as well as
integrity of the data
We can use NoSQL for transactions
purpose, it is still not comparable and
sable enough in high load and for
complex transactional applications
Properties SQL databases emphasizes on ACID
properties ( Atomicity, Consistency,
Isolation and Durability)
NoSQL database follows the Brewers
CAP theorem ( Consistency, Availability
and Partition tolerance )

Database Types
Relational database management systemRelational database management system
Online analytical processing
It's an approach to answering multi-dimensional analytical (MDA
queries swiftly in computing.OLAP consists of three basic
analytical operations: consolidation (roll-up), drill-down, and
slicing and dicing
Example-Oracle express Database.
NoSQL
OLAPRDMS
Not Only SQL

Relational Databases
Flat file system was
created
Problem:-
No standard implementation
Relational Database
was an answer1970s
•E.F.Codd.

NoSQL Databases
Relational databases were
created
Problem:-
Could not handle big data
NoSQL Database
was an answerRecently

Horizontal Scalability
Task
Tracker
Job
Tracker
Name
Node
Data
Node
Task
Tracker
Data
Node
Task
Tracker
Task
Tracker
Task
Tracker
Data
Node
Data
Node
Data
Node
Master
Slaves
New nodes could be
added

NoSQL Database Types
Document
Oriented
Tabular
Key value
Store
Not Only SQL
Example:-
BigTable
Hbase
Example:-
Redis
Memcached
Example:-
MongoDB
Couch DB

The Great Divide
Mongo tries to achieve
the performance of
traditional key-value
stores while
maintaining
functionality of
traditional RDBMS

What is MongoDB ?
MongoDB
is
Scalable, Open-source,
High-Performance ,Schema free,
Document-orientated database(DOD)
Developed and supported by
MongoDB Inc.
NoSQL
• DOD falls under the product category called NoSQL.
Document-orientated
database
MongoDB

Why we use MongoDB?
• Related information is stored together for fast query access through
the MongoDB query language.
• MongoDB uses dynamic schemas, meaning that you can create records
without first defining the structure.
• Documents in a collection need not have an identical set of fields
• Built for Speed
• Rich Document based queries for Easy readability.
• Full Index Support for High Performance.
• Replication and Fail-over for High Availability.
• Auto Sharding for Easy Scalability.
• Map / Reduce for Aggregation.
• MongoDB stores documents (or) objects.
• Documents are stored in BSON (binary JSON).
• BSON is a binary serialization of JSON-like objects
• Embedded documents and arrays reduce need for joins. No Joins and
No-multi document transactions.

What is MongoDB great for?
• Semi-structured Content Management.
• Real-time Analytics & High-Speed Logging.
• Caching and High Scalability

Not great for?
• Highly Transactional Applications.
• Problems requiring SQL.

Database
When I say
Database
Think
• Made up of Multiple Collections.
• Created on-the-fly when referenced for the first time.

Collection
When I say
Table
Think
• Schema-less, and contains Documents.
• Indexable by one/more keys.
• Created on-the-fly when referenced for the first time.
• Capped Collections: Fixed size, older records get dropped
after reaching the limit.

Document
When I say
Record/Row
Think
• Stored in a Collection.
• Can have _id key – works like Primary keys in MySQL.

MongoDB imposes a 4MB size limit on a single document.
• Supported Relationships – Embedded (or) References.
• Document storage in BSON (Binary form of JSON).

Terminology and Concepts
MySQL(RDBMS)
MongoDB
Table Collection
Row Document
Column Field
Joins Embedded documents, linking
Many concepts in RDBMS have close analogs in MongoDB. Above table outlines some of the
common concepts in each system.

A Blog Case Study in MySQL
MySQL

as a SW Engineer would like it to be ...
MongoDB

var p ={"_id" : ObjectId("5915523a101c77bf2ffbc642"),
"author" : DBRef("author", "5126bc054aed4daf9e2ab772"),
"title" : "Introduction to MongoDB",
"body" : "MongoDB ",
"timestamp" : "01-04-12",
"tags" : [ "MongoDB", "NoSQL"],
"comments" : [{
"author" : DBRef("author",
– "5126bc054aed4daf9e2ab772"),
"date" : "02-04-12",
"text" : "Did you see.. ",
"upvotes" : 7
}]
}
> db.posts.save(p);
Understanding the Document Model.

Connect to the Database & Db's Manipulations: Create & drop
Connect:-
>mongo
Show Databases:-
>> show databases; or >> show dbs;
Show current Database:-
>> db
Show Collections:-
>> show collections; or >> show tables;
Change Database:-
>> use <databaseName>;
Create Database:-
Just switch and create an object ...
Delete Database:-
> use mydb;
> db.dropDatabase();

Collections Manipulation
Create Collection:-
>db.createCollection('collectionName');
Or just insert to it
Delete Collection:-
>db.collectionName.drop();
Insert:-
J={name:”Mohan”}
K={x:5}
db.myColl.insert(j);
db.myColl.insert(k);

Select:No SQL, Just ORM...
Select All:-
>db.myColl.find();
Where:-
>db.mycoll.find({x:5});
Pattern Matching:-
>db.mycoll.find({title:/mongo/i})
Sort:-
>db.mycoll.find().sort({email:1, date:-1});
Limit:-
>db.mycoll.find().limit(1);
Specific Field -:(callled Projection)- Select All
>db.users.find(
{}, < Where
{user_id:1, status:1, _id:0} <Select userid and status

Where, Join, Group By
!=”A” {$ne:”A”}
>25 {$gt:25}
>25 AND <=50 {$gt:25, $lte:50}
Like 'bc%' /^bc/
<25 OR >=50 {$or: [{$lt:25}, {$gte:50}]}
Join:-
Wrong Place...
Or Map Reduce

Update & Delete
Update:-
db.mycoll.update({_id:1},{amount:78});
//update the record with amount
db.mycoll.update({_id:1},{$set:{amount:78}});
//Only set the amount field into the row
db.mycoll.update({},{amount:78},{multi:true});
// Will update all the record //multi will only work with $ operators
Delete-:
db.mycoll.remove({status:'D'})
$ function example -
$gt, $lt, $gte, $lte, $ne, $all, $in, $nin, count, limit, skip, group,
$push, $pull, $pop, $addToSet, $inc, $decr, $set $unset many more…

Create Index on any field in the document
// 1 means ascending, -1 means descending
> db.posts.ensureIndex({‘author’: 1});
//Index Nested Documents
> db.posts.ensureIndex(‘comments.author’: 1);
// Index on tags
> db.posts.ensureIndex({‘tags’: 1});
// Geo-spatial Index
> db.posts.ensureIndex({‘author.location’: ‘2d’});
Secondary Indexes

Aggregation
MongoDB’s aggregation framework is modeled on the concept of data processing
Pipelines.
Aggregation operations process data records and return computed results.
db.collection.aggregate(pipeline, options)-
Pipeline - A sequence of data aggregation operations or stages.

Aggregation Transformations Types
Example of some transformations types that can be used in an aggregation pipeline:
$match: filters input record set by any given expressions
$group: groups by one or more columns and perform aggregations on other columns
$geoNear: outputs documents in order of nearest to farthest from a specified point
$limit: picks first n documents from input sets (useful for percentile calculations, etc.)
$skip: ignores first n documents from input set
$sort: sorts all input documents as per the object given
$project: creates a resultset with a subset of input fields or computed fields
MongoDB provides three ways to perform aggregation:-

Aggregation pipeline

Map-reduce function

Single purpose aggregation method

Aggregation pipeline
The aggregation pipeline is a framework for data aggregation modeled on the concept
of data processing pipelines. Documents enter a multi-stage pipeline that transforms
the documents into aggregated results.

Map-reduce
Map-reduce uses custom JavaScript functions to perform the map and reduce operations,
as well as the optional finalize operation. map-reduce operations have two phases:
Map:- A Map stage that processes each document and emits one or more objects for each input
document.
Reduce:- Reduce phase that combines the output of the map operation.

Single purpose aggregation method
MongoDB also provides db.collection.count() and db.collection.distinct().

Optimizing pipelines

Starting a pipeline with a $match stage to restrict processing to
relevant documents.

Ensuring the initial $match / $sort stages are supported by
an efficient index.

Filtering data early using $match, $limit , and $skip .

Minimizing unnecessary stages and document manipulation.

Taking advantage of newer aggregation operators if you have
upgraded your MongoDB server. For example, MongoDB 3.4
added many new aggregation stages(tranformation types) and
expressions including support for working with arrays, strings, and
facets.

Some Cool features
• Geo-spatial Indexes for Geo-spatial queries.
$near, $within_distance, Bound queries (circle, box)
• GridFS
Stores Large Binary Files.
• Map/Reduce
GROUP BY in SQL, map/reduce in MongoDB.

Sharding, Replication, Replica-set

Sharding:- it's a process of storing or distributing the data across multiple machines or
shards. MongoDb uses sharding to support deployments with very large data set and high
throughput operations.
Each shard is an independent database, the shards make up a single logical database.

Replica-set:- A replica set in MongoDB is a group of mongod processes that maintain the
same data set.
A replica-set consists of one Master (also called "Primary") and one or more Slaves (aka
Secondary). Read-operations can be served by any slave, so you can increase read-
performance by adding more slaves to the replica-set. But write-operations always take place
on the master of the replica-set and are then propagated to the slaves, so writes won't get
faster when you add more slaves.
Replica-sets also offer fault-tolerance. When one of the members of the replica-set goes down,
the others take over. When the master goes down, the slaves will elect a new master.

Sharded Cluster
Sharded cluster consists of the following components:
Shard: Store the data, Each shard contains a subset of the sharded data. Each shard can be deployed
as a replica set.
Mongos or Query Routers: The mongos acts as a query router, providing an interface between client
applications and the sharded cluster.
Config servers: Config servers store metadata and configuration settings for the cluster. As of
MongoDB 3.4, config servers must be deployed as a replica set (CSRS).

How do we use MongoDB in NodeJs
Mongoose is a MongoDB object modeling tool designed to work in an asynchronous
environment.
The official MongoDB driver for Node.js. Provides a high-level API on top of
mongodb-core that is meant for end users.
NodeJs MongoDB
Mongodb, mongoose

Questions?
Next Steps: http://mongodb.org,
Twitter: @mongodb
Thank You 

// find posts which has ‘MongoDB’ tag.
> db.posts.find({tags: ‘MongoDB’});
// find posts by author’s comments.
> db.posts.find({‘comments.author’:
DBRef(‘User’,2)}).count();
// find posts written after 31st
March.
> db.posts.find({‘timestamp’: {‘gte’: Date(’31-03-12’)}});
// find posts written by authors around [22, 42]
> db.posts.find({‘author.location’: {‘near’:[22, 42]});
What about Queries? So Simple
$gt, $lt, $gte, $lte, $ne, $all, $in, $nin, count, limit, skip, group, etc…

Why NoSQL

BigData

Agile-:Agile sprint,quick iteration and frequent code pushes

Object-oriented programming-easy to store object

High Availability

Scale-out Architecture

Mongo Bb - NoSQL tutorial

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Mongo Bb - NoSQL tutorial

Similar to Mongo Bb - NoSQL tutorial (20)

Recently uploaded

Recently uploaded (20)

Mongo Bb - NoSQL tutorial