unit 4,Indexes in database.docx

Indexes
Indexes support the efficient execution of queries in MongoDB. Without indexes,
MongoDB must perform a collection scan, i.e. scan every document in a collection, to
select those documents that match the query statement. If an appropriate index exists
for a query, MongoDB can use the index to limit the number of documents it must
inspect.
Indexes are special data structures [1] that store a small portion of the collection's data
set in an easy to traverse form. The index stores the value of a specific field or set of
fields, ordered by the value of the field. The ordering of the index entries supports
efficient equality matches and range-based query operations. In addition, MongoDB can
return sorted results by using the ordering in the index.
Fundamentally, indexes in MongoDB are similar to indexes in other database systems.
MongoDB defines indexes at the collection level and supports indexes on any field or
sub-field of the documents in a MongoDB collection.
Default _id Index
MongoDB creates a unique index on the _id field during the creation of a collection.
The _id index prevents clients from inserting two documents with the same value for
the _id field. You cannot drop this index on the _id field.
Create an Index
To create an index in the Mongo Shell, use db.collection.createIndex().
db.collection.createIndex( <key and index type specification>, <options> )
The following example creates a single key descending index on the name field:
db.collection.createIndex( { name: -1 } )
The db.collection.createIndex() method only creates an index if an index of
the same specification does not already exist.

[1] MongoDB indexes use a B-tree data structure.
Index Names
The default name for an index is the concatenation of the indexed keys and each key's
direction in the index ( i.e. 1 or -1) using underscores as a separator. For example, an
index created on { item : 1, quantity: -1 } has the
name item_1_quantity_-1.
You can create indexes with a custom name, such as one that is more human-readable
than the default. For example, consider an application that frequently queries
the products collection to populate data on existing inventory. The
following createIndex() method creates an index
on item and quantity named query for inventory:
db.products.createIndex(
{ item: 1, quantity: -1 } ,
{ name: "query for inventory" }
)
You can view index names using the db.collection.getIndexes() method. You
cannot rename an index once created. Instead, you must drop and re-create the index
with a new name.
Index Types
MongoDB provides a number of different index types to support specific types of data
and queries.
Single Field
In addition to the MongoDB-defined _id index, MongoDB supports the creation of user-
defined ascending/descending indexes on a single field of a document.

For a single-field index and sort operations, the sort order (i.e. ascending or
descending) of the index key does not matter because MongoDB can traverse the index
in either direction.
Compound Index
MongoDB also supports user-defined indexes on multiple fields, i.e. compound indexes.
The order of fields listed in a compound index has significance. For instance, if a
compound index consists of { userid: 1, score: -1 }, the index sorts first
by userid and then, within each userid value, sorts by score.
For compound indexes and sort operations, the sort order (i.e. ascending or
descending) of the index keys can determine whether the index can support a sort
operation.
Multikey Index
MongoDB uses multikey indexes to index the content stored in arrays. If you index a
field that holds an array value, MongoDB creates separate index entries
for every element of the array. These multikey indexes allow queries to select
documents that contain arrays by matching on element or elements of the arrays.
MongoDB automatically determines whether to create a multikey index if the indexed
field contains an array value; you do not need to explicitly specify the multikey type.
Indexes support the efficient resolution of queries. Without indexes, MongoDB must scan every
document of a collection to select those documents that match the query statement. This scan is
highly inefficient and require MongoDB to process a large volume of data.
Indexes are special data structures, that store a small portion of the data set in an easy-to-
traverse form. The index stores the value of a specific field or set of fields, ordered by the value
of the field as specified in the index.
The createIndex() Method
To create an index, you need to use createIndex() method of MongoDB.

Syntax
The basic syntax of createIndex() method is as follows().
>db.COLLECTION_NAME.createIndex({KEY:1})
Here key is the name of the field on which you want to create index and 1 is for ascending
order. To create index in descending order you need to use -1.
Example
>db.mycol.createIndex({"title":1})
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
>
In createIndex() method you can pass multiple fields, to create index on multiple fields.
>db.mycol.createIndex({"title":1,"description":-1})
>
This method also accepts list of options (which are optional). Following is the list −
Parameter Type Description
background Boolean
Builds the index in the background so that building an index
does not block other database activities. Specify true to build
in the background. The default value is false.
unique Boolean
Creates a unique index so that the collection will not accept
insertion of documents where the index key or keys match
an existing value in the index. Specify true to create a
unique index. The default value is false.
name string
The name of the index. If unspecified, MongoDB generates
an index name by concatenating the names of the indexed
fields and the sort order.

The dropIndex() method
You can drop a particular index using the dropIndex() method of MongoDB.
Syntax
The basic syntax of DropIndex() method is as follows().
>db.COLLECTION_NAME.dropIndex({KEY:1})
Here key is the name of the file on which you want to create index and 1 is
for ascending order. To create index in descending order you need to use -1.
Example
> db.mycol.dropIndex({"title":1})
{
"ok" : 0,
"errmsg" : "can't find index with key: { title: 1.0 }",
"code" : 27,
"codeName" : "IndexNotFound"
}
The dropIndexes() method
This method deletes multiple (specified) indexes on a collection.
Syntax
The basic syntax of DropIndexes() method is as follows() −
>db.COLLECTION_NAME.dropIndexes()
Example
Assume we have created 2 indexes in the named mycol collection as shown
below −
> db.mycol.createIndex({"title":1,"description":-1})
Following example removes the above created indexes of mycol −
>db.mycol.dropIndexes({"title":1,"description":-1})
{ "nIndexesWas" : 2, "ok" : 1 }

>
The getIndexes() method
This method returns the description of all the indexes int the collection.
Syntax
Following is the basic syntax od the getIndexes() method −
db.COLLECTION_NAME.getIndexes()
Example
Assume we have created 2 indexes in the named mycol collection as shown
below −
> db.mycol.createIndex({"title":1,"description":-1})
Following example retrieves all the indexes in the collection mycol −
> db.mycol.getIndexes()
[
{
"v" : 2,
"key" : {
"_id" : 1
},
"name" : "_id_",
"ns" : "test.mycol"
},
{
"v" : 2,
"key" : {
"title" : 1,
"description" : -1
},
"name" : "title_1_description_-1",
"ns" : "test.mycol"
}
]
>

Aggregation
Aggregation operations process data records and return computed results. Aggregation
operations group values from multiple documents together, and can perform a variety of
operations on the grouped data to return a single result. MongoDB provides three ways
to perform aggregation: the aggregation pipeline, the map-reduce function, and single
purpose aggregation methods.
Aggregation Pipeline
MongoDB's aggregation framework is modeled on the concept of data processing pipelines.
Documents enter a multi-stage pipeline that transforms the documents into an aggregated result.
For example:
db.orders.aggregate([
{ $match: { status: "A" } },
{ $group: { _id: "$cust_id", total: { $sum: "$amount" } } }
])
First Stage: The $match stage filters the documents by the status field and passes to the next
stage those documents that have status equal to "A".
Second Stage: The $group stage groups the documents by the cust_id field to calculate the
sum of the amount for each unique cust_id.
The most basic pipeline stages provide filters that operate like queries and document
transformations that modify the form of the output document.
Other pipeline operations provide tools for grouping and sorting documents by specific field or
fields as well as tools for aggregating the contents of arrays, including arrays of documents. In
addition, pipeline stages can use operators for tasks such as calculating the average or
concatenating a string.
The pipeline provides efficient data aggregation using native operations within MongoDB, and is
the preferred method for data aggregation in MongoDB.
The aggregation pipeline can operate on a sharded collection.

The aggregation pipeline can use indexes to improve its performance during some of its stages.
In addition, the aggregation pipeline has an internal optimization phase.
Pipeline Operators
 $project − Used to select some specific fields from a collection.
 $match − This is a filtering operation and thus this can reduce the amount of
documents that are given as input to the next stage.
 $group − This does the actual aggregation as discussed above.
 $sort − Sorts the documents.
 $skip − With this, it is possible to skip forward in the list of documents for a given
amount of documents.
 $limit − This limits the amount of documents to look at, by the given number
starting from the current positions.
 $unwind − This is used to unwind document that are using arrays. When using
an array, the data is kind of pre-joined and this operation will be undone with this
to have individual documents again. Thus with this stage we will increase the
amount of documents for the next stage.
_____________________________________________________________________
Single Purpose Aggregation Operations
MongoDB also provides , db.collection.count() and db.collection.distinct().
All of these operations aggregate documents from a single collection. While these operations
provide simple access to common aggregation processes, they lack the flexibility and capabilities
of an aggregation pipeline.
https://docs.mongodb.com/manual/aggregation/

Map-Reduce
An aggregation pipeline provides better performance and usability than a map-
reduce operation.
Map-reduce operations can be rewritten using aggregation pipeline operators, such
as $group, $merge, and others.
For map-reduce operations that require custom functionality, MongoDB provides
the $accumulator and $function aggregation operators starting in version 4.4. Use
these operators to define custom aggregation expressions in JavaScript.
Map-Reduce Examples
NOTE
Aggregation Pipeline as Alternative to Map-Reduce
An aggregation pipeline provides better performance and usability than a map-reduce operation.
Map-reduce operations can be rewritten using aggregation pipeline operators, such
as $group, $merge, and others.
For map-reduce operations that require custom functionality, MongoDB provides
the $accumulator and $function aggregation operators starting in version 4.4. Use these
operators to define custom aggregation expressions in JavaScript.
In the mongo shell, the db.collection.mapReduce() method is a wrapper around
the mapReduce command. The following examples use
the db.collection.mapReduce() method.
The examples in this section include aggregation pipeline alternatives without custom
aggregation expressions. For alternatives that use custom expressions, see Map-Reduce to
Aggregation Pipeline Translation Examples.
Create a sample collection orders with these documents:

Return the Total Price Per Customer
Perform the map-reduce operation on the orders collection to group by the cust_id, and
calculate the sum of the price for each cust_id:
1. Define the map function to process each input document:
o In the function, this refers to the document that the map-reduce operation is
processing.
o The function maps the price to the cust_id for each document and emits
the cust_id and price.
var mapFunction1 = function() {
emit(this.cust_id, this.price);
};
2. Define the corresponding reduce function with two
arguments keyCustId and valuesPrices:
db.orders.insertMany([
{ _id: 1, cust_id: "Ant O. Knee", ord_date: new Date("2020-03-01"), price: 25, items: [ { sku: "oranges", qty: 5, price: 2.5 }, { sku: "apples"
{ _id: 2, cust_id: "Ant O. Knee", ord_date: new Date("2020-03-08"), price: 70, items: [ { sku: "oranges", qty: 8, price: 2.5 }, { sku: "chocola
{ _id: 3, cust_id: "Busby Bee", ord_date: new Date("2020-03-08"), price: 50, items: [ { sku: "oranges", qty: 10, price: 2.5 }, { sku: "pears",
{ _id: 4, cust_id: "Busby Bee", ord_date: new Date("2020-03-18"), price: 25, items: [ { sku: "oranges", qty: 10, price: 2.5 } ], status: "A" },
{ _id: 5, cust_id: "Busby Bee", ord_date: new Date("2020-03-19"), price: 50, items: [ { sku: "chocolates", qty: 5, price: 10 } ], status: "A"},
{ _id: 6, cust_id: "Cam Elot", ord_date: new Date("2020-03-19"), price: 35, items: [ { sku: "carrots", qty: 10, price: 1.0 }, { sku: "apples", q
{ _id: 7, cust_id: "Cam Elot", ord_date: new Date("2020-03-20"), price: 25, items: [ { sku: "oranges", qty: 10, price: 2.5 } ], status: "A" },
{ _id: 8, cust_id: "Don Quis", ord_date: new Date("2020-03-20"), price: 75, items: [ { sku: "chocolates", qty: 5, price: 10 }, { sku: "apples",
{ _id: 9, cust_id: "Don Quis", ord_date: new Date("2020-03-20"), price: 55, items: [ { sku: "carrots", qty: 5, price: 1.0 }, { sku: "apples", qt
{ _id: 10, cust_id: "Don Quis", ord_date: new Date("2020-03-23"), price: 25, items: [ { sku: "oranges", qty: 10, price: 2.5 } ], status: "A" }
])

o The valuesPrices is an array whose elements are the price values emitted by
the map function and grouped by keyCustId.
o The function reduces the valuesPrice array to the sum of its elements.
var reduceFunction1 = function(keyCustId, valuesPrices) {
return Array.sum(valuesPrices);
};
3. Perform map-reduce on all documents in the orders collection using
the mapFunction1 map function and the reduceFunction1 reduce function:
db.orders.mapReduce(
mapFunction1,
reduceFunction1,
{ out: "map_reduce_example" }
)
4. This operation outputs the results to a collection named map_reduce_example. If
the map_reduce_example collection already exists, the operation will replace the
contents with the results of this map-reduce operation.
5. Query the map_reduce_example collection to verify the results:
db.map_reduce_example.find().sort( { _id: 1 } )
6. The operation returns these documents:
{ "_id" : "Ant O. Knee", "value" : 95 }
{ "_id" : "Busby Bee", "value" : 125 }
{ "_id" : "Cam Elot", "value" : 60 }
{ "_id" : "Don Quis", "value" : 155 }
Aggregation Alternative
Using the available aggregation pipeline operators, you can rewrite the map-reduce operation
without defining custom functions:
db.orders.aggregate([
{ $group: { _id: "$cust_id", value: { $sum: "$price" } } },

{ $out: "agg_alternative_1" }
])
1. The $group stage groups by the cust_id and calculates the value field (See also $sum).
The value field contains the total price for each cust_id.
The stage output the following documents to the next stage:
{ "_id" : "Don Quis", "value" : 155 }
{ "_id" : "Cam Elot", "value" : 60 }
2. Then, the $out writes the output to the collection agg_alternative_1. Alternatively,
you could use $merge instead of $out.
3. Query the agg_alternative_1 collection to verify the results:
db.agg_alternative_1.find().sort( { _id: 1 } )
4. The operation returns the following documents:
{ "_id" : "Cam Elot", "value" : 60 }
{ "_id" : "Don Quis", "value" : 155 }
MongoDB - Replication
Replication is the process of synchronizing data across multiple servers. Replication
provides redundancy and increases data availability with multiple copies of data on
different database servers. Replication protects a database from the loss of a single
server. Replication also allows you to recover from hardware failure and service
interruptions. With additional copies of the data, you can dedicate one to disaster
recovery, reporting, or backup.
Why Replication?

 To keep your data safe
 High (24*7) availability of data
 Disaster recovery
 No downtime for maintenance (like backups, index rebuilds, compaction)
 Read scaling (extra copies to read from)
 Replica set is transparent to the application
How Replication Works in MongoDB
MongoDB achieves replication by the use of replica set. A replica set is a group
of mongod instances that host the same data set. In a replica, one node is primary
node that receives all write operations. All other instances, such as secondaries, apply
operations from the primary so that they have the same data set. Replica set can have
only one primary node.
 Replica set is a group of two or more nodes (generally minimum 3 nodes are
required).
 In a replica set, one node is primary node and remaining nodes are secondary.
 All data replicates from primary to secondary node.
 At the time of automatic failover or maintenance, election establishes for primary
and a new primary node is elected.
 After the recovery of failed node, it again join the replica set and works as a
secondary node.
A typical diagram of MongoDB replication is shown in which client application always
interact with the primary node and the primary node then replicates the data to the
secondary nodes.

Replica Set Features
 A cluster of N nodes
 Any one node can be primary
 All write operations go to primary
 Automatic failover
 Automatic recovery
 Consensus election of primary
Set Up a Replica Set
Here we will convert standalone MongoDB instance to a replica set. To convert to
replica set, following are the steps −
 Shutdown already running MongoDB server.

 Start the MongoDB server by specifying -- replSet option. Following is the basic
syntax of --replSet −

mongod --port "PORT" --dbpath "YOUR_DB_DATA_PATH" --replSet
"REPLICA_SET_INSTANCE_NAME"
Example
mongod --port 27017 --dbpath "D:set upmongodbdata" --replSet rs0
 It will start a mongod instance with the name rs0, on port 27017.
 Now start the command prompt and connect to this mongod instance.
 In Mongo client, issue the command rs.initiate() to initiate a new replica set.
 To check the replica set configuration, issue the command rs.conf(). To check
the status of replica set issue the command rs.status().

unit 4,Indexes in database.docx

unit 4,Indexes in database.docx

Recommended

Recommended

More Related Content

Similar to unit 4,Indexes in database.docx

Similar to unit 4,Indexes in database.docx (20)

More from RaviRajput416403

More from RaviRajput416403 (17)

Recently uploaded

Recently uploaded (8)

unit 4,Indexes in database.docx