SlideShare a Scribd company logo
1 of 46
 Coratti Stefano
 coratti.1624508@studenti.uniroma1.it
 github.com/CorattiS86
 it.linkedin.com/in/Stefano-coratti-83005a85
 www.slideshare.net/StefanoCoratti
Project for the course of
“Big Data Management 2016”
DIAG - SAPIENZA
>> Analysis of a Database for IOT
Big Data Management 2016
SAPIENZA - DIAG
Overview
 IOT world
 Databases requirements for IOT
 Relational Database
 NOSQL Databases
 Document-Based & MongoDB
 MongoDB - Overview
 MongoDB - Logical Data Model
 MongoDB - Architecture
 MongoDB - Storage Data Structure
 MongoDB - Query Language
 MongoDB - Other Features
 MongoDB - In action
01
IOT world
Big Data Management 2016
SAPIENZA - DIAG
 Sensor data is only useful if you can do something with it
 A world where all your physical assets and devices are connected to each other and share information,
making life easier and more convenient.
 Areas of application today :
Financial Services
Remotely monitor vehicle
performance and driver
behavior using telematics
sensor data to text those
metrics with insurance
premiums.
Government
Use biometric sensor data
from patients to alert
doctors early so that
they can prevent medical
Emergencies.
High Tech
Ability to quantify people
lifestyles with wearable
tech , analyzing diet, sleep
exercise and the rest of
their activity.
Retail
Present enticing offers to
shoppers using in-store
beacons and purchase
history data as they walk
through in the store.
02
Databases requirement for IOT
Big Data Management 2016
SAPIENZA - DIAG
 Scalability
Continous machine-scale ingestion, indexing and storage. A modest data source may generate millions of complex
records per seconds on a continuous basis.
 Operational “real-time” queries and analytics
Extracting value from IOT data is all about minimizing the latency from data ingestion to online queries
and actionable analytics.
 Spatio-temporal data
Many data objects in real world have attributes related to both space and time.
IOT data is all about spatiotemporal relationships and join operations. It require at least a true time-series database for
very simple uses and a true spatial database for the more general case.
 Schema flexibility
IOT Database must be as flexible as required by the application. Schema changes over time.
 High variety of data
Data in IOT, but in general for Big Data, may be structured or not, dense or sparse, connected or disconnected.
03
Why Relational DBs fail in IOT ?
Big Data Management 2016
SAPIENZA - DIAG
 Scalability
RDBMS are not designed for scale. It’s hard and a huge challenge to reach scalability.
They are designed to run on a single server in order to maintain the integrity of the table mappings and avoid
the problems of distributed computing.
 Operational “real-time” queries and analytics
Queries execution time increase as the size of tables. Operations like JOIN are compute- and memory-intensive and
have an exponential cost as queries grow.
 Spatio-temporal data
Managing spatio-temporal data in RDBMS is complex and inefficient, because in general it doesn’t support efficiently
objects that are multi-dimensional in nature.
 Schema flexibility
Relational DBs are STATIC schema, it must be defined at the beginning and respected successively to met the ACID property
 High variety of data
In a Relational DBs data must be structured, no all kind of data nature is efficient in this case.
04
NOSQL Databases
Big Data Management 2016
SAPIENZA - DIAG
Graph DBs
 Databases that use graph structures for semantic queries
with nodes, edges and properties to represent and store data.
 Specific query language, designed for Graph DB, which allow to
traverse a graph.
 Relationships allow data in the store to be linked together directly,
and in many cases retrieved with one operation.
Key–Value DBs
 Is a data storage paradigm designed for storing, retrieving and managing associative arrays
 No pre-defined structure for the data, known as a dictionary that contain a collection of objects,
or records which in turn have many different fields, each containing data.
 With little or no needs to maintain indexes, it is designed to be horizontally scalable.
 Support very efficiently only simple queries, particularly suited for problems where the data
are not highly related.
 You have to write a lot more application code to reassemble collections of key-value pairs into objects
05
NOSQL Databases
Big Data Management 2016
SAPIENZA - DIAG
Data Warehouse
Column family DBs
 Data Warehouse are large then can accommodate a lot of sensor data,
and scalable architecture solution exist.
 “Near Real-time” solution, the bottleneck is in the ETL process,
difficulty in supporting updates in “real-time”.
 Like a relational database, it also requires to maintain a schema.
 It’s nature is to be multi-dimensional and integrate heterogenous information
 Consist of rows and columns, data stored in key-value pair,
where the key is mapped to a value that is a set of columns.
 Differently from relational DBs where records are represented in a row,
here data is organized by column, storing data in this fashion allow queries
that perform aggregation over very large sets of data to run very efficiently.
 Data can be easily partitioned in a separate database server (Sharding)
06
NOSQL Databases
Big Data Management 2016
SAPIENZA - DIAG
Document-based DBs
 All information for a given object is stored in a single instance in the database,
and every stored object can be different from every other ( fields, format, etc…),
well structured using an encoding such as an XML or JSON.
 SCHEMALESS, documents don’t require to adhere to a standard schema, flexible,
adaptive based on the applications, each uses different objects.
 Database offers API or query language that allows the user to retrieve document based
on content, for example retrieve documents with a certain field.
 Drop ACID properties to allow high scalability.
Most popular Document-based DBs
07
NOSQL DBs Challenge
Big Data Management 2016
SAPIENZA - DIAG
 Maturity
RDBMS systems have been around for along time, they are stable and richly functional.
In comparison most NOSQL alternatives are in pre-production versions with many key features yet to be implemented.
 Support
Enterprise want reassurance that if a key system fails, they will be able to get timely and competent support.
RDBMS vendors provide high levels of support, in constrast most NOSQL DBs are open source, which companies are
small start-ups without support resources or credibility of an Oracle, Microsoft or IBM.
 Analytics and business intelligence
NOSQL databases are oriented towards the demands of Web 2.0 applications. However, data in an application has value
to the business that goes beyond the Insert - Read – Update – Delete - Cycle of a typical Web application.
Businesses mine information in corporate databases to improve their efficiency and competitiveness,
and business intelligence is a key IT issue for all medium to large companies.
 Expertise
There are millions of developers familiar with RDBMS, it’s more easy to find an expert of RDBMS than a NOSQL.
This situation will address naturally over time.
 Administration
The design goals for NOSQL may be to provide a zero admin solution, but the current reality falls well short of that goal.
NOSQL today requires a lot of skill to install and a lot of effort to maintain.
08
MongoDB Overview
Big Data Management 2016
SAPIENZA - DIAG
IOT is HARD MongoDB Makes it EASY
 Each new generation of “thing” comes with
new sensors. New sensors create new data and
new functionality requirements.
A database should succeed in the hard task to
incorporate new da and iterate on a data model.
 Document model enables to store and process
data of any structure: events, text, binary data,
geospatial coordinates and anything else.
Structure of a document’s schema can change rapidly
as data generated by IOT.
 Scaling problem. Billion sensors generate volume
of data. That’s a lot more than a single server
can handle.
 Scale Big. MongoDB is built to scale out on commodity
hardware, in an own data center or in the cloud, serving
a lot of users and sensor data without extra software
 The need to analyze rapidly changing,
multi-structured data in real time.
It’s no possible to have the luxury of lengthy
ETL processes to cleanse data for downstream
reporting.
 Signal vs Noise. MongoDB can analyze data of any structure.
It can do so directly within the database, giving result in
real time, and without expensive data warehouse loads.
08
MongoDB Overview
Big Data Management 2016
SAPIENZA - DIAG
IOT is HARD MongoDB Makes it EASY
 Each new generation of “thing” comes with
new sensors. New sensors create new data and
new functionality requirements.
A database should succeed in the hard task to
incorporate new data and iterate on a data model.
 Scaling problem. Billion sensors generate volume
of data. That’s a lot more than a single server
can handle.
 Scale Big. MongoDB is built to scale out on commodity
hardware, in an own data center or in the cloud, serving
a lot of users and sensor data without extra software
 The need to analyze rapidly changing,
multi-structured data in real time.
It’s no possible to have the luxury of lengthy
ETL processes to cleanse data for downstream
reporting.
 Signal vs Noise. MongoDB can analyze data of any structure.
It can do so directly within the database, giving result in
real time, and without expensive data warehouse loads.
 Document model enables to store and process
data of any structure: events, text, binary data,
geospatial coordinates and anything else.
Structure of a document’s schema can change rapidly
as data generated by IOT.
09
MongoDB Logical Data Model
Big Data Management 2016
SAPIENZA - DIAG
Data as Documents
 A document is a set of key-value pairs.
document collection
 Documents that share a similar structure are typically organized as collection.
 Fields can vary from document to document, there is no need to declare
the structure of documents to the system.
 If a new field needs to be added to a document then it can be created without
affecting all other documents and without updating central system catalog.
09
MongoDB Logical Data Model
Big Data Management 2016
SAPIENZA - DIAG
Data as Documents
 A document is a set of key-value pairs.
document collection
 Documents that share a similar structure are typically organized as collection.
RDBMS comparison
 Table:
It’s possible to think to collections
as being analogous to table.
 Rows:
Documents are similar to rows.
 Column:
Fields are similar to columns.
 Fields can vary from document to document, there is no need to declare
the structure of documents to the system.
 If a new field needs to be added to a document then it can be created without
affecting all other documents and without updating central system catalog.
10
MongoDB Logical Data Model
Big Data Management 2016
SAPIENZA - DIAG
 MongoDB stores data in to documents in a binary representation called BSON (Binary JSON)
it allow, differently from JSON, representation of data types.
Data as Documents
OBJECT_ID
10
MongoDB Logical Data Model
Big Data Management 2016
SAPIENZA - DIAG
string
number
date-time
Object_ID
struct
4-byte value representing the seconds
since the Unix epoch
3-byte machine identifier
2-byte process id
3-byte counter,
starting with random value
 MongoDB stores data in to documents in a binary representation called BSON (Binary JSON)
it allow, differently from JSON, representation of data types.
Data as Documents
11
MongoDB Logical Data Model
Big Data Management 2016
SAPIENZA - DIAG
 MongoDB documents tend to have all data for a given record in a single document,
whereas in a relational database information for a given record is usually spread across many tables.
Data as Documents
EXAMPLE
Suppose to design a database for a blog/website that need the following requirements:
 Every post has the unique title, description and url
 Every post can have one or more tags
 Every post has the name of its publisher and total number of likes
 Every post has comments by users with their name, message, data-time and likes
 On each post, there can be zero or more comments
11
MongoDB Logical Data Model
Big Data Management 2016
SAPIENZA - DIAG
 MongoDB documents tend to have all data for a given record in a single document,
whereas in a relational database information for a given record is usually spread across many tables.
Data as Documents
EXAMPLE
Suppose to design a database for a blog/website that need the following requirements:
 Every post has the unique title, description and url
 Every post can have one or more tags
 Every post has the name of its publisher and total number of likes
 Every post has comments by users with their name, message, data-time and likes
 On each post, there can be zero or more comments
RDBMS requires
minimum 3 tables
A unique
MongoDB documentVS
 It dramatically reduces the need to JOIN separate tables
12
MongoDB Storage Data Structure
Big Data Management 2016
SAPIENZA - DIAG
 Every MongoDB instance consists of a namespace file, journal files and data files.
Data files
 Focusing on data files,
they store BSON documents, indexes and generated metadata in structure called extents.
12
MongoDB Storage Data Structure
Big Data Management 2016
SAPIENZA - DIAG
 Every MongoDB instance consists of a namespace file, journal files and data files.
Data files
 Focusing on data files,
they store BSON documents, indexes and generated metadata in structure called extents.
Extents
 Extents are containers within data files used to store documents and indexes.
12
MongoDB Storage Data Structure
Big Data Management 2016
SAPIENZA - DIAG
 Every MongoDB instance consists of a namespace file, journal files and data files.
Data files
 Focusing on data files,
they store BSON documents, indexes and generated metadata in structure called extents.
Extents
 Extents are containers within data files used to store documents and indexes.
 data and indexes are each contained in their own sets of extents
 no extent will ever contain content for more than one collection
 data and indexes are never contained within the same extent
 data and indexes for a collection will usually span multiple extents
 when a new extent is needed, MongoDB will attempt to use available
space within current data file, if cannot be found, then a new data file is created.
13
MongoDB Storage metrics
 The dataSize metric is the sum of the sizes (bytes) of all documents and padding stored in database.
dataSize
My-db.1 My-db.2
 While dataSize decrease when document are deleted, it doesn’t when documents shrink
because space used by original document has already allocated and cannot be used by others.
 If document is updated with more data dataSize will remain the same as long as the
new document fits within its originally padded pre-allocated space.
record = document + padding
in MongoDB every document is stored
in a record which contain the document
itself and extra space called padding.
padding
document is allocated with additional
space, this is a strategy to increase
efficiently for the updating process.
Big Data Management 2016
SAPIENZA - DIAG
dataSize
14
MongoDB Storage metrics
 The storageSize metric is equal of the sizes (bytes) of all data extents in database.
storageSize
My-db.1 My-db.2
dataSize
 The storageSize is larger than dataSize because it includes yet-unused space (in data extents)
and space vacated by deleted or moved documents within extents
 The storageSize does not decrease as you remove or shrink documents.
Big Data Management 2016
SAPIENZA - DIAG
storageSize
15
MongoDB Storage metrics
fileSize
My-db.1 My-db.2
dataSize
 This metric represent the storage footprint of the database on disk.
 It doesn’t decrease when collections, documents or indexes are removed,
it decreases when a database is deleted.
Big Data Management 2016
SAPIENZA - DIAG
storageSize
fileSize
 The fileSize metric is equal of the sizes (bytes) of all data extents, index extents
and yet-unused space (in data files) in database.
16
MongoDB Distribution
Big Data Management 2016
SAPIENZA - DIAG
 MongoDB provides horizontal scale-out for database on low cost, commodity hardware
using a technique called “Sharding”.
SHARDING technique
 Sharding distributes data across multiple physical partitions called “shards”.
Data are automatically balanced in multiple server as data grows,
addressing the hardware limitation of a single server.
 Transparent to applications, whether there is one or one hundreds shards,
the application code for querying MongoDB is the same.
16
MongoDB Distribution
Big Data Management 2016
SAPIENZA - DIAG
 MongoDB provides horizontal scale-out for database on low cost, commodity hardware
using a technique called “Sharding”.
SHARDING technique
 Sharding distributes data across multiple physical partitions called “shards”.
Data are automatically balanced in multiple server as data grows,
addressing the hardware limitation of a single server.
 Transparent to applications, whether there is one or one hundreds shards,
the application code for querying MongoDB is the same.
Recovery from hardware failure
and service interruptions.
Protection from
the loss of a single server.
No needs to backup,
one copies can act as a backup.
It can eventually provides redundancy
with multiple copies of data on different databases.
17
MongoDB Distribution
 A MongoDB sharded cluster consists of the following components:
shard: each contain a subset of the sharded data,
each shard can be deployed as a replica set.
mongos: the mongos acts as a query router, providing
an interface between client applications and
the sharded cluster.
config servers: store metadata and
configuration setting for the cluster.
Big Data Management 2016
SAPIENZA - DIAG
17
MongoDB Distribution
 A MongoDB sharded cluster consists of the following components:
shard: each contain a subset of the sharded data,
each shard can be deployed as a replica set.
mongos: the mongos acts as a query router, providing
an interface between client applications and
the sharded cluster.
config servers: store metadata and
configuration setting for the cluster.
The query router uses this metadata to target
operations to specific shards.
Big Data Management 2016
SAPIENZA - DIAG
 Shard Keys: to distribute the documents in a collection, MongoDB partitions the collection
using the shard key, that consists of an immutable field or fields that exist in
every document in the target collection
18
MongoDB Query Language
Big Data Management 2016
SAPIENZA - DIAG
 To query data from collection, MongoDB provide the method find()
>> db.collection_name.find()
18
MongoDB Query Language
Big Data Management 2016
SAPIENZA - DIAG
 To query data from collection, MongoDB provide the method find()
>> db.collection_name.find( conditions )
 The conditions for the query must be specified as arguments in the find methods.
18
MongoDB Query Language
Big Data Management 2016
SAPIENZA - DIAG
 To query data from collection, MongoDB provide the method find()
>> db.collection_name.find({ key : value })
 The conditions for the query must be specified as arguments in the find methods.
 The conditions for the are expressed specifying key : value pairs
19
MongoDB Query Language
 Depending on the condition it is possible to obtain the equivalent of an SQL query.
OPERATION
Selection
from one collection
MONGODB SYNTAX
db.collection.find()
SQL EQUIVALENT
SELECT *
FROM collection
19
MongoDB Query Language
 Depending on the condition it is possible to obtain the equivalent of an SQL query.
OPERATION
Selection
from one collection
Selection
for equality
Selection
for a value less than
Selection
for a value not equals to
MONGODB SYNTAX
db.collection.find()
db.collection.find( { name : ‘’John’’ } )
db.collection.find( { num : {$lt : 50} } )
db.collection.find( { val : {$ne: 0} } )
SQL EQUIVALENT
SELECT *
FROM collection
SELECT *
FROM collection
WHERE name = ‘’John’’
SELECT *
FROM collection
WHERE num < 50
SELECT *
FROM collection
WHERE val != 0
19
MongoDB Query Language
 Depending on the condition it is possible to obtain the equivalent of an SQL query.
OPERATION
Selection
from one collection
Selection
for equality
Selection
for a value less than
Selection
for a value not equals to
Projection
MONGODB SYNTAX
db.collection.find()
db.collection.find( { name : ‘’John’’ } )
db.collection.find( { num : {$lt : 50} } )
db.collection.find( { val : {$ne: 0} } )
db.collection.find( {} , { name: 1, job: 1} )
SQL EQUIVALENT
SELECT *
FROM collection
SELECT *
FROM collection
WHERE name = ‘’John’’
SELECT *
FROM collection
WHERE num < 50
SELECT *
FROM collection
WHERE val != 0
SELECT name, job
FROM collection
19
MongoDB Query Language
 Depending on the condition it is possible to obtain the equivalent of an SQL query.
OPERATION
Selection
from one collection
Selection
for equality
Selection
for a value less than
Selection
for a value not equals to
Projection
Projection and Selection
MONGODB SYNTAX
db.collection.find()
db.collection.find( { name : ‘’John’’ } )
db.collection.find( { num : {$lt : 50} } )
db.collection.find( { val : {$ne: 0} } )
db.collection.find( {} , { name: 1, job: 1} )
db.collection.find({ age : {$gte : 18} } , { name: 1})
SQL EQUIVALENT
SELECT *
FROM collection
SELECT *
FROM collection
WHERE name = ‘’John’’
SELECT *
FROM collection
WHERE num < 50
SELECT *
FROM collection
WHERE val != 0
SELECT name, job
FROM collection
SELECT name
FROM collection
WHERE age >= 18
19
MongoDB Query Language
 Depending on the condition it is possible to obtain the equivalent of an SQL query.
OPERATION
AND
OR
MONGODB SYNTAX
db.collection.find(
{
$and: [
{ job: ‘’employee’’} , { age: {$gte : 65} }
]
}
)
db.collection.find(
{
$or: [
{ job: ‘’employee’’} , { job: ‘’freelancer’’}
]
}
)
SQL EQUIVALENT
SELECT *
FROM collection
WHERE job = ‘’employee’’
AND
age >= 65
SELECT *
FROM collection
WHERE job = ‘’employee’’
OR
job = ‘’freelancer’’
20
MongoDB CRUD Concepts
 C
Create or insert operation,
it allow to add new documents to a collection.
Can be see as the equivalent of INSERT in SQL.
 R
Read operation, is a way to query a collection for documents.
As seen before, there ways to do a selection, projection, etc…
 U
Update operation, it modifies existing documents in a collection.
It allow to set a filter to identify the document to update.
 D
Delete operation, it removes documents from a collection.
It allow to set a filter to delete only specified documents
otherwise they will be all deleted.
21
How ACID is MongoDB ?
Atomicity & Isolation
 In MongoDB a write operation is atomic on the level of a single document. When a single write
operation modifies multiple documents, the modification of each document is atomic.
 The $isolated operator can prevent other processes from interleaving once the write operation modifies the first document,
this ensures that no one sees the changes until the write operation completes or errors out.
 The operation as a whole is not atomic, then other operation can interleave.
However it’s possible to isolate a single operation with the $isolated operator.
 However an isolated write operation does not provide “all-or-nothing” atomicity.
That is an error during the write operation does not roll back all its changes that preceded the error.
 The $isolated operator causes write operations to acquire an exclusive lock on the collection.
Big Data Management 2016
SAPIENZA - DIAG
22
How ACID is MongoDB ?
Consistency
 Even in replica set configurations, the primary Mongo server is targeted with all the writes,
single server consistency is easy to guarantee.
 The secondary nodes may be out of date with respect to the primary, as eventual consistency only
guarantee that if after a long enough period with no writes, they will get up to date with to the primary.
 However by default the secondary servers cannot answer reads, so the traffic could be distribute with the penalty
of inconsistency, it is a configure choice.
Durability
 Durability of writes is the biggest issue with MongoDB
 What SQL DBs do is committing after every write operation. In MongoDB this doesn’t happen, a choice of developers,
they say because in many scenario the OS doesn’t write the file on disk even after syncing( hardware buffering), and
because time spent waiting for recovering would impact availability
 So if the server crashes, writes accepted after the last commit will be lost.
Big Data Management 2016
SAPIENZA - DIAG
23
MongoDB summing up
Is MongoDB a good choice for IOT ?
 Scalability
 Operational “real-time” queries and analytics
 Spatio-temporal data
 Schema flexibility
 High variety of data
Big Data Management 2016
SAPIENZA - DIAG
23
MongoDB summing up
Is MongoDB a good choice for IOT ?
 Scalability
The horizontal scale allow to scale easly, it is possible add multiple servers when needed ,
and the sharding technique allow to balance data across the servers.
 Operational “real-time” queries and analytics
 Spatio-temporal data
 Schema flexibility
 High variety of data
Big Data Management 2016
SAPIENZA - DIAG
23
MongoDB summing up
Is MongoDB a good choice for IOT ?
 Scalability
The horizontal scale allow to scale easly, it is possible add multiple servers when needed ,
and the sharding technique allow to balance data across the servers.
 Operational “real-time” queries and analytics
Sacrificing the ACID properties it allows more speed in the operations, and because much of related data
are inside the same document, it doesn’t require the expensive JOIN operations.
 Spatio-temporal data
 Schema flexibility
 High variety of data
Big Data Management 2016
SAPIENZA - DIAG
23
MongoDB summing up
Is MongoDB a good choice for IOT ?
 Scalability
The horizontal scale allow to scale easly, it is possible add multiple servers when needed ,
and the sharding technique allow to balance data across the servers.
 Operational “real-time” queries and analytics
Sacrificing the ACID properties it allows more speed in the operations, and because much of related data
are inside the same document, it doesn’t require the expensive JOIN operations.
 Spatio-temporal data
MongoDB offers a number of indexes and query mechanisms to handle geospatial information.
Location data are stored as GeoJSON objects.
 Schema flexibility
 High variety of data
Big Data Management 2016
SAPIENZA - DIAG
23
MongoDB summing up
Is MongoDB a good choice for IOT ?
 Scalability
The horizontal scale allow to scale easly, it is possible add multiple servers when needed ,
and the sharding technique allow to balance data across the servers.
 Operational “real-time” queries and analytics
Sacrificing the ACID properties it allows more speed in the operations, and because much of related data
are inside the same document, it doesn’t require the expensive JOIN operations.
 Spatio-temporal data
MongoDB offers a number of indexes and query mechanisms to handle geospatial information.
Location data are stored as GeoJSON objects.
 Schema flexibility
Schema is free, it can change during writes operations, and changes can affect one or more documents.
 High variety of data
Big Data Management 2016
SAPIENZA - DIAG
23
MongoDB summing up
Is MongoDB a good choice for IOT ?
 Scalability
The horizontal scale allow to scale easly, it is possible add multiple servers when needed ,
and the sharding technique allow to balance data across the servers.
 Operational “real-time” queries and analytics
Sacrificing the ACID properties it allows more speed in the operations, and because much of related data
are inside the same document, it doesn’t require the expensive JOIN operations.
 Spatio-temporal data
MongoDB offers a number of indexes and query mechanisms to handle geospatial information.
Location data are stored as GeoJSON objects.
 Schema flexibility
Schema is free, it can change during writes operations, and changes can affect one or more documents.
 High variety of data
It represent the strong point for IOT, sensor data can be represented with a field and respective value,
then they are stored in they natural way
Big Data Management 2016
SAPIENZA - DIAG
24
MongoDB simulation
Big Data Management 2016
SAPIENZA - DIAG
>> use myDB
>> show dbs
>> show collections
>> db.collection.find()
>> db.collection.insert()
>> db.collection.find().explain(“executionStats”)
>> db.collection.find( { field: “value”} )
>> db.collection.save()
>> db.collection.update({}, { $set:{} })
>> db.collection.find().mapReduce()
END
Any questions ???
Big Data Management 2016
SAPIENZA - DIAG

More Related Content

What's hot

Big Data vs Data Warehousing
Big Data vs Data WarehousingBig Data vs Data Warehousing
Big Data vs Data WarehousingThomas Kejser
 
Fast Data Strategy Houston Roadshow Presentation
Fast Data Strategy Houston Roadshow PresentationFast Data Strategy Houston Roadshow Presentation
Fast Data Strategy Houston Roadshow PresentationDenodo
 
Cloud Modernization and Data as a Service Option
Cloud Modernization and Data as a Service OptionCloud Modernization and Data as a Service Option
Cloud Modernization and Data as a Service OptionDenodo
 
Unlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationUnlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationDenodo
 
Denodo Data Virtualization - IT Days in Luxembourg with Oktopus
Denodo Data Virtualization - IT Days in Luxembourg with OktopusDenodo Data Virtualization - IT Days in Luxembourg with Oktopus
Denodo Data Virtualization - IT Days in Luxembourg with OktopusDenodo
 
Big Data Analytics for Real Time Systems
Big Data Analytics for Real Time SystemsBig Data Analytics for Real Time Systems
Big Data Analytics for Real Time SystemsKamalika Dutta
 
Maximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Maximizing Data Lake ROI with Data Virtualization: A Technical DemonstrationMaximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Maximizing Data Lake ROI with Data Virtualization: A Technical DemonstrationDenodo
 
Big Data Fabric 2.0 Drives Data Democratization
Big Data Fabric 2.0 Drives Data DemocratizationBig Data Fabric 2.0 Drives Data Democratization
Big Data Fabric 2.0 Drives Data DemocratizationCambridge Semantics
 
Top Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practicesTop Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practicesSpringPeople
 
Rethink Your Data Governance - POPI Act Compliance Made Easy with Data Virtua...
Rethink Your Data Governance - POPI Act Compliance Made Easy with Data Virtua...Rethink Your Data Governance - POPI Act Compliance Made Easy with Data Virtua...
Rethink Your Data Governance - POPI Act Compliance Made Easy with Data Virtua...Denodo
 
Virtual Sandbox for Data Scientists at Enterprise Scale
Virtual Sandbox for Data Scientists at Enterprise ScaleVirtual Sandbox for Data Scientists at Enterprise Scale
Virtual Sandbox for Data Scientists at Enterprise ScaleDenodo
 
Big Data and Data Virtualization
Big Data and Data VirtualizationBig Data and Data Virtualization
Big Data and Data VirtualizationKenneth Peeples
 
The Future Of Big Data
The Future Of Big DataThe Future Of Big Data
The Future Of Big DataMatthew Dennis
 
3 Reasons Data Virtualization Matters in Your Portfolio
3 Reasons Data Virtualization Matters in Your Portfolio3 Reasons Data Virtualization Matters in Your Portfolio
3 Reasons Data Virtualization Matters in Your PortfolioDenodo
 
Best Practices: Data Virtualization Perspectives and Best Practices
Best Practices: Data Virtualization Perspectives and Best PracticesBest Practices: Data Virtualization Perspectives and Best Practices
Best Practices: Data Virtualization Perspectives and Best PracticesDenodo
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An IntroductionDenodo
 
Big Data Use Cases
Big Data Use CasesBig Data Use Cases
Big Data Use Casesboorad
 

What's hot (20)

Big Data vs Data Warehousing
Big Data vs Data WarehousingBig Data vs Data Warehousing
Big Data vs Data Warehousing
 
Fast Data Strategy Houston Roadshow Presentation
Fast Data Strategy Houston Roadshow PresentationFast Data Strategy Houston Roadshow Presentation
Fast Data Strategy Houston Roadshow Presentation
 
Cloud Modernization and Data as a Service Option
Cloud Modernization and Data as a Service OptionCloud Modernization and Data as a Service Option
Cloud Modernization and Data as a Service Option
 
Unlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationUnlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data Virtualization
 
Denodo Data Virtualization - IT Days in Luxembourg with Oktopus
Denodo Data Virtualization - IT Days in Luxembourg with OktopusDenodo Data Virtualization - IT Days in Luxembourg with Oktopus
Denodo Data Virtualization - IT Days in Luxembourg with Oktopus
 
Big Data Analytics for Real Time Systems
Big Data Analytics for Real Time SystemsBig Data Analytics for Real Time Systems
Big Data Analytics for Real Time Systems
 
Taming Big Data With Modern Software Architecture
Taming Big Data  With Modern Software ArchitectureTaming Big Data  With Modern Software Architecture
Taming Big Data With Modern Software Architecture
 
Maximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Maximizing Data Lake ROI with Data Virtualization: A Technical DemonstrationMaximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Maximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
 
Big Data Fabric 2.0 Drives Data Democratization
Big Data Fabric 2.0 Drives Data DemocratizationBig Data Fabric 2.0 Drives Data Democratization
Big Data Fabric 2.0 Drives Data Democratization
 
Top Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practicesTop Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practices
 
Rethink Your Data Governance - POPI Act Compliance Made Easy with Data Virtua...
Rethink Your Data Governance - POPI Act Compliance Made Easy with Data Virtua...Rethink Your Data Governance - POPI Act Compliance Made Easy with Data Virtua...
Rethink Your Data Governance - POPI Act Compliance Made Easy with Data Virtua...
 
Capgemini Insights and Data
Capgemini Insights and Data Capgemini Insights and Data
Capgemini Insights and Data
 
Virtual Sandbox for Data Scientists at Enterprise Scale
Virtual Sandbox for Data Scientists at Enterprise ScaleVirtual Sandbox for Data Scientists at Enterprise Scale
Virtual Sandbox for Data Scientists at Enterprise Scale
 
Big Data and Data Virtualization
Big Data and Data VirtualizationBig Data and Data Virtualization
Big Data and Data Virtualization
 
The Future Of Big Data
The Future Of Big DataThe Future Of Big Data
The Future Of Big Data
 
3 Reasons Data Virtualization Matters in Your Portfolio
3 Reasons Data Virtualization Matters in Your Portfolio3 Reasons Data Virtualization Matters in Your Portfolio
3 Reasons Data Virtualization Matters in Your Portfolio
 
Best Practices: Data Virtualization Perspectives and Best Practices
Best Practices: Data Virtualization Perspectives and Best PracticesBest Practices: Data Virtualization Perspectives and Best Practices
Best Practices: Data Virtualization Perspectives and Best Practices
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An Introduction
 
Big Data Use Cases
Big Data Use CasesBig Data Use Cases
Big Data Use Cases
 

Similar to MongoDB

Big Data using NoSQL Technologies
Big Data using NoSQL TechnologiesBig Data using NoSQL Technologies
Big Data using NoSQL TechnologiesAmit Singh
 
Big Data and Enterprise Data - Oracle -1663869
Big Data and Enterprise Data - Oracle -1663869Big Data and Enterprise Data - Oracle -1663869
Big Data and Enterprise Data - Oracle -1663869Edgar Alejandro Villegas
 
SQL vs NoSQL: Big Data Adoption & Success in the Enterprise
SQL vs NoSQL: Big Data Adoption & Success in the EnterpriseSQL vs NoSQL: Big Data Adoption & Success in the Enterprise
SQL vs NoSQL: Big Data Adoption & Success in the EnterpriseAnita Luthra
 
bigdatasqloverview21jan2015-2408000
bigdatasqloverview21jan2015-2408000bigdatasqloverview21jan2015-2408000
bigdatasqloverview21jan2015-2408000Kartik Padmanabhan
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouseJames Serra
 
Relational Databases For An Efficient Data Management And...
Relational Databases For An Efficient Data Management And...Relational Databases For An Efficient Data Management And...
Relational Databases For An Efficient Data Management And...Sheena Crouch
 
Overcoming Today's Data Challenges with MongoDB
Overcoming Today's Data Challenges with MongoDBOvercoming Today's Data Challenges with MongoDB
Overcoming Today's Data Challenges with MongoDBMongoDB
 
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...ijcsity
 
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...ijcsity
 
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...ijcsity
 
MongoDB_Spark
MongoDB_SparkMongoDB_Spark
MongoDB_SparkMat Keep
 
10gen telco white_paper
10gen telco white_paper10gen telco white_paper
10gen telco white_paperEl Taller Web
 
MongoDB Breakfast Milan - Mainframe Offloading Strategies
MongoDB Breakfast Milan -  Mainframe Offloading StrategiesMongoDB Breakfast Milan -  Mainframe Offloading Strategies
MongoDB Breakfast Milan - Mainframe Offloading StrategiesMongoDB
 
The Recent Pronouncement Of The World Wide Web (Www) Had
The Recent Pronouncement Of The World Wide Web (Www) HadThe Recent Pronouncement Of The World Wide Web (Www) Had
The Recent Pronouncement Of The World Wide Web (Www) HadDeborah Gastineau
 
Big data - what, why, where, when and how
Big data - what, why, where, when and howBig data - what, why, where, when and how
Big data - what, why, where, when and howbobosenthil
 
Apache Spark and MongoDB - Turning Analytics into Real-Time Action
Apache Spark and MongoDB - Turning Analytics into Real-Time ActionApache Spark and MongoDB - Turning Analytics into Real-Time Action
Apache Spark and MongoDB - Turning Analytics into Real-Time ActionJoão Gabriel Lima
 

Similar to MongoDB (20)

No sql database
No sql databaseNo sql database
No sql database
 
Big Data using NoSQL Technologies
Big Data using NoSQL TechnologiesBig Data using NoSQL Technologies
Big Data using NoSQL Technologies
 
Big Data and Enterprise Data - Oracle -1663869
Big Data and Enterprise Data - Oracle -1663869Big Data and Enterprise Data - Oracle -1663869
Big Data and Enterprise Data - Oracle -1663869
 
Online MongoDB Training by Easylearning.guru
Online MongoDB Training by Easylearning.guruOnline MongoDB Training by Easylearning.guru
Online MongoDB Training by Easylearning.guru
 
SQL vs NoSQL: Big Data Adoption & Success in the Enterprise
SQL vs NoSQL: Big Data Adoption & Success in the EnterpriseSQL vs NoSQL: Big Data Adoption & Success in the Enterprise
SQL vs NoSQL: Big Data Adoption & Success in the Enterprise
 
bigdatasqloverview21jan2015-2408000
bigdatasqloverview21jan2015-2408000bigdatasqloverview21jan2015-2408000
bigdatasqloverview21jan2015-2408000
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
 
Relational Databases For An Efficient Data Management And...
Relational Databases For An Efficient Data Management And...Relational Databases For An Efficient Data Management And...
Relational Databases For An Efficient Data Management And...
 
Overcoming Today's Data Challenges with MongoDB
Overcoming Today's Data Challenges with MongoDBOvercoming Today's Data Challenges with MongoDB
Overcoming Today's Data Challenges with MongoDB
 
Road Map for Careers in Big Data
Road Map for Careers in Big DataRoad Map for Careers in Big Data
Road Map for Careers in Big Data
 
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
 
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
 
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
 
Report 1.0.docx
Report 1.0.docxReport 1.0.docx
Report 1.0.docx
 
MongoDB_Spark
MongoDB_SparkMongoDB_Spark
MongoDB_Spark
 
10gen telco white_paper
10gen telco white_paper10gen telco white_paper
10gen telco white_paper
 
MongoDB Breakfast Milan - Mainframe Offloading Strategies
MongoDB Breakfast Milan -  Mainframe Offloading StrategiesMongoDB Breakfast Milan -  Mainframe Offloading Strategies
MongoDB Breakfast Milan - Mainframe Offloading Strategies
 
The Recent Pronouncement Of The World Wide Web (Www) Had
The Recent Pronouncement Of The World Wide Web (Www) HadThe Recent Pronouncement Of The World Wide Web (Www) Had
The Recent Pronouncement Of The World Wide Web (Www) Had
 
Big data - what, why, where, when and how
Big data - what, why, where, when and howBig data - what, why, where, when and how
Big data - what, why, where, when and how
 
Apache Spark and MongoDB - Turning Analytics into Real-Time Action
Apache Spark and MongoDB - Turning Analytics into Real-Time ActionApache Spark and MongoDB - Turning Analytics into Real-Time Action
Apache Spark and MongoDB - Turning Analytics into Real-Time Action
 

Recently uploaded

HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
microprocessor 8085 and its interfacing
microprocessor 8085  and its interfacingmicroprocessor 8085  and its interfacing
microprocessor 8085 and its interfacingjaychoudhary37
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLDeelipZope
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 

Recently uploaded (20)

HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
microprocessor 8085 and its interfacing
microprocessor 8085  and its interfacingmicroprocessor 8085  and its interfacing
microprocessor 8085 and its interfacing
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCL
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 

MongoDB

  • 1.  Coratti Stefano  coratti.1624508@studenti.uniroma1.it  github.com/CorattiS86  it.linkedin.com/in/Stefano-coratti-83005a85  www.slideshare.net/StefanoCoratti Project for the course of “Big Data Management 2016” DIAG - SAPIENZA >> Analysis of a Database for IOT
  • 2. Big Data Management 2016 SAPIENZA - DIAG Overview  IOT world  Databases requirements for IOT  Relational Database  NOSQL Databases  Document-Based & MongoDB  MongoDB - Overview  MongoDB - Logical Data Model  MongoDB - Architecture  MongoDB - Storage Data Structure  MongoDB - Query Language  MongoDB - Other Features  MongoDB - In action
  • 3. 01 IOT world Big Data Management 2016 SAPIENZA - DIAG  Sensor data is only useful if you can do something with it  A world where all your physical assets and devices are connected to each other and share information, making life easier and more convenient.  Areas of application today : Financial Services Remotely monitor vehicle performance and driver behavior using telematics sensor data to text those metrics with insurance premiums. Government Use biometric sensor data from patients to alert doctors early so that they can prevent medical Emergencies. High Tech Ability to quantify people lifestyles with wearable tech , analyzing diet, sleep exercise and the rest of their activity. Retail Present enticing offers to shoppers using in-store beacons and purchase history data as they walk through in the store.
  • 4. 02 Databases requirement for IOT Big Data Management 2016 SAPIENZA - DIAG  Scalability Continous machine-scale ingestion, indexing and storage. A modest data source may generate millions of complex records per seconds on a continuous basis.  Operational “real-time” queries and analytics Extracting value from IOT data is all about minimizing the latency from data ingestion to online queries and actionable analytics.  Spatio-temporal data Many data objects in real world have attributes related to both space and time. IOT data is all about spatiotemporal relationships and join operations. It require at least a true time-series database for very simple uses and a true spatial database for the more general case.  Schema flexibility IOT Database must be as flexible as required by the application. Schema changes over time.  High variety of data Data in IOT, but in general for Big Data, may be structured or not, dense or sparse, connected or disconnected.
  • 5. 03 Why Relational DBs fail in IOT ? Big Data Management 2016 SAPIENZA - DIAG  Scalability RDBMS are not designed for scale. It’s hard and a huge challenge to reach scalability. They are designed to run on a single server in order to maintain the integrity of the table mappings and avoid the problems of distributed computing.  Operational “real-time” queries and analytics Queries execution time increase as the size of tables. Operations like JOIN are compute- and memory-intensive and have an exponential cost as queries grow.  Spatio-temporal data Managing spatio-temporal data in RDBMS is complex and inefficient, because in general it doesn’t support efficiently objects that are multi-dimensional in nature.  Schema flexibility Relational DBs are STATIC schema, it must be defined at the beginning and respected successively to met the ACID property  High variety of data In a Relational DBs data must be structured, no all kind of data nature is efficient in this case.
  • 6. 04 NOSQL Databases Big Data Management 2016 SAPIENZA - DIAG Graph DBs  Databases that use graph structures for semantic queries with nodes, edges and properties to represent and store data.  Specific query language, designed for Graph DB, which allow to traverse a graph.  Relationships allow data in the store to be linked together directly, and in many cases retrieved with one operation. Key–Value DBs  Is a data storage paradigm designed for storing, retrieving and managing associative arrays  No pre-defined structure for the data, known as a dictionary that contain a collection of objects, or records which in turn have many different fields, each containing data.  With little or no needs to maintain indexes, it is designed to be horizontally scalable.  Support very efficiently only simple queries, particularly suited for problems where the data are not highly related.  You have to write a lot more application code to reassemble collections of key-value pairs into objects
  • 7. 05 NOSQL Databases Big Data Management 2016 SAPIENZA - DIAG Data Warehouse Column family DBs  Data Warehouse are large then can accommodate a lot of sensor data, and scalable architecture solution exist.  “Near Real-time” solution, the bottleneck is in the ETL process, difficulty in supporting updates in “real-time”.  Like a relational database, it also requires to maintain a schema.  It’s nature is to be multi-dimensional and integrate heterogenous information  Consist of rows and columns, data stored in key-value pair, where the key is mapped to a value that is a set of columns.  Differently from relational DBs where records are represented in a row, here data is organized by column, storing data in this fashion allow queries that perform aggregation over very large sets of data to run very efficiently.  Data can be easily partitioned in a separate database server (Sharding)
  • 8. 06 NOSQL Databases Big Data Management 2016 SAPIENZA - DIAG Document-based DBs  All information for a given object is stored in a single instance in the database, and every stored object can be different from every other ( fields, format, etc…), well structured using an encoding such as an XML or JSON.  SCHEMALESS, documents don’t require to adhere to a standard schema, flexible, adaptive based on the applications, each uses different objects.  Database offers API or query language that allows the user to retrieve document based on content, for example retrieve documents with a certain field.  Drop ACID properties to allow high scalability. Most popular Document-based DBs
  • 9. 07 NOSQL DBs Challenge Big Data Management 2016 SAPIENZA - DIAG  Maturity RDBMS systems have been around for along time, they are stable and richly functional. In comparison most NOSQL alternatives are in pre-production versions with many key features yet to be implemented.  Support Enterprise want reassurance that if a key system fails, they will be able to get timely and competent support. RDBMS vendors provide high levels of support, in constrast most NOSQL DBs are open source, which companies are small start-ups without support resources or credibility of an Oracle, Microsoft or IBM.  Analytics and business intelligence NOSQL databases are oriented towards the demands of Web 2.0 applications. However, data in an application has value to the business that goes beyond the Insert - Read – Update – Delete - Cycle of a typical Web application. Businesses mine information in corporate databases to improve their efficiency and competitiveness, and business intelligence is a key IT issue for all medium to large companies.  Expertise There are millions of developers familiar with RDBMS, it’s more easy to find an expert of RDBMS than a NOSQL. This situation will address naturally over time.  Administration The design goals for NOSQL may be to provide a zero admin solution, but the current reality falls well short of that goal. NOSQL today requires a lot of skill to install and a lot of effort to maintain.
  • 10. 08 MongoDB Overview Big Data Management 2016 SAPIENZA - DIAG IOT is HARD MongoDB Makes it EASY  Each new generation of “thing” comes with new sensors. New sensors create new data and new functionality requirements. A database should succeed in the hard task to incorporate new da and iterate on a data model.  Document model enables to store and process data of any structure: events, text, binary data, geospatial coordinates and anything else. Structure of a document’s schema can change rapidly as data generated by IOT.  Scaling problem. Billion sensors generate volume of data. That’s a lot more than a single server can handle.  Scale Big. MongoDB is built to scale out on commodity hardware, in an own data center or in the cloud, serving a lot of users and sensor data without extra software  The need to analyze rapidly changing, multi-structured data in real time. It’s no possible to have the luxury of lengthy ETL processes to cleanse data for downstream reporting.  Signal vs Noise. MongoDB can analyze data of any structure. It can do so directly within the database, giving result in real time, and without expensive data warehouse loads.
  • 11. 08 MongoDB Overview Big Data Management 2016 SAPIENZA - DIAG IOT is HARD MongoDB Makes it EASY  Each new generation of “thing” comes with new sensors. New sensors create new data and new functionality requirements. A database should succeed in the hard task to incorporate new data and iterate on a data model.  Scaling problem. Billion sensors generate volume of data. That’s a lot more than a single server can handle.  Scale Big. MongoDB is built to scale out on commodity hardware, in an own data center or in the cloud, serving a lot of users and sensor data without extra software  The need to analyze rapidly changing, multi-structured data in real time. It’s no possible to have the luxury of lengthy ETL processes to cleanse data for downstream reporting.  Signal vs Noise. MongoDB can analyze data of any structure. It can do so directly within the database, giving result in real time, and without expensive data warehouse loads.  Document model enables to store and process data of any structure: events, text, binary data, geospatial coordinates and anything else. Structure of a document’s schema can change rapidly as data generated by IOT.
  • 12. 09 MongoDB Logical Data Model Big Data Management 2016 SAPIENZA - DIAG Data as Documents  A document is a set of key-value pairs. document collection  Documents that share a similar structure are typically organized as collection.  Fields can vary from document to document, there is no need to declare the structure of documents to the system.  If a new field needs to be added to a document then it can be created without affecting all other documents and without updating central system catalog.
  • 13. 09 MongoDB Logical Data Model Big Data Management 2016 SAPIENZA - DIAG Data as Documents  A document is a set of key-value pairs. document collection  Documents that share a similar structure are typically organized as collection. RDBMS comparison  Table: It’s possible to think to collections as being analogous to table.  Rows: Documents are similar to rows.  Column: Fields are similar to columns.  Fields can vary from document to document, there is no need to declare the structure of documents to the system.  If a new field needs to be added to a document then it can be created without affecting all other documents and without updating central system catalog.
  • 14. 10 MongoDB Logical Data Model Big Data Management 2016 SAPIENZA - DIAG  MongoDB stores data in to documents in a binary representation called BSON (Binary JSON) it allow, differently from JSON, representation of data types. Data as Documents
  • 15. OBJECT_ID 10 MongoDB Logical Data Model Big Data Management 2016 SAPIENZA - DIAG string number date-time Object_ID struct 4-byte value representing the seconds since the Unix epoch 3-byte machine identifier 2-byte process id 3-byte counter, starting with random value  MongoDB stores data in to documents in a binary representation called BSON (Binary JSON) it allow, differently from JSON, representation of data types. Data as Documents
  • 16. 11 MongoDB Logical Data Model Big Data Management 2016 SAPIENZA - DIAG  MongoDB documents tend to have all data for a given record in a single document, whereas in a relational database information for a given record is usually spread across many tables. Data as Documents EXAMPLE Suppose to design a database for a blog/website that need the following requirements:  Every post has the unique title, description and url  Every post can have one or more tags  Every post has the name of its publisher and total number of likes  Every post has comments by users with their name, message, data-time and likes  On each post, there can be zero or more comments
  • 17. 11 MongoDB Logical Data Model Big Data Management 2016 SAPIENZA - DIAG  MongoDB documents tend to have all data for a given record in a single document, whereas in a relational database information for a given record is usually spread across many tables. Data as Documents EXAMPLE Suppose to design a database for a blog/website that need the following requirements:  Every post has the unique title, description and url  Every post can have one or more tags  Every post has the name of its publisher and total number of likes  Every post has comments by users with their name, message, data-time and likes  On each post, there can be zero or more comments RDBMS requires minimum 3 tables A unique MongoDB documentVS  It dramatically reduces the need to JOIN separate tables
  • 18. 12 MongoDB Storage Data Structure Big Data Management 2016 SAPIENZA - DIAG  Every MongoDB instance consists of a namespace file, journal files and data files. Data files  Focusing on data files, they store BSON documents, indexes and generated metadata in structure called extents.
  • 19. 12 MongoDB Storage Data Structure Big Data Management 2016 SAPIENZA - DIAG  Every MongoDB instance consists of a namespace file, journal files and data files. Data files  Focusing on data files, they store BSON documents, indexes and generated metadata in structure called extents. Extents  Extents are containers within data files used to store documents and indexes.
  • 20. 12 MongoDB Storage Data Structure Big Data Management 2016 SAPIENZA - DIAG  Every MongoDB instance consists of a namespace file, journal files and data files. Data files  Focusing on data files, they store BSON documents, indexes and generated metadata in structure called extents. Extents  Extents are containers within data files used to store documents and indexes.  data and indexes are each contained in their own sets of extents  no extent will ever contain content for more than one collection  data and indexes are never contained within the same extent  data and indexes for a collection will usually span multiple extents  when a new extent is needed, MongoDB will attempt to use available space within current data file, if cannot be found, then a new data file is created.
  • 21. 13 MongoDB Storage metrics  The dataSize metric is the sum of the sizes (bytes) of all documents and padding stored in database. dataSize My-db.1 My-db.2  While dataSize decrease when document are deleted, it doesn’t when documents shrink because space used by original document has already allocated and cannot be used by others.  If document is updated with more data dataSize will remain the same as long as the new document fits within its originally padded pre-allocated space. record = document + padding in MongoDB every document is stored in a record which contain the document itself and extra space called padding. padding document is allocated with additional space, this is a strategy to increase efficiently for the updating process. Big Data Management 2016 SAPIENZA - DIAG dataSize
  • 22. 14 MongoDB Storage metrics  The storageSize metric is equal of the sizes (bytes) of all data extents in database. storageSize My-db.1 My-db.2 dataSize  The storageSize is larger than dataSize because it includes yet-unused space (in data extents) and space vacated by deleted or moved documents within extents  The storageSize does not decrease as you remove or shrink documents. Big Data Management 2016 SAPIENZA - DIAG storageSize
  • 23. 15 MongoDB Storage metrics fileSize My-db.1 My-db.2 dataSize  This metric represent the storage footprint of the database on disk.  It doesn’t decrease when collections, documents or indexes are removed, it decreases when a database is deleted. Big Data Management 2016 SAPIENZA - DIAG storageSize fileSize  The fileSize metric is equal of the sizes (bytes) of all data extents, index extents and yet-unused space (in data files) in database.
  • 24. 16 MongoDB Distribution Big Data Management 2016 SAPIENZA - DIAG  MongoDB provides horizontal scale-out for database on low cost, commodity hardware using a technique called “Sharding”. SHARDING technique  Sharding distributes data across multiple physical partitions called “shards”. Data are automatically balanced in multiple server as data grows, addressing the hardware limitation of a single server.  Transparent to applications, whether there is one or one hundreds shards, the application code for querying MongoDB is the same.
  • 25. 16 MongoDB Distribution Big Data Management 2016 SAPIENZA - DIAG  MongoDB provides horizontal scale-out for database on low cost, commodity hardware using a technique called “Sharding”. SHARDING technique  Sharding distributes data across multiple physical partitions called “shards”. Data are automatically balanced in multiple server as data grows, addressing the hardware limitation of a single server.  Transparent to applications, whether there is one or one hundreds shards, the application code for querying MongoDB is the same. Recovery from hardware failure and service interruptions. Protection from the loss of a single server. No needs to backup, one copies can act as a backup. It can eventually provides redundancy with multiple copies of data on different databases.
  • 26. 17 MongoDB Distribution  A MongoDB sharded cluster consists of the following components: shard: each contain a subset of the sharded data, each shard can be deployed as a replica set. mongos: the mongos acts as a query router, providing an interface between client applications and the sharded cluster. config servers: store metadata and configuration setting for the cluster. Big Data Management 2016 SAPIENZA - DIAG
  • 27. 17 MongoDB Distribution  A MongoDB sharded cluster consists of the following components: shard: each contain a subset of the sharded data, each shard can be deployed as a replica set. mongos: the mongos acts as a query router, providing an interface between client applications and the sharded cluster. config servers: store metadata and configuration setting for the cluster. The query router uses this metadata to target operations to specific shards. Big Data Management 2016 SAPIENZA - DIAG  Shard Keys: to distribute the documents in a collection, MongoDB partitions the collection using the shard key, that consists of an immutable field or fields that exist in every document in the target collection
  • 28. 18 MongoDB Query Language Big Data Management 2016 SAPIENZA - DIAG  To query data from collection, MongoDB provide the method find() >> db.collection_name.find()
  • 29. 18 MongoDB Query Language Big Data Management 2016 SAPIENZA - DIAG  To query data from collection, MongoDB provide the method find() >> db.collection_name.find( conditions )  The conditions for the query must be specified as arguments in the find methods.
  • 30. 18 MongoDB Query Language Big Data Management 2016 SAPIENZA - DIAG  To query data from collection, MongoDB provide the method find() >> db.collection_name.find({ key : value })  The conditions for the query must be specified as arguments in the find methods.  The conditions for the are expressed specifying key : value pairs
  • 31. 19 MongoDB Query Language  Depending on the condition it is possible to obtain the equivalent of an SQL query. OPERATION Selection from one collection MONGODB SYNTAX db.collection.find() SQL EQUIVALENT SELECT * FROM collection
  • 32. 19 MongoDB Query Language  Depending on the condition it is possible to obtain the equivalent of an SQL query. OPERATION Selection from one collection Selection for equality Selection for a value less than Selection for a value not equals to MONGODB SYNTAX db.collection.find() db.collection.find( { name : ‘’John’’ } ) db.collection.find( { num : {$lt : 50} } ) db.collection.find( { val : {$ne: 0} } ) SQL EQUIVALENT SELECT * FROM collection SELECT * FROM collection WHERE name = ‘’John’’ SELECT * FROM collection WHERE num < 50 SELECT * FROM collection WHERE val != 0
  • 33. 19 MongoDB Query Language  Depending on the condition it is possible to obtain the equivalent of an SQL query. OPERATION Selection from one collection Selection for equality Selection for a value less than Selection for a value not equals to Projection MONGODB SYNTAX db.collection.find() db.collection.find( { name : ‘’John’’ } ) db.collection.find( { num : {$lt : 50} } ) db.collection.find( { val : {$ne: 0} } ) db.collection.find( {} , { name: 1, job: 1} ) SQL EQUIVALENT SELECT * FROM collection SELECT * FROM collection WHERE name = ‘’John’’ SELECT * FROM collection WHERE num < 50 SELECT * FROM collection WHERE val != 0 SELECT name, job FROM collection
  • 34. 19 MongoDB Query Language  Depending on the condition it is possible to obtain the equivalent of an SQL query. OPERATION Selection from one collection Selection for equality Selection for a value less than Selection for a value not equals to Projection Projection and Selection MONGODB SYNTAX db.collection.find() db.collection.find( { name : ‘’John’’ } ) db.collection.find( { num : {$lt : 50} } ) db.collection.find( { val : {$ne: 0} } ) db.collection.find( {} , { name: 1, job: 1} ) db.collection.find({ age : {$gte : 18} } , { name: 1}) SQL EQUIVALENT SELECT * FROM collection SELECT * FROM collection WHERE name = ‘’John’’ SELECT * FROM collection WHERE num < 50 SELECT * FROM collection WHERE val != 0 SELECT name, job FROM collection SELECT name FROM collection WHERE age >= 18
  • 35. 19 MongoDB Query Language  Depending on the condition it is possible to obtain the equivalent of an SQL query. OPERATION AND OR MONGODB SYNTAX db.collection.find( { $and: [ { job: ‘’employee’’} , { age: {$gte : 65} } ] } ) db.collection.find( { $or: [ { job: ‘’employee’’} , { job: ‘’freelancer’’} ] } ) SQL EQUIVALENT SELECT * FROM collection WHERE job = ‘’employee’’ AND age >= 65 SELECT * FROM collection WHERE job = ‘’employee’’ OR job = ‘’freelancer’’
  • 36. 20 MongoDB CRUD Concepts  C Create or insert operation, it allow to add new documents to a collection. Can be see as the equivalent of INSERT in SQL.  R Read operation, is a way to query a collection for documents. As seen before, there ways to do a selection, projection, etc…  U Update operation, it modifies existing documents in a collection. It allow to set a filter to identify the document to update.  D Delete operation, it removes documents from a collection. It allow to set a filter to delete only specified documents otherwise they will be all deleted.
  • 37. 21 How ACID is MongoDB ? Atomicity & Isolation  In MongoDB a write operation is atomic on the level of a single document. When a single write operation modifies multiple documents, the modification of each document is atomic.  The $isolated operator can prevent other processes from interleaving once the write operation modifies the first document, this ensures that no one sees the changes until the write operation completes or errors out.  The operation as a whole is not atomic, then other operation can interleave. However it’s possible to isolate a single operation with the $isolated operator.  However an isolated write operation does not provide “all-or-nothing” atomicity. That is an error during the write operation does not roll back all its changes that preceded the error.  The $isolated operator causes write operations to acquire an exclusive lock on the collection. Big Data Management 2016 SAPIENZA - DIAG
  • 38. 22 How ACID is MongoDB ? Consistency  Even in replica set configurations, the primary Mongo server is targeted with all the writes, single server consistency is easy to guarantee.  The secondary nodes may be out of date with respect to the primary, as eventual consistency only guarantee that if after a long enough period with no writes, they will get up to date with to the primary.  However by default the secondary servers cannot answer reads, so the traffic could be distribute with the penalty of inconsistency, it is a configure choice. Durability  Durability of writes is the biggest issue with MongoDB  What SQL DBs do is committing after every write operation. In MongoDB this doesn’t happen, a choice of developers, they say because in many scenario the OS doesn’t write the file on disk even after syncing( hardware buffering), and because time spent waiting for recovering would impact availability  So if the server crashes, writes accepted after the last commit will be lost. Big Data Management 2016 SAPIENZA - DIAG
  • 39. 23 MongoDB summing up Is MongoDB a good choice for IOT ?  Scalability  Operational “real-time” queries and analytics  Spatio-temporal data  Schema flexibility  High variety of data Big Data Management 2016 SAPIENZA - DIAG
  • 40. 23 MongoDB summing up Is MongoDB a good choice for IOT ?  Scalability The horizontal scale allow to scale easly, it is possible add multiple servers when needed , and the sharding technique allow to balance data across the servers.  Operational “real-time” queries and analytics  Spatio-temporal data  Schema flexibility  High variety of data Big Data Management 2016 SAPIENZA - DIAG
  • 41. 23 MongoDB summing up Is MongoDB a good choice for IOT ?  Scalability The horizontal scale allow to scale easly, it is possible add multiple servers when needed , and the sharding technique allow to balance data across the servers.  Operational “real-time” queries and analytics Sacrificing the ACID properties it allows more speed in the operations, and because much of related data are inside the same document, it doesn’t require the expensive JOIN operations.  Spatio-temporal data  Schema flexibility  High variety of data Big Data Management 2016 SAPIENZA - DIAG
  • 42. 23 MongoDB summing up Is MongoDB a good choice for IOT ?  Scalability The horizontal scale allow to scale easly, it is possible add multiple servers when needed , and the sharding technique allow to balance data across the servers.  Operational “real-time” queries and analytics Sacrificing the ACID properties it allows more speed in the operations, and because much of related data are inside the same document, it doesn’t require the expensive JOIN operations.  Spatio-temporal data MongoDB offers a number of indexes and query mechanisms to handle geospatial information. Location data are stored as GeoJSON objects.  Schema flexibility  High variety of data Big Data Management 2016 SAPIENZA - DIAG
  • 43. 23 MongoDB summing up Is MongoDB a good choice for IOT ?  Scalability The horizontal scale allow to scale easly, it is possible add multiple servers when needed , and the sharding technique allow to balance data across the servers.  Operational “real-time” queries and analytics Sacrificing the ACID properties it allows more speed in the operations, and because much of related data are inside the same document, it doesn’t require the expensive JOIN operations.  Spatio-temporal data MongoDB offers a number of indexes and query mechanisms to handle geospatial information. Location data are stored as GeoJSON objects.  Schema flexibility Schema is free, it can change during writes operations, and changes can affect one or more documents.  High variety of data Big Data Management 2016 SAPIENZA - DIAG
  • 44. 23 MongoDB summing up Is MongoDB a good choice for IOT ?  Scalability The horizontal scale allow to scale easly, it is possible add multiple servers when needed , and the sharding technique allow to balance data across the servers.  Operational “real-time” queries and analytics Sacrificing the ACID properties it allows more speed in the operations, and because much of related data are inside the same document, it doesn’t require the expensive JOIN operations.  Spatio-temporal data MongoDB offers a number of indexes and query mechanisms to handle geospatial information. Location data are stored as GeoJSON objects.  Schema flexibility Schema is free, it can change during writes operations, and changes can affect one or more documents.  High variety of data It represent the strong point for IOT, sensor data can be represented with a field and respective value, then they are stored in they natural way Big Data Management 2016 SAPIENZA - DIAG
  • 45. 24 MongoDB simulation Big Data Management 2016 SAPIENZA - DIAG >> use myDB >> show dbs >> show collections >> db.collection.find() >> db.collection.insert() >> db.collection.find().explain(“executionStats”) >> db.collection.find( { field: “value”} ) >> db.collection.save() >> db.collection.update({}, { $set:{} }) >> db.collection.find().mapReduce()
  • 46. END Any questions ??? Big Data Management 2016 SAPIENZA - DIAG