Introduction to Microprocesso programming and interfacing.pptx
MongoDB - Features and Operations
1.
2. What is MongoDB?
“Humongous” DB
Lots of similarities with SQL RDBMs, but with more flexibility
No joins because data is denormalized & pre-joined
Open source
3. RDBMS MongoDB
Database ➜ Database
Table / View ➜ Collection
Row / Tuple ➜ Document
Index ➜ Index
Join ➜ Embedded Document
PK: Any
Attribute(s)
➜ PK: _id Field
Foreign Key ➜ Reference
4. Storage model
Emphasize data locality:
Each document stored as one unit, even with embedding
Each collection stored as one unit.
Uses BSON, a binary-encoded variant of JSON key/value pairs
5. JSON
“JavaScript Object Notation”
Easy for humans to write/read, easy for computers to parse/generate
Objects can be nested
Built on
name/value pairs
Ordered list of values
http://json.org/
6. BSON
• “Binary JSON”
• Binary-encoded serialization of JSON-like docs
• Also allows “referencing”
• Embedded structure reduces need for joins
• Goals
– Lightweight
– Traversable
– Efficient (decoding and encoding)
http://bsonspec.org/
8. The _id Field
• By default, each document contains an _id field. This field has a
number of special characteristics:
– Value serves as primary key for collection.
– Value is unique, immutable, and may be any non-array type.
– Default data type is ObjectId, which is “small, likely unique, fast to
generate, and ordered.”
– Sorting on an ObjectId value is roughly equivalent to sorting on creation
time.
9. Collections of documents
{
_id: “S1234”,
first_name: “John”,
last_name: “Smith”,
major: “ECON”
}
{
_id: “S5678”,
first_name: “Mary”,
last_name: “Jones”,
major: “COMP”,
minor: “STAT”
}
Collection = table
Document = record
Every document
has unique _id,
even if not user-
defined.
Documents need
not have the same
fields.
11. Create Operations
inserting details of a single student in the form of document in the student
collection using db.collection.insertOne() method
12. Create Operations
inserting details of the multiple
students in the form of
documents in the student
collection
using db.collection.insertMany() metho
d.
13. Relating data via embedding
{
_id: “S1234”,
first_name: “John”,
last_name: “Smith”,
major: “ECON”,
address: {
street: “1514 Nowhere St.”,
city: “Houston”,
state: “TX”
}
}
Data all stored
together for locality.
Denormalized.
Possible
redundancy.
15. Relating data via references
/////author
{
_id: 34237
first_name: “John”,
last_name: “Smith”
}
/////books
{
title: “A book about books”,
author: 34237
}
{
title: “A little bookish book”,
author: 34237
}
{
title: “A new book”,
author: 34237
}
_id field = primary key
Reference = foreign key
16. Relating data via references
{
first_name: “John”,
last_name: “Smith”,
books: [103, 279, 541]
}
{
_id: 103,
title: “A book about books”
}
{
_id: 279,
title: “A little bookish book”
}
{
_id: 541,
title: “A new book”
}
_id field = primary key
Reference = foreign key
17. Embedding & relating replaces joins
No need for joins, because data is already explicitly linked.
This gives up SQL’s flexibility of joining on conditions other than PK=FK.
Focuses on the common case.
18. Read Operation - Filtering
db.students.find({major: “COMP”})
db.students.find({major: “COMP”, minor: “STAT”})
db.students.find({$or: [{major: “COMP”}, {major: “ELEC”}]})
db.students.find({major: {$in: [“COMP”, “ELEC”]}})
db.students.find({gpa: {$gt: 3.8}})
db.students.find({address.street: {$exists: true}})
Collection
name
Syntax structured as
key/value pairs with
some special $keywords.
Embedded
field name
26. Map-Reduce
Has two phases:
A map stage that processes each document and emits one or more objects for
each input document
A reduce phase that combines the output of the map operation.
An optional finalize stage for final modifications to the result
Uses Custom JavaScript functions
Provides greater flexibility but is less efficient and more complex
than the aggregation pipeline
Can have output sets that exceed the 16 megabyte output limitation of
the aggregation pipeline.