NOSQL INTRO & MONGODB

1
REQUISITE SLIDE – WHO AM I?
-

Brian Enochson
- SW Engineer who has worked as designer / developer on NOSQL (Mongo,
Cassandra, Hadoop)
- Consultant – HBO, ACS, CIBER
- Specialize in SW Development, architecture and training

Brian Enochson
brian.enochson@gmail.com
Twitter @benochso
Google Plus https://plus.google.com/+BrianEnochson

Contact Me:
I am available for training, consulting & development.

NOSQL INTRO & MONGODB

2
AGENDA
Hour 1
•

Installation of required software (will send out list before, but make sure
all of class has what is needed)

•

Introduction to Big Data

•

Introduction to NoSQL

•

Relational Database to NoSQL technology contrast & compare

•

NoSQL landscape

•

Exercise – install and use required software

NOSQL INTRO & MONGODB

3
AGENDA
Hour 2
•

Introduction to MongoDB

•

MongoDB Components, capabilities and common use cases

•

Json & BsON

•

Documents, collections, references and Mongo ID

•

Querying

•

Other CRUD Operations

•

Indexes

•

Exercise – Design and populate MongoDB

NOSQL INTRO & MONGODB

4
AGENDA
Hour 3
•

Data Modeling/Schema Design

•

Replication & Sharding

•

Exercise: Application Development Using MongDB and Java

•

Wrap-up and final Q & A

NOSQL INTRO & MONGODB

5
SOFTWARE
Later we will need
•

MongoDB
 http://www.mongodb.org/downloads

•

Java JDK
• 1.6

•

Netbeans, Eclipse or Intellij (with maven support)
• or maven and any editor

•

Our project
• http://bit.ly/IVnTEb
(or https://www.dropbox.com/sh/mwu6lltaljqq59z/PMWiw7ZPk3)

•

Robomongo or MongoExplorer

NOSQL INTRO & MONGODB

6
BIG DATA
Why are database like Mongo needed?
•

To understand we need to look at
• the history of databases
• How systems were built in the past

•

Modern Application Architectures
• Web scale
• Data acquisition

•

Other factors like cost of H/W

NOSQL INTRO & MONGODB

7
HISTORY OF THE DATABASE
•

1960’s – Hierarchical and Network type (IMS and CODASYL)

•

1970’s – Beginnings of theory behind relational model. Codd

•

1980’s – Rise of the relational model. SQL. E/R Model (Chen)

•

1990’s – Access/Excel and MySQL. ODMS began to appear

•

2000;’s – Two forces; large enterprise and open source. Google and Amazon.
CAP Theorem (more on that to come…)

•

2010’s – Immergence of NoSQL as an industry player and viable alternative

NOSQL INTRO & MONGODB

8
WHY WERE ALTERNATIVES NEEDED
•

Developers today are faced with Internet scale
• 100,000’s of users
• Low cost of storage
• Increased processing power
• Ability to capture (and need) of millions of events. Caching solves it to an
extent but brings other complexities
• Real-time
• Need to scale out and not up. (add infinite number of low cost machines vs.
replace with a more powerful machine).

•

Cost
• Let’s not forget for enterprise DB’s Internet scale can become expensive
• Open source DB’s may solve license cost, but don’t ignore operational costs

NOSQL INTRO & MONGODB

9
A LOT OF DATA
Some facts from http://www.storagenewsletter.com/rubriques/marketreportsresearch/ibm-cmo-study/
Approximately 90 percent of all the real-time information being created today is
unstructured data
Every day we create 2.5 quintillion (10 to the 18th) bytes of data (this is 30
zeroes!!)

90 percent of the world's data today has been created in the last two years alone

NOSQL INTRO & MONGODB

10
RELATIONAL VS. NOSQL
• Relational
• Divide into tables, relate into foreign keys, DB constraints, normalized
data, the Interface is SQL

• NoSQL
• Store in schemaless format, redundancy encouraged, application access
determines the storage format (your queries).Interface varies and is
optimized for the implementation, no forced DB constraints. Tradeoff is
often you get eventual consistency.

NOSQL INTRO & MONGODB

11
TRADEOFFS?

Luckily, due to the large number of compromises made when
attempting to scale their existing relational databases,
these tradeoffs were not so foreign or
distasteful as they might have been.

Greg Burd - https://www.usenix.org/legacy/publications/login/201110/openpdfs/Burd.pdf

NOSQL INTRO & MONGODB

12
3 V’S – DESCRIBING THE BIG DATA PROBLEM
Driving force in requiring new technology is often referred to as the “3 V Model”.
•

High Volume – amount of data

•

High Variety – range of data types and sources

•

High Velocity – speed of data in and out

OK, maybe 4 V’s

•

Veracity – is all the data applicable to the problem being analyzed.

NOSQL INTRO & MONGODB

13
NOSQL IS NOT BIG DATA

NoSQL != Big Data
NoSQL products were created to help solve the big data problem.
Big data is a much larger problem than just storage. Analysis tools like
Hadoop, messaging systems like Kafka, real time processing engines like
Storm and machine learning (Mahout) all help solve the big data problem.

NOSQL INTRO & MONGODB

14
NOSQL TYPES
Document DB
• MongoDB, CouchDB,
Wide Column– Column Family
• Cassandra, HBASE, Amazon SimpleDB
Key Value
• Riak, Redis, DynamoDB, Voldemort, MemcacheDB
Graph
• Neo4J, OrientDB
Search (also alternatives, normally used with *)
• Lucene, Solr, ElasticSearch
Many many many, many more! (http://nosql-database.org/)

NOSQL INTRO & MONGODB

15
CHOOSING THE RIGHT ONE…
Choosing the right NoSQL type and eventual product depends on…
Type of Data
• One key and a lot of data?
• High volume of data?
• Storing, media, blobs,
• Document oriented?
• Tracking relationships?
• Combination?
• Multi-Datacenter
Type of Access
Volumes of Data (there is big data and there is BIG DATA)
Need Support/Services/Training

NOSQL INTRO & MONGODB

16
SOME BASICS
•

ACID

•

CAP Theorem

•

BASE

NOSQL INTRO & MONGODB

17
ACID
YOU PROBABLY ALL HAVE HEARD OF ACID
•

Atomic – All or None

•

Consistency – What is written is valid

•

Isolation – One operation at a time

•

Durability – Once committed to the DB, it stays

This is the world we have lived in for a long time…

NOSQL INTRO & MONGODB

18
CAP THEOREM (BREWERS)
Many may have heard this one
CAP stands for Consistency, Availability and Partition Tolerance
• Consistency –like the C in ACID. Operation is all or nothing,
• Availability – service is available.
• Partition Tolerance – No failure other than complete network failure causes
system not to respond
(REMEMBER VISUAL GUIDE TO SELECTING A NO SQL DATABASE
So.. What does this mean?
** http://www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf

NOSQL INTRO & MONGODB

19
YOU CAN ONLY HAVE 2 OF THEM

Or better said in C* terms you can have Availability and Partition-Tolerant
AND Eventual Consistency.
Means eventually all accesses will return the last updated value.

NOSQL INTRO & MONGODB

20
VISUAL GUIDE – USING THE CAP THEOREM
HTTP://BLOG.NAHURST.COM/VISUAL-GUIDE-TO-NOSQL-SYSTEMS

NOSQL INTRO & MONGODB

21
BIG DATA WRAP UP
•

So we are talking about large amounts of data

•

High velocity of acquisition

•

A lot of variety that we need to store. Will worry about it later how to
handle (or not)

•

Need to scale and not break the bank

•

Want the database to support agile, not hinder

NOSQL INTRO & MONGODB

22
STILL WRAPPING
•

Maybe consider going relational if
• High transaction (FoundationDB?)

• Business Intelligence Systems (Hadoop may make this not true)

• Don’t be fooled by fear of losing ACID….
http://highscalability.com/blog/2013/5/1/myth-eric-brewer-on-why-banks-arebase-not-acid-availability.html

NOSQL INTRO & MONGODB

23
And now,
let’s look at MongoDB

NOSQL INTRO & MONGODB

24
MONGO OVERVIEW
Few high level points
•

Document Oriented

•

Storage format is JSON (actually BSON)

•

Replication built in

•

Master / slave architecture

•

Strong querying support

•

from "humongous"

NOSQL INTRO & MONGODB

25
MEET MONGO
•

Open Source

•

Schemaless

•

Scalable

•

Document Level Atomicity

•

Easy Installation

•

Relatively Ease Of Use

•

Great (!!!!) Documentation

NOSQL INTRO & MONGODB

26
AND…
•

No cross document transactions

•

No joins

•

Replication – master / slave

•

Sharding

NOSQL INTRO & MONGODB

27
MONGO ADVANTAGE
-

* Credit – Dwight Merriman, Founder and CEO – MongoDB (was 10Ge

NOSQL INTRO & MONGODB

28
DOCUMENT
At its simplest form, Mongo is a document oriented
database
•

MongoDB stores all data in documents, which are JSON-style data
structures composed of field-and-value pairs.

•

MongoDB stores documents on disk in the BSON serialization format.
BSON is a binary representation of JSON documents. BSON contains
more data types than does JSON.

** For in-depth BSON information, see bsonspec.org.

NOSQL INTRO & MONGODB

29
WHAT DOES A DOCUMENT LOOK LIKE
{
"_id" : "52a602280f2e642811ce8478",
"ratingCode" : "PG13",
"country" : "USA",
"entityType" : "Rating”
}

NOSQL INTRO & MONGODB

30
MONGO DOCUMENTS

NOSQL INTRO & MONGODB

31
RULES FOR A DOCUMENT
Documents have the following rules:
The maximum BSON document size is 16 megabytes.
The field name _id is reserved for use as a primary key; its value must be
unique in the collection.
The field names cannot start with the $ character.
The field names cannot contain the . character.

NOSQL INTRO & MONGODB

32
MONGO INSTALL
Windows
http://docs.mongodb.org/manual/tutorial/install-mongodb-on-windows/
MAC

http://docs.mongodb.org/manual/tutorial/install-mongodb-on-os-x/
Create Data Directory , Defaults
• C:datadb
• /data/db/ (make sure have permissions)
Or can set using -dbpath
C:mongodbbinmongod.exe --dbpath d:testmongodbdata

NOSQL INTRO & MONGODB

33
START IT!
Database
mongod

Shell
mongo
show dbs

show collections
db.stats()

NOSQL INTRO & MONGODB

34
BASIC OPERATIONS
1_simpleinsert.txt
 Insert

 Find
 Find all
 Find One
 Find with criteria
 Indexes
 Explain()

NOSQL INTRO & MONGODB

35
MORE MONGO SHELL
2_arrays_sort.txt
• Embedded documents

• Limit, Sort
• Using regex in query
• Removing documents
• Drop collection

NOSQL INTRO & MONGODB

36
IMPORT / EXPORT
3_imp_exp.txt
Mongo provides tools for getting data in and out of the database
• Data Can Be Exported to json files

• Json files can then be Imported

NOSQL INTRO & MONGODB

37
CONDITIONAL OPERATORS
4_cond_ops.txt
•
•
•
•
•

$lt
$gt
$gte
$lte
$or

• Also $not, $exists, $type, $in

(for $type refer to
http://docs.mongodb.org/manual/reference/operator/query/type/#_S_typ
e )

NOSQL INTRO & MONGODB

38
ADMIN COMMANDS
5_admin.txt
•
•
•
•
•
•

how dbs
show collections
db.stats()
db.posts.stats()
db.posts.drop()
db.system.indexes.find()

NOSQL INTRO & MONGODB

39
DATA MODELING
•

Remember with NoSql redundancy is not evil

•

Applications insure consistency, not the DB

•

Application join data, not defined in the DB

•

Datamodel is schema-less

•

Datamodel is built to support queries usually

NOSQL INTRO & MONGODB

40
QUESTIONS TO ASK
•

Your basic units of data (what would be a document)?

•

How are these units grouped / related?

•

How does Mongo let you query this data, what are the options?

•

Finally, maybe most importantly, what are your applications access
patterns?
• Reads vs. writes
• Queries
• Updates
• Deletions
• How structured is it

NOSQL INTRO & MONGODB

41
DATA MODEL - NORMALIZED
Normalized
• Similar to relational model.

• One collection per entity type
• Little or no redundancy

• Allows clean updates, familiar to many SQL users, easier to understand

NOSQL INTRO & MONGODB

42
NORMALIZED DOCUMENTS

NOSQL INTRO & MONGODB

43
REFERENCES
•

From parent to child
{
name: "O'Reilly Media",
books: [12346789, 234567890, ...]
}

•

From child to parent
{
_id: 123456789,
title: "MongoDB: The Definitive Guide",
publisher_id: "oreilly"
}

NOSQL INTRO & MONGODB

44
DATA MODEL - EMBEDDED
Oft used pattern in Mongo, is to embed information as subdocuments.
•

Used when there is a contains relationship

•

Easier querying (when related data is often used together)

•

Need to keep 16 MB document size in mind

NOSQL INTRO & MONGODB

45
EMBEDDED

NOSQL INTRO & MONGODB

46
OTHER CONSIDERATIONS FOR DATA MODELING
Many or few collections
•

Many Collections
• As seen in normalized
• Clean and little redundancy
• May not provide best performance
• May require frequent updates to application if new types added

•

Multiple Collections
• Middle ground, partially normalized

•

Not many collections
• One large generic collection
• Contains many types
• Use type field

NOSQL INTRO & MONGODB

47
CONSIDERATION CONTINUED
•

Document Growth – will relocate if exceeds allocated size

•

Atomicity
• Atomic at document level
• Consideration for insertions, remove and multi-document updates

 Sharding – collections distributed across mongod instances, uses a shard key
 Indexes – index fields often queries, indexes affect write performance slightly
 Consider using TTL to automatically expire documents

NOSQL INTRO & MONGODB

48
COMMON USES FOR MONGO
Log Collection
https://code.google.com/p/log4mongo/
Caching
Queues / Messaging
Capped Collections - fixed-size collections that support high-throughput
operations that insert, retrieve, and delete documents based on insertion order.
Analytics
Prototyping

NOSQL INTRO & MONGODB

49
MONGODB DEVELOPMENT WITH JAVA
Supplied by MongoDB Itself
Easy to setup
Housed on maven repo

NOSQL INTRO & MONGODB

50
EXAMPLE JAVA APP
Load Health Data

Query Data

Administrative Functions

NOSQL INTRO & MONGODB

51
OTHER JAVA OPTIONS
Morphia
Spring MongoDB

NOSQL INTRO & MONGODB

52
SOME OTHER COOL STUFF
Get

MEAN

Mongo, Express, Angular and Node
http://bitnami.com/stack/mean
Can install, in a VM or even in the cloud

NOSQL INTRO & MONGODB

53
THE CLOUD
Database in the cloud
https://mongolab.com/

Can access using shell, GUI Mongo explorer, mongoimport, mongoexport
and use in application

Amazon, Rackspace, Joyent or Azure

NOSQL INTRO & MONGODB

54
BOOKS
MongoDB: The Definitive Guide, 2nd Edition
By: Kristina Chodorow
Publisher: O'Reilly Media, Inc.
Pub. Date: May 23, 2013
Print ISBN-13: 978-1-4493-4468-9
Pages in Print Edition: 432
MongoDB in Action
By: Kyle Banker
Publisher: Manning Publications
Pub. Date: December 16, 2011
Print ISBN-10: 1-935182-87-0
Print ISBN-13: 978-1-935182-87-0
Pages in Print Edition: 312
The Definitive Guide to MongoDB: The NoSQL Database for Cloud and Desktop Computing
By Eelco Plugge; Peter Membrey; Tim Hawkins
Apress, September 2010
ISBN: 9781430230519
327 pages

NOSQL INTRO & MONGODB

55
BOOKS CONT.
MongoDB Applied Design Patterns
By: Rick Copeland
Publisher: O'Reilly Media, Inc.
Pub. Date: March 18, 2013
Print ISBN-13: 978-1-4493-4004-9
Pages in Print Edition: 176

MongoDB for Web Development (rough cut!)
By: Mitch Pirtle
Publisher: Addison-Wesley Professional
Last Updated: 14-JUN-2013
Pub. Date: March 11, 2015 (Estimated)
Print ISBN-10: 0-321-70533-5
Print ISBN-13: 978-0-321-70533-4
Pages in Print Edition: 360
Instant MongoDB
By: Amol Nayak;
Publisher: Packt Publishing
Pub. Date: July 26, 2013
Print ISBN-13: 978-1-78216-970-3
Pages in Print Edition: 72

NOSQL INTRO & MONGODB

56
IMPORTANT SITES
•

http://www.mongodb.org/

•

https://mongolab.com/welcome/

•

https://education.mongodb.com/

•

http://blog.mongodb.org/

•

http://stackoverflow.com/questions/tagged/mongodb

•

http://bitnami.com/stack/mean

NOSQL INTRO & MONGODB

57
THAT’S ALL FOLKS
Questions?
Comments?
What other topics are of interest?

Thank You!!!!!!

NOSQL INTRO & MONGODB

58

NoSQL and MongoDB Introdction

  • 1.
    NOSQL INTRO &MONGODB 1
  • 2.
    REQUISITE SLIDE –WHO AM I? - Brian Enochson - SW Engineer who has worked as designer / developer on NOSQL (Mongo, Cassandra, Hadoop) - Consultant – HBO, ACS, CIBER - Specialize in SW Development, architecture and training Brian Enochson brian.enochson@gmail.com Twitter @benochso Google Plus https://plus.google.com/+BrianEnochson Contact Me: I am available for training, consulting & development. NOSQL INTRO & MONGODB 2
  • 3.
    AGENDA Hour 1 • Installation ofrequired software (will send out list before, but make sure all of class has what is needed) • Introduction to Big Data • Introduction to NoSQL • Relational Database to NoSQL technology contrast & compare • NoSQL landscape • Exercise – install and use required software NOSQL INTRO & MONGODB 3
  • 4.
    AGENDA Hour 2 • Introduction toMongoDB • MongoDB Components, capabilities and common use cases • Json & BsON • Documents, collections, references and Mongo ID • Querying • Other CRUD Operations • Indexes • Exercise – Design and populate MongoDB NOSQL INTRO & MONGODB 4
  • 5.
    AGENDA Hour 3 • Data Modeling/SchemaDesign • Replication & Sharding • Exercise: Application Development Using MongDB and Java • Wrap-up and final Q & A NOSQL INTRO & MONGODB 5
  • 6.
    SOFTWARE Later we willneed • MongoDB  http://www.mongodb.org/downloads • Java JDK • 1.6 • Netbeans, Eclipse or Intellij (with maven support) • or maven and any editor • Our project • http://bit.ly/IVnTEb (or https://www.dropbox.com/sh/mwu6lltaljqq59z/PMWiw7ZPk3) • Robomongo or MongoExplorer NOSQL INTRO & MONGODB 6
  • 7.
    BIG DATA Why aredatabase like Mongo needed? • To understand we need to look at • the history of databases • How systems were built in the past • Modern Application Architectures • Web scale • Data acquisition • Other factors like cost of H/W NOSQL INTRO & MONGODB 7
  • 8.
    HISTORY OF THEDATABASE • 1960’s – Hierarchical and Network type (IMS and CODASYL) • 1970’s – Beginnings of theory behind relational model. Codd • 1980’s – Rise of the relational model. SQL. E/R Model (Chen) • 1990’s – Access/Excel and MySQL. ODMS began to appear • 2000;’s – Two forces; large enterprise and open source. Google and Amazon. CAP Theorem (more on that to come…) • 2010’s – Immergence of NoSQL as an industry player and viable alternative NOSQL INTRO & MONGODB 8
  • 9.
    WHY WERE ALTERNATIVESNEEDED • Developers today are faced with Internet scale • 100,000’s of users • Low cost of storage • Increased processing power • Ability to capture (and need) of millions of events. Caching solves it to an extent but brings other complexities • Real-time • Need to scale out and not up. (add infinite number of low cost machines vs. replace with a more powerful machine). • Cost • Let’s not forget for enterprise DB’s Internet scale can become expensive • Open source DB’s may solve license cost, but don’t ignore operational costs NOSQL INTRO & MONGODB 9
  • 10.
    A LOT OFDATA Some facts from http://www.storagenewsletter.com/rubriques/marketreportsresearch/ibm-cmo-study/ Approximately 90 percent of all the real-time information being created today is unstructured data Every day we create 2.5 quintillion (10 to the 18th) bytes of data (this is 30 zeroes!!) 90 percent of the world's data today has been created in the last two years alone NOSQL INTRO & MONGODB 10
  • 11.
    RELATIONAL VS. NOSQL •Relational • Divide into tables, relate into foreign keys, DB constraints, normalized data, the Interface is SQL • NoSQL • Store in schemaless format, redundancy encouraged, application access determines the storage format (your queries).Interface varies and is optimized for the implementation, no forced DB constraints. Tradeoff is often you get eventual consistency. NOSQL INTRO & MONGODB 11
  • 12.
    TRADEOFFS? Luckily, due tothe large number of compromises made when attempting to scale their existing relational databases, these tradeoffs were not so foreign or distasteful as they might have been. Greg Burd - https://www.usenix.org/legacy/publications/login/201110/openpdfs/Burd.pdf NOSQL INTRO & MONGODB 12
  • 13.
    3 V’S –DESCRIBING THE BIG DATA PROBLEM Driving force in requiring new technology is often referred to as the “3 V Model”. • High Volume – amount of data • High Variety – range of data types and sources • High Velocity – speed of data in and out OK, maybe 4 V’s • Veracity – is all the data applicable to the problem being analyzed. NOSQL INTRO & MONGODB 13
  • 14.
    NOSQL IS NOTBIG DATA NoSQL != Big Data NoSQL products were created to help solve the big data problem. Big data is a much larger problem than just storage. Analysis tools like Hadoop, messaging systems like Kafka, real time processing engines like Storm and machine learning (Mahout) all help solve the big data problem. NOSQL INTRO & MONGODB 14
  • 15.
    NOSQL TYPES Document DB •MongoDB, CouchDB, Wide Column– Column Family • Cassandra, HBASE, Amazon SimpleDB Key Value • Riak, Redis, DynamoDB, Voldemort, MemcacheDB Graph • Neo4J, OrientDB Search (also alternatives, normally used with *) • Lucene, Solr, ElasticSearch Many many many, many more! (http://nosql-database.org/) NOSQL INTRO & MONGODB 15
  • 16.
    CHOOSING THE RIGHTONE… Choosing the right NoSQL type and eventual product depends on… Type of Data • One key and a lot of data? • High volume of data? • Storing, media, blobs, • Document oriented? • Tracking relationships? • Combination? • Multi-Datacenter Type of Access Volumes of Data (there is big data and there is BIG DATA) Need Support/Services/Training NOSQL INTRO & MONGODB 16
  • 17.
  • 18.
    ACID YOU PROBABLY ALLHAVE HEARD OF ACID • Atomic – All or None • Consistency – What is written is valid • Isolation – One operation at a time • Durability – Once committed to the DB, it stays This is the world we have lived in for a long time… NOSQL INTRO & MONGODB 18
  • 19.
    CAP THEOREM (BREWERS) Manymay have heard this one CAP stands for Consistency, Availability and Partition Tolerance • Consistency –like the C in ACID. Operation is all or nothing, • Availability – service is available. • Partition Tolerance – No failure other than complete network failure causes system not to respond (REMEMBER VISUAL GUIDE TO SELECTING A NO SQL DATABASE So.. What does this mean? ** http://www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf NOSQL INTRO & MONGODB 19
  • 20.
    YOU CAN ONLYHAVE 2 OF THEM Or better said in C* terms you can have Availability and Partition-Tolerant AND Eventual Consistency. Means eventually all accesses will return the last updated value. NOSQL INTRO & MONGODB 20
  • 21.
    VISUAL GUIDE –USING THE CAP THEOREM HTTP://BLOG.NAHURST.COM/VISUAL-GUIDE-TO-NOSQL-SYSTEMS NOSQL INTRO & MONGODB 21
  • 22.
    BIG DATA WRAPUP • So we are talking about large amounts of data • High velocity of acquisition • A lot of variety that we need to store. Will worry about it later how to handle (or not) • Need to scale and not break the bank • Want the database to support agile, not hinder NOSQL INTRO & MONGODB 22
  • 23.
    STILL WRAPPING • Maybe considergoing relational if • High transaction (FoundationDB?) • Business Intelligence Systems (Hadoop may make this not true) • Don’t be fooled by fear of losing ACID…. http://highscalability.com/blog/2013/5/1/myth-eric-brewer-on-why-banks-arebase-not-acid-availability.html NOSQL INTRO & MONGODB 23
  • 24.
    And now, let’s lookat MongoDB NOSQL INTRO & MONGODB 24
  • 25.
    MONGO OVERVIEW Few highlevel points • Document Oriented • Storage format is JSON (actually BSON) • Replication built in • Master / slave architecture • Strong querying support • from "humongous" NOSQL INTRO & MONGODB 25
  • 26.
    MEET MONGO • Open Source • Schemaless • Scalable • DocumentLevel Atomicity • Easy Installation • Relatively Ease Of Use • Great (!!!!) Documentation NOSQL INTRO & MONGODB 26
  • 27.
    AND… • No cross documenttransactions • No joins • Replication – master / slave • Sharding NOSQL INTRO & MONGODB 27
  • 28.
    MONGO ADVANTAGE - * Credit– Dwight Merriman, Founder and CEO – MongoDB (was 10Ge NOSQL INTRO & MONGODB 28
  • 29.
    DOCUMENT At its simplestform, Mongo is a document oriented database • MongoDB stores all data in documents, which are JSON-style data structures composed of field-and-value pairs. • MongoDB stores documents on disk in the BSON serialization format. BSON is a binary representation of JSON documents. BSON contains more data types than does JSON. ** For in-depth BSON information, see bsonspec.org. NOSQL INTRO & MONGODB 29
  • 30.
    WHAT DOES ADOCUMENT LOOK LIKE { "_id" : "52a602280f2e642811ce8478", "ratingCode" : "PG13", "country" : "USA", "entityType" : "Rating” } NOSQL INTRO & MONGODB 30
  • 31.
  • 32.
    RULES FOR ADOCUMENT Documents have the following rules: The maximum BSON document size is 16 megabytes. The field name _id is reserved for use as a primary key; its value must be unique in the collection. The field names cannot start with the $ character. The field names cannot contain the . character. NOSQL INTRO & MONGODB 32
  • 33.
    MONGO INSTALL Windows http://docs.mongodb.org/manual/tutorial/install-mongodb-on-windows/ MAC http://docs.mongodb.org/manual/tutorial/install-mongodb-on-os-x/ Create DataDirectory , Defaults • C:datadb • /data/db/ (make sure have permissions) Or can set using -dbpath C:mongodbbinmongod.exe --dbpath d:testmongodbdata NOSQL INTRO & MONGODB 33
  • 34.
    START IT! Database mongod Shell mongo show dbs showcollections db.stats() NOSQL INTRO & MONGODB 34
  • 35.
    BASIC OPERATIONS 1_simpleinsert.txt  Insert Find  Find all  Find One  Find with criteria  Indexes  Explain() NOSQL INTRO & MONGODB 35
  • 36.
    MORE MONGO SHELL 2_arrays_sort.txt •Embedded documents • Limit, Sort • Using regex in query • Removing documents • Drop collection NOSQL INTRO & MONGODB 36
  • 37.
    IMPORT / EXPORT 3_imp_exp.txt Mongoprovides tools for getting data in and out of the database • Data Can Be Exported to json files • Json files can then be Imported NOSQL INTRO & MONGODB 37
  • 38.
    CONDITIONAL OPERATORS 4_cond_ops.txt • • • • • $lt $gt $gte $lte $or • Also$not, $exists, $type, $in (for $type refer to http://docs.mongodb.org/manual/reference/operator/query/type/#_S_typ e ) NOSQL INTRO & MONGODB 38
  • 39.
    ADMIN COMMANDS 5_admin.txt • • • • • • how dbs showcollections db.stats() db.posts.stats() db.posts.drop() db.system.indexes.find() NOSQL INTRO & MONGODB 39
  • 40.
    DATA MODELING • Remember withNoSql redundancy is not evil • Applications insure consistency, not the DB • Application join data, not defined in the DB • Datamodel is schema-less • Datamodel is built to support queries usually NOSQL INTRO & MONGODB 40
  • 41.
    QUESTIONS TO ASK • Yourbasic units of data (what would be a document)? • How are these units grouped / related? • How does Mongo let you query this data, what are the options? • Finally, maybe most importantly, what are your applications access patterns? • Reads vs. writes • Queries • Updates • Deletions • How structured is it NOSQL INTRO & MONGODB 41
  • 42.
    DATA MODEL -NORMALIZED Normalized • Similar to relational model. • One collection per entity type • Little or no redundancy • Allows clean updates, familiar to many SQL users, easier to understand NOSQL INTRO & MONGODB 42
  • 43.
  • 44.
    REFERENCES • From parent tochild { name: "O'Reilly Media", books: [12346789, 234567890, ...] } • From child to parent { _id: 123456789, title: "MongoDB: The Definitive Guide", publisher_id: "oreilly" } NOSQL INTRO & MONGODB 44
  • 45.
    DATA MODEL -EMBEDDED Oft used pattern in Mongo, is to embed information as subdocuments. • Used when there is a contains relationship • Easier querying (when related data is often used together) • Need to keep 16 MB document size in mind NOSQL INTRO & MONGODB 45
  • 46.
  • 47.
    OTHER CONSIDERATIONS FORDATA MODELING Many or few collections • Many Collections • As seen in normalized • Clean and little redundancy • May not provide best performance • May require frequent updates to application if new types added • Multiple Collections • Middle ground, partially normalized • Not many collections • One large generic collection • Contains many types • Use type field NOSQL INTRO & MONGODB 47
  • 48.
    CONSIDERATION CONTINUED • Document Growth– will relocate if exceeds allocated size • Atomicity • Atomic at document level • Consideration for insertions, remove and multi-document updates  Sharding – collections distributed across mongod instances, uses a shard key  Indexes – index fields often queries, indexes affect write performance slightly  Consider using TTL to automatically expire documents NOSQL INTRO & MONGODB 48
  • 49.
    COMMON USES FORMONGO Log Collection https://code.google.com/p/log4mongo/ Caching Queues / Messaging Capped Collections - fixed-size collections that support high-throughput operations that insert, retrieve, and delete documents based on insertion order. Analytics Prototyping NOSQL INTRO & MONGODB 49
  • 50.
    MONGODB DEVELOPMENT WITHJAVA Supplied by MongoDB Itself Easy to setup Housed on maven repo NOSQL INTRO & MONGODB 50
  • 51.
    EXAMPLE JAVA APP LoadHealth Data Query Data Administrative Functions NOSQL INTRO & MONGODB 51
  • 52.
    OTHER JAVA OPTIONS Morphia SpringMongoDB NOSQL INTRO & MONGODB 52
  • 53.
    SOME OTHER COOLSTUFF Get MEAN Mongo, Express, Angular and Node http://bitnami.com/stack/mean Can install, in a VM or even in the cloud NOSQL INTRO & MONGODB 53
  • 54.
    THE CLOUD Database inthe cloud https://mongolab.com/ Can access using shell, GUI Mongo explorer, mongoimport, mongoexport and use in application Amazon, Rackspace, Joyent or Azure NOSQL INTRO & MONGODB 54
  • 55.
    BOOKS MongoDB: The DefinitiveGuide, 2nd Edition By: Kristina Chodorow Publisher: O'Reilly Media, Inc. Pub. Date: May 23, 2013 Print ISBN-13: 978-1-4493-4468-9 Pages in Print Edition: 432 MongoDB in Action By: Kyle Banker Publisher: Manning Publications Pub. Date: December 16, 2011 Print ISBN-10: 1-935182-87-0 Print ISBN-13: 978-1-935182-87-0 Pages in Print Edition: 312 The Definitive Guide to MongoDB: The NoSQL Database for Cloud and Desktop Computing By Eelco Plugge; Peter Membrey; Tim Hawkins Apress, September 2010 ISBN: 9781430230519 327 pages NOSQL INTRO & MONGODB 55
  • 56.
    BOOKS CONT. MongoDB AppliedDesign Patterns By: Rick Copeland Publisher: O'Reilly Media, Inc. Pub. Date: March 18, 2013 Print ISBN-13: 978-1-4493-4004-9 Pages in Print Edition: 176 MongoDB for Web Development (rough cut!) By: Mitch Pirtle Publisher: Addison-Wesley Professional Last Updated: 14-JUN-2013 Pub. Date: March 11, 2015 (Estimated) Print ISBN-10: 0-321-70533-5 Print ISBN-13: 978-0-321-70533-4 Pages in Print Edition: 360 Instant MongoDB By: Amol Nayak; Publisher: Packt Publishing Pub. Date: July 26, 2013 Print ISBN-13: 978-1-78216-970-3 Pages in Print Edition: 72 NOSQL INTRO & MONGODB 56
  • 57.
  • 58.
    THAT’S ALL FOLKS Questions? Comments? Whatother topics are of interest? Thank You!!!!!! NOSQL INTRO & MONGODB 58