NoSQL and MongoDB Introdction
Upcoming SlideShare
Loading in...5

NoSQL and MongoDB Introdction



Slides from workshop held on 12/14 in Asbury Park, NJ

Slides from workshop held on 12/14 in Asbury Park, NJ



Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

NoSQL and MongoDB Introdction NoSQL and MongoDB Introdction Presentation Transcript

  • REQUISITE SLIDE – WHO AM I? - Brian Enochson - SW Engineer who has worked as designer / developer on NOSQL (Mongo, Cassandra, Hadoop) - Consultant – HBO, ACS, CIBER - Specialize in SW Development, architecture and training Brian Enochson Twitter @benochso Google Plus Contact Me: I am available for training, consulting & development. NOSQL INTRO & MONGODB 2
  • AGENDA Hour 1 • Installation of required software (will send out list before, but make sure all of class has what is needed) • Introduction to Big Data • Introduction to NoSQL • Relational Database to NoSQL technology contrast & compare • NoSQL landscape • Exercise – install and use required software NOSQL INTRO & MONGODB 3
  • AGENDA Hour 2 • Introduction to MongoDB • MongoDB Components, capabilities and common use cases • Json & BsON • Documents, collections, references and Mongo ID • Querying • Other CRUD Operations • Indexes • Exercise – Design and populate MongoDB NOSQL INTRO & MONGODB 4
  • AGENDA Hour 3 • Data Modeling/Schema Design • Replication & Sharding • Exercise: Application Development Using MongDB and Java • Wrap-up and final Q & A NOSQL INTRO & MONGODB 5
  • SOFTWARE Later we will need • MongoDB  • Java JDK • 1.6 • Netbeans, Eclipse or Intellij (with maven support) • or maven and any editor • Our project • (or • Robomongo or MongoExplorer NOSQL INTRO & MONGODB 6
  • BIG DATA Why are database like Mongo needed? • To understand we need to look at • the history of databases • How systems were built in the past • Modern Application Architectures • Web scale • Data acquisition • Other factors like cost of H/W NOSQL INTRO & MONGODB 7
  • HISTORY OF THE DATABASE • 1960’s – Hierarchical and Network type (IMS and CODASYL) • 1970’s – Beginnings of theory behind relational model. Codd • 1980’s – Rise of the relational model. SQL. E/R Model (Chen) • 1990’s – Access/Excel and MySQL. ODMS began to appear • 2000;’s – Two forces; large enterprise and open source. Google and Amazon. CAP Theorem (more on that to come…) • 2010’s – Immergence of NoSQL as an industry player and viable alternative NOSQL INTRO & MONGODB 8
  • WHY WERE ALTERNATIVES NEEDED • Developers today are faced with Internet scale • 100,000’s of users • Low cost of storage • Increased processing power • Ability to capture (and need) of millions of events. Caching solves it to an extent but brings other complexities • Real-time • Need to scale out and not up. (add infinite number of low cost machines vs. replace with a more powerful machine). • Cost • Let’s not forget for enterprise DB’s Internet scale can become expensive • Open source DB’s may solve license cost, but don’t ignore operational costs NOSQL INTRO & MONGODB 9
  • A LOT OF DATA Some facts from Approximately 90 percent of all the real-time information being created today is unstructured data Every day we create 2.5 quintillion (10 to the 18th) bytes of data (this is 30 zeroes!!) 90 percent of the world's data today has been created in the last two years alone NOSQL INTRO & MONGODB 10
  • RELATIONAL VS. NOSQL • Relational • Divide into tables, relate into foreign keys, DB constraints, normalized data, the Interface is SQL • NoSQL • Store in schemaless format, redundancy encouraged, application access determines the storage format (your queries).Interface varies and is optimized for the implementation, no forced DB constraints. Tradeoff is often you get eventual consistency. NOSQL INTRO & MONGODB 11
  • TRADEOFFS? Luckily, due to the large number of compromises made when attempting to scale their existing relational databases, these tradeoffs were not so foreign or distasteful as they might have been. Greg Burd - NOSQL INTRO & MONGODB 12
  • 3 V’S – DESCRIBING THE BIG DATA PROBLEM Driving force in requiring new technology is often referred to as the “3 V Model”. • High Volume – amount of data • High Variety – range of data types and sources • High Velocity – speed of data in and out OK, maybe 4 V’s • Veracity – is all the data applicable to the problem being analyzed. NOSQL INTRO & MONGODB 13
  • NOSQL IS NOT BIG DATA NoSQL != Big Data NoSQL products were created to help solve the big data problem. Big data is a much larger problem than just storage. Analysis tools like Hadoop, messaging systems like Kafka, real time processing engines like Storm and machine learning (Mahout) all help solve the big data problem. NOSQL INTRO & MONGODB 14
  • NOSQL TYPES Document DB • MongoDB, CouchDB, Wide Column– Column Family • Cassandra, HBASE, Amazon SimpleDB Key Value • Riak, Redis, DynamoDB, Voldemort, MemcacheDB Graph • Neo4J, OrientDB Search (also alternatives, normally used with *) • Lucene, Solr, ElasticSearch Many many many, many more! ( NOSQL INTRO & MONGODB 15
  • CHOOSING THE RIGHT ONE… Choosing the right NoSQL type and eventual product depends on… Type of Data • One key and a lot of data? • High volume of data? • Storing, media, blobs, • Document oriented? • Tracking relationships? • Combination? • Multi-Datacenter Type of Access Volumes of Data (there is big data and there is BIG DATA) Need Support/Services/Training NOSQL INTRO & MONGODB 16
  • ACID YOU PROBABLY ALL HAVE HEARD OF ACID • Atomic – All or None • Consistency – What is written is valid • Isolation – One operation at a time • Durability – Once committed to the DB, it stays This is the world we have lived in for a long time… NOSQL INTRO & MONGODB 18
  • CAP THEOREM (BREWERS) Many may have heard this one CAP stands for Consistency, Availability and Partition Tolerance • Consistency –like the C in ACID. Operation is all or nothing, • Availability – service is available. • Partition Tolerance – No failure other than complete network failure causes system not to respond (REMEMBER VISUAL GUIDE TO SELECTING A NO SQL DATABASE So.. What does this mean? ** NOSQL INTRO & MONGODB 19
  • YOU CAN ONLY HAVE 2 OF THEM Or better said in C* terms you can have Availability and Partition-Tolerant AND Eventual Consistency. Means eventually all accesses will return the last updated value. NOSQL INTRO & MONGODB 20
  • BIG DATA WRAP UP • So we are talking about large amounts of data • High velocity of acquisition • A lot of variety that we need to store. Will worry about it later how to handle (or not) • Need to scale and not break the bank • Want the database to support agile, not hinder NOSQL INTRO & MONGODB 22
  • STILL WRAPPING • Maybe consider going relational if • High transaction (FoundationDB?) • Business Intelligence Systems (Hadoop may make this not true) • Don’t be fooled by fear of losing ACID…. NOSQL INTRO & MONGODB 23
  • And now, let’s look at MongoDB NOSQL INTRO & MONGODB 24
  • MONGO OVERVIEW Few high level points • Document Oriented • Storage format is JSON (actually BSON) • Replication built in • Master / slave architecture • Strong querying support • from "humongous" NOSQL INTRO & MONGODB 25
  • MEET MONGO • Open Source • Schemaless • Scalable • Document Level Atomicity • Easy Installation • Relatively Ease Of Use • Great (!!!!) Documentation NOSQL INTRO & MONGODB 26
  • AND… • No cross document transactions • No joins • Replication – master / slave • Sharding NOSQL INTRO & MONGODB 27
  • MONGO ADVANTAGE - * Credit – Dwight Merriman, Founder and CEO – MongoDB (was 10Ge NOSQL INTRO & MONGODB 28
  • DOCUMENT At its simplest form, Mongo is a document oriented database • MongoDB stores all data in documents, which are JSON-style data structures composed of field-and-value pairs. • MongoDB stores documents on disk in the BSON serialization format. BSON is a binary representation of JSON documents. BSON contains more data types than does JSON. ** For in-depth BSON information, see NOSQL INTRO & MONGODB 29
  • WHAT DOES A DOCUMENT LOOK LIKE { "_id" : "52a602280f2e642811ce8478", "ratingCode" : "PG13", "country" : "USA", "entityType" : "Rating” } NOSQL INTRO & MONGODB 30
  • RULES FOR A DOCUMENT Documents have the following rules: The maximum BSON document size is 16 megabytes. The field name _id is reserved for use as a primary key; its value must be unique in the collection. The field names cannot start with the $ character. The field names cannot contain the . character. NOSQL INTRO & MONGODB 32
  • MONGO INSTALL Windows MAC Create Data Directory , Defaults • C:datadb • /data/db/ (make sure have permissions) Or can set using -dbpath C:mongodbbinmongod.exe --dbpath d:testmongodbdata NOSQL INTRO & MONGODB 33
  • START IT! Database mongod Shell mongo show dbs show collections db.stats() NOSQL INTRO & MONGODB 34
  • BASIC OPERATIONS 1_simpleinsert.txt  Insert  Find  Find all  Find One  Find with criteria  Indexes  Explain() NOSQL INTRO & MONGODB 35
  • MORE MONGO SHELL 2_arrays_sort.txt • Embedded documents • Limit, Sort • Using regex in query • Removing documents • Drop collection NOSQL INTRO & MONGODB 36
  • IMPORT / EXPORT 3_imp_exp.txt Mongo provides tools for getting data in and out of the database • Data Can Be Exported to json files • Json files can then be Imported NOSQL INTRO & MONGODB 37
  • CONDITIONAL OPERATORS 4_cond_ops.txt • • • • • $lt $gt $gte $lte $or • Also $not, $exists, $type, $in (for $type refer to e ) NOSQL INTRO & MONGODB 38
  • ADMIN COMMANDS 5_admin.txt • • • • • • how dbs show collections db.stats() db.posts.stats() db.posts.drop() db.system.indexes.find() NOSQL INTRO & MONGODB 39
  • DATA MODELING • Remember with NoSql redundancy is not evil • Applications insure consistency, not the DB • Application join data, not defined in the DB • Datamodel is schema-less • Datamodel is built to support queries usually NOSQL INTRO & MONGODB 40
  • QUESTIONS TO ASK • Your basic units of data (what would be a document)? • How are these units grouped / related? • How does Mongo let you query this data, what are the options? • Finally, maybe most importantly, what are your applications access patterns? • Reads vs. writes • Queries • Updates • Deletions • How structured is it NOSQL INTRO & MONGODB 41
  • DATA MODEL - NORMALIZED Normalized • Similar to relational model. • One collection per entity type • Little or no redundancy • Allows clean updates, familiar to many SQL users, easier to understand NOSQL INTRO & MONGODB 42
  • REFERENCES • From parent to child { name: "O'Reilly Media", books: [12346789, 234567890, ...] } • From child to parent { _id: 123456789, title: "MongoDB: The Definitive Guide", publisher_id: "oreilly" } NOSQL INTRO & MONGODB 44
  • DATA MODEL - EMBEDDED Oft used pattern in Mongo, is to embed information as subdocuments. • Used when there is a contains relationship • Easier querying (when related data is often used together) • Need to keep 16 MB document size in mind NOSQL INTRO & MONGODB 45
  • OTHER CONSIDERATIONS FOR DATA MODELING Many or few collections • Many Collections • As seen in normalized • Clean and little redundancy • May not provide best performance • May require frequent updates to application if new types added • Multiple Collections • Middle ground, partially normalized • Not many collections • One large generic collection • Contains many types • Use type field NOSQL INTRO & MONGODB 47
  • CONSIDERATION CONTINUED • Document Growth – will relocate if exceeds allocated size • Atomicity • Atomic at document level • Consideration for insertions, remove and multi-document updates  Sharding – collections distributed across mongod instances, uses a shard key  Indexes – index fields often queries, indexes affect write performance slightly  Consider using TTL to automatically expire documents NOSQL INTRO & MONGODB 48
  • COMMON USES FOR MONGO Log Collection Caching Queues / Messaging Capped Collections - fixed-size collections that support high-throughput operations that insert, retrieve, and delete documents based on insertion order. Analytics Prototyping NOSQL INTRO & MONGODB 49
  • MONGODB DEVELOPMENT WITH JAVA Supplied by MongoDB Itself Easy to setup Housed on maven repo NOSQL INTRO & MONGODB 50
  • EXAMPLE JAVA APP Load Health Data Query Data Administrative Functions NOSQL INTRO & MONGODB 51
  • SOME OTHER COOL STUFF Get MEAN Mongo, Express, Angular and Node Can install, in a VM or even in the cloud NOSQL INTRO & MONGODB 53
  • THE CLOUD Database in the cloud Can access using shell, GUI Mongo explorer, mongoimport, mongoexport and use in application Amazon, Rackspace, Joyent or Azure NOSQL INTRO & MONGODB 54
  • BOOKS MongoDB: The Definitive Guide, 2nd Edition By: Kristina Chodorow Publisher: O'Reilly Media, Inc. Pub. Date: May 23, 2013 Print ISBN-13: 978-1-4493-4468-9 Pages in Print Edition: 432 MongoDB in Action By: Kyle Banker Publisher: Manning Publications Pub. Date: December 16, 2011 Print ISBN-10: 1-935182-87-0 Print ISBN-13: 978-1-935182-87-0 Pages in Print Edition: 312 The Definitive Guide to MongoDB: The NoSQL Database for Cloud and Desktop Computing By Eelco Plugge; Peter Membrey; Tim Hawkins Apress, September 2010 ISBN: 9781430230519 327 pages NOSQL INTRO & MONGODB 55
  • BOOKS CONT. MongoDB Applied Design Patterns By: Rick Copeland Publisher: O'Reilly Media, Inc. Pub. Date: March 18, 2013 Print ISBN-13: 978-1-4493-4004-9 Pages in Print Edition: 176 MongoDB for Web Development (rough cut!) By: Mitch Pirtle Publisher: Addison-Wesley Professional Last Updated: 14-JUN-2013 Pub. Date: March 11, 2015 (Estimated) Print ISBN-10: 0-321-70533-5 Print ISBN-13: 978-0-321-70533-4 Pages in Print Edition: 360 Instant MongoDB By: Amol Nayak; Publisher: Packt Publishing Pub. Date: July 26, 2013 Print ISBN-13: 978-1-78216-970-3 Pages in Print Edition: 72 NOSQL INTRO & MONGODB 56
  • THAT’S ALL FOLKS Questions? Comments? What other topics are of interest? Thank You!!!!!! NOSQL INTRO & MONGODB 58