NoSQL Introduction
A demonstration in parallel with SQL data models.
Eric Ross
Date: 06/03/2012
What is JSON?
From: http://json.org
{
"glossary": {
"title": "example glossary",
"GlossDiv": {
"title": "S",
"GlossList": {
"GlossEntry": {
"ID": "SGML",
"SortAs": "SGML",
"GlossTerm": "Standard Generalized Markup Language",
"Acronym": "SGML",
"Abbrev": "ISO 8879:1986",
"GlossDef": {
"para": "A meta-markup language, used to create markup languages such as DocBook.",
"GlossSeeAlso": ["GML", "XML"]
},
"GlossSee": "markup"
}
}
}
}
}
From: http://json.org/example.html
Example Test Result Record
{
"_id": "e3bc8454-6910-4c2a-b69b-dbf52046d3a0", unique record key
"customdata": { custom metadata
"deviceIP": "localhost?usesocketlog=pudStart.log",
"stopTime": "May 05,2013 13:34:45",
"logName": "durationLog-2013-05-23-13-34-45-localhost?usesocketlog=pudStart.log.txt",
"startTime": "May 05,2013 13:34:45",
"firmwareRelease": "BWP1CN1314AR",
"runTime": 0.004220008850097656,
"serialNumber": "CN2AM9J08V",
"MechMode": true,
"printerModel": ”XN1254"
},
"recs": [ 1 or more records – format is custom.
{
"filesProcessed": 1,
"fileName": "circle.ps",
"timePerFile": 7.2447731494903564,
"pagesPrinted": 1,
"totalPages": 1,
"pagesPerMinute": 8.281832814077937,
"result_id": "e3bc8454-6910-4c2a-b69b-dbf52046d3a0"
},
],
"passfail": "PASS",
"name": "performance",
"timestamp": "1368052433", standard fields
"app": "Duration",
"type": "result",
"appversion": "1.0"
}
Test Result Record
{
"_id": "e3bc8454-6910-4c2a-b69b-dbf52046d3a0", unique record key
"customdata": { custom metadata
"deviceIP": "localhost?usesocketlog=pudStart.log",
"stopTime": "May 05,2013 13:34:45",
"logName": "durationLog-2013-05-23-13-34-45-localhost?usesocketlog=pudStart.log.txt",
"startTime": "May 05,2013 13:34:45",
"firmwareRelease": "BWP1CN1314AR",
"runTime": 0.004220008850097656,
"serialNumber": "CN2AM9J08V",
"MechMode": true,
"printerModel": ”XN1254"
},
"recs": [ 1 or more records – format is custom.
{
"filesProcessed": 1,
"fileName": "circle.ps",
"timePerFile": 7.2447731494903564,
"pagesPrinted": 1,
"totalPages": 1,
"pagesPerMinute": 8.281832814077937,
"result_id": "e3bc8454-6910-4c2a-b69b-dbf52046d3a0"
},
],
"passfail": "PASS",
"name": "performance",
"timestamp": "1368052433", standard fields
"app": "Duration",
"type": "result",
"appversion": "1.0"
}
Test Result Record
{
"_id": "e3bc8454-6910-4c2a-b69b-dbf52046d3a0", unique record key
"customdata": { custom metadata
"deviceIP": "localhost?usesocketlog=pudStart.log",
"stopTime": "May 05,2013 13:34:45",
"logName": "durationLog-2013-05-23-13-34-45-localhost?usesocketlog=pudStart.log.txt",
"startTime": "May 05,2013 13:34:45",
"firmwareRelease": "BWP1CN1314AR",
"runTime": 0.004220008850097656,
"serialNumber": "CN2AM9J08V",
"MechMode": true,
"printerModel": ”XN1254"
},
"recs": [ 1 or more records – format is custom.
{
"filesProcessed": 1,
"fileName": "circle.ps",
"timePerFile": 7.2447731494903564,
"pagesPrinted": 1,
"totalPages": 1,
"pagesPerMinute": 8.281832814077937,
"result_id": "e3bc8454-6910-4c2a-b69b-dbf52046d3a0"
},
],
"passfail": "PASS",
"name": "performance",
"timestamp": "1368052433", standard fields
"app": "Duration",
"type": "result",
"appversion": "1.0"
}
Relational (SQL) model.
NoSQL Model
What? Is that all?
NoSQL/CouchDB characteristics
• Just a collection of JSON documents.
• Views are defined using JavaScript (Python possible, not fully qualified).
• A database can contain all types of documents.
• How you decide to “type” your documents is totally up to you.
• No data reformatting/schema changing required.
• REST API. Available to any tool that can do http requests (including your
browser using Javascript/AJAX).
• Libraries available for all major scripting languages.
Why couchDB?
• Replication allows deployment where needed. Very simple and easy to deploy
• REST/HTTP is accessible through all HP internal firewalls and can be exposed
to external entities on a case by case basis.
• REST/HTTP is accessible by ALL programming languages including command
line utils like cUrl.
• Fully open source(Apache License).
• Simpler features mean simpler admin tasks.
• Binaries available for Mac and Windows for development purposes and local
data caching.
• MAJOR FEATURE: You can attach ANY document to a couchdb JSON
document. This allows the attachment of things such as serial logs and core
dump files.
Popular NoSQL document stores
• MongoDB: Very popular commercial database. Proprietary clients and API.
Support for most programming languages. “Big Data” features.
• CouchDB: Opensource (Apache). REST API. Client libraries available.
Supports replication but no advanced features such as sharding.
• Couchbase: Both opensource community edition and commercial edition. Uses
custom API with supplied client libraries.
• Redis: not really a document store. Key/value store for “Big Data”.
http://nosql-database.org/
World Wide Deployment(idealized)
regional acquisition
clusters
remote analysis
results aggregation
remote analysis
remote analysis
regional acquisition
clusters regional acquisition
clusters

Comparison with storing data using NoSQL(CouchDB) and a relational database.

  • 1.
    NoSQL Introduction A demonstrationin parallel with SQL data models. Eric Ross Date: 06/03/2012
  • 2.
    What is JSON? From:http://json.org
  • 3.
    { "glossary": { "title": "exampleglossary", "GlossDiv": { "title": "S", "GlossList": { "GlossEntry": { "ID": "SGML", "SortAs": "SGML", "GlossTerm": "Standard Generalized Markup Language", "Acronym": "SGML", "Abbrev": "ISO 8879:1986", "GlossDef": { "para": "A meta-markup language, used to create markup languages such as DocBook.", "GlossSeeAlso": ["GML", "XML"] }, "GlossSee": "markup" } } } } } From: http://json.org/example.html
  • 4.
    Example Test ResultRecord { "_id": "e3bc8454-6910-4c2a-b69b-dbf52046d3a0", unique record key "customdata": { custom metadata "deviceIP": "localhost?usesocketlog=pudStart.log", "stopTime": "May 05,2013 13:34:45", "logName": "durationLog-2013-05-23-13-34-45-localhost?usesocketlog=pudStart.log.txt", "startTime": "May 05,2013 13:34:45", "firmwareRelease": "BWP1CN1314AR", "runTime": 0.004220008850097656, "serialNumber": "CN2AM9J08V", "MechMode": true, "printerModel": ”XN1254" }, "recs": [ 1 or more records – format is custom. { "filesProcessed": 1, "fileName": "circle.ps", "timePerFile": 7.2447731494903564, "pagesPrinted": 1, "totalPages": 1, "pagesPerMinute": 8.281832814077937, "result_id": "e3bc8454-6910-4c2a-b69b-dbf52046d3a0" }, ], "passfail": "PASS", "name": "performance", "timestamp": "1368052433", standard fields "app": "Duration", "type": "result", "appversion": "1.0" }
  • 5.
    Test Result Record { "_id":"e3bc8454-6910-4c2a-b69b-dbf52046d3a0", unique record key "customdata": { custom metadata "deviceIP": "localhost?usesocketlog=pudStart.log", "stopTime": "May 05,2013 13:34:45", "logName": "durationLog-2013-05-23-13-34-45-localhost?usesocketlog=pudStart.log.txt", "startTime": "May 05,2013 13:34:45", "firmwareRelease": "BWP1CN1314AR", "runTime": 0.004220008850097656, "serialNumber": "CN2AM9J08V", "MechMode": true, "printerModel": ”XN1254" }, "recs": [ 1 or more records – format is custom. { "filesProcessed": 1, "fileName": "circle.ps", "timePerFile": 7.2447731494903564, "pagesPrinted": 1, "totalPages": 1, "pagesPerMinute": 8.281832814077937, "result_id": "e3bc8454-6910-4c2a-b69b-dbf52046d3a0" }, ], "passfail": "PASS", "name": "performance", "timestamp": "1368052433", standard fields "app": "Duration", "type": "result", "appversion": "1.0" }
  • 6.
    Test Result Record { "_id":"e3bc8454-6910-4c2a-b69b-dbf52046d3a0", unique record key "customdata": { custom metadata "deviceIP": "localhost?usesocketlog=pudStart.log", "stopTime": "May 05,2013 13:34:45", "logName": "durationLog-2013-05-23-13-34-45-localhost?usesocketlog=pudStart.log.txt", "startTime": "May 05,2013 13:34:45", "firmwareRelease": "BWP1CN1314AR", "runTime": 0.004220008850097656, "serialNumber": "CN2AM9J08V", "MechMode": true, "printerModel": ”XN1254" }, "recs": [ 1 or more records – format is custom. { "filesProcessed": 1, "fileName": "circle.ps", "timePerFile": 7.2447731494903564, "pagesPrinted": 1, "totalPages": 1, "pagesPerMinute": 8.281832814077937, "result_id": "e3bc8454-6910-4c2a-b69b-dbf52046d3a0" }, ], "passfail": "PASS", "name": "performance", "timestamp": "1368052433", standard fields "app": "Duration", "type": "result", "appversion": "1.0" }
  • 7.
  • 8.
  • 11.
    NoSQL/CouchDB characteristics • Justa collection of JSON documents. • Views are defined using JavaScript (Python possible, not fully qualified). • A database can contain all types of documents. • How you decide to “type” your documents is totally up to you. • No data reformatting/schema changing required. • REST API. Available to any tool that can do http requests (including your browser using Javascript/AJAX). • Libraries available for all major scripting languages.
  • 12.
    Why couchDB? • Replicationallows deployment where needed. Very simple and easy to deploy • REST/HTTP is accessible through all HP internal firewalls and can be exposed to external entities on a case by case basis. • REST/HTTP is accessible by ALL programming languages including command line utils like cUrl. • Fully open source(Apache License). • Simpler features mean simpler admin tasks. • Binaries available for Mac and Windows for development purposes and local data caching. • MAJOR FEATURE: You can attach ANY document to a couchdb JSON document. This allows the attachment of things such as serial logs and core dump files.
  • 13.
    Popular NoSQL documentstores • MongoDB: Very popular commercial database. Proprietary clients and API. Support for most programming languages. “Big Data” features. • CouchDB: Opensource (Apache). REST API. Client libraries available. Supports replication but no advanced features such as sharding. • Couchbase: Both opensource community edition and commercial edition. Uses custom API with supplied client libraries. • Redis: not really a document store. Key/value store for “Big Data”. http://nosql-database.org/
  • 14.
    World Wide Deployment(idealized) regionalacquisition clusters remote analysis results aggregation remote analysis remote analysis regional acquisition clusters regional acquisition clusters

Editor's Notes

  • #2 Please adjust confidentiality notice accordingly