Developing node-mdb: a Node.js - based clone of SimpleDB
Upcoming SlideShare
Loading in...5
×
 

Developing node-mdb: a Node.js - based clone of SimpleDB

on

  • 1,723 views

Talk given at the London Ajax Users Group, June 14 2011

Talk given at the London Ajax Users Group, June 14 2011

Statistics

Views

Total Views
1,723
Views on SlideShare
1,722
Embed Views
1

Actions

Likes
0
Downloads
8
Comments
0

1 Embed 1

http://nodeslide.herokuapp.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Developing node-mdb: a Node.js - based clone of SimpleDB Developing node-mdb: a Node.js - based clone of SimpleDB Presentation Transcript

  • Developing node-mdb SimpleDB emulation using Node.js and GT.M Rob Tweed M/Gateway Developments Ltd http://www.mgateway.com Twitter: @rtweed
  • Could you translate that title?
    • SimpleDB:
      • Amazon’s NoSQL cloud database
    • Node.js:
      • evented server-side Javascript (using V8)
    • GT.M:
      • Open source global-storage based NoSQL database
    • node-mdb
      • Open source emulation of SimpleDB
  • SimpleDB
    • Amazon’s cloud database
      • Pay as you go
    • Secure HTTP interface
    • Schema-free NoSQL database
    • Spreadsheet-like database model
      • Domains (= tables)
        • Items (= rows)
          • Attributes (=cells)
            • Values (1+ per attribute allowed)
    • SQL-like query API
  • Why emulate SimpleDB?
    • Because I could!
    • Kind of cool project
  • Why emulate SimpleDB?
    • To provide a free, locally-available database that behaved identically to SimpleDB
      • Lots of off-the-shelf available clients
        • Standalone
          • Bolso
          • Mindscape’s SimpleDB Management Tools
        • Language-specific clients
          • boto (Python)
          • Official AWS clients for Java, .Net
          • Node.js
          • etc…
  • Why emulate SimpleDB?
    • To perform local tests prior to committing to production on SimpleDB
    • To provide a live, local backup database
    • A SimpleDB database for private clouds
    • To provide an immediately-consistent SimpleDB database
      • SimpleDB is “eventually consistent”
  • Why the GT.M database?
    • I’m familiar with it
    • Free Open Source NoSQL database
    • Schema-free
    • “ Globals”:
      • Sparse persistent multi-dimensional arrays
        • Hierarchical database
        • Completely dynamic storage
          • No pre-declaration or specification needed
    • Result: trivial to model SimpleDB in globals
    • node-mdb : Good way to demonstrate the capabilities of the otherwise little-known GT.M
    • More info – Google:
      • “ GT.M database”
      • “ universalnosql”
  • Why write it using Node.js?
    • M/DB originally written in late 2008
      • Implemented using GT.M’s native scripting language (M)
      • Apache + m_apache gateway to GT.M for HTTP interface
    • I’ve been working with Node.js for about a year now
      • Rewriting M/DB in Javascript would make it more widely interesting and comprehensible
    • Some performance issues reported with M/DB when being pushed hard
  • Why Node.js?
    • Conclusion:
      • Re-implementing M/DB using Node.js should provide better performance and scalability
      • Fewer moving parts:
        • Apache + m_apache + GT.M / multi-threaded
        • Node.js + GT.M as child processes / single-thread
      • Cool Node.js project to attempt
      • Great example of non-trivial use of Node.js + database
  • How does SimpleDB work? HTTP Server Authenticate Request (HMacSHA) Security Key Id Secret Key Execute API Action Generate HTTP Response SimpleDB Database Copy 1 SimpleDB Database Copy 2 SimpleDB Database Copy n SimpleDB Database Copy 2 SimpleDB Database Copy 2 Incoming SDB HTTP Request Outgoing SDB HTTP Response Error Success and/or data/results
  • Node.js can emulate all this HTTP Server Authenticate Request (HMacSHA) Security Key Id Secret Key Execute API Action Generate HTTP Response SimpleDB Database Copy 1 SimpleDB Database Copy 2 SimpleDB Database Copy n SimpleDB Database Copy 2 SimpleDB Database Copy 2 Incoming SDB HTTP Request Outgoing SDB HTTP Response Error Success and/or data/results
  • GT.M can emulate this HTTP Server Authenticate Request Security Key Id Secret Key Execute API Action Generate HTTP Response SimpleDB Database Copy 1 Incoming SDB HTTP Request Outgoing SDB HTTP Response Error Success and/or data/results
  • Node.js characteristics
    • Single threaded process
    • Event loop
    • Non-blocking I/O
      • Asynchronous calls to functions that handle I/O
      • Event-driven call-back functions when function completes
        • Data fetched
        • Data saved
  • Result: deeply nested call-backs HTTP Server Authenticate Request Security Key Id Secret Key Execute API Action Generate HTTP Response Error Success and/or data/results
  • Flattening the call-back nesting processSDBRequest() http server executeAPI() sendResponse() http.createServer(function(req,res) {..} var processSDBRequest = function() {…}; var executeAPI = function() {…};
  • Node.js HTTP Server http.createServer(function(request, response) { request.content = ''; request.on("data", function(chunk) { request.content += chunk; }); request.on("end", function(){ var SDB = {startTime: new Date().getTime(), request: request, response: response }; var urlObj = url.parse(request.url, true); if (request.method === 'POST') { SDB.nvps = parseContent(request.content); } else { SDB.nvps = urlObj.query; } var uri = urlObj.pathname; if ((uri.indexOf(sdbURLPattern) !== -1)||(uri.indexOf(mdbURLPattern) !== -1)) { processSDBRequest(SDB); } else { var uriString = 'http://' + request.headers.host + request.url; var error = {code:'InvalidURI', message: 'The URI ' + uriString + ' is not valid',status:400}; returnError(SDB ,error); } }); }).listen(httpPort);
  • processSDBRequest() var processSDBRequest = function(SDB) { var accessKeyId = SDB.nvps.AWSAccessKeyId; if (!accessKeyId) { var error = {code:'AuthMissingFailure', message: 'AWS was not able to authenticate the request: access credentials are missing',status:403}; returnError(SDB, error); } else { MDB.getGlobal('MDBUAF', ['keys', accessKeyId], function (error, results) { if (!error) { if (results.value !== '') { accessKey[accessKeyId] = results.value; validateSDBRequest(SDB, results.value); } else { var error = {code:'AuthMissingFailure', message: 'AWS was not able to authenticate the request: access credentials are missing',status:403}; returnError(SDB, error); } } }); } };
  • validateSDBRequest() var validateSDBRequest = function(SDB, secretKey) { var type = ‘HmacSHA256’; var stringToSign = createStringToSign(SDB, true); var hash = digest(stringToSign, secretKey, type); if (hash === SDB.nvps.Signature) { processSDBAction(SDB); } else { errorResponse('SignatureDoesNotMatch', SDB) } };
  • stringToSign() POST {lf} 192.168.1.134:8081 {lf} / {lf} AWSAccessKeyId= rob &Action=ListDomains& MaxNumberOfDomains=100&SignatureMethod=HmacSHA1& SignatureVersion=2& Timestamp=2011-06-06T22%3A39%3A30%2 B00%3A00& Version=2009-04-15 ie: reconstruct the same string that the SDB client used to sign the request then use rob ’s secret key to sign it:
  • digest() var crypto = require("crypto"); var digest = function(string, secretKey, type) { var hmac = crypto.createHmac(type, secretKey); hmac.update(string); return hmac.digest('base64'); };
  • Ready to execute an API! HTTP Server Authenticate Request Security Key Id Secret Key Execute API Action Generate HTTP Response SimpleDB Database Copy 1 SimpleDB Database Copy 2 SimpleDB Database Copy n SimpleDB Database Copy 2 SimpleDB Database Copy 2 Incoming SDB HTTP Request Outgoing SDB HTTP Response Error Success and/or data/results
  • SimpleDB APIs (Actions)
    • CreateDomain
    • ListDomains
    • DeleteDomain
    • PutAttributes (BatchPutAttributes)
    • GetAttributes
    • DeleteAttributes (BatchDeleteAttributes)
    • Select
    • DomainMetaData
  • Accessing the GT.M Database
    • Accessed via node-mwire
      • TCP-based wire protocol
      • Extension of Redis protocol
      • Adapted redis-node module
    • APIs allow you to set/get/delete/edit Globals
  • GT.M Globals
    • Globals = unit of persistent storage
      • Schema-free
      • Hierarchically structured
      • Sparse
      • Dynamic
      • “ persistent associative array”
  • GT.M Globals
    • A Global has:
      • A name
      • 0, 1 or more subscripts
      • String value
      • globalName[subscript1,subscript2,..subscript n ]=value
  • SDB Domain in Globals CreateDomain AWSAccessKeyId = ‘rob’ DomainName = ‘books’
  • Multiple Domains in Globals MDB ‘rob’ ‘domains’ ‘name’ ‘domainIndex’ ‘created’ 1304956337618 ‘books’ ‘modified’ 1304956337618 ‘books’ 1 1 ‘’ ‘name’ ‘created’ 1304956337423 ‘accounts’ ‘modified’ 1304956337423 2 ‘accounts’ 2 ‘’
  • Creating a new domain (1) MDB ‘rob’ ‘domains’ ‘name’ ‘domainIndex’ ‘created’ 1304956337618 ‘books’ ‘modified’ 1304956337618 ‘books’ 1 1 ‘’ 2 increment()
  • Creating a new domain (2) MDB ‘rob’ ‘domains’ ‘name’ ‘domainIndex’ ‘created’ 1304956337618 ‘books’ ‘modified’ 1304956337618 ‘books’ 1 1 ‘’ ‘name’ ‘created’ 1304956337423 ‘accounts’ ‘modified’ 1304956337423 2 ‘accounts’ 2 ‘’ setGlobal()
  • Key Node.js async patterns for db I/O
    • Dependent pattern:
      • Can’t set the global nodes until the value of the increment() is returned
    • Parallel pattern:
      • Global nodes can be created in parallel
      • No interdependence
      • BUT:
        • Need to know when they’re all completed
  • Dependent pattern MDB ‘rob’ ‘domains’ ‘name’ ‘created’ 1304956337618 ‘books’ ‘modified’ 1304956337618 1 2 MDB.increment([accessKeyId, 'domains'], 1, function (error, results) { var id = results.value; //….now create the other global nodes inside callback }); IncrBy
  • Dependent pattern MDB ‘rob’ ‘domains’ ‘name’ ‘created’ 1304956337618 ‘books’ ‘modified’ 1304956337618 1 2 MDB.increment([accessKeyId, 'domains'], 1, function (error, results) { var id = results.value; //….now create the other global nodes inside callback });
  • Parallel Pattern (semaphore) var count = 0; MDB.setGlobal([accessKeyId, 'domains', id, 'name'], domainName, function (error, results) { count++; if (count === 4) sendCreateDomainResponse(count, SDB); }); MDB.setGlobal([accessKeyId, 'domains', id, 'created'], now, function (error, results) { count++; if (count === 4) sendCreateDomainResponse(count, SDB); }); MDB.setGlobal([accessKeyId, 'domains', id, 'modified'], now, function (error, results) { count++; if (count === 4) sendCreateDomainResponse(count, SDB); }); MDB.setGlobal([accessKeyId, 'domainIndex', nameIndex, id], '', function (error, results) { count++; if (count === 4) sendCreateDomainResponse(count, SDB); });
  • New domain nodes created MDB ‘rob’ ‘domains’ ‘name’ ‘domainIndex’ ‘created’ 1304956337618 ‘books’ ‘modified’ 1304956337618 ‘books’ 1 1 ‘’ ‘name’ ‘created’ 1304956337423 ‘accounts’ ‘modified’ 1304956337423 2 ‘accounts’ 2 ‘’
  • Send CreateDomain Response HTTP Server Authenticate Request Security Key Id Secret Key Execute API Action Generate HTTP Response SimpleDB Database Copy 1 SimpleDB Database Copy 2 SimpleDB Database Copy n SimpleDB Database Copy 2 SimpleDB Database Copy 2 Incoming SDB HTTP Request Outgoing SDB HTTP Response Error Success and/or data/results
  • CreateDomain Response <?xml version=&quot;1.0&quot;?> <CreateDomainResponse xmlns=&quot;http://sdb.amazonaws.com/doc/2009-04-15/&quot;> <ResponseMetadata> <RequestID>e4e9fa45-f9dc-4e5b-8f0a-777acce6505e</RequestID> <BoxUsage>0.0020000000</BoxUsage> </ResponseMetadata> </CreateDomainResponse> var okResponse = function(SDB) { var nvps = SDB.nvps; var xml = responseStart({action: nvps.Action, version: nvps.Version}); xml = xml + responseEnd(nvps.Action, SDB.startTime, false); responseHeader(200, SDB.response); SDB.response.write(xml); SDB.response.end(); };
  • Node.js HTTP Server Response
    • http.createServer(function(request, response) {
    • //…numerous call-backs deep:
            • response.writeHead(status, {
            • &quot;Server&quot;: &quot;Amazon SimpleDB&quot;,
            • &quot;Content-Type&quot;: &quot;text/xml&quot;,
            • &quot;Date&quot;: dateNow.toUTCString()});
            • response.write('<?xml version=&quot;1.0&quot;?>n');
            • response.write(xml);
            • response.end();
    • });
    • Entire request/response SDB round-trip completed
  • Demo using Bolso
    • List Domains
    • Create Domain
    • Add an item (row) and some attributes (columns + cells)
  • Node.js Gotchas
    • Async programming is not immediately intuitive!
    • Loops
      • Calling functions that use call-backs inside a for..in loop will go horribly wrong!
    • Understanding closures
      • How externally-defined variables can be used inside call-back functions
  • Example
    • BatchPutAttributes
      • Intuitively a for .. in loop around PutAttributes
      • Had to be serialised
        • Completion of one PutAttributes calls the next
      • Copy state of SDB object and use for..in?
        • var SDBx = SDB;
        • SDBx is a pointer to SDB, not a clone of it!
  • Conclusions
    • node-mdb is now nearly complete
    • Only BatchDeleteAttributes not implemented
    • Other APIs emulate SimpleDB 100%
    • Free Open Source
      • https://github.com/robtweed/node-mdb
      • Give it a try!
      • Use mdb.js for examples to build your own Node.js database applications
    • Check out GT.M!
    • Follow me on Twitter at @rtweed
    • Slides: http://www.mgateway.com/node-mdb-pres.html