• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Riak at The NYC Cloud Computing Meetup Group
 

Riak at The NYC Cloud Computing Meetup Group

on

  • 2,020 views

In depth look at the nosql product, Riak.

In depth look at the nosql product, Riak.

Statistics

Views

Total Views
2,020
Views on SlideShare
2,018
Embed Views
2

Actions

Likes
2
Downloads
39
Comments
0

1 Embed 2

https://twitter.com 2

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Riak at The NYC Cloud Computing Meetup Group Riak at The NYC Cloud Computing Meetup Group Presentation Transcript

    • A Walk Down NOSQL Lane in the Cloud Part 2: Riak NYC Cloud Computing Group, March 2011 Alexander Sicular @siculars
    • Who is this blowhard?Columbia University pays my mortgageFor the better part of a decade in MedicalInformaticsAm not shilling for any of these companiesAm not a computer scientistAm a computer science enthusiastparticularly in the area of Informatics
    • Riak, eh?Dynamo inspiredHomogeneousSingle key-spaceDistributedReplicatedPredictablescaleability
    • OriginsShow me your friends...Amazon’s Dynamohttp://www.allthingsdistributed.com/2007/10/amazons_dynamo.htmlAkamaihttp://www.basho.com/bios.html Paramount Home Video
    • CAP Theorem http://en.wikipedia.org/wiki/CAP_theorem Consistency Availability Partition tolerance Pick two? http://guide.couchdb.org/draft/consistency.htmlRiak says: pick two at a time.
    • HomogeneousEvery node is thesameAny node can serviceany requestNodes gossip on theirown port
    • One Ring to Rule ThemSingle 160 bit key spaceHuh?No Sharding!
    • Distributed (!= replicated)riak is not sharded ★Considerations:vnodes = units of -must plan maximumdistribution ring sizevnodes != physical -think about numbernodes (pnodes) of vnodes per pnodevnodes map to -generally no less thanpnodes 10 vnodes per pnodedata is distributed atthe vnode level
    • Conflict Resolution Vector Clocks ancestry / divergency maintained automatic or manual resolution★ Considerations: X-Riak-ClientId, X-Riak-Vclock allow_mult
    • Replicated (!= distributed)configurable replication values (“N”)configurable consistency and availabilityvalues at read and write time- read- write- durable write
    • Predictable Scaleability How much performance per node? Scale in both directions> bin/riak-admin> Usage: riak-admin { join | leave | backup | restore | test | status | reip | js_reload | wait-for- service | ringready | transfers }
    • Data Agnostic schemaless data objects may be of any type binary, text (json, xml) use content types>curl -v -d this is a test -H "Content-Type: text/plain" http://127.0.0.1:8098/riak/testBucket/testKey
    • Extra GoodiesErlanghttp://www.pragprog.com/titles/jaerlang/programming-erlangCode Architecturebasho_benchMultiple backends bitcask, innodb, mem
    • Code architectureHighly modularized riak_core riak_kv bitcask erlang_js http://bitbucket.org/basho
    • basho_benchPerformance profilinghighly customizablepretty pictureskey/value store generalizedhttps://wiki.basho.com/display/RIAK/Benchmarking+with+Basho+Benchhttp://pics.livejournal.com/demmonoid/pic/00001sa7
    • BitcaskRiak’s default disk backendWrite Only LogHeavy updates will grow your footprint - Look into compaction/merging settingsKeys are cached in memory with disk offsetshttps://spreadsheets.google.com/ccc?key=0Ak4OBkABJPsxdEowYXc2akxnYU9xNkJmbmZscnhaTFE&hl=en&authkey=CMHw8tYO
    • Speak my language? HTTPhttp://wiki.basho.com/display/RIAK/REST+API Protocol Buffershttp://wiki.basho.com/display/RIAK/PBC+API Native Erlanghttp://wiki.basho.com/display/RIAK/Erlang+Client+PBC http://www.zazzle.com/ speak_to_me_in_tagalog_tshirt-235376204895796392
    • Ok sounds good. How do I get it?>git|hg clone http://bitbucket.org/basho/riak>cd riak>make all && make rel OR if you’re on a mac:>brew install riak
    • Ok sounds good. How do I get it?>git|hg clone http://bitbucket.org/basho/riak_search>cd riak_search>make all && make rel OR if you’re on a mac:>brew install riak-search
    • What does that get me? Fully functional Self contained (<3) Default configuration-64 vnodes, “riak” cookie, N = 3
    • Work... like so. Config fileshttp://wiki.basho.com/display/RIAK/Configuration+Filesapp.config-ring_creation_sizevm.args-name, -settings
    • Fire it up> bin/riak> Usage: riak {start|stop|restart| reboot|ping|console|attach}> bin/riak start
    • Do Stuff! GET:> curl -v http://127.0.0.1:8098/ping> curl -v http://127.0.0.1:8098/stats> curl -v http://127.0.0.1:8098/riak/myBucket> curl -v http://127.0.0.1:8098/riak/myBucket/myKey PUT:> curl -v -X PUT -H "Content-Type: application/json" -d {"backend": "ets"} http://127.0.0.1:8098/riak/myBucket> curl -v -X PUT -d test key http://127.0.0.1:8098/riak/ myBucket/myKey> curl -v -X POST -d autogen key http://127.0.0.1:8098/ riak/myBucket
    • LinksLightweight GraphingPractical limitations re. number of links perobjectUnidirectional object linkingrelationship modeling (one to one, one to many)Returns “Content-Type: multipart/mixed;” - Library needs to be multipart aware - nodejs, formidable
    • Link WalkingFirst level depth>curl http://localhost:8098/riak/myBucket/myKey/_,_,_Via Map/Reduce>$ curl -X POST -H "content-type:application/json" http://localhost:8098/mapred --data @-{"inputs":[["myBucket","myKey"]],"query":[{"link":{}},{"map":{"language":"javascript","source":"function(v){ return [v]; }"}}]}^DN level depth>curl http://localhost:8098/riak/myBucket/myKey/_,_,_/_,_,_More Info:http://blog.basho.com/2010/02/24/link-walking-by-example/http://wiki.basho.com/display/RIAK/Linkshttp://wiki.basho.com/display/RIAK/REST+API#RESTAPI-Linkwalking
    • Map/ReduceFunctions written in either Erlang orJavaScriptMap is distributed to where the data livesReduce is run on the node coordinating theM/RErlang > JavaScriptTweak JavaScript settings in app.conf
    • M/R in Riak An input to start from function(v, keydata, args) { bucket ! if (v.values) { ! var ret = [], o = {}; ! o = Riak.mapValuesJson(v)[0]; ! list of keys / keyfilter ! o.lastModifiedParsed = Date.parse(v["values"][0] ["metadata"]["X-Riak-Last-Modified"]); ! o.key = v["key"]; ★ keys > bucket ! ret.push(o); ! return ret; possible link phase ! } else { ! return []; ! } one or more map phases ! }; (many) possible reduce phase(s) Map = SQL Select/Where clauseReduce = SQL Aggregates (SUM, COUNT, GROUPBY)
    • Pre/Post Commit Hooks Pre Commit JavaScript or Post Commit Erlang Erlang Validation Indexing Modify data Messaging Kill writes
    • Chief complaintsNo indexNo native sortNo incrementNo native datastructures
    • Riak SearchBetaliciousSuperset of RiakFull text searchhttp://wiki.basho.com/display/RIAK/Riak+Searchhttp://www.slideshare.net/rklophaus/riak-search-erlang-factory-london-2010 http://www.seowebworx.co.uk/
    • Riak Search... moreuses a modified bitcask backend calledmerge_indexenabled on a per bucket basisaccess via http and command line
    • Riak-JSNodeJS Riak moduleWritten in CoffeescriptHTTP and ProtobufCustomizable via “meta” optionshttp://riakjs.org
    • Code demonodejsriak-jsredissimple post sitetagsjson data passing
    • Javascript Mapvar map = function(v, keydata, args) {! if (v.values) {! var ret = [], o = {};! o = Riak.mapValuesJson(v)[0];! o.key = v["key"]; / /put the key in the returned data object! o.lastModified = v["values"][0]["metadata"]["X-Riak-Last-Modified"];! ret.push(o);! return ret;! } else {! return [];! }! };
    • Javascript Reducevar sortInt = function ( data , args ) { var sortBy = (typeof args === "undefined" || args === null) ? undefined : args.field; var desc = ((typeof args === "undefined" || args === null) ? undefined : args.order) === desc;! ! data.sort ( function(a,b) {! ! ! if (desc) {! ! ! var _ref = [b, a];! ! ! a = _ref[0];! ! ! b = _ref[1];! ! ! }! !! ! return a[sortBy] - b[sortBy]! ! } );! ! return data! };
    • Putting it all togetherriak! .add(“bucket”) //map function! .map(map) //reduce fuction! .reduce(sortInt, { field: "lastModified", order: "desc" })! .run(function(err, response) { //send out an error if there is one! if (err) res.simpleJSON(400, {errortxt: mapreduce gone bad :(} );! / /otherwise send the data back...! res.simpleJSON(200, { response } );!! });
    • Hybrid architectures are the future!Use tools like Redis to augment shortcomings!
    • 1,456,023 Or “A Lot”At scale, precisiondoes not matter inpractice. Google Twitter http://photography.nationalgeographic.com/photography/enlarge/ okavango-cape-buffalo_pod_image.html
    • Google Look Ma! No exact counts!
    • TwitterNo Totals! No Pagination!
    • Questions?NYC Cloud Computing Group, March 2011 Alexander Sicular @siculars