Riak at The NYC Cloud Computing Meetup Group

2,163 views
2,058 views

Published on

In depth look at the nosql product, Riak.

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,163
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
43
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Riak at The NYC Cloud Computing Meetup Group

  1. 1. A Walk Down NOSQL Lane in the Cloud Part 2: Riak NYC Cloud Computing Group, March 2011 Alexander Sicular @siculars
  2. 2. Who is this blowhard?Columbia University pays my mortgageFor the better part of a decade in MedicalInformaticsAm not shilling for any of these companiesAm not a computer scientistAm a computer science enthusiastparticularly in the area of Informatics
  3. 3. Riak, eh?Dynamo inspiredHomogeneousSingle key-spaceDistributedReplicatedPredictablescaleability
  4. 4. OriginsShow me your friends...Amazon’s Dynamohttp://www.allthingsdistributed.com/2007/10/amazons_dynamo.htmlAkamaihttp://www.basho.com/bios.html Paramount Home Video
  5. 5. CAP Theorem http://en.wikipedia.org/wiki/CAP_theorem Consistency Availability Partition tolerance Pick two? http://guide.couchdb.org/draft/consistency.htmlRiak says: pick two at a time.
  6. 6. HomogeneousEvery node is thesameAny node can serviceany requestNodes gossip on theirown port
  7. 7. One Ring to Rule ThemSingle 160 bit key spaceHuh?No Sharding!
  8. 8. Distributed (!= replicated)riak is not sharded ★Considerations:vnodes = units of -must plan maximumdistribution ring sizevnodes != physical -think about numbernodes (pnodes) of vnodes per pnodevnodes map to -generally no less thanpnodes 10 vnodes per pnodedata is distributed atthe vnode level
  9. 9. Conflict Resolution Vector Clocks ancestry / divergency maintained automatic or manual resolution★ Considerations: X-Riak-ClientId, X-Riak-Vclock allow_mult
  10. 10. Replicated (!= distributed)configurable replication values (“N”)configurable consistency and availabilityvalues at read and write time- read- write- durable write
  11. 11. Predictable Scaleability How much performance per node? Scale in both directions> bin/riak-admin> Usage: riak-admin { join | leave | backup | restore | test | status | reip | js_reload | wait-for- service | ringready | transfers }
  12. 12. Data Agnostic schemaless data objects may be of any type binary, text (json, xml) use content types>curl -v -d this is a test -H "Content-Type: text/plain" http://127.0.0.1:8098/riak/testBucket/testKey
  13. 13. Extra GoodiesErlanghttp://www.pragprog.com/titles/jaerlang/programming-erlangCode Architecturebasho_benchMultiple backends bitcask, innodb, mem
  14. 14. Code architectureHighly modularized riak_core riak_kv bitcask erlang_js http://bitbucket.org/basho
  15. 15. basho_benchPerformance profilinghighly customizablepretty pictureskey/value store generalizedhttps://wiki.basho.com/display/RIAK/Benchmarking+with+Basho+Benchhttp://pics.livejournal.com/demmonoid/pic/00001sa7
  16. 16. BitcaskRiak’s default disk backendWrite Only LogHeavy updates will grow your footprint - Look into compaction/merging settingsKeys are cached in memory with disk offsetshttps://spreadsheets.google.com/ccc?key=0Ak4OBkABJPsxdEowYXc2akxnYU9xNkJmbmZscnhaTFE&hl=en&authkey=CMHw8tYO
  17. 17. Speak my language? HTTPhttp://wiki.basho.com/display/RIAK/REST+API Protocol Buffershttp://wiki.basho.com/display/RIAK/PBC+API Native Erlanghttp://wiki.basho.com/display/RIAK/Erlang+Client+PBC http://www.zazzle.com/ speak_to_me_in_tagalog_tshirt-235376204895796392
  18. 18. Ok sounds good. How do I get it?>git|hg clone http://bitbucket.org/basho/riak>cd riak>make all && make rel OR if you’re on a mac:>brew install riak
  19. 19. Ok sounds good. How do I get it?>git|hg clone http://bitbucket.org/basho/riak_search>cd riak_search>make all && make rel OR if you’re on a mac:>brew install riak-search
  20. 20. What does that get me? Fully functional Self contained (<3) Default configuration-64 vnodes, “riak” cookie, N = 3
  21. 21. Work... like so. Config fileshttp://wiki.basho.com/display/RIAK/Configuration+Filesapp.config-ring_creation_sizevm.args-name, -settings
  22. 22. Fire it up> bin/riak> Usage: riak {start|stop|restart| reboot|ping|console|attach}> bin/riak start
  23. 23. Do Stuff! GET:> curl -v http://127.0.0.1:8098/ping> curl -v http://127.0.0.1:8098/stats> curl -v http://127.0.0.1:8098/riak/myBucket> curl -v http://127.0.0.1:8098/riak/myBucket/myKey PUT:> curl -v -X PUT -H "Content-Type: application/json" -d {"backend": "ets"} http://127.0.0.1:8098/riak/myBucket> curl -v -X PUT -d test key http://127.0.0.1:8098/riak/ myBucket/myKey> curl -v -X POST -d autogen key http://127.0.0.1:8098/ riak/myBucket
  24. 24. LinksLightweight GraphingPractical limitations re. number of links perobjectUnidirectional object linkingrelationship modeling (one to one, one to many)Returns “Content-Type: multipart/mixed;” - Library needs to be multipart aware - nodejs, formidable
  25. 25. Link WalkingFirst level depth>curl http://localhost:8098/riak/myBucket/myKey/_,_,_Via Map/Reduce>$ curl -X POST -H "content-type:application/json" http://localhost:8098/mapred --data @-{"inputs":[["myBucket","myKey"]],"query":[{"link":{}},{"map":{"language":"javascript","source":"function(v){ return [v]; }"}}]}^DN level depth>curl http://localhost:8098/riak/myBucket/myKey/_,_,_/_,_,_More Info:http://blog.basho.com/2010/02/24/link-walking-by-example/http://wiki.basho.com/display/RIAK/Linkshttp://wiki.basho.com/display/RIAK/REST+API#RESTAPI-Linkwalking
  26. 26. Map/ReduceFunctions written in either Erlang orJavaScriptMap is distributed to where the data livesReduce is run on the node coordinating theM/RErlang > JavaScriptTweak JavaScript settings in app.conf
  27. 27. M/R in Riak An input to start from function(v, keydata, args) { bucket ! if (v.values) { ! var ret = [], o = {}; ! o = Riak.mapValuesJson(v)[0]; ! list of keys / keyfilter ! o.lastModifiedParsed = Date.parse(v["values"][0] ["metadata"]["X-Riak-Last-Modified"]); ! o.key = v["key"]; ★ keys > bucket ! ret.push(o); ! return ret; possible link phase ! } else { ! return []; ! } one or more map phases ! }; (many) possible reduce phase(s) Map = SQL Select/Where clauseReduce = SQL Aggregates (SUM, COUNT, GROUPBY)
  28. 28. Pre/Post Commit Hooks Pre Commit JavaScript or Post Commit Erlang Erlang Validation Indexing Modify data Messaging Kill writes
  29. 29. Chief complaintsNo indexNo native sortNo incrementNo native datastructures
  30. 30. Riak SearchBetaliciousSuperset of RiakFull text searchhttp://wiki.basho.com/display/RIAK/Riak+Searchhttp://www.slideshare.net/rklophaus/riak-search-erlang-factory-london-2010 http://www.seowebworx.co.uk/
  31. 31. Riak Search... moreuses a modified bitcask backend calledmerge_indexenabled on a per bucket basisaccess via http and command line
  32. 32. Riak-JSNodeJS Riak moduleWritten in CoffeescriptHTTP and ProtobufCustomizable via “meta” optionshttp://riakjs.org
  33. 33. Code demonodejsriak-jsredissimple post sitetagsjson data passing
  34. 34. Javascript Mapvar map = function(v, keydata, args) {! if (v.values) {! var ret = [], o = {};! o = Riak.mapValuesJson(v)[0];! o.key = v["key"]; / /put the key in the returned data object! o.lastModified = v["values"][0]["metadata"]["X-Riak-Last-Modified"];! ret.push(o);! return ret;! } else {! return [];! }! };
  35. 35. Javascript Reducevar sortInt = function ( data , args ) { var sortBy = (typeof args === "undefined" || args === null) ? undefined : args.field; var desc = ((typeof args === "undefined" || args === null) ? undefined : args.order) === desc;! ! data.sort ( function(a,b) {! ! ! if (desc) {! ! ! var _ref = [b, a];! ! ! a = _ref[0];! ! ! b = _ref[1];! ! ! }! !! ! return a[sortBy] - b[sortBy]! ! } );! ! return data! };
  36. 36. Putting it all togetherriak! .add(“bucket”) //map function! .map(map) //reduce fuction! .reduce(sortInt, { field: "lastModified", order: "desc" })! .run(function(err, response) { //send out an error if there is one! if (err) res.simpleJSON(400, {errortxt: mapreduce gone bad :(} );! / /otherwise send the data back...! res.simpleJSON(200, { response } );!! });
  37. 37. Hybrid architectures are the future!Use tools like Redis to augment shortcomings!
  38. 38. 1,456,023 Or “A Lot”At scale, precisiondoes not matter inpractice. Google Twitter http://photography.nationalgeographic.com/photography/enlarge/ okavango-cape-buffalo_pod_image.html
  39. 39. Google Look Ma! No exact counts!
  40. 40. TwitterNo Totals! No Pagination!
  41. 41. Questions?NYC Cloud Computing Group, March 2011 Alexander Sicular @siculars

×