Your SlideShare is downloading. ×
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Riak at The NYC Cloud Computing Meetup Group

1,898

Published on

In depth look at the nosql product, Riak.

In depth look at the nosql product, Riak.

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,898
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
40
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. A Walk Down NOSQL Lane in the Cloud Part 2: Riak NYC Cloud Computing Group, March 2011 Alexander Sicular @siculars
  • 2. Who is this blowhard?Columbia University pays my mortgageFor the better part of a decade in MedicalInformaticsAm not shilling for any of these companiesAm not a computer scientistAm a computer science enthusiastparticularly in the area of Informatics
  • 3. Riak, eh?Dynamo inspiredHomogeneousSingle key-spaceDistributedReplicatedPredictablescaleability
  • 4. OriginsShow me your friends...Amazon’s Dynamohttp://www.allthingsdistributed.com/2007/10/amazons_dynamo.htmlAkamaihttp://www.basho.com/bios.html Paramount Home Video
  • 5. CAP Theorem http://en.wikipedia.org/wiki/CAP_theorem Consistency Availability Partition tolerance Pick two? http://guide.couchdb.org/draft/consistency.htmlRiak says: pick two at a time.
  • 6. HomogeneousEvery node is thesameAny node can serviceany requestNodes gossip on theirown port
  • 7. One Ring to Rule ThemSingle 160 bit key spaceHuh?No Sharding!
  • 8. Distributed (!= replicated)riak is not sharded ★Considerations:vnodes = units of -must plan maximumdistribution ring sizevnodes != physical -think about numbernodes (pnodes) of vnodes per pnodevnodes map to -generally no less thanpnodes 10 vnodes per pnodedata is distributed atthe vnode level
  • 9. Conflict Resolution Vector Clocks ancestry / divergency maintained automatic or manual resolution★ Considerations: X-Riak-ClientId, X-Riak-Vclock allow_mult
  • 10. Replicated (!= distributed)configurable replication values (“N”)configurable consistency and availabilityvalues at read and write time- read- write- durable write
  • 11. Predictable Scaleability How much performance per node? Scale in both directions> bin/riak-admin> Usage: riak-admin { join | leave | backup | restore | test | status | reip | js_reload | wait-for- service | ringready | transfers }
  • 12. Data Agnostic schemaless data objects may be of any type binary, text (json, xml) use content types>curl -v -d this is a test -H "Content-Type: text/plain" http://127.0.0.1:8098/riak/testBucket/testKey
  • 13. Extra GoodiesErlanghttp://www.pragprog.com/titles/jaerlang/programming-erlangCode Architecturebasho_benchMultiple backends bitcask, innodb, mem
  • 14. Code architectureHighly modularized riak_core riak_kv bitcask erlang_js http://bitbucket.org/basho
  • 15. basho_benchPerformance profilinghighly customizablepretty pictureskey/value store generalizedhttps://wiki.basho.com/display/RIAK/Benchmarking+with+Basho+Benchhttp://pics.livejournal.com/demmonoid/pic/00001sa7
  • 16. BitcaskRiak’s default disk backendWrite Only LogHeavy updates will grow your footprint - Look into compaction/merging settingsKeys are cached in memory with disk offsetshttps://spreadsheets.google.com/ccc?key=0Ak4OBkABJPsxdEowYXc2akxnYU9xNkJmbmZscnhaTFE&hl=en&authkey=CMHw8tYO
  • 17. Speak my language? HTTPhttp://wiki.basho.com/display/RIAK/REST+API Protocol Buffershttp://wiki.basho.com/display/RIAK/PBC+API Native Erlanghttp://wiki.basho.com/display/RIAK/Erlang+Client+PBC http://www.zazzle.com/ speak_to_me_in_tagalog_tshirt-235376204895796392
  • 18. Ok sounds good. How do I get it?>git|hg clone http://bitbucket.org/basho/riak>cd riak>make all && make rel OR if you’re on a mac:>brew install riak
  • 19. Ok sounds good. How do I get it?>git|hg clone http://bitbucket.org/basho/riak_search>cd riak_search>make all && make rel OR if you’re on a mac:>brew install riak-search
  • 20. What does that get me? Fully functional Self contained (<3) Default configuration-64 vnodes, “riak” cookie, N = 3
  • 21. Work... like so. Config fileshttp://wiki.basho.com/display/RIAK/Configuration+Filesapp.config-ring_creation_sizevm.args-name, -settings
  • 22. Fire it up> bin/riak> Usage: riak {start|stop|restart| reboot|ping|console|attach}> bin/riak start
  • 23. Do Stuff! GET:> curl -v http://127.0.0.1:8098/ping> curl -v http://127.0.0.1:8098/stats> curl -v http://127.0.0.1:8098/riak/myBucket> curl -v http://127.0.0.1:8098/riak/myBucket/myKey PUT:> curl -v -X PUT -H "Content-Type: application/json" -d {"backend": "ets"} http://127.0.0.1:8098/riak/myBucket> curl -v -X PUT -d test key http://127.0.0.1:8098/riak/ myBucket/myKey> curl -v -X POST -d autogen key http://127.0.0.1:8098/ riak/myBucket
  • 24. LinksLightweight GraphingPractical limitations re. number of links perobjectUnidirectional object linkingrelationship modeling (one to one, one to many)Returns “Content-Type: multipart/mixed;” - Library needs to be multipart aware - nodejs, formidable
  • 25. Link WalkingFirst level depth>curl http://localhost:8098/riak/myBucket/myKey/_,_,_Via Map/Reduce>$ curl -X POST -H "content-type:application/json" http://localhost:8098/mapred --data @-{"inputs":[["myBucket","myKey"]],"query":[{"link":{}},{"map":{"language":"javascript","source":"function(v){ return [v]; }"}}]}^DN level depth>curl http://localhost:8098/riak/myBucket/myKey/_,_,_/_,_,_More Info:http://blog.basho.com/2010/02/24/link-walking-by-example/http://wiki.basho.com/display/RIAK/Linkshttp://wiki.basho.com/display/RIAK/REST+API#RESTAPI-Linkwalking
  • 26. Map/ReduceFunctions written in either Erlang orJavaScriptMap is distributed to where the data livesReduce is run on the node coordinating theM/RErlang > JavaScriptTweak JavaScript settings in app.conf
  • 27. M/R in Riak An input to start from function(v, keydata, args) { bucket ! if (v.values) { ! var ret = [], o = {}; ! o = Riak.mapValuesJson(v)[0]; ! list of keys / keyfilter ! o.lastModifiedParsed = Date.parse(v["values"][0] ["metadata"]["X-Riak-Last-Modified"]); ! o.key = v["key"]; ★ keys > bucket ! ret.push(o); ! return ret; possible link phase ! } else { ! return []; ! } one or more map phases ! }; (many) possible reduce phase(s) Map = SQL Select/Where clauseReduce = SQL Aggregates (SUM, COUNT, GROUPBY)
  • 28. Pre/Post Commit Hooks Pre Commit JavaScript or Post Commit Erlang Erlang Validation Indexing Modify data Messaging Kill writes
  • 29. Chief complaintsNo indexNo native sortNo incrementNo native datastructures
  • 30. Riak SearchBetaliciousSuperset of RiakFull text searchhttp://wiki.basho.com/display/RIAK/Riak+Searchhttp://www.slideshare.net/rklophaus/riak-search-erlang-factory-london-2010 http://www.seowebworx.co.uk/
  • 31. Riak Search... moreuses a modified bitcask backend calledmerge_indexenabled on a per bucket basisaccess via http and command line
  • 32. Riak-JSNodeJS Riak moduleWritten in CoffeescriptHTTP and ProtobufCustomizable via “meta” optionshttp://riakjs.org
  • 33. Code demonodejsriak-jsredissimple post sitetagsjson data passing
  • 34. Javascript Mapvar map = function(v, keydata, args) {! if (v.values) {! var ret = [], o = {};! o = Riak.mapValuesJson(v)[0];! o.key = v["key"]; / /put the key in the returned data object! o.lastModified = v["values"][0]["metadata"]["X-Riak-Last-Modified"];! ret.push(o);! return ret;! } else {! return [];! }! };
  • 35. Javascript Reducevar sortInt = function ( data , args ) { var sortBy = (typeof args === "undefined" || args === null) ? undefined : args.field; var desc = ((typeof args === "undefined" || args === null) ? undefined : args.order) === desc;! ! data.sort ( function(a,b) {! ! ! if (desc) {! ! ! var _ref = [b, a];! ! ! a = _ref[0];! ! ! b = _ref[1];! ! ! }! !! ! return a[sortBy] - b[sortBy]! ! } );! ! return data! };
  • 36. Putting it all togetherriak! .add(“bucket”) //map function! .map(map) //reduce fuction! .reduce(sortInt, { field: "lastModified", order: "desc" })! .run(function(err, response) { //send out an error if there is one! if (err) res.simpleJSON(400, {errortxt: mapreduce gone bad :(} );! / /otherwise send the data back...! res.simpleJSON(200, { response } );!! });
  • 37. Hybrid architectures are the future!Use tools like Redis to augment shortcomings!
  • 38. 1,456,023 Or “A Lot”At scale, precisiondoes not matter inpractice. Google Twitter http://photography.nationalgeographic.com/photography/enlarge/ okavango-cape-buffalo_pod_image.html
  • 39. Google Look Ma! No exact counts!
  • 40. TwitterNo Totals! No Pagination!
  • 41. Questions?NYC Cloud Computing Group, March 2011 Alexander Sicular @siculars

×