Adding Riak to your NoSQL Bag of Tricks

Adding Riak to your
NoSQL Bag of Tricks
NoSQL-NYC, October 2010

Alexander Sicular
@siculars

Riak, eh?
• Dynamo inspired

• Homogeneous

• Single key-space

• Distributed

• Replicated

• Predictable scaleability

• Data agnostic

Origins
Show me your friends...

• Amazon’s Dynamo
http://www.allthingsdistributed.com/2007/10/
amazons_dynamo.html

• Akamai
http://www.basho.com/bios.html

Paramount Home Video

CAP Theorem
http://en.wikipedia.org/wiki/CAP_theorem

• Consistency

• Availability

• Partition tolerance

Pick two?
Riak says: pick two at a time.
http://guide.couchdb.org/draft/consistency.html

Homogeneous

• Every node is the same

• Any node can service
any request

• Nodes gossip on their
own port

One Ring to Rule Them All
Single 160 bit key space

Huh?

No Sharding!

Distributed (!= replicated)
• riak is not sharded
★Considerations:
• vnodes = units of -must plan maximum ring
distribution size
• vnodes != physical -think about number of
nodes (pnodes) vnodes per pnode
• vnodes map to pnodes -generally no less than 10
• data is distributed at vnodes per pnode
the vnode level

Conﬂict Resolution

• Vector Clocks
• ancestry / divergency maintained
• automatic or manual resolution
★Considerations:
• X-Riak-ClientId,
• X-Riak-Vclock
• allow_mult

Replicated (!= distributed)
• conﬁgurable replication values (“N”)
• conﬁgurable consistency and availability
values at read and write time
- read

- write
- durable write

Predictable Scaleability
• How much performance per node?
• Scale in both directions
>bin/riak-admin
>Usage: riak-admin { join |
leave | backup | restore |
test | status | reip |
js_reload | wait-for-service
| ringready | transfers }

Data Agnostic
• schemaless
• data objects may be of any type
• binary, text (json, xml)
• use content types
>curl -v -d 'this is a test' -H "Content-Type: text/plain"
http://127.0.0.1:8098/riak/testBucket/testKey

Extra Goodies
• Erlang
http://www.pragprog.com/titles/jaerlang/
programming-erlang

• Code Architecture

• basho_bench

• Multiple backends

• bitcask

• innodb

Code architecture
• Highly modularized
• riak_core
• riak_kv
• bitcask
• erlang_js
http://bitbucket.org/basho

basho_bench
• Performance proﬁling
• highly customizable
• pretty pictures
• key/value store generalized
https://wiki.basho.com/display/RIAK/Benchmarking+with+Basho+Bench

http://pics.livejournal.com/demmonoid/pic/00001sa7

Bitcask
• Riak’s default disk backend
• Write Only Log
• Heavy updates will grow your footprint
- Look into compaction/merging settings
• Keys are cached in memory with disk offsets
https://spreadsheets.google.com/ccc?
key=0Ak4OBkABJPsxdEowYXc2akxnYU9xNkJmbmZscnhaTFE&hl=en&authkey=CMHw8tYO

Speak my language?
• HTTP
http://wiki.basho.com/display/RIAK/REST+API

• Protocol Buffers
http://wiki.basho.com/display/RIAK/PBC+API

• Native Erlang
http://wiki.basho.com/display/RIAK/Erlang+Client
+PBC

http://www.zazzle.com/
speak_to_me_in_tagalog_tshirt-235376204895796392

Ok sounds good.
How do I get it?
>hg clone http://
bitbucket.org/basho/riak
>cd riak
>make all && make rel
OR if you’re on a mac:
>brew install riak

What does that get
me?
• Fully functional
• Self contained (<3)
• Default conﬁguration
-64 vnodes, “riak” cookie, N = 3

Work... like so.

• Config files
http://wiki.basho.com/display/RIAK/Configuration+Files

• app.config
-ring_creation_size

• vm.args
-name, -settings

Fire it up

> bin/riak
> Usage: riak {start|stop|restart|reboot|
ping|console|attach}

> bin/riak start

Do Stuff!
GET:

> curl -v http://127.0.0.1:8098/ping

> curl -v http://127.0.0.1:8098/stats

> curl -v http://127.0.0.1:8098/riak/myBucket

> curl -v http://127.0.0.1:8098/riak/myBucket/myKey

PUT:

> curl -v -X PUT -H "Content-Type: application/json" -d
'{"backend": "ets"}' http://127.0.0.1:8098/riak/myBucket

> curl -v -X PUT -d 'test key' http://127.0.0.1:8098/riak/
myBucket/myKey

> curl -v -X POST -d 'autogen key' http://127.0.0.1:8098/
riak/myBucket

Links
• Lightweight Graphing
• Practical limitations re. number of links per
object
• Unidirectional object linking
• relationship modeling (one to one, one to many)
• Returns “Content-Type: multipart/mixed;”
- Library needs to be multipart aware
- nodejs, formidable

Link Walking
First level depth
>curl http://localhost:8098/riak/myBucket/myKey/_,_,_

Via Map/Reduce
>$ curl -X POST -H "content-type:application/json"
http://localhost:8098/mapred --data @-
{"inputs":[["myBucket","myKey"]],"query":[{"link":{}},{"map":
{"language":"javascript","source":"function(v)
{ return [v]; }"}}]}
^D

N level depth
>curl http://localhost:8098/riak/myBucket/myKey/_,_,_/_,_,_

More Info:
http://blog.basho.com/2010/02/24/link-walking-by-example/
http://wiki.basho.com/display/RIAK/Links
http://wiki.basho.com/display/RIAK/REST+API#RESTAPI-Linkwalking

Map/Reduce
• Functions written in either Erlang or
JavaScript
• Map is distributed to where the data lives
• Reduce is run on the node coordinating the
M/R
• Erlang > JavaScript
• Tweak JavaScript settings in app.conf

M/R in Riak
• An input to start from
function(v, keydata, args) {
• bucket

if (v.values) {
var ret = [], o = {};

• list of keys

o = Riak.mapValuesJson(v)[0];
o.lastModiﬁedParsed = Date.parse(v["values"][0]["metadata"]
["X-Riak-Last-Modiﬁed"]);
★ keys > bucket

o.key = v["key"];
ret.push(o);

• possible link phase

return ret;
} else {

return [];
• one or more map phases

};
}

• (many) possible reduce phase(s)

Map = SQL Where clause
Reduce = SQL Aggregates (SUM, COUNT, GROUP BY)

Pre/Post Commit
Hooks
• Pre Commit • Post Commit

• JavaScript or Erlang • Erlang

• Validation • Indexing

• Modify data • Messaging

• Kill writes

Code demo
• nodejs
• riak-js
• redis
• simple post site
• tags
• json data passing

Chief complaints

• No index

• No native sort

• No increment

• No full text search *

*Yet ;) inc Riak Search!
http://www.slideshare.net/rklophaus/riak-search-erlang-factory-london-2010

Hybrid architectures
are the future!
Use tools like Redis to augment shortcomings!

1,456,023 Or “A Lot”

• At scale, precision does
not matter in practice.

• Google

• Twitter

http://photography.nationalgeographic.com/photography/enlarge/
okavango-cape-buffalo_pod_image.html

Google
Look Ma!

No exact counts!

Twitter

No Totals!

No Pagination!

Questions?

Alexander Sicular
@siculars

Adding Riak to your NoSQL Bag of Tricks

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (6)

Similar to Adding Riak to your NoSQL Bag of Tricks

Similar to Adding Riak to your NoSQL Bag of Tricks (20)

Recently uploaded

Recently uploaded (20)

Adding Riak to your NoSQL Bag of Tricks

Editor's Notes