Riak intro to..

  • 513 views
Uploaded on

 

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
513
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
2
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Wednesday, March 13, 13
  • 2. #WHOIS Adron Hall | @adron | Coder, Messenger, ReconWednesday, March 13, 13
  • 3. 芭蕉Wednesday, March 13, 13
  • 4. Distributed, masterless, highly-available key/value storeWednesday, March 13, 13
  • 5. DESIGN GOALS Fault-Tolerance Ops Friendliness Horizontal Scalability Predictability Low-latency High-AvailabilityWednesday, March 13, 13
  • 6. Use, when & what to use Riak for... Users/Profiles Logging Systems Metadata Sensor Data Session Storage Notification Systems Object Storage Record SystemsWednesday, March 13, 13
  • 7. IN PRODUCTION AT And 1000s more...Wednesday, March 13, 13
  • 8. DATA MODELWednesday, March 13, 13
  • 9. {“Key”:“Value”} • Values are stored against keys • Key/Value + Metadata = Object • Fundamental Unit of Replication • Any Datatype will work • Record to disk in binary formatWednesday, March 13, 13
  • 10. <<BUCKET>>/<<KEY>> • Virtual Namespace • Bucket + Keys = Object Address • Buckets have properties • Objects in bucket inherit properties • No relationships between bucketsWednesday, March 13, 13
  • 11. DATA ACCESSWednesday, March 13, 13
  • 12. INTERFACES HTTP API - Via a little piece of magic called Webmachine Largely-faithful REST implementation Protocol Buffers API - Thanks, Google! Compact, binary protocolWednesday, March 13, 13
  • 13. CLIENT LIBS Python Java Clojure Erlang PHP Haskell ...and more via Basho Ruby C/C++ OCaml or our community. Perl .NET Scala Dart Go Node.jsWednesday, March 13, 13
  • 14. RIAK GIVES YOU [FOUR] WAYS TO STORE, RETRIEVE, AND QUERY DATAWednesday, March 13, 13
  • 15. CRUD // GET 1 2 3 4 GET  /buckets/bucket/keys/key         5 6 // PUT 7 8 9 10 POST  /buckets/bucket/keys/key        //  Riak-­‐defined  key 11 PUT  /buckets/bucket/keys/key            //  User-­‐defined  key 12 13 // DELETE 14 15 16 17 DELETE  /buckets/bucket/keys/key       18Wednesday, March 13, 13
  • 16. MapReduce Distributed processing system using Riak Pipe Efficient for targeted queries over known key range Write jobs in Erlang or JS. (Erlang more performant)Wednesday, March 13, 13
  • 17. Secondary Indexing (2i) riak_object X-Riak-Index-email_bin “mark@basho.com” riak_object X-Riak-Index-value_int “42” Tag objects with custom metadata on PUT... Exact match and range queries... No multi-index queries yet... Pagination is on its way...Wednesday, March 13, 13
  • 18. Riak Search Store and index documents (JSON, text, XML, etc) Current Riak Search supports subset of Solr API Next iteration (Yokozuna; in beta)will implement distributed Solr on Riak. It will be sexy. Looking for beta testers to help harden YokozunaWednesday, March 13, 13
  • 19. ARCHITECTURE Consistent Hashing The scaleability and Virtual Nodes ease of operation goals inform Handoff/Rebalancing architectural decisions. Vector Clocks These come with tradeoffs. Append-only storage Active Anti-Entropy*Wednesday, March 13, 13
  • 20. Consistent Hashing Location of data in the Riak ring is determined based on hash of bucket + key. Provides even distribution of storage and query load Trades off advantages gained from locality - e.g. Range queries and aggregatesWednesday, March 13, 13
  • 21. Consistent HashingWednesday, March 13, 13
  • 22. Virtual Nodes Unit of addressing and concurrency in Riak Each physical host manage many vnodes Partition count / physical machines = vnodes/machine* Decouples physical assets from data distribution. This provides: - simplicity in cluster sizing - failure isolationWednesday, March 13, 13
  • 23. Handoff/Rebalancing Mechanisms for data rebalancing When nodes join/leave cluster, handoff and rebalancing manage the date shuffling dynamically Trades off speed of convergence vs. effects on cluster performance - causes disk & network loadWednesday, March 13, 13
  • 24. Vector Clocks VCs used to rectify object consistency at READ time. Lots of knobs to turn; well-documented Trades off space, speed, and complexity for safety - will store all sibling objects until resolved - can lead to object size issuesWednesday, March 13, 13
  • 25. Append-Only Storage Riak provides a pluggable backend interface. (Write your own; we’ll probably hire you...) Bitcask, LevelDB are most-heavily used. Both are append - only Provides crash safety and speed. Trade off: periodic compaction/merge opsWednesday, March 13, 13
  • 26. RIAK 1.3 (AKA “new hotness”) Active Anti Entropy Riaknostic included by default MapReduce Improvements Riak Control improvements IPv6 Support Much more Full release notes: https://github.com/basho/riak/blob/1.3/RELEASE-NOTES.mdWednesday, March 13, 13
  • 27. FUTURE WORK* (1.4 and beyond) Dynamic Ring Size Consistency Yokozuna 2i Improvements CRDTs/Data Types Riak Pipe work Riak Object Much more (* all code subject to ship early, late, or not at all)Wednesday, March 13, 13
  • 28. Multi-tenant cloud storage software for public and private clouds. Designed to provide simple, available, distributed cloud storage at any scale. S3-API compatible and supports per-tenant reporting for billing and metering use cases. Additional APIs on the way. Stores files of arbitrary size. Under the hood stores 1MB chunks along side a manifest. Stateless proxy (CS) does chunking. Riak does distribution, storage, etc.Wednesday, March 13, 13
  • 29. Extends Riaks capabilities with: - multi-datacenter replication - SNMP Configuration - JMX-Monitoring - 24x7 support from Basho Engineers One cluster acts as a "source cluster". The source cluster replicates its data to one or more "sink clusters" using either real-time or full sync. Data transfer is unidirectional (source -> sink). Bidirectional synchronization can be achieved by configuring a pair of connections between clusters.Wednesday, March 13, 13
  • 30. RIAK COMMUNITY Mailing List - 1300 developers IRC - 200+ people every day yelling about software GitHub - 1000s of watchers; 200+ contributors to all projects Meetups - 10 Countries, 23 Cities, 3700+ Members & growing fast! Deployments - 1000s in production.Wednesday, March 13, 13
  • 31. ricon.io/east.html May 13-14th in New York City Talks, hacking, parties Dedicated to the future of Riak and distributed systems in production REGISTER https://ricon-east-2013.eventbrite.com/?discount=lovevnodes NOW!Wednesday, March 13, 13
  • 32. GETTING STARTED Downloads - http://docs.basho.com/riak/latest/downloads/ Docs - http://docs.basho.com Riak Source Code - github.com/basho/riak All Basho source Code - github.com/basho/ Riak Mailing List - http://bit.ly/riak-list Email or Tweet me @adron or adron@basho.comWednesday, March 13, 13
  • 33. Let’s Talk UI & CLI - Demo ThingsWednesday, March 13, 13
  • 34. #WHOIS Adron Hall | @adron | Coder, Messenger, ReconWednesday, March 13, 13