Your SlideShare is downloading. ×
0
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Riak Search - Berlin Buzzwords 2010
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Riak Search - Berlin Buzzwords 2010

2,181

Published on

Riak Search is a distributed data indexing and search platform built on top of Riak. The talk will introduce Riak Search, covering overall goals, architecture, and core functionality.

Riak Search is a distributed data indexing and search platform built on top of Riak. The talk will introduce Riak Search, covering overall goals, architecture, and core functionality.

Published in: Technology, Business
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,181
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
19
Comments
0
Likes
3
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Riak Search A Full-Text Search and Indexing Engine based on Riak Berlin Buzzwords· June 2010 Basho Technologies Rusty Klophaus - @rklophaus
  • 2. Why did we build it? What are the major goals? How does it work? 2
  • 3. Part One Why did we build Riak Search? 3
  • 4. Riak is a scalable, highly-available, networked, open-source key/value store. 4
  • 5. Writing to a Key/Value Store Key/Value CLIENT RIAK 5
  • 6. Writing to a Key/Value Store Object CLIENT RIAK 6
  • 7. Querying a Key/Value Store Key Object CLIENT RIAK 7
  • 8. Querying Riak via LinkWalking Key + Instructions Walk to Related Keys Object(s) CLIENT RIAK 8
  • 9. Querying Riak via Map/Reduce Key(s) + JS Functions Map Map Reduce Computed Value(s) CLIENT RIAK 9
  • 10. Key/Value Stores like Key-Based Queries 10
  • 11. Query by Secondary Index where Category == "Shoes" WTF!? I'm a KV store! CLIENT RIAK 11
  • 12. Full-Text Query "Converse AND Shoes" This is getting old. CLIENT RIAK 12
  • 13. These kinds of queries need an Index. *Market Opportunity!* 13
  • 14. Part Two What are the major goals of Riak Search? 14
  • 15. An application built on Riak. Your Riak Application 15
  • 16. Hrm... I need an index. Your Riak Application Index Object 16
  • 17. Hrm... I need an index with more features. Your ??? Riak Application 17
  • 18. Lucene should do the trick... Your Lucene Riak Application 18
  • 19. ...shard to add more storage capacity... Your Application Lucene Lucene Lucene Riak 19
  • 20. ...replicate to add more throughput. Lucene Lucene Lucene Your Application Lucene Lucene Lucene Riak Lucene Lucene Lucene 20
  • 21. ...replicate to add more throughput. Lucene Lucene Lucene Your Application Lucene Lucene Lucene Riak Lucene Lucene Lucene Operations nightmare! 21
  • 22. What do we really want? Your Riak-ified Riak Application Lucene 22
  • 23. What do we really want? Your Riak Riak Application Search 23
  • 24. Functionality? Be like Lucene (and more). • Lucene Syntax • Leverages Java Lucene Analyzers • Solr Endpoints • Integration via Riak Post-Commit Hook (Index) • Integration via Riak Map/Reduce (Query) • Near-Realtime • Schema-less 24
  • 25. Operations? Be like Riak. • No special nodes • Add nodes, get more compute and storage • Automatically load balance • Replicas for durability and performance • Index and query in parallel • Swappable storage backends 25
  • 26. Part Three How do we do it? 26
  • 27. A Gentle Introduction to Document Indexing 27
  • 28. The Inverted Index Document Inverted Index day, 1 dog, 1 #1 Every dog has his day. every, 1 has, 1 his, 1 28
  • 29. The Inverted Index Documents Combined Inverted Index and, 4 #1 Every dog has his day. bag, 3 bark, 2 bite, 2 The dog's bark #2 cat, 3 is worse than his bite. cat, 4 day, 1 dog, 1 #3 Let the cat out of the bag. dog, 2 dog, 4 every, 1 #4 It's raining cats and dogs. has, 1 ... 29
  • 30. At Query Time... "dog AND cat" AND dog cat 30
  • 31. At Query Time... AND dog cat dog, 1 cat, 3 dog, 2 cat, 4 dog, 4 31
  • 32. At Query Time... Result: 4 AND (Merge Intersection) 1 3 2 4 4 32
  • 33. At Query Time... Result: 1, 2, 3, 4 OR (Merge Union) 1 3 2 4 4 33
  • 34. Complex Behavior from Simple Structures 34
  • 35. Storage Approaches... 35
  • 36. Riak Search uses Consistent Hashing to store data on Partitions 36
  • 37. Introduction to Consistent Hashing and Partitions Partitions = 10 Number of Nodes = 5 Partitions per Node = 2 Replicas (NVal) = 2 37
  • 38. Introduction to Consistent Hashing and Partitions Object 38
  • 39. Document Partitioning vs. Term Partitioning 39
  • 40. ...and the Resulting Tradeoffs 40
  • 41. Document Partitioning @ Index Time #1 Every dog has his day. 41
  • 42. Document Partitioning @ Query Time "dog OR cat" 42
  • 43. Term Partitioning @ Index Time day, 1 dog, 1 #1 Every dog has his day. every, 1 has, 1 his, 1 43
  • 44. Term Partitioning @ Index Time dog, 1 day, 1 has, 1 his, 1 every, 1 44
  • 45. Term Partitioning @ Query Time "dog OR cat" 45
  • 46. Tradeoffs... Document Partitioning Term Partitioning + Lower Latency Queries - Higher Latency Queries - Lower Throughput + Higher Throughput - Lots of Disk Seeks - Hotspots in Ring (the "Obama" problem) 46
  • 47. Riak Search: Term Partitioning Term-partitioning is the most viable approach for our beta clients’ needs: high throughput on Really Big Datasets. Optimizations: • Term splitting to reduce hot spots • Bloom filters & caching to save query-time bandwidth • Batching to save query-time & index-time bandwidth Support for either approach eventually. 47
  • 48. Part Four Review 48
  • 49. Riak Search turns this... "Converse AND Shoes" WTF!? I'm a KV store! CLIENT RIAK 49
  • 50. ...into this... "Converse AND Shoes" Gladly! CLIENT RIAK 50
  • 51. ...into this... "Converse AND Shoes" Keys or Objects CLIENT RIAK 51
  • 52. ...while keeping operations easy. Your Riak Riak Application Search 52
  • 53. Thanks! Questions? Search Team: John Muellerleile - @jrecursive Rusty Klophaus - @rklophaus Kevin Smith - @kevsmith Currently working with a small set of Beta users. Open-source release planned for Q3. www.basho.com

×