Your SlideShare is downloading. ×

Schema Design for Riak (Take 2)

3,864
views

Published on

A discussion of strategies for designing application schemas that use the Riak distributed key-value store. …

A discussion of strategies for designing application schemas that use the Riak distributed key-value store.

Video available here: http://vimeo.com/17604126

Published in: Technology

0 Comments
12 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,864
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
87
Comments
0
Likes
12
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Transcript

    • 1. Schema Design for Riak Sean Cribbs Developer Advocateb a sho
    • 2. Thinking Non-Relationallyb a sho
    • 3. “There is no spoon schema.”b a sho
    • 4. There is an application.b a sho
    • 5. What you loseb a sho
    • 6. What you lose •Tablesb a sho
    • 7. What you lose •Tables •Foreign keys and constraintsb a sho
    • 8. What you lose •Tables •Foreign keys and constraints •ACIDb a sho
    • 9. What you lose •Tables •Foreign keys and constraints •ACID •Sophisticated query plannersb a sho
    • 10. What you lose •Tables •Foreign keys and constraints •ACID •Sophisticated query planners •Declarative query language (SQL)b a sho
    • 11. What you gainb a sho
    • 12. What you gain •More flexible, fluid designsb a sho
    • 13. What you gain •More flexible, fluid designs •More natural data representationsb a sho
    • 14. What you gain •More flexible, fluid designs •More natural data representations •Scaling without painb a sho
    • 15. What you gain •More flexible, fluid designs •More natural data representations •Scaling without pain •Reduced operational complexityb a sho
    • 16. Walking without Relational crutchesb a sho
    • 17. Walking without Relational crutches •Sparse data (optional/multi-value fields)b a sho
    • 18. Walking without Relational crutches •Sparse data (optional/multi-value fields) •Richer data structuresb a sho
    • 19. Walking without Relational crutches •Sparse data (optional/multi-value fields) •Richer data structures •Meaningful identifiersb a sho
    • 20. Walking without Relational crutches •Sparse data (optional/multi-value fields) •Richer data structures •Meaningful identifiers •Innovative access patternsb a sho
    • 21. Know your datab a sho
    • 22. Make your Top 10 listb a sho
    • 23. Make your Top 10 list •Frequently requested pages/screensb a sho
    • 24. Make your Top 10 list •Frequently requested pages/screens •Slow queriesb a sho
    • 25. Make your Top 10 list •Frequently requested pages/screens •Slow queries •Secondary indexesb a sho
    • 26. Make your Top 10 list •Frequently requested pages/screens •Slow queries •Secondary indexes •Complicated joins or aggregationsb a sho
    • 27. Analyzeb a sho
    • 28. Analyze •Interdependencies, couplingb a sho
    • 29. Analyze •Interdependencies, coupling •Cardinalities of relationshipsb a sho
    • 30. Analyze •Interdependencies, coupling •Cardinalities of relationships •Access patternb a sho
    • 31. Analyze •Interdependencies, coupling •Cardinalities of relationships •Access pattern •Shoehorned structuresb a sho
    • 32. Exampleb a sho
    • 33. Radiant CMS radiantcms.orgb a sho
    • 34. The “Styled Blog” Templateb a sho
    • 35. Date-organized pages The “Styled Blog” Templateb a sho
    • 36. Static sidebar content Date-organized pages The “Styled Blog” Templateb a sho
    • 37. Administration UIb a sho
    • 38. hierarchical site structure Administration UIb a sho
    • 39. add new pages by parent hierarchical site structure Administration UIb a sho
    • 40. add new pages by parent special “virtual” pages hierarchical site structure Administration UIb a sho
    • 41. b a sho
    • 42. slug, breadcrumbb a sho
    • 43. slug, breadcrumb content blocksb a sho
    • 44. slug, breadcrumb content blocks layoutb a sho
    • 45. b a sho
    • 46. template tagsb a sho
    • 47. Content-Type template tagsb a sho
    • 48. Relational Schema pagesb a sho
    • 49. Relational Schema parent pagesb a sho
    • 50. Relational Schema parent pages page_partsb a sho
    • 51. Relational Schema parent pages page_parts layoutsb a sho
    • 52. Relational Schema parent pages page_parts layouts snippetsb a sho
    • 53. Relational Schema parent pages page_parts layouts users snippetsb a sho
    • 54. Relational Schema parent pages page_parts created/modified by layouts users snippetsb a sho
    • 55. Converting to Riakb a sho
    • 56. Layouts and Snippetsb a sho
    • 57. Layouts and Snippets • Layouts and Snippets are accessed by nameb a sho
    • 58. Layouts and Snippets • Layouts and Snippets are accessed by name • SQL: WHERE name=?b a sho
    • 59. Layouts and Snippets • Layouts and Snippets are accessed by name layouts/Main • SQL: WHERE name=? • Simple access by key snippets/top-navb a sho
    • 60. Layouts and Snippets • Layouts and Snippets are accessed by name layouts/Main • SQL: WHERE name=? • Simple access by key snippets/top-nav • Simple value structure (content + metadata)b a sho
    • 61. Users Normally accessed by login or email: WHERE login=? OR email=?b a sho
    • 62. Users Normally accessed by login or email: WHERE login=? OR email=? • #1: Login or Email keyb a sho
    • 63. Users Normally accessed by login or email: WHERE login=? OR email=? • #1: Login or Email key • Easy lookup on one, manual index otherb a sho
    • 64. Users Normally accessed by login or email: WHERE login=? OR email=? • #1: Login or Email key • Easy lookup on one, manual index other • #2: Arbitrary keyb a sho
    • 65. Users Normally accessed by login or email: WHERE login=? OR email=? • #1: Login or Email key • Easy lookup on one, manual index other • #2: Arbitrary key • Independent of email/login changesb a sho
    • 66. Users Normally accessed by login or email: WHERE login=? OR email=? • #1: Login or Email key • Easy lookup on one, manual index other • #2: Arbitrary key • Independent of email/login changes • Manually index bothb a sho
    • 67. Users Normally accessed by login or email: WHERE login=? OR email=? • #1: Login or Email key • #3: Users-list object • Easy lookup on one, manual index other • #2: Arbitrary key • Independent of email/login changes • Manually index bothb a sho
    • 68. Users Normally accessed by login or email: WHERE login=? OR email=? • #1: Login or Email key • #3: Users-list object • Easy lookup on one, • Fits “small teams” manual index other philosophy • #2: Arbitrary key • Independent of email/login changes • Manually index bothb a sho
    • 69. Users Normally accessed by login or email: WHERE login=? OR email=? • #1: Login or Email key • #3: Users-list object • Easy lookup on one, • Fits “small teams” manual index other philosophy • #2: Arbitrary key • Quick lookup (one fetch) • Independent of email/login changes • Manually index bothb a sho
    • 70. Users Normally accessed by login or email: WHERE login=? OR email=? • #1: Login or Email key • #3: Users-list object • Easy lookup on one, • Fits “small teams” manual index other philosophy • #2: Arbitrary key • Quick lookup (one fetch) • Independent of email/login changes • Bad for large list, still need unique IDs • Manually index bothb a sho
    • 71. The Bigger Problemsb a sho
    • 72. Page Renderingb a sho
    • 73. Page Rendering • Load and render layout in page contextb a sho
    • 74. Page Rendering • Load and render layout in page context • Render parts, snippets via template tagsb a sho
    • 75. Page Rendering • Load and render layout in page context • Render parts, snippets via template tags • Iterate over sections of page hierarchyb a sho
    • 76. Rendering Page Partsb a sho
    • 77. Rendering Page Parts •Parts are dependent on the pageb a sho
    • 78. Rendering Page Parts •Parts are dependent on the page •Compose / Denormalize (part of page)b a sho
    • 79. Rendering Page Parts •Parts are dependent on the page •Compose / Denormalize (part of page) •Reduces lookups, conceptual unity, no index neededb a sho
    • 80. Rendering Page Parts •Parts are dependent on the page •Compose / Denormalize (part of page) •Reduces lookups, conceptual unity, no index needed • Larger objects, no lazy fetchingb a sho
    • 81. Denormalized Page { title:”Home Page”, parts:[ {name:”body”, content:”...”}, {name:”sidebar”, content:”...”}, ], // ... }b a sho
    • 82. Denormalized Page { title:”Home Page”, parts:[ {name:”body”, content:”...”}, {name:”sidebar”, content:”...”}, ], // ... } dependent objects inlineb a sho
    • 83. Site Hierarchy ...or, how to find a page (currently)b a sho
    • 84. Site Hierarchy ...or, how to find a page (currently) • Start at the root: WHERE parent_id IS NULLb a sho
    • 85. Site Hierarchy ...or, how to find a page (currently) • Start at the root: WHERE parent_id IS NULL • Check if the current page matches URLb a sho
    • 86. Site Hierarchy ...or, how to find a page (currently) • Start at the root: WHERE parent_id IS NULL • Check if the current page matches URL • Check children for matching path segment, recurseb a sho
    • 87. Site Hierarchy ...or, how to find a page (currently) • Start at the root: WHERE parent_id IS NULL • Check if the current page matches URL • Check children for matching path segment, recurse • Return “not found” page or 404b a sho
    • 88. Site Hierarchy ...or, how to find a page (analysis)b a sho
    • 89. Site Hierarchy ...or, how to find a page (analysis) • O(log N) queries to find requested pageb a sho
    • 90. Site Hierarchy ...or, how to find a page (analysis) • O(log N) queries to find requested page • Index on parent_id speeds retrievalb a sho
    • 91. Site Hierarchy ...or, how to find a page (analysis) • O(log N) queries to find requested page • Index on parent_id speeds retrieval • Traversing/iterating for content generationb a sho
    • 92. Site Hierarchy ...or, how to find a page (analysis) • O(log N) queries to find requested page • Index on parent_id speeds retrieval • Traversing/iterating for content generation • Generating URL pathsb a sho
    • 93. Site Hierarchy ...or, how to find a page (in Riak)b a sho
    • 94. Site Hierarchy ...or, how to find a page (in Riak) • Blog post: http://ow.ly/3jDAOb a sho
    • 95. Site Hierarchy ...or, how to find a page (in Riak) • Blog post: http://ow.ly/3jDAO • #1 Parent/child linksb a sho
    • 96. Site Hierarchy ...or, how to find a page (in Riak) • Blog post: http://ow.ly/3jDAO • #1 Parent/child links • #2 Material path keyb a sho
    • 97. Site Hierarchy ...or, how to find a page (in Riak) • Blog post: http://ow.ly/3jDAO • #1 Parent/child links • #2 Material path key • #3 Tree objectb a sho
    • 98. #1: Parent/Child Links A Natural Tree A B C D E Fb a sho
    • 99. #1: Parent/Child Links A Natural Tree • Easy to understand & A implement B C D E Fb a sho
    • 100. #1: Parent/Child Links A Natural Tree </riak/pages/C>; riaktag=”child” • Easy to understand & A implement </riak/pages/A>; riaktag=”parent” • Doubly-linked for easy traversal in either B C direction D E Fb a sho
    • 101. #1: Parent/Child Links A Natural Tree </riak/pages/C>; riaktag=”child” • Easy to understand & A implement </riak/pages/A>; riaktag=”parent” • Doubly-linked for easy traversal in either B C direction • Easy to move entire subtrees D E Fb a sho
    • 102. #1: Parent/Child Links A Natural Tree A B C D E Fb a sho
    • 103. #1: Parent/Child Links A Natural Tree A • Child order is arbitrary B C D E Fb a sho
    • 104. #1: Parent/Child Links A Natural Tree A • Child order is arbitrary • Still O(log N) traversal B C D E Fb a sho
    • 105. #1: Parent/Child Links A Natural Tree A • Child order is arbitrary • Still O(log N) traversal B C • Two writes to add or update D E Fb a sho
    • 106. #2: Material Path Key “Best Guess” discovery _root_ (A) B C B/D C/E “pages” bucketb a sho
    • 107. #2: Material Path Key “Best Guess” discovery _root_ (A) • Use path-to-page as key B C B/D C/E “pages” bucketb a sho
    • 108. #2: Material Path Key “Best Guess” discovery _root_ (A) • Use path-to-page as key B • Best case: one lookup C to find requested page B/D C/E “pages” bucketb a sho
    • 109. #2: Material Path Key “Best Guess” discovery _root_ (A) • Use path-to-page as key B • Best case: one lookup C to find requested page B/D • Ancestor pages listed inside object C/E “pages” bucketb a sho
    • 110. #2: Material Path Key “Best Guess” discovery { _root_ (A) • Use path-to-page as title:”D”, key ancestors:[ B “B”, • Best case: one lookup “_root_” C to find requested page ] B/D • Ancestor pages listed inside object C/E “pages” bucketb a sho
    • 111. #2: Material Path Key “Best Guess” discovery _root_ (A) B C B/D C/E “pages” bucketb a sho
    • 112. #2: Material Path Key “Best Guess” discovery • Large update cost for _root_ (A) internal nodes B C B/D C/E “pages” bucketb a sho
    • 113. #2: Material Path Key “Best Guess” discovery • Large update cost for _root_ (A) internal nodes • Dynamic URLs hard B C B/D C/E “pages” bucketb a sho
    • 114. #2: Material Path Key “Best Guess” discovery • Large update cost for _root_ (A) internal nodes • Dynamic URLs hard B • Key-filters (0.14) C needed for efficient B/D child lists C/E “pages” bucketb a sho
    • 115. #2: Material Path Key “Best Guess” discovery • Large update cost for _root_ (A) internal nodes • Dynamic URLs hard B • Key-filters (0.14) C needed for efficient B/D child lists • Need fallback for C/E “misses” “pages” bucketb a sho
    • 116. #3: Tree Object “Branches and Leaves” { key:”A”, children:[ {key:”B”, children:[“D”]}, {key:”C”, children:[“E”,”F”]} ] }b a sho
    • 117. #3: Tree Object “Branches and Leaves” • One request to get site structure { key:”A”, children:[ {key:”B”, children:[“D”]}, {key:”C”, children:[“E”,”F”]} ] }b a sho
    • 118. #3: Tree Object “Branches and Leaves” • One request to get site structure { key:”A”, • Separates content from children:[ {key:”B”, organization children:[“D”]}, {key:”C”, children:[“E”,”F”]} ] }b a sho
    • 119. #3: Tree Object “Branches and Leaves” • One request to get site structure { key:”A”, • Separates content from children:[ {key:”B”, organization children:[“D”]}, {key:”C”, • Intrinsic ordering ] children:[“E”,”F”]} }b a sho
    • 120. #3: Tree Object “Branches and Leaves” • One request to get site structure { key:”A”, • Separates content from children:[ {key:”B”, organization children:[“D”]}, {key:”C”, • Intrinsic ordering ] children:[“E”,”F”]} } • Massive structure changes are quickb a sho
    • 121. #3: Tree Object “Branches and Leaves” { key:”A”, children:[ {key:”B”, children:[“D”]}, {key:”C”, children:[“E”,”F”]} ] }b a sho
    • 122. #3: Tree Object “Branches and Leaves” • Large site = large tree object (expensive to transfer) { key:”A”, children:[ {key:”B”, children:[“D”]}, {key:”C”, children:[“E”,”F”]} ] }b a sho
    • 123. #3: Tree Object “Branches and Leaves” • Large site = large tree object (expensive to transfer) { key:”A”, • Some metadata children:[ {key:”B”, (slug?)will need to be children:[“D”]}, {key:”C”, in-tree for efficient children:[“E”,”F”]} traversal ] }b a sho
    • 124. #3: Tree Object “Branches and Leaves” • Large site = large tree object (expensive to transfer) { key:”A”, • Some metadata children:[ {key:”B”, (slug?)will need to be children:[“D”]}, {key:”C”, in-tree for efficient children:[“E”,”F”]} traversal ] } • Multiple writers problematicb a sho
    • 125. Hybrid Solutions TIMTOWTDIb a sho
    • 126. Hybrid Solutions TIMTOWTDI •Material paths for quick lookups, links for relative traversalb a sho
    • 127. Hybrid Solutions TIMTOWTDI •Material paths for quick lookups, links for relative traversal •Tree object for relative lookups, secondary index for material pathsb a sho
    • 128. Key Takeawaysb a sho
    • 129. Key Takeaways •Design for most common access pattern: the key is your indexb a sho
    • 130. Key Takeaways •Design for most common access pattern: the key is your index •Denormalize dependent data typesb a sho
    • 131. Key Takeaways •Design for most common access pattern: the key is your index •Denormalize dependent data types •Build richer (or simpler) data structuresb a sho
    • 132. Key Takeaways •Design for most common access pattern: the key is your index •Denormalize dependent data types •Build richer (or simpler) data structures •Use links to connect normalized or independent typesb a sho
    • 133. Useful Tipsb a sho
    • 134. Useful Tips •Use query or identity caches to reduce duplicate fetchesb a sho
    • 135. Useful Tips •Use query or identity caches to reduce duplicate fetches •Store data in JSON, XML, or Erlang terms for MapReduceb a sho
    • 136. Useful Tips •Use query or identity caches to reduce duplicate fetches •Store data in JSON, XML, or Erlang terms for MapReduce •Use Riak Search where appropriate to reduce complexityb a sho
    • 137. Reviewb a sho
    • 138. Review •Analyze your relational modelb a sho
    • 139. Review •Analyze your relational model •Identify pain points, take statsb a sho
    • 140. Review •Analyze your relational model •Identify pain points, take stats •Design some alternativesb a sho
    • 141. Review •Analyze your relational model •Identify pain points, take stats •Design some alternatives •Test, Measure, Repeat!b a sho
    • 142. Plug Interested in learning about support, consulting, or Enterprise features?   Email info@basho.com or go to http://www.basho.com/contact.html to talk with us. www.basho.comb a sho
    • 143. Plug Interested in learning about support, consulting, or Enterprise features?   Email info@basho.com or go to http://www.basho.com/contact.html to talk with us. www.basho.com sean@basho.com @seancribbsb a sho