SlideShare a Scribd company logo
1 of 54
Download to read offline
Relationships are Hard
‣ Why you should never use a
RDBMS again
‣ Map/Reduce, Views, or other
NoSQL query options
‣ Why NoSQL databases are
the cats pajamas
‣ How to prove the theory of
relativity
What we won’t be covering
‣ Where NoSQL came from
‣ Some database theory
‣ When its good or not good to use NoSQL
‣ A few Key Patterns
‣ A few Data Modeling Patterns
‣ Samples and Ideas of when to use both
What we will be covering
‣ Husband
‣ Dad
‣ Coach (CC and Track)
‣ Co-Owner of CKH Consulting
‣ Runner
‣ Polyglot
The Obligatory Who Am I?
#noSQL
It’s Just a Buzz Word
What does it mean?
Brief History
‣ Characteristics
‣ non-relational
‣ open-source (some)
‣ cluster-friendly (some)
‣ 21st Century
‣ schema less
Brief History
‣ Types
‣ key/value
‣ MemcacheDB
‣ document
‣ Couchbase, Redis, MongoDB
‣ column
‣ Cassandra, HBase
‣ graph
‣ neo4j, InfiniteGraph
And now for some database
theory
CAP THEOREM
PICK TWO
CAP Theorem (pick 2 please)
‣ Consistency
‣ all nodes see the same data at the same
time
‣ Availability
‣ a guarantee that every client making a
request for data gets a response, even if
one or more nodes are down
‣ Partition Tolerance
‣ the system continues to operate despite
arbitrary message loss or failure of part of
the system
ACID
‣ Atomicity
‣ All or Nothing (transactions)
‣ Consistency
‣ data is never violated (constraints.
validation, etc)
‣ Isolation
‣ operations must not see data from other
non-finished operations(locks)
‣ Durability
‣ state of data is persisted when operation
finishes (server crash protection)
BASE
‣ Basically Available
‣ Should be up, but not guaranteed
‣ Soft state
‣ changes happening even without
input
‣ Eventual consistency
‣ At some point your changes will
be consistent
‣ Less data duplication
‣ Less storage space
‣ Higher data consistency/integrity
‣ Easy to query
‣ Updates are EASY
Normalisation
Normalisation
userID username groupID statusID
1 wither 1 1
2 zombie 1 1
3 slime 2 2
4 enderman 1 2
statusID status
1 active
2 inactive
groupID group
1 monster
2 block
reference
reference
‣ Minimize the need for joins
‣ Reduce the number of indexes
‣ Reduce the number of relationships
‣ Updates are harder (slower)
Denormalisation
Denormalisation
userID username group status
1 wither monster active
2 zombie monster active
3 slime block inactive
4 enderman monster inactive
WAKE UP
Keying in NoSQL
‣ Predictable Keys
‣ Counter-ID
‣ Lookup Pattern
‣ Counter/Lookup combined
‣ No built in key mechanisms in most NoSQL
‣ Use a Unique value for key (email, usernames, sku, isbn, etc)
‣ Users
‣ u::ender@dragon.com
‣ u::enderdragon
‣ Products
‣ p::231-0321404314123 (isbn)
‣ Unpredictable Keys (GUID, UUID, etc) require Views (Map-
Reduce indexes) to find their documents
Predictable Keys
‣ Similar to IDENTITY column in RDBMS
‣ Creating New Documents is a pair of operations in your
application, increment and add
‣ initialize one Key as an Atomic Counter (on app start)
‣ increment counter and save new value
‣ id = bucket. counter(“blog::slug::comment_count”, {delta: 1})
‣ Use the id as component of the key for the new document
‣ bucket.upsert(“blog::slug::c” + id, comment_data)
Counter-ID Keys
‣ Create simple documents that has referential data (Key) to primary
documents
‣ Primary Document u::abc123-12345-4312
‣ Lookup Document u::iron@golem.com = u::abc123-12345-4312
‣ Lookup documents aren’t JSON, they should be the key as a string so
you skip the JSON parsing
‣ Requires two get operations, first the lookup doc, then the primary doc
‣ key = bucket.get(“u::iron@golem.com”)
‣ doc = bucket.get(key)
Lookup Pattern
RDBMS vs NoSQL
Gets slower with
growth
Embedding vs Linking
‣ Embed copies of frequently accessed related
objects
‣ Users friends into the users profile “friendlist”
‣ Allows for quick fetch of users friendlist and all friend
profiles in one read
‣ Denormalized
Embedding
‣ Can cause data duplication
‣ Updates are painful
‣ Easy to become inconsistent
Embedding
‣ user.friendsList = [ {username: “han”, firstName:
“Han”, lastName:”Solo”}, {username:
“leia”,firstName:”Leia”, lastName:”Solo”}]
Embedding
‣ user.friendsList = [ “han”, “leia”]
‣ If you use linking you have to get the linked data
yourself
‣ The DB Driver doesn’t do it for you, you can
delete a link without the reference deleting
Linking
‣ Embedding when
‣ there are no or few updates of the data
‣ caching
‣ when time-to-live is very restricted
‣ Otherwise link the related data
‣ Partial embedding is also possible
Embedding vs Linking
‣ Logging data
‣ Lots of records
‣ Little need to access all the children
Parent-Referencing
‣ One - To - Few = Embedding
‣ One - To - Many = Linking
‣ One - To - Squillions = Parent Referencing
Rule of Thumb
Making a mental shift
‣ In SQL we tend to want to avoid hitting the database as
much as possible
‣ we know that its costly when tying up connection pools and
overloading db servers
‣ even with indexing SQL still gets bogged down by complex joins
and huge indexes
‣ In noSQL gets and sets ARE FAST, and not a bottleneck, this
is hard for many people to accept and absorb at first
Making a mental shift
Gets and Sets are fast…
Gets and Sets are fast…
Gets and Sets are fast…
Gets and Sets are fast…
‣ Denormalization takes getting used to
‣ Data integrity
‣ Schema changes are easy
Making a mental shift
noSQL is Web Scale
‣ Big Data
‣ Impedance mismatch
‣ Frequently Changing schema
3 Big reasons for a NoSQL DB
‣ Document DB where it makes sense
‣ Relational DB where it makes sense
‣ Graph DB where it makes sense
Polyglot Persistence
Simple e-Commerce Example
‣ Store order history (including price at the time of
order)
‣ Keep product inventory
‣ Know the number of orders per product
‣ Know the number of orders per customer
‣ Human readable category URLs
Specs
‣ Email as key
‣ Embedded addresses
‣ Orders linking
Customer
‣ slug as key
‣ Products count
‣ Orders count
‣ Products parent referencing
Category
‣ SKU as key
‣ Orders count
‣ Inventory
‣ Categories linking
Product
‣ User Session ID as key
‣ Products Count
‣ Total
‣ Products embedded (partially)
‣ Status (active, completed, expired)
Cart
‣ Order Counter for Key
‣ Embed customer info (partially)
‣ Embed product info (partially)
‣ Totals
Order
‣ Set cart status to completed
‣ decrement inventory, adjust “inCart”
‣ create/update customer record
‣ add order counter ID to customer orders
‣ Update order counts on category and product
Order
‣ Introduction to NoSQL (Martin Folwer)
‣ https://www.youtube.com/watch?v=qI_g07C_Q5I
‣ Data modeling in key/value and document database
‣ http://openslava.sk/2014/Presentations/Data-modeling-in-
KeyValue-and-Document-Stores_Philipp-Fehre.pdf
‣ NoSQL data modeling techniques
‣ https://highlyscalable.wordpress.com/2012/03/01/nosql-data-
modeling-techniques/
More Information
“Thank you.”
–hold up applause sign here
‣ Email - gratzc@compknowhow.com
‣ Blog - https://ckhconsulting.com/blog/
‣ Twitter - gratzc
‣ LinkedIn - gratzc
‣ League of Legends - gratzc
Contact Info

More Related Content

Similar to ITB_2023_Relationships_are_Hard_Data_modeling_with_NoSQL_Curt_Gratz.pdf

[Mas 500] Data Basics
[Mas 500] Data Basics[Mas 500] Data Basics
[Mas 500] Data Basics
rahulbot
 
NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabases
Adi Challa
 

Similar to ITB_2023_Relationships_are_Hard_Data_modeling_with_NoSQL_Curt_Gratz.pdf (20)

ArangoDB
ArangoDBArangoDB
ArangoDB
 
Sql server performance tuning
Sql server performance tuningSql server performance tuning
Sql server performance tuning
 
SQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 QuestionsSQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 Questions
 
[Mas 500] Data Basics
[Mas 500] Data Basics[Mas 500] Data Basics
[Mas 500] Data Basics
 
NoSQL: An Analysis
NoSQL: An AnalysisNoSQL: An Analysis
NoSQL: An Analysis
 
NoSQL.pptx
NoSQL.pptxNoSQL.pptx
NoSQL.pptx
 
NoSQL for great good [hanoi.rb talk]
NoSQL for great good [hanoi.rb talk]NoSQL for great good [hanoi.rb talk]
NoSQL for great good [hanoi.rb talk]
 
Big Data in the Real World
Big Data in the Real WorldBig Data in the Real World
Big Data in the Real World
 
NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabases
 
Nosql
NosqlNosql
Nosql
 
Database Survival Guide: Exploratory Webcast
Database Survival Guide: Exploratory WebcastDatabase Survival Guide: Exploratory Webcast
Database Survival Guide: Exploratory Webcast
 
Nosql
NosqlNosql
Nosql
 
Big data pipelines
Big data pipelinesBig data pipelines
Big data pipelines
 
Database Creation and Management with ML
Database Creation and Management with MLDatabase Creation and Management with ML
Database Creation and Management with ML
 
Database basics
Database basicsDatabase basics
Database basics
 
Philly Code Camp 2013 Mark Kromer Big Data with SQL Server
Philly Code Camp 2013 Mark Kromer Big Data with SQL ServerPhilly Code Camp 2013 Mark Kromer Big Data with SQL Server
Philly Code Camp 2013 Mark Kromer Big Data with SQL Server
 
PSSUG Nov 2012: Big Data with SQL Server
PSSUG Nov 2012: Big Data with SQL ServerPSSUG Nov 2012: Big Data with SQL Server
PSSUG Nov 2012: Big Data with SQL Server
 
NoSQL(NOT ONLY SQL)
NoSQL(NOT ONLY SQL)NoSQL(NOT ONLY SQL)
NoSQL(NOT ONLY SQL)
 
GraphQL vs. (the) REST
GraphQL vs. (the) RESTGraphQL vs. (the) REST
GraphQL vs. (the) REST
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 

More from Ortus Solutions, Corp

More from Ortus Solutions, Corp (20)

ITB2024 - Keynote Day 1 - Ortus Solutions.pdf
ITB2024 - Keynote Day 1 - Ortus Solutions.pdfITB2024 - Keynote Day 1 - Ortus Solutions.pdf
ITB2024 - Keynote Day 1 - Ortus Solutions.pdf
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
Ortus Government.pdf
Ortus Government.pdfOrtus Government.pdf
Ortus Government.pdf
 
Luis Majano The Battlefield ORM
Luis Majano The Battlefield ORMLuis Majano The Battlefield ORM
Luis Majano The Battlefield ORM
 
Brad Wood - CommandBox CLI
Brad Wood - CommandBox CLI Brad Wood - CommandBox CLI
Brad Wood - CommandBox CLI
 
Secure your Secrets and Settings in ColdFusion
Secure your Secrets and Settings in ColdFusionSecure your Secrets and Settings in ColdFusion
Secure your Secrets and Settings in ColdFusion
 
Daniel Garcia ContentBox: CFSummit 2023
Daniel Garcia ContentBox: CFSummit 2023Daniel Garcia ContentBox: CFSummit 2023
Daniel Garcia ContentBox: CFSummit 2023
 
ITB_2023_Human-Friendly_Scheduled_Tasks_Giancarlo_Gomez.pdf
ITB_2023_Human-Friendly_Scheduled_Tasks_Giancarlo_Gomez.pdfITB_2023_Human-Friendly_Scheduled_Tasks_Giancarlo_Gomez.pdf
ITB_2023_Human-Friendly_Scheduled_Tasks_Giancarlo_Gomez.pdf
 
ITB_2023_CommandBox_Multi-Server_-_Brad_Wood.pdf
ITB_2023_CommandBox_Multi-Server_-_Brad_Wood.pdfITB_2023_CommandBox_Multi-Server_-_Brad_Wood.pdf
ITB_2023_CommandBox_Multi-Server_-_Brad_Wood.pdf
 
ITB_2023_The_Many_Layers_of_OAuth_Keith_Casey_.pdf
ITB_2023_The_Many_Layers_of_OAuth_Keith_Casey_.pdfITB_2023_The_Many_Layers_of_OAuth_Keith_Casey_.pdf
ITB_2023_The_Many_Layers_of_OAuth_Keith_Casey_.pdf
 
ITB_2023_Extend_your_contentbox_apps_with_custom_modules_Javier_Quintero.pdf
ITB_2023_Extend_your_contentbox_apps_with_custom_modules_Javier_Quintero.pdfITB_2023_Extend_your_contentbox_apps_with_custom_modules_Javier_Quintero.pdf
ITB_2023_Extend_your_contentbox_apps_with_custom_modules_Javier_Quintero.pdf
 
ITB_2023_25_Most_Dangerous_Software_Weaknesses_Pete_Freitag.pdf
ITB_2023_25_Most_Dangerous_Software_Weaknesses_Pete_Freitag.pdfITB_2023_25_Most_Dangerous_Software_Weaknesses_Pete_Freitag.pdf
ITB_2023_25_Most_Dangerous_Software_Weaknesses_Pete_Freitag.pdf
 
ITB_2023_CBWire_v3_Grant_Copley.pdf
ITB_2023_CBWire_v3_Grant_Copley.pdfITB_2023_CBWire_v3_Grant_Copley.pdf
ITB_2023_CBWire_v3_Grant_Copley.pdf
 
ITB_2023_Practical_AI_with_OpenAI_-_Grant_Copley_.pdf
ITB_2023_Practical_AI_with_OpenAI_-_Grant_Copley_.pdfITB_2023_Practical_AI_with_OpenAI_-_Grant_Copley_.pdf
ITB_2023_Practical_AI_with_OpenAI_-_Grant_Copley_.pdf
 
ITB_2023_When_Your_Applications_Work_As_a_Team_Nathaniel_Francis.pdf
ITB_2023_When_Your_Applications_Work_As_a_Team_Nathaniel_Francis.pdfITB_2023_When_Your_Applications_Work_As_a_Team_Nathaniel_Francis.pdf
ITB_2023_When_Your_Applications_Work_As_a_Team_Nathaniel_Francis.pdf
 
ITB_2023_Faster_Apps_That_Wont_Get_Crushed_Brian_Klaas.pdf
ITB_2023_Faster_Apps_That_Wont_Get_Crushed_Brian_Klaas.pdfITB_2023_Faster_Apps_That_Wont_Get_Crushed_Brian_Klaas.pdf
ITB_2023_Faster_Apps_That_Wont_Get_Crushed_Brian_Klaas.pdf
 
ITB_2023_Chatgpt_Box_Scott_Steinbeck.pdf
ITB_2023_Chatgpt_Box_Scott_Steinbeck.pdfITB_2023_Chatgpt_Box_Scott_Steinbeck.pdf
ITB_2023_Chatgpt_Box_Scott_Steinbeck.pdf
 
ITB_2023_CommandBox_Task_Runners_Brad_Wood.pdf
ITB_2023_CommandBox_Task_Runners_Brad_Wood.pdfITB_2023_CommandBox_Task_Runners_Brad_Wood.pdf
ITB_2023_CommandBox_Task_Runners_Brad_Wood.pdf
 
ITB_2023_Create_as_many_web_sites_or_web_apps_as_you_want_George_Murphy.pdf
ITB_2023_Create_as_many_web_sites_or_web_apps_as_you_want_George_Murphy.pdfITB_2023_Create_as_many_web_sites_or_web_apps_as_you_want_George_Murphy.pdf
ITB_2023_Create_as_many_web_sites_or_web_apps_as_you_want_George_Murphy.pdf
 
ITB2023 Developing for Performance - Denard Springle.pdf
ITB2023 Developing for Performance - Denard Springle.pdfITB2023 Developing for Performance - Denard Springle.pdf
ITB2023 Developing for Performance - Denard Springle.pdf
 

Recently uploaded

Recently uploaded (20)

Crafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM IntegrationCrafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM Integration
 
Workforce Efficiency with Employee Time Tracking Software.pdf
Workforce Efficiency with Employee Time Tracking Software.pdfWorkforce Efficiency with Employee Time Tracking Software.pdf
Workforce Efficiency with Employee Time Tracking Software.pdf
 
Lessons Learned from Building a Serverless Notifications System.pdf
Lessons Learned from Building a Serverless Notifications System.pdfLessons Learned from Building a Serverless Notifications System.pdf
Lessons Learned from Building a Serverless Notifications System.pdf
 
Tree in the Forest - Managing Details in BDD Scenarios (live2test 2024)
Tree in the Forest - Managing Details in BDD Scenarios (live2test 2024)Tree in the Forest - Managing Details in BDD Scenarios (live2test 2024)
Tree in the Forest - Managing Details in BDD Scenarios (live2test 2024)
 
IT Software Development Resume, Vaibhav jha 2024
IT Software Development Resume, Vaibhav jha 2024IT Software Development Resume, Vaibhav jha 2024
IT Software Development Resume, Vaibhav jha 2024
 
A Guideline to Zendesk to Re:amaze Data Migration
A Guideline to Zendesk to Re:amaze Data MigrationA Guideline to Zendesk to Re:amaze Data Migration
A Guideline to Zendesk to Re:amaze Data Migration
 
What is an API Development- Definition, Types, Specifications, Documentation.pdf
What is an API Development- Definition, Types, Specifications, Documentation.pdfWhat is an API Development- Definition, Types, Specifications, Documentation.pdf
What is an API Development- Definition, Types, Specifications, Documentation.pdf
 
Workshop: Enabling GenAI Breakthroughs with Knowledge Graphs - GraphSummit Milan
Workshop: Enabling GenAI Breakthroughs with Knowledge Graphs - GraphSummit MilanWorkshop: Enabling GenAI Breakthroughs with Knowledge Graphs - GraphSummit Milan
Workshop: Enabling GenAI Breakthroughs with Knowledge Graphs - GraphSummit Milan
 
Implementing KPIs and Right Metrics for Agile Delivery Teams.pdf
Implementing KPIs and Right Metrics for Agile Delivery Teams.pdfImplementing KPIs and Right Metrics for Agile Delivery Teams.pdf
Implementing KPIs and Right Metrics for Agile Delivery Teams.pdf
 
Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdf
Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdfMicrosoft 365 Copilot; An AI tool changing the world of work _PDF.pdf
Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdf
 
COMPUTER AND ITS COMPONENTS PPT.by naitik sharma Class 9th A mittal internati...
COMPUTER AND ITS COMPONENTS PPT.by naitik sharma Class 9th A mittal internati...COMPUTER AND ITS COMPONENTS PPT.by naitik sharma Class 9th A mittal internati...
COMPUTER AND ITS COMPONENTS PPT.by naitik sharma Class 9th A mittal internati...
 
Odoo vs Shopify: Why Odoo is Best for Ecommerce Website Builder in 2024
Odoo vs Shopify: Why Odoo is Best for Ecommerce Website Builder in 2024Odoo vs Shopify: Why Odoo is Best for Ecommerce Website Builder in 2024
Odoo vs Shopify: Why Odoo is Best for Ecommerce Website Builder in 2024
 
Automate your OpenSIPS config tests - OpenSIPS Summit 2024
Automate your OpenSIPS config tests - OpenSIPS Summit 2024Automate your OpenSIPS config tests - OpenSIPS Summit 2024
Automate your OpenSIPS config tests - OpenSIPS Summit 2024
 
AI Hackathon.pptx
AI                        Hackathon.pptxAI                        Hackathon.pptx
AI Hackathon.pptx
 
OpenChain Webinar: AboutCode and Beyond - End-to-End SCA
OpenChain Webinar: AboutCode and Beyond - End-to-End SCAOpenChain Webinar: AboutCode and Beyond - End-to-End SCA
OpenChain Webinar: AboutCode and Beyond - End-to-End SCA
 
The Impact of PLM Software on Fashion Production
The Impact of PLM Software on Fashion ProductionThe Impact of PLM Software on Fashion Production
The Impact of PLM Software on Fashion Production
 
The mythical technical debt. (Brooke, please, forgive me)
The mythical technical debt. (Brooke, please, forgive me)The mythical technical debt. (Brooke, please, forgive me)
The mythical technical debt. (Brooke, please, forgive me)
 
Optimizing Operations by Aligning Resources with Strategic Objectives Using O...
Optimizing Operations by Aligning Resources with Strategic Objectives Using O...Optimizing Operations by Aligning Resources with Strategic Objectives Using O...
Optimizing Operations by Aligning Resources with Strategic Objectives Using O...
 
OpenChain @ LF Japan Executive Briefing - May 2024
OpenChain @ LF Japan Executive Briefing - May 2024OpenChain @ LF Japan Executive Briefing - May 2024
OpenChain @ LF Japan Executive Briefing - May 2024
 
A Deep Dive into Secure Product Development Frameworks.pdf
A Deep Dive into Secure Product Development Frameworks.pdfA Deep Dive into Secure Product Development Frameworks.pdf
A Deep Dive into Secure Product Development Frameworks.pdf
 

ITB_2023_Relationships_are_Hard_Data_modeling_with_NoSQL_Curt_Gratz.pdf

  • 2. ‣ Why you should never use a RDBMS again ‣ Map/Reduce, Views, or other NoSQL query options ‣ Why NoSQL databases are the cats pajamas ‣ How to prove the theory of relativity What we won’t be covering
  • 3. ‣ Where NoSQL came from ‣ Some database theory ‣ When its good or not good to use NoSQL ‣ A few Key Patterns ‣ A few Data Modeling Patterns ‣ Samples and Ideas of when to use both What we will be covering
  • 4. ‣ Husband ‣ Dad ‣ Coach (CC and Track) ‣ Co-Owner of CKH Consulting ‣ Runner ‣ Polyglot The Obligatory Who Am I?
  • 6. It’s Just a Buzz Word
  • 7. What does it mean?
  • 8. Brief History ‣ Characteristics ‣ non-relational ‣ open-source (some) ‣ cluster-friendly (some) ‣ 21st Century ‣ schema less
  • 9. Brief History ‣ Types ‣ key/value ‣ MemcacheDB ‣ document ‣ Couchbase, Redis, MongoDB ‣ column ‣ Cassandra, HBase ‣ graph ‣ neo4j, InfiniteGraph
  • 10. And now for some database theory
  • 12. CAP Theorem (pick 2 please) ‣ Consistency ‣ all nodes see the same data at the same time ‣ Availability ‣ a guarantee that every client making a request for data gets a response, even if one or more nodes are down ‣ Partition Tolerance ‣ the system continues to operate despite arbitrary message loss or failure of part of the system
  • 13. ACID ‣ Atomicity ‣ All or Nothing (transactions) ‣ Consistency ‣ data is never violated (constraints. validation, etc) ‣ Isolation ‣ operations must not see data from other non-finished operations(locks) ‣ Durability ‣ state of data is persisted when operation finishes (server crash protection)
  • 14.
  • 15. BASE ‣ Basically Available ‣ Should be up, but not guaranteed ‣ Soft state ‣ changes happening even without input ‣ Eventual consistency ‣ At some point your changes will be consistent
  • 16. ‣ Less data duplication ‣ Less storage space ‣ Higher data consistency/integrity ‣ Easy to query ‣ Updates are EASY Normalisation
  • 17. Normalisation userID username groupID statusID 1 wither 1 1 2 zombie 1 1 3 slime 2 2 4 enderman 1 2 statusID status 1 active 2 inactive groupID group 1 monster 2 block reference reference
  • 18. ‣ Minimize the need for joins ‣ Reduce the number of indexes ‣ Reduce the number of relationships ‣ Updates are harder (slower) Denormalisation
  • 19. Denormalisation userID username group status 1 wither monster active 2 zombie monster active 3 slime block inactive 4 enderman monster inactive
  • 21. Keying in NoSQL ‣ Predictable Keys ‣ Counter-ID ‣ Lookup Pattern ‣ Counter/Lookup combined
  • 22. ‣ No built in key mechanisms in most NoSQL ‣ Use a Unique value for key (email, usernames, sku, isbn, etc) ‣ Users ‣ u::ender@dragon.com ‣ u::enderdragon ‣ Products ‣ p::231-0321404314123 (isbn) ‣ Unpredictable Keys (GUID, UUID, etc) require Views (Map- Reduce indexes) to find their documents Predictable Keys
  • 23. ‣ Similar to IDENTITY column in RDBMS ‣ Creating New Documents is a pair of operations in your application, increment and add ‣ initialize one Key as an Atomic Counter (on app start) ‣ increment counter and save new value ‣ id = bucket. counter(“blog::slug::comment_count”, {delta: 1}) ‣ Use the id as component of the key for the new document ‣ bucket.upsert(“blog::slug::c” + id, comment_data) Counter-ID Keys
  • 24. ‣ Create simple documents that has referential data (Key) to primary documents ‣ Primary Document u::abc123-12345-4312 ‣ Lookup Document u::iron@golem.com = u::abc123-12345-4312 ‣ Lookup documents aren’t JSON, they should be the key as a string so you skip the JSON parsing ‣ Requires two get operations, first the lookup doc, then the primary doc ‣ key = bucket.get(“u::iron@golem.com”) ‣ doc = bucket.get(key) Lookup Pattern
  • 25. RDBMS vs NoSQL Gets slower with growth
  • 27. ‣ Embed copies of frequently accessed related objects ‣ Users friends into the users profile “friendlist” ‣ Allows for quick fetch of users friendlist and all friend profiles in one read ‣ Denormalized Embedding
  • 28. ‣ Can cause data duplication ‣ Updates are painful ‣ Easy to become inconsistent Embedding
  • 29. ‣ user.friendsList = [ {username: “han”, firstName: “Han”, lastName:”Solo”}, {username: “leia”,firstName:”Leia”, lastName:”Solo”}] Embedding
  • 30. ‣ user.friendsList = [ “han”, “leia”] ‣ If you use linking you have to get the linked data yourself ‣ The DB Driver doesn’t do it for you, you can delete a link without the reference deleting Linking
  • 31. ‣ Embedding when ‣ there are no or few updates of the data ‣ caching ‣ when time-to-live is very restricted ‣ Otherwise link the related data ‣ Partial embedding is also possible Embedding vs Linking
  • 32. ‣ Logging data ‣ Lots of records ‣ Little need to access all the children Parent-Referencing
  • 33. ‣ One - To - Few = Embedding ‣ One - To - Many = Linking ‣ One - To - Squillions = Parent Referencing Rule of Thumb
  • 35. ‣ In SQL we tend to want to avoid hitting the database as much as possible ‣ we know that its costly when tying up connection pools and overloading db servers ‣ even with indexing SQL still gets bogged down by complex joins and huge indexes ‣ In noSQL gets and sets ARE FAST, and not a bottleneck, this is hard for many people to accept and absorb at first Making a mental shift
  • 36. Gets and Sets are fast…
  • 37. Gets and Sets are fast…
  • 38. Gets and Sets are fast…
  • 39. Gets and Sets are fast…
  • 40. ‣ Denormalization takes getting used to ‣ Data integrity ‣ Schema changes are easy Making a mental shift
  • 41. noSQL is Web Scale
  • 42. ‣ Big Data ‣ Impedance mismatch ‣ Frequently Changing schema 3 Big reasons for a NoSQL DB
  • 43. ‣ Document DB where it makes sense ‣ Relational DB where it makes sense ‣ Graph DB where it makes sense Polyglot Persistence
  • 45. ‣ Store order history (including price at the time of order) ‣ Keep product inventory ‣ Know the number of orders per product ‣ Know the number of orders per customer ‣ Human readable category URLs Specs
  • 46. ‣ Email as key ‣ Embedded addresses ‣ Orders linking Customer
  • 47. ‣ slug as key ‣ Products count ‣ Orders count ‣ Products parent referencing Category
  • 48. ‣ SKU as key ‣ Orders count ‣ Inventory ‣ Categories linking Product
  • 49. ‣ User Session ID as key ‣ Products Count ‣ Total ‣ Products embedded (partially) ‣ Status (active, completed, expired) Cart
  • 50. ‣ Order Counter for Key ‣ Embed customer info (partially) ‣ Embed product info (partially) ‣ Totals Order
  • 51. ‣ Set cart status to completed ‣ decrement inventory, adjust “inCart” ‣ create/update customer record ‣ add order counter ID to customer orders ‣ Update order counts on category and product Order
  • 52. ‣ Introduction to NoSQL (Martin Folwer) ‣ https://www.youtube.com/watch?v=qI_g07C_Q5I ‣ Data modeling in key/value and document database ‣ http://openslava.sk/2014/Presentations/Data-modeling-in- KeyValue-and-Document-Stores_Philipp-Fehre.pdf ‣ NoSQL data modeling techniques ‣ https://highlyscalable.wordpress.com/2012/03/01/nosql-data- modeling-techniques/ More Information
  • 53. “Thank you.” –hold up applause sign here
  • 54. ‣ Email - gratzc@compknowhow.com ‣ Blog - https://ckhconsulting.com/blog/ ‣ Twitter - gratzc ‣ LinkedIn - gratzc ‣ League of Legends - gratzc Contact Info