SlideShare a Scribd company logo
1 of 33
Data Modeling for NoSQL

        Tony Tam
        @fehguy
Data Modeling?




  Smart
 Modeling
  makes
NoSQL work
Why Modeling Matters

• NoSQL => no joins
• What replaces joins?
 •   Hierarchy
 •   Duplication of data
 •   Different models for querying, indexing
• Your optimal data model is (probably) very
  different than with relational
 •   Simpler
 •   More like you develop
Stop Thinking Like This!




   endless
  layers of
 abstraction
(and misery)
Hierarchy before NoSQL

• Simple User Model
Hierarchy before NoSQL

• Tuned Queries
 •   Write some brittle SQL:
     •   “select user.id, … inner join settings on …
 •   Pick out the fields and construct object hierarchy
     (this gets nasty, fast)
     •   (outer joins for optional values?)
• Object fetching
 •   Queries follow object graph, PK/FK
 •   5 queries to fetch object in this example
Hierarchy before NoSQL
Hierarchy with NoSQL

• JSON structure mapped to objects
 •   Fetch json from MongoDB**
 •   Unmarshall into objects/tuples
 •   Use it




                  Using JSON4S
Hierarchy with NoSQL

              Focus on
                 your
             Software, n
             ot DB layer!
Hierarchy with NoSQL

• Write operations
 •   Atomic upsert (create, update or fail)




 •   Saves all levels of object atomically
 •   Reduces need for transactions
Hierarchy with NoSQL

  • Write operations
    •   Atomic upsert (create, update or fail)




    •   Saves all levels of object atomically
    •   Reduces need for transactions
                                                Convenienc
                                                e not magic
 All or
nothing
Unique Identifiers in your Data

• Relational design => PK/FK
 •   Often not “meaningful” identifiers for data


• User Data Model
Unique Identifiers in your Data

• Relational design => PK/FK
 •   Often not “meaningful” identifiers for data
                                  Unique by
• User Data Model                 username
Unique Identifiers in your Data

• Words      Ensured to
             be constant
Data Duplication

• Without Joins, what about SQL lookup
  tables?
 •   Duplication of data in NoSQL is required
• Trade storage for speed
Data Duplication
                                   …Can
•   Without Joins, what about SQL lookup
                                move logic
    tables?                        to app
    •   Duplication of data in NoSQL is required
• Trade storage for speed
Data Duplication

• Many fields don’t change, ever
• But… many do
 •   New decisions for the developer!
 •   Often background updates
Data Duplication
                                 How often
•   Many fields don’t change, ever does this
•   But… many do
                                   change?
    •   New decisions for the developer!
    •   Often background updates
Data Duplication
Reaching into Objects

• Incredible feature of MongoDB
 •   Dot syntax safely** traverses the object graph
Inner Indexes

• Convenience at a cost
 •   No index => table scan
 •   No value? => table scan
 •   No child value? => table scan
• Table scan with big collection?
• Can’t index everything!
  96GB of
 Indexes?
Inner Indexes

• This will should drive your Data Model
• Sparse Data test


                            Even with only
                              2000 non-
                            empty values!
Adding & Modifying

• Append in mongo is blazing fast
 •   “tail” of data is always in memory
 •   Pre-allocated data files


• Main expense is “index maintenance”
 •   Some marshalling/unmarshalling cost**


• Modifying? Object growth
 •   Pre-allocation of space built in collection design
Adding & Modifying

• Each object has allocated space
 •   Exceed that space, need to relocate object
 •   Leaves “hole” in collection
• Large increases to documents hurts your
  overall performance
• Your data model should strive for equally-
  sized objects as much as possible
Retrieval

• Many same rules apply as relational
• Indexes
 •   complex/inner or not
 •   Indexes in RAM? Yes
 •   Cardinality matters
• New(ish) considerations
 •   Complex hierarchy not free
     •   Marshalling  unmarshalling
Marshalling & Unmarshalling

                  Object
                 complexit
                    y
Records/sec
Marshalling & Unmarshalling

• All you can eat from your Data Model?
• Techniques have tremendous impact
 •   Development ease until it matters
 •   50% speed bump with manual mapping


                 Only demand
                 what you can
                  consume!
Making the most of _id

• Indexes matter
• Tailor your _id to be meaningful by access
  pattern
 •   It’s your first defense when auto-sharding
• Date-driven data?
 •   Monotonically _id value



 •   Ensures recent data is “hot”
Making the most of _id

• Other time-based data techniques

• Flexibility in querying
Making the most of _id

• Other time-based data techniques

• Flexibility in querying

    Case-
  sensitive
  REGEX is
   your pal
Making the most of _id

• Hot indexes are happy indexes
 •   Access should strive for right bias
• Random access with large indexes hit disk
                                           1
                                           7




                                               1   2
                                               5   7
Your Data Model

• NoSQL gets you started faster
• Many relational pain points are gone
• New considerations (easier?)
• Migration should be real effort
• Designed by access patterns over object
  structure
• Don’t prematurely optimize, but know
  where the knobs are
More Reading

• http://tech.wordnik.com
• http://github.com/wordnik/wordnik-oss
• http://developer.wordnik.com
• http://slideshare.net/fehguy

More Related Content

What's hot

Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databasesJames Serra
 
Snowflake Architecture.pptx
Snowflake Architecture.pptxSnowflake Architecture.pptx
Snowflake Architecture.pptxchennakesava44
 
Data warehouse project on retail store
Data warehouse project on retail storeData warehouse project on retail store
Data warehouse project on retail storeSiddharth Chaudhary
 
Modeling with Document Database: 5 Key Patterns
Modeling with Document Database: 5 Key PatternsModeling with Document Database: 5 Key Patterns
Modeling with Document Database: 5 Key PatternsDan Sullivan, Ph.D.
 
DAX and Power BI Training - 002 DAX Level 1 - 3
DAX and Power BI Training - 002 DAX Level 1 - 3DAX and Power BI Training - 002 DAX Level 1 - 3
DAX and Power BI Training - 002 DAX Level 1 - 3Will Harvey
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data EngineeringHadi Fadlallah
 
Mongodb - NoSql Database
Mongodb - NoSql DatabaseMongodb - NoSql Database
Mongodb - NoSql DatabasePrashant Gupta
 
Data modeling star schema
Data modeling star schemaData modeling star schema
Data modeling star schemaSayed Ahmed
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureJames Serra
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)James Serra
 
What is NoSQL and CAP Theorem
What is NoSQL and CAP TheoremWhat is NoSQL and CAP Theorem
What is NoSQL and CAP TheoremRahul Jain
 

What's hot (20)

Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databases
 
Snowflake Architecture.pptx
Snowflake Architecture.pptxSnowflake Architecture.pptx
Snowflake Architecture.pptx
 
Tableau
TableauTableau
Tableau
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data cubes
Data cubesData cubes
Data cubes
 
Data warehouse project on retail store
Data warehouse project on retail storeData warehouse project on retail store
Data warehouse project on retail store
 
NOSQL vs SQL
NOSQL vs SQLNOSQL vs SQL
NOSQL vs SQL
 
Modeling with Document Database: 5 Key Patterns
Modeling with Document Database: 5 Key PatternsModeling with Document Database: 5 Key Patterns
Modeling with Document Database: 5 Key Patterns
 
OLAP v/s OLTP
OLAP v/s OLTPOLAP v/s OLTP
OLAP v/s OLTP
 
DAX and Power BI Training - 002 DAX Level 1 - 3
DAX and Power BI Training - 002 DAX Level 1 - 3DAX and Power BI Training - 002 DAX Level 1 - 3
DAX and Power BI Training - 002 DAX Level 1 - 3
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
 
Mongodb - NoSql Database
Mongodb - NoSql DatabaseMongodb - NoSql Database
Mongodb - NoSql Database
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
RDBMS vs NoSQL
RDBMS vs NoSQLRDBMS vs NoSQL
RDBMS vs NoSQL
 
Data modeling star schema
Data modeling star schemaData modeling star schema
Data modeling star schema
 
Dimensional Modelling
Dimensional ModellingDimensional Modelling
Dimensional Modelling
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)
 
What is NoSQL and CAP Theorem
What is NoSQL and CAP TheoremWhat is NoSQL and CAP Theorem
What is NoSQL and CAP Theorem
 
Dax & sql in power bi
Dax & sql in power biDax & sql in power bi
Dax & sql in power bi
 

Similar to Data Modeling for NoSQL

A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?DATAVERSITY
 
Build a modern data platform.pptx
Build a modern data platform.pptxBuild a modern data platform.pptx
Build a modern data platform.pptxIke Ellis
 
Using JPA applications in the era of NoSQL: Introducing Hibernate OGM
Using JPA applications in the era of NoSQL: Introducing Hibernate OGMUsing JPA applications in the era of NoSQL: Introducing Hibernate OGM
Using JPA applications in the era of NoSQL: Introducing Hibernate OGMPT.JUG
 
Sebastian Cohnen – Building a Startup with NoSQL - NoSQL matters Barcelona 2014
Sebastian Cohnen – Building a Startup with NoSQL - NoSQL matters Barcelona 2014Sebastian Cohnen – Building a Startup with NoSQL - NoSQL matters Barcelona 2014
Sebastian Cohnen – Building a Startup with NoSQL - NoSQL matters Barcelona 2014NoSQLmatters
 
What Drove Wordnik Non-Relational?
What Drove Wordnik Non-Relational?What Drove Wordnik Non-Relational?
What Drove Wordnik Non-Relational?DATAVERSITY
 
Running MongoDB in the Cloud
Running MongoDB in the CloudRunning MongoDB in the Cloud
Running MongoDB in the CloudTony Tam
 
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...Lucas Jellema
 
Data modeling trends for analytics
Data modeling trends for analyticsData modeling trends for analytics
Data modeling trends for analyticsIke Ellis
 
Inside Wordnik's Architecture
Inside Wordnik's ArchitectureInside Wordnik's Architecture
Inside Wordnik's ArchitectureTony Tam
 
PostgreSQL, your NoSQL database
PostgreSQL, your NoSQL databasePostgreSQL, your NoSQL database
PostgreSQL, your NoSQL databaseReuven Lerner
 
Why we love ArangoDB. The hunt for the right NosQL Database
Why we love ArangoDB. The hunt for the right NosQL DatabaseWhy we love ArangoDB. The hunt for the right NosQL Database
Why we love ArangoDB. The hunt for the right NosQL DatabaseAndreas Jung
 
Solr cloud the 'search first' nosql database extended deep dive
Solr cloud the 'search first' nosql database   extended deep diveSolr cloud the 'search first' nosql database   extended deep dive
Solr cloud the 'search first' nosql database extended deep divelucenerevolution
 
Demystifying data engineering
Demystifying data engineeringDemystifying data engineering
Demystifying data engineeringThang Bui (Bob)
 
Store
StoreStore
StoreESUG
 
Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero...
Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero...Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero...
Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero...OpenBlend society
 
Data Ingestion Engine
Data Ingestion EngineData Ingestion Engine
Data Ingestion EngineAdam Doyle
 
SeaJUG May 2012 mybatis
SeaJUG May 2012 mybatisSeaJUG May 2012 mybatis
SeaJUG May 2012 mybatisWill Iverson
 

Similar to Data Modeling for NoSQL (20)

A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
 
Build a modern data platform.pptx
Build a modern data platform.pptxBuild a modern data platform.pptx
Build a modern data platform.pptx
 
Using JPA applications in the era of NoSQL: Introducing Hibernate OGM
Using JPA applications in the era of NoSQL: Introducing Hibernate OGMUsing JPA applications in the era of NoSQL: Introducing Hibernate OGM
Using JPA applications in the era of NoSQL: Introducing Hibernate OGM
 
Sebastian Cohnen – Building a Startup with NoSQL - NoSQL matters Barcelona 2014
Sebastian Cohnen – Building a Startup with NoSQL - NoSQL matters Barcelona 2014Sebastian Cohnen – Building a Startup with NoSQL - NoSQL matters Barcelona 2014
Sebastian Cohnen – Building a Startup with NoSQL - NoSQL matters Barcelona 2014
 
What Drove Wordnik Non-Relational?
What Drove Wordnik Non-Relational?What Drove Wordnik Non-Relational?
What Drove Wordnik Non-Relational?
 
Running MongoDB in the Cloud
Running MongoDB in the CloudRunning MongoDB in the Cloud
Running MongoDB in the Cloud
 
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
 
Data modeling trends for analytics
Data modeling trends for analyticsData modeling trends for analytics
Data modeling trends for analytics
 
MongoDB
MongoDBMongoDB
MongoDB
 
Inside Wordnik's Architecture
Inside Wordnik's ArchitectureInside Wordnik's Architecture
Inside Wordnik's Architecture
 
PostgreSQL, your NoSQL database
PostgreSQL, your NoSQL databasePostgreSQL, your NoSQL database
PostgreSQL, your NoSQL database
 
Why we love ArangoDB. The hunt for the right NosQL Database
Why we love ArangoDB. The hunt for the right NosQL DatabaseWhy we love ArangoDB. The hunt for the right NosQL Database
Why we love ArangoDB. The hunt for the right NosQL Database
 
NOsql Presentation.pdf
NOsql Presentation.pdfNOsql Presentation.pdf
NOsql Presentation.pdf
 
Solr cloud the 'search first' nosql database extended deep dive
Solr cloud the 'search first' nosql database   extended deep diveSolr cloud the 'search first' nosql database   extended deep dive
Solr cloud the 'search first' nosql database extended deep dive
 
Why ruby and rails
Why ruby and railsWhy ruby and rails
Why ruby and rails
 
Demystifying data engineering
Demystifying data engineeringDemystifying data engineering
Demystifying data engineering
 
Store
StoreStore
Store
 
Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero...
Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero...Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero...
Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero...
 
Data Ingestion Engine
Data Ingestion EngineData Ingestion Engine
Data Ingestion Engine
 
SeaJUG May 2012 mybatis
SeaJUG May 2012 mybatisSeaJUG May 2012 mybatis
SeaJUG May 2012 mybatis
 

More from Tony Tam

A Tasty deep-dive into Open API Specification Links
A Tasty deep-dive into Open API Specification LinksA Tasty deep-dive into Open API Specification Links
A Tasty deep-dive into Open API Specification LinksTony Tam
 
API Design first with Swagger
API Design first with SwaggerAPI Design first with Swagger
API Design first with SwaggerTony Tam
 
Developing Faster with Swagger
Developing Faster with SwaggerDeveloping Faster with Swagger
Developing Faster with SwaggerTony Tam
 
Writer APIs in Java faster with Swagger Inflector
Writer APIs in Java faster with Swagger InflectorWriter APIs in Java faster with Swagger Inflector
Writer APIs in Java faster with Swagger InflectorTony Tam
 
Fastest to Mobile with Scalatra + Swagger
Fastest to Mobile with Scalatra + SwaggerFastest to Mobile with Scalatra + Swagger
Fastest to Mobile with Scalatra + SwaggerTony Tam
 
Swagger APIs for Humans and Robots (Gluecon)
Swagger APIs for Humans and Robots (Gluecon)Swagger APIs for Humans and Robots (Gluecon)
Swagger APIs for Humans and Robots (Gluecon)Tony Tam
 
Love your API with Swagger (Gluecon lightning talk)
Love your API with Swagger (Gluecon lightning talk)Love your API with Swagger (Gluecon lightning talk)
Love your API with Swagger (Gluecon lightning talk)Tony Tam
 
Swagger for-your-api
Swagger for-your-apiSwagger for-your-api
Swagger for-your-apiTony Tam
 
Swagger for startups
Swagger for startupsSwagger for startups
Swagger for startupsTony Tam
 
System insight without Interference
System insight without InterferenceSystem insight without Interference
System insight without InterferenceTony Tam
 
Keeping MongoDB Data Safe
Keeping MongoDB Data SafeKeeping MongoDB Data Safe
Keeping MongoDB Data SafeTony Tam
 
Scaling with swagger
Scaling with swaggerScaling with swagger
Scaling with swaggerTony Tam
 
Scala & Swagger at Wordnik
Scala & Swagger at WordnikScala & Swagger at Wordnik
Scala & Swagger at WordnikTony Tam
 
Introducing Swagger
Introducing SwaggerIntroducing Swagger
Introducing SwaggerTony Tam
 
Why Wordnik went non-relational
Why Wordnik went non-relationalWhy Wordnik went non-relational
Why Wordnik went non-relationalTony Tam
 
Building a Directed Graph with MongoDB
Building a Directed Graph with MongoDBBuilding a Directed Graph with MongoDB
Building a Directed Graph with MongoDBTony Tam
 
Managing a MongoDB Deployment
Managing a MongoDB DeploymentManaging a MongoDB Deployment
Managing a MongoDB DeploymentTony Tam
 
Keeping the Lights On with MongoDB
Keeping the Lights On with MongoDBKeeping the Lights On with MongoDB
Keeping the Lights On with MongoDBTony Tam
 
Migrating from MySQL to MongoDB at Wordnik
Migrating from MySQL to MongoDB at WordnikMigrating from MySQL to MongoDB at Wordnik
Migrating from MySQL to MongoDB at WordnikTony Tam
 

More from Tony Tam (19)

A Tasty deep-dive into Open API Specification Links
A Tasty deep-dive into Open API Specification LinksA Tasty deep-dive into Open API Specification Links
A Tasty deep-dive into Open API Specification Links
 
API Design first with Swagger
API Design first with SwaggerAPI Design first with Swagger
API Design first with Swagger
 
Developing Faster with Swagger
Developing Faster with SwaggerDeveloping Faster with Swagger
Developing Faster with Swagger
 
Writer APIs in Java faster with Swagger Inflector
Writer APIs in Java faster with Swagger InflectorWriter APIs in Java faster with Swagger Inflector
Writer APIs in Java faster with Swagger Inflector
 
Fastest to Mobile with Scalatra + Swagger
Fastest to Mobile with Scalatra + SwaggerFastest to Mobile with Scalatra + Swagger
Fastest to Mobile with Scalatra + Swagger
 
Swagger APIs for Humans and Robots (Gluecon)
Swagger APIs for Humans and Robots (Gluecon)Swagger APIs for Humans and Robots (Gluecon)
Swagger APIs for Humans and Robots (Gluecon)
 
Love your API with Swagger (Gluecon lightning talk)
Love your API with Swagger (Gluecon lightning talk)Love your API with Swagger (Gluecon lightning talk)
Love your API with Swagger (Gluecon lightning talk)
 
Swagger for-your-api
Swagger for-your-apiSwagger for-your-api
Swagger for-your-api
 
Swagger for startups
Swagger for startupsSwagger for startups
Swagger for startups
 
System insight without Interference
System insight without InterferenceSystem insight without Interference
System insight without Interference
 
Keeping MongoDB Data Safe
Keeping MongoDB Data SafeKeeping MongoDB Data Safe
Keeping MongoDB Data Safe
 
Scaling with swagger
Scaling with swaggerScaling with swagger
Scaling with swagger
 
Scala & Swagger at Wordnik
Scala & Swagger at WordnikScala & Swagger at Wordnik
Scala & Swagger at Wordnik
 
Introducing Swagger
Introducing SwaggerIntroducing Swagger
Introducing Swagger
 
Why Wordnik went non-relational
Why Wordnik went non-relationalWhy Wordnik went non-relational
Why Wordnik went non-relational
 
Building a Directed Graph with MongoDB
Building a Directed Graph with MongoDBBuilding a Directed Graph with MongoDB
Building a Directed Graph with MongoDB
 
Managing a MongoDB Deployment
Managing a MongoDB DeploymentManaging a MongoDB Deployment
Managing a MongoDB Deployment
 
Keeping the Lights On with MongoDB
Keeping the Lights On with MongoDBKeeping the Lights On with MongoDB
Keeping the Lights On with MongoDB
 
Migrating from MySQL to MongoDB at Wordnik
Migrating from MySQL to MongoDB at WordnikMigrating from MySQL to MongoDB at Wordnik
Migrating from MySQL to MongoDB at Wordnik
 

Recently uploaded

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 

Recently uploaded (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 

Data Modeling for NoSQL

  • 1. Data Modeling for NoSQL Tony Tam @fehguy
  • 2. Data Modeling? Smart Modeling makes NoSQL work
  • 3. Why Modeling Matters • NoSQL => no joins • What replaces joins? • Hierarchy • Duplication of data • Different models for querying, indexing • Your optimal data model is (probably) very different than with relational • Simpler • More like you develop
  • 4. Stop Thinking Like This! endless layers of abstraction (and misery)
  • 5. Hierarchy before NoSQL • Simple User Model
  • 6. Hierarchy before NoSQL • Tuned Queries • Write some brittle SQL: • “select user.id, … inner join settings on … • Pick out the fields and construct object hierarchy (this gets nasty, fast) • (outer joins for optional values?) • Object fetching • Queries follow object graph, PK/FK • 5 queries to fetch object in this example
  • 8. Hierarchy with NoSQL • JSON structure mapped to objects • Fetch json from MongoDB** • Unmarshall into objects/tuples • Use it Using JSON4S
  • 9. Hierarchy with NoSQL Focus on your Software, n ot DB layer!
  • 10. Hierarchy with NoSQL • Write operations • Atomic upsert (create, update or fail) • Saves all levels of object atomically • Reduces need for transactions
  • 11. Hierarchy with NoSQL • Write operations • Atomic upsert (create, update or fail) • Saves all levels of object atomically • Reduces need for transactions Convenienc e not magic All or nothing
  • 12. Unique Identifiers in your Data • Relational design => PK/FK • Often not “meaningful” identifiers for data • User Data Model
  • 13. Unique Identifiers in your Data • Relational design => PK/FK • Often not “meaningful” identifiers for data Unique by • User Data Model username
  • 14. Unique Identifiers in your Data • Words Ensured to be constant
  • 15. Data Duplication • Without Joins, what about SQL lookup tables? • Duplication of data in NoSQL is required • Trade storage for speed
  • 16. Data Duplication …Can • Without Joins, what about SQL lookup move logic tables? to app • Duplication of data in NoSQL is required • Trade storage for speed
  • 17. Data Duplication • Many fields don’t change, ever • But… many do • New decisions for the developer! • Often background updates
  • 18. Data Duplication How often • Many fields don’t change, ever does this • But… many do change? • New decisions for the developer! • Often background updates
  • 20. Reaching into Objects • Incredible feature of MongoDB • Dot syntax safely** traverses the object graph
  • 21. Inner Indexes • Convenience at a cost • No index => table scan • No value? => table scan • No child value? => table scan • Table scan with big collection? • Can’t index everything! 96GB of Indexes?
  • 22. Inner Indexes • This will should drive your Data Model • Sparse Data test Even with only 2000 non- empty values!
  • 23. Adding & Modifying • Append in mongo is blazing fast • “tail” of data is always in memory • Pre-allocated data files • Main expense is “index maintenance” • Some marshalling/unmarshalling cost** • Modifying? Object growth • Pre-allocation of space built in collection design
  • 24. Adding & Modifying • Each object has allocated space • Exceed that space, need to relocate object • Leaves “hole” in collection • Large increases to documents hurts your overall performance • Your data model should strive for equally- sized objects as much as possible
  • 25. Retrieval • Many same rules apply as relational • Indexes • complex/inner or not • Indexes in RAM? Yes • Cardinality matters • New(ish) considerations • Complex hierarchy not free • Marshalling  unmarshalling
  • 26. Marshalling & Unmarshalling Object complexit y Records/sec
  • 27. Marshalling & Unmarshalling • All you can eat from your Data Model? • Techniques have tremendous impact • Development ease until it matters • 50% speed bump with manual mapping Only demand what you can consume!
  • 28. Making the most of _id • Indexes matter • Tailor your _id to be meaningful by access pattern • It’s your first defense when auto-sharding • Date-driven data? • Monotonically _id value • Ensures recent data is “hot”
  • 29. Making the most of _id • Other time-based data techniques • Flexibility in querying
  • 30. Making the most of _id • Other time-based data techniques • Flexibility in querying Case- sensitive REGEX is your pal
  • 31. Making the most of _id • Hot indexes are happy indexes • Access should strive for right bias • Random access with large indexes hit disk 1 7 1 2 5 7
  • 32. Your Data Model • NoSQL gets you started faster • Many relational pain points are gone • New considerations (easier?) • Migration should be real effort • Designed by access patterns over object structure • Don’t prematurely optimize, but know where the knobs are
  • 33. More Reading • http://tech.wordnik.com • http://github.com/wordnik/wordnik-oss • http://developer.wordnik.com • http://slideshare.net/fehguy

Editor's Notes

  1. Go ½ way, won’t get the promised results. You’ll look silly
  2. Order history won’t change, but customer will
  3. Order history won’t change, but customer will
  4. 150mb index is not efficient