SlideShare a Scribd company logo
NoSE: Schema Design for
NoSQL Applications
Michael J. Mior, Kenneth Salem, Ashraf Aboulnaga, Rui Liu
NoSE
● NoSQL App Development
● Problem Formulation
● NoSE Design and Implementation
● Evaluation
NoSQL
● Eventually consistent, horizontally scalable, flexible schema
● Many different types of NoSQL databases
○ Document stores
○ Key-value stores
○ Graph databases
○ …
○ Extensible record stores
Extensible Record Store Data Model
CREATE COLUMNFAMILY "ReservationsByGuest"(
"GuestID" uuid, "ResID" uuid,
"ResStartDate" timestamp,
"RoomID" uuid, PRIMARY KEY(("GuestID"),
"ResStartDate", "ResID", "RoomID")
);
Partitioning key
Clustering key
Database Application Development
1. Define application requirements
2. Decide on a data model for the target system
3. Implement the application according to the model
a. Database access
b. Application logic
Database Application Development
1. Define application requirements
2. Decide on a data model for the target system
3. Implement the application according to the model
a. Database access
b. Application logic
}NoSE
Schema Design Best Practices
Source: http://www.ebaytechblog.com/2012/07/16/cassandra-data-modeling-best-practices-part-1/
Schema Design Best Practices
Model column families around query patterns
Source: http://www.ebaytechblog.com/2012/07/16/cassandra-data-modeling-best-practices-part-1/
Schema Design Best Practices
Model column families around query patterns
But start your design with entities and relationships, if you can
Source: http://www.ebaytechblog.com/2012/07/16/cassandra-data-modeling-best-practices-part-1/
Schema Design Best Practices
Model column families around query patterns
But start your design with entities and relationships, if you can
De-normalize and duplicate for read performance
Source: http://www.ebaytechblog.com/2012/07/16/cassandra-data-modeling-best-practices-part-1/
Schema Design Best Practices
Model column families around query patterns
But start your design with entities and relationships, if you can
De-normalize and duplicate for read performance
But don’t de-normalize if you don’t need to
Source: http://www.ebaytechblog.com/2012/07/16/cassandra-data-modeling-best-practices-part-1/
Schema Design Best Practices
Model column families around query patterns
But start your design with entities and relationships, if you can
De-normalize and duplicate for read performance
But don’t de-normalize if you don’t need to
Leverage wide rows for ordering, grouping, and filtering
Source: http://www.ebaytechblog.com/2012/07/16/cassandra-data-modeling-best-practices-part-1/
Schema Design Best Practices
Model column families around query patterns
But start your design with entities and relationships, if you can
De-normalize and duplicate for read performance
But don’t de-normalize if you don’t need to
Leverage wide rows for ordering, grouping, and filtering
But don’t go too wide
Source: http://www.ebaytechblog.com/2012/07/16/cassandra-data-modeling-best-practices-part-1/
Schema Design Example
For a given guest, return the cities that guest has stayed in
CREATE COLUMNFAMILY "CitiesByGuest" ("GuestID" uuid,
"City" text, PRIMARY KEY(("GuestID"), "City"));
CREATE COLUMNFAMILY "HotelsByGuest" ("GuestID" uuid,
"HotelID" uuid, PRIMARY KEY(("GuestID"), "HotelID"));
CREATE COLUMNFAMILY "HotelsByID" ("HotelID" uuid,
"HotelCity" text, PRIMARY KEY(("HotelID"), "HotelCity"));
NoSE Overview
Input Output
Conceptual
schema
Workload
Selected column
families
Query
implementation
plans
NoSE
Application Conceptual Model
Hotel
HotelID
HotelName
HotelPhone
HotelAddress
HotelCity
HotelState
HotelZip
Room
RoomID
RoomNumber
RoomRate
RoomFloor
Reservation
ResID
ResStartDate
ResEndDate
Guest
GuestID
GuestName
GuestEmail
Point of Interest
POIID
POIName
POIDescription
Amenity
AmenityID
AmenityDescription
Application Workload
For a given guest, return the cities that guest has stayed in
SELECT Hotel.HotelCity FROM Hotel.Room.Reservation.Guest
WHERE Guest.GuestID = ?
Hotel
HotelID
HotelCity
Room
RoomID
Reservation
ResID
Guest
GuestID
NoSE Architecture
NoSE
Input OutputCandidate
Enumeration
Query Planning
Schema
Optimization
Plan
Recommendation
Conceptual
schema
Workload
Selected column
families
Query
implementation plans
Query Planning Example
SELECT Name FROM Hotel WHERE Hotel.State = ‘NY’ AND
Hotel.Reservation.Room.Guest.GuestID = ? ORDER BY Name
GuestID
↓
RoomID
RoomID
↓
HotelID
HotelID
↓
Name, State
Name
State
Schema Optimization
Construct a linear program to optimize execution time
Cost of using column family j to answer query i
Use of column family j for query i in the final plan
Presence of column family j in final schema
Size of column family j
Schema Optimization
Add constraints to ensure each query has a valid plan
Minimize the cost
Ensure column families used are present
Limit maximum storage space
Updates
● Updates make denormalization more expensive
● Add statements to update conceptual entities
● New column families are added to support updates
● Costs for updates are added to the linear program
Evaluation
● Application defined by the RUBiS online auction benchmark
● Generate a schema and query plans recommended by NoSE
● Two schemas for comparison
○ Normalized (as much as possible)
○ Expert-selected
Evaluation - Schema Performance
Conclusion
● NoSE automates schema design for NoSQL applications
● Conforms to best practices without requiring expertise
● Schemas are better than those produced manually with an
average of 1.8x and up to 125x performance improvement
Questions?
git.io/nose-icde

More Related Content

Viewers also liked

NoSQL and Data Modeling for Data Modelers
NoSQL and Data Modeling for Data ModelersNoSQL and Data Modeling for Data Modelers
NoSQL and Data Modeling for Data Modelers
Karen Lopez
 
Software Developer and Architecture @ LinkedIn (QCon SF 2014)
Software Developer and Architecture @ LinkedIn (QCon SF 2014)Software Developer and Architecture @ LinkedIn (QCon SF 2014)
Software Developer and Architecture @ LinkedIn (QCon SF 2014)
Sid Anand
 
Operational Analytics Using Spark and NoSQL Data Stores
Operational Analytics Using Spark and NoSQL Data StoresOperational Analytics Using Spark and NoSQL Data Stores
Operational Analytics Using Spark and NoSQL Data Stores
DATAVERSITY
 
Non-Relational Databases & Key/Value Stores
Non-Relational Databases & Key/Value StoresNon-Relational Databases & Key/Value Stores
Non-Relational Databases & Key/Value Stores
Joël Perras
 
Persistence Smoothie: Blending SQL and NoSQL (RubyNation Edition)
Persistence  Smoothie: Blending SQL and NoSQL (RubyNation Edition)Persistence  Smoothie: Blending SQL and NoSQL (RubyNation Edition)
Persistence Smoothie: Blending SQL and NoSQL (RubyNation Edition)
Michael Bleigh
 
Wakanda: NoSQL for Model-Driven Web applications - NoSQL matters 2012
Wakanda: NoSQL for Model-Driven Web applications - NoSQL matters 2012Wakanda: NoSQL for Model-Driven Web applications - NoSQL matters 2012
Wakanda: NoSQL for Model-Driven Web applications - NoSQL matters 2012Alexandre Morgaut
 
NoSQL meets Microservices
NoSQL meets MicroservicesNoSQL meets Microservices
NoSQL meets Microservices
ArangoDB Database
 
7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth
Fabio Fumarola
 
Real-World NoSQL Schema Design
Real-World NoSQL Schema DesignReal-World NoSQL Schema Design
Real-World NoSQL Schema Design
DataWorks Summit/Hadoop Summit
 
Conceptos básicos. Seminario web 1: Introducción a NoSQL
Conceptos básicos. Seminario web 1: Introducción a NoSQLConceptos básicos. Seminario web 1: Introducción a NoSQL
Conceptos básicos. Seminario web 1: Introducción a NoSQL
MongoDB
 
Metodologías CMMI y PMI
Metodologías CMMI y  PMIMetodologías CMMI y  PMI
Metodologías CMMI y PMI
Miguel Veces
 
Slides: NoSQL Data Modeling Using JSON Documents – A Practical Approach
Slides: NoSQL Data Modeling Using JSON Documents – A Practical ApproachSlides: NoSQL Data Modeling Using JSON Documents – A Practical Approach
Slides: NoSQL Data Modeling Using JSON Documents – A Practical Approach
DATAVERSITY
 
Query mechanisms for NoSQL databases
Query mechanisms for NoSQL databasesQuery mechanisms for NoSQL databases
Query mechanisms for NoSQL databasesArangoDB Database
 
Advanced data modeling with apache cassandra
Advanced data modeling with apache cassandraAdvanced data modeling with apache cassandra
Advanced data modeling with apache cassandra
Patrick McFadin
 
The 7 Key Ingredients of Web Content and Experience Management
The 7 Key Ingredients of Web Content and Experience ManagementThe 7 Key Ingredients of Web Content and Experience Management
The 7 Key Ingredients of Web Content and Experience Management
rivetlogic
 
NoSQL Design Considerations and Lessons Learned
NoSQL Design Considerations and Lessons LearnedNoSQL Design Considerations and Lessons Learned
NoSQL Design Considerations and Lessons Learned
rivetlogic
 
An Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDBAn Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDB
Lee Theobald
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL DatabasesDerek Stainer
 

Viewers also liked (19)

NoSQL and Data Modeling for Data Modelers
NoSQL and Data Modeling for Data ModelersNoSQL and Data Modeling for Data Modelers
NoSQL and Data Modeling for Data Modelers
 
Software Developer and Architecture @ LinkedIn (QCon SF 2014)
Software Developer and Architecture @ LinkedIn (QCon SF 2014)Software Developer and Architecture @ LinkedIn (QCon SF 2014)
Software Developer and Architecture @ LinkedIn (QCon SF 2014)
 
Operational Analytics Using Spark and NoSQL Data Stores
Operational Analytics Using Spark and NoSQL Data StoresOperational Analytics Using Spark and NoSQL Data Stores
Operational Analytics Using Spark and NoSQL Data Stores
 
Non-Relational Databases & Key/Value Stores
Non-Relational Databases & Key/Value StoresNon-Relational Databases & Key/Value Stores
Non-Relational Databases & Key/Value Stores
 
Persistence Smoothie: Blending SQL and NoSQL (RubyNation Edition)
Persistence  Smoothie: Blending SQL and NoSQL (RubyNation Edition)Persistence  Smoothie: Blending SQL and NoSQL (RubyNation Edition)
Persistence Smoothie: Blending SQL and NoSQL (RubyNation Edition)
 
Wakanda: NoSQL for Model-Driven Web applications - NoSQL matters 2012
Wakanda: NoSQL for Model-Driven Web applications - NoSQL matters 2012Wakanda: NoSQL for Model-Driven Web applications - NoSQL matters 2012
Wakanda: NoSQL for Model-Driven Web applications - NoSQL matters 2012
 
NoSQL meets Microservices
NoSQL meets MicroservicesNoSQL meets Microservices
NoSQL meets Microservices
 
7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth
 
Real-World NoSQL Schema Design
Real-World NoSQL Schema DesignReal-World NoSQL Schema Design
Real-World NoSQL Schema Design
 
Conceptos básicos. Seminario web 1: Introducción a NoSQL
Conceptos básicos. Seminario web 1: Introducción a NoSQLConceptos básicos. Seminario web 1: Introducción a NoSQL
Conceptos básicos. Seminario web 1: Introducción a NoSQL
 
Metodologías CMMI y PMI
Metodologías CMMI y  PMIMetodologías CMMI y  PMI
Metodologías CMMI y PMI
 
Slides: NoSQL Data Modeling Using JSON Documents – A Practical Approach
Slides: NoSQL Data Modeling Using JSON Documents – A Practical ApproachSlides: NoSQL Data Modeling Using JSON Documents – A Practical Approach
Slides: NoSQL Data Modeling Using JSON Documents – A Practical Approach
 
Query mechanisms for NoSQL databases
Query mechanisms for NoSQL databasesQuery mechanisms for NoSQL databases
Query mechanisms for NoSQL databases
 
Advanced data modeling with apache cassandra
Advanced data modeling with apache cassandraAdvanced data modeling with apache cassandra
Advanced data modeling with apache cassandra
 
Modelo CMMI
Modelo CMMIModelo CMMI
Modelo CMMI
 
The 7 Key Ingredients of Web Content and Experience Management
The 7 Key Ingredients of Web Content and Experience ManagementThe 7 Key Ingredients of Web Content and Experience Management
The 7 Key Ingredients of Web Content and Experience Management
 
NoSQL Design Considerations and Lessons Learned
NoSQL Design Considerations and Lessons LearnedNoSQL Design Considerations and Lessons Learned
NoSQL Design Considerations and Lessons Learned
 
An Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDBAn Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDB
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL Databases
 

Similar to NoSE: Schema Design for NoSQL Applications

You know what iMEAN? Using MEAN stack for application dev on Informix
You know what iMEAN? Using MEAN stack for application dev on InformixYou know what iMEAN? Using MEAN stack for application dev on Informix
You know what iMEAN? Using MEAN stack for application dev on Informix
Keshav Murthy
 
Building your First MEAN App
Building your First MEAN AppBuilding your First MEAN App
Building your First MEAN AppMongoDB
 
Bring your code to explore the Azure Data Lake: Execute your .NET/Python/R co...
Bring your code to explore the Azure Data Lake: Execute your .NET/Python/R co...Bring your code to explore the Azure Data Lake: Execute your .NET/Python/R co...
Bring your code to explore the Azure Data Lake: Execute your .NET/Python/R co...
Michael Rys
 
MongoDB at Scale
MongoDB at ScaleMongoDB at Scale
MongoDB at Scale
MongoDB
 
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And WhentranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
David Peyruc
 
Scaling MongoDB
Scaling MongoDBScaling MongoDB
Scaling MongoDB
MongoDB
 
Introduction to NoSQL Database
Introduction to NoSQL DatabaseIntroduction to NoSQL Database
Introduction to NoSQL Database
Mohammad Alghanem
 
Simplifying & accelerating application development with MongoDB's intelligent...
Simplifying & accelerating application development with MongoDB's intelligent...Simplifying & accelerating application development with MongoDB's intelligent...
Simplifying & accelerating application development with MongoDB's intelligent...
Maxime Beugnet
 
Considerations for using NoSQL technology on your next IT project - Akmal Cha...
Considerations for using NoSQL technology on your next IT project - Akmal Cha...Considerations for using NoSQL technology on your next IT project - Akmal Cha...
Considerations for using NoSQL technology on your next IT project - Akmal Cha...
jaxconf
 
ADL/U-SQL Introduction (SQLBits 2016)
ADL/U-SQL Introduction (SQLBits 2016)ADL/U-SQL Introduction (SQLBits 2016)
ADL/U-SQL Introduction (SQLBits 2016)
Michael Rys
 
Cassandra Data Modelling
Cassandra Data ModellingCassandra Data Modelling
Cassandra Data Modelling
Knoldus Inc.
 
SQL Server 2008 Data Mining
SQL Server 2008 Data MiningSQL Server 2008 Data Mining
SQL Server 2008 Data Mining
llangit
 
Tech Gupshup Meetup On MongoDB - 24/06/2016
Tech Gupshup Meetup On MongoDB - 24/06/2016Tech Gupshup Meetup On MongoDB - 24/06/2016
Tech Gupshup Meetup On MongoDB - 24/06/2016
Mukesh Tilokani
 
Big Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft AzureBig Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft Azure
Mark Kromer
 
Orm and hibernate
Orm and hibernateOrm and hibernate
Orm and hibernate
s4al_com
 
Azure Data Lake and U-SQL
Azure Data Lake and U-SQLAzure Data Lake and U-SQL
Azure Data Lake and U-SQL
Michael Rys
 
Siddhi - cloud-native stream processor
Siddhi - cloud-native stream processorSiddhi - cloud-native stream processor
Siddhi - cloud-native stream processor
Sriskandarajah Suhothayan
 
[WSO2Con Asia 2018] Patterns for Building Streaming Apps
[WSO2Con Asia 2018] Patterns for Building Streaming Apps[WSO2Con Asia 2018] Patterns for Building Streaming Apps
[WSO2Con Asia 2018] Patterns for Building Streaming Apps
WSO2
 
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and SparkVital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
Vital.AI
 
Moving from SQL Server to MongoDB
Moving from SQL Server to MongoDBMoving from SQL Server to MongoDB
Moving from SQL Server to MongoDB
Nick Court
 

Similar to NoSE: Schema Design for NoSQL Applications (20)

You know what iMEAN? Using MEAN stack for application dev on Informix
You know what iMEAN? Using MEAN stack for application dev on InformixYou know what iMEAN? Using MEAN stack for application dev on Informix
You know what iMEAN? Using MEAN stack for application dev on Informix
 
Building your First MEAN App
Building your First MEAN AppBuilding your First MEAN App
Building your First MEAN App
 
Bring your code to explore the Azure Data Lake: Execute your .NET/Python/R co...
Bring your code to explore the Azure Data Lake: Execute your .NET/Python/R co...Bring your code to explore the Azure Data Lake: Execute your .NET/Python/R co...
Bring your code to explore the Azure Data Lake: Execute your .NET/Python/R co...
 
MongoDB at Scale
MongoDB at ScaleMongoDB at Scale
MongoDB at Scale
 
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And WhentranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
 
Scaling MongoDB
Scaling MongoDBScaling MongoDB
Scaling MongoDB
 
Introduction to NoSQL Database
Introduction to NoSQL DatabaseIntroduction to NoSQL Database
Introduction to NoSQL Database
 
Simplifying & accelerating application development with MongoDB's intelligent...
Simplifying & accelerating application development with MongoDB's intelligent...Simplifying & accelerating application development with MongoDB's intelligent...
Simplifying & accelerating application development with MongoDB's intelligent...
 
Considerations for using NoSQL technology on your next IT project - Akmal Cha...
Considerations for using NoSQL technology on your next IT project - Akmal Cha...Considerations for using NoSQL technology on your next IT project - Akmal Cha...
Considerations for using NoSQL technology on your next IT project - Akmal Cha...
 
ADL/U-SQL Introduction (SQLBits 2016)
ADL/U-SQL Introduction (SQLBits 2016)ADL/U-SQL Introduction (SQLBits 2016)
ADL/U-SQL Introduction (SQLBits 2016)
 
Cassandra Data Modelling
Cassandra Data ModellingCassandra Data Modelling
Cassandra Data Modelling
 
SQL Server 2008 Data Mining
SQL Server 2008 Data MiningSQL Server 2008 Data Mining
SQL Server 2008 Data Mining
 
Tech Gupshup Meetup On MongoDB - 24/06/2016
Tech Gupshup Meetup On MongoDB - 24/06/2016Tech Gupshup Meetup On MongoDB - 24/06/2016
Tech Gupshup Meetup On MongoDB - 24/06/2016
 
Big Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft AzureBig Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft Azure
 
Orm and hibernate
Orm and hibernateOrm and hibernate
Orm and hibernate
 
Azure Data Lake and U-SQL
Azure Data Lake and U-SQLAzure Data Lake and U-SQL
Azure Data Lake and U-SQL
 
Siddhi - cloud-native stream processor
Siddhi - cloud-native stream processorSiddhi - cloud-native stream processor
Siddhi - cloud-native stream processor
 
[WSO2Con Asia 2018] Patterns for Building Streaming Apps
[WSO2Con Asia 2018] Patterns for Building Streaming Apps[WSO2Con Asia 2018] Patterns for Building Streaming Apps
[WSO2Con Asia 2018] Patterns for Building Streaming Apps
 
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and SparkVital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
 
Moving from SQL Server to MongoDB
Moving from SQL Server to MongoDBMoving from SQL Server to MongoDB
Moving from SQL Server to MongoDB
 

Recently uploaded

Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
Fwdays
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 

Recently uploaded (20)

Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 

NoSE: Schema Design for NoSQL Applications

  • 1. NoSE: Schema Design for NoSQL Applications Michael J. Mior, Kenneth Salem, Ashraf Aboulnaga, Rui Liu
  • 2. NoSE ● NoSQL App Development ● Problem Formulation ● NoSE Design and Implementation ● Evaluation
  • 3. NoSQL ● Eventually consistent, horizontally scalable, flexible schema ● Many different types of NoSQL databases ○ Document stores ○ Key-value stores ○ Graph databases ○ … ○ Extensible record stores
  • 4. Extensible Record Store Data Model CREATE COLUMNFAMILY "ReservationsByGuest"( "GuestID" uuid, "ResID" uuid, "ResStartDate" timestamp, "RoomID" uuid, PRIMARY KEY(("GuestID"), "ResStartDate", "ResID", "RoomID") ); Partitioning key Clustering key
  • 5. Database Application Development 1. Define application requirements 2. Decide on a data model for the target system 3. Implement the application according to the model a. Database access b. Application logic
  • 6. Database Application Development 1. Define application requirements 2. Decide on a data model for the target system 3. Implement the application according to the model a. Database access b. Application logic }NoSE
  • 7. Schema Design Best Practices Source: http://www.ebaytechblog.com/2012/07/16/cassandra-data-modeling-best-practices-part-1/
  • 8. Schema Design Best Practices Model column families around query patterns Source: http://www.ebaytechblog.com/2012/07/16/cassandra-data-modeling-best-practices-part-1/
  • 9. Schema Design Best Practices Model column families around query patterns But start your design with entities and relationships, if you can Source: http://www.ebaytechblog.com/2012/07/16/cassandra-data-modeling-best-practices-part-1/
  • 10. Schema Design Best Practices Model column families around query patterns But start your design with entities and relationships, if you can De-normalize and duplicate for read performance Source: http://www.ebaytechblog.com/2012/07/16/cassandra-data-modeling-best-practices-part-1/
  • 11. Schema Design Best Practices Model column families around query patterns But start your design with entities and relationships, if you can De-normalize and duplicate for read performance But don’t de-normalize if you don’t need to Source: http://www.ebaytechblog.com/2012/07/16/cassandra-data-modeling-best-practices-part-1/
  • 12. Schema Design Best Practices Model column families around query patterns But start your design with entities and relationships, if you can De-normalize and duplicate for read performance But don’t de-normalize if you don’t need to Leverage wide rows for ordering, grouping, and filtering Source: http://www.ebaytechblog.com/2012/07/16/cassandra-data-modeling-best-practices-part-1/
  • 13. Schema Design Best Practices Model column families around query patterns But start your design with entities and relationships, if you can De-normalize and duplicate for read performance But don’t de-normalize if you don’t need to Leverage wide rows for ordering, grouping, and filtering But don’t go too wide Source: http://www.ebaytechblog.com/2012/07/16/cassandra-data-modeling-best-practices-part-1/
  • 14. Schema Design Example For a given guest, return the cities that guest has stayed in CREATE COLUMNFAMILY "CitiesByGuest" ("GuestID" uuid, "City" text, PRIMARY KEY(("GuestID"), "City")); CREATE COLUMNFAMILY "HotelsByGuest" ("GuestID" uuid, "HotelID" uuid, PRIMARY KEY(("GuestID"), "HotelID")); CREATE COLUMNFAMILY "HotelsByID" ("HotelID" uuid, "HotelCity" text, PRIMARY KEY(("HotelID"), "HotelCity"));
  • 15. NoSE Overview Input Output Conceptual schema Workload Selected column families Query implementation plans NoSE
  • 17. Application Workload For a given guest, return the cities that guest has stayed in SELECT Hotel.HotelCity FROM Hotel.Room.Reservation.Guest WHERE Guest.GuestID = ? Hotel HotelID HotelCity Room RoomID Reservation ResID Guest GuestID
  • 18. NoSE Architecture NoSE Input OutputCandidate Enumeration Query Planning Schema Optimization Plan Recommendation Conceptual schema Workload Selected column families Query implementation plans
  • 19. Query Planning Example SELECT Name FROM Hotel WHERE Hotel.State = ‘NY’ AND Hotel.Reservation.Room.Guest.GuestID = ? ORDER BY Name GuestID ↓ RoomID RoomID ↓ HotelID HotelID ↓ Name, State Name State
  • 20. Schema Optimization Construct a linear program to optimize execution time Cost of using column family j to answer query i Use of column family j for query i in the final plan Presence of column family j in final schema Size of column family j
  • 21. Schema Optimization Add constraints to ensure each query has a valid plan Minimize the cost Ensure column families used are present Limit maximum storage space
  • 22. Updates ● Updates make denormalization more expensive ● Add statements to update conceptual entities ● New column families are added to support updates ● Costs for updates are added to the linear program
  • 23. Evaluation ● Application defined by the RUBiS online auction benchmark ● Generate a schema and query plans recommended by NoSE ● Two schemas for comparison ○ Normalized (as much as possible) ○ Expert-selected
  • 24. Evaluation - Schema Performance
  • 25. Conclusion ● NoSE automates schema design for NoSQL applications ● Conforms to best practices without requiring expertise ● Schemas are better than those produced manually with an average of 1.8x and up to 125x performance improvement