Are you new to schema design for MongoDB, or are you looking for a more complete or agile process than what you are following currently? In this talk, we will guide you through the phases of a flexible methodology that you can apply to projects ranging from small to large with very demanding requirements.
MongoDB .local London 2019: A Complete Methodology to Data Modeling for MongoDBLisa Roth, PMP
Are you new to schema design for MongoDB, or are you looking for a more complete or agile process than what you are following currently? In this talk, we will guide you through the phases of a flexible methodology that you can apply to projects ranging from small to large with very demanding requirements.
MongoDB World 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB
Are you new to schema design for MongoDB, or are looking for a more complete or agile process than what you are following currently? In this talk we will guide you through the phases of a flexible methodology that you can apply to projects ranging from small to large with very demanding requirements.
MongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDBMongoDB
This document discusses data modeling for MongoDB. It begins by recognizing the differences between document and tabular databases. It then outlines a methodology for modeling data in MongoDB, including describing the workload, identifying relationships, and applying patterns. Several patterns are discussed, such as schema versioning and computed fields. The document uses a coffee shop franchise example to demonstrate modeling real-world data in MongoDB.
The document discusses data modeling for MongoDB. It begins by recognizing the differences between modeling for a document database versus a relational database. It then outlines a flexible methodology for MongoDB modeling including defining the workload, identifying relationships between entities, and applying schema design patterns. Finally, it recognizes the need to apply patterns like schema versioning, subset, computed, bucket, and external reference when modeling for MongoDB.
MongoDB .local London 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB
Are you new to schema design for MongoDB, or are you looking for a more complete or agile process than what you are following currently? In this talk, we will guide you through the phases of a flexible methodology that you can apply to projects ranging from small to large with very demanding requirements.
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
The document describes a methodology for data modeling with MongoDB. It begins by recognizing the differences between document and tabular databases, then outlines a three step methodology: 1) describe the workload by listing queries, 2) identify and model relationships between entities, and 3) apply relevant patterns when modeling for MongoDB. The document uses examples around modeling a coffee shop franchise to illustrate modeling approaches and techniques.
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
Are you new to schema design for MongoDB, or are you looking for a more complete or agile process than what you are following currently? In this talk, we will guide you through the phases of a flexible methodology that you can apply to projects ranging from small to large with very demanding requirements.
MongoDB .local London 2019: A Complete Methodology to Data Modeling for MongoDBLisa Roth, PMP
Are you new to schema design for MongoDB, or are you looking for a more complete or agile process than what you are following currently? In this talk, we will guide you through the phases of a flexible methodology that you can apply to projects ranging from small to large with very demanding requirements.
MongoDB World 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB
Are you new to schema design for MongoDB, or are looking for a more complete or agile process than what you are following currently? In this talk we will guide you through the phases of a flexible methodology that you can apply to projects ranging from small to large with very demanding requirements.
MongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDBMongoDB
This document discusses data modeling for MongoDB. It begins by recognizing the differences between document and tabular databases. It then outlines a methodology for modeling data in MongoDB, including describing the workload, identifying relationships, and applying patterns. Several patterns are discussed, such as schema versioning and computed fields. The document uses a coffee shop franchise example to demonstrate modeling real-world data in MongoDB.
The document discusses data modeling for MongoDB. It begins by recognizing the differences between modeling for a document database versus a relational database. It then outlines a flexible methodology for MongoDB modeling including defining the workload, identifying relationships between entities, and applying schema design patterns. Finally, it recognizes the need to apply patterns like schema versioning, subset, computed, bucket, and external reference when modeling for MongoDB.
MongoDB .local London 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB
Are you new to schema design for MongoDB, or are you looking for a more complete or agile process than what you are following currently? In this talk, we will guide you through the phases of a flexible methodology that you can apply to projects ranging from small to large with very demanding requirements.
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
The document describes a methodology for data modeling with MongoDB. It begins by recognizing the differences between document and tabular databases, then outlines a three step methodology: 1) describe the workload by listing queries, 2) identify and model relationships between entities, and 3) apply relevant patterns when modeling for MongoDB. The document uses examples around modeling a coffee shop franchise to illustrate modeling approaches and techniques.
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
Are you new to schema design for MongoDB, or are you looking for a more complete or agile process than what you are following currently? In this talk, we will guide you through the phases of a flexible methodology that you can apply to projects ranging from small to large with very demanding requirements.
MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...MongoDB
Are you new to schema design for MongoDB, or are looking for a more complete or agile process than what you are following currently? In this talk we will guide you through the phases of a flexible methodology that you can apply to projects ranging from small to large with very demanding requirements.
Data Modelling for MongoDB - MongoDB.local Tel AvivNorberto Leite
At this point, you may be familiar with MongoDB and its Document Model.
However, what are the methods you can use to create an efficient database schema quickly and effectively?
This presentation will explore the different phases of a methodology to create a database schema. This methodology covers the description of your workload, the identification of the relationships between the elements (one-to-one, one-to-many and many-to-many) and an introduction to design patterns. Those patterns present practical solutions to different problems observed while helping our customers over the last 10 years.
In this session, you will learn about:
The differences between modeling for MongoDB versus a relational database.
A flexible methodology to model for MongoDB, which can be applied to simple projects, agile ones or more complex ones.
Overview of some common design patterns that help improve the performance of systems.
MongoDB.local Sydney 2019: Data Modeling for MongoDBMongoDB
At this point, you may be familiar with MongoDB and its Document Model.
However, what are the methods you can use to create an efficient database schema quickly and effectively?
This presentation will explore the different phases of a methodology to create a database schema. This methodology covers the description of your workload, the identification of the relationships between the elements (one-to-one, one-to-many and many-to-many) and an introduction to design patterns. Those patterns present practical solutions to different problems observed while helping our customers over the last 10 years.
In this session, you will learn about:
The differences between modeling for MongoDB versus a relational database.
A flexible methodology to model for MongoDB, which can be applied to simple projects, agile ones or more complex ones.
Overview of some common design patterns that help improve the performance of systems.
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB
Are you new to schema design for MongoDB, or are you looking for a more complete or agile process than what you are following currently? In this talk, we will guide you through the phases of a flexible methodology that you can apply to projects ranging from small to large with very demanding requirements.
Relational data modeling trends for transactional applicationsIke Ellis
This document provides a summary of Ike Ellis's presentation on data modeling priorities and design patterns for transactional applications. The presentation discusses how data modeling priorities have changed from focusing on writes and normalization to emphasizing reads, flexibility, and performance. It outlines several current design priorities including optimizing the schema for reads, making it easy to change and discoverable, and designing for the network instead of the disk. The presentation concludes with practicing modeling data for example transactional applications like a blog, online store, and refrigeration trucks.
This document discusses strategies for moving away from legacy code using behavior-driven development (BDD). It summarizes three popular options: 1) Rewriting the entire application from scratch using best practices, 2) Doing technical refactoring of the code, and 3) Taking a business-focused approach using the "BDD pipeline" which involves impact mapping, prioritizing features, example workshops, and BDD layers to support planned changes. The presenter argues that the third option of a BDD pipeline is preferable to a full rewrite or only technical refactoring as it focuses on delivering business value over time rather than rewriting the code.
Mendeley’s Research Catalogue: building it, opening it up and making it even ...Kris Jack
Presentation given at Workshop on Academic-Industrial Collaborations for Recommender Systems 2013 (http://bit.ly/114XDsE), JCDL'13. A walk through Mendeley as a platform, growing pains involved with engineering at a large scale, the data that we're making publicly available and some demos that have come out of academic collaborations.
The Path to Truly Understanding Your MongoDB DataMongoDB
1. The document discusses data visualization and analytics using MongoDB. It covers terminology, data growth trends, the importance of visualization, and different tools for visualizing MongoDB data including Compass, the BI Connector, and MongoDB Charts.
2. Examples of early data visualizations are shown and different architectures for analytics using hidden replicas are described.
3. The presentation emphasizes choosing the right solution based on needs, such as custom solutions, Compass, the BI Connector, or MongoDB Charts. A demo of the visualization lifecycle is promised.
Data Analytics: Understanding Your MongoDB DataMongoDB
This document discusses data visualization and analytics using MongoDB data. It covers the importance of data visualization, different architectures for analytics, and tooling options for visualizing MongoDB data, including building custom solutions, MongoDB Compass, the MongoDB BI Connector, and the new MongoDB Charts tool. The goal is to help users understand which visualization methods and tools are best suited to their specific needs and data.
Jay Runkel presented a methodology for sizing MongoDB clusters to meet the requirements of an application. The key steps are: 1) Analyze data size and index size, 2) Estimate the working set based on frequently accessed data, 3) Use a simplified model to estimate IOPS and adjust for real-world factors, 4) Calculate the number of shards needed based on storage, memory and IOPS requirements. He demonstrated this process for an application that collects mobile events, requiring a cluster that can store over 200 billion documents with 50,000 IOPS.
Myths & benefits of kanban @ATMs 2nd Meetup_aug05Anubhav Sinha
The document discusses introducing Kanban using a systems thinking approach. It outlines analyzing the current delivery process, identifying sources of dissatisfaction, modeling workflow, and designing a Kanban system. It emphasizes an iterative approach to evolve the system over time based on learning. Key steps include understanding customer needs, analyzing demand and capabilities, socializing the design, and continually improving.
The document provides an overview of a presentation on schema design patterns for MongoDB databases. It introduces several common patterns including Attribute, Subset, Computed, Approximation, and Schema Versioning. For each pattern, it describes the problem it addresses, example use cases, and the general solution or approach. It also includes examples of how the patterns could address issues like large documents, working set size, CPU usage, write volume, and changing schemas. The presentation aims to provide a common methodology and vocabulary for designing MongoDB schemas.
Using Metrics for Fun, Developing with the KV Store + Javascript & News from ...Harry McLaren
We explore "Metrics, mstats and Me: Splunking Human Data” and also have some insights into the KV Store and javascript use in dashboards. We’ll also re-cover the conf18 updates for those who couldn’t attend our last session.
Operations for databases – The DevOps journey Eduardo Piairo
This document discusses the journey of an organization towards adopting DevOps practices for database operations. It describes moving from a process with separate database and development teams to integrating database operations into the development workflow using practices like source control for database changes, continuous integration of database changes, and establishing collaboration agreements between teams. The goal was to automate database operations, enable faster releases, and eliminate bottlenecks caused by manual database processes. Metrics showed benefits like increasing the percentage of automated database changes from 0% to 98% and improving the organization's ability to support multiple customers simultaneously.
MongoDB .local Toronto 2019: Finding the Right Atlas Cluster Size: Does this ...MongoDB
How do you determine whether your MongoDB Atlas cluster is over provisioned, whether the new feature in your next application release will crush your cluster, or when to increase cluster size based upon planned usage growth? MongoDB Atlas provides over a hundred metrics enabling visibility into the inner workings of MongoDB performance, but how do apply all this information to make capacity planning decisions? This presentation will enable you to effectively analyze your MongoDB performance to optimize your MongoDB Atlas spend and ensure smooth application operation into the future.
Rapid Development with Schemaless Data ModelsMongoDB
The document discusses the MyEdu Profile Project's decision to use MongoDB over MySQL for its user profile data. It chose MongoDB to allow for rapid iteration and continuous change while minimizing downtime for new features. While a schema-less design is harder to maintain, the project employs patterns to control document structure. The profile has since undergone several iterations on MongoDB, demonstrating its ability to scale. Lessons include proper indexing and not thinking of data like MySQL.
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...DataStax
Cassandra is a distributed database with features included but not limited to Secundary Indexes, UDF, Materialized Views, etc. and not so strict hardware requirements.
It is important to use those features and select hardware correctly to make sure the use of Cassandra in your business can be as painless as possible.
I will address how these features are used in the wrong way, how hardware should be selected, and how to make Cassandra work in the best possible way.
Learning Objective #1:
Learn that Cassandra hardware requirements exist (and why) and the shortcomings in some of features(Secundary Indexes, Compaction Strategies, etc).
Learning Objective #2:
The most misused features and common hardware errors. How they might seem harmeless at first (either small cluster or even single node).
Learning Objective #3:
How to correctly use Cassandra and it's features and go for perfect operation.
About the Speaker
Carlos Rolo Cassandra Consultant, Pythian
Carlos Rolo is a Cassandra MVP, and has deep expertise with distributed architecture technologies. Carlos is driven by challenge, and enjoys the opportunities to discover new things.. He has become known and trusted by customers and colleagues for his ability to understand complex problems, and to work well under pressure. When Carlos isn't working he can be found playing water polo or enjoying the his local community.
Doing Analytics Right - Building the Analytics EnvironmentTasktop
Implementing analytics for development processes is challenging. As in discussed in the previous webinars, the right analytics are determined by the goals of the organization, not by the available data. So implementing your analytics solutions will require an efficient analytics and data architecture, including the ability to combine and stage data from heterogeneous sources. An architecture that excludes the ability to gain access to the necessary data will create a barrier to deploying your newly designed analytics program, and will force you back into the “light is brighter here” anti-pattern.
This webinar will describe the technical considerations of implementing the data architecture for your analytics program, and explain how Tasktop can help.
2013 CPM Conference, Nov 6th, NoSQL Capacity Planningasya999
This document discusses MongoDB capacity planning. It begins with a brief history of databases and the factors driving NoSQL adoption. It then discusses MongoDB's origins and key features like document storage, auto-sharding, and high availability. The document emphasizes that capacity planning requires understanding an application's requirements, resources used, and monitoring metrics over time. It provides examples of measuring and planning for storage, memory, CPU, and network resources as applications and data change. The goal of capacity planning is to continuously and proactively scale resources to meet evolving needs.
- The document evaluates criteria for choosing between NoSQL technologies like MongoDB and Redis.
- It discusses two use cases at Offers.com and how Redis was chosen for the first use case due to its fast reads/writes and data persistence, while MongoDB was chosen for the second use case due to its document-oriented data model and flexibility.
- Some downsides discussed are lack of data safety guarantees in MongoDB and lack of abstraction between NoSQL systems.
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
This presentation discusses migrating data from other data stores to MongoDB Atlas. It begins by explaining why MongoDB and Atlas are good choices for data management. Several preparation steps are covered, including sizing the target Atlas cluster, increasing the source oplog, and testing connectivity. Live migration, mongomirror, and dump/restore options are presented for migrating between replicasets or sharded clusters. Post-migration steps like monitoring and backups are also discussed. Finally, migrating from other data stores like AWS DocumentDB, Azure CosmosDB, DynamoDB, and relational databases are briefly covered.
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
These days, everyone is expected to be a data analyst. But with so much data available, how can you make sense of it and be sure you're making the best decisions? One great approach is to use data visualizations. In this session, we take a complex dataset and show how the breadth of capabilities in MongoDB Charts can help you turn bits and bytes into insights.
More Related Content
Similar to MongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...MongoDB
Are you new to schema design for MongoDB, or are looking for a more complete or agile process than what you are following currently? In this talk we will guide you through the phases of a flexible methodology that you can apply to projects ranging from small to large with very demanding requirements.
Data Modelling for MongoDB - MongoDB.local Tel AvivNorberto Leite
At this point, you may be familiar with MongoDB and its Document Model.
However, what are the methods you can use to create an efficient database schema quickly and effectively?
This presentation will explore the different phases of a methodology to create a database schema. This methodology covers the description of your workload, the identification of the relationships between the elements (one-to-one, one-to-many and many-to-many) and an introduction to design patterns. Those patterns present practical solutions to different problems observed while helping our customers over the last 10 years.
In this session, you will learn about:
The differences between modeling for MongoDB versus a relational database.
A flexible methodology to model for MongoDB, which can be applied to simple projects, agile ones or more complex ones.
Overview of some common design patterns that help improve the performance of systems.
MongoDB.local Sydney 2019: Data Modeling for MongoDBMongoDB
At this point, you may be familiar with MongoDB and its Document Model.
However, what are the methods you can use to create an efficient database schema quickly and effectively?
This presentation will explore the different phases of a methodology to create a database schema. This methodology covers the description of your workload, the identification of the relationships between the elements (one-to-one, one-to-many and many-to-many) and an introduction to design patterns. Those patterns present practical solutions to different problems observed while helping our customers over the last 10 years.
In this session, you will learn about:
The differences between modeling for MongoDB versus a relational database.
A flexible methodology to model for MongoDB, which can be applied to simple projects, agile ones or more complex ones.
Overview of some common design patterns that help improve the performance of systems.
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB
Are you new to schema design for MongoDB, or are you looking for a more complete or agile process than what you are following currently? In this talk, we will guide you through the phases of a flexible methodology that you can apply to projects ranging from small to large with very demanding requirements.
Relational data modeling trends for transactional applicationsIke Ellis
This document provides a summary of Ike Ellis's presentation on data modeling priorities and design patterns for transactional applications. The presentation discusses how data modeling priorities have changed from focusing on writes and normalization to emphasizing reads, flexibility, and performance. It outlines several current design priorities including optimizing the schema for reads, making it easy to change and discoverable, and designing for the network instead of the disk. The presentation concludes with practicing modeling data for example transactional applications like a blog, online store, and refrigeration trucks.
This document discusses strategies for moving away from legacy code using behavior-driven development (BDD). It summarizes three popular options: 1) Rewriting the entire application from scratch using best practices, 2) Doing technical refactoring of the code, and 3) Taking a business-focused approach using the "BDD pipeline" which involves impact mapping, prioritizing features, example workshops, and BDD layers to support planned changes. The presenter argues that the third option of a BDD pipeline is preferable to a full rewrite or only technical refactoring as it focuses on delivering business value over time rather than rewriting the code.
Mendeley’s Research Catalogue: building it, opening it up and making it even ...Kris Jack
Presentation given at Workshop on Academic-Industrial Collaborations for Recommender Systems 2013 (http://bit.ly/114XDsE), JCDL'13. A walk through Mendeley as a platform, growing pains involved with engineering at a large scale, the data that we're making publicly available and some demos that have come out of academic collaborations.
The Path to Truly Understanding Your MongoDB DataMongoDB
1. The document discusses data visualization and analytics using MongoDB. It covers terminology, data growth trends, the importance of visualization, and different tools for visualizing MongoDB data including Compass, the BI Connector, and MongoDB Charts.
2. Examples of early data visualizations are shown and different architectures for analytics using hidden replicas are described.
3. The presentation emphasizes choosing the right solution based on needs, such as custom solutions, Compass, the BI Connector, or MongoDB Charts. A demo of the visualization lifecycle is promised.
Data Analytics: Understanding Your MongoDB DataMongoDB
This document discusses data visualization and analytics using MongoDB data. It covers the importance of data visualization, different architectures for analytics, and tooling options for visualizing MongoDB data, including building custom solutions, MongoDB Compass, the MongoDB BI Connector, and the new MongoDB Charts tool. The goal is to help users understand which visualization methods and tools are best suited to their specific needs and data.
Jay Runkel presented a methodology for sizing MongoDB clusters to meet the requirements of an application. The key steps are: 1) Analyze data size and index size, 2) Estimate the working set based on frequently accessed data, 3) Use a simplified model to estimate IOPS and adjust for real-world factors, 4) Calculate the number of shards needed based on storage, memory and IOPS requirements. He demonstrated this process for an application that collects mobile events, requiring a cluster that can store over 200 billion documents with 50,000 IOPS.
Myths & benefits of kanban @ATMs 2nd Meetup_aug05Anubhav Sinha
The document discusses introducing Kanban using a systems thinking approach. It outlines analyzing the current delivery process, identifying sources of dissatisfaction, modeling workflow, and designing a Kanban system. It emphasizes an iterative approach to evolve the system over time based on learning. Key steps include understanding customer needs, analyzing demand and capabilities, socializing the design, and continually improving.
The document provides an overview of a presentation on schema design patterns for MongoDB databases. It introduces several common patterns including Attribute, Subset, Computed, Approximation, and Schema Versioning. For each pattern, it describes the problem it addresses, example use cases, and the general solution or approach. It also includes examples of how the patterns could address issues like large documents, working set size, CPU usage, write volume, and changing schemas. The presentation aims to provide a common methodology and vocabulary for designing MongoDB schemas.
Using Metrics for Fun, Developing with the KV Store + Javascript & News from ...Harry McLaren
We explore "Metrics, mstats and Me: Splunking Human Data” and also have some insights into the KV Store and javascript use in dashboards. We’ll also re-cover the conf18 updates for those who couldn’t attend our last session.
Operations for databases – The DevOps journey Eduardo Piairo
This document discusses the journey of an organization towards adopting DevOps practices for database operations. It describes moving from a process with separate database and development teams to integrating database operations into the development workflow using practices like source control for database changes, continuous integration of database changes, and establishing collaboration agreements between teams. The goal was to automate database operations, enable faster releases, and eliminate bottlenecks caused by manual database processes. Metrics showed benefits like increasing the percentage of automated database changes from 0% to 98% and improving the organization's ability to support multiple customers simultaneously.
MongoDB .local Toronto 2019: Finding the Right Atlas Cluster Size: Does this ...MongoDB
How do you determine whether your MongoDB Atlas cluster is over provisioned, whether the new feature in your next application release will crush your cluster, or when to increase cluster size based upon planned usage growth? MongoDB Atlas provides over a hundred metrics enabling visibility into the inner workings of MongoDB performance, but how do apply all this information to make capacity planning decisions? This presentation will enable you to effectively analyze your MongoDB performance to optimize your MongoDB Atlas spend and ensure smooth application operation into the future.
Rapid Development with Schemaless Data ModelsMongoDB
The document discusses the MyEdu Profile Project's decision to use MongoDB over MySQL for its user profile data. It chose MongoDB to allow for rapid iteration and continuous change while minimizing downtime for new features. While a schema-less design is harder to maintain, the project employs patterns to control document structure. The profile has since undergone several iterations on MongoDB, demonstrating its ability to scale. Lessons include proper indexing and not thinking of data like MySQL.
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...DataStax
Cassandra is a distributed database with features included but not limited to Secundary Indexes, UDF, Materialized Views, etc. and not so strict hardware requirements.
It is important to use those features and select hardware correctly to make sure the use of Cassandra in your business can be as painless as possible.
I will address how these features are used in the wrong way, how hardware should be selected, and how to make Cassandra work in the best possible way.
Learning Objective #1:
Learn that Cassandra hardware requirements exist (and why) and the shortcomings in some of features(Secundary Indexes, Compaction Strategies, etc).
Learning Objective #2:
The most misused features and common hardware errors. How they might seem harmeless at first (either small cluster or even single node).
Learning Objective #3:
How to correctly use Cassandra and it's features and go for perfect operation.
About the Speaker
Carlos Rolo Cassandra Consultant, Pythian
Carlos Rolo is a Cassandra MVP, and has deep expertise with distributed architecture technologies. Carlos is driven by challenge, and enjoys the opportunities to discover new things.. He has become known and trusted by customers and colleagues for his ability to understand complex problems, and to work well under pressure. When Carlos isn't working he can be found playing water polo or enjoying the his local community.
Doing Analytics Right - Building the Analytics EnvironmentTasktop
Implementing analytics for development processes is challenging. As in discussed in the previous webinars, the right analytics are determined by the goals of the organization, not by the available data. So implementing your analytics solutions will require an efficient analytics and data architecture, including the ability to combine and stage data from heterogeneous sources. An architecture that excludes the ability to gain access to the necessary data will create a barrier to deploying your newly designed analytics program, and will force you back into the “light is brighter here” anti-pattern.
This webinar will describe the technical considerations of implementing the data architecture for your analytics program, and explain how Tasktop can help.
2013 CPM Conference, Nov 6th, NoSQL Capacity Planningasya999
This document discusses MongoDB capacity planning. It begins with a brief history of databases and the factors driving NoSQL adoption. It then discusses MongoDB's origins and key features like document storage, auto-sharding, and high availability. The document emphasizes that capacity planning requires understanding an application's requirements, resources used, and monitoring metrics over time. It provides examples of measuring and planning for storage, memory, CPU, and network resources as applications and data change. The goal of capacity planning is to continuously and proactively scale resources to meet evolving needs.
- The document evaluates criteria for choosing between NoSQL technologies like MongoDB and Redis.
- It discusses two use cases at Offers.com and how Redis was chosen for the first use case due to its fast reads/writes and data persistence, while MongoDB was chosen for the second use case due to its document-oriented data model and flexibility.
- Some downsides discussed are lack of data safety guarantees in MongoDB and lack of abstraction between NoSQL systems.
Similar to MongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDB (20)
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
This presentation discusses migrating data from other data stores to MongoDB Atlas. It begins by explaining why MongoDB and Atlas are good choices for data management. Several preparation steps are covered, including sizing the target Atlas cluster, increasing the source oplog, and testing connectivity. Live migration, mongomirror, and dump/restore options are presented for migrating between replicasets or sharded clusters. Post-migration steps like monitoring and backups are also discussed. Finally, migrating from other data stores like AWS DocumentDB, Azure CosmosDB, DynamoDB, and relational databases are briefly covered.
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
These days, everyone is expected to be a data analyst. But with so much data available, how can you make sense of it and be sure you're making the best decisions? One great approach is to use data visualizations. In this session, we take a complex dataset and show how the breadth of capabilities in MongoDB Charts can help you turn bits and bytes into insights.
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
MongoDB Kubernetes operator and MongoDB Open Service Broker are ready for production operations. Learn about how MongoDB can be used with the most popular container orchestration platform, Kubernetes, and bring self-service, persistent storage to your containerized applications. A demo will show you how easy it is to enable MongoDB clusters as an External Service using the Open Service Broker API for MongoDB
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
Humana, like many companies, is tackling the challenge of creating real-time insights from data that is diverse and rapidly changing. This is our journey of how we used MongoDB to combined traditional batch approaches with streaming technologies to provide continues alerting capabilities from real-time data streams.
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
Time series data is increasingly at the heart of modern applications - think IoT, stock trading, clickstreams, social media, and more. With the move from batch to real time systems, the efficient capture and analysis of time series data can enable organizations to better detect and respond to events ahead of their competitors or to improve operational efficiency to reduce cost and risk. Working with time series data is often different from regular application data, and there are best practices you should observe.
This talk covers:
Common components of an IoT solution
The challenges involved with managing time-series data in IoT applications
Different schema designs, and how these affect memory and disk utilization – two critical factors in application performance.
How to query, analyze and present IoT time-series data using MongoDB Compass and MongoDB Charts
At the end of the session, you will have a better understanding of key best practices in managing IoT time-series data with MongoDB.
Join this talk and test session with a MongoDB Developer Advocate where you'll go over the setup, configuration, and deployment of an Atlas environment. Create a service that you can take back in a production-ready state and prepare to unleash your inner genius.
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
Our clients have unique use cases and data patterns that mandate the choice of a particular strategy. To implement these strategies, it is mandatory that we unlearn a lot of relational concepts while designing and rapidly developing efficient applications on NoSQL. In this session, we will talk about some of our client use cases, the strategies we have adopted, and the features of MongoDB that assisted in implementing these strategies.
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
Encryption is not a new concept to MongoDB. Encryption may occur in-transit (with TLS) and at-rest (with the encrypted storage engine). But MongoDB 4.2 introduces support for Client Side Encryption, ensuring the most sensitive data is encrypted before ever leaving the client application. Even full access to your MongoDB servers is not enough to decrypt this data. And better yet, Client Side Encryption can be enabled at the "flick of a switch".
This session covers using Client Side Encryption in your applications. This includes the necessary setup, how to encrypt data without sacrificing queryability, and what trade-offs to expect.
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
MongoDB Kubernetes operator is ready for prime-time. Learn about how MongoDB can be used with most popular orchestration platform, Kubernetes, and bring self-service, persistent storage to your containerized applications.
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
These days, everyone is expected to be a data analyst. But with so much data available, how can you make sense of it and be sure you're making the best decisions? One great approach is to use data visualizations. In this session, we take a complex dataset and show how the breadth of capabilities in MongoDB Charts can help you turn bits and bytes into insights.
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
When you need to model data, is your first instinct to start breaking it down into rows and columns? Mine used to be too. When you want to develop apps in a modern, agile way, NoSQL databases can be the best option. Come to this talk to learn how to take advantage of all that NoSQL databases have to offer and discover the benefits of changing your mindset from the legacy, tabular way of modeling data. We’ll compare and contrast the terms and concepts in SQL databases and MongoDB, explain the benefits of using MongoDB compared to SQL databases, and walk through data modeling basics so you feel confident as you begin using MongoDB.
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
Join this talk and test session with a MongoDB Developer Advocate where you'll go over the setup, configuration, and deployment of an Atlas environment. Create a service that you can take back in a production-ready state and prepare to unleash your inner genius.
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
The document discusses guidelines for ordering fields in compound indexes to optimize query performance. It recommends the E-S-R approach: placing equality fields first, followed by sort fields, and range fields last. This allows indexes to leverage equality matches, provide non-blocking sorts, and minimize scanning. Examples show how indexes ordered by these guidelines can support queries more efficiently by narrowing the search bounds.
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
Aggregation pipeline has been able to power your analysis of data since version 2.2. In 4.2 we added more power and now you can use it for more powerful queries, updates, and outputting your data to existing collections. Come hear how you can do everything with the pipeline, including single-view, ETL, data roll-ups and materialized views.
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
MongoDB Atlas Data Lake is a new service offered by MongoDB Atlas. Many organizations store long term, archival data in cost-effective storage like S3, GCP, and Azure Blobs. However, many of them do not have robust systems or tools to effectively utilize large amounts of data to inform decision making. MongoDB Atlas Data Lake is a service allowing organizations to analyze their long-term data to discover a wealth of information about their business.
This session will take a deep dive into the features that are currently available in MongoDB Atlas Data Lake and how they are implemented. In addition, we'll discuss future plans and opportunities and offer ample Q&A time with the engineers on the project.
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
Virtual assistants are becoming the new norm when it comes to daily life, with Amazon’s Alexa being the leader in the space. As a developer, not only do you need to make web and mobile compliant applications, but you need to be able to support virtual assistants like Alexa. However, the process isn’t quite the same between the platforms.
How do you handle requests? Where do you store your data and work with it to create meaningful responses with little delay? How much of your code needs to change between platforms?
In this session we’ll see how to design and develop applications known as Skills for Amazon Alexa powered devices using the Go programming language and MongoDB.
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
aux Core Data, appréciée par des centaines de milliers de développeurs. Apprenez ce qui rend Realm spécial et comment il peut être utilisé pour créer de meilleures applications plus rapidement.
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
Il n’a jamais été aussi facile de commander en ligne et de se faire livrer en moins de 48h très souvent gratuitement. Cette simplicité d’usage cache un marché complexe de plus de 8000 milliards de $.
La data est bien connu du monde de la Supply Chain (itinéraires, informations sur les marchandises, douanes,…), mais la valeur de ces données opérationnelles reste peu exploitée. En alliant expertise métier et Data Science, Upply redéfinit les fondamentaux de la Supply Chain en proposant à chacun des acteurs de surmonter la volatilité et l’inefficacité du marché.
MongoDB .local Paris 2020: Les bonnes pratiques pour sécuriser MongoDBMongoDB
Chaque entreprise devient une entreprise de logiciels, fournissant des solutions client pour accéder à une variété de services et d'informations. Les entreprises commencent maintenant à valoriser leurs données et à obtenir de meilleures informations pour l'entreprise. Un défi crucial consiste à s'assurer que ces données sont toujours disponibles et sécurisées pour être conformes aux objectifs commerciaux de l'entreprise et aux contraintes réglementaires des pays. MongoDB fournit la couche de sécurité dont vous avez besoin, venez découvrir comment sécuriser vos données avec MongoDB.
MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...MongoDB
Venez en apprendre davantage sur notre nouvel opérateur de recherche en texte intégral pour MongoDB Atlas. Il s'agit d'une amélioration significative des fonctionnalités de recherches de MongoDB et c'est également la solution de recherche en texte intégral la plus simple et la plus puissante pour les bases de données MongoDB Atlas.
Cette présentation est importante pour quiconque a mis en place ou en visage de mettre en place une fonctionnalité de recherche dans son application MongoDB.
Vous assisterez à une démo de $searchBeta, apprendrez comment cela fonctionne, découvrirez des fonctionnalités spécifiques vous permettant d'obtenir des résultats de recherche pertinents et apprendrez comment vous pouvez commencer à utiliser la recherche en texte intégral dans votre application dès aujourd'hui.
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
“An Outlook of the Ongoing and Future Relationship between Blockchain Technologies and Process-aware Information Systems.” Invited talk at the joint workshop on Blockchain for Information Systems (BC4IS) and Blockchain for Trusted Data Sharing (B4TDS), co-located with with the 36th International Conference on Advanced Information Systems Engineering (CAiSE), 3 June 2024, Limassol, Cyprus.
Taking AI to the Next Level in Manufacturing.pdfssuserfac0301
Read Taking AI to the Next Level in Manufacturing to gain insights on AI adoption in the manufacturing industry, such as:
1. How quickly AI is being implemented in manufacturing.
2. Which barriers stand in the way of AI adoption.
3. How data quality and governance form the backbone of AI.
4. Organizational processes and structures that may inhibit effective AI adoption.
6. Ideas and approaches to help build your organization's AI strategy.
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Speck&Tech
ABSTRACT: A prima vista, un mattoncino Lego e la backdoor XZ potrebbero avere in comune il fatto di essere entrambi blocchi di costruzione, o dipendenze di progetti creativi e software. La realtà è che un mattoncino Lego e il caso della backdoor XZ hanno molto di più di tutto ciò in comune.
Partecipate alla presentazione per immergervi in una storia di interoperabilità, standard e formati aperti, per poi discutere del ruolo importante che i contributori hanno in una comunità open source sostenibile.
BIO: Sostenitrice del software libero e dei formati standard e aperti. È stata un membro attivo dei progetti Fedora e openSUSE e ha co-fondato l'Associazione LibreItalia dove è stata coinvolta in diversi eventi, migrazioni e formazione relativi a LibreOffice. In precedenza ha lavorato a migrazioni e corsi di formazione su LibreOffice per diverse amministrazioni pubbliche e privati. Da gennaio 2020 lavora in SUSE come Software Release Engineer per Uyuni e SUSE Manager e quando non segue la sua passione per i computer e per Geeko coltiva la sua curiosità per l'astronomia (da cui deriva il suo nickname deneb_alpha).
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
Driving Business Innovation: Latest Generative AI Advancements & Success StorySafe Software
Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency.
During the hour, we’ll take you through:
Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board.
Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes.
Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI.
We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI.
This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-und-domino-lizenzkostenreduzierung-in-der-welt-von-dlau/
DLAU und die Lizenzen nach dem CCB- und CCX-Modell sind für viele in der HCL-Community seit letztem Jahr ein heißes Thema. Als Notes- oder Domino-Kunde haben Sie vielleicht mit unerwartet hohen Benutzerzahlen und Lizenzgebühren zu kämpfen. Sie fragen sich vielleicht, wie diese neue Art der Lizenzierung funktioniert und welchen Nutzen sie Ihnen bringt. Vor allem wollen Sie sicherlich Ihr Budget einhalten und Kosten sparen, wo immer möglich. Das verstehen wir und wir möchten Ihnen dabei helfen!
Wir erklären Ihnen, wie Sie häufige Konfigurationsprobleme lösen können, die dazu führen können, dass mehr Benutzer gezählt werden als nötig, und wie Sie überflüssige oder ungenutzte Konten identifizieren und entfernen können, um Geld zu sparen. Es gibt auch einige Ansätze, die zu unnötigen Ausgaben führen können, z. B. wenn ein Personendokument anstelle eines Mail-Ins für geteilte Mailboxen verwendet wird. Wir zeigen Ihnen solche Fälle und deren Lösungen. Und natürlich erklären wir Ihnen das neue Lizenzmodell.
Nehmen Sie an diesem Webinar teil, bei dem HCL-Ambassador Marc Thomas und Gastredner Franz Walder Ihnen diese neue Welt näherbringen. Es vermittelt Ihnen die Tools und das Know-how, um den Überblick zu bewahren. Sie werden in der Lage sein, Ihre Kosten durch eine optimierte Domino-Konfiguration zu reduzieren und auch in Zukunft gering zu halten.
Diese Themen werden behandelt
- Reduzierung der Lizenzkosten durch Auffinden und Beheben von Fehlkonfigurationen und überflüssigen Konten
- Wie funktionieren CCB- und CCX-Lizenzen wirklich?
- Verstehen des DLAU-Tools und wie man es am besten nutzt
- Tipps für häufige Problembereiche, wie z. B. Team-Postfächer, Funktions-/Testbenutzer usw.
- Praxisbeispiele und Best Practices zum sofortigen Umsetzen
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
OpenID AuthZEN Interop Read Out - AuthorizationDavid Brossard
During Identiverse 2024 and EIC 2024, members of the OpenID AuthZEN WG got together and demoed their authorization endpoints conforming to the AuthZEN API
GraphRAG for Life Science to increase LLM accuracyTomaz Bratanic
GraphRAG for life science domain, where you retriever information from biomedical knowledge graphs using LLMs to increase the accuracy and performance of generated answers
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
3. Goals of the Presentation
Document vs Tabular
Recognize the
differences
Methodology
Summarize the steps
when modeling for
MongoDB
Patterns
Recognize when to apply
4. Goals of the Presentation
Document vs Tabular
Recognize the
differences
Methodology
Summarize the steps
when modeling for
MongoDB
Patterns
Recognize when to apply
5. Goals of the Presentation
Document vs Tabular
Recognize the
differences
Methodology
Summarize the steps
when modeling for
MongoDB
Patterns
Recognize when to apply
8. #MDBLocal
Thinking in Documents
• Polymorphism
• different documents may contain
different fields
• Array
• represent a "one-to-many" relation
• index entry separately
• Sub Document
• grouping some fields together
• JSON/BSON
• documents shown as JSON
• BSON is the physical format
12. #MDBLocal
Example: Modeling a Social Network
ü Slower writes
ü More storage space
ü Duplication
ü Faster reads
Pre-aggregated
Data
Solution A Solution B
(Fan Out on writes)(Fan Out on reads)
13. #MDBLocal
Tabular MongoDB
Steps to create the
model
1 – define schema
2 – develop app and queries
1 – identifying the queries
2 – define schema
Differences: Tabular vs Document
14. #MDBLocal
Tabular MongoDB
Steps to create the
model
1 – define schema
2 – develop app and queries
1 – identifying the queries
2 – define schema
Initial schema • 3rd normal form
• one possible solution
• many possible solutions
Differences: Tabular vs Document
15. #MDBLocal
Tabular MongoDB
Steps to create the
model
1 – define schema
2 – develop app and queries
1 – identifying the queries
2 – define schema
Initial schema • 3rd normal form
• one possible solution
• many possible solutions
Final schema • likely denormalized • few changes
Differences: Tabular vs Document
16. #MDBLocal
Tabular MongoDB
Steps to create the
model
1 – define schema
2 – develop app and queries
1 – identifying the queries
2 – define schema
Initial schema • 3rd normal form
• one possible solution
• many possible solutions
Final schema • likely denormalized • few changes
Schema evolution • difficult and not optimal
• likely downtime
• easy
• no downtime
Differences: Tabular vs Document
17. #MDBLocal
Tabular MongoDB
Steps to create the
model
1 – define schema
2 – develop app and queries
1 – identifying the queries
2 – define schema
Initial schema • 3rd normal form
• one possible solution
• many possible solutions
Final schema • likely denormalized • few changes
Schema evolution • difficult and not optimal
• likely downtime
• easy
• no downtime
Performance • mediocre • optimized
Differences: Tabular vs Document
24. #MDBLocal
Actors, Movies and Reviews actors
name
date_of_birth
movies
title
revenues
reviews
name
rating
actor_name
date_of_birth
movie_title
revenues
reviewer_name
rating
25. #MDBLocal
Actors, Movies and Reviews actors
name
date_of_birth
movies : [ .. ]
movies
title
revenues
actors: [ ..]
name
rating
actor_name
date_of_birth
movie_title
revenues
reviewer_name
rating
30. #MDBLocal
Case Study: Coffee Shop Franchises
Name: Beyond the Stars Coffee
Objective:
• 10 000 stores in the United States
31. #MDBLocal
Case Study: Coffee Shop Franchises
Name: Beyond the Stars Coffee
Objective:
• 10 000 stores in the United States
• … then we expend to the rest of the World
32. #MDBLocal
Case Study: Coffee Shop Franchises
Name: Beyond the Stars Coffee
Objective:
• 10 000 stores in the United States
• … then we expand to the rest of the World
Keys to success:
1. Best coffee in the world
33. #MDBLocal
Case Study: Coffee Shop Franchises
Name: Beyond the Stars Coffee
Objective:
• 10 000 stores in the United States
• … then we expand to the rest of the World
Keys to success:
1. Best coffee in the world
2. Best Technology
34. #MDBLocal
Make the Best Coffee in the World
23g of ground coffee in, 20g of extracted coffee
out, in approximately 20 seconds
1. Fill a small or regular cup with 80% hot
water (not boiling but pretty hot). Your cup
should be 150ml to 200ml in total volume,
80% of which will be hot water.
2. Grind 23g of coffee into your portafilter
using the double basket. We use a scale that
you can get here.
3. Draw 20g of coffee over the hot water by
placing your cup on a scale, press tare and
extract your shot.
35. #MDBLocal
Key to Success 2: Best Technology
a) Intelligent Shelves
• Measure inventory in real time
36. #MDBLocal
Key to Success 2: Best Technology
a) Intelligent Shelves
• Measure inventory in real time
b) Intelligent Coffee Machines
• Weightings, temperature, time to produce, …
• Coffee perfection
37. #MDBLocal
Key to Success 2: Best Technology
a) Intelligent Shelves
• Measure inventory in real time
b) Intelligent Coffee Machines
• Weightings, temperature, time to produce, …
• Coffee perfection
c) Intelligent Data Storage
• MongoDB
39. #MDBLocal
1 – Workload: List Queries
Query Operation Description
1. Coffee weight on the shelves write A shelf send information when coffee bags are
added or removed
40. #MDBLocal
1 – Workload: List Queries
Query Operation Description
1. Coffee weight on the shelves write A shelf send information when coffee bags are
added or removed
2. Coffee to deliver to stores read How much coffee do we have to ship to the store in
the next days
41. #MDBLocal
1 – Workload: List Queries
Query Operation Description
1. Coffee weight on the shelves write A shelf send information when coffee bags are
added or removed
2. Coffee to deliver to stores read How much coffee do we have to ship to the store in
the next days
3. Anomalies in the inventory read Analytics
42. #MDBLocal
1 – Workload: List Queries
Query Operation Description
1. Coffee weight on the shelves write A shelf send information when coffee bags are
added or removed
2. Coffee to deliver to stores read How much coffee do we have to ship to the store in
the next days
3. Anomalies in the inventory read Analytics
4. Making a cup of coffee write A coffee machine reporting on the production of a
coffee cup
43. #MDBLocal
1 – Workload: List Queries
Query Operation Description
1. Coffee weight on the shelves write A shelf send information when coffee bags are
added or removed
2. Coffee to deliver to stores read How much coffee do we have to ship to the store in
the next days
3. Anomalies in the inventory read Analytics
4. Making a cup of coffee write A coffee machine reporting on the production of a
coffee cup
5. Analysis of cups of coffee read Analytics
44. #MDBLocal
1 – Workload: List Queries
Query Operation Description
1. Coffee weight on the shelves write A shelf send information when coffee bags are
added or removed
2. Coffee to deliver to stores read How much coffee do we have to ship to the store in
the next days
3. Anomalies in the inventory read Analytics
4. Making a cup of coffee write A coffee machine reporting on the production of a
coffee cup
5. Analysis of cups of coffee read Analytics
6. Technical Support read Helping our franchisees
45. #MDBLocal
1 – Workload: quantify/qualify the queries
Query Quantification Qualification
1. Coffee weight on the shelves 10/day*shelf*store
=> 1/sec
<1s
critical write
2. Coffee to deliver to stores 1/day*store
=> 0.1/sec
<60s
3. Anomalies in the inventory 24 reads/day <5mins
"collection scan"
4. Making a cup of coffee 10 000 000 writes/day
115 writes/sec
<100ms
non-critical write
… cups of coffee at rush hour 3 000 000 writes/hr
833 writes/sec
<100ms
non-critical write
5. Analysis of cups of coffee 24 reads/day stale data is fine
"collection scan"
6. Technical Support 1000 reads/day <1s
46. #MDBLocal
1 – Workload: quantify/qualify the queries
Query Quantification Qualification
1. Coffee weight on the shelves 10/day*shelf*store
=> 1/sec
<1s
critical write
2. Coffee to deliver to stores 1/day*store
=> 0.1/sec
<60s
3. Anomalies in the inventory 24 reads/day <5mins
"collection scan"
4. Making a cup of coffee 10 000 000 writes/day
115 writes/sec
<100ms
non-critical write
… cups of coffee at rush hour 3 000 000 writes/hr
833 writes/sec
<100ms
non-critical write
5. Analysis of cups of coffee 24 reads/day stale data is fine
"collection scan"
6. Technical Support 1000 reads/day <1s
47. #MDBLocal
Disk Space
Cups of coffee
• one year of data
• 10000 x 1000/day x 365
• 3.7 billions/year
• 370 GB (100 bytes/cup of
coffee)
Weighings
• one year of data
• 10000 x 10/day x 365
• 365 billions/year
• 3.7 GB (100 bytes/weighings)
49. #MDBLocal
2 - Relations are still important
Type of Relation -> one-to-one/1-1 one-to-many/1-N many-to-many/N-N
Document
embedded in the
parent document
• one read
• no joins
• one read
• no joins
• one read
• no joins
• duplication of
information
Document
referenced in the
parent document
• smaller reads
• many reads
• smaller reads
• many reads
• smaller reads
• many reads
53. #MDBLocal
Schema Design Patterns Resources
A. Advanced Schema Design Patterns
• MongoDB World 2017
B. Blogs on Patterns, Ken Alger & Daniel Coupal
• https://www.mongodb.com/blog/post/building-
with-patterns-a-summary
C. MongoDB University: M320 – Data Modeling
• https://university.mongodb.com/courses/M320/about
63. Takeaways from the Presentation
Document vs Tabular
Recognize the
differences
Methodology
Summarize the steps
when modeling for
MongoDB
Patterns
Recognize when to apply
64. Takeaways from the Presentation
Document vs Tabular
Recognize the
differences
Methodology
Summarize the steps
when modeling for
MongoDB
Patterns
Recognize when to apply
65. Takeaways from the Presentation
Document vs Tabular
Recognize the
differences
Methodology
Summarize the steps
when modeling for
MongoDB
Patterns
Recognize when to apply
66. Thank you for taking our FREE
MongoDB classes at
university.mongodb.com
76. #MDBLocal
Application Lifecycle
Modify Application
• Can read/process all versions of documents
• Have different handler per version
• Reshape the document before processing
it
Update all Application servers
• Install updated application
• Remove old processes
Once migration completed
• remove the code to process old versions.
77. #MDBLocal
Document Lifecycle
New Documents:
• Application writes them in latest version
Existing Documents
A) Use updates to documents
• to transform to latest version
• keep forever documents that never
need an update
B) or transform all documents in batch
• no worry even if process takes days
79. #MDBLocal
Problem Solution
Use Cases Examples Benefits and Trade-Offs
Schema Versioning Pattern
● Avoid downtime while doing schema
upgrades
● Upgrading all documents can take hours,
days or even weeks when dealing with big
data
● Don't want to update all documents
No downtime needed
Feel in control of the migration
Less future technical debt
! May need 2 indexes for same field while
in migration period
● Each document gets a "schema_version"
field
● Application can handle all versions
● Choose your strategy to migrate the
documents
● Every application that use a database,
deployed in production and heavily used.
● System with a lot of legacy data
85. #MDBLocal
Problem Solution
Use Cases Examples Benefits and Trade-Offs
Computed Pattern
● Costly computation or manipulation of
data
● Executed frequently on the same data,
producing the same result
Read queries are faster
Saving on resources like CPU and Disk
! May be difficult to identify the need
! Avoid applying or overusing it unless
needed
● Perform the operation and store the
result in the appropriate document and
collection
● If need to redo the operations, keep the
source of them
● Internet Of Things (IOT)
● Event Sourcing
● Time Series Data
● Frequent Aggregation Framework
queries