Speaker: Charlie Swanson
Learn how MongoDB answers your queries from a query system engineer. If you've ever had a performance problem with a query but didn't know how to find the cause, or if you've ever needed to confirm that your shiny new index is being put to work, the explain command is an excellent place to start. MongoDB's explain system is a powerful tool for solving this type of problem, but can be intimidating and unwieldy to use. In this talk, we will discuss how the explain command works and break down its output into digestible pieces.
MongoDB World 2019: Tips and Tricks++ for Querying and Indexing MongoDBMongoDB
Query performance can either be a constant headache or the unsung hero of an application. MongoDB provides extremely powerful querying capabilities when used properly. As a senior member of the support team I will share more common mistakes observed and some tips and tricks to avoiding them.
Presented by Tom Schreiber, Senior Consulting Engineer, MongoDB
MongoDB supports a wide range of indexing options to enable fast querying of your data, but what are the right strategies for your application? In this talk we’ll cover how indexing works, the various indexing options, and cover use cases where each might be useful. We'll dive into common pitfalls using real-world examples to ensure that you're ready for scale. We'll show you the tools and techniques for diagnosing and tuning the performance of your MongoDB deployment. Whether you're running into problems or just want to optimize your performance, these skills will be useful.
MongoDB .local Toronto 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pi...MongoDB
Aggregation pipeline has been able to power your analysis of data since version 2.2. In 4.2 we added more power and now you can use it for more powerful queries, updates, and outputting your data to existing collections. Come hear how you can do everything with the pipeline, including single-view, ETL, data roll-ups and materialized views.
MongoDB .local Toronto 2019: Tips and Tricks for Effective IndexingMongoDB
Query performance can either be a constant headache or the unsung hero of an application. MongoDB provides extremely powerful querying capabilities when used properly. I will share more common mistakes observed and some tips and tricks to avoiding them.
MongoDB World 2019: The Sights (and Smells) of a Bad QueryMongoDB
“Why is MongoDB so slow?” you may ask yourself on occasion. You’ve created indexes, you’ve learned how to use the aggregation pipeline. What the heck? Could it be your queries? This talk will outline what tools are at your disposal (both in MongoDB Atlas and in MongoDB server) to identify inefficient queries.
As your data grows, the need to establish proper indexes becomes critical to performance. MongoDB supports a wide range of indexing options to enable fast querying of your data, but what are the right strategies for your application?
In this talk we’ll cover how indexing works, the various indexing options, and use cases where each can be useful. We'll dive into common pitfalls using real-world examples to ensure that you're ready for scale.
MongoDB .local Houston 2019:Tips and Tricks++ for Querying and Indexing MongoDBMongoDB
Query performance can either be a constant headache or the unsung hero of an application. MongoDB provides extremely powerful and performant querying capabilities by efficiently leveraging indexes. As the lead query expert on our global support team I will share some best practices for querying arrays, discuss the importance of compound indexes, and walk through how indexes are traversed by the database.
MongoDB World 2019: Tips and Tricks++ for Querying and Indexing MongoDBMongoDB
Query performance can either be a constant headache or the unsung hero of an application. MongoDB provides extremely powerful querying capabilities when used properly. As a senior member of the support team I will share more common mistakes observed and some tips and tricks to avoiding them.
Presented by Tom Schreiber, Senior Consulting Engineer, MongoDB
MongoDB supports a wide range of indexing options to enable fast querying of your data, but what are the right strategies for your application? In this talk we’ll cover how indexing works, the various indexing options, and cover use cases where each might be useful. We'll dive into common pitfalls using real-world examples to ensure that you're ready for scale. We'll show you the tools and techniques for diagnosing and tuning the performance of your MongoDB deployment. Whether you're running into problems or just want to optimize your performance, these skills will be useful.
MongoDB .local Toronto 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pi...MongoDB
Aggregation pipeline has been able to power your analysis of data since version 2.2. In 4.2 we added more power and now you can use it for more powerful queries, updates, and outputting your data to existing collections. Come hear how you can do everything with the pipeline, including single-view, ETL, data roll-ups and materialized views.
MongoDB .local Toronto 2019: Tips and Tricks for Effective IndexingMongoDB
Query performance can either be a constant headache or the unsung hero of an application. MongoDB provides extremely powerful querying capabilities when used properly. I will share more common mistakes observed and some tips and tricks to avoiding them.
MongoDB World 2019: The Sights (and Smells) of a Bad QueryMongoDB
“Why is MongoDB so slow?” you may ask yourself on occasion. You’ve created indexes, you’ve learned how to use the aggregation pipeline. What the heck? Could it be your queries? This talk will outline what tools are at your disposal (both in MongoDB Atlas and in MongoDB server) to identify inefficient queries.
As your data grows, the need to establish proper indexes becomes critical to performance. MongoDB supports a wide range of indexing options to enable fast querying of your data, but what are the right strategies for your application?
In this talk we’ll cover how indexing works, the various indexing options, and use cases where each can be useful. We'll dive into common pitfalls using real-world examples to ensure that you're ready for scale.
MongoDB .local Houston 2019:Tips and Tricks++ for Querying and Indexing MongoDBMongoDB
Query performance can either be a constant headache or the unsung hero of an application. MongoDB provides extremely powerful and performant querying capabilities by efficiently leveraging indexes. As the lead query expert on our global support team I will share some best practices for querying arrays, discuss the importance of compound indexes, and walk through how indexes are traversed by the database.
Video available here: http://vivu.tv/portal/archive.jsp?flow=783-586-4282&id=1270584002677
We all know that MongoDB is one of the most flexible and feature-rich databases available. In this webinar we'll discuss how you can leverage this feature set and maintain high performance with your project's massive data sets and high loads. We'll cover how indexes can be designed to optimize the performance of MongoDB. We'll also discuss tips for diagnosing and fixing performance issues should they arise.
Find out which is faster, SQL or NoSQL, for traditional reporting tasks. Discover how you can optimise MongoDB aggregation pipelines and how to push complex computation down to the database.
Webinar: Working with Graph Data in MongoDBMongoDB
With the release of MongoDB 3.4, the number of applications that can take advantage of MongoDB has expanded. In this session we will look at using MongoDB for representing graphs and how graph relationships can be modeled in MongoDB.
We will also look at a new aggregation operation that we recently implemented for graph traversal and computing transitive closure. We will include an overview of the new operator and provide examples of how you can exploit this new feature in your MongoDB applications.
^Regular Expressions is one of those tools that every developer should have in their toolbox. You can do your job without regular expressions, but knowing when and how to use them will make you a much more efficient and marketable developer. You'll learn how regular expressions can be used for validating user input, parsing text, and refactoring code. We'll also cover various tools that can be used to help you write and share expressions.$
Speakers: Lars George and Jon Hsieh (Cloudera)
Today, there are hundreds of production HBase clusters running a multitude of applications and use cases. Many well-known implementations exercise opposite ends of the HBase's capabilities emphasizing either entity-centric schemas or event-based schemas. This talk presents these archetypes and others based on a use-case survey of clusters conducted by Cloudera's development, product, and services teams. By analyzing the data from the nearly 20,000 HBase cluster nodes Cloudera has under management, we'll categorize HBase users and their use cases into a few simple archetypes, describe workload patterns, and quantify the usage of advanced features. We'll also explain what an HBase user can do to alleviate pressure points from these fundamentally different workloads, and use these results will provide insight into what lies in HBase's future.
In a real life almost any project deals with the
tree structures. Different kinds of taxonomies,
site structures etc require modeling of
hierarchy relations.
Typical approaches used
● Model Tree Structures with Child References
● Model Tree Structures with Parent References
● Model Tree Structures with an Array of Ancestors
● Model Tree Structures with Materialized Paths
● Model Tree Structures with Nested Sets
The openCypher Project - An Open Graph Query LanguageNeo4j
We want to present the openCypher project, whose purpose is to make Cypher available to everyone – every data store, every tooling provider, every application developer. openCypher is a continual work in progress. Over the next few months, we will move more and more of the language artifacts over to GitHub to make it available for everyone.
openCypher is an open source project that delivers four key artifacts released under a permissive license: (i) the Cypher reference documentation, (ii) a Technology compatibility kit (TCK), (iii) Reference implementation (a fully functional implementation of key parts of the stack needed to support Cypher inside a data platform or tool) and (iv) the Cypher language specification.
We are also seeking to make the process of specifying and evolving the Cypher query language as open as possible, and are actively seeking comments and suggestions on how to improve the Cypher query language.
The purpose of this talk is to provide more details regarding the above-mentioned aspects.
We want to present the openCypher project, whose purpose is to make Cypher available to everyone – every data store, every tooling provider, every application developer. openCypher is a continual work in progress. Over the next few months, we will move more and more of the language artifacts over to GitHub to make it available for everyone.
openCypher is an open source project that delivers four key artifacts released under a permissive license: (i) the Cypher reference documentation, (ii) a Technology compatibility kit (TCK), (iii) Reference implementation (a fully functional implementation of key parts of the stack needed to support Cypher inside a data platform or tool) and (iv) the Cypher language specification.
We are also seeking to make the process of specifying and evolving the Cypher query language as open as possible, and are actively seeking comments and suggestions on how to improve the Cypher query language.
The purpose of this talk is to provide more details regarding the above-mentioned aspects.
Scylla Summit 2022: Scylla 5.0 New Features, Part 1ScyllaDB
Discover the new features and capabilities of Scylla Open Source 5.0 directly from the engineers who developed it. This second block of lightning talks will cover the following topics:
- New IO Scheduler and Disk Parallelism
- Per-Service-Level Timeouts
- Better Workload Estimation for Backpressure and Out-of-Memory Conditions
- Large Partition Handling Improvements
- Optimizing Reverse Queries
To watch all of the recordings hosted during Scylla Summit 2022 visit our website here: https://www.scylladb.com/summit.
Webscale PostgreSQL - JSONB and Horizontal Scaling StrategiesJonathan Katz
All data is relational and can be represented through relational algebra, right? Perhaps, but there are other ways to represent data, and the PostgreSQL team continues to work on making it easier and more efficient to do so!
With the upcoming 9.4 release, PostgreSQL is introducing the "JSONB" data type which allows for fast, compressed, storage of JSON formatted data, and for quick retrieval. And JSONB comes with all the benefits of PostgreSQL, like its data durability, MVCC, and of course, access to all the other data types and features in PostgreSQL.
How fast is JSONB? How do we access data stored with this type? What can it do with the rest of PostgreSQL? What can't it do? How can we leverage this new data type and make PostgreSQL scale horizontally? Follow along with our presentation as we try to answer these questions.
Speaker: Charlie Swanson, Software Engineer, MongoDB
Level: 200 (Intermediate)
Track: How We Build MongoDB
Learn how MongoDB answers your queries from a query system engineer. If you've ever had a performance problem with a query but didn't know how to find the cause, or if you've ever needed to confirm that your shiny new index is being put to work, the explain command is an excellent place to start. MongoDB's explain system is a powerful tool for solving this type of problem, but can be intimidating and unwieldy to use. In this talk, we will discuss how the explain command works and break down its output into digestible pieces.
What You Will Learn:
- Exactly how indexes are used during your queries and aggregations
- How to diagnose your poorly performing operations
- How to tune your most important operations to ensure that they scale seamlessly
MongoDB.local DC 2018: Tips and Tricks for Avoiding Common Query PitfallsMongoDB
Query performance can either be a constant headache or the unsung hero of an application. MongoDB provides extremely powerful querying capabilities when used properly. As a member of the solutions architecture team, I will share common mistakes observed as well as tips and tricks to avoiding them.
Video available here: http://vivu.tv/portal/archive.jsp?flow=783-586-4282&id=1270584002677
We all know that MongoDB is one of the most flexible and feature-rich databases available. In this webinar we'll discuss how you can leverage this feature set and maintain high performance with your project's massive data sets and high loads. We'll cover how indexes can be designed to optimize the performance of MongoDB. We'll also discuss tips for diagnosing and fixing performance issues should they arise.
Find out which is faster, SQL or NoSQL, for traditional reporting tasks. Discover how you can optimise MongoDB aggregation pipelines and how to push complex computation down to the database.
Webinar: Working with Graph Data in MongoDBMongoDB
With the release of MongoDB 3.4, the number of applications that can take advantage of MongoDB has expanded. In this session we will look at using MongoDB for representing graphs and how graph relationships can be modeled in MongoDB.
We will also look at a new aggregation operation that we recently implemented for graph traversal and computing transitive closure. We will include an overview of the new operator and provide examples of how you can exploit this new feature in your MongoDB applications.
^Regular Expressions is one of those tools that every developer should have in their toolbox. You can do your job without regular expressions, but knowing when and how to use them will make you a much more efficient and marketable developer. You'll learn how regular expressions can be used for validating user input, parsing text, and refactoring code. We'll also cover various tools that can be used to help you write and share expressions.$
Speakers: Lars George and Jon Hsieh (Cloudera)
Today, there are hundreds of production HBase clusters running a multitude of applications and use cases. Many well-known implementations exercise opposite ends of the HBase's capabilities emphasizing either entity-centric schemas or event-based schemas. This talk presents these archetypes and others based on a use-case survey of clusters conducted by Cloudera's development, product, and services teams. By analyzing the data from the nearly 20,000 HBase cluster nodes Cloudera has under management, we'll categorize HBase users and their use cases into a few simple archetypes, describe workload patterns, and quantify the usage of advanced features. We'll also explain what an HBase user can do to alleviate pressure points from these fundamentally different workloads, and use these results will provide insight into what lies in HBase's future.
In a real life almost any project deals with the
tree structures. Different kinds of taxonomies,
site structures etc require modeling of
hierarchy relations.
Typical approaches used
● Model Tree Structures with Child References
● Model Tree Structures with Parent References
● Model Tree Structures with an Array of Ancestors
● Model Tree Structures with Materialized Paths
● Model Tree Structures with Nested Sets
The openCypher Project - An Open Graph Query LanguageNeo4j
We want to present the openCypher project, whose purpose is to make Cypher available to everyone – every data store, every tooling provider, every application developer. openCypher is a continual work in progress. Over the next few months, we will move more and more of the language artifacts over to GitHub to make it available for everyone.
openCypher is an open source project that delivers four key artifacts released under a permissive license: (i) the Cypher reference documentation, (ii) a Technology compatibility kit (TCK), (iii) Reference implementation (a fully functional implementation of key parts of the stack needed to support Cypher inside a data platform or tool) and (iv) the Cypher language specification.
We are also seeking to make the process of specifying and evolving the Cypher query language as open as possible, and are actively seeking comments and suggestions on how to improve the Cypher query language.
The purpose of this talk is to provide more details regarding the above-mentioned aspects.
We want to present the openCypher project, whose purpose is to make Cypher available to everyone – every data store, every tooling provider, every application developer. openCypher is a continual work in progress. Over the next few months, we will move more and more of the language artifacts over to GitHub to make it available for everyone.
openCypher is an open source project that delivers four key artifacts released under a permissive license: (i) the Cypher reference documentation, (ii) a Technology compatibility kit (TCK), (iii) Reference implementation (a fully functional implementation of key parts of the stack needed to support Cypher inside a data platform or tool) and (iv) the Cypher language specification.
We are also seeking to make the process of specifying and evolving the Cypher query language as open as possible, and are actively seeking comments and suggestions on how to improve the Cypher query language.
The purpose of this talk is to provide more details regarding the above-mentioned aspects.
Scylla Summit 2022: Scylla 5.0 New Features, Part 1ScyllaDB
Discover the new features and capabilities of Scylla Open Source 5.0 directly from the engineers who developed it. This second block of lightning talks will cover the following topics:
- New IO Scheduler and Disk Parallelism
- Per-Service-Level Timeouts
- Better Workload Estimation for Backpressure and Out-of-Memory Conditions
- Large Partition Handling Improvements
- Optimizing Reverse Queries
To watch all of the recordings hosted during Scylla Summit 2022 visit our website here: https://www.scylladb.com/summit.
Webscale PostgreSQL - JSONB and Horizontal Scaling StrategiesJonathan Katz
All data is relational and can be represented through relational algebra, right? Perhaps, but there are other ways to represent data, and the PostgreSQL team continues to work on making it easier and more efficient to do so!
With the upcoming 9.4 release, PostgreSQL is introducing the "JSONB" data type which allows for fast, compressed, storage of JSON formatted data, and for quick retrieval. And JSONB comes with all the benefits of PostgreSQL, like its data durability, MVCC, and of course, access to all the other data types and features in PostgreSQL.
How fast is JSONB? How do we access data stored with this type? What can it do with the rest of PostgreSQL? What can't it do? How can we leverage this new data type and make PostgreSQL scale horizontally? Follow along with our presentation as we try to answer these questions.
Speaker: Charlie Swanson, Software Engineer, MongoDB
Level: 200 (Intermediate)
Track: How We Build MongoDB
Learn how MongoDB answers your queries from a query system engineer. If you've ever had a performance problem with a query but didn't know how to find the cause, or if you've ever needed to confirm that your shiny new index is being put to work, the explain command is an excellent place to start. MongoDB's explain system is a powerful tool for solving this type of problem, but can be intimidating and unwieldy to use. In this talk, we will discuss how the explain command works and break down its output into digestible pieces.
What You Will Learn:
- Exactly how indexes are used during your queries and aggregations
- How to diagnose your poorly performing operations
- How to tune your most important operations to ensure that they scale seamlessly
MongoDB.local DC 2018: Tips and Tricks for Avoiding Common Query PitfallsMongoDB
Query performance can either be a constant headache or the unsung hero of an application. MongoDB provides extremely powerful querying capabilities when used properly. As a member of the solutions architecture team, I will share common mistakes observed as well as tips and tricks to avoiding them.
I inherited a MongoDB database server with 60 collections and 100 or so indexes.
The business users are complaining about slow report completion times. What can I do to improve performance?
MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...MongoDB
Come and hear more about our new full-text search operator for MongoDB Atlas. This is a significant enhancement to MongoDB search features and is the easiest and most powerful full-text search solution for databases on MongoDB Atlas.
This talk is important for anyone who has implemented search or is considering a search feature in their MongoDB application.
You will see a demo of $searchBeta, learn about how it works, discover specific features to help you deliver relevant search results, and learn how you can start using full-text search in your application today.
Building a real time big data analytics platform with solrTrey Grainger
Having “big data” is great, but turning that data into actionable intelligence is where the real value lies. This talk will demonstrate how you can use Solr to build a highly scalable data analytics engine to enable customers to engage in lightning fast, real-time knowledge discovery.
At CareerBuilder, we utilize these techniques to report the supply and demand of the labor force, compensation trends, customer performance metrics, and many live internal platform analytics. You will walk away from this talk with an advanced understanding of faceting, including pivot-faceting, geo/radius faceting, time-series faceting, function faceting, and multi-select faceting. You’ll also get a sneak peak at some new faceting capabilities just wrapping up development including distributed pivot facets and percentile/stats faceting, which will be open-sourced.
The presentation will be a technical tutorial, along with real-world use-cases and data visualizations. After this talk, you'll never see Solr as just a text search engine again.
The core Search frameworks in Liferay 7 have been significantly retooled to benefit not only from Liferay's new modular architecture, but also from one of the most innovative players in the market: Elasticsearch, which replaces Lucene as the default search engine in Portal. This session will cover topics like clustering and scalability, unveil improvements (both Elasticsearch and Solr) like aggregations, filters, geolocation, "more like this" and other new query types, and also hot new features for the Enterprise like out-of-the-box Marvel cluster monitoring and Shield security.
André "Arbo" Oliveira joined Liferay in early 2014 as a senior engineer and leads the Search Infrastructure team. He's been writing code for a living for 22 years, 14 of them as a Java developer and architect. Ever since discovering Elasticsearch, he's vowed never to write another SQL WHERE clause again.
Data Exploration with Apache Drill: Day 1Charles Givre
Study after study shows that data scientists and analysts spend between 50% and 90% of their time preparing their data for analysis. Using Drill, you can dramatically reduce the time it takes to go from raw data to insight. This course will show you how.
The course material for this presentation are available at https://github.com/cgivre/data-exploration-with-apache-drill
Beyond PHP - It's not (just) about the codeWim Godden
Most PHP developers focus on writing code. But creating Web applications is about much more than just wrting PHP. Take a step outside the PHP cocoon and into the big PHP ecosphere to find out how small code changes can make a world of difference on servers and network. This talk is an eye-opener for developers who spend over 80% of their time coding, debugging and testing.
MongoDB .local Chicago 2019: Still Haven't Found What You Are Looking For? Us...MongoDB
Come and hear more about our new full-text search operator for MongoDB Atlas. This is a significant enhancement to MongoDB search features and is the easiest and most powerful full-text search solution for databases on MongoDB Atlas.
This talk is important for anyone who has implemented search or is considering a search feature in their MongoDB application.
You will see a demo of $searchBeta, learn about how it works, discover specific features to help you deliver relevant search results, and learn how you can start using full-text search in your application today.
MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...MongoDB
Venez en apprendre davantage sur notre nouvel opérateur de recherche en texte intégral pour MongoDB Atlas. Il s'agit d'une amélioration significative des fonctionnalités de recherches de MongoDB et c'est également la solution de recherche en texte intégral la plus simple et la plus puissante pour les bases de données MongoDB Atlas.
Cette présentation est importante pour quiconque a mis en place ou en visage de mettre en place une fonctionnalité de recherche dans son application MongoDB.
Vous assisterez à une démo de $searchBeta, apprendrez comment cela fonctionne, découvrirez des fonctionnalités spécifiques vous permettant d'obtenir des résultats de recherche pertinents et apprendrez comment vous pouvez commencer à utiliser la recherche en texte intégral dans votre application dès aujourd'hui.
N1QL = SQL + JSON. N1QL gives developers and enterprises an expressive, powerful, and complete language for querying, transforming, and manipulating JSON data. We begin with a brief overview. Couchbase 5.0 has language and performance improvements for pagination, index exploitation, integration, and more. We’ll walk through scenarios, features, and best practices.
Topic: Discover deep insights with Salesforce Einstein Analytics and Discovery
ImpactSalesforceSaturday Session
by @newdelhisfdcdug
Speaker: Jayant Joshi
AGENDA
a. What is SFDC Einstein Analytics?
b. Let us build great Visualizations using Einstein Analytics
c. Discover Deep Insights with Einstein Discovery
d. Demo and QA
https://newdelhisfdcdug.com/salesforce-einstein-analytics-and-discovery/
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...NoSQLmatters
Simon Elliston Ball – When to NoSQL and When to Know SQL
With NoSQL, NewSQL and plain old SQL, there are so many tools around it’s not always clear which is the right one for the job.This is a look at a series of NoSQL technologies, comparing them against traditional SQL technology. I’ll compare real use cases and show how they are solved with both NoSQL options, and traditional SQL servers, and then see who wins. We’ll look at some code and architecture examples that fit a variety of NoSQL techniques, and some where SQL is a better answer. We’ll see some big data problems, little data problems, and a bunch of new and old database technologies to find whatever it takes to solve the problem.By the end you’ll hopefully know more NoSQL, and maybe even have a few new tricks with SQL, and what’s more how to choose the right tool for the job.
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
During this talk we'll navigate through a customer's journey as they migrate an existing MongoDB deployment to MongoDB Atlas. While the migration itself can be as simple as a few clicks, the prep/post effort requires due diligence to ensure a smooth transfer. We'll cover these steps in detail and provide best practices. In addition, we’ll provide an overview of what to consider when migrating other cloud data stores, traditional databases and MongoDB imitations to MongoDB Atlas.
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
These days, everyone is expected to be a data analyst. But with so much data available, how can you make sense of it and be sure you're making the best decisions? One great approach is to use data visualizations. In this session, we take a complex dataset and show how the breadth of capabilities in MongoDB Charts can help you turn bits and bytes into insights.
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
MongoDB Kubernetes operator and MongoDB Open Service Broker are ready for production operations. Learn about how MongoDB can be used with the most popular container orchestration platform, Kubernetes, and bring self-service, persistent storage to your containerized applications. A demo will show you how easy it is to enable MongoDB clusters as an External Service using the Open Service Broker API for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
Are you new to schema design for MongoDB, or are you looking for a more complete or agile process than what you are following currently? In this talk, we will guide you through the phases of a flexible methodology that you can apply to projects ranging from small to large with very demanding requirements.
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
Humana, like many companies, is tackling the challenge of creating real-time insights from data that is diverse and rapidly changing. This is our journey of how we used MongoDB to combined traditional batch approaches with streaming technologies to provide continues alerting capabilities from real-time data streams.
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
Time series data is increasingly at the heart of modern applications - think IoT, stock trading, clickstreams, social media, and more. With the move from batch to real time systems, the efficient capture and analysis of time series data can enable organizations to better detect and respond to events ahead of their competitors or to improve operational efficiency to reduce cost and risk. Working with time series data is often different from regular application data, and there are best practices you should observe.
This talk covers:
Common components of an IoT solution
The challenges involved with managing time-series data in IoT applications
Different schema designs, and how these affect memory and disk utilization – two critical factors in application performance.
How to query, analyze and present IoT time-series data using MongoDB Compass and MongoDB Charts
At the end of the session, you will have a better understanding of key best practices in managing IoT time-series data with MongoDB.
Join this talk and test session with a MongoDB Developer Advocate where you'll go over the setup, configuration, and deployment of an Atlas environment. Create a service that you can take back in a production-ready state and prepare to unleash your inner genius.
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
Our clients have unique use cases and data patterns that mandate the choice of a particular strategy. To implement these strategies, it is mandatory that we unlearn a lot of relational concepts while designing and rapidly developing efficient applications on NoSQL. In this session, we will talk about some of our client use cases, the strategies we have adopted, and the features of MongoDB that assisted in implementing these strategies.
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
Encryption is not a new concept to MongoDB. Encryption may occur in-transit (with TLS) and at-rest (with the encrypted storage engine). But MongoDB 4.2 introduces support for Client Side Encryption, ensuring the most sensitive data is encrypted before ever leaving the client application. Even full access to your MongoDB servers is not enough to decrypt this data. And better yet, Client Side Encryption can be enabled at the "flick of a switch".
This session covers using Client Side Encryption in your applications. This includes the necessary setup, how to encrypt data without sacrificing queryability, and what trade-offs to expect.
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
MongoDB Kubernetes operator is ready for prime-time. Learn about how MongoDB can be used with most popular orchestration platform, Kubernetes, and bring self-service, persistent storage to your containerized applications.
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
These days, everyone is expected to be a data analyst. But with so much data available, how can you make sense of it and be sure you're making the best decisions? One great approach is to use data visualizations. In this session, we take a complex dataset and show how the breadth of capabilities in MongoDB Charts can help you turn bits and bytes into insights.
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
When you need to model data, is your first instinct to start breaking it down into rows and columns? Mine used to be too. When you want to develop apps in a modern, agile way, NoSQL databases can be the best option. Come to this talk to learn how to take advantage of all that NoSQL databases have to offer and discover the benefits of changing your mindset from the legacy, tabular way of modeling data. We’ll compare and contrast the terms and concepts in SQL databases and MongoDB, explain the benefits of using MongoDB compared to SQL databases, and walk through data modeling basics so you feel confident as you begin using MongoDB.
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
Join this talk and test session with a MongoDB Developer Advocate where you'll go over the setup, configuration, and deployment of an Atlas environment. Create a service that you can take back in a production-ready state and prepare to unleash your inner genius.
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
Query performance should be the unsung hero of an application, but without proper configuration, can become a constant headache. When used properly, MongoDB provides extremely powerful querying capabilities. In this session, we'll discuss concepts like equality, sort, range, managing query predicates versus sequential predicates, and best practices to building multikey indexes.
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
Aggregation pipeline has been able to power your analysis of data since version 2.2. In 4.2 we added more power and now you can use it for more powerful queries, updates, and outputting your data to existing collections. Come hear how you can do everything with the pipeline, including single-view, ETL, data roll-ups and materialized views.
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
Are you new to schema design for MongoDB, or are you looking for a more complete or agile process than what you are following currently? In this talk, we will guide you through the phases of a flexible methodology that you can apply to projects ranging from small to large with very demanding requirements.
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
MongoDB Atlas Data Lake is a new service offered by MongoDB Atlas. Many organizations store long term, archival data in cost-effective storage like S3, GCP, and Azure Blobs. However, many of them do not have robust systems or tools to effectively utilize large amounts of data to inform decision making. MongoDB Atlas Data Lake is a service allowing organizations to analyze their long-term data to discover a wealth of information about their business.
This session will take a deep dive into the features that are currently available in MongoDB Atlas Data Lake and how they are implemented. In addition, we'll discuss future plans and opportunities and offer ample Q&A time with the engineers on the project.
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
Virtual assistants are becoming the new norm when it comes to daily life, with Amazon’s Alexa being the leader in the space. As a developer, not only do you need to make web and mobile compliant applications, but you need to be able to support virtual assistants like Alexa. However, the process isn’t quite the same between the platforms.
How do you handle requests? Where do you store your data and work with it to create meaningful responses with little delay? How much of your code needs to change between platforms?
In this session we’ll see how to design and develop applications known as Skills for Amazon Alexa powered devices using the Go programming language and MongoDB.
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
aux Core Data, appréciée par des centaines de milliers de développeurs. Apprenez ce qui rend Realm spécial et comment il peut être utilisé pour créer de meilleures applications plus rapidement.
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
Il n’a jamais été aussi facile de commander en ligne et de se faire livrer en moins de 48h très souvent gratuitement. Cette simplicité d’usage cache un marché complexe de plus de 8000 milliards de $.
La data est bien connu du monde de la Supply Chain (itinéraires, informations sur les marchandises, douanes,…), mais la valeur de ces données opérationnelles reste peu exploitée. En alliant expertise métier et Data Science, Upply redéfinit les fondamentaux de la Supply Chain en proposant à chacun des acteurs de surmonter la volatilité et l’inefficacité du marché.
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
Reading the .explain() Output
1. O C T O B E R 1 2 , 2 0 1 7 | B E S P O K E | S AN F R AN C I S C O
Deciphering Explain
Output
Charlie Swanson
2. # M D B l o c a l
Understand how
MongoDB answers
queries
Knowledge
Figure out what's
going on
Debugging
Learn some tricks to
optimize your queries &
aggregations
Best Practices
GOALS OF THIS TALK
3. # M D B l o c a l
S E N I O R E N G I N E E R - Q U E R Y T E A M
N Y C
Charlie Swanson
4. # M D B l o c a l
01 02 03 04 05
Motivation "queryPlanner"
Verbosity
"executionStats"
Verbosity
"allPlansExecution"
Versbosity
Beyond Queries
Why do you care? Describing considered
plans
More details about
winning plan
More details about plan
selection
Log messages
What is explain? The profile
Other commands
5. # M D B l o c a l
01 02 03 04 05
Motivation "queryPlanner"
Verbosity
"executionStats"
Verbosity
"allPlansExecution"
Versbosity
Beyond Queries
Why do you care? Describing considered
plans
More details about
winning plan
More details about plan
selection
Log messages
What is explain? The profile
Other commands
6. # M D B l o c a l
• Is your query using an index? Which one?
Target Questions
🤔
7. # M D B l o c a l
• Is your query using an index? Which one?
• Is your query using an index to provide the sort?
Target Questions
🤔
8. # M D B l o c a l
• Is your query using an index? Which one?
• Is your query using an index to provide the sort?
• How many of the examined documents ended up matching?
Target Questions
🤔
9. # M D B l o c a l
• Is your query using an index? Which one?
• Is your query using an index to provide the sort?
• How many of the examined documents ended up matching?
• Why did the server choose to answer the query the way it did?
Target Questions
🤔
10. # M D B l o c a l
• Is your query using an index? Which one?
• Is your query using an index to provide the sort?
• How many of the examined documents ended up matching?
• Why was your winning plan chosen?
• Can my queries go faster?
Target Questions
🤔
12. # M D B l o c a l
How Can The Server Answer This Query?
13. # M D B l o c a l
OPTION 1: Collection scan
TWITTER.TWEETS
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
✅
✅
14. # M D B l o c a l
COLLECTION SCAN:
TWITTER.TWEETS
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
✅
✅
204587
190587
SORT
01
.
02
.
15. # M D B l o c a l
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
✅
✅
(190587,
'@Charlie')
(204587,
'@TaylorSwift')
01.
02.
SORT
PROJECTION
02
01
{nFavorites: 204587,
username: '@TaylorSwift'}
{nFavorites: 109587,
username: '@Charlie'}
COLLECTION SCAN:
TWITTER.TWEETS
16. # M D B l o c a l
(190587, '@Charlie')
(204587, '@TaylorSwift')01.
02.
SORT
PROJECTION
02
01
{nFavorites: 204587,
username: '@TaylorSwift'}
{nFavorites: 109587,
username: '@Charlie'}
COLLECTION SCAN
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
✅
✅
17. # M D B l o c a l
OPTION 1: Collection scan
SORT
PROJECT
COLLECTION SCAN
18. # M D B l o c a l
OPTION 2: INDEX SCAN TWITTER.TWEETS
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
INDEX:
{nFavorites: -1}
204587
190587
87983
83092
76032
29023
…
19. # M D B l o c a l
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
INDEX:
{nFavorites: -1}
204587
190587
87983
83092
76032
29023
…
Stop when
entry is
smaller
than
100,000
TWITTER.TWEETS
20. # M D B l o c a l
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
INDEX:
{nFavorites: -1}
204587
190587
87983
83092
76032
29023
…
SORTED!
TWITTER.TWEETS
21. # M D B l o c a l
TWITTER.TWEET
S{…
}
{…
}
{…
}
{…
}
{…
}
{…
}{…
}
{…
}
{…
}
{…
}
{…
}
{…
}{…
}
{…
}
{…
}
{…
}
{…
}
{…
}{…
}
{…
}
{…
}
{…
}
{…
}
{…
}{…
}
{…
}
{…
}
{…
}
{…
}
{…
}{…
}
{…
}
{…
}
{…
}
{…
}
{…
}{…
}
{…
}
{…
}
{…
}
{…
}
{…
}
INDEX
204587
190587
87983
83092
123
123
…
INDEX SCAN
SORT
24. # M D B l o c a l
OPTION 2: INDEX scan
FETCH
PROJECT
INDEX SCAN
25. # M D B l o c a l
Many Ways to Answer a Query, Which Was
It?
SORT
PROJECT
COLLECTION SCAN
FETCH
PROJECT
INDEX SCAN
26. # M D B l o c a l
1. Command to explain execution of various other commands
2. Helper on shell cursor object
What is Explain?
27. O C T O B E R 1 2 , 2 0 1 7 | B E S P O K E | S AN F R AN C I S C O
What is Explain? - Command
28. O C T O B E R 1 2 , 2 0 1 7 | B E S P O K E | S AN F R AN C I S C O
What is Explain? - Shell Helper
29. O C T O B E R 1 2 , 2 0 1 7 | B E S P O K E | S AN F R AN C I S C O
What is Explain? - Shell Helper
???
30. # M D B l o c a l
Query Plans
SORT
PROJECT
COLLECTION SCAN
FETCH
PROJECT
INDEX SCAN
…OR
FETCH
INDEX SCAN INDEX SCAN
31. # M D B l o c a l
01 02 03 04 05
Motivation "queryPlanner"
Verbosity
"executionStats"
Verbosity
"allPlansExecution"
Versbosity
Beyond Queries
Why do you care? Describing considered
plans
More details about
winning plan
More details about plan
selection
Log messages
What is explain? The profile
Other commands
32. # M D B l o c a l
> db.collection.find(…).explain("queryPlanner")
{
"queryPlanner" : {
…
"winningPlan" : {…},
"rejectedPlans" : […]
},
…
}
Explain Output Optional, this is the
default
33. # M D B l o c a l
> db.collection.find(…).explain()
{queryPlanner: {
winningPlan: {
stage: "SORT",
inputStage: {
stage: "FETCH",
inputStage: {
stage: "IXSCAN"
}
}
}
}
}}
Explain Output
FETCH
INDEX SCAN
{nFavorites: -1}
SORT
34. # M D B l o c a l
> db.collection.find(…).explain()
{
…
stage: "IXSCAN"
keyPattern: {nFavorites: -1},
indexBounds: {
a: [ "[inf.0, 100000]" ]
},
… // Other index scan
// specific stats.
…
}}
Explain Output
FETCH
INDEX SCAN
keyPattern: {
nFavorites: -1
}
indexBounds: […]
…
SORT
35. # M D B l o c a l
> db.collection.find(…).explain()
{
"queryPlanner" : {
…
"winningPlan" : {
// Encodes selected plan.
},
"rejectedPlans" : […]
},
…
}
Explain Output
FETCH
INDEX SCAN
SORT
36. # M D B l o c a l
> db.collection.find(…).explain()
{"queryPlanner" : {
"winningPlan" : {
{"stage" : "SORT",
"inputStage" : {…}
}}
},
"rejectedPlans" : [
{"stage" : "SORT",
"inputStage" : {…}
}}
…
]
}}
Explain Output
FETCH
INDEX SCAN
SORT
COLL_SCAN
SORT
38. # M D B l o c a l
Is Your Query Using the Index You Expect?
39. # M D B l o c a l
FETCH
SORT
✓ COLLECTION SCAN
SORT
✗
INDEX SCAN
keyPattern: {nFollowers: -1}
Is Your Query Using the Index You Expect?
40. # M D B l o c a l
Is Your Query Using the Index You Expect?
41. # M D B l o c a l
db.tweets.explain().find(
{nFavorites: {$gte: 100000}},
{_id: 0, nFavorites: 1, username: 1})
.sort({nFavorites: -1})
Is Your Query Using the Index You Expect?
42. # M D B l o c a l
db.tweets.explain().find(
{nFavorites: {$gte: 100000}},
{_id: 0, nFavorites: 1, username: 1}).sort({nFavorites: -1})
{ "queryPlanner": {
"winningPlan": {
"stage": "PROJECTION",
"inputStage": {
"stage": "FETCH",
"inputStage": {
"stage": "IXSCAN",
"keyPattern": {"nFavorites": -1},
"indexBounds": {
"nFavorites": ["[inf.0, 100000.0]"]
} } } },
"rejectedPlans": [ ] } }
Is Your Query Using the Index You Expect?
43. # M D B l o c a l
Is Your Query Using an Index to Provide the Sort?
44. # M D B l o c a l
FETCH
INDEX SCAN
✓ ✗FETCH
INDEX SCAN
SORT
Is Your Query Using an Index to Provide the Sort?
45. # M D B l o c a l
Is Your Query Using an Index to Provide the Sort?
46. # M D B l o c a l
db.tweets.explain().find(
{nFavorites: {$gte: 100000}},
{_id: 0, nFavorites: 1, username: 1}).sort({nFavorites: -1})
{ "queryPlanner": {
"winningPlan": {
"stage": "PROJECTION",
"inputStage": {
"stage": "FETCH",
"inputStage": {
"stage": "IXSCAN",
"keyPattern": {"nFavorites": -1},
"indexBounds": {
"nFavorites": ["[inf.0, 100000.0]"]
} } } },
"rejectedPlans": [ ] } }
NO SORT STAGE ✅
Is Your Query Using an Index to Provide the Sort?
47. # M D B l o c a l
SORT_MERGE is OK
✓SORT_MERGE
INDEX SCAN
FETCH
INDEX SCAN
48. # M D B l o c a l
BONUS: Is Your Query Using an Index to Provide the
Projection?
49. # M D B l o c a l
Compound Index
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
INDEX:
{username: 1, nFavorites: -
1}
"@Charlie"29023
"@MongoDB"87983
"@MongoDB"60587
"@MongoDB"7983
"@MongoDBEn
g"
83092
"@MongoDBEn
g"
76032
… …
50. # M D B l o c a l
db.tweets.find({
username: {$in: ["@MongoDBEng", "@MongoDB"]},
nFavorites: {$gt: 50000}}})
Compound Index
TWITTER.TWEETS
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
INDEX:
{username: 1, nFavorites: -1}
"@Charlie"29023
"@MongoDB"87983
"@MongoDB"60587
"@MongoDB"7983
"@MongoDBEng"83092
"@MongoDBEng"76032
… …
51. # M D B l o c a l
db.tweets.find(
{username: {$in: ["@MongoDBEng", "@MongoDB"]}, nFavorites: {$gt: 50000}}},
{_id: 0, username: 1, nFavorites: 1})
Compound Index
TWITTER.TWEETS
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
{…} {…} {…} {…} {…} {…}
INDEX:
{username: 1, nFavorites: -1}
"@Charlie"29023
"@MongoDB"87983
"@MongoDB"60587
"@MongoDB"7983
"@MongoDBEng"83092
"@MongoDBEng"76032
… …
56. # M D B l o c a l
PROJECTION
INDEX SCAN
✓ ✗
PROJECTION
INDEX SCAN
FETCH
Is your query using an index to provide the
projection?
57. # M D B l o c a l
• Is your query using an index? Which one?
• Is your query using an index to provide the sort?
• Is your query using an index to provide the projection?
The Power of "QueryPlanner"
58. # M D B l o c a l
db.tweets.explain().find({
createdDate: {$gte: <today>},
favorites: "@eliothorowitz"
})
Next Up: "It's using an index, so what's taking so
long?"
FETCH
INDEX SCAN
keyPattern: {createdDate:
1}
59. # M D B l o c a l
db.tweets.explain("executionStats").find({
createdDate: {$gte: <today>},
favorites: "@eliothorowitz"
})
It's using an index, so what's taking so long?
FETCH
INDEX SCAN
keyPattern: {createdDate:
1}
12:02
12:03
12:04
…
INDEX SCAN
INDEX:
{createdAt: 1}
60. # M D B l o c a l
db.tweets.explain("executionStats").find({
createdDate: {$gte: <today>},
favorites: "@eliothorowitz"
})
{createdDate: 12:02,
favorites: [
"@MongoDB",
"@taylorswift"
]}
FETCH
FETCH
filter: {
favorites:
"@eliothorowitz"
}
INDEX SCAN
keyPattern: {createdDate:
1}
It's using an index, so what's taking so long?
12:02
12:03
12:04
…
INDEX:
{createdAt: 1}
61. # M D B l o c a l
db.tweets.explain("executionStats").find({
createdDate: {$gte: <today>},
favorites: "@eliothorowitz"
})
{createdDate: 12:02,
favorites: [
"@MongoDB",
"@taylorswift"
]}
FETCH
FETCH
filter: {
favorites: "@eliothorowitz"
}
INDEX SCAN
keyPattern: {createdDate: 1}
❌
It's using an index, so what's taking so long?
12:02
12:03
12:04
…
FETCH
filter: {
favorites:
"@eliothorowitz"
}
INDEX SCAN
keyPattern: {createdDate:
1}
INDEX:
{createdAt: 1}
62. # M D B l o c a l
db.tweets.explain("executionStats").find({
createdDate: {$gte: <today>},
favorites: "@eliothorowitz"
})
{createdDate: 12:03,
favorites: [
"@eliothorowitz",
"@taylorswift"
]}
FETCH
It's using an index, so what's taking so long?
12:02
12:03
12:04
…
FETCH
filter: {
favorites:
"@eliothorowitz"
}
INDEX SCAN
keyPattern: {createdDate:
1}
INDEX:
{createdAt: 1}
63. # M D B l o c a l
• What percentage of the index keys in the scanned range ended up
matching the predicate?
• What's the selectivity?
So how many of them were thrown out?
64. # M D B l o c a l
01 02 03 04 05
Motivation "queryPlanner"
Verbosity
"executionStats"
Verbosity
"allPlansExecution"
Versbosity
Beyond Queries
Why do you care? Describing considered
plans
More details about
winning plan
More details about plan
selection
Log messages
What is explain? The profile
Other commands
65. # M D B l o c a l
> db.collection.find(…).explain()
{"queryPlanner" : {
"winningPlan" : {
{"stage" : "SORT",
"inputStage" : {…}
}}
},
"rejectedPlans" : [
{"stage" : "SORT",
"inputStage" : {…}
}}
…
]
}}
Explain Mode: "queryPlanner"
FETCH
INDEX SCAN
(a, b)
SORT
FETCH
INDEX SCAN (a)
SORT
66. # M D B l o c a l
• "executionStats"
Explain Mode: "ExecutionStats"
FETCH
INDEX SCAN (a)
SORT
created by Mike Ashley from Noun Project
created by Creative Stall from Noun Project
67. # M D B l o c a l
> db.tweets.find(…).explain("executionStats")
{
"queryPlanner" : {
…
"winningPlan" : {…},
"rejectedPlans" : […]
},
"executionStats": { // New!
…,
"executionStages": {…}
}
…
}
Explain Mode: "executionStats"
68. # M D B l o c a l
> db.tweets.find(…).explain("executionStats")
{
"queryPlanner" : { /* Same as before. */ },
"executionStats": {
// Top-level stats.
"executionStages": {
stage: "SORT",
// Sort stats.
inputStage: {
// etc, etc.
}
}
}
…
}
Details: "ExecutionStats"
FETCH
SORT
INDEX SCAN
keyPattern: {hashtags: 1}
69. # M D B l o c a l
> db.tweets.find(…).explain("executionStats")
{
…,
"executionStats" : {
// Top-level stats.
"nReturned" : 390000,
"executionTimeMillis" : 4431,
"totalKeysExamined" : 390000,
"totalDocsExamined" : 390000,
"executionStages" : {…}
},
}
Details: "ExecutionStats"
FETCH
SORT
INDEX SCAN
keyPattern: {hashtags: 1}
70. # M D B l o c a l
db.tweets.find(…).explain("executionStats")
{
"executionStats" : {
// Top-level stats.
"executionStages" : {
"stage" : "SORT",
"nReturned" : 390000,
"executionTimeMillisEstimate" : 2030,
…
"sortPattern" : { "nFollowers" : 1 },
"memUsage" : 20280000,
"memLimit" : 33554432,
"inputStage" : {…}
}
}
}
Details: "ExecutionStats"
FETCH
SORT
INDEX SCAN
keyPattern: {hashtags: 1}
71. # M D B l o c a l
db.tweets.find(…).explain("executionStats")
{
"executionStats" : {
// Top-level stats.
"executionStages" : {
"stage" : "SORT",
"nReturned" : 390000,
"executionTimeMillisEstimate" : 2030,
…
"sortPattern" : { "nFollowers" : -1 },
"memUsage" : 20280000,
"memLimit" : 33554432,
"inputStage" : {…}
}
}
}
Details: "ExecutionStats"
FETCH
SORT
INDEX SCAN
keyPattern: {hashtags: 1}
72. # M D B l o c a l
db.tweets.find(…).explain("executionStats")
{
"executionStats" : {
// Top-level stats.
"executionStages" : {
"stage" : "SORT",
"nReturned" : 390000,
"executionTimeMillisEstimate" : 2030,
"works" : 780003,
"advanced" : 390000,
"needTime" : 390002,
"isEOF" : 1,
"sortPattern" : { "b" : 1 },
…
"inputStage" : {…}
}
}
}
?
Details: "ExecutionStats"
FETCH
SORT
INDEX SCAN
keyPattern: {hashtags: 1}
73. # M D B l o c a l
• These are all PlanStages
• SortStage
• FetchStage
• IndexScanStage
Execution Stats: works, advanced, etc.
FETCH
SORT
INDEX SCAN
keyPattern: {hashtags: 1}
74. # M D B l o c a l
• These are all PlanStages
• Each PlanStage implements work()
Execution Stats: works, advanced, etc.
FETCH
SORT
INDEX SCAN
keyPattern: {hashtags: 1}
75. # M D B l o c a l
• These are all PlanStages
• Each PlanStage implements work(),
returns one of:
• ADVANCED
• NEED_TIME
• IS_EOF
Execution Stats: works, advanced, etc.
FETCH
SORT
INDEX SCAN
keyPattern: {hashtags: 1}
76. # M D B l o c a l
EXECUTION STATS: WORKS, ADVANCED, ETC.
FETCH
SORT
INDEX SCAN
keyPattern: {hashtags: 1}
work()
77. # M D B l o c a l
EXECUTION STATS: WORKS, ADVANCED, ETC.
work()
FETCH
SORT
INDEX SCAN
keyPattern: {hashtags: 1}
work()
78. # M D B l o c a l
EXECUTION STATS: WORKS, ADVANCED, ETC.
work()
work()
FETCH
SORT
INDEX SCAN
keyPattern: {hashtags: 1}
work()
79. # M D B l o c a l
EXECUTION STATS: WORKS, ADVANCED, ETC.
work()
ADVANCEDID
FETCH
SORT
INDEX SCAN
keyPattern: {hashtags: 1}
work()
80. # M D B l o c a l
EXECUTION STATS: WORKS, ADVANCED, ETC.
ADVANCED{…}
ADVANCED
FETCH
SORT
INDEX SCAN
keyPattern: {hashtags: 1}
work()
81. # M D B l o c a l
SORT
EXECUTION STATS: WORKS, ADVANCED, ETC.
NEED_TIME
{
…
}
ADVANCED
ADVANCED
FETCH
INDEX SCAN
keyPattern: {hashtags: 1}
work()
82. # M D B l o c a l
db.tweets.find(…).explain("executionStats")
{
"executionStats" : {
// Top-level stats.
"executionStages" : {
"stage" : "SORT",
"nReturned" : 390000,
"executionTimeMillisEstimate" : 2030,
"works" : 780003,
"advanced" : 390000,
"needTime" : 390002,
"isEOF" : 1,
"sortPattern" : { "b" : 1 },
…
"inputStage" : {…}
}
}
}
Details: "executionStats"
FETCH
SORT
INDEX SCAN
keyPattern: {hashtags: 1}
83. # M D B l o c a l
> db.tweets.find(…).explain("executionStats")
{"executionStats": {
"executionStages": {
stage: "SORT",
// Sort stats, includes "works", "advanced", …
inputStage: {
stage: "FETCH",
// Fetch stats, includes "works", "advanced", …
inputStage: {
// etc, etc.
}
}
}
}
…
}
Details: "executionStats"
FETCH
SORT
INDEX SCAN
keyPattern: {hashtags: 1}
84. # M D B l o c a l
> db.tweets.find(…).explain("executionStats")
{
"queryPlanner" : {
…
"winningPlan" : {…},
"rejectedPlans" : […]
},
"executionStats": { // New!
…,
"executionStages": {…}
}
…
}
Explain Mode: "executionStats"
86. # M D B l o c a l
How selective is your index?
87. # M D B l o c a l
db.tweets.explain("executionStats").find({
createdDate: {$gte: <today>},
favorites: "@eliothorowitz"
})
How selective is your index?
12:02
12:03
12:04
12:06
…
INDEX SCAN
FETCH
filter: {
favorites:
"@eliothorowitz"
}
INDEX SCAN
keyPattern: {createdDate:
1}
88. # M D B l o c a l
db.tweets.explain("executionStats").find({
createdDate: {$gte: <today>},
favorites: "@eliothorowitz"
})
{
"executionStats" : {
"nReturned" : 314,
"totalKeysExamined" : 2704, // < 12% matched
…
}
How selective is your index?
FETCH
filter: {
favorites:
"@eliothorowitz"
}
INDEX SCAN
keyPattern: {createdDate:
1}
89. # M D B l o c a l
What's the most expensive part of your plan?
90. # M D B l o c a l
db.tweets.explain("executionStats").find({
createdDate: {$gte: <today>},
favorites: "@eliothorowitz"
})
What's the most expensive part of your plan?
FETCH
filter: {
favorites:
"@eliothorowitz"
}
INDEX SCAN
keyPattern: {createdDate:
1}
91. # M D B l o c a l
db.tweets.explain("executionStats").find(…)
What's the most expensive part of your plan?
FETCH
executionTimeMillisEstimate: 431
INDEX SCAN
executionTimeMillisEstimate: 67
92. # M D B l o c a l
db.tweets.explain("executionStats").find(…)
What's the most expensive part of your plan?
FETCH
works: 2705
advanced: 314
needTime: 2391
// 314/2705 ≈ 8%
INDEX SCAN
93. # M D B l o c a l
• "queryPlanner"
• Is your query using the index you expect?
• Is your query using an index to provide the sort?
• Is your query using an index to provide the projection?
• "executionStats"
• How selective is your index?
• Which part of your plan is the most expensive?
Our Progress
94. # M D B l o c a l
db.tweets.explain("executionStats").find({
createdDate: {$gte: <today>},
favorites: "@eliothorowitz"
})
• We had an index on {favorites: 1}, would that have been
faster?
next up: "Why was This plan chosen?"
🤔
95. # M D B l o c a l
01 02 03 04 05
Motivation "queryPlanner"
Verbosity
"executionStats"
Verbosity
"allPlansExecution"
Versbosity
Beyond Queries
Why do you care? Describing considered
plans
More details about
winning plan
More details about plan
selection
Log messages
What is explain? The profile
Other commands
96. # M D B l o c a l
> db.collection.find(…).explain()
{"queryPlanner" : {
"winningPlan" : {
{"stage" : "SORT",
"inputStage" : {…}
}}
},
"rejectedPlans" : [
{"stage" : "SORT",
"inputStage" : {…}
}}
…
]
}}
Explain Mode: "QueryPlanner"
❓
FETCH
INDEX SCAN
(a, b)
SORT
FETCH
INDEX SCAN (a)
SORT
97. # M D B l o c a l
Query Planning
FETCH
INDEX SCAN (a,
b)
SORT
FETCH
INDEX SCAN
(a)
SORT
FETCH
INDEX SCAN (a, b)
98. # M D B l o c a l
• MultiPlanStage::pickBestPlan()
Query Planning
MATCH
COLL SCAN …
MATCH
COLL SCAN
COLL SCAN
FETCH
IX SCAN (ts)
MultiPlanStage
work()
work()
work()
99. # M D B l o c a l
• MultiPlanStage::pickBestPlan()
Query Planning
ADVANCED
NEED_TIME
ADVANCED
MATCH
COLL SCAN …
MATCH
COLL SCAN
COLL SCAN
FETCH
IX SCAN (ts)
MultiPlanStage
100. # M D B l o c a l
• MultiPlanStage::pickBestPlan()
Query Planning
Advances:78 22 50
MATCH
COLL SCAN …
MATCH
COLL SCAN
COLL SCAN
FETCH
IX SCAN (ts)
MultiPlanStage
101. # M D B l o c a l
• MultiPlanStage::pickBestPlan()
Query Planning
Advances:78 22 50
MATCH
COLL SCAN …
MATCH
COLL SCAN
COLL SCAN
FETCH
IX SCAN (ts)
MultiPlanStage
102. # M D B l o c a l
> db.tweets.find(…).explain("allPlansExecution")
{
"queryPlanner" : {
…
"winningPlan" : {…},
"rejectedPlans" : […]
},
"executionStats": {
…,
"executionStages": {…},
"allPlansExecution": […] // New!
}
…
}
Explain Mode: "allPlansExecution"
103. # M D B l o c a l
> db.collection.find(…).explain()
{"queryPlanner" : {
"winningPlan" : {
{"stage" : "SORT",
"inputStage" : {…}
}}
},
"rejectedPlans" : [
{"stage" : "SORT",
"inputStage" : {…}
}}
…
]
}}
Verbosity: "queryPlanner"
FETCH
INDEX SCAN
SORT
COLL_SCAN
SORT
104. # M D B l o c a l
> db.collection.find(…).explain()
{"queryPlanner" : {
"winningPlan" : {…},
"rejectedPlans" : [
{…},
…
]
},
"executionStats": {
executionStages: {…}
}}
Verbosity: "executionStats"
FETCH
INDEX SCAN
SORT
SORT
FETCH
INDEX SCAN
SORT
105. # M D B l o c a l
> db.collection.find(…).explain()
{"queryPlanner" : {
"winningPlan" : {…},
"rejectedPlans" : [
{…},
…
]
},
"executionStats": {
executionStages: {…}
}}
Verbosity: "executionStats"
FETCH
INDEX SCAN
SORT
SORT
FETCH
INDEX SCAN
SORT
106. # M D B l o c a l
> db.collection.find(…).explain()
{"queryPlanner" : {
"winningPlan" : {…},
"rejectedPlans" : [
{…},
…
]
},
"executionStats": {
executionStages: {…},
allPlansExecution: [
{…},
{…},
…
]
}}
Verbosity: "executionStats"
SORT
FETCH
INDEX SCAN
SORT
FETCH
INDEX SCAN
SORT
FETCH
INDEX
SCAN
SORT
FETCH
INDEX
SCAN
SORT
107. # M D B l o c a l
db.tweets.explain("allPlansExecution").find({
createdDate: {$gte: <today>},
favorites: "@eliothorowitz"
})
{
"executionStats": {
"allPlansExecution": [
{nReturned: 34,
executionStages: { /* Index Scan on "favorites" */ }
},
{nReturned: 101,
executionStages: { /* Index Scan on "createdDate" */ }
}
]
}
…
}
Explain Mode: "allPlansExecution"
108. # M D B l o c a l
01 02 03 04 05
Motivation "queryPlanner"
Verbosity
"executionStats"
Verbosity
"allPlansExecution"
Versbosity
Beyond Queries
Why do you care? Describing considered
plans
More details about
winning plan
More details about plan
selection
Log messages
What is explain? The profile
Other commands
109. # M D B l o c a l
• Queries with response time >100ms (server side) are logged:
Slow Queries
2017-05-25T10:01:58.917-0400 I COMMAND [conn7] command twitter.tweets appName: "MongoDB Shell" command: find { find: "tweets",
filter: { nFavorites: { $gte: 10000.0 } }, limit: 20.0, singleBatch: false, sort: { nFavorites: -1.0, username: 1.0 }, projection: { _id: 0.0, nFavorites:
1.0, username: 1.0 } } planSummary: IXSCAN { nFavorites: -1 } keysExamined:359907 docsExamined:359907 hasSortStage:1
cursorExhausted:1 numYields:2871 nreturned:20 reslen:1087 locks:{ Global: { acquireCount: { r: 5744 } }, Database: { acquireCount: { r: 2872 }
}, Collection: { acquireCount: { r: 2872 } } } protocol:op_command 1493ms
110. # M D B l o c a l
• Queries with response time >100ms (server side) are logged:
Slow Queries
2017-05-25T10:01:58.917-0400 I COMMAND [conn7] command twitter.tweets appName: "MongoDB Shell" command: find { find: "tweets", filter: {
nFavorites: { $gte: 10000.0 } }, limit: 20.0, singleBatch: false, sort: { nFavorites: -1.0, username: 1.0 }, projection: { _id: 0.0, nFavorites: 1.0, username:
1.0 } } planSummary: IXSCAN { nFavorites: -1 } keysExamined:359907 docsExamined:359907 hasSortStage:1 cursorExhausted:1 numYields:2871
nreturned:20 reslen:1087 locks:{ Global: { acquireCount: { r: 5744 } }, Database: { acquireCount: { r: 2872 } }, Collection: { acquireCount: { r: 2872 } } }
protocol:op_command 1493ms
111. # M D B l o c a l
• Queries with response time >100ms (server side) are logged:
Slow Queries
2017-05-25T10:01:58.917-0400 I COMMAND [conn7] command twitter.tweets appName: "MongoDB Shell" command: find { find:
"tweets", filter: { nFavorites: { $gte: 10000.0 } }, limit: 20.0, singleBatch: false, sort: { nFavorites: -1.0,
username: 1.0 }, projection: { _id: 0.0, nFavorites: 1.0, username: 1.0 } } planSummary: IXSCAN { nFavorites: -1 }
keysExamined:359907 docsExamined:359907 hasSortStage:1 cursorExhausted:1 numYields:2871 nreturned:20 reslen:1087 locks:{ Global: {
acquireCount: { r: 5744 } }, Database: { acquireCount: { r: 2872 } }, Collection: { acquireCount: { r: 2872 } } } protocol:op_command 1493ms
112. # M D B l o c a l
• Queries with response time >100ms (server side) are logged:
Slow Queries
2017-05-25T10:01:58.917-0400 I COMMAND [conn7] command twitter.tweets appName: "MongoDB Shell" command: find { find: "tweets",
filter: { nFavorites: { $gte: 10000.0 } }, limit: 20.0, singleBatch: false, sort: { nFavorites: -1.0, username: 1.0 }, projection: { _id: 0.0, nFavorites:
1.0, username: 1.0 } } planSummary: IXSCAN { nFavorites: -1 } keysExamined:359907
docsExamined:359907 hasSortStage:1 cursorExhausted:1 numYields:2871 nreturned:20 reslen:1087 locks:{ Global: {
acquireCount: { r: 5744 } }, Database: { acquireCount: { r: 2872 } }, Collection: { acquireCount: { r: 2872 } } } protocol:op_command 1493ms
113. # M D B l o c a l
• Queries with response time >100ms (server side) are logged.
• Configurable via profiling parameter 'slowMs'
Slow Queries
114. # M D B l o c a l
• If turned on, queries show up in the system.profile collection
The Profile
{ "op": "query",
"ns": "twitter.tweets",
"query": { "find": "tweets", "filter": { … }, "limit": 20, "sort": { … }, "projection": { … } },
"millis": 1355,
"planSummary": "IXSCAN { nFavorites: -1 }",
"execStats": {
"stage": "PROJECTION",
"inputStage": {
"stage": "SORT",
"inputStage": {
"stage": "FETCH",
"inputStage": {
"stage": "IXSCAN",
} } } } } }
115. # M D B l o c a l
db.runCommand({
explain: {/* command */},
verbosity: <queryPlanner|executionStats|allPlansExecution>
})
Other Commands
116. # M D B l o c a l
db.runCommand({
explain: {findAndModify: {…}},
verbosity: <queryPlanner|executionStats|allPlansExecution>
})
Other Commands
117. # M D B l o c a l
db.runCommand({
explain: {update: {…}},
verbosity: <queryPlanner|executionStats|allPlansExecution>
})
Other Commands
118. # M D B l o c a l
db.runCommand({
explain: {aggregate: {…}},
verbosity: <queryPlanner|executionStats|allPlansExecution>
})
Other Commands
119. # M D B l o c a l
• Aggregation is special…
Aggregation
runCommand({
aggregate: "collection",
pipeline: […],
explain: <true|false>
})
120. # M D B l o c a l
• Aggregation is was special…
• 3.4 and earlier:
• 3.6 and beyond:
Aggregation
runCommand({
aggregate: "collection",
pipeline: […],
explain: <true|false>
})
runCommand({explain: {
aggregate: "collection",
pipeline: […],
},
verbosity: "…" })
121. # M D B l o c a l
db.explain().aggregate([{$group: {…}}, {$project: {…}}])
{
stages: [
{$cursor: {…}},
{$group: {…}},
{$project: {…}},
]
}
Aggregation
122. # M D B l o c a l
db.explain().aggregate([{$group: {…}}, {$project: {…}}])
{
stages: [
{$cursor: {
query: {…},
fields: {…},
queryPlanner: {/* same as query explain! */},
executionStats: {/* 3.6+ only, same as query explain! */}
}},
{$group: {…}},
{$project: {…}} ] }
Aggregation
123. # M D B l o c a l
SUMMARY
01 02 03 04 05
Motivation "queryPlanner"
Verbosity
"executionStats"
Verbosity
"allPlansExecution"
Versbosity
Beyond Queries
Why do you care? Describing considered
plans
More details about
winning plan
More details about plan
selection
Log messages
What is explain? The profile
Other commands
129. # M D B l o c a l
02: explain won't solve all your problems
created by Mike Ashley from Noun Project
Your Application MongoDB
130. # M D B l o c a l
Can I see all the
tweets with hashtag
"#MDBlocal" from
this hour?
02: explain won't solve all your problems
Your Application MongoDB
131. # M D B l o c a l
Hmm… Let
me think
about that…
02: explain won't solve all your problems
Your Application MongoDB
132. # M D B l o c a l
Ah! Here are
your results!
02: explain won't solve all your problems
Your Application MongoDB
133. # M D B l o c a l
02: explain won't solve all your problems
What took
you so
long?!
Your Application MongoDB
134. # M D B l o c a l
- network
latency
02: explain won't solve all your problems
What took
you so
long?!
Your Application MongoDB
135. # M D B l o c a l
- network latency
- large result set
{…
}
02: explain won't solve all your problems
{…
}
{…
}
{…
}
{…
}
{…
}
{…
} {…
}
{…
}
What took
you so
long?!
Your Application MongoDB
136. # M D B l o c a l
02: explain won't solve all your problems
Your Application MongoDB
- network
latency
- large result set
- server
contention
What took
you so
long?!
137. # M D B l o c a l
- network
latency
- large result set
- server
contention
- query planning
problem
02: explain won't solve all your problems
What took
you so
long?!
Your Application MongoDB
138. # M D B l o c a l
• Is your query using an index? Which one?
• Is your query using an index to provide the sort?
• How many of the examined documents ended up matching?
• Why did the server choose to answer the query the way it did?
03: Target Questions
🤔
Editor's Notes
Intro:
message: what is this about, but really?
table of contents - value proposition
Terms to define:
Selectivity
Fetch vs. collection scan
Covering
Index keys
Fetch!
What is interesting about query to me?
What is this slide for?
Do they trust me? Are they qualified?
How do I think? Point of view?
Open source -> not easy to understand
Don't ask me, ask the database
I use explain, we're buddies
Add a bit of fluff before the table of contents.
Note about why we're going bottom to top here.
use index name, not key pattern
Empirical
Observational
Why? Expensive?
Talk about why multi planning is worth it
Make this more of a reminder, finish on a higher note.