Query Analyzing
Introduction into indexes
Indexes In Mongo
Managing indexes in MongoDB
Using index to sort query results.
When should I use indexes.
When should we avoid using indexes.
Video available here: http://vivu.tv/portal/archive.jsp?flow=783-586-4282&id=1270584002677
We all know that MongoDB is one of the most flexible and feature-rich databases available. In this webinar we'll discuss how you can leverage this feature set and maintain high performance with your project's massive data sets and high loads. We'll cover how indexes can be designed to optimize the performance of MongoDB. We'll also discuss tips for diagnosing and fixing performance issues should they arise.
In this presentation, Raghavendra BM of Valuebound has discussed the basics of MongoDB - an open-source document database and leading NoSQL database.
----------------------------------------------------------
Get Socialistic
Our website: http://valuebound.com/
LinkedIn: http://bit.ly/2eKgdux
Facebook: https://www.facebook.com/valuebound/
Twitter: http://bit.ly/2gFPTi8
Video available here: http://vivu.tv/portal/archive.jsp?flow=783-586-4282&id=1270584002677
We all know that MongoDB is one of the most flexible and feature-rich databases available. In this webinar we'll discuss how you can leverage this feature set and maintain high performance with your project's massive data sets and high loads. We'll cover how indexes can be designed to optimize the performance of MongoDB. We'll also discuss tips for diagnosing and fixing performance issues should they arise.
In this presentation, Raghavendra BM of Valuebound has discussed the basics of MongoDB - an open-source document database and leading NoSQL database.
----------------------------------------------------------
Get Socialistic
Our website: http://valuebound.com/
LinkedIn: http://bit.ly/2eKgdux
Facebook: https://www.facebook.com/valuebound/
Twitter: http://bit.ly/2gFPTi8
MongoDB is the most famous and loved NoSQL database. It has many features that are easy to handle when compared to conventional RDBMS. These slides contain the basics of MongoDB.
Slidedeck presented at http://devternity.com/ around MongoDB internals. We review the usage patterns of MongoDB, the different storage engines and persistency models as well has the definition of documents and general data structures.
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLabCloudxLab
Big Data with Hadoop & Spark Training: http://bit.ly/2L4rPmM
This CloudxLab Basics of RDD tutorial helps you to understand Basics of RDD in detail. Below are the topics covered in this tutorial:
1) What is RDD - Resilient Distributed Datasets
2) Creating RDD in Scala
3) RDD Operations - Transformations & Actions
4) RDD Transformations - map() & filter()
5) RDD Actions - take() & saveAsTextFile()
6) Lazy Evaluation & Instant Evaluation
7) Lineage Graph
8) flatMap and Union
9) Scala Transformations - Union
10) Scala Actions - saveAsTextFile(), collect(), take() and count()
11) More Actions - reduce()
12) Can We Use reduce() for Computing Average?
13) Solving Problems with Spark
14) Compute Average and Standard Deviation with Spark
15) Pick Random Samples From a Dataset using Spark
Jane Uyvova
Senior Solutions Architect, MongoDB
March 21, 2017
MongoDB Evenings San Francisco
Learn how easy it is to set up, operate, and scale your MongoDB deployments in the cloud with MongoDB Atlas.
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...Simplilearn
This presentation about Apache Spark covers all the basics that a beginner needs to know to get started with Spark. It covers the history of Apache Spark, what is Spark, the difference between Hadoop and Spark. You will learn the different components in Spark, and how Spark works with the help of architecture. You will understand the different cluster managers on which Spark can run. Finally, you will see the various applications of Spark and a use case on Conviva. Now, let's get started with what is Apache Spark.
Below topics are explained in this Spark presentation:
1. History of Spark
2. What is Spark
3. Hadoop vs Spark
4. Components of Apache Spark
5. Spark architecture
6. Applications of Spark
7. Spark usecase
What is this Big Data Hadoop training course about?
The Big Data Hadoop and Spark developer course have been designed to impart an in-depth knowledge of Big Data processing using Hadoop and Spark. The course is packed with real-life projects and case studies to be executed in the CloudLab.
What are the course objectives?
Simplilearn’s Apache Spark and Scala certification training are designed to:
1. Advance your expertise in the Big Data Hadoop Ecosystem
2. Help you master essential Apache and Spark skills, such as Spark Streaming, Spark SQL, machine learning programming, GraphX programming and Shell Scripting Spark
3. Help you land a Hadoop developer job requiring Apache Spark expertise by giving you a real-life industry project coupled with 30 demos
What skills will you learn?
By completing this Apache Spark and Scala course you will be able to:
1. Understand the limitations of MapReduce and the role of Spark in overcoming these limitations
2. Understand the fundamentals of the Scala programming language and its features
3. Explain and master the process of installing Spark as a standalone cluster
4. Develop expertise in using Resilient Distributed Datasets (RDD) for creating applications in Spark
5. Master Structured Query Language (SQL) using SparkSQL
6. Gain a thorough understanding of Spark streaming features
7. Master and describe the features of Spark ML programming and GraphX programming
Who should take this Scala course?
1. Professionals aspiring for a career in the field of real-time big data analytics
2. Analytics professionals
3. Research professionals
4. IT developers and testers
5. Data scientists
6. BI and reporting professionals
7. Students who wish to gain a thorough understanding of Apache Spark
Learn more at https://www.simplilearn.com/big-data-and-analytics/apache-spark-scala-certification-training
As we all know, the standard specifies MPO as a connector to the 40GBASE-SR4 QSFP+ transceiver. To connect a QSFP+ to QSFP+, we usually use a MTP 12-fiber trunk cable. In the 40GBASE-SR transmission, there are eight fibers associated with the channel—four fibers for the TX signal and four fibers for the RX signal. Therefore only 8 of the 12 fibers are used, where the remaining four are not used, and can optionally be not present in the cable. So we can also choose a MTP 8-fiber trunk cable for connectivity. Besides this, there are a number of other factors also needed to be considered when to choose a right MTP trunk cable for 40GBASE-SR4 QSFP+ connectivity. This article explains 40GBASE-SR4 QSFP+ to 40GBASE-SR4 QSFP+ cabling selections.
MongoDB is the most famous and loved NoSQL database. It has many features that are easy to handle when compared to conventional RDBMS. These slides contain the basics of MongoDB.
Slidedeck presented at http://devternity.com/ around MongoDB internals. We review the usage patterns of MongoDB, the different storage engines and persistency models as well has the definition of documents and general data structures.
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLabCloudxLab
Big Data with Hadoop & Spark Training: http://bit.ly/2L4rPmM
This CloudxLab Basics of RDD tutorial helps you to understand Basics of RDD in detail. Below are the topics covered in this tutorial:
1) What is RDD - Resilient Distributed Datasets
2) Creating RDD in Scala
3) RDD Operations - Transformations & Actions
4) RDD Transformations - map() & filter()
5) RDD Actions - take() & saveAsTextFile()
6) Lazy Evaluation & Instant Evaluation
7) Lineage Graph
8) flatMap and Union
9) Scala Transformations - Union
10) Scala Actions - saveAsTextFile(), collect(), take() and count()
11) More Actions - reduce()
12) Can We Use reduce() for Computing Average?
13) Solving Problems with Spark
14) Compute Average and Standard Deviation with Spark
15) Pick Random Samples From a Dataset using Spark
Jane Uyvova
Senior Solutions Architect, MongoDB
March 21, 2017
MongoDB Evenings San Francisco
Learn how easy it is to set up, operate, and scale your MongoDB deployments in the cloud with MongoDB Atlas.
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...Simplilearn
This presentation about Apache Spark covers all the basics that a beginner needs to know to get started with Spark. It covers the history of Apache Spark, what is Spark, the difference between Hadoop and Spark. You will learn the different components in Spark, and how Spark works with the help of architecture. You will understand the different cluster managers on which Spark can run. Finally, you will see the various applications of Spark and a use case on Conviva. Now, let's get started with what is Apache Spark.
Below topics are explained in this Spark presentation:
1. History of Spark
2. What is Spark
3. Hadoop vs Spark
4. Components of Apache Spark
5. Spark architecture
6. Applications of Spark
7. Spark usecase
What is this Big Data Hadoop training course about?
The Big Data Hadoop and Spark developer course have been designed to impart an in-depth knowledge of Big Data processing using Hadoop and Spark. The course is packed with real-life projects and case studies to be executed in the CloudLab.
What are the course objectives?
Simplilearn’s Apache Spark and Scala certification training are designed to:
1. Advance your expertise in the Big Data Hadoop Ecosystem
2. Help you master essential Apache and Spark skills, such as Spark Streaming, Spark SQL, machine learning programming, GraphX programming and Shell Scripting Spark
3. Help you land a Hadoop developer job requiring Apache Spark expertise by giving you a real-life industry project coupled with 30 demos
What skills will you learn?
By completing this Apache Spark and Scala course you will be able to:
1. Understand the limitations of MapReduce and the role of Spark in overcoming these limitations
2. Understand the fundamentals of the Scala programming language and its features
3. Explain and master the process of installing Spark as a standalone cluster
4. Develop expertise in using Resilient Distributed Datasets (RDD) for creating applications in Spark
5. Master Structured Query Language (SQL) using SparkSQL
6. Gain a thorough understanding of Spark streaming features
7. Master and describe the features of Spark ML programming and GraphX programming
Who should take this Scala course?
1. Professionals aspiring for a career in the field of real-time big data analytics
2. Analytics professionals
3. Research professionals
4. IT developers and testers
5. Data scientists
6. BI and reporting professionals
7. Students who wish to gain a thorough understanding of Apache Spark
Learn more at https://www.simplilearn.com/big-data-and-analytics/apache-spark-scala-certification-training
As we all know, the standard specifies MPO as a connector to the 40GBASE-SR4 QSFP+ transceiver. To connect a QSFP+ to QSFP+, we usually use a MTP 12-fiber trunk cable. In the 40GBASE-SR transmission, there are eight fibers associated with the channel—four fibers for the TX signal and four fibers for the RX signal. Therefore only 8 of the 12 fibers are used, where the remaining four are not used, and can optionally be not present in the cable. So we can also choose a MTP 8-fiber trunk cable for connectivity. Besides this, there are a number of other factors also needed to be considered when to choose a right MTP trunk cable for 40GBASE-SR4 QSFP+ connectivity. This article explains 40GBASE-SR4 QSFP+ to 40GBASE-SR4 QSFP+ cabling selections.
Fanout technology is playing an important role in 40G data center. Products like 40G break out cable, cassette and 40G break out direct attached cable can all be customized in Fiberstore. Different connectors, cable length, fiber count etc. can all be specially designed according to your application.
Wbmmf – next generation duplex multimode fiber in the data centerKerry Zhang
WBMMF is born at the right moment to meet the challenges associated with escalating data rates and the ongoing need to build cost-effective infrastructure.
OP VVV - Call "Support for excellent research teams"
http://www.msmt.cz/strukturaini-fondy-1/vyzva-c-02-15-003-podpora-excelentnich-vyzkumnych-tymu-v?lang=1
Winter Wonders adventure at Kirstenbosch Gardens
The school holidays are almost here! So, what to do with the kids?
Take them to Kirstenbosch for the fun-filled Winter Wonders programme.
From Saturday 25 June through to Saturday 16 July, Winter Wonders offers a 3-week programme of fun, learning, walking, creativity and story-telling. Most of the events are free of charge, and there are also family focused activities for all to join in.
The Winter Wonders programme offers an opportunity to explore the most beautiful garden in Africa and learn more about the environment we live in.
The unmissable adventures include animated story readings by Struik Nature and Struik Children’s authors in the Botanical Society bookshop, treasure-hunts through the gardens, a biodiversity and waste art workshop, a nature’s treasure box experience and Kirstenbosch’s version of the Amazing Race: ‘Find 50 in Kirstenbosch’!
Many Struik Nature and Struik Children’s authors are participating, so bring your family, meet the authors and join in the fun. One of the highlights is herpetologist Johan Marais who is bringing in some of his reptilian friends for children to see and touch if they dare!
There will also be fantastic prizes and good discounts on Struik Nature books in the Botanical Society Bookshop.
Free Garden entry for children under 18 years during the Winter School Holidays 25 June to 17 July
See full programme details on www.sanbi.org.za & www.struiknatureclub.co.za
40 gbase qsfp+ aoc and 40gbase sr4 qsfp+ transceiver, which is your choiceKerry Zhang
40GBASE QSFP+ AOC is more reliable and stable than 40GBASE SR4 QSFP+ transceiver and costs less for 40G interconnection. However, it cannot perform as much good as 40GBASE-SR4 QSFP+ when the transmission distance is longer than 100 meters. And the function of DDM can help this 40G transceiver find its best working state, which is cannot be achieved by AOC.
As your data grows, the need to establish proper indexes becomes critical to performance. MongoDB supports a wide range of indexing options to enable fast querying of your data, but what are the right strategies for your application?
In this talk we’ll cover how indexing works, the various indexing options, and use cases where each can be useful. We'll dive into common pitfalls using real-world examples to ensure that you're ready for scale.
New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2Antonios Giannopoulos
MongoDB 4.2 comes GA soon delivering some amazing new features on multiple areas. In this talk, we will focus on the new capabilities of the aggregation framework. We are going to cover the new operators and expressions. At the same time, we will explore how updates commands can now use the aggregation framework operators. We are also going to present aggregation framework improvements focusing on the on-demand materialized views. Finally, we are going to explore the wildcard indexes introduced in MongoDB 4.2 and how they change the way we design documents and build queries/aggregations. We will also make a reference to the new index build system.
This is an introduction about the MongoDB. It includes basic MongoQueries. Not a advance level of presentation but provide nice information for the starters
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Globus
The Earth System Grid Federation (ESGF) is a global network of data servers that archives and distributes the planet’s largest collection of Earth system model output for thousands of climate and environmental scientists worldwide. Many of these petabyte-scale data archives are located in proximity to large high-performance computing (HPC) or cloud computing resources, but the primary workflow for data users consists of transferring data, and applying computations on a different system. As a part of the ESGF 2.0 US project (funded by the United States Department of Energy Office of Science), we developed pre-defined data workflows, which can be run on-demand, capable of applying many data reduction and data analysis to the large ESGF data archives, transferring only the resultant analysis (ex. visualizations, smaller data files). In this talk, we will showcase a few of these workflows, highlighting how Globus Flows can be used for petabyte-scale climate analysis.
Cyaniclab : Software Development Agency Portfolio.pdfCyanic lab
CyanicLab, an offshore custom software development company based in Sweden,India, Finland, is your go-to partner for startup development and innovative web design solutions. Our expert team specializes in crafting cutting-edge software tailored to meet the unique needs of startups and established enterprises alike. From conceptualization to execution, we offer comprehensive services including web and mobile app development, UI/UX design, and ongoing software maintenance. Ready to elevate your business? Contact CyanicLab today and let us propel your vision to success with our top-notch IT solutions.
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Globus
Large Language Models (LLMs) are currently the center of attention in the tech world, particularly for their potential to advance research. In this presentation, we'll explore a straightforward and effective method for quickly initiating inference runs on supercomputers using the vLLM tool with Globus Compute, specifically on the Polaris system at ALCF. We'll begin by briefly discussing the popularity and applications of LLMs in various fields. Following this, we will introduce the vLLM tool, and explain how it integrates with Globus Compute to efficiently manage LLM operations on Polaris. Attendees will learn the practical aspects of setting up and remotely triggering LLMs from local machines, focusing on ease of use and efficiency. This talk is ideal for researchers and practitioners looking to leverage the power of LLMs in their work, offering a clear guide to harnessing supercomputing resources for quick and effective LLM inference.
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Shahin Sheidaei
Games are powerful teaching tools, fostering hands-on engagement and fun. But they require careful consideration to succeed. Join me to explore factors in running and selecting games, ensuring they serve as effective teaching tools. Learn to maintain focus on learning objectives while playing, and how to measure the ROI of gaming in education. Discover strategies for pitching gaming to leadership. This session offers insights, tips, and examples for coaches, team leads, and enterprise leaders seeking to teach from simple to complex concepts.
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Anthony Dahanne
Les Buildpacks existent depuis plus de 10 ans ! D’abord, ils étaient utilisés pour détecter et construire une application avant de la déployer sur certains PaaS. Ensuite, nous avons pu créer des images Docker (OCI) avec leur dernière génération, les Cloud Native Buildpacks (CNCF en incubation). Sont-ils une bonne alternative au Dockerfile ? Que sont les buildpacks Paketo ? Quelles communautés les soutiennent et comment ?
Venez le découvrir lors de cette session ignite
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns
Unlocking Business Potential: Tailored Technology Solutions by Prosigns
Discover how Prosigns, a leading technology solutions provider, partners with businesses to drive innovation and success. Our presentation showcases our comprehensive range of services, including custom software development, web and mobile app development, AI & ML solutions, blockchain integration, DevOps services, and Microsoft Dynamics 365 support.
Custom Software Development: Prosigns specializes in creating bespoke software solutions that cater to your unique business needs. Our team of experts works closely with you to understand your requirements and deliver tailor-made software that enhances efficiency and drives growth.
Web and Mobile App Development: From responsive websites to intuitive mobile applications, Prosigns develops cutting-edge solutions that engage users and deliver seamless experiences across devices.
AI & ML Solutions: Harnessing the power of Artificial Intelligence and Machine Learning, Prosigns provides smart solutions that automate processes, provide valuable insights, and drive informed decision-making.
Blockchain Integration: Prosigns offers comprehensive blockchain solutions, including development, integration, and consulting services, enabling businesses to leverage blockchain technology for enhanced security, transparency, and efficiency.
DevOps Services: Prosigns' DevOps services streamline development and operations processes, ensuring faster and more reliable software delivery through automation and continuous integration.
Microsoft Dynamics 365 Support: Prosigns provides comprehensive support and maintenance services for Microsoft Dynamics 365, ensuring your system is always up-to-date, secure, and running smoothly.
Learn how our collaborative approach and dedication to excellence help businesses achieve their goals and stay ahead in today's digital landscape. From concept to deployment, Prosigns is your trusted partner for transforming ideas into reality and unlocking the full potential of your business.
Join us on a journey of innovation and growth. Let's partner for success with Prosigns.
We describe the deployment and use of Globus Compute for remote computation. This content is aimed at researchers who wish to compute on remote resources using a unified programming interface, as well as system administrators who will deploy and operate Globus Compute services on their research computing infrastructure.
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production.
Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process?
In this session we will cover:
- The Art of Effective Code Reviews
- Streamlining the Review Process
- Elevating Reviews with Automated Tools
By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdfJay Das
With the advent of artificial intelligence or AI tools, project management processes are undergoing a transformative shift. By using tools like ChatGPT, and Bard organizations can empower their leaders and managers to plan, execute, and monitor projects more effectively.
May Marketo Masterclass, London MUG May 22 2024.pdfAdele Miller
Can't make Adobe Summit in Vegas? No sweat because the EMEA Marketo Engage Champions are coming to London to share their Summit sessions, insights and more!
This is a MUG with a twist you don't want to miss.
A Comprehensive Look at Generative AI in Retail App Testing.pdfkalichargn70th171
Traditional software testing methods are being challenged in retail, where customer expectations and technological advancements continually shape the landscape. Enter generative AI—a transformative subset of artificial intelligence technologies poised to revolutionize software testing.
How to Position Your Globus Data Portal for Success Ten Good PracticesGlobus
Science gateways allow science and engineering communities to access shared data, software, computing services, and instruments. Science gateways have gained a lot of traction in the last twenty years, as evidenced by projects such as the Science Gateways Community Institute (SGCI) and the Center of Excellence on Science Gateways (SGX3) in the US, The Australian Research Data Commons (ARDC) and its platforms in Australia, and the projects around Virtual Research Environments in Europe. A few mature frameworks have evolved with their different strengths and foci and have been taken up by a larger community such as the Globus Data Portal, Hubzero, Tapis, and Galaxy. However, even when gateways are built on successful frameworks, they continue to face the challenges of ongoing maintenance costs and how to meet the ever-expanding needs of the community they serve with enhanced features. It is not uncommon that gateways with compelling use cases are nonetheless unable to get past the prototype phase and become a full production service, or if they do, they don't survive more than a couple of years. While there is no guaranteed pathway to success, it seems likely that for any gateway there is a need for a strong community and/or solid funding streams to create and sustain its success. With over twenty years of examples to draw from, this presentation goes into detail for ten factors common to successful and enduring gateways that effectively serve as best practices for any new or developing gateway.
Quarkus Hidden and Forbidden ExtensionsMax Andersen
Quarkus has a vast extension ecosystem and is known for its subsonic and subatomic feature set. Some of these features are not as well known, and some extensions are less talked about, but that does not make them less interesting - quite the opposite.
Come join this talk to see some tips and tricks for using Quarkus and some of the lesser known features, extensions and development techniques.
Into the Box Keynote Day 2: Unveiling amazing updates and announcements for modern CFML developers! Get ready for exciting releases and updates on Ortus tools and products. Stay tuned for cutting-edge innovations designed to boost your productivity.
How Recreation Management Software Can Streamline Your Operations.pptxwottaspaceseo
Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.
Globus Connect Server Deep Dive - GlobusWorld 2024Globus
We explore the Globus Connect Server (GCS) architecture and experiment with advanced configuration options and use cases. This content is targeted at system administrators who are familiar with GCS and currently operate—or are planning to operate—broader deployments at their institution.
Globus Connect Server Deep Dive - GlobusWorld 2024
Mongo indexes
1. MongoDB
Indexes
●
Query Analyzing
●
Introduction into indexes
●
Indexes In Mongo
●
Managing indexes in MongoDB
●
Using index to sort query results.
●
When should I use indexes.
●
When should we avoid using indexes.
2. MongoDB Indexes
Creating experimental collection
●
Selecting Database
●
Inserting fake data in our test collection
“product”. 900000 documents.
●
Products are categorized into 10 categories and
the status of each product could be 0 or 1.
3. _id name category status
1236**** Prod 1 1 0
1237**** Prod 2 5 0
1238**** Prod 3 9 1
1238**** Prod 4 6 0
1239**** Prod 5 5 1
MongoDB Indexes
Query Analyzing
● db.product.find({category : 5});
●
Searching for products in a specific category.
●
Number of scanned documents.
●
Mongo scans all documents in the collection and
match them with the given condition.
4. MongoDB Indexes
Query Analyzing
●
If we tried to search for a single document
by _Id using “find” function, we would get
significantly different results.
●
In our case right now we are able to get
same performance just in case we search for
an “_id”.
●
What makes this happen is an index on “_id”
attribute.
5. MongoDB Indexes
Introduction into indexes
●
How does the normal query work? “linear
search”
●
Indexes can be described as the single and
most critical tool to increase the database
performance.
●
What is it?
An index is a data structure that contains a
copy of some data from database.
6. MongoDB Indexes
Indexes In Mongo
●
Index Types:
1) Default: _id
2) Single: Similar to Default but could be applied on any of
document fields.
3) Compound: Means defining an index on multiple fields.
Ie : employeeId, salary
4) Multikey: This type of index is used to index a field contains an
array.
We can just define one multikey index per document.
5) GeoSpatial: A set of indexes and query mechanisms to handle
geospatial information.
6) Text: This index is used with text supporting search for string
content.
7) Hashed: This index hashes the value of a field.
Doesn't support range-based queries. Doesn't support multi-key
arrays
Supports only equality matches.
Indexes in Mongo are tree data structured.
For More info search for “b-tree”.
●
Index Properties:
1) Unique.
2) TTL: These are special indexes used to
automatically remove documents after a certain
period of time.
7. MongoDB Indexes
Indexes In Mongo
Single Index:
It can be applied on any field of a collection.
Embedded Fields & Embedded Documents:
●
We can create indexes on fields within embedded documents.
●
We can create indexes on embedded documents.
8. MongoDB Indexes
Indexes In Mongo
Compound Index:
Where a single index structure references to more than one field “max of 31” in the same collection.
Let's go back to our example:
We have 900000 products.
Single document example.
Continue to the next slide.
9. MongoDB Indexes
Indexes In Mongo : Compound
Find query scenarios explanation: more about Explain results
Now Let's create a compound index on both category and status fields:
Find query scenarios explanation:
Continue to the next slide.
10. MongoDB Indexes
Indexes In Mongo
Find query scenarios explanation: more about Explain results
In the previous slide, it seemed like the index doesn't take effect when we filter documents using “status”
filed, Mongo scanned all 900000 documents. This is because indexing in Mongo is tree structured.
Let's try to explain it again:
Tests from previous slide:
11. MongoDB Indexes
Indexes In Mongo
Conclusions from compound indexes:
●
Sort indexes is supported by compound index as well as in single index.
●
Sort order can matter in determining whether the index can support a sort operation.
●
Compound index supports indexing by prefixes.
12. MongoDB Indexes
Indexes In Mongo
Multikey Index:
To index an array or subdocument MongoDB creates an index for each element in the array.
Creation to multikey index is similar to other types, MongoDB would automatically create multikey
index if any indexed field is in array.
Index Arrray with Embedded documents
Simple Array indexing:
13. MongoDB Indexes
GeoSpatial Index
GeoSpatial Index:
MongoDB offers a number of indexes that allows us to handle geospatial data. That geospatial data
is a geographical information that points to a specific location using longitude and latitude or x and
y in the Cartesian coordinates.
There are two surface types:
●
Flat (1): To calculate distances on a Euclidean plane. Use “2d” index. Supports data stored as
two-dimensional plane, legacy coordinate pairs [x, y].
●
Spherical (2): To calculate geometry over an Earth-like sphere. Use “2dsphere” index. Supports
data stored as GeoJSON object and as legacy coordinate pairs
(1)(2)
14. MongoDB Indexes
GeoSpatial Index
Simple example of 2d index:
Creating a geospatial data for restaurants
A sample of documents generated by
the previous code.
Creating index:
15. MongoDB Indexes
GeoSpatial Index
Simple example of 2dsphere index:
Creating a geospatial data for restaurants
A sample of documents generated by
the previous code.
Creating index:
Note: 2dshpere supports data stored as GeoJSON Objects.
For more info about GeoJSON : geojson.org , The following website is a tool to show how GeoJson is structured : geojson.io
for (var i = 0; i < 1000; i++) {
var x = (Math.floor(Math.random() * 20) % 2 == 1 ? "" : "-") + (i + Math.floor(Math.random() * 150)) % 99 + '.' + (i + Math.floor(Math.random() * 100000)) % 1000;
var y = (Math.floor(Math.random() * 20) % 2 == 1 ? "" : "-") + (i + Math.floor(Math.random() * 150)) % 99 + '.' + (i + Math.floor(Math.random() * 100000)) % 1000;
db.places.insert({
"loc": {
type: "Point",
coordinates: [+x, +y]
},
name: "Resturant Num. " + i,
status: i % 2
});
}
16. MongoDB Indexes
GeoSpatial Index
Geospatial query operators: more info
$geoWithin$near
$polygon $geoWithin
$geoIntersects
$near
$nearSphere
$geometry
$minDistance
$maxDistance
$center
$centerSphere
$box
$polygon
$uniqueDocs