MongoDB can be used to store and query document-oriented data, and provides scalability through horizontal scaling. The document stores provide more flexibility than relational databases by allowing dynamic schemas with embedded documents. MongoDB combines the rich querying of relational databases with the flexibility and scalability of NoSQL databases. It uses indexes to improve query performance and supports features like aggregation, geospatial queries, and text search.
MongoDB is one of the most popular databases these days and there are a few reasons for such popularity. One of these reasons is the excellent integration with different programming languages and development frameworks.
In the case of Python we take it a few notches up (native use of dictionaries, integration with asynchronous libraries (twisted, gevent), good support for web frameworks like django, flask, bottle ... (mongoengine anyone?).
This talk is about the several different projects that we support, the way to effectively use Python and MongoDB together and a few other improvements and announcements.
Back to Basics, webinar 2: La tua prima applicazione MongoDBMongoDB
Questo è il secondo webinar della serie Back to Basics che ti offrirà un'introduzione al database MongoDB. In questo webinar ti dimostreremo come creare un'applicazione base per il blogging in MongoDB.
Conceptos básicos. Seminario web 5: Introducción a Aggregation FrameworkMongoDB
Este es el quinto seminario web de la serie Conceptos básicos, en la que se realiza una introducción a la base de datos MongoDB. En este seminario web, se analizan los aspectos básicos de Aggregation Framework.
Webinar: Getting Started with MongoDB - Back to BasicsMongoDB
Part one an Introduction to MongoDB. Learn how easy it is to start building applications with MongoDB. This session covers key features and functionality of MongoDB and sets out the course of building an application.
MongoDB is one of the most popular databases these days and there are a few reasons for such popularity. One of these reasons is the excellent integration with different programming languages and development frameworks.
In the case of Python we take it a few notches up (native use of dictionaries, integration with asynchronous libraries (twisted, gevent), good support for web frameworks like django, flask, bottle ... (mongoengine anyone?).
This talk is about the several different projects that we support, the way to effectively use Python and MongoDB together and a few other improvements and announcements.
Back to Basics, webinar 2: La tua prima applicazione MongoDBMongoDB
Questo è il secondo webinar della serie Back to Basics che ti offrirà un'introduzione al database MongoDB. In questo webinar ti dimostreremo come creare un'applicazione base per il blogging in MongoDB.
Conceptos básicos. Seminario web 5: Introducción a Aggregation FrameworkMongoDB
Este es el quinto seminario web de la serie Conceptos básicos, en la que se realiza una introducción a la base de datos MongoDB. En este seminario web, se analizan los aspectos básicos de Aggregation Framework.
Webinar: Getting Started with MongoDB - Back to BasicsMongoDB
Part one an Introduction to MongoDB. Learn how easy it is to start building applications with MongoDB. This session covers key features and functionality of MongoDB and sets out the course of building an application.
This presentation will demonstrate how you can use the aggregation pipeline with MongoDB similar to how you would use GROUP BY in SQL and the new stage operators coming 3.4. MongoDB’s Aggregation Framework has many operators that give you the ability to get more value out of your data, discover usage patterns within your data, or use the Aggregation Framework to power your application. Considerations regarding version, indexing, operators, and saving the output will be reviewed.
Webinaire 2 de la série « Retour aux fondamentaux » : Votre première applicat...MongoDB
Il s'agit du deuxième webinaire de la série « Retour aux fondamentaux » qui a pour but de vous présenter la base de données MongoDB. Dans ce webinaire, nous vous expliquerons comment créer une application de création de blogs dans MongoDB.
Back to Basics Webinar 5: Introduction to the Aggregation FrameworkMongoDB
This is the fifth webinar of a Back to Basics series that will introduce you to the MongoDB database. This webinar will introduce you to the aggregation framework.
These are slides from our Big Data Warehouse Meetup in April. We talked about NoSQL databases: What they are, how they’re used and where they fit in existing enterprise data ecosystems.
Mike O’Brian from 10gen, introduced the syntax and usage patterns for a new aggregation system in MongoDB and give some demonstrations of aggregation using the new system. The new MongoDB aggregation framework makes it simple to do tasks such as counting, averaging, and finding minima or maxima while grouping by keys in a collection, complementing MongoDB’s built-in map/reduce capabilities.
For more information, visit our website at http://casertaconcepts.com/ or email us at info@casertaconcepts.com.
For developers new to MongoDB and Node.js, however, some the common design patterns are very different than those of a RDBMS and traditional synchronous languages. Developers learning these technologies together may find it a bit bewildering. In reality, however, these tools fit perfectly together and enable I high degree of developer productivity and application performance.
This webinar will walk developers through common MongoDB development patterns in Node.js, such as efficiently loading data into MongoDB using MongoDB's bulk API, iterating through query results, and managing simultaneous asynchronous MongoDB queries to provide the best possible application performance. Working Node.js and MongoDB examples will be used throughout the presentation.
Webinar: Data Processing and Aggregation OptionsMongoDB
MongoDB scales easily to store mass volumes of data. However, when it comes to making sense of it all what options do you have? In this talk, we'll take a look at 3 different ways of aggregating your data with MongoDB, and determine the reasons why you might choose one way over another. No matter what your big data needs are, you will find out how MongoDB the big data store is evolving to help make sense of your data.
MongoDB: Comparing WiredTiger In-Memory Engine to RedisJason Terpko
This presentation will compare WiredTiger’s In-Memory Engine to Redis. We will review characteristics of each data store, how they are similar, and different. Understanding the similarities and differences will help you decide which data store is best suited for your key-value store needs.
Back to Basics: My First MongoDB ApplicationMongoDB
This Back to Basics webinar series will introduce you to NoSQL and the MongoDB database. You will find out what MongoDB is, why you would use it, and what you would use it for.
MongoDB Days UK: Using MongoDB and Python for Data Analysis PipelinesMongoDB
Presented by Eoin Brazil, Proactive Technical Services Engineer, MongoDB
Experience level: Advanced
MongoDB offers a flexible, scalable, and easy way to store your large data set. Python provides many useful data science tools (e.g. NumPy, SciPy, Scikit-learn, etc.). This talk will discuss the concerns for creating operational data analytic pipelines, introduce Monary as alternative for loading data into NumPy, and give examples of accessing data with Monary, as well as how to build scalable data analysis pipelines using these open source tools.
This presentation will demonstrate how you can use the aggregation pipeline with MongoDB similar to how you would use GROUP BY in SQL and the new stage operators coming 3.4. MongoDB’s Aggregation Framework has many operators that give you the ability to get more value out of your data, discover usage patterns within your data, or use the Aggregation Framework to power your application. Considerations regarding version, indexing, operators, and saving the output will be reviewed.
Webinaire 2 de la série « Retour aux fondamentaux » : Votre première applicat...MongoDB
Il s'agit du deuxième webinaire de la série « Retour aux fondamentaux » qui a pour but de vous présenter la base de données MongoDB. Dans ce webinaire, nous vous expliquerons comment créer une application de création de blogs dans MongoDB.
Back to Basics Webinar 5: Introduction to the Aggregation FrameworkMongoDB
This is the fifth webinar of a Back to Basics series that will introduce you to the MongoDB database. This webinar will introduce you to the aggregation framework.
These are slides from our Big Data Warehouse Meetup in April. We talked about NoSQL databases: What they are, how they’re used and where they fit in existing enterprise data ecosystems.
Mike O’Brian from 10gen, introduced the syntax and usage patterns for a new aggregation system in MongoDB and give some demonstrations of aggregation using the new system. The new MongoDB aggregation framework makes it simple to do tasks such as counting, averaging, and finding minima or maxima while grouping by keys in a collection, complementing MongoDB’s built-in map/reduce capabilities.
For more information, visit our website at http://casertaconcepts.com/ or email us at info@casertaconcepts.com.
For developers new to MongoDB and Node.js, however, some the common design patterns are very different than those of a RDBMS and traditional synchronous languages. Developers learning these technologies together may find it a bit bewildering. In reality, however, these tools fit perfectly together and enable I high degree of developer productivity and application performance.
This webinar will walk developers through common MongoDB development patterns in Node.js, such as efficiently loading data into MongoDB using MongoDB's bulk API, iterating through query results, and managing simultaneous asynchronous MongoDB queries to provide the best possible application performance. Working Node.js and MongoDB examples will be used throughout the presentation.
Webinar: Data Processing and Aggregation OptionsMongoDB
MongoDB scales easily to store mass volumes of data. However, when it comes to making sense of it all what options do you have? In this talk, we'll take a look at 3 different ways of aggregating your data with MongoDB, and determine the reasons why you might choose one way over another. No matter what your big data needs are, you will find out how MongoDB the big data store is evolving to help make sense of your data.
MongoDB: Comparing WiredTiger In-Memory Engine to RedisJason Terpko
This presentation will compare WiredTiger’s In-Memory Engine to Redis. We will review characteristics of each data store, how they are similar, and different. Understanding the similarities and differences will help you decide which data store is best suited for your key-value store needs.
Back to Basics: My First MongoDB ApplicationMongoDB
This Back to Basics webinar series will introduce you to NoSQL and the MongoDB database. You will find out what MongoDB is, why you would use it, and what you would use it for.
MongoDB Days UK: Using MongoDB and Python for Data Analysis PipelinesMongoDB
Presented by Eoin Brazil, Proactive Technical Services Engineer, MongoDB
Experience level: Advanced
MongoDB offers a flexible, scalable, and easy way to store your large data set. Python provides many useful data science tools (e.g. NumPy, SciPy, Scikit-learn, etc.). This talk will discuss the concerns for creating operational data analytic pipelines, introduce Monary as alternative for loading data into NumPy, and give examples of accessing data with Monary, as well as how to build scalable data analysis pipelines using these open source tools.
MongoDB Certification Study Group - May 2016Norberto Leite
Study group session to review the certification exam regarding material covered, exam structure and technical requirements. DBA and Developers track covered to ensure the technical expertise of individuals on subject matter topics specific to MongoDB
Data Engineering 101: Building your first data product by Jonathan Dinu PyDat...PyData
Often times there exists a divide between data teams, engineering, and product managers in organizations, but with the dawn of data driven companies/applications, it is more prescient now than ever to be able to automate your analyses to personalize your users experiences. LinkedIn's People you May Know, Netflix and Pandora's recommenders, and Amazon's eerily custom shopping experience have all shown us why it is essential to leverage data if you want to stay relevant as a company.
As data analyses turn into products, it is essential that your tech/data stack be flexible enough to run models in production, integrate with web applications, and provide users with immediate and valuable feedback. I believe Python is becoming the lingua franca of data science due to its flexibility as a general purpose performant programming language, rich scientific ecosystem (numpy, scipy, scikit-learn, pandas, etc.), web frameworks/community, and utilities/libraries for handling data at scale. In this talk I will walk through a fictional company bringing it's first data product to market. Along the way I will cover Python and data science best practices for such a pipeline, cover some of the pitfalls of what happens when you put models into production, and how to make sure your users (and engineers) are as happy as they can be.
https://github.com/Jay-Oh-eN/pydatasv2014
The basics of Python are rather straightforward. In a few minutes you can learn most of the syntax. There are some gotchas along the way that might appear tricky. This talk is meant to bring programmers up to speed with Python. They should be able to read and write Python.
MongoDB 2.4 enthält über hundert Verbesserungen, welche mehr Möglichkeiten und eine höhere Produktivität für Entwickler, erweiterte Datenbank Management Funktionalitäten und Geschwindigkeitsverbesserungen beinhalten. Im Webinar werden die wichtigsten Neuerungen anhand von Beispielen erläutert.
MongoDB for Coder Training (Coding Serbia 2013)Uwe Printz
Slides of my MongoDB Training given at Coding Serbia Conference on 18.10.2013
Agenda:
1. Introduction to NoSQL & MongoDB
2. Data manipulation: Learn how to CRUD with MongoDB
3. Indexing: Speed up your queries with MongoDB
4. MapReduce: Data aggregation with MongoDB
5. Aggregation Framework: Data aggregation done the MongoDB way
6. Replication: High Availability with MongoDB
7. Sharding: Scaling with MongoDB
As your data grows, the need to establish proper indexes becomes critical to performance. MongoDB supports a wide range of indexing options to enable fast querying of your data, but what are the right strategies for your application?
In this talk we’ll cover how indexing works, the various indexing options, and use cases where each can be useful. We'll dive into common pitfalls using real-world examples to ensure that you're ready for scale.
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...pchutichetpong
M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years.
Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success.
MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies.
According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”
3. Databases
Relational Databases (RDBMS) “Les SGBDRs” (OLTP – Online Transaction Processing)
• Based on relational database model invented by E.F. Codd (IBM) in 1970
• SQL (SEQUEL): expressive query language
• Achieves performance increase through “vertical scaling” (higher performance node)
NoSQL (No SQL, or “Not just SQL”?) existed before RDBMS, resurged out of a need for more Scalability for
Web2.0
- Columnar, e.g. Cassandra, Hbase, Vertica
- Document Oriented, e.g. CouchDB, MongoDB, RethinkDB
- Graph, e.g. Neo4J
- Key-value, e.g. CouchDB, Dynamo, Riak, Redis, Oracle NoSQL, MUMPS
- Multi-model, e.g. Alchemy, ArangoDB
• Achieves performance increase through “horizontal scaling” (by adding nodes)
OLAP – Online Analytic Processing
4. Databases for scalability
The need for scalability requires distribution of processing making it more difficult
to satisfy all criteria, according to the CAP Theorem (2000)
- Consistency (Atomicity)
- Availability (A request will always receive a response)
- Partition Tolerance (if nodes become disconnected, the system continues to
function)
It is impossible to satisfy all three constraints (proved by MIT, 2002).
RDBMS - favour Consistency and Availability but cannot scale horizontally
NoSQL - favour Consistency and Partitioning or Availability and Partitioning
5. Relational vs. NoSQL
Relational databases provide
- Rich query language (SQL)
- Strong consistency
- Secondary indices
NoSQL provides
- Flexibility
- “dynamic schema” allows missing or extra fields, embedded structures
- Scalability
- Adding nodes allows to scale data size
- Performance
- Adding nodes allows to scale performance
MongoDB combines both sets of advantages
7. MongoDB – a Document-Oriented DB
Open source DB on github with commercial support from MongoDB Inc (was 10gen).
• Dynamic schema (schema-less) aids agile development
• A document is an associative array (possibly more than 1-level deep), e.g. JSON Object
• Uses BSON (binary JSON format) for serialization (easy to parse)
• Binaries downloadable for Linux(es), Windows, OSX, Solaris
• Drivers available in many languages
• Provides an aggregation framework
• Is the “M” in MEAN
8. MongoDB – Performance
• Written in C++
• It achieves high performance, especially on dynamic queries
• Allows primary & secondary indices (main tuning element)
• Is horizontally scalable across many nodes
• Extensive use of memory-mapped files
i.e. read-through write-through memory caching.
• Provides high availability with Master/slave replication
• Provides Sharding (horizontal partitioning of data across nodes)
• Horizontal partitioning : complete documents (or rows) are stored on a node
• Vertical partitioning: documents (or rows) are split across nodes
11. MongoDB – Enterprise Products
There are supported products for the enterprise
• MongoDB Enterprise: for production workloads
• MongoDB Compass: Data and Schema visualization tool
• MongoDB Connector for BI:
Allows users to visualize their MongoDB Enterprise data using existing relational
business intelligence tools such as Tableau.
Foreign data wrapper with PostgreSQL to provide a relational SQL view on
MongoDB data
12. MongoDB v3.2: new features
• The default storage engine is now WiredTiger
• Following acquisition of WiredTiger, Inc.
• Introduced as an optional engine in MongoDB 3.0
• Has more granular concurrency control and native compression (lowering storage costs,
increasing h/w utilization, throughput and providing more predictable performance)
• Replication election enhancements
• Config servers as replica sets
• readConcern, and document validations.
• OpsManager 2.0
13. MongoDB v3.2: new aggregation features
• $sample, $lookup
• $indexStats
• $filter,$slice,$arrayElemAt, $isArray, $concatArrays
• Partial Indexes
• Document Validation
• 4 bit testing operators
• 2 new Accumulators for $group
• 10 new Arithmetic expression operators
14. MongoDB components
Components
mongod - The database daemon process.
mongos - Sharding controller.
mongo - The database shell (uses interactive javascript).
Utilities
mongodump - MongoDB dump tool - for backups, snapshots, etc.
mongorestore - MongoDB restore a dump
mongoexport - Export a single collection to test (JSON, CSV)
mongoimport - Import from JSON or CSV
mongofiles - Utility for putting and getting files from MongoDB GridFS
mongostat - Show performance statistics
19. MongoDB shell
> mongo
MongoDB shell version: 3.2.3
connecting to: test
> show dbs
Money_UK 0.000GB
aggregation_example 0.000GB
local 0.000GB
test 0.000GB
> use Money_UK
switched to db Money_UK
> show collections
CARD
SAVING
22. MongoDB indexing
Indexing is the single biggest tunable performance factor.
Can index on a single field, e.g. on ascending or descending values:
db.newcoll.ensureIndex( { ‘field1’: 1 } ) // or -1: descending
Or subfields:
db.newcoll.ensureIndex( { ‘arr.subfield’: 1 } )
Or multiple fields:
db.newcoll.ensureIndex( { ‘arr.subfield’: 1, ‘field2’: -1 } )
23. MongoDB indexing - 2
It is also possible to provide hints to the indexer, on uniqueness, sparseness or to
request index creation as a background task, e.g.
db.newcoll.ensureIndex( { ‘field1’: 1 }, { ‘unique’: true} )
db.newcoll.ensureIndex( { ‘field1’: 1 , ‘field2’: 1}, { ‘sparse’: true} )
db.newcoll.ensureIndex( { ‘field1’: 1 }, { ‘background’: true} )
24. MongoDB searching
We can search based on a single field
db.newcoll.find( { ‘field1’: ‘any match’ } )
Or subfields:
db.newcoll.find( { ‘arr.subfield’: ‘sub field match’ } )
Or multiple fields:
db.newcoll.find( { ‘arr.subfield’: ‘sub field match’, ‘field2’: ‘match2’ } )
Available indices will be used if possible
25. MongoDB sorting
We can sort the results in ascending or descending order
db.newcoll.find().sort( { ‘field1’: 1 } ) // or -1: descending
or
db.newcoll.find().sort( { ‘field1’: 1, ‘field2’: -1 } )
Available indices will be used if possible
26. MongoDB projections
We can reduce the number of returned fields (the projection).
This may enable data access directly from the index.
In this example we limit the returned fields to ‘field1’ (and the ‘_id’ index).
db.newcoll.find({ ‘field1’: ‘example’ }, { ‘field1’: 1} )
In this example we limit the returned fields to ‘field1’ (without the ‘_id’ index).
db.newcoll.find({ ‘field1’: ‘example’ }, { ‘_id’: 0, ‘field1’: 1} )
We can also override the default index
db.newcoll.find({ ‘field1’: ‘example’ }, { ‘_id’: ‘field1’ } )
27. MongoDB explain plan
We can request explanation of how a query will be handled (showing possible index use)
> db.newcoll.find().explain()
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "test.newcoll",
"indexFilterSet" : false,
"parsedQuery" : {
"$and" : [ ]
},
"winningPlan" : {
"stage" : "COLLSCAN",
"filter" : {
"$and" : [ ]
},
"direction" : "forward"
},
"rejectedPlans" : [ ]
},
"serverInfo" : {
"host" : "MJBRIGHT7",
"port" : 27017,
"version" : "3.2.3",
"gitVersion" : "b326ba837cf6f49d65c2f85e1b70f6f31ece7937"
},
"ok" : 1
}
28. MongoDB special index types
• Geospatial Indexes (2d Sphere)
• Text Indexes
• TTL Collections (expireAfterSeconds)
• Hashed Indexes for sharding
35. PyMongo – ca y est, on arrive !
… there are alternatives to the official PyMongo driver.
• Motor is a full-featured, non-blocking MongoDB driver for Python Tornado
applications.
• TxMongo is an asynchronous Twisted Python driver for MongoDB.
36. PyMongo – ca y est, on arrive !
… and other Python/MongoDB tools
ORM-like Layers
New users are recommended to begin with PyMongo.
Nevertheless other implementations are available providing higher abstraction.
Humongolus, MongoKit, Ming, MongoAlchemy, MongoEngine, Minimongo, Manga,
MotorEngine
Framework Tools
This section lists tools and adapters that have been designed to work with various Python
frameworks and libraries.
Django MongoDB Engine, Mango, Django MongoEngine, mongodb_beaker, Log4Mongo,
MongoLog, C5t, rod.recipe.mongodb, repoze-what-plugins-mongodb, Mongobox, Flask-
MongoAlchemy, Flask-MongoKit, Flask-PyMongo.