Buzz Moschetti compares the development time and effort required to save and fetch contact data using MongoDB versus SQL over the course of two weeks. With SQL, each time a new field is added or the data structure changes, the SQL schema must be altered and code updated in multiple places. With MongoDB, the data structure can evolve freely without changes to the data access code - it remains a simple insert and find. By day 14, representing the more complex data structure in SQL would require flattening some data and storing it in non-ideal ways, while MongoDB continues to require no changes to the simple data access code.
As your data grows, the need to establish proper indexes becomes critical to performance. MongoDB supports a wide range of indexing options to enable fast querying of your data, but what are the right strategies for your application?
In this talk we’ll cover how indexing works, the various indexing options, and use cases where each can be useful. We'll dive into common pitfalls using real-world examples to ensure that you're ready for scale.
Video available here: http://vivu.tv/portal/archive.jsp?flow=783-586-4282&id=1270584002677
We all know that MongoDB is one of the most flexible and feature-rich databases available. In this webinar we'll discuss how you can leverage this feature set and maintain high performance with your project's massive data sets and high loads. We'll cover how indexes can be designed to optimize the performance of MongoDB. We'll also discuss tips for diagnosing and fixing performance issues should they arise.
As your data grows, the need to establish proper indexes becomes critical to performance. MongoDB supports a wide range of indexing options to enable fast querying of your data, but what are the right strategies for your application?
In this talk we’ll cover how indexing works, the various indexing options, and use cases where each can be useful. We'll dive into common pitfalls using real-world examples to ensure that you're ready for scale.
Video available here: http://vivu.tv/portal/archive.jsp?flow=783-586-4282&id=1270584002677
We all know that MongoDB is one of the most flexible and feature-rich databases available. In this webinar we'll discuss how you can leverage this feature set and maintain high performance with your project's massive data sets and high loads. We'll cover how indexes can be designed to optimize the performance of MongoDB. We'll also discuss tips for diagnosing and fixing performance issues should they arise.
Query Analyzing
Introduction into indexes
Indexes In Mongo
Managing indexes in MongoDB
Using index to sort query results.
When should I use indexes.
When should we avoid using indexes.
We all know that MongoDB is one of the most flexible and feature-rich databases available. In this session we'll discuss how you can leverage this feature set and maintain high performance with your project's massive data sets and high loads. We'll cover how indexes can be designed to optimize the performance of MongoDB. We'll also discuss tips for diagnosing and fixing performance issues should they arise.
Presented by Tom Schreiber, Senior Consulting Engineer, MongoDB
MongoDB supports a wide range of indexing options to enable fast querying of your data, but what are the right strategies for your application? In this talk we’ll cover how indexing works, the various indexing options, and cover use cases where each might be useful. We'll dive into common pitfalls using real-world examples to ensure that you're ready for scale. We'll show you the tools and techniques for diagnosing and tuning the performance of your MongoDB deployment. Whether you're running into problems or just want to optimize your performance, these skills will be useful.
I inherited a MongoDB database server with 60 collections and 100 or so indexes.
The business users are complaining about slow report completion times. What can I do to improve performance?
MongoDB is opensource DB, CRUD with MongoDB is not as same with other DB using SQL statements it can be achieved using NoSQL json queries which i have try explained here.
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial IndexesMongoDB
This is the fourth webinar of a Back to Basics series that will introduce you to the MongoDB database. This webinar will introduce you to the aggregation framework.
Development time is wasted as the bulk of the work shifts from adding business features to struggling with the RDBMS. MongoDB, the leading NoSQL database, offers a flexible and scalable solution.
Building a Scalable Inbox System with MongoDB and Javaantoinegirbal
Many user-facing applications present some kind of news feed/inbox system. You can think of Facebook, Twitter, or Gmail as different types of inboxes where the user can see data of interest, sorted by time, popularity, or other parameter. A scalable inbox is a difficult problem to solve: for millions of users, varied data from many sources must be sorted and presented within milliseconds. Different strategies can be used: scatter-gather, fan-out writes, and so on. This session presents an actual application developed by 10gen in Java, using MongoDB. This application is open source and is intended to show the reference implementation of several strategies to tackle this common challenge. The presentation also introduces many MongoDB concepts.
Mythbusting: Understanding How We Measure the Performance of MongoDBMongoDB
Benchmarking, benchmarking, benchmarking. We all do it, mostly it tells us what we want to hear but often hides a mountain of misinformation. In this talk we will walk through the pitfalls that you might find yourself in by looking at some examples where things go wrong. We will then walk through how MongoDB performance is measured, the processes and methodology and ways to present and look at the information.
MongoDB + Java - Everything you need to know Norberto Leite
Learn everything you need to know to get started building a MongoDB-based app in Java. We'll explore the relationship between MongoDB and various languages on the Java Virtual Machine such as Java, Scala, and Clojure. From there, we'll examine the popular frameworks and integration points between MongoDB and the JVM including Spring Data and object-document mappers like Morphia.
In a real life almost any project deals with the
tree structures. Different kinds of taxonomies,
site structures etc require modeling of
hierarchy relations.
Typical approaches used
● Model Tree Structures with Child References
● Model Tree Structures with Parent References
● Model Tree Structures with an Array of Ancestors
● Model Tree Structures with Materialized Paths
● Model Tree Structures with Nested Sets
Database Trends for Modern Applications: Why the Database You Choose Matters MongoDB
Matt Kalan, Senior Solutions Architect, MongoDB
Matt will explain how modern technology requirements have changed the requirements of the database. In order to handle agile development, big data, cloud, APIs, continuous availability, and unlimited scale while lowering costs, new capabilities are required. Do you need to tolerate the impedance mismatch between an object model and the relational model, or is there another way? We will walk through the application development process, to the code level, to compare using an RDBMS with MongoDB.
Benefits of Using MongoDB Over RDBMS (At An Evening with MongoDB Minneapolis ...MongoDB
Rapid Development and Performance By Transitioning from RDBMSs to MongoDB
Modern day application requirements demand rich & dynamic data structures, fast response times, easy scaling, and low TCO to match the rapidly changing customer & business requirements plus the powerful programming languages used in today's software landscape.
Traditional approaches to solutions development with RDBMSs increasingly expose the gap between the modern development languages and the relational data model, and between scaling up vs. scaling horizontally on commodity hardware. Development time is wasted as the bulk of the work has shifted from adding business features to struggling with the RDBMSs.
MongoDB, the premier NoSQL database, offers a flexible and scalable solution to focus on quickly adding business value again.
In this session, we will provide:
- Overview of MongoDB's capabilities
- Code-level exploration of the MongoDB programming model and APIs and how they transform the way developers interact with a database
- Update of the exciting features in MongoDB 3.0
Query Analyzing
Introduction into indexes
Indexes In Mongo
Managing indexes in MongoDB
Using index to sort query results.
When should I use indexes.
When should we avoid using indexes.
We all know that MongoDB is one of the most flexible and feature-rich databases available. In this session we'll discuss how you can leverage this feature set and maintain high performance with your project's massive data sets and high loads. We'll cover how indexes can be designed to optimize the performance of MongoDB. We'll also discuss tips for diagnosing and fixing performance issues should they arise.
Presented by Tom Schreiber, Senior Consulting Engineer, MongoDB
MongoDB supports a wide range of indexing options to enable fast querying of your data, but what are the right strategies for your application? In this talk we’ll cover how indexing works, the various indexing options, and cover use cases where each might be useful. We'll dive into common pitfalls using real-world examples to ensure that you're ready for scale. We'll show you the tools and techniques for diagnosing and tuning the performance of your MongoDB deployment. Whether you're running into problems or just want to optimize your performance, these skills will be useful.
I inherited a MongoDB database server with 60 collections and 100 or so indexes.
The business users are complaining about slow report completion times. What can I do to improve performance?
MongoDB is opensource DB, CRUD with MongoDB is not as same with other DB using SQL statements it can be achieved using NoSQL json queries which i have try explained here.
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial IndexesMongoDB
This is the fourth webinar of a Back to Basics series that will introduce you to the MongoDB database. This webinar will introduce you to the aggregation framework.
Development time is wasted as the bulk of the work shifts from adding business features to struggling with the RDBMS. MongoDB, the leading NoSQL database, offers a flexible and scalable solution.
Building a Scalable Inbox System with MongoDB and Javaantoinegirbal
Many user-facing applications present some kind of news feed/inbox system. You can think of Facebook, Twitter, or Gmail as different types of inboxes where the user can see data of interest, sorted by time, popularity, or other parameter. A scalable inbox is a difficult problem to solve: for millions of users, varied data from many sources must be sorted and presented within milliseconds. Different strategies can be used: scatter-gather, fan-out writes, and so on. This session presents an actual application developed by 10gen in Java, using MongoDB. This application is open source and is intended to show the reference implementation of several strategies to tackle this common challenge. The presentation also introduces many MongoDB concepts.
Mythbusting: Understanding How We Measure the Performance of MongoDBMongoDB
Benchmarking, benchmarking, benchmarking. We all do it, mostly it tells us what we want to hear but often hides a mountain of misinformation. In this talk we will walk through the pitfalls that you might find yourself in by looking at some examples where things go wrong. We will then walk through how MongoDB performance is measured, the processes and methodology and ways to present and look at the information.
MongoDB + Java - Everything you need to know Norberto Leite
Learn everything you need to know to get started building a MongoDB-based app in Java. We'll explore the relationship between MongoDB and various languages on the Java Virtual Machine such as Java, Scala, and Clojure. From there, we'll examine the popular frameworks and integration points between MongoDB and the JVM including Spring Data and object-document mappers like Morphia.
In a real life almost any project deals with the
tree structures. Different kinds of taxonomies,
site structures etc require modeling of
hierarchy relations.
Typical approaches used
● Model Tree Structures with Child References
● Model Tree Structures with Parent References
● Model Tree Structures with an Array of Ancestors
● Model Tree Structures with Materialized Paths
● Model Tree Structures with Nested Sets
Database Trends for Modern Applications: Why the Database You Choose Matters MongoDB
Matt Kalan, Senior Solutions Architect, MongoDB
Matt will explain how modern technology requirements have changed the requirements of the database. In order to handle agile development, big data, cloud, APIs, continuous availability, and unlimited scale while lowering costs, new capabilities are required. Do you need to tolerate the impedance mismatch between an object model and the relational model, or is there another way? We will walk through the application development process, to the code level, to compare using an RDBMS with MongoDB.
Benefits of Using MongoDB Over RDBMS (At An Evening with MongoDB Minneapolis ...MongoDB
Rapid Development and Performance By Transitioning from RDBMSs to MongoDB
Modern day application requirements demand rich & dynamic data structures, fast response times, easy scaling, and low TCO to match the rapidly changing customer & business requirements plus the powerful programming languages used in today's software landscape.
Traditional approaches to solutions development with RDBMSs increasingly expose the gap between the modern development languages and the relational data model, and between scaling up vs. scaling horizontally on commodity hardware. Development time is wasted as the bulk of the work has shifted from adding business features to struggling with the RDBMSs.
MongoDB, the premier NoSQL database, offers a flexible and scalable solution to focus on quickly adding business value again.
In this session, we will provide:
- Overview of MongoDB's capabilities
- Code-level exploration of the MongoDB programming model and APIs and how they transform the way developers interact with a database
- Update of the exciting features in MongoDB 3.0
Rapid Development and Performance By Transitioning from RDBMSs to MongoDB
Modern day application requirements demand rich & dynamic data structures, fast response times, easy scaling, and low TCO to match the rapidly changing customer & business requirements plus the powerful programming languages used in today's software landscape.
Traditional approaches to solutions development with RDBMSs increasingly expose the gap between the modern development languages and the relational data model, and between scaling up vs. scaling horizontally on commodity hardware. Development time is wasted as the bulk of the work has shifted from adding business features to struggling with the RDBMSs.
MongoDB, the premier NoSQL database, offers a flexible and scalable solution to focus on quickly adding business value again.
In this session, we will provide:
- Overview of MongoDB's capabilities
- Code-level exploration of the MongoDB programming model and APIs and how they transform the way developers interact with a database
- Update of the exciting features in MongoDB 3.0
Engineers often ask "how do I know if I should build my application on MongoDB?" IT executives ask a similar question, "which applications in my application portfolio should I migrate to MongoDB?" This presentation will present a framework for answering these questions.
We will cover two sets of criteria: (1) how to determine when to migrate a legacy application to MongoDB and (2) when should MongoDB be used for new applications? The presentation will also include a brief introduction to MongoDB to provide enough MongoDB technical background for analyzing when to use MongoDB?
Learning Objectives:
The basics of MongoDB document model, query capabilities, and architecture required for analyzing when to use MongoDB?
Criteria for determining when to use MongoDB to re-platform legacy applications
Criteria for determining when to use MongoDB for new applications
Intro to MongoDB
Get a jumpstart on MongoDB, use cases, and next steps for building your first app with Buzz Moschetti, MongoDB Enterprise Architect.
@BuzzMoschetti
MongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: TutorialMongoDB
For 30 years, developers have been taught that relational data modeling was THE way to model, but as more companies adopt MongoDB as their data platform, the approaches that work well in relational design actually work against you in a document model design. In this talk, we will discuss how to conceptually approach modeling data with MongoDB, focusing on practical foundational techniques, paired with tips and tricks, and wrapping with discussing design patterns to solve common real world problems.
La Modernizzazione dei Dati come base per la Trasformazione DigitaleMongoDB
L'economia digitale costringe le aziende a innovare per rimanere competitive. Nessuno vuole essere il prossimo Blockbuster, costretto al fallimento da concorrenti più agili ed efficienti nello sfruttamento della tecnologia.
NoSQL - MongoDB. Agility, scalability, performance. I am going to talk about the basis of NoSQL and MongoDB. Why some projects requires RDBMs and another NoSQL databases? What are the pros and cons to use NoSQL vs. SQL? How data are stored and transefed in MongoDB? What query language is used? How MongoDB supports high availability and automatic failover with the help of the replication? What is sharding and how it helps to support scalability?. The newest level of the concurrency - collection-level and document-level.
Application Development & Database Choices: Postgres Support for non Relation...EDB
This talk will cover the advanced features of PostgreSQL that make it the most-loved RDBMS by developers and a great choice for non-relational workloads.
This webinar will explore:
- Global adoption of Postgres
- Document-centric applications
- Geographic Information Systems (GIS)
- Business intelligence
- Central data centers
- Server-side languages
NoSQL for the rest of us - a JBoss perspective over those hot tools and how y...Alexandre Porcelli
Nowadays noSQL technologies had become very popular in the market, however a lot of doubts are always present when we have to decide which technology we have to use. There is a specific noSQL/Big Data Technology according to scenario, the application or non-functional requirements, in this presentation you will see how differentiate the noSQL technologies, and how to apply them together with JBoss Technologies, enabling real modern architectures. During this talk we'll cover some of the most important tools like Inifispan, MongoDB and Neo4J from data model and architecture as well, presenting to the audience some rationalization when each of them should be used in real world scenarios.
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...NoSQLmatters
Simon Elliston Ball – When to NoSQL and When to Know SQL
With NoSQL, NewSQL and plain old SQL, there are so many tools around it’s not always clear which is the right one for the job.This is a look at a series of NoSQL technologies, comparing them against traditional SQL technology. I’ll compare real use cases and show how they are solved with both NoSQL options, and traditional SQL servers, and then see who wins. We’ll look at some code and architecture examples that fit a variety of NoSQL techniques, and some where SQL is a better answer. We’ll see some big data problems, little data problems, and a bunch of new and old database technologies to find whatever it takes to solve the problem.By the end you’ll hopefully know more NoSQL, and maybe even have a few new tricks with SQL, and what’s more how to choose the right tool for the job.
L1 Intro to Relational DBMS LP.pdfIntro to Relational .docxDIPESH30
L1 Intro to Relational DBMS LP.pdf
Intro to Relational
Databases
CS 2215 Introduction to Databases
1
2
What Is a DBMS?
Database: A collection of information.
Eg ?
Examples: Library, University
Database Management System (DBMS) :
software package designed to store
and manage databases.
Eg: Oracle, SQL server, MySQL, Access
Files vs DBMS : Why bother with
databases ?
Why not just store all the data in a big file
and write C or Java programs to manipulate
the data. 2
3
Why Use a DBMS?
Naïve users sheltered from messy
details
Data integrity:
Eg: if Bob works in Marketing, make
sure there is a dept. called Marketing.
Reduced application development
time: Avoid writing special programs
from scratch each time to access
data.
Standard Application Interface:
increased reliability
3
4
Why Use a DBMS?
Data independence: easier
to make changes
If how data is stored changes,
don’t have to change views.
Forms, etc.
Security: easier to control
how data is shared
Concurrent access: allow
multiple users to access
simultaneously
But in a controlled way !
4
5
Different people involved
DBMS implementers: who build the DBMS like
Oracle, MS SQL server
End users: Use forms & reports, might write SQL
queries
DB application programmers: write programs
to make life easier for end users.
Eg: person who creates forms for library.
Must know how databases work
DB administrator (DBA):
Handles security and authorization
Crash recovery
Database tuning as needs evolve
5
6
Overview of course: Relational Model:
Student Database, Fig 1.2
6
STUDENT
Name StudentNumber Class Major
Smith 17 1 CS
Brown 8 2 CS
7
Overview of course:
Data Models:
High level : Entity Relation (E.R.) model
Intermediate level : relational model
Student database
Low level: physical database -
Covered in CSCI 4524 Advanced
Databases
Relational databases:
Integrity constraints
Good design : normalization
Query languages: Relational
algebra, SQL
Views, Assertions, Triggers
7
8
Relational Data Model
Relation: 2-dimensional table
All info stored in tables
Eg: student, course
See Elmasri Fig 1.2
Rows (or tuples): student : 2 rows
Records: a row may correspond with a
record in a file
Commonly used if we are talking about the
physical storage of databases
Columns (or attributes): student : 4
columns
8
9
Relational Data Model
Relational model proposed by E. F.
Codd 1970
Dominant model in commercial DBMS
products.
Eg: Oracle, SQL server, MySQL, Access.
Compared to previous models
(network, hierarchical etc):
Easier to understand info in tables
Casual user can write simple SQL queries
Complex queries much easier to
understand compared to previous models.
9
10
Basic Terminology
Relational Schema (or head): set of all the
column names i.e. what info is bei ...
Large Scale Fuzzy Name Matching with a Custom ML Pipeline in Batch and Stream...Databricks
ING bank is a Dutch multinational, multi-product bank that offers banking services to 33 million retail and commercial customers in over 40 countries. At this scale, ING naturally faces a multitude of data consolidation tasks across its disparate sources. A common consolidation problem is fuzzy name matching: given a name (streaming) or a list of names (batch), find out the most similar name(s) from a different list.
Popular methods such as Levenshtein distance are not appropriate because of the time complexity and sheer volume of names involved. In this talk, we will introduce how we use a Spark custom ML pipeline and Structured Streaming to build fuzzy name matching products in batch and streaming. This can successfully match 8000 names per second against a 10 million name list, using a ten-node cluster. Firstly, we will give an introduction into the name matching problem.
Secondly, we will explain why Levenshtein distance approach is limited, and demonstrate a faster approach; token-based cosine similarity matching. Next, we will show how a ML pipeline helps to build an elegant solution. Here, we will deep dive into the detail of each stage, including customized preprocessing, tokenization, term-frequency, customized inverse document frequency, customized cosine similarity with distributed sparse matrix multiplication, and a customized supervision stage.
Finally, we will show how we deploy the ML pipeline within a batch data pipeline, and additionally as a fuzzy search engine in a streaming manner. Â The main conclusions will be: (1) a spark custom ML pipeline provides a powerful way to handle complicated data science problems (2) a uniform ML pipeline can serve both batch and streaming products easily from the same codebase.
Data Modeling, Normalization, and De-Normalization | PostgresOpen 2019 | Dimi...Citus Data
As a developer using PostgreSQL one of the most important tasks you have to deal with is modeling the database schema for your application. In order to achieve a solid design, it’s important to understand how the schema is then going to be used as well as the trade-offs it involves.
As Fred Brooks said: “Show me your flowcharts and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won’t usually need your flowcharts; they’ll be obvious.”
In this talk we're going to see practical normalisation examples and their benefits, and also review some anti-patterns and their typical PostgreSQL solutions, including Denormalization techniques thanks to advanced Data Types.
Introduction to MongoDB
MongoDB Database
Document Model
BSON
Data Model
CRUD operations
High Availability and Scalability
Replication
Sharding
Hands-On MongoDB
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
During this talk we'll navigate through a customer's journey as they migrate an existing MongoDB deployment to MongoDB Atlas. While the migration itself can be as simple as a few clicks, the prep/post effort requires due diligence to ensure a smooth transfer. We'll cover these steps in detail and provide best practices. In addition, we’ll provide an overview of what to consider when migrating other cloud data stores, traditional databases and MongoDB imitations to MongoDB Atlas.
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
These days, everyone is expected to be a data analyst. But with so much data available, how can you make sense of it and be sure you're making the best decisions? One great approach is to use data visualizations. In this session, we take a complex dataset and show how the breadth of capabilities in MongoDB Charts can help you turn bits and bytes into insights.
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
MongoDB Kubernetes operator and MongoDB Open Service Broker are ready for production operations. Learn about how MongoDB can be used with the most popular container orchestration platform, Kubernetes, and bring self-service, persistent storage to your containerized applications. A demo will show you how easy it is to enable MongoDB clusters as an External Service using the Open Service Broker API for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
Are you new to schema design for MongoDB, or are you looking for a more complete or agile process than what you are following currently? In this talk, we will guide you through the phases of a flexible methodology that you can apply to projects ranging from small to large with very demanding requirements.
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
Humana, like many companies, is tackling the challenge of creating real-time insights from data that is diverse and rapidly changing. This is our journey of how we used MongoDB to combined traditional batch approaches with streaming technologies to provide continues alerting capabilities from real-time data streams.
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
Time series data is increasingly at the heart of modern applications - think IoT, stock trading, clickstreams, social media, and more. With the move from batch to real time systems, the efficient capture and analysis of time series data can enable organizations to better detect and respond to events ahead of their competitors or to improve operational efficiency to reduce cost and risk. Working with time series data is often different from regular application data, and there are best practices you should observe.
This talk covers:
Common components of an IoT solution
The challenges involved with managing time-series data in IoT applications
Different schema designs, and how these affect memory and disk utilization – two critical factors in application performance.
How to query, analyze and present IoT time-series data using MongoDB Compass and MongoDB Charts
At the end of the session, you will have a better understanding of key best practices in managing IoT time-series data with MongoDB.
Join this talk and test session with a MongoDB Developer Advocate where you'll go over the setup, configuration, and deployment of an Atlas environment. Create a service that you can take back in a production-ready state and prepare to unleash your inner genius.
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
Our clients have unique use cases and data patterns that mandate the choice of a particular strategy. To implement these strategies, it is mandatory that we unlearn a lot of relational concepts while designing and rapidly developing efficient applications on NoSQL. In this session, we will talk about some of our client use cases, the strategies we have adopted, and the features of MongoDB that assisted in implementing these strategies.
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
Encryption is not a new concept to MongoDB. Encryption may occur in-transit (with TLS) and at-rest (with the encrypted storage engine). But MongoDB 4.2 introduces support for Client Side Encryption, ensuring the most sensitive data is encrypted before ever leaving the client application. Even full access to your MongoDB servers is not enough to decrypt this data. And better yet, Client Side Encryption can be enabled at the "flick of a switch".
This session covers using Client Side Encryption in your applications. This includes the necessary setup, how to encrypt data without sacrificing queryability, and what trade-offs to expect.
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
MongoDB Kubernetes operator is ready for prime-time. Learn about how MongoDB can be used with most popular orchestration platform, Kubernetes, and bring self-service, persistent storage to your containerized applications.
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
These days, everyone is expected to be a data analyst. But with so much data available, how can you make sense of it and be sure you're making the best decisions? One great approach is to use data visualizations. In this session, we take a complex dataset and show how the breadth of capabilities in MongoDB Charts can help you turn bits and bytes into insights.
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
When you need to model data, is your first instinct to start breaking it down into rows and columns? Mine used to be too. When you want to develop apps in a modern, agile way, NoSQL databases can be the best option. Come to this talk to learn how to take advantage of all that NoSQL databases have to offer and discover the benefits of changing your mindset from the legacy, tabular way of modeling data. We’ll compare and contrast the terms and concepts in SQL databases and MongoDB, explain the benefits of using MongoDB compared to SQL databases, and walk through data modeling basics so you feel confident as you begin using MongoDB.
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
Join this talk and test session with a MongoDB Developer Advocate where you'll go over the setup, configuration, and deployment of an Atlas environment. Create a service that you can take back in a production-ready state and prepare to unleash your inner genius.
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
Query performance should be the unsung hero of an application, but without proper configuration, can become a constant headache. When used properly, MongoDB provides extremely powerful querying capabilities. In this session, we'll discuss concepts like equality, sort, range, managing query predicates versus sequential predicates, and best practices to building multikey indexes.
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
Aggregation pipeline has been able to power your analysis of data since version 2.2. In 4.2 we added more power and now you can use it for more powerful queries, updates, and outputting your data to existing collections. Come hear how you can do everything with the pipeline, including single-view, ETL, data roll-ups and materialized views.
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
Are you new to schema design for MongoDB, or are you looking for a more complete or agile process than what you are following currently? In this talk, we will guide you through the phases of a flexible methodology that you can apply to projects ranging from small to large with very demanding requirements.
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
MongoDB Atlas Data Lake is a new service offered by MongoDB Atlas. Many organizations store long term, archival data in cost-effective storage like S3, GCP, and Azure Blobs. However, many of them do not have robust systems or tools to effectively utilize large amounts of data to inform decision making. MongoDB Atlas Data Lake is a service allowing organizations to analyze their long-term data to discover a wealth of information about their business.
This session will take a deep dive into the features that are currently available in MongoDB Atlas Data Lake and how they are implemented. In addition, we'll discuss future plans and opportunities and offer ample Q&A time with the engineers on the project.
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
Virtual assistants are becoming the new norm when it comes to daily life, with Amazon’s Alexa being the leader in the space. As a developer, not only do you need to make web and mobile compliant applications, but you need to be able to support virtual assistants like Alexa. However, the process isn’t quite the same between the platforms.
How do you handle requests? Where do you store your data and work with it to create meaningful responses with little delay? How much of your code needs to change between platforms?
In this session we’ll see how to design and develop applications known as Skills for Amazon Alexa powered devices using the Go programming language and MongoDB.
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
aux Core Data, appréciée par des centaines de milliers de développeurs. Apprenez ce qui rend Realm spécial et comment il peut être utilisé pour créer de meilleures applications plus rapidement.
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
Il n’a jamais été aussi facile de commander en ligne et de se faire livrer en moins de 48h très souvent gratuitement. Cette simplicité d’usage cache un marché complexe de plus de 8000 milliards de $.
La data est bien connu du monde de la Supply Chain (itinéraires, informations sur les marchandises, douanes,…), mais la valeur de ces données opérationnelles reste peu exploitée. En alliant expertise métier et Data Science, Upply redéfinit les fondamentaux de la Supply Chain en proposant à chacun des acteurs de surmonter la volatilité et l’inefficacité du marché.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
1. Reducing Development Time
with MongoDB vs. SQL
Buzz Moschetti
EnterpriseArchitect, MongoDB
buzz.moschetti@mongodb.com
@buzzmoschetti
2. Who is your Presenter?
• Yes, I use “Buzz” on my business cards
• Former Investment Bank Chief Architect at
JPMorganChase and Bear Stearns before that
• Over 27 years of designing and building systems
• Big and small
• Super-specialized to broadly useful in any vertical
• “Traditional” to completely disruptive
• Advocate of language leverage and strong factoring
• Still programming – using emacs, of course
3. What Are Your Developers Doing All
Day?
Adding and testing business features
OR
“Integrating/modifying other components and tools”
• Database(s)
• ETL and other data transfer operations
• Messaging
• Services (web & other)
• Other open source frameworks incl.
ORMs
4. Why Can’t We Just Save and Fetch
Data?
Because the way we think about data at the
business use case level…
…which traditionally is VERY different than the
way it is implemented at the database level
…is different than the way it is implemented at
the application/code level…
5. This Problem Isn’t New…
…but for the past 40 years, innovation at the business & application layers
has outpaced innovation at the database layer
1974 2014
Business
Data Goals
Capture my company’s
transactions daily at
5:30PM EST, add them up
on a nightly basis, and print
a big stack of paper
Capture my company’s global transactions in realtime
plus everything that is happening in the world
(customers, competitors, business/regulatory/weather),
producing any number of computed results, and passing
this all in realtime to predictive analytics with model
feedback; results in realtime to 10000s of mobile
devices, multiple GUIs, and b2b and b2c channels
Release
Schedule
Semi-Annually Yesterday
Application
/Code
COBOL, Fortran, Algol,
PL/1, assembler,
proprietary tools
C, C++, VB, C#, Java, javascript, groovy, ruby, perl
python, Obj-C, SmallTalk, Clojure, ActionScript, Flex,
DSLs, spring, AOP, CORBA, ORM, third party software
ecosystem, the whole open source movement, … and
COBOL and Fortran
Database I/VSAM, early RDBMS Mature RDBMS, legacy I/VSAM
Column & key/value stores, and…mongoDB
6. Exactly How Does MongoDB Change
Things?
• MongoDB is designed from the ground up to
address rich structure (maps of maps of lists
of…), not rectangles
• Standard RDBMS interfaces (i.e. JDBC) do not exploit features
of contemporary languages
• Rapid Application Development (RAD) and scripting in
Javascript, Python, Perl, Ruby, and Scala is impedance-
matched to mongoDB
• In MongoDB, the data is the schema
• Shapes of data go in the same way they come
out
7. Rectangles are 1974. Maps and Lists are
2014
{ customer_id : 1,
first_name : "Mark",
last_name : "Smith",
city : "San Francisco",
phones: [ {
type : “work”,
number: “1-800-555-1212”
},
{ type : “home”,
number: “1-800-555-1313”,
DNC: true
},
{ type : “home”,
number: “1-800-555-1414”,
DNC: true
}
]
}
8. An Actual Code Example (Finally!)
Let’s compare and contrast RDBMS/SQL to MongoDB
development using Java over the course of a few weeks.
Some ground rules:
1. Observe rules of Software Engineering 101: Assume separation of application,
Data Access Layer, and persistor implementation
2. Data Access Layer must be able to
a. Expose simple, functional, data-only interfaces to the application
• No ORM, frameworks, compile-time bindings, special tools
b. Exploit high performance features of persistor
3. Focus on core data handling code and avoid distractions that require the same
amount of work in both technologies
a. No exception or error handling
b. Leave out DB connection and other setup resources
4. Day counts are a proxy for progress, not actual time to complete indicated task
9. The Task: Saving and Fetching Contact
data
Map m = new HashMap();
m.put(“name”, “buzz”);
m.put(“id”, “K1”);
Start with this simple,
flat shape in the Data
Access Layer:
save(Map m)
And assume we
save it in this way:
Map m = fetch(String id)
And assume we
fetch one by primary
key in this way:
Brace yourself…..
10. Day 1: Initial efforts for both technologies
DDL: create table contact ( … )
init()
{
contactInsertStmt = connection.prepareStatement
(“insert into contact ( id, name ) values ( ?,? )”);
fetchStmt = connection.prepareStatement
(“select id, name from contact where id = ?”);
}
save(Map m)
{
contactInsertStmt.setString(1, m.get(“id”));
contactInsertStmt.setString(2, m.get(“name”));
contactInsertStmt.execute();
}
Map fetch(String id)
{
Map m = null;
fetchStmt.setString(1, id);
rs = fetchStmt.execute();
if(rs.next()) {
m = new HashMap();
m.put(“id”, rs.getString(1));
m.put(“name”, rs.getString(2));
}
return m;
}
SQL
DDL: none
save(Map m)
{
collection.insert(m);
}
MongoDB
Map fetch(String id)
{
Map m = null;
DBObject dbo = new BasicDBObject();
dbo.put(“id”, id);
c = collection.find(dbo);
if(c.hasNext()) }
m = (Map) c.next();
}
return m;
}
11. Day 2: Add simple fields
m.put(“name”, “buzz”);
m.put(“id”, “K1”);
m.put(“title”, “Mr.”);
m.put(“hireDate”, new Date(2011, 11, 1));
• Capturing title and hireDate is part of adding a new
business feature
• It was pretty easy to add two fields to the structure
• …but now we have to change our persistence code
Brace yourself (again) …..
12. SQL Day 2 (changes in bold)
DDL: alter table contact add title varchar(8);
alter table contact add hireDate date;
init()
{
contactInsertStmt = connection.prepareStatement
(“insert into contact ( id, name, title, hiredate ) values
( ?,?,?,? )”);
fetchStmt = connection.prepareStatement
(“select id, name, title, hiredate from contact where id =
?”);
}
save(Map m)
{
contactInsertStmt.setString(1, m.get(“id”));
contactInsertStmt.setString(2, m.get(“name”));
contactInsertStmt.setString(3, m.get(“title”));
contactInsertStmt.setDate(4, m.get(“hireDate”));
contactInsertStmt.execute();
}
Map fetch(String id)
{
Map m = null;
fetchStmt.setString(1, id);
rs = fetchStmt.execute();
if(rs.next()) {
m = new HashMap();
m.put(“id”, rs.getString(1));
m.put(“name”, rs.getString(2));
m.put(“title”, rs.getString(3));
m.put(“hireDate”, rs.getDate(4));
}
return m;
}
Consequences:
1. Code release schedule linked
to database upgrade (new
code cannot run on old
schema)
2. Issues with case sensitivity
starting to creep in (many
RDBMS are case insensitive
for column names, but code is
case sensitive)
3. Changes require careful mods
in 4 places
4. Beginning of technical debt
13. MongoDB Day 2
save(Map m)
{
collection.insert(m);
}
Map fetch(String id)
{
Map m = null;
DBObject dbo = new BasicDBObject();
dbo.put(“id”, id);
c = collection.find(dbo);
if(c.hasNext()) }
m = (Map) c.next();
}
return m;
}
Advantages:
1. Zero time and money spent on
overhead code
2. Code and database not physically
linked
3. New material with more fields can
be added into existing collections;
backfill is optional
4. Names of fields in database
precisely match key names in
code layer and directly match on
name, not indirectly via positional
offset
5. No technical debt is created
✔ NO
CHANGE
14. Day 3: Add list of phone numbers
m.put(“name”, “buzz”);
m.put(“id”, “K1”);
m.put(“title”, “Mr.”);
m.put(“hireDate”, new Date(2011, 11,
1));
n1.put(“type”, “work”);
n1.put(“number”, “1-800-555-1212”));
list.add(n1);
n2.put(“type”, “home”));
n2.put(“number”, “1-866-444-3131”));
list.add(n2);
m.put(“phones”, list);
• It was still pretty easy to add this data to the structure
• .. but meanwhile, in the persistence code …
REALLY brace yourself…
15. SQL Day 3 changes: Option 1: Assume
just 1 work and 1 home phone number
DDL: alter table contact add work_phone varchar(16);
alter table contact add home_phone varchar(16);
init()
{
contactInsertStmt = connection.prepareStatement
(“insert into contact ( id, name, title, hiredate,
work_phone, home_phone ) values ( ?,?,?,?,?,? )”);
fetchStmt = connection.prepareStatement
(“select id, name, title, hiredate, work_phone,
home_phone from contact where id = ?”);
}
save(Map m)
{
contactInsertStmt.setString(1, m.get(“id”));
contactInsertStmt.setString(2, m.get(“name”));
contactInsertStmt.setString(3, m.get(“title”));
contactInsertStmt.setDate(4, m.get(“hireDate”));
for(Map onePhone : m.get(“phones”)) {
String t = onePhone.get(“type”);
String n = onePhone.get(“number”);
if(t.equals(“work”)) {
contactInsertStmt.setString(5, n);
} else if(t.equals(“home”)) {
contactInsertStmt.setString(6, n);
}
}
contactInsertStmt.execute();
}
Map fetch(String id)
{
Map m = null;
fetchStmt.setString(1, id);
rs = fetchStmt.execute();
if(rs.next()) {
m = new HashMap();
m.put(“id”, rs.getString(1));
m.put(“name”, rs.getString(2));
m.put(“title”, rs.getString(3));
m.put(“hireDate”, rs.getDate(4));
Map onePhone;
onePhone = new HashMap();
onePhone.put(“type”, “work”);
onePhone.put(“number”, rs.getString(5));
list.add(onePhone);
onePhone = new HashMap();
onePhone.put(“type”, “home”);
onePhone.put(“number”, rs.getString(6));
list.add(onePhone);
m.put(“phones”, list);
}
This is just plain bad….
16. SQL Day 3 changes: Option 2:
Proper approach with multiple phone
numbersDDL: create table phones ( … )
init()
{
contactInsertStmt = connection.prepareStatement
(“insert into contact ( id, name, title, hiredate )
values ( ?,?,?,? )”);
c2stmt = connection.prepareStatement(“insert into
phones (id, type, number) values (?, ?, ?)”;
fetchStmt = connection.prepareStatement
(“select id, name, title, hiredate, type, number from
contact, phones where phones.id = contact.id and
contact.id = ?”);
}
save(Map m)
{
startTrans();
contactInsertStmt.setString(1, m.get(“id”));
contactInsertStmt.setString(2, m.get(“name”));
contactInsertStmt.setString(3, m.get(“title”));
contactInsertStmt.setDate(4, m.get(“hireDate”));
for(Map onePhone : m.get(“phones”)) {
c2stmt.setString(1, m.get(“id”));
c2stmt.setString(2, onePhone.get(“type”));
c2stmt.setString(3, onePhone.get(“number”));
c2stmt.execute();
}
contactInsertStmt.execute();
endTrans();
}
Map fetch(String id)
{
Map m = null;
fetchStmt.setString(1, id);
rs = fetchStmt.execute();
int i = 0;
List list = new ArrayList();
while (rs.next()) {
if(i == 0) {
m = new HashMap();
m.put(“id”, rs.getString(1));
m.put(“name”, rs.getString(2));
m.put(“title”, rs.getString(3));
m.put(“hireDate”, rs.getDate(4));
m.put(“phones”, list);
}
Map onePhone = new HashMap();
onePhone.put(“type”, rs.getString(5));
onePhone.put(“number”, rs.getString(6));
list.add(onePhone);
i++;
}
return m;
}
This took time and money
17. SQL Day 5: Zombies! (zero or more between entities)
init()
{
contactInsertStmt = connection.prepareStatement
(“insert into contact ( id, name, title, hiredate )
values ( ?,?,?,? )”);
c2stmt = connection.prepareStatement(“insert into
phones (id, type, number) values (?, ?, ?)”;
fetchStmt = connection.prepareStatement
(“select A.id, A.name, A.title, A.hiredate, B.type,
B.number from contact A left outer join phones B on
(A.id = B. id) where A.id = ?”);
}
Whoops! And it’s also wrong!
We did not design the query accounting
for contacts that have no phone number.
Thus, we have to change the join to an
outer join.
But this ALSO means we have to change
the unwind logic
This took more time and
money!
while (rs.next()) {
if(i == 0) {
// …
}
String s = rs.getString(5);
if(s != null) {
Map onePhone = new HashMap();
onePhone.put(“type”, s);
onePhone.put(“number”, rs.getString(6));
list.add(onePhone);
}
}
…but at least we have a DAL…
right?
18. MongoDB Day 3
Advantages:
1. Zero time and money spent on
overhead code
2. No need to fear fields that are
“naturally occurring” lists
containing data specific to the
parent structure and thus do not
benefit from normalization and
referential integrity
3. Safe from zombies and other
undead distractions from productivity
save(Map m)
{
collection.insert(m);
}
Map fetch(String id)
{
Map m = null;
DBObject dbo = new BasicDBObject();
dbo.put(“id”, id);
c = collection.find(dbo);
if(c.hasNext()) }
m = (Map) c.next();
}
return m;
}
✔ NO
CHANGE
19. By Day 14, our structure looks like this:
n4.put(“geo”, “US-EAST”);
n4.put(“startupApps”, new String[] { “app1”, “app2”, “app3” } );
list2.add(n4);
n4.put(“geo”, “EMEA”);
n4.put(“startupApps”, new String[] { “app6” } );
n4.put(“useLocalNumberFormats”, false):
list2.add(n4);
m.put(“preferences”, list2)
n6.put(“optOut”, true);
n6.put(“assertDate”, someDate);
seclist.add(n6);
m.put(“attestations”, seclist)
m.put(“security”, mapOfDataCreatedByExternalSource);
• It was still pretty easy to add this data to the structure
• Want to guess what the SQL persistence code looks like?
• How about the MongoDB persistence code?
20. SQL Day 14
Error: Could not fit all the code into this space.
…actually, I didn’t want to spend 2 hours putting the code together..
But very likely, among other things:
• n4.put(“startupApps”,new String[]{“app1”,“app2”,“app3”});
was implemented as a single semi-colon delimited string
• m.put(“security”, anotherMapOfData);
was implemented by flattening it out and storing a subset of fields
21. MongoDB Day 14 – and every other day
Advantages:
1. Zero time and money spent on
overhead code
2. Persistence is so easy and flexible
and backward compatible that the
persistor does not upward-
influence the shapes we want to
persist i.e. the tail does not wag
the dog
save(Map m)
{
collection.insert(m);
}
Map fetch(String id)
{
Map m = null;
DBObject dbo = new BasicDBObject();
dbo.put(“id”, id);
c = collection.find(dbo);
if(c.hasNext()) }
m = (Map) c.next();
}
return m;
}
✔ NO
CHANGE
22. But what if we must do a join?
Both RDBMS and MongoDB will have a PhoneTransactions
table/collection
{ customer_id : 1,
first_name : "Mark",
last_name : "Smith",
city : "San Francisco",
phones: [ {
type : “work”,
number: “1-800-555-1212”
},
{ type : “home”,
number: “1-800-555-1313”,
DNC: true
},
{ type : “home”,
number: “1-800-555-1414”,
DNC: true
}
]
}
{ number: “1-800-555-1212”,
target: “1-999-238-3423”,
duration: 20
}
{ number: “1-800-555-1212”,
target: “1-444-785-6611”,
duration: 243
}
{ number: “1-800-555-1414”,
target: “1-645-331-4345”,
duration: 132
}
{ number: “1-800-555-1414”,
target: “1-990-875-2134”,
duration: 71
}
PhoneTransactions
23. SQL Join Attempt #1
select A.id, A.lname, B.type, B.number, C.target, C.duration
from contact A, phones B, phonestx C
Where A.id = B.id and B.number = C.number
id | lname | type | number | target | duration
-----+--------------+------+----------------+----------------+----------
g9 | Moschetti | home | 1-900-555-1212 | 1-222-707-7070 | 23
g10 | Kalan | work | 1-999-444-9999 | 1-222-907-7071 | 7
g9 | Moschetti | work | 1-800-989-2231 | 1-987-707-7072 | 9
g9 | Moschetti | home | 1-900-555-1212 | 1-222-707-7071 | 7
g9 | Moschetti | home | 1-777-999-1212 | 1-222-807-7070 | 23
g9 | Moschetti | home | 1-777-999-1212 | 1-222-807-7071 | 7
g10 | Kalan | work | 1-999-444-9999 | 1-222-907-7070 | 23
g9 | Moschetti | home | 1-900-555-1212 | 1-222-707-7072 | 9
g10 | Kalan | work | 1-999-444-9999 | 1-222-907-7072 | 9
g9 | Moschetti | home | 1-777-999-1212 | 1-222-807-7072 | 9
How to turn this into a list of names –
each with a list of numbers, each of those with a list of target
numbers?
24. SQL Unwind Attempt #1
Map idmap = new HashMap();
ResultSet rs = fetchStmt.execute();
while (rs.next()) {
String id = rs.getString(“id");
String nmbr = rs.getString("number");
List tnum;
Map snum;
if((snum = (List) idmap.get(id)) == null) {
snum = new HashMap();
idmap.put(did, snum);
}
if((tnum = snum.get(nmbr)) == null) {
tnum = new ArrayList();
snum.put(number, tnum);
}
Map info = new HashMap();
info.put("target", rs.getString("target"));
info.put("duration", rs.getInteger("duration"));
tnum.add(info);
}
// idmap[“g9”][“1-900-555-1212”] = ({target:1-222-707-7070,duration:23…)
25. SQL Join Attempt #2
select A.id, A.lname, B.type, B.number, C.target, C.duration
Fromcontact A, phones B, phonestx C
Where A.id = B.id and B.number = C.number order by A.id, B.number
id | lname | type | number | target | duration
-----+--------------+------+----------------+----------------+----------
g10 | Kalan | work | 1-999-444-9999 | 1-222-907-7072 | 9
g10 | Kalan | work | 1-999-444-9999 | 1-222-907-7070 | 23
g10 | Kalan | work | 1-999-444-9999 | 1-222-907-7071 | 7
g9 | Moschetti | home | 1-777-999-1212 | 1-222-807-7072 | 9
g9 | Moschetti | home | 1-777-999-1212 | 1-222-807-7070 | 23
g9 | Moschetti | home | 1-777-999-1212 | 1-222-807-7071 | 7
g9 | Moschetti | work | 1-800-989-2231 | 1-987-707-7072 | 9
g9 | Moschetti | home | 1-900-555-1212 | 1-222-707-7071 | 7
g9 | Moschetti | home | 1-900-555-1212 | 1-222-707-7072 | 9
g9 | Moschetti | home | 1-900-555-1212 | 1-222-707-7070 | 23
“Early bail out” from cursor is now possible –
but logic to construct list of source and target numbers is similar
26. SQL is about Disassembly
String s = “select A, B, C, D,
E, F from T1,T2,T3 where T1.col
= T2.col and T2.col2 = T3.col2
and X = Y and X2 != Y2 and G >
10 and G < 100 and TO_DATE(‘ …”;
while(ResultSet.next()) {
if(new column1 value from T1) {
set up new Object;
}
if(new column2 value from T2) {
set up new Object2
}
if(new column3 value from T3) {
set up new Object3
}
populate maps, lists and scalars
}
ResultSet rs = execute(s);
Design a Big Query
including business logic
to grab all the data up
front
Throw it at the engine
Disassemble Big
Rectangle into usable
objects with logic implicit
in change in column
values
27. MongoDB is about Assembly
Cursor c = coll1.find({“X”:”Y”});
while(c.hasNext()) {
populate maps, lists and scalars;
Cursor c2 = coll2.find(logic+key from c);
while(c2.hasNext()) {
populate maps, lists and scalars;
Cursor c3 = coll3.find(logic+key from c2);
while(c3.hasNext()) {
populate maps, lists and scalars;
}
}
Assemble usable
objects incrementally
with explicit logic
28. MongoDB ”Join” using simple n+1
Map idmap = new HashMap();
DBCursor c = contacts.find();
while(c.hasNext()) {
DBObject item = c.next();
String id = item.get(“id”);
Map nummap = new HashMap();
for(Map phone : (List)item.get(”phones”)) {
String pnum = phone.get(“number”);
DBObject q = new BasicDBObject(“number”, pnum);
DBCursor c2 = phonestx.find(q);
List txs = new ArrayList();
while(c2.hasNext()) {
txs.add((Map)c2.next());
}
nummap.put(pnum, txs);
}
idmap.put(id, nummap);
}
// idmap[“g9”][“1-900-555-1212”] = ({target:1-222-707-7070,duration:23…)
29. MongoDB ”Join” in Efficient Two-Pass
Mode
Map idmap = new HashMap();
Set uniquePhones = new HashSet();
DBCursor c = contacts.find();
while(c.hasNext()) {
DBObject item = c.next();
idmap.put("id", (String)item.get(“id”));
uniquePhones.add((String)item.get(“number”));
}
String[] phoneTxKeys = (String[]) uniquePhones.toArray(new String[0]);
DBObject q = new BasicDBObject(“number”, new BasicDBObject("$in", phoneTxKeys));
DBCursor c2 = phonestx.find(q);
ArrayListMultimap<String,Map> numMap = ArrayListMultimap.create();
while(c2.hasNext()) {
DBObject item = c.next();
numMap.put((String)item.get(“number”), (Map)item);
}
// String numkey = idmap["g9"].number;
// List phoneTx = numMap[numkey];
30. But what about “real” queries?
• MongoDB query language is a physical map-of-
map based structure, not a String
• Operators (e.g. AND, OR, GT, EQ, etc.) and arguments are
keys and values in a cascade of Maps
• No grammar to parse, no templates to fill in, no whitespace,
no escaping quotes, no parentheses, no punctuation
• Same paradigm to manipulate data is used to
manipulate query expressions
• …which is also, by the way, the same paradigm
for working with MongoDB metadata and
explain()
31. MongoDB Queries use familiar operators
SQL CLI select * from contact A, phones B where
A.did = B.did and B.type = 'work’;
MongoDB CLI db.contact.find({"phones.type”:”work”});
SQL in Java String s = “select * from contact A, phones B
where A.did = B.did and B.type = 'work’”;
ResultSet rs = execute(s);
MongoDB via
Java driver
DBObject expr = new BasicDBObject();
expr.put(“phones.type”, “work”);
Cursor c = contact.find(expr);
Find all contacts with at least one work phone
32. MongoDB Queries are Expressive and
Typed
SQL select A.did, A.lname, A.hiredate, B.type,
B.number from contact A left outer join phones B
on (B.did = A.did) where b.type = 'work' or
A.hiredate > '2014-02-02'::date
MongoDB CLI db.contacts.find({"$or”: [
{"phones.type":”work”},
{"hiredate": {”$gt": new ISODate("2014-02-
02")}}
]});
Find all contacts with at least one work phone or
hired after 2014-02-02
33. MongoDB Queries Welcome
Programmatic Construction
Fetch a given set of fields for contacts hired
between date A and date B, inclusive:
fetch(String[] fields, Date date1, Date date2) {
DBObject projection = new BasicDBObject();
for(String f: fields) {
projection.put(f, 1);
}
List exprs = new List();
exprs.add(new BasicDBObject("hdate", new BasicDBObject("$gte",date1)));
exprs.add(new BasicDBObject("hdate", new BasicDBObject("$lte",date2)));
DBObject predicate = new BasicDBObject("$and", exprs);
DBCursor c2 = contacts.find(predicate, projection);
}
34. SQL Queries Require Construction of a
String
Fetch a given set of fields for contacts hired
between date A and date B, inclusive:
fetch(String[] fields, Date date1, Date date2) {
StringBuffer sb = new StringBuffer();
sb.append("select ");
for(int i = 0; i < fields.length; i++) {
if(i != 0) { sb.append(","); }
sb.append(fields[i]);
}
DateFormat df = new SimpleDateFormat("YYYY-MM-DD");
sb.append("from contact where ");
sb.append("hiredate >= ");
sb.append("'");
sb.append(df.format(date1));
sb.append("'::date");
sb.append(" and ");
sb.append("hiredate <= ");
sb.append("'");
sb.append(df.format(date2));
sb.append("'::date");
ResultSet resultSet = stmt.executeQuery(sb.toString());
}
Careful! What happens if
the fetch involved a join
and the incoming field
names needed to be
prefixed, e.g.
select A.fields[0],
B.fields[1],
B.fields[2]
35. PreparedStatement ease the pain a bit….
Fetch a given set of fields for contacts hired
between date A and date B, inclusive:
fetch(String[] fields, Date date1, Date date2) {
StringBuffer sb = new StringBuffer();
sb.append("select ");
for(int i = 0; i < fields.length; i++) {
if(i != 0) { sb.append(","); }
sb.append(fields[i]);
}
sb.append(” from contact where ");
sb.append("hiredate >= ? and hiredate <= ?");
px = conn.prepareStatement( sb.toString() );
px.setString(1, new java.sql.Date(date1));
px.setString(2, new java.sql.Date(date2));
ResultSet resultSet = px.executeQuery();
}
36. MongoDB Facilitates Versatile Filtering
Pass an arbitrary filter expression to contacts:
/**
* Contract: Filter only that which you could see if you
* applied NO filter. So allow Mongo query language
* (MQL) to be used ABOVE the DAL as a filter spec.
* No MongoDB physical dependencies!
*/
fetch(Map mql) {
DBCursor c2 = contacts.find(new BasicDBObject(mql));
}
37. SQL Requires … MongoDB Query
Language!
Pass an arbitrary filter expression to contacts:
fetch(Map mql) {
StringBuffer sb = new StringBuffer();
sb.append("select * from contact where ");
walkMap(sb, mql);
ResultSet resultSet = stmt.executeQuery(sb.toString());
}
walkMap(StringBuffer sb, Map fragment) {
java.util.Iterator<String> ii = m.keySet().iterator();
while(ii.hasNext()) {
String key = ii.next();
Object ov = m.get(key);
process(sb, ov, key); // e.g. (hdate >= ‘2013-04-01’::date)
}
}
38. …and before you ask…
Yes, MongoDB query expressions
support
1. Sorting
2. Cursor size limit
3. Aggregation (“GROUP BY”) functions
39. Day 30: RAD on MongoDB with Python
import pymongo
def save(data):
coll.insert(data)
def fetch(id):
return coll.find_one({”id": id } )
myData = {
“name”: “jane”,
“id”: “K2”,
# no title? No problem
“hireDate”: datetime.date(2011, 11, 1),
“phones”: [
{ "type": "work",
"number": "1-800-555-1212"
},
{ "type": "home",
"number": "1-866-444-3131"
}
]
}
save(myData)
print fetch(“K2”)
expr = {"$or": [ {"phones.type": “work”}, {”hireDate": {“$gt”: datetime.date(2014,2,2)}} ]}
for c in coll.find(expr):
print [ k.upper() for k in sorted(c.keys()) ]
Advantages:
1. Far easier and faster to create
scripts due to “fidelity-parity” of
mongoDB map data and python
(and perl, ruby, and javascript)
structures
1. Data types and structure in scripts
are exactly the same as that read and
written in Java and C++
40. Day 30: Polymorphic RAD on MongoDB with
Python
import pymongo
item = fetch("K8")
# item is:
{
“name”: “bob”,
“id”: “K8”,
"personalData": {
"preferedAirports": [ "LGA", "JFK" ],
"travelTimeThreshold": { "value": 3,
"units": “HRS”}
}
}
item = fetch("K9")
# item is:
{
“name”: “steve”,
“id”: “K9”,
"personalData": {
"lastAccountVisited": {
"name": "mongoDB",
"when": datetime.date(2013,11,4)
},
"favoriteNumber": 3.14159
}
}
Advantages:
1. Scripting languages easily digest
shapes with common fields and
dissimilar fields
2. Easy to create an information
architecture where placeholder fields
like personalData are “known” in the
software logic to be dynamic
41. Day 30: (Not) RAD on top of SQL with
Python
init()
{
contactInsertStmt = connection.prepareStatement
(“insert into contact ( id, name, title, hiredate )
values ( ?,?,?,? )”);
c2stmt = connection.prepareStatement(“insert into
phones (id, type, number) values (?, ?, ?)”;
fetchStmt = connection.prepareStatement
(“select id, name, title, hiredate, type, number from
contact, phones where phones.id = contact.id and
contact.id = ?”);
}
save(Map m)
{
startTrans();
contactInsertStmt.setString(1, m.get(“id”));
contactInsertStmt.setString(2, m.get(“name”));
contactInsertStmt.setString(3, m.get(“title”));
contactInsertStmt.setDate(4, m.get(“hireDate”));
for(Map onePhone : m.get(“phones”)) {
c2stmt.setString(1, onePhone.get(“type”));
c2stmt.setString(2, onePhone.get(“number”));
c2stmt.execute();
}
contactInsertStmt.execute();
endTrans();
}
Consequences:
1. All logic coded in Java interface
layer (unwinding contact, phones,
preferences, etc.) needs to be
rewritten in python (unless Jython
is used) … AND/or perl, C++,
Scala, etc.
2. No robust way to handle
polymorphic data other than
BLOBing it
3. …and that will take real time and
money!
42. The Fundamental Change with mongoDB
RDBMS designed in era when:
• CPU and disk was slow &
expensive
• Memory was VERY expensive
• Network? What network?
• Languages had limited means to
dynamically reflect on their types
• Languages had poor support for
richly structured types
Thus, the database had to
• Act as combiner-coordinator of
simpler types
• Define a rigid schema
• (Together with the code) optimize
at compile-time, not run-time
In mongoDB, the
data is the schema!
43. mongoDB and the Rich Map Ecosystem
Generic comparison of two
records
Map expr = new HashMap();
expr.put("myKey", "K1");
DBObject a = collection.findOne(expr);
expr.put("myKey", "K2");
DBObject b = collection.findOne(expr);
List<MapDiff.Difference> d = MapDiff.diff((Map)a, (Map)b);
Getting default values for a thing
on a certain date and then
overlaying user preferences (like
for a calculation run)
Map expr = new HashMap();
expr.put("myKey", "DEFAULT");
expr.put("createDate", new Date(2013, 11, 1));
DBObject a = collection.findOne(expr);
expr.clear();
expr.put("myKey", "user1");
DBObject b = otherCollectionPerhaps.findOne(expr);
MapStack s = new MapStack();
s.push((Map)a);
s.push((Map)b);
Map merged = s.project();
Runtime reflection of Maps and Lists enables generic powerful utilities
(MapDiff, MapStack) to be created once and used for all kinds of shapes,
saving time and money
44. Lastly: A CLI with teeth
> db.contact.find({"SeqNum": {"$gt”:10000}}).explain();
{
"cursor" : "BasicCursor",
"n" : 200000,
//...
"millis" : 223
}
Try a query and show the
diagnostics
> for(v=[],i=0;i<3;i++) {
… n = i*50000;
… expr = {"SeqNum": {"$gt”: n}};
… v.push( [n, db.contact.find(expr).explain().millis)] }
Run it 3 times with smaller and
smaller chunks and create a
vector of timing result pairs
(size,time)
> v
[ [ 0, 225 ], [ 50000, 222 ], [ 100000, 220 ] ]
Let’s see that vector
> load(“jStat.js”)
> jStat.stdev(v.map(function(p){return p[1];}))
2.0548046676563256
Use any other javascript you
want inside the shell
> for(i=0;i<3;i++) {
… expr = {"SeqNum": {"$gt":i*1000}};
… db.foo.insert(db.contact.find(expr).explain()); }
Party trick: save the explain()
output back into a collection!
45. What Does All This Add Up To?
• MongoDB easier than RDBMS/SQL for many
problems
• Quicker to change
• Much better harmonized with modern languages
• Comprehensive indexing (arbitrary non/unique
secondaries, compound keys, geospatial, text
search, TTL, etc….)
• Horizontally scalable to petabytes
• Isomorphic HA and DR
Modern Database for Modern
Solutions
+
=