There is a tremendous innovation that has happened in the database space in the recent past. In the beginning, we had traditional file systems and hierarchical databases. We then graduated to Relational Database Management systems followed by database appliances and in-memory databases. Then we saw the data explosion. Hadoop was born to alleviate the challenges that followed this data explosion. But it also failed to meet the new age demands of the business. Batch oriented processing was not good enough for the speed that business is looking for. The need of the hour is a near real time solution that can not only process transactional but also analytical workloads at the same time. The new age NoSQL databases bring them all together. However, we need to unlearn a lot while designing applications on NoSQL. We need to throw our relational cap while building such solutions. In this session, we will talk about the best practices of designing performant data models, tools and accelerators that lend predictability especially while migrating from traditional RDBMS to modern NoSQL models. This will be substantiated with our experience in developing such data models for our clients across the verticals.
Michael Poremba, Director, Data Architecture at Practice FusionMongoDB
Practice Fusion, the largest cloud-based electronic health records (EHR) system in the US, used by more than 100,000 health care providers managing over 100 million patient medical records, faced the need to move their four terabyte HIPAA audit reporting system off of a relational database. Practice Fusion selected MongoDB for their new HIPAA audit reporting system. Learn how the team designed and implemented a highly scalable system for storing protected health information in the cloud. This case study covers the move from a relational database to a document database; data modeling in JSON; sharding strategies; indexing; sharded cluster design supporting high availability and disaster recovery; performance testing; and data migration of billions of historical audit records.
The document discusses MongoDB and how it allows storing data in flexible, document-based collections rather than rigid tables. Some key points:
- MongoDB uses a flexible document model that allows embedding related data rather than requiring separate tables joined by foreign keys.
- It supports dynamic schemas that allow fields within documents to vary unlike traditional SQL databases that require all rows to have the same structure.
- Aggregation capabilities allow complex analytics to be performed directly on the data without requiring data warehousing or manual export/import like with SQL databases. Pipelines of aggregation operations can be chained together.
Simplifying & accelerating application development with MongoDB's intelligent...Maxime Beugnet
The document discusses MongoDB's Intelligent Operational Data Platform and how it allows developers to simplify application development. It highlights how MongoDB uses a document model which is more flexible than a relational database and allows for embedding of related data. MongoDB also provides features like multi-document transactions, full indexing capabilities, advanced aggregations, and change streams for building reactive applications in real-time.
Slidedeck presented at http://devternity.com/ around MongoDB internals. We review the usage patterns of MongoDB, the different storage engines and persistency models as well has the definition of documents and general data structures.
MongoDB .local Houston 2019: Best Practices for Working with IoT and Time-ser...MongoDB
Time series data is increasingly at the heart of modern applications - think IoT, stock trading, clickstreams, social media, and more. With the move from batch to real time systems, the efficient capture and analysis of time series data can enable organizations to better detect and respond to events ahead of their competitors or to improve operational efficiency to reduce cost and risk. Working with time series data is often different from regular application data, and there are best practices you should observe.
This talk covers:
Common components of an IoT solution
The challenges involved with managing time-series data in IoT applications
Different schema designs, and how these affect memory and disk utilization – two critical factors in application performance.
How to query, analyze and present IoT time-series data using MongoDB Compass and MongoDB Charts
At the end of the session, you will have a better understanding of key best practices in managing IoT time-series data with MongoDB.
Michael Poremba, Director, Data Architecture at Practice FusionMongoDB
Practice Fusion, the largest cloud-based electronic health records (EHR) system in the US, used by more than 100,000 health care providers managing over 100 million patient medical records, faced the need to move their four terabyte HIPAA audit reporting system off of a relational database. Practice Fusion selected MongoDB for their new HIPAA audit reporting system. Learn how the team designed and implemented a highly scalable system for storing protected health information in the cloud. This case study covers the move from a relational database to a document database; data modeling in JSON; sharding strategies; indexing; sharded cluster design supporting high availability and disaster recovery; performance testing; and data migration of billions of historical audit records.
The document discusses MongoDB and how it allows storing data in flexible, document-based collections rather than rigid tables. Some key points:
- MongoDB uses a flexible document model that allows embedding related data rather than requiring separate tables joined by foreign keys.
- It supports dynamic schemas that allow fields within documents to vary unlike traditional SQL databases that require all rows to have the same structure.
- Aggregation capabilities allow complex analytics to be performed directly on the data without requiring data warehousing or manual export/import like with SQL databases. Pipelines of aggregation operations can be chained together.
Simplifying & accelerating application development with MongoDB's intelligent...Maxime Beugnet
The document discusses MongoDB's Intelligent Operational Data Platform and how it allows developers to simplify application development. It highlights how MongoDB uses a document model which is more flexible than a relational database and allows for embedding of related data. MongoDB also provides features like multi-document transactions, full indexing capabilities, advanced aggregations, and change streams for building reactive applications in real-time.
Slidedeck presented at http://devternity.com/ around MongoDB internals. We review the usage patterns of MongoDB, the different storage engines and persistency models as well has the definition of documents and general data structures.
MongoDB .local Houston 2019: Best Practices for Working with IoT and Time-ser...MongoDB
Time series data is increasingly at the heart of modern applications - think IoT, stock trading, clickstreams, social media, and more. With the move from batch to real time systems, the efficient capture and analysis of time series data can enable organizations to better detect and respond to events ahead of their competitors or to improve operational efficiency to reduce cost and risk. Working with time series data is often different from regular application data, and there are best practices you should observe.
This talk covers:
Common components of an IoT solution
The challenges involved with managing time-series data in IoT applications
Different schema designs, and how these affect memory and disk utilization – two critical factors in application performance.
How to query, analyze and present IoT time-series data using MongoDB Compass and MongoDB Charts
At the end of the session, you will have a better understanding of key best practices in managing IoT time-series data with MongoDB.
MongoDB .local London 2019: Best Practices for Working with IoT and Time-seri...MongoDB
Time series data is increasingly at the heart of modern applications - think IoT, stock trading, clickstreams, social media, and more. With the move from batch to real time systems, the efficient capture and analysis of time series data can enable organizations to better detect and respond to events ahead of their competitors or to improve operational efficiency to reduce cost and risk. Working with time series data is often different from regular application data, and there are best practices you should observe.
This talk covers:
Common components of an IoT solution
• The challenges involved with managing time-series data in IoT applications
• Different schema designs, and how these affect memory and disk utilization – two critical factors in application performance.
• How to query, analyze and present IoT time-series data using MongoDB Compass and MongoDB Charts
• At the end of the session, you will have a better understanding of key best practices in managing IoT time-series data with MongoDB.
Evolving your Data Access with MongoDB StitchMongoDB
MongoDB Stitch is a platform that allows developers to build and deploy applications with MongoDB. It consists of four main services - QueryAnywhere for data access, Functions for server-side logic, Triggers for real-time notifications, and Mobile Sync for offline data synchronization. Stitch handles infrastructure concerns so developers can focus on writing code. It provides global data access, integrated authorization rules, and serverless hosting of business logic. This allows applications to be built more easily and deployed seamlessly across different platforms and locations.
Real-time big data analytics based on product recommendations case studydeep.bi
We started as an ad network. The challenge was to recommend the best product (out of millions) to the right person in a given moment (thousands of users within a second). We have delivered 5 billion ad views since 24 months. To put it in the scale context: If we would serve 1 ad per second it will take 160 years to serve 5 billion ads.
So we needed a solution. SQL databases did not work. Popular NoSQL databases did not work. Standard data warehouse approaches (pre-aggregations, creating schemas) - did not work too.
Re-thinking all the problems with huge data streams flowing to us every second we have built a complete solution based on open-source technologies and fresh, smart ideas from our engineering team. It is called deep.bi and now we make it available to other companies.
deep.bi lets high-growth companies solve fast data problems by providing scalable, flexible and real-time data collection, enrichment and analytics.
It was built using:
- Node.js - API
- Kafka - collecting and distributing data
- Spark Streaming - ETL, data enrichments
- Druid - real-time analytics
- Cassandra - user events store
- Hadoop + Parquet + Spark - raw data store + ad-hoc queries
AWS re:Invent 2016: Building IoT Applications with AWS and Amazon Alexa (HLC304)Amazon Web Services
Alexa, what is the Internet of Things? Now that technology is small enough to be embedded in everyday devices, Healthcare has an opportunity to exploit the extraordinary potential of connecting ordinary devices. In this presentation, we explain how to rapidly build an IoT system and how to drive the Cloud with your voice on an Amazon Echo. In addition to describing how to use Alexa, we explore using AWS IoT, Lambda, Amazon SNS, and DynamoDB.
Today’s highly connected world is flooding businesses with big and fast-moving data. The ability to trawl this data ocean and identify actionable insights can deliver a competitive advantage to any organization. The WSO2 Analytics Platform enables businesses to do just that by providing batch, real-time, interactive and predictive analysis capabilities all in one place.
In this tutorial we will
* Plug in the WSO2 Analytics Platform to some common business use cases
* Showcase the numerous capabilities of the platform
* Demonstrate how to collect data, analyze, predict and communicate effectively
* Demonstrate how it can analyze integration, security and IoT scenarios
Stick around till the end and you will walk away with the necessary skills to create a winning data strategy for your organization to stay ahead of its competition.
Cloud Adoption in Regulated Financial Services - SID328 - re:Invent 2017Amazon Web Services
Macquarie, a global provider of financial services, identified early on that it would require strong partnership between its business, technology and risk teams to enable the rapid adoption of AWS cloud technologies. As a result, Macquarie built a Cloud Governance Platform to enable its risk functions to move as quickly as its development teams. This platform has been the backbone of Macquarie’s adoption of AWS over the past two years and has enabled Macquarie to accelerate its use of cloud technologies for the benefit of clients across multiple global markets. This talk will outline the strategy that Macquarie embarked on, describe the platform they built, and provide examples for other organizations who are on a similar journey.
Dublin Ireland Spark Meetup October 15, 2015eddiebaggott
This document discusses Apache Spark DataFrames and their use for fraud detection. It provides an overview of DataFrames, how they can be created from various data sources and used for data aggregation, filtering, and querying. Examples are given for creating DataFrames from text files, performing SQL queries, and using DataFrames for profiling, charting, and data quality checks. A variety of fraud detection applications for DataFrames are also outlined.
This document discusses using Azure HDInsight for big data applications. It provides an overview of HDInsight and describes how it can be used for various big data scenarios like modern data warehousing, advanced analytics, and IoT. It also discusses the architecture and components of HDInsight, how to create and manage HDInsight clusters, and how HDInsight integrates with other Azure services for big data and analytics workloads.
- MongoDB is well-suited for IoT applications due to its ability to handle large volumes of variable data from sensors, perform analytics on both real-time and historical data, and scale horizontally to support growing workloads.
- Its flexible document model accommodates changing sensor schemas and nested/complex data structures from devices, while secondary indexes enable expressive queries.
- Time series data from sensors can be optimized in MongoDB using bucketing which improves write performance, storage usage, and analytics capabilities.
Big Data Expo 2015 - Gigaspaces Making Sense of it allBigDataExpo
NOSQL are often limited in the type of queries that they can support due to the distributed nature of the data. In this session we would learn patterns on how we can overcome this limitation and combine multiple query semantics with NoSQL based engines. We will demonstrate specifically a combination of key/value, SQL like, Document model and Graph based queries as well as more advanced topic such as handling partial update and query through projection. We will also demonstrate how we can create a mash-up between those API's i.e. write fast through Key/Value API and execute complex queries on that same data through SQL query.
Building Analytics Applications with Streaming Expressions in Apache Solr - A...Lucidworks
This document discusses building analytics applications with streaming expressions in Apache Solr. It introduces parallel computing frameworks, the streaming API, and streaming expressions. It provides examples of use cases like performing searches, facets, joins, and aggregations on real-time data from different sources. It also demonstrates how to execute expressions in parallel using worker collections and shuffling to improve performance.
The document discusses software design patterns for distributed applications. It begins with introductions and definitions of patterns, then discusses specific patterns like Table Module, Table Data Gateway, and Active Record that address problems like representing business entities, data access, and application distribution. The document also provides examples of applying these patterns to a revenue recognition problem domain.
The document discusses software design patterns for distributed applications. It introduces common patterns like Model-View-Controller (MVC), layers (presentation, business, data), and data access patterns (table data gateway, active record). It also provides examples of applying these patterns to problems like representing business entities and data, handling distributed transactions, and implementing specific business logic like revenue recognition. The goal of patterns is to provide reusable solutions to common problems in software architecture and design.
Docker Summit MongoDB - Data Democratization Chris Grabosky
The document summarizes a presentation given by Chris Grabosky, a Solutions Architect at MongoDB. It discusses how containerization and data democratization can help organizations build cloud native applications. It highlights how MongoDB and Docker can be used together to build scalable and portable apps. It also covers data modeling approaches, workload isolation, analytics, and data access controls that MongoDB provides to help democratize data.
This document summarizes Giant Eagle's use of MongoDB for an expense reimbursement application. It discusses why Giant Eagle chose MongoDB over its traditional .NET/Oracle stack, the technology stack used, and MongoDB's architecture. Some limitations encountered included inability to directly select subdocuments and query across collections. Overall, while MongoDB has some immature areas, it was a good fit for the project's performance needs and flexibility.
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And WhenDavid Peyruc
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
Massimo Brignoli, MongoDB Inc
The presentation will illustrate what MongoDB is, the advantages of the document based approach and some of the use cases where MongoDB is a perfect fit.
This document discusses MongoDB and the needs of Rivera Group, an IT services company. It notes that Rivera Group has been using MongoDB since 2012 to store large, multi-dimensional datasets with heavy read/write and audit requirements. The document outlines some of the challenges Rivera Group faces around indexing, aggregation, and flexibility in querying datasets.
Eagle6 is a product that use system artifacts to create a replica model that represents a near real-time view of system architecture. Eagle6 was built to collect system data (log files, application source code, etc.) and to link system behaviors in such a way that the user is able to quickly identify risks associated with unknown or unwanted behavioral events that may result in unknown impacts to seemingly unrelated down-stream systems. This session is designed to present the capabilities of the Eagle6 modeling product and how we are using MongoDB to support near-real-time analysis of large disparate datasets.
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
This presentation discusses migrating data from other data stores to MongoDB Atlas. It begins by explaining why MongoDB and Atlas are good choices for data management. Several preparation steps are covered, including sizing the target Atlas cluster, increasing the source oplog, and testing connectivity. Live migration, mongomirror, and dump/restore options are presented for migrating between replicasets or sharded clusters. Post-migration steps like monitoring and backups are also discussed. Finally, migrating from other data stores like AWS DocumentDB, Azure CosmosDB, DynamoDB, and relational databases are briefly covered.
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
These days, everyone is expected to be a data analyst. But with so much data available, how can you make sense of it and be sure you're making the best decisions? One great approach is to use data visualizations. In this session, we take a complex dataset and show how the breadth of capabilities in MongoDB Charts can help you turn bits and bytes into insights.
More Related Content
Similar to MongoDB World 2019: Building an Efficient and Performant Data Model: Real World Challenges Faced and How We Solved Them
MongoDB .local London 2019: Best Practices for Working with IoT and Time-seri...MongoDB
Time series data is increasingly at the heart of modern applications - think IoT, stock trading, clickstreams, social media, and more. With the move from batch to real time systems, the efficient capture and analysis of time series data can enable organizations to better detect and respond to events ahead of their competitors or to improve operational efficiency to reduce cost and risk. Working with time series data is often different from regular application data, and there are best practices you should observe.
This talk covers:
Common components of an IoT solution
• The challenges involved with managing time-series data in IoT applications
• Different schema designs, and how these affect memory and disk utilization – two critical factors in application performance.
• How to query, analyze and present IoT time-series data using MongoDB Compass and MongoDB Charts
• At the end of the session, you will have a better understanding of key best practices in managing IoT time-series data with MongoDB.
Evolving your Data Access with MongoDB StitchMongoDB
MongoDB Stitch is a platform that allows developers to build and deploy applications with MongoDB. It consists of four main services - QueryAnywhere for data access, Functions for server-side logic, Triggers for real-time notifications, and Mobile Sync for offline data synchronization. Stitch handles infrastructure concerns so developers can focus on writing code. It provides global data access, integrated authorization rules, and serverless hosting of business logic. This allows applications to be built more easily and deployed seamlessly across different platforms and locations.
Real-time big data analytics based on product recommendations case studydeep.bi
We started as an ad network. The challenge was to recommend the best product (out of millions) to the right person in a given moment (thousands of users within a second). We have delivered 5 billion ad views since 24 months. To put it in the scale context: If we would serve 1 ad per second it will take 160 years to serve 5 billion ads.
So we needed a solution. SQL databases did not work. Popular NoSQL databases did not work. Standard data warehouse approaches (pre-aggregations, creating schemas) - did not work too.
Re-thinking all the problems with huge data streams flowing to us every second we have built a complete solution based on open-source technologies and fresh, smart ideas from our engineering team. It is called deep.bi and now we make it available to other companies.
deep.bi lets high-growth companies solve fast data problems by providing scalable, flexible and real-time data collection, enrichment and analytics.
It was built using:
- Node.js - API
- Kafka - collecting and distributing data
- Spark Streaming - ETL, data enrichments
- Druid - real-time analytics
- Cassandra - user events store
- Hadoop + Parquet + Spark - raw data store + ad-hoc queries
AWS re:Invent 2016: Building IoT Applications with AWS and Amazon Alexa (HLC304)Amazon Web Services
Alexa, what is the Internet of Things? Now that technology is small enough to be embedded in everyday devices, Healthcare has an opportunity to exploit the extraordinary potential of connecting ordinary devices. In this presentation, we explain how to rapidly build an IoT system and how to drive the Cloud with your voice on an Amazon Echo. In addition to describing how to use Alexa, we explore using AWS IoT, Lambda, Amazon SNS, and DynamoDB.
Today’s highly connected world is flooding businesses with big and fast-moving data. The ability to trawl this data ocean and identify actionable insights can deliver a competitive advantage to any organization. The WSO2 Analytics Platform enables businesses to do just that by providing batch, real-time, interactive and predictive analysis capabilities all in one place.
In this tutorial we will
* Plug in the WSO2 Analytics Platform to some common business use cases
* Showcase the numerous capabilities of the platform
* Demonstrate how to collect data, analyze, predict and communicate effectively
* Demonstrate how it can analyze integration, security and IoT scenarios
Stick around till the end and you will walk away with the necessary skills to create a winning data strategy for your organization to stay ahead of its competition.
Cloud Adoption in Regulated Financial Services - SID328 - re:Invent 2017Amazon Web Services
Macquarie, a global provider of financial services, identified early on that it would require strong partnership between its business, technology and risk teams to enable the rapid adoption of AWS cloud technologies. As a result, Macquarie built a Cloud Governance Platform to enable its risk functions to move as quickly as its development teams. This platform has been the backbone of Macquarie’s adoption of AWS over the past two years and has enabled Macquarie to accelerate its use of cloud technologies for the benefit of clients across multiple global markets. This talk will outline the strategy that Macquarie embarked on, describe the platform they built, and provide examples for other organizations who are on a similar journey.
Dublin Ireland Spark Meetup October 15, 2015eddiebaggott
This document discusses Apache Spark DataFrames and their use for fraud detection. It provides an overview of DataFrames, how they can be created from various data sources and used for data aggregation, filtering, and querying. Examples are given for creating DataFrames from text files, performing SQL queries, and using DataFrames for profiling, charting, and data quality checks. A variety of fraud detection applications for DataFrames are also outlined.
This document discusses using Azure HDInsight for big data applications. It provides an overview of HDInsight and describes how it can be used for various big data scenarios like modern data warehousing, advanced analytics, and IoT. It also discusses the architecture and components of HDInsight, how to create and manage HDInsight clusters, and how HDInsight integrates with other Azure services for big data and analytics workloads.
- MongoDB is well-suited for IoT applications due to its ability to handle large volumes of variable data from sensors, perform analytics on both real-time and historical data, and scale horizontally to support growing workloads.
- Its flexible document model accommodates changing sensor schemas and nested/complex data structures from devices, while secondary indexes enable expressive queries.
- Time series data from sensors can be optimized in MongoDB using bucketing which improves write performance, storage usage, and analytics capabilities.
Big Data Expo 2015 - Gigaspaces Making Sense of it allBigDataExpo
NOSQL are often limited in the type of queries that they can support due to the distributed nature of the data. In this session we would learn patterns on how we can overcome this limitation and combine multiple query semantics with NoSQL based engines. We will demonstrate specifically a combination of key/value, SQL like, Document model and Graph based queries as well as more advanced topic such as handling partial update and query through projection. We will also demonstrate how we can create a mash-up between those API's i.e. write fast through Key/Value API and execute complex queries on that same data through SQL query.
Building Analytics Applications with Streaming Expressions in Apache Solr - A...Lucidworks
This document discusses building analytics applications with streaming expressions in Apache Solr. It introduces parallel computing frameworks, the streaming API, and streaming expressions. It provides examples of use cases like performing searches, facets, joins, and aggregations on real-time data from different sources. It also demonstrates how to execute expressions in parallel using worker collections and shuffling to improve performance.
The document discusses software design patterns for distributed applications. It begins with introductions and definitions of patterns, then discusses specific patterns like Table Module, Table Data Gateway, and Active Record that address problems like representing business entities, data access, and application distribution. The document also provides examples of applying these patterns to a revenue recognition problem domain.
The document discusses software design patterns for distributed applications. It introduces common patterns like Model-View-Controller (MVC), layers (presentation, business, data), and data access patterns (table data gateway, active record). It also provides examples of applying these patterns to problems like representing business entities and data, handling distributed transactions, and implementing specific business logic like revenue recognition. The goal of patterns is to provide reusable solutions to common problems in software architecture and design.
Docker Summit MongoDB - Data Democratization Chris Grabosky
The document summarizes a presentation given by Chris Grabosky, a Solutions Architect at MongoDB. It discusses how containerization and data democratization can help organizations build cloud native applications. It highlights how MongoDB and Docker can be used together to build scalable and portable apps. It also covers data modeling approaches, workload isolation, analytics, and data access controls that MongoDB provides to help democratize data.
This document summarizes Giant Eagle's use of MongoDB for an expense reimbursement application. It discusses why Giant Eagle chose MongoDB over its traditional .NET/Oracle stack, the technology stack used, and MongoDB's architecture. Some limitations encountered included inability to directly select subdocuments and query across collections. Overall, while MongoDB has some immature areas, it was a good fit for the project's performance needs and flexibility.
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And WhenDavid Peyruc
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
Massimo Brignoli, MongoDB Inc
The presentation will illustrate what MongoDB is, the advantages of the document based approach and some of the use cases where MongoDB is a perfect fit.
This document discusses MongoDB and the needs of Rivera Group, an IT services company. It notes that Rivera Group has been using MongoDB since 2012 to store large, multi-dimensional datasets with heavy read/write and audit requirements. The document outlines some of the challenges Rivera Group faces around indexing, aggregation, and flexibility in querying datasets.
Eagle6 is a product that use system artifacts to create a replica model that represents a near real-time view of system architecture. Eagle6 was built to collect system data (log files, application source code, etc.) and to link system behaviors in such a way that the user is able to quickly identify risks associated with unknown or unwanted behavioral events that may result in unknown impacts to seemingly unrelated down-stream systems. This session is designed to present the capabilities of the Eagle6 modeling product and how we are using MongoDB to support near-real-time analysis of large disparate datasets.
Similar to MongoDB World 2019: Building an Efficient and Performant Data Model: Real World Challenges Faced and How We Solved Them (20)
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
This presentation discusses migrating data from other data stores to MongoDB Atlas. It begins by explaining why MongoDB and Atlas are good choices for data management. Several preparation steps are covered, including sizing the target Atlas cluster, increasing the source oplog, and testing connectivity. Live migration, mongomirror, and dump/restore options are presented for migrating between replicasets or sharded clusters. Post-migration steps like monitoring and backups are also discussed. Finally, migrating from other data stores like AWS DocumentDB, Azure CosmosDB, DynamoDB, and relational databases are briefly covered.
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
These days, everyone is expected to be a data analyst. But with so much data available, how can you make sense of it and be sure you're making the best decisions? One great approach is to use data visualizations. In this session, we take a complex dataset and show how the breadth of capabilities in MongoDB Charts can help you turn bits and bytes into insights.
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
MongoDB Kubernetes operator and MongoDB Open Service Broker are ready for production operations. Learn about how MongoDB can be used with the most popular container orchestration platform, Kubernetes, and bring self-service, persistent storage to your containerized applications. A demo will show you how easy it is to enable MongoDB clusters as an External Service using the Open Service Broker API for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
Are you new to schema design for MongoDB, or are you looking for a more complete or agile process than what you are following currently? In this talk, we will guide you through the phases of a flexible methodology that you can apply to projects ranging from small to large with very demanding requirements.
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
Humana, like many companies, is tackling the challenge of creating real-time insights from data that is diverse and rapidly changing. This is our journey of how we used MongoDB to combined traditional batch approaches with streaming technologies to provide continues alerting capabilities from real-time data streams.
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
Time series data is increasingly at the heart of modern applications - think IoT, stock trading, clickstreams, social media, and more. With the move from batch to real time systems, the efficient capture and analysis of time series data can enable organizations to better detect and respond to events ahead of their competitors or to improve operational efficiency to reduce cost and risk. Working with time series data is often different from regular application data, and there are best practices you should observe.
This talk covers:
Common components of an IoT solution
The challenges involved with managing time-series data in IoT applications
Different schema designs, and how these affect memory and disk utilization – two critical factors in application performance.
How to query, analyze and present IoT time-series data using MongoDB Compass and MongoDB Charts
At the end of the session, you will have a better understanding of key best practices in managing IoT time-series data with MongoDB.
Join this talk and test session with a MongoDB Developer Advocate where you'll go over the setup, configuration, and deployment of an Atlas environment. Create a service that you can take back in a production-ready state and prepare to unleash your inner genius.
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
Our clients have unique use cases and data patterns that mandate the choice of a particular strategy. To implement these strategies, it is mandatory that we unlearn a lot of relational concepts while designing and rapidly developing efficient applications on NoSQL. In this session, we will talk about some of our client use cases, the strategies we have adopted, and the features of MongoDB that assisted in implementing these strategies.
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
Encryption is not a new concept to MongoDB. Encryption may occur in-transit (with TLS) and at-rest (with the encrypted storage engine). But MongoDB 4.2 introduces support for Client Side Encryption, ensuring the most sensitive data is encrypted before ever leaving the client application. Even full access to your MongoDB servers is not enough to decrypt this data. And better yet, Client Side Encryption can be enabled at the "flick of a switch".
This session covers using Client Side Encryption in your applications. This includes the necessary setup, how to encrypt data without sacrificing queryability, and what trade-offs to expect.
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
MongoDB Kubernetes operator is ready for prime-time. Learn about how MongoDB can be used with most popular orchestration platform, Kubernetes, and bring self-service, persistent storage to your containerized applications.
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
These days, everyone is expected to be a data analyst. But with so much data available, how can you make sense of it and be sure you're making the best decisions? One great approach is to use data visualizations. In this session, we take a complex dataset and show how the breadth of capabilities in MongoDB Charts can help you turn bits and bytes into insights.
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
When you need to model data, is your first instinct to start breaking it down into rows and columns? Mine used to be too. When you want to develop apps in a modern, agile way, NoSQL databases can be the best option. Come to this talk to learn how to take advantage of all that NoSQL databases have to offer and discover the benefits of changing your mindset from the legacy, tabular way of modeling data. We’ll compare and contrast the terms and concepts in SQL databases and MongoDB, explain the benefits of using MongoDB compared to SQL databases, and walk through data modeling basics so you feel confident as you begin using MongoDB.
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
Join this talk and test session with a MongoDB Developer Advocate where you'll go over the setup, configuration, and deployment of an Atlas environment. Create a service that you can take back in a production-ready state and prepare to unleash your inner genius.
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
The document discusses guidelines for ordering fields in compound indexes to optimize query performance. It recommends the E-S-R approach: placing equality fields first, followed by sort fields, and range fields last. This allows indexes to leverage equality matches, provide non-blocking sorts, and minimize scanning. Examples show how indexes ordered by these guidelines can support queries more efficiently by narrowing the search bounds.
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
Aggregation pipeline has been able to power your analysis of data since version 2.2. In 4.2 we added more power and now you can use it for more powerful queries, updates, and outputting your data to existing collections. Come hear how you can do everything with the pipeline, including single-view, ETL, data roll-ups and materialized views.
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
The document describes a methodology for data modeling with MongoDB. It begins by recognizing the differences between document and tabular databases, then outlines a three step methodology: 1) describe the workload by listing queries, 2) identify and model relationships between entities, and 3) apply relevant patterns when modeling for MongoDB. The document uses examples around modeling a coffee shop franchise to illustrate modeling approaches and techniques.
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
MongoDB Atlas Data Lake is a new service offered by MongoDB Atlas. Many organizations store long term, archival data in cost-effective storage like S3, GCP, and Azure Blobs. However, many of them do not have robust systems or tools to effectively utilize large amounts of data to inform decision making. MongoDB Atlas Data Lake is a service allowing organizations to analyze their long-term data to discover a wealth of information about their business.
This session will take a deep dive into the features that are currently available in MongoDB Atlas Data Lake and how they are implemented. In addition, we'll discuss future plans and opportunities and offer ample Q&A time with the engineers on the project.
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
Virtual assistants are becoming the new norm when it comes to daily life, with Amazon’s Alexa being the leader in the space. As a developer, not only do you need to make web and mobile compliant applications, but you need to be able to support virtual assistants like Alexa. However, the process isn’t quite the same between the platforms.
How do you handle requests? Where do you store your data and work with it to create meaningful responses with little delay? How much of your code needs to change between platforms?
In this session we’ll see how to design and develop applications known as Skills for Amazon Alexa powered devices using the Go programming language and MongoDB.
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
aux Core Data, appréciée par des centaines de milliers de développeurs. Apprenez ce qui rend Realm spécial et comment il peut être utilisé pour créer de meilleures applications plus rapidement.
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
Il n’a jamais été aussi facile de commander en ligne et de se faire livrer en moins de 48h très souvent gratuitement. Cette simplicité d’usage cache un marché complexe de plus de 8000 milliards de $.
La data est bien connu du monde de la Supply Chain (itinéraires, informations sur les marchandises, douanes,…), mais la valeur de ces données opérationnelles reste peu exploitée. En alliant expertise métier et Data Science, Upply redéfinit les fondamentaux de la Supply Chain en proposant à chacun des acteurs de surmonter la volatilité et l’inefficacité du marché.
Your One-Stop Shop for Python Success: Top 10 US Python Development Providersakankshawande
Simplify your search for a reliable Python development partner! This list presents the top 10 trusted US providers offering comprehensive Python development services, ensuring your project's success from conception to completion.
Webinar: Designing a schema for a Data WarehouseFederico Razzoli
Are you new to data warehouses (DWH)? Do you need to check whether your data warehouse follows the best practices for a good design? In both cases, this webinar is for you.
A data warehouse is a central relational database that contains all measurements about a business or an organisation. This data comes from a variety of heterogeneous data sources, which includes databases of any type that back the applications used by the company, data files exported by some applications, or APIs provided by internal or external services.
But designing a data warehouse correctly is a hard task, which requires gathering information about the business processes that need to be analysed in the first place. These processes must be translated into so-called star schemas, which means, denormalised databases where each table represents a dimension or facts.
We will discuss these topics:
- How to gather information about a business;
- Understanding dictionaries and how to identify business entities;
- Dimensions and facts;
- Setting a table granularity;
- Types of facts;
- Types of dimensions;
- Snowflakes and how to avoid them;
- Expanding existing dimensions and facts.
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfChart Kalyan
A Mix Chart displays historical data of numbers in a graphical or tabular form. The Kalyan Rajdhani Mix Chart specifically shows the results of a sequence of numbers over different periods.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/building-and-scaling-ai-applications-with-the-nx-ai-manager-a-presentation-from-network-optix/
Robin van Emden, Senior Director of Data Science at Network Optix, presents the “Building and Scaling AI Applications with the Nx AI Manager,” tutorial at the May 2024 Embedded Vision Summit.
In this presentation, van Emden covers the basics of scaling edge AI solutions using the Nx tool kit. He emphasizes the process of developing AI models and deploying them globally. He also showcases the conversion of AI models and the creation of effective edge AI pipelines, with a focus on pre-processing, model conversion, selecting the appropriate inference engine for the target hardware and post-processing.
van Emden shows how Nx can simplify the developer’s life and facilitate a rapid transition from concept to production-ready applications.He provides valuable insights into developing scalable and efficient edge AI solutions, with a strong focus on practical implementation.
Best 20 SEO Techniques To Improve Website Visibility In SERPPixlogix Infotech
Boost your website's visibility with proven SEO techniques! Our latest blog dives into essential strategies to enhance your online presence, increase traffic, and rank higher on search engines. From keyword optimization to quality content creation, learn how to make your site stand out in the crowded digital landscape. Discover actionable tips and expert insights to elevate your SEO game.
GraphRAG for Life Science to increase LLM accuracyTomaz Bratanic
GraphRAG for life science domain, where you retriever information from biomedical knowledge graphs using LLMs to increase the accuracy and performance of generated answers
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxSitimaJohn
Ocean Lotus cyber threat actors represent a sophisticated, persistent, and politically motivated group that poses a significant risk to organizations and individuals in the Southeast Asian region. Their continuous evolution and adaptability underscore the need for robust cybersecurity measures and international cooperation to identify and mitigate the threats posed by such advanced persistent threat groups.
OpenID AuthZEN Interop Read Out - AuthorizationDavid Brossard
During Identiverse 2024 and EIC 2024, members of the OpenID AuthZEN WG got together and demoed their authorization endpoints conforming to the AuthZEN API
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
5th LF Energy Power Grid Model Meet-up SlidesDanBrown980551
5th Power Grid Model Meet-up
It is with great pleasure that we extend to you an invitation to the 5th Power Grid Model Meet-up, scheduled for 6th June 2024. This event will adopt a hybrid format, allowing participants to join us either through an online Mircosoft Teams session or in person at TU/e located at Den Dolech 2, Eindhoven, Netherlands. The meet-up will be hosted by Eindhoven University of Technology (TU/e), a research university specializing in engineering science & technology.
Power Grid Model
The global energy transition is placing new and unprecedented demands on Distribution System Operators (DSOs). Alongside upgrades to grid capacity, processes such as digitization, capacity optimization, and congestion management are becoming vital for delivering reliable services.
Power Grid Model is an open source project from Linux Foundation Energy and provides a calculation engine that is increasingly essential for DSOs. It offers a standards-based foundation enabling real-time power systems analysis, simulations of electrical power grids, and sophisticated what-if analysis. In addition, it enables in-depth studies and analysis of the electrical power grid’s behavior and performance. This comprehensive model incorporates essential factors such as power generation capacity, electrical losses, voltage levels, power flows, and system stability.
Power Grid Model is currently being applied in a wide variety of use cases, including grid planning, expansion, reliability, and congestion studies. It can also help in analyzing the impact of renewable energy integration, assessing the effects of disturbances or faults, and developing strategies for grid control and optimization.
What to expect
For the upcoming meetup we are organizing, we have an exciting lineup of activities planned:
-Insightful presentations covering two practical applications of the Power Grid Model.
-An update on the latest advancements in Power Grid -Model technology during the first and second quarters of 2024.
-An interactive brainstorming session to discuss and propose new feature requests.
-An opportunity to connect with fellow Power Grid Model enthusiasts and users.
MongoDB World 2019: Building an Efficient and Performant Data Model: Real World Challenges Faced and How We Solved Them
1. Building an efficient and a performant data
model
Real world challenges faced and how we solved them
2. Navigating your next with Infosys
200,000+
Employees
globally
$10.9
Billion in
revenues
1,204
Clients in over
45 countries
168,000+
Employees
trained in
Design Thinking
World’s largest
Corporate
University
3. Open Source COE
Legacy / Mainframe
Modernization
Public Cloud
(Applications)
DevOps
Innovation, Cost
Efficiency
ROI on current
technology investment
alignment to modern
architectures
Scale and Savings on the
infrastructure with Cloud
Native architecture
Infra as Code
Build the technology
foundation of the
Digital Platform
Reduce dependency on
the existing legacy
estate
Building a Cloud Native
digital platform
Digital tools adopting
DevOps & Agile
principles
Modernization Practice to drive Transformation
4. Infosys Open Source – At a Glance
Technology Platforms
Solutions Themes
Mainframe
offload
Monolith
to
Microservices
RDBMS
to
ODBMS
Application
Modernization on
NoSQL
EVENTS & STREAMING
API MANAGEMENT RDBMS
SEARCH & INSIGHTS
IN-MEMORY
PaaS CONTAINERSUXINTEGRATION
&
BPM
NOSQL
IaaS
Advisory
Plan, accelerate open
source adoption and
manage associated risk
Architecture
Consulting
Implementation Operations & Support
Make the right
technological choices and
establish the springboard
for success
Deliver measurable
benefits faster through
agile & lean methods
Embed technology into
mainstream with
continuous improvement
Big Data
Analytics
Service Offerings
5. Considerations for Relational Data Models
5
Integrity
Structure of Data &
Entities
Concurrency ControlConsistency
Schema Validations
6. Need of Databases for non-transactional use cases and applications
6
Use Case Preferred Type of Database
§ Caching Data
§ User Session and Preferences
§ Shopping Cart Data
Key Value
§ IOT Sensor Data
§ Logs
§ Huge Data set
Columnar
§ Social and other networks
§ Real time Routing
§ Fraud Detection
Graph
§ Web App
§ Product Catalog
§ Operational Data stores
§ Performant Reference Data Store
Document
That support Variety, Velocity and Volume of Data in the Digital world
7. Features of No SQL Databases Alternatives
7
Denormalized data -
Higher speed of retrieval
Schema-free and
unstructured data formats
Flexibility to
accommodate changes
and various data types
Denormalized data
Higher speed of retrieval
Higher performance
Horizontal scaling on
commodity servers
Low cost
Low Complexity
Features of Modern Databases
1 0 1 1 0 1 0 0 1 1 0 1 0 1
Built in Replication, High
Availability and Automated
Failover
No Add-on's
Low Complexity
Consistent Multi platform
experience
Avoids platform lock-in
Aligns to Next Gen Architecture
Open Source
Low License & Storage Cost
Lower Cost
9. 9
Client Context: A Multinational e-Commerce Company that provides order management, payment processing, order
routing, fulfillment, and analytics services to its clients. The company was out to re-architect its order management
system to a micro services based architecture
Identified 3 business
areas for Modernization
- Order Capture,
Supply Chain, Billing
Deep Dive on
understanding
Business Processes
and data flows
Identified
Dependencies that
impact the data entities
and attributes
Impacted ERD’s
mapped to 200 Tables
Pattern #1 –
Reference by key or
Embedding the
document.
Guidance - Typically never
embed more than few 100
documents
Inventory
{
“_id” : ObjectId(“…”),
“item_zone” : ” “,
“item_id” : “ “,
“item_sku” : “ “,
“item_qty” : 500,
“item_desc” : “XXXX”,
…
}
Inventory Audit
{
“_id” : ObjectId(“…”),
“item_zone” : ” “,
“item_id” : “ “,
“item_sku” : “ “,
“item_qty” : 500,
“inventory_chng”: [
{“dts”: DTS#1},
{“dts”: DTS#2},
…]
}
Inventory Audit Detail
{
“_id” : ObjectId(“…”),
“item_zone” : ” “,
“item_id” : “ “,
“item_sku” : “ “,
“item_qty” : 500,
“dts” : DTS#1,
from_qty : X,
to_qty : Y,
ord_id : “XXX”
}
Reference by key was recommended
1
12. 12
Client Context: A French multinational corporation specializing in energy management and automation . The
company was in the process of implementing an IOT use case for recording sensor readings from multiple devices
Sds Dsds Sds sds
Pattern #1 –
Bucket Pattern –
Optimized Index size and
Optimized Read
operations
Guidance - Optimized as per
the application access and
aggregation needs
Old Model
{ “_id” : ObjectId(“…”),
"s" : BinData ( xx),
"t" : ISODate ( xxx ),
"v" : 15,
"a" :
{
"Name" : "Quality",
"SemanticRef" : "com.ref",
"Value" : "Good"
}
}
New Model
{ “_id” : ObjectId(“…”),
"s" : BinData ( xx),
"t" : ISODate ( xxx ),
"v" : [ "0" : { # Increment from bucket start
"a" : { # Only exists if there are attributes
"Quality": { "v" : “Good",
"s" : "XXX" # SemanticRef },
}},
“1" : {Increment from beginning of bucket
},
Bucket the observations
2
17. How to provide Predictability and get a head start to
NoSQL Model when migrating from a RDBMS?
18. IDMC - Infosys Data Model Converter
18
EXTRACTION ANALYSIS PERSISTENCE PROCESSING DEPLOYMENT
Source
RDBMS
Table Design
Query Pattern
Data Pattern
Entity and
Relationship
Read and Write
Query
Data Volatility
and Cardinality
Rules
Drools
SQL Lite
Target
Data-Model
Generation
Deployment
Scripts
Target
NoSQL
Process Rules