"MongoDB isn't just software we release; it's software we rely on every day as part of the MongoDB Management Service (MMS). We use MongoDB to store monitoring data and backups for the databases of over 50,000 customers. Get the inside scoop on running an overwhelmingly write-heavy application as MMS engineers Steve Briskin and John Morales explain the tweaks and optimizations they use to get most out of every machine, including:
- RRD schema design to balance write throughput and read latency
- Warming indexes before inserts
- Backup snapshot storage design optimized for insert-only workloads
They will also discuss the process of migrating to 3.0 and how it has simplified their lives."
Solr Power FTW: Powering NoSQL the World OverAlex Pinkin
Solr is an open source, Lucene based search platform originally developed by CNET and used by the likes of Netflix, Yelp, and StubHub which has been rapidly growing in popularity and features during the last few years. Learn how Solr can be used as a Not Only SQL (NoSQL) database along the lines of Cassandra, Memcached, and Redis. NoSQL data stores are regularly described as non-relational, distributed, internet-scalable and are used at both Facebook and Digg. This presentation will quickly cover the fundamentals of NoSQL data stores, the basics of Lucene, and what Solr brings to the table. Following that we will dive into the technical details of making Solr your primary query engine on large scale web applications, thus relegating your traditional relational database to little more than a simple key store. Real solutions to problems like handling four billion requests per month will be presented. We'll talk about sizing and configuring the Solr instances to maintain rapid response times under heavy load. We'll show you how to change the schema on a live system with tens of millions of documents indexed while supporting real-time results. And finally, we'll answer your questions about ways to work around the lack of transactions in Solr and how you can do all of this in a highly available solution.
MongoDB Basics. Talk at University of León, Spain. A whole description of MongoDB power, characteristics, capabilities and products. Updated to 3.2 version.
Pomerania Cloud case study - Openstack Day Warsaw 2017Łukasz Klimek
We have deployed Openstack-based public cloud, pomeraniacloud.pl, and integrated it with e-commerce solution based on Drupal 7 and Drupal Commerce.
This presentation contains summary of our work. It was presented during Openstack Day Warsaw 2017.
Solr Power FTW: Powering NoSQL the World OverAlex Pinkin
Solr is an open source, Lucene based search platform originally developed by CNET and used by the likes of Netflix, Yelp, and StubHub which has been rapidly growing in popularity and features during the last few years. Learn how Solr can be used as a Not Only SQL (NoSQL) database along the lines of Cassandra, Memcached, and Redis. NoSQL data stores are regularly described as non-relational, distributed, internet-scalable and are used at both Facebook and Digg. This presentation will quickly cover the fundamentals of NoSQL data stores, the basics of Lucene, and what Solr brings to the table. Following that we will dive into the technical details of making Solr your primary query engine on large scale web applications, thus relegating your traditional relational database to little more than a simple key store. Real solutions to problems like handling four billion requests per month will be presented. We'll talk about sizing and configuring the Solr instances to maintain rapid response times under heavy load. We'll show you how to change the schema on a live system with tens of millions of documents indexed while supporting real-time results. And finally, we'll answer your questions about ways to work around the lack of transactions in Solr and how you can do all of this in a highly available solution.
MongoDB Basics. Talk at University of León, Spain. A whole description of MongoDB power, characteristics, capabilities and products. Updated to 3.2 version.
Pomerania Cloud case study - Openstack Day Warsaw 2017Łukasz Klimek
We have deployed Openstack-based public cloud, pomeraniacloud.pl, and integrated it with e-commerce solution based on Drupal 7 and Drupal Commerce.
This presentation contains summary of our work. It was presented during Openstack Day Warsaw 2017.
hbaseconasia2017: HBase Practice At XiaoMiHBaseCon
Zheng Hu
We'll share some HBase experience at XiaoMi:
1. How did we tuning G1GC for HBase Clusters.
2. Development and performance of Async HBase Client.
hbaseconasia2017 hbasecon hbase xiaomi https://www.eventbrite.com/e/hbasecon-asia-2017-tickets-34935546159#
Logging at OVHcloud :
Logs Data platform est la plateforme de collecte, d'analyse et de gestion centralisée de logs d'OVHcloud. Cette plateforme a pour but de répondre aux challenges que constitue l'indexation de plus de 4000 milliards de logs par une entreprise comme OVHcloud. Cette présentation vous décrira l'architecture générale de Logs Data Platform autour de ses composants centraux Elasticsearch et Graylog et vous décrira les différentes problématiques de scalabilité, disponibilité, performance et d'évolutivité qui sont le quotidien de l'équipe Observability à OVHcloud.
Security Monitoring for big Infrastructures without a Million Dollar budgetJuan Berner
Nowadays in an increasingly more complex and dynamic network its not enough to be a regex ninja and storing only the logs you think you might need. From network traffic to custom logs you won't know which logs will be crucial to stop the next attacker, and if you are not planning to spend a half of your security budget in a commercial solution we will show you a way to building you own SIEM with open source. The talk will go from how to build a powerful logging environment for your organization to scaling on the cloud and storing everything forever. We will walk through how to build such a system with open source solutions as Elasticsearch and Hadoop, and creating your own custom monitoring rules to monitor everything you need. The talk will also include how to secure the environment and allow restricted access to other teams as well as avoiding common pitfalls and ensuring compliance standards.
As service providers and primary code contributors in the Islandora Community, discoverygarden encounters customers who are ingesting, accessing, and storing high volumes of data. For example, a customer who had 150,000 objects in 2012 now has three million objects and expectations to grow to five million in the very short term. This is increasingly common.
As repositories grow in size they can encounter poor performance, particularly during large ingests and derivative generation. To accommodate growing repositories caching mechanisms, infrastructure changes, and code updates are necessary.
The presentation will explore customer case studies that demonstrate interim solutions and the extensive, ongoing research and development to find long-term solutions.
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...Hernan Costante
Nowadays in an increasingly more complex and dynamic network its not enough to be a regex ninja and storing only the logs you think you might need. From network traffic to custom logs you won't know which logs will be crucial to stop the next attacker, and if you are not planning to spend a half of your security budget in a commercial solution we will show you a way to building you own SIEM with open source. The talk will go from how to build a powerful logging environment for your organization to scaling on the cloud and storing everything forever. We will walk through how to build such a system with open source solutions as Elasticsearch and Hadoop, and creating your own custom monitoring rules to monitor everything you need. The talk will also include how to secure the environment and allow restricted access to other teams as well as avoiding common pitfalls and ensuring compliance standards.
Speaker: Jay Runkel
When architecting a MongoDB application, one of the most difficult questions to answer is how much hardware (number of shards, number of replicas, and server specifications) am I going to need for an application. Similarly, when deploying in the cloud, how do you estimate your monthly AWS, Azure, or GCP costs given a description of a new application? While there isn't a precise formula for mapping application features (e.g., document structure, schema, query volumes) into servers, there are various strategies you can use to estimate the MongoDB cluster sizing. This presentation will cover the questions you need to ask and describe how to use this information to estimate the required cluster size or cloud deployment cost.
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB
Moving to a new home is daunting. Packing up all your things, getting a vehicle to move it all, unpacking it, updating your mailing address, and making sure you did not leave anything behind. Well, the move to MongoDB Atlas is similar, but all the logistics are already figured out for you by MongoDB.
hbaseconasia2017: HBase Practice At XiaoMiHBaseCon
Zheng Hu
We'll share some HBase experience at XiaoMi:
1. How did we tuning G1GC for HBase Clusters.
2. Development and performance of Async HBase Client.
hbaseconasia2017 hbasecon hbase xiaomi https://www.eventbrite.com/e/hbasecon-asia-2017-tickets-34935546159#
Logging at OVHcloud :
Logs Data platform est la plateforme de collecte, d'analyse et de gestion centralisée de logs d'OVHcloud. Cette plateforme a pour but de répondre aux challenges que constitue l'indexation de plus de 4000 milliards de logs par une entreprise comme OVHcloud. Cette présentation vous décrira l'architecture générale de Logs Data Platform autour de ses composants centraux Elasticsearch et Graylog et vous décrira les différentes problématiques de scalabilité, disponibilité, performance et d'évolutivité qui sont le quotidien de l'équipe Observability à OVHcloud.
Security Monitoring for big Infrastructures without a Million Dollar budgetJuan Berner
Nowadays in an increasingly more complex and dynamic network its not enough to be a regex ninja and storing only the logs you think you might need. From network traffic to custom logs you won't know which logs will be crucial to stop the next attacker, and if you are not planning to spend a half of your security budget in a commercial solution we will show you a way to building you own SIEM with open source. The talk will go from how to build a powerful logging environment for your organization to scaling on the cloud and storing everything forever. We will walk through how to build such a system with open source solutions as Elasticsearch and Hadoop, and creating your own custom monitoring rules to monitor everything you need. The talk will also include how to secure the environment and allow restricted access to other teams as well as avoiding common pitfalls and ensuring compliance standards.
As service providers and primary code contributors in the Islandora Community, discoverygarden encounters customers who are ingesting, accessing, and storing high volumes of data. For example, a customer who had 150,000 objects in 2012 now has three million objects and expectations to grow to five million in the very short term. This is increasingly common.
As repositories grow in size they can encounter poor performance, particularly during large ingests and derivative generation. To accommodate growing repositories caching mechanisms, infrastructure changes, and code updates are necessary.
The presentation will explore customer case studies that demonstrate interim solutions and the extensive, ongoing research and development to find long-term solutions.
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...Hernan Costante
Nowadays in an increasingly more complex and dynamic network its not enough to be a regex ninja and storing only the logs you think you might need. From network traffic to custom logs you won't know which logs will be crucial to stop the next attacker, and if you are not planning to spend a half of your security budget in a commercial solution we will show you a way to building you own SIEM with open source. The talk will go from how to build a powerful logging environment for your organization to scaling on the cloud and storing everything forever. We will walk through how to build such a system with open source solutions as Elasticsearch and Hadoop, and creating your own custom monitoring rules to monitor everything you need. The talk will also include how to secure the environment and allow restricted access to other teams as well as avoiding common pitfalls and ensuring compliance standards.
Speaker: Jay Runkel
When architecting a MongoDB application, one of the most difficult questions to answer is how much hardware (number of shards, number of replicas, and server specifications) am I going to need for an application. Similarly, when deploying in the cloud, how do you estimate your monthly AWS, Azure, or GCP costs given a description of a new application? While there isn't a precise formula for mapping application features (e.g., document structure, schema, query volumes) into servers, there are various strategies you can use to estimate the MongoDB cluster sizing. This presentation will cover the questions you need to ask and describe how to use this information to estimate the required cluster size or cloud deployment cost.
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB
Moving to a new home is daunting. Packing up all your things, getting a vehicle to move it all, unpacking it, updating your mailing address, and making sure you did not leave anything behind. Well, the move to MongoDB Atlas is similar, but all the logistics are already figured out for you by MongoDB.
As cloud adoption has grown more rapidly in the last decade , how DBA's a can add more value to system and bring in more scalability to the DB server. This talk was presented at Open Source India 2018 conference by Kabilesh and Manosh of Mydbops. They share a few experience and value addition made to customers during their consulting process.
Benchmarking your cloud performance with top 4 global public cloudsdata://disrupted®
In this presentation, we will present the performance measurement metrics of leading cloud providers - AWS, Google Cloud, Microsoft Azure, and Digital Ocean. We’ll give you useful tools to measure your own cloud performance and a handy guide on how to calculate cloud TCO (total cost of ownership). In addition, you’ll learn how to estimate correctly your market positioning and perform better than the cloud giants.
Boyan Krosnov is a Co-Founder and Chief Product Officer of StorPool Storage. He has been part of the technical teams building 5 service providers from scratch in 4 countries. In most of these projects, he has designed the architecture, led the technical teams, and managed the implementation of projects in the millions.
Silicon Valley Code Camp 2014 - Advanced MongoDBDaniel Coupal
MongoDB presentation from Silicon Valley Code Camp 2014.
Walkthrough developing, deploying and operating a MongoDB application, avoiding the most common pitfalls.
MongoDB performance tuning and monitoring with MMSNicholas Tang
Using the MongoDB Monitoring Service to monitor your MongoDB instance(s) and track down performance issues - including two real-world examples of how we tracked down problems using MMS to understand the environment, figure out what changed, and help us rapidly drill into a successful diagnosis.
Most database products have their own auditing functionalities or plugins but they always involve overhead which means they end up having them turned off or with the bare minimum enabled.
In this workshop we will show how to get reliable logging for mysql and mongodb servers in a scalable and non intrusive way, its drawbacks and how we can build our own open source tools to achieve results similar to most commercial products.
Tools to sniff, process and act upon queries will be shared and we will show how simple is to set up and monitor a database environment so it can be replicated and grow horizontally. All the code needed will be published.
Similar to Evolution and Scaling of MongoDB Management Service Running on MongoDB (20)
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
During this talk we'll navigate through a customer's journey as they migrate an existing MongoDB deployment to MongoDB Atlas. While the migration itself can be as simple as a few clicks, the prep/post effort requires due diligence to ensure a smooth transfer. We'll cover these steps in detail and provide best practices. In addition, we’ll provide an overview of what to consider when migrating other cloud data stores, traditional databases and MongoDB imitations to MongoDB Atlas.
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
These days, everyone is expected to be a data analyst. But with so much data available, how can you make sense of it and be sure you're making the best decisions? One great approach is to use data visualizations. In this session, we take a complex dataset and show how the breadth of capabilities in MongoDB Charts can help you turn bits and bytes into insights.
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
MongoDB Kubernetes operator and MongoDB Open Service Broker are ready for production operations. Learn about how MongoDB can be used with the most popular container orchestration platform, Kubernetes, and bring self-service, persistent storage to your containerized applications. A demo will show you how easy it is to enable MongoDB clusters as an External Service using the Open Service Broker API for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
Are you new to schema design for MongoDB, or are you looking for a more complete or agile process than what you are following currently? In this talk, we will guide you through the phases of a flexible methodology that you can apply to projects ranging from small to large with very demanding requirements.
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
Humana, like many companies, is tackling the challenge of creating real-time insights from data that is diverse and rapidly changing. This is our journey of how we used MongoDB to combined traditional batch approaches with streaming technologies to provide continues alerting capabilities from real-time data streams.
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
Time series data is increasingly at the heart of modern applications - think IoT, stock trading, clickstreams, social media, and more. With the move from batch to real time systems, the efficient capture and analysis of time series data can enable organizations to better detect and respond to events ahead of their competitors or to improve operational efficiency to reduce cost and risk. Working with time series data is often different from regular application data, and there are best practices you should observe.
This talk covers:
Common components of an IoT solution
The challenges involved with managing time-series data in IoT applications
Different schema designs, and how these affect memory and disk utilization – two critical factors in application performance.
How to query, analyze and present IoT time-series data using MongoDB Compass and MongoDB Charts
At the end of the session, you will have a better understanding of key best practices in managing IoT time-series data with MongoDB.
Join this talk and test session with a MongoDB Developer Advocate where you'll go over the setup, configuration, and deployment of an Atlas environment. Create a service that you can take back in a production-ready state and prepare to unleash your inner genius.
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
Our clients have unique use cases and data patterns that mandate the choice of a particular strategy. To implement these strategies, it is mandatory that we unlearn a lot of relational concepts while designing and rapidly developing efficient applications on NoSQL. In this session, we will talk about some of our client use cases, the strategies we have adopted, and the features of MongoDB that assisted in implementing these strategies.
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
Encryption is not a new concept to MongoDB. Encryption may occur in-transit (with TLS) and at-rest (with the encrypted storage engine). But MongoDB 4.2 introduces support for Client Side Encryption, ensuring the most sensitive data is encrypted before ever leaving the client application. Even full access to your MongoDB servers is not enough to decrypt this data. And better yet, Client Side Encryption can be enabled at the "flick of a switch".
This session covers using Client Side Encryption in your applications. This includes the necessary setup, how to encrypt data without sacrificing queryability, and what trade-offs to expect.
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
MongoDB Kubernetes operator is ready for prime-time. Learn about how MongoDB can be used with most popular orchestration platform, Kubernetes, and bring self-service, persistent storage to your containerized applications.
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
These days, everyone is expected to be a data analyst. But with so much data available, how can you make sense of it and be sure you're making the best decisions? One great approach is to use data visualizations. In this session, we take a complex dataset and show how the breadth of capabilities in MongoDB Charts can help you turn bits and bytes into insights.
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
When you need to model data, is your first instinct to start breaking it down into rows and columns? Mine used to be too. When you want to develop apps in a modern, agile way, NoSQL databases can be the best option. Come to this talk to learn how to take advantage of all that NoSQL databases have to offer and discover the benefits of changing your mindset from the legacy, tabular way of modeling data. We’ll compare and contrast the terms and concepts in SQL databases and MongoDB, explain the benefits of using MongoDB compared to SQL databases, and walk through data modeling basics so you feel confident as you begin using MongoDB.
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
Join this talk and test session with a MongoDB Developer Advocate where you'll go over the setup, configuration, and deployment of an Atlas environment. Create a service that you can take back in a production-ready state and prepare to unleash your inner genius.
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
Query performance should be the unsung hero of an application, but without proper configuration, can become a constant headache. When used properly, MongoDB provides extremely powerful querying capabilities. In this session, we'll discuss concepts like equality, sort, range, managing query predicates versus sequential predicates, and best practices to building multikey indexes.
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
Aggregation pipeline has been able to power your analysis of data since version 2.2. In 4.2 we added more power and now you can use it for more powerful queries, updates, and outputting your data to existing collections. Come hear how you can do everything with the pipeline, including single-view, ETL, data roll-ups and materialized views.
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
Are you new to schema design for MongoDB, or are you looking for a more complete or agile process than what you are following currently? In this talk, we will guide you through the phases of a flexible methodology that you can apply to projects ranging from small to large with very demanding requirements.
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
MongoDB Atlas Data Lake is a new service offered by MongoDB Atlas. Many organizations store long term, archival data in cost-effective storage like S3, GCP, and Azure Blobs. However, many of them do not have robust systems or tools to effectively utilize large amounts of data to inform decision making. MongoDB Atlas Data Lake is a service allowing organizations to analyze their long-term data to discover a wealth of information about their business.
This session will take a deep dive into the features that are currently available in MongoDB Atlas Data Lake and how they are implemented. In addition, we'll discuss future plans and opportunities and offer ample Q&A time with the engineers on the project.
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
Virtual assistants are becoming the new norm when it comes to daily life, with Amazon’s Alexa being the leader in the space. As a developer, not only do you need to make web and mobile compliant applications, but you need to be able to support virtual assistants like Alexa. However, the process isn’t quite the same between the platforms.
How do you handle requests? Where do you store your data and work with it to create meaningful responses with little delay? How much of your code needs to change between platforms?
In this session we’ll see how to design and develop applications known as Skills for Amazon Alexa powered devices using the Go programming language and MongoDB.
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
aux Core Data, appréciée par des centaines de milliers de développeurs. Apprenez ce qui rend Realm spécial et comment il peut être utilisé pour créer de meilleures applications plus rapidement.
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
Il n’a jamais été aussi facile de commander en ligne et de se faire livrer en moins de 48h très souvent gratuitement. Cette simplicité d’usage cache un marché complexe de plus de 8000 milliards de $.
La data est bien connu du monde de la Supply Chain (itinéraires, informations sur les marchandises, douanes,…), mais la valeur de ces données opérationnelles reste peu exploitée. En alliant expertise métier et Data Science, Upply redéfinit les fondamentaux de la Supply Chain en proposant à chacun des acteurs de surmonter la volatilité et l’inefficacité du marché.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Evolution and Scaling of MongoDB Management Service Running on MongoDB
1. Evolution and Scaling of MongoDB
Management Service Running on MongoDB
Steve Briskin, Lead Engineer, MMS Backup
John Morales, Senior Engineer, MMS Monitoring
2. 2
Agenda
● What is MongoDB Management Service
● MMS Backup
o Schema evolution and optimizations
o How we scaled
● MMS Monitoring
o Read-optimized time series schema
o Write-optimized time series schema
o Benchmarks
3. 3
MongoDB Management Service (MMS)
mms.mongodb.com
Automation and Provisioning
Single-click provisioning, scaling &
upgrades, administrative tasks
Monitoring
Charts, dashboards, and alerts on 100+
metrics
Backup
Backup and restore, with point-in-time
recovery, and support for sharded clusters
4. 4
MMS Backup
● Cloud Backup service
● Takes periodic snapshots
● Manages storage
● Premium features
o Point in time recovery
o Consistent snapshots of sharded clusters
6. 6
Oplog Store
● “Circular Buffer” of operations
o Many concurrent inserts
o Time-bound (e.g. 24 hours)
o Lifecycle: insert, read once, delete
● Concerns
o Lock contention
o Data purging
o Freelist fragmentation
MMS Backup
Ingestion
Oplog
Store
7. 7
Oplog Store
● Developed with MongoDB 2.2
● Lock Contention
o DB per customer
● Data Purging
o Use TTL Index
● Freelist fragmentation
o Use Power Of 2 Allocation
MMS Backup
Ingestion
Oplog
Store
8. 8
Oplog Store on MongoDB 3.0
• MongoDB 3.0
o More granular locking
o Freelist management improvements
Upgrade
Upgrade
30% Faster
9. 9
Snapshot Storage (Blockstore)
● Backup Snapshot Storage
o File storage in MongoDB
● Design
o Block: 64KB - 15MB of binary data
SHA256 hash as a unique identifier
o File: List of blocks
o Insert only schema
o De-duplication and compression
MMS Backup
Backup Process
Snapshot
Store
10. 10
Blockstore
● Insert Only + Power Of 2 Allocation = Wasted Space
o Example: 9k document will use 16k
o Worst case: Need 2x disk space
● Writes are sporadic
o Indexes are cold and need to be paged in
o Can be slow and I/O-expensive
11. 11
Blockstore
● Disable Power Of 2 Allocation
o MongoDB 2.2 - 2.6:
db.runCommand({collMod : “collection”,
usePowerOf2Sizes : false})
o MongoDB 3.0:
db.runCommand({collMod : “collection”,
noPadding : true})
● Warm indexes before bulk insertions
db.runCommand({touch : “collection”,
index : true,
data : false})
12. 12
Scaling - Replica Sets
• Started with a single replica set
• Split into purpose-based replica sets
Blockstore (Large HDDs)
Primary
Secondary Secondary
Backup Metadata (Small SSDs)
Primary
Secondary Secondary Secondary Arbiter
Oplog Store (Small HDDs)
Primary
Secondary Secondary Secondary Arbiter
13. 13
Scaling - Application Sharding
• Application sharding for horizontal scaling
• Each customer is assigned to one replica set
Application
Customer A
Customer B
Customer C
Blockstore_1
Primary
Secondary Secondary
Blockstore_2
Primary
Secondary Secondary
Blockstore_0
Primary
Secondary Secondary
16. 16
Introduction to MMS Monitoring
Design Objectives circa 2012
•Fast chart load times
•Chart ~80 metrics per host
•Minute-level resolution
Inherent Advantages
•Control our own rate of samples
Browser Users
Monitoring Agent
Metric Data
Sharded ClusterCustomer
Deployment
mms.mongodb.com
17. 17
Circa 2012: Read-Optimized Schema
{
hid: “id”, // Host ID
cid: ObjectId(“...”), // Group ID
g: “network”, // Metric group
i: “bytesOut”, // Specific metric
mn: { // hour worth of points stored
together
“00”: {
n: NumberLong(“...”), // value
t: 1430918626 // time
},
“01”: {
...
},
...,
“59”: { ... }
}
}
● Store points for same metric together
18. 18
Circa 2012: Scaling up Writes
● Write Performance when Read-Optimized
○ Updates $set the time and value sub-doc
○ Documents grow, move on disk
○ I/O mostly random
● Mitigate
○ Ensure updates always in-place (MMAPv1-only)
19. 19
Circa 2012 to Today: Performance
Hooray
●Average chart load time: 15ms
●Today MMS actively monitoring 60k+ hosts
●Storing average of 128 metrics per host
20. 20
2015: What’s Next
Upcoming MMS Monitoring features
●High resolution monitoring
●Charting more metrics
Whoops
●Read-optimized schema inflexible
●Each new metric means new write
21. Coming Soon: Write-Optimized
{
"_id" : “...”,
"n" : { // network
"bi" : NumberLong(123), // bytesIn
"bo" : NumberLong(234), // bytesOut
"r" : NumberLong(34), // requests
},
"e" : { // page faults
"pf" : NumberLong(3564),
},
"g" : { // queues
"cr" : NumberLong(3564),
},
...,
// sample time for all these points
"t" : ISODate("2015-06-02T15:35:43.189Z")
}
● Store points across metrics together
● Insert-only versus random updates
22. 22
Benchmarking and Tradeoffs
Writes: time (millis) to ingest 4500 hosts Read Latency: millis to read 24-hour chart
200x more write
throughput
~18ms latency tradeoff
23. Wrap Up and Q & A
● Tailoring configuration for workload
● Schema design and managing tradeoffs
● IOPS often the limiting resource
Editor's Notes
We’re talking architecture and evolution. We are not mongoDB kernel engineers. We are internal customers and use MongoDB just like you.
Automation: simplifies deployment of new clusters. Automatic provisioning of hosts in AWS and deployment and configuration of MongoDB. Orchestrates upgrades and other admin tasks.
Monitoring: Captures and alerts on critical metrics. You can’t optimize what you can’t measure. Pre-empts problems.
Backup: Once you have everything running, you need to back it up. Backup takes periodic snapshots of your data, offers some value added features like PIT recovery and consistent cluster snapshots, manages storage, and monitors for health.
Mention Ops Manager!
Cloud backup service
Capture write operations and rebuild that dataset on our servers
Take periodic snapshots
Result: fully managed backups with low overhead for the customer -- low system impact, low development and operational overhead.
Many customers = many concurrent writes = lock contention
Best way to purge old data
Freelist fragmentation that leads to wasted disk space. Common issue with insert/delete patterns.
Designed and developed ~3 years ago.
Power of 2 allocation - allocates and frees space in more predictable chunks. introduced in MongoDB 2.2, was made the default in 2.6. All allocations are rounded up to the next power of 2. For example, 3.5kb -> 4kb.
Overall we are very happy with this decision.
Over the years we upgraded to 2.4, 2.6, and recently to 3.0.
March 3rd.
Locking - ~30% decrease in insert latency
Freelist – Reclaimed ~1TB of disk per month
Since oplogs are a circular buffer and customer activity is generally steady w/ a slight increase over time we should expect storage to be mostly flat with a slight increase.
We are using MongoDB to store MongoDB data files. Think of it as a filesystem backed by MongoDB.
Insert only pattern -- no deletes, mark and sweep for deleting old data.
Touch on other optimizations like de-duplication, compression, high availability, and scalability.
In most cases power of 2 is better. It is a MUST for update/delete patterns. But this case is different.
Wrap up: Different use cases require different tuning. MongoDB is flexibile and provides with all the power we need.
pros and cons
Operationally easier initially -- isolation, fewer components, fewer network hops
Small customers -- no one exceeded a single replica set. Assumption that usage would average out between customers sharing a blockstore
Load didn’t average out
Blockstores became either disk space bound or IO bound
Not hard to balance on either, very hard to balance on both.
We shard every collection
Our usecase is perfectly suited for this since most data already has a hashed _id. So no additional hashed index requirement
Load should be equally distributed. And it is!
Operationally easier since it looks like one large end-point. We use automation to manage it.
This is the topology that we’re currently migrating to and we’re very happy with the results thus far.
That concludes the backup portion of the talk. Now I’ll hand it over to John who’ll talk about schema evolution of our monitoring time series data.
Thanks Steve, so I’m going to switch gears tale of two schemas and tradeoffs
Not a prescription of favoring one schema design over another, but rather a history of how the design evolved to meet changing application requirements
To begin take you back to 2012 when MMS Monitoring was first getting off the ground. Lets build a SaaS monitoring product for MongoDB deployments.
Started with the broad goals wanting both: monitoring data as granular as it can be but b.) without sacrificing responsiveness.
Arrived at these 3 objectives
And so from there, decide how are we going to get metric data in, and eventually landed on an architecture as shown by this high-level architecture diagram on the left which has few moving pieces.
One advantage of using agent is we control sample rate, and can make it fixed interval
How we’re going to store this metric data to meet our objectives, and where we ended up…
..is a Read-Optimized schema. Where the main idea is..
Here we have this cutout of the MMS user interface showing just 4 different charts about a mongod host, and on right how one series is laid out in the schema.
But how many points? How to ensure we have an upper bound the number of points? Well remember I mentioned the monitoring agent..
What’s my network out over the past hour? Read only 2 documents.
For every metric you see, there will be similar documents for the different metrics types (THERE’S MUCH TO TALK ABOUT ON THIS SLIDE)
load all 13 data series on this slides, 780 data points, but only read 26 documents.
Great, so looking like we can expect reads to be fast (after all, says read-optimized right on the tin) but what about getting the data in..
Write tradeoff -- existing documents means updates that create subdocuments, which means random I/O
And so now three years later, how’d we do..
98%ile ~350ms most recently.
Looking ahead..
One of the things that guides evolution of a product is the feedback of users -- you guys! -- and we’re listening what we need to do.
Minute level really ought to be enough for anyone, so locked our lowest level resolution our SCHEMA can express as 1 minute.
Each new metric means new documents per host. So like I mentioned, 60,000 is the current number of hosts. So one new metric means 60,000 more writes per minute.
In general if we want to double number of hosts, and double number of metrics, 4x number documents - can we do better
Quadratic with increase of hosts and metrics
“As you can see, we put every metric into the one document, key abbreviation, etc.”
Insert only workload means more sequential access pattern for spinning disks
scalable design: more samples = more documents, but More metrics != more documents
Per host, 1 large insert vs many small random updates
Time to completely ingest metrics from 4500 hosts was ~10 seconds, with new schema design is now 50 millis
Write IOPS: 35x fewer
Only ~18ms cost to reading more documents
Now we’ve spoken about different ways to ingest data for write-heavy applications.
How we tailor MongoDB configuration for our workload – we saw one case for the oplog store where the delete pattern means usePowerOf2 is essential, and another (blockstore) where it’s undesirable
Tradeoffs to read optimized and write-optimized schemas
Always tradeoffs - working to find balance between the two. neither approach is strictly superior
Write-heavy applications measure success in IOPS. Drawing down from budget - spend judiciously. Optimizing your access pattern for your disks
We scale with you guys and are fortunate that MongoDB has the flexibility to meet these different access patterns, use cases.
Recruit: the more people use it the bigger MMS needs to be, we scale with you guys and are fortunate that MongoDB has the flexibility to meet these different access patterns, use cases.
Recruit: MMS a big focus of investment, if you think the contents of this talk were interesting you could be making a career out of it. Feel free to ask me about life at MongoDB or find one of our Recruiters at the MongoDB booth