Charity Majors works as a systems engineer at Parse, a platform for mobile developers. Parse uses MongoDB for various purposes, including storing user data, DDoS protection and query profiling, and analytics for billing and logging. Charity provides advice on best practices for running MongoDB in production environments at scale, such as using replica sets, taking regular snapshots, automating setup and maintenance with Chef, and using provisioned IOPS volumes to improve performance.
A fotopedia presentation made at the MongoDay 2012 in Paris at Xebia Office.
Talk by Pierre Baillet and Mathieu Poumeyrol.
French Article about the presentation:
http://www.touilleur-express.fr/2012/02/06/mongodb-retour-sur-experience-chez-fotopedia/
Video to come.
Relational databases are used extensively in many applications and systems, but they are not always the best data store solution to the problem at hand. In this session we discuss the limitations of RDBMS and show which NoSQL solutions can be used to overcome these limitations. We also cover migration topics, such as how to add NoSQL databases without adding complexity to your development and operations.
NoSQL datastores fall under the following categories: Key-value stores, document databases, column-family stores and graph databases. The traditional TPC-* tests are not sufficient for these heterogeneous database systems. MongoDB, CouchDB, Cassandra, HBase, Memcaches etc belong to one of 4 families and a common workload can be generated by ycsb to simulate your usecase and benchmark them.
For our eReader development project, we had to find a persistent storage for our JSON documents. After initial scanning we zeroed into two products DynamoDB and MongoDB. These slides take a deeper dive in the selection of our JSON data store.
Deploying any software can be a challenge if you don't understand how resources are used or how to plan for the capacity of your systems. Whether you need to deploy or grow a single MongoDB instance, replica set, or tens of sharded clusters then you probably share the same challenges in trying to size that deployment.
A fotopedia presentation made at the MongoDay 2012 in Paris at Xebia Office.
Talk by Pierre Baillet and Mathieu Poumeyrol.
French Article about the presentation:
http://www.touilleur-express.fr/2012/02/06/mongodb-retour-sur-experience-chez-fotopedia/
Video to come.
Relational databases are used extensively in many applications and systems, but they are not always the best data store solution to the problem at hand. In this session we discuss the limitations of RDBMS and show which NoSQL solutions can be used to overcome these limitations. We also cover migration topics, such as how to add NoSQL databases without adding complexity to your development and operations.
NoSQL datastores fall under the following categories: Key-value stores, document databases, column-family stores and graph databases. The traditional TPC-* tests are not sufficient for these heterogeneous database systems. MongoDB, CouchDB, Cassandra, HBase, Memcaches etc belong to one of 4 families and a common workload can be generated by ycsb to simulate your usecase and benchmark them.
For our eReader development project, we had to find a persistent storage for our JSON documents. After initial scanning we zeroed into two products DynamoDB and MongoDB. These slides take a deeper dive in the selection of our JSON data store.
Deploying any software can be a challenge if you don't understand how resources are used or how to plan for the capacity of your systems. Whether you need to deploy or grow a single MongoDB instance, replica set, or tens of sharded clusters then you probably share the same challenges in trying to size that deployment.
Powering Microservices with Docker, Kubernetes, Kafka, and MongoDBMongoDB
Speaker: Andrew Morgan, Principal Product Marketing Manager, MongoDB
Level: 100 (Beginner)
Track: Microservices
Organizations are building their applications around microservice architectures because of the flexibility, speed of delivery, and maintainability they deliver. Want to try out MongoDB on your laptop? Execute a single command and you have a lightweight, self-contained sandbox; another command removes all trace when you're done. Replicate your complete application for your development, test, operations, and support teams. This session introduces you to technologies such as Docker, Kubernetes, and Kafka, which are driving the microservices revolution. Learn about containers and orchestration – and most importantly, how to exploit them for stateful services such as MongoDB.
What You Will Learn:
- Why organizations are choosing to use microservice architectures, what benefits they deliver, and when they should - or shouldn't - be used.
- Technologies that are used to build microservices – including containers, orchestration and messaging systems.
- Why MongoDB is a good fit for microservices and what special steps need to be taken to make them work well together.
When dealing with infrastructure we often go through the process of determining the different resources needed to attend our application requirements. This talks looks into the way that resources are used by MongoDB and which aspects should be considered to determined the sizing, capacity and deployment of a MongoDB cluster given the different scenarios, different sets of operations and storage engines available.
Practical Design Patterns for Building Applications Resilient to Infrastructu...MongoDB
Speaker: Feng Qu, Sr MTS, eBay
Level: 200 (Intermediate)
Track: Developer
Building applications resilient to infrastructure failure is essential to systems that run in distributed environments, including those with a MongoDB database. For example, failure can come from computer resources, such as nodes, network switches, or the entire data center. On occasion, MongoDB nodes may be marked down by Operations to perform administrative tasks, such as a software upgrade, adding extra capacity, etc.
In this session, we will discuss how to build resilient applications using appropriate design patterns suitable to enterprise class MongoDB applications.
What You Will Learn:
- How to manage updates within a resilient architecture.
- Design patterns for resilient applications.
- Practical advice for deploying resilient enterprise applications.
Cassandra @ Sony: The good, the bad, and the ugly part 2DataStax Academy
This talk covers scaling Cassandra to a fast growing user base. Alex and Isaias will cover new best practices and how to work with the strengths and weaknesses of Cassandra at large scale. They will discuss how to adapt to bottlenecks while providing a rich feature set to the playstation community.
Deploying any software can be a challenge if you don't understand how resources are used or how to plan for the capacity of your systems. Whether you need to deploy or grow a single MongoDB instance, replica set, or tens of sharded clusters then you probably share the same challenges in trying to size that deployment.
This webinar will cover what resources MongoDB uses, and how to plan for their use in your deployment. Topics covered will include understanding how to model and plan capacity needs for new and growing deployments. The goal of this webinar will be to provide you with the tools needed to be successful in managing your MongoDB capacity planning tasks.
Speaker: Michael Cahill, Director of Engineering (Storage), MongoDB
Level: 300 (Advanced)
Track: How We Build MongoDB
When the WiredTiger storage engine was created, the use case we had in mind was applications with modest numbers of collections. That led to various choices during the design, such as storing each collection in a separate file. However, MongoDB customers have an enormous variety of use cases, including multi-tenant applications where each user has a separate database and each database contains hundreds of collections.
To support these applications efficiently, we have evolved the storage layer with better testing, better analysis tools, and more scalable data structures and algorithms. This session will explain how we can now run workloads with over a million collections.
What You Will Learn:
- How MongoDB represents collections and indexes in the storage layer.
- The system resources involved in accessing multiple collections and indexes from different client connections.
- The system limits MongoDB may hit as you add more collections, and how to increase or work around those limits.
MongoDB Introduction talk at Dr Dobbs Conference, MongoDB Evenings at Bangalo...Prasoon Kumar
MongoDB is a leading nosql database. It is horizonatally scalable, document datastore. In this introduction given at Dr Dobbs Conference, Bangalore and Pune in April 2014, I show schema design with an example blog application and Python code snippets. I delivered the same in the maiden MongoDB Evening event at Delhi and Gurgaon in May 2014.
When constructing a data model for your MongoDB collection for CMS, there are various options you can choose from, each of which has its strengths and weaknesses. The three basic patterns are:
1.Store each comment in its own document.
2.Embed all comments in the “parent” document.
3.A hybrid design, stores comments separately from the “parent,” but aggregates comments into a small number of documents, where each contains many comments.
Code sample and wiki documentation is available on https://github.com/prasoonk/mycms_mongodb/wiki.
When it comes time to select database software for your project, there are a bewildering number of choices. How do you know if your project is a good fit for a relational database, or whether one of the many NoSQL options is a better choice?
In this webinar you will learn when to use MongoDB and how to evaluate if MongoDB is a fit for your project. You will see how MongoDB's flexible document model is solving business problems in ways that were not previously possible, and how MongoDB's built-in features allow running at scale.
Topics covered include:
Performance and Scalability
MongoDB's Data Model
Popular MongoDB Use Cases
Customer Stories
Security is more critical than ever with new computing environments in the cloud and expanding access to the internet. There are a number of security protection mechanisms available for MongoDB to ensure you have a stable and secure architecture for your deployment. We'll walk through general security threats to databases and specifically how they can be mitigated for MongoDB deployments. Topics will include general security tools and how to configure those for MongoDB, an overview of security features available in MongoDB, including LDAP, SSL, x.509 and Authentication.
Some of the most common questions we hear from users relate to capacity planning and hardware choices. How many replicas do I need? Should I consider sharding right away? How much RAM will I need for my working set? SSD or HDD? No one likes spending a lot of cash on hardware and cloud bills can just be as painful. MongoDB is different from traditional RDBMSs in its resource management, so you need to be mindful when deciding on the cluster layout and hardware. In this talk we will review the factors that drive the capacity requirements: volume of queries, access patterns, indexing, working set size, among others. Attendees will gain additional insight as we go through a few real-world scenarios, as experienced with MongoDB Inc customers, and come up with their ideal cluster layout and hardware.
MongoDB 2.6 is the biggest MongoDB release ever. In this presentation you are going to explore which features, improvements and capabilities were added to the latest version and how you can smoothly upgrade your deployments.
An Elastic Metadata Store for eBay’s Media PlatformMongoDB
In order to build a robust, multi-tenant, highly available storage services that meet the business’ SLA your databases has to be sharded. But if your service has to scale continuously through the incremental additions of storage without service interruption or human intervention, basic static sharding is not enough. At eBay, we are building MStore to solve this problem, with MongoDB as the storage engine. In this presentation, we will dive into the key design concepts of this solution.
We provide an overview of the expressive object model, secondary indexes, high availability, write scalability, query language support, performance benchmarks - database model, performance benchmarks - load characteristics, performance benchmarks - consistency requirements, ease of use, and navigation aggregation.
Capacity Planning For Your Growing MongoDB ClusterMongoDB
Your MongoDB deployment is growing, but are you prepared for that growth? Capacity planning is an essential practice when deploying any database system. You need to understand your usage patterns and determine the appropriate hardware based on your application's needs. Scaling reads and scaling writes will require different types of resources. With the proper tools in place, you can understand your working set, gain visibility into when it's time to add resources or start sharding and avoid performance issues. In this session, you'll learn how to use MongoDB Management Service and other tools to identify patterns and predict growth, ensuring your success with MongoDB.
Fondazione Bruno Kessler - Trento - Research seminar - 11 September 2013
The seminar presents the key features of VirtualSense, an open-HW platform for ultra-low-power wireless sensor networks, and discusses the state of the art, the open issues, and the ongoing reserach activities in the field of wireless sensor networks.
Also availabe at:
http://prezi.com/zx0hhbohspk8/virtualsense-and-the-state-of-the-art-in-ultra-low-power-wsns-short/
Challenges in opening up qualitative research datalifeofdata
Reflections on the challenges encountered in enabling open research data for the Secret Life of a Weather Datum project. A pecha kucha presentation given at the iFutures 2015 PGR conference, Information School, University of Sheffield, July 2015.
Powering Microservices with Docker, Kubernetes, Kafka, and MongoDBMongoDB
Speaker: Andrew Morgan, Principal Product Marketing Manager, MongoDB
Level: 100 (Beginner)
Track: Microservices
Organizations are building their applications around microservice architectures because of the flexibility, speed of delivery, and maintainability they deliver. Want to try out MongoDB on your laptop? Execute a single command and you have a lightweight, self-contained sandbox; another command removes all trace when you're done. Replicate your complete application for your development, test, operations, and support teams. This session introduces you to technologies such as Docker, Kubernetes, and Kafka, which are driving the microservices revolution. Learn about containers and orchestration – and most importantly, how to exploit them for stateful services such as MongoDB.
What You Will Learn:
- Why organizations are choosing to use microservice architectures, what benefits they deliver, and when they should - or shouldn't - be used.
- Technologies that are used to build microservices – including containers, orchestration and messaging systems.
- Why MongoDB is a good fit for microservices and what special steps need to be taken to make them work well together.
When dealing with infrastructure we often go through the process of determining the different resources needed to attend our application requirements. This talks looks into the way that resources are used by MongoDB and which aspects should be considered to determined the sizing, capacity and deployment of a MongoDB cluster given the different scenarios, different sets of operations and storage engines available.
Practical Design Patterns for Building Applications Resilient to Infrastructu...MongoDB
Speaker: Feng Qu, Sr MTS, eBay
Level: 200 (Intermediate)
Track: Developer
Building applications resilient to infrastructure failure is essential to systems that run in distributed environments, including those with a MongoDB database. For example, failure can come from computer resources, such as nodes, network switches, or the entire data center. On occasion, MongoDB nodes may be marked down by Operations to perform administrative tasks, such as a software upgrade, adding extra capacity, etc.
In this session, we will discuss how to build resilient applications using appropriate design patterns suitable to enterprise class MongoDB applications.
What You Will Learn:
- How to manage updates within a resilient architecture.
- Design patterns for resilient applications.
- Practical advice for deploying resilient enterprise applications.
Cassandra @ Sony: The good, the bad, and the ugly part 2DataStax Academy
This talk covers scaling Cassandra to a fast growing user base. Alex and Isaias will cover new best practices and how to work with the strengths and weaknesses of Cassandra at large scale. They will discuss how to adapt to bottlenecks while providing a rich feature set to the playstation community.
Deploying any software can be a challenge if you don't understand how resources are used or how to plan for the capacity of your systems. Whether you need to deploy or grow a single MongoDB instance, replica set, or tens of sharded clusters then you probably share the same challenges in trying to size that deployment.
This webinar will cover what resources MongoDB uses, and how to plan for their use in your deployment. Topics covered will include understanding how to model and plan capacity needs for new and growing deployments. The goal of this webinar will be to provide you with the tools needed to be successful in managing your MongoDB capacity planning tasks.
Speaker: Michael Cahill, Director of Engineering (Storage), MongoDB
Level: 300 (Advanced)
Track: How We Build MongoDB
When the WiredTiger storage engine was created, the use case we had in mind was applications with modest numbers of collections. That led to various choices during the design, such as storing each collection in a separate file. However, MongoDB customers have an enormous variety of use cases, including multi-tenant applications where each user has a separate database and each database contains hundreds of collections.
To support these applications efficiently, we have evolved the storage layer with better testing, better analysis tools, and more scalable data structures and algorithms. This session will explain how we can now run workloads with over a million collections.
What You Will Learn:
- How MongoDB represents collections and indexes in the storage layer.
- The system resources involved in accessing multiple collections and indexes from different client connections.
- The system limits MongoDB may hit as you add more collections, and how to increase or work around those limits.
MongoDB Introduction talk at Dr Dobbs Conference, MongoDB Evenings at Bangalo...Prasoon Kumar
MongoDB is a leading nosql database. It is horizonatally scalable, document datastore. In this introduction given at Dr Dobbs Conference, Bangalore and Pune in April 2014, I show schema design with an example blog application and Python code snippets. I delivered the same in the maiden MongoDB Evening event at Delhi and Gurgaon in May 2014.
When constructing a data model for your MongoDB collection for CMS, there are various options you can choose from, each of which has its strengths and weaknesses. The three basic patterns are:
1.Store each comment in its own document.
2.Embed all comments in the “parent” document.
3.A hybrid design, stores comments separately from the “parent,” but aggregates comments into a small number of documents, where each contains many comments.
Code sample and wiki documentation is available on https://github.com/prasoonk/mycms_mongodb/wiki.
When it comes time to select database software for your project, there are a bewildering number of choices. How do you know if your project is a good fit for a relational database, or whether one of the many NoSQL options is a better choice?
In this webinar you will learn when to use MongoDB and how to evaluate if MongoDB is a fit for your project. You will see how MongoDB's flexible document model is solving business problems in ways that were not previously possible, and how MongoDB's built-in features allow running at scale.
Topics covered include:
Performance and Scalability
MongoDB's Data Model
Popular MongoDB Use Cases
Customer Stories
Security is more critical than ever with new computing environments in the cloud and expanding access to the internet. There are a number of security protection mechanisms available for MongoDB to ensure you have a stable and secure architecture for your deployment. We'll walk through general security threats to databases and specifically how they can be mitigated for MongoDB deployments. Topics will include general security tools and how to configure those for MongoDB, an overview of security features available in MongoDB, including LDAP, SSL, x.509 and Authentication.
Some of the most common questions we hear from users relate to capacity planning and hardware choices. How many replicas do I need? Should I consider sharding right away? How much RAM will I need for my working set? SSD or HDD? No one likes spending a lot of cash on hardware and cloud bills can just be as painful. MongoDB is different from traditional RDBMSs in its resource management, so you need to be mindful when deciding on the cluster layout and hardware. In this talk we will review the factors that drive the capacity requirements: volume of queries, access patterns, indexing, working set size, among others. Attendees will gain additional insight as we go through a few real-world scenarios, as experienced with MongoDB Inc customers, and come up with their ideal cluster layout and hardware.
MongoDB 2.6 is the biggest MongoDB release ever. In this presentation you are going to explore which features, improvements and capabilities were added to the latest version and how you can smoothly upgrade your deployments.
An Elastic Metadata Store for eBay’s Media PlatformMongoDB
In order to build a robust, multi-tenant, highly available storage services that meet the business’ SLA your databases has to be sharded. But if your service has to scale continuously through the incremental additions of storage without service interruption or human intervention, basic static sharding is not enough. At eBay, we are building MStore to solve this problem, with MongoDB as the storage engine. In this presentation, we will dive into the key design concepts of this solution.
We provide an overview of the expressive object model, secondary indexes, high availability, write scalability, query language support, performance benchmarks - database model, performance benchmarks - load characteristics, performance benchmarks - consistency requirements, ease of use, and navigation aggregation.
Capacity Planning For Your Growing MongoDB ClusterMongoDB
Your MongoDB deployment is growing, but are you prepared for that growth? Capacity planning is an essential practice when deploying any database system. You need to understand your usage patterns and determine the appropriate hardware based on your application's needs. Scaling reads and scaling writes will require different types of resources. With the proper tools in place, you can understand your working set, gain visibility into when it's time to add resources or start sharding and avoid performance issues. In this session, you'll learn how to use MongoDB Management Service and other tools to identify patterns and predict growth, ensuring your success with MongoDB.
Fondazione Bruno Kessler - Trento - Research seminar - 11 September 2013
The seminar presents the key features of VirtualSense, an open-HW platform for ultra-low-power wireless sensor networks, and discusses the state of the art, the open issues, and the ongoing reserach activities in the field of wireless sensor networks.
Also availabe at:
http://prezi.com/zx0hhbohspk8/virtualsense-and-the-state-of-the-art-in-ultra-low-power-wsns-short/
Challenges in opening up qualitative research datalifeofdata
Reflections on the challenges encountered in enabling open research data for the Secret Life of a Weather Datum project. A pecha kucha presentation given at the iFutures 2015 PGR conference, Information School, University of Sheffield, July 2015.
3 Leadership Frameworks are presented (Leadership Styles; Authentic Leadership; and Cynefin) with applications on to 4 different leaders bearing in mind coaching within the 3 domains of leadership (the challenge leaders need to tackle; the context leaders operate in; and what the leaders bring into the equation).
Youtube presentation: http://www.youtube.com/watch?v=r-NI5jyzX7U
This was my assignment in my Leadership and Organisational Coaching module.
Strongly Typed Languages and Flexible SchemasNorberto Leite
We like to use strongly type languages and used them along side with flexible schema databases. What challenges and strategies do we have to deal with data coherence and format validations using different strategies and tools like ODMs versioning, migrations et al. We also review the tradeoffs of such strategies.
Big Data is on every CIO’s mind. It is presently synonymous with open source technologies like Hadoop, and the ‘NoSQL’ class of databases. Another technology that is shaking things up in Big Data is R (www.r-project.org, #rstats). R is an open source programming language and software environment designed for statistical computing and visualisation. The statistical software R is the fastest growing analytics platform in the world, and is established in both academia and companies for robustness, reliability and accuracy. For real big data analyses you have to access your data in your preferred database on the fly. In this talk I will give a short overview about R, the available connection to MongoDB and present some big data analyses using R and mongoDB.
Any Data, Any Analytics, Simplified. Pentaho Business Analytics 5.0, purpose-built for the future of analytics, provides an open, unified platform to access, integrate and blend any data, in any environment, across a full spectrum of analytics. This presentation corresponds to a live demo of the Pentaho Business Analytics with special enpahsis on what is new on Pentaho 5.0.
Ricardo Pires - BI Division Manager & Pentaho Official Trainer, @Xpand IT
Xpand IT presentation during the Pentaho & Big Data Ecosystem - Live Seminar 2013
In May of 2012, Socialcam exploded, gaining tens of millions of new users in just a few weeks. At the time, the service ran on 15 servers in a co-location facility in San Francisco. To meet new user traffic demands and continue to deliver maximum user satisfaction, Socialcam made the move to cloud services. With only two engineers and a constant barrage of users, there was limited time for technical transition, but Socialcam endured with no significant downtime. In this technical session, Socialcam co-founders Guillaume Luccisano and Ammon Bartram talk about their experience scaling Socialcam. They present the challenges they encountered, how they addressed them, and the technologies they used in the process. They focus particularly on how they used Amazon services in conjunction with their own hardware to keep Socialcam active with no significant downtime and no costly system redesign.
Understanding Elastic Block Store Availability and PerformanceAmazon Web Services
Depending on your application needs, Elastic Block Store’s volumes can be configured for optimal performance and higher availability. In this session, we will present the different design characteristics of EBS Standard and Provisioned IOPS volumes, provide technical insights on how to think about EBS performance and availability, and share best practices to achieve higher availability and performance.
2019 PHP Serbia - Boosting your performance with BlackfireMarko Mitranić
We aim to dispel the notion that large PHP applications tend to be sluggish, resource-intensive and slow compared to what the likes of Python, Erlang or even Node can do. The issue is not with optimising PHP internals - it's the lack of proper introspection tools and getting them into our every day workflow that counts! In this workshop we will talk about our struggles with whipping PHP Applications into shape, as well as work together on some of the more interesting examples of CPU or IO drain.
RDS for MySQL, No BS Operations and PatternsLaine Campbell
Amazon's RDS for MySQL is a wonderful tool with a significant value. It can also create a lot of havoc if you are not aware of it's limitations and changes before you make it a core part of your environment. In this deck, we discuss those issues.
A presentation on the selection criteria, testing + evaluation and successful, zero-downtime migration to MongoDB. Additionally details on Wordnik's speed and stability are covered as well as how NoSQL technologies have changed the way Wordnik scales.
Scaling on EC2 in a fast-paced environment (LISA'11 - Full Paper)Nicolas Brousse
Managing a server infrastructure in a fastpaced environment like a start-up is challenging. You have little time for provisioning, testing and planning but still you need to prepare for scaling when your product reaches the tipping point. Amazon EC2 is one of the cloud providers that we experimented with while growing our infrastructure from 20 servers to 500 servers. In this paper we will go over the pros and cons of managing EC2 instances with a mix of Bind, LDAP, SimpleDB and Python scripts; how we kept a smooth working process by using NFS, auto-mount and shell-scripting; why we switched from managing our instances based on tailor-made AMI/Shell-scripting to the official Ubuntu AMI, Cloud-init and puppet; and finally, we will go over some rules we had to follow carefully to be able to handle billions of daily non-static http request across multiple Amazon EC2 regions.
Covers a broad overview of how to use AWS for building a scalable web app. Covers some of the AWS services in depth, and also gives recommendations on many services.
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...Fred de Villamil
The talk I gave at the Snow Unix Event in Nederland about upgrading a massive production Elasticsearch cluster from a major version to another without downtime and a complete rollback plan.
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
During this talk we'll navigate through a customer's journey as they migrate an existing MongoDB deployment to MongoDB Atlas. While the migration itself can be as simple as a few clicks, the prep/post effort requires due diligence to ensure a smooth transfer. We'll cover these steps in detail and provide best practices. In addition, we’ll provide an overview of what to consider when migrating other cloud data stores, traditional databases and MongoDB imitations to MongoDB Atlas.
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
These days, everyone is expected to be a data analyst. But with so much data available, how can you make sense of it and be sure you're making the best decisions? One great approach is to use data visualizations. In this session, we take a complex dataset and show how the breadth of capabilities in MongoDB Charts can help you turn bits and bytes into insights.
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
MongoDB Kubernetes operator and MongoDB Open Service Broker are ready for production operations. Learn about how MongoDB can be used with the most popular container orchestration platform, Kubernetes, and bring self-service, persistent storage to your containerized applications. A demo will show you how easy it is to enable MongoDB clusters as an External Service using the Open Service Broker API for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
Are you new to schema design for MongoDB, or are you looking for a more complete or agile process than what you are following currently? In this talk, we will guide you through the phases of a flexible methodology that you can apply to projects ranging from small to large with very demanding requirements.
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
Humana, like many companies, is tackling the challenge of creating real-time insights from data that is diverse and rapidly changing. This is our journey of how we used MongoDB to combined traditional batch approaches with streaming technologies to provide continues alerting capabilities from real-time data streams.
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
Time series data is increasingly at the heart of modern applications - think IoT, stock trading, clickstreams, social media, and more. With the move from batch to real time systems, the efficient capture and analysis of time series data can enable organizations to better detect and respond to events ahead of their competitors or to improve operational efficiency to reduce cost and risk. Working with time series data is often different from regular application data, and there are best practices you should observe.
This talk covers:
Common components of an IoT solution
The challenges involved with managing time-series data in IoT applications
Different schema designs, and how these affect memory and disk utilization – two critical factors in application performance.
How to query, analyze and present IoT time-series data using MongoDB Compass and MongoDB Charts
At the end of the session, you will have a better understanding of key best practices in managing IoT time-series data with MongoDB.
Join this talk and test session with a MongoDB Developer Advocate where you'll go over the setup, configuration, and deployment of an Atlas environment. Create a service that you can take back in a production-ready state and prepare to unleash your inner genius.
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
Our clients have unique use cases and data patterns that mandate the choice of a particular strategy. To implement these strategies, it is mandatory that we unlearn a lot of relational concepts while designing and rapidly developing efficient applications on NoSQL. In this session, we will talk about some of our client use cases, the strategies we have adopted, and the features of MongoDB that assisted in implementing these strategies.
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
Encryption is not a new concept to MongoDB. Encryption may occur in-transit (with TLS) and at-rest (with the encrypted storage engine). But MongoDB 4.2 introduces support for Client Side Encryption, ensuring the most sensitive data is encrypted before ever leaving the client application. Even full access to your MongoDB servers is not enough to decrypt this data. And better yet, Client Side Encryption can be enabled at the "flick of a switch".
This session covers using Client Side Encryption in your applications. This includes the necessary setup, how to encrypt data without sacrificing queryability, and what trade-offs to expect.
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
MongoDB Kubernetes operator is ready for prime-time. Learn about how MongoDB can be used with most popular orchestration platform, Kubernetes, and bring self-service, persistent storage to your containerized applications.
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
These days, everyone is expected to be a data analyst. But with so much data available, how can you make sense of it and be sure you're making the best decisions? One great approach is to use data visualizations. In this session, we take a complex dataset and show how the breadth of capabilities in MongoDB Charts can help you turn bits and bytes into insights.
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
When you need to model data, is your first instinct to start breaking it down into rows and columns? Mine used to be too. When you want to develop apps in a modern, agile way, NoSQL databases can be the best option. Come to this talk to learn how to take advantage of all that NoSQL databases have to offer and discover the benefits of changing your mindset from the legacy, tabular way of modeling data. We’ll compare and contrast the terms and concepts in SQL databases and MongoDB, explain the benefits of using MongoDB compared to SQL databases, and walk through data modeling basics so you feel confident as you begin using MongoDB.
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
Join this talk and test session with a MongoDB Developer Advocate where you'll go over the setup, configuration, and deployment of an Atlas environment. Create a service that you can take back in a production-ready state and prepare to unleash your inner genius.
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
Query performance should be the unsung hero of an application, but without proper configuration, can become a constant headache. When used properly, MongoDB provides extremely powerful querying capabilities. In this session, we'll discuss concepts like equality, sort, range, managing query predicates versus sequential predicates, and best practices to building multikey indexes.
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
Aggregation pipeline has been able to power your analysis of data since version 2.2. In 4.2 we added more power and now you can use it for more powerful queries, updates, and outputting your data to existing collections. Come hear how you can do everything with the pipeline, including single-view, ETL, data roll-ups and materialized views.
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
Are you new to schema design for MongoDB, or are you looking for a more complete or agile process than what you are following currently? In this talk, we will guide you through the phases of a flexible methodology that you can apply to projects ranging from small to large with very demanding requirements.
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
MongoDB Atlas Data Lake is a new service offered by MongoDB Atlas. Many organizations store long term, archival data in cost-effective storage like S3, GCP, and Azure Blobs. However, many of them do not have robust systems or tools to effectively utilize large amounts of data to inform decision making. MongoDB Atlas Data Lake is a service allowing organizations to analyze their long-term data to discover a wealth of information about their business.
This session will take a deep dive into the features that are currently available in MongoDB Atlas Data Lake and how they are implemented. In addition, we'll discuss future plans and opportunities and offer ample Q&A time with the engineers on the project.
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
Virtual assistants are becoming the new norm when it comes to daily life, with Amazon’s Alexa being the leader in the space. As a developer, not only do you need to make web and mobile compliant applications, but you need to be able to support virtual assistants like Alexa. However, the process isn’t quite the same between the platforms.
How do you handle requests? Where do you store your data and work with it to create meaningful responses with little delay? How much of your code needs to change between platforms?
In this session we’ll see how to design and develop applications known as Skills for Amazon Alexa powered devices using the Go programming language and MongoDB.
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
aux Core Data, appréciée par des centaines de milliers de développeurs. Apprenez ce qui rend Realm spécial et comment il peut être utilisé pour créer de meilleures applications plus rapidement.
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
Il n’a jamais été aussi facile de commander en ligne et de se faire livrer en moins de 48h très souvent gratuitement. Cette simplicité d’usage cache un marché complexe de plus de 8000 milliards de $.
La data est bien connu du monde de la Supply Chain (itinéraires, informations sur les marchandises, douanes,…), mais la valeur de ces données opérationnelles reste peu exploitée. En alliant expertise métier et Data Science, Upply redéfinit les fondamentaux de la Supply Chain en proposant à chacun des acteurs de surmonter la volatilité et l’inefficacité du marché.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
1. Tuesday, December 4, 12
Hi! My name is Charity Majors, and I am a systems engineer at Parse.
Parse is a platform for mobile developers.
You can use our apis to build apps for iOS, Android, and Windows phones. We take care of all of the provisioning and scaling for backend services, so you can focus on building your app
and user experience.
2. Replica sets
• Always use replica sets
• Distribute across Availability Zones
• Avoid situations where you have even # voters
• More voters are better than fewer
Tuesday, December 4, 12
First, the basics.
* Always run with replica sets. Never run with a single node, unless you really hate your data. And always distribute your replica set members across
as many different regions as possible. If you have three nodes, use three regions. Do not put two nodes in one region and one node in a second
region. Remember, you need at least two nodes to form a quorum in case of network split. And an even number of nodes can leave you stuck in a
situation where they can’t elect a master. If you need to run with an even number of nodes temporarily, either assign more votes to some nodes or add
an arbiter. But always, always think about how to protect yourself from situations where you can’t elect a master. Go for more votes rather than fewer,
because it’s easier to subtract if you have too many than to add if you have too few.
** Remember, if you get in to a situation where you have only one node, you have a situation where you have no way to add another node to the replica
set. There was one time very early on when we were still figuring mongo out, and we had to recover from an outage by bringing up a node from
snapshot with the same hostname so it would be recognized as a member of the same replica set. Bottom line, you just really don’t want to be in this
situation. Spread your eggs around in lots of baskets.
3. Snapshots
• Snapshot often
• Lock Mongo
• Set snapshot node to priority = 0
• Always warm up a snapshot before promoting
• Warm up both indexes and data
Tuesday, December 4, 12
Snapshots
* Snapshot regularly. We snapshot every 30 minutes. EBS snapshot actually does a differential backup, so subsequent snapshots will be faster the
more frequently you do them.
* Make sure you use a snapshot script that locks mongo. It’s not enough to just use ec2-create-snapshot on the RAID volumes, you also need to lock
mongo beforehand and unlock it after. We use a script called ec2-consistent-snapshot, though I think we may have modified it to add mongo support.
* Always set your snapshot node to config priority = 0. This will prevent it from ever getting elected master. You really, really do not want your
snapshotting host to ever become master, or your site will go down. We also like to set our primary priority to 3, and all non-snapshot secondaries to 2,
because priority 1 isn’t always visible from rs.conf(). That’s just a preference of ours.
* Never, ever switch primary over to a newly restored snapshot. Something a lot of people don’t seem to realize is that EBS blocks are actually lazy-
loaded off S3. You need to warm your fresh secondaries up. I mean, you think loading data into RAM from disk is bad, try loading into RAM from S3.
There’s just a *tiny* bit of latency there.
Warming up
Lots of people seem to do this in different ways, and it kind of depends on how much data you have. If you have less data than you have RAM, you can
just use dd or vmtouch to load entire databases into memory. If you have more data than RAM, it’s a little bit trickier.
The way we do it is, first we run a script on the primary. It gets the current ops every quarter of a second or so for an hour, then sorts by most frequently
accessed collections. Then we take that list of collections and feed it into a warmup script on the secondary, which reads all the collections and indexes
into memory. The script is parallelized, but it still takes several hours to complete. You can also read collections into memory by doing a full table scan,
or a natural sort.
God, what I wouldn’t give for block-level replication like Amazon’s RDS.
4. Chef everything
• Role attributes for backup volumes, cluster
names
• Nodes are disposable
• Delete volumes and aws attributes, run chef-
client to reprovision
Tuesday, December 4, 12
Chef
Moving along … chef! Everything we have is fully chef’d. It only takes us like 5 minutes to bring up a new node from snapshot. We use the opscode
MongoDB and AWS cookbooks, with some local modifications so they can handle PIOPS and the ebs_optimized dedicated NICs. We haven’t open
sourced these changes, but we probably can, if there’s any demand for them. It looks like this:
$ knife ec2 server create -r "role[mongo-replset1-iops]" -f m2.4xlarge -G db -x ubuntu --node-name db36 -I ami-xxxxxxxx -Z us-east-1d -E production
There are some neat things in the mongo cookbook. You can create a role attribute to define the cluster name, so it automatically comes up and joins
the cluster. The backup volumes for a cluster are also just attributes for the role. So it’s easy to create a mongo backups role that automatically backs
up whatever volumes are pointed to by that attribute.
We use the m2.4xlarge flavor, which has like 68 gigs of memory. We have about a terabyte of data per replica set, so 68 gigs is just barely enough for
the working set to fit into memory.
We used to use four EBS volumes RAID 10’d, but we don’t even bother with RAID 10 anymore, we just stripe PIOPS volumes. It’s faster for us to
reprovision a replica set member than repairing the RAID array. If an EBS volume dies, or the secondary falls too far behind, or whatever, we just delete
the volumes, remove the AWS attributes for the node in the chef node description, and re-run chef-client. It reprovisions new volumes for us from the
latest snapshot in a matter of minutes. For most problems, it’s faster for us to destroy and rebuild than attempt any sort of repair.
5. Before PIOPS:
After PIOPS:
Tuesday, December 4, 12
P-IOPS
And … we use PIOPS. We switched to Provisioned IOPS literally as soon as it was available. As you can see from this graph, it made a *huge*
difference for us.
These are end-to-end latency graphs in Cloudwatch, from the point a request enters the ELB til the response goes back out. Note the different Y-axis!
order of magnitude difference. The top Y-axis goes up to 2.5, the bottom one goes up to 0.6.
EBS is awful. It’s bursty, and flaky, and just generally everything you DON’T want in your database hardware. As you can see here in the top graph,
using 4 EBS volumes raid 10'd, we had ebs spikes all the time. Any time one of the four ebs volumes had any sort of availability event, our end to end
latency took a hit. With PIOPS, our average latency dropped in half and went almost completely flat around 100 milliseconds.
So yes. Use PIOPS. Until recently you could only provision 1k iops per volume, but you can now provision volumes with up to 2000 iops per volume.
And they guarantee a variability of less than .1%, which is exactly what you want in your database hardware.
6. Filesystem & misc
• Use ext4
• Raise file descriptor limits (cat /proc/<pid>/
limits to verify)
• Sharding. Eventually you must shard.
Tuesday, December 4, 12
Misc
Some small, miscellaneous details:
* Remember to raise your file descriptor limits. And test that they are actually getting applied. The best way to do this is find the pid of your mongodb
process, and type “cat /proc/<pid>/limits. We had a hard time getting sysvinit scripts to properly apply the increased limits, so we converted to use
upstart and have had no issues. I don’t know if ubuntu no longer supports sysvinit very well, or what.
* We use ext4. Supposedly either ext4 or xfs will work, but I have been scarred by xfs file corruption way too many times to ever consider that. They
say it’s fixed, but I have like xfs PTSD or something.
* Sharding -- at some point you have to shard your data. The mongo built-in sharding didn’t work for us for a variety of reasons I won’t go into here.
We’re doing sharding at the app layer, the goal is to
7. Parse runs on MongoDB
• DDoS protection and query profiling
• Billing and logging analytics
• User data
Tuesday, December 4, 12
In summary, we are very excited about MongoDB. We love the fact that it fails over seamlessly between Availability Zones during an AZ event. And we
value the fact that its flexibility allows us to build our expertise and tribal knowledge around one primary database product, instead of a dozen different
ones.
In fact, we actually use MongoDB in at least three or four distinct ways. We use it for a high-writes DDoS and query analyzer cluster, where we process
a few hundred thousand writes per minute and expire the data every 10 minutes. We use it for our logging and analytics cluster, where we analyze all
our logs from S3 and generate billing data. And we use it to store all the app data for all of our users and their mobile apps.
Something like Parse wouldn’t even be possible without a nosql product as flexible and reliable as Mongo is. We’ve built our business around it, and
we’re very excited about its future.
Also, we’re hiring. See me if you’re interested. :)
Thank you! Any questions?