To understand how to make your application fast, it's important to understand what makes the database fast. We will take a detailed look at how to think about performance, and how different choices in schema design affect your cluster performances depending on storage engines used and physical resources available.
MongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDBMongoDB
Presented by Greg Deeds, CEO, Technology Exploration Group
Experience level: Introductory
A two person team using MongoDB and Salesforce.com created a geospatial machine learning tool from various datasets, parsing, indexing, and mapreduce in 24 hours. The amazing hack that beat 350 teams from around the world designer Greg Deeds will speak on getting to the winners circle with MongoDB power. It was MongoDB that proved to be the teams secret weapon to level the playing field for the win!
Are you in the process of evaluating or migrating to MongoDB? We will cover key aspects of migrating to MongoDB from a RDBMS, including Schema design, Indexing strategies, Data migration approaches as your implementation reaches various SDLC stages, Achieving operational agility through MongoDB Management Services (MMS).
MongoDB Days Silicon Valley: Introducing MongoDB 3.2MongoDB
Presented by:
Eliot Horowitz, CTO and Co-Founder, MongoDB
Richard Kreuter, VP of Professional Services, MongoDB
Andrew Erlichson, VP of Engineering, Developer Experience, MongoDB
Intro to MongoDB
Get a jumpstart on MongoDB, use cases, and next steps for building your first app with Buzz Moschetti, MongoDB Enterprise Architect.
@BuzzMoschetti
Webinar: Best Practices for Getting Started with MongoDBMongoDB
MongoDB adoption continues to grow at a record pace due to the significant enhancements in developer productivity and scalability that the database provides. Occasionally, however, organizations new to the technology make mistakes that limit their ability to leverage the significant advantages MongoDB provides. This webinar will discuss some of the common mistakes made by users when they first start working with MongoDB, how to identify when you've made those mistakes, and how to resolve them.
Webinar: Developing with the modern App Stack: MEAN and MERN (with Angular2 a...MongoDB
Users increasingly demand a far richer experience from web applications – expecting the same level of performance and interactivity they get with native desktop and mobile apps.
At the same time, there's pressure on developers to deliver new applications faster and continually roll-out enhancements, while ensuring that the application is highly available and can be scaled appropriately when needed.
Fortunately, there’s a set of open source technologies using JavaScript that make all of this possible.
Watch this presentation to learn about the two dominant JavaScript web app stacks – MEAN (MongoDB, Express, Angular, Node.js) and MERN (MongoDB, Express, React, Node.js).
These technologies are also used outside of the browser – delivering the best user experience, regardless of whether accessing your application from the desktop, from a mobile app, or even using your voice.
By watching this presentation you will learn:
What these technologies and how they’re used in combination:
NodeJS
MongoDB
Express
Angular2
ReactJS
How to get started building your own apps using these stacks
Some of the decisions to take:
Angular vs Angular2 vs ReactJS
Javascript vs ES6 vs Typescript
What should be implemented in the front-end vs the back-end
MongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDBMongoDB
Presented by Greg Deeds, CEO, Technology Exploration Group
Experience level: Introductory
A two person team using MongoDB and Salesforce.com created a geospatial machine learning tool from various datasets, parsing, indexing, and mapreduce in 24 hours. The amazing hack that beat 350 teams from around the world designer Greg Deeds will speak on getting to the winners circle with MongoDB power. It was MongoDB that proved to be the teams secret weapon to level the playing field for the win!
Are you in the process of evaluating or migrating to MongoDB? We will cover key aspects of migrating to MongoDB from a RDBMS, including Schema design, Indexing strategies, Data migration approaches as your implementation reaches various SDLC stages, Achieving operational agility through MongoDB Management Services (MMS).
MongoDB Days Silicon Valley: Introducing MongoDB 3.2MongoDB
Presented by:
Eliot Horowitz, CTO and Co-Founder, MongoDB
Richard Kreuter, VP of Professional Services, MongoDB
Andrew Erlichson, VP of Engineering, Developer Experience, MongoDB
Intro to MongoDB
Get a jumpstart on MongoDB, use cases, and next steps for building your first app with Buzz Moschetti, MongoDB Enterprise Architect.
@BuzzMoschetti
Webinar: Best Practices for Getting Started with MongoDBMongoDB
MongoDB adoption continues to grow at a record pace due to the significant enhancements in developer productivity and scalability that the database provides. Occasionally, however, organizations new to the technology make mistakes that limit their ability to leverage the significant advantages MongoDB provides. This webinar will discuss some of the common mistakes made by users when they first start working with MongoDB, how to identify when you've made those mistakes, and how to resolve them.
Webinar: Developing with the modern App Stack: MEAN and MERN (with Angular2 a...MongoDB
Users increasingly demand a far richer experience from web applications – expecting the same level of performance and interactivity they get with native desktop and mobile apps.
At the same time, there's pressure on developers to deliver new applications faster and continually roll-out enhancements, while ensuring that the application is highly available and can be scaled appropriately when needed.
Fortunately, there’s a set of open source technologies using JavaScript that make all of this possible.
Watch this presentation to learn about the two dominant JavaScript web app stacks – MEAN (MongoDB, Express, Angular, Node.js) and MERN (MongoDB, Express, React, Node.js).
These technologies are also used outside of the browser – delivering the best user experience, regardless of whether accessing your application from the desktop, from a mobile app, or even using your voice.
By watching this presentation you will learn:
What these technologies and how they’re used in combination:
NodeJS
MongoDB
Express
Angular2
ReactJS
How to get started building your own apps using these stacks
Some of the decisions to take:
Angular vs Angular2 vs ReactJS
Javascript vs ES6 vs Typescript
What should be implemented in the front-end vs the back-end
MongoDB Schema Design: Practical Applications and ImplicationsMongoDB
Presented by Austin Zellner, Solutions Architect, MongoDB
Schema design is as much art as it is science, but it is central to understanding how to get the most out of MongoDB. Attendees will walk away with an understanding of how to approach schema design, what influences it, and the science behind the art. After this session, attendees will be ready to design new schemas, as well as re-evaluate existing schemas with a new mental model.
Back to Basics Webinar 1: Introduction to NoSQLMongoDB
This is the first webinar of a Back to Basics series that will introduce you to the MongoDB database, what it is, why you would use it, and what you would use it for.
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...MongoDB
Este es el cuarto seminario web de la serie Conceptos básicos, en la que se realiza una introducción a la base de datos MongoDB. Este seminario se ve en la compatibilidad con índices de texto libre y geoespaciales.
Webinar: Schema Patterns and Your Storage EngineMongoDB
How do MongoDB’s different storage options change the way you model your data?
Each storage engine, WiredTiger, the In-Memory Storage engine, MMAP V1 and other community supported drivers, persists data differently, writes data to disk in different formats and handles memory resources in different ways.
This webinar will go through how to design applications around different storage engines based on your use case and data access patterns. We will be looking into concrete examples of schema design practices that were previously applied on MMAPv1 and whether those practices still apply, to other storage engines like WiredTiger.
Topics for review: Schema design patterns and strategies, real-world examples, sizing and resource allocation of infrastructure.
MongoDB Days Silicon Valley: Jumpstart: Ops/Admin 101MongoDB
Presented by Achille Brighton, Principal Consulting Engineer, MongoDB
Experience level: Introductory
New to MongoDB? We'll provide an overview of installation, high availability through replication, scale out through sharding, and options for monitoring and backup. No prior knowledge of MongoDB is assumed. This session will jumpstart your knowledge of MongoDB operations, providing you with context for the rest of the day's content.
Back to Basics: My First MongoDB ApplicationMongoDB
This Back to Basics webinar series will introduce you to NoSQL and the MongoDB database. You will find out what MongoDB is, why you would use it, and what you would use it for.
Conceptos básicos. Seminario web 2: Su primera aplicación MongoDBMongoDB
Este es el segundo seminario web de la serie Conceptos básicos, en la que se realiza una introducción a la base de datos MongoDB. En este seminario web mostraremos cómo construir una aplicación de creación de blogs en MongoDB.
Determining the root cause of performance issues is a critical task for Operations. In this webinar, we'll show you the tools and techniques for diagnosing and tuning the performance of your MongoDB deployment. Whether you're running into problems or just want to optimize your performance, these skills will be useful.
Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentosMongoDB
Este es el tercer seminario web de la serie Conceptos básicos, en la que se realiza una introducción a la base de datos MongoDB. En este seminario web se explica la arquitectura de las bases de datos de documentos.
Back to Basics Webinar 3: Schema Design Thinking in DocumentsMongoDB
This is the third webinar of a Back to Basics series that will introduce you to the MongoDB database. This webinar will explain the architecture of document databases.
This presentation was given at the LDS Tech SORT Conference 2011 in Salt Lake City. The slides are quite comprehensive covering many topics on MongoDB. Rather than a traditional presentation, this was presented as more of a Q & A session. Topics covered include. Introduction to MongoDB, Use Cases, Schema design, High availability (replication) and Horizontal Scaling (sharding).
Conceptos básicos. Seminario web 5: Introducción a Aggregation FrameworkMongoDB
Este es el quinto seminario web de la serie Conceptos básicos, en la que se realiza una introducción a la base de datos MongoDB. En este seminario web, se analizan los aspectos básicos de Aggregation Framework.
MongoDB has taken a clear lead in adoption among the new generation of databases, including the enormous variety of NoSQL offerings. A key reason for this lead has been a unique combination of agility and scalability. Agility provides business units with a quick start and flexibility to maintain development velocity, despite changing data and requirements. Scalability maintains that flexibility while providing fast, interactive performance as data volume and usage increase. We'll address the key organizational, operational, and engineering considerations to ensure that agility and scalability stay aligned at increasing scale, from small development instances to web-scale applications. We will also survey some key examples of highly-scaled customer applications of MongoDB.
Webinar: MongoDB Schema Design and Performance ImplicationsMongoDB
In this session, you will learn how to translate one-to-one, one-to-many and many-to-many relationships, and learn how MongoDB's JSON structures, atomic updates and rich indexes can influence your design. We will also explore implications of storage engines, indexing and query patterns, available tools and related new features in MongoDB 3.2.
MongoDB Schema Design: Practical Applications and ImplicationsMongoDB
Presented by Austin Zellner, Solutions Architect, MongoDB
Schema design is as much art as it is science, but it is central to understanding how to get the most out of MongoDB. Attendees will walk away with an understanding of how to approach schema design, what influences it, and the science behind the art. After this session, attendees will be ready to design new schemas, as well as re-evaluate existing schemas with a new mental model.
Back to Basics Webinar 1: Introduction to NoSQLMongoDB
This is the first webinar of a Back to Basics series that will introduce you to the MongoDB database, what it is, why you would use it, and what you would use it for.
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...MongoDB
Este es el cuarto seminario web de la serie Conceptos básicos, en la que se realiza una introducción a la base de datos MongoDB. Este seminario se ve en la compatibilidad con índices de texto libre y geoespaciales.
Webinar: Schema Patterns and Your Storage EngineMongoDB
How do MongoDB’s different storage options change the way you model your data?
Each storage engine, WiredTiger, the In-Memory Storage engine, MMAP V1 and other community supported drivers, persists data differently, writes data to disk in different formats and handles memory resources in different ways.
This webinar will go through how to design applications around different storage engines based on your use case and data access patterns. We will be looking into concrete examples of schema design practices that were previously applied on MMAPv1 and whether those practices still apply, to other storage engines like WiredTiger.
Topics for review: Schema design patterns and strategies, real-world examples, sizing and resource allocation of infrastructure.
MongoDB Days Silicon Valley: Jumpstart: Ops/Admin 101MongoDB
Presented by Achille Brighton, Principal Consulting Engineer, MongoDB
Experience level: Introductory
New to MongoDB? We'll provide an overview of installation, high availability through replication, scale out through sharding, and options for monitoring and backup. No prior knowledge of MongoDB is assumed. This session will jumpstart your knowledge of MongoDB operations, providing you with context for the rest of the day's content.
Back to Basics: My First MongoDB ApplicationMongoDB
This Back to Basics webinar series will introduce you to NoSQL and the MongoDB database. You will find out what MongoDB is, why you would use it, and what you would use it for.
Conceptos básicos. Seminario web 2: Su primera aplicación MongoDBMongoDB
Este es el segundo seminario web de la serie Conceptos básicos, en la que se realiza una introducción a la base de datos MongoDB. En este seminario web mostraremos cómo construir una aplicación de creación de blogs en MongoDB.
Determining the root cause of performance issues is a critical task for Operations. In this webinar, we'll show you the tools and techniques for diagnosing and tuning the performance of your MongoDB deployment. Whether you're running into problems or just want to optimize your performance, these skills will be useful.
Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentosMongoDB
Este es el tercer seminario web de la serie Conceptos básicos, en la que se realiza una introducción a la base de datos MongoDB. En este seminario web se explica la arquitectura de las bases de datos de documentos.
Back to Basics Webinar 3: Schema Design Thinking in DocumentsMongoDB
This is the third webinar of a Back to Basics series that will introduce you to the MongoDB database. This webinar will explain the architecture of document databases.
This presentation was given at the LDS Tech SORT Conference 2011 in Salt Lake City. The slides are quite comprehensive covering many topics on MongoDB. Rather than a traditional presentation, this was presented as more of a Q & A session. Topics covered include. Introduction to MongoDB, Use Cases, Schema design, High availability (replication) and Horizontal Scaling (sharding).
Conceptos básicos. Seminario web 5: Introducción a Aggregation FrameworkMongoDB
Este es el quinto seminario web de la serie Conceptos básicos, en la que se realiza una introducción a la base de datos MongoDB. En este seminario web, se analizan los aspectos básicos de Aggregation Framework.
MongoDB has taken a clear lead in adoption among the new generation of databases, including the enormous variety of NoSQL offerings. A key reason for this lead has been a unique combination of agility and scalability. Agility provides business units with a quick start and flexibility to maintain development velocity, despite changing data and requirements. Scalability maintains that flexibility while providing fast, interactive performance as data volume and usage increase. We'll address the key organizational, operational, and engineering considerations to ensure that agility and scalability stay aligned at increasing scale, from small development instances to web-scale applications. We will also survey some key examples of highly-scaled customer applications of MongoDB.
Webinar: MongoDB Schema Design and Performance ImplicationsMongoDB
In this session, you will learn how to translate one-to-one, one-to-many and many-to-many relationships, and learn how MongoDB's JSON structures, atomic updates and rich indexes can influence your design. We will also explore implications of storage engines, indexing and query patterns, available tools and related new features in MongoDB 3.2.
What enterprises can learn from Real Time BiddingAerospike
Brian Bulkowski, CTO of Aerospike, the NoSQL database, discusses the software architecture pioneered in cutting edge advertising optimizations companies in 2008, made popular between 2009 and 2013, and now becoming more widely used in Financial Services, Retail, Social Media, Travel companies, and others. This new technology architecture focuses on multiple big data analytics sources - HDFS based batch engines, using Hadoop, Hive, Hbase, Vertica, Spark, and others depending on analysis and query patterns - with an operational and application layer. The operational application level consists of new internet application stacks, such as Node.js, Nginx, Jetty, Scala, and Go, and in-memory NoSQL databases such as MongoDB, Cassandra, and Aerospike.
Specific recommendations regarding building a high-performance operational layer are presented. In particular, focusing on primary-key access at the operational layer, using Flash for the random in-memory nosql layer, and the benefits of Open Source were presented.
This presentation was given at the Big Data Gurus meetup in Santa Clara, CA, on July 29, 2014. http://www.meetup.com/BigDataGurus/
This talk will introduce the philosophy and features of the open source, NoSQL MongoDB. We’ll discuss the benefits of the document-based data model that MongoDB offers by walking through how one can build a simple app to store books. We’ll cover inserting, updating, and querying the database of books.
Rapid Application Design in Financial ServicesAerospike
Applying internet NoSQL design patterns to fraud detection and risk scoring, including when to use SQL and when to use NoSQL. The state of NAND Flash and NVMe is also discussed, as well as storage class memory futures with Intel's 3D Xpoint technology.
This talk was presented in LA at the following meetup:
http://www.meetup.com/scalela/events/233396111/
MongoDB Europe 2016 - Advanced MongoDB Aggregation PipelinesMongoDB
We will do a deep dive into the powerful query capabilities of MongoDB's Aggregation Framework, and show you how you can use MongoDB's built-in features to inspect the execution and tune the performance of your queries. And, last but not least, we will also give you a brief outlook into MongoDB 3.4's awesome new Aggregation Framework additions.
Webinar: High Performance MongoDB Applications with IBM POWER8MongoDB
Innovative companies are building Internet of Things, mobile, content management, single view, and big data apps on top of MongoDB. In this session, we'll explore how the IBM POWER8 platform brings new levels of performance and ease of configuration to these solutions which already benefit from easier and faster design and development using MongoDB.
As we increasingly build applications to reach global audiences, the scalability and availability of your database across geographic regions becomes a critical consideration in systems selection and design.
Parse was a bold offering in the burgeoning space of Backend-as-a-Service, and we’re sorry to see them wind down.
If your application runs on Parse you’ll need to migrate your data from from the hosted service to your own database. Fortunately, MongoDB Cloud Manager makes running your own deployment easy. In this webinar we’ll use Cloud Manager to create and manage a new replica set, and detail the steps required to migrate from the Parse platform to your own deployment of MongoDB on Amazon Web Services.
Remaining Agile with Billions of Documents: Appboy and Creative MongoDB SchemasMongoDB
In this talk, Appboy co-founder and CIO Jon Hyman will discuss various schemas that Appboy has evolved to use on MongoDB, remaining agile as Appboy has grown to massive scale. Jon will discuss topics such as random sampling of documents, multivariate testing and multi-arm bandit optimization of such tests, field tokenization, and how Appboy stores multi-dimensional data on an individual user basis to be able to quickly optimize for the best time to deliver messages to end users. Appboy is the global leader in Marketing Automation for Apps, helping clients such as Urban Outfitters, Shutterfly, Kixeye, PicsArt, USA Today Sports, and iHeartRadio increase engagement through automated messaging. Each month, Appboy collects tens of billions of data points from hundreds of millions of monthly active users.
MongoDB Days UK: Building Apps with the MEAN StackMongoDB
Presented by Norberto Leite, Developer Advocate, MongoDB
Experience level: Advanced
Get ready to be MEAN! The MEAN Stack (MongoDB, ExpressJS, AngularJS and Node.js) allows developers to do rapid application development and application scaffolding. In this session, Norberto will walk you through strategies and best practices for building applications on the MEAN stack, the benefits of using such an application stack and the key benefits of each of the individual components.
Migrating your business applications from your on-site or co-located datacenters to the AWS Cloud takes some planning, and a phased approach. This webinar looks at migration patterns from an architectural perspective and what tools and techniques are available to you.
Reasons to attend:
- Learn about planning your cloud migration strategy.
- This webinar will help you select the workloads that can easily be moved to the cloud.
- Evaluate the conditions and metrics required for a successful and cost effective migration.
Speaker:Drew DiPalma
Learn more about MongoDB Stitch, our new Backend as a Service (BaaS) that makes it easy for developers to create and launch applications across mobile and web platforms. Stitch provides a REST API on top of MongoDB with read, write, and validation rules built-in and full integration with the services you love. This talk will cover the what, why, and how of MongoDB Stitch. We'll discuss everything from features to the architecture. You'll walk away knowing how Stitch can kickstart your new project or take your existing application to the next level.
Moved to https://slidr.io/azzazzel/web-application-performance-tuning-beyond-xmxMilen Dyankov
This slide deck will be removed from here in the future. It has been moved to : https://slidr.io/azzazzel/web-application-performance-tuning-beyond-xmx
WSO2Con EU 2016: An Introduction to the WSO2 Analytics PlatformWSO2
In today’s connected, organizations have access to an enormous amount of data but only use a very small subset of it. This data can give you hindsight, oversight, insight and foresight about your enterprise and the world that communicates with. In can be leverage to gain a considerable competitive advantage in the market.
The WSO2 Data Analytics platform lets you collect data, explore it through batch, real-time, interactive and predictive processing technologies and communicate your results. In this talk, we will discuss the WSO2 Data Analytics platform and how it brings together all analytics technologies into a single platform and user experience.
Comparison between OGC Sensor Observation Service and SensorThings APISensorUp
The recording of the webinar is here: https://www.youtube.com/watch?v=SyDSB5VM2Bw&list=PLUSJC5mjKZ9SIASpVJNWKWCSS9hVzjiFA&index=2
This webinar discussed the differences between the two OGC standards for IoT data exchange, i.e., OGC Sensor Observation Service and the OGC SensorThings API. It compares the two specifications in terms of interoperability, feature list, developer experience, efficiency, scalability/discoverability, and security. In summary, SOS and SensorThings are both interoperable. SensorThings can interoperate with SOS but not the other way around. SensorThings offers more features, better developer experience, better efficiency, and better scalability. In terms of security, SensorThings API can leverage the XML/SOAP security mechanisms by offering an SOS interface.
Building a complete social networking platform presents many challenges at scale. Socialite is a reference architecture and open source Java implementation of a scalable social feed service built on DropWizard and MongoDB. We'll provide an architectural overview of the platform, explaining how you can store an infinite timeline of data while optimizing indexing and sharding configuration for access to the most recent window of data. We'll also dive into the details of storing a social user graph in MongoDB.
Introducing MongoDB Stitch, Backend-as-a-Service from MongoDBMongoDB
Watch this webinar to learn about our new Backend as a Service (BaaS) – MongoDB Stitch.
MongoDB Stitch lets developers focus on building applications rather than on managing data manipulation code, service integration, or backend infrastructure. Whether you’re just starting up and want a fully managed backend as a service, or you’re part of an enterprise and want to expose existing MongoDB data to new applications, Stitch lets you focus on building the app users want, not on writing boilerplate backend logic.
This webinar will cover the what, why, and how of MongoDB Stitch. We’ll cover everything from the features it provides to the architecture that makes it possible. By the end of the session, you should understand how Stitch can kickstart your new project or take your existing application to the next level.
Attendees will learn:
- The basics of MongoDB Stitch and how to use it for new projects or to expose existing data to new applications
- How to control what data and services individual users can access
- How to integrate your favorite services with your MongoDB application without writing extra code
Cassandra's Sweet Spot - an introduction to Apache CassandraDave Gardner
Slides from my NoSQL Exchange 2011 talk introducing Apache Cassandra. This talk explained the fundamental concepts of Cassandra and then demonstrated how to build a simple ad-targeting application using PHP, with a focus on data modeling.
Video of talk: http://skillsmatter.com/podcast/home/cassandra/js-2880
Using Graph Analysis and Fraud Detection in the Fintech IndustryStanka Dalekova
Paysafe provides simple and secure payment solutions to businesses of all sizes around the world, processing billions of payment dollars a year. This, combined with the focus of flawless customer experience and real-time money transfer, makes it a candidate for the “dark side” of the payments industry: fraudsters, money launderers, etc. With traditional data storage techniques such as relational technologies, it is almost impossible to see beyond individual accounts to the connections between them. In this session see how Paysafe implemented the property graph technologies in Oracle Spatial and Graph and Oracle Database, including its fast, built-in, in-memory graph analytics, to perform fast graph queries that identify patterns of fraud.
Using Graph Analysis and Fraud Detection in the Fintech IndustryStanka Dalekova
Paysafe provides simple and secure payment solutions to businesses of all sizes around the world, processing billions of payment dollars a year. This, combined with the focus of flawless customer experience and real-time money transfer, makes it a candidate for the “dark side” of the payments industry: fraudsters, money launderers, etc. With traditional data storage techniques such as relational technologies, it is almost impossible to see beyond individual accounts to the connections between them. In this session see how Paysafe implemented the property graph technologies in Oracle Spatial and Graph and Oracle Database, including its fast, built-in, in-memory graph analytics, to perform fast graph queries that identify patterns of fraud.
We all know not to poke at alien life forms in another planet, right? But what about metrics, do you know how to pick, measure and draw conclusions from them? In this talk we will cover various Site Reliability Engineering topics, such as SLIs and SLOs while we explore real life examples of defining and implementing metrics in a system with examples using Prometheus, an open-source system monitoring and alert platform, to demonstrate implementation. Let's get back to some real science.
Ustream vs Legacy, It's never too late to start your fight! #Jsist 2014Máté Nádasdi
A general talk about Ustream's revolution on our aging codebase. We had to make critical changes on our codebase to achieve more stability and scalability on the client side.
This talk is about encouragement with lot of tips and tricks if you are in a similar legacy situation.
https://speakerdeck.com/matenadasdi/ustream-vs-legacy-its-never-too-late-to-start-your-fight-jsist-2014
During this session we will cover the best practices for implementing a product catalog with MongoDB. We will cover how to model an item properly when it can have thousands of variations and thousands of properties of interest. You'll learn how to index properly and allow for faceted search with milliseconds response latency and how to implement per-store, per-sku pricing while still keeping a sane number of documents. We will also cover operational considerations, like how to bring the data closer to users to cut down the network latency.
Microxchg Analyzing Response Time Distributions for MicroservicesAdrian Cockcroft
Research oriented presentation @Microxchg Berlin Feb 5th 2016. New code to collect histograms of response time and export them to monte-carlo simulation spreadsheet via getguesstimate.com
Similar to High Performance Applications with MongoDB (20)
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
During this talk we'll navigate through a customer's journey as they migrate an existing MongoDB deployment to MongoDB Atlas. While the migration itself can be as simple as a few clicks, the prep/post effort requires due diligence to ensure a smooth transfer. We'll cover these steps in detail and provide best practices. In addition, we’ll provide an overview of what to consider when migrating other cloud data stores, traditional databases and MongoDB imitations to MongoDB Atlas.
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
These days, everyone is expected to be a data analyst. But with so much data available, how can you make sense of it and be sure you're making the best decisions? One great approach is to use data visualizations. In this session, we take a complex dataset and show how the breadth of capabilities in MongoDB Charts can help you turn bits and bytes into insights.
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
MongoDB Kubernetes operator and MongoDB Open Service Broker are ready for production operations. Learn about how MongoDB can be used with the most popular container orchestration platform, Kubernetes, and bring self-service, persistent storage to your containerized applications. A demo will show you how easy it is to enable MongoDB clusters as an External Service using the Open Service Broker API for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
Are you new to schema design for MongoDB, or are you looking for a more complete or agile process than what you are following currently? In this talk, we will guide you through the phases of a flexible methodology that you can apply to projects ranging from small to large with very demanding requirements.
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
Humana, like many companies, is tackling the challenge of creating real-time insights from data that is diverse and rapidly changing. This is our journey of how we used MongoDB to combined traditional batch approaches with streaming technologies to provide continues alerting capabilities from real-time data streams.
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
Time series data is increasingly at the heart of modern applications - think IoT, stock trading, clickstreams, social media, and more. With the move from batch to real time systems, the efficient capture and analysis of time series data can enable organizations to better detect and respond to events ahead of their competitors or to improve operational efficiency to reduce cost and risk. Working with time series data is often different from regular application data, and there are best practices you should observe.
This talk covers:
Common components of an IoT solution
The challenges involved with managing time-series data in IoT applications
Different schema designs, and how these affect memory and disk utilization – two critical factors in application performance.
How to query, analyze and present IoT time-series data using MongoDB Compass and MongoDB Charts
At the end of the session, you will have a better understanding of key best practices in managing IoT time-series data with MongoDB.
Join this talk and test session with a MongoDB Developer Advocate where you'll go over the setup, configuration, and deployment of an Atlas environment. Create a service that you can take back in a production-ready state and prepare to unleash your inner genius.
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
Our clients have unique use cases and data patterns that mandate the choice of a particular strategy. To implement these strategies, it is mandatory that we unlearn a lot of relational concepts while designing and rapidly developing efficient applications on NoSQL. In this session, we will talk about some of our client use cases, the strategies we have adopted, and the features of MongoDB that assisted in implementing these strategies.
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
Encryption is not a new concept to MongoDB. Encryption may occur in-transit (with TLS) and at-rest (with the encrypted storage engine). But MongoDB 4.2 introduces support for Client Side Encryption, ensuring the most sensitive data is encrypted before ever leaving the client application. Even full access to your MongoDB servers is not enough to decrypt this data. And better yet, Client Side Encryption can be enabled at the "flick of a switch".
This session covers using Client Side Encryption in your applications. This includes the necessary setup, how to encrypt data without sacrificing queryability, and what trade-offs to expect.
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
MongoDB Kubernetes operator is ready for prime-time. Learn about how MongoDB can be used with most popular orchestration platform, Kubernetes, and bring self-service, persistent storage to your containerized applications.
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
These days, everyone is expected to be a data analyst. But with so much data available, how can you make sense of it and be sure you're making the best decisions? One great approach is to use data visualizations. In this session, we take a complex dataset and show how the breadth of capabilities in MongoDB Charts can help you turn bits and bytes into insights.
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
When you need to model data, is your first instinct to start breaking it down into rows and columns? Mine used to be too. When you want to develop apps in a modern, agile way, NoSQL databases can be the best option. Come to this talk to learn how to take advantage of all that NoSQL databases have to offer and discover the benefits of changing your mindset from the legacy, tabular way of modeling data. We’ll compare and contrast the terms and concepts in SQL databases and MongoDB, explain the benefits of using MongoDB compared to SQL databases, and walk through data modeling basics so you feel confident as you begin using MongoDB.
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
Join this talk and test session with a MongoDB Developer Advocate where you'll go over the setup, configuration, and deployment of an Atlas environment. Create a service that you can take back in a production-ready state and prepare to unleash your inner genius.
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
Query performance should be the unsung hero of an application, but without proper configuration, can become a constant headache. When used properly, MongoDB provides extremely powerful querying capabilities. In this session, we'll discuss concepts like equality, sort, range, managing query predicates versus sequential predicates, and best practices to building multikey indexes.
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
Aggregation pipeline has been able to power your analysis of data since version 2.2. In 4.2 we added more power and now you can use it for more powerful queries, updates, and outputting your data to existing collections. Come hear how you can do everything with the pipeline, including single-view, ETL, data roll-ups and materialized views.
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
Are you new to schema design for MongoDB, or are you looking for a more complete or agile process than what you are following currently? In this talk, we will guide you through the phases of a flexible methodology that you can apply to projects ranging from small to large with very demanding requirements.
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
MongoDB Atlas Data Lake is a new service offered by MongoDB Atlas. Many organizations store long term, archival data in cost-effective storage like S3, GCP, and Azure Blobs. However, many of them do not have robust systems or tools to effectively utilize large amounts of data to inform decision making. MongoDB Atlas Data Lake is a service allowing organizations to analyze their long-term data to discover a wealth of information about their business.
This session will take a deep dive into the features that are currently available in MongoDB Atlas Data Lake and how they are implemented. In addition, we'll discuss future plans and opportunities and offer ample Q&A time with the engineers on the project.
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
Virtual assistants are becoming the new norm when it comes to daily life, with Amazon’s Alexa being the leader in the space. As a developer, not only do you need to make web and mobile compliant applications, but you need to be able to support virtual assistants like Alexa. However, the process isn’t quite the same between the platforms.
How do you handle requests? Where do you store your data and work with it to create meaningful responses with little delay? How much of your code needs to change between platforms?
In this session we’ll see how to design and develop applications known as Skills for Amazon Alexa powered devices using the Go programming language and MongoDB.
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
aux Core Data, appréciée par des centaines de milliers de développeurs. Apprenez ce qui rend Realm spécial et comment il peut être utilisé pour créer de meilleures applications plus rapidement.
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
Il n’a jamais été aussi facile de commander en ligne et de se faire livrer en moins de 48h très souvent gratuitement. Cette simplicité d’usage cache un marché complexe de plus de 8000 milliards de $.
La data est bien connu du monde de la Supply Chain (itinéraires, informations sur les marchandises, douanes,…), mais la valeur de ces données opérationnelles reste peu exploitée. En alliant expertise métier et Data Science, Upply redéfinit les fondamentaux de la Supply Chain en proposant à chacun des acteurs de surmonter la volatilité et l’inefficacité du marché.
GridMate - End to end testing is a critical piece to ensure quality and avoid...ThomasParaiso2
End to end testing is a critical piece to ensure quality and avoid regressions. In this session, we share our journey building an E2E testing pipeline for GridMate components (LWC and Aura) using Cypress, JSForce, FakerJS…
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
10. Is It Fast?
• In context of crossing the bridge, fast means:
– how long will it take one car
– how many cars can do it "at the same time"
11. Is It Fast?
Facts & Info
Opened to traffic
Upper level: October 25, 1931
Lower level: August 29, 1962
Bus Station opened: January 17, 1963
Length of bridge between anchorages: 4,760 feet
Width of bridge: 119 feet
Width of roadway: 90 feet
Height of tower above water: 604 feet
Water clearance at midspan: 212 feet
Number of toll lanes:
Upper level: 12
Lower level: 10
Palisades Interstate Parkway: 7*
* E-ZPass only overnight
2013 Traffic Volumes
Total New York-bound (eastbound) traffic: 49,402,245 vehicles
40. Unbounded growth
Deeply nested arrays
Really large
documents
Schema Anti-Patterns: over-normalizing
you are over-normalizing if you are
doing JOINS in your application
instead of "finds"
88. Benchmark your own application
Use realistic workload
Use real data
Measure throughput and latency
Editor's Notes
What is fast? Before we can agree what our topic is, we have to literally define what fast means for you. For your application, for your users, for your stakeholders.
For your application, for your users, for your stakeholders.
What's fast in one context, / may not be fast in \ another
may not be fast in
fast in \ another context, let me give you an example
For those unfamiliar with this area, here were my options. Holland, Lincoln and
By far the most scenic is George Washington Bridge the world's busiest motor vehicle bridge. Twice as long as any previous suspension bridge
when its design finalized in 1923, construction started in 1927 and the bridge was first opened to traffic in 1931 1932 more than 5.5 million vehicles used original six lane roadway. Two center lanes were added in 1946, increasing capacity by 1/3rd. Six lanes of the lower roadway were completed in 1962.
bringing bridge to 14 lanes it has today. So let me ask you this:
is the George Washington Bridge fast? Well, that's a bit of a non sequitor as a question in a vacuum isn't it? The bridge cannot be fast, it's not even going anywhere! But we all have context here. So what matters when I ask this question is whether it's a fast way to get from NJ to NY.
*For me* to get to NY "fast" meant to get across the Hudson river as quickly (and painlessly) as possible.
speed limit which is 45 MPH, let's just say that to drive across GW bridge would take about one minute. but we wouldn't measure GW capacity by how long it took me, but by how many cars can make use of it. 50M just from NJ to NY.
back to your application. In a vacuum, it's not slow or fast. your stakeholders say "fast application" what we mean is perform whatever it is that it does for the end-user quickly. I'm not the only car on the GWB, there is never just one end-user - we want the application to perform quickly and consistently for all endusrs. User: what matters is fast response time, for you matters how many can use it simultaneously.
How many users or operations we can process at any given time, or in a given period of time / we call that throughput. So latency == how long something takes; Throughput == how many you can process "in parallel" You'd be surprised how often they get confused for one another...
One of the reasons that latency and throughput get conflated /when talking about performance, is because they are closely related. You can easily see in the single threaded / case that your latency directly impacts your throughput. The higher/WORSE /the latency, the/ lower the throughput. And sometimes,/ the lower the throughput, the higher the latency... happens when (two in one) /
One of the reasons that latency and throughput get conflated /when talking about performance, is because they are closely related. You can easily see in the single threaded / case that your latency directly impacts your throughput. The higher/WORSE /the latency, the/ lower the throughput. And sometimes,/ the lower the throughput, the higher the latency... happens when (two in one) /
If your latency across the bridge is caused by delays at the toll booths,
One of the reasons that latency and throughput get conflated /when talking about performance, is because they are closely related. You can easily see in the single threaded / case that your latency directly impacts your throughput. The higher/WORSE /the latency, the/ lower the throughput. And sometimes,/ the lower the throughput, the higher the latency... happens when (two in one) /
THIS IS BECAUSE EACH PHYSICAL RESOURCE CAN ONLY ACCOMMODATE A FIXED NUMBER OF CLIENTS.
because everyone has to wait. So they get worse together. So increasing latency can reduce your throughput And decreasing throughput increases latency. - that is undeniable. I'm sure we've all experienced it. A slightly less intuitive concept is that increasing throughput capacity, may or may not reduce your latency. It depends how much of latency is inherent in doing the operation itself and how much is caused by waiting due to ... well, lack of throughput...
Adding more lanes, without adding more toll booths will *not* help with either throughput or latency. [click] adding more toll booths will likely reduce the time across the bridge. ...
only to a point, because no matter how many toll booths or lanes you add, the laws of physics (and speed limit laws) will make it hard to reduce the duration of our trip across the Hudson to less than about a minute.
Adding more lanes, without adding more toll booths will *not* help with either throughput or latency. [click] adding more toll booths will likely reduce the time across the bridge. ...
only to a point, because no matter how many toll booths or lanes you add, the laws of physics (and speed limit laws) will make it hard to reduce the duration of our trip across the Hudson to less than about a minute.
Spped of light, it's not just a good idea, it's the law!
Why does all of this matter and how does this tie into your application design decisions? Well, just like getting across the Hudson requires a working vehicle, an open road and a variety of other favorable conditions, your application comprises many components, and all of them must be working together optimally to get the best possible performance to your end-user – focusing on speeding up the wrong component (not bottleneck) will be useless... SYSTEM COMPONENTS
Any one can slow down each user as they go through it, and that will increase latency, reducing your throughput significantly. You know – opposite of "fast application". Components can be split into two groups:
Any one can slow down each user as they go through it, and that will increase latency, reducing your throughput significantly. You know – opposite of "fast application". Components can be split into two groups:
Any one can slow down each user as they go through it, and that will increase latency, reducing your throughput significantly. You know – opposite of "fast application". Components can be split into two groups:
Any one can slow down each user as they go through it, and that will increase latency, reducing your throughput significantly. You know – opposite of "fast application". Components can be split into two groups:
Physical components /resources - and conceptual components -/your algorithms, data structures,/schema, indexes, choice of storage engine, choice of OS and FS - all those choices affect how your physical resources will be used up. So if you don't design your application well, you will be unnecessarily exhausting some of these limited physical resources, causing your application to perform worse than it might with optimal design. [PAUSE] Of course, physical components must be properly sized and tuned. File System tuning, OS tuning: we don't have time to get into the specifics of it here, but there's lots of information available so just keep in mind that if you don't follow best practices in configuring your file system, it's a little bit like trying to drive across GW bridge with four flat tires - let's just agree that's a bad idea? [PAUSE] /Two big components we will focus on in detail are the parts of the DB:
Physical components /resources - and conceptual components -/your algorithms, data structures,/schema, indexes, choice of storage engine, choice of OS and FS - all those choices affect how your physical resources will be used up. So if you don't design your application well, you will be unnecessarily exhausting some of these limited physical resources, causing your application to perform worse than it might with optimal design. [PAUSE] Of course, physical components must be properly sized and tuned. File System tuning, OS tuning: we don't have time to get into the specifics of it here, but there's lots of information available so just keep in mind that if you don't follow best practices in configuring your file system, it's a little bit like trying to drive across GW bridge with four flat tires - let's just agree that's a bad idea? [PAUSE] /Two big components we will focus on in detail are the parts of the DB:
[ schema / indexes ] [ STORAGE ENGINE ]
Schema Design is the building block of your application and getting it right is essential to making your application's DB requests efficient. We do that by structuring your data in a way that your application can easily read and write This willl minimize the resources used while minimizing latency of each request.
Tailoring your schema design to fit your read and write patterns is like using the right tool for the job. Good schema design will always take into account data locality - that's co-locating data that you tend to get at the same time into the same documents. Now that's a rule of thumb, there are definitely ways to take this too far – important counter point to this is "don't store data in the document that you tend not to need immediately".
Imagine you have to get 50 people across George Washington bridge,/would you use a car and make over a dozen trips? /Or would you use a much slower moving bus and get the job done in a single trip? [PAUSE] On the other hand, if you have one passenger, you might get better gas mileage if you take a car rather than the bus. If u r making lots of trips to fetch all the data ...
Imagine you have to get 50 people across George Washington bridge,/would you use a car and make over a dozen trips? /Or would you use a much slower moving bus and get the job done in a single trip? [PAUSE] On the other hand, if you have one passenger, you might get better gas mileage if you take a car rather than the bus. If u r making lots of trips to fetch all the data ...
to fetch all the data you need for a single operation is called an anti-pattern in schema design, we recognize as over-normalization. /On the other hand, getting way more data each time than you need is usually a sign of the opposite problem - let's call it over-embedding. "ANTIPATTERNS"
to fetch all the data you need for a single operation is called an anti-pattern in schema design, we recognize as over-normalization. /On the other hand, getting way more data each time than you need is usually a sign of the opposite problem - let's call it over-embedding. "ANTIPATTERNS"
sign you might be over embedding: Your documents tend to grow unbounded (you keep pushing more values into arrays, though you don't usually need them all) / You have deeply nested arrays within arrays but you usually need to work only with a small number of elements in it (NOT ALWAYS)/ Your documents are really large
[PAUSE] Some of the signs you might be over-normalizing
sign you might be over embedding: Your documents tend to grow unbounded (you keep pushing more values into arrays, though you don't usually need them all) / You have deeply nested arrays within arrays but you usually need to work only with a small number of elements in it (NOT ALWAYS)/ Your documents are really large
[PAUSE] Some of the signs you might be over-normalizing
sign you might be over embedding: Your documents tend to grow unbounded (you keep pushing more values into arrays, though you don't usually need them all) / You have deeply nested arrays within arrays but you usually need to work only with a small number of elements in it (NOT ALWAYS)/ Your documents are really large
[PAUSE] Some of the signs you might be over-normalizing
1 sign you might be over-normalizing [CLICK] you keep implementing joins in your application for every "query".
Other Signs you may run into trouble with your schema in the future: IF u haven't considered relative SLAs of reads vs writes - usually if we architect our system to make one of those faster it's at the cost of the other - more on that when we come to indexes. So knowing which you can afford to be a bit slower (higher latency) up front will help you make these trade-offs correctly.
Another one: You have lots of different types of documents in the same collection - usually it's a sign of trouble. [PAUSE]
You have lots of different types of values in the same field across a collection (sometimes string, sometimes date, sometimes number).[PAUSE] that will bring you to the BIGGEST warning sign: Your queries can't use indexes efficiently:
Other Signs you may run into trouble with your schema in the future: IF u haven't considered relative SLAs of reads vs writes - usually if we architect our system to make one of those faster it's at the cost of the other - more on that when we come to indexes. So knowing which you can afford to be a bit slower (higher latency) up front will help you make these trade-offs correctly.
Another one: You have lots of different types of documents in the same collection - usually it's a sign of trouble. [PAUSE]
You have lots of different types of values in the same field across a collection (sometimes string, sometimes date, sometimes number).[PAUSE] that will bring you to the BIGGEST warning sign: Your queries can't use indexes efficiently:
Other Signs you may run into trouble with your schema in the future: IF u haven't considered relative SLAs of reads vs writes - usually if we architect our system to make one of those faster it's at the cost of the other - more on that when we come to indexes. So knowing which you can afford to be a bit slower (higher latency) up front will help you make these trade-offs correctly.
Another one: You have lots of different types of documents in the same collection - usually it's a sign of trouble. [PAUSE]
You have lots of different types of values in the same field across a collection (sometimes string, sometimes date, sometimes number).[PAUSE] that will bring you to the BIGGEST warning sign: Your queries can't use indexes efficiently:
can't use indexes efficiently: - unanchored or case insensitive regex's - you need dozens of indexes for a single collection -worst you have no idea what indexes you might possibly need on a collection. Which brings us to the other biggest determining factor of whether your application will be fast:
can't use indexes efficiently: - unanchored or case insensitive regex's - you need dozens of indexes for a single collection -worst you have no idea what indexes you might possibly need on a collection. Which brings us to the other biggest determining factor of whether your application will be fast:
can't use indexes efficiently: - unanchored or case insensitive regex's - you need dozens of indexes for a single collection -worst you have no idea what indexes you might possibly need on a collection. Which brings us to the other biggest determining factor of whether your application will be fast:
can't use indexes efficiently: - unanchored or case insensitive regex's - you need dozens of indexes for a single collection -worst you have no idea what indexes you might possibly need on a collection. Which brings us to the other biggest determining factor of whether your application will be fast:
I wouldn't be exaggerating if I told you that when our support is dealing with a customer whose application is "slow" over 90% of the time, the indexes are suboptimal or outright missing for some high percentage of the slow operations! And this is in spite of the fact that we constantly harp about how important indexing is to good performance, and of course *all* databases require indexing to work well, right? let me show you how BAD life is with no indexes:
Here is my bridge analogy extended to such systems: Imagine that every morning, a bus, let's say NJ Transit picks up passengers at bus stops and then heads across GWB. How.. impact the "latency" of the trip, if throw away schedule & signed bus stops the bus just drove on every street to see if any of the people who wanted to go to NY were there? I don't imagine that would work very well. YOUR APP=query:{ } [PAUSE] And yet users deploy applications into production without having proper indexes in place – frequently because they didn't do proper test - they didn't benchmark their application's performance. (more about that at the end).
Here is my bridge analogy extended to such systems: Imagine that every morning, a bus, let's say NJ Transit picks up passengers at bus stops and then heads across GWB. How.. impact the "latency" of the trip, if throw away schedule & signed bus stops the bus just drove on every street to see if any of the people who wanted to go to NY were there? I don't imagine that would work very well. YOUR APP=query:{ } [PAUSE] And yet users deploy applications into production without having proper indexes in place – frequently because they didn't do proper test - they didn't benchmark their application's performance. (more about that at the end).
Here is my bridge analogy extended to such systems: Imagine that every morning, a bus, let's say NJ Transit picks up passengers at bus stops and then heads across GWB. How.. impact the "latency" of the trip, if throw away schedule & signed bus stops the bus just drove on every street to see if any of the people who wanted to go to NY were there? I don't imagine that would work very well. YOUR APP=query:{ } [PAUSE] And yet users deploy applications into production without having proper indexes in place – frequently because they didn't do proper test - they didn't benchmark their application's performance. (more about that at the end).
Here is my bridge analogy extended to such systems: Imagine that every morning, a bus, let's say NJ Transit picks up passengers at bus stops and then heads across GWB. How.. impact the "latency" of the trip, if throw away schedule & signed bus stops the bus just drove on every street to see if any of the people who wanted to go to NY were there? I don't imagine that would work very well. YOUR APP=query:{ } [PAUSE] And yet users deploy applications into production without having proper indexes in place – frequently because they didn't do proper test - they didn't benchmark their application's performance. (more about that at the end).
Here is my bridge analogy extended to such systems: Imagine that every morning, a bus, let's say NJ Transit picks up passengers at bus stops and then heads across GWB. How.. impact the "latency" of the trip, if throw away schedule & signed bus stops the bus just drove on every street to see if any of the people who wanted to go to NY were there? I don't imagine that would work very well. YOUR APP=query:{ } [PAUSE] And yet users deploy applications into production without having proper indexes in place – frequently because they didn't do proper test - they didn't benchmark their application's performance. (more about that at the end).
Here is my bridge analogy extended to such systems: Imagine that every morning, a bus, let's say NJ Transit picks up passengers at bus stops and then heads across GWB. How.. impact the "latency" of the trip, if throw away schedule & signed bus stops the bus just drove on every street to see if any of the people who wanted to go to NY were there? I don't imagine that would work very well. YOUR APP=query:{ } [PAUSE] And yet users deploy applications into production without having proper indexes in place – frequently because they didn't do proper test - they didn't benchmark their application's performance. (more about that at the end).
I'm sure you are all excited to hear about how awesome Wired Tiger is - and it is! But of course - the right tool for the job and all that. There are a couple of important differences between MMAP and WT that I want you to understand so you can take advantage of the strengths of each.
Most easily seen difference: WT has on-disk compression. MMAP does not. MMAP does X. WT does Y. Will it help with RAM? yes – prefix index compression.
Index prefix compression 7X (1/7th) 20% or less!
40%
3%
We have our own application Evergreen - our continuous build integration that runs thousands of tests and has TBs of log files - it was doing fine with MMAP but with 10x compression in WT we are able to now keep 10x as many runs of history! talk tomorrow afternoon about it.
If disk resource is a big limiting factor for your application, AND your data is highly compressible, CPU cycles available? then WT FTW!
If disk resource is a big limiting factor for your application, AND your data is highly compressible, CPU cycles available? then WT FTW!
If disk resource is a big limiting factor for your application, AND your data is highly compressible, CPU cycles available? then WT FTW!
If disk resource is a big limiting factor for your application, AND your data is highly compressible, CPU cycles available? then WT FTW!
interesting, complex, CONCURRENCY impacts both latency Throughput. lot has been said over the years about MMAP low granularity concurrency. It's like relatively few toll booths in front of GWB. It can be a limiting factor. But - for actual execution of the operation, mmap is "faster" i.e. lower latency. Wired Tiger has very high grained concurrency - in fact, not "document level *locking*" - it uses clever lock-free algorithms to achieve high degree of concurrency. But related to that, the latency of a single operation is higher than with mmap. WT ^thruput ^latency
Wired Tiger has very high grained concurrency - in fact, not "document level *locking*" - it uses clever lock-free algorithms to achieve high degree of concurrency. But related to that, the latency of a single operation is higher than with mmap. WT ^thruput ^latency
Why would granularity of locking impact latency this way? Imagine GWB lanes again... MMAP is like having one toll booth (or one per lane). - once you pay the toll and you *know* you are the only person in that lane so you can go as fast as possible
Why would granularity of locking impact latency this way? Imagine GWB lanes again... MMAP is like having one toll booth (or one per lane). - once you pay the toll and you *know* you are the only person in that lane so you can go as fast as possible
WiredTiger, well, I'm stretching the metaphor a little here, but imagine that there are no toll booths. Everyone has EZ-Pass or FastTrak or whatever. And you drive to your lane BUT if you find yourself in contention
Why would granularity of locking impact latency this way? Imagine GWB lanes again... MMAP is like having one toll booth (or one per lane). - once you pay the toll and you *know* you are the only person in that lane so you can go as fast as possible
might find yourself in contention with another car for this lane, then one of you has to stop and try again. So first, you can't drive quite so fast, because you have to be able to notice another car in your lane in time to stop, and second if you do meet contention then you have to stop and try again. WRITE-CONFLICTS NO BLIND WRITES.
So when is this a big win for Wired Tiger? Well, you have to have (a) multiple threads! too few threads and you aren't winning big from the clever algorithms (b) multiple threads have to be contending on the same collection (otherwise mmap has coll-level lock) (c) multiple threads must NOT be all contending on a single document (if they are then well, you see) (d) CPU available but (e) you must not have significantly more threads than you have "lanes" - in this case CPU processors Here are some "benchmarks"
So when is this a big win for Wired Tiger? Well, you have to have (a) multiple threads! too few threads and you aren't winning big from the clever algorithms (b) multiple threads have to be contending on the same collection (otherwise mmap has coll-level lock) (c) multiple threads must NOT be all contending on a single document (if they are then well, you see) (d) CPU available but (e) you must not have significantly more threads than you have "lanes" - in this case CPU processors Here are some "benchmarks"
So when is this a big win for Wired Tiger? Well, you have to have (a) multiple threads! too few threads and you aren't winning big from the clever algorithms (b) multiple threads have to be contending on the same collection (otherwise mmap has coll-level lock) (c) multiple threads must NOT be all contending on a single document (if they are then well, you see) (d) CPU available but (e) you must not have significantly more threads than you have "lanes" - in this case CPU processors Here are some "benchmarks"
So when is this a big win for Wired Tiger? Well, you have to have (a) multiple threads! too few threads and you aren't winning big from the clever algorithms (b) multiple threads have to be contending on the same collection (otherwise mmap has coll-level lock) (c) multiple threads must NOT be all contending on a single document (if they are then well, you see) (d) CPU available but (e) you must not have significantly more threads than you have "lanes" - in this case CPU processors Here are some "benchmarks"
So when is this a big win for Wired Tiger? Well, you have to have (a) multiple threads! too few threads and you aren't winning big from the clever algorithms (b) multiple threads have to be contending on the same collection (otherwise mmap has coll-level lock) (c) multiple threads must NOT be all contending on a single document (if they are then well, you see) (d) CPU available but (e) you must not have significantly more threads than you have "lanes" - in this case CPU processors Here are some "benchmarks"
So when is this a big win for Wired Tiger? Well, you have to have (a) multiple threads! too few threads and you aren't winning big from the clever algorithms (b) multiple threads have to be contending on the same collection (otherwise mmap has coll-level lock) (c) multiple threads must NOT be all contending on a single document (if they are then well, you see) (d) CPU available but (e) you must not have significantly more threads than you have "lanes" - in this case CPU processors Here are some "benchmarks"
Not contending on the same document!!! and contending. Uniform, latest, zipfian
you must not have significantly more threads than you have "lanes" - in this case CPU processors if you have a huge number of threads which are all trying to do active work on a small number of cores then you will waste a huge amount of resources on just context switching and not actually doing work plus more threads contending on same documents.
Even for read heavy loads, huge number of threads which are all trying to do active work on a small number of cores then you will waste a huge amount of resources on just context switching
context switching and not actually doing work. That's concurrency and multithreading.
So please don't do any single threaded benchmarking of WT and then ask how come it's not as fast as you heard. But don't benchmark 500 threads on a 4-core laptop!
The other other significant differentiator is the "write pattern". I'm not talking compressing data on disk & using the disk IOPs a lot more judiciously than MMAP. I'm talking abotu write amplification. There is a big difference in how writes are done during updates: MMAP does "in place" updates WT does "copy on write" on all updates. Illustration using a document rather than a bridge
Here' a time series document for a particular hour, with minutes and seconds. if you make an update to document
update to this document, mmap will overwrite the existing document with new value.
new value.
back to original document: WiredTiger will rewrite the current document on update
document (or more technically the internal page that contains that document) as a new version
new version of that page. This of course enables whoever was reading that page to still be reading it as the previous version of that page, which will get recycled when everyone who was using it is done with it. USE CASE Think about the use case where you have a very high number of documents that are nonetheless a small portion of your total data that are being extremely frequently updated, over and over again?
I'm talking of course of a system like MMS monitoring component which receives a large number of performance metrics and updates counters inside documents that don't change except for these numbers being incremented for the duration of whatever the document represents. Here, with schema heavily optimized to make sure updates are in place, performance is better with mmap even though it uses up more disk space (and RAM).
And this brings me to the most important point I'm going to make – all the generalizations are just that - no matter what I told you here today, no matter what you read on the internet, the only way to know for sure how fast your application is with your carefully selected schema and your carefully selected indexes would be to stress test and measure it. The examples I used are both applications we run in-house that we benchmarked with both storage engines with different configurations and physical resources to make the most appropriate choices - you guys should do the same. Oh, and if you happen to be going back to Jersey tonight and you want to have predictable latency
do yourself a favor, and take the train.
Thank you!