Managing a MongoDB deployment discusses operational best practices for MongoDB including redundancy with master-slave replication or replica sets, controlling blocking operations, handling eventual consistency, and scaling out deployments. It provides details on Wordnik's MongoDB architecture using Java application servers, REST APIs, master-slave replication across data centers, and tools they developed for backups, restores, and incremental replication. It emphasizes understanding your data and access patterns to choose the right operational solutions.
Migrating from MySQL to MongoDB at WordnikTony Tam
Wordnik migrated their live application from MySQL to MongoDB to address scaling issues. They moved over 5 billion documents totaling over 1.2 TB of data with zero downtime. The migration involved setting up MongoDB infrastructure, designing the data model and software to match their existing object model, migrating the data, and optimizing performance of the new system. They achieved insert rates of over 100,000 documents per second during the migration process and saw read speeds increase to 250,000 documents per second after completing the move to MongoDB.
A presentation on the selection criteria, testing + evaluation and successful, zero-downtime migration to MongoDB. Additionally details on Wordnik's speed and stability are covered as well as how NoSQL technologies have changed the way Wordnik scales.
Prometheus lightning talk (Devops Dublin March 2015)Brian Brazil
This document introduces Prometheus, an open-source monitoring system that allows instrumentation of everything including RPCs, interfaces, business logic, and logs. It provides client libraries that make instrumentation easy across many languages. The Prometheus server can handle over a million time series in one instance with no dependencies. It offers dashboards, expression queries, alerts and integrates with many systems. Time series have structured labels allowing flexible aggregation and complex math for rules and alerts. Prometheus costs less than $.001 per time series per month and is developed by SoundCloud, Boxever and Docker with an active community.
Charity Majors works as a systems engineer at Parse, a platform for mobile developers. Parse uses MongoDB for various purposes, including storing user data, DDoS protection and query profiling, and analytics for billing and logging. Charity provides advice on best practices for running MongoDB in production environments at scale, such as using replica sets, taking regular snapshots, automating setup and maintenance with Chef, and using provisioned IOPS volumes to improve performance.
From MySQL to MongoDB at Wordnik (Tony Tam)MongoSF
Wordnik migrated their live application from MySQL to MongoDB to address scaling issues. They moved over 5 billion documents totaling over 1.2 TB of data with zero downtime. The migration involved setting up MongoDB infrastructure, designing the data model and software to match their existing object model, migrating the data, and optimizing performance of the new system. They achieved insert rates of over 100,000 documents per second during the migration process and saw read speeds increase to 250,000 documents per second after completing the move to MongoDB.
Sam Weaver, a MongoDB Product Manager, introduces MongoDB Compass. He discusses the need for Compass due to customer requests for quicker prototyping, less friction on handovers, and easier learning of MongoDB Query Language (MQL). He demos Compass' features like viewing schemas and sampling data from MongoDB databases. Finally, he outlines future plans like supporting more database operations and statistics, and sharing queries.
Managing a MongoDB deployment discusses operational best practices for MongoDB including redundancy with master-slave replication or replica sets, controlling blocking operations, handling eventual consistency, and scaling out deployments. It provides details on Wordnik's MongoDB architecture using Java application servers, REST APIs, master-slave replication across data centers, and tools they developed for backups, restores, and incremental replication. It emphasizes understanding your data and access patterns to choose the right operational solutions.
Migrating from MySQL to MongoDB at WordnikTony Tam
Wordnik migrated their live application from MySQL to MongoDB to address scaling issues. They moved over 5 billion documents totaling over 1.2 TB of data with zero downtime. The migration involved setting up MongoDB infrastructure, designing the data model and software to match their existing object model, migrating the data, and optimizing performance of the new system. They achieved insert rates of over 100,000 documents per second during the migration process and saw read speeds increase to 250,000 documents per second after completing the move to MongoDB.
A presentation on the selection criteria, testing + evaluation and successful, zero-downtime migration to MongoDB. Additionally details on Wordnik's speed and stability are covered as well as how NoSQL technologies have changed the way Wordnik scales.
Prometheus lightning talk (Devops Dublin March 2015)Brian Brazil
This document introduces Prometheus, an open-source monitoring system that allows instrumentation of everything including RPCs, interfaces, business logic, and logs. It provides client libraries that make instrumentation easy across many languages. The Prometheus server can handle over a million time series in one instance with no dependencies. It offers dashboards, expression queries, alerts and integrates with many systems. Time series have structured labels allowing flexible aggregation and complex math for rules and alerts. Prometheus costs less than $.001 per time series per month and is developed by SoundCloud, Boxever and Docker with an active community.
Charity Majors works as a systems engineer at Parse, a platform for mobile developers. Parse uses MongoDB for various purposes, including storing user data, DDoS protection and query profiling, and analytics for billing and logging. Charity provides advice on best practices for running MongoDB in production environments at scale, such as using replica sets, taking regular snapshots, automating setup and maintenance with Chef, and using provisioned IOPS volumes to improve performance.
From MySQL to MongoDB at Wordnik (Tony Tam)MongoSF
Wordnik migrated their live application from MySQL to MongoDB to address scaling issues. They moved over 5 billion documents totaling over 1.2 TB of data with zero downtime. The migration involved setting up MongoDB infrastructure, designing the data model and software to match their existing object model, migrating the data, and optimizing performance of the new system. They achieved insert rates of over 100,000 documents per second during the migration process and saw read speeds increase to 250,000 documents per second after completing the move to MongoDB.
Sam Weaver, a MongoDB Product Manager, introduces MongoDB Compass. He discusses the need for Compass due to customer requests for quicker prototyping, less friction on handovers, and easier learning of MongoDB Query Language (MQL). He demos Compass' features like viewing schemas and sampling data from MongoDB databases. Finally, he outlines future plans like supporting more database operations and statistics, and sharing queries.
Testing Distributed Query Engine as a Servicetakezoe
Naoki Takezoe from Treasure Data discussed testing their distributed query engine Presto as a service. They developed a tool called presto-query-simulator to test using production data and queries in a safe manner. The tool reduces testing time by grouping similar queries and narrowing data scans. It also helps analyze results and find problematic queries. Future work includes running tests more frequently and improving coverage.
NoSQL datastores fall under the following categories: Key-value stores, document databases, column-family stores and graph databases. The traditional TPC-* tests are not sufficient for these heterogeneous database systems. MongoDB, CouchDB, Cassandra, HBase, Memcaches etc belong to one of 4 families and a common workload can be generated by ycsb to simulate your usecase and benchmark them.
MongoDB Introduction talk at Dr Dobbs Conference, MongoDB Evenings at Bangalo...Prasoon Kumar
MongoDB is a leading nosql database. It is horizonatally scalable, document datastore. In this introduction given at Dr Dobbs Conference, Bangalore and Pune in April 2014, I show schema design with an example blog application and Python code snippets. I delivered the same in the maiden MongoDB Evening event at Delhi and Gurgaon in May 2014.
When constructing a data model for your MongoDB collection for CMS, there are various options you can choose from, each of which has its strengths and weaknesses. The three basic patterns are:
1.Store each comment in its own document.
2.Embed all comments in the “parent” document.
3.A hybrid design, stores comments separately from the “parent,” but aggregates comments into a small number of documents, where each contains many comments.
Code sample and wiki documentation is available on https://github.com/prasoonk/mycms_mongodb/wiki.
The document summarizes the key improvements in MongoDB version 2.6, including improved operations, integrated search capabilities, query system enhancements, improved security features, and better performance and stability. Some of the main updates are bulk write operations, background indexing and replication, storage allocation improvements to reduce fragmentation, full text search integration, index intersection capabilities, aggregation framework enhancements, and auditing functionality. The presentation provides details on each of these areas.
MongoDB .local Bengaluru 2019: Using MongoDB Services in Kubernetes: Any Plat...MongoDB
MongoDB Kubernetes operator is ready for prime-time. Learn about how MongoDB can be used with most popular orchestration platform, Kubernetes, and bring self-service, persistent storage to your containerized applications.
Web Performance – die effektivsten Techniken aus der PraxisFelix Gessert
Eine durchschnittliche Webseite lädt 2299KB an Daten und macht dafür 100 HTTP Anfragen. Dass Ladezeiten einen immensen Einfluss auf User-Zufriedenheit und Business-Metriken haben, bezweifelt dieser Tage niemand mehr. Aber die Meinungen darüber, welche Techniken Ladezeiten effektiv minimieren, gehen weit auseinander. Dieser Vortrag gibt einen detaillierten Überblick zu den wichtigsten Techniken der Web Performance Optimierung vom Critical Rendering Path bis zu verteilten Caching-Infrastrukturen an einem Beispiel aus der Praxis.
Building Codealike: a journey into the developers analytics worldOren Eini
Codealike plugins in Visual Studio, Eclipse and Chrome, track developers while they code and perform analytic calculations at the millisecond level. For such write heavy workloads and using RavenDB as the main and only database was not without challenge. In this talk, we will reveal how we built and scaled such a solution, how we were able to improve performance with Voron and glance at our own mistakes and architectural choices down the line.
MongoDB is a document-oriented, open source database that is high performing, horizontally scalable, and full featured. It uses a flexible schema and stores data in flexible JSON-like documents which allows for an evolving schema. MongoDB can be easily scaled out across commodity servers and provides high availability with automatic replication and recovery. It supports dynamic queries and indexing and has drivers for many languages.
The document discusses MongoDB's plans to implement an encrypted storage engine. It will begin by explaining why MongoDB needs encryption at rest and how an encrypted storage engine benefits users. It then details how the encrypted storage engine will work, including using a key manager to handle encryption keys, encrypting data with WiredTiger storage engine using AES-256, and supporting encryption at the database level. It concludes by stating that the encrypted storage engine will be available in an upcoming MongoDB Enterprise Advanced release.
MMS - Monitoring, backup and management at a single clickMatias Cascallares
MMS is MongoDB's monitoring, backup, and management system that provides:
- Server and cluster monitoring with metrics, alerts and activity feeds
- Backup of replica sets and sharded clusters with initial sync and incremental backups stored as snapshots
- Restore of backups to point-in-time within the last 24 hours
- Automation capabilities for tasks like capacity resizing, provisioning machines, and rolling upgrades
It has cloud and on-premise deployment options with pricing based on usage for the cloud version. MMS aims to simplify managing large MongoDB deployments through monitoring, backups and automation.
- Rediff News uses MongoDB for its publishing system to manage the lifecycle of articles, store article metadata and roles, acquire external feeds, enable tagging and notifications, and power search and data visualization on maps.
- The system allows users to upload Excel data, match it to map attributes, generate articles using data science insights, and visualize data on interactive maps.
- Rediff's architecture uses POJOs to define schemas, custom collections to store different data types, and a REST layer to expose data resources and abstract storage from applications.
Living with SQL and NoSQL at craigslist, a Pragmatic ApproachJeremy Zawodny
From the 2012 Percona Live MySQL Conference in Santa Clara, CA.
Craigslist uses a variety of data storage systems in its backend systems: in-memory, SQL, and NoSQL. This talk is an overview of how craigslist works with a focus on the data storage and management choices that were made in each of its major subsystems. These include MySQL, memcached, Redis, MongoDB, Sphinx, and the filesystem. Special attention will be paid to the benefits and tradeoffs associated with choosing from the various popular data storage systems, including long-term viability, support, and ease of integration.
Presentation in video is here: https://youtu.be/7YHKwwTr-9I
Scalability is an important part when architecting a web app. There are multiple options on how to scale the web app tier and the database tier. Those options will be explained with examples from services from Microsoft Azure. If you are beginner and want to understand the fundamentals of scalability and resiliency, then this presentation is for you.
Presented technologies and concepts: Scale Up, Scale Out, Microservices, VM, Load Balancer, CDN,
RedisCache, API Management, Queues, Sharding, DDD, NoSql
MongoDB Days Silicon Valley: A Technical Introduction to WiredTiger MongoDB
Presented by Osmar Olivo, Product Manager, MongoDB
Experience level: Introductory
WiredTiger is MongoDB's first officially supported pluggable storage engine as well as the new default engine in 3.2. It exposes several new features and configuration options. This talk will highlight the major differences between the MMAPV1 and WiredTiger storage engines including currency, compression, and caching.
Lessons Learned Migrating 2+ Billion Documents at CraigslistJeremy Zawodny
Lessons Learned from Migrating 2+ Billion Documents at Craigslist outlines Craigslist's migration from MySQL to MongoDB. Some key lessons include: knowing your hardware limitations, that replica sets provide high availability during reboots, understanding your data types and sizes, and being aware of limitations with sharding and replica set re-sync processes. The migration addressed issues with their archive data storage and provided a more scalable and performant system.
What's new in MongoDB 2.6 at India event by companyMongoDB APAC
The document summarizes the key improvements in MongoDB 2.6, including improvements to operations, integrated search capabilities, the query system, security features, and performance and stability. Some of the main enhancements are bulk writes for importing large datasets, background indexing and replication, storage allocation improvements to reduce fragmentation, integrated text search support for multiple languages, new aggregation framework operators, and auditing capabilities.
This document discusses consuming web services from mobile applications. It covers common mobile development challenges like limited screen space, CPU power, and bandwidth. It then provides an overview of technologies used to access web services like XML, JSON, REST, and SOAP. Examples are given of using AsyncTask and Services in Android to make asynchronous web service calls. Code demonstrations and additional resources are also referenced.
Bloom Filters for Web Caching - Lightning TalkFelix Gessert
This document discusses using Bloom filters to cache dynamic web content for low latency. It describes how Bloom filters can be used to proactively revalidate cached data and check if it is still fresh. An end-to-end example is provided showing how Bloom filters could work from the browser cache to a CDN to check cached objects and estimate their time-to-live. Code snippets demonstrate integrating Bloom filters into querying and loading data from a backend database to leverage caching. The goal is to serve dynamic content from ubiquitous web caches for low latency with less processing.
The document discusses lessons learned from integrating MongoDB into eCommerce websites. Some key points:
- The EAV data model used by Magento is slow and performs poorly at scale, motivating a transition to MongoDB.
- Early approaches stored all product data in MongoDB but this broke features relying on SQL. A hybrid model using MongoDB for most attributes and MySQL for key fields worked better.
- The learning curve is high but storing data to match queries, managing transactions carefully, and using search engines are important. Near real-time processing can improve performance significantly.
- Backup and replication require special attention in distributed architectures. The open source MongoGento module developed by Smile improves Magento performance
Magento sites need optimization to load fast and provide a good user experience. Speeding up a site increases sales and improves SEO. Factors that impact load time include network transfers and the resource-intensive nature of Magento. Benchmarking tools like APDEX, Funkload, Yslow and Pagespeed help measure performance and set goals, such as loading the homepage in under 1.5 seconds. Architectures must be sized properly and include techniques like splitting front and back ends, enabling caching, and using a CDN. The Nitrogento extension automates optimizations like blocking caching, sprite generation, and asset minification to significantly improve performance.
Testing Distributed Query Engine as a Servicetakezoe
Naoki Takezoe from Treasure Data discussed testing their distributed query engine Presto as a service. They developed a tool called presto-query-simulator to test using production data and queries in a safe manner. The tool reduces testing time by grouping similar queries and narrowing data scans. It also helps analyze results and find problematic queries. Future work includes running tests more frequently and improving coverage.
NoSQL datastores fall under the following categories: Key-value stores, document databases, column-family stores and graph databases. The traditional TPC-* tests are not sufficient for these heterogeneous database systems. MongoDB, CouchDB, Cassandra, HBase, Memcaches etc belong to one of 4 families and a common workload can be generated by ycsb to simulate your usecase and benchmark them.
MongoDB Introduction talk at Dr Dobbs Conference, MongoDB Evenings at Bangalo...Prasoon Kumar
MongoDB is a leading nosql database. It is horizonatally scalable, document datastore. In this introduction given at Dr Dobbs Conference, Bangalore and Pune in April 2014, I show schema design with an example blog application and Python code snippets. I delivered the same in the maiden MongoDB Evening event at Delhi and Gurgaon in May 2014.
When constructing a data model for your MongoDB collection for CMS, there are various options you can choose from, each of which has its strengths and weaknesses. The three basic patterns are:
1.Store each comment in its own document.
2.Embed all comments in the “parent” document.
3.A hybrid design, stores comments separately from the “parent,” but aggregates comments into a small number of documents, where each contains many comments.
Code sample and wiki documentation is available on https://github.com/prasoonk/mycms_mongodb/wiki.
The document summarizes the key improvements in MongoDB version 2.6, including improved operations, integrated search capabilities, query system enhancements, improved security features, and better performance and stability. Some of the main updates are bulk write operations, background indexing and replication, storage allocation improvements to reduce fragmentation, full text search integration, index intersection capabilities, aggregation framework enhancements, and auditing functionality. The presentation provides details on each of these areas.
MongoDB .local Bengaluru 2019: Using MongoDB Services in Kubernetes: Any Plat...MongoDB
MongoDB Kubernetes operator is ready for prime-time. Learn about how MongoDB can be used with most popular orchestration platform, Kubernetes, and bring self-service, persistent storage to your containerized applications.
Web Performance – die effektivsten Techniken aus der PraxisFelix Gessert
Eine durchschnittliche Webseite lädt 2299KB an Daten und macht dafür 100 HTTP Anfragen. Dass Ladezeiten einen immensen Einfluss auf User-Zufriedenheit und Business-Metriken haben, bezweifelt dieser Tage niemand mehr. Aber die Meinungen darüber, welche Techniken Ladezeiten effektiv minimieren, gehen weit auseinander. Dieser Vortrag gibt einen detaillierten Überblick zu den wichtigsten Techniken der Web Performance Optimierung vom Critical Rendering Path bis zu verteilten Caching-Infrastrukturen an einem Beispiel aus der Praxis.
Building Codealike: a journey into the developers analytics worldOren Eini
Codealike plugins in Visual Studio, Eclipse and Chrome, track developers while they code and perform analytic calculations at the millisecond level. For such write heavy workloads and using RavenDB as the main and only database was not without challenge. In this talk, we will reveal how we built and scaled such a solution, how we were able to improve performance with Voron and glance at our own mistakes and architectural choices down the line.
MongoDB is a document-oriented, open source database that is high performing, horizontally scalable, and full featured. It uses a flexible schema and stores data in flexible JSON-like documents which allows for an evolving schema. MongoDB can be easily scaled out across commodity servers and provides high availability with automatic replication and recovery. It supports dynamic queries and indexing and has drivers for many languages.
The document discusses MongoDB's plans to implement an encrypted storage engine. It will begin by explaining why MongoDB needs encryption at rest and how an encrypted storage engine benefits users. It then details how the encrypted storage engine will work, including using a key manager to handle encryption keys, encrypting data with WiredTiger storage engine using AES-256, and supporting encryption at the database level. It concludes by stating that the encrypted storage engine will be available in an upcoming MongoDB Enterprise Advanced release.
MMS - Monitoring, backup and management at a single clickMatias Cascallares
MMS is MongoDB's monitoring, backup, and management system that provides:
- Server and cluster monitoring with metrics, alerts and activity feeds
- Backup of replica sets and sharded clusters with initial sync and incremental backups stored as snapshots
- Restore of backups to point-in-time within the last 24 hours
- Automation capabilities for tasks like capacity resizing, provisioning machines, and rolling upgrades
It has cloud and on-premise deployment options with pricing based on usage for the cloud version. MMS aims to simplify managing large MongoDB deployments through monitoring, backups and automation.
- Rediff News uses MongoDB for its publishing system to manage the lifecycle of articles, store article metadata and roles, acquire external feeds, enable tagging and notifications, and power search and data visualization on maps.
- The system allows users to upload Excel data, match it to map attributes, generate articles using data science insights, and visualize data on interactive maps.
- Rediff's architecture uses POJOs to define schemas, custom collections to store different data types, and a REST layer to expose data resources and abstract storage from applications.
Living with SQL and NoSQL at craigslist, a Pragmatic ApproachJeremy Zawodny
From the 2012 Percona Live MySQL Conference in Santa Clara, CA.
Craigslist uses a variety of data storage systems in its backend systems: in-memory, SQL, and NoSQL. This talk is an overview of how craigslist works with a focus on the data storage and management choices that were made in each of its major subsystems. These include MySQL, memcached, Redis, MongoDB, Sphinx, and the filesystem. Special attention will be paid to the benefits and tradeoffs associated with choosing from the various popular data storage systems, including long-term viability, support, and ease of integration.
Presentation in video is here: https://youtu.be/7YHKwwTr-9I
Scalability is an important part when architecting a web app. There are multiple options on how to scale the web app tier and the database tier. Those options will be explained with examples from services from Microsoft Azure. If you are beginner and want to understand the fundamentals of scalability and resiliency, then this presentation is for you.
Presented technologies and concepts: Scale Up, Scale Out, Microservices, VM, Load Balancer, CDN,
RedisCache, API Management, Queues, Sharding, DDD, NoSql
MongoDB Days Silicon Valley: A Technical Introduction to WiredTiger MongoDB
Presented by Osmar Olivo, Product Manager, MongoDB
Experience level: Introductory
WiredTiger is MongoDB's first officially supported pluggable storage engine as well as the new default engine in 3.2. It exposes several new features and configuration options. This talk will highlight the major differences between the MMAPV1 and WiredTiger storage engines including currency, compression, and caching.
Lessons Learned Migrating 2+ Billion Documents at CraigslistJeremy Zawodny
Lessons Learned from Migrating 2+ Billion Documents at Craigslist outlines Craigslist's migration from MySQL to MongoDB. Some key lessons include: knowing your hardware limitations, that replica sets provide high availability during reboots, understanding your data types and sizes, and being aware of limitations with sharding and replica set re-sync processes. The migration addressed issues with their archive data storage and provided a more scalable and performant system.
What's new in MongoDB 2.6 at India event by companyMongoDB APAC
The document summarizes the key improvements in MongoDB 2.6, including improvements to operations, integrated search capabilities, the query system, security features, and performance and stability. Some of the main enhancements are bulk writes for importing large datasets, background indexing and replication, storage allocation improvements to reduce fragmentation, integrated text search support for multiple languages, new aggregation framework operators, and auditing capabilities.
This document discusses consuming web services from mobile applications. It covers common mobile development challenges like limited screen space, CPU power, and bandwidth. It then provides an overview of technologies used to access web services like XML, JSON, REST, and SOAP. Examples are given of using AsyncTask and Services in Android to make asynchronous web service calls. Code demonstrations and additional resources are also referenced.
Bloom Filters for Web Caching - Lightning TalkFelix Gessert
This document discusses using Bloom filters to cache dynamic web content for low latency. It describes how Bloom filters can be used to proactively revalidate cached data and check if it is still fresh. An end-to-end example is provided showing how Bloom filters could work from the browser cache to a CDN to check cached objects and estimate their time-to-live. Code snippets demonstrate integrating Bloom filters into querying and loading data from a backend database to leverage caching. The goal is to serve dynamic content from ubiquitous web caches for low latency with less processing.
The document discusses lessons learned from integrating MongoDB into eCommerce websites. Some key points:
- The EAV data model used by Magento is slow and performs poorly at scale, motivating a transition to MongoDB.
- Early approaches stored all product data in MongoDB but this broke features relying on SQL. A hybrid model using MongoDB for most attributes and MySQL for key fields worked better.
- The learning curve is high but storing data to match queries, managing transactions carefully, and using search engines are important. Near real-time processing can improve performance significantly.
- Backup and replication require special attention in distributed architectures. The open source MongoGento module developed by Smile improves Magento performance
Magento sites need optimization to load fast and provide a good user experience. Speeding up a site increases sales and improves SEO. Factors that impact load time include network transfers and the resource-intensive nature of Magento. Benchmarking tools like APDEX, Funkload, Yslow and Pagespeed help measure performance and set goals, such as loading the homepage in under 1.5 seconds. Architectures must be sized properly and include techniques like splitting front and back ends, enabling caching, and using a CDN. The Nitrogento extension automates optimizations like blocking caching, sprite generation, and asset minification to significantly improve performance.
The document summarizes an author's first experience using Google AppEngine, a platform that allows developers to build and host web applications on Google's infrastructure without having to maintain servers. The author discusses how AppEngine provides a scalable hosting environment but has limitations like blacklisting of certain technologies, per-account application limits, and constraints on data storage and querying. The author also explores services provided by AppEngine and notes that while much is free, usage beyond quotas requires payment.
Scaling a web application involves making it able to handle growing amounts of work gracefully as demand increases. There are two main approaches: scaling up by upgrading hardware, which provides diminishing returns, and scaling out by adding more nodes. Scaling out involves sharding the database across several servers so that no single machine contains all the data. While relational databases guarantee consistency and are easier to use, they generally do not scale out as well as NoSQL databases, which can distribute data independently across shards for improved read and write throughput at the cost of consistency. The choice of database depends on the specific application's needs, and either relational or NoSQL options may be suitable depending on the situation.
Klmug presentation - Simple Analytics with MongoDBRoss Affandy
This document discusses using MongoDB for analytics and lessons learned from implementing analytics on a car listings website. It describes the technical stack including MongoDB, challenges with slow map reduce queries and server crashes, and solutions tried like moving to aggregation queries and increasing hardware. A key lesson was that data modeling is important - denormalizing data and simplifying queries through adding redundant data improved performance and solved the problem more cost effectively than increasing resources.
The document provides tips on common scalability mistakes made when designing web applications and strategies to avoid them. It discusses the importance of considering scalability from the beginning, avoiding blocking calls, caching frequently accessed data, optimizing database and file system usage, and using tools like profilers to identify bottlenecks. The key is designing applications that can scale both up and down based on current needs through a proactive, process-oriented approach.
Ruby on Rails Performance Tuning. Make it faster, make it better (WindyCityRa...John McCaffrey
(reposting with clearer title)
Performance tuning presentation from WindyCityRails 2010.
Why performance matters
The right way to approach it
Front end testing tools
Automated testing tools
Common problems and the ways to solve them in Rails
Rails specific tools
bullet
slim_scrooge
rack bug
request log analyzer
rails indexes
This document discusses MongoDB and Mongoid. It begins with an introduction to NoSQL and MongoDB as a document-oriented database. It then compares MongoDB to SQL databases and describes how Mongoid allows mapping Ruby objects to MongoDB documents and embedded documents, similar to how ActiveRecord works with SQL databases. The document provides examples of using Mongoid to define models and queries. It notes pros and cons of Mongoid and encourages trying it. It concludes by sharing more information on MongoDB and Mongoid and announcing an upcoming meeting.
Management and Automation of MongoDB Clusters - SlidesSeveralnines
Use MongoDB at Any Scale
As you scale, one of the challenges is optimizing your clusters and mitigating operational risk. Proper preparation can result in significant savings and reduced downtime.
This session covers:
* Deployment of dev/test/production environments across private data centers or public clouds
* What to monitor in production environments
* Management automation with ClusterControl from Severalnines
* How ClusterControl works with TokuMX
The session will give you the tools to more effectively manage your cluster, immediately. The presentation will include code samples and a live Q&A session.
This webinar is being delivered jointly by Severalnines & Tokutek. Severalnines provides automation and management tools to reduce the complexity of working with highly available database clusters. Tokutek provides high-performance and scalability for MongoDB, MySQL and MariaDB.
BP101 - 10 Things to Consider when Developing & Deploying Applications in Lar...Martijn de Jong
Many common development techniques can cause dramatic effects when your application is rolled out over hundreds of servers. As a developer, you need a good understanding of certain parts of the infrastructure to build an application designed for wide-scale deployment. System administrators who review applications before deployment should know what to look for in the code to prevent problems when rolled out to production. This session takes a look at the area where Application Development and System Administration come together. You will hear about real-life problems, view examples of bad code as well as good code, and learn what you should consider when you have to develop or deploy an application which will be rolled out in a large-scale deployment, or how to "harden" your code to support large quantities of documents.
This document summarizes a presentation about building an IoT application using the MEAN stack. It discusses five key things they learned: performance depends on test data; MEAN is fast to develop with but frameworks can obscure what's happening so profiling is important; incremental aggregation works well for IoT; Node.js bottlenecks before MongoDB; and performance tuning patterns like identifying bottlenecks and slam-dunk optimizations. It also describes building user stories for an advertising application, modeling the data, initial measurements that guided prototyping, challenges of scaling, and using "boxes" to identify hot sales areas.
Shinken is a full rewrite of Nagios in Python that aims to solve issues with scaling, high availability, and simplifying administration for modern IT infrastructures. Key features include built-in high availability, multi-level load balancing, support for multiple platforms, faster performance, and advanced business rules. The Shinken web interface focuses on aggregating related elements and showing dependencies to help both technical and non-technical users understand business impacts. Advanced modules allow for discovery, triggers for passive data, and templating to reduce configuration complexity.
The 90-Day Startup with Google AppEngine for JavaDavid Chandler
The document discusses Google App Engine, a platform for developing and hosting web applications on Google's infrastructure. It provides an overview of App Engine and how to get started, discusses some limitations and tradeoffs compared to traditional web hosting, and recommends frameworks and techniques for building scalable applications on App Engine, including Objectify, Guice, and gwt-dispatch. It also notes that while App Engine is still relatively new, it has significant potential for developing scalable applications with minimal upfront costs.
The document provides an overview of scaling principles for web applications, beginning with optimizing a single server application and progressing to more advanced architectures involving load balancing, multiple web/application servers, and multiple database servers. It discusses profiling applications to identify bottlenecks, various caching and optimization strategies, Apache configuration for handling load, and links to additional resources on related topics.
The document provides an overview of scaling principles for web applications, beginning with optimizing a single server application and progressing to more advanced architectures involving load balancing, multiple web/application servers, and multiple database servers. It discusses profiling applications to identify bottlenecks, various caching and optimization strategies, Apache configuration for prefork MPM, and load balancing technologies like DNS round robin, Apache reverse proxy, HAProxy and Pound. Links are provided to additional resources on related topics.
Building configurable applications for the websupertom
The document discusses building configurable web applications. It recommends making as many aspects of an application configurable as possible without changing code, such as hostnames, database names, timeouts and display settings. This allows applications to be deployed across multiple platforms and environments without redeploying code. The document also stresses the importance of writing debuggable and defensive code to make applications more stable and operable in different states.
Similar to Keeping the Lights On with MongoDB (20)
A Tasty deep-dive into Open API Specification LinksTony Tam
From the March APICraft meetup in San Francisco, we dive into the details of one of the newest features of the Open API Specification (fka Swagger Specification) called links.
While not intended as a replacement for Hypermedia, the OAS 3.0 Links feature provides design-time designation for rich traversals between operations
Presented at JavaOne 2016.
Using Swagger has become the most popular way to describe REST APIs across the web, enabling people to more quickly understand and communicate with services, with developer-friendly documentation and rich, autogenerated client SDKs. As the API has moved more into being one of the most important aspects of a service, the Swagger definition has become increasingly more important and essential to the design phase. This presentation explains how the Swagger definition can be used to streamline the iteration process and enable client and server engineers to develop concurrently with complex APIs.
This document discusses how Swagger can be used to develop APIs faster. It describes what Swagger is, provides an example Swagger YAML file, and discusses how code can be generated from Swagger specifications. It also introduces Swagger Inflector, which uses the Swagger specification as the single source of truth to automatically route controllers, map models, and generate sample data when controllers are not implemented. The document encourages rethinking the DRY principle and maintaining the API specification as the central source.
Writer APIs in Java faster with Swagger InflectorTony Tam
Swagger provides a clean contract for your REST API. Swagger Inflector is a project which uses Swagger as the language of the API, automatically wiring REST endpoints directly to controllers in the Jersey 2.x framework. By doing so, the specification and code are always up to date, removing potentially error-prone redundant code and bringing development on the JDK up to speed with typeless languages.
Presentation by Tony Tam on using the Scalatra micro web framework with native support for Swagger. This gives the fastest possible server-to-mobile integration with Scala
Swagger APIs for Humans and Robots (Gluecon)Tony Tam
Presentation to Gluecon 2014 about Swagger for API development and adoption of services. Reverb also announced the Swagger 2.0 Working Group, with Apigee as a founding member
Love your API with Swagger (Gluecon lightning talk)Tony Tam
Love your API with Swagger
Developers want to understand and integrate with APIs but have different workflows than API creators. Swagger makes APIs understandable, testable, discoverable, and ready to integrate through a simple JSON description that can be built directly or via code/YAML/GUI. It allows developers to try out APIs and generate SDKs so they can use services as they want rather than writing the API creator's software. The Swagger community supports many frameworks and it is open source under the Apache 2 license.
The document introduces Swagger, an open source framework for describing and documenting RESTful APIs. Swagger allows APIs to be defined in a machine-readable JSON format and generates documentation, client libraries, and servers from these definitions. This standardized interface for APIs has benefits like enabling parallel development, removing logic from clients, and facilitating code generation for multiple platforms and languages.
The document discusses an API-first or description-driven approach to API development using the Swagger specification. It argues that the traditional approach of building an API and then documenting it is broken because it puts API consumers at a disadvantage. Instead, it advocates describing the API interface first using Swagger, which allows developers to model and iterate on the API without writing any code. This benefits both API developers and consumers.
- Data modeling for NoSQL databases is different than relational databases and requires designing the data model around access patterns rather than object structure. Key differences include not having joins so data needs to be duplicated and modeling the data in a way that works for querying, indexing, and retrieval speed.
- The data model should focus on making the most of features like atomic updates, inner indexes, and unique identifiers. It's also important to consider how data will be added, modified, and retrieved factoring in object complexity, marshalling/unmarshalling costs, and index maintenance.
- The _id field can be tailored to the access patterns, such as using dates for time-series data to keep recent
This document discusses various techniques for monitoring applications without interfering with core engineering work. It recommends using open source tools like the Wordnik profiler, Swagger, and MongoDB's oplog to provide business metrics monitoring and allow product teams to define their own metrics. The oplog can be used to access real-time data changes and power use cases like alerts, analytics, and pushing data to external systems without interrupting application code.
This document discusses various strategies for backing up MongoDB data to keep it safe. It recommends:
1. Using mongodump for simple backups that can restore quickly but may be inconsistent.
2. Setting up replication for high availability, but also using mongodump for backups and testing restore processes.
3. Taking snapshots of the data files for consistent backups, but this requires downtime and gaps can occur between snapshots.
4. Using the oplog for incremental, continuous backups to avoid gaps without downtime using tools like the Wordnik Admin Tools. Testing backups is strongly recommended.
Wordnik's architecture is built around a large English word graph database and uses microservices and ephemeral Amazon EC2 storage. Key aspects include:
1) The system is built as independent microservices that communicate via REST APIs documented using Swagger specifications.
2) Databases for each microservice are kept small by design to facilitate operations like backups, replication, and index rebuilding.
3) Services are deployed across multiple Availability Zones and regions on ephemeral Amazon EC2 storage for high availability despite individual host failures.
This document discusses scaling applications and services. It recommends taking a vertical approach by breaking monolithic applications into microservices that communicate through APIs. The Swagger framework is presented as a way to document and test APIs. Swagger can generate client libraries and helps services scale by enabling asynchronous communication through websockets. Taking this vertical, microservices approach with Swagger improves scalability by allowing dedicated teams to own individual services and improves performance through asynchronous communication protocols.
A deck on the practical reasons why Wordnik moved to the Scala programming language. Also covered is the Swagger REST API framework which is available at http://swagger.wordnik.com
"Choosing proper type of scaling", Olena SyrotaFwdays
Imagine an IoT processing system that is already quite mature and production-ready and for which client coverage is growing and scaling and performance aspects are life and death questions. The system has Redis, MongoDB, and stream processing based on ksqldb. In this talk, firstly, we will analyze scaling approaches and then select the proper ones for our system.
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
5th LF Energy Power Grid Model Meet-up SlidesDanBrown980551
5th Power Grid Model Meet-up
It is with great pleasure that we extend to you an invitation to the 5th Power Grid Model Meet-up, scheduled for 6th June 2024. This event will adopt a hybrid format, allowing participants to join us either through an online Mircosoft Teams session or in person at TU/e located at Den Dolech 2, Eindhoven, Netherlands. The meet-up will be hosted by Eindhoven University of Technology (TU/e), a research university specializing in engineering science & technology.
Power Grid Model
The global energy transition is placing new and unprecedented demands on Distribution System Operators (DSOs). Alongside upgrades to grid capacity, processes such as digitization, capacity optimization, and congestion management are becoming vital for delivering reliable services.
Power Grid Model is an open source project from Linux Foundation Energy and provides a calculation engine that is increasingly essential for DSOs. It offers a standards-based foundation enabling real-time power systems analysis, simulations of electrical power grids, and sophisticated what-if analysis. In addition, it enables in-depth studies and analysis of the electrical power grid’s behavior and performance. This comprehensive model incorporates essential factors such as power generation capacity, electrical losses, voltage levels, power flows, and system stability.
Power Grid Model is currently being applied in a wide variety of use cases, including grid planning, expansion, reliability, and congestion studies. It can also help in analyzing the impact of renewable energy integration, assessing the effects of disturbances or faults, and developing strategies for grid control and optimization.
What to expect
For the upcoming meetup we are organizing, we have an exciting lineup of activities planned:
-Insightful presentations covering two practical applications of the Power Grid Model.
-An update on the latest advancements in Power Grid -Model technology during the first and second quarters of 2024.
-An interactive brainstorming session to discuss and propose new feature requests.
-An opportunity to connect with fellow Power Grid Model enthusiasts and users.
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframePrecisely
Inconsistent user experience and siloed data, high costs, and changing customer expectations – Citizens Bank was experiencing these challenges while it was attempting to deliver a superior digital banking experience for its clients. Its core banking applications run on the mainframe and Citizens was using legacy utilities to get the critical mainframe data to feed customer-facing channels, like call centers, web, and mobile. Ultimately, this led to higher operating costs (MIPS), delayed response times, and longer time to market.
Ever-changing customer expectations demand more modern digital experiences, and the bank needed to find a solution that could provide real-time data to its customer channels with low latency and operating costs. Join this session to learn how Citizens is leveraging Precisely to replicate mainframe data to its customer channels and deliver on their “modern digital bank” experiences.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/temporal-event-neural-networks-a-more-efficient-alternative-to-the-transformer-a-presentation-from-brainchip/
Chris Jones, Director of Product Management at BrainChip , presents the “Temporal Event Neural Networks: A More Efficient Alternative to the Transformer” tutorial at the May 2024 Embedded Vision Summit.
The expansion of AI services necessitates enhanced computational capabilities on edge devices. Temporal Event Neural Networks (TENNs), developed by BrainChip, represent a novel and highly efficient state-space network. TENNs demonstrate exceptional proficiency in handling multi-dimensional streaming data, facilitating advancements in object detection, action recognition, speech enhancement and language model/sequence generation. Through the utilization of polynomial-based continuous convolutions, TENNs streamline models, expedite training processes and significantly diminish memory requirements, achieving notable reductions of up to 50x in parameters and 5,000x in energy consumption compared to prevailing methodologies like transformers.
Integration with BrainChip’s Akida neuromorphic hardware IP further enhances TENNs’ capabilities, enabling the realization of highly capable, portable and passively cooled edge devices. This presentation delves into the technical innovations underlying TENNs, presents real-world benchmarks, and elucidates how this cutting-edge approach is positioned to revolutionize edge AI across diverse applications.
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor IvaniukFwdays
At this talk we will discuss DDoS protection tools and best practices, discuss network architectures and what AWS has to offer. Also, we will look into one of the largest DDoS attacks on Ukrainian infrastructure that happened in February 2022. We'll see, what techniques helped to keep the web resources available for Ukrainians and how AWS improved DDoS protection for all customers based on Ukraine experience
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyScyllaDB
Freshworks creates AI-boosted business software that helps employees work more efficiently and effectively. Managing data across multiple RDBMS and NoSQL databases was already a challenge at their current scale. To prepare for 10X growth, they knew it was time to rethink their database strategy. Learn how they architected a solution that would simplify scaling while keeping costs under control.
Main news related to the CCS TSI 2023 (2023/1695)Jakub Marek
An English 🇬🇧 translation of a presentation to the speech I gave about the main changes brought by CCS TSI 2023 at the biggest Czech conference on Communications and signalling systems on Railways, which was held in Clarion Hotel Olomouc from 7th to 9th November 2023 (konferenceszt.cz). Attended by around 500 participants and 200 on-line followers.
The original Czech 🇨🇿 version of the presentation can be found here: https://www.slideshare.net/slideshow/hlavni-novinky-souvisejici-s-ccs-tsi-2023-2023-1695/269688092 .
The videorecording (in Czech) from the presentation is available here: https://youtu.be/WzjJWm4IyPk?si=SImb06tuXGb30BEH .
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/how-axelera-ai-uses-digital-compute-in-memory-to-deliver-fast-and-energy-efficient-computer-vision-a-presentation-from-axelera-ai/
Bram Verhoef, Head of Machine Learning at Axelera AI, presents the “How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-efficient Computer Vision” tutorial at the May 2024 Embedded Vision Summit.
As artificial intelligence inference transitions from cloud environments to edge locations, computer vision applications achieve heightened responsiveness, reliability and privacy. This migration, however, introduces the challenge of operating within the stringent confines of resource constraints typical at the edge, including small form factors, low energy budgets and diminished memory and computational capacities. Axelera AI addresses these challenges through an innovative approach of performing digital computations within memory itself. This technique facilitates the realization of high-performance, energy-efficient and cost-effective computer vision capabilities at the thin and thick edge, extending the frontier of what is achievable with current technologies.
In this presentation, Verhoef unveils his company’s pioneering chip technology and demonstrates its capacity to deliver exceptional frames-per-second performance across a range of standard computer vision networks typical of applications in security, surveillance and the industrial sector. This shows that advanced computer vision can be accessible and efficient, even at the very edge of our technological ecosystem.
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...Alex Pruden
Folding is a recent technique for building efficient recursive SNARKs. Several elegant folding protocols have been proposed, such as Nova, Supernova, Hypernova, Protostar, and others. However, all of them rely on an additively homomorphic commitment scheme based on discrete log, and are therefore not post-quantum secure. In this work we present LatticeFold, the first lattice-based folding protocol based on the Module SIS problem. This folding protocol naturally leads to an efficient recursive lattice-based SNARK and an efficient PCD scheme. LatticeFold supports folding low-degree relations, such as R1CS, as well as high-degree relations, such as CCS. The key challenge is to construct a secure folding protocol that works with the Ajtai commitment scheme. The difficulty, is ensuring that extracted witnesses are low norm through many rounds of folding. We present a novel technique using the sumcheck protocol to ensure that extracted witnesses are always low norm no matter how many rounds of folding are used. Our evaluation of the final proof system suggests that it is as performant as Hypernova, while providing post-quantum security.
Paper Link: https://eprint.iacr.org/2024/257
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving
Manufacturing custom quality metal nameplates and badges involves several standard operations. Processes include sheet prep, lithography, screening, coating, punch press and inspection. All decoration is completed in the flat sheet with adhesive and tooling operations following. The possibilities for creating unique durable nameplates are endless. How will you create your brand identity? We can help!
Skybuffer SAM4U tool for SAP license adoptionTatiana Kojar
Manage and optimize your license adoption and consumption with SAM4U, an SAP free customer software asset management tool.
SAM4U, an SAP complimentary software asset management tool for customers, delivers a detailed and well-structured overview of license inventory and usage with a user-friendly interface. We offer a hosted, cost-effective, and performance-optimized SAM4U setup in the Skybuffer Cloud environment. You retain ownership of the system and data, while we manage the ABAP 7.58 infrastructure, ensuring fixed Total Cost of Ownership (TCO) and exceptional services through the SAP Fiori interface.
2. Presentation Overview Data >>> code Treat it appropriately Manage and maintain Mongo Mongo is young (and robust!) Performance and Features The right hooks exist
3. Who is Wordnik Wordnik is: The world’s largest English Language reference ~10M words! Mapping every word, based on real data (free ) API to add word information, everywhere
4. Wordnik’s MongoDB Deployment Over 12 Months with Mongo Corpus/UGC/Structured Data/Statistics Master/Slave ~3TB data ~12B records We love Mongo’s performance Read more: http://blog.wordnik.com/12-months-with-mongodb
5. Engineering + IT Ops First, Guiding Principles Know your data Don’t rely on IT magic Equal Importance in WebApps / SaaS Hold hands and be friends If you can’t manage it, don’t deploy it
7. How? Replicate! Is that enough? Well, not if your company is on the line Snapshot Every minute??? Export often Really???
8. Then What? Yes, Mongo can do Incremental Use the mongo slave mechanism It’s exposed It’s supported It’s very easy It’s extremely fast How? Snapshot your data Stream write ops to disk Repeat
9. Better than Free Take our tools-They work!!! SnapshotUtil Selectively snapshot in BSON Index info too! IncrementalBackupUtil Tail the oplog, stream to disk Only the collections you want! Compress & rotate RestoreUtil Recover your snapshots Apply indexes yourself ReplayUtil Apply your Incremental backups
10. What if Scenarios One collection gets corrupt? Restore it Apply all operations to it “My top developer dropped a collection!” Restore just that one Apply operations to it until that POT “We got hacked!” Restore it all Apply operations until that POT
11. What else is possible? Replication Why not use built-in? Control, of course Same logic as Incremental + Replay Add some filters and it gets interesting
12. Hot Datacenter Create incremental backups Compress Push to DC in batch Apply to master Primary Datacenter Hot Datacenter Incremental Backup Files Master Master Replay Util SCP
14. Multiple Upstream Masters Aggregate to single collection Target can be a master! Master A Master B Master C db.page_views db.page_views
15. Unblock MapReduce Map Reduce can lock up your server Replicate source data to another mongod Replicate results back to master Master MR Server db.source_data db.summary_data