Best practices for MySQL High AvailabilityColin Charles
The MariaDB/MySQL world is full of tradeoffs, and choosing a high availability (HA) solution is no exception. This session aims to look at all the alternatives in an unbiased way. Preference is of course only given to open source solutions.
How do you choose between: asynchronous/semi-synchronous/synchronous replication, MHA (MySQL high availability tools), DRBD, Tungsten Replicator, or Galera Cluster? Do you integrate Pacemaker and Heartbeat like Percona Replication Manager? The cloud brings even more fun, especially if you are dealing with a hybrid cloud and must think about geographical redundancy.
What about newer solutions like using Consul for MySQL HA?
When you’ve decided on your solution, how do you provision and monitor these solutions?
This and more will be covered in a walkthrough of MySQL HA options and when to apply them.
MySQL Storage Engines - which do you use? TokuDB? MyRocks? InnoDB?Sveta Smirnova
"MySQL Storage Engines - which do you use? TokuDB? MyRocks? InnoDB?" session at https://www.percona.com/live/17/sessions/mysql-storage-engines-which-do-you-use-tokudb-myrocks-innodb
David Mytton is a MongoDB master and the founder of Server Density. In this presentation David delves deeper into what's discussed in our how to monitor MongoDB tutorial (https://blog.serverdensity.com/monitor-mongodb/), with the aim of taking you through:
Key MongoDB metrics to monitor.
Non-critical MongoDB metrics to monitor.
Alerts to set for MongoDB on production.
Tools for monitoring MongoDB.
This document discusses how to tune Linux for optimal MongoDB performance. Key points include setting ulimits to allow for many processes and open files, disabling transparent huge pages, using the deadline IO scheduler, setting the dirty ratio and swappiness low, and ensuring consistent clocks with NTP. Monitoring tools like Percona PMM or Prometheus with Grafana dashboards can help analyze MongoDB and system metrics.
M|18 Writing Stored Procedures in the Real WorldMariaDB plc
This document discusses using MariaDB stored procedures and parallel processing to optimize the "Wordament" word game. It presents solutions to run the game using:
1) A single thread on one node.
2) Multiple threads on one node using MariaDB events.
3) Multiple threads across multiple nodes using MariaDB replication.
It concludes that MariaDB supports parallelism through events and replication, but could benefit from a thread API to more easily develop multithreaded stored procedure solutions.
Automate MongoDB with MongoDB Management ServiceMongoDB
MongoDB Management Service makes operations effortless, reducing complicated tasks to a single click. You can now provision machines, configure replica sets and sharded clusters, and upgrade your MongoDB deployment all through the MMS interface. We'll walk through demos of all the new MMS features, including provisioning, expanding and contracting a cluster, resizing the oplog, and managing users.
Josh Berkus
Most users know that PostgreSQL has a 23-year development history. But did you know that Postgres code is used for over a dozen other database systems? Thanks to our liberal licensing, many companies and open source projects over the years have taken the Postgres or PostgreSQL code, changed it, added things to it, and/or merged it into something else. Illustra, Truviso, Aster, Greenplum, and others have seen the value of Postgres not just as a database but as some darned good code they could use. We'll explore the lineage of these forks, and go into the details of some of the more interesting ones.
Best practices for MySQL High AvailabilityColin Charles
The MariaDB/MySQL world is full of tradeoffs, and choosing a high availability (HA) solution is no exception. This session aims to look at all the alternatives in an unbiased way. Preference is of course only given to open source solutions.
How do you choose between: asynchronous/semi-synchronous/synchronous replication, MHA (MySQL high availability tools), DRBD, Tungsten Replicator, or Galera Cluster? Do you integrate Pacemaker and Heartbeat like Percona Replication Manager? The cloud brings even more fun, especially if you are dealing with a hybrid cloud and must think about geographical redundancy.
What about newer solutions like using Consul for MySQL HA?
When you’ve decided on your solution, how do you provision and monitor these solutions?
This and more will be covered in a walkthrough of MySQL HA options and when to apply them.
MySQL Storage Engines - which do you use? TokuDB? MyRocks? InnoDB?Sveta Smirnova
"MySQL Storage Engines - which do you use? TokuDB? MyRocks? InnoDB?" session at https://www.percona.com/live/17/sessions/mysql-storage-engines-which-do-you-use-tokudb-myrocks-innodb
David Mytton is a MongoDB master and the founder of Server Density. In this presentation David delves deeper into what's discussed in our how to monitor MongoDB tutorial (https://blog.serverdensity.com/monitor-mongodb/), with the aim of taking you through:
Key MongoDB metrics to monitor.
Non-critical MongoDB metrics to monitor.
Alerts to set for MongoDB on production.
Tools for monitoring MongoDB.
This document discusses how to tune Linux for optimal MongoDB performance. Key points include setting ulimits to allow for many processes and open files, disabling transparent huge pages, using the deadline IO scheduler, setting the dirty ratio and swappiness low, and ensuring consistent clocks with NTP. Monitoring tools like Percona PMM or Prometheus with Grafana dashboards can help analyze MongoDB and system metrics.
M|18 Writing Stored Procedures in the Real WorldMariaDB plc
This document discusses using MariaDB stored procedures and parallel processing to optimize the "Wordament" word game. It presents solutions to run the game using:
1) A single thread on one node.
2) Multiple threads on one node using MariaDB events.
3) Multiple threads across multiple nodes using MariaDB replication.
It concludes that MariaDB supports parallelism through events and replication, but could benefit from a thread API to more easily develop multithreaded stored procedure solutions.
Automate MongoDB with MongoDB Management ServiceMongoDB
MongoDB Management Service makes operations effortless, reducing complicated tasks to a single click. You can now provision machines, configure replica sets and sharded clusters, and upgrade your MongoDB deployment all through the MMS interface. We'll walk through demos of all the new MMS features, including provisioning, expanding and contracting a cluster, resizing the oplog, and managing users.
Josh Berkus
Most users know that PostgreSQL has a 23-year development history. But did you know that Postgres code is used for over a dozen other database systems? Thanks to our liberal licensing, many companies and open source projects over the years have taken the Postgres or PostgreSQL code, changed it, added things to it, and/or merged it into something else. Illustra, Truviso, Aster, Greenplum, and others have seen the value of Postgres not just as a database but as some darned good code they could use. We'll explore the lineage of these forks, and go into the details of some of the more interesting ones.
Redis : Database, cache, pub/sub and more at Jelly button gamesRedis Labs
Nir Shney-Dor of Jelly button games talks about how he uses Redis across many different use cases - as a persistent db, cache, for pub/sub, leaderboards etc etc..Fun walkthrough of all the uses one can put Redis to.
The talk will elaborate on how to detect and Heal your MySQL topology with MySQL Orchestrator .This talk was delivered on Mydbops database Meetup on 27-04-2019 by Anil Yadav, Lead Database Engineer with OLA and Krishna Ramanathan Database Administrator III with OLA.
This document summarizes the results of benchmarking PostgreSQL database performance on several cloud platforms, including AWS EC2, RDS, Google Compute Engine, DigitalOcean, Rackspace, and Heroku.
The benchmarks tested small and large instance sizes across the clouds on different workload types, including in-memory and disk-based transactions and queries. Key metrics measured were transactions per second (TPS), load time to set up the database, and cost per TPS and load bandwidth.
The results show large performance and cost variations between clouds and instance types. In general, dedicated instances like EC2 outperformed shared instances, and DBaaS options like RDS were more expensive but offered higher availability. The document discusses challenges
Pacemaker is a high availability cluster resource manager that can be used to provide high availability for MySQL databases. It monitors MySQL instances and replicates data between nodes using replication. If the primary MySQL node fails, Pacemaker detects the failure and fails over to the secondary node, bringing the MySQL service back online without downtime. Pacemaker manages shared storage and virtual IP failover to ensure connections are direct to the active MySQL node. It is important to monitor replication state and lag to ensure data consistency between nodes.
This document discusses Linux huge pages, including:
- What huge pages are and how they can reduce memory management overhead by allocating larger blocks of memory
- How to configure huge pages on Linux, including installing required packages, mounting the huge page filesystem, and setting kernel parameters
- When huge pages should be configured, such as for data-intensive or latency-sensitive applications like databases, but that testing is required due to disadvantages like reduced swappability
Massively Scaled High Performance Web Services with PHPDemin Yin
Over the years, people have questioned if PHP is a good choice for building web services. In this talk, I will share how we use PHP on the backend for Glu Mobile’s flagship mobile game Design Home, enabling it to regularly rank amongst the top free mobile games in the Apple App Store and the Google Play Store. We will deep dive into the thought processes, development, testing, and deployment strategy, showcasing what we have achieved with PHP.
The document summarizes the results of benchmarking and comparing the performance of PostgreSQL databases hosted on Amazon EC2, RDS, and Heroku. It finds that EC2 provides the most configuration options but requires more management, RDS offers simplified deployment but less configuration options, and Heroku requires no management but has limited configuration and higher costs. Benchmark results show EC2 performing best for raw performance while RDS and Heroku trade off some performance for manageability. Heroku was the most expensive option.
This document provides an overview of MariaDB Galera Cluster and discusses some key features of Galera Cluster version 4, including huge transaction support through streaming replication and optimizing handling of inconsistencies to avoid unnecessary cluster-wide shutdowns. It summarizes Seppo Jaakola's presentation on the state of Galera Cluster and the roadmap for future releases.
MySQL/MariaDB replication is asynchronous. You can make replication faster by using better hardware (faster CPU, more RAM, or quicker disks), or you can use parallel replication to remove it single-threaded limitation; but lag can still happen. This talk is not about making replication faster, it is how to deal with its asynchronous nature, including the (in-)famous lag.
We will start by explaining the consequences of asynchronous replication and how/when lag can happen. Then, we will present the solution used at Booking.com to avoid both creating lag and minimize the consequence of stale reads on slaves (hint: this solution does not mean reading from the master because this does not scale).
Once all above is well understood, we will discuss how Booking.com’s solution can be improved: this solution was designed years ago and we would do this differently if starting from scratch today. Finally, I will present an innovative way to avoid lag: the no-slave-left-behind MariaDB patch.
Highly Available MySQL/PHP Applications with mysqlndJervin Real
This document discusses how to achieve high availability in PHP/MySQL applications using the mysqlnd driver. It describes different MySQL high availability configurations including master-slave replication, multi-master replication using Galera or NDB Cluster, and how mysqlnd's mysqlnd_ms plugin allows applications to connect to these clustered MySQL instances in a highly available manner by handling failover between nodes. The document provides examples of mysqlnd_ms connection configuration for both master-slave and multi-master setups.
HighLoad Solutions On MySQL / Xiaobin Lin (Alibaba)Ontico
The document summarizes several issues encountered with high load on Alibaba MySQL databases and solutions implemented:
1) Hotspot updating of a single row caused deadlocks; implementing queueing on the primary key resolved this.
2) Unexpected long transactions under high load led to clients waiting long periods; committing transactions early where possible addressed this.
3) More than 50,000 active threads overwhelmed MySQL's capabilities; implementing actions based on low and high thread thresholds helped.
The document discusses different database architectures including master-slave, master-master, and MySQL cluster. Master-slave involves one master node that handles writes and multiple read-only slave nodes. Master-master allows writes and reads on all nodes but has weaker consistency. MySQL cluster provides high availability, no single point of failure, and automatic sharding but has some limitations. The author has compiled pros and cons of each and decided MySQL cluster is best for their use case.
Tomcat, Undertow, Jetty, Nginx Unit: pros and consGeraldo Netto
A quick comparison between Tomcat, Undertow, Jetty, Nginx Unit regarding features, performance, scalability, security, maintainability and extensibility
The Complete MariaDB Server Tutorial - Percona Live 2015Colin Charles
The document provides an overview of the Complete MariaDB Server Tutorial presentation. It introduces MariaDB and discusses what it is, its goals of being compatible with MySQL and having stable releases. It also covers MariaDB architecture, installation, utilities, and storage engines.
This document discusses scalable architectures and provides an overview of load balancing, web servers, and database servers. It begins with an introduction to load balancing options like software-based solutions (HAProxy, Pound, Varnish) and hardware appliances. It then covers setting up multiple web servers and load balancing between them, as well as database replication and partitioning strategies to scale databases. The goal is to help applications grow from a single server to distributed, scalable architectures.
Next Generation DevOps in Drupal: DrupalCamp London 2014Barney Hanlon
In this talk, Barney will be discussing and demonstrating how to:
- Use nginx, Varnish and Apache together in a "SPDY sandwich" to support HTTP 2.0
- Setting up SSL properly to mitigate against attack vectors
- Performance improvements with mod_pagespeed and nginx
- Deploying Drupal sites with Docker containers
Barney is a Technical Team Leader at Inviqa, a Drupal Association member and writes for Techportal on using technologies to improve website performance. He first started using PHP professionally in 2003, and has over seventeen years experience in software development. He is an advocate of Scrum methodology and has an interest in performance optimization, researching and speaking on various techniques to improve user experience through faster load times.
MySQL 5.7 provides significant performance improvements and new features over previous versions. Benchmark tests showed it was 3x faster than MySQL 5.6 for SQL point selects and connection requests, and 1.5x faster for OLTP read/write workloads. New features include enhanced InnoDB storage engine capabilities, improved replication, JSON data type support, and increased security.
Tim Vaillancourt is a senior technical operations architect specializing in MongoDB. He has over 10 years of experience tuning Linux for database workloads and monitoring technologies like Nagios, MRTG, Munin, Zabbix, Cacti, and Graphite. He discussed the various MongoDB storage engines including MMAPv1, WiredTiger, RocksDB, and TokuMX. Key metrics for monitoring the different engines include lock ratio, page faults, background flushing times, checkpoints/compactions, replication lag, and scanned/moved documents. High-level operating system metrics like CPU, memory, disk, and network utilization are also important for ensuring MongoDB has sufficient resources.
Threads and processes are abstractions of execution units that differ in how they share resources. The key differences are that threads share memory and file descriptors by default, while processes do not. Underneath, both are created using the clone() system call but with different flags. Whether to use threads or processes depends on the specific workload and how system calls like malloc(), read/write, and setreuid() are implemented at the OS level. Understanding these implementation details is important for choosing the best approach.
Linux Memory Basics for SysAdmins - ChinaNetCloud TrainingChinaNetCloud
ChinaNetCloud training for Linux Memory Basics for SysAdmins.
This is an introduction to general Linux memory for troubleshooting, monitoring, and basic understanding.
Redis : Database, cache, pub/sub and more at Jelly button gamesRedis Labs
Nir Shney-Dor of Jelly button games talks about how he uses Redis across many different use cases - as a persistent db, cache, for pub/sub, leaderboards etc etc..Fun walkthrough of all the uses one can put Redis to.
The talk will elaborate on how to detect and Heal your MySQL topology with MySQL Orchestrator .This talk was delivered on Mydbops database Meetup on 27-04-2019 by Anil Yadav, Lead Database Engineer with OLA and Krishna Ramanathan Database Administrator III with OLA.
This document summarizes the results of benchmarking PostgreSQL database performance on several cloud platforms, including AWS EC2, RDS, Google Compute Engine, DigitalOcean, Rackspace, and Heroku.
The benchmarks tested small and large instance sizes across the clouds on different workload types, including in-memory and disk-based transactions and queries. Key metrics measured were transactions per second (TPS), load time to set up the database, and cost per TPS and load bandwidth.
The results show large performance and cost variations between clouds and instance types. In general, dedicated instances like EC2 outperformed shared instances, and DBaaS options like RDS were more expensive but offered higher availability. The document discusses challenges
Pacemaker is a high availability cluster resource manager that can be used to provide high availability for MySQL databases. It monitors MySQL instances and replicates data between nodes using replication. If the primary MySQL node fails, Pacemaker detects the failure and fails over to the secondary node, bringing the MySQL service back online without downtime. Pacemaker manages shared storage and virtual IP failover to ensure connections are direct to the active MySQL node. It is important to monitor replication state and lag to ensure data consistency between nodes.
This document discusses Linux huge pages, including:
- What huge pages are and how they can reduce memory management overhead by allocating larger blocks of memory
- How to configure huge pages on Linux, including installing required packages, mounting the huge page filesystem, and setting kernel parameters
- When huge pages should be configured, such as for data-intensive or latency-sensitive applications like databases, but that testing is required due to disadvantages like reduced swappability
Massively Scaled High Performance Web Services with PHPDemin Yin
Over the years, people have questioned if PHP is a good choice for building web services. In this talk, I will share how we use PHP on the backend for Glu Mobile’s flagship mobile game Design Home, enabling it to regularly rank amongst the top free mobile games in the Apple App Store and the Google Play Store. We will deep dive into the thought processes, development, testing, and deployment strategy, showcasing what we have achieved with PHP.
The document summarizes the results of benchmarking and comparing the performance of PostgreSQL databases hosted on Amazon EC2, RDS, and Heroku. It finds that EC2 provides the most configuration options but requires more management, RDS offers simplified deployment but less configuration options, and Heroku requires no management but has limited configuration and higher costs. Benchmark results show EC2 performing best for raw performance while RDS and Heroku trade off some performance for manageability. Heroku was the most expensive option.
This document provides an overview of MariaDB Galera Cluster and discusses some key features of Galera Cluster version 4, including huge transaction support through streaming replication and optimizing handling of inconsistencies to avoid unnecessary cluster-wide shutdowns. It summarizes Seppo Jaakola's presentation on the state of Galera Cluster and the roadmap for future releases.
MySQL/MariaDB replication is asynchronous. You can make replication faster by using better hardware (faster CPU, more RAM, or quicker disks), or you can use parallel replication to remove it single-threaded limitation; but lag can still happen. This talk is not about making replication faster, it is how to deal with its asynchronous nature, including the (in-)famous lag.
We will start by explaining the consequences of asynchronous replication and how/when lag can happen. Then, we will present the solution used at Booking.com to avoid both creating lag and minimize the consequence of stale reads on slaves (hint: this solution does not mean reading from the master because this does not scale).
Once all above is well understood, we will discuss how Booking.com’s solution can be improved: this solution was designed years ago and we would do this differently if starting from scratch today. Finally, I will present an innovative way to avoid lag: the no-slave-left-behind MariaDB patch.
Highly Available MySQL/PHP Applications with mysqlndJervin Real
This document discusses how to achieve high availability in PHP/MySQL applications using the mysqlnd driver. It describes different MySQL high availability configurations including master-slave replication, multi-master replication using Galera or NDB Cluster, and how mysqlnd's mysqlnd_ms plugin allows applications to connect to these clustered MySQL instances in a highly available manner by handling failover between nodes. The document provides examples of mysqlnd_ms connection configuration for both master-slave and multi-master setups.
HighLoad Solutions On MySQL / Xiaobin Lin (Alibaba)Ontico
The document summarizes several issues encountered with high load on Alibaba MySQL databases and solutions implemented:
1) Hotspot updating of a single row caused deadlocks; implementing queueing on the primary key resolved this.
2) Unexpected long transactions under high load led to clients waiting long periods; committing transactions early where possible addressed this.
3) More than 50,000 active threads overwhelmed MySQL's capabilities; implementing actions based on low and high thread thresholds helped.
The document discusses different database architectures including master-slave, master-master, and MySQL cluster. Master-slave involves one master node that handles writes and multiple read-only slave nodes. Master-master allows writes and reads on all nodes but has weaker consistency. MySQL cluster provides high availability, no single point of failure, and automatic sharding but has some limitations. The author has compiled pros and cons of each and decided MySQL cluster is best for their use case.
Tomcat, Undertow, Jetty, Nginx Unit: pros and consGeraldo Netto
A quick comparison between Tomcat, Undertow, Jetty, Nginx Unit regarding features, performance, scalability, security, maintainability and extensibility
The Complete MariaDB Server Tutorial - Percona Live 2015Colin Charles
The document provides an overview of the Complete MariaDB Server Tutorial presentation. It introduces MariaDB and discusses what it is, its goals of being compatible with MySQL and having stable releases. It also covers MariaDB architecture, installation, utilities, and storage engines.
This document discusses scalable architectures and provides an overview of load balancing, web servers, and database servers. It begins with an introduction to load balancing options like software-based solutions (HAProxy, Pound, Varnish) and hardware appliances. It then covers setting up multiple web servers and load balancing between them, as well as database replication and partitioning strategies to scale databases. The goal is to help applications grow from a single server to distributed, scalable architectures.
Next Generation DevOps in Drupal: DrupalCamp London 2014Barney Hanlon
In this talk, Barney will be discussing and demonstrating how to:
- Use nginx, Varnish and Apache together in a "SPDY sandwich" to support HTTP 2.0
- Setting up SSL properly to mitigate against attack vectors
- Performance improvements with mod_pagespeed and nginx
- Deploying Drupal sites with Docker containers
Barney is a Technical Team Leader at Inviqa, a Drupal Association member and writes for Techportal on using technologies to improve website performance. He first started using PHP professionally in 2003, and has over seventeen years experience in software development. He is an advocate of Scrum methodology and has an interest in performance optimization, researching and speaking on various techniques to improve user experience through faster load times.
MySQL 5.7 provides significant performance improvements and new features over previous versions. Benchmark tests showed it was 3x faster than MySQL 5.6 for SQL point selects and connection requests, and 1.5x faster for OLTP read/write workloads. New features include enhanced InnoDB storage engine capabilities, improved replication, JSON data type support, and increased security.
Tim Vaillancourt is a senior technical operations architect specializing in MongoDB. He has over 10 years of experience tuning Linux for database workloads and monitoring technologies like Nagios, MRTG, Munin, Zabbix, Cacti, and Graphite. He discussed the various MongoDB storage engines including MMAPv1, WiredTiger, RocksDB, and TokuMX. Key metrics for monitoring the different engines include lock ratio, page faults, background flushing times, checkpoints/compactions, replication lag, and scanned/moved documents. High-level operating system metrics like CPU, memory, disk, and network utilization are also important for ensuring MongoDB has sufficient resources.
Threads and processes are abstractions of execution units that differ in how they share resources. The key differences are that threads share memory and file descriptors by default, while processes do not. Underneath, both are created using the clone() system call but with different flags. Whether to use threads or processes depends on the specific workload and how system calls like malloc(), read/write, and setreuid() are implemented at the OS level. Understanding these implementation details is important for choosing the best approach.
Linux Memory Basics for SysAdmins - ChinaNetCloud TrainingChinaNetCloud
ChinaNetCloud training for Linux Memory Basics for SysAdmins.
This is an introduction to general Linux memory for troubleshooting, monitoring, and basic understanding.
This document discusses using Pacemaker with MySQL for high availability (HA). It covers key concepts in HA including eliminating single points of failure. It then discusses various MySQL HA solutions like replication, DRBD, MySQL Cluster, and using Linux HA tools like Pacemaker. Pacemaker manages resources across nodes to ensure services are always running, and can monitor and migrate MySQL and other services in an HA cluster. The document provides configuration examples and best practices for setting up MySQL HA with Pacemaker.
MariaDB / MySQL tripping hazard and how to get out again?FromDual GmbH
The document discusses common pitfalls and mistakes when using MariaDB/MySQL databases and how to avoid or recover from them, including issues related to different versions and forks of MariaDB and MySQL not being fully compatible, keeping implementations simple to avoid unnecessary complexity, and problems that can occur from table locking, disk space usage, and other operational concerns.
This document discusses Chartbeat's use of MongoDB and Amazon EC2. Chartbeat stores real-time analytics data and historical data in MongoDB clusters running on EC2. They faced challenges with disappearing EC2 instances, poor I/O performance on EBS volumes, and unpredictable EC2 performance. To address these, Chartbeat uses replica sets for high availability, preallocates data to reduce fragmentation, and heavily monitors servers and MongoDB for issues. Automating processes and monitoring are important strategies for stable MongoDB on EC2.
This document outlines best practices for MySQL database administration including database design and planning, installation and configuration, optimization, replication, backup, and monitoring. It discusses topics such as database structure, storage engines, configuration variables, indexing, replication components, backup methods, and using tools like MySQL Enterprise Backup, mysqldump, and monitoring queries. GTID replication is also covered, explaining how it solves problems and can be enabled to uniquely identify transactions across servers.
This document discusses different memory management techniques used in operating systems. It covers basic concepts like logical vs physical addresses and address translation. It then describes swapping as a technique where processes can be moved out of main memory temporarily. It also discusses contiguous allocation of memory to processes. The key techniques of paging and segmentation are explained in detail. Paging divides memory into fixed sized pages and uses a page table to map logical to physical addresses. Segmentation allows programs to be composed of different segments like code, data etc.
The document provides guidance on deploying MongoDB in production environments. It discusses sizing hardware requirements for memory, CPU, and disk I/O. It also covers installing and upgrading MongoDB, considerations for cloud platforms like EC2, security, backups, durability, scaling out, and monitoring. The focus is on performance optimization and ensuring data integrity and high availability.
Redis Developers Day 2014 - Redis Labs TalksRedis Labs
These are the slides that the Redis Labs team had used to accompany the session that we gave during the first ever Redis Developers Day on October 2nd, 2014, London. It includes some of the ideas we've come up with to tackle operational challenges in the hyper-dense, multi-tenants Redis deployments that our service - Redis Cloud - consists of.
This document provides an overview of big data ecosystems, including common log formats, compression techniques, data collection methods, distributed storage options like HDFS and S3, distributed processing frameworks like Hadoop MapReduce and Storm, workflow managers, real-time storage options, and other related topics. It describes technologies like Kafka, HBase, Cassandra, Pig, Hive, Oozie, and Azkaban; compares advantages and disadvantages of HDFS, S3, HBase and other storage systems; and provides references for further information.
Deploying Containers and Managing ThemDocker, Inc.
The document discusses managing Docker containers across multiple hosts. It introduces Dockermix/Maestro for defining deployments in YAML and synchronizing containers. It covers allocating CPU/RAM resources, potential scheduling solutions like Mesos and Omega, and advanced networking techniques like using Open vSwitch to bypass iptables overhead. Useful links are provided for Maestro, container metrics, Pipework for networking containers, and a Docker API pull request for resource allocation.
- Mongo DB is an open-source document database that provides high performance, a rich query language, high availability through clustering, and horizontal scalability through sharding. It stores data in BSON format and supports indexes, backups, and replication.
- Mongo DB is best for operational applications using unstructured or semi-structured data that require large scalability and multi-datacenter support. It is not recommended for applications with complex calculations, finance data, or those that scan large data subsets.
- The next session will provide a security and replication overview and include demonstrations of installation, document creation, queries, indexes, backups, and replication and sharding if possible.
- The document provides guidance on deploying MongoDB including sizing hardware, installing and upgrading MongoDB, configuration considerations for EC2, security, backups, durability, scaling out, and monitoring. Key aspects discussed are profiling and indexing queries for performance, allocating sufficient memory, CPU and disk I/O, using 64-bit OSes, ext4/XFS filesystems, upgrading to even version numbers, and replicating for high availability and backups.
This document provides a summary of a presentation on becoming an accidental PostgreSQL database administrator (DBA). It covers topics like installation, configuration, connections, backups, monitoring, slow queries, and getting help. The presentation aims to help those suddenly tasked with DBA responsibilities to not panic and provides practical advice on managing a PostgreSQL database.
This document summarizes the Massive Storage Engine 2.0, which was built to address scaling issues with file- and memory-based backends in handling workloads with gigabytes of content. It features allocation that is fragmentation-proof and can scale to over 100 terabytes, with an LFU eviction approach. The architecture uses threads for reliable allocation across multiple segments with reduced locking. It also supports an optional persistent datastore by mirroring metadata to disk in an asynchronous manner with minimal impact to performance. Evaluation showed it handles larger files well and recovers quickly from crashes by reading the stored book of metadata.
This document provides a summary of a presentation on practical MySQL tuning. It discusses measuring critical system resources like CPU, memory, I/O and network usage to identify bottlenecks. It also covers rough tuning of MySQL parameters like the InnoDB buffer pool size, log file size and key buffer size. Further tuning includes application optimizations like query tuning with EXPLAIN, index tuning, and schema design. The presentation also discusses scaling MySQL through approaches like caching, sharding, replication and optimizing architecture and data distribution. Regular performance monitoring is emphasized to simulate increased load and aid capacity planning.
The document discusses memory management and file systems. It covers topics like paging, segmentation, contiguous and non-contiguous allocation, and disk scheduling algorithms. Paging divides memory into fixed-sized blocks called frames and logical memory into pages. It uses a page table to translate logical addresses to physical frame addresses. Swapping allows processes to be temporarily moved to disk to free up memory frames. Contiguous allocation allocates each process to a single block of contiguous memory.
The document describes Massive Storage Engine 2.0, which was built to address scaling issues with file- and memory-based backends in handling gigabytes of content. It uses an allocation algorithm that is fragmentation-proof and supports up to 100+ terabytes of storage per node. It also uses an LFU eviction approach rather than LRU to achieve higher cache hit rates. The architecture uses threading, multiple active segments, and "hole expansion" to improve performance. An optional persistent datastore mirrors metadata to disk for crash recovery with little overhead. The system has been deployed successfully on several public and private CDNs for applications like video distribution and CDNs.
GlusterFS is a distributed file system that shards and replicates files across multiple servers without a central metadata server. It uses modular "translators" to handle functions like replication and distribution. Some challenges GlusterFS faces include multi-tenancy, distributed quota management, efficient data rebalancing, reducing replication latency, optimizing directory traversal, and handling many small files. The speaker argues these challenges are not unique to GlusterFS and that incremental, modular improvements are preferable to monolithic solutions.
The document discusses how content marketing is a path to reaching customers' goals, not just a company's marketing goals. It emphasizes creating high-quality, useful content that serves customers' needs over superficial or keyword-focused content. The key is developing content that provides long-term value for customers through discipline, honesty and developing one's unique voice.
This document discusses improving incident response procedures through practices such as checklists, documented procedures, realistic incident simulations and postmortems. It recommends extended use of checklists to guide responses while still allowing for experience and independent thought. Regular incident response simulations that test both general processes and specific failures can help refine procedures and build confidence. Postmortems should objectively review incidents, suggest improvements and run through scenarios again over time to prevent complacency.
The document discusses how to handle incidents, downtime, and outages. It notes that the cost of downtime for companies in Q1 2015 was $2.9 billion, $870 million, and $4.1 billion. It recommends preparing for incidents by having on-call staff and documentation, responding quickly by following an incident response checklist and notifying stakeholders, and performing a postmortem within days to analyze what failed and how to prevent future issues.
Scaling humans - Ops teams and incident managementServer Density
The document discusses the costs of downtime for companies and best practices for incident management teams to prepare for, respond to, and review incidents to minimize downtime costs. It notes that downtime costs companies billions per quarter and recommends teams prepare documentation and contact information, have on-call rotation schedules, log all responses to incidents, provide frequent status updates, gather teams to escalate incidents as needed, and conduct post-mortem reviews within days of an incident.
Containers seem to have suddenly become the hot new thing everyone is talking about, but what are they?
Why are they important?
How should you use them and what does it mean for cloud infrastructure? This talk will examine the history, technical details and strategy around containerisation from the perspective of developers and operations, consider internal container OSs like Rocket and Ubuntu Core as well as management layers like Docker and Apache Mesos and take a look at why cloud providers are launching their own services around them.
Presented by David Mytton at Datacloud Monaco 2015-06-04
Why Puppet? Why now? Can you get by without using any config management? You probably think don't have time, or that your project is too small. What can using Puppet really add? How can you justify investing time up front? Maybe you can just do it later?
Getting started with config management can often seem like a big project, especially if you only manage a few systems or have a small team. This talk will examine why you should use Puppet from the beginning. It will examine what you can do with Puppet that couldn't do otherwise, how much time it will save and why it's especially important if you think your project has even the smallest chance of scaling in the future.
Presented by David Mytton at Puppet Camp London 2015-04-13
Infrastructure choices - cloud vs colo vs bare metalServer Density
This document discusses the differences between cloud, colocation, and bare metal infrastructure options. It covers key considerations for performance including CPU, memory, disk, and network latency and bandwidth. Colocation provides hardware at a specific location while maintaining internal skills, but has costs for total spend, hardware specifications, and power usage. Cloud infrastructure offers elastic workloads and support for demand spikes and unknown requirements, but bare metal is preferable for managed hardware replacement and networking needs. Overall, the best option depends on an organization's specific workload characteristics and skills.
The customer lifecycle - from visitor to customer. Techniques for driving traffic, trials, nurturing, conversion, success monitoring and handling churn.
Presented by David Mytton at Startup Camp Berlin 2015-03-13.
DevOps Incident Handling - Making friends not enemies.Server Density
David Mytton CEO of Server Density presented this talk to the DevOps Meetup in London. It takes you through how to handle DevOps incidents, outages and downtime -- and more specifically how to make friends, not enemies in the process.
Joined by Rick Nelson, Technical Solutions architect from NGINX Server Density take you though the do's and don'ts of monitoring NGINX. Critical and non critical metrics to monitor, important alerts to configure and the best monitoring tools available.
The document discusses high performance infrastructure for Server Density which includes 150 servers that have been running since June 2009 and migrated from MySQL to MongoDB. It stores 25TB of data per month. Key aspects of performance discussed are using fast networks like 10 Gigabit Ethernet on AWS, ensuring high memory, using SSDs over spinning disks for performance, and factors like replication lag based on location. The document also compares options like using cloud, dedicated servers, or colocation and discusses monitoring, backups, dealing with outages, and other operational aspects.
The document discusses Server Density's architecture which includes 100 Ubuntu servers with 50% being virtual, using Nginx, Python, and MongoDB. It handles 25TB of data per month. Puppet is used for configuration, failover, code deploys, and system updates. The document also considers colocating servers versus using a dedicated provider and factors like hardware specs, costs, skills required, and fun.
NoSQL databases are often touted for their performance and whilst it's true that they usually offer great performance out of the box, it still really depends on how you deploy your infrastructure. Dedicated vs cloud? In memory vs on disk? Spindal vs SSD? Replication lag. Multi data centre deployment.
This talk considers all the infrastructure requirements of a successful high performance infrastructure with hints and tips that can be applied to any NoSQL technology. It includes things like OS tweaks, disk benchmarks, replication, monitoring and backups.
Remote startup - building a company from everywhere in the worldServer Density
This document discusses how Server Density grew from 2 employees in 2009 to 12 employees in 2013 while being fully remote. It outlines the company's timeline and some advantages of being fully remote such as access to worldwide talent, lower costs, and no distractions of an office. However, it also notes disadvantages like collaboration being more difficult without in-person interactions. While the company added an office in 2012, the document emphasizes that working remotely is a mindset that is difficult to adopt after initially co-locating. Effective communication and availability are important for fully distributed teams.
This document discusses MongoDB infrastructure at Server Density. It notes that Server Density uses 27 MongoDB nodes to store 20TB of data per month from their MySQL database. Some key reasons for choosing MongoDB include replication, official drivers, easy deployment, and fast performance out of the box. The document then discusses various MongoDB performance and infrastructure considerations like network throughput, replication lag, failover processes, disk types, backups, and monitoring.
StartOps: Growing an ops team from 1 founderServer Density
Bootstrapped startups don't have the luxury of a full team of ops engineers available to respond to issues 24/7, so how can you survive on your own? This talk will tell the story of how to run your infrastructure as a single founder through to growing that into a team of on call engineers. It will include some interesting war stories as well as tips and suggestions for how to run ops at a startup.
Presented at DevOpsDays London 2013 by David Mytton.
MongoDB: Optimising for Performance, Scale & AnalyticsServer Density
MongoDB is easy to download and run locally but requires some thought and further understanding when deploying to production. At scale, schema design, indexes and query patterns really matter. So does data structure on disk, sharding, replication and data centre awareness. This talk will examine these factors in the context of analytics, and more generally, to help you optimise MongoDB for any scale.
Presented at MongoDB Days London 2013 by David Mytton.
This document discusses adding Forge modules to Puppet Enterprise. It describes moving the HTTP load balancer from Pound to nginx, keeping existing manifests pulled from GitHub, and using the Puppet Forge to install the puppetlabs/nginx module or integrating it via Git submodules. It also covers parameterizing classes on the Puppet Enterprise console and merging site.pp files, as well as updating nginx configurations on the fly and considering alternative nginx modules on the Forge.
Session 1 - Intro to Robotic Process Automation.pdfUiPathCommunity
👉 Check out our full 'Africa Series - Automation Student Developers (EN)' page to register for the full program:
https://bit.ly/Automation_Student_Kickstart
In this session, we shall introduce you to the world of automation, the UiPath Platform, and guide you on how to install and setup UiPath Studio on your Windows PC.
📕 Detailed agenda:
What is RPA? Benefits of RPA?
RPA Applications
The UiPath End-to-End Automation Platform
UiPath Studio CE Installation and Setup
💻 Extra training through UiPath Academy:
Introduction to Automation
UiPath Business Automation Platform
Explore automation development with UiPath Studio
👉 Register here for our upcoming Session 2 on June 20: Introduction to UiPath Studio Fundamentals: https://community.uipath.com/events/details/uipath-lagos-presents-session-2-introduction-to-uipath-studio-fundamentals/
GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...GlobalLogic Ukraine
Під час доповіді відповімо на питання, навіщо потрібно підвищувати продуктивність аплікації і які є найефективніші способи для цього. А також поговоримо про те, що таке кеш, які його види бувають та, основне — як знайти performance bottleneck?
Відео та деталі заходу: https://bit.ly/45tILxj
Conversational agents, or chatbots, are increasingly used to access all sorts of services using natural language. While open-domain chatbots - like ChatGPT - can converse on any topic, task-oriented chatbots - the focus of this paper - are designed for specific tasks, like booking a flight, obtaining customer support, or setting an appointment. Like any other software, task-oriented chatbots need to be properly tested, usually by defining and executing test scenarios (i.e., sequences of user-chatbot interactions). However, there is currently a lack of methods to quantify the completeness and strength of such test scenarios, which can lead to low-quality tests, and hence to buggy chatbots.
To fill this gap, we propose adapting mutation testing (MuT) for task-oriented chatbots. To this end, we introduce a set of mutation operators that emulate faults in chatbot designs, an architecture that enables MuT on chatbots built using heterogeneous technologies, and a practical realisation as an Eclipse plugin. Moreover, we evaluate the applicability, effectiveness and efficiency of our approach on open-source chatbots, with promising results.
AppSec PNW: Android and iOS Application Security with MobSFAjin Abraham
Mobile Security Framework - MobSF is a free and open source automated mobile application security testing environment designed to help security engineers, researchers, developers, and penetration testers to identify security vulnerabilities, malicious behaviours and privacy concerns in mobile applications using static and dynamic analysis. It supports all the popular mobile application binaries and source code formats built for Android and iOS devices. In addition to automated security assessment, it also offers an interactive testing environment to build and execute scenario based test/fuzz cases against the application.
This talk covers:
Using MobSF for static analysis of mobile applications.
Interactive dynamic security assessment of Android and iOS applications.
Solving Mobile app CTF challenges.
Reverse engineering and runtime analysis of Mobile malware.
How to shift left and integrate MobSF/mobsfscan SAST and DAST in your build pipeline.
"NATO Hackathon Winner: AI-Powered Drug Search", Taras KlobaFwdays
This is a session that details how PostgreSQL's features and Azure AI Services can be effectively used to significantly enhance the search functionality in any application.
In this session, we'll share insights on how we used PostgreSQL to facilitate precise searches across multiple fields in our mobile application. The techniques include using LIKE and ILIKE operators and integrating a trigram-based search to handle potential misspellings, thereby increasing the search accuracy.
We'll also discuss how the azure_ai extension on PostgreSQL databases in Azure and Azure AI Services were utilized to create vectors from user input, a feature beneficial when users wish to find specific items based on text prompts. While our application's case study involves a drug search, the techniques and principles shared in this session can be adapted to improve search functionality in a wide range of applications. Join us to learn how PostgreSQL and Azure AI can be harnessed to enhance your application's search capability.
From Natural Language to Structured Solr Queries using LLMsSease
This talk draws on experimentation to enable AI applications with Solr. One important use case is to use AI for better accessibility and discoverability of the data: while User eXperience techniques, lexical search improvements, and data harmonization can take organizations to a good level of accessibility, a structural (or “cognitive” gap) remains between the data user needs and the data producer constraints.
That is where AI – and most importantly, Natural Language Processing and Large Language Model techniques – could make a difference. This natural language, conversational engine could facilitate access and usage of the data leveraging the semantics of any data source.
The objective of the presentation is to propose a technical approach and a way forward to achieve this goal.
The key concept is to enable users to express their search queries in natural language, which the LLM then enriches, interprets, and translates into structured queries based on the Solr index’s metadata.
This approach leverages the LLM’s ability to understand the nuances of natural language and the structure of documents within Apache Solr.
The LLM acts as an intermediary agent, offering a transparent experience to users automatically and potentially uncovering relevant documents that conventional search methods might overlook. The presentation will include the results of this experimental work, lessons learned, best practices, and the scope of future work that should improve the approach and make it production-ready.
This talk will cover ScyllaDB Architecture from the cluster-level view and zoom in on data distribution and internal node architecture. In the process, we will learn the secret sauce used to get ScyllaDB's high availability and superior performance. We will also touch on the upcoming changes to ScyllaDB architecture, moving to strongly consistent metadata and tablets.
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor IvaniukFwdays
At this talk we will discuss DDoS protection tools and best practices, discuss network architectures and what AWS has to offer. Also, we will look into one of the largest DDoS attacks on Ukrainian infrastructure that happened in February 2022. We'll see, what techniques helped to keep the web resources available for Ukrainians and how AWS improved DDoS protection for all customers based on Ukraine experience
"Choosing proper type of scaling", Olena SyrotaFwdays
Imagine an IoT processing system that is already quite mature and production-ready and for which client coverage is growing and scaling and performance aspects are life and death questions. The system has Redis, MongoDB, and stream processing based on ksqldb. In this talk, firstly, we will analyze scaling approaches and then select the proper ones for our system.
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...DanBrown980551
This LF Energy webinar took place June 20, 2024. It featured:
-Alex Thornton, LF Energy
-Hallie Cramer, Google
-Daniel Roesler, UtilityAPI
-Henry Richardson, WattTime
In response to the urgency and scale required to effectively address climate change, open source solutions offer significant potential for driving innovation and progress. Currently, there is a growing demand for standardization and interoperability in energy data and modeling. Open source standards and specifications within the energy sector can also alleviate challenges associated with data fragmentation, transparency, and accessibility. At the same time, it is crucial to consider privacy and security concerns throughout the development of open source platforms.
This webinar will delve into the motivations behind establishing LF Energy’s Carbon Data Specification Consortium. It will provide an overview of the draft specifications and the ongoing progress made by the respective working groups.
Three primary specifications will be discussed:
-Discovery and client registration, emphasizing transparent processes and secure and private access
-Customer data, centering around customer tariffs, bills, energy usage, and full consumption disclosure
-Power systems data, focusing on grid data, inclusive of transmission and distribution networks, generation, intergrid power flows, and market settlement data
inQuba Webinar Mastering Customer Journey Management with Dr Graham HillLizaNolte
HERE IS YOUR WEBINAR CONTENT! 'Mastering Customer Journey Management with Dr. Graham Hill'. We hope you find the webinar recording both insightful and enjoyable.
In this webinar, we explored essential aspects of Customer Journey Management and personalization. Here’s a summary of the key insights and topics discussed:
Key Takeaways:
Understanding the Customer Journey: Dr. Hill emphasized the importance of mapping and understanding the complete customer journey to identify touchpoints and opportunities for improvement.
Personalization Strategies: We discussed how to leverage data and insights to create personalized experiences that resonate with customers.
Technology Integration: Insights were shared on how inQuba’s advanced technology can streamline customer interactions and drive operational efficiency.
QA or the Highway - Component Testing: Bridging the gap between frontend appl...zjhamm304
These are the slides for the presentation, "Component Testing: Bridging the gap between frontend applications" that was presented at QA or the Highway 2024 in Columbus, OH by Zachary Hamm.
Getting the Most Out of ScyllaDB Monitoring: ShareChat's TipsScyllaDB
ScyllaDB monitoring provides a lot of useful information. But sometimes it’s not easy to find the root of the problem if something is wrong or even estimate the remaining capacity by the load on the cluster. This talk shares our team's practical tips on: 1) How to find the root of the problem by metrics if ScyllaDB is slow 2) How to interpret the load and plan capacity for the future 3) Compaction strategies and how to choose the right one 4) Important metrics which aren’t available in the default monitoring setup.
What is an RPA CoE? Session 2 – CoE RolesDianaGray10
In this session, we will review the players involved in the CoE and how each role impacts opportunities.
Topics covered:
• What roles are essential?
• What place in the automation journey does each role play?
Speaker:
Chris Bolin, Senior Intelligent Automation Architect Anika Systems
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving
Manufacturing custom quality metal nameplates and badges involves several standard operations. Processes include sheet prep, lithography, screening, coating, punch press and inspection. All decoration is completed in the flat sheet with adhesive and tooling operations following. The possibilities for creating unique durable nameplates are endless. How will you create your brand identity? We can help!
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdfleebarnesutopia
So… you want to become a Test Automation Engineer (or hire and develop one)? While there’s quite a bit of information available about important technical and tool skills to master, there’s not enough discussion around the path to becoming an effective Test Automation Engineer that knows how to add VALUE. In my experience this had led to a proliferation of engineers who are proficient with tools and building frameworks but have skill and knowledge gaps, especially in software testing, that reduce the value they deliver with test automation.
In this talk, Lee will share his lessons learned from over 30 years of working with, and mentoring, hundreds of Test Automation Engineers. Whether you’re looking to get started in test automation or just want to improve your trade, this talk will give you a solid foundation and roadmap for ensuring your test automation efforts continuously add value. This talk is equally valuable for both aspiring Test Automation Engineers and those managing them! All attendees will take away a set of key foundational knowledge and a high-level learning path for leveling up test automation skills and ensuring they add value to their organizations.