This document discusses various database technologies and their performance characteristics. It begins by outlining some key metrics for evaluating databases, including throughput, response time variance, and efficiency/cost. It then discusses storage strategies using flash and spinning disks to reduce costs while improving performance. The document goes on to examine technologies like B-trees, log structured merge trees, and their variants. It analyzes each in terms of read amplification, write amplification, space amplification, and how these factors influence hardware requirements and lifespan. Overall, the document provides a technical overview for choosing an appropriate database technology based on workload and performance goals.
How do you operate over 1,200 deployments on a single BOSH Director? In the past many talks have had the Topic of Cloud Foundry at scale. But how about the underlying automation layer? BOSH has its own set of challenges and limits for running VMs and Deployments at scale. Learn which obstacles and limits came up and how we solved them with the help of the BOSH core development team. Learn how we monitor the directors, be it via logging and metrics or performance indicators. We’ll also show you how we automate BOSH itself to ensure the best experience for end users, and to keep them blissfully unaware of the complexity of the processes working on their behalf After this talk you will also be able to run at least 1,200 deployments on your directors.
[NetApp] Managing Big Workspaces with Storage MagicPerforce
If you work with large volumes of data—multimedia assets, video game art, or firmware designs—you understand the pain of trying to quickly get a copy of source and build assets. But if you have the right storage system, you can be up and running with a new Perforce workspace in minutes instead of hours. See a simple procedure for fast workspace cloning using a few Perforce commands and NetApp FlexClone.
How to Meet Your P99 Goal While Overcommitting Another WorkloadScyllaDB
Meeting a tight P99 latency goal is hard, it's harder when running
multiple workloads with a mix of real time sensitive and analytical workloads.
In this presentation, I will cover the Scylla schedulers and controllers and demonstrate how they guarantee a good level of resource isolation.
Considerations for building your private cloud folsom update 041513OpenStack Foundation
The document discusses considerations for building a private cloud using OpenStack Folsom. It covers topics like defining a private cloud, sizing flavors and capacity planning, networking with nova-network, image management with Glance, storage options, and example architectures. The presentation aims to help architects build private clouds that can easily scale and have good performance.
This document outlines the key concepts of Google's Bigtable distributed database system. It discusses Bigtable's data model, APIs, implementation details including its use of GFS and Chubby, refinements to improve performance, and lessons learned. The document poses many questions about Bigtable's design and implementation for further discussion.
Cloud-Friendly Hadoop and Hive - StampedeCon 2013StampedeCon
`At the StampedeCon 2013 Big Data conference in St. Louis, Shrikanth Shankar, Head of Engineering at Qubole, presented Cloud-Friendly Hadoop and Hive. The cloud reduces the barrier to entry for many small and medium size enterprises into analytics. Hadoop and related frameworks like Hive, Oozie, Sqoop are becoming tools of choice for deriving insights from data. However, these frameworks were designed for in-house datacenters, which have different tradeoffs from a cloud environment, and making them run well in the cloud presents some challenges. In this talk, Shrikanth Shankar, Head of Engineering at Qubole, describes how these experiences taught us to extend Hadoop and Hive to exploit these new tradeoffs. Use cases will be presented that show how the challenges at large scale at Facebook are now making it extremely easy for a significantly smaller end user to leverage these technologies in the cloud.
How do you operate over 1,200 deployments on a single BOSH Director? In the past many talks have had the Topic of Cloud Foundry at scale. But how about the underlying automation layer? BOSH has its own set of challenges and limits for running VMs and Deployments at scale. Learn which obstacles and limits came up and how we solved them with the help of the BOSH core development team. Learn how we monitor the directors, be it via logging and metrics or performance indicators. We’ll also show you how we automate BOSH itself to ensure the best experience for end users, and to keep them blissfully unaware of the complexity of the processes working on their behalf After this talk you will also be able to run at least 1,200 deployments on your directors.
[NetApp] Managing Big Workspaces with Storage MagicPerforce
If you work with large volumes of data—multimedia assets, video game art, or firmware designs—you understand the pain of trying to quickly get a copy of source and build assets. But if you have the right storage system, you can be up and running with a new Perforce workspace in minutes instead of hours. See a simple procedure for fast workspace cloning using a few Perforce commands and NetApp FlexClone.
How to Meet Your P99 Goal While Overcommitting Another WorkloadScyllaDB
Meeting a tight P99 latency goal is hard, it's harder when running
multiple workloads with a mix of real time sensitive and analytical workloads.
In this presentation, I will cover the Scylla schedulers and controllers and demonstrate how they guarantee a good level of resource isolation.
Considerations for building your private cloud folsom update 041513OpenStack Foundation
The document discusses considerations for building a private cloud using OpenStack Folsom. It covers topics like defining a private cloud, sizing flavors and capacity planning, networking with nova-network, image management with Glance, storage options, and example architectures. The presentation aims to help architects build private clouds that can easily scale and have good performance.
This document outlines the key concepts of Google's Bigtable distributed database system. It discusses Bigtable's data model, APIs, implementation details including its use of GFS and Chubby, refinements to improve performance, and lessons learned. The document poses many questions about Bigtable's design and implementation for further discussion.
Cloud-Friendly Hadoop and Hive - StampedeCon 2013StampedeCon
`At the StampedeCon 2013 Big Data conference in St. Louis, Shrikanth Shankar, Head of Engineering at Qubole, presented Cloud-Friendly Hadoop and Hive. The cloud reduces the barrier to entry for many small and medium size enterprises into analytics. Hadoop and related frameworks like Hive, Oozie, Sqoop are becoming tools of choice for deriving insights from data. However, these frameworks were designed for in-house datacenters, which have different tradeoffs from a cloud environment, and making them run well in the cloud presents some challenges. In this talk, Shrikanth Shankar, Head of Engineering at Qubole, describes how these experiences taught us to extend Hadoop and Hive to exploit these new tradeoffs. Use cases will be presented that show how the challenges at large scale at Facebook are now making it extremely easy for a significantly smaller end user to leverage these technologies in the cloud.
This document provides an overview and best practices for using Redis beyond basic operations. It discusses techniques for ensuring data persistence and safety through replication, snapshots, and append-only files. It also covers reducing memory usage through optimized data structures, scaling read and write capabilities by sharding and using slaves, and executing complex queries efficiently. The goal is to help users leverage Redis' advanced features like high performance, replication, and unique data models for their use cases.
Redis is an in-memory database that offers high performance, replication, and unique data structures. This document outlines various Redis topics like persistence using snapshots and AOF files, replication of data across servers, replacing failed masters, transactions, reducing memory usage with specialized data structures like ziplists and intsets, and scaling Redis through sharding, increasing read/write capabilities, and using specialized commands. The document provides technical details on configuring and implementing these Redis features.
Scylla Summit 2018: Rebuilding the Ceph Distributed Storage Solution with Sea...ScyllaDB
This document summarizes plans to rebuild the Ceph distributed storage system using Seastar, a framework for high-performance event-driven applications. Ceph is an open-source distributed storage platform that provides object, block, and file storage at scale. It uses a thread pool model that has limitations around lock contention and context switching. Seastar uses an asynchronous message passing model without locks that could improve Ceph's performance. The plan is to backfill Ceph components starting with critical I/O paths to prioritize basic functionality, then add supporting features later to fully rebuild Ceph on Seastar.
Virtual memory allows processes to access memory addresses that exceed the amount of physical memory available. When a process references a memory page that is not in RAM, a page fault occurs which brings the missing page into memory from disk. Page replacement algorithms are used to determine which page to remove from RAM to make room for the faulting page. The working set model aims to keep the active pages used by each process in memory to reduce thrashing, which occurs when the total memory demand exceeds the available RAM.
Virtual Memory
• Copy-on-Write
• Page Replacement
• Allocation of Frames
• Thrashing
• Operating-System Examples
Background
Page Table When Some PagesAre Not in Main Memory
Steps in Handling a Page Fault
This document discusses optimizing an Apache Pulsar cluster to handle 10 PB of data per day for a financial customer. Initial estimates showed the cluster would need over 1000 VMs using HDD storage. Various optimizations were implemented, including eliminating the journal, using direct I/O, compression, and C++ client optimizations. This reduced the estimated number of needed VMs to 200 using L-SSD storage per VM. The optimized cluster can now meet the customer's requirements of processing 10 PB of data per day with 3 hours of retention and zone failure protection.
Accelerating HBase with NVMe and Bucket CacheNicolas Poggi
on-Volatile-Memory express (NVMe) standard promises and order of magnitude faster storage than regular SSDs, while at the same time being more economical than regular RAM on TB/$. This talk evaluates the use cases and benefits of NVMe drives for its use in Big Data clusters with HBase and Hadoop HDFS.
First, we benchmark the different drives using system level tools (FIO) to get maximum expected values for each different device type and set expectations. Second, we explore the different options and use cases of HBase storage and benchmark the different setups. And finally, we evaluate the speedups obtained by the NVMe technology for the different Big Data use cases from the YCSB benchmark.
In summary, while the NVMe drives show up to 8x speedup in best case scenarios, testing the cost-efficiency of new device technologies is not straightforward in Big Data, where we need to overcome system level caching to measure the maximum benefits.
P99 Pursuit: 8 Years of Battling P99 LatencyScyllaDB
Performance engineering is a Sisyphean hill climb for perfection. Those who climb the hill are hardly ever satisfied with the results. You should always ask yourself where the bottleneck is today and what’s holding you back. Great performance improves your software. It enables you to run fewer layers, manage 10x less machines, simplifies your stack, and more.
In this keynote session, ScyllaDB CEO Dor Laor will cover the principles for successful creation of projects like ScyllaDB, KVM, the Linux kernel and explain why they spurred his vision for the P99 CONF.
Kafka on ZFS: Better Living Through Filesystems confluent
(Hugh O'Brien, Jet.com) Kafka Summit SF 2018
You’re doing disk IO wrong, let ZFS show you the way. ZFS on Linux is now stable. Say goodbye to JBOD, to directories in your reassignment plans, to unevenly used disks. Instead, have 8K Cloud IOPS for $25, SSD speed reads on spinning disks, in-kernel LZ4 compression and the smartest page cache on the planet. (Fear compactions no more!)
Learn how Jet’s Kafka clusters squeeze every drop of disk performance out of Azure, all completely transparent to Kafka.
-Striping cheap disks to maximize instance IOPS
-Block compression to reduce disk usage by ~80% (JSON data)
-Instance SSD as the secondary read cache (storing compressed data), eliminating >99% of disk reads and safe across host redeployments
-Upcoming features: Compressed blocks in memory, potentially quadrupling your page cache (RAM) for free
We’ll cover:
-Basic Principles
-Adapting ZFS for cloud instances (gotchas)
-Performance tuning for Kafka
-Benchmarks
Percona live linux filesystems and my sqlMichael Zhang
The document discusses Linux filesystems and MySQL tuning. It begins with an overview of basic Linux IO concepts like directory structure, LVM, RAID, SSDs, and filesystem concepts. It then covers filesystem choices and best practices for MySQL tuning, benchmarks, and AWS EC2 deployments. The goal is to provide optimization strategies for storing MySQL data and logs on Linux filesystems.
ZFS is a filesystem developed for Solaris that provides features like cheap snapshots, replication, and checksumming. It can be used for databases. While it has benefits, random writes become sequential which can hurt performance. The OpenZFS project continues developing ZFS and improved the I/O scheduler to provide smoother write latency compared to the original ZFS write throttle. Tuning parameters in OpenZFS give better control over throughput and latency. Measuring performance is important for optimizing ZFS for database use.
These are the slides from a tutorial I presented at LOPSA-East in 2013. It covers spinning media and and solid state drives in detail.
A video of the presentation can be found on YouTube: http://www.youtube.com/watch?v=G3wf1HMr6b0
Queueing Theory is perhaps one of the most important mathematical theories in systems design and analysis, yet only few engineers learn it. This talk teaches the basics of queueing theory and explores the ramifications of queue behavior on system performance and resiliency. This talk aims to give practical skills that can be applied better build and tune your systems. The talk covers:
- Queueing delays
- Queueing capacity
- Little's Law and how to apply it
- Proper sizing of thread and connection pools
This chapter contains information for memory compilers available in STDL80 cell library. These are
complete compilers that consist of various generators to satisfy the requirements of the circuit at hand. Each
of the final building block, the physical layout, will be implemented as a stand-alone, densely packed,
pitch-matched array. Using this complex layout generator and adopting state-of-the-art logic and circuit
design technique, these memory cells can realize extreme density and performance. In each layout
generator, we added an option which makes the aspect ratio of the physical layout selectable so that the
ASIC designers can choose the aspect ratio according to the convenience of the chip level layout.
Queueing Theory is perhaps one of the most important mathematical theories in systems design and analysis, yet only few engineers learn it. This talk teaches the basics of queueing theory and explores the ramifications of queue behavior on system performance and resiliency. This talk aims to give practical skills that can be applied better build and tune your systems. The talk covers:
- Queueing delays
- Queueing capacity
- Little's Law and how to apply it
- Proper sizing of thread and connection pools
This document discusses how Cassandra's storage engine was optimized for spinning disks but remains well-suited for solid state drives. It describes how Cassandra uses LSM trees with sequential, append-only writes to disks, avoiding the random read/write patterns that cause issues for SSDs like write amplification and reduced lifetime from excessive garbage collection. While SSDs have benefits like fast random access, Cassandra's design circumvents problems they were meant to solve, keeping write amplification close to 1 and leveraging SSDs' fast sequential throughput.
Migrating from InnoDB and HBase to MyRocks at FacebookMariaDB plc
Migrating large databases at Facebook from InnoDB to MyRocks and HBase to MyRocks resulted in significant space savings of 2-4x and improved write performance by up to 10x. Various techniques were used for the migrations such as creating new MyRocks instances without downtime, loading data efficiently, testing on shadow instances, and promoting MyRocks instances as masters. Ongoing work involves optimizations like direct I/O, dictionary compression, parallel compaction, and dynamic configuration changes to further improve performance and efficiency.
An updated talk about how to use Solr for logs and other time-series data, like metrics and social media. In 2016, Solr, its ecosystem, and the operating systems it runs on have evolved quite a lot, so we can now show new techniques to scale and new knobs to tune.
We'll start by looking at how to scale SolrCloud through a hybrid approach using a combination of time- and size-based indices, and also how to divide the cluster in tiers in order to handle the potentially spiky load in real-time. Then, we'll look at tuning individual nodes. We'll cover everything from commits, buffers, merge policies and doc values to OS settings like disk scheduler, SSD caching, and huge pages.
Finally, we'll take a look at the pipeline of getting the logs to Solr and how to make it fast and reliable: where should buffers live, which protocols to use, where should the heavy processing be done (like parsing unstructured data), and which tools from the ecosystem can help.
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...Lucidworks
The document summarizes key points from a presentation on optimizing Solr and log pipelines for time-series data. The presentation covered using time-based Solr collections that rotate based on size, tiering hot and cold clusters, tuning OS and Solr settings, parsing logs, buffering pipelines, and shipping logs using protocols like UDP, TCP, and Kafka. The overall conclusions were that tuning segments per tier and max merged segment size improved indexing throughput, and that simple, reliable pipelines like Filebeat to Kafka or rsyslog over UNIX sockets generally work best.
This document discusses scaling MySQL databases in Amazon Web Services. It provides an overview of using Amazon RDS versus managing MySQL databases on EC2 instances. While RDS offers ease of use, it has higher costs and less flexibility. The document recommends using EC2 for high performance or flexible setups, and automating database provisioning, backups, and failover. It also discusses sharding databases across multiple instances, using replication and multiple availability zones for resiliency, and tools for monitoring and operations visibility.
This document provides an overview and best practices for using Redis beyond basic operations. It discusses techniques for ensuring data persistence and safety through replication, snapshots, and append-only files. It also covers reducing memory usage through optimized data structures, scaling read and write capabilities by sharding and using slaves, and executing complex queries efficiently. The goal is to help users leverage Redis' advanced features like high performance, replication, and unique data models for their use cases.
Redis is an in-memory database that offers high performance, replication, and unique data structures. This document outlines various Redis topics like persistence using snapshots and AOF files, replication of data across servers, replacing failed masters, transactions, reducing memory usage with specialized data structures like ziplists and intsets, and scaling Redis through sharding, increasing read/write capabilities, and using specialized commands. The document provides technical details on configuring and implementing these Redis features.
Scylla Summit 2018: Rebuilding the Ceph Distributed Storage Solution with Sea...ScyllaDB
This document summarizes plans to rebuild the Ceph distributed storage system using Seastar, a framework for high-performance event-driven applications. Ceph is an open-source distributed storage platform that provides object, block, and file storage at scale. It uses a thread pool model that has limitations around lock contention and context switching. Seastar uses an asynchronous message passing model without locks that could improve Ceph's performance. The plan is to backfill Ceph components starting with critical I/O paths to prioritize basic functionality, then add supporting features later to fully rebuild Ceph on Seastar.
Virtual memory allows processes to access memory addresses that exceed the amount of physical memory available. When a process references a memory page that is not in RAM, a page fault occurs which brings the missing page into memory from disk. Page replacement algorithms are used to determine which page to remove from RAM to make room for the faulting page. The working set model aims to keep the active pages used by each process in memory to reduce thrashing, which occurs when the total memory demand exceeds the available RAM.
Virtual Memory
• Copy-on-Write
• Page Replacement
• Allocation of Frames
• Thrashing
• Operating-System Examples
Background
Page Table When Some PagesAre Not in Main Memory
Steps in Handling a Page Fault
This document discusses optimizing an Apache Pulsar cluster to handle 10 PB of data per day for a financial customer. Initial estimates showed the cluster would need over 1000 VMs using HDD storage. Various optimizations were implemented, including eliminating the journal, using direct I/O, compression, and C++ client optimizations. This reduced the estimated number of needed VMs to 200 using L-SSD storage per VM. The optimized cluster can now meet the customer's requirements of processing 10 PB of data per day with 3 hours of retention and zone failure protection.
Accelerating HBase with NVMe and Bucket CacheNicolas Poggi
on-Volatile-Memory express (NVMe) standard promises and order of magnitude faster storage than regular SSDs, while at the same time being more economical than regular RAM on TB/$. This talk evaluates the use cases and benefits of NVMe drives for its use in Big Data clusters with HBase and Hadoop HDFS.
First, we benchmark the different drives using system level tools (FIO) to get maximum expected values for each different device type and set expectations. Second, we explore the different options and use cases of HBase storage and benchmark the different setups. And finally, we evaluate the speedups obtained by the NVMe technology for the different Big Data use cases from the YCSB benchmark.
In summary, while the NVMe drives show up to 8x speedup in best case scenarios, testing the cost-efficiency of new device technologies is not straightforward in Big Data, where we need to overcome system level caching to measure the maximum benefits.
P99 Pursuit: 8 Years of Battling P99 LatencyScyllaDB
Performance engineering is a Sisyphean hill climb for perfection. Those who climb the hill are hardly ever satisfied with the results. You should always ask yourself where the bottleneck is today and what’s holding you back. Great performance improves your software. It enables you to run fewer layers, manage 10x less machines, simplifies your stack, and more.
In this keynote session, ScyllaDB CEO Dor Laor will cover the principles for successful creation of projects like ScyllaDB, KVM, the Linux kernel and explain why they spurred his vision for the P99 CONF.
Kafka on ZFS: Better Living Through Filesystems confluent
(Hugh O'Brien, Jet.com) Kafka Summit SF 2018
You’re doing disk IO wrong, let ZFS show you the way. ZFS on Linux is now stable. Say goodbye to JBOD, to directories in your reassignment plans, to unevenly used disks. Instead, have 8K Cloud IOPS for $25, SSD speed reads on spinning disks, in-kernel LZ4 compression and the smartest page cache on the planet. (Fear compactions no more!)
Learn how Jet’s Kafka clusters squeeze every drop of disk performance out of Azure, all completely transparent to Kafka.
-Striping cheap disks to maximize instance IOPS
-Block compression to reduce disk usage by ~80% (JSON data)
-Instance SSD as the secondary read cache (storing compressed data), eliminating >99% of disk reads and safe across host redeployments
-Upcoming features: Compressed blocks in memory, potentially quadrupling your page cache (RAM) for free
We’ll cover:
-Basic Principles
-Adapting ZFS for cloud instances (gotchas)
-Performance tuning for Kafka
-Benchmarks
Percona live linux filesystems and my sqlMichael Zhang
The document discusses Linux filesystems and MySQL tuning. It begins with an overview of basic Linux IO concepts like directory structure, LVM, RAID, SSDs, and filesystem concepts. It then covers filesystem choices and best practices for MySQL tuning, benchmarks, and AWS EC2 deployments. The goal is to provide optimization strategies for storing MySQL data and logs on Linux filesystems.
ZFS is a filesystem developed for Solaris that provides features like cheap snapshots, replication, and checksumming. It can be used for databases. While it has benefits, random writes become sequential which can hurt performance. The OpenZFS project continues developing ZFS and improved the I/O scheduler to provide smoother write latency compared to the original ZFS write throttle. Tuning parameters in OpenZFS give better control over throughput and latency. Measuring performance is important for optimizing ZFS for database use.
These are the slides from a tutorial I presented at LOPSA-East in 2013. It covers spinning media and and solid state drives in detail.
A video of the presentation can be found on YouTube: http://www.youtube.com/watch?v=G3wf1HMr6b0
Queueing Theory is perhaps one of the most important mathematical theories in systems design and analysis, yet only few engineers learn it. This talk teaches the basics of queueing theory and explores the ramifications of queue behavior on system performance and resiliency. This talk aims to give practical skills that can be applied better build and tune your systems. The talk covers:
- Queueing delays
- Queueing capacity
- Little's Law and how to apply it
- Proper sizing of thread and connection pools
This chapter contains information for memory compilers available in STDL80 cell library. These are
complete compilers that consist of various generators to satisfy the requirements of the circuit at hand. Each
of the final building block, the physical layout, will be implemented as a stand-alone, densely packed,
pitch-matched array. Using this complex layout generator and adopting state-of-the-art logic and circuit
design technique, these memory cells can realize extreme density and performance. In each layout
generator, we added an option which makes the aspect ratio of the physical layout selectable so that the
ASIC designers can choose the aspect ratio according to the convenience of the chip level layout.
Queueing Theory is perhaps one of the most important mathematical theories in systems design and analysis, yet only few engineers learn it. This talk teaches the basics of queueing theory and explores the ramifications of queue behavior on system performance and resiliency. This talk aims to give practical skills that can be applied better build and tune your systems. The talk covers:
- Queueing delays
- Queueing capacity
- Little's Law and how to apply it
- Proper sizing of thread and connection pools
This document discusses how Cassandra's storage engine was optimized for spinning disks but remains well-suited for solid state drives. It describes how Cassandra uses LSM trees with sequential, append-only writes to disks, avoiding the random read/write patterns that cause issues for SSDs like write amplification and reduced lifetime from excessive garbage collection. While SSDs have benefits like fast random access, Cassandra's design circumvents problems they were meant to solve, keeping write amplification close to 1 and leveraging SSDs' fast sequential throughput.
Migrating from InnoDB and HBase to MyRocks at FacebookMariaDB plc
Migrating large databases at Facebook from InnoDB to MyRocks and HBase to MyRocks resulted in significant space savings of 2-4x and improved write performance by up to 10x. Various techniques were used for the migrations such as creating new MyRocks instances without downtime, loading data efficiently, testing on shadow instances, and promoting MyRocks instances as masters. Ongoing work involves optimizations like direct I/O, dictionary compression, parallel compaction, and dynamic configuration changes to further improve performance and efficiency.
An updated talk about how to use Solr for logs and other time-series data, like metrics and social media. In 2016, Solr, its ecosystem, and the operating systems it runs on have evolved quite a lot, so we can now show new techniques to scale and new knobs to tune.
We'll start by looking at how to scale SolrCloud through a hybrid approach using a combination of time- and size-based indices, and also how to divide the cluster in tiers in order to handle the potentially spiky load in real-time. Then, we'll look at tuning individual nodes. We'll cover everything from commits, buffers, merge policies and doc values to OS settings like disk scheduler, SSD caching, and huge pages.
Finally, we'll take a look at the pipeline of getting the logs to Solr and how to make it fast and reliable: where should buffers live, which protocols to use, where should the heavy processing be done (like parsing unstructured data), and which tools from the ecosystem can help.
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...Lucidworks
The document summarizes key points from a presentation on optimizing Solr and log pipelines for time-series data. The presentation covered using time-based Solr collections that rotate based on size, tiering hot and cold clusters, tuning OS and Solr settings, parsing logs, buffering pipelines, and shipping logs using protocols like UDP, TCP, and Kafka. The overall conclusions were that tuning segments per tier and max merged segment size improved indexing throughput, and that simple, reliable pipelines like Filebeat to Kafka or rsyslog over UNIX sockets generally work best.
This document discusses scaling MySQL databases in Amazon Web Services. It provides an overview of using Amazon RDS versus managing MySQL databases on EC2 instances. While RDS offers ease of use, it has higher costs and less flexibility. The document recommends using EC2 for high performance or flexible setups, and automating database provisioning, backups, and failover. It also discusses sharding databases across multiple instances, using replication and multiple availability zones for resiliency, and tools for monitoring and operations visibility.
Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...ScyllaDB
Scylla strives to deliver high throughput at low, consistent latencies under any scenario. But in the field things can and do get slower than one would like. Some of those issues come from bad data modelling and anti-patterns. Some others from lack of resources and bad system configuration, and in rare cases even product malfunction.
But how to tell them apart? And once you do, how to understand how to fix your application or reconfigure your system? Scylla has a rich ecosystem of tools available to answer those questions and in this talk we’ll discuss the proper use of some of them and how to take advantage of each tool’s strength. We will discuss real examples using tools like CQL tracing, nodetool commands, the Scylla monitor and others.
From HDFS to S3: Migrate Pinterest Apache Spark ClustersDatabricks
The document discusses Pinterest migrating their Apache Spark clusters from HDFS to S3 storage. Some key points:
1) Migrating to S3 provided significantly better performance due to the higher IOPS of modern EC2 instances compared to their older HDFS nodes. Jobs saw 25-35% improvements on average.
2) S3 is eventually consistent while HDFS is strongly consistent, so they implemented the S3Committer to handle output consistency issues during job failures.
3) Metadata operations like file moves were very slow in S3, so they optimized jobs to reduce unnecessary moves using techniques like multipart uploads to S3.
What Every Developer Should Know About Database Scalabilityjbellis
Replication. Partitioning. Relational databases. Bigtable. Dynamo. There is no one-size-fits-all approach to scaling your database, and the CAP theorem proved that there never will be. This talk will explain the advantages and limits of the approaches to scaling traditional relational databases, as well as the tradeoffs made by the designers of newer distributed systems like Cassandra. These slides are from Jonathan Ellis's OSCON 09 talk: http://en.oreilly.com/oscon2009/public/schedule/detail/7955
This document discusses storing data on disks and in files for database management systems. It covers several key topics:
1) The memory hierarchy from main memory to disks and tapes and why databases must store most data on disk for capacity and cost reasons.
2) Disk drive architecture including how data is stored, read, and written in blocks and the implications for performance like seek times and rotational delays.
3) File structure including heap files, how records and pages are organized on disk, and the impact of layout on performance through aspects like locality of reference.
UKOUG, Lies, Damn Lies and I/O StatisticsKyle Hailey
1. Many factors can cause storage performance anomalies that make benchmarking difficult. Caching, shared infrastructure, I/O consolidation and fragmentation, and tiered storage are some of the top issues.
2. It is important to use real workloads, capture latency histograms rather than just averages, ensure results are reproducible, and run tests long enough to reach steady state.
3. Proper testing methodology is required to accurately characterize storage performance and avoid anomalies. Tools like FIO can help simulate real workloads.
One-cloud — система управления дата-центром в Одноклассниках / Олег Анастасье...Ontico
HighLoad++ 2017
Зал «Калининград», 8 ноября, 15:00
Тезисы:
http://www.highload.ru/2017/abstracts/2964.html
Одноклассники состоят из более чем восьми тысяч железных серверов, расположенных в нескольких дата-центрах. Каждая из этих машин была специализированной под конкретную задачу - как для обеспечения изоляции отказов, так и для обеспечения автоматизированного управления инфраструктурой.
...
Масштабируя DNS / Артем Гавриченков (Qrator Labs)Ontico
HighLoad++ 2017
Зал «Калининград», 8 ноября, 16:00
Тезисы:
http://www.highload.ru/2017/abstracts/3032.html
Протокол DNS на семь лет старше, чем Всемирная паутина. Стандарты RFC 882 и 883, определяющие основную функциональность системы доменных имён, появились в конце 1983 года, а первая реализация последовала уже годом позже. Естественно, что у технологии столь старой и при этом по сей день активнейшим образом используемой просто не могли не накопиться особенности, неочевидные обыкновенным пользователям.
...
Создание BigData-платформы для ФГУП Почта России / Андрей Бащенко (Luxoft)Ontico
HighLoad++ 2017
Зал «Калининград», 8 ноября, 13:00
Тезисы:
http://www.highload.ru/2017/abstracts/3010.html
В этом докладе я расскажу, как BigData-платформа помогает трансформировать Почту России, как мы управляем построением и развитием платформы. Расскажу про найденные удачные решения, например, как разбиение на продукты с понятными SLA и интерфейсами между ними помогло нам сохранять управляемость с ростом масштабов проекта.
...
Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...Ontico
HighLoad++ 2017
Зал «Кейптаун», 8 ноября, 10:00
Тезисы:
http://www.highload.ru/2017/abstracts/2914.html
Казалось бы, что нужно для организации тестового окружения? Тестовая железка и копия боевого окружения - и тестовый сервер готов. Но как быть, когда проект сложный? А когда большой? А если нужно тестировать одновременно много версий? А если все это вместе?
Организация тестирования большого развивающегося проекта, где одновременно в разработке и тестировании около полусотни фич - достаточно непростая задача. Ситуация обычно осложняется тем, что иногда есть желание потрогать еще не полностью готовый функционал. В таких ситуациях часто возникает вопрос: "А куда это можно накатить и где покликать?"
...
Новые технологии репликации данных в PostgreSQL / Александр Алексеев (Postgre...Ontico
HighLoad++ 2017
Зал «Кейптаун», 8 ноября, 18:00
Тезисы:
http://www.highload.ru/2017/abstracts/2854.html
Из этого доклада вы узнаете о возможностях репликации и автофейловера PostgreSQL, в том числе о возможностях, ставших доступных в PostgreSQL 10.
Среди прочих, будет затронуты следующие темы:
* Виды репликации и решаемые с ее помощью проблемы.
* Настройка потоковой репликации.
* Настройка логической репликации.
* Настройка автофейловера / HA средствами Stolon и Consul.
После прослушивания доклада вы сможете самостоятельно настраивать репликацию и автофейловер PostgreSQL.
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)Ontico
HighLoad++ 2017
Зал «Кейптаун», 8 ноября, 17:00
Тезисы:
http://www.highload.ru/2017/abstracts/3096.html
PostgreSQL is the world’s most advanced open source database. Indeed! With around 270 configuration parameters in postgresql.conf, plus all the knobs in pg_hba.conf, it is definitely ADVANCED!
How many parameters do you tune? 1? 8? 32? Anyone ever tuned more than 64?
No tuning means below par performance. But how to start? Which parameters to tune? What are the appropriate values? Is there a tool --not just an editor like vim or emacs-- to help users manage the 700-line postgresql.conf file?
Join this talk to understand the performance advantages of appropriately tuning your postgresql.conf file, showcase a new free tool to make PostgreSQL configuration possible for HUMANS, and learn the best practices for tuning several relevant postgresql.conf parameters.
Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...Ontico
HighLoad++ 2017
Зал «Кейптаун», 8 ноября, 16:00
Тезисы:
http://www.highload.ru/2017/abstracts/3115.html
During this session we will cover the last development in ProxySQL to support regular expressions (RE2 and PCRE) and how we can use this strong technique in correlation with ProxySQL's query rules to anonymize live data quickly and transparently. We will explain the mechanism and how to generate these rules quickly. We show live demo with all challenges we got from the Community and we finish the session by an interactive brainstorm testing queries from the audience.
Опыт разработки модуля межсетевого экранирования для MySQL / Олег Брославский...Ontico
HighLoad++ 2017
Зал «Кейптаун», 8 ноября, 15:00
Тезисы:
http://www.highload.ru/2017/abstracts/2957.html
Расскажем о нашем опыте разработки модуля межсетевого экрана для MySQL с использованием генератора парсеров ANTLR и языка Kotlin.
Подробно рассмотрим следующие вопросы:
— когда и почему целесообразно использовать ANTLR;
— особенности разработки ANTLR-грамматики для MySQL;
— сравнение производительности рантаймов для ANTLR в рамках задачи синтаксического анализа MySQL (C#, Java, Kotlin, Go, Python, PyPy, C++);
— вспомогательные DSL;
— микросервисная архитектура модуля экранирования SQL;
— полученные результаты.
ProxySQL Use Case Scenarios / Alkin Tezuysal (Percona)Ontico
HighLoad++ 2017
Зал «Кейптаун», 8 ноября, 14:00
Тезисы:
http://www.highload.ru/2017/abstracts/3114.html
ProxySQL aims to be the most powerful proxy in the MySQL ecosystem. It is protocol-aware and able to provide high availability (HA) and high performance with no changes in the application, using several built-in features and integration with clustering software. During this session we will quickly introduce its main features, so to better understand how it works. We will then describe multiple use case scenarios in which ProxySQL empowers large MySQL installations to provide HA with zero downtime, read/write split, query rewrite, sharding, query caching, and multiplexing using SSL across data centers.
MySQL Replication — Advanced Features / Петр Зайцев (Percona)Ontico
HighLoad++ 2017
Зал «Кейптаун», 8 ноября, 13:00
Тезисы:
http://www.highload.ru/2017/abstracts/2954.html
MySQL Replication is powerful and has added a lot of advanced features through the years. In this presentation we will look into replication technology in MySQL 5.7 and variants focusing on advanced features, what do they mean, when to use them and when not, Including.
When should you use STATEMENT, ROW or MIXED binary log format?
What is GTID in MySQL and MariaDB and why do you want to use them?
What is semi-sync replication and how is it different from lossless semi-sync?
...
Внутренний open-source. Как разрабатывать мобильное приложение большим количе...Ontico
HighLoad++ 2017
Зал «Кейптаун», 8 ноября, 12:00
Тезисы:
http://www.highload.ru/2017/abstracts/3120.html
Количество разработчиков мобильных приложений Сбербанк Онлайн с начала 2016 года выросло на порядок. Для того чтобы продолжать выпускать качественный продукт, мы кардинально перестраиваем процесс разработки.
Количество внутренних заказчиков тех или иных доработок в какой-то момент выросло настолько, что разработчики стали узким местом. Мы внедрили культуру разработки, которую можно условно назвать "внутренним open-source", сохранив за собой контроль над архитектурой и качеством проекта, но позволив разрабатывать новые фичи всем желающим.
...
Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев...Ontico
HighLoad++ 2017
Зал «Мумбай», 8 ноября, 18:00
Тезисы:
http://www.highload.ru/2017/abstracts/2836.html
При использовании Eventually Consistent распределенных баз данных нет гарантий, что чтение возвращает результаты последних изменений данных, если чтение и запись производятся на разных узлах. Это ограничивает пропускную способность системы. Поддержка свойства Causal Consistency снимает это ограничение, что позволяет улучшить масштабируемость, не требуя изменений в коде приложения.
...
Балансировка на скорости проводов. Без ASIC, без ограничений. Решения NFWare ...Ontico
HighLoad++ 2017
Зал «Мумбай», 8 ноября, 16:00
Тезисы:
http://www.highload.ru/2017/abstracts/2858.html
Аудитория Одноклассников превышает 73 миллиона человек в России, СНГ и странах дальнего зарубежья. При этом ОК.ru - первая социальная сеть по просмотрам видео в рунете и крупнейшая сервисная платформа.
Качественный и количественный рост DDoS-атак за последние годы превращает их в одну из первоочередных проблем для крупнейших интернет-ресурсов. В зависимости от вектора атаки “узким” местом становится та или иная часть инфраструктуры. В частности, при SYN-flood первый удар приходится на систему балансировки трафика. От ее производительности зависит успех в противостоянии атаке.
...
Перехват трафика — мифы и реальность / Евгений Усков (Qrator Labs)Ontico
HighLoad++ 2017
Зал «Мумбай», 8 ноября, 15:00
Тезисы:
http://www.highload.ru/2017/abstracts/3008.html
Никогда не было и вот снова случилось! Компания Google в результате перенаправления трафика сделала недостпуными в Японии несколько тысяч различных сервисов, большинство из которых никак не связано с самой компанией Google. Однако, подобные инциденты происходят с завидной регулярностью, вот только не всегда попадают в большие СМИ. У таких инцидентов могут быть разные причины, начиная от ошибок сетевых инженеров и заканчивая государственным регулированием.
...
И тогда наверняка вдруг запляшут облака! / Алексей Сушков (ПЕТЕР-СЕРВИС)Ontico
HighLoad++ 2017
Зал «Мумбай», 8 ноября, 14:00
Тезисы:
http://www.highload.ru/2017/abstracts/2925.html
Облака и виртуализация – современные тренды развития IT-технологий. Операторы связи строят свои TelcoClouds на стандартах NFV (Network Functions Virtualization) и SDN (Software-Defined Networking). В докладе начнем с основ виртуализации, далее разберемся, для чего используются NFV и SDN, потом полетим к облакам и вернемся на землю для решения практических задач!
...
Как мы заставили Druid работать в Одноклассниках / Юрий Невиницин (OK.RU)Ontico
HighLoad++ 2017
Зал «Мумбай», 8 ноября, 10:00
Тезисы:
http://www.highload.ru/2017/abstracts/3045.html
Как мы заставили Druid работать в Одноклассниках.
«Druid is a high-performance, column-oriented, distributed data store» http://druid.io.
Мы расскажем о том, как, внедрив Druid, мы справились с ситуацией, когда MSSQL-based система статистики на 50 терабайт стала:
- медленной: средняя скорость ответа была в разы меньше требуемой (и увеличилась в 20 раз);
- нестабильной: в час пик статистика отставала до получаса (теперь ничего не отстает);
- дорогой: изменилась политика лицензирования Microsoft, расходы на лицензии могли составить миллионы долларов.
...
Разгоняем ASP.NET Core / Илья Вербицкий (WebStoating s.r.o.)Ontico
HighLoad++ 2017
Зал «Рио-де-Жанейро», 8 ноября, 18:00
Тезисы:
http://www.highload.ru/2017/abstracts/2905.html
Прошло более года с того момента, как Microsoft выпустила первую версию своего нового фреймворка для разработки web-приложений ASP.NET Core, и с каждым днем он находит все больше поклонников. ASP.NET Core базируется на платформе .NET Core, кроссплатформенной версии платформы .NET c открытым исходным кодом. Теперь у С#-разработчиков появилась возможность использовать Mac в качестве среды разработки, и запускать приложения на Linux или внутри Docker-контейнеров.
...
100500 способов кэширования в Oracle Database или как достичь максимальной ск...Ontico
HighLoad++ 2017
Зал «Рио-де-Жанейро», 8 ноября, 14:00
Тезисы:
http://www.highload.ru/2017/abstracts/2913.html
Изначально будут раскрыты базовые причины, которые заставили появиться такой части механизма СУБД, как кэш результатов, и почему в ряде СУБД он есть или отсутствует.
Будут рассмотрены различные варианты кэширования результатов как sql-запросов, так и результатов хранимой в БД бизнес-логики. Произведено сравнение способов кэширования (программируемые вручную кэши, стандартный функционал) и даны рекомендации, когда и в каких случаях данные способы оптимальны, а порой опасны.
...
Apache Ignite Persistence: зачем Persistence для In-Memory, и как он работает...Ontico
HighLoad++ 2017
Зал «Рио-де-Жанейро», 8 ноября, 13:00
Тезисы:
http://www.highload.ru/2017/abstracts/2947.html
Apache Ignite — Open Source платформа для высокопроизводительной распределенной работы с большими данными с применением SQL или Java/.NET/C++ API. Ignite используют в самых разных отраслях. Сбербанк, ING, RingCentral, Microsoft, e-Therapeutics — все эти компании применяют решения на основе Ignite. Размеры кластеров разнятся от всего одного узла до нескольких сотен, узлы могут быть расположены в одном ЦОД-е или в нескольких геораспределенных.
...
HighLoad++ 2017
Зал «Рио-де-Жанейро», 8 ноября, 12:00
Тезисы:
http://www.highload.ru/2017/abstracts/3005.html
Когда мы говорим о нагруженных системах и базах данных с большим числом параллельных коннектов, особый интерес представляет практика эксплуатации и сопровождения таких проектов. В том числе инструменты и механизмы СУБД, которые могут быть использованы DBA и DevOps-инженерами для решения задач мониторинга жизнедеятельности базы данных и ранней диагностики возможных проблем.
...
Dive into the realm of operating systems (OS) with Pravash Chandra Das, a seasoned Digital Forensic Analyst, as your guide. 🚀 This comprehensive presentation illuminates the core concepts, types, and evolution of OS, essential for understanding modern computing landscapes.
Beginning with the foundational definition, Das clarifies the pivotal role of OS as system software orchestrating hardware resources, software applications, and user interactions. Through succinct descriptions, he delineates the diverse types of OS, from single-user, single-task environments like early MS-DOS iterations, to multi-user, multi-tasking systems exemplified by modern Linux distributions.
Crucial components like the kernel and shell are dissected, highlighting their indispensable functions in resource management and user interface interaction. Das elucidates how the kernel acts as the central nervous system, orchestrating process scheduling, memory allocation, and device management. Meanwhile, the shell serves as the gateway for user commands, bridging the gap between human input and machine execution. 💻
The narrative then shifts to a captivating exploration of prominent desktop OSs, Windows, macOS, and Linux. Windows, with its globally ubiquitous presence and user-friendly interface, emerges as a cornerstone in personal computing history. macOS, lauded for its sleek design and seamless integration with Apple's ecosystem, stands as a beacon of stability and creativity. Linux, an open-source marvel, offers unparalleled flexibility and security, revolutionizing the computing landscape. 🖥️
Moving to the realm of mobile devices, Das unravels the dominance of Android and iOS. Android's open-source ethos fosters a vibrant ecosystem of customization and innovation, while iOS boasts a seamless user experience and robust security infrastructure. Meanwhile, discontinued platforms like Symbian and Palm OS evoke nostalgia for their pioneering roles in the smartphone revolution.
The journey concludes with a reflection on the ever-evolving landscape of OS, underscored by the emergence of real-time operating systems (RTOS) and the persistent quest for innovation and efficiency. As technology continues to shape our world, understanding the foundations and evolution of operating systems remains paramount. Join Pravash Chandra Das on this illuminating journey through the heart of computing. 🌟
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on automated letter generation for Bonterra Impact Management using Google Workspace or Microsoft 365.
Interested in deploying letter generation automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxSitimaJohn
Ocean Lotus cyber threat actors represent a sophisticated, persistent, and politically motivated group that poses a significant risk to organizations and individuals in the Southeast Asian region. Their continuous evolution and adaptability underscore the need for robust cybersecurity measures and international cooperation to identify and mitigate the threats posed by such advanced persistent threat groups.
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
Taking AI to the Next Level in Manufacturing.pdfssuserfac0301
Read Taking AI to the Next Level in Manufacturing to gain insights on AI adoption in the manufacturing industry, such as:
1. How quickly AI is being implemented in manufacturing.
2. Which barriers stand in the way of AI adoption.
3. How data quality and governance form the backbone of AI.
4. Organizational processes and structures that may inhibit effective AI adoption.
6. Ideas and approaches to help build your organization's AI strategy.
Driving Business Innovation: Latest Generative AI Advancements & Success StorySafe Software
Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency.
During the hour, we’ll take you through:
Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board.
Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes.
Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI.
We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI.
This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
Digital Marketing Trends in 2024 | Guide for Staying AheadWask
https://www.wask.co/ebooks/digital-marketing-trends-in-2024
Feeling lost in the digital marketing whirlwind of 2024? Technology is changing, consumer habits are evolving, and staying ahead of the curve feels like a never-ending pursuit. This e-book is your compass. Dive into actionable insights to handle the complexities of modern marketing. From hyper-personalization to the power of user-generated content, learn how to build long-term relationships with your audience and unlock the secrets to success in the ever-shifting digital landscape.
Introduction of Cybersecurity with OSS at Code Europe 2024Hiroshi SHIBATA
I develop the Ruby programming language, RubyGems, and Bundler, which are package managers for Ruby. Today, I will introduce how to enhance the security of your application using open-source software (OSS) examples from Ruby and RubyGems.
The first topic is CVE (Common Vulnerabilities and Exposures). I have published CVEs many times. But what exactly is a CVE? I'll provide a basic understanding of CVEs and explain how to detect and handle vulnerabilities in OSS.
Next, let's discuss package managers. Package managers play a critical role in the OSS ecosystem. I'll explain how to manage library dependencies in your application.
I'll share insights into how the Ruby and RubyGems core team works to keep our ecosystem safe. By the end of this talk, you'll have a better understanding of how to safeguard your code.
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfChart Kalyan
A Mix Chart displays historical data of numbers in a graphical or tabular form. The Kalyan Rajdhani Mix Chart specifically shows the results of a sequence of numbers over different periods.
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...alexjohnson7307
Predictive maintenance is a proactive approach that anticipates equipment failures before they happen. At the forefront of this innovative strategy is Artificial Intelligence (AI), which brings unprecedented precision and efficiency. AI in predictive maintenance is transforming industries by reducing downtime, minimizing costs, and enhancing productivity.
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
Mark Callaghan, Facebook
1. MySQL versus something else
Evaluating alternative databases
Mark Callaghan
Small Data Engineer
October, 2013
Friday, October 25, 13
2. What metric is important?
▪
Throughput
▪
Throughput while minimizing response time variance
▪
Efficiency - reduce cost while meeting response time goals
Friday, October 25, 13
3. My focus is storage efficiency
▪
Use flash to get IOPs
▪
Use spinning disks to get capacity
▪
Use both to reduce cost while improving quality of service
frequent
reads
frequent
writes
read IOPs
write IOPs
flash
yes
yes
yes
maybe
flash
yes
no
yes
no
SATA, /dev/null
no
yes
no
maybe
SATA, /dev/null
no
no
no
no
device
Friday, October 25, 13
4. What technology would you choose today?
▪
How do you value flexibility?
▪
▪
Servers you buy today will be in production for a few years
▪
▪
Newer & faster hardware arrives each year
Software can last even longer in production
We have several generations of HW on the small data tiers
▪
Pure-disk (SAS array + HW RAID)
▪
Flashcache (SATA array + HW RAID, flash)
▪
Pure-flash
Friday, October 25, 13
5. Common definitions
▪
Sorted run - rows stored in key order
▪
may be stored using many range-partitioned files
▪
Memtable - sorted run in memory
▪
L0 - 1 or more sorted runs on disk
▪
L1, L2, ... Lmax - each is 1 sorted run on disk
▪
▪
▪
Lmax is the largest level
by size L1 < L2 ... < Lmax
live% - percentage of live data in the database
Friday, October 25, 13
6. Amplification factors
▪
Framework for describing efficiency of database algorithms
▪
How much is done physically in response to a logical change?
▪
▪
Write amplification
▪
▪
Read amplification
Space amplification
Can determine
▪
How many disks or flash you must buy
▪
How long your flash might last
▪
Whether you can buy lower endurance flash
Friday, October 25, 13
7. Read amplification
▪
Read-amp == disk reads per query
▪
▪
Assume some data is in cache
▪
▪
Separate results for point query versus short range scan
Assume the index is covering for the query
Example: b-tree with all non-leaf levels in cache
▪
Point read-amp - 1 disk read to get the leaf block
▪
Short range read-amp - 1 or 2 disk reads to get the leaf blocks
Friday, October 25, 13
8. Read amplification and bloom filters
▪
Bloom filter summary
▪
f(key) -> { no, maybe }
▪
Use ~10 bits/row to get reasonable false positive rate
▪
Great for avoiding disk reads on point queries
▪
Bonus - prevent disk reads for keys that don’t exist
▪
Useless for general range scans like “select x where y < 100”
▪
Can be useful for equality prefix like “select x where q = 10 and y < 100”
▪
▪
use bloom filter on q
Too many bloom filter checks can hurt response time
▪
Friday, October 25, 13
each sorted run on disk needs a bloom filter check
9. Write amplification
▪
Write-amp == bytes written per byte changed
▪
▪
▪
Insert 100 bytes with write-amp=5 and 500 bytes will be written
For now ignore penalty from small random writes
Some writes done immediately, others are deferred
▪
Immediate -> redo log
▪
Deferred -> b-tree dirty pages not forced on commit, LSM compaction
Friday, October 25, 13
10. Write amplification, part 2
▪
HW can increase write-amp
▪
▪
▪
Read live pages and write them elsewhere when cleaning flash blocks
Only a cost for algorithms that do small random writes
Redo log writes can increase write-amp
▪
Writes must be done to a multiple of 512 or larger
▪
Insert 100 byte row, force 512 byte sector for redo has write-amp=5
Friday, October 25, 13
11. Why write amplification matters
▪
Write endurance for flash device
▪
▪
▪
The wrong algorithm can wear out the device too soon
The right algorithm might let you buy lower cost/endurance device
Write-amp can predict peak performance
▪
If storage can sustain 400 MB/second of writes
▪
And write-amp is 10
▪
Then database can sustain 40 MB/second of changes
Friday, October 25, 13
12. Simple request - make counting faster
▪
Some web-scale workloads need to maintain counts
▪
▪
▪
Database is IO-bound
Workload should be write-heavy, counters might not be read
update foo set count = count + 1 where key = ‘bar’
▪
Read-modify-write
▪
Write-only: write delta, merge deltas later when queried/compacted
Friday, October 25, 13
13. Space amplification
▪
Space-amp == sizeof(database files) / sizeof(data)
▪
▪
Assume database files are in steady state (fragmented & compacted)
▪
▪
Ignore secondary indexes
Space-amp == 100 / %live
Things that change space amplification
▪
B-tree fragmentation
▪
Old versions of rows that are yet to be collected
▪
Compression
▪
Per row/page metadata (rollback pointer, transaction ID, ...)
Friday, October 25, 13
14. Space versus write amplification
▪
Sorry for the confusion
▪
▪
▪
Databases store N blocks in 1 extent
Flash devices store N pages in 1 block
Copy out
▪
Read live data from the cleaned extent, write it elsewhere
▪
Cost is a function of the percentage of live data
▪
Larger live% means less space and more write amplification
▪
Smaller live% means more space and less write amplification
Friday, October 25, 13
15. Space versus write amplification
Old flash block assuming all blocks have 25% live pages
75 dead pages
25 live pages
Block cleaning copies 25 pages
New flash block
75 pages ready for new writes
25 copied
pages
Write 100 pages total per 75 new page writes:
* %live is 25%
* write-amp is 100 / (100 - %live) == 100 / 75
* space-amp is 100 / %live == 4
Friday, October 25, 13
16. Disclaimer
▪
There are many assumptions in the rest of the slides.
▪
Assumption #1: workloads have no skew.
▪
▪
▪
▪
Most real workloads have skew.
Lets save skew for a much longer discussion
Assumption #2: workload is update-only
I am trying to start a discussion rather than solve everything.
▪
This won’t be confused as a lecture on algorithm analysis.
▪
We might disagree on technology, but we can agree on terminology
Friday, October 25, 13
17. Database algorithms
▪
B-tree
▪
▪
▪
Update-in-place (UIP)
Copy-on-write using sequential (COW-S) and random (COW-R) writes
Log structured merge tree (LSM)
▪
▪
▪
LevelDB-style compaction (leveled)
HBase-style compaction (n-files, size-tiered)
Other
▪
Log-only - Bitcask
▪
Memtable + L1 - Sophia via Sphia.org
▪
Memtable, L0, L1 - MaSM
▪
TokuDB/TokuMX - fractional cascading
Friday, October 25, 13
19. B-tree: UIP and COW-R
▪
When non-leaf levels are in cache
▪
▪
Point read-amp is 1, range read-amp is 1 or 2
When dirty pages are forced after each row change
▪
Write-amp is sizeof(page) / sizeof(row)
▪
More write-amp from torn-page protection
▪
Add +1 for redo log
▪
Include HW write-amp when using flash
▪
Forcing data pages too soon increases write-amp
Friday, October 25, 13
20. B-tree: UIP and COW-R, space amplification
▪
Fragmentation because b-tree pages are not full on average
▪
▪
▪
After a page split 1 full page becomes 2 half-full pages
With InnoDB we have many indexes with pages that are ~60% full
Fixed page size reduces compression, with InnoDB 2X compression
▪
Default fixed page size is 8kb
▪
Compress 16kb to 6kb, still write out 8kb
▪
It is hard to use a compression window larger than one page
▪
Per-row metadata uses 13+ bytes on InnoDB
Friday, October 25, 13
21. B-tree: COW-S
▪
Read amplification is the same as for UIP and COW-R
▪
Write amplification
▪
▪
Has SW write-amp, cost of cleaning previously written extents
▪
▪
Smaller page size from better compression and no fragmentation
No HW write-amp on flash
Space amplification
▪
Compresses better than UIP/COW-R because page size not fixed
▪
Almost no fragmentation
▪
Space-amp from old versions of pages that have yet to be cleaned
▪
More (less) space-amp means less (more) write-amp
Friday, October 25, 13
22. LSM with leveled compaction
▪
Implemented by LevelDB and Cassandra
▪
Database is memtable, L0, L1, ..., Lmax
▪
Less read-amp and space-amp, more write-amp
▪
Similar to original LSM design from paper by O’Neil
▪
Difference is the use of many range-partitioned files per level
▪
▪
▪
Increases write-amp by a small amount
Prevents temporary doubling of Lmax during compaction
Compaction from L1 to L2
▪
reads N bytes from L1
▪
reads 10*N bytes from L2
▪
writes 10*N + N bytes back to L2
Friday, October 25, 13
23. LSM with leveled compaction
memtable
keys: 00..01
keys: 0..99
keys: 11..19
keys: 0..99
keys: 0..99
keys: 90..99
Level 0 (1GB)
Level 1 (1GB)
10X more data
keys:
000..001
Friday, October 25, 13
keys:
002...003
keys: 90..99
Level 2 (10GB)
24. LSM with leveled compaction
▪
Point read amplification
▪
▪
Range read amplification
▪
▪
1 disk read per level and per L0 file, bloom filters don’t help
Write amplification
▪
▪
1 bloom filter check per L0 file and per level for L1->Lmax + 1 disk read
10 per level starting with L2 + 1 for redo + 1 for L0 + ~1 for L1
Space amplification
▪
Friday, October 25, 13
1.1 assuming 90% of data is on the maximum level
25. LSM with n-files compaction
▪
Implemented by Hbase, WiredTiger and Cassandra
▪
Database is memtable, L0, L1
▪
Files in L0 have varying sizes
▪
Less write-amp, more read-amp and space-amp
▪
Compaction cost determined by:
▪
▪
▪
#files merged at a time
sizeof(L1) / sizeof(file created by memtable flush)
If memtable is 1 GB, L1 is 64 GB, 2 files are merged at a time
▪
then a row is written to files of size 1, 2, 4, 8, 16, 32 and 64 GB
▪
write-amp is 7
Friday, October 25, 13
27. LSM with n-files compaction
▪
Point read amplification
▪
▪
Range read amplification
▪
▪
1 bloom filter check per file + 1 disk read
1 disk read per file, bloom filters don’t help with range scans
Write amplification
▪
▪
Trade write for space amplification
▪
▪
Usually much less than leveled compaction
Add 1 for redo
Space amplification
▪
Friday, October 25, 13
Usually greater than 2
28. Log-only
▪
Bitcask (part of Riak/Basho) is an example of this
▪
Data is written 1+ times
▪
▪
▪
Write data once to a log
Write again when row is live during log cleaning
Copy data from tail to head of log when out of disk space
Friday, October 25, 13
29. Log-only
new data
newest log file
Log 4
Log 3
live data
Log 2
oldest log file
Friday, October 25, 13
Log 1
cleaner
dead data
/dev/null
30. Log-only
▪
Point read amplification is 1
▪
Range read amplification is one per value in the range
▪
Write and space amplification are related
▪
▪
▪
Write amplification is 100 / (100 - %live)
Space amplification is 100 / %live
When 66% of the data in the logs is live
▪
Space-amp is 1.5
▪
Write-amp is 3
Friday, October 25, 13
31. Memtable + L1
▪
I think Sophia (sphia.org) is an example of this
▪
Database is memtable, L1
▪
Do compaction between memtable & L1 when memtable is full
▪
Great when database on disk not too much bigger than RAM
Friday, October 25, 13
33. Memtable + L1
▪
Point read amplification is 1
▪
Range read amplification is 1
▪
Write amplification
▪
▪
▪
The ratio sizeof(database) / sizeof(memtable)
+1 for redo log
Space amplification is 1
Friday, October 25, 13
34. Memtable + L0 + L1
▪
MaSM is an example of this
▪
Database is memtable, L0, L1
▪
▪
▪
sizeof(L0) == sizeof(L1)
Looks like file structures from 2-pass external sort
Tradeoffs
▪
Minimize write-amp
▪
Maximize read-amp
Friday, October 25, 13
35. Memtable + L0 + L1
memtable
L0
L0
L0
L0
L0
Merge all on compaction
L1
Friday, October 25, 13
36. Memtable + L0 + L1
▪
Point read amplification is 1 disk read + many bloom filter checks
▪
Range read amplification 1 disk read per L0 file + 1
▪
Write amplification is 3
▪
▪
Write to redo log, L0 and L1
Space amplification is 2
Friday, October 25, 13
37. TokuDB, TokuMX
▪
Read amplification
▪
▪
▪
1 disk read for point queries
1 or 2 disk reads for range read queries
Write amplification
▪
▪
▪
10 per level + 1 for redo
Won’t use as many levels as LevelDB
Space amplification
▪
No internal fragmentation, variable sizes pages written
▪
Similar to LevelDB
Friday, October 25, 13
38. Database algorithms
point
read-amp
range
read-amp
write-amp
space-amp
UIP b-tree
1
1 or 2
page/row * HW GC
1.5 to 2
COW-R b-tree
1
1 or 2
page/row * HW GC
1.5 to 2
COW-S b-tree
1
1 or 2
page/row * SW GC
1
LSM leveled
1 + N*bloom
N
10 per level
1.1 X
LSM n-files
1 + N*bloom
N
can be < 10
can be > 2
log-only
1
N
1 / (1 - %live)
1 / %live
memtable+L1
1
1
database/mem
1
1 + N*bloom
N
3
2
1
2
10 per level
1.1 X
algorithm
memtable+L0+L1
tokudb
Friday, October 25, 13
39. Two things to remember
▪
You can trade space/read versus write amplification
▪
▪
▪
Switch database algorithms or tune existing algorithm
Hard to minimize read, write & space amplification
One size doesn’t fit all
▪
The workload I care about has different types of indexes
▪
▪
▪
Friday, October 25, 13
Some indexes should be optimized for short range scans
Other indexes can be optimized for write amplification
would be nice to support both in one database engine