The document discusses compaction in RocksDB, an embedded key-value storage engine. It describes the two compaction styles in RocksDB: level style compaction and universal style compaction. Level style compaction stores data in multiple levels and performs compactions by merging files from lower to higher levels. Universal style compaction keeps all files in level 0 and performs compactions by merging adjacent files in time order. The document provides details on the compaction process and configuration options for both styles.
Anoop Sam John and Ramkrishna Vasudevan (Intel)
HBase provides an LRU based on heap cache but its size (and so the total data size that can be cached) is limited by Java’s max heap space. This talk highlights our work under HBASE-11425 to allow the HBase read path to work directly from the off-heap area.
Some key value stores using log-structureZhichao Liang
This slides presents three key-value stores using log-structure, includes Riak, RethinkDB, LevelDB. BTW, i state that RethinkDB employs append-only B-tree and that is an estimate made by combining guessing wih reasoning!
RocksDB is an embedded key-value store written in C++ and optimized for fast storage environments like flash or RAM. It uses a log-structured merge tree to store data by writing new data sequentially to an in-memory log and memtable, periodically flushing the memtable to disk in sorted SSTables. It reads from the memtable and SSTables, and performs background compaction to merge SSTables and remove overwritten data. RocksDB supports two compaction styles - level style, which stores SSTables in multiple levels sorted by age, and universal style, which stores all SSTables in level 0 sorted by time.
1. Log structured merge trees store data in multiple levels with different storage speeds and costs, requiring data to periodically merge across levels.
2. This structure allows fast writes by storing new data in faster levels before merging to slower levels, and efficient reads by querying multiple levels and merging results.
3. The merging process involves loading, sorting, and rewriting levels to consolidate and propagate deletions and updates between levels.
HBase Accelerated introduces an in-memory flush and compaction pipeline for HBase to improve performance of real-time workloads. By keeping data in memory longer and avoiding frequent disk flushes and compactions, it reduces I/O and improves read and scan latencies. Evaluation on workloads with high update rates and small working sets showed the new approach significantly outperformed the default HBase implementation by serving most data from memory. Work is ongoing to further optimize the in-memory representation and memory usage.
HBase 2.0 is the next stable major release for Apache HBase scheduled for early 2017. It is the biggest and most exciting milestone release from the Apache community after 1.0. HBase-2.0 contains a large number of features that is long time in the development, some of which include rewritten region assignment, perf improvements (RPC, rewritten write pipeline, etc), async clients, C++ client, offheaping memstore and other buffers, Spark integration, shading of dependencies as well as a lot of other fixes and stability improvements. We will go into technical details on some of the most important improvements in the release, as well as what are the implications for the users in terms of API and upgrade paths. Existing users of HBase/Phoenix as well as operators managing HBase clusters will benefit the most where they can learn about the new release and the long list of features. We will also briefly cover earlier 1.x release lines and compatibility and upgrade paths for existing users and conclude by giving an outlook on the next level of initiatives for the project.
The document discusses compaction in RocksDB, an embedded key-value storage engine. It describes the two compaction styles in RocksDB: level style compaction and universal style compaction. Level style compaction stores data in multiple levels and performs compactions by merging files from lower to higher levels. Universal style compaction keeps all files in level 0 and performs compactions by merging adjacent files in time order. The document provides details on the compaction process and configuration options for both styles.
Anoop Sam John and Ramkrishna Vasudevan (Intel)
HBase provides an LRU based on heap cache but its size (and so the total data size that can be cached) is limited by Java’s max heap space. This talk highlights our work under HBASE-11425 to allow the HBase read path to work directly from the off-heap area.
Some key value stores using log-structureZhichao Liang
This slides presents three key-value stores using log-structure, includes Riak, RethinkDB, LevelDB. BTW, i state that RethinkDB employs append-only B-tree and that is an estimate made by combining guessing wih reasoning!
RocksDB is an embedded key-value store written in C++ and optimized for fast storage environments like flash or RAM. It uses a log-structured merge tree to store data by writing new data sequentially to an in-memory log and memtable, periodically flushing the memtable to disk in sorted SSTables. It reads from the memtable and SSTables, and performs background compaction to merge SSTables and remove overwritten data. RocksDB supports two compaction styles - level style, which stores SSTables in multiple levels sorted by age, and universal style, which stores all SSTables in level 0 sorted by time.
1. Log structured merge trees store data in multiple levels with different storage speeds and costs, requiring data to periodically merge across levels.
2. This structure allows fast writes by storing new data in faster levels before merging to slower levels, and efficient reads by querying multiple levels and merging results.
3. The merging process involves loading, sorting, and rewriting levels to consolidate and propagate deletions and updates between levels.
HBase Accelerated introduces an in-memory flush and compaction pipeline for HBase to improve performance of real-time workloads. By keeping data in memory longer and avoiding frequent disk flushes and compactions, it reduces I/O and improves read and scan latencies. Evaluation on workloads with high update rates and small working sets showed the new approach significantly outperformed the default HBase implementation by serving most data from memory. Work is ongoing to further optimize the in-memory representation and memory usage.
HBase 2.0 is the next stable major release for Apache HBase scheduled for early 2017. It is the biggest and most exciting milestone release from the Apache community after 1.0. HBase-2.0 contains a large number of features that is long time in the development, some of which include rewritten region assignment, perf improvements (RPC, rewritten write pipeline, etc), async clients, C++ client, offheaping memstore and other buffers, Spark integration, shading of dependencies as well as a lot of other fixes and stability improvements. We will go into technical details on some of the most important improvements in the release, as well as what are the implications for the users in terms of API and upgrade paths. Existing users of HBase/Phoenix as well as operators managing HBase clusters will benefit the most where they can learn about the new release and the long list of features. We will also briefly cover earlier 1.x release lines and compatibility and upgrade paths for existing users and conclude by giving an outlook on the next level of initiatives for the project.
HBaseCon 2013: Apache HBase at Pinterest - Scaling Our Feed StorageCloudera, Inc.
Pinterest uses Apache HBase to store data for users' personalized "following feeds" at scale. This involves storing billions of pins and updates per day. Some key challenges addressed are handling high throughput writes from fanouts, providing low latency reads, and resolving potential data inconsistencies from race conditions. Optimizations to HBase include increased memstore size, block cache tuning, and prefix compression. Maintaining high availability involves writing to dual clusters, tight Zookeeper timeouts, and automated repairs.
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, SalesforceCloudera, Inc.
The strength of an open source project resides entirely in its developer community; a strong democratic culture of participation and hacking makes for a better piece of software. The key requirement is having developers who are not only willing to contribute, but also knowledgeable about the project’s internal structure and architecture. This session will introduce developers to the core internal architectural concepts of HBase, not just “what” it does from the outside, but “how” it works internally, and “why” it does things a certain way. We’ll walk through key sections of code and discuss key concepts like the MVCC implementation and memstore organization. The goal is to convert serious “HBase Users” into HBase Developer Users”, and give voice to some of the deep knowledge locked in the committers’ heads.
In this session, you will learn the work Xiaomi has done to improve the availability and stability of our HBase clusters, including cross-site data and service backup and a coordinated compaction framework. You'll also learn about the Themis framework, which supports cross-row transactions on HBase based on Google's percolator algorithm, and its usage in Xiaomi's applications.
RocksDB is an embedded key-value store that is optimized for fast storage. It uses a log-structured merge-tree to organize data on storage. Optimizing RocksDB for open-channel SSDs would allow controlling data placement to exploit flash parallelism and minimize overhead. This could be done by mapping RocksDB files like SSTables and logs to virtual blocks that map to physical flash blocks in a way that considers data access patterns and flash characteristics. This would improve performance by reducing writes and garbage collection.
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookThe Hive
This presentation describes the reasons why Facebook decided to build yet another key-value store, the vision and architecture of RocksDB and how it differs from other open source key-value stores. Dhruba describes some of the salient features in RocksDB that are needed for supporting embedded-storage deployments. He explains typical workloads that could be the primary use-cases for RocksDB. He also lays out the roadmap to make RocksDB the key-value store of choice for highly-multi-core processors and RAM-speed storage devices.
HBase is an open source, distributed, column-oriented database modeled after Google's Bigtable that runs on top of Hadoop. The presenter discusses HBase's architecture, performance improvements in version 0.20 including major gains from new file formats and compression, and Stumbleupon's extensive use of HBase including supporting over 9 billion rows in a single table with high import and read speeds.
This document discusses how to setup HBase with Docker in three configurations: single-node standalone, pseudo-distributed single-machine, and fully-distributed cluster. It describes features of HBase like consistent reads/writes, automatic sharding and failover. It provides instructions for installing HBase in a single node using Docker, including building an image and running it with ports exposed. It also covers running HBase in pseudo-distributed mode with the processes running as separate containers and interacting with the HBase shell.
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, HerokuRedis Labs
Postgres and Redis Sitting in a Tree | In today’s world of polyglot persistence, it’s likely that companies will be using multiple data stores for storing and working with data based on the use case. Typically a company will
start with a relational database like Postgres and then add Redis for more high velocity use-cases. What if you could tie the two systems together to enable so much more?
MySQL shell and It's utilities - Praveen GR (Mydbops Team)Mydbops
This document provides an overview of MySQL shell utilities including the Upgrade Checker Utility, Table Export Utility, Parallel Table Import Utility, Dump Utility, Dump Loading Utility, and JSON Import Utility. It describes the purpose and notable options of each utility for upgrading databases, exporting and importing tables in parallel, taking backups, and importing JSON data.
This document summarizes a presentation about optimizing for low latency in HBase. It discusses how to measure latency, the write and read paths in HBase, sources of latency like garbage collection and compactions, and techniques for reducing latency like streaming puts, block caching, and timeline consistency. The key points are that single puts can achieve millisecond latency while garbage collection and machine failures can cause pauses of 10s of milliseconds to seconds, and optimizing for the "magical 1%" of requests after the 99th percentile is important to improve average latency.
This document demonstrates how to use Scala and Spark to analyze text data from the Bible. It shows how to install Scala and Spark, load a text file of the Bible into a Spark RDD, perform searches to count verses containing words like "God" and "Love", and calculate statistics on the data like the total number of words and unique words used in the Bible. Example commands and outputs are provided.
This document discusses techniques for optimizing data loading and unloading performance in Oracle databases. It describes how to use external tables with compression and parallelism to load data very quickly at over 300 MB/s. For unloading, it recommends using parallel table functions and compression to achieve similar high speeds. Various monitoring tools are also covered, including OS-level tools like vmstat and iostat as well as database-level tools like AWR and SQL Monitor reports. Live demonstrations are provided of loading and unloading 1 TB of data and using monitoring reports to analyze query performance.
This document summarizes the challenges and solutions for maintaining large PostgreSQL databases at Emma, including:
- Maintaining terabytes of data across multiple clusters up to version 9.0
- Facing performance issues when the hardware load was pushed to its limits
- Dealing with huge catalogs containing millions of data points that caused slow performance
- Addressing problems like bloat, backups that took hours, system resource exhaustion, and transaction wraparound issues
- Implementing solutions such as scripts to clean up bloat, sharding to a Linux filesystem, and increasing autovacuum thresholds
The document discusses different types of block caches in HBase including LruBlockCache, SlabCache, and BucketCache. It explains that block caching improves performance by storing frequently accessed blocks in faster memory rather than slower disk storage. Each block cache has its own configuration options and memory usage characteristics. Benchmark results show that the off-heap BucketCache provides strong performance due to its use of off-heap memory for the L2 cache.
The Hive Think Tank: Rocking the Database World with RocksDBThe Hive
RocksDB is a new storage engine for MySQL that provides better storage efficiency than InnoDB. It achieves lower space amplification and write amplification than InnoDB through its use of compression and log-structured merge trees. While MyRocks (RocksDB integrated with MySQL) currently has some limitations like a lack of support for online DDL and spatial indexes, work is ongoing to address these limitations and integrate additional RocksDB features to fully support MySQL workloads. Testing at Facebook showed MyRocks uses less disk space and performs comparably to InnoDB for their queries.
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...Cloudera, Inc.
This document summarizes Berk D. Demir's design for a content addressable storage system to store and serve large amounts of static assets with low latency, high availability, and without data duplication. The key aspects of the design are:
1) Using HBase as the underlying distributed database to store immutable rows of metadata and blob content in a single table with different column families based on access patterns.
2) Addressing content via a cryptographic hash of the content rather than a database key to allow immutable and deduplicated storage.
3) Serving the stored content via HTTP using common verbs and headers to provide a simple interface for clients.
The document summarizes the HBase 1.0 release which introduces major new features and interfaces including a new client API, region replicas for high availability, online configuration changes, and semantic versioning. It describes goals of laying a stable foundation, stabilizing clusters and clients, and making versioning explicit. Compatibility with earlier versions is discussed and the new interfaces like ConnectionFactory, Connection, Table and BufferedMutator are introduced along with examples of using them.
Digital Library Collection Management using HBaseHBaseCon
Speaker: Ron Buckley (OCLC)
OCLC has been working over the last year to move its massive repository to HBase. This talk will focus on the impetus behind the move, implementation details and technology choices we've made (key design, shredding PDFs and other digital objects into HBase, scaling), and the value-add that HBase brings to digital collection management.
Evolution of MongoDB Replicaset and Its Best PracticesMydbops
There are several exciting and long-awaited features released from MongoDB 4.0. He will focus on the prime features, the kind of problem it solves, and the best practices for deploying replica sets.
This document discusses database performance factors for developers. It covers topics like query execution plans, table indexes, table partitioning, and performance troubleshooting. The goal is to help developers understand how to optimize database performance. It provides examples and recommends analyzing execution plans, properly indexing tables, partitioning large tables, and using a structured approach to troubleshooting performance issues.
This document discusses various SQL Server disaster recovery strategies including log shipping, database mirroring, replication, and maintaining disaster recovery plans and documentation. Log shipping uses transaction logs to copy changes from a primary to standby server. Database mirroring maintains an up-to-date copy of a database on a mirror server. Replication can be used to distribute data changes in near real-time. The document emphasizes the importance of regularly testing disaster recovery plans and keeping recovery documentation up-to-date.
HBaseCon 2013: Apache HBase at Pinterest - Scaling Our Feed StorageCloudera, Inc.
Pinterest uses Apache HBase to store data for users' personalized "following feeds" at scale. This involves storing billions of pins and updates per day. Some key challenges addressed are handling high throughput writes from fanouts, providing low latency reads, and resolving potential data inconsistencies from race conditions. Optimizations to HBase include increased memstore size, block cache tuning, and prefix compression. Maintaining high availability involves writing to dual clusters, tight Zookeeper timeouts, and automated repairs.
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, SalesforceCloudera, Inc.
The strength of an open source project resides entirely in its developer community; a strong democratic culture of participation and hacking makes for a better piece of software. The key requirement is having developers who are not only willing to contribute, but also knowledgeable about the project’s internal structure and architecture. This session will introduce developers to the core internal architectural concepts of HBase, not just “what” it does from the outside, but “how” it works internally, and “why” it does things a certain way. We’ll walk through key sections of code and discuss key concepts like the MVCC implementation and memstore organization. The goal is to convert serious “HBase Users” into HBase Developer Users”, and give voice to some of the deep knowledge locked in the committers’ heads.
In this session, you will learn the work Xiaomi has done to improve the availability and stability of our HBase clusters, including cross-site data and service backup and a coordinated compaction framework. You'll also learn about the Themis framework, which supports cross-row transactions on HBase based on Google's percolator algorithm, and its usage in Xiaomi's applications.
RocksDB is an embedded key-value store that is optimized for fast storage. It uses a log-structured merge-tree to organize data on storage. Optimizing RocksDB for open-channel SSDs would allow controlling data placement to exploit flash parallelism and minimize overhead. This could be done by mapping RocksDB files like SSTables and logs to virtual blocks that map to physical flash blocks in a way that considers data access patterns and flash characteristics. This would improve performance by reducing writes and garbage collection.
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookThe Hive
This presentation describes the reasons why Facebook decided to build yet another key-value store, the vision and architecture of RocksDB and how it differs from other open source key-value stores. Dhruba describes some of the salient features in RocksDB that are needed for supporting embedded-storage deployments. He explains typical workloads that could be the primary use-cases for RocksDB. He also lays out the roadmap to make RocksDB the key-value store of choice for highly-multi-core processors and RAM-speed storage devices.
HBase is an open source, distributed, column-oriented database modeled after Google's Bigtable that runs on top of Hadoop. The presenter discusses HBase's architecture, performance improvements in version 0.20 including major gains from new file formats and compression, and Stumbleupon's extensive use of HBase including supporting over 9 billion rows in a single table with high import and read speeds.
This document discusses how to setup HBase with Docker in three configurations: single-node standalone, pseudo-distributed single-machine, and fully-distributed cluster. It describes features of HBase like consistent reads/writes, automatic sharding and failover. It provides instructions for installing HBase in a single node using Docker, including building an image and running it with ports exposed. It also covers running HBase in pseudo-distributed mode with the processes running as separate containers and interacting with the HBase shell.
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, HerokuRedis Labs
Postgres and Redis Sitting in a Tree | In today’s world of polyglot persistence, it’s likely that companies will be using multiple data stores for storing and working with data based on the use case. Typically a company will
start with a relational database like Postgres and then add Redis for more high velocity use-cases. What if you could tie the two systems together to enable so much more?
MySQL shell and It's utilities - Praveen GR (Mydbops Team)Mydbops
This document provides an overview of MySQL shell utilities including the Upgrade Checker Utility, Table Export Utility, Parallel Table Import Utility, Dump Utility, Dump Loading Utility, and JSON Import Utility. It describes the purpose and notable options of each utility for upgrading databases, exporting and importing tables in parallel, taking backups, and importing JSON data.
This document summarizes a presentation about optimizing for low latency in HBase. It discusses how to measure latency, the write and read paths in HBase, sources of latency like garbage collection and compactions, and techniques for reducing latency like streaming puts, block caching, and timeline consistency. The key points are that single puts can achieve millisecond latency while garbage collection and machine failures can cause pauses of 10s of milliseconds to seconds, and optimizing for the "magical 1%" of requests after the 99th percentile is important to improve average latency.
This document demonstrates how to use Scala and Spark to analyze text data from the Bible. It shows how to install Scala and Spark, load a text file of the Bible into a Spark RDD, perform searches to count verses containing words like "God" and "Love", and calculate statistics on the data like the total number of words and unique words used in the Bible. Example commands and outputs are provided.
This document discusses techniques for optimizing data loading and unloading performance in Oracle databases. It describes how to use external tables with compression and parallelism to load data very quickly at over 300 MB/s. For unloading, it recommends using parallel table functions and compression to achieve similar high speeds. Various monitoring tools are also covered, including OS-level tools like vmstat and iostat as well as database-level tools like AWR and SQL Monitor reports. Live demonstrations are provided of loading and unloading 1 TB of data and using monitoring reports to analyze query performance.
This document summarizes the challenges and solutions for maintaining large PostgreSQL databases at Emma, including:
- Maintaining terabytes of data across multiple clusters up to version 9.0
- Facing performance issues when the hardware load was pushed to its limits
- Dealing with huge catalogs containing millions of data points that caused slow performance
- Addressing problems like bloat, backups that took hours, system resource exhaustion, and transaction wraparound issues
- Implementing solutions such as scripts to clean up bloat, sharding to a Linux filesystem, and increasing autovacuum thresholds
The document discusses different types of block caches in HBase including LruBlockCache, SlabCache, and BucketCache. It explains that block caching improves performance by storing frequently accessed blocks in faster memory rather than slower disk storage. Each block cache has its own configuration options and memory usage characteristics. Benchmark results show that the off-heap BucketCache provides strong performance due to its use of off-heap memory for the L2 cache.
The Hive Think Tank: Rocking the Database World with RocksDBThe Hive
RocksDB is a new storage engine for MySQL that provides better storage efficiency than InnoDB. It achieves lower space amplification and write amplification than InnoDB through its use of compression and log-structured merge trees. While MyRocks (RocksDB integrated with MySQL) currently has some limitations like a lack of support for online DDL and spatial indexes, work is ongoing to address these limitations and integrate additional RocksDB features to fully support MySQL workloads. Testing at Facebook showed MyRocks uses less disk space and performs comparably to InnoDB for their queries.
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...Cloudera, Inc.
This document summarizes Berk D. Demir's design for a content addressable storage system to store and serve large amounts of static assets with low latency, high availability, and without data duplication. The key aspects of the design are:
1) Using HBase as the underlying distributed database to store immutable rows of metadata and blob content in a single table with different column families based on access patterns.
2) Addressing content via a cryptographic hash of the content rather than a database key to allow immutable and deduplicated storage.
3) Serving the stored content via HTTP using common verbs and headers to provide a simple interface for clients.
The document summarizes the HBase 1.0 release which introduces major new features and interfaces including a new client API, region replicas for high availability, online configuration changes, and semantic versioning. It describes goals of laying a stable foundation, stabilizing clusters and clients, and making versioning explicit. Compatibility with earlier versions is discussed and the new interfaces like ConnectionFactory, Connection, Table and BufferedMutator are introduced along with examples of using them.
Digital Library Collection Management using HBaseHBaseCon
Speaker: Ron Buckley (OCLC)
OCLC has been working over the last year to move its massive repository to HBase. This talk will focus on the impetus behind the move, implementation details and technology choices we've made (key design, shredding PDFs and other digital objects into HBase, scaling), and the value-add that HBase brings to digital collection management.
Evolution of MongoDB Replicaset and Its Best PracticesMydbops
There are several exciting and long-awaited features released from MongoDB 4.0. He will focus on the prime features, the kind of problem it solves, and the best practices for deploying replica sets.
This document discusses database performance factors for developers. It covers topics like query execution plans, table indexes, table partitioning, and performance troubleshooting. The goal is to help developers understand how to optimize database performance. It provides examples and recommends analyzing execution plans, properly indexing tables, partitioning large tables, and using a structured approach to troubleshooting performance issues.
This document discusses various SQL Server disaster recovery strategies including log shipping, database mirroring, replication, and maintaining disaster recovery plans and documentation. Log shipping uses transaction logs to copy changes from a primary to standby server. Database mirroring maintains an up-to-date copy of a database on a mirror server. Replication can be used to distribute data changes in near real-time. The document emphasizes the importance of regularly testing disaster recovery plans and keeping recovery documentation up-to-date.
Scaling sql server 2014 parallel insertChris Adkin
A slide deck on how to get the best possible performance out of the parallel insert feature introduced in SQL Server 2014 as presented at SQL Bits XIV.
This document discusses SQL Server table partitioning and provides guidance on when it is helpful to use partitioning. It describes the key concepts of partitioning such as partition functions, ranges, schemes and switching partitions. It also outlines some of the fine print around limitations, parallelism, locking and maintenance. The document concludes that the client should use partitioning if their workload exhibits queries by region, they can optimize queries for it, have the disk and memory resources to support it and can test it adequately.
This document provides an overview of SQL Server performance tuning. It discusses monitoring tools and dynamic management views that can be used to identify performance issues. Several common performance problems are described such as those related to CPU, memory, I/O, and blocking. The document also covers query tuning, indexing, and optimizing joins. Overall it serves as a guide to optimizing SQL Server performance through monitoring, troubleshooting, and addressing issues at the server, database, and query levels.
This document discusses improving MySQL application performance with Sphinx. It provides an introduction to Sphinx, describing it as a standalone full-text search engine that can be scaled horizontally and has many features beyond full-text search. It explains that Sphinx indexes data separately from MySQL and must be queried separately, though it can return attribute values to MySQL. The document outlines important facts about MySQL's index usage and limitations, and Sphinx's grouping, attribute storage, and block-based data organization to optimize attribute filtering. It provides an example comparing full-text search performance between MySQL and Sphinx.
The document discusses SQL Server performance monitoring and tuning. It recommends taking a holistic view of the entire system landscape, including hardware, software, systems and networking components. It outlines various tools for performance monitoring, and provides guidance on identifying and addressing common performance issues like high CPU utilization, disk I/O issues and poorly performing queries.
Ten query tuning techniques every SQL Server programmer should knowKevin Kline
From the noted database expert and author of 'SQL in a Nutshell' - SELECT statements have a reputation for being very easy to write, but hard to write very well. This session will take you through ten of the most problematic patterns and anti-patterns when writing queries and how to deal with them all. Loaded with live demonstrations and useful techniques, this session will teach you how to take your SQL Server queries mundane to masterful.
Performance tuning and optimization (ppt)Harish Chand
The document discusses various ways to improve client/server performance at both the client and server level. It addresses:
1) Client performance can be improved by optimizing hardware and software. Hardware optimizations include using the fastest available components, while software optimizations involve improving the operating system and applications.
2) Server performance can also be improved through hardware upgrades like adding network cards, as well as implementing high-performance file systems and offloading processing to servers.
3) Database performance optimizations involve efficient index design, query design, and database normalization to minimize network traffic and process data faster.
SQL Server 2014 Memory Optimised Tables - AdvancedTony Rogerson
Hekaton is large piece of kit, this session will focus on the internals of how in-memory tables and native stored procedures work and interact – Database structure: use of File Stream, backup/restore considerations in HA and DR as well as Database Durability, in-memory table make up: hash and range indexes, row chains, Multi-Version Concurrency Control (MVCC). Design considerations and gottcha’s to watch out for.
The session will be demo led.
Note: the session will assume the basics of Hekaton are known, so it is recommended you attend the Basics session.
SQL Server 2014 Extreme Transaction Processing (Hekaton) - BasicsTony Rogerson
Far from Hekaton being an extension of DBCC PINTABLE, it’s a huge new piece of functionality that can significantly improve the scalability of various data based scenarios – not just OLTP but also ETL and real-time BI.
This session will introduce Hekaton features, how and when to use it; it will be demo led giving Hekaton end-to-end: enabling it, create tables, index design, query considerations, native stored procedures, durability [or not], introduce methods of identifying what to put in memory or not.
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]Malin Weiss
Microservices can provide terabytes of data in microseconds by mapping data from SQL databases into in-memory key-value stores and column key stores within JVMs. This is done through periodic synchronization of changed data from databases into memory and mapping the in-memory data into fast access structures. The in-memory data is then exposed through Java Stream and REST APIs to microservices for high performance querying and analysis of large datasets. This architecture allows microservices to quickly share access to large datasets and restart rapidly by reloading from the synchronized persistent stores.
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]Speedment, Inc.
By leveraging memory-mapped files, Speedment and the Chronicle Engine supports large Java maps that easily can exceed the size of your server’s RAM.Because the Java maps are mapped onto files, these maps can be shared instantly between several microservice JVMs and new microservice instances can be added, removed, or restarted very quickly. Data can be retrieved with predictable ultralow latency for a wide range of operations. The solution can be synchronized with an underlying database so that your in-memory maps will be consistently “alive.” The mapped files can be tens of terabytes, which has been done in real-world deployment cases, and a large number of micro services can share these maps simultaneously. Learn more in this session.
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)Amazon Web Services
During this session Greg Brandt and Liyin Tang, Data Infrastructure engineers from Airbnb, will discuss the design and architecture of Airbnb's streaming ETL infrastructure, which exports data from RDS for MySQL and DynamoDB into Airbnb's data warehouse, using a system called SpinalTap. We will also discuss how we leverage Spark Streaming to compute derived data from tracking topics and/or database tables, and HBase to provide immediate data access and generate cleanly time-partitioned Hive tables.
This document summarizes a lecture on key-value storage systems. It introduces the key-value data model and compares it to relational databases. It then describes Cassandra, a popular open-source key-value store, including how it maps keys to servers, replicates data across multiple servers, and performs reads and writes in a distributed manner while maintaining consistency. The document also discusses Cassandra's use of gossip protocols to manage cluster membership.
Inside sql server in memory oltp sql sat nyc 2017Bob Ward
This document provides a high-level summary of In-Memory OLTP in SQL Server:
- In-Memory OLTP stores and processes transactional data entirely in memory using natively compiled stored procedures to avoid concurrency bottlenecks like locks and latches.
- Data is stored in memory-optimized tables using either a hash index or range index for fast lookup. Transactions are logged and written to checkpoint files for durability.
- The Hekaton engine handles all transaction processing in memory without locks by using techniques like multi-version concurrency control and lock-free data structures. Checkpoint files are used to reconstruct the database after a restart.
- Natively compiled stored procedures provide improved performance by
Performance Tuning RocksDB for Kafka Streams’ State Storesconfluent
Performance Tuning RocksDB for Kafka Streams’ State Stores, Bruno Cadonna, Contributor to Apache Kafka & Software Developer at Confluent and Dhruba Borthakur, CTO & Co-founder Rockset
Meetup link: https://www.meetup.com/Berlin-Apache-Kafka-Meetup-by-Confluent/events/273823025/
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...confluent
RocksDB is the default state store for Kafka Streams. In this talk, we will discuss how to improve single node performance of the state store by tuning RocksDB and how to efficiently identify issues in the setup. We start with a short description of the RocksDB architecture. We discuss how Kafka Streams restores the state stores from Kafka by leveraging RocksDB features for bulk loading of data. We give examples of hand-tuning the RocksDB state stores based on Kafka Streams metrics and RocksDB’s metrics. At the end, we dive into a few RocksDB command line utilities that allow you to debug your setup and dump data from a state store. We illustrate the usage of the utilities with a few real-life use cases. The key takeaway from the session is the ability to understand the internal details of the default state store in Kafka Streams so that engineers can fine-tune their performance for different varieties of workloads and operate the state stores in a more robust manner.
This document provides an overview of distributed databases and the Yahoo! Cloud Serving Benchmark (YCSB). It discusses NoSQL databases Cassandra and HBase and how YCSB can be used to benchmark their performance. Experiments were conducted on Amazon EC2 using YCSB to load data and run workloads on Cassandra and HBase clusters. The results showed Cassandra had lower latency and higher throughput than HBase. YCSB provides a way to compare the performance of different databases.
MariaDB ColumnStore is a high performance columnar storage engine for MariaDB that supports analytical workloads on large datasets. It uses a distributed, massively parallel architecture to provide faster and more efficient queries. Data is stored column-wise which improves compression and enables fast loading and filtering of large datasets. The cpimport tool allows loading data into MariaDB ColumnStore in bulk from CSV files or other sources, with options for centralized or distributed parallel loading. Proper sizing of ColumnStore deployments depends on factors like data size, workload, and hardware specifications.
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...Fwdays
We will start from understanding how Real-Time Analytics can be implemented on Enterprise Level Infrastructure and will go to details and discover how different cases of business intelligence be used in real-time on streaming data. We will cover different Stream Data Processing Architectures and discus their benefits and disadvantages. I'll show with live demos how to build Fast Data Platform in Azure Cloud using open source projects: Apache Kafka, Apache Cassandra, Mesos. Also I'll show examples and code from real projects.
Tim Vaillancourt is a senior technical operations architect specializing in MongoDB. He has over 10 years of experience tuning Linux for database workloads and monitoring technologies like Nagios, MRTG, Munin, Zabbix, Cacti, and Graphite. He discussed the various MongoDB storage engines including MMAPv1, WiredTiger, RocksDB, and TokuMX. Key metrics for monitoring the different engines include lock ratio, page faults, background flushing times, checkpoints/compactions, replication lag, and scanned/moved documents. High-level operating system metrics like CPU, memory, disk, and network utilization are also important for ensuring MongoDB has sufficient resources.
Take an in-depth look at data warehousing with Amazon Redshift and get answers to your technical questions. We will cover performance tuning techniques that take advantage of Amazon Redshift's columnar technology and massively parallel processing architecture. We will also discuss best practices for migrating from existing data warehouses, optimizing your schema, loading data efficiently, and using work load management and interleaved sorting.
Apache Drill is a distributed SQL query engine that enables fast analytics over NoSQL databases and distributed file systems. It has a plugin-based architecture that allows it to access different data sources. For NoSQL databases, Drill leverages secondary indexes to generate index-based query plans for predicates on non-key columns. For distributed file systems like HDFS, Drill performs partition pruning based on directory metadata and filter pushdown based on Parquet row group statistics to speed up queries. Drill's extensible framework allows data sources to provide metadata like indexes, statistics, and partitioning functions to optimize query execution.
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at AlibabaMichael Stack
Yun Zhang
Track 2: Ecology and Solutions
https://open.mi.com/conference/hbasecon-asia-2019
THE COMMUNITY EVENT FOR APACHE HBASE™
July 20th, 2019 - Sheraton Hotel, Beijing, China
https://hbase.apache.org/hbaseconasia-2019/
Take an in-depth look at data warehousing with Amazon Redshift and get answers to your technical questions. We will cover performance tuning techniques that take advantage of Amazon Redshift's columnar technology and massively parallel processing architecture. We will also discuss best practices for migrating from existing data warehouses, optimizing your schema, loading data efficiently, and using work load management and interleaved sorting.
From: DataWorks Summit 2017 - Munich - 20170406
HBase hast established itself as the backend for many operational and interactive use-cases, powering well-known services that support millions of users and thousands of concurrent requests. In terms of features HBase has come a long way, overing advanced options such as multi-level caching on- and off-heap, pluggable request handling, fast recovery options such as region replicas, table snapshots for data governance, tuneable write-ahead logging and so on. This talk is based on the research for the an upcoming second release of the speakers HBase book, correlated with the practical experience in medium to large HBase projects around the world. You will learn how to plan for HBase, starting with the selection of the matching use-cases, to determining the number of servers needed, leading into performance tuning options. There is no reason to be afraid of using HBase, but knowing its basic premises and technical choices will make using it much more successful. You will also learn about many of the new features of HBase up to version 1.3, and where they are applicable.
HBase is a distributed, column-oriented database that stores data in tables divided into rows and columns. It is optimized for random, real-time read/write access to big data. The document discusses HBase's key concepts like tables, regions, and column families. It also covers performance tuning aspects like cluster configuration, compaction strategies, and intelligent key design to spread load evenly. Different use cases are suitable for HBase depending on access patterns, such as time series data, messages, or serving random lookups and short scans from large datasets. Proper data modeling and tuning are necessary to maximize HBase's performance.
Best Practices for Migrating Your Data Warehouse to Amazon RedshiftAmazon Web Services
by Darin Briskman, Technical Evangelist, AWS
You can gain substantially more business insights and save costs by migrating your existing data warehouse to Amazon Redshift. This session will cover the key benefits of migrating to Amazon Redshift, migration strategies, and tools and resources that can help you in the process. We’ll learn about AWS Database Migration Service and AWS Schema Migration Tool, which were recently enhanced to import data from six common data warehouse platforms. Level: 200
We are pleased to share with you the latest VCOSA statistical report on the cotton and yarn industry for the month of March 2024.
Starting from January 2024, the full weekly and monthly reports will only be available for free to VCOSA members. To access the complete weekly report with figures, charts, and detailed analysis of the cotton fiber market in the past week, interested parties are kindly requested to contact VCOSA to subscribe to the newsletter.
Codeless Generative AI Pipelines
(GenAI with Milvus)
https://ml.dssconf.pl/user.html#!/lecture/DSSML24-041a/rate
Discover the potential of real-time streaming in the context of GenAI as we delve into the intricacies of Apache NiFi and its capabilities. Learn how this tool can significantly simplify the data engineering workflow for GenAI applications, allowing you to focus on the creative aspects rather than the technical complexities. I will guide you through practical examples and use cases, showing the impact of automation on prompt building. From data ingestion to transformation and delivery, witness how Apache NiFi streamlines the entire pipeline, ensuring a smooth and hassle-free experience.
Timothy Spann
https://www.youtube.com/@FLaNK-Stack
https://medium.com/@tspann
https://www.datainmotion.dev/
milvus, unstructured data, vector database, zilliz, cloud, vectors, python, deep learning, generative ai, genai, nifi, kafka, flink, streaming, iot, edge
4. SQL Server Memory Pools
Stable
Storage
(MDF/NDF
Files)
Buffer Pool
Table
Data Data in/out as required
Memory Internal Structures (proc/log
cache etc.)
SQL Server
Memory
Space
Memory Optimised Tables
5. Memory Optimised Tables
SQL Server Memory Pools – MOT aggression
Stable
Storage
(MDF/NDF
Files)
Buffer Pool
Table
Data
Memory Internal Structures (proc/log
cache etc.)
SQL Server
Memory
Space
6. Create Resource Pool
CREATE RESOURCE POOL mem_xtp_pool
WITH ( MAX_MEMORY_PERCENT = 50,
MIN_MEMORY_PERCENT = 50
);
ALTER RESOURCE GOVERNOR RECONFIGURE;
EXEC sp_xtp_bind_db_resource_pool 'xtp_demo', 'mem_xtp_pool';
ALTER DATABASE xtp_demo SET OFFLINE WITH ROLLBACK IMMEDIATE;
ALTER DATABASE xtp_demo SET ONLINE;
Best
Practice
8. CREATE DATABASE
ALTER DATABASE xtp_basics ADD
FILEGROUP xtp_demo_mod
CONTAINS MEMORY_OPTIMIZED_DATA;
ALTER DATABASE xtp_basics ADD
FILE ( NAME = N'xtp_basics_mod1',
FILENAME = N'c:SQLDATAinmemxtp_basics_mod1' ,
MAXSIZE = 4GB),
( NAME = N'xtp_basics_mod2',
FILENAME = N'e:SQLDATAinmemxtp_basics_mod2' ,
MAXSIZE = 4GB),
( NAME = N'xtp_basics_mod3',
FILENAME = N'c:SQLDATAinmemxtp_basics_mod3' ,
MAXSIZE = 4GB)
TO FILEGROUP xtp_demo_mod;
go
9. Multiple Containers – Load Balancing
• Specify an odd number of Files in the File Group
• CFP {Data and Delta files} allocated in round robin
• If only two files – Data will always be on “1” and Delta on “2”
10. Life of a Row
Memory
CFP
(Data / Delta)
CFP
(Data / Delta )
No active rows
3. MERGE
4. GARBAGE
COLLECT
2. CHECKPOINT
1. Write to storage LDF – offline
checkpoint worker writes to
CFP (512MiB written to
log/tran is bigger than CFP
size)
2. Close CFP and mark ACTIVE
(Durable Checkpoint
established)
3. ACTIVE CFP’s with >= 50%
free space can be merged.
4. Files with no active rows can
be deleted
11. Offline Checkpoint Worker
• After 512MiB data written to log (from all DB activity) or Manual
CHECKPOINT
• CFP will become ACTIVE if amount of data written in single transaction
warrants it
• CFP state: UNDER CONSTRUCTION (on recovery, data taken from
transaction log)
• On CHECKPOINT
• UNDER CONSTRUCTION CTP closed, becomes ACTIVE
• Now have a durable checkpoint (otherwise use the log)
• Reiterate the need for Odd containers – read from container ‘A’ and write
new CFP to container ‘B’
16. Multi-Version Concurrency Control (MVCC)
• SNAPSHOT isolation
• Update is actually INSERT and Tombstone
• Row Versions are kept in memory
• Compare And Swap replaces Latching
• Each in-memory row has a Start and End timestamp (on row header)
• Each database has:
• xtp_transaction_id counter (increment on BEGIN TRAN, reset on SQL restart)
• Global Transaction timestamp (used on COMMIT)
17. Multi-Version Concurrency Control (MVCC)
• SNAPSHOT isolation
• Update is actually INSERT and Tombstone
• Row Versions are kept in memory
• Compare And Swap replaces Latching
• Memory Garbage Collection cleans up versions no longer required (stale
data rows)
• Versions no longer required determined by active transactions – may be
inter-connection dependencies
• Your in-memory table can double, triple, x 100 etc. in size depending on
access patterns (data mods + query durations)
• Row chains can expand dramatically and cause really poor performance
21. MVCC Summary
• Scalability achieved because MVCC removes need for latching and
logging (SNAPSHOT isolation)
• Row Versions (BeginTS, EndTS) are the main stay
• For UPDATES’s the biggest issue is because of MVCC
• Row Versions!
• Garbage collection can be slow to clean up stale rows
• In-Memory is not for large UPDATE’s
24. Hash is used as bucket position in array of
Memory Pointers aka the Hash Index
Bucket
position
25. Hash Collisions
• Example:1 million unique values hashed into 5000 hash values?
• Multiple unique values mush hash to the same hash value
• Multiple key (real data) values hang off same hash bucket
• Termed: Row Chain
• Chain of connected rows
• SQL Server 2014 Hash function is Balanced and follows a Poisson
distribution – 1/3 empty, 1/3 at least one index key and 1/3
containing two index keys
28. Memory Optimised Table Row Header
Up-to 8 Index
Memory Pointers
• Above is a 3 row – row chain, all three rows hash to bucket #3
• Most recent row is first row in chain
• Rows in stable storage
• accessed using files, pages and extents
• Rows in memory optimised
• accessed using memory pointers
• Timestamp required for MVCC
Memory
Location
Hash Index
30. Row Header – Multiple Index Pointers
Data Rows (Index #‘x’ Pointer is next memory pointer in ‘row’ chain)
Memory Pointers
per Hash Bucket
Row chain is
defined
through the
Row Header
31. Hash Index Scan (Table Scan)
• Scan follows Hash Index (8 byte per
bucket)
• Jump to each first row in row chain
• Read row chain (lots of bytes per
row – header + data)
32. Hash Index - thoughts
• Base BUCKET_COUNT on cardinality of the column
• Don’t use where low cardinality – high rows: gives large row chains –
performance will suck
• Equality queries only
• Good candidates are FK on joins
33. What makes a Row Chain grow?
• Hash collisions
• Row versions from updates
• An update is a {delete, insert}
• Late garbage collection because of long running transactions
• Garbage collection can’t keep up because of box load and amount to
do {active threads do garbage collection on completion of work}
• Deleted rows may still be in memory for some time even though no
other connections have a snapshot
34. Affect of Row Chains on Memory
• MEMORY RESOUCES.sql
35. Range Index (Bw Tree ≈ B+Tree)
8KiB
1 row tracks a page
in level below
Root
Leaf
Leaf row per table row
{index-col}{pointer}
B+ Tree
36. Range Index (Bw Tree ≈ B+Tree)
8KiB
1 row tracks a page
in level below
Root
Leaf
Leaf row per table row
{index-col}{pointer}
B+ Tree
37. Range Index (Bw Tree ≈ B+Tree)
Bw Tree
8KiB
Root
Leaf
Leaf row per unique value
{index-col}{memory-pointer}
Row Chain
• Nodes 1 row – 8KiB
• Leaf is pointer to Row
Chain
• Nodes don’t change –
add/merge
38. Range Index (Bw Tree ≈ B+Tree)
8KiB
1 row tracks a page
in level below
Root
Leaf
Leaf row per table row
{index-col}{pointer}
B+Tree
8KiB
Root
Leaf
Leaf row per unique value
{index-col}{memory-pointer}
Row Chain
≈
Bw Tree
39. Index recommendations
• Use HASH when
• Mostly unique values
• Query uses expressions “Equality” i.e. = or !=
• Set BUCKET_COUNT appropriately
• Seriously think if you are considering composite index
• Use RANGE when
• Expressions are not Equality e.g. < > between
• Composite Index
• Low Cardinality
Best
Practice