Hekaton is the original project name for In-Memory OLTP and just sounds cooler for a title name. Keeping up the tradition of deep technical “Inside” sessions at PASS, this half-day talk will take you behind the scenes and under the covers on how the In-Memory OLTP functionality works with SQL Server.
We will cover “everything Hekaton”, including how it is integrated with the SQL Server Engine Architecture. We will explore how data is stored in memory and on disk, how I/O works, how native complied procedures are built and executed. We will also look at how Hekaton integrates with the rest of the engine, including Backup, Restore, Recovery, High-Availability, Transaction Logging, and Troubleshooting.
Demos are a must for a half-day session like this and what would an inside session be if we didn’t bring out the Windows Debugger. As with previous “Inside…” talks I’ve presented at PASS, this session is level 500 and not for the faint of heart. So read through the docs on In-Memory OLTP and bring some extra pain reliever as we move fast and go deep.
This session will appear as two sessions in the program guide but is not a Part I and II. It is one complete session with a small break so you should plan to attend it all to get the maximum benefit.
Any DBA from beginner to advanced level, who wants to fill in some gaps in his/her knowledge about Performance Tuning on an Oracle Database, will benefit from this workshop.
This talk provides an in-depth overview of the key concepts of Apache Calcite. It explores the Calcite catalog, parsing, validation, and optimization with various planners.
Oracle Database Performance Tuning Advanced Features and Best Practices for DBAsZohar Elkayam
Oracle Week 2017 slides.
Agenda:
Basics: How and What To Tune?
Using the Automatic Workload Repository (AWR)
Using AWR-Based Tools: ASH, ADDM
Real-Time Database Operation Monitoring (12c)
Identifying Problem SQL Statements
Using SQL Performance Analyzer
Tuning Memory (SGA and PGA)
Parallel Execution and Compression
Oracle Database 12c Performance New Features
Any DBA from beginner to advanced level, who wants to fill in some gaps in his/her knowledge about Performance Tuning on an Oracle Database, will benefit from this workshop.
This talk provides an in-depth overview of the key concepts of Apache Calcite. It explores the Calcite catalog, parsing, validation, and optimization with various planners.
Oracle Database Performance Tuning Advanced Features and Best Practices for DBAsZohar Elkayam
Oracle Week 2017 slides.
Agenda:
Basics: How and What To Tune?
Using the Automatic Workload Repository (AWR)
Using AWR-Based Tools: ASH, ADDM
Real-Time Database Operation Monitoring (12c)
Identifying Problem SQL Statements
Using SQL Performance Analyzer
Tuning Memory (SGA and PGA)
Parallel Execution and Compression
Oracle Database 12c Performance New Features
Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...Julian Hyde
A talk given at ACM SIGMOD 2018 in support of the paper <a href="https://arxiv.org/abs/1802.10233"> Calcite: A Foundational Framework for Optimized Query Processing Over Heterogeneous Data Sources</a>.
Apache Calcite is a foundational software framework that provides query processing, optimization, and query language support to many popular open-source data processing systems such as Apache Hive, Apache Storm, Apache Flink, Druid, and MapD. Calcite's architecture consists of a modular and extensible query optimizer with hundreds of built-in optimization rules, a query processor capable of processing a variety of query languages, an adapter architecture designed for extensibility, and support for heterogeneous data models and stores (relational, semi-structured, streaming, and geospatial). This flexible, embeddable, and extensible architecture is what makes Calcite an attractive choice for adoption in big-data frameworks. It is an active project that continues to introduce support for the new types of data sources, query languages, and approaches to query processing and optimization.
hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...Michael Stack
Wellington Chevreuil of Cloudera
Track 1: Internals
https://open.mi.com/conference/hbasecon-asia-2019
THE COMMUNITY EVENT FOR APACHE HBASE™
July 20th, 2019 - Sheraton Hotel, Beijing, China
https://hbase.apache.org/hbaseconasia-2019/
Best Practices for the Most Impactful Oracle Database 18c and 19c FeaturesMarkus Michalewicz
This presentation answers the question, “What’s new in Oracle Database 19c ?” in a slightly different way: by providing best practices and a deep dive into the most impactful high availability (HA), scalability, and lifecycle management features in Oracle Database 12c, 18c and 19c, including a short roadmap of features yet to come in the next generation Oracle Database.
This deck was first presented during OOW19 together with Mauricio Feria, who reported on two of his customers and how they have used Oracle Database HA features and Maximum Availability Architecture (MAA) to improve their businesses.
Don’t optimize my queries, optimize my data!Julian Hyde
Your queries won't run fast if your data is not organized right. Apache Calcite optimizes queries, but can we evolve it so that it can optimize data? We had to solve several challenges. Users are too busy to tell us the structure of their database, and the query load changes daily, so Calcite has to learn and adapt.
We talk about new algorithms we developed for gathering statistics on massive database, and how we infer and evolve the data model based on the queries, suggesting materialized views that will make your queries run faster without you changing them.
A talk given by Julian Hyde at DataEngConf NYC, Columbia University, on 2017/10/30.
Experiences Migrating Hive Workload to SparkSQL with Jie Xiong and Zhan ZhangDatabricks
At Facebook, millions of Hive queries are executed on a daily basis, and the workload contributes to important analytics that drive product decisions and insights. Spark SQL in Apache Spark provides much of the same functionality as Hive query language (HQL) more efficiently, and Facebook is building a framework to migrate existing production Hive workload to Spark SQL with minimal user intervention.
Before Facebook began large-scale migration to SparkSQL, they worked on identifying the gap between HQL and SparkSQL. They built an offline syntax analysis tool that parses, analyzes, optimizes and generates physical plans on daily HQL workload. In this session, they’ll share their results. After finding their syntactic analysis encouraging, they built tooling for offline semantic analysis where they run HQL queries in their Spark shadow cluster and validate the outputs. Output validation is necessary since the runtime behavior in Spark SQL may be different from HQL. They have built a migration framework that supports HQL in both Hive and Spark execution engines, can shadow and validate HQL workloads in Spark, and makes it easy for users to convert their workloads.
Apache Calcite (a tutorial given at BOSS '21)Julian Hyde
Apache Calcite is a dynamic data management framework. Think of it as a toolkit for building databases: it has an industry-standard SQL parser, validator, highly customizable optimizer (with pluggable transformation rules and cost functions, relational algebra, and an extensive library of rules), but it has no preferred storage primitives.
In this tutorial (given at BOSS '21 in Copenhagen as part of VLDB '21) the attendees will use Apache Calcite to build a fully fledged query processor from scratch with very few lines of code. This processor is a full implementation of SQL over an Apache Lucene storage engine. (Lucene does not support SQL queries and lacks a declarative language for performing complex operations such as joins or aggregations.) Attendees will also learn how to use Calcite as an effective tool for research.
Presenters: Julian Hyde and Stamatis Zampetakis
Building robust CDC pipeline with Apache Hudi and DebeziumTathastu.ai
We have covered the need for CDC and the benefits of building a CDC pipeline. We will compare various CDC streaming and reconciliation frameworks. We will also cover the architecture and the challenges we faced while running this system in the production. Finally, we will conclude the talk by covering Apache Hudi, Schema Registry and Debezium in detail and our contributions to the open-source community.
All of the Performance Tuning Features in Oracle SQL DeveloperJeff Smith
An overview of all of the performance tuning instrumentation, tools, and features in Oracle SQL Developer. Get help making those applications and their queries more performant.
Troubleshooting Complex Performance issues - Oracle SEG$ contentionTanel Poder
From Tanel Poder's Troubleshooting Complex Performance Issues series - an example of Oracle SEG$ internal segment contention due to some direct path insert activity.
Cost-based Query Optimization in Apache Phoenix using Apache CalciteJulian Hyde
This talk, given by Maryann Xue and Julian Hyde at Hadoop Summit, San Jose on June 30th, 2016, describes how we re-engineered Apache Phoenix with a cost-based optimizer based on Apache Calcite.
Apache Phoenix has rapidly become a workhorse in many organizations, providing a convenient standard SQL interface to HBase suitable for a wide variety of workloads from transactions to ETL and analytics. But Phoenix's initial query optimizer was based on static optimization procedures and thus could not choose between several potential plans or indices based on cost metrics.
We describe how we rebuilt Phoenix's parser and query optimizer using the Calcite framework, improving Phoenix's performance and SQL compliance. The new architecture uses relational algebra as an intermediate language, and this enables you to switch in other engines, especially those also based on Calcite. As an example of this, we demonstrate querying a Phoenix database via Apache Drill.
Developer’s guide to contributing code to Kafka with Mickael Maison and Tom B...HostedbyConfluent
Contributing code to an open source project can sometimes feel really difficult. The process is often different for each project and requires you to develop, build and test your change and then it needs to be accepted by the project. For Kafka, certain types of changes also require you to go through the Kafka Improvement Proposal (KIP) process.
In this talk, we will cover in detail the process to contribute code to Apache Kafka, from setting up a development environment, to building the code, running tests and opening a PR. We will also look at the KIP process, describe what each section of the document is for, the importance of finding consensus, and what happens when it gets voted. We will share, from a committers point of view, what we look for when reviewing a KIP and give some tips to help you get through the process successfully.
At the end of this talk, you will be able to get started contributing code to Kafka and understand how to get from idea, to KIP, to released feature.
SQL Server In-Memory OLTP: What Every SQL Professional Should KnowBob Ward
Perhaps you have heard the term “In-Memory” but not sure what it means. If you are a SQL Server Professional then you will want to know. Even if you are new to SQL Server, you will want to learn more about this topic. Come learn the basics of how In-Memory OLTP technology in SQL Server 2016 and Azure SQL Database can boost your OLTP application by 30X. We will compare how In-Memory OTLP works vs “normal” disk-based tables. We will discuss what is required to migrate your existing data into memory optimized tables or how to build a new set of data and applications to take advantage of this technology. This presentation will cover the fundamentals of what, how, and why this technology is something every SQL Server Professional should know
SQL Server R Services: What Every SQL Professional Should KnowBob Ward
SQL Server 2016 introduces a new platform for building intelligent, advanced analytic applications called SQL Server R Services. This session is for the SQL Server Database professional to learn more about this technology and its impact on managing a SQL Server environment. We will cover the basics of this technology but also look at how it works, troubleshooting topics, and even usage case scenarios. You don't have to be a data scientist to understand SQL Server R Services but you need to know how this works so come upgrade you career by learning more about SQL Server and advanced analytics.
Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...Julian Hyde
A talk given at ACM SIGMOD 2018 in support of the paper <a href="https://arxiv.org/abs/1802.10233"> Calcite: A Foundational Framework for Optimized Query Processing Over Heterogeneous Data Sources</a>.
Apache Calcite is a foundational software framework that provides query processing, optimization, and query language support to many popular open-source data processing systems such as Apache Hive, Apache Storm, Apache Flink, Druid, and MapD. Calcite's architecture consists of a modular and extensible query optimizer with hundreds of built-in optimization rules, a query processor capable of processing a variety of query languages, an adapter architecture designed for extensibility, and support for heterogeneous data models and stores (relational, semi-structured, streaming, and geospatial). This flexible, embeddable, and extensible architecture is what makes Calcite an attractive choice for adoption in big-data frameworks. It is an active project that continues to introduce support for the new types of data sources, query languages, and approaches to query processing and optimization.
hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...Michael Stack
Wellington Chevreuil of Cloudera
Track 1: Internals
https://open.mi.com/conference/hbasecon-asia-2019
THE COMMUNITY EVENT FOR APACHE HBASE™
July 20th, 2019 - Sheraton Hotel, Beijing, China
https://hbase.apache.org/hbaseconasia-2019/
Best Practices for the Most Impactful Oracle Database 18c and 19c FeaturesMarkus Michalewicz
This presentation answers the question, “What’s new in Oracle Database 19c ?” in a slightly different way: by providing best practices and a deep dive into the most impactful high availability (HA), scalability, and lifecycle management features in Oracle Database 12c, 18c and 19c, including a short roadmap of features yet to come in the next generation Oracle Database.
This deck was first presented during OOW19 together with Mauricio Feria, who reported on two of his customers and how they have used Oracle Database HA features and Maximum Availability Architecture (MAA) to improve their businesses.
Don’t optimize my queries, optimize my data!Julian Hyde
Your queries won't run fast if your data is not organized right. Apache Calcite optimizes queries, but can we evolve it so that it can optimize data? We had to solve several challenges. Users are too busy to tell us the structure of their database, and the query load changes daily, so Calcite has to learn and adapt.
We talk about new algorithms we developed for gathering statistics on massive database, and how we infer and evolve the data model based on the queries, suggesting materialized views that will make your queries run faster without you changing them.
A talk given by Julian Hyde at DataEngConf NYC, Columbia University, on 2017/10/30.
Experiences Migrating Hive Workload to SparkSQL with Jie Xiong and Zhan ZhangDatabricks
At Facebook, millions of Hive queries are executed on a daily basis, and the workload contributes to important analytics that drive product decisions and insights. Spark SQL in Apache Spark provides much of the same functionality as Hive query language (HQL) more efficiently, and Facebook is building a framework to migrate existing production Hive workload to Spark SQL with minimal user intervention.
Before Facebook began large-scale migration to SparkSQL, they worked on identifying the gap between HQL and SparkSQL. They built an offline syntax analysis tool that parses, analyzes, optimizes and generates physical plans on daily HQL workload. In this session, they’ll share their results. After finding their syntactic analysis encouraging, they built tooling for offline semantic analysis where they run HQL queries in their Spark shadow cluster and validate the outputs. Output validation is necessary since the runtime behavior in Spark SQL may be different from HQL. They have built a migration framework that supports HQL in both Hive and Spark execution engines, can shadow and validate HQL workloads in Spark, and makes it easy for users to convert their workloads.
Apache Calcite (a tutorial given at BOSS '21)Julian Hyde
Apache Calcite is a dynamic data management framework. Think of it as a toolkit for building databases: it has an industry-standard SQL parser, validator, highly customizable optimizer (with pluggable transformation rules and cost functions, relational algebra, and an extensive library of rules), but it has no preferred storage primitives.
In this tutorial (given at BOSS '21 in Copenhagen as part of VLDB '21) the attendees will use Apache Calcite to build a fully fledged query processor from scratch with very few lines of code. This processor is a full implementation of SQL over an Apache Lucene storage engine. (Lucene does not support SQL queries and lacks a declarative language for performing complex operations such as joins or aggregations.) Attendees will also learn how to use Calcite as an effective tool for research.
Presenters: Julian Hyde and Stamatis Zampetakis
Building robust CDC pipeline with Apache Hudi and DebeziumTathastu.ai
We have covered the need for CDC and the benefits of building a CDC pipeline. We will compare various CDC streaming and reconciliation frameworks. We will also cover the architecture and the challenges we faced while running this system in the production. Finally, we will conclude the talk by covering Apache Hudi, Schema Registry and Debezium in detail and our contributions to the open-source community.
All of the Performance Tuning Features in Oracle SQL DeveloperJeff Smith
An overview of all of the performance tuning instrumentation, tools, and features in Oracle SQL Developer. Get help making those applications and their queries more performant.
Troubleshooting Complex Performance issues - Oracle SEG$ contentionTanel Poder
From Tanel Poder's Troubleshooting Complex Performance Issues series - an example of Oracle SEG$ internal segment contention due to some direct path insert activity.
Cost-based Query Optimization in Apache Phoenix using Apache CalciteJulian Hyde
This talk, given by Maryann Xue and Julian Hyde at Hadoop Summit, San Jose on June 30th, 2016, describes how we re-engineered Apache Phoenix with a cost-based optimizer based on Apache Calcite.
Apache Phoenix has rapidly become a workhorse in many organizations, providing a convenient standard SQL interface to HBase suitable for a wide variety of workloads from transactions to ETL and analytics. But Phoenix's initial query optimizer was based on static optimization procedures and thus could not choose between several potential plans or indices based on cost metrics.
We describe how we rebuilt Phoenix's parser and query optimizer using the Calcite framework, improving Phoenix's performance and SQL compliance. The new architecture uses relational algebra as an intermediate language, and this enables you to switch in other engines, especially those also based on Calcite. As an example of this, we demonstrate querying a Phoenix database via Apache Drill.
Developer’s guide to contributing code to Kafka with Mickael Maison and Tom B...HostedbyConfluent
Contributing code to an open source project can sometimes feel really difficult. The process is often different for each project and requires you to develop, build and test your change and then it needs to be accepted by the project. For Kafka, certain types of changes also require you to go through the Kafka Improvement Proposal (KIP) process.
In this talk, we will cover in detail the process to contribute code to Apache Kafka, from setting up a development environment, to building the code, running tests and opening a PR. We will also look at the KIP process, describe what each section of the document is for, the importance of finding consensus, and what happens when it gets voted. We will share, from a committers point of view, what we look for when reviewing a KIP and give some tips to help you get through the process successfully.
At the end of this talk, you will be able to get started contributing code to Kafka and understand how to get from idea, to KIP, to released feature.
SQL Server In-Memory OLTP: What Every SQL Professional Should KnowBob Ward
Perhaps you have heard the term “In-Memory” but not sure what it means. If you are a SQL Server Professional then you will want to know. Even if you are new to SQL Server, you will want to learn more about this topic. Come learn the basics of how In-Memory OLTP technology in SQL Server 2016 and Azure SQL Database can boost your OLTP application by 30X. We will compare how In-Memory OTLP works vs “normal” disk-based tables. We will discuss what is required to migrate your existing data into memory optimized tables or how to build a new set of data and applications to take advantage of this technology. This presentation will cover the fundamentals of what, how, and why this technology is something every SQL Server Professional should know
SQL Server R Services: What Every SQL Professional Should KnowBob Ward
SQL Server 2016 introduces a new platform for building intelligent, advanced analytic applications called SQL Server R Services. This session is for the SQL Server Database professional to learn more about this technology and its impact on managing a SQL Server environment. We will cover the basics of this technology but also look at how it works, troubleshooting topics, and even usage case scenarios. You don't have to be a data scientist to understand SQL Server R Services but you need to know how this works so come upgrade you career by learning more about SQL Server and advanced analytics.
Brk3288 sql server v.next with support on linux, windows and containers was...Bob Ward
SQL Server is bringing its world-class RDBMS to Linux and Windows with SQL Server v.Next. In this session you will learn what´s next for SQL Server on Linux and how application developers and IT architects can now leverage the enterprise class features of SQL Server in every edition on Linux, Windows and containers.
Based on the popular blog series, join me in taking a deep dive and a behind the scenes look at how SQL Server 2016 “It Just Runs Faster”, focused on scalability and performance enhancements. This talk will discuss the improvements, not only for awareness, but expose design and internal change details. The beauty behind ‘It Just Runs Faster’ is your ability to just upgrade, in place, and take advantage without lengthy and costly application or infrastructure changes. If you are looking at why SQL Server 2016 makes sense for your business you won’t want to miss this session.
Brk3043 azure sql db intelligent cloud database for app developers - wash dcBob Ward
Make building and maintaining applications easier and more productive. With built-in intelligence that learns app patterns and adapts to maximize performance, reliability, and data protection, SQL Database is a cloud database built for developers. The session covers our most advanced features to-date including Threat Detection, auto-tuned performance and actionable recommendations across performance and security aspects. Case studies and live demos help you understand how choosing SQL Database will make a difference for your app and your company.
A presentation to persuade Leeds Highways department to amend the signs and calming measures in Chapel Allerton to make the zone conforms to the very clear legal requirement of Traffic Signs General Directions 2002.
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...NoSQLmatters
Simon Elliston Ball – When to NoSQL and When to Know SQL
With NoSQL, NewSQL and plain old SQL, there are so many tools around it’s not always clear which is the right one for the job.This is a look at a series of NoSQL technologies, comparing them against traditional SQL technology. I’ll compare real use cases and show how they are solved with both NoSQL options, and traditional SQL servers, and then see who wins. We’ll look at some code and architecture examples that fit a variety of NoSQL techniques, and some where SQL is a better answer. We’ll see some big data problems, little data problems, and a bunch of new and old database technologies to find whatever it takes to solve the problem.By the end you’ll hopefully know more NoSQL, and maybe even have a few new tricks with SQL, and what’s more how to choose the right tool for the job.
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)Michael Rys
Data Lakes have become a new tool in building modern data warehouse architectures. In this presentation we will introduce Microsoft's Azure Data Lake offering and its new big data processing language called U-SQL that makes Big Data Processing easy by combining the declarativity of SQL with the extensibility of C#. We will give you an initial introduction to U-SQL by explaining why we introduced U-SQL and showing with an example of how to analyze some tweet data with U-SQL and its extensibility capabilities and take you on an introductory tour of U-SQL that is geared towards existing SQL users.
slides for SQL Saturday 635, Vancouver BC, Aug 2017
Introduction to the basics of Information Retrieval (IR) with an emphasis on Apache Solr/Lucene. A lecture I gave during the JOSA Data Science Bootcamp.
Brad McGehee's presentation on "How to Interpret Query Execution Plans in SQL Server 2005/2008".
Presented to the San Francisco SQL Server User Group on March 11, 2009.
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...Michael Rys
More and more customers who are looking to modernize analytics needs are exploring the data lake approach in Azure. Typically, they are most challenged by a bewildering array of poorly integrated technologies and a variety of data formats, data types not all of which are conveniently handled by existing ETL technologies. In this session, we’ll explore the basic shape of a modern ETL pipeline through the lens of Azure Data Lake. We will explore how this pipeline can scale from one to thousands of nodes at a moment’s notice to respond to business needs, how its extensibility model allows pipelines to simultaneously integrate procedural code written in .NET languages or even Python and R, how that same extensibility model allows pipelines to deal with a variety of formats such as CSV, XML, JSON, Images, or any enterprise-specific document format, and finally explore how the next generation of ETL scenarios are enabled though the integration of Intelligence in the data layer in the form of built-in Cognitive capabilities.
En este diapositivas der Microsoft podemos ver qué aporta SQL 2014 en áreas como: Tablas optimizadas en memòria, Cambios en estimacion de la cardinalidad, Cifrado de los Backups, Mejoras en arquitectures, Always On, Cambios en Resource Governor, Data files en Azure.
2021 04-20 apache arrow and its impact on the database industry.pptxAndrew Lamb
The talk will motivate why Apache Arrow and related projects (e.g. DataFusion) is a good choice for implementing modern analytic database systems. It reviews the major components in most databases and explains where Apache Arrow fits in, and explains additional integration benefits from using Arrow.
An introduction to the different types of NoSQL and some guidance on when to choose them, and when to use plain old SQL. Focuses on developer productivity, intuitive code, and system issues including scaling and usage patterns. As delivered at JavaOne 2014 in San Francisco
This session shows an overview of the features and architecture of SQL Server on Linux and Containers. It covers install, config, performance, security, HADR, Docker containers, and tools. Find the demos on http://aka.ms/bobwardms
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
Developing Distributed High-performance Computing Capabilities of an Open Sci...Globus
COVID-19 had an unprecedented impact on scientific collaboration. The pandemic and its broad response from the scientific community has forged new relationships among public health practitioners, mathematical modelers, and scientific computing specialists, while revealing critical gaps in exploiting advanced computing systems to support urgent decision making. Informed by our team’s work in applying high-performance computing in support of public health decision makers during the COVID-19 pandemic, we present how Globus technologies are enabling the development of an open science platform for robust epidemic analysis, with the goal of collaborative, secure, distributed, on-demand, and fast time-to-solution analyses to support public health.
Globus Connect Server Deep Dive - GlobusWorld 2024Globus
We explore the Globus Connect Server (GCS) architecture and experiment with advanced configuration options and use cases. This content is targeted at system administrators who are familiar with GCS and currently operate—or are planning to operate—broader deployments at their institution.
How Recreation Management Software Can Streamline Your Operations.pptxwottaspaceseo
Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.
Cyaniclab : Software Development Agency Portfolio.pdfCyanic lab
CyanicLab, an offshore custom software development company based in Sweden,India, Finland, is your go-to partner for startup development and innovative web design solutions. Our expert team specializes in crafting cutting-edge software tailored to meet the unique needs of startups and established enterprises alike. From conceptualization to execution, we offer comprehensive services including web and mobile app development, UI/UX design, and ongoing software maintenance. Ready to elevate your business? Contact CyanicLab today and let us propel your vision to success with our top-notch IT solutions.
Check out the webinar slides to learn more about how XfilesPro transforms Salesforce document management by leveraging its world-class applications. For more details, please connect with sales@xfilespro.com
If you want to watch the on-demand webinar, please click here: https://www.xfilespro.com/webinars/salesforce-document-management-2-0-smarter-faster-better/
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Anthony Dahanne
Les Buildpacks existent depuis plus de 10 ans ! D’abord, ils étaient utilisés pour détecter et construire une application avant de la déployer sur certains PaaS. Ensuite, nous avons pu créer des images Docker (OCI) avec leur dernière génération, les Cloud Native Buildpacks (CNCF en incubation). Sont-ils une bonne alternative au Dockerfile ? Que sont les buildpacks Paketo ? Quelles communautés les soutiennent et comment ?
Venez le découvrir lors de cette session ignite
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?XfilesPro
Worried about document security while sharing them in Salesforce? Fret no more! Here are the top-notch security standards XfilesPro upholds to ensure strong security for your Salesforce documents while sharing with internal or external people.
To learn more, read the blog: https://www.xfilespro.com/how-does-xfilespro-make-document-sharing-secure-and-seamless-in-salesforce/
Strategies for Successful Data Migration Tools.pptxvarshanayak241
Data migration is a complex but essential task for organizations aiming to modernize their IT infrastructure and leverage new technologies. By understanding common challenges and implementing these strategies, businesses can achieve a successful migration with minimal disruption. Data Migration Tool like Ask On Data play a pivotal role in this journey, offering features that streamline the process, ensure data integrity, and maintain security. With the right approach and tools, organizations can turn the challenge of data migration into an opportunity for growth and innovation.
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamtakuyayamamoto1800
In this slide, we show the simulation example and the way to compile this solver.
In this solver, the Helmholtz equation can be solved by helmholtzFoam. Also, the Helmholtz equation with uniformly dispersed bubbles can be simulated by helmholtzBubbleFoam.
How to Position Your Globus Data Portal for Success Ten Good PracticesGlobus
Science gateways allow science and engineering communities to access shared data, software, computing services, and instruments. Science gateways have gained a lot of traction in the last twenty years, as evidenced by projects such as the Science Gateways Community Institute (SGCI) and the Center of Excellence on Science Gateways (SGX3) in the US, The Australian Research Data Commons (ARDC) and its platforms in Australia, and the projects around Virtual Research Environments in Europe. A few mature frameworks have evolved with their different strengths and foci and have been taken up by a larger community such as the Globus Data Portal, Hubzero, Tapis, and Galaxy. However, even when gateways are built on successful frameworks, they continue to face the challenges of ongoing maintenance costs and how to meet the ever-expanding needs of the community they serve with enhanced features. It is not uncommon that gateways with compelling use cases are nonetheless unable to get past the prototype phase and become a full production service, or if they do, they don't survive more than a couple of years. While there is no guaranteed pathway to success, it seems likely that for any gateway there is a need for a strong community and/or solid funding streams to create and sustain its success. With over twenty years of examples to draw from, this presentation goes into detail for ten factors common to successful and enduring gateways that effectively serve as best practices for any new or developing gateway.
Into the Box Keynote Day 2: Unveiling amazing updates and announcements for modern CFML developers! Get ready for exciting releases and updates on Ortus tools and products. Stay tuned for cutting-edge innovations designed to boost your productivity.
Understanding Globus Data Transfers with NetSageGlobus
NetSage is an open privacy-aware network measurement, analysis, and visualization service designed to help end-users visualize and reason about large data transfers. NetSage traditionally has used a combination of passive measurements, including SNMP and flow data, as well as active measurements, mainly perfSONAR, to provide longitudinal network performance data visualization. It has been deployed by dozens of networks world wide, and is supported domestically by the Engagement and Performance Operations Center (EPOC), NSF #2328479. We have recently expanded the NetSage data sources to include logs for Globus data transfers, following the same privacy-preserving approach as for Flow data. Using the logs for the Texas Advanced Computing Center (TACC) as an example, this talk will walk through several different example use cases that NetSage can answer, including: Who is using Globus to share data with my institution, and what kind of performance are they able to achieve? How many transfers has Globus supported for us? Which sites are we sharing the most data with, and how is that changing over time? How is my site using Globus to move data internally, and what kind of performance do we see for those transfers? What percentage of data transfers at my institution used Globus, and how did the overall data transfer performance compare to the Globus users?
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Globus
The Earth System Grid Federation (ESGF) is a global network of data servers that archives and distributes the planet’s largest collection of Earth system model output for thousands of climate and environmental scientists worldwide. Many of these petabyte-scale data archives are located in proximity to large high-performance computing (HPC) or cloud computing resources, but the primary workflow for data users consists of transferring data, and applying computations on a different system. As a part of the ESGF 2.0 US project (funded by the United States Department of Energy Office of Science), we developed pre-defined data workflows, which can be run on-demand, capable of applying many data reduction and data analysis to the large ESGF data archives, transferring only the resultant analysis (ex. visualizations, smaller data files). In this talk, we will showcase a few of these workflows, highlighting how Globus Flows can be used for petabyte-scale climate analysis.
Advanced Flow Concepts Every Developer Should KnowPeter Caitens
Tim Combridge from Sensible Giraffe and Salesforce Ben presents some important tips that all developers should know when dealing with Flows in Salesforce.
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus
As part of the DOE Integrated Research Infrastructure (IRI) program, NERSC at Lawrence Berkeley National Lab and ALCF at Argonne National Lab are working closely with General Atomics on accelerating the computing requirements of the DIII-D experiment. As part of the work the team is investigating ways to speedup the time to solution for many different parts of the DIII-D workflow including how they run jobs on HPC systems. One of these routes is looking at Globus Compute as a way to replace the current method for managing tasks and we describe a brief proof of concept showing how Globus Compute could help to schedule jobs and be a tool to connect compute at different facilities.
Globus Compute wth IRI Workflows - GlobusWorld 2024
Inside SQL Server In-Memory OLTP
1. Inside SQL Server
In-Memory OLTP
Bob Ward, Principal Architect, Microsoft,
Data Group, Tiger Team
Latch free since 2007
Based on this whitepaper, internal
docs, source code, and reverse
engineering
Credits to Kalen
Delaney, Jos de Bruijn,
Sunil Agarwal, Cristian
Diaconu, Craig
Freedman, Alejandro
Hernandez Saenz, Jack
Li, Mike Zwilling and
the entire Hekaton
team
deck and demos at http://aka.ms/bobwardms
2.
3. This is a 500 Level Presentation
It is ONE talk with a 15min break for 3 hours
=
Great primer
and
customer
stories by
@jdebruijn
Get and run the demo
yourself from github
4. Let’s Do This
What, How, Why, and Architecture
Data and Indexes
Concurrency, Transactions, and Logging
Checkpoint and Recovery
Natively Compiled Procedures
Resource Management
Based on SQL
Server 2016
Check out the Bonus
Material Section
if time
5. The Research behind the concept
OLTP Through the Looking Glass,
and What We Found There
by Stavros Harizopoulos, Daniel J.
Abadi, Samuel Madden, Michael
Stonebraker , SIGMOD 2008
TPC-C hand-coded on top of the
SHORE storage engine
6. The Path to In-Memory OLTP
Project Verde
• 2007
“Early”
Hekaton
• 2009/2010
Hekaton
becomes In-
Memory OLTP
• SQL Server 2014
In-Memory
OLTP
Unleashed
• SQL Server 2016
7. Keep SQL Server relevant in a world of high-speed hardware with dense
cores, fast I/O, and inexpensive massive memory
The need for high-speed OLTP transactions at microsecond speed
Reduce the number of instructions to execute a transaction
Find areas of latency for a transaction and reduce its footprint
Use multi-version optimistic concurrency and “lock free” algorithms
Use DLLs and native compiled procs
What, Why, and How
7
XTP = eXtreme Transaction Processing
HK = Hekaton
Minimal
diagnostics in
critical path
8. • Supported on Always On Availability Groups
• Supports ColumnStore Indexes for HTAP applications
• Supported in Azure SQL Database
• Cross container transactions (disk and in-memory in one transaction)
• Table variables supported
• SQL surface area expanded in SQL Server 2016. Complete support here
• LOB data types (ex. Varchar(max)) supported
• BACKUP/RESTORE complete functionality
• Use Transaction Performance Analysis Report for migration
In-Memory OLTP Capabilities
8
Preview
Hear the
case studies
9. “Normal” INSERT
The Path of In-Memory OLTP Transactions
9
Release latches and locks
Update index pages ( more locks and latch )
Modify page
Latch page
Obtain locks
INSERT LOG record
In-Memory OLTP INSERT
Maintain index in
memory
Insert ROW into
memory
COMMIT Transaction = Log
Record and Flush
COMMIT Transaction = Insert
HK Log Record and Flush
Page Split
Sys Tran =
Log Flush
Spinlocks
No index
logging
SCHEMA_ONLY
no logging
No latch
No spinlock
14. filegroup <fg_name> contains memory_optimized_data
(name = <name>, filename = ‘<drive and path><folder name>’)
Creating the database for in-memory OLTP
14
filestream
container
we create this
• Name = DB_ID_<dbid>
• Your data goes here so this gets big
MEMORYCLERK_XTP
• Two of these for each db
• Much smaller but can grow some over time
MEMOBJ_XTPDB
HK System Tables
created
(hash index based
tables)
Create multiple
across drives
15. In-memory OLTP Threads
15
USER
GC
USER
USER
USER
USER
USER
Node 0
GC
Node <n>
user transactions
Garbage collection one
per node
database
CHECKPOINT
CONTROLLER
CHECKPOINT
CLOSE
LOG FLUSH
LOG FLUSH
thread
Data
Serializer
thread thread
In-Memory Thread Pools
CHECKPOINT I/O
U
U
P
XTP_OFFLINE_CKPT
H
H
P
U
H
P
P
U
U
H
P
user scheduler
hidden scheduler
preemptive
You could see
session_id >
50
Data
Serializer
dm_xtp_threads
18. Creating a Table
create table starsoftheteam
(player_number int identity
primary key nonclustered hash with (bucket_count = 1048576),
player_name char(1000) not null)
with (memory_optimized = on, durability = schema_and_data)
All tables require
at least one index
Schema
stored in
SQL
system
catalog
Hash index
built in-
memory
Hk system
tables
populated
Schema
DLL built
Perform a
checkpoint
Ex. “root” table FS container files created
and written
Default = range
19. Inside Data and Rows
How do you find data rows in a normal SQL table?
• Heap = Use IAM pages
• Clustered Index = Find root page and traverse index
What about an in-memory table?
• Hash index table pointer known in HK metadata for a table. Hash the index key, go to bucket
pointer, traverse chain to find row
• Page Mapping Table used for range indexes and has a known pointer in HK metadata. Traverse the
range index which points to data rows
• Data exists in memory as pointers to rows (aka a heap). No page structure
All data rows have known header but data is opaque to HK engine
• Schema DLL and/or Native Compiled Proc DLL knows the format of the row data
• Schema DLL and/or Native Compiled Proc DLL knows how to find “key” inside the index
Compute your
estimated row size
here
“bag of
bytes”
Hash index
scan is
possible
20. A In-Memory OLTP Data Row Your data.
“bytes” to HK
engine
Points to another row in
the “chain”. One pointer for
each index on the table
Timestamp of
INSERT
Halloween
protection
Number of
pointers = #
indexes
Rows don’t
have to be
contiguous in
memory
Timestamp of
DELETE. ∞ =
“not deleted”
21. 100,∞ Bob Dallas
Hash indexes
hash on name
columns = name char(100), city char (100)
hash index on name bucket_count = 8
hash index on city bucket_count = 8 hash on city
0 0
4
5
200,∞ Bob Fort Worth
90,∞ Ginger Austin
50,∞ Ryan Arlington
70,∞ Ginger Houston
3
4
5
chain count
= 3
Hash
collision
duplicate key !=
collision
Begin ts
reflects order
of inserts
ts
22. Hash Indexes
When should I use these?
Single row lookups by exact key value
Support multi-column indexes
Do I care about bucket counts and chain length?
The key is to keep the chains short
Key values with high cardinality the best and typically result in short chains
Larger number of buckets keep collisions low and chains shorter
Monitor and adjust with dm_db_xtp_hash_index_stats
SQL catalog view sys.hash_indexes also provided
Empty bucket % high ok except uses memory and can affect scans
Monitor average and max chain length (1 avg chain ideal; < 10 acceptable)
Rebuild with ALTER
TABLE. Could take
some time
REBUILD can block interop with SCH-S; Native proc rebuilt so blocked
24. Range Indexes
When should I use these?
Frequent multi-row searches based on a “range” criteria (Ex. >, < )
ORDER BY on index key
First column of multi-column index searches
“I don’t know” or “I don’t want to maintain buckets” – Default for an index
You can always put pkey on hash and range on other columns for seeks
What is different about them?
Contains a tree of pages but not a fixed size like SQL btrees
Pointers are logical IDs that refer to Page Mapping Table
Updates are contained in “delta” records instead of modifying page
26. The SCHEMA DLL
• The HK Engine doesn’t understand row formats or key value comparisons
• So when a table is built or altered, we create, compile, and build a DLL
bound to the table used for interop queries
dbid
object_id
Symbol file
Rebuilt when
database comes
online or table altered
cl.exe
output
Generated C
source
t = schema DLL
p =native proc
27. The Row Payload create table letsgomavs
(col1 int not null identity primary key
nonclustered hash with (bucket_count =
1000000),
col2 int not null,
col3 varchar(100)
)
with (memory_optimized = on, durability
= schema_and_data)
struct NullBitsStruct_2147483654
{
unsigned char hkc_isnull_3:1;
};
struct hkt_2147483654
{
long hkc_1;
long hkc_2;
struct NullBitsStruct_2147483654 null_bits;
unsigned short hkvdo[2];
};
Header hkc_1 hkc_2
Null
bits
Offset
array
col3
“your data”
nullable
28. Access data with the SCHEMA DLL
Interop Code to “crack” rows
Find hash index in
metadata
Hash key and find
bucket
Call
CompareSKeyToRow
routine
If row found, “crack”
row
Lookup a row in a hash index seek
schema DLL
Find Table
metadata in
Root Table
Get
OpaqueData
pointer
Get
HkOffsetInfo
pointer
Find columns
and offsets
Use this to
“crack” row
values
31. Multi-Version Optimistic Concurrency (MVCC)
• Rows never change: UPDATE = DELETE + INSERT
Immutable
• UPDATE and DELETE create versions
• Timestamps in rows for visibility and transactions
correctness
Versions
• Assume no conflicts
• Snapshots + conflict detection
• Guarantee correct transaction isolation at commit
Optimistic
NOT in
tempdb
Pessimistic
= locks
Errors may occur
32. In-memory OLTP versioning
In-Mem TableDisk based table
1, ‘Row 1’
100
101
102
103
104
105
106
Time T1 T2
BEGIN TRAN
BEGIN TRAN
2, ‘Row 2’
SELECT
COMMIT
SELECT
COMMIT
100
101
102
103
104
105
106
Time T1 T2
BEGIN TRAN
BEGIN TRAN
2, ‘Row 2’
SELECT
COMMIT
SELECT
COMMIT
read committed and
SERIALIZABLE = blocked
XRCSI = Row 1
Snapshot = Row 1
RCSI = Row 1 and 2
Snapshot = Row 1
SERIALIZABLE =
Msg 41325
Snapshot always
used = Row 1
X
33. What were those timestamps for?
TS = 243
SELECT Name,
City FROM T1
“write set” = logged records
“read
set”
BEGIN TRAN TX3 – TS 246
SELECT City FROM T1 WITH
(REPEATABLEREAD) WHERE Name =
'Jane';
UPDATE T1 WITH (REPEATABLEREAD)
SET City ‘Helinski’ WHERE Name =
'Susan';
COMMIT TRAN -- commits at timestamp
255
Greg, Lisbon
Susan Bogata
Jane Helsinki
FAIL = ‘Jane’ changed after I started
but before I committed and I’m
REPEATABLEREAD. With SQL update to
Jane would have been blocked
Commit dependency
34. Transactions Phases and In-Memory OLTP
Update
conflict can
happen here
COMMIT
started
Ensure
isolation levels
are accurate.
Could result in
failure
Wait for
dependencies, flush
log records, update
timestamps
COMMIT
completed
Lookout for
Fkeys
dm_db_xtp_transactions
35. It is really “lock free”
Code executing transactions in the Hekaton Kernel are lock, latch, and spinlock free
Locks
• Only SCH-S needed for interop queries
• Database locks
Latches
• No pages = No page latches
• Latch XEvents don’t fire inside HK
Spinlocks
• Spinlock class never used
The “host” may wait – tLog, SQLOS
We use Thread
Local Storage (TLS)
to “guard” pointers
We can “walk” lists
lock free. Retries
may be needed
Atomic “Compare
and Swap” (CAS) to
modify
IsXTPSupported() = cmpxchg16b
Spinlocks use CAS
(‘cmpxchg’) to
“acquire” and
“release”
Great blog
post here
*XTP* wait
types not in
transaction
code
deadlock free
CMED_HASH_SET fix
37. Logging for a simple disk based INSERT
use dontgobluejays
create table starsoftheteam (player_no int primary key nonclustered, player_name
char(100) not null)
One row already exists
begin tran
insert into starsoftheteam (1, 'I really don‘’t like the Blue Jays')
commit tran
LOP_BEGIN_XACT
LOP_INSERT_ROWS – heap page
LOP_INSERT_ROWS – ncl index
LOP_COMMIT_XACT
124
204
116
84
4 log
records
528 bytes
38. Logging for a simple in-mem OLTP INSERT
use gorangers
create table starsoftheteam (player_no int primary key nonclustered,
player_name char(100) not null)
with (memory_optimized = on, durability = schema_and_data)_
begin tran
insert into starsoftheteam values (1, 'Let'‘s go Rangers')
commit tran
3 log
records
436 bytes
LOP_BEGIN_XACT
LOP_HK
LOP_COMMIT_XACT
144
208
84
HK_LOP_BEGIN_TX
HK_LOP_INSERT_ROW
HK_LOP_COMMIT_TX
fn_dblog_xtp()
39. Multi-row INSERT disk-based heap table
LOP_BEGIN_XACT
LOP_INSERT_ROWS – heap page
LOP_INSERT_ROWS – ncl index
LOP_BEGIN_XACT
LOP_MODIFY_ROW – PFS
LOP_HOBT_DELTA
LOP_FORMAT_PAGE
LOP_COMMIT_XACT
LOP_SET_FREE_SPACE - PFS
LOP_COMMIT_XACT
100
rows
Log flush
PFS latch multiple times
213 log records @ 33Kb
PFS latch
Metadata access
1 pair for every row
Alloc new page twice
40. Compared to In-Memory OLTP
LOP_BEGIN_XACT
LOP_HK
LOP_COMMIT_XACT
144
11988
84
HK_LOP_BEGIN_TX
HK_LOP_INSERT_ROW
HK_LOP_COMMIT_TX
100
Only inserted into log cache at commit.
Log flush ROLLBACK =
more log records
for SQL. 0 for HK
If LOP_HK too big we
may need more than
one
42. In-Memory OLTP CHECKPOINT facts
Why CHECKPOINT?
Speed up database startup. Otherwise we would have to keep around a huge tlog around.
We only write data based on committed log records
Independent of SQL recovery intervals or background tasks (1.5Gb log increase)
All data written in pairs of data and delta files
Preemptive tasks using WriteFile (async with FILE_FLAG_NO_BUFFERING)
Data is always appended
I/O is continuous but CHECKPOINT event is based on 1.5Gb log growth
Instant File Initialization matters for PRECREATED files
Data = INSERTs and UPDATEs
Delta = filter for what rows in Data are actually deleted
Periodically we will MERGE several Data/Delta pairs.
Typically 128Mb
but can be 1Gb
Typically 8Mb but
can be 128Mb
Log truncation
eligible at
CHECKPOINT
event
No WAL protocol
43. CHECKPOINT FILE types and states PRECREATED
ACTIVE
UNDER CONSTRUCTION
WAITING..TRUNCATION
ROOT
FREE
After first CREATE TABLE
DELTA
DATA
ROOT
FREE
INSERT data rows
DELTA
DATA
ROOT
ROOT
CHECKPOINT event
DELTA
DATA
0 to 100 TS
0 to 100 TS
ROOT
ROOT
DELTA
DATA
INSERT more
data rows
DATA
DELTA
101 to 200 TS
101 to 200 TS
These will be
used to populate
table on startup
and then apply
log
Checkpoint
File Pair
(CFP) and
are what is
in tlog
This can be
reused or
deleted at log
truncation
We constantly keep
PRECREATED, FREE
files available
Any tran in this
range
FREEFREE
44. CHECKPOINT DATA file format
Block
Header
File Header
Block
Header
Data Rows
……
Block
Header
Data Rows
Fixed
“CKPT”
TYPE
4Kb
Length dependent
on TYPE
Row headers and
payload
256Kb
P
A
D
Every type starts with a
FILE Header block of 4Kb
ROOT and DELTA files use
4Kb blocks
Fixed
Find the list @
dm_db_xtp_checkpoint_files
45. In-Memory OLTP Recovery
Read details from
database Boot Page
(DBINFO)
Load ROOT file for
system tables and
file metadata
Load ACTIVE DATA
files filtered by
DELTA streamed in
parallel
Redo COMMITED
transactions greater
than last
CHECKPOINT LSN
Checksum failures =
RECOVERY_PENDING on
startup
We can
create a
thread per
CPU for this
We are a bit
ERRORLOG “happy”
47. Natively
Compiled
Procedures
Native = C code vs T-SQL
Built at CREATE PROC time or recompile
Rebuilt when database comes online
‘XTP Native DLL’ in dm_os_loaded_modules
48. The structure of a natively compiled proc
create procedure cowboys_proc_scan
with native_compilation, schemabinding as
begin atomic with
(transaction isolation level = snapshot, language = N'English')
select player_number, player_name from dbo.starsoftheteam
..
end
Compile this
Into a DLL
Required. No referenced
object/column
can be dropped or altered.
No SCH lock required
Everything in
block a single
tran
These are required. There are other options
No “*” allowed
Fully qualified
names
Bigger surface
area in SQL
Server 2016
Restrictions and
Best Practices
are here
Iso
levels
still
use
MVCC
Your queries
49. The lifecycle of compilation
Mixed Abstract
Tree (MAT) built
• Abstraction with flow
and SQL specific info
• Query trees into
MAT nodes
• XML file in the DLL
dir
Converted to
Pure Imperative
Tree (PIT)
• General nodes of a
tree that are SQL
agnostic. C like data
structures
• Easier to turn into
final C code
Final C code built
and written
• C instead of C++ is
simpler and faster
compile.
• No user defined
strings or SQL
identifiers to prevent
injection
Call cl.exe to
compile, link, and
generate DLL
• Many of what is
needed to execute in
the DLL
• Some calls into the
HK Engine
All files in
BINNXTP
VC has
compiler files,
DLLs and libs
Gen has HK
header and
libs
Call cl.exe to
compile and
link
Done in memory
xtp_matgen XEvent
52. Inside the Memory of In-Memory OLTP
Hekaton implements its own memory management system built on SQLOS
• MEMORYCLERK_XTP (DB_ID_<dbid>) uses SQLOS Page allocator
• Variable heaps created per table and range index
• Hash indexes using partitioned memory objects for buckets
• “System” memory for database independent tasks
• Memory only limited by the OS (24TB in Windows Server 2016)
• Details in dm_db_xtp_memory_consumers and dm_xtp_system_memory_consumers
In-Memory does recognize SQL Server Memory Pressure
• Garbage collection is triggered
• If OOM, no inserts allowed but you may be able to DELETE to free up space
Allocated at
create index
time
Locked and
Large apply
53. Resource Governor and In-Memory OLTP
Remember this is ALL memory
SQL Server limits HK to ~70-90% of target depending on size
Freeing up memory = drop db (immediate) or delete data/drop table (deferred)
Binding to your own Resource Pool
Best way to control and monitor memory usage of HK tables
sp_xtp_bind_db_resource_pool required to bind db to RG pool
What about CPU and I/O?
RG I/O does not apply because we don’t use SQLOS for checkpoint I/O
RG CPU does apply for user tasks because we run under SQLOS host
no classifier
function
Not the same
as memory
55. Walk away with this
Want 30x faster for OLTP? Go In-Memory OLTP
Transactions are truly lock, latch, and spinlock free
Hash index for single row; Range for multi-row
Reduce transaction latency for super speed
SQL Server 2016 major step up from 2014
56. We are not done
• Move to GA for Azure SQL Database
• Increase surface area of features support for SQL Server
• Push the limits of hardware
• Speeding up recovery and HA with NVDIMM and RDMA
• Explore further areas of code optimization
• SQL 2016 CU2 fix for session state workloads Read more here
• Range index performance enhancements
57. Resources
• SQL Server In-Memory OLTP Internals for SQL Server 2016
• In-Memory OLTP Videos: What it is and When/How to use it
• Explore In-Memory OLTP architectures and customer case studies
• Review In-Memory OLTP in SQL Server 2016 and Azure SQL Database
• In-Memory OLTP (In-Memory Optimization) docs
59. Troubleshooting
Validation errors. Check out this blog
Upgrade from 2014 to 2016 can take time
Large checkpoint files for 2016 could take up more space and increase recovery times
Log growths, XTP_CHECKPOINT waits. Hotfix 6051103
Checkpoint file shut down and with no detailed info: https://support.microsoft.com/en-us/kb/3090141
Unable to rebuild log 6424109 (by design)
Set filegroup offline. NEVER do it because you can’t set it online again (filestream limitation)
You can’t remove HK filegroup after you add it 2016 LOB can cause memory issues
Other CSS Observations
• Hash index doesn’t have a concept of “covered index”
• Low memory can cause recovery to fail (because we need everything in memory)
• No dbcc checkdb support (but checkpoint files and backup have checksum)
From the CSS team
61. The Challenge
61
Client Access
· Backup/
Restore
· DBCC
· Buik load
· Exception
handling
· Event logging
· XEvents
SOS
· Process model (scheduling & synchronization)
· Memory management & caching
· I/O
Txn’s
Lock
mgr
Buffer
Pool
Access Methods
Query Execution & Expression
Services
UCS
Query
Optimization
TSQL Parsing & Algebrizer
Metadata
File
Manager
Database
Manager
Procedure
Cache
· TDS handler
· Session Mgmt
Security
Support Utilities
UDTs &
CLR
Stored
Procs
SNI
Service
Broker
Logging
&
Recovery
10%10%
Network,
TDS
T-SQL
interpreter
Query
Execution
Expression
s
Access
Methods
Transactio
n, Lock,
Log
Managers
SOS, OS
and I/O
35%45%
62. SQLOS task and worker threads are the foundation
“User” tasks to run transactions
Hidden schedulers used for critical background tasks
Some tasks dedicated while others use a “worker” pool
The In-Memory OLTP Thread Model
62
SQLOS workers dedicated
to HK
63. NON-PREEMPTIVEPOOL
Background
workers needed
for on-demand
Parallel ALTER
TABLE
WORKERSPOOL
Other workers
needed for on-
demand (non-
preemptive)
Parallel MERGE
operation of
checkpoint files
PREEMPTIVEPOOL
Background
workers needed
on-demand -
preemptive
Serializers to
write out
checkpoint files
In-Memory OLTP Thread Pool
63
Command = XTP_THREAD_POOL wait_type =
DISPATCHER_QUEUE_SEMAPHORE
Command = UNKNOWN
TOKEN wait_type =
XTP_PREEMPTIVE_TASK
Ideal count = < # schedulers> ; idle timeout = 10 secs
HUU
H
U
hidden scheduler
user scheduler
64. Data Modifications
• Multi-version Optimistic Concurrency prevents all blocking
• ALL UPDATEs are DELETE followed by INSERT
• DELETED rows not automatically removed from memory
• Deleted rows not visible to active transactions becomes stale
• Garbage Collection process removes stale rows from memory
• TRUNCATE TABLE not supported
Page
deallocation in
SQL Server
65. Garbage Collection
Stale rows to be removed queued and removed by multiple workers
User workers can clean up stale rows as part of normal transactions
Dedicated threads (GC) per NODE to cleanup stale rows (awoken based on # deletes)
SQL Server Memory pressure can accelerate
Dusty corner scan cleanup – ones no transaction ever reaches
This is the only way to reduce size of HK tables
Long running transactions may prevent removal of deleted rows (visible != stale)
Diagnostics
rows_expired = rows that are stale
rows_expired_removed = stale rows removed from memory
Perfmon -SQL Server 2016 XTP Garbage Collection
dm_db_xtp_gc_cycle_stats
dm_xtp_gc_queue_stats
66. Diagnostics for Native Procs
Query Plans and Stats
SHOWPLAN_XML is only option
Operator list can be found here
sp_xtp_control_query_exec_stats and sp_xtp_control_proc_exec_stats
Query Store
Plan store by default
Runtime stats store needs above stored procedures to enable (need recompile)
XEvent and SQLTrace
sp_statement_completed works for Xevent but more limited information
SQLTrace only picks up overall proc execution at batch level
Read the
docs
67. Undo, tempdb, and BULK
• Undo required 415 log records @46Kb
• Hekaton required no logging
• Tempdb has similar # log records as disk-
based table but less per log record (aka
minimal logging).
Multi-row
insert test
Remember
SCHEMA_ONLY has no
logging or I/OBULK INSERT for
in-mem executes
and logged just
like INSERT
Minimally logged
BULK INSERT took
271 log records @
27Kb
Latches required for
GAM, PFS, and system
table pates
68. In-Memory OLTP pushes the log to the limit
Multiple Log Writers in SQL Server 2016
One per NODE up to four
We were able to increase log throughput from 600 to 900Mb/sec
You could go to delayed durability
Database is always consistent
You could lose transactions
Log at the speed of memory
Tail of the log is a “memcpy” to commit
Watch the video
69. The Merge Process
Over time we could have many CFPs
Why not consolidate data and delta files into a smaller number of pairs?
MERGE TARGET type is target of the merge of files. Becomes ACTIVE
ACTIVE files that were source of merge become WAITING FOR LOG TRUNCATION
70. Bob Ward
Principal Architect, Microsoft
Bob Ward is a Principal Architect for the Microsoft Data
Group (Tiger Team). Bob has worked for Microsoft for 23
years supporting and speaking on every version of SQL
Server shipped from OS/2 1.1 to SQL Server 2016. He has
worked in customer support as a principal escalation
engineer and Chief Technology Officer (CTO) interacting with
some of the largest SQL Server deployments in the world.
Bob is a well-known speaker on SQL Server often presenting
talks on internals and troubleshooting at events such as SQL
PASS Summit, SQLBits, SQLIntersection, and Microsoft Ignite.
You can find him on twitter at @bobwardms or read his blog
at https://blogs.msdn.microsoft.com/bobsql..
Place your
photo here
/bobwardms @bobwardms Bob Ward
71. Explore Everything PASS Has to Offer
FREE ONLINE WEBINAR EVENTS FREE 1-DAY LOCAL TRAINING EVENTS
LOCAL USER GROUPS
AROUND THE WORLD
ONLINE SPECIAL INTEREST
USER GROUPS
BUSINESS ANALYTICS TRAINING
VOLUNTEERING OPPORTUNITIES
PASS COMMUNITY NEWSLETTER
BA INSIGHTS NEWSLETTERFREE ONLINE RESOURCES
72. Session Evaluations
ways to access
Go to passSummit.com Download the GuideBook App
and search: PASS Summit 2016
Follow the QR code link displayed
on session signage throughout the
conference venue and in the
program guide
Submit by 5pm
Friday November 6th to
WIN prizes
Your feedback is
important and valuable. 3
The point of this slide is talk about capabilities that this deck will not cover
Notice that log records are not even created until COMMIT.
That is the greek word for Hekaton which means 100 because our original goal was to achieve 100x performance with In-Memory OLTP
Follow the readme.txt file in the inmem_oltp_transaction_compare folder
A request with command = XTP_CKPT_AGENT is used to keep FREE files replenished for CHECKPOINT files
Follow the readme.txt file in the inmem_oltp_architecture folder
Hash collision on name index with Ryan because it is a different value than Bob. The first two Bob rows are technically not a collision because they are the same key value
Follow the readme.txt file in the inmem_oltp_data_and_indexes folder
We will discuss the compilation process when talking about natively complied DLLs
Follow the instructions in the readme.txt file in the inmem_oltp_schema_dll folder
T-SQL IsXTPSupported() = Windows API IsProcessorFeaturePresent(PF_COMPARE_EXCHANGE128)
= _InterlockedCompareExchange128() = Cmpxchg16b instruction
Follow the instructions in the readme.txt file in the inmem_oltp_concurrency folder
We can switch to “large” files (1Gb) if checkpoint has determine it can read 200Mb/sec and the machine has 16 cores and 128Gb for SQL Server
Follow the instructions in readme.txt in the inmem_oltp_checkpoint folder
Show the XML representing the MAT TREE
Finding g_Bindings for a native proc DLL and set a breakpoint then step through it
Probably cut slide
Each pool is the size of number of schedulers after first item queued
Timeout after 10 secs of idle
NON-PREEMPTIVE and WORKERS POOL
Command = XTP_THREAD_POOL
Wait type = DISPATCHER_QUEUE_SEMAPHORE when idle
PREEMPTIVE POOL
Command = UNKNOWN TOKEN
Wait type = XTP_PREEMPTIVE_TASK