The document discusses the role of the SQL query optimizer in generating efficient query plans. It describes the optimizer's multi-stage process of parsing the SQL statement, binding objects, optimizing through different search levels, applying logical and physical properties and over 350 rules to simplify and optimize the query tree, and selecting the cheapest plan. It notes challenges like a large number of possible join orders and timeouts during complex optimizations.
Introducing 3 FREE Smart solutions for SQL Server (Adi Sapir, Docco Labs)
As Database experts, we work with SQL Server Databases on a daily basis. We face the same problems every SQL Administrator and/or developer does. And – we spend our time writing solutions for these problems! In this session Adi will introduce the following 3, totally FREE solutions:
· ClipTable – A revolutionary new *anything* to SQL Table importer
· Database File Explorer – a much easier way to explore our database->filegroups->files->storage mapping
· Log Table Viewer – a complete client/server logger solution for SQL Server
Hekaton is the original project name for In-Memory OLTP and just sounds cooler for a title name. Keeping up the tradition of deep technical “Inside” sessions at PASS, this half-day talk will take you behind the scenes and under the covers on how the In-Memory OLTP functionality works with SQL Server.
We will cover “everything Hekaton”, including how it is integrated with the SQL Server Engine Architecture. We will explore how data is stored in memory and on disk, how I/O works, how native complied procedures are built and executed. We will also look at how Hekaton integrates with the rest of the engine, including Backup, Restore, Recovery, High-Availability, Transaction Logging, and Troubleshooting.
Demos are a must for a half-day session like this and what would an inside session be if we didn’t bring out the Windows Debugger. As with previous “Inside…” talks I’ve presented at PASS, this session is level 500 and not for the faint of heart. So read through the docs on In-Memory OLTP and bring some extra pain reliever as we move fast and go deep.
This session will appear as two sessions in the program guide but is not a Part I and II. It is one complete session with a small break so you should plan to attend it all to get the maximum benefit.
Postgres has the unique ability to act as a powerful data aggregator or information hub in many IT centers bringing together data from different databases and in different formats.
This presentation reviews Postgres' extensibility, foreign data wrappers, and ability to work with structured relational and unstructured NoSQL-like information such as documents and key-value data.
The Postgres capabilities are unrivaled in enabling a complete view of customers or businesses, analyzing disparate data together, and breaking down data silos within the enterprise.
If you would like to listen to the recording please visit EnterpriseDB > Resources > Webcasts > Ondemand Webcasts.
To speak to someone about EnterpriseDB's solutions and services please email sales@enterprisedb.com.
It has just been a few months since the PostgreSQL9.5 is released. We have got some of our customers excited about great new features and performance enhancements in v9.5. But here we are already taking a peak into the next version, and we find it awesome! One of the most awaited features – parallelism makes it to Postgres. The infrastructure for parallelism has been added over last few releases but the first parallel operation in query execution will be seen only in v9.6.
Introducing 3 FREE Smart solutions for SQL Server (Adi Sapir, Docco Labs)
As Database experts, we work with SQL Server Databases on a daily basis. We face the same problems every SQL Administrator and/or developer does. And – we spend our time writing solutions for these problems! In this session Adi will introduce the following 3, totally FREE solutions:
· ClipTable – A revolutionary new *anything* to SQL Table importer
· Database File Explorer – a much easier way to explore our database->filegroups->files->storage mapping
· Log Table Viewer – a complete client/server logger solution for SQL Server
Hekaton is the original project name for In-Memory OLTP and just sounds cooler for a title name. Keeping up the tradition of deep technical “Inside” sessions at PASS, this half-day talk will take you behind the scenes and under the covers on how the In-Memory OLTP functionality works with SQL Server.
We will cover “everything Hekaton”, including how it is integrated with the SQL Server Engine Architecture. We will explore how data is stored in memory and on disk, how I/O works, how native complied procedures are built and executed. We will also look at how Hekaton integrates with the rest of the engine, including Backup, Restore, Recovery, High-Availability, Transaction Logging, and Troubleshooting.
Demos are a must for a half-day session like this and what would an inside session be if we didn’t bring out the Windows Debugger. As with previous “Inside…” talks I’ve presented at PASS, this session is level 500 and not for the faint of heart. So read through the docs on In-Memory OLTP and bring some extra pain reliever as we move fast and go deep.
This session will appear as two sessions in the program guide but is not a Part I and II. It is one complete session with a small break so you should plan to attend it all to get the maximum benefit.
Postgres has the unique ability to act as a powerful data aggregator or information hub in many IT centers bringing together data from different databases and in different formats.
This presentation reviews Postgres' extensibility, foreign data wrappers, and ability to work with structured relational and unstructured NoSQL-like information such as documents and key-value data.
The Postgres capabilities are unrivaled in enabling a complete view of customers or businesses, analyzing disparate data together, and breaking down data silos within the enterprise.
If you would like to listen to the recording please visit EnterpriseDB > Resources > Webcasts > Ondemand Webcasts.
To speak to someone about EnterpriseDB's solutions and services please email sales@enterprisedb.com.
It has just been a few months since the PostgreSQL9.5 is released. We have got some of our customers excited about great new features and performance enhancements in v9.5. But here we are already taking a peak into the next version, and we find it awesome! One of the most awaited features – parallelism makes it to Postgres. The infrastructure for parallelism has been added over last few releases but the first parallel operation in query execution will be seen only in v9.6.
There exist some valid reasons to rebuild indexes on an Oracle database (not many). This presentation is about some of those reasons and how to automate such online index rebuild.
SQL Server In-Memory OLTP: What Every SQL Professional Should KnowBob Ward
Perhaps you have heard the term “In-Memory” but not sure what it means. If you are a SQL Server Professional then you will want to know. Even if you are new to SQL Server, you will want to learn more about this topic. Come learn the basics of how In-Memory OLTP technology in SQL Server 2016 and Azure SQL Database can boost your OLTP application by 30X. We will compare how In-Memory OTLP works vs “normal” disk-based tables. We will discuss what is required to migrate your existing data into memory optimized tables or how to build a new set of data and applications to take advantage of this technology. This presentation will cover the fundamentals of what, how, and why this technology is something every SQL Server Professional should know
Ingesting Data from Kafka to JDBC with Transformation and EnrichmentApache Apex
Presenter - Dr Sandeep Deshmukh, Committer Apache Apex, DataTorrent engineer
Abstract:
Ingesting and extracting data from Hadoop can be a frustrating, time consuming activity for many enterprises. Apache Apex Data Ingestion is a standalone big data application that simplifies the collection, aggregation and movement of large amounts of data to and from Hadoop for a more efficient data processing pipeline. Apache Apex Data Ingestion makes configuring and running Hadoop data ingestion and data extraction a point and click process enabling a smooth, easy path to your Hadoop-based big data project.
In this series of talks, we would cover how Hadoop Ingestion is made easy using Apache Apex. The third talk in this series would focus on ingesting unbounded data from Kafka to JDBC with couple of processing operators -Transform and enrichment.
Based on the popular blog series, join me in taking a deep dive and a behind the scenes look at how SQL Server 2016 “It Just Runs Faster”, focused on scalability and performance enhancements. This talk will discuss the improvements, not only for awareness, but expose design and internal change details. The beauty behind ‘It Just Runs Faster’ is your ability to just upgrade, in place, and take advantage without lengthy and costly application or infrastructure changes. If you are looking at why SQL Server 2016 makes sense for your business you won’t want to miss this session.
Tempto is a product test framework that allows developers to write and execute tests for SQL databases running on Hadoop. Individual test requirements such as data generation, HDFS file copy/storage of generated data and schema creation are expressed declaratively and are automatically fulfilled by the framework. Developers can write tests using Java (using a TestNG like paradigm and AssertJ style assertion) or by providing query files with expected results. We will show how we use it for presto product tests.
Benchto is a benchmark framework that provides an easy and manageable way to define, run and analyze macro benchmarks in clustered environment. Understanding behavior of distributed systems is hard and requires good visibility intostate of the cluster and internals of tested system. This project was developed for repeatable benchmarking ofHadoop SQL engines, most importantly Presto.
Percona Cluster ( Galera ) is one of the best database solution that provides synchronous replication. The feature like automatic recovery, GTID and multi threaded replication makes it powerful along with ( XtraDB and Xtrabackup ).
The good solution for MySQL HA.
Nesta segunda parte do tema Redshift, mostramos o case da Movile, líder em mobile commerce com 50 milhões de usuários, e analisamos tópicos avançados como compressão, macros SQL embutidas e índices multidimensionais para grandes bases de dados.
Intro to Apache Apex - Next Gen Platform for Ingest and TransformApache Apex
Introduction to Apache Apex - The next generation native Hadoop platform. This talk will cover details about how Apache Apex can be used as a powerful and versatile platform for big data processing. Common usage of Apache Apex includes big data ingestion, streaming analytics, ETL, fast batch alerts, real-time actions, threat detection, etc.
Bio:
Pramod Immaneni is Apache Apex PMC member and senior architect at DataTorrent, where he works on Apache Apex and specializes in big data platform and applications. Prior to DataTorrent, he was a co-founder and CTO of Leaf Networks LLC, eventually acquired by Netgear Inc, where he built products in core networking space and was granted patents in peer-to-peer VPNs.
There exist some valid reasons to rebuild indexes on an Oracle database (not many). This presentation is about some of those reasons and how to automate such online index rebuild.
SQL Server In-Memory OLTP: What Every SQL Professional Should KnowBob Ward
Perhaps you have heard the term “In-Memory” but not sure what it means. If you are a SQL Server Professional then you will want to know. Even if you are new to SQL Server, you will want to learn more about this topic. Come learn the basics of how In-Memory OLTP technology in SQL Server 2016 and Azure SQL Database can boost your OLTP application by 30X. We will compare how In-Memory OTLP works vs “normal” disk-based tables. We will discuss what is required to migrate your existing data into memory optimized tables or how to build a new set of data and applications to take advantage of this technology. This presentation will cover the fundamentals of what, how, and why this technology is something every SQL Server Professional should know
Ingesting Data from Kafka to JDBC with Transformation and EnrichmentApache Apex
Presenter - Dr Sandeep Deshmukh, Committer Apache Apex, DataTorrent engineer
Abstract:
Ingesting and extracting data from Hadoop can be a frustrating, time consuming activity for many enterprises. Apache Apex Data Ingestion is a standalone big data application that simplifies the collection, aggregation and movement of large amounts of data to and from Hadoop for a more efficient data processing pipeline. Apache Apex Data Ingestion makes configuring and running Hadoop data ingestion and data extraction a point and click process enabling a smooth, easy path to your Hadoop-based big data project.
In this series of talks, we would cover how Hadoop Ingestion is made easy using Apache Apex. The third talk in this series would focus on ingesting unbounded data from Kafka to JDBC with couple of processing operators -Transform and enrichment.
Based on the popular blog series, join me in taking a deep dive and a behind the scenes look at how SQL Server 2016 “It Just Runs Faster”, focused on scalability and performance enhancements. This talk will discuss the improvements, not only for awareness, but expose design and internal change details. The beauty behind ‘It Just Runs Faster’ is your ability to just upgrade, in place, and take advantage without lengthy and costly application or infrastructure changes. If you are looking at why SQL Server 2016 makes sense for your business you won’t want to miss this session.
Tempto is a product test framework that allows developers to write and execute tests for SQL databases running on Hadoop. Individual test requirements such as data generation, HDFS file copy/storage of generated data and schema creation are expressed declaratively and are automatically fulfilled by the framework. Developers can write tests using Java (using a TestNG like paradigm and AssertJ style assertion) or by providing query files with expected results. We will show how we use it for presto product tests.
Benchto is a benchmark framework that provides an easy and manageable way to define, run and analyze macro benchmarks in clustered environment. Understanding behavior of distributed systems is hard and requires good visibility intostate of the cluster and internals of tested system. This project was developed for repeatable benchmarking ofHadoop SQL engines, most importantly Presto.
Percona Cluster ( Galera ) is one of the best database solution that provides synchronous replication. The feature like automatic recovery, GTID and multi threaded replication makes it powerful along with ( XtraDB and Xtrabackup ).
The good solution for MySQL HA.
Nesta segunda parte do tema Redshift, mostramos o case da Movile, líder em mobile commerce com 50 milhões de usuários, e analisamos tópicos avançados como compressão, macros SQL embutidas e índices multidimensionais para grandes bases de dados.
Intro to Apache Apex - Next Gen Platform for Ingest and TransformApache Apex
Introduction to Apache Apex - The next generation native Hadoop platform. This talk will cover details about how Apache Apex can be used as a powerful and versatile platform for big data processing. Common usage of Apache Apex includes big data ingestion, streaming analytics, ETL, fast batch alerts, real-time actions, threat detection, etc.
Bio:
Pramod Immaneni is Apache Apex PMC member and senior architect at DataTorrent, where he works on Apache Apex and specializes in big data platform and applications. Prior to DataTorrent, he was a co-founder and CTO of Leaf Networks LLC, eventually acquired by Netgear Inc, where he built products in core networking space and was granted patents in peer-to-peer VPNs.
Evolving the Optimal Relevancy Ranking Model at Dice.comSimon Hughes
This is a talk about gathering a golden test set of relevancy judgements, either using manual annotators or search log mining, to use in either an automated or manual relevancy tuning process. We also discuss the dangers of positive feedback loops when building closed-loop machine learning models for search and recommendation.
When it comes to user experience a snappy application beat a glamorous one. Nothing frustrates an end user more than a slow application. Did you know that any wait time greater than one second will break a user's concentration and cause them to feel frustration? How can we create applications to meet user expectations? This class will cover all things performance from design to delivery. We will go over application design, user interface guidelines, caching guidelines, code optimizations, and query optimizations.
Grails has great performance characteristics but as with all full stack frameworks, attention must be paid to optimize performance. In this talk Lari will discuss common missteps that can easily be avoided and share tips and tricks which help profile and tune Grails applications.
Dynamic data processing tools to minimize time spent on chromatogram review and integration (Dynamic Data Linking, SmartLink, Cobra, SmartPeaks).
Learn more about our chromatography data system Chromeleon: http://www.thermoscientific.com/en/about-us/general-landing-page/chromeleon-resource-center.html?ca=chromeleon
MySQL Optimization from a Developer's point of viewSachin Khosla
Optimization from a developer's point of view. Optimization is not only the duty of a DBA but its should be done by all those who are involved in the ecosystem
Antes de migrar de 10g a 11g o 12c, tome en cuenta las siguientes consideraciones. No es tan sencillo como simplemente cambiar de motor de base de datos, se necesita hacer consideraciones a nivel del aplicativo.
Big Data and New Challenges for DBAs (Michael Naumov, LivePerson)
Hadoop has become a popular platform for managing large datasets of structured and unstructured data. It does not replace existing infrastructures, but instead augments them. Most companies will still use relational databases for transactional processing and low-latency queries, but can benefit from Hadoop for reporting, machine learning or ETL. This session will cover:
What is Hadoop and why do I care?
What do people do with Hadoop?
How can SQL Server DBAs add Hadoop to their architecture?
Common Errors That Effect Performance (Adi Cohen, Naya-Tech)
There are a few common errors that have a negative effect on performance. In this session we will review some of them, see why they impact performance and provide alternative solutions. Among the issues we will cover are:
· Misunderstanding of the query plan when using procedures
· Query plan differences between procedures and ad-hoc batches
· The differences between a temporary table and a table variable
· And many more…
Who is afraid of Columnstore Indexes? (Michael Zilberstein, DB-Art)
This talk describes new SQL Server 2012 feature called "columnstore index". In this session we will learn about the differences between columnstore indexes and B-Tree indexes we are used to work with. We will see when it is best to use and when not to use this new index. We will cover limitations that columnstore index imposes on the tables that use it and how to live with those limitations. Like in all my sessions, I won't let you go without some internals – how columnstore index is organized on a physical level and how Query Processor works this new type of index. And of course Demos, Demos, Demos…
2. OPTIMIZER’S ROLE
• The optimizer should generate an efficient
plan to access the data that we need to
work with.
3. WHY DO WE NEED IT?
• SQL Statement defines what we are looking
for. It doesn’t define how to get the data.
• The optimizer has to generate a query plan
which tells the server how to get the data.
9. PARSING
• Checks that the syntax is correct
• Creates a logical tree that represents the
query.
10. DEMO 1
• See logical tree that was produced by parse
step.
11. BINDIND
• Gets the logical tree that was produced by
the parsing step
• Makes sure that all objects referenced by
the logical tree exist and that the user can
see them.
13. TRIVIAL PLAN
• Very simple queries with very simple logical
tree won’t get full optimization
• The optimizer will generate a plan that is
called a trivial plan
• The trivial plan is very cheap to generate
and it won’t be inserted into the plan cache
14. TRIVIAL PLAN
• If the trivial plan’s cost is more then the
value that is configured as “cost threshold
for parallelism”, the query will get full
optimization
• You can disable the trivial plan by using
trace flag 8757
15. SIMPLIFICATION
• The OPTIMIZER rewrites the tree to make it
more simple. There are few simplification
methods:
– Constant folding
– Domain simplification
– Contradiction detection
17. OPTIMIZATION LEVELS
• If no trivial plan was found, SQL Server
starts optimizing the query.
• SQL Server has 3 stages that are called:
– Search 0 (Transaction processing)
– Search 1 (Quick plan)
– Search 2 (Full optimization)
18. OPTIMIZATION LEVELS
• Each level has an entry and termination
condition.
• Termination condition can be good enough
plan was found, or to much time passed
• Optimization can begin at a low search step
and get to a higher search step.
19. PROPERTIES
• Each node in the logical tree has properties
attached to it.
• There are 2 types of properties
– Logical properties (Node cost, output columns,
Type information and nullability, etc)
– Physical properties (Sort order, partition
information etc’)
20. RULES
• The optimizer has set of more then 350
rules that is using to optimize the query.
• The rules help the optimizer to modify the
logical tree in a way that doesn’t effect the
query’s results
• The rules also dictate the physical
implementation of the logical tree
21. RULES
• There are four types of rules that can be
used:
– Simplification rules
– Exploration rules
– Implementation rules
– Physical properties enforcement rules
22. MEMO
• All the trees are stored in memory in a
structure that is called Memo
• Each optimization has its own memo
• A single memo can get to the size of 1.6 GB
23. OPTIMIZATION PROBLEMS
• On rare occasions there can be a timeout
for the optimization process
• Most times when it happens, it will be on
search 2 stage, and the server will use the
query plan that was produced on search 1
stage.
• Sometime the server stops optimization
because of memory pressure
24. DMVs
• There are 2 DMVs that can give us
information about the optimizer:
– sys.dm_exec_query_optimizer_info, shows
information about optimization process
– sys.dm_exec_query_transformation_stats, sho
ws information about rules usage