The document outlines the key phases and concepts in query optimization: 1) Parsing the SQL query into an internal representation like a query tree, 2) Applying transformation rules to put the query in canonical form, 3) Estimating the costs of different execution plans, and 4) Selecting the lowest cost plan. Key topics covered include relational algebra trees, transformation rules, heuristic strategies like pushing down selections, and using statistics and cost models to choose the most efficient query execution plan.
Hey friends, here is my "query tree" assignment. :-) I have searched a lot to get this master piece :p and I can guarantee you that this one gonna help you In Sha ALLAH more than any else document on the subject. Have a good day :-)
Query processing and Query OptimizationNiraj Gandha
This presentation on query processing and query optimization is made with many efforts. According to me, I have used the most basic/ fundamental examples and topics for the explanation.
Hey friends, here is my "query tree" assignment. :-) I have searched a lot to get this master piece :p and I can guarantee you that this one gonna help you In Sha ALLAH more than any else document on the subject. Have a good day :-)
Query processing and Query OptimizationNiraj Gandha
This presentation on query processing and query optimization is made with many efforts. According to me, I have used the most basic/ fundamental examples and topics for the explanation.
International Journal of Computational Engineering Research(IJCER)ijceronline
International Journal of Computational Engineering Research(IJCER) is an intentional online Journal in English monthly publishing journal. This Journal publish original research work that contributes significantly to further the scientific knowledge in engineering and Technology.
Building data fusion surrogate models for spacecraft aerodynamic problems wit...Shinwoo Jang
Abstract. This work concerns a construction of surrogate models for a specific aerodynamic data base. This data base is generally available from wind tunnel testing or from CFD aerodynamic simulations and contains aerodynamic coefficients for different flight conditions and configurations (such as Mach number, angle-of-attack, vehicle configuration angle) encountered over different space vehicles mission. The main peculiarity of aerodynamic data base is a specific design of
experiment which is a union of grids of low and high fidelity data with considerably different sizes. Universal algorithms can’t approximate accurately such significantly non-uniform data. In this work a fast and accurate algorithm was developed which takes into account different fidelity of the data and special design of experiments.
International Journal of Computational Engineering Research(IJCER)ijceronline
International Journal of Computational Engineering Research(IJCER) is an intentional online Journal in English monthly publishing journal. This Journal publish original research work that contributes significantly to further the scientific knowledge in engineering and Technology.
Building data fusion surrogate models for spacecraft aerodynamic problems wit...Shinwoo Jang
Abstract. This work concerns a construction of surrogate models for a specific aerodynamic data base. This data base is generally available from wind tunnel testing or from CFD aerodynamic simulations and contains aerodynamic coefficients for different flight conditions and configurations (such as Mach number, angle-of-attack, vehicle configuration angle) encountered over different space vehicles mission. The main peculiarity of aerodynamic data base is a specific design of
experiment which is a union of grids of low and high fidelity data with considerably different sizes. Universal algorithms can’t approximate accurately such significantly non-uniform data. In this work a fast and accurate algorithm was developed which takes into account different fidelity of the data and special design of experiments.
This presenation explains basics of ETL (Extract-Transform-Load) concept in relation to such data solutions as data warehousing, data migration, or data integration. CloverETL is presented closely as an example of enterprise ETL tool. It also covers typical phases of data integration projects.
How to Make Awesome SlideShares: Tips & TricksSlideShare
Turbocharge your online presence with SlideShare. We provide the best tips and tricks for succeeding on SlideShare. Get ideas for what to upload, tips for designing your deck and more.
Learn the built-in mathematical functions in R. This tutorial is part of the Working With Data module of the R Programming course offered by r-squared.
Recognizing patterns in a sequence of rows has been a capability that was widely desired, but not possible with SQL until now. There were many workarounds, but these were difficult to write, hard to understand, and inefficient to execute. Beginning in Oracle Database 12c, you can use the MATCH_RECOGNIZE clause to achieve this capability in native SQL that executes efficiently. This presentation discusses how to do this.
A Strategic Approach: GenAI in EducationPeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
Embracing GenAI - A Strategic ImperativePeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
Macroeconomics- Movie Location
This will be used as part of your Personal Professional Portfolio once graded.
Objective:
Prepare a presentation or a paper using research, basic comparative analysis, data organization and application of economic information. You will make an informed assessment of an economic climate outside of the United States to accomplish an entertainment industry objective.
Introduction to AI for Nonprofits with Tapp NetworkTechSoup
Dive into the world of AI! Experts Jon Hill and Tareq Monaur will guide you through AI's role in enhancing nonprofit websites and basic marketing strategies, making it easy to understand and apply.
Synthetic Fiber Construction in lab .pptxPavel ( NSTU)
Synthetic fiber production is a fascinating and complex field that blends chemistry, engineering, and environmental science. By understanding these aspects, students can gain a comprehensive view of synthetic fiber production, its impact on society and the environment, and the potential for future innovations. Synthetic fibers play a crucial role in modern society, impacting various aspects of daily life, industry, and the environment. ynthetic fibers are integral to modern life, offering a range of benefits from cost-effectiveness and versatility to innovative applications and performance characteristics. While they pose environmental challenges, ongoing research and development aim to create more sustainable and eco-friendly alternatives. Understanding the importance of synthetic fibers helps in appreciating their role in the economy, industry, and daily life, while also emphasizing the need for sustainable practices and innovation.
How to Make a Field invisible in Odoo 17Celine George
It is possible to hide or invisible some fields in odoo. Commonly using “invisible” attribute in the field definition to invisible the fields. This slide will show how to make a field invisible in odoo 17.
The Roman Empire A Historical Colossus.pdfkaushalkr1407
The Roman Empire, a vast and enduring power, stands as one of history's most remarkable civilizations, leaving an indelible imprint on the world. It emerged from the Roman Republic, transitioning into an imperial powerhouse under the leadership of Augustus Caesar in 27 BCE. This transformation marked the beginning of an era defined by unprecedented territorial expansion, architectural marvels, and profound cultural influence.
The empire's roots lie in the city of Rome, founded, according to legend, by Romulus in 753 BCE. Over centuries, Rome evolved from a small settlement to a formidable republic, characterized by a complex political system with elected officials and checks on power. However, internal strife, class conflicts, and military ambitions paved the way for the end of the Republic. Julius Caesar’s dictatorship and subsequent assassination in 44 BCE created a power vacuum, leading to a civil war. Octavian, later Augustus, emerged victorious, heralding the Roman Empire’s birth.
Under Augustus, the empire experienced the Pax Romana, a 200-year period of relative peace and stability. Augustus reformed the military, established efficient administrative systems, and initiated grand construction projects. The empire's borders expanded, encompassing territories from Britain to Egypt and from Spain to the Euphrates. Roman legions, renowned for their discipline and engineering prowess, secured and maintained these vast territories, building roads, fortifications, and cities that facilitated control and integration.
The Roman Empire’s society was hierarchical, with a rigid class system. At the top were the patricians, wealthy elites who held significant political power. Below them were the plebeians, free citizens with limited political influence, and the vast numbers of slaves who formed the backbone of the economy. The family unit was central, governed by the paterfamilias, the male head who held absolute authority.
Culturally, the Romans were eclectic, absorbing and adapting elements from the civilizations they encountered, particularly the Greeks. Roman art, literature, and philosophy reflected this synthesis, creating a rich cultural tapestry. Latin, the Roman language, became the lingua franca of the Western world, influencing numerous modern languages.
Roman architecture and engineering achievements were monumental. They perfected the arch, vault, and dome, constructing enduring structures like the Colosseum, Pantheon, and aqueducts. These engineering marvels not only showcased Roman ingenuity but also served practical purposes, from public entertainment to water supply.
Francesca Gottschalk - How can education support child empowerment.pptxEduSkills OECD
Francesca Gottschalk from the OECD’s Centre for Educational Research and Innovation presents at the Ask an Expert Webinar: How can education support child empowerment?
Palestine last event orientationfvgnh .pptxRaedMohamed3
An EFL lesson about the current events in Palestine. It is intended to be for intermediate students who wish to increase their listening skills through a short lesson in power point.
2. ⇒ Motivation for Query Optimisation
⇒ Phases of Query Processing
⇒ Query Trees
⇒ RA Transformation Rules
⇒ Heuristic Processing Strategies
⇒ Cost Estimation for RA Operations
LECTURE PLAN
3. Motivation for Query Optimisation
List all the managers that work in the sales department.
SELECT *
FROM emp, dept
WHERE emp.deptno = dept.deptno
AND emp.job = ‘Manager’
AND dept.name = ‘Sales’;
σ(job = ‘Manager’) ∧ (name=‘Sales’) ∧ (emp.deptno = dept.deptno) (EMP X DEPT)
σ(job = ‘Manager’) ∧ (name=‘Sales’) (EMP emp.deptno = dept.deptno DEPT)
(σ(job = ‘Manager’) (EMP)) emp.deptno = dept.deptno (σ(name=‘Sales’) (DEPT))
There are at least three
alternative ways of
representing this query
as a Relational Algebra
expression.
4. Motivation for Query Optimisation
σ(job = ‘Manager’) ∧ (name=‘Sales’) ∧ (emp.deptno = dept.deptno) (EMP X DEPT)
Metrics:
1000 tuples in the EMP relation
50 tuples in the DEPT relation
50 employees are Managers (one per department)
5 separate Sales departments (across the country)
Cost of processing the following query alternate:
Cartesian product of EMP and DEPT:
(1000 + 50) record I/O’s to read the relations
+ (1000 * 50) record I/O’s to create an intermediate relation to store result
Selection on result of Cartesian product:
(1000 * 50) record I/O’s to read tuples and compare against predicate
Total cost of the query:
(1000 + 50) + 2*(1000 * 50) = 101, 050 record I/O’s.
5. Motivation for Query Optimisation
Metrics:
1000 tuples in the EMP relation
50 tuples in the DEPT relation
50 employees are Managers (one per department)
5 separate Sales departments (across the country)
Cost of processing the following query alternate:
Join of EMP and DEPT over deptno:
(1000 + 50) record I/O’s to read the relations
+ (1000) record I/O’s to create an intermediate relation to store join result
Selection on result of Join:
(1000) record I/O’s to read each tuple and compare against predicate
Total cost of the query:
(1000 + 50) + 2*(1000) = 3, 050 record I/O’s.
σ(job = ‘Manager’) ∧ (name=‘Sales’) (EMP emp.deptno = dept.deptno DEPT)
6. Motivation for Query Optimisation
Cost of processing the following query:
(σ(job = ‘Manager’) (EMP)) emp.deptno = dept.deptno (σ(name=‘Sales’) (DEPT))
Select ‘Managers’ in EMP:
(1000) record I/O’s to read the relations
+ (50) record I/O’s to create an intermediate relation to store select result
Select ‘Sales’ in DEPT:
(50) record I/O’s to read the relations
+ (5) record I/O’s to create an intermediate relation to store select result
Join of previous two selections over deptno:
(50 + 5) record I/O’s to read the relations
Total cost of the query:
(1000 2*(50) + 5 +(50 +5)) = 1, 160 record I/O’s.
8. Query Processing Stage - 1
Cast the query into internal form
This involves the conversion of the original (SQL)
query into some internal representation more suitable
for machine manipulation.
The internal representation typically chosen is either
some kind of ‘abstract syntax tree’, or a relational
algebra ‘query tree’.
9. Relational Algebra Query Trees
A Relational Algebra query can be represented as a ‘query tree’. For
example the query to list all the managers that work in the sales
department could be described as one of the following:
σ(job = ‘Manager’) ∧ (name=‘Sales’) ∧ (emp.deptno = dept.deptno) (EMP X DEPT)
EMP DEPT
X
σ(job = ‘Manager’) ∧ (name=‘Sales’) ∧ (emp.deptno = dept.deptno)
Leaves
Intermediate
operations
Root
10. Relational Algebra Query Trees
A Relational Algebra query can be represented as a ‘query tree’. For
example the query to list all the managers that work in the sales
department could be described as one of the following:
σ(job = ‘Manager’) ∧ (name=‘Sales’) ∧ (emp.deptno = dept.deptno) (EMP X DEPT)
EMP DEPT
X
σ(job = ‘Manager’) ∧ (name=‘Sales’)
∧ (emp.deptno = dept.deptno)
Leaves
Intermediate
operations
Root
11. Relational Algebra Query Trees
σ(job = ‘Manager’) ∧ (name=‘Sales’) (EMP emp.deptno = dept.deptno DEPT)
EMP DEPT
σ(job = ‘Manager’) ∧ (name=‘Sales’)
emp.deptno = dept.deptno
Alternative‘query tree’ for the query to list all the managers that work
in the sales department:
12. Relational Algebra Query Trees
(σ(job = ‘Manager’) (EMP)) emp.deptno = dept.deptno (σ(name=‘Sales’) (DEPT))
EMP DEPT
emp.deptno = dept.deptno
σ(job = ‘Manager’) σ(name=‘Sales’)
Alternative‘query tree’ for the query to list all the managers that work
in the sales department:
13. Query Processing Stage - 2
Convert to canonical form
Find a more ‘efficient’ representation of the query by
converting the internal representation into some
equivalent (canonical) form through the application
of a set of well-defined ‘transformation rules’.
The set of transformation rules to apply will
generally be the result of the application of specific
heuristic processing strategies associated with
particular DBMSs.
14. 1. Conjunctive selection operations can cascade into
individual selection operations (and vice versa).
Sometimes referred to as cascade of selection.
σp∧q∧r(R) = σp(σq(σr(R)))
Example:
σdeptno=10 ∧sal>1000(Emp) = σdeptno=10(σsal>1000(Emp))
Transformation Rules for RA Operations
15. 2. Commutativity of selection
σp(σq(R)) = σq(σp(R))
Example:
σsal>1000(σdeptno=10(Emp)) = σdeptno=10(σsal>1000(Emp))
Transformation Rules for RA Operations
16. 3. In a sequence of projection operations, only the last
in the sequence is required.
ΠLΠM … ΠN(R) = ΠL (R)
Example:
ΠdeptnoΠname(Dept) = Πdeptno (Dept))
Transformation Rules for RA Operations
17. 4. Commutativity of selection and projection.
ΠAi,…,Am(σp(R)) = σp(ΠAi,…,Am(R))
where p ∈{A1, A2, …, Am}
Example:
Πname, job(σname=‘Smith’(Emp)) = σname=‘Smith'(Πname,job(Staff))
Transformation Rules for RA Operations
Selection predicate (p) is only
made up of projected attributes
18. 5. Commutativity of theta-join (and Cartesian product).
R pS = S pR
Transformation Rules for RA Operations
R X S = S X R
Example:
EMP emp.deptno = dept.deptno DEPT
= DEPT emp.deptno = dept.deptno EMP
NOTE: Theta-join is a generalisation
of both the equi-join and natural-join
19. 6. Commutativity of selection and theta-join
(or Cartesian product).
Transformation Rules for RA Operations
Example:
(σemp.deptno=10 (EMP)) emp.deptno = dept.deptno DEPT
= σemp.deptno=10 (EMP emp.deptno = dept.deptno DEPT)
(σp(R)) r S = σp(R r S)
where p ∈{A1, A2, …, Am}
Selection predicate (p) is only
made up of join attributes
20. 7. Commutativity of projection and theta-join
(or Cartesian product).
Transformation Rules for RA Operations
Example:
Π job, location, deptno (EMP emp.deptno = dept.deptno DEPT)
= (Π job, deptno (EMP)) emp.deptno = dept.deptno (Π location, deptno (DEPT))
ΠL(R r S) = (ΠL1(R)) r (ΠL2(S))
Project attributes L = L1 ∪ L2, where L1 are attributes of R, and
L2 are attributes of S. L will also contain the join attributes
21. 8. Commutativity of union and intersection
(but not set difference).
R ∪ S = S ∪ R
R ∩ S = S ∩ R
Transformation Rules for RA Operations
22. Transformation Rules for RA Operations
9. Commutativity of selection and set operations
(union, intersection, and set difference).
Union
σp(R ∪ S) = σp(S) ∪ σp(R)
Intersection
σp(R ∩ S) = σp(S) ∩ σp(R)
Set Difference
σp(R - S) = σp(S) - σp(R)
23. 10 Commutativity of projection and union
ΠL(R ∪ S) = ΠL(S) ∪ ΠL(R)
Transformation Rules for RA Operations
24. 11 Associativity of natural join (and Cartesian product)
Natural Join
(R S) T = R (S T)
Cartesian Product
(R X S) X T = R X (S X T)
Transformation Rules for RA Operations
25. Transformation Rules for RA Operations
12 Associativity of union and intersection (but not set
difference)
Union
(R ∪ S) ∪ T = S ∪ (R ∪ T)
Intersection
(R ∩ S) ∩ T = S ∩ (R ∩ T)
26. Heuristic Processing Strategies
Perform selection operations as early as possible
Translate a Cartesian product and subsequent
selection (whose predicate represents a join condition)
into a join operation.
Use associativity of binary operations to ensure
that the most restrictive selection operations are
executed first
Perform projections as early as possible.
Compute common expressions once
28. Query Processing Stage - 3
Choose candidate low-level procedures
Consider the (optimised canonical) query as a series
of low-level operations (join, restrict, etc…).
For each of these operations generate alternative
execution strategies and calculate the cost of such
strategies on the basis of statistical information held
about the database tables (files).
29. Query Processing Stage - 4
Generate query plans and choose the cheapest
Construct a set of ‘candidate’ Query Execution Plans (QEPs).
Each QEP is constructed by selecting a candidate
implementation procedure for each operation in the canonical
query and then combining them to form a string of associated
operations.
Each QEP will have an (estimated) cost associated with it – the
sum of the cost of each of its operations.
Choose the QEP with the least cost.
30. Cost Based Optimisation
Cost Based Optimisation (stages 3 & 4)
A good declarative query optimiser does not rely
solely on heuristic processing strategies.
It chooses the QEP with the lowest estimated cost.
After heuristic rules are applied to a query, there still
remains a number of alternative ways to execute it .
The Query Optimiser estimates the cost of executing
each one (or at least a number) of these alternatives, and
selects the cheapest one.
31. Costs associated with query execution
Secondary storage access costs:
Searching for data blocks on disk,
Reading data blocks from disk
Writing data block to disk
Storage costs
Cost of storing intermediate (temp) files
Computation costs
Cost of CPU usage
Main memory usage costs
Cost of buffering data
Communication costs
Cost of moving data across
32. Database statistics used in cost estimation
Information held on each relation:
number of tuples
number of blocks
blocking factor
primary access method
primary access attributes
secondary indexes
secondary indexing attributes
number of levels for each index
number of distinct values of each attribute
33. Physical Data Structures – File Types
Heap (Sequential, Unordered)
no key columns
queries, other than appends, scan every page
rows are appended at the end
duplicate rows are allowed
Ordered
physically sorted data file with no index
Hash (Random, Direct)
data is located based on the (calculated) value of a hash field (key)
Indexed Sequential (ISAM)
sorted data file with a primary index
B+
Tree
dynamic multilevel index
reuses deleted space on associated data pages
34. Strategies for implementing the RESTRICT operation
Different access strategies dependant upon the structure of
the file in which the relation is stored, and whether the
predicate attribute(s) have been indexed/hashed: Each uses a
different cost algorithm (which refers to specific database statistics).
Linear Search (Heap)
Binary Search (Ordered)
Equality on Hash Key
Equality condition on primary key
Inequality condition on primary key
Equality condition on secondary index
Inequality condition on secondary B+
Tree index
If the selection predicate is a composite (AND & OR) then there
are additional cost considerations!
35. Strategies for implementing the JOIN operation
Different access strategies dependant upon the structure of the
files in which the relations to be joined are stored, and whether
the join attributes have been indexed/hashed: Each uses its
own cost algorithm (which refers to specific database statistics).
Block nested loop join
Indexed nested loop join
Sort-merge join
Hash join
36. Query Optimisation Summary
The aims of query processing are to transform a query
written in a high-level language (SQL), into a correct and
efficient execution strategy expressed in a low-level
language (Relational Algebra), and to execute the strategy to
retrieve the required data.
There are many equivalent transformations of the same high-
level query, the DBMS has to choose the one that minimises
resource usage.
There are two main techniques for query optimisation. The
first uses heuristic rules that order the operations in a query.
The second compares different execution strategies for those
operations, based on their relative costs, and selects the least
resource intensive (cheapest) ones.