The document summarizes external sorting techniques used in database management systems. It describes a two-phase sorting approach using limited buffer space in memory. The first phase creates runs by sorting each page individually. The second phase repeatedly merges runs by pairs until a single sorted run is produced, using three buffer pages - two for input runs and one for the output merged run. The process of merging two sorted runs by comparing elements and writing the smallest to the output page is also explained.
Dynamic multi level indexing Using B-Trees And B+ TreesPooja Dixit
B-TREE, Properties of B-Tree, B-Tree of minimum degree 3, Drawbacks of B-Tree, B+ tree, B+ tree, Structure of the internal nodes of a B+ tree , structure of the leaf nodes of a B+ tree , Example of B+ tree
Dynamic multi level indexing Using B-Trees And B+ TreesPooja Dixit
B-TREE, Properties of B-Tree, B-Tree of minimum degree 3, Drawbacks of B-Tree, B+ tree, B+ tree, Structure of the internal nodes of a B+ tree , structure of the leaf nodes of a B+ tree , Example of B+ tree
In the context of this assignment on Redis, relational data are
inserted into a redis database while sql queries are properly
edited and transformed in order to retrieve information from the
redis database.
SQL For Programmers -- Boston Big Data Techcon April 27thDave Stokes
SQL For Programmers is an introduction to SQL concepts, when SQL is a better choice, and a look at the future of databases. Presented April 27th, 2015 at Big Data Techcon Boston
Introduction of Data Structures and Algorithms by GOWRU BHARATH KUMARBHARATH KUMAR
Basic Terminology: Elementary Data Organization:
Data Structures Usage
Data Structures Implementation
CLASSIFICATION OF DATA STRUCTURES
DATA STRUCTURES OPERATIONS
Space-Time Trade-off
Searching Algorithms
Trees. Defining, Creating and Traversing Trees. Traversing the File System
Binary Search Trees. Balanced Trees
Graphs and Graphs Traversal Algorithms
Exercises: Working with Trees and Graphs
Bridging data analysis and interactive visualizationNacho Caballero
Clickme is an R package that lets you generate interactive visualizations directly from R. I presented the latest iteration at the 2013 IBSB conference in Kyoto
Furnish an Index Using the Works of Tree Structuresijceronline
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
1. Introduction to the Course "Designing Data Bases with Advanced Data Models...Fabio Fumarola
The Information Technology have led us into an era where the production, sharing and use of information are now part of everyday life and of which we are often unaware actors almost: it is now almost inevitable not leave a digital trail of many of the actions we do every day; for example, by digital content such as photos, videos, blog posts and everything that revolves around the social networks (Facebook and Twitter in particular). Added to this is that with the "internet of things", we see an increase in devices such as watches, bracelets, thermostats and many other items that are able to connect to the network and therefore generate large data streams. This explosion of data justifies the birth, in the world of the term Big Data: it indicates the data produced in large quantities, with remarkable speed and in different formats, which requires processing technologies and resources that go far beyond the conventional systems management and storage of data. It is immediately clear that, 1) models of data storage based on the relational model, and 2) processing systems based on stored procedures and computations on grids are not applicable in these contexts. As regards the point 1, the RDBMS, widely used for a great variety of applications, have some problems when the amount of data grows beyond certain limits. The scalability and cost of implementation are only a part of the disadvantages: very often, in fact, when there is opposite to the management of big data, also the variability, or the lack of a fixed structure, represents a significant problem. This has given a boost to the development of the NoSQL database. The website NoSQL Databases defines NoSQL databases such as "Next Generation Databases mostly addressing some of the points: being non-relational, distributed, open source and horizontally scalable." These databases are: distributed, open source, scalable horizontally, without a predetermined pattern (key-value, column-oriented, document-based and graph-based), easily replicable, devoid of the ACID and can handle large amounts of data. These databases are integrated or integrated with processing tools based on the MapReduce paradigm proposed by Google in 2009. MapReduce with the open source Hadoop framework represent the new model for distributed processing of large amounts of data that goes to supplant techniques based on stored procedures and computational grids (step 2). The relational model taught courses in basic database design, has many limitations compared to the demands posed by new applications based on Big Data and NoSQL databases that use to store data and MapReduce to process large amounts of data.
Course Website http://pbdmng.datatoknowledge.it/
In the context of this assignment on Redis, relational data are
inserted into a redis database while sql queries are properly
edited and transformed in order to retrieve information from the
redis database.
SQL For Programmers -- Boston Big Data Techcon April 27thDave Stokes
SQL For Programmers is an introduction to SQL concepts, when SQL is a better choice, and a look at the future of databases. Presented April 27th, 2015 at Big Data Techcon Boston
Introduction of Data Structures and Algorithms by GOWRU BHARATH KUMARBHARATH KUMAR
Basic Terminology: Elementary Data Organization:
Data Structures Usage
Data Structures Implementation
CLASSIFICATION OF DATA STRUCTURES
DATA STRUCTURES OPERATIONS
Space-Time Trade-off
Searching Algorithms
Trees. Defining, Creating and Traversing Trees. Traversing the File System
Binary Search Trees. Balanced Trees
Graphs and Graphs Traversal Algorithms
Exercises: Working with Trees and Graphs
Bridging data analysis and interactive visualizationNacho Caballero
Clickme is an R package that lets you generate interactive visualizations directly from R. I presented the latest iteration at the 2013 IBSB conference in Kyoto
Furnish an Index Using the Works of Tree Structuresijceronline
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
1. Introduction to the Course "Designing Data Bases with Advanced Data Models...Fabio Fumarola
The Information Technology have led us into an era where the production, sharing and use of information are now part of everyday life and of which we are often unaware actors almost: it is now almost inevitable not leave a digital trail of many of the actions we do every day; for example, by digital content such as photos, videos, blog posts and everything that revolves around the social networks (Facebook and Twitter in particular). Added to this is that with the "internet of things", we see an increase in devices such as watches, bracelets, thermostats and many other items that are able to connect to the network and therefore generate large data streams. This explosion of data justifies the birth, in the world of the term Big Data: it indicates the data produced in large quantities, with remarkable speed and in different formats, which requires processing technologies and resources that go far beyond the conventional systems management and storage of data. It is immediately clear that, 1) models of data storage based on the relational model, and 2) processing systems based on stored procedures and computations on grids are not applicable in these contexts. As regards the point 1, the RDBMS, widely used for a great variety of applications, have some problems when the amount of data grows beyond certain limits. The scalability and cost of implementation are only a part of the disadvantages: very often, in fact, when there is opposite to the management of big data, also the variability, or the lack of a fixed structure, represents a significant problem. This has given a boost to the development of the NoSQL database. The website NoSQL Databases defines NoSQL databases such as "Next Generation Databases mostly addressing some of the points: being non-relational, distributed, open source and horizontally scalable." These databases are: distributed, open source, scalable horizontally, without a predetermined pattern (key-value, column-oriented, document-based and graph-based), easily replicable, devoid of the ACID and can handle large amounts of data. These databases are integrated or integrated with processing tools based on the MapReduce paradigm proposed by Google in 2009. MapReduce with the open source Hadoop framework represent the new model for distributed processing of large amounts of data that goes to supplant techniques based on stored procedures and computational grids (step 2). The relational model taught courses in basic database design, has many limitations compared to the demands posed by new applications based on Big Data and NoSQL databases that use to store data and MapReduce to process large amounts of data.
Course Website http://pbdmng.datatoknowledge.it/
In this lecture we analyze key-values databases. At first we introduce key-value characteristics, advantages and disadvantages.
Then we analyze the major Key-Value data stores and finally we discuss about Dynamo DB.
In particular we consider how Dynamo DB: How is implemented
1. Motivation Background
2. Partitioning: Consistent Hashing
3. High Availability for writes: Vector Clocks
4. Handling temporary failures: Sloppy Quorum
5. Recovering from failures: Merkle Trees
6. Membership and failure detection: Gossip Protocol
Postgres indexes: how to make them work for your applicationBartosz Sypytkowski
Indexes are one of the most crucial structures of any relational database. In this talk we'll explain how to use them efficiently, how to read query plans and what do they mean for us. We'll also cover a variety of different indexing structures available in PostgreSQL database and build up some intuition about which one to pick depending on the situation.
In this presentation I am illustrating how and why InnodDB perform Merge and Split pages. I will also show what are the possible things to do to reduce the impact.
Every time you choose how to store data in your database, a lot of things happen under the hood.
Making the best choice is even more important in those applications that aim to high performance.
The purpose of the talk is to show how indexes work and how slightly changing their combinations can impact on the performance of your application.
Presentation of our work "Local discrepancy maximization on graphs" at the International IEEE Conference on Data Engineering (ICDE) 2015 in Seoul, Korea.
Read| The latest issue of The Challenger is here! We are thrilled to announce that our school paper has qualified for the NATIONAL SCHOOLS PRESS CONFERENCE (NSPC) 2024. Thank you for your unwavering support and trust. Dive into the stories that made us stand out!
The simplified electron and muon model, Oscillating Spacetime: The Foundation...RitikBhardwaj56
Discover the Simplified Electron and Muon Model: A New Wave-Based Approach to Understanding Particles delves into a groundbreaking theory that presents electrons and muons as rotating soliton waves within oscillating spacetime. Geared towards students, researchers, and science buffs, this book breaks down complex ideas into simple explanations. It covers topics such as electron waves, temporal dynamics, and the implications of this model on particle physics. With clear illustrations and easy-to-follow explanations, readers will gain a new outlook on the universe's fundamental nature.
Safalta Digital marketing institute in Noida, provide complete applications that encompass a huge range of virtual advertising and marketing additives, which includes search engine optimization, virtual communication advertising, pay-per-click on marketing, content material advertising, internet analytics, and greater. These university courses are designed for students who possess a comprehensive understanding of virtual marketing strategies and attributes.Safalta Digital Marketing Institute in Noida is a first choice for young individuals or students who are looking to start their careers in the field of digital advertising. The institute gives specialized courses designed and certification.
for beginners, providing thorough training in areas such as SEO, digital communication marketing, and PPC training in Noida. After finishing the program, students receive the certifications recognised by top different universitie, setting a strong foundation for a successful career in digital marketing.
2024.06.01 Introducing a competency framework for languag learning materials ...Sandy Millin
http://sandymillin.wordpress.com/iateflwebinar2024
Published classroom materials form the basis of syllabuses, drive teacher professional development, and have a potentially huge influence on learners, teachers and education systems. All teachers also create their own materials, whether a few sentences on a blackboard, a highly-structured fully-realised online course, or anything in between. Despite this, the knowledge and skills needed to create effective language learning materials are rarely part of teacher training, and are mostly learnt by trial and error.
Knowledge and skills frameworks, generally called competency frameworks, for ELT teachers, trainers and managers have existed for a few years now. However, until I created one for my MA dissertation, there wasn’t one drawing together what we need to know and do to be able to effectively produce language learning materials.
This webinar will introduce you to my framework, highlighting the key competencies I identified from my research. It will also show how anybody involved in language teaching (any language, not just English!), teacher training, managing schools or developing language learning materials can benefit from using the framework.
A workshop hosted by the South African Journal of Science aimed at postgraduate students and early career researchers with little or no experience in writing and publishing journal articles.
Executive Directors Chat Leveraging AI for Diversity, Equity, and InclusionTechSoup
Let’s explore the intersection of technology and equity in the final session of our DEI series. Discover how AI tools, like ChatGPT, can be used to support and enhance your nonprofit's DEI initiatives. Participants will gain insights into practical AI applications and get tips for leveraging technology to advance their DEI goals.
MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...NelTorrente
In this research, it concludes that while the readiness of teachers in Caloocan City to implement the MATATAG Curriculum is generally positive, targeted efforts in professional development, resource distribution, support networks, and comprehensive preparation can address the existing gaps and ensure successful curriculum implementation.
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Dr. Vinod Kumar Kanvaria
Exploiting Artificial Intelligence for Empowering Researchers and Faculty,
International FDP on Fundamentals of Research in Social Sciences
at Integral University, Lucknow, 06.06.2024
By Dr. Vinod Kumar Kanvaria
Normal Labour/ Stages of Labour/ Mechanism of LabourWasim Ak
Normal labor is also termed spontaneous labor, defined as the natural physiological process through which the fetus, placenta, and membranes are expelled from the uterus through the birth canal at term (37 to 42 weeks
2. logistics
2
• assignment 1 is up
• cowbook available at learning center beta, otakaari 1 x
• if you do not have access to the lab, provide Aalto
username or email today!!
• if you do not have access at mycourses, i will post material
(slides and assignments) also at http://michalis.co/moderndb/
3. in this lecture...
b+ trees and hash-based indexing
external sorting
join algorithms
query optimization
3
5. b+ trees
5
leaf nodes
contain data-entries
sequentially linked
each node stored in one page
data entries can be any one of the three alternative types
type 1: data records; type 2: (k, rid); type 3: (k, rids)
at least 50% capacity - except for root!
non-leaf nodes
index entries
used to direct search
in the examples that follow...
alternative 2 is used
all nodes have between d and 2d key entries
d is the order of the tree
6. b+ trees
6
non-leaf nodes
index entries
used to direct search
P0 K 1 P 1 K 2 P 2 K m P m
k* < K1 K1 ≤ k* < K2
Km ≤ k*
leaf nodes
contain data-entries
sequentially linked
closer look at non-leaf nodes
search key values pointers
7. b+ trees
7
most widely used index
search and updates at logFN cost (cost = pages I/O)
F = fanout (num of pointers per index node); N = num of leaf pages
efficient equality and range queries
non-leaf nodes
index entries
used to direct search
leaf nodes
contain data-entries
sequentially linked
8. example b+ tree - search
search begins at root, and key comparisons direct it to a leaf
search for 5*; search for all data entries >= 24*
root
17 24 30
2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*
13
8
9. inserting a data entry
1. find correct leaf L
2. place data entry onto L
a. if L has enough space, done!
b. else must split L into L and L2
• redistribute entries evenly
• copy up the middle key to parent of L
• insert entry pointing to L2 to parent of L
9
the above happens recursively
when index nodes are split, push up middle key
splits grow the tree
root split increases height
10. example b+ tree
insert 8*
root
17 24 30
2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*
13
10
middle key is copied up (and
continues to appear in the leaf)
split!
5
≥ 5< 5
5 24 30
17
13
middle key is pushed up
≥ 17< 17split parent node!
L
2* 3* 5* 7* 8*
L L2
12. deleting a data entry
12
inverse of insertion
re-distribute entries &
(maybe) merge nodes
vs split nodes &
re-distribute entries
when nodes
are less than half-full
when nodes
overflow
remove data entry add data entryvs
deletion insertion
13. b+ trees in practice
typical order d = 100, fill-factor = 67%
average fan-out 133
typical capacities:
for height 4: 1334 = 312,900,700 records
for height 3: 1333 = 2,352,637 records
can often hold top levels in main memory
level 1: 1 page = 8KBytes
level 2: 133 pages = 1MByte
level 3: 17,689 pages = 133 MBytes
13
15. hash-based index
the index supports equality queries
does not support range queries
static and dynamic variants exist 15
data entries organized in M buckets
bucket = a collection of pages
the data entry for record
with search key value key
is assigned to bucket
h(key) mod M
hash function h(key)
e.g., h(key) = α key + β
h(key) mod M
key
h
0
1
2
M-1
...
buckets
16. static hashing
number of buckets is fixed
start with one page per bucket
allocated sequentially, never de-allocated
can use overflow pages
16
h(key) mod M
key
h
0
1
2
M-1
...
buckets
17. static hashing
drawback
long overflow chains can degrade performance
dynamic hashing techniques
adapt index to data size
extendible and linear hashing
17
18. extendible hashing
problem: bucket becomes full
one solution
double the number of buckets...
...and redistribute data entries
however
reading and re-writing all buckets is expensive
better idea:
use directory of pointers to buckets
double number of ‘logical’ buckets…
but split ‘physically’ only the overflown bucket
directory much smaller than data entry pages - good!
no overflow pages - good! 18
19. example
2
2
2
2
local depth
2
global depth
directory
00
01
10
11
data entries
bucket A
bucket B
bucket C
bucket D
4* 12* 32* 16*
1* 5* 21* 13*
10*
15* 7* 19*
directory is array of size M = 4 = 22
to find bucket for r, take last 2 # bits of h(r)
h(r) = key
e.g., if h(r) = 5 = binary 101
it is in bucket pointed to by 01
global depth = 2
= min. bits enough to enumerate buckets
local depth
= min bits to identify individual bucket
= 2
19
20. insertion
2
2
2
2
local depth
2
global depth
directory
00
01
10
11
data entries
bucket A
bucket B
bucket C
bucket D
4* 12* 32* 16*
1* 5* 21* 13*
10*
15* 7* 19*
try to insert entry to
corresponding bucket
if necessary, double the directory
i.e., when for split bucket
local depth > global depth
20
when directory doubles,
increase global depth +1
if bucket is full,
increase +1 local depth
and split bucket
(allocate new bucket, re-distribute)
21. example
insert record with h(r) = 20
= binary 10100 è bucket A
split bucket A !
allocate new page,
redistribute according to
modulo 2M = 8
3 least significant bits
we’ll have more than 4
buckets now,
so double the directory!
00
01
10
11
2
2
2
2
2
local depth
global depth
directory
bucket A
bucket B
bucket C
bucket D
data entries
4* 12* 32* 16*
1* 5* 21* 13*
10*
15* 7* 19*
21
22. global depth
2
example
3
2
2
2
2
2
local depth 3
2
2
2
3
directory
001
4* 12* 20* bucket A2
split image of A
000
010
011
bucket A
bucket B
bucket C
bucket D
32* 16*
1* 5* 21* 13*
10*
15* 7* 19*
100
101
110
111
22
split bucket A and
redistribute entries
update local depth
double the directory
update global depth
update pointers
23. notes
20 = binary 10100
last 2 bits (00) tell us r belongs in A or A2
last 3 bits needed to tell which
global depth of directory
number of bits enough to determine which bucket any entry belongs to
local depth of a bucket
number of bits enough to determine if an entry belongs to this bucket
when does bucket split cause directory doubling?
before insert, local depth of bucket = global depth
23
24. example
3
4* 12* 20* bucket A2
000
001
010
011
3
3
2
2
2
local depth
global depth
directory
bucket A
bucket B
bucket C
bucket D
32* 16*
1* 5* 21* 13*
10*
15* 7* 19*
100
101
110
111
insert h(r) = 17
split bucket B
000
001
010
011
3 3
global depth
directory
bucket B1* 17*
100
101
110
111
3
bucket B25* 21* 13*
24
other buckets
not shown
25. comments on extendible hashing
if directory fits in memory,
equality query answered in one disk access
answered = retrieve rid
directory grows in spurts
if hash values are skewed, it might grow large
delete: reverse algorithm
empty bucket can be merged with its ‘split image’
when can the directory be halved?
when all directory elements point to same bucket as their ‘split image’
25
27. create index
CREATE INDEX indexb
ON students (age, grade)
USING BTREE;
CREATE INDEX indexh
ON students (age, grade)
USING HASH;
DROP INDEX indexh
ON student;
27
29. the sorting problem
setting
a relation R, stored over N disk pages
3≤B<N pages available in memory (buffer pages)
task
sort records of R and store result on disk
sort by a function of record field values f(r)
why
application need records ordered
part of join implementation (soon...)
29
30. sorting with 3 buffer pages
2 phases
30
1
2
N
input
relation R
Npagesstoredondisk
output
sorted R
f(r)
Npagesstoredondisk
buffer (memory used by dbms)
31. sorting with 3 buffer pages - first phase
31
1
2
N
input
relation R
Npagesstoredondisk
pass 0: output N runs
run: sorted sub-file
after first phase: one run is one page
how: load one page at a time,
sort it in-memory, output to disk
run #1
run #2
run #N
only 1 buffer page needed for first phase
output
N runs
32. run #1
...
run #N/2
sorting with 3 buffer pages - second phase
32
input
relation R
Npagesstoredondisk
run #1
run #2
run #(N-1)
run #N
pass 1,2,...: halve the runs
how: scan pairs of runs, each in own page,
merge in-memory into a new run,
output to disk
input page 1
input page 2
output page
Npagesstoredondisk
output
half runs
33. merge?
33
input page 1
input page 2
output page8 4 3 1
7 6 5 2
merge the two sorted input pages
into the output page
maintaining sorted order
compare the next smallest value from each page
move smallest to output page
values
f(r)
34. merge?
34
input page 1
input page 2
output page8 4 3
7 6 5 2
1
merge the two input pages into the output page
maintaining sorted order
compare the next smallest value from each page
move smallest to output page
35. merge?
35
input page 1
input page 2
output page8 4 3
7 6 5
2 1
merge the two input pages into the output page
maintaining sorted order
compare the next smallest value from each page
move smallest to output page
36. merge?
36
input page 1
input page 2
output page8 4
7 6 5
3 2 1
merge the two input pages into the output page
maintaining sorted order
compare the next smallest value from each page
move smallest to output page
37. merge?
37
input page 1
input page 2
output page8
7 6 5
4 3 2 1
merge the two input pages into the output page
maintaining sorted order
compare the next smallest value from each page
move smallest to output page
output page is full,
what do we do?
write it to disk!
38. merge?
38
input page 1
input page 2
output page8
7 6 5
merge the two input pages into the output page
maintaining sorted order
compare the next smallest value from each page
move smallest to output page
39. merge?
39
input page 1
input page 2
output page8
7 6
5
merge the two input pages into the output page
maintaining sorted order
compare the next smallest value from each page
move smallest to output page
40. merge?
40
input page 1
input page 2
output page8
7
6 5
merge the two input pages into the output page
maintaining sorted order
compare the next smallest value from each page
move smallest to output page
41. merge?
41
input page 1
input page 2
output page8
7 6 5
merge the two input pages into the output page
maintaining sorted order
compare the next smallest value from each page
move smallest to output page
input page is empty!
what do we do?
if the input run has
more pages, load
next one
42. merge?
42
input page 1
input page 2
output page
8 7 6 5
merge the two input pages into the output page
maintaining sorted order
compare the next smallest value from each page
move smallest to output page
43. merge?
43
input page 1
input page 2
output page
merge the two input pages into the output page
maintaining sorted order
compare the next smallest value from each page
move smallest to output page
44. sorting with 3 buffer pages - second phase
44
input
relation R
Npagesstoredondisk
Npagesstoredondisk
run #1
run #2
run #(N-1)
run #N
pass 1,2,...: halve the runs
after log2N passes...
we are done!
input page 1
input page 2
output page
output
sorted R
f(r)
45. sorting with B buffer pages
45
1
2
N
input
relation R
Npagesstoredondisk
output
sorted R
f(r)
Npagesstoredondisk
page #B
page #1
page #2
page #(B-1)
same approach
...
46. sorting with B buffer pages - first phase
46
1
2
N
input
relation R
Npagesstoredondisk
output
sorted R
Npagesstoredondisk
...
pass 0: output [N/B] runs
how: load R to memory in chunks
of B pages, sort in-memory,
output to disk
run #1
run #2
run #[N/B]
page #B
page #1
page #2
page #(B-1)
47. sorting with B buffer pages - second phase
47
input
relation R
Npagesstoredondisk
output
sorted R
f(r)
Npagesstoredondisk
output page
pass 1,2,...:
merge runs in groups of B-1
...
input page #1
input page #2
input page #(B-1)
run #1
run #2
run #[N/B]
48. sorting with B buffer pages
48
1
2
N
input
relation R
Npagesstoredondisk
output
sorted R
f(r)
Npagesstoredondisk
output page
how many passes in total?
let N1 = [N/B]
total number of passes =
1 + [logB-1(N1)]
...
input page #1
input page #2
input page #(B-1)
phase 1 phase 2
49. sorting with B buffer
49
1
2
N
input
relation R
Npagesstoredondisk
output
sorted R
f(r)
Npagesstoredondisk
output page
how many pages I/O per pass?
...
input page #1
input page #2
input page #(B-1)
2N: N input, N output
51. joins
so far, we have seen queries that
operate on a single relation
but we can also have queries that
combine information
from two or more relations
51
52. joins
sid name username age
53666 Sam Jones jones 22
53688 Alice Smith smith 22
53650 Jon Edwards jon 23
students
sid points grade
53666 92 A
53650 65 C
dbcourse
52
SELECT *
FROM students S, dbcourse C
WHERE S.sid = C.sid
what does this compute?
S.sid S.name S.username C.age C.sid C.points C.grade
53666 Sam Jones jones 22 53666 92 A
53650 Jon Edwards jon 23 53650 65 C
53. joins
53
SELECT *
FROM students S, dbcourse C
WHERE S.sid = C.sid
S record #1 C record #1
S record #1 C record #2
S record #1 C record #3
... ...
S record #2 C record #1
S record #2 C record #2
S record #2 C record #3
... ...
intuitively...
take all pairs of records from S and C
(the “cross product” S x C)
keep only records that
satisfy WHERE condition
54. joins
54
SELECT *
FROM students S, dbcourse C
WHERE S.sid = C.sid
S record #1 C record #1
S record #1 C record #2
S record #1 C record #3
... ...
S record #2 C record #1
S record #2 C record #2
S record #2 C record #3
... ...
keep only records that
satisfy WHERE condition
intuitively...
take all pairs of records from S and C
(the “cross product” S x C)
output join result
S S.sid=C.sid C
55. joins
55
SELECT *
FROM students S, dbcourse C
WHERE S.sid = C.sid
keep only records that
satisfy WHERE condition
intuitively...
take all pairs of records from S and C
(the “cross product” S x C) S record #1 C record #2
S record #2 C record #1
output join result
S S.sid=C.sid C
expensive to
materialize!
56. in what follows...
algorithms to compute joins
without materializing cross product
56
SELECT *
FROM students S, dbcourse C
WHERE S.sid = C.sid
assuming WHERE condition is equality condition
as in the example
assumption is not essential, though
58. the join problem
input
relation R: M pages on disk, pR records per page
relation S: N pages on disk, pS records per page
M ≤ N
output
58
R R.a=S.b S
59. if there is enough memory...
load both relations in memory
59
R
S
M pages
N pages
output pages
in-memory
for each record r in R
for each record s in S
if r.a = s.b:
store (r, s) in output pages
output
I/O cost
(ignoring final output cost)
M + N pages
not necessarily
to disk
we only have to scan
each relation once
output
60. page-oriented simple nested loops join
60
1 input page
output page
join using 3 memory (buffer) pages
R
S
1 input page
for each page P of R
for each page Q of S
compute P join Q; store in output page
output
I/O cost (pages)
M + M * N
outer
relation
inner
relation
R is scanned once
S is scanned M times
61. block nested loops join
61
B - 2 input pages
output page
join using B memory (buffer) pages
R
S
1 input page
for each block P of (B-2) pages of R
for each page Q of S
compute P join Q; store in output page
output
I/O cost (pages)
M + [M/(B-2)] * N
outer
relation
inner
relation
R is scanned once
S is scanned M/(B-2) times
62. index nested loop join
relation S has an index on the join attribute
use one page to make a pass over R
use index to retrieve only matching records of S
62
1 input page
output page
R
S
1 input page
output
outer
relation
inner
relation
for each record r of R
for each record s of S with s.b = r.a // query index
add (r,s) to output
63. index nested loop join
cost
63
M
to scan R
( MpR ) x (cost of query on S)
number of records of R
covered in previous lectures
total
M + MpR x (cost of query on S)
64. sort-merge join
two phases
sort and merge
sort
R and S on the join attribute
using external sort algorithm
cost O(nlogn), n: number of relation pages
merge sorted R and S
64
66. sort-merge join: merge
66
1
2
6
7
8
9
current pages
in memory
(only join attributes
are shown)
3
4
5
6
9
10
R.a S.b
r s main loop
repeat while r != s:
advance r until r >= s
advance s until s >= r
output (r, s)
advance r and s
assumption
b is a key for S
67. sort-merge join: merge
67
1
2
6
7
8
9
current pages
in memory
(only join attributes
are shown)
3
4
5
6
9
10
R.a S.b
r
s main loop
repeat while r != s:
advance r until r >= s
advance s until s >= r
output (r, s)
advance r and s
assumption
b is a key for S
68. sort-merge join: merge
68
1
2
6
7
8
9
current pages
in memory
(only join attributes
are shown)
3
4
5
6
9
10
R.a S.b
r
s main loop
repeat while r != s:
advance r until r >= s
advance s until s >= r
output (r, s)
advance r and s
assumption
b is a key for S
69. sort-merge join: merge
69
1
2
6
7
8
9
current pages
in memory
(only join attributes
are shown)
3
4
5
6
9
10
R.a S.b
r
s
main loop
repeat while r != s:
advance r until r >= s
advance s until s >= r
output (r, s)
advance r and s
assumption
b is a key for S
70. sort-merge join: merge
70
1
2
6
7
8
9
current pages
in memory
(only join attributes
are shown)
3
4
5
6
9
10
R.a S.b
r s
main loop
repeat while r != s:
advance r until r >= s
advance s until s >= r
output (r, s)
advance r and s
assumption
b is a key for S
71. sort-merge join: merge
71
1
2
6
7
8
9
current pages
in memory
(only join attributes
are shown)
3
4
5
6
9
10
R.a S.b
r
s
main loop
repeat while r != s:
advance r until r >= s
advance s until s >= r
output (r, s)
advance r and s
assumption
b is a key for S
72. sort-merge join: merge
72
1
2
6
7
8
9
current pages
in memory
(only join attributes
are shown)
3
4
5
6
9
10
R.a S.b
r
s
repeat while r != s:
advance r until r >= s
advance s until s >= r
output (r, s)
advance r and s
main loop
assumption
b is a key for S
73. sort-merge join: merge
73
1
2
6
7
8
9
current pages
in memory
(only join attributes
are shown)
3
4
5
6
9
10
R.a S.b
r
s
repeat while r != s:
advance r until r >= s
advance s until s >= r
output (r, s)
advance r and s
main loop
assumption
b is a key for S
74. sort-merge join: merge
74
1
2
6
7
8
9
current pages
in memory
(only join attributes
are shown)
3
4
5
6
9
10
R.a S.b
r s
repeat while r != s:
advance r until r >= s
advance s until s >= r
output (r, s)
advance r and s
main loop
assumption
b is a key for S
75. sort-merge join: merge
75
1
2
6
7
8
9
current pages
in memory
(only join attributes
are shown)
3
4
5
6
9
10
R.a S.b
r
s
repeat while r != s:
advance r until r >= s
advance s until s >= r
output (r, s)
advance r and s
main loop
assumption
b is a key for S
76. sort-merge join: merge
76
1
2
6
7
8
9
current pages
in memory
(only join attributes
are shown)
3
4
5
6
9
10
R.a S.b
r
s
repeat while r != s:
advance r until r >= s
advance s until s >= r
output (r, s)
advance r and s
main loop
assumption
b is a key for S
77. sort-merge join: merge
77
13
14
15
17
19
23
current pages
in memory
(only join attributes
are shown)
3
4
5
6
9
10
R.a S.b
r
repeat while r != s:
advance r until r >= s
advance s until s >= r
output (r, s)
advance r and s
main loop
assumption
b is a key for S
s
new page
for R!
78. sort-merge join: merge
78
13
14
15
17
19
23
current pages
in memory
(only join attributes
are shown)
3
4
5
6
9
10
R.a S.b
r
repeat while r != s:
advance r until r >= s
advance s until s >= r
output (r, s)
advance r and s
main loop
assumption
b is a key for S
s
cost for merge
M + N
79. sort-merge join: merge
79
13
14
15
17
19
23
current pages
in memory
(only join attributes
are shown)
3
4
5
6
9
10
R.a S.b
r
repeat while r != s:
advance r until r >= s
advance s until s >= r
output (r, s)
advance r and s
main loop
assumption
b is a key for S
s
cost for merge
M + N
what if this assumption
does not hold?
81. sort-merge join: merge
81
sorted R sorted S
the pages of
sorted R and S
join attribute is not key
on either relation
parts of R and S with same
join attribute value
must output cross product
of these parts
R.a = 15
R.a = 20
S.b = 15
S.b = 20
82. sort-merge join: merge
82
sorted R sorted S
the pages of
sorted R and S
join attribute is not key
on either relation
modify algorithm to perform
join over these areas
e.g., page-oriented
simple nested loops
R.a = 15
R.a = 20
S.b = 15
S.b = 20 for that algorithm,
worst case cost
is M + MN
83. two phases
partition and probe
partition
each relation into partitions
using the same hash function
on the join attribute
probe
join the corresponding partitions
83
hash join
84. hash join - partition
84
1 input page
(B - 1) partition / output pages
h(R.a)
R
B-1 partitions of R
use B-1 pages to hold the partitions,
flush when full or scan of R ends
B buffer pages available
scan R with 1 buffer page
hash into B-1 partitions
on join attribute
85. hash join - partition
85
1 input page
(B - 1) partition / output pages
h(S.b)
S
B-1 partitions of R
B-1 partitions of S
B buffer pages available
scan S with 1 buffer page
hash into B-1 partitions
on join attribute
use B-1 pages to hold the partitions,
flush when full or scan of S ends
86. hash join - probe
86
B-2 input pages
output page
1 input page
output
k-th partition
of R
k-th partition
of S
B buffer pages available
load k-th partition of R into memory
(assuming it fits in B-2 pages)
scan k-th partition of S one page at a time;
for each record of S, probe the partition
of R for matching records;
store matches in output page;
flush when full or done
holds when size of each
partition fits in B-2 pages
approximately
B-2 > M / (B - 1)
B > √M
variant
re-partition the partition of
R in-memory with hash
function h2,
probe using h2
87. hash join
cost
87
partition phase
read and write each relation once
2M + 2N
probing phase
read each partition once
(final output cost ignored)
M + N
total
3M + 3N
89. query optimization
once we submit a query
the dbms is responsible for
efficient computation
the same query can be
executed in many ways
each is an ‘execution plan’
or ‘query evaluation plan’
89
90. example
select *
from students
where sid = 100
90
students
σsid = 100
(scan)
(on-the-fly)
execution plan
annotated relational
algebra tree
access path
how we retrieve data from
the relation: scan or index?
algorithm used by operator
students
σsid = 100
(b+ tree index stud_btree on sid)
(query index stud_btree)
another
execution plan
91. example
select *
from students S, dbcourse C
where S.sid = C.sid
91
students S (scan)
execution plan
C.sid = S.sid
dbcourse C (scan)
(block-nested
-loops)
another
execution plan
students S
(index
stud_btree)
C.sid = S.sid
dbcourse C (scan)
(index-nested-loops)
92. which plan to choose?
dbms estimates cost for a number of execution plans
(not all possible plans, necessarily!)
the estimates follow the cost analysis
we presented earlier
dbms picks the execution plan with
minimum estimated cost
92
96. references
● “cowbook”, database management systems, by ramakrishnan and gehrke
● “elmasri”, fundamentals of database systems, elmasri and navathe
● other database textbooks
96
credits
some slides based on material from
database management systems, by ramakrishnan and gehrke
99. deleting a data entry
1. start at root, find leaf L of entry
2. remove the entry, if it exists
○ if L is at least half-full, done!
○ else
■ try to re-distribute, borrowing from sibling
● adjacent node with same parent as L
■ if that fails, merge L into sibling
o if merge occured,
must delete L from parent of L
99
merge could propagate to root
110. example b+ tree
135 17 20
22
30
14* 16* 17* 18* 20* 33* 34* 38* 39*22* 27* 29*21*7*5* 8*3*2*
during deletion of 24* -- different example
110
left child of root has many entries (full)
can redistribute entries of index nodes
pushing through root splitting entry
111. example b+ tree
135
17
20 22 30
14* 16* 17* 18* 20* 33* 34* 38* 39*22* 27* 29*21*7*5* 8*3*2*
during deletion of 24* -- different example
111
left child of root has many entries (full)
can redistribute entries of index nodes
pushing through root splitting entry
114. linear hashing
dynamic hashing
uses overflow pages; no directory
splits a bucket in round-robin fashion
when an overflow occurs
M = 2level: number of buckets at beginning of round
pointer next ∈ [0, M) points at next bucket to split
already next - 1 ‘split-image’ buckets appended to original M
114
115. linear hashing
to allocate entries, use
H0(key) = h(key) mod M, or
H1(key) = h(key) mod 2M
i.e., level or level+1 least significant bits of h(key)
to allocate bucket for key
first use H0(key)
if H0(key) is less than next
then it refers to a split bucket
use H1(key) to determine if it refers
to original or its split image
115
116. linear hashing
in the middle of a round...
M buckets that existed at the
beginning of this round;
this is the range of H0
bucket to be split next
buckets already split in this round
split image buckets
created through splitting of
other buckets in this round
if H0(key) is in this range, then
must use H1(key) to decide if
entry is in split image bucket
> is a directory necessary? 116
117. linear hashing inserts
insert
find bucket by applying H0 / H1 and
insert if there is space
if bucket to insert into is full:
add overflow page, insert data entry,
split next bucket and increment next
since buckets are split round-robin,
long overflow chains don’t develop!
117
118. example - insert h(r) = 43
on split, H1 is used to redistribute entries
H0
this is for
illustration only!
M=4
00
01
10
11
000
001
010
011
actual contents of the
linear hashing file
next=0
PRIMARY
PAGES
44* 36*32*
25*9* 5*
14* 18*10*30*
31*35* 11*7*
00
01
10
11
000
001
010
011
next=1
PRIMARY
PAGES
44* 36*
32*
25*9* 5*
14* 18*10*30*
31*35* 11*7*
OVERFLOW
PAGES
43*
00100
H1 H0H1
118