Relational Algebra
Data Models
• A Database models some
portion of the real world.
• Data Model is link between
user’s view of the world and
bits stored in computer.
• Many models have been
proposed.
• We will concentrate on the
Relational Model.
10101
11101
Student (sid: string, name: string, login:
string, age: integer, gpa:real)
Describing Data:
Data Models
• A data model is a collection of concepts for
describing data.
• A database schema is a description of a
particular collection of data, using a given data
model.
• The relational model of data is the most widely
used model today.
o Main concept: relation, basically a table with rows and columns.
o Every relation has a schema, which describes the columns, or fields.
Need to design a
data model
Data Model
A data schema
Need to model the business
Relational Query
Languages
• Query languages:
o Allow manipulation and retrieval of data from a database.
• Relational model supports simple,
powerful QLs:
o Strong formal foundation based on logic.
o Allows for much optimization.
• Query Languages != programming
languages!
o QLs not expected to be “Turing complete”.
o QLs not intended to be used for complex calculations.
o QLs support easy, efficient access to large data sets.
Formal Relational Query
Languages
Two mathematical Query Languages form the
basis for “real” languages (e.g. SQL), and for
implementation:
¶ Relational Algebra: More operational, very
useful for representing execution plans.
· Relational Calculus: Lets users describe what
they want, rather than how to compute it.
· (Non-operational, declarative.)
* Understanding Algebra & Calculus is key to
* understanding SQL, query processing!
Relational Database:
Definitions
• Relational database: a set of relations.
• Relation: made up of 2 parts:
o Schema : specifies name of relation, plus name
and type of each column.
• E.g. Students(sid: string, name: string, login:
string, age: integer, gpa: real)
o Instance : a table, with rows and columns.
• #rows = cardinality
• #fields = degree / arity
• Can think of a relation as a set of rows or tuples.
o i.e., all rows are distinct
Set and Bag
A set of objects….
Formal distinction
Set:
All objects in the “set” are unique
If the objects are not unique, then it is a
Bag
Preliminaries
• A query is applied to relation instances, and the
result of a query is also a relation instance.
o Schemas of input relations for a query are fixed (but query will run
regardless of instance!)
o The schema for the result of a given query is also fixed! Determined by
definition of query language constructs.
• Positional vs. named-field notation:
o Positional notation easier for formal definitions, named-field notation more
readable.
o Both used in SQL
Algebra
• In math, algebraic operations like +, -, x, /.
• Operate on numbers: input are numbers, output are
numbers.
• Can also do Boolean algebra on sets, using union,
intersect, difference.
• Focus on algebraic identities, e.g.
o x (y+z) = xy + xz.
• (Relational algebra lies between propositional and 1st-order logic.)
3
4
7+
Relational Algebra
• Every operator takes one or two relation instances
• Result is also a relation
A relational algebra expression is a relation
Algebra is closed
F( R ) -> R
F(R1,R2) -> R
12
Relational Algebra in a
DBMS
parser
SQL
query
Relational
algebra
expression
Optimized
Relational
algebra
expression
Query optimizer
Code
generator
Query
execution
plan
Executable
code
DBMS
Introduction to Relational Algebra
• Introduced by E. F.
Codd in 1970.
• Codd proposed such
an algebra as a basis
for database query
languages.
Terminology
• Relation - a set of tuples.
• Tuple - a collection of attributes which describe
some real world entity.
• Attribute - a real world role played by a named
domain.
• Domain - a set of atomic values.
• Set - a mathematical definition for a collection of
objects which contains no duplicates.
Relational Algebra
• Basic operations:
o Selection ( 𝛔) Selects a subset of rows from relation.
o Projection ( π) Deletes unwanted columns from relation.
o Cross-product ( X ) Allows us to combine two relations.
o Set-difference ( - ) Tuples in reln. 1, but not in reln. 2.
o Union ( U ) Tuples in reln. 1 and in reln. 2.
• Additional operations:
o Intersection, join, division, renaming: Not essential, but (very!) useful.
Closed Algebra
Since each operation returns a relation, operations
can be composed! (Algebra is “closed”.)
All these operations have a relation instance as input
And all these operations give an instance relation as output
Example Instances
sid bid day
22 101 10/10/96
58 103 11/12/96
R1
sid sname rating age
22 dustin 7 45.0
31 lubber 8 55.5
58 rusty 10 35.0
S1
sid sname rating age
28 yuppy 9 35.0
31 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
S2
S1(sid,sname,rating,age)
S1(sid,sname,rating,age)
R1(sid,bid,day)
18
Projection
R1 := PROJL (R2)
R1 := πL (R2)
• L is a list of attributes from the schema of R2.
• R1 is constructed by looking at each tuple of R2,
extracting the attributes on list L, in the order
specified, and creating from those components a
tuple for R1.
• Eliminate duplicate tuples, if any.
Projection
• Deletes attributes that are not in projection list.
• Schema of result contains exactly the fields in the
projection list, with the same names that they had in
the (only) input relation.
sname rating
S
,
( )2
Schema: Result(sname,rating)
Projection
sname rating
yuppy 9
lubber 8
guppy 5
rusty 10
sname rating
S
,
( )2
• Deletes attributes that are not in projection list.
sid sname rating age
28 yuppy 9 35.0
31 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
Schema: Result(sname,rating)
Projection
age
35.0
55.5
age S( )2
• Deletes attributes that are not in projection list.
sid sname rating age
28 yuppy 9 35.0
31 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
Schema: Result(age)
Duplicates are eliminated
(sets not bags)
22
Selection
R1 := SELECTC (R2)
R1 := 𝛔C (R2)
• C is a condition (as in “if” statements) that refers to
attributes of R2.
• R1 is all those tuples of R2 that satisfy C.
Selection
rating
S
8
2( )
sid sname rating age
28 yuppy 9 35.0
58 rusty 10 35.0
Selects rows that satisfy selection condition.
sid sname rating age
28 yuppy 9 35.0
31 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
Selection
rating
S
8
2( )
sid sname rating age
28 yuppy 9 35.0
58 rusty 10 35.0
Schema of result identical to schema of input relation.
sid sname rating age
28 yuppy 9 35.0
31 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
S2
Result(sid,sname,rating,age)
Composite
We have two operations
Each operation, 𝛔 and π, have relations as input
Each operation has a relation as output
i.e., Relational Algebra is closed
Thus we can combine them into composite functions
 sname rating rating
S
,
( ( ))
8
2
Composite
rating
S
8
2( )sid sname rating age
28 yuppy 9 35.0
58 rusty 10 35.0
sname rating
yuppy 9
rusty 10
 sname rating rating
S
,
( ( ))
8
2
sid sname rating age
28 yuppy 9 35.0
31 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
2
S2
More operations
Union
Intersection
difference
Similar to the normal set operations
sid sname rating age
22 dustin 7 45.0
31 lubber 8 55.5
58 rusty 10 35.0
S1
Prerequisite:
Union compatibility
(tuples are the same)
Union Compatible
• Same number of fields.
• `Corresponding’ fields
have the same type.
sid sname rating age
22 dustin 7 45.0
31 lubber 8 55.5
58 rusty 10 35.0
sid sname rating age
28 yuppy 9 35.0
31 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
S1
S2Schema of S1 = Schema of S1
S1(sid,sname,rating,age)
S2(sid,sname,rating,age)
Union
sid sname rating age
22 dustin 7 45.0
31 lubber 8 55.5
58 rusty 10 35.0
44 guppy 5 35.0
28 yuppy 9 35.0
sid sname rating age
22 dustin 7 45.0
31 lubber 8 55.5
58 rusty 10 35.0
sid sname rating age
28 yuppy 9 35.0
31 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
S1
S2
Duplicates
Compute: S1 U S2
Union Compatable
S1(sid,sname,rating,age)
S2(sid,sname,rating,age)
Result(sid,sname,rating,age)
The same schema
Intersection
sid sname rating age
22 dustin 7 45.0
31 lubber 8 55.5
58 rusty 10 35.0
sid sname rating age
28 yuppy 9 35.0
31 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
S1
S2
Compute: S1 ∩ S2
Union Compatible Duplicates
sid sname rating age
31 lubber 8 55.5
58 rusty 10 35.0
S1(sid,sname,rating,age)
S2(sid,sname,rating,age)
Result(sid,sname,rating,age)
The same schema
Difference
sid sname rating age
22 dustin 7 45.0
31 lubber 8 55.5
58 rusty 10 35.0
sid sname rating age
28 yuppy 9 35.0
31 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
S1
S2
Compute: S1 - S2
Union Compatible Take away
Duplicates
sid sname rating age
22 dustin 7 45.0
S1(sid,sname,rating,age)
S2(sid,sname,rating,age)
Result(sid,sname,rating,age)
The same schema
Union, Intersection, Set-
Difference
sid sname rating age
22 dustin 7 45.0
31 lubber 8 55.5
58 rusty 10 35.0
44 guppy 5 35.0
28 yuppy 9 35.0
S S1 2
sid sname rating age
31 lubber 8 55.5
58 rusty 10 35.0
S S1 2
sid sname rating age
22 dustin 7 45.0
S S1 2
sid sname rating age
22 dustin 7 45.0
31 lubber 8 55.5
58 rusty 10 35.0
sid sname rating age
28 yuppy 9 35.0
31 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
S1
S2
All have the same schema
33
Cross-Product
R3 := R1 * R2
• Pair each tuple t1 of R1 with each tuple t2 of R2.
• Concatenation t1 and t2 is a tuple of R3.
• Schema of R3 is the attributes of R1 and then R2, in order.
• But beware attribute A of the same name in R1 and R2:
use R1.A and R2.A (rename)
Cross-Product
sid bid day
22 101 10/10/96
58 103 11/12/96
R1(sid,bid,day)
sid sname rating age
22 dustin 7 45.0
31 lubber 8 55.5
58 rusty 10 35.0
S1(sid,sname,rating,age)
Schema of cross product
Result(R1.sid,bid,day,S1.sid,sname,rating,age)
Renaming attribute
Cross-Product
sid bid day
22 101 10/10/96
58 103 11/12/96
R1(sid,bid,day)
sid sname rating age
22 dustin 7 45.0
31 lubber 8 55.5
58 rusty 10 35.0
S1(sid,sname,rating,age)
Pair each tuple t1 of R1 with each tuple t2 of S1.
(sid) sname rating age (sid) bid day
22 dustin 7 45.0 22 101 10/10/96
22 dustin 7 45.0 58 103 11/12/96
31 lubber 8 55.5 22 101 10/10/96
31 lubber 8 55.5 58 103 11/12/96
58 rusty 10 35.0 22 101 10/10/96
58 rusty 10 35.0 58 103 11/12/96
1
2
3
4
5
6
3
2
1
6
5
4
Cross-Product
• Each row of S1 is paired with each row of R1.
• Result schema has one field per field of S1 and R1,
with field names `inherited’ if possible.
• Conflict: Both S1 and R1 have a field called sid.
 ( ( , ), )C sid sid S R1 1 5 2 1 1  
(sid) sname rating age (sid) bid day
22 dustin 7 45.0 22 101 10/10/96
22 dustin 7 45.0 58 103 11/12/96
31 lubber 8 55.5 22 101 10/10/96
31 lubber 8 55.5 58 103 11/12/96
58 rusty 10 35.0 22 101 10/10/96
58 rusty 10 35.0 58 103 11/12/96
* Renaming operator:
37
Renaming
• The RENAME operator gives a new schema to a
relation.
• R1 := RENAMER1(A1,…,An)(R2) makes R1 be a relation
with attributes A1,…,An and the same tuples as
R2.
• Simplified notation: R1(A1,…,An) := R2.
38
Relational Algebra Operations
Composite Functions
• Projection
• Selection
• Product
• Union
• Intersection
• Difference
Relation algebra is closed
Can form composite
function:
as our example before:
This is where the power of relation algebra
Comes into play
Can form useful composite functions:
Such as
Joins and Division
Conditional Joins
(Theta Join)
Select out rows of a cross product given a certain
condition
• Result schema same as that of cross-product.
• Sometimes called a theta-join.
R c S c R S   ( )
Cross product
Selection
Joins
R c S c R S   ( )
S R
S sid R sid
1 1
1 1

. .
1. Perform the cross product S1 x R1
2. The perform the selection
Cross-Product
sid bid day
22 101 10/10/96
58 103 11/12/96
R1(sid,bid,day)
sid sname rating age
22 dustin 7 45.0
31 lubber 8 55.5
58 rusty 10 35.0
S1(sid,sname,rating,age)
Pair each tuple t1 of R1 with each tuple t2 of S1.
(sid) sname rating age (sid) bid day
22 dustin 7 45.0 22 101 10/10/96
22 dustin 7 45.0 58 103 11/12/96
31 lubber 8 55.5 22 101 10/10/96
31 lubber 8 55.5 58 103 11/12/96
58 rusty 10 35.0 22 101 10/10/96
58 rusty 10 35.0 58 103 11/12/96
1
2
3
4
5
6
3
2
1
6
5
4
Selection
(sid) sname rating age (sid) bid day
22 dustin 7 45.0 58 103 11/12/96
31 lubber 8 55.5 58 103 11/12/96
S1.sid > R1.sid
(sid) sname rating age (sid) bid day
22 dustin 7 45.0 22 101 10/10/96
22 dustin 7 45.0 58 103 11/12/96
31 lubber 8 55.5 22 101 10/10/96
31 lubber 8 55.5 58 103 11/12/96
58 rusty 10 35.0 22 101 10/10/96
58 rusty 10 35.0 58 103 11/12/96
S1.sid R1.sid
Joins
• Condition Join:
• Result schema same as that of cross-product.
• Fewer tuples than cross-product, might be able
to compute more efficiently
• Sometimes called a theta-join.
R c S c R S   ( )
(sid) sname rating age (sid) bid day
22 dustin 7 45.0 58 103 11/12/96
31 lubber 8 55.5 58 103 11/12/96
S R
S sid R sid
1 1
1 1

. .
Conditional Joins
Special Case: Equi-Join)
The condition is equality
Selects out those rows where a attributes are the same
• (for example, two primary keys)
• Again, result schema same as that of cross-product.
R c S c R S   ( )
Cross product
Selection
Joins
R c S c R S   ( )
1. Perform the cross product S1 x R1
2. The perform the selection R1.sid = S1.sid
S R
sid
1 1
Cross-Product
sid bid day
22 101 10/10/96
58 103 11/12/96
R1(sid,bid,day)
sid sname rating age
22 dustin 7 45.0
31 lubber 8 55.5
58 rusty 10 35.0
S1(sid,sname,rating,age)
Pair each tuple t1 of R1 with each tuple t2 of S1.
(sid) sname rating age (sid) bid day
22 dustin 7 45.0 22 101 10/10/96
22 dustin 7 45.0 58 103 11/12/96
31 lubber 8 55.5 22 101 10/10/96
31 lubber 8 55.5 58 103 11/12/96
58 rusty 10 35.0 22 101 10/10/96
58 rusty 10 35.0 58 103 11/12/96
1
2
3
4
5
6
Selection
S1.sid = R1.sid
(sid) sname rating age (sid) bid day
22 dustin 7 45.0 22 101 10/10/96
22 dustin 7 45.0 58 103 11/12/96
31 lubber 8 55.5 22 101 10/10/96
31 lubber 8 55.5 58 103 11/12/96
58 rusty 10 35.0 22 101 10/10/96
58 rusty 10 35.0 58 103 11/12/96
S1.sid R1.sid
sid sname rating age bid day
22 dustin 7 45.0 101 10/10/96
58 rusty 10 35.0 103 11/12/96
Equi-Join
• Equi-Join: A special case of condition join where the
condition c contains only equalities.
• Result schema similar to cross-product,
• but only one copy of fields for which equality is
specified.
sid sname rating age bid day
22 dustin 7 45.0 101 10/10/96
58 rusty 10 35.0 103 11/12/96
S R
sid
1 1
Natural Join
• Natural Join: Equijoin on all common fields.
sid bid day
22 101 10/10/96
58 103 11/12/96
R1(sid,bid,day)
sid sname rating age
22 dustin 7 45.0
31 lubber 8 55.5
58 rusty 10 35.0
S1(sid,sname,rating,age)
sid sname rating age bid day
22 dustin 7 45.0 101 10/10/96
58 rusty 10 35.0 103 11/12/96
S1 R1
Notation
Division
• Not supported as a primitive operator, but useful for
expressing queries like:
Find sailors who have reserved all boats.
• Let A have 2 fields, x and y; B have only field y:
o A/B =
o i.e., A/B contains all x tuples (sailors) such that for every y tuple (boat) in B, there
is an xy tuple in A.
o Or: If the set of y values (boats) associated with an x value (sailor) in A contains all
y values in B, the x value is in A/B.
• In general, x and y can be any lists of fields; y is the list of
fields in B, and x y is the list of fields of A.
 x x y A y B| ,   

53
Division (con’t)
sno pno
s1 p1
s1 p2
s1 p3
s1 p4
s2 p1
s2 p2
s3 p2
s4 p2
s4 p4
Examples of Division A/B
pno
p2
sno
s1
s2
s3
s4
A
B1
A/B1Which have
p2 in A
Examples of Division A/B
sno pno
s1 p1
s1 p2
s1 p3
s1 p4
s2 p1
s2 p2
s3 p2
s4 p2
s4 p4
pno
p2
p4
sno
s1
s4
A
B2
A/B2
Which have both
p2 and p4
Examples of Division A/B
sno pno
s1 p1
s1 p2
s1 p3
s1 p4
s2 p1
s2 p2
s3 p2
s4 p2
s4 p4
pno
p1
p2
p4
sno
s1
A
B3
A/B3
Which has
p1, p2 and p4
Expressing A/B Using
Basic Operators
• Division is not essential op; just a useful shorthand.
o (Also true of joins, but joins are so common that systems implement joins
specially.)
• Idea: For A/B, compute all x values that are not
`disqualified’ by some y value in B.
o x value is disqualified if by attaching y value from B, we obtain an xy tuple that
is not in A.
Disqualified x values:
A/B:
 x x A B A(( ( ) ) ) 
 x A( )  all disqualified tuples
Expressing A/B Using
Basic Operators
 x x A B A(( ( ) ) ) 
sno pno
s1 p1
s1 p2
s1 p3
s1 p4
s2 p1
s2 p2
s3 p2
s4 p2
s4 p4
pno
p2
p4
Select out sno from A
(note that only unique element
x is attributes unique to A
(not in B)
sno
Cross with B
has the same schema as A
Subtract rows that are
the same as A
Select out sno
This is the set of “disqualified” rows
Expressing A/B Using
Basic Operators
 x x A B A(( ( ) ) ) 
sno pno
s1 p1
s1 p2
s1 p3
s1 p4
s2 p1
s2 p2
s3 p2
s4 p2
s4 p4
pno
p2
p4
This is the set of “disqualified”
 x A( ) 
If something remains,
Then it is in the answer
sno
s1
s4
Subtract out disqualified tuples
SQL and
Relational Algebra
• Project
o SELECT X FROM TABLE
• Select
o select * from E where salary < 200
• Product
o select * from E, D
• Union
o UNION
• Intersection
o INTERSECT
61
Schemas for Results
• Union, intersection, and difference: the schemas of
the two operands must be the same, so use that
schema for the result.
• Selection: schema of the result is the same as the
schema of the operand.
• Projection: list of attributes tells us the schema.
62
Schemas for Results ---
(2)
• Product: schema is the attributes of both relations.
o Use R.A, etc., to distinguish two attributes named A.
• Theta-join: same as product.
• Natural join: union of the attributes of the two
relations.
• Renaming: the operator tells the schema.
Example tables
Sailors(sid: integer, sname: string, rating: integer, age: real)
Boats(bid: integer, bname: string, color: string)
Reserves(sid: integer, bid: integer, day: date)
Examples
Reserves
Sailors
Boats
sid bid day
22 101 10/10/96
58 103 11/12/96
sid sname rating age
22 dustin 7 45.0
31 lubber 8 55.5
58 rusty 10 35.0
bid bname color
101 Interlake Blue
102 Interlake Red
103 Clipper Green
104 Marine Red
Notation
Find names of sailors who’ve reserved
boat #103
• Solution 1:  sname bid
serves Sailors(( Re ) )
103

v Solution 2:  ( , Re )Temp serves
bid
1
103
 ( , )Temp Temp Sailors2 1 
 sname Temp( )2
v Solution 3:  sname bid
serves Sailors( (Re ))
103

Find names of sailors who’ve reserved a
red boat
• Information about boat color only available in Boats;
so need an extra join:
 sname color red
Boats serves Sailors((
' '
) Re )

 
v A more efficient solution:
   sname sid bid color red
Boats s Sailors( ((
' '
) Re ) )

 
* A query optimizer can find this given the first solution!
Find sailors who’ve reserved a red or a
green boat
• Can identify all red or green boats, then find
sailors who has reserved one of these boats:
 ( , (
' ' ' '
))Tempboats
color red color green
Boats
  
 sname Tempboats serves Sailors( Re ) 
v What happens if is replaced by this query? 
Find sailors who’ve reserved a red and a
green boat
• Previous approach won’t work! Must identify
sailors who’ve reserved red boats, sailors who’ve
reserved green boats, then find the intersection
(note that sid is a key for Sailors):
  ( , ((
' '
) Re ))Tempred
sid color red
Boats serves


 sname Tempred Tempgreen Sailors(( ) ) 
  ( , ((
' '
) Re ))Tempgreen
sid color green
Boats serves


Find the names of sailors who’ve reserved all
boats
• Uses division; schemas of the input relations to /
must be carefully chosen:
  ( , (
,
Re ) / ( ))Tempsids
sid bid
serves
bid
Boats
 sname Tempsids Sailors( )
v To find sailors who’ve reserved all ‘Interlake’ boats:
/ (
' '
) 
bid bname Interlake
Boats

.....
71
Duplicates
• Duplicate rows not allowed in a relation
• However, duplicate elimination from query result is
costly and not automatically done; it must be
explicitly requested:
SELECT DISTINCT …..
FROM …..
72
Operations on Bags
• Selection applies to each tuple, so its effect on
bags is like its effect on sets.
• Projection also applies to each tuple, but as a bag
operator, we do not eliminate duplicates.
• Products and joins are done on each pair of tuples,
so duplicates in bags have no effect on how we
operate.
73
Beware: Bag Laws != Set
Laws
• Some, but not all algebraic laws that hold for sets
also hold for bags.
• Example: the commutative law for union (R UNION
S = S UNION R ) does hold for bags.
o Since addition is commutative, adding the number of times x appears in R
and S doesn’t depend on the order of R and S.
Relational Algebra
• Relational Algebra and Relational Calculus have
substantial expressive power. In particular, they can
express
• Natural Join
• Quotient
• Unions of conjunctive queries
• …
• However, they Cannot Express recursive Queries.
75
Equivalences
The same relational algebraic expression can be
written in many different ways. The order in which
tuples appear in relations is never significant.
• A  B <=> B  A
• A  B <=> B  A
• A  B <=> B  A
• (A - B) is not the same as (B - A)
•  c1 ( c2 (A)) <=>  c2 ( c1 (A)) <=>  c1 ^ c2 (A)
•  a1(A) <=>  a1( a1,etc(A)) , where etc is any
attributes of A.
• ...
76
Operations on Bags
(and why we care)
• Union: {a,b,b,c} U {a,b,b,b,e,f,f} =
{a,a,b,b,b,b,b,c,e,f,f}
o add the number of occurrences
• Difference: {a,b,b,b,c,c} – {b,c,c,c,d} = {a,b,b,d}
o subtract the number of occurrences
• Intersection: {a,b,b,b,c,c}∩{b,b,c,c,c,c,d} = {b,b,c,c}
o minimum of the two numbers of occurrences
• Selection: preserve the number of occurrences
• Projection: preserve the number of occurrences (no
duplicate elimination)
• Cartesian product, join: no duplicate elimination
Notation
Summary
• The relational model has rigorously defined query
languages that are simple and powerful.
• Relational algebra is more operational; useful as
internal representation for query evaluation plans.
• Several ways of expressing a given query; a query
optimizer should choose the most efficient version.

Relational algebra

  • 1.
  • 2.
    Data Models • ADatabase models some portion of the real world. • Data Model is link between user’s view of the world and bits stored in computer. • Many models have been proposed. • We will concentrate on the Relational Model. 10101 11101 Student (sid: string, name: string, login: string, age: integer, gpa:real)
  • 3.
    Describing Data: Data Models •A data model is a collection of concepts for describing data. • A database schema is a description of a particular collection of data, using a given data model. • The relational model of data is the most widely used model today. o Main concept: relation, basically a table with rows and columns. o Every relation has a schema, which describes the columns, or fields.
  • 4.
    Need to designa data model Data Model A data schema Need to model the business
  • 5.
    Relational Query Languages • Querylanguages: o Allow manipulation and retrieval of data from a database. • Relational model supports simple, powerful QLs: o Strong formal foundation based on logic. o Allows for much optimization. • Query Languages != programming languages! o QLs not expected to be “Turing complete”. o QLs not intended to be used for complex calculations. o QLs support easy, efficient access to large data sets.
  • 6.
    Formal Relational Query Languages Twomathematical Query Languages form the basis for “real” languages (e.g. SQL), and for implementation: ¶ Relational Algebra: More operational, very useful for representing execution plans. · Relational Calculus: Lets users describe what they want, rather than how to compute it. · (Non-operational, declarative.) * Understanding Algebra & Calculus is key to * understanding SQL, query processing!
  • 7.
    Relational Database: Definitions • Relationaldatabase: a set of relations. • Relation: made up of 2 parts: o Schema : specifies name of relation, plus name and type of each column. • E.g. Students(sid: string, name: string, login: string, age: integer, gpa: real) o Instance : a table, with rows and columns. • #rows = cardinality • #fields = degree / arity • Can think of a relation as a set of rows or tuples. o i.e., all rows are distinct
  • 8.
    Set and Bag Aset of objects…. Formal distinction Set: All objects in the “set” are unique If the objects are not unique, then it is a Bag
  • 9.
    Preliminaries • A queryis applied to relation instances, and the result of a query is also a relation instance. o Schemas of input relations for a query are fixed (but query will run regardless of instance!) o The schema for the result of a given query is also fixed! Determined by definition of query language constructs. • Positional vs. named-field notation: o Positional notation easier for formal definitions, named-field notation more readable. o Both used in SQL
  • 10.
    Algebra • In math,algebraic operations like +, -, x, /. • Operate on numbers: input are numbers, output are numbers. • Can also do Boolean algebra on sets, using union, intersect, difference. • Focus on algebraic identities, e.g. o x (y+z) = xy + xz. • (Relational algebra lies between propositional and 1st-order logic.) 3 4 7+
  • 11.
    Relational Algebra • Everyoperator takes one or two relation instances • Result is also a relation A relational algebra expression is a relation Algebra is closed F( R ) -> R F(R1,R2) -> R
  • 12.
    12 Relational Algebra ina DBMS parser SQL query Relational algebra expression Optimized Relational algebra expression Query optimizer Code generator Query execution plan Executable code DBMS
  • 13.
    Introduction to RelationalAlgebra • Introduced by E. F. Codd in 1970. • Codd proposed such an algebra as a basis for database query languages.
  • 14.
    Terminology • Relation -a set of tuples. • Tuple - a collection of attributes which describe some real world entity. • Attribute - a real world role played by a named domain. • Domain - a set of atomic values. • Set - a mathematical definition for a collection of objects which contains no duplicates.
  • 15.
    Relational Algebra • Basicoperations: o Selection ( 𝛔) Selects a subset of rows from relation. o Projection ( π) Deletes unwanted columns from relation. o Cross-product ( X ) Allows us to combine two relations. o Set-difference ( - ) Tuples in reln. 1, but not in reln. 2. o Union ( U ) Tuples in reln. 1 and in reln. 2. • Additional operations: o Intersection, join, division, renaming: Not essential, but (very!) useful.
  • 16.
    Closed Algebra Since eachoperation returns a relation, operations can be composed! (Algebra is “closed”.) All these operations have a relation instance as input And all these operations give an instance relation as output
  • 17.
    Example Instances sid bidday 22 101 10/10/96 58 103 11/12/96 R1 sid sname rating age 22 dustin 7 45.0 31 lubber 8 55.5 58 rusty 10 35.0 S1 sid sname rating age 28 yuppy 9 35.0 31 lubber 8 55.5 44 guppy 5 35.0 58 rusty 10 35.0 S2 S1(sid,sname,rating,age) S1(sid,sname,rating,age) R1(sid,bid,day)
  • 18.
    18 Projection R1 := PROJL(R2) R1 := πL (R2) • L is a list of attributes from the schema of R2. • R1 is constructed by looking at each tuple of R2, extracting the attributes on list L, in the order specified, and creating from those components a tuple for R1. • Eliminate duplicate tuples, if any.
  • 19.
    Projection • Deletes attributesthat are not in projection list. • Schema of result contains exactly the fields in the projection list, with the same names that they had in the (only) input relation. sname rating S , ( )2 Schema: Result(sname,rating)
  • 20.
    Projection sname rating yuppy 9 lubber8 guppy 5 rusty 10 sname rating S , ( )2 • Deletes attributes that are not in projection list. sid sname rating age 28 yuppy 9 35.0 31 lubber 8 55.5 44 guppy 5 35.0 58 rusty 10 35.0 Schema: Result(sname,rating)
  • 21.
    Projection age 35.0 55.5 age S( )2 •Deletes attributes that are not in projection list. sid sname rating age 28 yuppy 9 35.0 31 lubber 8 55.5 44 guppy 5 35.0 58 rusty 10 35.0 Schema: Result(age) Duplicates are eliminated (sets not bags)
  • 22.
    22 Selection R1 := SELECTC(R2) R1 := 𝛔C (R2) • C is a condition (as in “if” statements) that refers to attributes of R2. • R1 is all those tuples of R2 that satisfy C.
  • 23.
    Selection rating S 8 2( ) sid snamerating age 28 yuppy 9 35.0 58 rusty 10 35.0 Selects rows that satisfy selection condition. sid sname rating age 28 yuppy 9 35.0 31 lubber 8 55.5 44 guppy 5 35.0 58 rusty 10 35.0
  • 24.
    Selection rating S 8 2( ) sid snamerating age 28 yuppy 9 35.0 58 rusty 10 35.0 Schema of result identical to schema of input relation. sid sname rating age 28 yuppy 9 35.0 31 lubber 8 55.5 44 guppy 5 35.0 58 rusty 10 35.0 S2 Result(sid,sname,rating,age)
  • 25.
    Composite We have twooperations Each operation, 𝛔 and π, have relations as input Each operation has a relation as output i.e., Relational Algebra is closed Thus we can combine them into composite functions  sname rating rating S , ( ( )) 8 2
  • 26.
    Composite rating S 8 2( )sid snamerating age 28 yuppy 9 35.0 58 rusty 10 35.0 sname rating yuppy 9 rusty 10  sname rating rating S , ( ( )) 8 2 sid sname rating age 28 yuppy 9 35.0 31 lubber 8 55.5 44 guppy 5 35.0 58 rusty 10 35.0 2 S2
  • 27.
    More operations Union Intersection difference Similar tothe normal set operations sid sname rating age 22 dustin 7 45.0 31 lubber 8 55.5 58 rusty 10 35.0 S1 Prerequisite: Union compatibility (tuples are the same)
  • 28.
    Union Compatible • Samenumber of fields. • `Corresponding’ fields have the same type. sid sname rating age 22 dustin 7 45.0 31 lubber 8 55.5 58 rusty 10 35.0 sid sname rating age 28 yuppy 9 35.0 31 lubber 8 55.5 44 guppy 5 35.0 58 rusty 10 35.0 S1 S2Schema of S1 = Schema of S1 S1(sid,sname,rating,age) S2(sid,sname,rating,age)
  • 29.
    Union sid sname ratingage 22 dustin 7 45.0 31 lubber 8 55.5 58 rusty 10 35.0 44 guppy 5 35.0 28 yuppy 9 35.0 sid sname rating age 22 dustin 7 45.0 31 lubber 8 55.5 58 rusty 10 35.0 sid sname rating age 28 yuppy 9 35.0 31 lubber 8 55.5 44 guppy 5 35.0 58 rusty 10 35.0 S1 S2 Duplicates Compute: S1 U S2 Union Compatable S1(sid,sname,rating,age) S2(sid,sname,rating,age) Result(sid,sname,rating,age) The same schema
  • 30.
    Intersection sid sname ratingage 22 dustin 7 45.0 31 lubber 8 55.5 58 rusty 10 35.0 sid sname rating age 28 yuppy 9 35.0 31 lubber 8 55.5 44 guppy 5 35.0 58 rusty 10 35.0 S1 S2 Compute: S1 ∩ S2 Union Compatible Duplicates sid sname rating age 31 lubber 8 55.5 58 rusty 10 35.0 S1(sid,sname,rating,age) S2(sid,sname,rating,age) Result(sid,sname,rating,age) The same schema
  • 31.
    Difference sid sname ratingage 22 dustin 7 45.0 31 lubber 8 55.5 58 rusty 10 35.0 sid sname rating age 28 yuppy 9 35.0 31 lubber 8 55.5 44 guppy 5 35.0 58 rusty 10 35.0 S1 S2 Compute: S1 - S2 Union Compatible Take away Duplicates sid sname rating age 22 dustin 7 45.0 S1(sid,sname,rating,age) S2(sid,sname,rating,age) Result(sid,sname,rating,age) The same schema
  • 32.
    Union, Intersection, Set- Difference sidsname rating age 22 dustin 7 45.0 31 lubber 8 55.5 58 rusty 10 35.0 44 guppy 5 35.0 28 yuppy 9 35.0 S S1 2 sid sname rating age 31 lubber 8 55.5 58 rusty 10 35.0 S S1 2 sid sname rating age 22 dustin 7 45.0 S S1 2 sid sname rating age 22 dustin 7 45.0 31 lubber 8 55.5 58 rusty 10 35.0 sid sname rating age 28 yuppy 9 35.0 31 lubber 8 55.5 44 guppy 5 35.0 58 rusty 10 35.0 S1 S2 All have the same schema
  • 33.
    33 Cross-Product R3 := R1* R2 • Pair each tuple t1 of R1 with each tuple t2 of R2. • Concatenation t1 and t2 is a tuple of R3. • Schema of R3 is the attributes of R1 and then R2, in order. • But beware attribute A of the same name in R1 and R2: use R1.A and R2.A (rename)
  • 34.
    Cross-Product sid bid day 22101 10/10/96 58 103 11/12/96 R1(sid,bid,day) sid sname rating age 22 dustin 7 45.0 31 lubber 8 55.5 58 rusty 10 35.0 S1(sid,sname,rating,age) Schema of cross product Result(R1.sid,bid,day,S1.sid,sname,rating,age) Renaming attribute
  • 35.
    Cross-Product sid bid day 22101 10/10/96 58 103 11/12/96 R1(sid,bid,day) sid sname rating age 22 dustin 7 45.0 31 lubber 8 55.5 58 rusty 10 35.0 S1(sid,sname,rating,age) Pair each tuple t1 of R1 with each tuple t2 of S1. (sid) sname rating age (sid) bid day 22 dustin 7 45.0 22 101 10/10/96 22 dustin 7 45.0 58 103 11/12/96 31 lubber 8 55.5 22 101 10/10/96 31 lubber 8 55.5 58 103 11/12/96 58 rusty 10 35.0 22 101 10/10/96 58 rusty 10 35.0 58 103 11/12/96 1 2 3 4 5 6 3 2 1 6 5 4
  • 36.
    Cross-Product • Each rowof S1 is paired with each row of R1. • Result schema has one field per field of S1 and R1, with field names `inherited’ if possible. • Conflict: Both S1 and R1 have a field called sid.  ( ( , ), )C sid sid S R1 1 5 2 1 1   (sid) sname rating age (sid) bid day 22 dustin 7 45.0 22 101 10/10/96 22 dustin 7 45.0 58 103 11/12/96 31 lubber 8 55.5 22 101 10/10/96 31 lubber 8 55.5 58 103 11/12/96 58 rusty 10 35.0 22 101 10/10/96 58 rusty 10 35.0 58 103 11/12/96 * Renaming operator:
  • 37.
    37 Renaming • The RENAMEoperator gives a new schema to a relation. • R1 := RENAMER1(A1,…,An)(R2) makes R1 be a relation with attributes A1,…,An and the same tuples as R2. • Simplified notation: R1(A1,…,An) := R2.
  • 38.
  • 39.
    Composite Functions • Projection •Selection • Product • Union • Intersection • Difference Relation algebra is closed Can form composite function: as our example before: This is where the power of relation algebra Comes into play Can form useful composite functions: Such as Joins and Division
  • 40.
    Conditional Joins (Theta Join) Selectout rows of a cross product given a certain condition • Result schema same as that of cross-product. • Sometimes called a theta-join. R c S c R S   ( ) Cross product Selection
  • 41.
    Joins R c Sc R S   ( ) S R S sid R sid 1 1 1 1  . . 1. Perform the cross product S1 x R1 2. The perform the selection
  • 42.
    Cross-Product sid bid day 22101 10/10/96 58 103 11/12/96 R1(sid,bid,day) sid sname rating age 22 dustin 7 45.0 31 lubber 8 55.5 58 rusty 10 35.0 S1(sid,sname,rating,age) Pair each tuple t1 of R1 with each tuple t2 of S1. (sid) sname rating age (sid) bid day 22 dustin 7 45.0 22 101 10/10/96 22 dustin 7 45.0 58 103 11/12/96 31 lubber 8 55.5 22 101 10/10/96 31 lubber 8 55.5 58 103 11/12/96 58 rusty 10 35.0 22 101 10/10/96 58 rusty 10 35.0 58 103 11/12/96 1 2 3 4 5 6 3 2 1 6 5 4
  • 43.
    Selection (sid) sname ratingage (sid) bid day 22 dustin 7 45.0 58 103 11/12/96 31 lubber 8 55.5 58 103 11/12/96 S1.sid > R1.sid (sid) sname rating age (sid) bid day 22 dustin 7 45.0 22 101 10/10/96 22 dustin 7 45.0 58 103 11/12/96 31 lubber 8 55.5 22 101 10/10/96 31 lubber 8 55.5 58 103 11/12/96 58 rusty 10 35.0 22 101 10/10/96 58 rusty 10 35.0 58 103 11/12/96 S1.sid R1.sid
  • 44.
    Joins • Condition Join: •Result schema same as that of cross-product. • Fewer tuples than cross-product, might be able to compute more efficiently • Sometimes called a theta-join. R c S c R S   ( ) (sid) sname rating age (sid) bid day 22 dustin 7 45.0 58 103 11/12/96 31 lubber 8 55.5 58 103 11/12/96 S R S sid R sid 1 1 1 1  . .
  • 45.
    Conditional Joins Special Case:Equi-Join) The condition is equality Selects out those rows where a attributes are the same • (for example, two primary keys) • Again, result schema same as that of cross-product. R c S c R S   ( ) Cross product Selection
  • 46.
    Joins R c Sc R S   ( ) 1. Perform the cross product S1 x R1 2. The perform the selection R1.sid = S1.sid S R sid 1 1
  • 47.
    Cross-Product sid bid day 22101 10/10/96 58 103 11/12/96 R1(sid,bid,day) sid sname rating age 22 dustin 7 45.0 31 lubber 8 55.5 58 rusty 10 35.0 S1(sid,sname,rating,age) Pair each tuple t1 of R1 with each tuple t2 of S1. (sid) sname rating age (sid) bid day 22 dustin 7 45.0 22 101 10/10/96 22 dustin 7 45.0 58 103 11/12/96 31 lubber 8 55.5 22 101 10/10/96 31 lubber 8 55.5 58 103 11/12/96 58 rusty 10 35.0 22 101 10/10/96 58 rusty 10 35.0 58 103 11/12/96 1 2 3 4 5 6
  • 48.
    Selection S1.sid = R1.sid (sid)sname rating age (sid) bid day 22 dustin 7 45.0 22 101 10/10/96 22 dustin 7 45.0 58 103 11/12/96 31 lubber 8 55.5 22 101 10/10/96 31 lubber 8 55.5 58 103 11/12/96 58 rusty 10 35.0 22 101 10/10/96 58 rusty 10 35.0 58 103 11/12/96 S1.sid R1.sid sid sname rating age bid day 22 dustin 7 45.0 101 10/10/96 58 rusty 10 35.0 103 11/12/96
  • 49.
    Equi-Join • Equi-Join: Aspecial case of condition join where the condition c contains only equalities. • Result schema similar to cross-product, • but only one copy of fields for which equality is specified. sid sname rating age bid day 22 dustin 7 45.0 101 10/10/96 58 rusty 10 35.0 103 11/12/96 S R sid 1 1
  • 50.
    Natural Join • NaturalJoin: Equijoin on all common fields. sid bid day 22 101 10/10/96 58 103 11/12/96 R1(sid,bid,day) sid sname rating age 22 dustin 7 45.0 31 lubber 8 55.5 58 rusty 10 35.0 S1(sid,sname,rating,age) sid sname rating age bid day 22 dustin 7 45.0 101 10/10/96 58 rusty 10 35.0 103 11/12/96 S1 R1
  • 51.
  • 52.
    Division • Not supportedas a primitive operator, but useful for expressing queries like: Find sailors who have reserved all boats. • Let A have 2 fields, x and y; B have only field y: o A/B = o i.e., A/B contains all x tuples (sailors) such that for every y tuple (boat) in B, there is an xy tuple in A. o Or: If the set of y values (boats) associated with an x value (sailor) in A contains all y values in B, the x value is in A/B. • In general, x and y can be any lists of fields; y is the list of fields in B, and x y is the list of fields of A.  x x y A y B| ,    
  • 53.
  • 54.
    sno pno s1 p1 s1p2 s1 p3 s1 p4 s2 p1 s2 p2 s3 p2 s4 p2 s4 p4 Examples of Division A/B pno p2 sno s1 s2 s3 s4 A B1 A/B1Which have p2 in A
  • 55.
    Examples of DivisionA/B sno pno s1 p1 s1 p2 s1 p3 s1 p4 s2 p1 s2 p2 s3 p2 s4 p2 s4 p4 pno p2 p4 sno s1 s4 A B2 A/B2 Which have both p2 and p4
  • 56.
    Examples of DivisionA/B sno pno s1 p1 s1 p2 s1 p3 s1 p4 s2 p1 s2 p2 s3 p2 s4 p2 s4 p4 pno p1 p2 p4 sno s1 A B3 A/B3 Which has p1, p2 and p4
  • 57.
    Expressing A/B Using BasicOperators • Division is not essential op; just a useful shorthand. o (Also true of joins, but joins are so common that systems implement joins specially.) • Idea: For A/B, compute all x values that are not `disqualified’ by some y value in B. o x value is disqualified if by attaching y value from B, we obtain an xy tuple that is not in A. Disqualified x values: A/B:  x x A B A(( ( ) ) )   x A( )  all disqualified tuples
  • 58.
    Expressing A/B Using BasicOperators  x x A B A(( ( ) ) )  sno pno s1 p1 s1 p2 s1 p3 s1 p4 s2 p1 s2 p2 s3 p2 s4 p2 s4 p4 pno p2 p4 Select out sno from A (note that only unique element x is attributes unique to A (not in B) sno Cross with B has the same schema as A Subtract rows that are the same as A Select out sno This is the set of “disqualified” rows
  • 59.
    Expressing A/B Using BasicOperators  x x A B A(( ( ) ) )  sno pno s1 p1 s1 p2 s1 p3 s1 p4 s2 p1 s2 p2 s3 p2 s4 p2 s4 p4 pno p2 p4 This is the set of “disqualified”  x A( )  If something remains, Then it is in the answer sno s1 s4 Subtract out disqualified tuples
  • 60.
    SQL and Relational Algebra •Project o SELECT X FROM TABLE • Select o select * from E where salary < 200 • Product o select * from E, D • Union o UNION • Intersection o INTERSECT
  • 61.
    61 Schemas for Results •Union, intersection, and difference: the schemas of the two operands must be the same, so use that schema for the result. • Selection: schema of the result is the same as the schema of the operand. • Projection: list of attributes tells us the schema.
  • 62.
    62 Schemas for Results--- (2) • Product: schema is the attributes of both relations. o Use R.A, etc., to distinguish two attributes named A. • Theta-join: same as product. • Natural join: union of the attributes of the two relations. • Renaming: the operator tells the schema.
  • 63.
    Example tables Sailors(sid: integer,sname: string, rating: integer, age: real) Boats(bid: integer, bname: string, color: string) Reserves(sid: integer, bid: integer, day: date)
  • 64.
    Examples Reserves Sailors Boats sid bid day 22101 10/10/96 58 103 11/12/96 sid sname rating age 22 dustin 7 45.0 31 lubber 8 55.5 58 rusty 10 35.0 bid bname color 101 Interlake Blue 102 Interlake Red 103 Clipper Green 104 Marine Red
  • 65.
  • 66.
    Find names ofsailors who’ve reserved boat #103 • Solution 1:  sname bid serves Sailors(( Re ) ) 103  v Solution 2:  ( , Re )Temp serves bid 1 103  ( , )Temp Temp Sailors2 1   sname Temp( )2 v Solution 3:  sname bid serves Sailors( (Re )) 103 
  • 67.
    Find names ofsailors who’ve reserved a red boat • Information about boat color only available in Boats; so need an extra join:  sname color red Boats serves Sailors(( ' ' ) Re )    v A more efficient solution:    sname sid bid color red Boats s Sailors( (( ' ' ) Re ) )    * A query optimizer can find this given the first solution!
  • 68.
    Find sailors who’vereserved a red or a green boat • Can identify all red or green boats, then find sailors who has reserved one of these boats:  ( , ( ' ' ' ' ))Tempboats color red color green Boats     sname Tempboats serves Sailors( Re )  v What happens if is replaced by this query? 
  • 69.
    Find sailors who’vereserved a red and a green boat • Previous approach won’t work! Must identify sailors who’ve reserved red boats, sailors who’ve reserved green boats, then find the intersection (note that sid is a key for Sailors):   ( , (( ' ' ) Re ))Tempred sid color red Boats serves    sname Tempred Tempgreen Sailors(( ) )    ( , (( ' ' ) Re ))Tempgreen sid color green Boats serves  
  • 70.
    Find the namesof sailors who’ve reserved all boats • Uses division; schemas of the input relations to / must be carefully chosen:   ( , ( , Re ) / ( ))Tempsids sid bid serves bid Boats  sname Tempsids Sailors( ) v To find sailors who’ve reserved all ‘Interlake’ boats: / ( ' ' )  bid bname Interlake Boats  .....
  • 71.
    71 Duplicates • Duplicate rowsnot allowed in a relation • However, duplicate elimination from query result is costly and not automatically done; it must be explicitly requested: SELECT DISTINCT ….. FROM …..
  • 72.
    72 Operations on Bags •Selection applies to each tuple, so its effect on bags is like its effect on sets. • Projection also applies to each tuple, but as a bag operator, we do not eliminate duplicates. • Products and joins are done on each pair of tuples, so duplicates in bags have no effect on how we operate.
  • 73.
    73 Beware: Bag Laws!= Set Laws • Some, but not all algebraic laws that hold for sets also hold for bags. • Example: the commutative law for union (R UNION S = S UNION R ) does hold for bags. o Since addition is commutative, adding the number of times x appears in R and S doesn’t depend on the order of R and S.
  • 74.
    Relational Algebra • RelationalAlgebra and Relational Calculus have substantial expressive power. In particular, they can express • Natural Join • Quotient • Unions of conjunctive queries • … • However, they Cannot Express recursive Queries.
  • 75.
    75 Equivalences The same relationalalgebraic expression can be written in many different ways. The order in which tuples appear in relations is never significant. • A  B <=> B  A • A  B <=> B  A • A  B <=> B  A • (A - B) is not the same as (B - A) •  c1 ( c2 (A)) <=>  c2 ( c1 (A)) <=>  c1 ^ c2 (A) •  a1(A) <=>  a1( a1,etc(A)) , where etc is any attributes of A. • ...
  • 76.
    76 Operations on Bags (andwhy we care) • Union: {a,b,b,c} U {a,b,b,b,e,f,f} = {a,a,b,b,b,b,b,c,e,f,f} o add the number of occurrences • Difference: {a,b,b,b,c,c} – {b,c,c,c,d} = {a,b,b,d} o subtract the number of occurrences • Intersection: {a,b,b,b,c,c}∩{b,b,c,c,c,c,d} = {b,b,c,c} o minimum of the two numbers of occurrences • Selection: preserve the number of occurrences • Projection: preserve the number of occurrences (no duplicate elimination) • Cartesian product, join: no duplicate elimination
  • 77.
  • 78.
    Summary • The relationalmodel has rigorously defined query languages that are simple and powerful. • Relational algebra is more operational; useful as internal representation for query evaluation plans. • Several ways of expressing a given query; a query optimizer should choose the most efficient version.