HW 2 - SQL
The database you will use for this assignment contains information related to Major League
Baseball (MLB) about players, teams, and games. The relations are:
Players(playerID, playerName, team, position, birthYear)
● playerID is a player identifier used in MLB, and all players throughout the history of
baseball have a unique ID
● playerName is player’s name
● team is the name of the MLB team the player is currently playing on (or the last team the
player played for if they are not currently playing)
● position is the position of the player
● birthYear is the year that player was born
Teams(teamID, teamName, home, leagueName)
● teamID is a unique ID internal to MLB.
● teamName is the name of the team
● home is the home city of the team
● leagueName is the league the team is in, i.e. either “National” or “American”, which
stands for “National League” and “American League”, respectively
Games(gameID, homeTeamID, guestTeamID, date)
● gameID is a unique ID used internally in MLB
● homeTeamID is the ID of the hometeam
● guestTeamID is the ID of the visiting team
● date is the date of the game.
A sample instance of this database is given at the end of this homework handout. Since it is just
one instance of the database designed to give you some intuition, you should not “customize”
your answer to work only with this instance.
1. (10 points each) Write the following queries in SQL, using the schema provided
above. (Note: Your queries must not be “state-dependent", that is, they should work without
modification even if another instance of the database is given.)
(a) Print the names of all players who were born in 1970 and played for the Braves.
(b) Print the names of teams that do not have a pitcher.
(c) Print names of all players who have played in the National League.
(d) Print all gameIDs with Phillies as the home team.
2. (15 points each) Write the following queries in SQL, using the schema provided
above.
(a) Print all teamIDs where the team played against the Phillies but not against the Braves.
(b) Print all tuples (playerID1, playerID2, team) where playerID1 and playerID2 are (or have
been) on the same team. Avoid listing self-references or duplicates, e.g. do not allow
(1,1,”Braves”) or both (2,5,”Phillies”) and (5,2,”Phillies”).
(c) Print all tuples (teamID1, league1, teamID2, league2, date) where teamID1 and teamID2
played against each other in a World Series game. Although there is no direct information
about the World Series games in the relations, we can infer that when two teams from different
leagues play each other, it is a World Series game. So, in this relation, league1 and league2
should be different leagues.
(d) List all cities that have a team in all leagues. For example, there are currently two leagues
(National and American). Although not shown in this instance, New York is home to the Mets in
the National ...
HW 2 - SQL The database you will use for this assignme.docx
1. HW 2 - SQL
The database you will use for this assignment contains
information related to Major League
Baseball (MLB) about players, teams, and games. The relations
are:
Players(playerID, playerName, team, position, birthYear)
● playerID is a player identifier used in MLB, and all players
throughout the history of
baseball have a unique ID
● playerName is player’s name
● team is the name of the MLB team the player is currently
playing on (or the last team the
player played for if they are not currently playing)
● position is the position of the player
● birthYear is the year that player was born
Teams(teamID, teamName, home, leagueName)
● teamID is a unique ID internal to MLB.
2. ● teamName is the name of the team
● home is the home city of the team
● leagueName is the league the team is in, i.e. either “National”
or “American”, which
stands for “National League” and “American League”,
respectively
Games(gameID, homeTeamID, guestTeamID, date)
● gameID is a unique ID used internally in MLB
● homeTeamID is the ID of the hometeam
● guestTeamID is the ID of the visiting team
● date is the date of the game.
A sample instance of this database is given at the end of this
homework handout. Since it is just
one instance of the database designed to give you some
intuition, you should not “customize”
your answer to work only with this instance.
1. (10 points each) Write the following queries in SQL, using
the schema provided
above. (Note: Your queries must not be “state-dependent", that
3. is, they should work without
modification even if another instance of the database is given.)
(a) Print the names of all players who were born in 1970 and
played for the Braves.
(b) Print the names of teams that do not have a pitcher.
(c) Print names of all players who have played in the National
League.
(d) Print all gameIDs with Phillies as the home team.
2. (15 points each) Write the following queries in SQL, using
the schema provided
above.
(a) Print all teamIDs where the team played against the Phillies
but not against the Braves.
(b) Print all tuples (playerID1, playerID2, team) where
playerID1 and playerID2 are (or have
been) on the same team. Avoid listing self-references or
duplicates, e.g. do not allow
(1,1,”Braves”) or both (2,5,”Phillies”) and (5,2,”Phillies”).
(c) Print all tuples (teamID1, league1, teamID2, league2, date)
4. where teamID1 and teamID2
played against each other in a World Series game. Although
there is no direct information
about the World Series games in the relations, we can infer that
when two teams from different
leagues play each other, it is a World Series game. So, in this
relation, league1 and league2
should be different leagues.
(d) List all cities that have a team in all leagues. For example,
there are currently two leagues
(National and American). Although not shown in this instance,
New York is home to the Mets in
the National league as well as the Yankees in the American
league (Chicago also has one in
each league, for those of you who are baseball fans).
Remember that your query must work
over all instances of this schema, even if there are more than
two leagues in the instance.
Players
playerID playerName team position birthYear
1 Javy Lopez Braves Catcher 1970
5. 2 Cliff Lee Phillies Pitcher 1978
3 Derek Jeter Yankees Infielder 1974
4 Skip Schumaker Cardinals Infielder 1980
5 Dominic Brown Phillies Outfielder 1987
Teams
teamID teamName home leagueName
1 Phillies Philadelphia National
2 Braves Atlanta National
3 Yankees New York American
4 Twins Minnesota American
5 Rangers Texas American
6 Cubs Chicago National
Games
gameID homeTeamID guestTeamID date
1 3 6 04/21/2010
6. 2 1 4 04/21/2010
3 2 5 04/30/2010
4 6 3 05/02/2010
5 4 5 05/02/2010
6 1 5 05/06/2010
HW1 - Relational Algebra
The database you will use for this assignment contains
information related to Major League
Baseball (MLB) about players, teams, and games. The relations
are:
Players(playerID, playerName, team, position, birthYear)
● playerID is a player identifier used in MLB, and all players
throughout the history of
baseball have a unique ID
● playerName is player’s name
● team is the name of the MLB team the player is currently
playing on (or the last team the
player played for if they are not currently playing)
● position is the position of the player
● birthYear is the year that player was born
7. Teams(teamID, teamName, home, leagueName)
● teamID is a unique ID internal to MLB.
● teamName is the name of the team
● home is the home city of the team
● leagueName is the league the team is in, i.e. either “National”
or “American”, which
stands for “National League” and “American League”,
respectively
Games(gameID, homeTeamID, guestTeamID, date)
● gameID is a unique ID used internally in MLB
● homeTeamID is the ID of the hometeam
● guestTeamID is the ID of the visiting team
● date is the date of the game.
A sample instance of this database is given at the end of this
homework handout. Since it is just
one instance of the database designed to give you some
intuition, you should not “customize”
your answer to work only with this instance.
1. (5 points each) Consider the schema given above:
(a) Give a primary key for each relation. Are there any relations
for which there is an alternate
candidate key which you have not chosen as the primary key? If
yes, mention the relations,
candidate keys and the reason (if any) for your choice of the
primary key.
8. (b) State all referential integrity constraints (inclusion
dependencies) that should hold on these
relations.
(c) Note that there is no way to represent the fact that a player
may have played on several
different teams (for example, Javy Lopez played for the Braves,
Orioles and RedSox before
retiring), or that they are currently retired. How would you
modify the schema to take this into
account? (Hint: try to do it in a way that information is not
repeated unnecessarily.)
------------------------
For the next parts, if a query is long, feel free to break it up into
a series of queries with
intermediate answers stored in temporary relations (e.g. “let
temp =.....”). You may also use just
the first letter of each relation name since they are unique (e.g.
“P" for “Players"). Also, for the
ease of typing, you can use words for operations (e.g., ‘proj’ for
projection)
Note: Your queries must not be “state-dependent", that is, they
should work without modification
even if another instance of the database is given.
2. (8 points each) Write the following queries in relational
algebra, using the
schema provided above.
9. (a) Print the names of all players who were born in 1970 and
played for the Braves.
(b) Print the names of teams that do not have a pitcher.
(c) Print names of all players who have played in the National
League.
(d) Print all gameIDs with Phillies as the home team.
(e) Print all teamIDs where the team played against the Phillies
but not against the Braves.
3. (15 points each) Write the following queries in relational
algebra, using the
schema provided above.
(a) Define a relation Members(playerID1, playerID2, team)
where playerID1 and playerID2 are
(or have been) on the same team. Avoid listing self-references
or duplicates, e.g. do not allow
(1,1,”Braves”) or both (2,5,”Phillies”) and (5,2,”Phillies”).
(b) Define a relation WorldSeries(teamID1, league1, teamID2,
league2, date) where teamID1
and teamID2 played against each other in a World Series game.
Although there is no direct
information about the World Series in the relations, we can
infer that when two teams from
different leagues play each other, it is a World Series game. So,
in this relation, league1 and
league2 should be different leagues.
(c) Define a relation AllLeagues(city) for which each city has a
team in all leagues. For
example, there are currently two leagues (National and
American). Although not shown in this
instance, New York is home to the Mets in the National league
10. as well as the Yankees in the
American league (Chicago also has one in each league, for those
of you who are baseball
fans). Remember that your query must work over all instances
of this schema, even if there
are more than two leagues in the instance.
Players
playerID playerName team position birthYear
1 Javy Lopez Braves Catcher 1970
2 Cliff Lee Phillies Pitcher 1978
3 Derek Jeter Yankees Infielder 1974
4 Skip Schumaker Cardinals Infielder 1980
5 Dominic Brown Phillies Outfielder 1987
Teams
teamID teamName home leagueName
1 Phillies Philadelphia National
2 Braves Atlanta National
3 Yankees New York American
4 Twins Minnesota American
11. 5 Rangers Texas American
6 Cubs Chicago National
Games
gameID homeTeamID guestTeamID date
1 3 6 04/21/2010
2 1 4 04/21/2010
3 2 5 04/30/2010
4 6 3 05/02/2010
5 4 5 05/02/2010
6 1 5 05/06/2010