4. Introduction
• What is SQL ?
• SQL stands for structured query language.
What does query language mean ?
You are at a Super Market
What to know how many employees work ?
Manager counts from list and replies
Use English as Query Language
What does structured mean ?
Suppose Manager is a Computer
Structure Query so Computer can Understand
• SQL is the language which Computer ( i.e. Database Server ) Understands
7. SQL Recap
What is DB and Table ?
DB stands for Database
Tables are Small sub-sections of a DB
Relational Database
Sales Table references to employees table ( via employee_id ) and items table ( via item_id )
We have a group of tables such that a table is connected to other table(s).
What is the advantage ?
Consider what if they were not related
Store each and every details of items and Employee in Sales
Lot of Redundant Data
Wastage of Space
Not Properly Organized
8. Entity-Relationship diagrams
• –ER diagrams are pictures that represent how the data in a database are supposed to be
connected or related to one another.
11. Relational schemas
• –They are basically maps of the database.
• –Sometimes relational schemas are missing important information, that is contained in an ER
diagram, so it is useful to practice and understand both tools.
• ER diagrams represent the concepts that database architects have to implement, but they don't
yet represent how a database is actually organized.
• To describe the model of what a database truly looks like, you use a relational schema
12.
13. Critical points
• Each entity in the ER diagram is converted into a table in the relational scheme.
• Primary Keys , Foreign Keys
• Translating ER to Relational Schema
• Translating ER to Relational Schema
15. Constraints
• The SQL CONSTRAINTS are an integrity which defines some conditions that restrict the
column to remain true while inserting or updating or deleting data in the column.
Constraints can be specified when the table created first with CREATE TABLE statement
or at the time of modification of the structure of an existing table with ALTER TABLE
statement.
• The SQL CONSTRAINTS are used to implement the rules of the table. If there is any
violation of the constraints caused some action not performing properly on the table the
action is aborted by the constraint
16. • CREATE TABLE Persons (
ID int NOT NULL,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Age int,
UNIQUE (ID), City varchar(255) DEFAULT 'Sandnes'
);
• CREATE TABLE Persons (
ID int NOT NULL,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Age int,
PRIMARY KEY (ID), CHECK (Age>=18)
);
• CREATE TABLE Orders (
OrderID int NOT NULL,
OrderNumber int NOT NULL,
PersonID int,
PRIMARY KEY (OrderID),
FOREIGN KEY (PersonID) REFERENCES Persons(PersonID)
);
17. SQL
1. DDL – Data Definition Language
2. DQl – Data Query Language
3. DML – Data Manipulation Language
4.DCL – Data Control Language
26. Analytical Functions ---- Order by
• analytic_function_name([argument_list])
OVER (
[PARTITION BY partition_expression,…]
[ORDER BY sort_expression, … [ASC|DESC]])
• We want to calculate the running average revenue and total revenue for each agent in the third quarter:
analytic_function_name: name of the function — like RANK(), SUM(), FIRST(),
27.
28. LEAD() and LAG()
• LEAD() function, as the name suggests, fetches the value of a specific column from the next row and
returns the fetched value in the current row. In PostgreSQL, LEAD() takes two arguments:
• column_name from which the next value has to be fetched
• index of the next row relative to the current row.
• LAG() is just the opposite of. It fetches values from the previous rows
• what is the last highest amount for which an order was sold by an agent
29.
30. RANK() and DENSE_RANK() & Row Number
• RANK() and DENSE_RANK() are numbering functions. They assign an integer value to a row depending
upon the partition and the ordering. I cannot stress enough on the importance of these functions when
it comes to finding the nth highest/lowest record from the table.
• DENSE_RANK() and RANK() differ on the point that in the former we get consecutive ranks while in the
later the rank after a tie is skipped. For example, ranking using DENSE_RANK() would be something like
(1,2,2,3) whereas ranking using RANK() would be (1,2,2,4). Hope you get the difference
• what are the second highest order values for each month
31.
32. CUME_DIST()
• CUME_DIST() function is used to calculate the cumulative distribution of values within a given partition. It
computes the fraction of rows in the partition that is less than or equal to the current row. It’s very
helpful when we have to fetch only the top n% of the results
• calculate the revenue percentile for each order in August and September:
33.
34. Interview Questions
• Write a SQL query to get the second highest salary from the Employee table. For example, given the
Employee table below, the query should return 200 as the second highest salary. If there is no second
highest salary, then the query should return
Write a SQL query to find all duplicate emails in a table named Person
select distinct a.Email from Table a, Table b where a.Email = b.Email and a.id != b.id
35. • Given a Weather table, write a SQL query to find all dates' Ids with higher temperature
compared to its previous (yesterday's) dates
Mary is a teacher in a middle school and she has a table seat storing students' names and their corresponding
seat ids. The column id is a continuous increment. Mary wants to change seats for the adjacent students.
Can you write a SQL query to output the result for Mary
36.
37. • Think of a CASE WHEN THEN statement like an IF statement in coding.
• The first WHEN statement checks to see if there’s an odd number of rows, and if there is, ensure
that the id number does not change.
• The second WHEN statement adds 1 to each id (eg. 1,3,5 becomes 2,4,6)
• Similarly, the third WHEN statement subtracts 1 to each id (2,4,6 becomes 1,3,5)
38. • 5 Common SQL Interview Problems for Data Scientists 📘
• 46 Questions to test a Data Scientist on SQL 📘
• 30 SQL Interview Questions curated for FAANG by an Ex-Facebook Data Scientist 📘
• SQL Interview Questions 📘
• How to ace Data Science Interviews - SQL 📘
• 3 Must Know SQL Questions to pass your Data Science Interview 📘
• 10 frequently asked SQL Queries in Interviews 📘
• Technical Data Science Interview Questions: SQL and Coding 📘
• How to optimize SQL Queries - Datacamp 📘
• Ten SQL Concepts You Should Know for Data Science Interviews 📘
• https://www.nicksingh.com/posts/30-sql-and-database-design-questions-from-real-data-science-interviews
• https://365datascience.com/sql-interview-questions/
• https://www.java67.com/2013/04/10-frequently-asked-sql-query-interview-questions-answers-database.html
• https://github.com/rbhatia46/Data-Science-Interview-Resources
• https://data36.com/sql-interview-questions-tech-screening-data-analysts/
• https://gdcoder.com/how-to-nail-a-data-scientist-sql-interview-includes-sql-code/
• https://www.datacamp.com/community/tutorials/sql-tutorial-query