Structured Query
Language (SQL)
Lesson 1 - Introduction
Topics Covered
▪ What is Data?
▪ What are Databases?
▪ Need of Databases
▪ Types of Databases
▪ How Data is Stored in Relational Databases?
▪ What is SQL?
▪ Why SQL?
▪ What is Data Science?
▪ Why is SQL required for Data Science?
Database
Topic 1
What is Data?
▪ Derived from Latin word “datum” meaning single fact, entity
or point of matter
▪ Data is collection of facts, information, or knowledge
▪ Examples
– Twitter tweets
– Company’s financial report
– Newsletter
– Sales data, Employee data, etc.
What are Databases?
A database is an
organized collection
of structured
information, or data,
typically stored
electronically in a
computer system
[Source - Wikipedia]
Source – Pixabay.com
Features of a Database
▪ Data are raw facts that constitute building blocks of
information.
▪ Database is a collection of information and a means to
manipulate data. It allows,
– Easy
– Fast access
– Facilitate the processing of data.
Need of Databases
▪ Data is easier to store
▪ Data is easier to manage
▪ Need multiple views of data
▪ Improved sharing of data
▪ High security of data
▪ Enforce quality data
▪ Better integrated data
Types of Databases
1960s
Traditional Files
1970s
Hierarchical
Network based
1980s
Relational
1990s
1. Object-
Oriented
2. Object-
Relational
2000s
1.Datawarehouse
2. Distributed
Database
3. Big Data
Data is
Flat files
Files stored in
parent/child
manner
Data stored
in tabular
manner
Data was
created as
objects
Large data
stored
across
networks
Types of databases
▪ Relational Databases
▪ Hierarchical Databases
▪ Graph based Databases
▪ Operational Databases
▪ Distributed Databases
▪ Data Warehouse
Database Management System (DBMS)
As per technopedia,
A database management system (DBMS) is a software package
designed to define, manipulate, retrieve and manage data in a
database.
Source – BBC Bitesize
DBMS - Application
Relational Database
Management System (RDBMS)
Topic 2
Relational Database Management
System (RDBMS)
▪ A relational database is a digital database based on
the relational model of data, as proposed by E. F. Codd in 1970.
▪ A software system used to maintain relational databases is
a relational database management system (RDBMS).
▪ A RDBMS consists of fields, tables and records.
▪ Structured Query Language (SQL) is used for querying and
maintaining the database.
▪ Some popular RDBMS systems are MySQL, MS SQL Server, IBM
DB2, Oracle, PostgreSQL, Microsoft Access, etc.
Popular Relational Databases
Basic Elements of RDBMS
▪ Tables – collection of rows
and columns e.g. Customer
table as shown
▪ Records or Tuple – represents
each row of the table
▪ Fields or Column name or
Attribute – are the columns of
the table
▪ Keys – establishes
relationship between tables
e.g. Cust Number
SQL
Topic 3
What is SQL?
▪ SQL stands for Structured Query language, pronounced as "S-Q-L" or as "See-
Quel".
▪ Initially developed @ IBM by Donald Chamberlin and Raymond Boyce (early
1970s)
▪ ANSI standard in 1986 and ISO standard in 1987
▪ SQL skills are in popular demand in the industry
▪ SQL is the standard language for Relational Databases.
▪ SQL is used to create, insert, search, update and delete database records.
▪ SQL can do other operations including optimizing and maintenance of
databases.
▪ Relational databases like MySQL, Oracle, MS SQL Server, Sybase, etc. use SQL.
Why SQL?
▪ Popular as it is easy to understand
▪ It is a declarative language
▪ Read and written similar to English language
▪ Directly accesses the stored data, hence it is very fast
Introduction to Data
Science
Topic 4
What is Data Science?
Book on Prediction of Cancer Patient Outcomes Based on Artificial Intelligence - By Suk Lee, Eunbin Ju, etc.
How SQL fits into Data Science?
▪ Since data is at the core of
Data Science, there is a
frequent need to store and
access data
▪ Data could be very large --
of the order of millions and
billions data points
▪ Hence SQL is needed
SQL
Big Picture
Source
-
www.sql-datatools.com
Key Takeaways
▪ Data is a everywhere, hence databases are used to store and
manage data
▪ Databases can be relational or non-relational
▪ The basic elements of a RDBMS are tables, rows, columns,
keys
▪ SQL is a structured query language and a vital tool for
accessing and manipulating data in the entire Data Science
life cycle
Queries?

Data Analytics - SQL Lesson 01 - Introduction.pptx

  • 1.
  • 2.
    Topics Covered ▪ Whatis Data? ▪ What are Databases? ▪ Need of Databases ▪ Types of Databases ▪ How Data is Stored in Relational Databases? ▪ What is SQL? ▪ Why SQL? ▪ What is Data Science? ▪ Why is SQL required for Data Science?
  • 3.
  • 4.
    What is Data? ▪Derived from Latin word “datum” meaning single fact, entity or point of matter ▪ Data is collection of facts, information, or knowledge ▪ Examples – Twitter tweets – Company’s financial report – Newsletter – Sales data, Employee data, etc.
  • 5.
    What are Databases? Adatabase is an organized collection of structured information, or data, typically stored electronically in a computer system [Source - Wikipedia] Source – Pixabay.com
  • 6.
    Features of aDatabase ▪ Data are raw facts that constitute building blocks of information. ▪ Database is a collection of information and a means to manipulate data. It allows, – Easy – Fast access – Facilitate the processing of data.
  • 7.
    Need of Databases ▪Data is easier to store ▪ Data is easier to manage ▪ Need multiple views of data ▪ Improved sharing of data ▪ High security of data ▪ Enforce quality data ▪ Better integrated data
  • 8.
    Types of Databases 1960s TraditionalFiles 1970s Hierarchical Network based 1980s Relational 1990s 1. Object- Oriented 2. Object- Relational 2000s 1.Datawarehouse 2. Distributed Database 3. Big Data Data is Flat files Files stored in parent/child manner Data stored in tabular manner Data was created as objects Large data stored across networks
  • 9.
    Types of databases ▪Relational Databases ▪ Hierarchical Databases ▪ Graph based Databases ▪ Operational Databases ▪ Distributed Databases ▪ Data Warehouse
  • 10.
    Database Management System(DBMS) As per technopedia, A database management system (DBMS) is a software package designed to define, manipulate, retrieve and manage data in a database.
  • 11.
    Source – BBCBitesize DBMS - Application
  • 12.
  • 13.
    Relational Database Management System(RDBMS) ▪ A relational database is a digital database based on the relational model of data, as proposed by E. F. Codd in 1970. ▪ A software system used to maintain relational databases is a relational database management system (RDBMS). ▪ A RDBMS consists of fields, tables and records. ▪ Structured Query Language (SQL) is used for querying and maintaining the database. ▪ Some popular RDBMS systems are MySQL, MS SQL Server, IBM DB2, Oracle, PostgreSQL, Microsoft Access, etc.
  • 14.
  • 15.
    Basic Elements ofRDBMS ▪ Tables – collection of rows and columns e.g. Customer table as shown ▪ Records or Tuple – represents each row of the table ▪ Fields or Column name or Attribute – are the columns of the table ▪ Keys – establishes relationship between tables e.g. Cust Number
  • 16.
  • 17.
    What is SQL? ▪SQL stands for Structured Query language, pronounced as "S-Q-L" or as "See- Quel". ▪ Initially developed @ IBM by Donald Chamberlin and Raymond Boyce (early 1970s) ▪ ANSI standard in 1986 and ISO standard in 1987 ▪ SQL skills are in popular demand in the industry ▪ SQL is the standard language for Relational Databases. ▪ SQL is used to create, insert, search, update and delete database records. ▪ SQL can do other operations including optimizing and maintenance of databases. ▪ Relational databases like MySQL, Oracle, MS SQL Server, Sybase, etc. use SQL.
  • 18.
    Why SQL? ▪ Popularas it is easy to understand ▪ It is a declarative language ▪ Read and written similar to English language ▪ Directly accesses the stored data, hence it is very fast
  • 19.
  • 20.
    What is DataScience? Book on Prediction of Cancer Patient Outcomes Based on Artificial Intelligence - By Suk Lee, Eunbin Ju, etc.
  • 21.
    How SQL fitsinto Data Science? ▪ Since data is at the core of Data Science, there is a frequent need to store and access data ▪ Data could be very large -- of the order of millions and billions data points ▪ Hence SQL is needed SQL
  • 22.
  • 23.
    Key Takeaways ▪ Datais a everywhere, hence databases are used to store and manage data ▪ Databases can be relational or non-relational ▪ The basic elements of a RDBMS are tables, rows, columns, keys ▪ SQL is a structured query language and a vital tool for accessing and manipulating data in the entire Data Science life cycle
  • 24.

Editor's Notes

  • #8 https://mhaadi.wordpress.com/2010/10/18/the-evolution-of-database/
  • #20 Points to cover on these slides – Data Science Big Data – stored in various databases, formats, etc Data processing – where SQL can be used Analysis – where SQL can be used Machine Learning – the extracted data (using SQL, etc.) is used to build models, and models are used for prediction, etc.
  • #22 Points to explain Data stored in various data sources are used by different groups like purchase, sales, etc. to perform various tasks like analysis, business intelligence, analytics, etc.