SlideShare a Scribd company logo
P 
P 
D 
Manipulating Data C 
with Style in SQL 
An introduction to SQL, the interface language to 
most of the world’s structured data, and 
practices for readable and reusable SQL code 
Ryan B. Harvey 
! 
October 14, 2014
P 
P 
D 
Relational Data C 
•Relational data is organized in tables 
consisting of columns and rows 
•Fields (columns) consist of a column 
name and data type constraint 
•Records (rows) in a table have a common 
field (column) structure and order 
•Records (rows) are linked across tables 
by key fields 
Relational Data Model: Codd, Edgar F. “A Relational Model of Data for Large Shared Data Banks” (1970)
P 
P 
D 
Intro to SQL C 
•SQL (“Structured Query Language”) is a 
declarative data definition and query 
language for relational data 
•SQL is an ISO/IEC standard with many 
implementations in common database 
management systems (a few below) 
Structured Query Language: ISO/IEC 9075 (standard), first appeared 1974, current version SQL:2011
P 
P 
D 
C Which database system 
should I use? 
1. Use the one your data is in 
2. Unless you need specific things 
(performance, functions, etc.), 
use the one you know best 
3. If you need other stuff or you’ve never 
used a database before: 
A. SQLite: FOSS, one file db, easy/limited 
B. PostgreSQL: FOSS, Enterprise-ready 
The above are my opinions based on experience. Others may disagree, and that’s OK.
P 
P 
D 
SQL: Working with Objects C 
feature comparison 
•Data Definition Language (DB Objects) 
•CREATE (table, index, view, function, …) 
•ALTER (table, index, view, function, …) 
•DROP (table, index, view, function, …)
P 
P 
D 
SQL: Working with Rows C 
feature comparison 
•Data Manipulation Language (Records) 
aka Query Language 
•SELECT … FROM … 
•INSERT INTO … 
•UPDATE … SET … 
•DELETE FROM …
P 
P 
D 
SQL: SELECT Statement C 
•SELECT <col_list> FROM <table> … 
•Merging/Column Binding: JOIN clause 
•Row binding: UNION clause 
•Filtering: WHERE clause 
•Aggregation: GROUP BY clause 
•Aggregated filtering: HAVING clause 
•Sorting: ORDER BY clause 
feature comparison
P 
P 
D 
C Intro to Relational Algebra 
•Basic operators 
SELECT WHERE, HAVING 
PROJECT <COL_LIST> 
RENAME AS 
! 
•Join operators: inner/outer, cartesian 
•Set operators: union, intersect, set 
minus, and, or, etc. 
•SELECT name, id FROM t1 WHERE id<3 
AND dob<DATE ‘2004-01-01’ 
(ΠNAME,ID σID<3 ∧ DOB<(1/1/2004) T1) 
For a very detailed Intro to Relational Algebra, see lecture notes from 2005 databases course, IT U of Copenhagen
P 
P 
D 
SQL Beginner Resources C 
•Basic SQL Commands Reference: 
http://www.cs.utexas.edu/~mitra/ 
csFall2013/cs329/lectures/sql.html
P 
P 
D 
C 
SQL: Common Table 
Expressions (CTEs) 
•WITH <name> [(<col_list>)] AS (SELECT …) 
•SELECT <col_list> FROM <table or CTE> … 
•Merging/Column Binding: JOIN clause 
•Row binding: UNION clause 
•Filtering: WHERE clause 
•Aggregation: GROUP BY clause 
•Aggregated filtering: HAVING clause 
•Sorting: ORDER BY clause 
Same 
as 
before! 
feature comparison
P 
P 
D 
SQL: Views from SELECTs C 
•CREATE VIEW <name> AS … 
•SELECT <col_list> FROM <table> … 
•Merging/Column Binding: JOIN clause 
•Row binding: UNION clause 
•Filtering: WHERE clause 
•Aggregation: GROUP BY clause 
•Aggregated filtering: HAVING clause 
•Sorting: ORDER BY clause 
feature comparison
P 
P 
D 
SQL: Functions from Views C 
•CREATE FUNCTION <name> (<params>) AS … 
•SELECT … <params> … 
•Merging/Column Binding: JOIN clause 
•Row binding: UNION clause 
•Filtering: WHERE clause 
•Aggregation: GROUP BY clause 
•Aggregated filtering: HAVING clause 
•Sorting: ORDER BY clause 
feature comparison
P 
P 
D 
SQL: Tuning with EXPLAIN C 
•EXPLAIN <options> SELECT … 
Same 
•rows scanned: COST option 
as 
•wordy response: VERBOSE option 
before! 
•output formatting: FORMAT option 
•actually run it: ANALYZE option 
•runtime (only with ANALYZE): TIMING option 
•(EXPLAIN is not part of the SQL standard, 
but most implementations support it)
P 
P 
D 
SQL: Tuning using Indexes C 
•CREATE INDEX <name> ON <table> 
(<col_list|expression>) … 
•UNIQUE indices for key fields 
•Use functions in expressions: 
What’s in 
your 
WHERE 
clause? 
feature comparison 
LOWER(<text_col>), INT(<num_col>) 
•Specify ordering (ASC, DESC, NULLS FIRST, 
etc.) and method (BTREE, HASH, GIST, etc.) 
•Partial indexes via WHERE clause
P 
P 
D 
C SQL in other languages 
(or, accessing data in databases via sql in other languages) 
•R with libraries 
•RPostgreSQL, dplyr 
! 
•Python with modules 
•psycopg2, SQLAlchemy
P 
P 
D 
C SQL in other languages 
(or, operating on other languages’ data structures via sql) 
•R with libraries 
•RSQLite, sqldf 
! 
•Python with modules 
•Pandas, PandaSQL 
Mostly, 
Data 
Frames.
P 
P 
D 
C 
Slides and code are 
available on GitHub at 
nihonjinrxs/polyglot-october2014!
P 
P 
D 
C 
Ryan B. Harvey 
http://datascientist.guru 
ryan.b.harvey@gmail.com 
@nihonjinrxs 
+ryan.b.harvey 
Employment & Affiliations* 
IT Project Manager 
Office of Management and Budget 
Executive Office of the President 
! 
Thank you! 
! 
Questions? 
Data Scientist & Software Architect 
Kitchology Inc. 
! 
Research Affiliate 
Norbert Wiener Center for Harmonic Analysis & Applications 
College of Computer, Mathematical & Natural Sciences 
University of Maryland at College Park 
* My remarks, presentation and prepared materials are my own, and do not represent the views of my employers.

More Related Content

What's hot

Bcp
BcpBcp
Scalable Data Analysis in R -- Lee Edlefsen
Scalable Data Analysis in R -- Lee EdlefsenScalable Data Analysis in R -- Lee Edlefsen
Scalable Data Analysis in R -- Lee Edlefsen
Revolution Analytics
 
Bcp
BcpBcp
Xxx test
Xxx testXxx test
Xxx test
Breach_P
 
Sparql
SparqlSparql
Unit iv xml
Unit iv xmlUnit iv xml
Unit iv xml
smitha273566
 
Information Retrieval - Data Science Bootcamp
Information Retrieval - Data Science BootcampInformation Retrieval - Data Science Bootcamp
Information Retrieval - Data Science Bootcamp
Kais Hassan, PhD
 
Xml basics
Xml basicsXml basics
Xml basics
Kumar
 
Introduction to XML
Introduction to XMLIntroduction to XML
Introduction to XML
Abhra Basak
 
การใช้ภาษา SQL
การใช้ภาษา SQLการใช้ภาษา SQL
การใช้ภาษา SQL
Kornnicha Wonglai
 
Libreoffice conference 2014 Easy hacks to improve writer - ooxml interoperabi...
Libreoffice conference 2014 Easy hacks to improve writer - ooxml interoperabi...Libreoffice conference 2014 Easy hacks to improve writer - ooxml interoperabi...
Libreoffice conference 2014 Easy hacks to improve writer - ooxml interoperabi...
Sushil Shinde
 
Unit3wt
Unit3wtUnit3wt
Unit3wt
vamsi krishna
 
XML Technologies
XML TechnologiesXML Technologies
XML Technologies
hamsa nandhini
 
Ch2 neworder
Ch2 neworderCh2 neworder
Ch2 neworder
davidlahr32
 
Dynamic Publishing with Arbortext Data Merge
Dynamic Publishing with Arbortext Data MergeDynamic Publishing with Arbortext Data Merge
Dynamic Publishing with Arbortext Data Merge
Clay Helberg
 
fundamentals of XML
fundamentals of XMLfundamentals of XML
fundamentals of XML
hamsa nandhini
 
XSLT presentation
XSLT presentationXSLT presentation
XSLT presentation
Miguel Angel Teheran Garcia
 
Twinkle: A SPARQL Query Tool
Twinkle: A SPARQL Query ToolTwinkle: A SPARQL Query Tool
Twinkle: A SPARQL Query Tool
Leigh Dodds
 
List and images in html
List and images in htmlList and images in html
List and images in html
prithvisawla
 
XMLT
XMLTXMLT

What's hot (20)

Bcp
BcpBcp
Bcp
 
Scalable Data Analysis in R -- Lee Edlefsen
Scalable Data Analysis in R -- Lee EdlefsenScalable Data Analysis in R -- Lee Edlefsen
Scalable Data Analysis in R -- Lee Edlefsen
 
Bcp
BcpBcp
Bcp
 
Xxx test
Xxx testXxx test
Xxx test
 
Sparql
SparqlSparql
Sparql
 
Unit iv xml
Unit iv xmlUnit iv xml
Unit iv xml
 
Information Retrieval - Data Science Bootcamp
Information Retrieval - Data Science BootcampInformation Retrieval - Data Science Bootcamp
Information Retrieval - Data Science Bootcamp
 
Xml basics
Xml basicsXml basics
Xml basics
 
Introduction to XML
Introduction to XMLIntroduction to XML
Introduction to XML
 
การใช้ภาษา SQL
การใช้ภาษา SQLการใช้ภาษา SQL
การใช้ภาษา SQL
 
Libreoffice conference 2014 Easy hacks to improve writer - ooxml interoperabi...
Libreoffice conference 2014 Easy hacks to improve writer - ooxml interoperabi...Libreoffice conference 2014 Easy hacks to improve writer - ooxml interoperabi...
Libreoffice conference 2014 Easy hacks to improve writer - ooxml interoperabi...
 
Unit3wt
Unit3wtUnit3wt
Unit3wt
 
XML Technologies
XML TechnologiesXML Technologies
XML Technologies
 
Ch2 neworder
Ch2 neworderCh2 neworder
Ch2 neworder
 
Dynamic Publishing with Arbortext Data Merge
Dynamic Publishing with Arbortext Data MergeDynamic Publishing with Arbortext Data Merge
Dynamic Publishing with Arbortext Data Merge
 
fundamentals of XML
fundamentals of XMLfundamentals of XML
fundamentals of XML
 
XSLT presentation
XSLT presentationXSLT presentation
XSLT presentation
 
Twinkle: A SPARQL Query Tool
Twinkle: A SPARQL Query ToolTwinkle: A SPARQL Query Tool
Twinkle: A SPARQL Query Tool
 
List and images in html
List and images in htmlList and images in html
List and images in html
 
XMLT
XMLTXMLT
XMLT
 

Similar to Manipulating Data in Style with SQL

Data Wrangling in SQL & Other Tools :: Data Wranglers DC :: June 4, 2014
Data Wrangling in SQL & Other Tools :: Data Wranglers DC :: June 4, 2014Data Wrangling in SQL & Other Tools :: Data Wranglers DC :: June 4, 2014
Data Wrangling in SQL & Other Tools :: Data Wranglers DC :: June 4, 2014
Ryan B Harvey, CSDP, CSM
 
U-SQL Meta Data Catalog (SQLBits 2016)
U-SQL Meta Data Catalog (SQLBits 2016)U-SQL Meta Data Catalog (SQLBits 2016)
U-SQL Meta Data Catalog (SQLBits 2016)
Michael Rys
 
3 CityNetConf - sql+c#=u-sql
3 CityNetConf - sql+c#=u-sql3 CityNetConf - sql+c#=u-sql
3 CityNetConf - sql+c#=u-sql
Łukasz Grala
 
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...
Michael Rys
 
Taming the Data Science Monster with A New ‘Sword’ – U-SQL
Taming the Data Science Monster with A New ‘Sword’ – U-SQLTaming the Data Science Monster with A New ‘Sword’ – U-SQL
Taming the Data Science Monster with A New ‘Sword’ – U-SQL
Michael Rys
 
Oracle 11g developer on linux training in bangalore
Oracle 11g developer on linux training in bangaloreOracle 11g developer on linux training in bangalore
Oracle 11g developer on linux training in bangalore
Suvash Chowdary
 
Oracle 11g developer on linux training in bangalore
Oracle 11g developer on linux training in bangaloreOracle 11g developer on linux training in bangalore
Oracle 11g developer on linux training in bangalore
Suvash Chowdary
 
Introduction to Oracle
Introduction to OracleIntroduction to Oracle
Introduction to Oracle
Achmad Solichin
 
Introduction to Oracle
Introduction to OracleIntroduction to Oracle
Introduction to Oracle
Achmad Solichin
 
Intro
IntroIntro
MIS5101 WK10 Outcome Measures
MIS5101 WK10 Outcome MeasuresMIS5101 WK10 Outcome Measures
MIS5101 WK10 Outcome Measures
Steven Johnson
 
Sql tutorial
Sql tutorialSql tutorial
Sql tutorial
Rumman Ansari
 
Database Programming
Database ProgrammingDatabase Programming
Database Programming
Henry Osborne
 
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)
Michael Rys
 
Inside SQL Server In-Memory OLTP
Inside SQL Server In-Memory OLTPInside SQL Server In-Memory OLTP
Inside SQL Server In-Memory OLTP
Bob Ward
 
The openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query LanguageThe openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query Language
Neo4j
 
BTEC- HND In Computing-Creating Table_Week6.pptx
BTEC- HND In Computing-Creating Table_Week6.pptxBTEC- HND In Computing-Creating Table_Week6.pptx
BTEC- HND In Computing-Creating Table_Week6.pptx
TTKCreation
 
How to build a data dictionary
How to build a data dictionaryHow to build a data dictionary
How to build a data dictionary
Piotr Kononow
 
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Michael Rys
 
SQL Complete Tutorial. All Topics Covered
SQL Complete Tutorial. All Topics CoveredSQL Complete Tutorial. All Topics Covered
SQL Complete Tutorial. All Topics Covered
Danish Mehraj
 

Similar to Manipulating Data in Style with SQL (20)

Data Wrangling in SQL & Other Tools :: Data Wranglers DC :: June 4, 2014
Data Wrangling in SQL & Other Tools :: Data Wranglers DC :: June 4, 2014Data Wrangling in SQL & Other Tools :: Data Wranglers DC :: June 4, 2014
Data Wrangling in SQL & Other Tools :: Data Wranglers DC :: June 4, 2014
 
U-SQL Meta Data Catalog (SQLBits 2016)
U-SQL Meta Data Catalog (SQLBits 2016)U-SQL Meta Data Catalog (SQLBits 2016)
U-SQL Meta Data Catalog (SQLBits 2016)
 
3 CityNetConf - sql+c#=u-sql
3 CityNetConf - sql+c#=u-sql3 CityNetConf - sql+c#=u-sql
3 CityNetConf - sql+c#=u-sql
 
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...
 
Taming the Data Science Monster with A New ‘Sword’ – U-SQL
Taming the Data Science Monster with A New ‘Sword’ – U-SQLTaming the Data Science Monster with A New ‘Sword’ – U-SQL
Taming the Data Science Monster with A New ‘Sword’ – U-SQL
 
Oracle 11g developer on linux training in bangalore
Oracle 11g developer on linux training in bangaloreOracle 11g developer on linux training in bangalore
Oracle 11g developer on linux training in bangalore
 
Oracle 11g developer on linux training in bangalore
Oracle 11g developer on linux training in bangaloreOracle 11g developer on linux training in bangalore
Oracle 11g developer on linux training in bangalore
 
Introduction to Oracle
Introduction to OracleIntroduction to Oracle
Introduction to Oracle
 
Introduction to Oracle
Introduction to OracleIntroduction to Oracle
Introduction to Oracle
 
Intro
IntroIntro
Intro
 
MIS5101 WK10 Outcome Measures
MIS5101 WK10 Outcome MeasuresMIS5101 WK10 Outcome Measures
MIS5101 WK10 Outcome Measures
 
Sql tutorial
Sql tutorialSql tutorial
Sql tutorial
 
Database Programming
Database ProgrammingDatabase Programming
Database Programming
 
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)
 
Inside SQL Server In-Memory OLTP
Inside SQL Server In-Memory OLTPInside SQL Server In-Memory OLTP
Inside SQL Server In-Memory OLTP
 
The openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query LanguageThe openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query Language
 
BTEC- HND In Computing-Creating Table_Week6.pptx
BTEC- HND In Computing-Creating Table_Week6.pptxBTEC- HND In Computing-Creating Table_Week6.pptx
BTEC- HND In Computing-Creating Table_Week6.pptx
 
How to build a data dictionary
How to build a data dictionaryHow to build a data dictionary
How to build a data dictionary
 
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
 
SQL Complete Tutorial. All Topics Covered
SQL Complete Tutorial. All Topics CoveredSQL Complete Tutorial. All Topics Covered
SQL Complete Tutorial. All Topics Covered
 

Recently uploaded

Kubernetes at Scale: Going Multi-Cluster with Istio
Kubernetes at Scale:  Going Multi-Cluster  with IstioKubernetes at Scale:  Going Multi-Cluster  with Istio
Kubernetes at Scale: Going Multi-Cluster with Istio
Severalnines
 
Alluxio Webinar | 10x Faster Trino Queries on Your Data Platform
Alluxio Webinar | 10x Faster Trino Queries on Your Data PlatformAlluxio Webinar | 10x Faster Trino Queries on Your Data Platform
Alluxio Webinar | 10x Faster Trino Queries on Your Data Platform
Alluxio, Inc.
 
14 th Edition of International conference on computer vision
14 th Edition of International conference on computer vision14 th Edition of International conference on computer vision
14 th Edition of International conference on computer vision
ShulagnaSarkar2
 
Malibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed RoundMalibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed Round
sjcobrien
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
widenerjobeyrl638
 
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
safelyiotech
 
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
Bert Jan Schrijver
 
How Can Hiring A Mobile App Development Company Help Your Business Grow?
How Can Hiring A Mobile App Development Company Help Your Business Grow?How Can Hiring A Mobile App Development Company Help Your Business Grow?
How Can Hiring A Mobile App Development Company Help Your Business Grow?
ToXSL Technologies
 
Orca: Nocode Graphical Editor for Container Orchestration
Orca: Nocode Graphical Editor for Container OrchestrationOrca: Nocode Graphical Editor for Container Orchestration
Orca: Nocode Graphical Editor for Container Orchestration
Pedro J. Molina
 
Superpower Your Apache Kafka Applications Development with Complementary Open...
Superpower Your Apache Kafka Applications Development with Complementary Open...Superpower Your Apache Kafka Applications Development with Complementary Open...
Superpower Your Apache Kafka Applications Development with Complementary Open...
Paul Brebner
 
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom KittEnhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Peter Caitens
 
The Comprehensive Guide to Validating Audio-Visual Performances.pdf
The Comprehensive Guide to Validating Audio-Visual Performances.pdfThe Comprehensive Guide to Validating Audio-Visual Performances.pdf
The Comprehensive Guide to Validating Audio-Visual Performances.pdf
kalichargn70th171
 
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
The Third Creative Media
 
Photoshop Tutorial for Beginners (2024 Edition)
Photoshop Tutorial for Beginners (2024 Edition)Photoshop Tutorial for Beginners (2024 Edition)
Photoshop Tutorial for Beginners (2024 Edition)
alowpalsadig
 
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptxOperational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
sandeepmenon62
 
Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptxMigration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
ervikas4
 
All you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVMAll you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVM
Alina Yurenko
 
Manyata Tech Park Bangalore_ Infrastructure, Facilities and More
Manyata Tech Park Bangalore_ Infrastructure, Facilities and MoreManyata Tech Park Bangalore_ Infrastructure, Facilities and More
Manyata Tech Park Bangalore_ Infrastructure, Facilities and More
narinav14
 
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSISDECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
Tier1 app
 

Recently uploaded (20)

Kubernetes at Scale: Going Multi-Cluster with Istio
Kubernetes at Scale:  Going Multi-Cluster  with IstioKubernetes at Scale:  Going Multi-Cluster  with Istio
Kubernetes at Scale: Going Multi-Cluster with Istio
 
Alluxio Webinar | 10x Faster Trino Queries on Your Data Platform
Alluxio Webinar | 10x Faster Trino Queries on Your Data PlatformAlluxio Webinar | 10x Faster Trino Queries on Your Data Platform
Alluxio Webinar | 10x Faster Trino Queries on Your Data Platform
 
14 th Edition of International conference on computer vision
14 th Edition of International conference on computer vision14 th Edition of International conference on computer vision
14 th Edition of International conference on computer vision
 
Malibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed RoundMalibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed Round
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
 
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
 
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
 
How Can Hiring A Mobile App Development Company Help Your Business Grow?
How Can Hiring A Mobile App Development Company Help Your Business Grow?How Can Hiring A Mobile App Development Company Help Your Business Grow?
How Can Hiring A Mobile App Development Company Help Your Business Grow?
 
Orca: Nocode Graphical Editor for Container Orchestration
Orca: Nocode Graphical Editor for Container OrchestrationOrca: Nocode Graphical Editor for Container Orchestration
Orca: Nocode Graphical Editor for Container Orchestration
 
Superpower Your Apache Kafka Applications Development with Complementary Open...
Superpower Your Apache Kafka Applications Development with Complementary Open...Superpower Your Apache Kafka Applications Development with Complementary Open...
Superpower Your Apache Kafka Applications Development with Complementary Open...
 
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom KittEnhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
 
The Comprehensive Guide to Validating Audio-Visual Performances.pdf
The Comprehensive Guide to Validating Audio-Visual Performances.pdfThe Comprehensive Guide to Validating Audio-Visual Performances.pdf
The Comprehensive Guide to Validating Audio-Visual Performances.pdf
 
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
 
Photoshop Tutorial for Beginners (2024 Edition)
Photoshop Tutorial for Beginners (2024 Edition)Photoshop Tutorial for Beginners (2024 Edition)
Photoshop Tutorial for Beginners (2024 Edition)
 
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptxOperational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
 
Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptxMigration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
 
All you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVMAll you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVM
 
Manyata Tech Park Bangalore_ Infrastructure, Facilities and More
Manyata Tech Park Bangalore_ Infrastructure, Facilities and MoreManyata Tech Park Bangalore_ Infrastructure, Facilities and More
Manyata Tech Park Bangalore_ Infrastructure, Facilities and More
 
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSISDECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
 

Manipulating Data in Style with SQL

  • 1. P P D Manipulating Data C with Style in SQL An introduction to SQL, the interface language to most of the world’s structured data, and practices for readable and reusable SQL code Ryan B. Harvey ! October 14, 2014
  • 2. P P D Relational Data C •Relational data is organized in tables consisting of columns and rows •Fields (columns) consist of a column name and data type constraint •Records (rows) in a table have a common field (column) structure and order •Records (rows) are linked across tables by key fields Relational Data Model: Codd, Edgar F. “A Relational Model of Data for Large Shared Data Banks” (1970)
  • 3. P P D Intro to SQL C •SQL (“Structured Query Language”) is a declarative data definition and query language for relational data •SQL is an ISO/IEC standard with many implementations in common database management systems (a few below) Structured Query Language: ISO/IEC 9075 (standard), first appeared 1974, current version SQL:2011
  • 4. P P D C Which database system should I use? 1. Use the one your data is in 2. Unless you need specific things (performance, functions, etc.), use the one you know best 3. If you need other stuff or you’ve never used a database before: A. SQLite: FOSS, one file db, easy/limited B. PostgreSQL: FOSS, Enterprise-ready The above are my opinions based on experience. Others may disagree, and that’s OK.
  • 5. P P D SQL: Working with Objects C feature comparison •Data Definition Language (DB Objects) •CREATE (table, index, view, function, …) •ALTER (table, index, view, function, …) •DROP (table, index, view, function, …)
  • 6. P P D SQL: Working with Rows C feature comparison •Data Manipulation Language (Records) aka Query Language •SELECT … FROM … •INSERT INTO … •UPDATE … SET … •DELETE FROM …
  • 7. P P D SQL: SELECT Statement C •SELECT <col_list> FROM <table> … •Merging/Column Binding: JOIN clause •Row binding: UNION clause •Filtering: WHERE clause •Aggregation: GROUP BY clause •Aggregated filtering: HAVING clause •Sorting: ORDER BY clause feature comparison
  • 8. P P D C Intro to Relational Algebra •Basic operators SELECT WHERE, HAVING PROJECT <COL_LIST> RENAME AS ! •Join operators: inner/outer, cartesian •Set operators: union, intersect, set minus, and, or, etc. •SELECT name, id FROM t1 WHERE id<3 AND dob<DATE ‘2004-01-01’ (ΠNAME,ID σID<3 ∧ DOB<(1/1/2004) T1) For a very detailed Intro to Relational Algebra, see lecture notes from 2005 databases course, IT U of Copenhagen
  • 9. P P D SQL Beginner Resources C •Basic SQL Commands Reference: http://www.cs.utexas.edu/~mitra/ csFall2013/cs329/lectures/sql.html
  • 10. P P D C SQL: Common Table Expressions (CTEs) •WITH <name> [(<col_list>)] AS (SELECT …) •SELECT <col_list> FROM <table or CTE> … •Merging/Column Binding: JOIN clause •Row binding: UNION clause •Filtering: WHERE clause •Aggregation: GROUP BY clause •Aggregated filtering: HAVING clause •Sorting: ORDER BY clause Same as before! feature comparison
  • 11. P P D SQL: Views from SELECTs C •CREATE VIEW <name> AS … •SELECT <col_list> FROM <table> … •Merging/Column Binding: JOIN clause •Row binding: UNION clause •Filtering: WHERE clause •Aggregation: GROUP BY clause •Aggregated filtering: HAVING clause •Sorting: ORDER BY clause feature comparison
  • 12. P P D SQL: Functions from Views C •CREATE FUNCTION <name> (<params>) AS … •SELECT … <params> … •Merging/Column Binding: JOIN clause •Row binding: UNION clause •Filtering: WHERE clause •Aggregation: GROUP BY clause •Aggregated filtering: HAVING clause •Sorting: ORDER BY clause feature comparison
  • 13. P P D SQL: Tuning with EXPLAIN C •EXPLAIN <options> SELECT … Same •rows scanned: COST option as •wordy response: VERBOSE option before! •output formatting: FORMAT option •actually run it: ANALYZE option •runtime (only with ANALYZE): TIMING option •(EXPLAIN is not part of the SQL standard, but most implementations support it)
  • 14. P P D SQL: Tuning using Indexes C •CREATE INDEX <name> ON <table> (<col_list|expression>) … •UNIQUE indices for key fields •Use functions in expressions: What’s in your WHERE clause? feature comparison LOWER(<text_col>), INT(<num_col>) •Specify ordering (ASC, DESC, NULLS FIRST, etc.) and method (BTREE, HASH, GIST, etc.) •Partial indexes via WHERE clause
  • 15. P P D C SQL in other languages (or, accessing data in databases via sql in other languages) •R with libraries •RPostgreSQL, dplyr ! •Python with modules •psycopg2, SQLAlchemy
  • 16. P P D C SQL in other languages (or, operating on other languages’ data structures via sql) •R with libraries •RSQLite, sqldf ! •Python with modules •Pandas, PandaSQL Mostly, Data Frames.
  • 17. P P D C Slides and code are available on GitHub at nihonjinrxs/polyglot-october2014!
  • 18. P P D C Ryan B. Harvey http://datascientist.guru ryan.b.harvey@gmail.com @nihonjinrxs +ryan.b.harvey Employment & Affiliations* IT Project Manager Office of Management and Budget Executive Office of the President ! Thank you! ! Questions? Data Scientist & Software Architect Kitchology Inc. ! Research Affiliate Norbert Wiener Center for Harmonic Analysis & Applications College of Computer, Mathematical & Natural Sciences University of Maryland at College Park * My remarks, presentation and prepared materials are my own, and do not represent the views of my employers.