This document discusses different types of joins in Greenplum including inner joins, left outer joins, right outer joins, full outer joins, and cross joins. It provides syntax examples and explanations of when each join type would be used. It also covers PostgreSQL database concepts like schemas, roles, users, privileges, and table constraints.
An tutorial for sql learners in very easy way. It contains all the sql commands like ddl, dml, etc. with suitable examples.
at the end there are 3 sets of question with their solution with explanation. each set contains 40+ questions.
Running queries across multiple tables. This will involve the concept of joins—that is, how we join tables together.
Using joins to run queries over multiple tables, including:
Natural, inner, and cross joins
Straight joins
Left and right joins
Writing subqueries
Using SELECT statement options
Building a heterogeneous Hadoop Olap system with Microsoft BI stack. PABLO DO...Big Data Spain
Session presented at Big Data Spain 2012 Conference
16th Nov 2012
ETSI Telecomunicacion UPM Madrid
www.bigdataspain.org
More info: http://www.bigdataspain.org/es-2012/conference/building-a-heterogeneous-hadoop-olap-system-with-microsoft-bi-stack/pablo-doval-and-ibon-landa
Collaboration by individuals, organizations, and communities with the right tools and resources is essential in achieving success with data science. Join us for a live demonstration of how you can leverage a data science platform, an open-source model, internal and external data, analytics tools, and visualization using Hadoop. See how unprecedented access to data scientists can deliver entirely new levels of insight to push the boundaries of what’s possible. Find out what you can do NOW to move your data science efforts forward.
Uwe Ricken at SQL in the City 2016.
Waits, as they’re known in the SQL Server world, indicate that a worker thread inside SQL Server is waiting for a resource to become available before it can proceed with executing. They’re often a major source of performance issues.
In this session, we’ll walk through an optimal performance troubleshooting process for a variety of scenarios, and illustrate both the strengths and weaknesses of using a waits-only approach to troubleshooting.
An tutorial for sql learners in very easy way. It contains all the sql commands like ddl, dml, etc. with suitable examples.
at the end there are 3 sets of question with their solution with explanation. each set contains 40+ questions.
Running queries across multiple tables. This will involve the concept of joins—that is, how we join tables together.
Using joins to run queries over multiple tables, including:
Natural, inner, and cross joins
Straight joins
Left and right joins
Writing subqueries
Using SELECT statement options
Building a heterogeneous Hadoop Olap system with Microsoft BI stack. PABLO DO...Big Data Spain
Session presented at Big Data Spain 2012 Conference
16th Nov 2012
ETSI Telecomunicacion UPM Madrid
www.bigdataspain.org
More info: http://www.bigdataspain.org/es-2012/conference/building-a-heterogeneous-hadoop-olap-system-with-microsoft-bi-stack/pablo-doval-and-ibon-landa
Collaboration by individuals, organizations, and communities with the right tools and resources is essential in achieving success with data science. Join us for a live demonstration of how you can leverage a data science platform, an open-source model, internal and external data, analytics tools, and visualization using Hadoop. See how unprecedented access to data scientists can deliver entirely new levels of insight to push the boundaries of what’s possible. Find out what you can do NOW to move your data science efforts forward.
Uwe Ricken at SQL in the City 2016.
Waits, as they’re known in the SQL Server world, indicate that a worker thread inside SQL Server is waiting for a resource to become available before it can proceed with executing. They’re often a major source of performance issues.
In this session, we’ll walk through an optimal performance troubleshooting process for a variety of scenarios, and illustrate both the strengths and weaknesses of using a waits-only approach to troubleshooting.
This is the presentation I delivered on Hadoop User Group Ireland meetup in Dublin on Nov 28 2015. It covers at glance the architecture of GPDB and most important its features. Sorry for the colors - Slideshare is crappy with PDFs
Many questions on database newsgroups and forums can be answered with uses of outer joins. Outer joins are part of the standard SQL language and supported by all RDBMS brands. Many programmers are expected to use SQL in their work, but few know how to use outer joins effectively.
Learn to use this powerful feature of SQL, increase your employability, and amaze your friends!
Karwin will explain outer joins, show examples, and demonstrate a Sudoku puzzle solver implemented in a single SQL query.
Types Of Join In Sql Server - Join With Example In Sql Serverprogrammings guru
Do you know How many types of Joins in SQL. In this ppt presentation we are discussion about types of joins in sql server eg: INNER JOIN , SELF JOIN ,OUTER JOIN ,Right outer Join,Left outer Join,Full Outer Join,CROSS JOIN .
As the core SQL processing engine of the Greenplum Unified Analytics Platform, the Greenplum Database delivers Industry leading performance for Big Data Analytics while scaling linearly on massively parallel processing clusters of standard x86 servers. This session reviews the product's underlying architecture, identify key differentiation areas, go deep into the new features introduced in Greenplum Database Release 4.2, and discuss our plans for 2012.
Hadoop & Greenplum: Why Do Such a Thing?Ed Kohlwey
Greenplum is using Hadoop in several interesting ways as part of a larger big data architecture with EMC Greenplum Database (a scale-out MPP SQL database) and EMC Isilon (a scale-out network-attached storage appliance). After a quick introduction of Greenplum Database and Isilon, I list some ways Greenplum is tightly integrating with Hadoop and why we would want to do such a thing. Integration points discussed include: Greenplum Database external tables to seamlessly access data in HDFS, querying HBase tables natively from Greenplum Database, Greenplum Database having its underlying storage on HDFS, and Isilon OneFS as a seamless replacement for HDFS.
This presentation is an INTRODUCTION to intermediate MySQL query optimization for the Audience of PHP World 2017. It covers some of the more intricate features in a cursory overview.
How to leave the ORM at home and write SQLMariaDB plc
Looking to understand the basics of relational databases and the ubiquitous structured query language (SQL)? This is the session for you. Senior Software Engineer Assen Totin starts with an introduction to relational database theory and quickly moves to practical examples of SQL with simple, single-table selects, joins, and aggregates.
PostgreSQL (or Postgres) began its life in 1986 as POSTGRES, a research project of the University of California at Berkeley.
PostgreSQL isn't just relational, it's object-relational.it's object-relational. This gives it some advantages over other open source SQL databases like MySQL, MariaDB and Firebird.
This Doc Consist of ER diagram of University and NHL, Introduction to posgres SQL and installation,DML and its various commands,implementation of constraints with examples,DML Implementation with set operations & Functions,Implementation of nested Queries.
Model Attribute Check Company Auto PropertyCeline George
In Odoo, the multi-company feature allows you to manage multiple companies within a single Odoo database instance. Each company can have its own configurations while still sharing common resources such as products, customers, and suppliers.
Unit 8 - Information and Communication Technology (Paper I).pdfThiyagu K
This slides describes the basic concepts of ICT, basics of Email, Emerging Technology and Digital Initiatives in Education. This presentations aligns with the UGC Paper I syllabus.
Synthetic Fiber Construction in lab .pptxPavel ( NSTU)
Synthetic fiber production is a fascinating and complex field that blends chemistry, engineering, and environmental science. By understanding these aspects, students can gain a comprehensive view of synthetic fiber production, its impact on society and the environment, and the potential for future innovations. Synthetic fibers play a crucial role in modern society, impacting various aspects of daily life, industry, and the environment. ynthetic fibers are integral to modern life, offering a range of benefits from cost-effectiveness and versatility to innovative applications and performance characteristics. While they pose environmental challenges, ongoing research and development aim to create more sustainable and eco-friendly alternatives. Understanding the importance of synthetic fibers helps in appreciating their role in the economy, industry, and daily life, while also emphasizing the need for sustainable practices and innovation.
Operation “Blue Star” is the only event in the history of Independent India where the state went into war with its own people. Even after about 40 years it is not clear if it was culmination of states anger over people of the region, a political game of power or start of dictatorial chapter in the democratic setup.
The people of Punjab felt alienated from main stream due to denial of their just demands during a long democratic struggle since independence. As it happen all over the word, it led to militant struggle with great loss of lives of military, police and civilian personnel. Killing of Indira Gandhi and massacre of innocent Sikhs in Delhi and other India cities was also associated with this movement.
Macroeconomics- Movie Location
This will be used as part of your Personal Professional Portfolio once graded.
Objective:
Prepare a presentation or a paper using research, basic comparative analysis, data organization and application of economic information. You will make an informed assessment of an economic climate outside of the United States to accomplish an entertainment industry objective.
Biological screening of herbal drugs: Introduction and Need for
Phyto-Pharmacological Screening, New Strategies for evaluating
Natural Products, In vitro evaluation techniques for Antioxidants, Antimicrobial and Anticancer drugs. In vivo evaluation techniques
for Anti-inflammatory, Antiulcer, Anticancer, Wound healing, Antidiabetic, Hepatoprotective, Cardio protective, Diuretics and
Antifertility, Toxicity studies as per OECD guidelines
Honest Reviews of Tim Han LMA Course Program.pptxtimhan337
Personal development courses are widely available today, with each one promising life-changing outcomes. Tim Han’s Life Mastery Achievers (LMA) Course has drawn a lot of interest. In addition to offering my frank assessment of Success Insider’s LMA Course, this piece examines the course’s effects via a variety of Tim Han LMA course reviews and Success Insider comments.
A Strategic Approach: GenAI in EducationPeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
Embracing GenAI - A Strategic ImperativePeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
Introduction to AI for Nonprofits with Tapp NetworkTechSoup
Dive into the world of AI! Experts Jon Hill and Tareq Monaur will guide you through AI's role in enhancing nonprofit websites and basic marketing strategies, making it easy to understand and apply.
4. Join Types
Inner Join Left Outer Join Right Outer Join
Full Outer Join Cross Join
Greenplum Confidential
4
5. Inner Join
The syntax for an INNER JOIN is simple!
Also known as a SIMPLE JOIN or EQUI-JOIN.
SELECT
EMP.EMPLOYEE_ID
, EMP.EMPLOYEE_NAME
, DPT.DEPT_NAME
FROM
EMPLOYEES EMP
, DEPARTMENTS DPT
WHERE
EMP.DPT_ID = DPT.DPT_ID;
This query only retrieves rows based on the following join condition:
a corresponding row must exist in each table,
thus the term ―INNER JOIN‖.
Greenplum Confidential
5
6. Left OUTER Join
LEFT OUTER JOIN returns all rows from the ―left‖ table, and only those rows from
the ―right‖ table that match the left one.
This is handy when there may be missing values in the ―left‖ table, but you want
to get a description or other information from the ―right‖ table.
SELECT
t.transid
, c.custname
FROM
facts.transaction t
LEFT OUTER JOIN dimensions.customer c
ON c.customerid = t.customerid;
You may also abbreviate LEFT OUTER JOIN to LEFT JOIN.
…
FROM
facts.transaction t
LEFT JOIN dimensions.customer c
ON c.customerid = t.customerid;
Greenplum Confidential
6
7. Right OUTER Join
RIGHT OUTER JOIN is the inverse of the LEFT OUTER JOIN.
There is no difference in the functionality of these two. Just be careful when
designating the left and right tables!
SELECT
t.transid
, c.custname
FROM
dimensions.customer c
RIGHT OUTER JOIN facts.transaction t
ON c.customerid = t.customerid;
You may also abbreviate RIGHT OUTER JOIN to RIGHT JOIN.
…
FROM dimensions.customer c
RIGHT JOIN facts.transaction t
ON c.customerid = t.customerid;
Greenplum Confidential
7
8. Full OUTER Join
This interesting join is used when you want to retrieve every row, whether there is
a match or not based on the join criteria! When would you use this? An example
is detecting ―true changes‖ for a table.
SELECT
COALESCE( ORIG.transid, STG.transid) AS transid
, CASE
WHEN ORIG.transid IS NULL AND STG.transid IS NOT NULL THEN 'INSERT'
WHEN ORIG.transid IS NOT NULL AND STG.transid IS NOT NULL THEN
'UPDATE'
WHEN ORIG.transid IS NOT NULL AND STG.transid IS NULL THEN 'DELETE‘
ELSE 'UNKNOWN‘
END AS Process_type
FROM
facts.transaction ORIG
FULL OUTER JOIN staging.transaction_w STG
ON STG.transid = ORIG.transid;
The COALESCE function insures that we have a transaction id
to work with in subsequent processing.
ORIG STG
Greenplum Confidential
8
9. CROSS JOIN (Cartesian Product)
We see these joins, generally, when they are unintentionally created!
• Every row in the ―left‖ table is joined with every row in the ―right‖
table!
• So, if you have a billion row table and CROSS JOIN it to a 100 row
table, your answer set will have 100 billion rows!
• There are uses for deliberate CROSS JOINs and will be shown later.
SELECT
t.transid
, t.transdate
, COALESCE( c.custname, ‘Unknown Customer‘) AS custname
FROM
facts.transaction t
, dimensions.customer c;
Q: What is wrong with this query?
Why will it perform a CROSS JOIN?
Greenplum Confidential
9
11. PostgreSQL Schemas – the Search Path
PostgreSQL will search for tables that are named without
qualification (schema name not included) by use of the
search_path variable.
You can change the search_path parameter using SET
command:
SET search_path TO myschema,public;
SHOW search_path;
search_path
--------------
myschema,public
Greenplum Confidential
11
12. PostgreSQL – Roles
• Database roles are conceptually completely separate from
operating system users.
• Database roles are global across all databases on the same
cluster installation.
• Database user is a role with a LOGIN privilege.
• It is frequently convenient to group users together to ease
management of privileges. That way, privileges can be granted
to, or revoked from, one group instead of may users separately.
Greenplum Confidential
12
13. PostgreSQL – Roles and Privileges
• When an object is created, it is assigned an owner. The default
owner is the role that executed the creation statement.
• For most kinds of objects, the initial state is that only the owner (or
a superuser) can do anything with it.
• To allow other roles to use it, privileges must be granted.
• There are several different kinds of privileges:
SELECT Read only, unable to create new data or modify
INSERT Permission to add rows to the object
UPDATE Permission to make changes to existing data
DELETE Permission to remove data from the object
CREATE Permission to create objects
CONNECT Permission to connect (applies to a database)
EXECUTE Permission to execute functions / procedures
USAGE Permission to use objects (applies to a schema)
Greenplum Confidential
13
14. PostgreSQL – Users
• In PostgreSQL a user is a role with login privileges.
• Privileges on database objects should be granted to a group role
rather than a user role.
• Users may have special (system) privileges not associated with
database objects.
SUPERUSER
PASSWORD
CREATEDB
CREATEROLE
Tip: It is good practice to create a role that has the CREATEDB and CREATEROLE
privileges, but is not a superuser, and then use this role for all routine management
of databases and roles. This approach avoids the dangers of operating as a
superuser for tasks that do not really require it.
Greenplum Confidential
14
15. PostgreSQL Table Basics - Constraints
• PRIMARY KEY constraint insures row uniqueness.
A primary key indicates that a column or group of columns can be used as
a unique identifier for rows in the table.
Technically, a primary key constraint is simply a combination of a UNIQUE
constraint and a NOT NULL constraint.
• FOREIGN KEY constraint specifies that values in a column (or a
group of columns) of a given table must match the values
appearing in another table. * NOT SUPPORTED IN GREENPLUM
Greenplum Confidential
15
16. PostgreSQL DDL – CREATE TABLE
• Supports standard ANSI CREATE TABLE syntax:
CREATE TABLE dimensions.store
(
storeId SMALLINT NOT NULL,
storeName VARCHAR(40) NOT NULL,
address VARCHAR(50) NOT NULL,
city VARCHAR(40) NOT NULL,
state CHAR(2) NOT NULL,
zipcode CHAR(8) NOT NULL
);
Greenplum Confidential
16
17. PostgreSQL DDL – MODIFY TABLE
• You can modify certain attributes of a table:
Rename columns
Rename tables
Add/Remove columns
- ALTER TABLE product ADD COLUMN description text;
Add/Remove constraints
- ALTER TABLE product ALTER COLUMN prod_no SET NOT NULL;
Change default values
- ALTER TABLE product ALTER COLUMN prod_no SET DEFAULT -999;
Change column data types
- ALTER TABLE products ALTER COLUMN price TYPE numeric(10,2);
Greenplum Confidential
17
18. PostgreSQL DDL – DROP TABLE
• Table can be removed from the database with the DROP
TABLE statement:
DROP TABLE dimensions.customer_old;
• If the table you are dropping does not exists, you can
avoid error messages if you add ―IF EXISTS‖ to the
DROP TABLE statement:
DROP TABLE IF EXISTS dimensions.customer_old;
Greenplum Confidential
18
19. PostgreSQL Column Basics - Constraints
• A column can be assigned a DEFAULT value.
• CHECK constraint is the most generic constraint type. It allows
you to specify that the value in a certain column must satisfy a
Boolean expression. EXAMPLE: The column price must be greater than zero:
price numeric CHECK (price > 0),
• NOT NULL constraint simply specifies that a column must not
assume the null value. EXAMPLE: The column transactionid may not be NULL:
transactionid NOT NULL,
• UNIQUE constraints ensure that the data contained in a column or
a group of columns is unique with respect to all the rows in the
table.
Greenplum Confidential
19
20. PostgreSQL Indexes
• PostgreSQL provides several index types:
B-tree
Hash
GiST (Generalized Search Tree)
GIN (Generalized Inverted Index)
• Each index type uses a different algorithm that is best suited
to different types of queries.
• By default, CREATE INDEX command creates a B-tree index.
Greenplum Confidential
20
21. Constants in PostgreSQL
• There are three kinds of implicitly-typed constants in
PostgreSQL:
Strings
Bit strings
Numbers
• Constants can also be specified with explicit types, which can
enable more accurate representation and more efficient
handling by the system.
Greenplum Confidential
21
22. Casting Data Types in PostgreSQL
• A constant of an arbitrary type can be entered using any one of the
following explicit notations:
type ’string ’ REAL ‘2.117902’
’string ’::type ‘2.117902’::REAL
CAST ( ’string ’ AS type ) CAST(‘2.117902’ AS REAL)
The result is a constant of the indicated type.
• If there is no ambiguity the explicit type cast can be omitted (for
example, when constant it is assigned directly to a table column), in
which case it is automatically converted to the column type.
• The string constant can be written using either regular SQL notation or
dollar-quoting.
Greenplum Confidential
22
23. PostgreSQL Comments
• Standard SQL Comments
-- This is a comment
• “C” style Comments
/* This is a comment block
that ends with the */
Greenplum Confidential
23
27. PostgreSQL Functions & Operators
• PostgreSQL has a wide range of built-in functions and
operators for all data types.
• Users can define custom operators and functions.
Use caution, new operators or functions may adversely effect
Greenplum MPP functionality.
Greenplum Confidential
27
28. Built-In Functions (select function)
CURRENT_DATE returns the current system date
Example: 2006-11-06
CURRENT_TIME returns the current system time
Example: 16:50:54
CURRENT_TIMESTAMP returns the current system date and time
Example: 2008-01-06 16:51:44.430000+00:00
LOCALTIME returns the current system time with time zone adjustment
Example: 19:50:54
LOCALTIMESTAMP returns the current system date and time with time
zone adjustment
Example: 2008-01-06 19:51:44.430000+00:00
CURRENT_ROLE or USER returns the current database user
Example: jdoe
Greenplum Confidential
28
29. PostgreSQL Comparison Operators
The usual comparison operators are available:
Operator Description
= Equal to
!= OR <> NOT Equal to
> Greater than
>= Greater than or equal to
< Less than
<= Less than or equal to
x BETWEEN y AND z Short hand for
x >= y and x <= z
x IS NULL True if x has NO VALUE
‗abc‘ LIKE ‗%abcde%‘ Pattern Matching
POSIX Comparisons POSIX Pattern Matching
Greenplum Confidential
29
31. PostgreSQL Mathematical Functions
Handy Mathematical functions:
Function Returns Description Example Results
Abs same Absolute Value Abs(-998.2) 998.2
Ceiling Numeric Returns smallest Ceiling(48.2) 49
(numeric) integer not less
than argument
Floor Numeric Returns largest Floor(48.2) 48
(numeric) integer not
greater than
argument
Pi() Numeric The π constant Pi() 3.1419…
Random() Numeric Random value Random() .87663
between 0.0 and
1.0
Round() Numeric Round to nearest Round(22.7) 23
integer
Greenplum Confidential
31
32. PostgreSQL String Functions
Commonly used string functions:
Function Returns Description Example Results
String || Text String ‘my’ || ‘my’ ‘mymy’
String concatenation
Char_length Integer number of chars in Char_length(‘mym 4
(string) string y’)
Position Integer Location of Position(‘my’ in 3
(string in specified sub- ‘ohmy’)
string) string
Lower(string) Text Converts to lower Lower(‘MYMY’) ‘mymy’
case
Upper(string) Text Converts to upper Upper(‘mymy’) ‘MYMY’
case
Substring Text Displays portion Substring(‘myohm ‘oh’
(string from of string y’ from 3 for 2)
n for n)
Trim(both,lea Text Remove leading Trim(‘ mymy ‘mymy’
ding,trailing and/or trailing ‘)
from string) characters
Greenplum Confidential
32
33. PostgreSQL String Functions
Handy string functions:
Function Returns Description Example Results
Initcap Text Changes case Initcap ‘My My’
(string) (‘my my’)
Length Integer Returns string Length 4
(string) length (‘mymy’)
Split_part Text Separates Split_part(‘o ‘two’
(string, delimited list ne|two|three’
delimiter, ,’|’,2)
occurrence)
Greenplum Confidential
33
34. PostgreSQL Date Functions
Commonly used date functions:
Function Returns Description Example Results
Age Timesta Difference in Age(‘2008-08- 0 years 1
(timestam,t mp years, months 12’ timestamp, month 11
imestamp) and days current_timest days
amp)
Extract Integer Returns year, Extract( day 11
(field from month, day, from
timestamp) hour, minute or current_date)
second
Now() Timesta Returns current Now() 2008-09-22
mp date & time 11:00:01
Overlaps Boolean Simplifies WHERE TRUE
comparing date (‘2008-01-
ranges 01’,’2008-02-
11’) overlaps
(‘2008-02-
01’,’2008-09-
11’)
Greenplum Confidential
34
36. Set Operations – EXCEPT – Example
Returns all rows from the first SELECT statement
except for those also returned by the second SELECT.
SELECT t.transid
c.custname
FROM facts.transaction t
JOIN dimensions.customer c
ON c.customerid = t.customerid
EXCEPT
SELECT t.transid
c.custname
FROM facts.transaction t
JOIN dimensions.customer c
ON c.customerid = t.customerid
TABLE A
WHERE t.transdate BETWEEN
‘2008-01-01’ AND ‘2008-01-21’ RESULTS
TABLE B TABLE A
MINUS
TABLE B
Greenplum Confidential
36
37. Set Operations – UNION – Example
SELECT t.transid
c.custname
FROM facts.transaction t
JOIN dimensions.customer c
ON c.customerid = t.customerid
WHERE t.transdate BETWEEN
‘2008-01-01’ AND ‘2008-05-17’
UNION
SELECT t.transid TABLE A TABLE B
c.custname
FROM facts.transaction t
JOIN dimensions.customer c
ON c.customerid = t.customerid
WHERE t.transdate BETWEEN
‘2008-01-01’ AND ‘2008-01-21’ RESULTS
The UNION operation DOES NOT ALLOW DUPLICATE ROWS
Greenplum Confidential
37
38. Set Operations – UNION ALL - Example
SELECT t.transid
c.custname
FROM facts.transaction t
JOIN dimensions.customer c
ON c.customerid = t.customerid
WHERE t.transdate BETWEEN
‘2008-01-01’ AND ‘2008-05-17’
UNION ALL
SELECT t.transid
c.custname
FROM facts.transaction t TABLE A TABLE B
JOIN dimensions.customer c
ON c.customerid = t.customerid
WHERE t.transdate BETWEEN
‘2008-01-01’ AND ‘2008-01-21’
RESULTS
The UNION ALL operation MAY CREATE DUPLICATE ROWS
Greenplum Confidential
38
39. Greenplum allows Sub-Queries
IN clause is fully supported …
ORIGINAL QUERY:
SELECT *
FROM facts.transaction t
WHERE t.customerid IN (SELECT customerid
FROM dimensions.customer c);
HOWEVER, THIS WILL PERFORM BETTER IN MOST CASES:
SELECT t.*
FROM facts.transaction t
INNER JOIN dimensions.customer c
ON c.customerid = t.customerid;
Greenplum Confidential
39
41. Types of Functions
Greenplum supports several function types.
query language functions (functions written in
SQL)
procedural language functions (functions
written in for example, PL/pgSQL or PL/Tcl)
internal functions
C-language functions
This class will only present the first two types of functions:
Query Language and Procedural Language
Greenplum Confidential
41
42. Query Language Function – More Rules
• May contain SELECT statements
• May contain DML statements
Insert, Update or Delete statements
• May not contain:
Rollback, Savepoint, Begin or Commit commands
• Last statement must be SELECT unless the
return type is void.
Greenplum Confidential
42
43. Query Language Function – Example
This function has no parameters and no return
set:
CREATE FUNCTION public.clean_customer()
RETURNS void AS ‘DELETE FROM dimensions.customer
WHERE state = ‘‘NA’’; ’
LANGUAGE SQL;
It is executed as a SELECT statement:
SELECT public.clean_customer();
clean_customer
-----------
(1 row)
Greenplum Confidential
43
44. Query Language Function – With
Parameter
CREATE FUNCTION public.clean_specific_customer(which
char(2))
RETURNS void AS ‘DELETE FROM dimensions.customer
WHERE state = $1; ’
LANGUAGE SQL;
It is executed as a SELECT statement:
SELECT public.clean_specific_customer(‘NA’);
clean_specific_customer
-----------------------
(1 row)
Greenplum Confidential
44
45. SQL Functions Returning Sets
When an SQL function is declared as returning SETOF sometype, the
function‘s final SELECT query is executed to completion, and each
row it outputs is returned as an element of the result set.
CREATE FUNCTION dimensions.getstatecust(char)
RETURNS SETOF customer AS $$
SELECT *
FROM dimensions.customer WHERE state = $1;
$$ LANGUAGE SQL;
SELECT *,UPPER(city)
FROM dimensions.getstatecust(‘WA’);
This function returns all qualifying rows.
Greenplum Confidential
45
46. Procedural Language Functions
• PL/pgSQL Procedural Language for Greenplum
Can be used to create functions and procedures
Adds control structures to the SQL language
Can perform complex computations
Inherits all user-defined types, functions, and
operators
Can be defined to be trusted by the server
Is (relatively) easy to use.
Greenplum Confidential
46
47. Structure of PL/pgSQL
• PL/pgSQL is a block-structured language.
CREATE FUNCTION somefunc() RETURNS integer AS $$
DECLARE quantity integer := 30;
BEGIN
RAISE NOTICE ’Quantity here is %’, quantity;
-- Prints 30 quantity := 50;
-- Create a subblock
DECLARE quantity integer := 80;
BEGIN
RAISE NOTICE ’Quantity here is %’, quantity;
-- Prints 80
RAISE NOTICE ’Outer quantity here is %’, outerblock.quantity;
-- Prints 50
END;
RAISE NOTICE ’Quantity here is %’, quantity;
-- Prints 50
RETURN quantity;
END;
$$ LANGUAGE plpgsql;
Greenplum Confidential
47
48. PL/pgSQL Declarations
• All variables used in a block must be declared in the declarations
section of the block.
• SYNTAX:
name [ CONSTANT ] type [ NOT NULL ] [ { DEFAULT | := } expression ];
• EXAMPLES:
user_id integer;
quantity numeric(5);
url varchar;
myrow tablename%ROWTYPE;
myfield tablename.columnname%TYPE;
somerow RECORD;
Greenplum Confidential
48
49. PL/pgSQL %Types
• %TYPE provides the data type of a variable or
table column.
• You can use this to declare variables that will
hold database values.
SYNTAX: variable%TYPE
EXAMPLES:
custid customer.customerid%TYPE
tid transaction.transid%TYPE
Greenplum Confidential
49
50. PL/pgSQL Basic Statements
• Assignment
variable := expression;
• Executing DML (no RETURN)
PERFORM myfunction(myparm1,myparm2);
• Single row results
Use SELECT … INTO mytarget%TYPE;
Greenplum Confidential
50
51. PL/pgSQL – Executing Dynamic SQL
• Oftentimes you will want to generate dynamic
commands inside your PL/pgSQL functions
• This is for commands that will involve different
tables or different data types each time they are
executed.
• To handle this sort of problem, the EXECUTE
statement is provided:
EXECUTE command-string [ INTO [STRICT] target ];
Greenplum Confidential
51
52. PL/pgSQL – Dynamic SQL Example
Dynamic values that are to be inserted into the constructed query
require careful handling since they might themselves contain quote
characters.
EXECUTE ’UPDATE tbl SET ’
|| quote_ident(colname)
|| ’ = ’
|| quote_literal(newvalue)
|| ’ WHERE key = ’
|| quote_literal(keyvalue);
Greenplum Confidential
52
53. PL/pgSQL – FOUND
• FOUND is set by:
A SELECT INTO statement sets FOUND true if a row is assigned, false if
no row is returned.
A PERFORM statement sets FOUND true if it produces (and discards)
one or more rows, false if no row is produced.
UPDATE, INSERT, and DELETE statements set FOUND true if at least
one row is affected, false if no row is affected.
A FETCH statement sets FOUND true if it returns a row, false if no row
is returned.
A MOVE statement sets FOUND true if it successfully repositions the
cursor, false otherwise.
A FOR statement sets FOUND true if it iterates one or more times, else
false.
Greenplum Confidential
53
54. PL/pgSQL – Returning from a Function
• RETURN NEXT and RETURN QUERY do not actually return from the
function — they simply append zero or more rows to the function‘s result
set.
EXAMPLE:
CREATE OR REPLACE FUNCTION getAllStores()
RETURNS SETOF store AS
$BODY$
DECLARE r store%rowtype;
BEGIN
FOR r IN SELECT * FROM store WHERE storeid > 0 LOOP
-- can do some processing here
RETURN NEXT r;
-- return current row of SELECT
END LOOP;
RETURN;
END
$BODY$
LANGUAGE plpgsql;
Greenplum Confidential
54
55. PL/pgSQL – Conditionals
IF – THEN – ELSE conditionals lets you execute commands based on
certain conditions.
IF boolean-expression THEN
statements
[ ELSIF boolean-expression THEN
statements
[ ELSIF boolean-expression THEN
statements
[ ELSE statements ]
END IF;
TIP: Put the most commonly occurring condition first!
Greenplum Confidential
55
56. PL/pgSQL – Simple Loops
• With the LOOP, EXIT, CONTINUE, WHILE, and FOR statements,
you can arrange for your PL/pgSQL function to repeat a series of
commands.
LOOP SYNTAX:
[ <<label>> ]
LOOP
statements
END LOOP [ label ];
Greenplum Confidential
56
57. PL/pgSQL – Simple Loops
EXIT EXAMPLES:
LOOP
-- some computations
IF count > 0
THEN EXIT;
-- exit loop
END IF;
END LOOP;
LOOP
-- some computations
EXIT WHEN count > 0;
-- same result as previous example
END LOOP;
BEGIN
-- some computations
IF stocks > 100000
THEN EXIT;
-- causes exit from the BEGIN block
END IF;
END;
Greenplum Confidential
57
58. PL/pgSQL – Simple Loops
CONTINUE can be used within all loops.
CONTINUE EXAMPLE:
LOOP
-- some computations
EXIT WHEN count > 100;
CONTINUE WHEN count < 50;
-- some computations for count IN [50 .. 100]
END LOOP;
Greenplum Confidential
58
59. PL/pgSQL – Simple Loops
The WHILE statement repeats a sequence of statements so long as
the boolean-expression evaluates to true. The expression is checked
just before each entry to the loop body.
WHILE EXAMPLE:
WHILE customerid < 50 LOOP
-- some computations
END LOOP;
Greenplum Confidential
59
60. PL/pgSQL – Simple For (integer) Loops
This form of FOR creates a loop that iterates over a range of integer
values.
EXAMPLE:
FOR i IN 1..3 LOOP
-- i will take on the values 1,2,3 within the loop
END LOOP;
FOR i IN REVERSE 3..1 LOOP
-- i will take on the values 3,2,1 within the loop
END LOOP;
FOR i IN REVERSE 8..1 BY 2 LOOP
-- i will take on the values 8,6,4,2 within the loop
END LOOP;
Greenplum Confidential
60
61. PL/pgSQL – Simple For (Query) Loops
Using a different type of FOR loop, you can iterate through the results
of a query and manipulate that data accordingly.
EXAMPLE:
CREATE FUNCTION nukedimensions() RETURNS integer AS $$
DECLARE mdims RECORD;
BEGIN
FOR mdims IN SELECT * FROM pg_class WHERE schema=‘dimensions’ LOOP
-- Now ”mdims" has one record from pg_class
EXECUTE ’TRUNCATE TABLE ’ || quote_ident(mdimss.relname);
END LOOP;
RETURN 1;
END;
$$ LANGUAGE plpgsql;
Greenplum Confidential
61
62. PL/pgSQL – Trapping Errors
• By default, any error occurring in a PL/pgSQL function aborts
execution of the function, and indeed of the surrounding
transaction as well.
• You can trap errors and recover from them by using a BEGIN block
with an EXCEPTION clause.
BEGIN
statements
EXCEPTION WHEN condition [OR condition ... ] THEN
handler_statements
[ WHEN condition [ OR condition ... ] THEN
handler_statements ... ]
END;
Greenplum Confidential
62
63. PL/pgSQL – Common Exception
Conditions
NO_DATA No data matches predicates
CONNECTION_FAILURE SQL cannot connect
DIVISION_BY_ZERO Invalid division operation 0 in divisor
INTERVAL_FIELD_OVERFLOW Insufficient precision for timestamp
NULL_VALUE_VIOLATION Tried to insert NULL into NOT NULL
column
RAISE_EXCEPTION Error raising an exception
PLPGSQL_ERROR PL/pgSQL error encountered at run
time
NO_DATA_FOUND Query returned zero rows
TOO_MANY_ROWS Anticipated single row return,
received more than 1 row.
Greenplum Confidential
63