U-SQL Meta Data Catalog (SQLBits 2016)

Michael Rys
Principal Program Manager, Big Data @ Microsoft
@MikeDoesBigData, {mrys, usql}@microsoft.com
U-SQL Meta Data Catalog
2016/04/04

Meta Data Object Model
ADLA Catalog
Database
Schema
[1,n]
[1,n]
[0,n]
tables views TVFs
C# Fns C# UDAgg
Clustered
Index
partitions
C#
Assemblies
C# Extractors
Data Source
C# Reducers
C# Processors
C# Combiners
C# Outputters
Ext. tables Procedures
Creden-
tials
C# Applier
Table Types
Statistics
C# UDTs
Abstract
objects
User
objects
Refers toContains Implemented
and named by
MD
Name
C# Name
Legend

U-SQL Catalog
• Naming
• Discovery
• Sharing
• Securing
Naming
• Default database and schema context: master.dbo
• Quote identifiers with []: [my table]
• Stores data in ADL Storage /catalog folder
Discovery
• Visual Studio Server Explorer
• Azure Data Lake Analytics Portal
• SDKs and Azure PowerShell commands
Sharing
• Within an Azure Data Lake Analytics account
Securing
• Secured with AAD principals at catalog level (inherited
from ADL Storage)

Views and TVFs
• Views for simple
cases
• TVFs for
parameterization
and most cases
Views
CREATE VIEW V AS EXTRACT…
CREATE VIEW V AS SELECT …
• Cannot contain user-defined objects (such as UDFs or
UDOs)
• Will be inlined
Table-Valued Functions (TVFs)
CREATE FUNCTION F (@arg string = "default")
RETURNS @res [TABLE ( … )]
AS BEGIN … @res = … END;
• Provides parameterization
• One or more results
• Can contain multiple statements
• Can contain user-code (needs assembly reference)
• Will always be inlined
• Infers schema or checks against specified return schema

Procedures
Allows encapsulation
of non-DDL scripts
CREATE PROCEDURE P (@arg string = "default“)
AS
BEGIN
…;
OUTPUT @res TO …;
INSERT INTO T …;
END;
• Provides parameterization
• No result but writes into file or table
• Can contain multiple statements
• Can contain user code (needs assembly
reference)
• Will always be inlined
• Cannot contain DDL (no CREATE, DROP)

Table types
Enables you to name
a table schema
Provides reuse for
function/procedure
definitions
CREATE TYPE T AS TABLE(c1 string, c2 int );
CREATE FUNCTION F (@table_arg T)
RETURNS @res T
AS BEGIN … @res = … END;

Tables
• CREATE TABLE
• CREATE TABLE AS
SELECT
CREATE TABLE T (col1 int
, col2 string
, col3 SQL.MAP<string,string>
, INDEX idx CLUSTERED (col1 ASC)
PARTITIONED BY HASH (driver_id)
);
• Structured Data
• Built-in Data types only (no UDTs)
• Clustered index (must be specified): row-oriented
• Fine-grained partitioning (must be specified):
• HASH, DIRECT HASH, RANGE, ROUND ROBIN
CREATE TABLE T (INDEX idx CLUSTERED …) AS SELECT …;
CREATE TABLE T (INDEX idx CLUSTERED …) AS EXTRACT…;
CREATE TABLE T (INDEX idx CLUSTERED …) AS
myTVF(DEFAULT);
• Infer the schema from the query
• Still requires index and partitioning

Additional
Resources
Documentation
U-SQL DDL: https://msdn.microsoft.com/en-
us/library/azure/mt621299.aspx
Sample Projects
https://github.com/Azure/usql/tree/master/Examples/Ambulan
ceDemos/AmbulanceDemos/2-Ambulance-Structured%20Data
https://github.com/Azure/usql/tree/master/Examples/TweetAn
alysis

U-SQL Meta Data Catalog (SQLBits 2016)

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (8)

Similar to U-SQL Meta Data Catalog (SQLBits 2016)

Similar to U-SQL Meta Data Catalog (SQLBits 2016) (20)

More from Michael Rys

More from Michael Rys (8)

Recently uploaded

Recently uploaded (20)

U-SQL Meta Data Catalog (SQLBits 2016)

Editor's Notes