SlideShare a Scribd company logo
1 of 57
BASEL BERN BRUGG DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. GENF
HAMBURG KOPENHAGEN LAUSANNE MÜNCHEN STUTTGART WIEN ZÜRICH
External Tables
- Not *Just* Loading a CSV File
Kim Berg Hansen
Senior Consultant
About me
External Tables - Not *Just* Loading a CSV File2 9/21/2018
• Danish geek
• SQL & PL/SQL developer since 2000
• Developer at Trivadis since 2016
http://www.trivadis.dk
• Oracle Certified Expert in SQL
• Oracle ACE Director
• Blogger at http://www.kibeha.dk
• SQL quizmaster at
http://devgym.oracle.com
• Likes to cook
• Reads sci-fi
• Member of Danish Beer Enthusiasts
3 Membership Tiers
• Oracle ACE Director
• Oracle ACE
• Oracle ACE Associate
bit.ly/OracleACEProgram
500+ Technical Experts
Helping Peers Globally
Connect:
Nominate yourself or someone you know: acenomination.oracle.com
@oracleace
Facebook.com/oracleaces
oracle-ace_ww@oracle.com
About Trivadis
External Tables - Not *Just* Loading a CSV File4 9/21/2018
Trivadis is a market leader in IT consulting, system integration, solution engineering
and the provision of IT services focusing on and
technologies in Switzerland, Germany, Austria and Denmark.
We offer our services in the following strategic business fields:
Trivadis Services takes over the interacting operation of your IT systems.
O P E R A T I O N
COPENHAGEN
MUNICH
LAUSANNE
BERN
ZURICH
BRUGG
GENEVA
HAMBURG
DÜSSELDORF
FRANKFURT
STUTTGART
FREIBURG
BASEL
VIENNA
With over 600 specialists and IT experts in your region
External Tables - Not *Just* Loading a CSV File5 9/21/2018
14 Trivadis branches and more than
600 employees
260 Service Level Agreements
Over 4,000 training participants
Research and development budget:
EUR 5.0 million
Financially self-supporting and
sustainably profitable
Experience from more than 1,900
projects per year at over 800
customers
External Tables - Not *Just* Loading a CSV File
External Tables - Not *Just* Loading a CSV File6 9/21/2018
1. Access Drivers, Parameters, Locations
2. Definition versus Runtime
3. Error Handling, Logging Files
4. Flat Files input
5. Preprocessor
6. Multiple Files, Parallelism, Partition Pruning
7. Trusted Relied Constraints
8. SQL*Loader as Generator
9. External Table with Datapump Dump Files
10. HDFS / HIVE
External Tables - Not *Just* Loading a CSV File7 9/21/2018
Access Drivers
Parameters
Locations
External Tables
External Tables - Not *Just* Loading a CSV File9 9/21/2018
A way to treat a file outside of the database as a rowsource
Enables SELECT from the file with all the power of SQL
– Without necessarily loading the data into a table in the database
Different filetypes supported with different Access Drivers
select t1.col1, t2.col2
from db_tab t1
join ext_tab t2
on t2.fk = t1.pk
where t1.grp = 'FOO';
Creation
External Tables - Not *Just* Loading a CSV File10 9/21/2018
Definition created in data dictionary* like normal table (only data is outside DB)
(* in 18c not necessarily - more on that later)
Specify type (access driver), directory and location (file)
Specify access parameters depending on access driver
create table ext_tab (fk number, col2 varchar2(10))
organization external (
type oracle_loader
access parameters (
records delimited by newline
fields terminated by ";" optionally enclosed by '"'
( fk integer external(6), col2 char(10) )
)
location (ext_dir:'file.txt')
);
Access Driver
External Tables - Not *Just* Loading a CSV File11 9/21/2018
Keyword TYPE specifies which access driver to use
ORACLE_LOADER
– Flat files - alternative to SQL*Loader
ORACLE_DATAPUMP
– Datadump dump files - can also write files (once - at creation time)
ORACLE_HDFS (12.2) Oracle Big Data SQL
– Read datafiles from HDFS (by creating a HIVE table)
ORACLE_HIVE (12.2) Oracle Big Data SQL
– Read datafiles from HDFS by querying a HIVE catalog
Access Parameters
External Tables - Not *Just* Loading a CSV File12 9/21/2018
Specific for each Access Driver type
Tells DB the metadata of the file, how to get the values of each column
18c doc states opaque_format_spec in quotes used for INLINE EXTERNAL and
EXTERNAL_MODIFY, while without quotes is used for CREATE TABLE
– This appears to be a doc bug - without quotes seems always to work
Or a subquery can return the access parameters
Location
External Tables - Not *Just* Loading a CSV File13 9/21/2018
Keyword LOCATION contains one or more filenames
For ORACLE_LOADER and ORACLE_DATAPUMP files in filesystem
– DIRECTORY object must be created and privileges granted
– DIRECTORY object specified for file: DIRNAME:'file.txt'
– Or DEFAULT DIRECTORY specifies directory for files where dir. is omitted
– (12.1) Location supports wildcards * and ?
For ORACLE_HDFS location specifies hdfs:/... style URI
For ORACLE_HIVE location unused - access parameters specifies cluster/table
External Tables - Not *Just* Loading a CSV File14 9/21/2018
Definition versus Runtime
Definition in Data Dictionary
External Tables - Not *Just* Loading a CSV File16 9/21/2018
Define with CREATE TABLE
Change with ALTER TABLE
– Often useful to change LOCATION
– Some restrictions on what can be altered - see manual of each version
Change the projection with ALTER TABLE
– PROJECT COLUMN ALL / PROJECT COLUMN REFERENCED
- The latter may cause inconsistencies if errors in un-referenced columns
Overrides at Runtime (12.2)
External Tables - Not *Just* Loading a CSV File17 9/21/2018
SELECT ... FROM EXT_TAB EXTERNAL MODIFY (...)
– Modify default directory and/or location
- Allows each session/query to read own (identically structured) file(s)
– Modify reject limit
– Modify badfile / logfile / discardfile
Careful with your security
– A user with SELECT privilege on the external table can potentially read all files in
the DIRECTORY objects he has READ privilege on
Everything at Runtime (18.1)
External Tables - Not *Just* Loading a CSV File18 9/21/2018
Inline definition of External Table
Nothing in data dictionary (hence also less information for the optimizer)
select fk, col2
from external (
(fk number, col2 varchar2(10)
type oracle_loader
access parameters (
records delimited by newline
fields terminated by ";" optionally enclosed by '"'
( fk integer external(6), col2 char(10) )
)
location (ext_dir:'file.txt')
);
External Tables - Not *Just* Loading a CSV File19 9/21/2018
Error Handling
Logging Files
Errors in the Data
External Tables - Not *Just* Loading a CSV File21 9/21/2018
Errors in the data may or may not return an error
– REJECT LIMIT 0 (default) = first occurrence of bad data throws error
– REJECT LIMIT {int} {int} occurrence of bad data throws error
– REJECT LIMIT UNLIMITED no errors thrown
Bad rows of data are copied to the BADFILE
Note: If you have ALTER TABLE ... PROJECT COLUMN REFERENCED
– When column with bad data is in SELECT list => row goes to BADFILE
– When column with bad data is not in SELECT list => row is selected
Logging Files
External Tables - Not *Just* Loading a CSV File22 9/21/2018
Three parameter pairs
– NOLOGFILE / LOGFILE dir_obj:'ext.log'
– NOBADFILE / BADFILE dir_obj:'ext.bad'
– NODISCARDFILE / DISCARDFILE dir_obj:'ext.dcs'
Can use symbol substitution for uniqueness
- %p = Process id of user process doing the SELECT
- %a = Agent number of slave process by parallel access
Each of them defaults to {table_name}_%p.{ext}
BADFILE contains those rows that could not be imported
DISCARDFILE contains those rows that were skipped by LOAD WHEN clause
External Tables - Not *Just* Loading a CSV File23 9/21/2018
Flat Files input
Overall file characteristica
External Tables - Not *Just* Loading a CSV File25 9/21/2018
CHARACTERSET
– What characterset is the file (default is DB characterset, not client)
LANGUAGE
– Which language is used for month names, AM/PM, etc. in the file
TERRITORY
– How are decimal / thousand separators, week numbers, etc. in the file
DATA IS BIG ENDIAN / DATA IS LITTLE ENDIAN
– What endianness used the platform where the file originated
Records
External Tables - Not *Just* Loading a CSV File26 9/21/2018
FIXED
– Each record a fixed length (in bytes)
VARIABLE
– Start of each record contains a character count
DELIMITED BY
– Each record ends with a given string
XMLTAG
– Each record is the content within a given XML tag: <MYTAG>....</MYTAG>
Fields
External Tables - Not *Just* Loading a CSV File27 9/21/2018
Field list for file not necessarily match directly field list for table, can map differently
ALL FIELDS OVERRIDE - tells that field list does match directly table fields
– Then only list fields that needs extra info, like non-default date format or such
FIELD NAMES clause tells how to handle that first line contains field names
– Can be ignored or can map fields automatically by field name
TERMINATED BY / [OPTIONALLY] ENCLOSED BY
FIELDS CSV
– WITH / WITHOUT EMBEDDED - does file contain record delim within string fields
– TERMINATED / ENCLOSED - override default , and "
Specifying Field Positions (when not delimited)
External Tables - Not *Just* Loading a CSV File28 9/21/2018
Start position
– Digit is position directly
– * means the start is the char after the end of previous field
– *+{offset} or *-{offset} means plus or minus offset chars after end of previous field
End can be specified as position (Digit) or as length (+Digit)
STRING SIZES ARE IN
– Parameter says if positions are measured in bytes or chars (for multibyte charsets)
Datatypes
External Tables - Not *Just* Loading a CSV File29 9/21/2018
INTEGER, DECIMAL, FLOAT, DOUBLE
– Specifying EXTERNAL means the numbers are represented as strings in the file
– Without EXTERNAL means they are binary in the format as a C program
- Access parameter DATA IS BIG / LITTLE ENDIAN used here
RAW, VARRAW, VARRAWC
– Binary data, fixed length or variable with first bytes indicating length
ORACLE_DATE, ORACLE_NUMBER
– Binary representations of Oracle DATE or NUMBER datatype
Datatypes (continued)
External Tables - Not *Just* Loading a CSV File30 9/21/2018
CHAR, VARCHAR, VARCHARC
– Character data, fixed length or variable with first bytes indicating length
– VARCHAR length indicator is bytes, VARCHARC length indicator is characters
– CHAR also used for DATE, TIMESTAMP, INTERVAL:
– DATE_FORMAT {type} MASK "{format mask}"
COLUMN TRANSFORMS
External Tables - Not *Just* Loading a CSV File31 9/21/2018
{column_name} FROM {transformation}
– NULL - sets column in all rows to NULL
– CONSTANT - sets column in all rows to specified literal
– CONCAT - sets column to concatenation of field(s) and/or literal(s)
– STARTOF - sets column to a substring from the start of a field
– LOBFILE - sets column to a LOB loaded from another file
directory object / filename can be a field or literal
External Tables - Not *Just* Loading a CSV File32 9/21/2018
Preprocessor
Preprocessor
External Tables - Not *Just* Loading a CSV File34 9/21/2018
PREPROCESSOR [{directory}:]{script_or_exe_file}
Must have EXECUTE privilege on directory object
Can be different directory than the datafile - this is recommended for security
Preprocessor script/exe will be called with filename from LOCATION as parameter
Standard output from script/exe will become the input for the EXTERNAL TABLE
Cannot specify arguments directly
– if executable requires arguments, must wrap it in a script
Windows script (batch file) must have suffix .bat or .cmd
Windows batch file must start with @echo off
Uses
External Tables - Not *Just* Loading a CSV File35 9/21/2018
Uncompress (gunzip / zcat)
– Process compressed file and stream uncompressed data as external table input
Directory listing
– Preprocessor script does ls / dir
Changing file content
– Do transformations with sed before the data is used for external table input
curl calls
– get http resources and feed them to external table input
Your imagination is the limit 
External Tables - Not *Just* Loading a CSV File36 9/21/2018
Multiple Files
Parallelism
Partition Pruning
Multiple Files
External Tables - Not *Just* Loading a CSV File38 9/21/2018
LOCATION can contain multiple files, with or without directory specification
– If without, directory specified in DEFAULT DIRECTORY is used
Selecting from the external table reads all the files (except by partition pruning)
If field names are in first row, it can be in either just first file or all files
– Specify which with FIELD NAMES FIRST / ALL
Parallelism
External Tables - Not *Just* Loading a CSV File39 9/21/2018
Multiple files
– Each file specified in LOCATION handled by each slave process
- parallel degree not helpful to set larger than number of files
– That includes that PREPROCESSOR is called for each file by slave process
Large files
– ORACLE_LOADER parallel select can attempt to assign file chunks to slaves
– Cannot always be done, for example not by:
- Named pipes as input
- Multibyte charactersets (unless fixed byte length records)
- Variable length records with length indicator bytes
Partition Pruning (12.2)
External Tables - Not *Just* Loading a CSV File40 9/21/2018
Can be partitioned with RANGE, INTERVAL, LIST or composites of them
Each partition has one or more files in LOCATION clause
When optimizer does partition pruning, for an external table that means it only scans
the file(s) of that partition
DB trusts that files of each partition only contains the specified partition key value(s)
If key values are wrong in the files:
– you can get output that does not match WHERE clause
– you may have data you cannot query with WHERE clause
External Tables - Not *Just* Loading a CSV File41 9/21/2018
Trusted Relied Constraints
Purposes of Constraints
External Tables - Not *Just* Loading a CSV File43 9/21/2018
On regular tables integrity constraints can be enforced
– Not possible to enforce on external tables - data comes from elsewhere
- Except NOT NULL constraint can be enforced - nulls go to bad file
– But you can say "trust me" and use RELY DISABLE on constraints (12.2)
- can do that for primary key, foreign key, unique constraints
- but not check constraint
With knowledge of the constraints, optimizer can make assumptions
that enables choosing more optimal access plans
– This also works with the trusted constraints on external tables
- QUERY_REWRITE_INTEGRITY = trusted or stale_tolerated
External Tables - Not *Just* Loading a CSV File44 9/21/2018
SQL*Loader as Generator
SQL*Loader for Creating External Tables
External Tables - Not *Just* Loading a CSV File46 9/21/2018
You have a SQL*Loader control file?
You want to do the same load (or almost) with an external table?
Use SQL*Loader parameter EXTERNAL_TABLE=GENERATE_ONLY
SQL*Loader won't load but instead create code in the log file
This code you can execute or edit as you wish
External Tables - Not *Just* Loading a CSV File47 9/21/2018
External Table with
Datapump Dump Files
Write (once) to Dump File
External Tables - Not *Just* Loading a CSV File49 9/21/2018
CTAS for ORACLE_DATAPUMP access driver
This created external table can be read, but not modified
create table ext_emp_tab
organization external (
type oracle_datapump
default directory ext_dir
location ('ext_emp.dmp')
)
as select * from emp;
Driver Parameters for Write
External Tables - Not *Just* Loading a CSV File50 9/21/2018
COMPRESSION
– ENABLED BASIC / LOW / MEDIUM / HIGH
- requires Advanced Compression option
ENCRYPTION
– ENABLED / DISABLED
VERSION
– COMPATIBLE / LATEST / version number
Parallel Write to Multiple Files
External Tables - Not *Just* Loading a CSV File51 9/21/2018
CTAS for ORACLE_LOADER access driver
Parallel degree and number of files should match
– If number of files > parallel, extra files unused
– If parallel > number of files, parallel is reduced to number of files
create table ext_emp_tab
organization external (
type oracle_datapump
default directory ext_dir
location ('ext_emp1.dmp', 'ext_emp2.dmp', 'ext_emp3.dmp')
)
parallel 3
as select * from emp;
External Table to Read Dump File
External Tables - Not *Just* Loading a CSV File52 9/21/2018
Create external table on an existing Dump File (for example from other DB)
Dump file can be from other DB charset, other DB endianness
Reading from multiple files require all have been written with identical metadata
– Ext.table name, column names/types, charset, timezone must be identical
create table ext_emp_tab (
emp_id number, ename varchar2(20)
) organization external (
type oracle_datapump
default directory ext_dir
location ('ext_emp1.dmp', 'ext_emp2.dmp', 'ext_emp3.dmp')
);
External Tables - Not *Just* Loading a CSV File53 9/21/2018
HDFS / HIVE
Oracle Big Data SQL
External Tables - Not *Just* Loading a CSV File55 9/21/2018
External HDFS / HIVE tables for Oracle Big Data SQL (licensed product)
– Hadoop Clusters on Oracle Big Data Appliance
– Database on Exadata
HIVE metadata exposed to database
– ORACLE_HIVE external tables can just specify columns and HIVE cluster/table
– Can override mappings if desired
ORACLE_HDFS you specify HIVE style metadata directly, no table in HIVE catalog
Advantages
External Tables - Not *Just* Loading a CSV File56 9/21/2018
Big Data SQL Engine
– SmartScan on Hadoop
– Fast direct reads
– Oracle PQ => Hadoop parallelism
Advantages of Hadoop data directly in SQL
– Immediate use by anything that uses SELECT
– Fine-grained access control of Hadoop
– Data redaction, data masking
Questions & Answers
Kim Berg Hansen
Senior Consultant
email kim.berghansen@trivadis.com
twitter @kibeha
blog http://www.kibeha.dk
9/21/2018 External Tables - Not *Just* Loading a CSV File57

More Related Content

What's hot

0104 abap dictionary
0104 abap dictionary0104 abap dictionary
0104 abap dictionaryvkyecc1
 
documents writing with LATEX
documents writing with LATEXdocuments writing with LATEX
documents writing with LATEXAnusha Vajrapu
 
FileMan Training Part 2
FileMan Training Part 2FileMan Training Part 2
FileMan Training Part 2ckuyehar
 
ABAP Material 05
ABAP Material 05ABAP Material 05
ABAP Material 05warcraft_c
 
File organization and indexing
File organization and indexingFile organization and indexing
File organization and indexingraveena sharma
 
Sas short course_presentation_11-4-09
Sas short course_presentation_11-4-09Sas short course_presentation_11-4-09
Sas short course_presentation_11-4-09Prashant Ph
 
FileMan Training Part 3
FileMan Training Part 3FileMan Training Part 3
FileMan Training Part 3ckuyehar
 
FileMan Training Part 1
FileMan Training Part 1FileMan Training Part 1
FileMan Training Part 1ckuyehar
 
Cross-File Navigation & Enhanced Interactivity with TimeSavers + Navigation A...
Cross-File Navigation & Enhanced Interactivity with TimeSavers + Navigation A...Cross-File Navigation & Enhanced Interactivity with TimeSavers + Navigation A...
Cross-File Navigation & Enhanced Interactivity with TimeSavers + Navigation A...Shlomo Perets
 
Lecture03 abap on line
Lecture03 abap on lineLecture03 abap on line
Lecture03 abap on lineMilind Patil
 
File Organization
File OrganizationFile Organization
File OrganizationManyi Man
 

What's hot (20)

Introduction to Latex
Introduction to LatexIntroduction to Latex
Introduction to Latex
 
0104 abap dictionary
0104 abap dictionary0104 abap dictionary
0104 abap dictionary
 
documents writing with LATEX
documents writing with LATEXdocuments writing with LATEX
documents writing with LATEX
 
Introduction to LaTeX
Introduction to LaTeXIntroduction to LaTeX
Introduction to LaTeX
 
Latex intro
Latex introLatex intro
Latex intro
 
Training basic latex
Training basic latexTraining basic latex
Training basic latex
 
FileMan Training Part 2
FileMan Training Part 2FileMan Training Part 2
FileMan Training Part 2
 
ABAP Material 05
ABAP Material 05ABAP Material 05
ABAP Material 05
 
SAS - overview of SAS
SAS - overview of SASSAS - overview of SAS
SAS - overview of SAS
 
File organization and indexing
File organization and indexingFile organization and indexing
File organization and indexing
 
Sap abap material
Sap abap materialSap abap material
Sap abap material
 
Sas short course_presentation_11-4-09
Sas short course_presentation_11-4-09Sas short course_presentation_11-4-09
Sas short course_presentation_11-4-09
 
FileMan Training Part 3
FileMan Training Part 3FileMan Training Part 3
FileMan Training Part 3
 
LaTeX Basics
LaTeX BasicsLaTeX Basics
LaTeX Basics
 
FileMan Training Part 1
FileMan Training Part 1FileMan Training Part 1
FileMan Training Part 1
 
Sas
Sas Sas
Sas
 
LaTeX for beginners
LaTeX for beginnersLaTeX for beginners
LaTeX for beginners
 
Cross-File Navigation & Enhanced Interactivity with TimeSavers + Navigation A...
Cross-File Navigation & Enhanced Interactivity with TimeSavers + Navigation A...Cross-File Navigation & Enhanced Interactivity with TimeSavers + Navigation A...
Cross-File Navigation & Enhanced Interactivity with TimeSavers + Navigation A...
 
Lecture03 abap on line
Lecture03 abap on lineLecture03 abap on line
Lecture03 abap on line
 
File Organization
File OrganizationFile Organization
File Organization
 

Similar to External Tables - not just loading a csv file

Changing platforms of Oracle database
Changing platforms of Oracle databaseChanging platforms of Oracle database
Changing platforms of Oracle databasePawanbir Singh
 
Accessing external hadoop data sources using pivotal e xtension framework (px...
Accessing external hadoop data sources using pivotal e xtension framework (px...Accessing external hadoop data sources using pivotal e xtension framework (px...
Accessing external hadoop data sources using pivotal e xtension framework (px...Sameer Tiwari
 
data loading and unloading in IBM Netezza by www.etraining.guru
data loading and unloading in IBM Netezza by www.etraining.gurudata loading and unloading in IBM Netezza by www.etraining.guru
data loading and unloading in IBM Netezza by www.etraining.guruRavikumar Nandigam
 
Working with the IFS on System i
Working with the IFS on System iWorking with the IFS on System i
Working with the IFS on System iChuck Walker
 
Pivotal greenplum external tables
Pivotal greenplum external tablesPivotal greenplum external tables
Pivotal greenplum external tablesRajesh Goyal
 
IBM Db2 11.5 External Tables
IBM Db2 11.5 External TablesIBM Db2 11.5 External Tables
IBM Db2 11.5 External TablesPhil Downey
 
Be A Hero: Transforming GoPro Analytics Data Pipeline
Be A Hero: Transforming GoPro Analytics Data PipelineBe A Hero: Transforming GoPro Analytics Data Pipeline
Be A Hero: Transforming GoPro Analytics Data PipelineChester Chen
 
Take your database source code and data under control
Take your database source code and data under controlTake your database source code and data under control
Take your database source code and data under controlMarcin Przepiórowski
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversScyllaDB
 
Transformation Processing Smackdown; Spark vs Hive vs Pig
Transformation Processing Smackdown; Spark vs Hive vs PigTransformation Processing Smackdown; Spark vs Hive vs Pig
Transformation Processing Smackdown; Spark vs Hive vs PigLester Martin
 
SQL/MED: Doping for PostgreSQL
SQL/MED: Doping for PostgreSQLSQL/MED: Doping for PostgreSQL
SQL/MED: Doping for PostgreSQLPeter Eisentraut
 
An overview of snowflake
An overview of snowflakeAn overview of snowflake
An overview of snowflakeSivakumar Ramar
 
SQLDAY 2023 Chodkowski Adrian Databricks Performance Tuning
SQLDAY 2023 Chodkowski Adrian Databricks Performance TuningSQLDAY 2023 Chodkowski Adrian Databricks Performance Tuning
SQLDAY 2023 Chodkowski Adrian Databricks Performance TuningSeeQuality.net
 
Federated Queries Across Both Different Storage Mediums and Different Data En...
Federated Queries Across Both Different Storage Mediums and Different Data En...Federated Queries Across Both Different Storage Mediums and Different Data En...
Federated Queries Across Both Different Storage Mediums and Different Data En...VMware Tanzu
 

Similar to External Tables - not just loading a csv file (20)

Changing platforms of Oracle database
Changing platforms of Oracle databaseChanging platforms of Oracle database
Changing platforms of Oracle database
 
Accessing external hadoop data sources using pivotal e xtension framework (px...
Accessing external hadoop data sources using pivotal e xtension framework (px...Accessing external hadoop data sources using pivotal e xtension framework (px...
Accessing external hadoop data sources using pivotal e xtension framework (px...
 
IDUG 2015 NA Data Movement Utilities final
IDUG 2015 NA Data Movement Utilities finalIDUG 2015 NA Data Movement Utilities final
IDUG 2015 NA Data Movement Utilities final
 
Cdi implementation
Cdi implementationCdi implementation
Cdi implementation
 
Less17 Util
Less17  UtilLess17  Util
Less17 Util
 
data loading and unloading in IBM Netezza by www.etraining.guru
data loading and unloading in IBM Netezza by www.etraining.gurudata loading and unloading in IBM Netezza by www.etraining.guru
data loading and unloading in IBM Netezza by www.etraining.guru
 
Working with the IFS on System i
Working with the IFS on System iWorking with the IFS on System i
Working with the IFS on System i
 
Pivotal greenplum external tables
Pivotal greenplum external tablesPivotal greenplum external tables
Pivotal greenplum external tables
 
IBM Db2 11.5 External Tables
IBM Db2 11.5 External TablesIBM Db2 11.5 External Tables
IBM Db2 11.5 External Tables
 
Be A Hero: Transforming GoPro Analytics Data Pipeline
Be A Hero: Transforming GoPro Analytics Data PipelineBe A Hero: Transforming GoPro Analytics Data Pipeline
Be A Hero: Transforming GoPro Analytics Data Pipeline
 
Take your database source code and data under control
Take your database source code and data under controlTake your database source code and data under control
Take your database source code and data under control
 
introduction-stata.pptx
introduction-stata.pptxintroduction-stata.pptx
introduction-stata.pptx
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the Covers
 
Transformation Processing Smackdown; Spark vs Hive vs Pig
Transformation Processing Smackdown; Spark vs Hive vs PigTransformation Processing Smackdown; Spark vs Hive vs Pig
Transformation Processing Smackdown; Spark vs Hive vs Pig
 
SQL/MED: Doping for PostgreSQL
SQL/MED: Doping for PostgreSQLSQL/MED: Doping for PostgreSQL
SQL/MED: Doping for PostgreSQL
 
Sas classes in mumbai
Sas classes in mumbaiSas classes in mumbai
Sas classes in mumbai
 
An overview of snowflake
An overview of snowflakeAn overview of snowflake
An overview of snowflake
 
SQLDAY 2023 Chodkowski Adrian Databricks Performance Tuning
SQLDAY 2023 Chodkowski Adrian Databricks Performance TuningSQLDAY 2023 Chodkowski Adrian Databricks Performance Tuning
SQLDAY 2023 Chodkowski Adrian Databricks Performance Tuning
 
Federated Queries Across Both Different Storage Mediums and Different Data En...
Federated Queries Across Both Different Storage Mediums and Different Data En...Federated Queries Across Both Different Storage Mediums and Different Data En...
Federated Queries Across Both Different Storage Mediums and Different Data En...
 
Hadoop
HadoopHadoop
Hadoop
 

More from Kim Berg Hansen

When 7 bit-ascii ain't enough - about NLS, collation, charsets, unicode and s...
When 7 bit-ascii ain't enough - about NLS, collation, charsets, unicode and s...When 7 bit-ascii ain't enough - about NLS, collation, charsets, unicode and s...
When 7 bit-ascii ain't enough - about NLS, collation, charsets, unicode and s...Kim Berg Hansen
 
When 7-bit ASCII ain't enough - about NLS, Collation, Charsets, Unicode and such
When 7-bit ASCII ain't enough - about NLS, Collation, Charsets, Unicode and suchWhen 7-bit ASCII ain't enough - about NLS, Collation, Charsets, Unicode and such
When 7-bit ASCII ain't enough - about NLS, Collation, Charsets, Unicode and suchKim Berg Hansen
 
Analytic Views in Oracle 12.2
Analytic Views in Oracle 12.2Analytic Views in Oracle 12.2
Analytic Views in Oracle 12.2Kim Berg Hansen
 
Uses of row pattern matching
Uses of row pattern matchingUses of row pattern matching
Uses of row pattern matchingKim Berg Hansen
 
Read, store and create xml and json
Read, store and create xml and jsonRead, store and create xml and json
Read, store and create xml and jsonKim Berg Hansen
 
Oracle database - Get external data via HTTP, FTP and Web Services
Oracle database - Get external data via HTTP, FTP and Web ServicesOracle database - Get external data via HTTP, FTP and Web Services
Oracle database - Get external data via HTTP, FTP and Web ServicesKim Berg Hansen
 
Oracle database - Analytic functions - Advanced cases
Oracle database - Analytic functions - Advanced casesOracle database - Analytic functions - Advanced cases
Oracle database - Analytic functions - Advanced casesKim Berg Hansen
 
Real cases of indispensability of Oracle SQL analytic functions
Real cases of indispensability of Oracle SQL analytic functionsReal cases of indispensability of Oracle SQL analytic functions
Real cases of indispensability of Oracle SQL analytic functionsKim Berg Hansen
 
Really using Oracle analytic SQL functions
Really using Oracle analytic SQL functionsReally using Oracle analytic SQL functions
Really using Oracle analytic SQL functionsKim Berg Hansen
 

More from Kim Berg Hansen (10)

When 7 bit-ascii ain't enough - about NLS, collation, charsets, unicode and s...
When 7 bit-ascii ain't enough - about NLS, collation, charsets, unicode and s...When 7 bit-ascii ain't enough - about NLS, collation, charsets, unicode and s...
When 7 bit-ascii ain't enough - about NLS, collation, charsets, unicode and s...
 
When 7-bit ASCII ain't enough - about NLS, Collation, Charsets, Unicode and such
When 7-bit ASCII ain't enough - about NLS, Collation, Charsets, Unicode and suchWhen 7-bit ASCII ain't enough - about NLS, Collation, Charsets, Unicode and such
When 7-bit ASCII ain't enough - about NLS, Collation, Charsets, Unicode and such
 
Analytic Views in Oracle 12.2
Analytic Views in Oracle 12.2Analytic Views in Oracle 12.2
Analytic Views in Oracle 12.2
 
Uses of row pattern matching
Uses of row pattern matchingUses of row pattern matching
Uses of row pattern matching
 
Read, store and create xml and json
Read, store and create xml and jsonRead, store and create xml and json
Read, store and create xml and json
 
Data twisting
Data twistingData twisting
Data twisting
 
Oracle database - Get external data via HTTP, FTP and Web Services
Oracle database - Get external data via HTTP, FTP and Web ServicesOracle database - Get external data via HTTP, FTP and Web Services
Oracle database - Get external data via HTTP, FTP and Web Services
 
Oracle database - Analytic functions - Advanced cases
Oracle database - Analytic functions - Advanced casesOracle database - Analytic functions - Advanced cases
Oracle database - Analytic functions - Advanced cases
 
Real cases of indispensability of Oracle SQL analytic functions
Real cases of indispensability of Oracle SQL analytic functionsReal cases of indispensability of Oracle SQL analytic functions
Real cases of indispensability of Oracle SQL analytic functions
 
Really using Oracle analytic SQL functions
Really using Oracle analytic SQL functionsReally using Oracle analytic SQL functions
Really using Oracle analytic SQL functions
 

Recently uploaded

RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...ThinkInnovation
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 

Recently uploaded (20)

RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 

External Tables - not just loading a csv file

  • 1. BASEL BERN BRUGG DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. GENF HAMBURG KOPENHAGEN LAUSANNE MÜNCHEN STUTTGART WIEN ZÜRICH External Tables - Not *Just* Loading a CSV File Kim Berg Hansen Senior Consultant
  • 2. About me External Tables - Not *Just* Loading a CSV File2 9/21/2018 • Danish geek • SQL & PL/SQL developer since 2000 • Developer at Trivadis since 2016 http://www.trivadis.dk • Oracle Certified Expert in SQL • Oracle ACE Director • Blogger at http://www.kibeha.dk • SQL quizmaster at http://devgym.oracle.com • Likes to cook • Reads sci-fi • Member of Danish Beer Enthusiasts
  • 3. 3 Membership Tiers • Oracle ACE Director • Oracle ACE • Oracle ACE Associate bit.ly/OracleACEProgram 500+ Technical Experts Helping Peers Globally Connect: Nominate yourself or someone you know: acenomination.oracle.com @oracleace Facebook.com/oracleaces oracle-ace_ww@oracle.com
  • 4. About Trivadis External Tables - Not *Just* Loading a CSV File4 9/21/2018 Trivadis is a market leader in IT consulting, system integration, solution engineering and the provision of IT services focusing on and technologies in Switzerland, Germany, Austria and Denmark. We offer our services in the following strategic business fields: Trivadis Services takes over the interacting operation of your IT systems. O P E R A T I O N
  • 5. COPENHAGEN MUNICH LAUSANNE BERN ZURICH BRUGG GENEVA HAMBURG DÜSSELDORF FRANKFURT STUTTGART FREIBURG BASEL VIENNA With over 600 specialists and IT experts in your region External Tables - Not *Just* Loading a CSV File5 9/21/2018 14 Trivadis branches and more than 600 employees 260 Service Level Agreements Over 4,000 training participants Research and development budget: EUR 5.0 million Financially self-supporting and sustainably profitable Experience from more than 1,900 projects per year at over 800 customers
  • 6. External Tables - Not *Just* Loading a CSV File External Tables - Not *Just* Loading a CSV File6 9/21/2018 1. Access Drivers, Parameters, Locations 2. Definition versus Runtime 3. Error Handling, Logging Files 4. Flat Files input 5. Preprocessor 6. Multiple Files, Parallelism, Partition Pruning 7. Trusted Relied Constraints 8. SQL*Loader as Generator 9. External Table with Datapump Dump Files 10. HDFS / HIVE
  • 7. External Tables - Not *Just* Loading a CSV File7 9/21/2018 Access Drivers Parameters Locations
  • 8.
  • 9. External Tables External Tables - Not *Just* Loading a CSV File9 9/21/2018 A way to treat a file outside of the database as a rowsource Enables SELECT from the file with all the power of SQL – Without necessarily loading the data into a table in the database Different filetypes supported with different Access Drivers select t1.col1, t2.col2 from db_tab t1 join ext_tab t2 on t2.fk = t1.pk where t1.grp = 'FOO';
  • 10. Creation External Tables - Not *Just* Loading a CSV File10 9/21/2018 Definition created in data dictionary* like normal table (only data is outside DB) (* in 18c not necessarily - more on that later) Specify type (access driver), directory and location (file) Specify access parameters depending on access driver create table ext_tab (fk number, col2 varchar2(10)) organization external ( type oracle_loader access parameters ( records delimited by newline fields terminated by ";" optionally enclosed by '"' ( fk integer external(6), col2 char(10) ) ) location (ext_dir:'file.txt') );
  • 11. Access Driver External Tables - Not *Just* Loading a CSV File11 9/21/2018 Keyword TYPE specifies which access driver to use ORACLE_LOADER – Flat files - alternative to SQL*Loader ORACLE_DATAPUMP – Datadump dump files - can also write files (once - at creation time) ORACLE_HDFS (12.2) Oracle Big Data SQL – Read datafiles from HDFS (by creating a HIVE table) ORACLE_HIVE (12.2) Oracle Big Data SQL – Read datafiles from HDFS by querying a HIVE catalog
  • 12. Access Parameters External Tables - Not *Just* Loading a CSV File12 9/21/2018 Specific for each Access Driver type Tells DB the metadata of the file, how to get the values of each column 18c doc states opaque_format_spec in quotes used for INLINE EXTERNAL and EXTERNAL_MODIFY, while without quotes is used for CREATE TABLE – This appears to be a doc bug - without quotes seems always to work Or a subquery can return the access parameters
  • 13. Location External Tables - Not *Just* Loading a CSV File13 9/21/2018 Keyword LOCATION contains one or more filenames For ORACLE_LOADER and ORACLE_DATAPUMP files in filesystem – DIRECTORY object must be created and privileges granted – DIRECTORY object specified for file: DIRNAME:'file.txt' – Or DEFAULT DIRECTORY specifies directory for files where dir. is omitted – (12.1) Location supports wildcards * and ? For ORACLE_HDFS location specifies hdfs:/... style URI For ORACLE_HIVE location unused - access parameters specifies cluster/table
  • 14. External Tables - Not *Just* Loading a CSV File14 9/21/2018 Definition versus Runtime
  • 15.
  • 16. Definition in Data Dictionary External Tables - Not *Just* Loading a CSV File16 9/21/2018 Define with CREATE TABLE Change with ALTER TABLE – Often useful to change LOCATION – Some restrictions on what can be altered - see manual of each version Change the projection with ALTER TABLE – PROJECT COLUMN ALL / PROJECT COLUMN REFERENCED - The latter may cause inconsistencies if errors in un-referenced columns
  • 17. Overrides at Runtime (12.2) External Tables - Not *Just* Loading a CSV File17 9/21/2018 SELECT ... FROM EXT_TAB EXTERNAL MODIFY (...) – Modify default directory and/or location - Allows each session/query to read own (identically structured) file(s) – Modify reject limit – Modify badfile / logfile / discardfile Careful with your security – A user with SELECT privilege on the external table can potentially read all files in the DIRECTORY objects he has READ privilege on
  • 18. Everything at Runtime (18.1) External Tables - Not *Just* Loading a CSV File18 9/21/2018 Inline definition of External Table Nothing in data dictionary (hence also less information for the optimizer) select fk, col2 from external ( (fk number, col2 varchar2(10) type oracle_loader access parameters ( records delimited by newline fields terminated by ";" optionally enclosed by '"' ( fk integer external(6), col2 char(10) ) ) location (ext_dir:'file.txt') );
  • 19. External Tables - Not *Just* Loading a CSV File19 9/21/2018 Error Handling Logging Files
  • 20.
  • 21. Errors in the Data External Tables - Not *Just* Loading a CSV File21 9/21/2018 Errors in the data may or may not return an error – REJECT LIMIT 0 (default) = first occurrence of bad data throws error – REJECT LIMIT {int} {int} occurrence of bad data throws error – REJECT LIMIT UNLIMITED no errors thrown Bad rows of data are copied to the BADFILE Note: If you have ALTER TABLE ... PROJECT COLUMN REFERENCED – When column with bad data is in SELECT list => row goes to BADFILE – When column with bad data is not in SELECT list => row is selected
  • 22. Logging Files External Tables - Not *Just* Loading a CSV File22 9/21/2018 Three parameter pairs – NOLOGFILE / LOGFILE dir_obj:'ext.log' – NOBADFILE / BADFILE dir_obj:'ext.bad' – NODISCARDFILE / DISCARDFILE dir_obj:'ext.dcs' Can use symbol substitution for uniqueness - %p = Process id of user process doing the SELECT - %a = Agent number of slave process by parallel access Each of them defaults to {table_name}_%p.{ext} BADFILE contains those rows that could not be imported DISCARDFILE contains those rows that were skipped by LOAD WHEN clause
  • 23. External Tables - Not *Just* Loading a CSV File23 9/21/2018 Flat Files input
  • 24.
  • 25. Overall file characteristica External Tables - Not *Just* Loading a CSV File25 9/21/2018 CHARACTERSET – What characterset is the file (default is DB characterset, not client) LANGUAGE – Which language is used for month names, AM/PM, etc. in the file TERRITORY – How are decimal / thousand separators, week numbers, etc. in the file DATA IS BIG ENDIAN / DATA IS LITTLE ENDIAN – What endianness used the platform where the file originated
  • 26. Records External Tables - Not *Just* Loading a CSV File26 9/21/2018 FIXED – Each record a fixed length (in bytes) VARIABLE – Start of each record contains a character count DELIMITED BY – Each record ends with a given string XMLTAG – Each record is the content within a given XML tag: <MYTAG>....</MYTAG>
  • 27. Fields External Tables - Not *Just* Loading a CSV File27 9/21/2018 Field list for file not necessarily match directly field list for table, can map differently ALL FIELDS OVERRIDE - tells that field list does match directly table fields – Then only list fields that needs extra info, like non-default date format or such FIELD NAMES clause tells how to handle that first line contains field names – Can be ignored or can map fields automatically by field name TERMINATED BY / [OPTIONALLY] ENCLOSED BY FIELDS CSV – WITH / WITHOUT EMBEDDED - does file contain record delim within string fields – TERMINATED / ENCLOSED - override default , and "
  • 28. Specifying Field Positions (when not delimited) External Tables - Not *Just* Loading a CSV File28 9/21/2018 Start position – Digit is position directly – * means the start is the char after the end of previous field – *+{offset} or *-{offset} means plus or minus offset chars after end of previous field End can be specified as position (Digit) or as length (+Digit) STRING SIZES ARE IN – Parameter says if positions are measured in bytes or chars (for multibyte charsets)
  • 29. Datatypes External Tables - Not *Just* Loading a CSV File29 9/21/2018 INTEGER, DECIMAL, FLOAT, DOUBLE – Specifying EXTERNAL means the numbers are represented as strings in the file – Without EXTERNAL means they are binary in the format as a C program - Access parameter DATA IS BIG / LITTLE ENDIAN used here RAW, VARRAW, VARRAWC – Binary data, fixed length or variable with first bytes indicating length ORACLE_DATE, ORACLE_NUMBER – Binary representations of Oracle DATE or NUMBER datatype
  • 30. Datatypes (continued) External Tables - Not *Just* Loading a CSV File30 9/21/2018 CHAR, VARCHAR, VARCHARC – Character data, fixed length or variable with first bytes indicating length – VARCHAR length indicator is bytes, VARCHARC length indicator is characters – CHAR also used for DATE, TIMESTAMP, INTERVAL: – DATE_FORMAT {type} MASK "{format mask}"
  • 31. COLUMN TRANSFORMS External Tables - Not *Just* Loading a CSV File31 9/21/2018 {column_name} FROM {transformation} – NULL - sets column in all rows to NULL – CONSTANT - sets column in all rows to specified literal – CONCAT - sets column to concatenation of field(s) and/or literal(s) – STARTOF - sets column to a substring from the start of a field – LOBFILE - sets column to a LOB loaded from another file directory object / filename can be a field or literal
  • 32. External Tables - Not *Just* Loading a CSV File32 9/21/2018 Preprocessor
  • 33.
  • 34. Preprocessor External Tables - Not *Just* Loading a CSV File34 9/21/2018 PREPROCESSOR [{directory}:]{script_or_exe_file} Must have EXECUTE privilege on directory object Can be different directory than the datafile - this is recommended for security Preprocessor script/exe will be called with filename from LOCATION as parameter Standard output from script/exe will become the input for the EXTERNAL TABLE Cannot specify arguments directly – if executable requires arguments, must wrap it in a script Windows script (batch file) must have suffix .bat or .cmd Windows batch file must start with @echo off
  • 35. Uses External Tables - Not *Just* Loading a CSV File35 9/21/2018 Uncompress (gunzip / zcat) – Process compressed file and stream uncompressed data as external table input Directory listing – Preprocessor script does ls / dir Changing file content – Do transformations with sed before the data is used for external table input curl calls – get http resources and feed them to external table input Your imagination is the limit 
  • 36. External Tables - Not *Just* Loading a CSV File36 9/21/2018 Multiple Files Parallelism Partition Pruning
  • 37.
  • 38. Multiple Files External Tables - Not *Just* Loading a CSV File38 9/21/2018 LOCATION can contain multiple files, with or without directory specification – If without, directory specified in DEFAULT DIRECTORY is used Selecting from the external table reads all the files (except by partition pruning) If field names are in first row, it can be in either just first file or all files – Specify which with FIELD NAMES FIRST / ALL
  • 39. Parallelism External Tables - Not *Just* Loading a CSV File39 9/21/2018 Multiple files – Each file specified in LOCATION handled by each slave process - parallel degree not helpful to set larger than number of files – That includes that PREPROCESSOR is called for each file by slave process Large files – ORACLE_LOADER parallel select can attempt to assign file chunks to slaves – Cannot always be done, for example not by: - Named pipes as input - Multibyte charactersets (unless fixed byte length records) - Variable length records with length indicator bytes
  • 40. Partition Pruning (12.2) External Tables - Not *Just* Loading a CSV File40 9/21/2018 Can be partitioned with RANGE, INTERVAL, LIST or composites of them Each partition has one or more files in LOCATION clause When optimizer does partition pruning, for an external table that means it only scans the file(s) of that partition DB trusts that files of each partition only contains the specified partition key value(s) If key values are wrong in the files: – you can get output that does not match WHERE clause – you may have data you cannot query with WHERE clause
  • 41. External Tables - Not *Just* Loading a CSV File41 9/21/2018 Trusted Relied Constraints
  • 42.
  • 43. Purposes of Constraints External Tables - Not *Just* Loading a CSV File43 9/21/2018 On regular tables integrity constraints can be enforced – Not possible to enforce on external tables - data comes from elsewhere - Except NOT NULL constraint can be enforced - nulls go to bad file – But you can say "trust me" and use RELY DISABLE on constraints (12.2) - can do that for primary key, foreign key, unique constraints - but not check constraint With knowledge of the constraints, optimizer can make assumptions that enables choosing more optimal access plans – This also works with the trusted constraints on external tables - QUERY_REWRITE_INTEGRITY = trusted or stale_tolerated
  • 44. External Tables - Not *Just* Loading a CSV File44 9/21/2018 SQL*Loader as Generator
  • 45.
  • 46. SQL*Loader for Creating External Tables External Tables - Not *Just* Loading a CSV File46 9/21/2018 You have a SQL*Loader control file? You want to do the same load (or almost) with an external table? Use SQL*Loader parameter EXTERNAL_TABLE=GENERATE_ONLY SQL*Loader won't load but instead create code in the log file This code you can execute or edit as you wish
  • 47. External Tables - Not *Just* Loading a CSV File47 9/21/2018 External Table with Datapump Dump Files
  • 48.
  • 49. Write (once) to Dump File External Tables - Not *Just* Loading a CSV File49 9/21/2018 CTAS for ORACLE_DATAPUMP access driver This created external table can be read, but not modified create table ext_emp_tab organization external ( type oracle_datapump default directory ext_dir location ('ext_emp.dmp') ) as select * from emp;
  • 50. Driver Parameters for Write External Tables - Not *Just* Loading a CSV File50 9/21/2018 COMPRESSION – ENABLED BASIC / LOW / MEDIUM / HIGH - requires Advanced Compression option ENCRYPTION – ENABLED / DISABLED VERSION – COMPATIBLE / LATEST / version number
  • 51. Parallel Write to Multiple Files External Tables - Not *Just* Loading a CSV File51 9/21/2018 CTAS for ORACLE_LOADER access driver Parallel degree and number of files should match – If number of files > parallel, extra files unused – If parallel > number of files, parallel is reduced to number of files create table ext_emp_tab organization external ( type oracle_datapump default directory ext_dir location ('ext_emp1.dmp', 'ext_emp2.dmp', 'ext_emp3.dmp') ) parallel 3 as select * from emp;
  • 52. External Table to Read Dump File External Tables - Not *Just* Loading a CSV File52 9/21/2018 Create external table on an existing Dump File (for example from other DB) Dump file can be from other DB charset, other DB endianness Reading from multiple files require all have been written with identical metadata – Ext.table name, column names/types, charset, timezone must be identical create table ext_emp_tab ( emp_id number, ename varchar2(20) ) organization external ( type oracle_datapump default directory ext_dir location ('ext_emp1.dmp', 'ext_emp2.dmp', 'ext_emp3.dmp') );
  • 53. External Tables - Not *Just* Loading a CSV File53 9/21/2018 HDFS / HIVE
  • 54.
  • 55. Oracle Big Data SQL External Tables - Not *Just* Loading a CSV File55 9/21/2018 External HDFS / HIVE tables for Oracle Big Data SQL (licensed product) – Hadoop Clusters on Oracle Big Data Appliance – Database on Exadata HIVE metadata exposed to database – ORACLE_HIVE external tables can just specify columns and HIVE cluster/table – Can override mappings if desired ORACLE_HDFS you specify HIVE style metadata directly, no table in HIVE catalog
  • 56. Advantages External Tables - Not *Just* Loading a CSV File56 9/21/2018 Big Data SQL Engine – SmartScan on Hadoop – Fast direct reads – Oracle PQ => Hadoop parallelism Advantages of Hadoop data directly in SQL – Immediate use by anything that uses SELECT – Fine-grained access control of Hadoop – Data redaction, data masking
  • 57. Questions & Answers Kim Berg Hansen Senior Consultant email kim.berghansen@trivadis.com twitter @kibeha blog http://www.kibeha.dk 9/21/2018 External Tables - Not *Just* Loading a CSV File57

Editor's Notes

  1. “Our focus as IT consultants and system integrator lies on the business fields of Business Intelligence, Application Development, Infrastructure Engineering and Training. We have a separate division – Trivadis Services – which takes over the operation, maintenance and ongoing development of individual systems such as databases and specific applications, or we can also outsource the responsibility for more complex environments. We provide our services throughout Switzerland, Germany, Austria and Danmark and concentrate on Oracle and Microsoft technologies.”
  2. “We are a non-affiliated and profitable company with over 600 employees. Regional proximity to our customers is one of our key considerations. We achieve this by operating 14 branch operations in Switzerland, Germany, Austria and Danmark. We successfully completed more than 1900 customer projects during the last business year. Additionally, we also support our customers with over 200 Service Level Agreements. The basis for this sustained technological excellence is reflected in our research and development budget. Every year we invest around 5 million Swiss franks in analyzing and evaluating new technologies and developing our methods and products.”