(305) 4-2
Advanced DataStage
ModuleObjectives
After this module, you will be able to:
Use DataStage to access
Line-terminated sequential data
Non-line-terminated sequential data
Column-delimited data
Fixed-width columns
Fixed-format sequential data
Variable-format sequential data
Use COMMON variables to store values and accumulations
between reads
Merge sequential data using Merge plug-in
3.
(305) 4-3
Advanced DataStage
SequentialRecord Types
Line (record) terminators
Column delimited
For example: comma-delimited, tab-delimited
Fixed-width columns
Lengths specified by the metadata
No line terminators
Fixed-width columns
Lengths specified by the metadata
4.
(305) 4-4
Advanced DataStage
Variable-RecordFormats
Sequential files may contain records with varying
formats
First field indicates type of format
Records (lines) may or may not be terminated
Fields may or may not be delimited
Require special processing
Can’t specify the different formats (column definitions) in
the sequential stage
Need to determine record type before processing
Line-terminated records can be read in as a single field of
data and then parsed
Can’t read in a non-line-terminated record until its type is
determined
(305) 4-12
Advanced DataStage
AccessingMulti-Format Records
Line terminated:
Using Sequential stage
Specify line terminator
Specify column delimiter using a character not in the data
Define a single column to store the whole record
Parse using FIELD function or substring operator
Using DataStage BASIC
OPENSEQ to open file
READSEQ to read a record
Parse using FIELD function or substring operator
CLOSESEQ to close the file
(305) 4-14
Advanced DataStage
AccessingMulti-Format Records
No Line terminators:
Don’t use Sequential stage
Using DataStage BASIC
OPENSEQ to open file
READBLK to read first byte of a new record to
determine its type
READBLK the rest of the record based on its
type
Parse using substring operator based on record type
CLOSESEQ to close the file
15.
(305) 4-15
Advanced DataStage
UsingStage Variables
Use to store and accumulate values
between reads
Persistent within a transformer
Define in transformer
Click icon to view
(305) 4-18
Advanced DataStage
ExercisePart I: Using BASIC
Read sequential records with a BASIC program
Read from a line-terminated, column delimited
file
Read from a line-terminated file with fixed-length
records
Read from a non-line-terminated file with fixed-
length records
Read from a line-terminated file with varying-
format records
Use stage variables accumulate a running total
19.
(305) 4-19
Advanced DataStage
MergingSequential Data
Install the Merge plug-in
Add Merge plug-in stage to job
Define merge
Names and locations of input files
Temporary directory
Type of join
Input columns for each input file
Columns (key) to merge by
Output columns
(305) 4-22
Advanced DataStage
ExercisePart II: Merge Plug-In
If necessary, install Merge plug-in server
and client components
Merge sequential data using the Merge
plug-in