(305) 4-2
Advanced DataStage
ModuleObjectives
Build controlling jobs to manage job execution
flow
Load a Star Schema
Assign surrogate keys
Load dimension tables
Load fact table
Add parameters to jobs to introduce flexibility
Programmatically set job parameters
Check for duplicates
Load Changed Data Capture records
3.
(305) 4-3
Advanced DataStage
ControllingJobs
Execute and control networks of jobs
Can contain stages, but usually don’t
Job control code is executed before the
stages
Can be nested
(305) 4-6
Advanced DataStage
Loadingthe Star
Load dimension tables
Load hashed file lookup tables
Used as lookups for loading fact table
Faster than lookups directly to dimension tables
Load fact table
Automate the load using a controlling job
Load all dimension tables in parallel
Load all lookups in parallel
Load fact table only if all dimension tables are
successfully loaded
(305) 4-11
Advanced DataStage
JobControl Functions
DSAttachJob
DSAttachJob Establish job handle
Establish job handle
DSRunJob
DSRunJob Run the job
Run the job
DSWaitForJob
DSWaitForJob Wait for the job to finish
Wait for the job to finish
DSGetJobInfo
DSGetJobInfo Get job information
Get job information
(305) 4-13
Advanced DataStage
JobParameters
Used to introduce flexibility
Entered in Job Properties window
Can be used by the job to specify file
names and locations
Can be used in expressions for column
derivations
Can be used in controlling jobs to pass
values to controlled jobs
Must be resolved at execution time
14.
(305) 4-14
Advanced DataStage
Syntax
Name can be any alphanumeric string not
starting with #
Hash marks (#) used to delineate
parameters that will be passed to
operating system (e.g., file name, directory
name)
Examples
#Filename# in a Sequential stage
StartDate in a constraint or derivation or job
control
15.
(305) 4-15
Advanced DataStage
ResolvingParameter Values
Parameter entry window is displayed when
the job runs in standalone mode
Default values can be assigned
Can be set by controlling job using
DSSetParam
Set statically (hard-coded)
Set dynamically
Read from file
Read from a variable or other job parameter
(305) 4-22
Advanced DataStage
Ex.Part III: Controlling Program Flow
Test for bad data
Test job status codes
Get job link information
Set user status codes
Test user status codes
23.
(305) 4-23
Advanced DataStage
Ex.Part IV: Update the dimension tables
Check for duplicate records
Update records in dimension tables
24.
(305) 4-24
Advanced DataStage
Loadinga Parameter File
Example parameter hashed file
Dimension (key)
MaxKey: value of last surrogate key
CustomersD 234
ProductsD 1233
TimeD 9878
Dimension MaxKey
25.
(305) 4-25
Advanced DataStage
Readingthe Hashed File
OPEN “MAXKEY” TO H.MAXKEY
ELSE
. . . ‘ Error Handling Code
END
READ PARAMREC FROM H.MAXKEY, “CustomersD”
ELSE
. . . ‘ Error Handling Code
END
MaxKey = PARAMREC<1>
Filename File Handle
Dimension Key
Dynamic
array
Get MaxKey value from array
26.
(305) 4-26
Advanced DataStage
Settingthe Parameter
READ PARAMREC FROM H.MAXKEY, “CustomersD”
. . .
MaxKey = PARAMREC<1>
hJob2 = DSAttachJob(“jcLoadCustomersD”, DSJ.ERRFATAL)
ErrCode = DSSetParam(hJob2, “MaxKey”, MaxKey
ErrCode = DSRunJob(hJob2, DSJ.RUNNORMAL)
. . .
Job
name
Parameter name
Parameter value
27.
(305) 4-27
Advanced DataStage
ExercisePart V: Changed Data Capture
Load a parameter file with the maximum
key
Read maximum key from parameter file
Generate surrogate key values
sequentially from maximum key value