(305) 4-1
Advanced DataStage
Job Control
Module
6
(305) 4-2
Advanced DataStage
Module Objectives
 Build controlling jobs to manage job execution
flow
 Load a Star Schema
 Assign surrogate keys
 Load dimension tables
 Load fact table
 Add parameters to jobs to introduce flexibility
 Programmatically set job parameters
 Check for duplicates
 Load Changed Data Capture records
(305) 4-3
Advanced DataStage
Controlling Jobs
 Execute and control networks of jobs
 Can contain stages, but usually don’t
 Job control code is executed before the
stages
 Can be nested
(305) 4-4
Advanced DataStage
Job Networks
Job1 Job2
Job1
Job2
Other
jobs
Jobs run in serial
Jobs run in parallel
(305) 4-5
Advanced DataStage
Star Schema Example
Fact table
Dimension table
Surrogate key
Natural key
(305) 4-6
Advanced DataStage
Loading the Star
 Load dimension tables
 Load hashed file lookup tables
 Used as lookups for loading fact table
 Faster than lookups directly to dimension tables
 Load fact table
 Automate the load using a controlling job
 Load all dimension tables in parallel
 Load all lookups in parallel
 Load fact table only if all dimension tables are
successfully loaded
(305) 4-7
Advanced DataStage
Load Dimension Tables
Job 1
Job 2
Job 3
(305) 4-8
Advanced DataStage
Load Lookup Tables
(305) 4-9
Advanced DataStage
Load Fact Table
Dimension lookups
Fact table
(305) 4-10
Advanced DataStage
Controlling Job
Add code to run
the selected job
(305) 4-11
Advanced DataStage
Job Control Functions
DSAttachJob
DSAttachJob Establish job handle
Establish job handle
DSRunJob
DSRunJob Run the job
Run the job
DSWaitForJob
DSWaitForJob Wait for the job to finish
Wait for the job to finish
DSGetJobInfo
DSGetJobInfo Get job information
Get job information
(305) 4-12
Advanced DataStage
Exercise Part I: Loading a Star
 Load dimension tables
 Load lookup tables
 Load the fact table
 Automate the process
(305) 4-13
Advanced DataStage
Job Parameters
Used to introduce flexibility
Entered in Job Properties window
Can be used by the job to specify file
names and locations
Can be used in expressions for column
derivations
Can be used in controlling jobs to pass
values to controlled jobs
Must be resolved at execution time
(305) 4-14
Advanced DataStage
Syntax
 Name can be any alphanumeric string not
starting with #
 Hash marks (#) used to delineate
parameters that will be passed to
operating system (e.g., file name, directory
name)
 Examples
 #Filename# in a Sequential stage
 StartDate in a constraint or derivation or job
control
(305) 4-15
Advanced DataStage
Resolving Parameter Values
 Parameter entry window is displayed when
the job runs in standalone mode
 Default values can be assigned
 Can be set by controlling job using
DSSetParam
 Set statically (hard-coded)
 Set dynamically
Read from file
Read from a variable or other job parameter
(305) 4-16
Advanced DataStage
Using DSSetParam
Parameters
Hard-
coded
Values
(305) 4-17
Advanced DataStage
Setting Parameter Dynamically
Parameter
Controlling
job
parameter
(305) 4-18
Advanced DataStage
Exercise Part II: Set Job Parameters
 Add job parameters
 Set job parameters using DSSetParam
(305) 4-19
Advanced DataStage
Conditions To Control Flow
Return status codes:
Status = DSGetJobInfo(JobHandle, DSJ.JOBSTATUS)
If Status = DSJS.RUNOK Then …
If Status = DSJS.RUNFAILED Then …
If Status = DSJS.RUNWARN Then …
Link record counts
 DSGetLinkInfo(JobHandle, StageName, LinkName, DSJ.LINKROWCOUNT)
User-defined conditions
DSSetUserStatus(“job termination code”)
UserStatus = DSGetJobInfo(JobHandle, DSJ.USERSTATUS)
If UserStatus = “job termination code” Then …
(305) 4-20
Advanced DataStage
Checking Job Status and Link Info
Check Status
Request Status
Get Link Info Num rows through link
(305) 4-21
Advanced DataStage
Checking User Status
Request User Status
Check User Status
(305) 4-22
Advanced DataStage
Ex. Part III: Controlling Program Flow
 Test for bad data
 Test job status codes
 Get job link information
 Set user status codes
 Test user status codes
(305) 4-23
Advanced DataStage
Ex. Part IV: Update the dimension tables
 Check for duplicate records
 Update records in dimension tables
(305) 4-24
Advanced DataStage
Loading a Parameter File
Example parameter hashed file
Dimension (key)
MaxKey: value of last surrogate key
CustomersD 234
ProductsD 1233
TimeD 9878
Dimension MaxKey
(305) 4-25
Advanced DataStage
Reading the Hashed File
OPEN “MAXKEY” TO H.MAXKEY
ELSE
. . . ‘ Error Handling Code
END
READ PARAMREC FROM H.MAXKEY, “CustomersD”
ELSE
. . . ‘ Error Handling Code
END
MaxKey = PARAMREC<1>
Filename File Handle
Dimension Key
Dynamic
array
Get MaxKey value from array
(305) 4-26
Advanced DataStage
Setting the Parameter
READ PARAMREC FROM H.MAXKEY, “CustomersD”
. . .
MaxKey = PARAMREC<1>
hJob2 = DSAttachJob(“jcLoadCustomersD”, DSJ.ERRFATAL)
ErrCode = DSSetParam(hJob2, “MaxKey”, MaxKey
ErrCode = DSRunJob(hJob2, DSJ.RUNNORMAL)
. . .
Job
name
Parameter name
Parameter value
(305) 4-27
Advanced DataStage
Exercise Part V: Changed Data Capture
 Load a parameter file with the maximum
key
 Read maximum key from parameter file
 Generate surrogate key values
sequentially from maximum key value

DS41_DS305_M06_JobControl.ppt DS41_DS305_M06_JobControl.ppt

  • 1.
  • 2.
    (305) 4-2 Advanced DataStage ModuleObjectives  Build controlling jobs to manage job execution flow  Load a Star Schema  Assign surrogate keys  Load dimension tables  Load fact table  Add parameters to jobs to introduce flexibility  Programmatically set job parameters  Check for duplicates  Load Changed Data Capture records
  • 3.
    (305) 4-3 Advanced DataStage ControllingJobs  Execute and control networks of jobs  Can contain stages, but usually don’t  Job control code is executed before the stages  Can be nested
  • 4.
    (305) 4-4 Advanced DataStage JobNetworks Job1 Job2 Job1 Job2 Other jobs Jobs run in serial Jobs run in parallel
  • 5.
    (305) 4-5 Advanced DataStage StarSchema Example Fact table Dimension table Surrogate key Natural key
  • 6.
    (305) 4-6 Advanced DataStage Loadingthe Star  Load dimension tables  Load hashed file lookup tables  Used as lookups for loading fact table  Faster than lookups directly to dimension tables  Load fact table  Automate the load using a controlling job  Load all dimension tables in parallel  Load all lookups in parallel  Load fact table only if all dimension tables are successfully loaded
  • 7.
    (305) 4-7 Advanced DataStage LoadDimension Tables Job 1 Job 2 Job 3
  • 8.
  • 9.
    (305) 4-9 Advanced DataStage LoadFact Table Dimension lookups Fact table
  • 10.
    (305) 4-10 Advanced DataStage ControllingJob Add code to run the selected job
  • 11.
    (305) 4-11 Advanced DataStage JobControl Functions DSAttachJob DSAttachJob Establish job handle Establish job handle DSRunJob DSRunJob Run the job Run the job DSWaitForJob DSWaitForJob Wait for the job to finish Wait for the job to finish DSGetJobInfo DSGetJobInfo Get job information Get job information
  • 12.
    (305) 4-12 Advanced DataStage ExercisePart I: Loading a Star  Load dimension tables  Load lookup tables  Load the fact table  Automate the process
  • 13.
    (305) 4-13 Advanced DataStage JobParameters Used to introduce flexibility Entered in Job Properties window Can be used by the job to specify file names and locations Can be used in expressions for column derivations Can be used in controlling jobs to pass values to controlled jobs Must be resolved at execution time
  • 14.
    (305) 4-14 Advanced DataStage Syntax Name can be any alphanumeric string not starting with #  Hash marks (#) used to delineate parameters that will be passed to operating system (e.g., file name, directory name)  Examples  #Filename# in a Sequential stage  StartDate in a constraint or derivation or job control
  • 15.
    (305) 4-15 Advanced DataStage ResolvingParameter Values  Parameter entry window is displayed when the job runs in standalone mode  Default values can be assigned  Can be set by controlling job using DSSetParam  Set statically (hard-coded)  Set dynamically Read from file Read from a variable or other job parameter
  • 16.
    (305) 4-16 Advanced DataStage UsingDSSetParam Parameters Hard- coded Values
  • 17.
    (305) 4-17 Advanced DataStage SettingParameter Dynamically Parameter Controlling job parameter
  • 18.
    (305) 4-18 Advanced DataStage ExercisePart II: Set Job Parameters  Add job parameters  Set job parameters using DSSetParam
  • 19.
    (305) 4-19 Advanced DataStage ConditionsTo Control Flow Return status codes: Status = DSGetJobInfo(JobHandle, DSJ.JOBSTATUS) If Status = DSJS.RUNOK Then … If Status = DSJS.RUNFAILED Then … If Status = DSJS.RUNWARN Then … Link record counts  DSGetLinkInfo(JobHandle, StageName, LinkName, DSJ.LINKROWCOUNT) User-defined conditions DSSetUserStatus(“job termination code”) UserStatus = DSGetJobInfo(JobHandle, DSJ.USERSTATUS) If UserStatus = “job termination code” Then …
  • 20.
    (305) 4-20 Advanced DataStage CheckingJob Status and Link Info Check Status Request Status Get Link Info Num rows through link
  • 21.
    (305) 4-21 Advanced DataStage CheckingUser Status Request User Status Check User Status
  • 22.
    (305) 4-22 Advanced DataStage Ex.Part III: Controlling Program Flow  Test for bad data  Test job status codes  Get job link information  Set user status codes  Test user status codes
  • 23.
    (305) 4-23 Advanced DataStage Ex.Part IV: Update the dimension tables  Check for duplicate records  Update records in dimension tables
  • 24.
    (305) 4-24 Advanced DataStage Loadinga Parameter File Example parameter hashed file Dimension (key) MaxKey: value of last surrogate key CustomersD 234 ProductsD 1233 TimeD 9878 Dimension MaxKey
  • 25.
    (305) 4-25 Advanced DataStage Readingthe Hashed File OPEN “MAXKEY” TO H.MAXKEY ELSE . . . ‘ Error Handling Code END READ PARAMREC FROM H.MAXKEY, “CustomersD” ELSE . . . ‘ Error Handling Code END MaxKey = PARAMREC<1> Filename File Handle Dimension Key Dynamic array Get MaxKey value from array
  • 26.
    (305) 4-26 Advanced DataStage Settingthe Parameter READ PARAMREC FROM H.MAXKEY, “CustomersD” . . . MaxKey = PARAMREC<1> hJob2 = DSAttachJob(“jcLoadCustomersD”, DSJ.ERRFATAL) ErrCode = DSSetParam(hJob2, “MaxKey”, MaxKey ErrCode = DSRunJob(hJob2, DSJ.RUNNORMAL) . . . Job name Parameter name Parameter value
  • 27.
    (305) 4-27 Advanced DataStage ExercisePart V: Changed Data Capture  Load a parameter file with the maximum key  Read maximum key from parameter file  Generate surrogate key values sequentially from maximum key value