The document discusses various workflow tasks in Informatica PowerCenter including sessions, commands, email, decision, assignment, timer, control, event raise, and event wait tasks. It provides examples of how to use these tasks to control workflow execution based on conditions, variables, events, timing requirements. Specifically, it presents a business case where sessions need to wait for indicator files but only within a specific time window each day, using assignment, file wait, timer, and command tasks along with link logic.
4. 4
Workflow Manager - In Detail
Workflow manager is the key to loading the final product
into the target database(s).
Used to manager how jobs run, order, criteria
Used for scheduling job runs
Used to notify users when a job as completed / failed
Used to partition loads and perform performance turning
5. 5
Register Server
Similar to Relational Connection
dialog
Same parameters apply with the exception of
the new variables for Workflow logs.
6. 6
Assign to Workflows
While folders are closed it is
possible to assign server to a
particular session
This dialog allows for individual or globally selected
sessions to be assigned to run on a particular server
7. 7
Links and Conditions
DefinitionDefinition
Links and their underlying conditions are what provide process control to theLinks and their underlying conditions are what provide process control to the
workflow. When an attached link condition resolves to TRUE then the attachedworkflow. When an attached link condition resolves to TRUE then the attached
object may begin processing. There can be no looping and links can only executeobject may begin processing. There can be no looping and links can only execute
once per workflow. However more complex branching and decisions can be madeonce per workflow. However more complex branching and decisions can be made
by combining multiple links to a single object or branching into decision type paths.by combining multiple links to a single object or branching into decision type paths.
Each link has its own expression editor and can utilize upstream resolved objectEach link has its own expression editor and can utilize upstream resolved object
variables or user-defined variables for its own evaluation.variables or user-defined variables for its own evaluation.
Link conditionLink condition
8. 8
Links and Conditions
Object VariablesObject Variables
The default set of objectThe default set of object
variables from a sessionvariables from a session
can provide morecan provide more
information than just ainformation than just a
status of ‘Completed’. Morestatus of ‘Completed’. More
complex evaluation can becomplex evaluation can be
done for ErrorCode,done for ErrorCode,
StartTime,StartTime,
SrcSuccessRows, etc.SrcSuccessRows, etc.
In addition to the defaultIn addition to the default
object variables, Userobject variables, User
Defined variables can beDefined variables can be
created and populated viacreated and populated via
parameter files or changedparameter files or changed
in the workflow viain the workflow via
Assignment tasks. Also anyAssignment tasks. Also any
upstream task that hasupstream task that has
completed can have itscompleted can have its
variables utilized invariables utilized in
downstream link conditions.downstream link conditions.
ObjectObject
VariablesVariables
9. 9
Tasks
Local Tasks –
Sessions
Commands
Email
Decision
Assignment
Timer
Control
Event Raise
Event Wait
Global (Reusable) Tasks –
Sessions
Commands
Email
Tasks are the default units of work for building the workflow.
Global tasks are reusable across workflows. Local tasks are
independent and self-contained within workflows.
10. 10
Sessions
Session -> Workflow NotificationSession -> Workflow Notification
Options can be set to treat conditional links attached to the object as AND/OR functionality.
Also control option to fail the parent (container) if task fails or does not run.
Disabling a task in a workflow allows the task to be skipped instead of having to remove it.
UpdatedUpdated
parametersparameters
11. 11
Sessions - Continued
ComponentsComponents
The area where commands or email unique to this object can be defined. You can alternatelyThe area where commands or email unique to this object can be defined. You can alternately
select a reusable task to use as well.select a reusable task to use as well.
Choice ofChoice of
reusable orreusable or
local commandlocal command
12. 12
Non-Reusable Commands
ComponentsComponents
Regardless of reusable or non-reusable it is necessary to name the object since there isRegardless of reusable or non-reusable it is necessary to name the object since there is
potential to promote it.potential to promote it.
Option for localOption for local
or reusableor reusable
Name ofName of
commandcommand
objectobject
13. 13
Non-Reusable Commands
ComponentsComponents
The properties tab allows for error control for commands/tasksThe properties tab allows for error control for commands/tasks
Error Control forError Control for
multiplemultiple
commands/taskscommands/tasks
14. 14
Sessions - Continued
PartitionsPartitions
New partitioning scheme allows for repartitioning after Source Qualifier at almost any otherNew partitioning scheme allows for repartitioning after Source Qualifier at almost any other
transformation object in the mapping. There are four main partition types Pass Through, Roundtransformation object in the mapping. There are four main partition types Pass Through, Round
Robin, Hash Auto Keys, Hash User Keys.Robin, Hash Auto Keys, Hash User Keys.
Add Partition pointsAdd Partition points
Change Partition TypeChange Partition Type
15. 15
Session Partitions (Partition Points)
Partition points mark thread boundaries as well as divide
the pipeline into stages.
The partition point at the source qualifier marks the boundary between the first (reader) and second
(transformation) stages. The partition point at the Aggregator transformation marks the boundary
between the second and third (transformation) stages. The partition point at the target instance marks
the boundary between the third (transformation) and fourth (writer) stage.
16. 16
Session Partitions (Partition Types)
Round-robin partitioning. The Informatica Server distributes data
evenly among all partitions. Use round-robin partitioning where you
want each partition to process approximately the same number of
rows.
Hash partitioning. The Informatica Server applies a hash function
to a partition key to group data among partitions.
Key range partitioning. You specify one or more ports to form a
compound partition key.
Pass-through partitioning. The Informatica Server passes all rows
at one partition point to the next partition point without redistributing
them. Choose pass-through partitioning where you want to create an
additional pipeline stage to improve performance, but do not want to
change the distribution of data across partitions.
17. 17
Partitions Defined
First stage. To read data from the three flat files concurrently, you must specify three
partitions at the source qualifier. Accept the default partition type, pass-through.
Second Stage. Since the source files vary in size, each partition processes a different
amount of data. Set a partition point at the Filter transformation, and choose round-
robin partitioning to balance the load going into the Filter transformation.
Third Stage. To eliminate overlapping groups in the Sorter and Aggregator
transformations, use hash auto-keys partitioning at the Sorter transformation. This
causes the Informatica Server to group all items with the same description into the
same partition before the Sorter and Aggregator transformations process the rows.
Fourth Stage. Since the target tables are partitioned by key range, specify key range
partitioning at the target to optimize writing data to the target.
18. 18
Command Tasks
CommandCommand
The command object can be created globally under the Task Developer. It can also beThe command object can be created globally under the Task Developer. It can also be
promoted here from within a mapping. The command task is used to call a shell commandspromoted here from within a mapping. The command task is used to call a shell commands
during the workflow.during the workflow.
Created in TaskCreated in Task
DeveloperDeveloper
19. 19
Command Tasks
CommandCommand
The properties section homes the ability to either run all commands regardless or run them ifThe properties section homes the ability to either run all commands regardless or run them if
each previous command completes. Commands tab is where the actual commands are created.each previous command completes. Commands tab is where the actual commands are created.
One command per line.One command per line.
Process ControlProcess Control
for multiplefor multiple
commandscommands
20. 20
Email Tasks
EmailEmail
Email task is very similar to the command task since it can be either created in the TaskEmail task is very similar to the command task since it can be either created in the Task
Developer or promoted from a mapping. The properties tab allows for an expression editor forDeveloper or promoted from a mapping. The properties tab allows for an expression editor for
text creation utilizing the built-in variables.text creation utilizing the built-in variables.
Email textEmail text
creation dialogcreation dialog
Built-inBuilt-in
VariablesVariables
21. 21
Workflow Variables
Pre-defined VariablesPre-defined Variables
This is the list of all pre-defined task level variables available to evaluate uponThis is the list of all pre-defined task level variables available to evaluate upon
Variable Task Type Datatype ** Supported Status
Returns
ABORTED
DISABLED
FAILED
NOTSTARTED
STARTED
STOPPED
SUCCEEDED
Condition Decision Task Integer
EndTime All tasks Date/time
ErrorCode All tasks Integer
ErrorMsg All tasks Nstring*
FirstErrorCode Session task Integer
FirstErrorMsg Session task Nstring*
PrevTaskStatus All tasks Integer
SrcFailedRows Session task Integer
SrcSuccessRows Session task Integer
StartTime All tasks Date/time
Status** All tasks Integer
TgtFailedRows Session tasks Integer
TgtSuccessRows Sessions Integer
22. 22
Workflow Variables
User-defined VariablesUser-defined Variables
Variables are created at the container level much like the mappings. (Workflows=Mappings,Variables are created at the container level much like the mappings. (Workflows=Mappings,
Worklets=Mapplets). Once created, values can be passed to objects within the same containerWorklets=Mapplets). Once created, values can be passed to objects within the same container
for evaluation. (Assignment Task can modify/calculate variables)for evaluation. (Assignment Task can modify/calculate variables)
Edit VariablesEdit Variables
23. 23
Workflow Variables
User-defined VariablesUser-defined Variables
A user-defined variable can assist in more complex evaluations. In the above example, anA user-defined variable can assist in more complex evaluations. In the above example, an
external parameter file contains the number of expected rows. This in turn is evaluated againstexternal parameter file contains the number of expected rows. This in turn is evaluated against
the actual rows successfully read from an upstream session. $ signifies and is reserved for pre-the actual rows successfully read from an upstream session. $ signifies and is reserved for pre-
defined variables. User defined variables should maintain $$ naming.defined variables. User defined variables should maintain $$ naming.
User DefinedUser Defined
VariablesVariables
Pre-DefinedPre-Defined
VariableVariable
24. 24
Assignment Task
UsageUsage
The assignment task allows for the user to assign a value to a user-defined workflow variable. ToThe assignment task allows for the user to assign a value to a user-defined workflow variable. To
use the assignment task first create and add the assignment task to the workflow. Thenuse the assignment task first create and add the assignment task to the workflow. Then
configure the assignment task by assigning values or expressions to user defined variables. Thisconfigure the assignment task by assigning values or expressions to user defined variables. This
assigned value will then be used for the remainder of the workflow.assigned value will then be used for the remainder of the workflow.
Edit VariablesEdit Variables
25. 25
Event Task
UsageUsage
Event tasks are used to specify the sequence of task execution. The event is triggered based onEvent tasks are used to specify the sequence of task execution. The event is triggered based on
the completion of a sequence of tasks. Event-Raise task and Event-Wait task help to use eventthe completion of a sequence of tasks. Event-Raise task and Event-Wait task help to use event
tasks in a workflow.tasks in a workflow.
Edit EventsEdit Events
26. 26
Event Task
UsageUsage
If using Event tags then an Event Raise is used in conjunction with an Event Wait. In the aboveIf using Event tags then an Event Raise is used in conjunction with an Event Wait. In the above
example two branches are executed in parallel. The second session of the lower branch willexample two branches are executed in parallel. The second session of the lower branch will
remain in stasis until the upper branch completes triggering the event. The lower branches eventremain in stasis until the upper branch completes triggering the event. The lower branches event
wait task recognizes the event and allows for the second session to start.wait task recognizes the event and allows for the second session to start.
Event RaiseEvent Raise
Event WaitEvent Wait
27. 27
Event Raise
UsageUsage
To configure the Event Raise task the drop-down box allows for selection of the appropriateTo configure the Event Raise task the drop-down box allows for selection of the appropriate
user-defined Event tag. This will create an entry in the repository for a matching event wait touser-defined Event tag. This will create an entry in the repository for a matching event wait to
look for.look for.
28. 28
Event Wait
UsageUsage
The event wait allows for configuration for an Event Raise (user-defined event) or existenceThe event wait allows for configuration for an Event Raise (user-defined event) or existence
check for an indicator file.check for an indicator file.
User DefinedUser Defined
EventEvent
Indicator FileIndicator File
29. 29
Event Wait
UsageUsage
The properties section of the Event Wait task allows for further definition of behavior. If yourThe properties section of the Event Wait task allows for further definition of behavior. If your
workflow has failed/suspended after Event Raise but before the Event Wait has resolved, thenworkflow has failed/suspended after Event Raise but before the Event Wait has resolved, then
the Enable Past Events is able to recognize that the Event has happened already. If workingthe Enable Past Events is able to recognize that the Event has happened already. If working
with indicator files you have the ability to either delete the file or allow it to stay in case somewith indicator files you have the ability to either delete the file or allow it to stay in case some
downstream Event Waits are also looking for that file.downstream Event Waits are also looking for that file.
Resume/RestartResume/Restart
SupportSupport
Flat-file CleanupFlat-file Cleanup
30. 30
Decision Task
UsageUsage
The decision task allows for True/False based branching of process ordering. The Decision taskThe decision task allows for True/False based branching of process ordering. The Decision task
can home multiple conditions and therefore downstream links can be evaluated simply upon thecan home multiple conditions and therefore downstream links can be evaluated simply upon the
Decision being True or False.Decision being True or False.
**Note it is possible to have the decision based on SUCCEEDED or FAILED of previous task,**Note it is possible to have the decision based on SUCCEEDED or FAILED of previous task,
however if workflow is set to suspend on error than that branch is suspended and the decisionhowever if workflow is set to suspend on error than that branch is suspended and the decision
won’t trigger on a FAILED conditionwon’t trigger on a FAILED condition
31. 31
Control Task
UsageUsage
The control task is utilized in a branching manner to present a level of stoppage during theThe control task is utilized in a branching manner to present a level of stoppage during the
workflow. Consider if too many sessions have too many failed rows. The options allow forworkflow. Consider if too many sessions have too many failed rows. The options allow for
different levels such as failing at the object level to Aborting the whole workflow.different levels such as failing at the object level to Aborting the whole workflow.
32. 32
Timer Task
UsageUsage
The timer task has two main ways to be utilized. The first way is by absolute time that is timeThe timer task has two main ways to be utilized. The first way is by absolute time that is time
evaluated by server time or a user-defined variable (that contains the date/time stamp to start).evaluated by server time or a user-defined variable (that contains the date/time stamp to start).
33. 33
Timer Task
UsageUsage
The second usage is by Relative time that offers options of time calculated from when theThe second usage is by Relative time that offers options of time calculated from when the
process reached this (Timer) task, from the start of the container this task, or from the start of theprocess reached this (Timer) task, from the start of the container this task, or from the start of the
absolute top-level workflow.absolute top-level workflow.
34. 34
Practical
Business CaseBusiness Case
Need for three sessions to wait forNeed for three sessions to wait for
indicator file(s) to start each one.indicator file(s) to start each one.
Window of opportunity is only betweenWindow of opportunity is only between
10PM and 2AM (next morning). A cutoff10PM and 2AM (next morning). A cutoff
time is needed to stop the processtime is needed to stop the process
(polling - not existing runs) so that new(polling - not existing runs) so that new
activity does not continue between 2AMactivity does not continue between 2AM
and 10PM. Workflow is scheduled to runand 10PM. Workflow is scheduled to run
everyday at 10PMeveryday at 10PM
Objects Used:
•Assignment Task – Assigns the appropriate cutoff time for logic
•File Wait Tasks – Polls for the appropriate Indicator files
•Timer Task – Assigned to start based on the variable assigned by the Assignment task
•Command Tasks – After cutoff time the commands will put an indicator file to release the polling
Link Logic – The remainder of the logic is contained within the links themselves. The main sessions evaluate end
time of file wait tasks to the cutoff time. If within cutoff then sessions will run. If over cutoff sessions will not run. The
cutoff branch also evaluates to see if file wait tasks are running over. If they are still running then the command tasks
will fire.
38. 38
Error Types
Transformation Error
− Data row has only passed partway through the mapping
transformation logic
− An error occurs within a transformation
Data reject
− Data row is fully transformed according to the mapping
logic
− Due to a data issue, it cannot be written to the target
− A data reject can be forced by an Update Strategy
39. 39
Error Types
Error Log Options are set in the Session task (via
Workflow Manager)
Error Type Logging OFF (default) Logging ON
Transformation
Errors
Written to session log then
discarded
Appended to flat file or
relational tables. Only fatal
errors written to session log
Data rejects Appended to reject file
(one .bad file per target)
Written to row error tables
or file
40. 40
Error Logging Off
Transformation Errors:
− Details and data written to the session log
− Data row is discarded
− If data flows concatenated, corresponding rows in
parallel flow are also discarded
Data Rejects
− Conditions causing data to be rejected include:
Target database constraint violations, out-of-space errors, logspace
errors, null values not accepted
Data-driven records, contain value ‘3’ or DD_REJECT (the reject
has been been forced by an update strategy)
Target table properties ‘reject truncated/overflowed rows’
41. 41
Error Logging to a Relational Database
Option set in Session Configuration
Results written to several tables:
− PMERR_SESS: Stores metadata about the session run
such as workflow name, session name, repository name,
etc.
− PMERR_MSG: Error messages for a row of data are
logged in this table
− PMERR_TRANS: Metadata about the transformation
such as transformation group name, source name, port
names with datatypes are logged in this table
− PMERR_DATA: The row data of the error row as well as
the source row data is logged here. The row data is in a
string format such as [indicator1:data1 | indicator2 : data2]
42. 42
Error Logging to a Flat File
Option set in Session Configuration
Format: Session metadata followed by de-normalized error information
Sample session metadata:
Repository GID: 510u6f02-8733-11d7-9db7-00e01823c14d
Repository: RowErrorLogging
Workflow: w_unitTests
Session: s_customers
Mapping: m_customers
Workflow Run ID: 6079
Worklet Run ID: 0
Session Instance ID: 806
Session Start Time: 10/19/2004 11:24:15
Session Start Time (OTC): 1066587856
Row data format:
Transformation || Transformation Mapplet Name || Transformation
Group || Partition Index || Transformation Row ID || Error
Sequence || Error Timestamp || Error UTC Time || Error Code ||
Error Message || Error Type || Transformation Data || Source
Mapplet Name || Source Name || Source Row ID || Source Row Type
|| Source Data
43. 43
Log Source Row Data
Separate checkbox in session task
Logs the source row associated with the error row
Logs metadata about source, e.g. Source Qualifier,
source row ID, and source row type
NOTE: Source row logging is not available
downstream of an Aggregator, Joiner, Sorter, or
other transformation (where output rows are not
uniquely correlated with input rows).