New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Scheduling bods jobs sequentially and conditionally
1. Scheduling BODS Jobs Sequentially
and Conditionally
created by Anoop kumar on Dec 25, 2012 3:26 PM, last modified by Anoop kumar on Dec 27,2012 12:09 PM
Version 7i nShar e
Introduction:
This Article provides various solutions for scheduling multiple BODS Batch Jobs (Jobs) sequentially and conditionally. As you
are aware that BODS does not contain an inbuilt mechanism to chain multiple Jobs within one parent Job and the default way to
work is to chain multiple Workflow's within a Job. However we cannot execute workflow on its own and need a Job to execute it
and in various scenarios there is a need to sequence the Jobs and conditionally run theJobs. Theapproaches provided below can
be used where chaining multiple Workflow's within a Job is not enough and Jobs have to be chained/Sequenced.
The advantages of using below approaches are
1.There is no need for a Third Party Scheduling Tool, Various features within BODS can be combined to create a Job, which acts
as a Parent Job, which can be scheduled to trigger Jobs one after other. Parent Job acts like a Sequencer of Jobs.
2. We can avoid scheduling each and every Job and only Schedule Parent Jobs.
3. Using WebServices approach, Global Variables can be passed Via XMLfile to the Jobs in a simplified manner.
4. Using WebServices approach, Developer would need access only to the folder, which JobServer can access, to place theXML
files and does not require access to the JobServer itself.
5. Avoids loading a Job with too many WorkFlows.
6. Time based Scheduling (example:Schedule Jobs at 10 every minutes Interval) can be avoided and Hence there will not any
overlap if the preceding Job takes more than 10 minutes.
7.As the Child Jobs and the Parent Job will have its own Trace Logs it would make it easier to troubleshoot in case of any issues.
8.At any point, Child Jobs can be run independently too in Production Environment, this will not be possibleif theentire Job
logic is put into a WorkFlow.
Scheduling BODS Jobs Sequentially:
If the requirement is to just sequence the jobs so that it can be executed one after the other irrespective of whether the preceding
job completes successfully or terminates with some error, then, one of thebelow approaches can be used. Notethat in the
example provided below it is considered that the jobs do not have any Global Variables. Approach for Chaining/Sequencing Jobs
with Global Variables is explained in the later part of theArticle.
Sequencing using Script:
Consider two simple Jobs: Job1 and Job2 are to be executed in sequence and Job2 does not have any Business dependency on
Job1. Therequirement is to execute only one Job at a time. i.e Job1 can be run first and then Job2 or the other way round but the
only criteria is that no two jobs should run at the same time. This restriction could be for various reasons like efficient utilization
of Job Server or because both the Jobs use the same Temp Tables.
Steps to Sequence Jobs using Script:
1.Export theJobs as .bat (Windows) using the Export Execution Command from Management console.
2. 2.Check availability of Job1.bat and Job2.bat files in theJob Server.
3.Create a new Parent Job (call it as Schedule_Jobs) with just one Script Object.
4. In the Script, Call the Job1 and Job2 one after another using the exec function as given below
Print('Trigger Job1');
Print(exec('C:ProgramFiles (x86)Business ObjectsBusinessObjects Data ServiceslogJob1.bat','',8));
Print('Trigger Job2');
Print(exec('C:ProgramFiles (x86)Business ObjectsBusinessObjects Data ServiceslogJob2.bat','',8));
When theSchedule_Jobs Parent Job is run it triggers Job1 and then after completion (successful completion/Termination) of Job1
it triggers Job2. Now Parent Job can be Scheduled in the Management Console to run at a Scheduled time and it would trigger
both Job1 and Job2 in sequence as required. Note that if the Job1 hangs due to some reason, Schedule_Job will wait until Job1
comes out of the hung stateand returns control to Schedule_Job. In this way any number of Jobs can be sequenced.
Sequencing using Webservices:
If the same above two jobs (Job1 and Job2) have to be executed in sequence using Webservices, below approach can be used.
1.Publish both Job1 and Job2 as Webservice from Management Console.
2.Pick up the Webservice URL using theview WSDL option, thelink will be as given below
http://<hostname>:28080/DataServices/servlet/webservices?ver=2.1&wsdlxml
3. 3.In Designer, create a new DataStore with Datastoretypeas WebService and provide the WebService URL fetched from View
WSDL option.
4. Import thepublished Jobs as functions
4. 5. Create a simple Parent Job (called Simple_Schedule) to trigger Job1 and Job2
6. In the Call_Job1 query object, call Job1 as shown in below diagrams, as no inputs are required for Job1, the DI_ROW_ID from
Row_Generation or Null can be passed on to theJob1.
5.
6. 7. Similarly call Job2 in the Call_Job2 query object.
When theSimple_Schedule Parent Job is run, It triggers Job1 and then after completion (successful completion/Termination) of
Job1 it triggers Job2. Now the Parent Job can be Scheduled in the Management Console to run at a Scheduled time and it would
trigger both Job1 and Job2 in sequence as required. Note that if the Job1 hangs due to some reason, Parent Job will wait until
Job1 comes out of thehung stateand returns control to Parent Job. In this way any number of Jobs can be sequenced.
Scheduling BODS Jobs Conditionally:
In most of the cases, Jobs are dependent on other Jobs and some Jobs should only be run, after all the Jobs that this Job depends
on, has run successfully. In thesescenarios Jobs should be scheduled to run conditionally.
Conditional Execution using Script:
7. Lets consider that Job2 should be triggered after successful completion (not termination) of Job1 and Job2 should not be
triggered if Job1 fails.
Job status can be obtained from Repository table/view ALVW_HISTORY. The Job status for the latest instance of the Job1 run
should be checked and based on that Job2 should be triggered.To do this,
1.The Repository DatabaseSchema should be created as new Datastore(Call it BODS_REPO).
2.Import theALVW_HISTORY view from the Datastore.
3.Create a new Parent Job Conditionally_Schedule_Using_Script with just one Script Object
4.Create two Variables $JobStatus and $MaxTimestamp in the Parent job
5.Between the exec functions place thestatus check code as given in the below code
Print('Trigger Job1');
Print(exec('C:ProgramFiles (x86)Business ObjectsBusinessObjects Data ServiceslogJob1.bat','',8));
#Remain idle for 2 secs so that JobStatus is Stable (Status moves fromSto D for a Successful JobandE for Error)
Sleep(2000);
#Pick up thelatest jobStart time
$MaxTimestamp= sql('BODS_REPO', 'SELECT MAX(START_TIME) FROM DataServices.alvw_historyWHERE SERVICE='Job1';');
PRINT($MaxTimestamp);
#Check thelatest status of the precedingjob
$JobStatus = sql('BODS_REPO', 'SELECT STATUSFROM DataServices.alvw_history WHERE SERVICE='Job1' AND
START_TIME='[$MaxTimestamp]';');
PRINT($JobStatus);
if ($JobStatus='E')
begin
PRINT('First JobFailed');
raise_exception('First JobFailed');
end
else
begin
print('First JobSuccess, SecondJobwill be Triggered');
end
Print('Trigger Job2');
Print(exec('C:ProgramFiles (x86)Business ObjectsBusinessObjects Data ServiceslogJob2.bat','',8));
Using theabove code in the Script, when the Parent Job is Run it will trigger Job1 and only if Job1 has completed successfully it
will trigger Job2. If Job1 fails then Parent Job will be terminated using the raise_exception function. This approach can be used to
conditionally schedule any number of Jobs.
Conditional Execution using Webservices:
To conditionally execute Job (Published as WebService) based on the status of preceding Job (again Published as WebService),
the same concept used in the Conditional Execution using Script can be applied. i.e Call Job1, Check the Status of Job1 and then
if Job1 is successful trigger Job2.
1.Create a Parent Job with 2 DataFlows and a Script in between the DataFlows
2. Use First DataFlow to call theFirst Job (Refer above section for detail on calling a Job as webservice within another Job)
3. Use theSecond DataFlow to call theSecond Job
4. Use theScript to Check the Status of First Job
8. The Script will have below code to check the status
#wait for 2 seconds
sleep(2000);
#Pick up thelatest jobstart time
$MaxTimestamp= sql('BODS_REPO', 'SELECT MAX(START_TIME) FROM DataServices.alvw_history WHERE SERVICE='Job1';');
PRINT($MaxTimestamp);
#Check thelatest status of the Precedingjob
$JobStatus = sql('BODS_REPO', 'SELECT STATUSFROM DataServices.alvw_history WHERE SERVICE='Job1' AND
START_TIME='[$MaxTimestamp]';');
PRINT($JobStatus);
if ($JobStatus='E')
begin
PRINT('First JobFailed');
raise_exception('First JobFailed');
end
else
begin
print('First JobSuccess, SecondJobwill be Triggered');
end
Using theabove code in the Script when the Parent Job is Run it will trigger Job1 and only if Job1 has completed successfully it
will trigger Job2. This approach can be used to conditionally schedule any number of Jobs that are published as WebService.
Conditional Execution using Webservices: Jobs with Global Variables
When Jobs have Global Variables for which values needs to be passed while triggering it,It needs to be handled differently as
when the Job is called as webservice it expects Global Variables to be mapped. So theidea is to pass either Null Values (For
Scheduled Run) or actual Values (For ManualTrigger) using XMLfile as input.
Lets assume that the First Job has 2 Global Variables like $GV1Path and $GV2Filename and that the second Job does not have
any Global Variables and the requirement is to trigger Job2 immediately after successful completion of Job1.
1.Similar to above Parent Job, Create a Parent Job with 2 DataFlows and a Script in between theDataFlows
9. 2. Use First DataFlow to call theFirst Job (Refer above sections for detail on calling a Job as webservice within another Job),
Instead of Using Row Generator Object use XMLinput File as source
The XSD for theInput XMLfile will be as given below, if there are more Global Variables in the Job then elements GV3, GV4
and so on should be added to the Schema.
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="FIRSTJOB">
<xs:complexType>
<xs:sequence>
<xs:element name="GLOBALVARIABLES">
<xs:complexType>
<xs:sequence>
<xs:element type="xs:string" name="GV1"/>
<xs:element type="xs:string" name="GV2"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
The Input XMLfile used is as given below
<FIRSTJOB>
<GLOBALVARIABLES>
<GV1>testpath1</GV1>
<GV2>testfilename</GV2>
</GLOBALVARIABLES>
</FIRSTJOB>
3. In the "WebService function Call" in "call_FirstJob" Query object, map the Global Variables as shown below
10. 4.Use Second DataFlow to call the Second Job. As this Job does not contain Global Variables, Row Generation object would be
enough (as in previous section)
5. Use theScript object to Check the Status of First Job
Using theabove approach, When theParent Job is Run it will trigger First Job and pass theGlobal Variables present in the Input
XMLFile and only if First Job has completed successfully it will trigger Second Job. This approach can be used to conditionally
schedule any number of Jobs that are published as WebService. For every Job that has Global Variables, An XSD and XMLfile
should be created. The Global Variables passed from theXMLfile to the WebService seems to be working only when the
parameters are passed in right order, Hence it would be good practice to name theGlobal Variables with good naming convention
like $GV1<name>, $GV2<name> and so on.
Data Services sequential and conditional batch job scheduling &
launching
Posted by Scott Broadway in Data Services and Data Quality on Feb 1, 2013 3:13:05 AM
i nShar e
I really appreciate thequality of Anoop Kumar's recent article "Scheduling BODS Jobs Sequentially and Conditionally". And the
technical accuracy is high -- yes, you can accomplish what you are tryingto do with the techniques discussed in the article. Love
the visuals, too.
11. However.
I cannot really recommend this kind of solution. Data Services is not an enterprise scheduling or orchestration tool. This
approach suffers a bit from Maslow's law of the instrument:"if the only tool you have is a hammer...treat everything as if it were
a nail." Yes, I love DataServices and Data Services is capable of doing all of these things. Is it the best tool for this job?
Not exactly. And this question is answered in the first paragraph that mentions chaining workflows. DataServices already gives
you the capability to encapsulate, chain together, and provide conditional execution of workflows. If jobs only contain one
dataflow each, why are you calling them jobs and why do you want to execute these jobs together as a unit? DataServices is a
programming language like other programming languages, and some discretion needs to be taken for encapsulation and
reusability.
I do really like the use of web services for batch job launching. It is a fantastic feature that is underutilized by DS customers.
Instead, I see so many folks struggling to maintain tens and sometimes hundreds of batch scripts. This is great for providing
plenty of billable work for theadministration team, but it isn't very good for simplifying the DS landscape. Theweb services
approach here will work and seems elegant, but thesection about "sequencing using web services" does not sequence the jobs at
all. It just sequences the launching. Batch jobs launched as web services are asynchronous... you call the SOAP function to
launch the job, and the web service provider replies back with whether the job was launched successfully. This does not provide
any indication of whether the job hascompleted yet. You must keep a copy of the job's runID (provided to you as a reply when
you launch the job successfully) and use the runID to check back with the DS web service function Get_BatchJob_Status (see
section 3.3.3.3 in the DS 4.1 Integrator's Guide). [Note: scheduling and orchestration tools are great for programming this kind
of logic.]
Notice how it would be very hard to get true dependent web services scheduling in DS since you would have to implement this
kind of design inside of a batch job:
Have a dataflow that launches Job1 and returns the runID to the parent object as a variable
Pass the runID variable to a looping workflow
In the looping workflow, pass therunID to a dataflow that checks to see if Job1 is completed successfully
When completed successfully, exit the loop
Have a dataflow that launches Job2 and returns the runID to theparent object as a variable
Pass the runID variable to another looping workflow
In the looping workflow, pass therunID to a dataflow that checks to see if Job2 is completed successfully
When completed successfully, exit the loop
Build your own custom logic into both of thoselooping workflows to run a raise_exception() if therunID of the job crashes with
an error.
Encapsulate the whole thing with Try/Catch to send email notification if an exception is raised.
This convoluted design is functionally IDENTICALto the following and does not rely on web services:
Encapsulate the logic for Job1 inside of Workflow1
Encapsulate the logic for Job2 inside of Workflow2
Put Workflow1 and Workflow2 inside JobA
Use Try/Catch to catch errors and send emails
I'm also hesitant to recommend a highly customized DS job launching solution because of supportability. When you encapsulate
your ETL job launching and orchestration in an ETL job, it's not very supportableby theconsultants and administrators who will
inherit this highly custom solution. This is why you invest in a tool like Tidal, Control-M, Maestro, Tivoli, Redwood, etc., so that
the scheduling toolencapsulates your scheduling and monitoring and notification logic. Put the job execution logic into your
batch jobs, and keep the two domains separate (and separately documentable). If you come to me with a scheduling/launching
problem with your DS-based highly customized job launching solution, I'm going to tell you to reproduce theproblem without
the customized job launching solution. If you can't reproduce the problem in a normal fashion with out-of-the-box DS scheduling
and launching, you own responsibility for investigating theproblem yourself. And this increases thecost to you of owning and
operating DS.
If you really want to get fancy with conditional execution of workflows inside of a job, that is pretty easy to do.
12. Set up substitution parameters to control whether you want to run Workflow1, Workflow2, Workflow3, etc. [Don't use Global
Variables. You really need to stop using Global Variables so much...your doctor called me and we had a nicechat. Please
read this twice and call me in the morning.]
Ok, so you have multiple substitution parameters. Now, set up multiple substitution parameter configurations
with $$Workflow1=TRUE, $$Workflow2=TRUE, $$Workflow3=TRUE, or $$Workflow1=TRUE,
$$Workflow2=FALSE, $$Workflow3=FALSE, etc. Put these substitution parameters into multiple systemconfiguration,
e.g. RunAllWorkflows or RunWorkflows12.
In your job, use Conditional blocks to evaluate whether $$Workflow1=TRUE -- if so, run Workflow1. Else continue with therest
of the job. To another conditional that evaluates $$Workflow2...etc.
Depending on which workflows you want to execute, just call the job with a different systemconfiguration.
Yes, you can include SystemConfiguration name when you call a batch job via command line or via a web service call.
o For web services, you just need to enable Job Attributes in the Management Console -> Administrator ->Web Services (see
section 3.1.1.1 step 9 in the DS 4.1 Integrator's Guide) and specify theSystemConfiguration name inside of element:
<job_system_profile>RunAllWorkflows</job_system_profile>.
o For command line launching, use the al_engine flag:
-KspRunAllWorkflows
Yes, you can override your own substitution parameters at runtime.
o For Web Services, enable Job Attributes and specify the overrides inside of the tags:
<substitutionParameters>
<parameter name="$$Workflow1">TRUE</parameter>
<parameter name="$$Workflow2">FALSE</parameter>
</substitutionParameters>
o For command line launching, use the al_engine flag:
-CSV"$$Workflow1=TRUE:$$Workflow2=FALSE" (put a listof Substitution Parameters in quotes, and separate
them with semicolons)