Day 1 Data Stage Administrator And Director 11.0

12,154 views

Published on

DATA STAGE BASICS

Published in: Business, Technology
1 Comment
9 Likes
Statistics
Notes
  • I can't wait to see more presentations from you.
    http://www.teethgrindinginsleep.net/
    http://www.teethgrindinginsleep.net/how-to-stop-teeth-grinding-at-night/
    http://www.teethgrindinginsleep.net/get-a-mouth-guard-to-stop-teeth-grinding-at-night/
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total views
12,154
On SlideShare
0
From Embeds
0
Number of Embeds
81
Actions
Shares
0
Downloads
958
Comments
1
Likes
9
Embeds 0
No embeds

No notes for slide

Day 1 Data Stage Administrator And Director 11.0

  1. 1. DataStage Administrator and Director Basic C3: Protected
  2. 2. About the Author Created By: Mandhagini P.S (127057) Credential An expert in DataStage having 3 years of IT experience Information: Version and DS/PPT/1106/1.0 Date: ©Copyright 2005, Cognizant Academy, All Rights Reserved 2
  3. 3. Icons Used Questions Hands-on Exercise A Welcome Break Test Your Understanding Coding Standards Reference Demo Key Contacts ©Copyright 2005, Cognizant Academy, All Rights Reserved 3
  4. 4. DataStage Administrator and Director: Overview Introduction: DataStage is a Widely used Data Warehousing (DW) tool used to develop Complex ETL jobs. It has a unique feature of Real Time Integration and also provides a very user friendly Interface. DataStage has many features to make easier back end query. DataStage administrator allows you to prepare the setup for DataStage Projects and General Administration of DataStage DataStage director allows you to monitor, schedule, and run the jobs and helps in viewing the Job Log after running the job ©Copyright 2005, Cognizant Academy, All Rights Reserved 4
  5. 5. DataStage Administrator and Director: Objectives Objective: After completing this chapter, you will be able to: Identify what is DataStage tool Define DataStage Administrator Work with DataStage Administrator Explain DataStage Director Work with DataStage Director ©Copyright 2005, Cognizant Academy, All Rights Reserved 5
  6. 6. DataStage Administrator: Logging In • Logging into a DataStage server using the Administrator requires the host name of the server, the fully qualified name if necessary or the server’s IP address, and an operating system username and password. • For UNIX servers, users logging in as root or as a root-equivalent account, or as dsadm will have full administrative rights. • For Windows servers, users logging in who are members of the Local Administrators (standalone server) or Domain Administrators (domain controller or servers in an Active Directory Forest) groups will have full administrative rights. ©Copyright 2005, Cognizant Academy, All Rights Reserved 6
  7. 7. DataStage Administrator: Logging In (Contd.) The Administrator Login Dialog Box Enter the hostname or IP address of the server where DataStage is installed Enter your operating system username and password ©Copyright 2005, Cognizant Academy, All Rights Reserved 7
  8. 8. Viewing the Project List • This page lists the DataStage projects, and shows the pathname of the selected project in the Project pathname field. The Projects page has the following buttons: – Add: Adds new DataStage projects. This button is enabled only if you have administrator status. – Delete: Deletes projects. This button is enabled only if you have administrator status. – Properties: Views or sets the properties of the selected project. – NLS: Lets you change project maps and locales (if the NLS option was installed during the server installation). – Command: Issues DataStage Engine commands directly from the selected project. ©Copyright 2005, Cognizant Academy, All Rights Reserved 8
  9. 9. Adding Projects • Provided that you have the proper permissions, you can add as many projects to the DataStage server as necessary. • In normal projects any DataStage developer can create, delete, or modify any object within the project once it has been created. Tip: The default directory path in which to create projects is located under the root directory of the DataStage server installation. For example, if the server was installed to /appl/Ascential/DataStage the projects would be installed to /appl/Ascential/DataStage/Projects/{project name}. ©Copyright 2005, Cognizant Academy, All Rights Reserved 9
  10. 10. Deleting Projects Highlight the project to be deleted Make sure you have a current backup of your project, just in case! ©Copyright 2005, Cognizant Academy, All Rights Reserved 10
  11. 11. General Project Options • Enable job administration in Director - enabling this feature allows the user the ability to Cleanup Resources and Clear Status File from within the Job menu of DataStage Director. • Enable Runtime Column Propagation for Parallel Jobs - if you enable this feature, stages in parallel jobs can handle undefined columns that they encounter when the job is run, and propagate these columns through to the rest of the stages in the job. • Auto-purge of job log - this setting will automatically purge job log entries for jobs based on the auto-purge action setting. For example, if you specify to auto purge up to the previous 3 job runs, entries for the previous 3 job runs are kept as new job runs are completed. ©Copyright 2005, Cognizant Academy, All Rights Reserved 11
  12. 12. General Project Options (Contd.) Auto purge settings for job logs—not a global or retroactive setting Create Environmental Variables ©Copyright 2005, Cognizant Academy, All Rights Reserved 12
  13. 13. Setting Project-wise Environment Variables • You can set project-wide defaults for general environment variables or ones specific to parallel jobs from this page. • You can also specify new variables. All of these are then available to be used in jobs. • In each of the categories except User Defined, only the default value can be modified. In the User Defined category, users can create new environment variables and assign default values. ©Copyright 2005, Cognizant Academy, All Rights Reserved 13
  14. 14. Setting Project-wise Environment Variables (Contd.) ©Copyright 2005, Cognizant Academy, All Rights Reserved 14
  15. 15. Enable Server-Side Job Tracing You can trace the activities on the server to help diagnose project problems. Enable or disable tracing in the project View or delete the currently highlighted file Trace files that have been created ©Copyright 2005, Cognizant Academy, All Rights Reserved 15
  16. 16. Validating User Account for Job Scheduling • This tab applies to Windows NT/2000 servers only. • DataStage uses the Windows NT Schedule service to schedule jobs. Select a user account with proper access to the DataStage project Verification that the currently selected user account can schedule jobs ©Copyright 2005, Cognizant Academy, All Rights Reserved 16
  17. 17. Performance Tuning Options Some performance tuning options are: • Row buffering • Hashed file stage caching ©Copyright 2005, Cognizant Academy, All Rights Reserved 17
  18. 18. Server Commands Select a project and click ‘Command’ Enter a valid DataStage command When you execute the command, a new window will show the response from the engine ©Copyright 2005, Cognizant Academy, All Rights Reserved 18
  19. 19. Assigning Roles (Operator/Developer) to User Accounts There are four roles for a DataStage user account: • DataStage Developer: Has full access to all areas of a DataStage project. • DataStage Production Manager: Has full access to all areas of a DataStage project, and can also create and manipulate protected projects. • DataStage Operator: Has permission to run and manage DataStage jobs. • <None>: Does not have permission to log on to DataStage. ©Copyright 2005, Cognizant Academy, All Rights Reserved 19
  20. 20. Assigning Roles (Operator/Developer) to User Accounts (Contd.) Select the user role, which is to be assigned to particular user accounts. ©Copyright 2005, Cognizant Academy, All Rights Reserved 20
  21. 21. Settings for Parallel Jobs • Enable Runtime Column Propagation for Parallel Jobs When this feature is enabled, stages in parallel jobs can handle undefined columns that they encounter when the job is run, and propagate these columns through to the rest of the job. • Enable Remote Execution of Parallel Jobs Select this to specify that parallel jobs in this project are to be deployed on USS machine (Unix systems Services). When this option is selected, the Remote tab is enabled and you can specify details about the jobs that are deployed ©Copyright 2005, Cognizant Academy, All Rights Reserved 21
  22. 22. Settings for Parallel Jobs (Contd.) Enable these options. ©Copyright 2005, Cognizant Academy, All Rights Reserved 22
  23. 23. Settings for Parallel Jobs (Contd.) ©Copyright 2005, Cognizant Academy, All Rights Reserved 23
  24. 24. DataStage Director: Logging In • Logging into a DataStage server using the Director requires. • The host name of the server, the fully qualified name if necessary, or the server’s IP address and the operating system username and password. ©Copyright 2005, Cognizant Academy, All Rights Reserved 24
  25. 25. DataStage Director: Logging In (Contd.) The Director Login Dialog Box Enter the hostname or IP address of the server where DataStage is installed Enter your operating system username and password Select the project to attach to ©Copyright 2005, Cognizant Academy, All Rights Reserved 25
  26. 26. Viewing the Job Run Status • The Job Status view shows the status of all the jobs in the currently selected job category, or, if the job category pane is hidden, in the current project. The view has the following columns: – Job name: The name of the job. – Status: The status of the job. – Started on date: The time and date a job was started. These fields are only filled in for a job with a status of Running. – Last ran on date: The time and date the job was finished, stopped, or aborted. These columns are blank for jobs that have never been run. – Description: A description of the job, if available. • To view more details about a job’s status, select the job and do one of the following: – Choose View —> Detail. – Right-click to display the shortcut menu and choose Detail. – Double-click the job. ©Copyright 2005, Cognizant Academy, All Rights Reserved 26
  27. 27. Viewing the Job Run Status (Contd.) Detailed information about a job’s status ©Copyright 2005, Cognizant Academy, All Rights Reserved 27
  28. 28. Validating a Job • You can check that a job or job invocation will run successfully by validating it. • Jobs should be validated before running them for the first time, or after making any significant changes to job parameters. When a server job is validated, the following checks are made without actually extracting, converting, or writing data. • Connections are made to the data sources or data warehouse. • SQL SELECT statements are prepared. • Files are opened. Intermediate files in Hashed File, UniVerse, or ODBC stages that use the local data source are created, if they do not already exist. ©Copyright 2005, Cognizant Academy, All Rights Reserved 28
  29. 29. Validating a Job (Contd.) Click Validate when Job Run Options and parameters have been set ©Copyright 2005, Cognizant Academy, All Rights Reserved 29
  30. 30. Running a Job Click Run when Job Run Options, parameters and tracing options have been set ©Copyright 2005, Cognizant Academy, All Rights Reserved 30
  31. 31. Monitoring a Job Expand tree to see all Optionally show CPU links attached to an active utilization for each stage active stage ©Copyright 2005, Cognizant Academy, All Rights Reserved 31
  32. 32. Stopping a Job Click Stop button to stop a running job ©Copyright 2005, Cognizant Academy, All Rights Reserved 32
  33. 33. Resetting a Job • If a job has stopped or aborted, then it is difficult to determine whether all the required data was written to the target data tables. When a job has a status of Stopped or Aborted, you must reset it before running the job again. By resetting a job, you set it back to a runnable state and, optionally, return your target files to the state they were in before the job was run. • To reset a job or job invocation: 1. Select the job or invocation you want to reset in the Job Status view. 2. Choose Job —> Reset or click the Reset button on the toolbar. A message box appears. 3. Click Yes to reset the tables. All the files in the job are reinstated to the state they were in before the job was run. The job’s status is updated to “Has been reset”. ©Copyright 2005, Cognizant Academy, All Rights Reserved 33
  34. 34. Resetting a Job (Contd.) Click Reset button to return a job to a runnable state ©Copyright 2005, Cognizant Academy, All Rights Reserved 34
  35. 35. Interpreting the Job Execution Details in Log View Current run—black Additional information is Previous run—blue available for this entry (…) ©Copyright 2005, Cognizant Academy, All Rights Reserved 35
  36. 36. Log Event Detail Window Detail information can be copied to the system clipboard and pasted into a text editor— useful for sending errors to support! Additional lines of information regarding this particular event ©Copyright 2005, Cognizant Academy, All Rights Reserved 36
  37. 37. Filtering Log Events Where to start showing log entries Where to stop showing log entries What type of log entries to show How many log entries to show ©Copyright 2005, Cognizant Academy, All Rights Reserved 37
  38. 38. Clearing Log Entries Immediately delete log entries or automatically purge entries Which entries to remove immediately Which entries to remove automatically ©Copyright 2005, Cognizant Academy, All Rights Reserved 38
  39. 39. Clearing Log Entries (Contd.) Options in Auto- Purge: • Up to previous (job runs): Purges old log entries, leaving the specified number of recent job run entries in the file. • Older than (days): Purges all log entries older than the specified number of days. Specify the number of job run entries or days by clicking the arrow buttons or entering the value directly. ©Copyright 2005, Cognizant Academy, All Rights Reserved 39
  40. 40. Schedule View ©Copyright 2005, Cognizant Academy, All Rights Reserved 40
  41. 41. Scheduling a Job Execution You can schedule a job to run in a number of ways: • Once today at a specified time • Once tomorrow at a specified time • On a specific day and at a particular time • Daily at a particular time • On the next occurrence of a particular date and time ©Copyright 2005, Cognizant Academy, All Rights Reserved 41
  42. 42. Scheduling a Job Execution (Contd.) Select a job and click Schedule button ©Copyright 2005, Cognizant Academy, All Rights Reserved 42
  43. 43. Rescheduling a Job Execution Select a previously scheduled job and click Reschedule button ©Copyright 2005, Cognizant Academy, All Rights Reserved 43
  44. 44. Un-scheduling a Job Execution Right click on a previously scheduled job and click Unschedule ©Copyright 2005, Cognizant Academy, All Rights Reserved 44
  45. 45. Cleaning Up Resources • If the Enable Job Administration in Director option has been set in the DataStage Administrator, then certain functions are available to help you clean up the resources of a job that has hung or aborted or return a job to a state in which you can rerun it after the cause of the problem has been fixed. • You should use them with care, and only after you have tried to reset the job and you are sure it has hung or aborted. • The Cleanup Resources command lets you: – View and end job processes – View and release the associated locks ©Copyright 2005, Cognizant Academy, All Rights Reserved 45
  46. 46. Cleaning Up Resources (Contd.) Operating system’s process ID number Logout (kill) selected O/S process Engine locks associated with processes ©Copyright 2005, Cognizant Academy, All Rights Reserved 46
  47. 47. Clearing the Status File Select a hung job and select Clear Status File from Job menu ©Copyright 2005, Cognizant Academy, All Rights Reserved 47
  48. 48. Clearing the Status File (Contd.) Before you clear a status file you should: • Try to reset the job. • Ensure that all the job’s processes have ended. ©Copyright 2005, Cognizant Academy, All Rights Reserved 48
  49. 49. • Allow time for questions from participants ©Copyright 2005, Cognizant Academy, All Rights Reserved 49
  50. 50. Test Your Understanding • What is the use of having User Defined Environment Variables? • Can a DataStage operator manipulate a protected Project? • What is the default cache size of a Hash size? • When will “Clear Status File” be enabled in Director? • What does (…) in the JOB LOG mean? • Where do you see the CPU Utilization of each stage in a job? ©Copyright 2005, Cognizant Academy, All Rights Reserved 50
  51. 51. DataStage Administrator and Director: Summary • DataStage is an ETL tool widely used in Data Warehousing. It has 4 components: Administrator, Director, Designer and Manager. • Administrator can be used to: – Create or delete projects – Assign roles to user accounts – Set project specific environment variables – Enable tracing and Performance tuning • Director can be used to: – View job statistics – Validate/Run/Monitor/Stop/Reset and Schedule jobs – View logs/ filter log events and clear log entries – Clean up job resources ©Copyright 2005, Cognizant Academy, All Rights Reserved 51
  52. 52. DataStage Administrator and Director: Source • DataStage 7.5.1 manual Disclaimer: Parts of the content of this course is based on the materials available from the Web sites and books listed above. The materials that can be accessed from linked sites are not maintained by Cognizant Academy and we are not responsible for the contents thereof. All trademarks, service marks, and trade names in this course are the marks of the respective owner(s). ©Copyright 2005, Cognizant Academy, All Rights Reserved 52
  53. 53. You have successfully completed DataStage Administrator and Director.

×