DB2 Basics


Published on

1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

DB2 Basics

  1. 1. DB2 Basics DB2 - Database 2 is an IBM product. It is MVS’s relational database management system. It exists as one or more subsystem within MVS. DB2 Objects: Database, Table space, Table, Index Space, Index, Storage Group, View, Synonym, Alias STOGROUPS • For DB2 Datasets, we have 2 options for storage allocation – 1. Storage Group 2. User-Defined Vsam • Dataset allocation is performed by DB2 through IBM’s Data Facility Product (DFP) • Max no of volumes per Stogroup is 133 (Ideally 3 or 4) and all volumes must be of same type – 3380, 3390, etc. • CREATE STOGROUP TESTSG1 VOLUMES(‘VOL1’, ‘VOL2’,…) . • Default STOGROUP is SYSDEFLT. It is created when DB2 is installed. This should be avoided. • The underlying VSAM datasets are created and maintained by DB2 . Using User_Defined VSAM datasets • 2 types of VSAM datasets can be used to represent DB2 Tablespaces and Indexspaces - ESDS AND LDS • These are not used as plain VSAM datasets but DB2 uses VSAM Media Manager to access them. Also DB2 does additional formatting of the datasets because of which they cant be treated like Standard Vsam. • DB2 can use LDS more efficiently as it has a 4k CI size and has no control information like ESDS. • DEFINE CLUSTER – (NAME(vcat.dsndbc.ddddddd.ssssssss.I0001.Annn) – LINEAR – REUSE - VOLUMES(vol1,vol2,…) - CYLINDER(pri sec) – SHAREOPTIONS(3 3) )- DATA - (NAME(vcat.dsndbd.ddddddd.ssssssss.I0001.Annn)) vcat represents an ICF catalog BUFFERPOOL Data is first read from a table which is in a VSAM dataset and is moved into a bufferpool and from there it is sent to the requester. There are totally 60 Bufferpool options – 50 4K Bufferpools (BP0 thru BP49) and 10 32K bufferpools – BP32K thru BP32K9. 1
  2. 2. Database - • The total collection of stored data is divied into a number of user databases and a few system databases. • It has a group of logically related Tablespaces and Indexspaces, which in turn contain tables and indexes respectively. • Database is the unit of START and STOP for the system administrator. • Default Database is DSNDB04 which is created during installation. • The parameters used for the creation are – •  STOGROUP - Default is SYSDEFLT  BUFFER POOL - Default is BP0 • DBD is Database Descriptor – It is a control structure used by DB2 to manage the objects which are under the control of a given database. • Whenever any DB2 object in a database is created,altered or dropped, the DBD is modified . • The DBD contains a mapping of the tablespaces, tables and indexes defined in a database. • An X lock is acquired on a DBD during the execution of the DDL. So it is better to • Execute DDL when there is little or no activity. CREATE DATABASE DATPRPDB STOGROUP D3DPSMS BUFFERPOOL BP3 INDEXBP BP4 CCSID EBCDIC; Table Space – • Table space contains one or more tables. Index space contains exactly one index. A table and all of its indexes will be contained within a single database. • It is a dynamically extended collection of pages. Page is a block of physical storage and it is the unit of I/O. The pages are all of 4K size for index spaces and 32K or 4K for table spaces. • Table space is the unit of recovery and reorganization. Types of table spaces – Simple –  can contain one or more tables. Here within a single page, records from  more than one table can be interleaved.  Disadvantages – Affects Concurrent access, data availability, space  management Partitioned –  can contain exactly one table which is so huge that it cant be maintained as a  single unit.  So it is partitioned based on the value ranges of one or more columns in a table. For this column or its combination, a clustering index must be defined. The key can be max 40 bytes . Columns in a partitioning index cannot be updated.  Each partition is independent of one another. Individual partitions can be associated with different storage groups.  To isolate specific data areas in dedicated datasets.  To improve data availability  To improve recoverability  To encourage parallelism (Query Parallelism is breaking the data access for a query into multiple I/O streams that is execute in parallel and which is useful in reducing the overall elapsed time)  For partitioned table space, individual partitions can be reorganized or recovered 2
  3. 3. Segmented –  Can contain one or more tables. The table space is divided into segments. Each segment consists of n pages, where n is a 4x be <= 64 and can contain records of only one table. Advantages over Simple TS - o Concurrent access is possible as data in in different segments o Handles free space more efficiently. o Mass delete is more efficient as data is deleted using the space map and without reading it (unlike simple ts) o Space can be reclaimed from dropped tables immediately. Parameters for Segmented TS are –  SEGSIZE – denotes the no of pages assigned to a segment Parameters for Partitioned TS are –  NUMPARTS – no of partitions  LARGE is used to allocate as much as 1TB of data in a tablespace and No of partitions can be max 254. If not, max storage amount is 64 GB and max no of partitions is 64 LOCKSIZE – Denotes the type of locking for a particular tablespace. Options are TABLESPACE – Used when Shared access to multiple users is not required or when the data is Read-Only TABLE – Applicable only for Segmented TS and when Shared access is not required PAGE - To facilitate shared access (i.e concurrency) ROW - To facilitate shared access to data which could be on the same page. Used only when Page Locks weren’t sufficient for concurrent access. ANY – Means Let DB2 decide for each case, starting from Page Lock LOCKMAX – Max no of Row or Page level Lock for any one user in a tablespace. If this max is reached, the lock gets escalated to table or tablespace lock. Values can be 0 -> lock escalation should never occur SYSTEM -> defaults to the system-wide value specified in DSNZPARMS Integer from 1 to 2,147,483,647 PRIQTY, SECQTY PCTFREE – Denotes what % of each page should remain free for future inserts FREEPAGE – Denotes the no of pages after which an empty page is available. These 2 are useful in reducing the frequency of reorganization, reducing contention and increasing the efficiency of insertion MAXROWS – Max no of rows on a tablespace page. Dflt is 255 3
  5. 5. DB2 DATA TYPES: Numeric - Smallint - 2 byte (binary integer) Integer - 4 byte (") Decimal(p,q) - Packed Decimal; Total p digits and decimal q digits; (p+1) or (p+2)/2 bytes;p<54 Float(p) - If p<22, single precision, 4 bytes, else double precision, 8 bytes Note: All have a DEFAULT value of 0 String - Character - String of n bytes (fixed) where n < 255 Note: DEFAULT is blanks Varchar - String of n bytes (variable) where exact length is n+2, 2 for the actual length Max size < page size within the tablespac. Note: DEFAULT is a null string Date/Time - Date - unsigned packed decimal in format yyyymmdd - 4 bytes Time - " " " hhmmss - 3 bytes Timestamp -same - yyyymmddhhmmssnnnnnn - 10 bytes Note: DEFAULT is CURRENT DATE/TIME/TIMESTAMP Table – • Table that is physically stored within a table space contains one or more stored records. • There will be one stored record per row in the base table (the table that is visible to the user). • The stored record is a byte string containing o A prefix denoting the internal system identifier o For each field, a length prefix denoting the actual length, followed by the actual field value in a encoded form. o The stored records are internally addressed by RID – Record identifier – this consists of the page number within the table space and a byte offset from the foot of the page. This in turn contains the byte offset from the top of the page. This way when the records are rearranged within a page, the RID wont change. Only the local offset at the foot of the page will change. Special Registers: They are Zero Argument built-in Scalar functions, which return a scalar value. 1. USER - Returns the primary Authorisation ID. 2. CURRENT SQLID - Returns the Current Authorisation ID - This could be primary Id or secondary ID assigned to a specific functional area. 3. CURRENT SERVER - Returns the ID of the current server (useful in Distributed Database mgmt) 4. CURRENT PACKAGESET - Returns ID of the collection that is currently in use. 5. CURRENT DATE/TIME/TIMESTAMP - Returns the Current Date/ Time /TimeStamp. 6. CURRENT TIMEZONE - Returns a time duration representing the displacement of the local time zone from Greenwich Mean Time. 5
  6. 6. CREATE TABLE DEPARTMENT (DEPTNO CHAR(3) NOT NULL ,DEPTNAME VARCHAR(29) NOT NULL ,MGRNO CHAR(6) ,ADMRDEPT CHAR(3) NOT NULL WITH DEFAULT '200' ,LOCATION CHAR(16) NOT NULL WITH DEFAULT ,PRIMARY KEY(DEPTNO) ) -----IN <DATABASE>.<TABLESPACE> (The default is used if not specified) ; Note: Table can also be created as CREATE TABLE DUPDEPT1 LIKE DEPARTMENT; CREATE TABLE DUPDEPT2 LIKE DEPARTMENT (DEPT_PHONE CHAR(10)); CREATE TABLE EMPLOYEE (EMPNO CHAR(6) NOT NULL ,FNAME VARCHAR(12) NOT NULL ,LNAME VARCHAR(15) NOT NULL ,WORKDEPT CHAR(3) ,PHONENO CHAR(7) ,HIREDATE DATE ,JOB CHAR(8) ,SEX CHAR(1) ,DOB DATE ,SALARY DECIMAL(9,2) ,BONUS DECIMAL(9,2) ,COMM DECIMAL(9,2) ,PRIMARY KEY(EMPNO)) WITH RESTRICT ON DROP --> this will avoid table from getting dropped. ; ALTER TABLE: ALTER TABLE EMPLOYEE ADD TAX DECIMAL(9,2) ; And this command is also used to add or drop constraint - primary/foreign keys/check Types of Constraints: (1)Check Constraint: -->Used to enforce specific restriction on the values of a column. This will be checked for every update/insert . -->Can use condition operators,between,in,like,null and can have multiple condtions combined using 'and' / 'or' -->CONSTRAINT PHONE_CHK CHECK (PHONENO >= '0000000' AND PHONENO <= '9999999') The 2nd operand can be another column. But they shd be of same data type as first. -->Advantages: (1)The basic business rules which need to be applied can be done at the database level. And this can save additional programming in applications which modify that data. (2)It ensures consitency and data integrity - as it avoids bypassing of rules by adhoc data modification 6
  7. 7. as the rules are applied everytime data is modified -->Watch out the following- (these is not checked) (1)If the conditions coded contradict each other, like check(Phoneno > '1000000' and phoneno < '0999999') (2)If the conditions coded contradict the defaults (3)redundant conditions are allowed - this cud impact performance (2)Referential Integrity: -->Means of enuring data integrity between related tables using a parent-child relationship The one with Primary key is parent. The one with foreign key is child table. -->Alter Table EMPLOYEE Foreign Key Fkey1 (workdept) References Department ON DELETE CASCADE Alter Table EMPLOYEE Foreign Key Fkey2 (workdept) References Department ON DELETE RESTRICT Alter Table DEPARTMENT Foreign Key Fkey3 (ADMRDEPT) References Department ON DELETE CASCADE --> A self referencing constraint shd always use 'ON DELETE CASCADE' Alter Table DEPARTMENT Foreign Key Fkey4 (MGRNO) References Employee ON DELETE SET NULL Note; The same constraint when enforced through the application, it consumes more resources (3) Primary Key is a unique identifier for each row in a table. It can be made of one or more columns. These columns can never be NULL. Note:A table can be defined without a primary key. But it is available for use only if the uniqueness is enforced by defining a UNIQUE INDEX on the same columns. Other unique keys in the table are called alternate keys. To enforce them also, we need unique indexes. They also cant be NULL. Defined as UNIQUE(<COLUMN>) or <column> char(3) NOT NULL UNIQUE Other Integrity Constraints: (1) Checking of Data types (2) Checking NOT NULL (3) Based on unique index on columns not defined as primary key/ alternate key. 7
  8. 8. Index Space - • There is always a one-to-one correspondence between index and index space. Index space is automatically created when we create an index. • They can be partitioned if the index it contains is a clustering index for a partitioned table space. Others are simple. • It is the unit of recovery and reorganization. For partitioned Index space, individual partitions can be reorganized or recovered. Index – • Index defines the logical ordering imposed on the stored data. • They are defined on one column or a combination of columns. • Useful for fast sequential access of the indexed data. • For each distinct value of the index, the pointers, i.e, the RIDs Of all the stored records that have that value are stored. • Index Scan is used when an exhaustive search is to be done based on the index value. This is based on the sequence in which the indexed values are stored. • Table space scan is based on the physical sequence of the records. • Clustering Index – is one for which the records are physically maintained in the sequence defined by the index. The index controls the physical placement of the indexed recs. CREATE UNIQUE INDEX SACM2.ACICACCT ON SACM2.ACTBACCT (ACCT_ID ASC ) CLUSTER (PART 1 VALUES('4384340002425997') USING STOGROUP DUDTSMS PRIQTY 25200 SECQTY 7200 ERASE NO FREEPAGE 31 PCTFREE 15 ,PART 2 VALUES('4851270009632812') USING STOGROUP DUDTSMS PRIQTY 25200 SECQTY 7200 ERASE NO FREEPAGE 31 PCTFREE 15 ) BUFFERPOOL BP4 CLOSE YES; CREATE UNIQUE INDEX RACA.ACI1SPAA ON RACA.ACTBSPAA (ACCT_ID ASC ) USING STOGROUP D2DPSMS PRIQTY 360000 SECQTY 131068 ERASE NO FREEPAGE 0 PCTFREE 10 BUFFERPOOL BP4 CLOSE YES PIECESIZE 2097152 K; 8
  9. 9. VIEWS -> Views are virtual tables based on one or more base tables. -> When data in the base table changes, these changes are also reflected in the view. -> The view is executed only when it is accessed ->They must be defined for a specific use and for one of the following advantages they provide – • To provide row and column level security By limiting the select items to the columns to which the user should have access and/or by limiting the rows by using proper conditions in WHERE clause. • To ensure optimal access By using join criteria and by using indexed columns in predicates • To ensure same calculation By using data derivation formulas in the select list • To Mask complexity of a query from DB2 beginners • To Support Domains Note:Domain refers to a valid range of values that a column can contain. The Table CHECK Constraint is used to create the domains. The WITH CHECK OPTION is used to ensure that the data modification using Update/Insert conforms to the WHERE conditions specified in the view definition. This option is of 2 types- WITH CASCADED CHECK OPTION – The check option is applied to the current view and all the views it accesses regardless of whether it is specified or not. WITH LOCAL CHECK OPTION – The check option is applied to the views where it is Specified. • To rename the columns so that the user can understand them better. -> In a View with  Joins  Functions  Distinct  Group By and Having Cannot do Insert/ Update /Delete In a View with  Derived Data  Constants  Without columns that don’t have default value Cannot do Insert CREATE VIEW HIGH_PAID_EMP (EMP_NO, EMP_FIRST_NAME, EMP_LAST_NAME, DEPT, JOB, SALARY) AS SELECT EMPNO,FNAME,LNAME,WORKDEPT,JOB, SALARY FROM EMP WHERE SALARY > 75000 WITH CHECK OPTION 9
  10. 10. SYNONYMS • An alternative private name for a table or a view • It can be used only by its creator. • It cannot refer to a remote table. • When a table/view is dropped, all synonyms defined on it are also dropped CREATE SYNONYM TESTSYS FOR <CREATOR>.<TABLENAM> ALIASES • An alternative private name for a table or a view • It can be used by everyone. • It can refer to a remote table. • When a table/view is dropped, all aliases defined on it are also dropped • It provides a useful level of indirection between application programs and the data tables. CREATE ALIAS TESTALS FOR <CREATOR>.<TABLENAM> SQL – Structured Query Language It is the standard query language used by many Relational database products like – DB2, Oracle, Sybase,etc. Features of SQL ( or sequel) – • It is a high level language as compared to procedural languages- because it provides a greater degree of abstraction . In procedural languages, we process record by record – we specify what data is to be accessed and how to access them. Sql on the other hand requires that that programmer specify only what is needed, but cannot specify how to retrieve them. The optimal instructions for data navigation (called the access path) is determined by the database itself. • Sql is not only a query language – but it is also used to define data structures, insert, modify, delete data and control access to the data. The language is common to the different users – like DBAs, application programmers, etc. • All database operations using Sql are at Set-Level unlike record level processing using flat files. This includes querying a table and the result is in the form of a subset of the original table. Similarly, updates and deletes also can be done at set level. Based on Functionality, Sql can be classified into 3 Categories - 2) Data Definition Language(DDL) – Creates and maintains physical data structures using CREATE, ALTER and DROP verbs. 3) Data Manipulation Language(DML) – Accesses and Modifies Data using INSERT, SELECT, UPDATE and DELETE. 4) Data Control Language(DCL) – Control Data Security using GRANT and REVOKE. Based on the type of usage, Sql can be classified into 2 Categories - 1) Static Sql – It is embedded in an application program written in a high level Language like Cobol. 2) Dynamic Sql – It could be embedded in an application program where it will be Constructed at run-time. Or it could be executed online at a terminal. 10
  11. 11. Rules for Sql – 1) Every Query must access atleast one object and select atleast item. 2) The items selected can be one or more columns from a table, a literal, special Register, an expression result, an embedded select that returns a single row. 2) The object accessed can be a table, a view, alias or full select. Examples: 1) Getting the the maximum average salary and the dept that has that value. • FETCH FIRST n rows – to limit the fetch to n rows • Correlation name – altername name used for a FROM object in a query (AS …) This can be referred to in the where clause, having clause, etc • Rename – alternate name used for a Selected item. (AS …). This can be used only in Order by. • While creating tables, we specify for every column if it can hold null value or not. A Null value simply means the value is not known or in other words not defined. The presence of null value for a column in any row is identified using – ‘IS (NOT) NULL’. The presence of null value could impact the result of a query in many ways – It could impact the result of functions like Avg which ignores null values in a row. So this is not same as actual Avg. So when we query items which can have undefined values, we need to locate the NULL values using IS (NOT) NULL. SQL Predicates – these are conditions used to identify desired rows in a query. The conditions can be True, False or Unknown . • Basic Predicate – a simple comparison of 2 values. Where Expression1 (OP) Expression2 • Quantified Predicate – compares 1 or more values with a collection of values (from a fullselect) Where Expression1 (OP) Some/Any/All (full select) (single/multi value) • Between Predicate – compares a value in a range of values Where Expression1 between Start-value and End-value • IN Predicate - – compares 1 or more values with a list of values (multi value comparison is always with the result of a fullselect) • LIKE Predicate – to do partial check on strings – ‘%’ is used for skipping a string of any length ‘_’ is used for skipping a single byte • NULL Predicate – checks for NULL values To check if the value of an expression is NULL or not Defining and Using Temporary tables – WITH expression Temporary table provides data that will be fixed for the duration of the SQL. An uncorrelated subquery is in this sense, same. But a correlated subquery is resolved multiple times per sql statement. We can define a temp table. Query from that temp table or join that temp table with a real table. We can also use temp table to insert data into real tables. Wherever we use a temp table, we can use a full select in its place 11
  12. 12. Use of CAST: It is used to convert one data type to another – like CAST(col as <new data type>) E.g: CAST(salary as INTEGER) where sal is decimal CAST(job as CHAR(3)) Padding with blanks or truncation will occur depending on the case. VALUES : This is used to define a set of rows and columns which will be assigned to a view or a temporary table. This data can be treated as though it were an ordinary table except that updates cant be performed. Create View view 1 (col1,col2,col3) Values ((c11,c12,c13),(c21,c22,c23)…) With Temp1(col1,col2,col3) as Values ((c11,c12,c13),(c21,c22,c23)…) CASE : This enables conditional processing within an sql statement. They work in 2 ways – (1)one where every condition in WHEN clause is independently checked (2)where every WHEN checks for equality against a common expression. It can be used in the list of select items Select Dept, Project_id, Case LOB When ‘F’ Then ‘Financial’ When ‘R’ Then ‘Retail’ When ‘I’ Then ‘Insurance’ Else ‘Others’ End as Business From Project It can be used in the where clause Select Dept, Project_id From Project Where Case When Salary between 0 and 999 Then Salary +25 When Salary >1000 Then Salary +50 When Salary >2000 Then Salary + 100 When Salary >1000 Then Salary +100 End Is greater than Netpay It can be used inside a function Select Sales_mgr, Max(Case Month when 1 then Sales_Amt else null end) as Jan Max(Case Month when 2 then Sales_Amt else null end) as Feb …………. Max(Case Month when 12 then Sales_Amt else null end) as Dec From Sales where year=2001 It can be used to set different values in a SET clause in a UPDATE statement. (and can also be nested) . 12
  13. 13. Update Project Set Salary = Case Designation When ‘Associate’ then 1000 When ‘Senior Associate’ then Case Role When ‘tl’ then 2000 When ‘pl’ then 3000 Else null end Else null End It can be used to avoid a divide by 0 error. Select c1,c2,c3 from Table1 Where (Case c2 When 0 then Null Else c1/c2 End as c3) Functions: They are of 2 types – (1) One that acts on a particular column in all the rows that got selected. (Column fn) (2) One that acts on one row at a time. (Scalar fn) Func (All/Distinct expression) 1) Avg – gets the avg value of a set of numeric values which are not Null. 2) Max – gets the maximum value of a set of values. 3) Min – gets the minimum value of a set of values. 4) Sum – gets the sum of numeric values which are not null. Note: All the 4 return Null when no matching rows are found. 5) Count – Count (*) -Gets a count of matching rows Count (All/Distinct expression) – gets a count of rows with non-null value for the expression. It returns 0 when no rows satisfy a Where clause It returns 0 rows when used with a Groupby 6) Count_Big – Similar to count – used when the count is too large Scalar Functions: 1) Abs - To get the absolute value of a number (Both decimal and double will return double) 2) Ascii – To get the ascii value of the leftmost character in the input string. 3) Ceil – To get the next smallest ingeter value >= input 4) Char – It acts differently depending on the type of input. If input is char, then it acts similar to substr – Char(str1,3) – First 3 chars If input is date/time, it converts into char string of diff formats – Char(DOB,USA/EUR) If input is numeric, it conversts to left justified char string. Note: DIGITS – it is similar to chat – it gives equivalent char string for numeric input 5) Chr – It returns the ascii value for input in the range 0 to 255. 6) Coalesce – It is a synonym for Value – It returns the first non-null value in a list of compatible input values. Coalesce(c1,c2,c3). 13
  14. 14. 7) Concat – It is used to concatenate 2 strings – It can be used as str1 || str 2 or as str1 concat str2 or as Concat(str1,str2). 8) Date – This will convert to char string into equivalent date If the input represents a valid date/timestamp – it is coverted as it is If the input is 7 bytes long – it is assumes as yyyynnn – julian and converted. If input is numeric – it is considered as no of days since 0001-01-01 Continuation of Scalar Functions: ---Int / Integer ---- numeric decimal value /valid char --- ----Lcase/Lower , Ucase/Upper, Left Select hour(23:12:58),minute(23:12:58),second(23:12:58), ---can be used with timestamp also microsecond(1996-08-15-15.12.124455), month('1996-11-21'),monthname('1996-11-21'), -- can use ts also ---- week('2002-07-21'), year('2002-07-21'), int('+123'),int('-123'),int('+12.3'),int(' 124 '),int(-2.4) lcase(name),ucase(name), left(' India ',3),length(left(' india,3), -- the blanks not removed --- right(' India',3), length(name) ---- always returns same value for all rows except varchar ltrim(' Indians '),length(ltrim(' Indians ')), rtrim(' India '),length(rtrim(' India ')), -- returns the table partition where the row is ; returns 0 if it is not partiotioned;cant be used with fields on Group by as we cant associate the value to a specific row -- nodenumber(empno), ---- returns partition map index of the row partition(empno), -- if a= b then null else the first value nullif(dept,'100'), ---locate(find-str,base-str,start-pos)-- returns 0 if str not found locate('C','ABCDEF',2), --- this is similar to locate except that first 2 parms are reversed -posstr(base-str,find-str) ---3rd parm not there - but faster as it is of sysibm posstr('ABCDEF,'C'), -----Repeat(string,no of times) repeat('India',5), -----Replace(string,from,to) -- if 'from' not found, returns orig strin replace('Search','ar','AR') space(5), ----- returns a string of 5 b/k - varchar ----substr(string,start,len) - error if len>(length(string)-start) substr(name,3,5), 14
  15. 15. ------ Translate(base-str,to,from,substitute) -- similar to replace, but works char by char ------ unmatched char replaced by b/k or the substitute Translate('India'), -- to ucase Translate('abcd','A','ab'), -----To generate Random float value in the range 0 to 1 inclusive Rand() from emp0710. Order By – It acts on the final output of a query, sorts the data in the requested form. It cant be used on intermediate results (like subquery). We can order by Column name (this need not be one of the selected Column or we can order by column no or we can order by expression – like substr(name,2) ----- Use of Order By ---- select empno,salary,bonus from emp0710 order by 2 --> using column no select empno,salary,bonus from emp0710 order by dob desc --> using a specific colm which is not in select also select empno,salary,bonus from emp0710 order by substr(empno,2) --> using a fn output Group By – It is used to combine multiple rows into 1. And Having is used to select which of the groups are to be retrieved. Rules: There can be only one group by per select . But we can have one per select in a multi select query. The select items must either be those in the the group by or they must be column functions . The result of the query shd have a distinct set of rows where the unique Identifier is the fields grouped on All Null values in the group by fields are considered equal. We can group by fields as well as expressions – sum(salary+comm) Select Avg(Bonus),count(*),sum(salary+comm.) From Employee Where Dept <> 100 Group by sum(salary+comm.) Having Avg(Bonus) > 10000. ---- Group By --------- Select Dept,Avg(Bonus),count(*),count(Empno),sum(salary+bonus) From Employee 15
  16. 16. Where Dept <> 100 Group by Dept Having Avg(salary) > 8000 Select Dept,Avg(Bonus),count(*),count(Empno),sum(salary+bonus) From Employee Group by Dept Having Avg(salary) > 8000 ----> To test and see Where year(hiredate) <> 2001 Select Dept,Avg(Bonus),count(empno),sum(salary+bonus) From Employee Group by year(hiredate) Joins - They are used to get data from more than one table linking them based on the Relationship between the tables. They can be coded in 2 forms – (1) Select c1,c2,…. From Table1 T1,Table2 T2 where T1.Id = T2.Id and T1.c1 = someval Note: This is mainly suitable only for inner joins. (2) Select v1.Id, v1.name,v2.job From Staff v1 Inner Join Staff v2 On v1.id=v2.id Inner join Staff v3 On v2.id=v3.id where …….. Note: The 2nd style joins 2 tables at a time. The ON condition constrains the join. Where conditions are applied after the join to constrain the result. Join Types: The Default is Inner Join. It always checks for matching values between the 2 tables based on condition specified and returns data from the row that matches. Outer Join: Useful when we want rows that have the matching values and also the ones that don’t have matching rows in the other table. Left Outer join: All rows in first table plus matching rows in 2nd table. Right Outer join: All rows in 2nd table plus matching rows in 1st table. 16
  18. 18. ON A.EMPNO = B.MGRNO SUBQUERIES: It is used to relate one table to another or multiple others without actually doing a join. Syntax: Expression OP SOME/ANY/ALL (SUBQUERY) OR Expression (NOT) EXISTS/IN (SUBQUERY) SOME: The subquery is true if any row matches (same for ANY) ALL : The subquery is true ONLY if ALL rows match IN : matches one of the values listed by the subquery EXISTS: there exists atleast one matching row in the subquery. They are of 2 Types - (1) Uncorrelated Subquery The results of the subquery have no direct relationship with the current row . Example: SELECT EMPNO,WORKDEPT,DEPTNAME FROM EMPLOYEE A WHERE WORKDEPT = ANY(SELECT DEPTNO FROM DEPARTMENT); (2) Correlated Subquery The results of the subquery can be evaluated only based on the value in the current row . SELECT A.EMPNO,A.WORKDEPT,A.DEPTNAME FROM EMPLOYEE A WHERE A.WORKDEPT = (SELECT B.DEPTNO FROM DEPARTMENT B WHERE A.WORKDEPT=B.DEPTNO); Examples: SELECT EMPNO,WORKDEPT,DEPTNAME FROM EMPLOYEE A WHERE WORKDEPT = ANY(SELECT DEPTNO FROM DEPARTMENT); SELECT EMPNO,WORKDEPT,DEPTNAME FROM EMPLOYEE A WHERE WORKDEPT = SOME(SELECT DEPTNO FROM DEPARTMENT); SELECT EMPNO,WORKDEPT,DEPTNAME FROM EMPLOYEE A WHERE WORKDEPT IN (SELECT DEPTNO FROM DEPARTMENT); SELECT * FROM EMPLOYEE A WHERE A.SALARY >= ALL (SELECT SALARY FROM EMPLOYEE B) ---- TO SELECT THOSE DEPARTMENTS WHICH HAVE NO EMPLOYEES ---- 18
  19. 19. SELECT A.DEPTNO,A.DEPTNAME FROM DEPARTMENT A WHERE NOT EXISTS (SELECT DISTINCT(B.WORKDEPT) FROM EMPLOYEE B WHERE B.WORKDEPT = A.DEPTNO) OR SELECT A.DEPTNO,A.DEPTNAME FROM DEPARTMENT A WHERE NOT IN (SELECT DISTINCT(B.WORKDEPT) FROM EMPLOYEE B) ---- TO SELECT THOSE DEPARTMENTS WHICH HAVE MORE THAN 1 EMPLOYEE ---- SELECT A.DEPTNO,A.DEPTNAME FROM DEPARTMENT A WHERE 1 < (SELECT COUNT(*) FROM EMPLOYEE B WHERE B.WORKDEPT = A.DEPTNO) ----MULTI-FIELD SUBQUERIES ---------- SELECT * FROM EMPLOYEE A WHERE (A.BONUS,A.SALARY) > (SELECT AVG(B.BONUS),AVG(B.SALARY) FROM EMPLOYEE B) DB2 Application Development For an application program to access DB2 data, the sql statements have to be embedded in the program statements of the high level language – like COBOL, PL1, etc. Embedded SQL can be executed in • MVS Batch using CAF • TSO Batch • DL/I Batch • TSO (interactive) SQLs that can be embedded are: SQL SQL Statements Type DCL Grant, Revoke DDL Create, Alter, Drop DML Insert, Select, Update, Delete Dynamic Sql Describe, Execute, Execute Immediate, Prepare Stored Procedures Call, Allocate Cursor. Associate Locators Embedding Control Declare, Open, Fetch, Close Transaction Control Commit, Rollback Package Control Set Current Packageset General Explain, Lock Table, Set Error Handling Whenever Comparison of Application that accesses data from Flat files VS Application accessing DB2 data 19
  20. 20. DB2 Access File Access All the accessed tables are declared. Files are defined in the FD section Cursors are declared No DD card in JCL DD card required except for files allocated dynamically Cursors are opened Files are opened Cursors are closed Files are closed Set level processing Record level processing Access to table can be at Column level Always a full record is read or written Status of SQL returned by Sqlcode Status indicated by file-status Cursors are used to simulate record level READ/WRITE used for record-level processing processing  All Embedded SQL Statements are delimited using Exec SQL Sql statements End-Exec (Starting in column 12) This helps the precompiler parse the embedded sql in the source code  Though not mandatory, declare all the tables which are used in the application. This should be declared in the working-storage section before any other db2-related variable. This helps in reducing the precompiler’s work. This is done using DCLGEN command. It reads the DB2 catalog to determine the structure of the table and builds a Coobl Copybook. This copybook has the Embedded ‘Declare table’ statement and also working-storage host variable definitions for each column in the table. However, it is not mandatory that only this copybook should be used in the program. We can also hardcode the same information in the program, but not a good practice.  Include a structure called SQLCA (SQL Communication Area) as Exec SQL Include SQLCA End-Exec The fields defined in it are used to describe the success or failure of execution of each embedded Sql. The important fields in this are SQLCODE – Return Code of the last executed sql passed by DB2 to the application 0 -> Successful Execution +ve -> Successful Execution, but with an exception -ve -> Execution Failed Note: Move the Sqlcode into a formatted field to display . SQLERRD – an array containing 6 rows used to diagnose error conditions. SQLERRD(3) -> no of rows inserted, deleted or updated by the sql SQLERRD(5) -> gives column position of syntax error in a dynamic sql SQLWARN0 – Contains W if any other SQLWARN* (1 thru A) have W. SQLSTATE - Similar to SQLCODE . Has a 2 char Class code denoting the error type and a 3 Char subclass Code denoting the exact error in that error type. 20
  21. 21. A group of Sqlcodes are associated to a single sqlstate. The return code can be checked after every sql and action taken depending on the requirement. SQL has a error trapping statement – WHENEVER When ‘WHENEVER is used’ it applies to all the subsequent sqls in that application. It can be used in 3 cases - Exec Sql Whenever NOT FOUND (* not found) Continue Whenever SQLWARNING (* +ve sqlcode but not +100 or sqlwarn0 = 0) Go to Warning-para Whenever SQLWARNING (* -ve sqlcode) Go to Warning-para End-Exec But use of WHENEVER is not recommended.  Host Variables - Host Variable is an area of storage allocated by the host language and referenced in an SQL statement. They are the means of moving data from the program to Database and vice-versa. In addition they also are used as values to be compared with in predicates and are dynamically populated at run time. Host Structures They are group level data containing a set of host variables and if all of them have to be used As the target of retrievel , the into will refer only to :Group_Data. Null value returned : -305 Null Indicators are defined as Pic S9(4) Comp. Possible values in the null indicator: +ve or 0 implies column is not null. -2 implies data conversion error -1 implies null data returned from the database original length of a column if the column is char type and is truncated on retrieval. The sql defined in a cursor may be executed at the time of OPEN Cursor or for every fetch. CLOSE Cursor releases all the resources used by the cursor. A Cursor cannot be used for updates if it is defined with any one of the following – UNION DISTINCT GROUP BY ORDER BY JOIN SUBQUERY CORRELATED SUBQUERY TABLES IN RO/UT MODES READ-ONLY VIEWS 21
  22. 22. Use FOR UPDATE OF both for Delete / Update This helps in ensuring Data Integrity. This helps in locking the data page when the fetch happens and ensures no other updates happen. Note: LOAD outperforms individual inserts by 50-75% ISOLATION LEVEL specified at the statement level is preferable because it can vary from one stmt to another and it overrides that in plan or package. Cursors are implicitly closed when a COMMIT is issued or when the program ends. Declare Cursor with WITH HOLD option (declare trial1 cursor with hold for select….) Using WITH HOLD helps in retaining the result set table for the select defined in the cursor. This way we can continue to process the cursor from where we left. Otherwise the cursor will have to opened and repositioned. While Declaring cursors it is a good practice to specify ‘FOR FETCH (READ) ONLY’ or ‘FOR UPDATE OF Col1, Col2 ,. At the SQL Level also we can specify Isolation Levels using WITH clause. This will override what was specified during Bind Package/Plan. This can be done For – Declare Cursor Select Into Insert Update with Where Delete with Where COMMIT_RESTART Logic: COMMIT is the DB2 statement which saves the updates made to DB2 tables since the start of the program or the last COMMIT. It does so by physically applying the all the changes to the DB2 Log. When a program abends, the changes are rolled back to the last Commit / Sync point. When is a COMMIT issued - A unit of work is a portion of processing that gives a logical completion and helps in achieving data integrity. The Updates done during a unit of work are saved by issuing a DB2 COMMIT. For Example, suppose we have an application that involved update of 2 tables which are related to each other. And suppose the update of table1 is COMMITed and the program abends before COMMITing the UPDATE to the other. The data in the 2 tables will no longer be synchronous. To Achieve data integrity, we must issue a COMMIT only when the processing reaches a logical end. Checkpoint-Restart Logic: Application Programs which process large volumes of data and those that involve modification of Db2 Tables need to incorporate a Restart Logic when a system error occurs. This is to help in continuing the processing from the record next to the last successfully processed row.  Create a DB2 table which will have 22
  23. 23. Pgm_name Job_name Commit_Frequency No_of_Commits Checkpoint_Key Checkpoint_Time The Unique key will be Pgm_Name, Job_Name Declare 2 cursors – both with ORDER BY for the columns that form the unique index for the table. One cursor will be used to do normal processing. The other cursor while restarting the application is used to reposition at the record following the last saved record . This is done using additional predicates like Where Key > Last_Key For every row that is fetched, processing is done, tables are updated. Then based on elapsed time since last commit or commit frequency, etc, COMMIT is issued Before every COMMIT is issued, the ckpt_restart table is updated with the Current key for Which processing is complete and the current timestamp. If the program is restarted, it reads the key in the ckpt table and uses that in 2nd cursor to Reposition. Then continues processing. After the processing is complete for all the rows, the ckpt table keys are set to default Values. If File inputs are used for processing, then we need to reposition the input record by saving the read_count in the ckpt table. And upon restart perform read in a loop till the count is > the read_count. Or else is the file is sorted on the same key that is stored in the ckpt table, we can Reposition to the next key. If there are File outputs, they need to be have a disp=Mod in the jcl to continue appending The records. Dynamic SQLs Static SQL is one wherein the SQL is hardcoded and only the values of host variables used in predicates can change during execution. Static SQL is prepared before execution. In Dynamic SQL, the columns, tables, and predicates are decided at run time. Dynamic SQL is prepared at the time of execution. Dynamic SQL are of 4 classes – Execute Immediate – They are used to implicitly prepare and execute complete sql statements. This cannot be used to SELECT data. They are not useful if the same statement has to be executed multiple times. Because they are destroyed immediately after execution. And must be Prepared and Executed again. 01 SQL-TEXT. 03 SQL-LEN PIC S9(4) COMP. 03 SQL-TXT PIC X(100) COMP. Move +38 to SQL-LEN. 23
  24. 24. MOVE “‘DELETE FROM EMPL2 WHERE EMPNO = ‘100’ “ to SQL-TEXT EXEC SQL EXECUTE IMMEDIATE :SQL-TEXT END-EXEC Non-select Dynamic SQL – They are used to Explicitly prepare and execute complete sql statements. The statement preparation and execution are kept separate – PREPARE and EXECUTE. This cannot be used to SELECT data. EXEC SQL PREPARE STMT1 FROM :SQL-TEXT END_EXEC EXEC SQL EXECUTE STMT1 END_EXEC The equivalents of host variables of Static Sql is the Parameter Marker. Move +38 to SQL-LEN. MOVE “‘DELETE FROM EMPL2 WHERE EMPNO = ? “ to SQL-TEXT EXEC SQL PREPARE STMT1 FROM :SQL-TEXT END_EXEC Move ‘100’ to Var1. EXEC SQL EXECUTE STMT1 using :var1 END_EXEC Useful because prepared once and executed many times. Fixed-List Select – To Explicitly prepare and execute sql select statements when the columns to be retrieved are fixed. ‘Select Projno, Projname, Empno from Proj where Projno = ? and Proj_Start_Date = ?’ to SQL-TEXT. Exec SQL Declare CSR Cursor for FSQL End-Exec Exec SQL Prepare FSQL from :SQL-TEXT End-Exec Exec SQL Open CSR using :var1 :var2 End-Exec Exec SQL Fetch CSR into :Ws-Projno :Ws-Projname :Ws-Empno 24
  25. 25. End-Exec Note: This is like a static sql only. But suppose at run time, we need to execute the following ‘Select Projno, Projname, Empno from Proj where Projno = ? and Proj_End_Date = ?’ to SQL-TEXT. Since the 2 predicates are of same datatype, we can use the same var2. Instead we need to use a different OPEN for the 2nd type. Varying-List Select - To Explicitly prepare and execute sql select statements when the columns to be retrieved are also not fixed. SQLDA – This is the SQL Descriptor Area used by DB2 to communicate information about the Dynamic sql (fixed-List and varying-list Select) to an application program like – type of SQL, the number and data type of columns being returned by the SQL. This is included as Exec SQL Include SQLDA End-Exec We use SQLDA to code the PREPARE, FETCH statements and also have steps to store host variable addresses in SQLDA. Suppose an application has to read a sql from a terminal and execute it. Have 2 SQLDAs - One is the full include and the other with only minimal info – MINSQLDA. Read the sql at run time from the source and Move ‘sql’ to SQL-TEXT. Exec SQL Prepare FSQL Into MINSQLDA From :SQL-TEXT End-Exec If SQLD in MINSQLDA = 0 /* no variables used ,i.e, a non-select */ Exec SQL Execute Immediate :SQL-TEXT End-Exec Else Exec SQL Declare CSR Cursor for FSQL End-Exec Exec SQL Prepare FSQL Into SQLDA From :SQL-TEXT End-Exec Exec SQL OPEN CSR End-Exec 25
  26. 26. For each host variable to be retrieved , Set SQLDATA to address of the host-var which will have the retrieved data Set SQLIND to address of the null-indicator field Exec SQL Fetch CSR Using Descriptor SQLDA End-Exec First determine if it was a select or not. If not a Select then use Execute Immediate. If it is a select, prepare the sql statement 26
  27. 27. Program Preparation Steps Program Preparation – This involves preparation of an executable load module and a DB2 application plan for each application . They can be used only together. The steps involved are as follows – Source Program + copybooks  Precompiler  DBRM and Modified Source program Modified Source  Cobol Compiler  Compiled Source  Linkage Editor  Load DBRM  (Bind Package  Package)  Bind Plan  Plan  DB2 Directory DCLGEN Command when used for a particular table reads the DB2 Catalog to determine the table definition and builds a Cobol copybook . This copybook Contains the Declare Table statement and host variable definitions for the columns in that table. Precompilation:  Done by Precompiler  Expands DB2 INCLUDE members  Comments out all the Embedded Sql Text and for every executable Sql, a call is added to the DB2 runtime interface module – DSNHLI along with the necessary parameters.  Extracts all the sql statements and places them in Database Request Module – DBRM  Places a unique timestamp token in the modified source and DBRM  Reports the success or failure of the precompile process.  Db2 need not be operational at the time of compilation. Compilation :  The modified source code is compiled using the standard Cobol Compiler.  Db2 need not be operational at the time of compilation. Bind :  Invokes the DB2 Optimiser and the execution happens in the Relational Data Services Component of DB2  It reads each sql statement from the DBRM and comes up with an optimized access path for that sql.  Bind is of 2 types – Bind Package accept as input a single DBRM and it produces and execution logic for only the sqls in that DBRM. It is not executable. One or more packages are bound into an application plan . Bind Plan accepts as input one or more DBRMs and one or more packages produced by Bind Package. It produces an executable sql logic for all the Sqls in the DBRMs attached to it. Plan is the unit of execution, but can be used only with the corresponding Load module. Bind Plan / Package do the following –  Read the Sqls and check the syntax  Ensures that the tables and columns accessed are same as what is mentioned in 27
  28. 28. the db2 Catalog.  Bind Authorisation of the user is validated.  Determines the Optimal Access Path for each SQL Statement based on the DB2 Catalog Statistics (such as the availability of indexes, organization of data, table Size). Linkage: The compiled source is link-edited to create the executable Load. The link-edit step must include an appropriate Cobol-DB2 interface module which will Vary depending on the execution environment. A DB2 Program has 2 components used in execution – one is load module and the other is application plan. Execution of a Db2 batch program using TSO command – //step001 Exec Pgm=IKJEFT01 //SYSTPRT DD Sysout=* //SYSPRINT DD Sysout=* //SYSOUT DD Sysout=* //SYSTSIN DD * DSN SYSTEM(DSN) RUN PROGRAM(PROGRAM1) PLAN(PLN01) LIB(xxx.xx.xxxxx) /* //* SYSDBRM has information on every DBRM which is bound into a package or a plan. SYSSTMT has all the SQL statements of a DBRM which are bound into a plan. SYSPACKSTMT has all the SQL statements of a DBRM which are bound into a package. During BIND PLAN, the following DB2 Catalog tables are read – SYSCOLDIST SYSCOLDISTSTATS SYSCOLSTATS SYSCOLUMNS SYSINDEXES SYSINDEXSTATS SYSPLAN  SYSPACKAGE SYSPLANAUTH  SYSPACKAUTH SYSTABLES SYSTABLESPACE SYSTABSTATS SYSUSERAUTH And information about the plan is stored in the following DB2 tables – SYSDBRM SYSPACKAUTH SYSPACKLIST SYSPLAN  SYSPACKAGE SYSPLANAUTH  SYSPACKAUTH SYSPLANDEP  SYSPACKDEP SYSPLSYSTEM  SYSPKSYSTEM SYSSTMT  SYSPACKSTMT 28
  29. 29. SYSTABAUTH During Bind Package, the package is associated with a specific collection id. //BIND EXEC PGM=IKJEFT01,COND=(5,LT) //STEPLIB DD DISP=SHR,DSN=DSN710.SDSNLOAD //DBRMLIB DD DISP=OLD,DSN=FDR4378.BABU.FILES //SYSPRINT DD SYSOUT=* //SYSTSPRT DD SYSOUT=* //SYSUDUMP DD SYSOUT=* //SYSTSIN DD * DSN S(DSN) BIND PLAN(FDRPLN01) - QUALIFIER(ID) - MEMBER(DBRM1,DBRM2,….) - PKLIST(COL1.PACKAGE1,COL2.*) - ACTION(REPLACE) - RETAIN - VALIDATE(BIND)- ACQUIRE(USE) - RELEASE(COMMIT) - ISOLATION(CS) - DEGREE(ANY) - EXPLAIN(NO) /* Collection: It is a logical grouping of packages. During Bind Plan, we specify the DBRMs which are directly bound to the plan and we Also specify the packages which are bound to the plan. The packages are specified as collection_id.Package or collectio_id.* to refer to all The packages in that collection. The use of Package Bind provides following advantages – (1) Reduced Bind Time – Suppose a plan has more than one package bound to it. If changes are made only to one DBRM. We only need to BIND Package for that DBRM and put in the same collection which is associated to the plan. So no need to Bind Plan again. (2) With the use of Bind at Package level, the isolation and release parameters are specified at the DBRM level, instead of the plan level. (3) Packages can be versioned and maintained. That is multiple packages can exist with the same name in the same collection and in turn bound to a single plan. At run time, depending on the CONTOKEN of the load, the corresponding package is picked and executed. By Default, DB2 uses Timestamp as version identifier. Advantage of using Collection: (1) The same DBRM can be bound into different packages which are associated to different collection ids and using different qualifiers during BIND. And both the collections are used in the BIND PLAN. At run time, the same application can be made to access different sets of DB2 objects by using ‘SET CURRENT PACKAGESET = COLLECTION ID’ (2) Logically related DB2 programs’s packages are grouped into the same collection. 29
  30. 30. BIND Parameters – (1) Qualifier – The ID specified here qualifies all the tables which are referred to in that DBRM. If Qualifier is not specified, the default qualifer is the owner which is the primary/secondary auth id. (2) ACTION - ADD /REPLACE (3) Isolation Level – specifies the mode of page locking used when a pgm is executed. They are of 4 types – CS – Cursor Stability This releases page locks used for read as soon as another page is accessed. This will help in improving Concurrency. UR – Uncommitted Read Also called Dirty Read. This facilitates accessing DB2 data without taking any Locks. That is access data which is being changed by some other application. This avoids concurrency pbms, but doesn’t guarantee accurate data. This isolation applies only to Read statements. For others CS is applied. This is to be avoided if the results have to be 100% accurate. RR – Repeatable Read This is the default . All page locks are held till a COMMIT is issued. This is useful only when the application requires that the same row be accessed more than once and the data fetched during each fetch have to be consistent . Therefore it is essential to ensure the data integrity. It offers very poor concurrency. RS – Read Stability Similar to RR. It doesn’t allow the page lock to be released till COMMIT point. But allows new data to be inserted. (4) ACQUIRE and RELEASE – These 2 parameters specify the mode of tablespace locking used. And they are specified only at Plan level and not Package level. ACQUIRE – Use - ts locks are acquired when the tablespace is first accessed -- Default Allocate – ts locks are acquired when the plan is allocated. RELEASE – Commit – ts locks are released during Commit or Rollback. -- Default Deallocate – ts locks are released when the plan is deallocated. (5) VALIDATE - Method of checking the existence and validity of DB2 tables and DB2 access authorization. Bind - The validation done at the time of Bind Run - The validation done each time the plan is executed (6) FLAG(I) – This returns all the information – warning, error and completion messages, indicating the success or failure of Bind. 30
  31. 31. (7) EXPLAIN(YES) - (8) RETAIN – For existing plans, we should specify the Retain parm to retain the Bind and Execution authority already granted for this plan. Otherwise all the authority will be revoked. (9) DEGREE(ANY) – used to indicate to the optimizer to consider parallelism while executing queries against partitioned ts. (10)NODEFER(PREPARE) or DEFER(PREPARE) – used when accessing Db2 data in a remote location. (11)SQLERROR(CONTINUE / NOPACKAGE) -Continue the Bind even if an error occurs. DB2 Data types and their Cobol Equivalents DB2 Data Type Internal Length Cobol Ws Char(n) N Pic x(n) Varchar(n) N+2 Pic X(n) (with s9(4) comp) Smallint 2 Pic S9(4) Comp Integer 4 Pic S9(9) Comp Decimal(p,q) P/2 + 1 Pic S9(p)Vp(s) comp-3 Float(n) 8( single prec if n >21) Comp2 Float(n) 4(double prec if n <21) Comp1 Date 4 Pic X(10) Time 3 Pic X(8) Timestamp 10 Pic X(26) 31
  32. 32. DB2 Utilities Data Consistency Utilities – (1) CHECK Utility  Checks referential Integrity betweeb 2 tables and resolves the referential constraint violations by moving the erroneous rows to exception tables.  Checks the status of Check Constraints  Checks Db2 Indexes for consistency of index data and the corresponding table data – No corr row for a given index value / No corr index value for a table row / For a matching RID, the value in the index and the table don’t match. (2) REPAIR Utility  Used to Reset Pending flags that are unnecessary or are set erroneously.  Used to sync up the information abt a DB2 database which is stored in the SYSDATABASE table and the definition of the DBD in the DB2 directory. (3) REPORT Utility  reports the names of TS and Tables which are tied together by referential Integrity (4) COPY Utility  Used to create full or partial image copy of a full TS or a Partition in a TS.  Full Copy copies all the data in the TS or partition.  Incremental copy copies only those pages that have been modified since the last image copy that was taken.  For every run, a row is inserted in SYSCOPY table with info on the image copy dsname, date and time of copy. The image copy datasets are useful for TS recovery. (5) MERGECOPY Utility  Merges multiple image copy datasets into a single image copy dataset (6) QUIESCE Utility  It is used to record a point of consistency for related application or system TS.  It ensures that all TS in the scope are referentially intact. (7) RECOVER Utility  RECOVER TABLESPACE Utility restores Tablespaces to a current or previous state. First it checks what all image copies are available for that TS and checks the Db2 Logs for interim changes . Then applies the image copy data and the log data to The TS being recovered.  RECOVER INDEX is used to recreate indexes from current data. Indexes are always recovered from the actual table data on which index is created and not from image copy or log data. DB2 Logs – DB2 keeps a log of all changes made to TS. All updates are recorded in DB2 active log. When the active log becomes full, it creates an archive log. There could be multiple archive logs created during application processing. All the info is stored in the DB2 Directory’s SYSIBM.SYSLGRNG table. (8) LOAD Utility – used to do bulk inserts into table . It can add or replace existing data. (8) RUNSTATS – This utility collects statistical information for Tables, Tablespaces, 32
  33. 33. Partitions, Indexes and Columns in tables. It can be used to Update all the information In the relevant catalog tables or can be used simply for generating all the information In the form of a report. It is this statistical data which is used by the DB2 optimiser to arrive at an optimal Access path. It is good to run RUNSTATS after every LOAD. After running RUNSTATS, the plans and packages with static SQL have to be rebound to have the access paths based on the recent statistics. (9) REORG – Used to reorganize DB2 Tablespaces and Indexes, thereby improving the Efficiency of access to those objects. REORG reclusters data, resets free space to the Amount specified in CREATE DDL. Catalog Tables: 1. Syschecks - One row for each Check Constraint 2. Syscolumns – One row for every column in a table and view defined . 3. Sysdatabase - One row for each database 4. Sysdbrm – one row for each dbrm bound into DB2 plans. 5. SysforeignKeys – contains information about all columns participating in foreign keys 6. Sysindexes – one row for every index 7. Syspackage – contains one row for each package 8. Syspackauth – contains the privileges held by DB2 users on packages. 9. Syspackdep – cross-reference of DB2 objects on which each package is dependent 10. Syspacklist – contains the list of DB2 packages that have been bound into Application Plans. 11. Syspackstmt – contains SQL statements for every DB2 package. 12. Sysplan – contains one row for each plan. 13. Sysplanauth – contains the plan (bind and execute ) privileges held by DB2. 14. Sysplandep – crossref of db2 objects used by each plan. 15. Sysrels – contains details of referential integrity constraints – the constraint name, parent table, child table. Deleterule,etc 16. Sysstmt – contains SQL statements for every plan. 17. SysStogroup 18. SysSynonyms 19. Systabauth – contains information on the table privileges held by DB2 users. 20. Systablepart – contains information on tablespace partitions 21. Systables 22. Systablespace 23. Sysviewdep – contains a cross-reference of DB2 objects on which each view depends. 24. Sysviews – contains one row for each view 33