1. Flawless COBOL copybook documentation Page 1
Authored by Peter Duray-Bito
Flawless COBOL copybook documentation
One of the big challenges in taking in data files is mapping mainframe COBOL copybooks to the local
data warehouse. Copybooks often contain cryptic field names, inconsistent OCCURS and REDEFINE
segments and lack any sort of field start and end positions.
There are a few commercial mainframe copybook wizard programs out there such as TextPipe but the
task is not so great that it can’t be handled in Microsoft Excel. Furthermore, Microsoft Access can be used
to import data files using specifications created in Excel to verify the accuracy of the file mapping.
Import to Excel
Export the copybook from the mainframe as a .txt file and open in Excel.
1. Visually examine each line to determine if it’s a true data field or an OCCURS,REDEFINE or
some other sort of artifact that does not truly take up field space.
2. Delete all these extraneous lines and you should end up with lines that add up to the LRECL
(record length) of the file.
3. Using Excel’s Text to Columns (under the Data menu) selection, separate the Field Name from
the PIC clause.
4. While it’s tempting to continue to use Text to Columns to neatly break up the PIC clause, implied
decimals (e.g. PIC 9(3)V9(2)) makes this a cantankerous task.
5. Instead, visually look at each PIC clause,calculate in your head and enter the number of
characters the field will displace in a new column, let’s call it “Source Field Length”, to the right
of the PIC clause.
6. When completed, highlighting the Source Field Length column should give you a Sum on Excel’s
Status Bar (lower right) that equals LRECL.
7. From here you can create additional columns such as “Source Field Start Position” and build a
formula to increment the starting position:
a. For the first line, enter the start position as 1.
b. For the next line, as the value of the prior line’s Source Field Length to the prior line’s
Source Field Start Position.
c. Grab the lower right of the cell and copy the formula down to the last line and now you
have the starting position for each field.
Create Access Import Specification table
Create another Worksheet in the same spreadsheet that contains the work completed above.
1. Access Import Specifications contain four columns:
a. Field Name
b. Data Type
c. Start
d. Width
2. Enter the four names above as column names in your new Worksheet.
2. Flawless COBOL copybook documentation Page 2
Authored by Peter Duray-Bito
3. Using the = formula, refer to the Field Name in first Worksheet under the Field Name column
and copy down all your field names.
4. For the purposes of documentation, you can make all Data Types as Text. This will simplify your
first file import because all we’re doing at this point is making sure the copybook is accurate.
5. For the Start column, use the = formula to refer to the Source Field Start Position column in the
first Worksheet.
6. For the Width column, use the = formula to refer to the Source Field Length column.
7. You should have something that looks like:
Field Name Data Type Start Width
FIELD1 Text 1 3
FIELD2 Text 4 5
FIELD3 Text 9 7
Import sample file into Access
Create a database in Access and import sample data file.
1. Under the External Data menu, select Import text file.
2. Specify the data file and click through to the Import Text Wizard.
3. Click the Advanced… button (lower right).
4. In Field Information section of the Import Specification screen,highlight the Field Name,Data
Type, Start and Width fields by holding down the Shift key and dragging the mouse pointer
across the columns.
5. Copy the columns from the Excel spreadsheet without the Header row (Field Name,Data Type,
Start and Width) and paste into the Access Field Information table (pasting data into this table
undocumented in Access).
6. Click on Save As… and save the Import Specification.
7. Import the file.
You now have the ability to test the mapping accuracy of the copybook by visually examining the table in
Access as wellas running queries against this data. At the same time, you can generate a published data
dictionary by enhancing the Excel spreadsheet with additional descriptive text (such as plain English field
names) and exporting to an Adobe PDF document.