How to load comma delimited data using CA Fast Load?

Document ID : KB000025679
Last Modified Date : 14/02/2018
Show Technical Document Details

Description:

How do I load data into a DB2 table after transfer from my PC after export from a spreadsheet package like Microsoft Excel?

The ability to transfer table data to a DB2 table from the non mainframe environment allows that data to be used on the mainframe DB2 environment. It would also alleviate the need to enter the data manually. Such data may exist on PC databases or Spreadsheets.

Solution:

CA Fast Load for DB2 for z/OS (PFL) has a data input format called DELIMITED which is able to process data records which may be exported from programs such as spreadsheets. Once the data is exported from the PC program and loaded to the mainframe in sequential format with an appropriate file transfer program it can be loaded by Fast Load. Most mainframe emulator programs have such file transfer functions.

Spreadsheet programs can produce "comma delimited" as a valid format for exported data. This output data is then written into the correct columns separated by the column delimiter character. The file type on the PC is ".CSV". Microsoft Excel defines this file type as "Microsoft Office Excel Comma Separated Values File".

A comma delimited record simply means that the record is written with a delimiter or "separator" between data columns. A user can choose various characters to be a delimiter. The actual field may also be written with field delimiters around the data itself. Field delimiters are used in case there are characters in the data that could be misinterpreted as column delimiters but are needed as text in a field regardless, such as a comma after someone's name.

Another strategy is to use a column delimiter character that could not be misinterpreted as a column delimiter because it does not exist as data in any of the rows.
Eg:

'111111','JOE','BLOGG;S JR',1111.22

In the above example there are four columns each separated by a "," comma as a column delimiter and the data itself has quotes as field delimiters. The 2nd last column has a ";" semicolon inside the field delimiters and will be loaded with the rest of the field as such. The last column is a decimal.

Data Considerations

The character to be used as a column delimiter with the data? The character (if any) to be used as a field delimiter with the data? How should a NULL column be represented in the data record? Non-traditional data types like LOB's, BLOB's and CLOB's can't be loaded.

Illustration

Consider this code:
The INPUT-FORMAT is DELIMITED.

  1. The COMMA keyword specifies that a "," will be the delimiter for columns. The normal default is a "," comma.

  2. The input data is treated as EXTERNAL data so columns with field delimiters are treated as character data and those without are treated as numeric data. Numeric data should not have any decimal points embedded.
    FASTLOAD
    COPY-BUFFERS 10
    SPACE-DEFN NO
    DISPLAY-STATUS 10000
    ESTIMATED-ROWS 2000000
    INPUT-FORMAT DELIMITED    <---------- delimited format input data
    QUOTE NONE                <---------- do not use any field delimiter
    COMMA ','                 <----------, is my column delimiter
    IO-BUFFERS 50
    MAXTASKS 1
    NONLEAF-PCTFREE LEAF
    OUTPUT-CONTROL ALL
    ENFORCE CONSTRAINTS
    UNLOAD CONTINUE
    RECLUSTER NO SORT-CLINDX
    SORTSIZE 4M
    SORTNUM 4
    STOP-LIMIT 1
    VSAM-BUFFERS 96
    RESUME YES
    INTO TABLE creator.TBEMP

Expected input

The above load cards would expect input records like this:

111111,JOE,BLOGG;S JR,1111.22

Microsoft Excel considerations

When a comma delimited file is produced with Microsoft Excel the user saves the file to a CSV file type which gives you comma separated columns and no field delimiters. There is no option for alternate column delimiters other than comma's. When decimal columns are saved to CSV format with the intention of loading to a DB2 table then the correct COLUMN FORMAT for decimal places should be observed. Rejects will be produced if the number of decimal places does not match.

Decimal Point considerations

Fast Load provides a DECPT keyword which may be used to signify a decimal point other than the one specified in DSNHDECP for the load data. The literal used must not be the same as a delimiter being used. Microsoft Excel has options to signify alternate decimal point characters in order to correctly output the data as a numeric value. In EXCEL look in TOOLS, OPTIONS, INTERNATIONAL, DECIMAL SEPARATOR.

File Transfer considerations

The file transfer method used by the user should enable the file to be translated from ASCII(or UNICODE) to EBCDIC. Check to make sure that the data transfer has performed the correct translation to EBCDIC. This will depend on the capability of the file transfer program. Read the file transfer instructions carefully. The aim is to have a pure text file as input to Fast Load including any numeric's.

Fast Unload considerations

If Fast Unload is used to create the delimited input file, the NULL-FIELD value must be defaulted or specified as EMPTY. Therefore when a NULL is encountered in the table data it will be represented as two consecutive column delimiters with no space between them.
Fast Unload is able to generate NULL columns with QUOTES around them but Fast Load will not recognise these as nulls. It loads them as a space and may cause rejects on referential integrity.

Eg: This represents two null columns

With NULL-FIELD EMPTY,,,
With NULL-FIELD QUOTES,'','',

OTHER RELEVANT INFORMATION

Documentation References
Fast Load User Guide, KEYWORD descriptions for COMMA, QUOTE, DECPT and INPUT-FORMAT DELIMITED.