There are a set of records on a table which have become too aged to be held online and can be archived to reduce the amount of the data stored.
I would like to archive this DB2 data.
Archiving data provides many benefits.
- Disk space is released for use by more recent user data
- Storage costs are reduced due to reduced space usage on hardware
- The maintenance of database objects is made more efficient due to faster utility processing times
- and finally and most importantly the performance of applications is increased due to faster access to records.
The benefits of archiving data should be viewed as a whole since there is a ripple effect on the environment when some applications run faster than before, then others, which had previously been delayed, will run sooner or faster as there is more CPU available.
RC/Extract is able to implement an archival process.
Prerequisite knowledge: The following information should be read by people who are familiar with the normal RC/Extract extract process as this is an advanced topic.
Which rows will be removed from the source table(s) for archiving?
The records which will be removed are selected during the extract process. The source definition defines the specific records to be extracted and later optionally archived(removed from the source tables). If a subset of the source data records is required then the source definition must select them either with the use of WHERE clauses or by using the KEYFILE function.
Which RC/Extract functions carry out archival functions?
These functions are only available from the RC/Extract Command Center.
On the Source Definition: The "Z" line command "SINGLE STEP EXTRACT/DELETE/LOAD IN BATCH"
This function does the extract from the source tables, the delete from the source tables and an optional load to the target definition tables. It's used from an existing source definition. The extract and the delete can be done using SQL or a CA Fast Utility such as CA Fast Unload for DB2 for Z/OS and CA Fast Load for DB2 for Z/OS.
On the Target Definition: The "P" line command "PURGE/DELETE ROWS USING SQL" or
The "Z" line command "PURGE/DELETE ROWS USING UTILITIES"
These functions are used on a TARGET definition to remove the extracted rows from the source tables either using SQL or using a CA Fast utility such as CA Fast Unload for DB2 for Z/OS and CA Fast Load for DB2 for Z/OS. These functions would be used after an extract has been carried out in the normal way as a once off removal from the source tables.
In this document I will examine the "SINGLE STEP EXTRACT/DELETE/LOAD IN BATCH" function in detail.
Create a source definition
The initial task is to create a new source definition specifying the tables to be extracted as normal. We must now place more emphasis on the selection of rows from the source tables using WHERE clauses or the KEYFILE function in the source definition. You might carry out a test extract to ensure that the rows required have been selected. The rows extracted will be the ones to be removed from the source definition when using the "Z" command.
Use the "Z" line command on the new source definition.
Enter the "Z" line command against the new source definition and specify the Extract Options as you would when doing a normal extract. Set up the extract options and continue using PF3. SQL or Fast Unload can be used. These options are the same as your test extract.
Next step is the Delete Options Screen
View/Update Selection List ===> N
Data Deletion Method ===> S
Load Targets ===> N
On this screen the list of tables to have data archived can be viewed using the Selection list "Y" option, defaulted to "all tables" of those in the source definition. The method of deletion(sql or Utility) can be chosen and a choice about doing the load to the target tables now or not now can also be made. You might not want to load the target tables yet but just hold the data in the extract object. When finished press PF3 to go on to the next step.
SQL Delete Options, Commit Frequency OR RC/Extract Utilities Delete Options
If SQL was the deletion method chosen the user can choose the commit frequency of the processing otherwise if Utility was chosen then a MODEL selection screen is displayed so that the JCL model can be chosen to generate the Utility JCL cards. Note that the MODEL chosen must have the CA utilities activated. It does not use an IBM utility model. After the commit frequency or the model is chosen then PF3 to go on.
The Target Definition
The next step is to choose the target objects that will optionally receive the archived data if the load step is chosen. This target definition works the same way as a normal target definition. PF3 when completed selecting the appropriate Target tables.
Load Target Options
The next step is displayed if you have chosen to load the target tables during this process. The choice of SQL and Utility is shown here and the user can make selections of the load and allocation options depending on the load type. For the Utility option(CA Fast Load) there are choices for Load Resume/Replace and others are available. PF3 to go on to the last step. If SQL Load is chosen then some options include "Empty Target Table Before Loading" and "Commit Frequency for Load Operation".
Make sure also that if using Fast Load to load a target table that already contains records that you correctly set the RESUME setting to YES or REPLACE in the Load Target options depending on what you want to do with the data rows already on the target tables. For the SQL Option you have the "Empty Target Table Before Loading" option.
Submit generated JCL.
The JCL is now ready to be generated to do the EXTRACT/DELETE/LOAD. The Batch JCL Specification is shown as with any other Batch Processor submission. Depending on SQL or Utility options chosen above the processing is handled by the RCX engine using SQL or Fast Unload and/or Fast Load.
The RCX Engine writes the data rows into the extract object, then it removes the rows to be archived from the source tables. If Utility is selected RC/Extract generates all the Fast Unload and Fast Load statements to unload all the tables (if more than one) from the tablespace and load the required records. If a subset of rows were selected from a source table during the extract then only those rows are removed from the source table. If SQL is used then the RC/Extract engine handles it all with it's own SQL process. No externalized SQL statements are generated.
When ready the JCL can be submitted. The JCL is generated in one member and is made up of a number of Batch Processor Sync points. Under some conditions a restart can be used if required at one of the sync points like any other batch processor job cards.
The records that have been removed from the source tables are placed in an extract object in the normal way. This means that they can be loaded to another target table as I have discussed above. It also means that if a new target definition is created, associated with this extract object, which has a target of the original source table or tables then these same rows could be loaded back to their original source tables if required at a later date.