Host Encyclopedia Subset ID Management

Document ID : KB000027689
Last Modified Date : 14/02/2018
Show Technical Document Details

Summary:

This article discusses how to manage subset IDs used by the Gen Host Encyclopedia: what they are, how they are used, and how to avoid running out. Note that this article applies only to Host Encyclopedias. The Client Server Encyclopedia does not have the same limitations with regard to subset IDs.

We are all familiar with the 10 digit Object ID used by Gen to keep track of objects within models in the Host Encyclopedia. The Object ID is actually a number derived from two separate values: the Subset ID and a Range Number. A maximum of 8191 unique subset ID's can be allocated for a Host Encyclopedia. Subset ID's are allocated explicitly as a row within the DSUBID table or can be used implicitly when an object ID exists in the Host Encyclopedia but no corresponding row exists for that subset ID within the DSUBID table. The Range Number is a sequential number from 1 to 262,143. It represents the maximum number of objects that can be created within a particular subset ID. Because an object ID (column OBJ_ID) in the DOBJ table may exist only once, unique object IDs are allocated using an algorithm that includes a unique subset ID.

To understand how critical subset IDs are to the Host Encyclopedia, recall that a model is defined by a special subset called "ALL". Large models, over 262,143 objects, consist of multiple subsets chained together and may be referred to for convenience by the "ALL" designation. In addition, when a Subset Definition is created; it is assigned an explicit subset ID.

 

Instructions:

Detecting a Subset ID Shortage

The problem that can arise from inefficient use of subset IDs can best be seen with the following example. Suppose that a model exists on the Host Encyclopedia and that a Subset Definition is created for use by a developer. The definition is assigned one of the available subset IDs. The subset is then checked out to the developer's workstation, where a single Entity Type is created. The new Entity Type is assigned a unique object ID based on the assigned subset ID and the next available sequential range number (the number 000001, in this example). Now suppose that the developer checks in the subset. The new Entity Type object is added to the model with the object ID that was assigned to it at the workstation. Finally, suppose that the developer now deletes the Subset Definition. At this point, since an object was created using one of the object IDs that is within the range of objects that correspond to the subset ID assigned to the definition, the subset ID is no longer available for reuse. When this happens, not only has the ability to define an additional 262,142 objects been lost for that model, but more important, the subset ID and its corresponding object ID range are no longer available for use by any other model.

To appreciate the situation this can cause, you need only consider the extreme case. Suppose a single model consists of 100,000 objects. Suppose every time a Subset Definition is created, it is used to create a single new object and then deleted. By the time the model had grown to 108,190 objects, all of the available subset IDs would be used. The result would be a virtual shutdown of the Host Encyclopedia. Any activity that would require a new subset ID, such as Model Copy or Generate New Model would fail with the following error:

GETID: ALL SUBSET IDENTIFIERS ARE IN USE ONLOC=FNDGAP

Version Control functions use the subset ID assigned to the model itself (subset "ALL"), however, Migration and Un-adoption would fail with the above error because these Version Control functions must reserve additional subset IDs during initialization and thus require available subset IDs to process.

Making subset IDs available for reuse requires that all object usages of a particular subset ID must be removed from the encyclopedia. In order to be reused a subset ID must not exist in the DSUBID table, and may not have object references in the encyclopedia tables (DOBJ DPRP DTXT DASC). In addition, if a model has been used to create a family model, either by creating a model from a subset, copying the model, or if a model has been processed by Version Control Migration or Adoption, the objects in the family model contain object ancestry references from the parent model. Object ancestry, stored as Original Object ID in the DOBJ table, prevents reuse of the corresponding subset id. Models that have been backed up using the Host Model Management Backup Utility also prohibit subset ID reusability because backed up models contain objects associated with a corresponding subset ID.

The first step is to analyze current subset ID usage to find out how subsets are being used and how many subset IDs are available. To do this, you need to understand the different types of subset ID usage:

  • Explicit: A subset ID that exists as a row in the DSUBID table with a valid model id (zero or greater) is assigned to an active model. It is referred to as an Explicit Subset ID.
  • Implicit: Ancestry Implicit subset IDs are those where any object identifier, within the subset's available range, exist only within the encyclopedia as object ancestry stored in DOBJ column OBJ_ORG_ID also known as the Original Object ID.
  • Reserved: Reserved Subset IDs are assigned during normal encyclopedia processing such as model check out and in conjunction with activities within Version Control. Reserved Subset IDs are used for extending the current subset object range once it has been filled (262,143 objects created).
  • Backup: This subset ID usage can be attributed to those subset IDs that correspond to backup models referenced in the DCOPY/DCPYUS table only.

Determining how many Explicit, Implicit, Reserved, and Backup Subset IDs exist and the total number of subset ID's used and available can be accomplished by running foreground command "TSO %TIUSSID". This command will generate batch JCL that can be used to produce the ENCYCLOPEDIA SUBSET USAGE REPORT. At the end of the report, the totals for subset identifiers are displayed by usage type as follows:

  • Total # of subsets with no objects associated. These subsets should be deleted if no longer needed. Each one of these will recover a subset ID, for reuse, if deleted.
  • Total # of explicit subsets. This presents the number of rows in DSUBID table where column S_MODEL_ID > 0.
  • Total # of implicit subsets. This presents the number of subset IDs where no row exists in the DSUBID table, but where there are ancestry implicit object IDs or objects in the encyclopedia tables that correspond to a subset ID.
  • Total # of reserved subsets. This presents the number of rows in the DSUBID table where column S_MODEL_ID < 0.
  • Total # of object subsets defined in backups only. This presents the number of subset IDs referenced in the DCOPY/DCPYUS table only.
  • Grand total # of subsets used.
  • Total # of subsets available.

Correcting a Subset ID Shortage

Once current subset usage and availability has been ascertained, the appropriate action can be taken. If all of the encyclopedias subset ID's have been used, or if the encyclopedia is close to its limit, the next step is to recover as many subset IDs as possible. Focus on Reserved Subset IDs first. If a significant number of subset IDs are reported as reserved they can be recovered by running the encyclopedia orphan cleanup utility (UCLEN) job. Normally, after UCLEN has been run, the total number of reserved subset IDs should be less than one hundred. Of course running UCLEN will also remove orphan objects which may result in the recovery of additional subset IDs for reuse. Orphan objects are those objects that reference a model ID with no corresponding row in the DMDL table. Since orphan objects naturally accumulate during encyclopedia activity over time, running UCLEN could result in significant subset ID recovery. It is recommended that the UCLEN utility be run on a regular scheduled basis.

In order to recover Explicit Subset IDs, it is recommended that unnecessary models and subsets be deleted. The subsets reported, by the Encyclopedia Subset Usage Report (TIUSSID), as having no objects in the encyclopedia should recover a subset ID, for reuse, for each one of these that you delete. Another option is to break out some of the models by moving them into a new encyclopedia. Model Copy can also be used to increase the number of subset IDs available by compressing the object IDs within large models into a smaller number of subset IDs within subset "ALL". Note that merely copying a model may not result in any net increase in the number of available IDs; therefore, it cannot be used as a primary means of recovering subsets. This technique involves copying each model one at a time, deleting the original model, and then renaming the new model to the original name. The model copy option that copies the model without Subset Definitions is preferred because it will have the potential to free the most subset IDs. If any Subset Definitions must be copied with the model, then delete unnecessary Subset Definitions before running the model copy. When new "ALL" subsets are created by model copy, the object IDs in the model will be filled sequentially within subset range. The drawback, for subset ID recovery, is that even though the object IDs are allocated sequentially they will retain their ancestry (original object id) from the source model. If you try this approach, you might start by copying only the largest models. The following SQL can be used to report models sorted by number of objects:

SELECT OBJ_MODEL_ID, COUNT(*), MODEL_NAME
FROM owner.DOBJ, owner.DMDL
WHERE OBJ_MODEL_ID = MODEL_ID
GROUP BY OBJ_MODEL_ID, MODEL_NAME
ORDER BY 2;

Removing unnecessary model backups is another task that can increase the number of subset IDs available for reuse. This involves running SQL against the DCOPY/DCPYUS encyclopedia tables to determine if any models have been backed up using the Model Management Backup Utility. Any subset ID referenced in these tables cannot be reused even if the model no longer exists in the encyclopedia. The following SQL can be run to report any backups that may exist:

SELECT COPY_MODE L_NAME, COPY_MODE L_ID, COPY_DATE, COPY_TIME
FROM owner.DCOPY

If any backed up models are identified by the above SQL, and they are no longer needed, the following SQL can be used to delete DCOPY/DCPYUS rows that correspond to any unneeded backups:

FIRST:   DELETE earlear 
                    FROM owner.DCPYUS
                WHERE CPYUS_MODEL_ID = the copy_model_id from the SQL SELECT output above
                      AND CPYUS_DATE = the copy_date from the SQL SELECT output above     
                      AND CPYUS_TIME = the copy_time from the SQL SELECT output above
SECOND:  DELETE 
                     FROM <i>owne</i>r..DCOPY
                WHERE COPY_MODEL_ID = the copy_model_id from the SQL SELECT output above
                      AND COPY_DATE = the copy_date from the SQL SELECT output above  
                      AND COPY_TIME = the copy_time from the SQL SELECT output above

Warning: Once data has been deleted from DCOPY/DCPYUS, the model backup associated with that data can never be restored to the encyclopedia.

The remaining obstacle to subset reuse may require running the TIUSIDU utility. Beginning in Gen v6.5 utility TIUSIDU was provided to recover subset IDs that are used implicitly by ancestry only (Original Object IDs). This utility will allow ancestry implicit subset ID's to be made available for reuse. If the TIUSSID: ENCYCLOPEDIA SUBSET USAGE REPORT indicates that a significant percentage of subset ID usage can be attributed to ancestry implicit subset IDs and if the host encyclopedia is running critically short of subset identifiers, TIUSIDU can be very useful. This utility updates original object IDs that refer to non-existent objects to reference an existing object. Therefore it makes it possible to recover ancestry implicit subset IDs while preserving ancestry. It is important to note that TIUSIDU uses extensive resources. For performance, an additional index is needed on table DSETDF. Other methods and techniques that can be used to recover subset IDs, as discussed in this article, should be performed before working with the TIUSIDU utility.

Foreground command TSO %TIUSIDU will generate batch JCL that can be used to run the utility. Report mode or update mode can be selected. All models and subsets must be checked into the encyclopedia prior to running this utility in update mode. The default report mode does not have this requirement. Both a detail and a summary report can be produced.

Preventing a Subset ID Shortage

Thus far, this article has addressed detection and correction of the problems relating to a limited number of subset IDs. Once corrected, however, the problem must be prevented from occurring in the future. There is no ultimate solution. Utilization of subset IDs is a normal by-product of Gen development. There currently are no plans to remove the subset limitation from the Host Encyclopedia. You can, however, take steps to minimize the rate at which subset IDs are created.

These steps involve relearning how to use Subset Definitions. Most of us create subsets as needed and then delete them rather than modifying and reusing them. This methodology, however, has led to the problem at hand. What is needed is a new paradigm for managing subsets-one that treats Subset Definitions as a project resource, to be managed in the same way that time and cost is managed, and one that promotes frugality of subsets over ease of use. Simply put, Subset Definitions need to be continuously reused until the maximum number of objects has been created, and only then should the definition be deleted. This means that developers will need to modify existing definitions to meet their changing needs, but also that definitions will have to be shared among developers within the same project. Developers should be limited as to the number of definitions they are allowed to create (one being the ideal). Data Base Administrators performing Technical Design need to use one of the existing Subset Definitions rather than being provided with one for their own use.

In conjunction with the new subset paradigm, a new Model Management methodology may be required, where model copies are not used for creating model backups and the Gen Backup/Restore utility function is used only in the most business-critical situations. Model backups and archiving should be done by performing a DOWNLOAD with the UPLOAD option specified. The resulting userid.IEF.TRAN file should be saved for possible future upload.

Finally, Model Owners need to monitor the creation and usage of Subset Definitions associated with their models. In addition, Encyclopedia Administrators need to monitor overall subset creation and usage by periodically running the Encyclopedia Subset Usage Report (TIUSSID), in order to know in advance when the number of available subset IDs may be becoming a problem.

 

Additional Information:

In summary, we must modify our view of the Host Encyclopedia in general, and the use of subsets in particular. Instead of viewing the encyclopedia as a collection of objects whose 10 digit unique IDs give a theoretical maximum number of objects that can be created, we must view the encyclopedia as a collection of objects that exist within a set of subsets. Not only does this view yield a smaller theoretical limit to the number of objects that can be created, but, depending on the efficiency of their use, this limit can have a significantly lower value. A new paradigm must be developed around the concept of subsets as a project resource that must be monitored and controlled. The keywords for this paradigm must be to reuse and not abuse encyclopedia subsets.