qos_processor won't start after upgrade to 9.0.2

Document ID : KB000124969
Last Modified Date : 01/02/2019
Show Technical Document Details
Issue:
After upgraded from 8.5.1 to 9.0.2, qos_processor won't start properly. From the loglevel 5 qos_processor probe the following error is reported prior to the probe failing to start:

Jan 18 12:09:33:155 [Qos Monitor Loader, qos_processor] QosProcessor Coordinator aborting: null
Jan 18 12:09:33:156 [Qos Monitor Loader, qos_processor] java.lang.NullPointerException
at com.nimsoft.slm.qos.processor.qosmessage.QosMonitor.<init>(QosMonitor.java:141)
at com.nimsoft.slm.qos.processor.qosmessage.dao.QosMonitorDao.loadQosMonitors(QosMonitorDao.java:108)

Deleting the probe, its directory and the directory contents then redeploying the probe does not correct this issue.
Environment:
UIM Server:  9.02
qos_processor:  9.02
Cause:
After successfully completing the 9.02 UIM Server and UMP upgrade installs, the primary hub robot was stopped to force a fail over to the HA backup hub that was still running the UIM 8.51 core probes (data_engine and qos_processor).  Due to a schema change in the UIM database this is not supported since one of the schema changes was to the creation of the value in the checksum field in the S_QOS_DATA table.  Prior to 9.02, the Qos, source, and target fields of the first instance of a QoS message were used to compute the checksum hash value,  In 9.02, the value of the origin field was added to the hash value.  This causes corruption in the S_QOS_DATA table.
Resolution:
The minimal preferred upgrade method is as follows:

1. Deactivate the HA probe on the primary HA backup hub. This is to prevent a fail over to the primary HA backup hub.
2. Run the 9.02 UIM installer
3. Run the 9.02 UMP installer
4. Upgrade the robot on the HA backup hub (7.97)
5. Put a copy of the <installPath>\Nimsoft\security\certificate.pem file from the primary hub on the HA backup hub
6. Add the following key to the robot.cfg file on the HA backup hub in the <controller> section:

cryptkey = <absolute path to the copy of>\certificate.pem

7. Restart the robot on the HA backup hub
8. Deploy the 7.97 hub to the HA backup hub
9. Deploy all the the remaining 9.02 core probes from the primary hub to the HA backup hub
10. Activate the HA probe on the primary HA backup hub.

At this point you should be able to fail over the the HA backup hub with no problems.