What is 'regularized' data in eHealth?

Document ID : KB000017456
Last Modified Date : 14/02/2018
Show Technical Document Details


What is the difference between regularized data and traditional statistical data, raw or rolled up, within the eHealth Oracle Database?


Before the Cognos reporting tool that underlies eHealth reporting functionality was introduced, all reporting ran against the traditional statistical data present in the database.

With the introduction of the Cognos reporting tool and its related report customization and configuration abilities, it brought with it a requirement for a very specific data style.

Historically all collected Statistics Data was added to the database in a raw form as as-polled data which is time stamped with the time the poll was received. Then per the configured Statistics Rollup schedule, over time that data is rolled up from as-polled, to daily, to weekly data points. Eventually the data is aged out of the database based on that rollup schedule.

The new Cognos reporting tools also require raw as-polled data sets for use in providing reports. But it requires instead of values inserted at the time the poll was received, that these values be provided in a regularized format that is time aligned on known bounds.

To meet this requirement regularized data and its related views and tables were added to the existing eHealth Oracle database. These tables present the data to Cognos in regularized form by setting up the data in a way that shows it in clear five minute intervals, providing a consistent and regular interval for the data time stamps for use by Cognos. This is also done in eHealth legacy reports but is done at report run time by the report engine.

So when polled data is returned to eHealth, the first thing it does is place it in STSample files in $NH_HOME/tmp to be added to the Oracle DB as traditional Statistics data.

Before Cognos was added to the system, after the raw as-polled data is written to the DB the STSample files would have been discarded. Now though the files are left in place for the regularized data insertion process to read and insert to the DB and then the files are deleted from the system.

So now we have an environment with an Oracle DB that contains two sets of tables with the same effective data. One for traditional Statistics data and the other is the regularized data required by Cognos. The Regularized Data is compressed with Oracle compression and uses about 50% less space then the legacy Statistics data.

How this data is rolled up, how often, and how long the data is kept around in the DB at each stage of its life, is different for each data set.

Traditional Statistics data is stored and rolled up out of the database on the following schedule:

As-Polled: stored for 2 days
1-Hour Samples: stored for 6 days
1-Day Samples: stored for 70 days

These settings are stored in the configuration of the default Statistics Rollup job. To see how your system is configured to store the data review the configuration of the Statistics Rollup job via the eHealth OCE UI. It is found in the 'Tasks and Information' area, within the 'Scheduled Job Management' area, in the 'Scheduled Jobs' option. You use this view to make changes to the scheduled data storage values. You can also use this view to stop data roll up activity but doing so is not recommended. Doing so will cause the database to quickly grow in size. This will result in eHealth performance degradation and it may result in a down eHealth system due to the server having full disks.

Another cause for concern if disabling the traditional Statistics Rollup job is that the resulting backup size increases. If you plan to take this step, it is highly recommended that you run the pending change through the eHealth sizing wizard to ensure sufficient disk space is available. If it is not, the disk may quickly fill up and cause the Oracle DB to go down resulting in a potential down production eHealth environment until disk space is made available.

By default Regularized data is stored and rolled up out of the database on the same schedule as the leagacy Staistics data.

Fast granularity: 60 seconds
Base granularity: 5 minutes
Hourly granularity: 60 minutes
Base duration: 2 days
Hourly duration: 6 weeks
Daily duration: 70 weeks

This information is obtained by running the following command on the eHealth server:

nhModifyTimeAlignSettings -list

The same command is used to modify the current default settings if needed.

Run the command as follows to observe its full usage options that show what changes are possible:

nhModifyTimeAlignSettings -h

Note that the two sets of rollup schedule settings for the traditional Statistics data and the regularized data act independently of each other. This is important to note so that it is clear that changing the settings for one does not change or impact the other in any way by design.

It is also worth noting that the regularized data is also used for:

  • Integrations with other CA products such as CAPC

  • Performance Reporting

  • Live reporting

The Performance and Live Reporting options are accessed via the eHealth reporting web site. Both are found within the "Reports & Live Reports" tab.

Note that the views created against the regularized data provide better performance than the one against the legacy statistics data and are the ones CA suggests using. They are described in the eHealth Admin guide.