How to purge or reduce size of APM database in Postgres?

Document ID : KB000048617
Last Modified Date : 27/02/2019
Show Technical Document Details
Introduction:
This is one of a series of articles on APM database maintenance.
Environment:
All supported APM releases.
Instructions:
  1. You can run vacuum manually to clean all tables (it can be easily run from PgAdmin).

    Connect to the postgres, right click on APM schema/database

    Figure 1

    Select Vacuum

    Figure 2

    Vacuum is automatic process, however, it runs on tables only when certain threshold is met. Default settings can be changed manually in C:\Program Files\CA APM\database\data or using PgAdmin

    Figure 3

    Figure 4
     
  2. Doing re-indexing may also help reduce the size of the database (it can be easily run from PgAdmin).

    Figure 5
     
  3. The issue could be related to database size growing fast due to large number of invalid defects. For testing purpose, change the defect retention settings (Incident Management Settings) from default of 7 days to 2 days, the EM should bring the disk usage further down overnight.

    Figure 6
     
  4. In the CEM console, ensure that the "Capture Comprehensive Defect Details" check box is not selected on the Setup > Domain page.

    Figure 7

    If you select this, the CEM > Incident Management > Defect details page can display additional data about query and post parameters and the request and response bodies. For testing purpose, disable it.
     
  5. Gradually reduce retention settings in CEM UI > Setup > Domain tab (reduce it about 20-30% every day).

    Figure 8

    From APM_Sizing_Performance_EN:

    "CA CEM Data Retention Considerations

    You can configure the length of time that various kinds of monitored user data are retained in the APM database. These settings are ultimately a business decision you base on the requirements at your organization. However, make retention setting decisions with an understanding of the capacity consequences.

    Increasing data retention increases the duration of daily statistics aggregation runs. This situation can result in the following performance issues:

    Retention settings have a direct impact on APM database disk space requirements. Periodically monitor available space on the database data disk. If the percentage of available space drops below 25 percent, possible remedial actions include:
     
    • More contention for resources with the Collector that runs the daily stats aggregation service.
       
    • Reduced Collector capacity.
       
    • If the APM database logs are not on separate disks, move the logs to separate disks.
       
    • If you plan to have the logs continue to share the data disk, delete old log files.
       
  6. Reduce retention periods
     
  7. Upgrade to a larger capacity disk subsystem.

    One common cause for a database bottleneck is that the resources available to the APM database, primarily memory, are inadequate for the data retention settings.

    Reducing the data retention period has temporary side effects on database maintenance tasks. The side effects include longer execution times and higher memory requirements for aggregation and cleanup tasks during the 24-hour period following a reduction in retention times.

    Wait from 24 to 72 hours before assessing the effect of a change on retention settings, so that all aggregation, cleanup, and maintenance tasks have run. If older data is being kept for historical purposes and not being included in ongoing analysis or reporting, consider backing up and archiving data. Coordinate these tasks with reduced retention periods."
     
  8. Check 'badfiles' directory in APM installation and delete files in there.
    By default, the data files sent to the Enterprise Manager from the TIMs are stored in this directory: cem/tess/badfiles
     
  9. Run useful queries to help you troubleshoot the issue.
     
    1. Run this query to find largest tables in DB:

      SELECT relname,relfilenode,relpages,(relpages*8/(1024*1024)) as disk_space_in_GB FROM pg_class ORDER BY relpages DESC;
       
    2. You can validate whether the tables were auto vacuumed/analyzed or not by running the following query in customer environment:

      SELECT relname, n_tup_ins, n_tup_upd, n_tup_del, n_tup_hot_upd, n_live_tup, n_dead_tup, last_vacuum, last_autovacuum, last_analyze, last_autoanalyze FROM pg_stat_user_tables;
       
    3. Run the below query to find out db size in MB:

      SELECT pg_database.datname,pg_size_pretty(pg_database_size(pg_database.datname)) AS size FROM pg_database;
Additional Information:
For Appmap database issues, see these knowledge documents:

https://comm.support.ca.com/kb/apm-database-is-growing-fast-too-many-appmapstates-tables-is-there-a-property-to-prune-those-tables/kb000004232

https://comm.support.ca.com/kb/unable-to-purge-appmapstates-tables/kb000044847

https://comm.support.ca.com/kb/how-can-we-optimize-our-apm-database-that-is-growing-larger-very-quickly/kb000016443