Customize Data Repository Vertica Database backup log file storage location

Document ID : KB000032095
Last Modified Date : 14/02/2018
Show Technical Document Details

Description:

By default the Vertica database backup log files are stored in the /tmp directory on the Data Repository server. It is possible to change this location to a custom location.

Solution:

By default the backup log files created by Vertica during its backup process runs are saved to the /tmp directory.

It is possible to modify this location, and the process to do so is fairly simple. It is often done when the /tmp directory is unable to have the appropriate amount of disk space allocated to it to allow proper storage of these log files.

Before configuring a new log file location for the Vertica backup log files, we should consider where the new directory will reside, and how much storage space to allocate to it. This is very important as its possible that this location could fill up and cause issues with the Vertica backup processing due to these logs not being automatically cleaned up by the Vertica installation.

According to the Vertica 6.0 user documentation it states the following regarding use of a non-default directory other than /tmp for backup log file storage: 

Estimating Log File Disk Requirements

When you run the vbr.py --setupconfig command to create your configuration file (see Configuring Advanced VBR Parameters (page 453)), one of the parameters is tempDir. This parameter specifies the backup host location where vbr.py writes its log files, and some other temp files (of negligible size). The default location is the /tmp/vbr directory on each backup host. You can change the default by specifying a different path in the configuration file.

The temporary storage directory also contains local log files describing the progress, throughput, and any errors encountered for each node. Each time you run vbr.py, the script creates a separate log file, so the directory size increases over time, depending on how frequently you run the utility. The utility stores older log files in separate subdirectories, each named with a timestamp. When the vbr.py utility is executing, it updates the current log file each second with one line of information, so the log file size is proportional to the actual backup transfer size. As a rough estimate, a one-hour backup will populate the log file with 3600 lines, or approximately 100KB. Vertica recommends allotting 1GB disk space on each backup host for the vbr.py log files. The vbr.py log files are not removed automatically, so delete older log files manually as necessary."

Once the new log location is configured correctly, including ensuring the "dradmin" user is able to write to the custom directory targeted for housing the log files, we need to make the change to the DB configuration for backups.

To do this, we re-run the /opt/vertica/bin/vbr.py script to create a new, or update the existing, backup configuration *.ini file.

NOTE: 
It is highly recommended that proper steps be taken to ensure the safety of the system before making these changes. This should include having a snapshot or backup image of the DR server taken for use in restoring the system should it be necessary. This should also include running a backup of the database for use in a database restore should it be necessary. To take a back up of the Vertica DB, run the following command as the dradmin user:

$DR_HOME/vertica/bin/vbr.py --task backup --config-file <absolutePath_to_Vertica_backupConfig_ini_file> --showconfig

Note the use of the --showconfig flag. This provides, as seen in the example backup run below, what is currently configured for the backup processing. This is useful when updating the configuration as it provides a readily available list of configuration settings that can be reused or changed as needed.

An example of this might look like the following output from the command line:

bash-3.2$ /opt/vertica/bin/vbr.py --task backup --config-file /DR_Data/backupConfig.ini --showconfig 
{'port_rsync': 50000, 'dryrun': False, 'hardLinkLocal': False, 'optNodes': [], 'verticaBinDir': '/opt/vertica/bin', 'restorePointLimit': 1, 'downNodes': [], 'verticaConfig': False, 'fullTopology': [], 'archiveSpecified': '', 'objects': None, 'dbOptions': '', 'configFileName': '/DR_Data/backupConfig.ini', 'nodeStates': {}, 'dbUser': 'dradmin', 'checksum': False, 'bwlimit': 0, 'overwrite': True, 'topology': [{'dbHost': '', 'dbDir': '', 'dbNode': 'v_xr1200x_node0001', 'backupHost': '127.0.0.1', 'backupDir': '/DR_Data/backups'}], 'dbPort': None, 'dbPassword': '<password>', 'encrypt': False, 'dbPromptForPassword': False, 'sqlCached': '', 'retryDelay': 1, 'logFileName': None, 'tmpDir': '/tmp', 'snapshotName': 'XR1200XBackup', 'debug': 0, 'retryCount': 2, 'dbInitiator': None, 'dbName': 'XR1200X', 'upNodes': [], 'copyCtx': 'backup'} 
Preparing... 
Found Database port: 5433 
Copying...

Note here in this example, the location for the log files to be stored is in /tmp per the tmpDir variable value. This is an example of a default backup configuration created after initial DR/Vertica installation.

You must log in to the DR server as the dradmin or equivalent user to run the vbr.py script. To run the command and update the backup configuraiton use the following command:

$DR_HOME/vertica/bin/vbr.py --setupconfig

Answer all necessary questions and at the last question it will ask:

Change advanced settings? (n) [y/n]:

Answer 'Yes' to this question and new questions will be presented related to the advanced settings. Most of these values can and should be left as default as they are only used when directing the backup and it's log files to be written to a different host than the DR itself. This is most commonly used for disaster recovery scenarios or Vertica configurations where the DB is spread out among more than one system.

The first question presented is the one we're focused on in this solution. It will be:

Temp directory (/tmp/vbr):

Enter the desired new location here by specifying the absolute path.

Once finished when the backup runs, it should now store the log files in the specified custom location instead of the default /tmp location. To validate this, first examine the newly created *.ini file to ensure the new custom directory is set. Look in the file at the 'tempDir' value in the file.

Lastly, to test a backup run, and validate the correct directory is used, once more run a backup with the following command:

$DR_HOME/vertica/bin/vbr.py --task backup --config-file <absolutePath_to_Vertica_backupConfig_ini_file> --showconfig

In the resulting output we should see the custom directory set in the tmpDir variable value. An example of that change is seen here in this example output:

bash-3.2$ /opt/vertica/bin/vbr.py --task backup --config-file /db_backup/backupConfig.ini --showconfig 
{'port_rsync': 50000, 'dryrun': False, 'hardLinkLocal': False, 'optNodes': [], 'verticaBinDir': '/opt/vertica/bin', 'restorePointLimit': 1, 'downNodes': [], 'verticaConfig': 'True', 'fullTopology': [], 'archiveSpecified': '', 'objects': 'public', 'dbOptions': '', 'configFileName': '/db_backup/backupConfig.ini', 'nodeStates': {}, 'dbUser': 'dradmin', 'checksum': False, 'bwlimit': 0, 'overwrite': True, 'topology': [{'dbHost': '', 'dbDir': '', 'dbNode': 'v_xr1200x_node0001', 'backupHost': '127.0.0.1', 'backupDir': '/DR_Data/backups'}], 'dbPort': None, 'dbPassword': '<password>', 'encrypt': False, 'dbPromptForPassword': False, 'sqlCached': '', 'retryDelay': 1, 'logFileName': None, 'tmpDir': '/db_backup', 'snapshotName': 'XR1200XBackup', 'debug': 0, 'retryCount': 2, 'dbInitiator': None, 'dbName': 'XR1200X', 'upNodes': [], 'copyCtx': 'backup'} 
Preparing...

Note that compared to the earlier example provided, this time the tmpDir is not the /tmp default location. When looking in that directory, we can perform the final confirmation that this is working, by seeing "vbr_history_<DATE>" named files created in that location for each backup that is performed.