How much space is required for Vertica database backups

Document ID : KB000048102
Last Modified Date : 14/02/2018
Show Technical Document Details

Description:

Recommendations on space requirements for Vertica database backups on the Data Repository system are now included in the IM Sizing Spreadsheet, which can be obtained from the support.ca.com site. After going to the product support page for Infrastructure Management, it will be listed under the Recommended Reading section.

Solution:

Vertica support is generally correct with their information re: "Restore points not inherently increasing disk footprint"; however, things will differ a bit given the fact that the Data Repository application continually prunes data partitions once data has met the data retention requirements.

For example, by default, we will retain as polled "rate" data for 45 days. Once we have 45 days of data, we will start to prune data each night effectively ensuring that we have a 45 day rolling window of data. From a backup perspective however, if you have a restore point limit of 7 (which equates to 7 days by default if performing daily backups), the backup data to support a potential restore of any of the 7 days actually requires 45 days + 7 additional days. Each restore point requires its own 45 day "look back" of historical data.

We actually have ~15% more data on our backup disk than our source db. The 15% delta can be approximated by the following calculation which assumes the same daily volume of data:

Cost of 7 days of additional data is = total disk footprint of source database * 7/45

Example:

A given system has a source DB that is ~3.4TB and 45 days of rate data retention ( we have other retention periods as well for aggregate data but that usually equates to about 10% or less of total volume):


 (3.4TB / 45 days) = ~.08 TB per day
 7 additional days * .08 = .56 TB

Our backup data directory footprint is just over 4TB which is what we would expect given the calculation:


 (Source DB footprint = 3.4 TB) + (7 additional days= .56 TB) = ~4TB. 

The cost of each restore point equates to an additional day of data once retention periods have been met (assuming you have daily backups).

We then expanded our restore point count to 14, and estimate the backup size being ~30% greater than the source database.

From a capacity planning/"sizing guide" perspective, the estimate for the source database already contains a buffer of ~40->50% since this is the Vertica recommended free space on disk over and above the disk footprint. The sizing guide is very conservative in that it suggests an estimate of the backup required space = source database required space (keep in mind, the source database required space already contains a 40->50% buffer), so, unless a customer is maintaining more than 30 daily restore points, the "sizing guide" guidance is accurate.