Thursday, March 29, 2012

Estimation of Full Backup size

Has anyone come across a more accurate method than sp_spaceused to estimate the size of a full database backup for SQL Server 2000 ?

I have found this to have too great a variance (even after running updateusage) to rely on any accuracy for it. I have also looked at perhaps using the ALLOCATED Pages indicated in the GAM pages but this also seems to be pretty inaccurate.

I have a number of servers where space can be limited and backups using Maintenance Plans have occasionally failed because they delete the old backups AFTER they do the latest one. I am writing a script which can check the space remaining and adjust the backup accordingly but the variance I have observed so far with sp_spaceused is too great.

Any ideas welcomed.

Full backups copy all pages within allocated extents. So, there may be a few pages which are not allocated, but which are copied to the backup. Also, there is a small percentage of overhead in the backup format, but this should be predictable.

The most accurate estimate would be to use updateusage and then look at the reserved column of sp_spaceused.

What is the variance that you are seeing? It should be fairly small.

The real bottom line is that if you have servers which you know occasionally run out of backup space, they need to have storage added so that you can ensure the safety of your databases. You can minimize the space requirements by compressing the backups and/or copying them off the server as soon as they are created, but you need the space there.

|||

Thanks for that Kevin

I need to do a few more tests to confirm the variance that I am seeing - it seemed to be inconsistent for a couple of databases (shooting up to 4MB) after being within about 85KB - 90KB for all of the others so there may have been something else happening.

You mention all pages within allocated extents - does that mean that I could use the GAM or SGAM pages to count the number of allocated extents and use this as an estimate? I already have a method of doing this for estimating differential backups using the DIFF_MAP pages.

Martin

|||

The other component of a backup is that the portion of the log created during the execution of the backup is also backed up. This is required in order to get a consistent view of the data at a single point in time.

Depending on the activity level and size of your database, this could easily account for this much variability and more.

Lets look at this from a different angle:

What are your business requirements around backups that are driving your backup plans? If we understand those, we can perhaps better optimize your strategy. I find that the best way to craft a backup strategy is to start from the recovery requirements and work backwards to come up with a plan which will support those needs.

What are the sizes of the databases you are backing up?

No comments:

Post a Comment