Reducing Size of SQL Backup

  • Is your database in “Simple” recovery mode? If so, it’ll produce a lot less transaction log entries, and the backup will be smaller. Recommended for development – but not for production.
  • If it’s in “FULL” recovery mode – do you do regular transaction log backups? That should limit the growth of the transaction log and thus reduce the overall backup size. Transaction Log is growing continuously as the result of no Transaction Log backups being taken.
  • Have you run a DBCC SHRINKDATABASE(yourdatabasename) on it lately?
  • Do you have any log / logging tables in your database that are just filling up over time? Can you remove some of those entries?
  • At design time, use the smallest data type that you can use.
  • Do both Full and Differential Backup.
  • Use 7zip to zipped up backup files.
  • Drop All Non-Cluster Indexes – if performance on the database is not important.
  • Rebuild Tables with 100% Fill Factor.

Drop All Non-Clustered Indexes

There’s two kinds of indexes – clustered, and non-clustered. Clustered indexes aren’t really indexes at all; they define how the table itself is laid out on disk. Dropping a clustered index doesn’t save space. Non-clustered indexes, on the other hand, take up space because they’re additional copies of parts of the table. Non-clustered indexes can help performance, but they can be rebuilt from scratch as long as you save the index definition first.

Which of these takes less space to back up:

  • The string “CREATE INDEX IX_EmployeeName ON dbo.Employees (LastName, FirstName)”, or
  • An index with thousands or millions of rows in it

The answer is the string. To minimize what we’re backing up, we need to:

  1. Script out all of the non-clustered indexes in the database
  2. Save those definitions to a table (or a stored procedure)
  3. Create a stored proc that will loop through those index definitions and recreate them later
  4. Drop the indexes

If our database’s content is, say, 30% indexes, we just dropped the size of our backup by 30%. The downside is that after the database is restored somewhere else, it’ll take longer for the database to be fully available, because we have to run the stored proc to recreate all our indexes. If we’re running SQL Server Enterprise Edition, our database will be online while the indexes are recreated.

Here’s a couple of scripts to generate T-SQL scripts for all of the indexes in a database:

To make these work, you’ll want to adapt them so that they insert the scripts into a table for later execution, and then drop the indexes. I know, I know, you’re going to want the script I’m using at StackOverflow, but I’m not quite comfortable publicizing that code just yet. It’s still a little duct-tape-y. I’ll revisit that in the future though.

This index removal task is extremely fast, only a matter of seconds, because SQL Server is just dumping metadata. The next step, however, is much more intensive.

Rebuild Tables with 100% Fill Factor

SQL Server stores data in pages, and you can think of these much like the pages in the phone book. If we tried to fill the pages completely full, then whenever we needed to add a single record, we’d have all kinds of shuffling to do. When we know we’re going to be adding a lot of data, we’re better off leaving some working area on each page so that we can add records without doing a lot of shuffling every time. The default amount of space we use on each page is known as the Fill Factor.

Normally, StackOverflow’s tables wouldn’t have a 100% fill factor because insert performance would be horrible. However, in our backup scenario, we want to cram all of the pages completely full. The solution is to rebuild all of the clustered indexes with a 100% fill factor. The lower your default fill factor, the more space you’ll save by rebuilding to 100% fill factor. This step takes a lot of IO crunching, and again, it only makes sense when you’ve got a second database server available to do all this work.


Last modified: July 29, 2020



Write a Reply or Comment