The SQL Server Database Engine automatically modifies indexes whenever insert, update, or delete operations are made to the underlying data. Over time these modifications can cause the information in the index to become scattered in the database (fragmented). Fragmentation exists when indexes have pages in which the logical ordering, based on the key value, does not match the physical ordering inside the data file. Heavily fragmented indexes can degrade query performance and cause your application to respond slowly, especially scan operations.
You can remedy index fragmentation by reorganizing or rebuilding an index.
SQL index maintenance guidelines
Now that we have a general idea of what the problem with fragmentation is, let’s focus on how to solve it. We can fix this in two ways. There are pros and cons for both of these ways, and which one we choose will depend on the environment, situation, database size, etc. Both have the same goal and that is achieving the sequential everything:
- Rebuild – this creates a brand-new SQL index. Rebuilding is cleaner, easier and usually much quicker (this is noticeably quicker on a large database). We do lose a little bit of concurrency when doing an online rebuilt because it creates an SCH-M lock and on top of that if you run this during a busy day and it’s sucking up too many resources, you will probably have to cancel the whole operation. This means that it will actually roll-back everything like it was a transaction.
- Reorganize – this fixes physical order and compact pages. Reorganizing is better for concurrency and furthermore, if the operation is canceled like in the example before, it will simply just stop, and we will not lose the work it did right until the moment when it was canceled.
Reorganizing an index uses minimal system resources and is an online operation. This means long-term blocking table locks are not held and queries or updates to the underlying table can continue during the ALTER INDEX REORGANIZE
transaction.
Rebuilding an index drops and re-creates the index. Depending on the type of index and Database Engine version, this can be done online or offline.
Space Required
Rebuilding an index requires building a new index before dropping the old index, regardless of the amount of fragmentation present in the old index. This means you need to have enough free space to accommodate the new index.
Reorganizing an index first squishes the index rows together to try to deallocate some index pages, and then shuffles the remaining pages in place to make their physical (allocation) order the same as the logical (key) order. This only requires a single 8-KB page, as a temporary storage for pages being moved around. So an index reorganize is extremely space efficient.
Algorithm Speed
An index rebuild will always build a new index, even if there’s no fragmentation. The length of time the rebuild takes is related to the size of the index, not the amount of fragmentation in it.
Reorganizing an index only deals with the fragmentation that exists, and the more fragmentation there is, the longer a reorganize will take.
Interruptible or Not
An index rebuild operation cannot be interrupted without it rolling back everything it’s done so far – it’s atomic – all or nothing. In SQL Server 2017, however, there is a resumable-online index rebuild feature.
An index reorganize can be interrupted and the worst that will happen is that a single page move operation is rolled back.
Statistics
An index rebuild will always rebuild the index column statistics with the equivalent of a full scan (or sampled, for an index partition or if the index is partitioned).
An index reorganize does not see a total view of the index and so cannot update statistics, meaning that manual index statistics maintenance is required.
Transaction Log Generated
In the FULL recovery mode, an index rebuild is fully logged, so the transaction log will have to accommodate the full size of the index in a single transaction. This also means the entire generated transaction log may need to be mirrored, sent to your AG replicas, scanned by replication, backed up, and so on.
In the SIMPLE and BULK_LOGGED recovery modes, the amount of transaction log generated by an offline index rebuild will be minimal (online index rebuild is always fully logged) – just the allocations of pages and extents. However, the next log backup performed (either in BULK_LOGGED or after switching to FULL) will also contain all the extents changed by the rebuild, and so the log backup will be roughly the same size as if the rebuild was done in the FULL recovery mode. The benefits are in time and the fact that the transaction log itself does not have to accommodate the full size of the index during the rebuild in a single transaction.
In all recovery modes, reorganizing an index is fully logged, but is performed as a series of small transactions so should not cause the transaction log to grow inordinately. And of course, transaction log is only generated for the operations performed, which may be less for a reorganize as it only deals with fragmentation that exists.
Locks Required
An offline index rebuild of any index holds a schema-modification (i.e. super-exclusive) table lock – no updates or reads of the entire table.
An online index rebuild of any index acquires a short-term shared table lock at the start of the operation, holds an intent-shared table lock throughout the operation (which will only block exclusive and schema-modification table locks), and then acquires a short-term schema-modification table lock at the end of the operation.
An index reorganize holds an intent-exclusive table lock throughout the operation, which will only block shared, exclusive, and schema-modification table locks.
Sources:
Comments