SQL Server Index Design Basics and Guidelines

Workload type

Before creating an index, you should understand the workload type of the database. On Online Transaction Processing (OLTP) database, workloads are used for transactional systems, in which most of the submitted queries are data modification queries. In contrast, Online Analytical Processing (OLAP) database workloads are used for data warehousing systems, in which most of the submitted queries are data retrieval queries that filter, group, aggregate and join large data sets quickly.

Creating a large number of indexes on a database table affects data modification (e.g. Updates) operations performance. When you add or modify a row in the underlying table, the row will also be adjusted appropriately in all related table indexes. Because of that, you need to avoid creating a large number of indexes on the heavily modified tables and create the minimum possible number of indexes, with the least possible number of columns on each index. You can overcome this problem by writing queries that add or modify rows in batches, rather than writing a single query for each insert or modify operation. For Online Analytical Processing (OLAP) workloads, in which tables have low modification requirements, you can create a large number of indexes that improve the performance of the data retrieval operations.

Table Size

It is not recommended to create indexes on small tables, as it takes the SQL Server Engine less time scanning the underlying table than traversing the index when searching for a specific data. In this case, the index will not be used and still affect the data modification performance, as it will be always adjusted when modifying the underlying table’s data.

Table Columns

In addition to database workload characteristics, the characteristics of the table columns that are used in the submitted queries should be also considered when designing an index. For instance, the columns with exact numeric data types, such as INT and BIGINT data types and that are UNIQUE and NOT NULL are considered optimal columns to participate in the index key.

In most cases, a long-running query is caused by indexing a column with few unique values. Although it is not possible to add the columns with ntext, text, image, varchar(max), nvarchar(max), and varbinary(max) data types to the index key columns, it is possible to add these data types to non-key index columns, but only in case of critical need. A column with XML data type can be added only to an XML index type.

Columns Order and Sorting

It is recommended to create the indexes on columns that are used in the query predicates and join conditions in the correct order that is specified in the predicate. In this way, the goal is keeping the index key short, without including the rarely used columns, in order to minimize the index complexity, storage and maintenance overheads. You can also improve query performance by creating a covering index that contains all data required by the query, without the need to read from the underlying table.

It is also recommended when designing an index, to consider if the columns that participate in the index key will be sorted in ascending or descending order, with the ascending order as the default order, depending on the system requirements.

Index Type

Studying the available types of SQL Server indexes is also recommended, in order to decide which index type will enhance the performance of the current workload, such as Clustered indexes that can be used to sort huge tables, Columnstore Indexes that can be used to enhance the processing performance of the Online Analytical Processing (OLAP) read only workload of data warehouses databases or Filtered indexes for columns that have well-defined subsets of data, such as NULL values, distinct ranges or categorized values.

Index Storage

The storage location of the index may also affect the performance of the queries reading from the index. By default, the index will be stored in the same filegroup as the underlying table on which the index is created. If you design a Non-Clustered index to be stored in a data file different from the underlying table data file and located in a separate disk drive, or horizontally partition the index to span multiple filegroups, the performance of the queries that are reading from the index will be improved, due to the I/O performance enhancement resulted from hitting on different data files and disk drives at the same time.

The initial storage of the index can be also optimized by setting the FILLFACTOR option to a value different from the default value of 0 or 100. FILLFACTOR is the value that determines the percentage of space on each leaf-level page to be filled with data. Setting the FILLFACTOR to 90% when you create or rebuild an index, SQL Server will try to leave 10% of each leaf page empty, reserving the remainder on each page as free space for future growth to prevent the page splitting and index fragmentation performance problems.

Sources:

SQL Server index design basics and guidelines