MCSA Certification Tutorial: Understanding Indexes in SQL Server
An index is used as a reference to the actual text. The benefit of having an index can be derived from a simple explanation such that of a book. A book contains hundreds of pages with dozens of chapters, and a table of contents in one of the first pages.
What if the table of contents or index was not present? The reader would be skimming through all the pages just to reach a certain chapter or page number. Having an index provides an ordered representation of data.
The same logic applies to database tables as well. A database is a collection of data stored to provide meaningful information in an organized and structured manner. It provides classification and helps make retrieving data easier with simple queries.
To accomplish that goal, databases are managed through database management systems, such as Microsoft’s SQL Server. By applying the same analogy, we can say that without proper indexing, the entire data stored in SQL Server database tables will be scanned. As you continue to add more tables, making the database more complex to handle, the lack of an index will prove to be time-consuming and resource-intensive for the user.
Every organization uses and maintains a database. The bigger the organization, the bigger the database and the more complex it is to manage it. It is why professional certification such as MSCA certification provides a great deal of value to an individual’s career as a database manager.
Indexes in databases are optional. Some people choose to build their databases without one, simply because indexes require maintenance. When the tables will be updated, the indexes will have to be updated accordingly. As we said earlier, indexes are used as references that point to some data in the table. This relationship has to be kept in order to avoid bad retrievals.
Not to mention, just as an index would require additional pages in the book, an index in SQL Server database would require additional storage.
An index is built using keys from one or more columns in the table. The keys are stored in a B-tree structure that allows SQL Server to retrieve information (rows) associated with the key.
Based on the way they store data, what purpose they accomplish, and the way they are defined, indexes are classified into many types. Let us look at some of them below:
Clustered indexes allow users to sort and store data rows available in the table or view according to their key values. There can only be one index in a Table. A clustered index is automatically created when we create a primary key.
It determines the physical order of data in a table, by sorting the rows physically in memory. Data pages are stored in the leaf nodes of a clustered index. This allows clustered indexes to be fast in retrieving data. But aside from improving query performance, the existence of a clustered index can be used to control table fragmentation by rebuilding or reorganizing data rows on demand.
A good example of a clustered index is a phone book in your cellphone. A phonebook sorts every individual entry alphabetically. When you search for a name, the corresponding details such as phone number, email address, home address, etc. appear with the search.
A table is called a clustered table if there is a clustered index in the table. A heap is when a table has data rows which are unsorted due to no definition of clustered index.
Here is a simple query to create a clustered index:
CREATE CLUSTERED INDEX IX_TestTable_TestColumn1
ON dbo.TestTable (TestColumn1);
The above query will create a clustered table with column name TestColumn1. Once the table has been created, any data rows we add will be in a sorted order.
The second main type is non-clustered. While it is similar to clustered, it is different because of that fact that it does not share the same table. They have a structure that is separate from the table they reference. Hence, non-clustered indexes require additional storage space. It is worth noting that an index is by-default a non-clustered index.
The non-clustered indexes have non-clustered index key values. Each key value directly points to a data row that contains the key value. We previously mentioned how an unsorted table is called a heap. The structure of non-clustered indexes depends on whether the data pages stored are in a heap or clustered table.
Data stored in non-clustered index itself can be either ascending or descending, it does not influence the storage of data in the table.
Let us take an example of a simple non-clustered query.
CREATE NONCLUSTERED INDEX IX_ProductVendor_VendorID
ON Purchasing.ProductVendor (BusinessEntityID);
The above query will create a non-clustered index on the Purchasing.ProductVendor table using the BusinessEntityID column. The business ID is what will become the reference that points to the data rows associated with each entity’s ID.
Unique indexes are used to maintain data integrity by checking for data and ensuring that no two rows of data share identical key values. Upon the creation of a unique index, SQL server checks the values in the columns for uniqueness.
The creation of a unique index fails if it detects rows with duplicate key values, ensuring that no two values in the table’s column are duplicate.
Here is a simple example of a unique index:
CREATE UNIQUE INDEX AK_UNITMeasure_Name
ON Production.UnitMeasure (Name);
The query will create a unique index on the Production.UnitMeasure table using the Name column.
A covering index is a type that includes columns needed to process a query request. We can take the most classic example of the phonebook again. A query which uses the value stored in the ContactID column will also retrieve the corresponding information stored in FirstName and LastName.
Composite Indexes are simply a group of two or more key columns, also referred to as multi-column index. SQL Server 2005 and 2008 and onwards allow addition of up to 16 columns in an index, provided they meet the 900-byte limit.
Clustered and Non-Clustered indexes can both be composite indexes.
SQL Server is extensive in its functionality, and more is added through seasonal and critical updates. Indexing is only a part of how you can create a database that works fast on queries and is reliable in data retrieval. The full suite of functionality can be learned through MSCA certification programs.