Using SPARSE Columns

Have you ever seen the SPARSE keyword used in a table definition? I came across this for the first time when looking at a holiday table that had a HolidayText column with a SPARSE property.

So, what’s a SPARSE column?

Sparse columns were introduced in SQL Server 2008. They’re intended to reduce the storage space of NULL values. The downside, however, is that they add some overhead when retrieving the non-NULL values stored. When creating or altering a table, the “SPARSE” keyword would immediately follow the data type.

How much space-saving are we talking?

The answer, unfortunately, is “it depends.” It depends on the data type. Microsoft has a great chart in their documentation explaining the space required for data types with a percentage of NULL values.

In the following demo, we’ll be taking a look at using a datetime data type as a SPARSE column. We’ll populate 50,000 rows of data into two heaps, both being identical except one will have a SPARSE datetime while the other does not. We’re also going to leave 90% of the datetime columns in each table NULL.

CREATE TABLE dbo.NoSparse
(
	NoSparseID int NOT NULL IDENTITY,
	NoSparseDate datetime NULL
);

CREATE TABLE dbo.YesSparse
(
	YesSparseID int NOT NULL IDENTITY,
	SparseDate datetime SPARSE NULL
);

WITH X1 (N) AS (SELECT N FROM (VALUES (1), (1), (1), (1), (1), (1), (1), (1),
                                      (1), (1), (1), (1), (1), (1), (1), (1)) X (N)),
     X2 (N) AS (SELECT A.N FROM X1 AS A, X1 AS B),
	 X3 (N) AS (SELECT A.N FROM X2 AS A, X2 AS B),
	 Tally (Number) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM X3)
INSERT INTO dbo.NoSparse (NoSparseDate)
OUTPUT inserted.NoSparseDate INTO dbo.YesSparse (SparseDate)
SELECT TOP (45000) NULL
FROM Tally
UNION ALL
SELECT TOP (5000) CURRENT_TIMESTAMP
FROM Tally;

I’ll talk more on that OUTPUT and why I built that CTE like that in a later post. Hopefully you also noticed that Table Value Constructor.

If we execute sp_spaceused on each object, we can get an idea of what the storage looks like for each.

EXEC sp_spaceused @objname = N'dbo.NoSparse';
EXEC sp_spaceused @objname = N'dbo.YesSparse';

I also could have used sys.dm_db_index_physical_stats to do this.

namerowsreserveddataindex_sizeunused
NoSparse500001096 KB1040 KB8 KB48 KB
YesSparse50000776 KB744 KB8 KB24 KB

The table with the sparse column saved us about 320 KB in this scenario.

This doesn’t mean we need to go make all nullable columns SPARSE, but it should be considered in certain scenarios. Also, please keep in mind there are plenty of limitations and restrictions with using these as well.

Leave a Reply

Your email address will not be published. Required fields are marked *