Sparse column is a new feature on SQL Server 2008. Microsoft says "Sparse columns reduce the space requirements for null values at the cost of more overhead to retrieve nonnull values". So sparse column can be implemented on column that field isn't often to be filled by values. Using this, we can save up database space lower cause when you store null values on sparse column, it takes no space at all. but becarefull about using sparse column, it has a drawback. when you store nonnulls values, it will take 4 bytes than normal column, even when your column is only save a bit.
On my computer, i have tried to implement sparse column. At the first time, i created 2 tables that contains sparse and not. here is the script:
/* creating table */
CREATE TABLE TABLEWITHSPARSE
(
FIELD_A INT IDENTITY (1,1) NOT NULL PRIMARY KEY,
FIELD_B VARCHAR(25) SPARSE,
FIELD_C VARCHAR(10) NOT NULL
)
CREATE TABLE TABLEWITHOUTSPARSE
(
FIELD_A INT IDENTITY (1,1) NOT NULL PRIMARY KEY,
FIELD_B VARCHAR(25) NULL,
FIELD_C VARCHAR(10) NOT NULL
)
After that, we have to fill up the data. the data on sparse column will be null and nonnull values to prove the theory. so here is the script:
/* Inserting data on sparse table */
DECLARE @A INT = 1, @B VARCHAR(25), @C VARCHAR(10) = 'N'
WHILE @A <= 50000
BEGIN
IF (@A >= 45000) SET @B = 'TEST'
INSERT INTO TABLEWITHSPARSE VALUES (@B,@C)
SET @A = @A + 1
END
/* Inserting data on non sparse table */
DECLARE @A INT = 1, @B VARCHAR(25), @C VARCHAR(10) = 'N'
WHILE @A <= 50000
BEGIN
IF (@A >= 45000) SET @B = 'TEST'
INSERT INTO TABLEWITHOUTSPARSE VALUES (@B,@C)
SET @A = @A + 1
END
Well, After we did it, we can see the result. we can see how's good sparse column to store data when your data is many null values. if we refer to example that have created, we have 50,000 data and 45,000 data is null values. so we have 5,000 data that contains nonull values. it means only 10% data is stored with nonnull values. Ok, let's we execute this query to find the data space used on each tables.
/* Executing table with non sparse column */
exec sp_executesql N'
declare @PageSize float
select @PageSize=v.low/1024.0
from master.dbo.spt_values v
where v.number=@_msparam_0 and v.type=@_msparam_1
SELECT
ISNULL((select @PageSize * SUM(CASE WHEN a.type <> 1 THEN a.used_pages WHEN p.index_id < object_id =" i.object_id" index_id =" i.index_id" container_id =" p.partition_id" object_id =" tbl.object_id),0.0)" name="@_msparam_2" _msparam_0="N'1',@_msparam_1=" _msparam_2="" _msparam_3="">
exec sp_executesql N'
declare @PageSize float
select @PageSize=v.low/1024.0 from master.dbo.spt_values v where v.number=@_msparam_0 and v.type=@_msparam_1
SELECT
ISNULL((select @PageSize * SUM(CASE WHEN a.type <> 1 THEN a.used_pages WHEN p.index_id < object_id =" i.object_id" index_id =" i.index_id" container_id =" p.partition_id" object_id =" tbl.object_id),0.0)" name="@_msparam_2" _msparam_0="N'1',@_msparam_1=" _msparam_2="" _msparam_3="">
here is the result:
DataSpaceUsed - For Sparse Column used
----------------------
960KB
(1 row(s) affected)
DataSpaceUsed - For NON Sparse Column used
----------------------
1016KB
(1 row(s) affected)
As just you saw, the sparse column stores a litte more than non sparse column. So, it's true that sparse column can reduce null values. But remember, sparse column has drawback when the data has many values that isn't null values. So, i have tried to find out it to. and here is the script:
/* Deleting table */
TRUNCATE TABLE tablewithsparse
TRUNCATE TABLE tablewithoutsparse
/* Inserting data on sparse table */
DECLARE @A INT = 1, @B VARCHAR(25), @C VARCHAR(10) = 'N'
WHILE @A <= 50000
BEGIN
IF (@A >= 25000) SET @B = 'TEST'
INSERT INTO TABLEWITHSPARSE VALUES (@B,@C)
SET @A = @A + 1
END
/* Inserting data on non-sparse table */
DECLARE @A INT = 1, @B VARCHAR(25), @C VARCHAR(10) = 'N'
WHILE @A <= 50000
BEGIN
IF (@A >= 25000) SET @B = 'TEST'
INSERT INTO TABLEWITHSPARSE VALUES (@B,@C)
SET @A = @A + 1
END
After that, you can execute the same query for seeing the result. here is the result:
DataSpaceUsed - For Sparse Column used
----------------------
1240
(1 row(s) affected)
DataSpaceUsed - For NON-Sparse Column used
----------------------
1096
(1 row(s) affected)
As you just saw, the sparse colum table have a little bit higher for stored data than non-sparse column table. it's caused we stored the data is about 50% data is nonnull values. And sparse column have a drawback to store the data like that.
And one more thing, i haven't tried the sparse column about getting performance using it or not. but if the microsoft said there will be more overhead on it, maybe it's true, and the logic i think is reasonable. hehehehehehehehehe ...
Any comments?
Tuesday, May 27, 2008
Sparse Column on SQL Server 2008 - At a glance
Labels:
SQL Server
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment