comparing data compression and backup compression

MSSQL supports on-the-fly compression of backups – lightweight and fast. You can also pack data inside the database using DATA_COMPRESSION = PAGE or ROW. As we remember, packed data is poorly packed. How will data packaging affect the backup size?

Test base

We create a large table test in a clean database (don’t forget to convert it to SIMPLE):

create table seed (n int)
GO
set nocount on
declare @n int=9999 while @n>=0 begin
  insert into seed select @n
  set @n=@n-1
  end
GO
create table test (n int identity, a int, b int, str varchar(128))
GO
create clustered index PK_n on test (n)
GO
insert into test (a,b,str)
  select L.n,R.n,'seed '+convert(varchar,L.n)+' and '+convert(varchar,R.n)
    from seed L, seed R
GO
create index IXa on test (a)
GO
create index IXb on test (b)
GO
create index IXstr on test (str)
GO

Let's write down the sizes of indexes and backups – packed and regular. Then apply to all indexes DATA_COMPRESSION = PAGE

alter index PK_n on test REBUILD with (data_compression=PAGE)
alter index IXa on test REBUILD with (data_compression=PAGE)
alter index IXb on test REBUILD with (data_compression=PAGE)
alter index IXstr on test REBUILD with (data_compression=PAGE)

and we'll write everything down again.

Finally, apply DATA_COMPRESSION = ROW

alter index PK_n on test REBUILD with (data_compression=ROW)
alter index IXa on test REBUILD with (data_compression=ROW)
alter index IXb on test REBUILD with (data_compression=ROW)
alter index IXstr on test REBUILD with (data_compression=ROW)

results

All dimensions are given in Mb

Index

no compression

compression PAGE

compression ROW

PK_n

4355

1992

3646

IXa

1355

854

1152

IXb

1355

963

1152

IXstr

3091

1239

3174

Now consider the size of full backups, also in megabytes:

Backup

no compression

compression PAGE

compression ROW

no compression

10417

5189

9364

compression

1934

2331

2285

conclusions

I haven’t discovered America, the results are quite expected, but I wanted to make sure again

  • Packaging a backup always reduces its size, regardless of the type of packaging of the data inside, so this is useful.

  • As expected, the backup decreases the most if the data inside the database is not compressed (up to 5 times). In reality, this coefficient is usually 3-4

  • compression=PAGE greatly reduces the amount of data and increases the cache hits ratio. Obviously, at the cost of increased CPU

  • compression=ROW works worse and sometimes even increases the space occupied by data.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *