XFS, Reflink and Fast Clone. Made for each other

As we all know, XFS is a high-performance journaling file system created in the bowels of Silicon Graphics. And it’s highly productive because it can handle many parallel I / O streams. So if you are interested in a file system with easily scalable bandwidth and not degrading from working with several devices at the same time, then you are definitely here. But today we will not praise all of XFS, but one specific flag of it – reflink. It includes the ability to reuse the same data blocks between files, providing deduplication and the ability to make fast copy-on-write snapshots.
It’s a shame to pass such exciting functionality, so today we’ll see how reflink can help everyone responsible for backups, and what Veeam Backup & Replication 10 can offer us on this field.

If you do not go into too much detail, then reflink for XFS was introduced about three years ago. There was nothing breakthrough in this event, since Microsoft has had its own implementation since 2012 (under the name ReFS), and in one form or another there is reflink in BTRFS and OCFS2. Perhaps even in a larger number of file systems – here I ask you to correct the readers, if you undeservedly missed someone.

So, what about backup front fighters with similar technologies? Of course, the most important thing is to save a precious place in our backup repositories due to the deduplication mechanism built into the file system. This is the very first and obvious plus. Less obvious is the performance increase of heavy file operations. If our application is able to work with reflink, then the procedure for copying a heavy file can be reduced to updating metadata, without the need to actually move data blocks. That, as you might imagine, is orders of magnitude faster.
In the context of Veeam, this allowed many times (you can’t even imagine how much) to accelerate an operation such as Synthetic Full. Creating “synthetics” is a terrible dream of your repository. Surely remember how all sorts of tech-bloggers like to take and start torturing a conditional hard drive with a test to read from arbitrary places, viciously laughing at a performance graph that has fallen to the floor? And creating a synthetic backup is not a test, but a real task, consisting of a constant stream of read and write operations from anywhere on the disk. If you just have a drop in drive performance, this is not bad. Some storages can simply hang in the middle.

Accordingly, if we do not need to rush around the entire disk in search of the necessary blocks and copy them to a new location, or we can just fix the metadata table, this gives an incredible effect. Therefore, there is nothing strange that we at Veeam have long recommended using this feature, and the function itself was called Fast Clone. And, as already mentioned, initially Veeam received support from Microsoft ReFs. We have on forum There is a recall of a client who, using ReFS, allowed fit 99 terabytes of real data to 15 terabytes of occupied space. And this is without the use of expensive deduplicating devices.
And now it’s time for XFS to get its share of fame.

Training

Firstly, reflink is available in XFS only in relatively recent releases since requires kernel level support. Your favorite Ubuntu 16.04 will not work here, so upgrade to 18.04. Or better, even 20.04.
What else? The CRC check must be enabled on the system and a very interesting point: the used data block size must be in the range from 1 KiB to 4 KiB. Technically, the upper bound is set by the size of the pages in memory (default), although it is possible to have more – up to 64KiB, but you have to rebuild the kernel. And many report that such a config is unstable. As stated in the official mana – XFS on Linux can only mount filesystems with pagesize or smaller blocks.
.
Pre-verify that everything is fine with us:

toor@ubuntuxfs:~$ sudo lshw -class disk -class storage -short

H/W path      Device      Class      Description
================================================
/0/2          scsi0      storage
/0/2/0.0.0    /dev/sda   disk       5368MB Virtual Disk
/0/2/0.0.2    /dev/sdb   disk       21GB Virtual Disk

And then create the desired section with the command. Why do we create it? Yes, because you cannot enable the -reflink flag on an already created file system. Such is the restriction from the developers.

toor@ubuntuxfs:~$ mkfs.xfs -b size=4096 -m reflink=1,crc=1 /dev/sdb

And we see something like this, where crc = 1 and reflink = 1, which is what we need. Honestly, crc = 1 is set by default, but we have laboratory conditions here, so we did this for clarity.

meta-data=/dev/sdb               isize=512    agcount=4, agsize=1310720 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1
data     =                       bsize=4096   blocks=5242880, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=2560, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

We make a folder for backups and mount it

toor@ubuntuxfs: mkdir /backups
mount /dev/sdb /backups

And to finally make sure that everything is fine, we look:

toor@ubuntuxfs: df -hT .
Filesystem     Type      Size  Used Avail Use% Mounted on
/dev/sdb       xfs        20G  17M   20G   1% /backups

Test

Now, in the laboratory, let’s check how this vaunted XFS and its reflink work. For this, we will generate a file with random content using the favorite method of redirecting output from urandom to everyone

root@ubuntu: dd if=/dev/urandom of=test bs=1M count=10240
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB, 10 GiB) copied, 71.9184 s, 149 MB/s

We will not see any deduplication here, since reflink is not used by the system by default. So how much space was asked to take, so much will be done. What we are really interested in right now is how much space the data itself took, and how much it took for the metadata. We check.

root@ubuntu:/backups# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdb         20G   11G  9.9G  51% /backups

Yeah, for ten gigabytes of data we have about a gigabyte of metadata.
Now let’s try to copy this file with the forced use of reflink and look at the result.

root@ubuntu: cp -v --reflink=always test test-one
'test' -> 'test-one

Check

root@ubuntu:/backups# ls -hsl
total 20G
10G -rw-r--r-- 1 root root 10G Jun 17 09:45 test
10G -rw-r--r-- 1 root root 10G Jun 17 10:01 test-one

root@ubuntu:/backups# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdb         20G   11G  9.9G  51% /backups

Fine! We have two files of ten gigabytes in total, occupying eleven. Glory to the technology! But remember that just like in the case of ReFS, all this is not in vain and requires certain costs. In ReFS, one block can only be reused 8175 times. Also in XFS. The number depends on how many clone records we can store – this is the number of inodes records. And this is the very metadata. But there is good news: this size changes dynamically, and the theoretical limit of XFS is much larger than in ReFS.

Let’s see how the data is located on the disk.

root@ubuntu:/backups# filefrag -v test test-one
Filesystem type is: 58465342
File size of test is 10737418240 (2621440 blocks of 4096 bytes)
 ext:     logical_offset:        physical_offset: length:   expected: flags:
   0:        0.. 1048559:         24..   1048583: 1048560:             shared
   1:  1048560.. 2097135:    1310733..   2359308: 1048576:    1048584: shared
   2:  2097136.. 2621439:    2624013..   3148316: 524304:    2359309: last,shared,eof
test: 3 extents found
File size of test-one is 10737418240 (2621440 blocks of 4096 bytes)
 ext:     logical_offset:        physical_offset: length:   expected: flags:
   0:        0.. 1048559:         24..   1048583: 1048560:             shared
   1:  1048560.. 2097135:    1310733..   2359308: 1048576:    1048584: shared
   2:  2097136.. 2621439:    2624013..   3148316: 524304:    2359309: last,shared,eof
test-one: 3 extents found

As we can see, both files occupy three extents, and they are located equally with the accuracy of a block. So reflink works as we expected from it.

Now let’s move on to practical exercises with Veeam

Practice

We will not invent a bicycle, so we will connect our Linux machine as a repository in a completely standard way.
And do not forget to choose the right drive

And most importantly, do not forget the tick Use Fast Cloning on XFS volumes. Otherwise, the focus will fail. And check that the Advanced button removes all the checkmarks.

When the wizard finishes his work, you can proceed to creating backups. As you remember, at the beginning I talked about using tasks with creating synthetic points. Therefore, at the Storage step, do not forget to select the desired repository, go to the Advanced tab, select Create Synthetic full backups periodically and choose on which days they will be created. Usually we advise everyone to choose a weekend, as the operations are complicated, and there is no need to load your storage once again on weekdays.

It will also not be superfluous to include a periodic backup health check on the Maintenance tab. Too many of us have heard sad stories about data corruption on systems like ReFS and XFS. Yes, similar mechanisms are already built into the file system itself, but if they really worked, we would not read so many fascinating stories about the heroic data recovery. Practice has shown that if you have RAID with redundancy, then you can sleep more or less calmly. If not, then all these self-test systems are capable of is to say “Oh, that’s it.”

All other items are standard and do not deserve special attention. When the task is created and successfully fulfills, on the day with the planned synthetics you should see this line in the log:

Postscript [fast clone] means that not just a synthetic point was created, namely using Fast Clone [xfs reflink] for the entire contents of backup. There is another option [partial fast clone]when a mixed chain is created. Yes, we can do synthetics even on not aligned fs. Like ReFS, Veeam can only clone blocks at the boundaries of file system blocks. Therefore, data blocks of backup files are created with alignment equal to block_size. This is done automatically when creating the repository, and for this there is even a separate checkmark in repository settings.
Now, let’s go back to the repository and see the backup files. Namely, we will check what the last synthetic .vbk consists of
The output of a real file can be very large, so I will give only its beginning and end

root@ubuntu:/backups/Agent Linux/192.168.91.226# filefrag -v Agent Linux - 192.168.91.226D2020-06-18T062036_1F09.vbk
Filesystem type is: 58465342
File size of Agent Linux - 192.168.91.226D2020-06-18T062036_1F09.vbk is 2430484480 (593380 blocks of 4096 bytes)
 ext:     logical_offset:        physical_offset: length:   expected: flags:
   0:        0..       0:    1777650..   1777650:      1:
   1:        1..    4360:    2039034..   2043393:   4360:    1777651:
   2:     4361..    5341:    1315106..   1316086:    981:    2043394: shared
   3:     5342..    5345:    1986271..   1986274:      4:    1316087:
   4:     5346..    5348:    1316091..   1316093:      3:    1986275: shared
   5:     5349..    5570:    1986275..   1986496:    222:    1316094: shared
   6:     5571..    5603:    1316319..   1316351:     33:    1986497: shared
   7:     5604..    5781:    1986497..   1986674:    178:    1316352: shared
   8:     5782..    5800:    1974097..   1974115:     19:    1986675: shared
   9:     5801..    5872:    1316508..   1316579:     72:    1974116: shared

....
 925:   545910..  546109:    2534022..   2534221:    200:    1853810: shared
 926:   546110..  546299:    1776748..   1776937:    190:    2534222: shared
 927:   546300..  546477:    2534222..   2534399:    178:    1776938: shared
 928:   546478..  546623:    1854227..   1854372:    146:    2534400: shared
 929:   546624..  547203:    2534400..   2534979:    580:    1854373: shared
 930:   547204..  548096:    1855025..   1855917:    893:    2534980: shared
 931:   548097..  549585:    2534980..   2536468:   1489:    1855918: shared
 932:   549586..  551487:    1857319..   1859220:   1902:    2536469: shared
 933:   551488..  551787:    2536469..   2536768:    300:    1859221: shared
 934:   551788..  553011:    2037808..   2039031:   1224:    2536769:
 935:   553012..  577866:    1929924..   1954778:  24855:    2039032: shared
 936:   577867..  578291:    2536769..   2537193:    425:    1954779: shared
 937:   578292..  592732:    1954913..   1969353:  14441:    2537194: shared
 938:   592733..  593373:    2537194..   2537834:    641:    1969354: shared
 939:   593374..  593375:    1777645..   1777646:      2:    2537835: shared
 940:   593376..  593379:    1969356..   1969359:      4:    1777647: last,eof
Agent Linux - 192.168.91.226D2020-06-18T062036_1F09.vbk: 941 extents found

As we can see, almost all of it consists of reused shared blocks. Unmarked blocks refer to the last increment that is created before creating the fulnik.
But what kind of gain did these shared blocks give us? We look:

root@ubuntu:/backups/Agent Linux/192.168.91.226# df -h .
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdb         20G  3.8G   17G  19% /backups

3.8 Gigabytes actually used. But what about the files themselves?

root@ubuntu:/backups/Agent Linux/192.168.91.226# ls -hsl
total 7.2G
 56K -rw-rw-rw- 1 root root  54K Jun 18 13:20 'Agent Linux - 192.168.91.226.vbm'
1.8G -rw-r--r-- 1 root root 1.8G Jun 17 13:53 'Agent Linux - 192.168.91.226D2020-06-17T065146_8546.vbk'
 63M -rw-r--r-- 1 root root  63M Jun 17 13:57 'Agent Linux - 192.168.91.226D2020-06-17T065727_0228.vib'
317M -rw-r--r-- 1 root root 317M Jun 17 14:03 'Agent Linux - 192.168.91.226D2020-06-17T070240_FD8B.vib'
383M -rw-r--r-- 1 root root 383M Jun 17 14:09 'Agent Linux - 192.168.91.226D2020-06-17T070826_5860.vib'
2.4G -rw-r--r-- 1 root root 2.4G Jun 17 15:04 'Agent Linux - 192.168.91.226D2020-06-18T020624_DF44.vbk'
2.3G -rw-r--r-- 1 root root 2.3G Jun 18 13:20 'Agent Linux - 192.168.91.226D2020-06-18T062036_1F09.vbk'

And the files themselves occupy 7.2 Gigabytes. Here is such a win.

On this, I consider the task to show the benefits and benefits of using Fast Clone accomplished. As we can see, this is not only a saving of space, although now it is believed that it is easier to buy a new store and throw discs into it. It also saves time in order to get into the required backup window. While the speed of conventional synthetics is limited by the performance of the disk subsystem, in the case of ReFS / XFS this load is more computational. And with the availability of CPU and RAM resources, things are usually much better.

And before you say goodbye, let me leave you some useful links:

helpcenter.veeam.com/docs/backup/vsphere/backup_repository_block_cloning.html?ver=100 About Fast Clone in the documentation
www.veeam.com/kb3150 Fast Clone and Nutanix Mine
blogs.oracle.com/linux/xfs-data-block-sharing-reflink Great article on XFS and Reflink on Oracle Blog