Proxmox backup of container way bigger than original

Sep 1, 2022
12
2
3
I have a privileged container hosting a MS SQL server (for WSUS and other stuff). That's the configuration:

Rich (BB code):
arch: amd64
cores: 4
features: nesting=1
hostname: mssql
memory: 2048
nameserver: 192.168.13.1
net0: name=eth0,bridge=vmbr0,gw=192.168.13.2,hwaddr=66:BC:CF:B2:F4:4B,ip=192.168.30.6/24,tag=30,type=veth
onboot: 1
ostype: ubuntu
rootfs: SSD:subvol-112-disk-1,size=16G
swap: 512

The virtual disk only contains about 5GB of data:

Code:
root@pve1:~# zfs list
NAME                    USED  AVAIL     REFER  MOUNTPOINT
SSD/subvol-112-disk-1  4.95G  11.1G     4.95G  /SSD/subvol-112-disk-1

The Storage SSD is a ZFS mirror-0 of two 4TB SSDs.

There seems to be something odd with the ZFS volume: when I backup the container, in my case to a Proxmox Backup server, it traverses ~500GB! and tells me, that that would be the size of the backup. Fortunately, the compressed size is only about 3GB, but the backup of that container takes forever, compared to its size. Here the log of the Backup:

Code:
INFO: Starting Backup of VM 112 (lxc)
INFO: Backup started at 2022-12-09 23:10:20
INFO: status = running
INFO: CT Name: mssql
INFO: including mount point rootfs ('/') in backup
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: create storage snapshot 'vzdump'
INFO: creating Proxmox Backup Server archive 'ct/112/2022-12-09T22:10:20Z'
INFO: run: /usr/bin/proxmox-backup-client backup --crypt-mode=none pct.conf:/var/tmp/vzdumptmp3563993_112/etc/vzdump/pct.conf root.pxar:/mnt/vzsnap0 --include-dev /mnt/vzsnap0/./ --skip-lost-and-found --exclude=/tmp/?* --exclude=/var/tmp/?* --exclude=/var/run/?*.pid --backup-type ct --backup-id 112 --backup-time 1670623820 --repository root@pam@192.168.1.1:Host-ZFS
INFO: Starting backup: ct/112/2022-12-09T22:10:20Z
INFO: Client name: pve1
INFO: Starting backup protocol: Fri Dec  9 23:10:20 2022
INFO: No previous manifest available.
INFO: Upload config file '/var/tmp/vzdumptmp3563993_112/etc/vzdump/pct.conf' to 'root@pam@192.168.1.1:8007:Host-ZFS' as pct.conf.blob
INFO: Upload directory '/mnt/vzsnap0' to 'root@pam@192.168.1.1:8007:Host-ZFS' as root.pxar.didx
INFO: root.pxar: had to backup 13.975 GiB of 486.277 GiB (compressed 3.281 GiB) in 1932.87s
INFO: root.pxar: average backup speed: 7.404 MiB/s
INFO: root.pxar: backup was done incrementally, reused 472.302 GiB (97.1%)
INFO: Uploaded backup catalog (746.738 KiB)
INFO: Duration: 1933.17s
INFO: End Time: Fri Dec  9 23:42:33 2022
INFO: cleanup temporary 'vzdump' snapshot
INFO: Finished Backup of VM 112 (00:32:14)
INFO: Backup finished at 2022-12-09 23:42:34
INFO: Backup job finished successfully

It's not just the backup that takes that long, but also the move to another storage. I have another ZFS storage on that node, a RAIDZ2 of 6 1TB SAS disks, that I used to check what happens if I move the 16GB volume to it. It takes about as long as the backup, which tells me, that again proxmox traverses ~500GB.

There are about 10 other containers on the same node and storage, that do not show that behaviour. They are unprivileged only, however (if that matters).

I am curious about your ideas.

On another note: I deleted all previously created backups on the backup server. Why does the log above claim to have made an "incremental backup" then? Is there some remains lingering?
 
Last edited:
Hey, what disk size of the container have you specified? Proxmox shows the full disk size while running the backup task to PBS, even if it's mostly empty. For example, if you have defined a 500 GB drive while setting up the container, but it's currently using only 100 GB, it will still show that it's backing up 500 GB of data. However, it's never going it exceed 100 GB due to the compression (moreover, it will be much smaller, with MSSQL most likely under 10 GB). This is due to the fact that most of the data are zeroes which are easily compressed. However, if you would not use any compression at all, you're right - the backup would take entire 500 GB because it would create a full (virtual) drive image, including the empty sectors.
 
Hey, what disk size of the container have you specified? Proxmox shows the full disk size while running the backup task to PBS, even if it's mostly empty. For example, if you have defined a 500 GB drive while setting up the container, but it's currently using only 100 GB, it will still show that it's backing up 500 GB of data. However, it's never going it exceed 100 GB due to the compression (moreover, it will be much smaller, with MSSQL most likely under 10 GB). This is due to the fact that most of the data are zeroes which are easily compressed. However, if you would not use any compression at all, you're right - the backup would take entire 500 GB because it would create a full (virtual) drive image, including the empty sectors.
As Dunuin already said, the disk was created as a 16GB disk. I also printed the zfs list command output, to verify, that the ZFS subvolume was of the same size as the LXC specification.
 
Do you maybe mount some other stuff inside that privileged LXC? Like SMB/NFS shares that the backup might also need to go through?
 
Do you maybe mount some other stuff inside that privileged LXC? Like SMB/NFS shares that the backup might also need to go through?

No. See the output of df -h

Code:
root@mssql:~# df -h
Filesystem             Size  Used Avail Use% Mounted on
SSD/subvol-112-disk-1   16G  4.9G   12G  31% /
none                   492K  4.0K  488K   1% /dev
tmpfs                   32G     0   32G   0% /dev/shm
tmpfs                  6.3G  1.6M  6.3G   1% /run
tmpfs                  5.0M     0  5.0M   0% /run/lock
tmpfs                  6.3G     0  6.3G   0% /run/user/0
 
Hi,
maybe you have a sparse file inside your container? You can check with e.g. the find three-liner mentioned here. There also is an open feature request to make handling sparse files more efficient in PBS.
That was actually the case. the file /var/log/lastlog, which is a sparse file, was about 470GB in size when asking ls. From what I read about this file, this comes from large gaps between user IDs of reported last logons.
This container was unprivileged before, and a join via sssd and our active directory didn't play well with the user id mapping. I suppose the reported large user IDs are a left over of that.

To solve my problem, I reset the file using the advice from here:
Code:
# cat > /var/log/lastlog

Since the container is now privileged, and sssd is playing nicely with respect to user ID mapping, the file no longer grows that big, and my backups are now way smaller and are finished much quicker.
 
Thank you so much for bringing this up, i just installed sssd and had the same problem, backups would take ages!

TANK YOU!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!