[SOLVED] LXC Restore - no space left on device

voarsh

Member
Nov 20, 2020
218
19
23
28
Creating filesystem with 786432 4k blocks and 196608 inodes
Filesystem UUID: b091a25d-4e2d-4506-8f0f-88e70c6cb0a3
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912

Allocating group tables: 0/24 done
Writing inode tables: 0/24 done
Creating journal (16384 blocks): done
Multiple mount protection is enabled with update interval 5 seconds.
Writing superblocks and filesystem accounting information: 0/24 done

extracting archive '/mnt/pve/FourTBpveIPC2Expansion/dump/vzdump-lxc-102-2021_01_08-17_43_57.tar.zst'
tar: ./var/lib/postgresql/11/main/base/16387/3258: Wrote only 1024 of 5120 bytes
tar: ./var/lib/postgresql/11/main/base/16387/17412: Cannot write: No space left on device
tar: ./var/lib/postgresql/11/main/base/16387/112: Cannot write: No space left on device


- ommitting long error

tar: ./var/lib/ghostscript/CMap/UniJISX02132004-UTF32-V: Cannot create symlink to '/usr/share/poppler/cMap/Adobe-Japan1/UniJISX02132004-UTF32-V': No space left on device
tar: ./var/lib/ghostscript/CMap/UniJISX02132004-UTF32-H: Cannot create symlink to '/usr/share/poppler/cMap/Adobe-Japan1/UniJISX02132004-UTF32-H': No space left on device
tar: ./etc/ssl/certs/SwissSign_Gold_CA_-_G2.pem: Cannot create symlink to '/usr/share/ca-certificates/mozilla/SwissSign_Gold_CA_-_G2.crt': No space left on device
tar: ./etc/ssl/certs/OISTE_WISeKey_Global_Root_GA_CA.pem: Cannot create symlink to '/usr/share/ca-certificates/mozilla/OISTE_WISeKey_Global_Root_GA_CA.crt': No space left on device
tar: ./etc/ssl/certs/SSL.com_Root_Certification_Authority_ECC.pem: Cannot create symlink to '/usr/share/ca-certificates/mozilla/SSL.com_Root_Certification_Authority_ECC.crt': No space left on device
tar: ./etc/ssl/certs/Buypass_Class_3_Root_CA.pem: Cannot create symlink to '/usr/share/ca-certificates/mozilla/Buypass_Class_3_Root_CA.crt': No space left on device

- ommitting long error

Total bytes read: 3029975040 (2.9GiB, 21MiB/s)
tar: Exiting with failure status due to previous errors
TASK ERROR: unable to restore CT 102 - command 'lxc-usernsexec -m u:0:100000:65536 -m g:0:100000:65536 -- tar xpf - --zstd --totals --one-file-system -p --sparse --numeric-owner --acls --xattrs '--xattrs-include=user.*' '--xattrs-include=security.capability' '--warning=no-file-ignored' '--warning=no-xattr-write' -C /var/lib/lxc/102/rootfs --skip-old-files --anchored --exclude './dev/*'' failed: exit code 2


I've read this might have something to do with the disk size of the LXC.
I'm not sure what to do. It's a 1.2 GB backup and a 3G disk (allocation).


pct restore --rootfs 3 102 vzdump-lxc-102-2021_01_08-08_30_02.tar.zst -storage local-lvm

unable to restore CT 102 - can't find file 'vzdump-lxc-102-2021_01_08-08_30_02.tar.zst'

- it can't seem to find it and restore to local-lvm
The backup is in local.

It only seems to do something when I run it in SSH in /var//lib/vz/dump/
 
Last edited:
Code:
pct restore --rootfs 8 102 vzdump-lxc-102-2021_01_08-08_30_02.tar.zst -storage local-lvm

I had to make the disk larger, although I do not know why.
 
I'm a little confused, found this thread, mirrors my experiences,

I took a look at the filesystem of the backup after I'd restored it, having manually adjusted the disk size via the above console command,

This is for a debian buster container I use for running Pi-Hole, I'm setting it up on another node manually so I can take the other down for an upgrade, anyway, the source LXC on Node 1 reports that it is using 947MB out of a 2.00GB boot disk, which breaks down as;

0 ./sys
512 ./boot
512 ./home
512 ./media
512 ./mnt
512 ./srv
4.0K ./proc
23K ./root
27K ./tmp
64K ./run
158K ./opt
7.5M ./dev
254M ./var
322M ./usr
375M ./etc
958M .


The system was backed up to a storage server that is shared across all nodes, and attempted to restore via GUI on Node 2, failing with the "out of space" errors noted in the posts above.


Manually restored with extended disk, mounted the FS using "mount /dev/pve/vm-104-disk-0 /mnt/pve/pihole/" and then comparing the du result;

4.0K ./boot
4.0K ./dev
4.0K ./home
4.0K ./media
4.0K ./mnt
4.0K ./proc
4.0K ./srv
4.0K ./sys
4.0K ./tmp
16K ./lost+found
16K ./run
20K ./root
272K ./opt
509M ./usr
665M ./var
747M ./etc
1.9G .

Everything's doubled in size. Is this because the block size has jumped from 512 bytes to 4k or something else going on? I am perplexed.



EDIT: Answering my own confused question;
Node 1 had a ZFS rpool with compression on. Node 2 has ext4, no compression. Checked and compressratio was 2.13.So that explains that.
 
Last edited:
Thank for posting your findings. There must be more to this, however. I seem to be running into the same issue. While your solution might work as well your hypothesis (compression) wouldn't check out. My layout is as follows:

Code:
sda              8:0    0 465.8G  0 disk
├─sda1           8:1    0  1007K  0 part
├─sda2           8:2    0     1G  0 part  /boot/efi
├─sda3           8:3    0    19G  0 part  /
└─sda4           8:4    0 445.8G  0 part
  └─fast_crypt 253:0    0 445.7G  0 crypt /mnt/fast
sdb \
sdc  | ZFS                 ~8.0T             /mnt/slow
sdd /

So, essentially two storage targets in the system, /mnt/fast and /mnt/slow. /mnt/slow contains the backups I'm restoring from.

* Restoring to /mnt/slow works.
* Restoring to /mnt/fast fails with the seemingly same errors ("No space left on device").

I only have this issue with one particular machine I try to restore. All others have no trouble restoring to either of the two storage targets. First thing that stands out that distinguishes this machine from the others is that it has the following features enabled. None of the other machines (that succeed restoring) have any of these enabled:

* fuse=1
* mount=nfs
* nesting=1

Root partition is very limited (20GiB), so might still be an issue (though I wouldn't knwo why, exactly. Temporary storage? But 16GB of free space shouldn't create a bottleneck (for tar?), if any is required at all:

/dev/sda3 19G 2.9G 16G 16% /

I had BTrFS on /mnt/fast first. Tried it as storage type BTRFS as well as Directory, same issue. I've reformatted to XFS, no change.

Lack of space on the root partition seems illogical as the restore to /mnt/slow works just fine. Any ideas?

Thank you~

---

PS.: Restoring to /mnt/slow and then trying to move the individual root device of the container to /mnt/fast yields the same errors.
 
Last edited:
TL;DR: Solved, issue found:

* Storage A compresses more than storage B. VM's filesystem is actually larger than bounds of the disk but fit (on storage A) since it compresses well
* Storage B compresses less (or not at all), running out of space in the disk image it creates that is now too small to fit the un-/insufficiently compressed filesystem.
* Manual CLI restore or do magic below to increase disk size & restore again
* greeat success



I've figured this out (apologies should this overlap with what has already been said above): I kept wondering how, or why, it would run out of space, literal error or indirection? Then looked at a backup archive of the VM and realized its contents being considerably larger than the archive itself. Since I have two storage targets of different type (ZFS, BTrFS), the reason of the issue seems to be that the VM's root volume's contents get compressed sufficiently on the original ZFS target to fit into the capacity of the root volume/disk image (and also into a backup archive deceptively compact in size), while the other filesystem, although (again, deceptively) having compression enabled, does not compress the (raw) disk image it creates (sufficiently) to fit the contents. More specifically:

* root disk on ZFS: disk image is set to 2GiB that the VM's fileystem fits within (even though it contains large log files around 3.8GiB in size (of course it's logs crashing the party again...))
* VM backs up to a <2GiB compact size as well
* VM fails to restore to the other storage target: "no space left"
* manual CLI restore or doing some magic explained below to resize the backed up disk to capacity bounds large enough to fit its filesystem, even in un-/less compressed state & re-attempting restore to second storage target now succeeds as it's now large enough to fit the filesystem

Manual CLI restore, something like this can be used to increase the size of the root volume to be create; --storage is not required:

* pct restore <containerId> <pathToBackupArchive> --rootfs <sizeInGiB> --storage <nameOfTargetStorage>

Magic to increase bounds of image:

* /etc/pve/nodes/<nodeName>/lxc/<vid>.conf: rootfs: [...],size=2G: Increase this size to fit the uncompressed filesystem.
* Backup the machine.
* Restore the machine from this backup. Proxmox now create a new disk image of the new capacity, no more running out of space while writing the pesty logs... or whatever it might be.

PS.: I had mounted the BTrFS storage with compression on, albeit soft compression, which, given that Proxmox crates a single raw disk file, probably did not end up compressing it as the beginning of the image is probably fairly incompressible (binaries), thus causing BTrFS to abandon compression for that entire file, causing the large files that do compress on the ZFS storage target (zvol, individual files, per-file compression) to not get compressed and blow up to 100% their raw size. Trying to write them into the new image of limited capacity being restored it runs out of space. Makes sense but is quite unintuitive as there's nothing obviously violating the 2GiB limit of the root disk.
 
Last edited: