[SOLVED] Error code 11 when moving storage from zfspool to local: "no space left on device" but there is plenty of space

jsabater

Member
Oct 25, 2021
130
14
23
49
Palma, Mallorca, Spain
I have a 5-node cluster with LXC only and I am trying to move 3GB of storage of one of the LXC from the zfspool pool to the local pool on the same node (ext4), but I am getting this error:

Code:
Formatting '/var/lib/vz/images/106/vm-106-disk-0.raw', fmt=raw size=3221225472 preallocation=off
Creating filesystem with 786432 4k blocks and 196608 inodes
Filesystem UUID: c05d40e3-b520-4ea5-a175-9ac6491af8bf
Superblock backups stored on blocks:
    32768, 98304, 163840, 229376, 294912
rsync: [receiver] write failed on "/var/lib/lxc/106/.copy-volume-1/home/ansible/.cache/ansible-compat/21d1f8/collections/ansible_collections/community/aws/plugins/modules/elb_target_group.py": No space left on device (28)
rsync error: error in file IO (code 11) at receiver.c(378) [receiver=3.2.3]
rsync: [sender] write error: Broken pipe (32)
TASK ERROR: command 'rsync --stats -X -A --numeric-ids -aH --whole-file --sparse --one-file-system '--bwlimit=0' /var/lib/lxc/106/.copy-volume-2/ /var/lib/lxc/106/.copy-volume-1' failed: exit code 11

The LXC is just an Ansible Controller with 1 core, 512 MB of RAM and 3 GB of storage on the zfspool pool, so nothing "fancy".

Is there a way to do this or once the storage of my LXC is on ZFS has to stay there?
 
Thanks for your reply, but that does not seem to be the problem. The local pool has more than 400 GB of empty space but, indeed, the local pool is just 500 GB big whereas the zfspool pool is 4 TB big.

I think that it's attempting to provision more storage than needed on the local pool? Either by mistake or lack of a given parametre I could add if I were to do this through the console (via pct move-volume, I presume, but I haven't seen any relevant option in the man page of the command).
 
then u should post more details like config lxc, free storage of the pool and so on...
 
These are the versions I am running on the node where this container is located, named proxmox3:

Code:
# pveversion -v
proxmox-ve: 7.3-1 (running kernel: 5.15.85-1-pve)
pve-manager: 7.3-6 (running version: 7.3-6/723bb6ec)
pve-kernel-helper: 7.3-4
pve-kernel-5.15: 7.3-2
pve-kernel-5.15.85-1-pve: 5.15.85-1
pve-kernel-5.15.83-1-pve: 5.15.83-1
ceph-fuse: 14.2.21-1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.3
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.3-1
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.3-2
libpve-guest-common-perl: 4.2-3
libpve-http-server-perl: 4.1-5
libpve-storage-perl: 7.3-2
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.2-1
lxcfs: 5.0.3-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.3.3-1
proxmox-backup-file-restore: 2.3.3-1
proxmox-mail-forward: 0.1.1-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.5.5
pve-cluster: 7.3-2
pve-container: 4.4-2
pve-docs: 7.3-1
pve-edk2-firmware: 3.20220526-1
pve-firewall: 4.2-7
pve-firmware: 3.6-3
pve-ha-manager: 3.5.1
pve-i18n: 2.8-2
pve-qemu-kvm: 7.2.0-5
pve-xtermjs: 4.16.0-1
qemu-server: 7.3-3
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+2
vncterm: 1.7-1
zfsutils-linux: 2.1.9-pve1

There's enough disk space in both pools, as can be seen here:

Code:
# df -h
Filesystem                 Size  Used Avail Use% Mounted on
udev                        32G     0   32G   0% /dev
tmpfs                      6.3G 1004K  6.3G   1% /run
/dev/md2                   407G  5.7G  381G   2% /
tmpfs                       32G   66M   32G   1% /dev/shm
tmpfs                      5.0M     0  5.0M   0% /run/lock
/dev/md1                   989M  159M  779M  17% /boot
zfspool                    3.6T  128K  3.6T   1% /zfspool
zfspool/subvol-106-disk-0  3.0G  2.0G  1.1G  67% /zfspool/subvol-106-disk-0
zfspool/subvol-108-disk-0   10G  5.8G  4.3G  58% /zfspool/subvol-108-disk-0
zfspool/subvol-109-disk-0   50G  947M   50G   2% /zfspool/subvol-109-disk-0
/dev/fuse                  128M  168K  128M   1% /etc/pve
tmpfs                      6.3G     0  6.3G   0% /run/user/0

The specific container I am trying to operate on is #106:

Code:
# pct list
VMID       Status     Lock         Name              
106        running                 ansible1          
108        running                 minio1            
109        running                 postgresql3        
root@proxmox3 ~ # pct df 106
MP     Volume                    Size Used Avail Use% Path
rootfs zfspool:subvol-106-disk-0 3.0G 2.0G  1.0G  0.7 /

Finally, I found this blog post from 2017. I take it that must have been a solution before the LXC: Resources: Root disk: Storage: Move storage option was added to the WebGUI?
 

Attachments

  • proxmox3.png
    proxmox3.png
    16.4 KB · Views: 4
  • ansible1.png
    ansible1.png
    24.5 KB · Views: 4
  • ansible1-resources.png
    ansible1-resources.png
    13.7 KB · Views: 3
Last edited:
I also tried restoring the LXC from the backup on Proxmox Backup Server (pbs1) without success. The error is the same. The node I was restoring it into has the same configuration as proxmox3, that is, a local pool and a zfspool pool. Both pools have sufficient disk space to hold a 3 GB image. Here is the output and attached there is the restore window:

Code:
recovering backed-up configuration from 'pbs1:backup/ct/106/2023-04-03T06:15:01Z'
Formatting '/var/lib/vz/images/112/vm-112-disk-0.raw', fmt=raw size=3221225472 preallocation=off
Creating filesystem with 786432 4k blocks and 196608 inodes
Filesystem UUID: a7c19be5-2bb1-4b62-8631-b1b1b74234e9
Superblock backups stored on blocks:
    32768, 98304, 163840, 229376, 294912
restoring 'pbs1:backup/ct/106/2023-04-03T06:15:01Z' now..
Error: error extracting archive - error at entry "index.html": failed to copy file contents: No space left on device (os error 28)
TASK ERROR: unable to restore CT 112 - command 'lxc-usernsexec -m u:0:100000:65536 -m g:0:100000:65536 -- /usr/bin/proxmox-backup-client restore '--crypt-mode=none' ct/106/2023-04-03T06:15:01Z root.pxar /var/lib/lxc/112/rootfs --allow-existing-dirs --repository backupuser@pbs@192.168.1.10:local' failed: exit code 255

May this be a bug in Proxmox? I recall being able to do this about a year ago without any issues. Well, I least that is how I remember it.
 

Attachments

  • ansible1_restore_from_pbs1.png
    ansible1_restore_from_pbs1.png
    29.7 KB · Views: 3
I think it is. Do you restore it to the zfs pool as well?
Maybe u find some more error hints in other log files, but i dont know where to looking for.
 
Indeed I can restore a backup of the LXC on proxmox5 if I select the zfspool storage of that node (same name, same setup, I did that on purpose).

Should I open a bug at the bug tracker?
 
I dont know the process. I think that the staff should post there opinion.
 
Last edited:
I do a quick reproduction with a small container without files and it works. So do you have special configuration done for your container?
I use 7.2-11
 
Last edited:
I would say that no, I don't have anything "special". I created the ZFS pool via the WebGUI, in mirror mode, with default values. As I said, I've seen it working about a year ago (probably 7.0 to 7.2, but don't take my word for it). It's not that big of a disk to be an issue.

These are the LXC options:
  • Start at boot: No
  • Start/Shutdown order: order=any
  • OS Type: debian
  • Architecture: amd64
  • /dev/console: Enabled
  • TTY count: 2
  • Console mode: tty
  • Protection: No
  • Unprivileged container: Yes
  • Features: nesting=1
I think that somehow the image must have become "corrupted", or some flag is not adequately set. I just don't know where to look or what to look for. Any hints?
 
maybe the volume compresses so well on ZFS that it stores more than 3GB of logical data (zfs get all zfspool/subvol-106-disk-0 should tell you)? you could try making the volume bigger and then retrying the move volume operation..
 
Thanks for your reply, Fabian. It does, indeed, compress fairly well, but not to the point where 400 GB of free space would not suffice. Do you see any value that gives a clue of what's happening in the output of the command you gave, @fabian?

Code:
# zfs get all zfspool/subvol-106-disk-0
NAME                       PROPERTY              VALUE                       SOURCE
zfspool/subvol-106-disk-0  type                  filesystem                  -
zfspool/subvol-106-disk-0  creation              Wed Apr 20 13:40 2022       -
zfspool/subvol-106-disk-0  used                  1.99G                       -
zfspool/subvol-106-disk-0  available             1.01G                       -
zfspool/subvol-106-disk-0  referenced            1.99G                       -
zfspool/subvol-106-disk-0  compressratio         2.15x                       -
zfspool/subvol-106-disk-0  mounted               yes                         -
zfspool/subvol-106-disk-0  quota                 none                        default
zfspool/subvol-106-disk-0  reservation           none                        default
zfspool/subvol-106-disk-0  recordsize            128K                        default
zfspool/subvol-106-disk-0  mountpoint            /zfspool/subvol-106-disk-0  default
zfspool/subvol-106-disk-0  sharenfs              off                         default
zfspool/subvol-106-disk-0  checksum              on                          default
zfspool/subvol-106-disk-0  compression           on                          inherited from zfspool
zfspool/subvol-106-disk-0  atime                 on                          default
zfspool/subvol-106-disk-0  devices               on                          default
zfspool/subvol-106-disk-0  exec                  on                          default
zfspool/subvol-106-disk-0  setuid                on                          default
zfspool/subvol-106-disk-0  readonly              off                         default
zfspool/subvol-106-disk-0  zoned                 off                         default
zfspool/subvol-106-disk-0  snapdir               hidden                      default
zfspool/subvol-106-disk-0  aclmode               discard                     default
zfspool/subvol-106-disk-0  aclinherit            restricted                  default
zfspool/subvol-106-disk-0  createtxg             19231                       -
zfspool/subvol-106-disk-0  canmount              on                          default
zfspool/subvol-106-disk-0  xattr                 sa                          local
zfspool/subvol-106-disk-0  copies                1                           default
zfspool/subvol-106-disk-0  version               5                           -
zfspool/subvol-106-disk-0  utf8only              off                         -
zfspool/subvol-106-disk-0  normalization         none                        -
zfspool/subvol-106-disk-0  casesensitivity       sensitive                   -
zfspool/subvol-106-disk-0  vscan                 off                         default
zfspool/subvol-106-disk-0  nbmand                off                         default
zfspool/subvol-106-disk-0  sharesmb              off                         default
zfspool/subvol-106-disk-0  refquota              3G                          local
zfspool/subvol-106-disk-0  refreservation        none                        default
zfspool/subvol-106-disk-0  guid                  145368298871665946          -
zfspool/subvol-106-disk-0  primarycache          all                         default
zfspool/subvol-106-disk-0  secondarycache        all                         default
zfspool/subvol-106-disk-0  usedbysnapshots       0B                          -
zfspool/subvol-106-disk-0  usedbydataset         1.99G                       -
zfspool/subvol-106-disk-0  usedbychildren        0B                          -
zfspool/subvol-106-disk-0  usedbyrefreservation  0B                          -
zfspool/subvol-106-disk-0  logbias               latency                     default
zfspool/subvol-106-disk-0  objsetid              1391                        -
zfspool/subvol-106-disk-0  dedup                 off                         default
zfspool/subvol-106-disk-0  mlslabel              none                        default
zfspool/subvol-106-disk-0  sync                  standard                    default
zfspool/subvol-106-disk-0  dnodesize             legacy                      default
zfspool/subvol-106-disk-0  refcompressratio      2.15x                       -
zfspool/subvol-106-disk-0  written               1.99G                       -
zfspool/subvol-106-disk-0  logicalused           3.89G                       -
zfspool/subvol-106-disk-0  logicalreferenced     3.89G                       -
zfspool/subvol-106-disk-0  volmode               default                     default
zfspool/subvol-106-disk-0  filesystem_limit      none                        default
zfspool/subvol-106-disk-0  snapshot_limit        none                        default
zfspool/subvol-106-disk-0  filesystem_count      none                        default
zfspool/subvol-106-disk-0  snapshot_count        none                        default
zfspool/subvol-106-disk-0  snapdev               hidden                      default
zfspool/subvol-106-disk-0  acltype               posix                       local
zfspool/subvol-106-disk-0  context               none                        default
zfspool/subvol-106-disk-0  fscontext             none                        default
zfspool/subvol-106-disk-0  defcontext            none                        default
zfspool/subvol-106-disk-0  rootcontext           none                        default
zfspool/subvol-106-disk-0  relatime              off                         default
zfspool/subvol-106-disk-0  redundant_metadata    all                         default
zfspool/subvol-106-disk-0  overlay               on                          default
zfspool/subvol-106-disk-0  encryption            off                         default
zfspool/subvol-106-disk-0  keylocation           none                        default
zfspool/subvol-106-disk-0  keyformat             none                        default
zfspool/subvol-106-disk-0  pbkdf2iters           0                           default
zfspool/subvol-106-disk-0  special_small_blocks  0                           default
 
the issue is not the free space on the storage, but the size of the volume itself. you try to move almost 4G of logical data into a volume that only has 3G of size ;) that cannot work if the target volume doesn't support/use compression.
 
  • Like
Reactions: BruceX and Wouter
Ah, I see. That, indeed, makes sense.

Would making the volume bigger, say from 3 GB to 6 GB, and immediately try to move the storage from the zfspool pool to the local pool solve the issue?

Also, were this to be a solution, would you like me to open a ticket on the bug tracker requesting for the WebGUI and tools to check for this scenario and offer a solution instead of just failing?

Thanks! Much appreciated.
 
Would making the volume bigger, say from 3 GB to 6 GB, and immediately try to move the storage from the zfspool pool to the local pool solve the issue?
yes, this is what I suggested earlier in this thread ;)

I don't think this warrants a bug report.
 
  • Like
Reactions: Wouter and jsabater
Just to confirm that it worked. I tried adding 1 GB at a time and at 5 GB the process went fine.

Again, thank you very much for your time, Fabian. Marking the thread as solved.

P.S. Wouldn't it be nice for the command being executed via the WebGUI to check for the compression ratio and do some basic math, so that it could warn the user on what to do, e.g. "increase storage to 5 GB to be able to run this task"? :)
 
  • Like
Reactions: BruceX
the problem is that the upper layer doesn't know whether the lower storage layer uses compression and how much it affects the data.
 
  • Like
Reactions: BruceX and Wouter
Couldn't it be inferred from the compressratio property (and maybe some more) of the zfs get all command? As in, at least, provide a warning to the user and a lead to a solution in case he or she gets the "no space left on device" error?

Just trying to be helpful here :)
 
yes it could, but that information is rather storage specific and currently not exposed in a way that the upper layers that are not concerned with the storage type can handle.