Moving CT volume hangs

SlothCroissant

Active Member
Feb 26, 2019
16
0
41
35
When migrating a CT storage volume from local to local-zfs (Stop the CT, "Move Storage" in the UI), the process just hangs and never completes (30+ minutes hung right now). This happens with *all* my CTs (waiting on 5 hung move operations as we speak)

Here is the status of the move job:

Code:
Number of files: 55,516 (reg: 44,506, dir: 7,450, link: 3,527, dev: 2, special: 31)
Number of created files: 55,515 (reg: 44,506, dir: 7,449, link: 3,527, dev: 2, special: 31)
Number of deleted files: 0
Number of regular files transferred: 44,498
Total file size: 5,489,999,322 bytes
Total transferred file size: 5,485,749,574 bytes
Literal data: 5,485,749,574 bytes
Matched data: 0 bytes
File list size: 1,572,643
File list generation time: 0.008 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 5,490,606,875
Total bytes received: 897,364

sent 5,490,606,875 bytes  received 897,364 bytes  61,357,589.26 bytes/sec
total size is 5,489,999,322  speedup is 1.00

And the resulting volume is indeed created in ZFS (you can see all 5 disks are provisioned, actually):

Code:
root@ff-pve01:~# zfs list
NAME                            USED  AVAIL     REFER  MOUNTPOINT
pbs-ssd                         117G   782G       96K  /pbs-ssd
pbs-ssd/vm-100-disk-0          43.4G   782G     43.4G  -
pbs-ssd/vm-100-disk-1          73.4G   782G     73.4G  -
rpool                          56.2G  3.32T      104K  /rpool
rpool/ROOT                     15.8G  3.32T       96K  /rpool/ROOT
rpool/ROOT/pve-1               15.8G  3.32T     15.8G  /
rpool/data                     40.3G  3.32T      136K  /rpool/data
rpool/data/subvol-1000-disk-0  1.88G  30.1G     1.88G  /rpool/data/subvol-1000-disk-0
rpool/data/subvol-1003-disk-0  3.14G  60.9G     3.14G  /rpool/data/subvol-1003-disk-0
rpool/data/subvol-7001-disk-0  3.31G  4.69G     3.31G  /rpool/data/subvol-7001-disk-0
rpool/data/subvol-8001-disk-0  1.74G  14.3G     1.74G  /rpool/data/subvol-8001-disk-0
rpool/data/subvol-8002-disk-0  1.57G  14.4G     1.57G  /rpool/data/subvol-8002-disk-0
rpool/data/vm-100-disk-0       2.92G  3.32T     2.92G  -
rpool/data/vm-1500-disk-0      1.16G  3.32T     1.16G  -
rpool/data/vm-1501-disk-0      1.52G  3.32T     1.52G  -
rpool/data/vm-2001-disk-0      23.1G  3.32T     23.1G  -

Similar threads without any solution, aside from "a reboot fixed it" or "a recovery from backup fixed it":

* https://forum.proxmox.com/threads/moving-lxc-disk-to-a-mounted-directory-hangs.81623/
* https://forum.proxmox.com/threads/v...ontainer-from-local-to-nfs-never-ends.120957/
* https://forum.proxmox.com/threads/pct-move_volume-hangs-forever.122616/#post-532927

Code:
root@ff-pve01:~# pveversion -v
proxmox-ve: 7.4-1 (running kernel: 5.15.107-2-pve)
pve-manager: 7.4-4 (running version: 7.4-4/4a8501a8)
pve-kernel-5.15: 7.4-3
pve-kernel-5.15.107-2-pve: 5.15.107-2
pve-kernel-5.15.102-1-pve: 5.15.102-1
ceph-fuse: 15.2.17-pve1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx4
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.4
libproxmox-backup-qemu0: 1.3.1-1
libproxmox-rs-perl: 0.2.1
libpve-access-control: 7.4-3
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.4-1
libpve-guest-common-perl: 4.2-4
libpve-http-server-perl: 4.2-3
libpve-rs-perl: 0.7.6
libpve-storage-perl: 7.4-3
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.2-2
lxcfs: 5.0.3-pve1
novnc-pve: 1.4.0-1
openvswitch-switch: 2.15.0+ds1-2+deb11u4
proxmox-backup-client: 2.4.2-1
proxmox-backup-file-restore: 2.4.2-1
proxmox-kernel-helper: 7.4-1
proxmox-mail-forward: 0.1.1-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.7.0
pve-cluster: 7.3-3
pve-container: 4.4-4
pve-docs: 7.4-2
pve-edk2-firmware: 3.20230228-2
pve-firewall: 4.3-2
pve-firmware: 3.6-5
pve-ha-manager: 3.6.1
pve-i18n: 2.12-1
pve-qemu-kvm: 7.2.0-8
pve-xtermjs: 4.16.0-2
qemu-server: 7.4-3
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+3
vncterm: 1.7-1
zfsutils-linux: 2.1.11-pve1
 
Hi,
please check your system logs and post the output of ps faxl.
 
For anyone who comes across this, I just had the same problem and was able to trace it down to a hanging call to fuse. The cause of that process hanging was an NFS mount from a server that had gone away. After unmounting the NFS mounted directory (unrelated to the CT in question), the copying process finished immediately.
 
  • Like
Reactions: fiona
I can confirm what @helgew wrote. I was unable to move a CT too, caused by a hanging `fuser` process (also due to an NFS mount from a server that had gone away). In my case, I had to open a shell and kill the hanging process. After that the CT move completed.
 
This is maybe the relevant section of
Code:
ps faxl

1708076474681.png

Otherwise than that I could not find anything :(

I am moving a volume from my ZFS storage to a NAS.
1708076546930.png

I already see that the file was being created on the NAS, but the disk is only 8GB, its Gigabit Ethernet which I already confirmed with iperf on multiple occasions.
Its been there for like 30 mins.
 
Last edited: