I have a single node in a 3 node cluster that ALWAYS crashes on incoming migration. I have reinstalled the OS twice. This began after upgrading to v8.x, and was not an issue on v7.x. I have tried this with an addon USB 2.5GBE dedicated migration lan and with the onboard 1GBE shared lan. Hosts are Beelink SER5pro. I just did an `apt update && apt upgrade -y && reboot` on the erring host.
Steps to reproduce:
1. Right-click a VM, click migrate
2. Choose pve2, no other options
3. Watch the migration in the task viewer.
4. Task fails at arbitrary number of GBs, it appears to stall as the destination host restarts.
output:
If run from the terminal, the output is as follows:
Steps to reproduce:
1. Right-click a VM, click migrate
2. Choose pve2, no other options
3. Watch the migration in the task viewer.
4. Task fails at arbitrary number of GBs, it appears to stall as the destination host restarts.
output:
Code:
```
2023-08-23 10:30:03 use dedicated network address for sending migration traffic (10.2.2.62)
2023-08-23 10:30:03 starting migration of VM 105 to node 'pve2' (10.2.2.62)
2023-08-23 10:30:03 found local disk 'local-zfs:vm-105-disk-0' (attached)
2023-08-23 10:30:03 found local disk 'local-zfs:vm-105-disk-1' (attached)
2023-08-23 10:30:03 found generated disk 'local-zfs:vm-105-disk-2' (in current VM config)
2023-08-23 10:30:03 copying local disk images
2023-08-23 10:30:04 full send of rpool/data/vm-105-disk-0@__migration__ estimated size is 573K
2023-08-23 10:30:04 total estimated size is 573K
2023-08-23 10:30:05 successfully imported 'local-zfs:vm-105-disk-0'
2023-08-23 10:30:05 volume 'local-zfs:vm-105-disk-0' is 'local-zfs:vm-105-disk-0' on the target
2023-08-23 10:30:05 full send of rpool/data/vm-105-disk-1@__migration__ estimated size is 13.9G
2023-08-23 10:30:05 total estimated size is 13.9G
2023-08-23 10:30:06 TIME SENT SNAPSHOT rpool/data/vm-105-disk-1@__migration__
2023-08-23 10:30:06 10:30:06 27.5M rpool/data/vm-105-disk-1@__migration__
2023-08-23 10:30:07 10:30:07 59.1M rpool/data/vm-105-disk-1@__migration__
2023-08-23 10:30:08 10:30:08 90.8M rpool/data/vm-105-disk-1@__migration__
2023-08-23 10:30:09 10:30:09 123M rpool/data/vm-105-disk-1@__migration__
2023-08-23 10:30:10 10:30:10 154M rpool/data/vm-105-disk-1@__migration__
2023-08-23 10:30:11 10:30:11 186M rpool/data/vm-105-disk-1@__migration__
2023-08-23 10:30:13 10:30:12 218M rpool/data/vm-105-disk-1@__migration__
2023-08-23 10:30:14 10:30:13 249M rpool/data/vm-105-disk-1@__migration__
2023-08-23 10:30:15 10:30:15 281M rpool/data/vm-105-disk-1@__migration__
2023-08-23 10:30:16 10:30:16 313M rpool/data/vm-105-disk-1@__migration__
2023-08-23 10:30:17 10:30:17 345M rpool/data/vm-105-disk-1@__migration__
2023-08-23 10:30:18 10:30:18 376M rpool/data/vm-105-disk-1@__migration__
2023-08-23 10:30:19 10:30:19 408M rpool/data/vm-105-disk-1@__migration__
2023-08-23 10:30:20 10:30:20 439M rpool/data/vm-105-disk-1@__migration__
2023-08-23 10:30:21 10:30:21 471M rpool/data/vm-105-disk-1@__migration__
2023-08-23 10:30:22 10:30:22 494M rpool/data/vm-105-disk-1@__migration__
...message repeats every second
2023-08-23 10:30:58 10:30:58 494M rpool/data/vm-105-disk-1@__migration__
2023-08-23 10:30:58 client_loop: send disconnect: Broken pipe
2023-08-23 10:30:58 command 'zfs send -Rpv -- rpool/data/vm-105-disk-1@__migration__' failed: got signal 13
send/receive failed, cleaning up snapshot(s)..
2023-08-23 10:30:58 ERROR: storage migration for 'local-zfs:vm-105-disk-1' to storage 'local-zfs' failed - command 'set -o pipefail && pvesm export local-zfs:vm-105-disk-1 zfs - -with-snapshots 0 -snapshot __migration__ | /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=pve2' root@10.2.2.62 -- pvesm import local-zfs:vm-105-disk-1 zfs - -with-snapshots 0 -snapshot __migration__ -delete-snapshot __migration__ -allow-rename 1' failed: exit code 255
2023-08-23 10:30:58 aborting phase 1 - cleanup resources
2023-08-23 10:30:58 ERROR: command '/usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=pve2' root@10.2.2.62 pvesm free local-zfs:vm-105-disk-0' failed: exit code 255
2023-08-23 10:30:58 ERROR: migration aborted (duration 00:00:55): storage migration for 'local-zfs:vm-105-disk-1' to storage 'local-zfs' failed - command 'set -o pipefail && pvesm export local-zfs:vm-105-disk-1 zfs - -with-snapshots 0 -snapshot __migration__ | /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=pve2' root@10.2.2.62 -- pvesm import local-zfs:vm-105-disk-1 zfs - -with-snapshots 0 -snapshot __migration__ -delete-snapshot __migration__ -allow-rename 1' failed: exit code 255
TASK ERROR: migration aborted
```
If run from the terminal, the output is as follows:
Code:
```
root@pve3:~# qm migrate 105 pve2
2023-08-23 10:35:28 use dedicated network address for sending migration traffic (10.2.2.62)
2023-08-23 10:35:29 starting migration of VM 105 to node 'pve2' (10.2.2.62)
2023-08-23 10:35:29 found local disk 'local-zfs:vm-105-disk-0' (attached)
2023-08-23 10:35:29 found local disk 'local-zfs:vm-105-disk-1' (attached)
2023-08-23 10:35:29 found generated disk 'local-zfs:vm-105-disk-2' (in current VM config)
2023-08-23 10:35:29 copying local disk images
2023-08-23 10:35:30 full send of rpool/data/vm-105-disk-0@__migration__ estimated size is 573K
2023-08-23 10:35:30 total estimated size is 573K
2023-08-23 10:35:30 volume 'rpool/data/vm-105-disk-0' already exists - importing with a different name
2023-08-23 10:35:30 successfully imported 'local-zfs:vm-105-disk-1'
2023-08-23 10:35:31 volume 'local-zfs:vm-105-disk-0' is 'local-zfs:vm-105-disk-1' on the target
2023-08-23 10:35:33 full send of rpool/data/vm-105-disk-1@__migration__ estimated size is 13.9G
2023-08-23 10:35:33 total estimated size is 13.9G
2023-08-23 10:35:33 volume 'rpool/data/vm-105-disk-1' already exists - importing with a different name
2023-08-23 10:35:35 TIME SENT SNAPSHOT rpool/data/vm-105-disk-1@__migration__
2023-08-23 10:35:35 10:35:35 33.2M rpool/data/vm-105-disk-1@__migration__
2023-08-23 10:35:36 10:35:36 64.8M rpool/data/vm-105-disk-1@__migration__
2023-08-23 10:35:37 10:35:37 96.5M rpool/data/vm-105-disk-1@__migration__
2023-08-23 10:35:38 10:35:38 128M rpool/data/vm-105-disk-1@__migration__
2023-08-23 10:35:39 10:35:39 160M rpool/data/vm-105-disk-1@__migration__
2023-08-23 10:35:40 10:35:40 165M rpool/data/vm-105-disk-1@__migration__
...message repeats every second
2023-08-23 10:36:08 10:36:08 165M rpool/data/vm-105-disk-1@__migration__
2023-08-23 10:36:08 client_loop: send disconnect: Broken pipe
2023-08-23 10:36:09 command 'zfs send -Rpv -- rpool/data/vm-105-disk-1@__migration__' failed: got signal 13
send/receive failed, cleaning up snapshot(s)..
2023-08-23 10:36:10 ERROR: storage migration for 'local-zfs:vm-105-disk-1' to storage 'local-zfs' failed - command 'set -o pipefail && pvesm export local-zfs:vm-105-disk-1 zfs - -with-snapshots 0 -snapshot __migration__ | /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=pve2' [EMAIL]root@10.2.2.62[/EMAIL] -- pvesm import local-zfs:vm-105-disk-1 zfs - -with-snapshots 0 -snapshot __migration__ -delete-snapshot __migration__ -allow-rename 1' failed: exit code 255
2023-08-23 10:36:10 aborting phase 1 - cleanup resources
2023-08-23 10:36:11 ERROR: migration aborted (duration 00:00:43): storage migration for 'local-zfs:vm-105-disk-1' to storage 'local-zfs' failed - command 'set -o pipefail && pvesm export local-zfs:vm-105-disk-1 zfs - -with-snapshots 0 -snapshot __migration__ | /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=pve2' [EMAIL]root@10.2.2.62[/EMAIL] -- pvesm import local-zfs:vm-105-disk-1 zfs - -with-snapshots 0 -snapshot __migration__ -delete-snapshot __migration__ -allow-rename 1' failed: exit code 255
migration aborted
```
Code:
```erring-host
root@pve2:~# pveversion -v
proxmox-ve: 8.0.2 (running kernel: 6.2.16-10-pve)
pve-manager: 8.0.4 (running version: 8.0.4/d258a813cfa6b390)
pve-kernel-6.2: 8.0.5
proxmox-kernel-helper: 8.0.3
proxmox-kernel-6.2.16-10-pve: 6.2.16-10
proxmox-kernel-6.2: 6.2.16-10
proxmox-kernel-6.2.16-6-pve: 6.2.16-7
pve-kernel-6.2.16-3-pve: 6.2.16-3
ceph-fuse: 17.2.6-pve1+3
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-4
libknet1: 1.25-pve1
libproxmox-acme-perl: 1.4.6
libproxmox-backup-qemu0: 1.4.0
libproxmox-rs-perl: 0.3.1
libpve-access-control: 8.0.4
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.0.7
libpve-guest-common-perl: 5.0.4
libpve-http-server-perl: 5.0.4
libpve-network-perl: 0.8.1
libpve-rs-perl: 0.8.5
libpve-storage-perl: 8.0.2
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve3
novnc-pve: 1.4.0-2
proxmox-backup-client: 3.0.2-1
proxmox-backup-file-restore: 3.0.2-1
proxmox-kernel-helper: 8.0.3
proxmox-mail-forward: 0.2.0
proxmox-mini-journalreader: 1.4.0
proxmox-widget-toolkit: 4.0.6
pve-cluster: 8.0.3
pve-container: 5.0.4
pve-docs: 8.0.4
pve-edk2-firmware: 3.20230228-4
pve-firewall: 5.0.3
pve-firmware: 3.7-1
pve-ha-manager: 4.0.2
pve-i18n: 3.0.5
pve-qemu-kvm: 8.0.2-4
pve-xtermjs: 4.16.0-3
qemu-server: 8.0.6
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.1.12-pve1
```
Code:
```source-host
root@pve3:~# pveversion -v
proxmox-ve: 8.0.2 (running kernel: 6.2.16-3-pve)
pve-manager: 8.0.4 (running version: 8.0.4/d258a813cfa6b390)
pve-kernel-6.2: 8.0.5
proxmox-kernel-helper: 8.0.3
proxmox-kernel-6.2.16-6-pve: 6.2.16-7
proxmox-kernel-6.2: 6.2.16-7
pve-kernel-6.2.16-4-pve: 6.2.16-5
pve-kernel-6.2.16-3-pve: 6.2.16-3
ceph-fuse: 17.2.6-pve1+3
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-3
libknet1: 1.25-pve1
libproxmox-acme-perl: 1.4.6
libproxmox-backup-qemu0: 1.4.0
libproxmox-rs-perl: 0.3.1
libpve-access-control: 8.0.4
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.0.7
libpve-guest-common-perl: 5.0.4
libpve-http-server-perl: 5.0.4
libpve-network-perl: 0.8.1
libpve-rs-perl: 0.8.5
libpve-storage-perl: 8.0.2
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve3
novnc-pve: 1.4.0-2
proxmox-backup-client: 3.0.2-1
proxmox-backup-file-restore: 3.0.2-1
proxmox-kernel-helper: 8.0.3
proxmox-mail-forward: 0.2.0
proxmox-mini-journalreader: 1.4.0
proxmox-widget-toolkit: 4.0.6
pve-cluster: 8.0.3
pve-container: 5.0.4
pve-docs: 8.0.4
pve-edk2-firmware: 3.20230228-4
pve-firewall: 5.0.3
pve-firmware: 3.7-1
pve-ha-manager: 4.0.2
pve-i18n: 3.0.5
pve-qemu-kvm: 8.0.2-4
pve-xtermjs: 4.16.0-3
qemu-server: 8.0.6
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.1.12-pve1
```
Last edited: