TASK ERROR: failed to get ip for node 'pve01' in network '10.100.100.231/24'

jsterr · Nov 22, 2021

Hello proxmox,

if I start a bulk migration from pve02 to pve01 7 vms migrate without any problem, but one had this error:

Code:

task started by HA resource agent
TASK ERROR: failed to get ip for node 'pve01' in network '10.100.100.231/24'

10.100.100.231 is the ceph ip of the pve01-node. I started the bulk migrate with multiple jobs. After failing I tried again to only migrate the failed one and it was successful:

Code:

task started by HA resource agent
2021-11-22 15:26:51 use dedicated network address for sending migration traffic (10.100.100.231)
2021-11-22 15:26:51 starting migration of VM 108 to node 'pve01' (10.100.100.231)
2021-11-22 15:26:51 starting VM 108 on remote node 'pve01'
2021-11-22 15:26:52 start remote tunnel
2021-11-22 15:26:53 ssh tunnel ver 1
2021-11-22 15:26:53 starting online/live migration on unix:/run/qemu-server/108.migrate
2021-11-22 15:26:53 set migration capabilities
2021-11-22 15:26:53 migration downtime limit: 100 ms
2021-11-22 15:26:53 migration cachesize: 512.0 MiB
2021-11-22 15:26:53 set migration parameters
2021-11-22 15:26:53 start migrate command to unix:/run/qemu-server/108.migrate
2021-11-22 15:26:54 migration active, transferred 544.9 MiB of 3.9 GiB VM-state, 560.8 MiB/s
2021-11-22 15:26:55 migration active, transferred 928.4 MiB of 3.9 GiB VM-state, 456.8 MiB/s
2021-11-22 15:26:56 migration active, transferred 1.3 GiB of 3.9 GiB VM-state, 411.4 MiB/s
2021-11-22 15:26:57 migration active, transferred 1.7 GiB of 3.9 GiB VM-state, 368.8 MiB/s
2021-11-22 15:26:58 migration active, transferred 2.0 GiB of 3.9 GiB VM-state, 350.0 MiB/s
2021-11-22 15:26:59 migration active, transferred 2.6 GiB of 3.9 GiB VM-state, 339.4 MiB/s
2021-11-22 15:27:00 average migration speed: 573.8 MiB/s - downtime 50 ms
2021-11-22 15:27:00 migration status: completed
2021-11-22 15:27:02 migration finished successfully (duration 00:00:11)
TASK OK

Is this a bug or a problem of maybe saturated link? If it is because of link, can this be avoided somehow?

Kurtulus · Jan 21, 2023

I had similar problems when i tried to migrate for example from node 1 to node 2, first it was: ERROR: online migrate failure - unable to detect remote migration address. Then it became: TASK ERROR: failed to get ip for node 'pve02' in network '10.0.20.xxx/24'

In the first case it was possible to migrate when the VM was shutdown, in the second case both migration didn't work.
The second error started when i checked the .datacenter.cfg and saw that there was no Migration Settings. After i choose from the gui the possible options from where you can choose, this second error occurred. So that didn't solve it either.

The strange thing is, it worked always before, I didn't change anything so far network related. The only thing i can remember is that Ceph was upgraded from version 16 to 17. But this problem is not happening only to Ceph also VMs on ZFS don't migrate.

fiona · Jan 23, 2023

Hi,

Kurtulus said:
I had similar problems when i tried to migrate for example from node 1 to node 2, first it was: ERROR: online migrate failure - unable to detect remote migration address. Then it became: TASK ERROR: failed to get ip for node 'pve02' in network '10.0.20.xxx/24'

In the first case it was possible to migrate when the VM was shutdown, in the second case both migration didn't work.
The second error started when i checked the .datacenter.cfg and saw that there was no Migration Settings. After i choose from the gui the possible options from where you can choose, this second error occurred. So that didn't solve it either.

The strange thing is, it worked always before, I didn't change anything so far network related. The only thing i can remember is that Ceph was upgraded from version 16 to 17. But this problem is not happening only to Ceph also VMs on ZFS don't migrate.

please post the output of pveversion -v on both nodes, the config of the VM qm config <ID> and

Code:

ssh -e none -o BatchMode=yes -o HostKeyAlias=<target node name> root@<target node IP from corosync.conf> pvecm mtunnel -migration_network 10.0.20.0/24 -get_migration_ip

where you replace the pieces inside <here> with the appropriate information.

Kurtulus · Jan 23, 2023

I have executed the command you wrote for me, and they all gave me the right ip address of the node. i have 3 nodes so i executed 6 different ways, 2 on each node to get the ip address. So didn't see any strange thing there.

This is the pveversion -v output:

Code:

root@pve01:~# pveversion -v
proxmox-ve: 7.3-1 (running kernel: 5.15.74-1-pve)
pve-manager: 7.3-3 (running version: 7.3-3/c3928077)
pve-kernel-5.15: 7.2-14
pve-kernel-helper: 7.2-14
pve-kernel-5.13: 7.1-9
pve-kernel-5.11: 7.0-10
pve-kernel-5.15.74-1-pve: 5.15.74-1
pve-kernel-5.15.64-1-pve: 5.15.64-1
pve-kernel-5.13.19-6-pve: 5.13.19-15
pve-kernel-5.13.19-2-pve: 5.13.19-4
pve-kernel-5.11.22-7-pve: 5.11.22-12
pve-kernel-5.11.22-4-pve: 5.11.22-9
ceph: 17.2.5-pve1
ceph-fuse: 17.2.5-pve1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.2
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.2-5
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.3-1
libpve-guest-common-perl: 4.2-3
libpve-http-server-perl: 4.1-5
libpve-storage-perl: 7.3-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.0-3
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.3.1-1
proxmox-backup-file-restore: 2.3.1-1
proxmox-mini-journalreader: 1.3-1
proxmox-offline-mirror-helper: 0.5.0-1
proxmox-widget-toolkit: 3.5.3
pve-cluster: 7.3-1
pve-container: 4.4-2
pve-docs: 7.3-1
pve-edk2-firmware: 3.20220526-1
pve-firewall: 4.2-7
pve-firmware: 3.5-6
pve-ha-manager: 3.5.1
pve-i18n: 2.8-1
pve-qemu-kvm: 7.1.0-4
pve-xtermjs: 4.16.0-1
qemu-server: 7.3-1
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+2
vncterm: 1.7-1
zfsutils-linux: 2.1.6-pve1

And this is the output of qm config <ID>:

Code:

root@pve01:~# qm config 105
agent: 1
boot: order=scsi0;ide2;net0
cores: 8
cpu: host
ide2: none,media=cdrom
memory: 8192
meta: creation-qemu=7.1.0,ctime=1673887847
name: Asterisk19
net0: virtio=9A:4C:5C:A7:66:D9,bridge=vmbr1,firewall=1,tag=100
numa: 0
ostype: l26
scsi0: ceph-nvme:vm-105-disk-0,cache=writeback,iothread=1,size=64G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=40b6268e-c8a9-4507-8412-a66a60557ae7
sockets: 1
vmgenid: 9234f235-f0cd-47b0-8452-9362769a1816

root@pve01:~# qm config 100
agent: 1
boot: order=scsi0;net0
cores: 4
memory: 4096
meta: creation-qemu=6.1.0,ctime=1644385197
name: debian11
net0: virtio=1A:60:1D:58:3B:36,bridge=vmbr1,firewall=1,tag=21
numa: 0
ostype: l26
scsi0: ceph-nvme:vm-100-disk-0,size=20G
scsihw: virtio-scsi-pci
smbios1: uuid=7e6daef5-6082-4b3a-b9eb-a4653c404afb
sockets: 1
vmgenid: ce548b52-a562-4d94-8464-eb3d47bbb6f6

fiona · Jan 24, 2023

Do you have any ssh banners or something similar configured that leads to additional output for the ssh command? That would mess up parsing the result. Otherwise, it's strange, because that command failing is what leads to the error message you get, see here:
https://git.proxmox.com/?p=pve-clus...1148cbe43e1e4166647f9b579915a2e202832;hb=HEAD

Kurtulus · Jan 24, 2023

I don't have special ssh banners, i do although have Neofetch. And in case i execute that command with get_ssh_info the neofetch banner displays first and then i get the right ip address, but only the ip address of <target node IP from corosync.conf>.

fiona · Jan 24, 2023

Can you try without neofetch? Any additional output likely messes up the parsing.

Kurtulus · Jan 24, 2023

Hi Fiona, many thanks for your input, indeed the ssh banner or in this case Neofetch seems causing the problem. I have uninstalled it, and the first impression is that live migration is working now also on every node.

mcooper59 · Mar 22, 2023

Thanks Guys,

The issue was neofetch, I uninstalled it and the migrations are woking flawlessly.

Great Job,
Michael

Search

Search

TASK ERROR: failed to get ip for node 'pve01' in network '10.100.100.231/24'

jsterr

Renowned Member

Kurtulus

Member

fiona

Proxmox Staff Member

Kurtulus

Member

fiona

Proxmox Staff Member

Kurtulus

Member

fiona

Proxmox Staff Member

Kurtulus

Member

mcooper59

New Member