TASK ERROR: failed to get ip for node 'pve01' in network '10.100.100.231/24'

jsterr

Well-Known Member
Jul 24, 2020
680
160
53
32
Hello proxmox,

if I start a bulk migration from pve02 to pve01 7 vms migrate without any problem, but one had this error:
Code:
task started by HA resource agent
TASK ERROR: failed to get ip for node 'pve01' in network '10.100.100.231/24'

10.100.100.231 is the ceph ip of the pve01-node. I started the bulk migrate with multiple jobs. After failing I tried again to only migrate the failed one and it was successful:
Code:
task started by HA resource agent
2021-11-22 15:26:51 use dedicated network address for sending migration traffic (10.100.100.231)
2021-11-22 15:26:51 starting migration of VM 108 to node 'pve01' (10.100.100.231)
2021-11-22 15:26:51 starting VM 108 on remote node 'pve01'
2021-11-22 15:26:52 start remote tunnel
2021-11-22 15:26:53 ssh tunnel ver 1
2021-11-22 15:26:53 starting online/live migration on unix:/run/qemu-server/108.migrate
2021-11-22 15:26:53 set migration capabilities
2021-11-22 15:26:53 migration downtime limit: 100 ms
2021-11-22 15:26:53 migration cachesize: 512.0 MiB
2021-11-22 15:26:53 set migration parameters
2021-11-22 15:26:53 start migrate command to unix:/run/qemu-server/108.migrate
2021-11-22 15:26:54 migration active, transferred 544.9 MiB of 3.9 GiB VM-state, 560.8 MiB/s
2021-11-22 15:26:55 migration active, transferred 928.4 MiB of 3.9 GiB VM-state, 456.8 MiB/s
2021-11-22 15:26:56 migration active, transferred 1.3 GiB of 3.9 GiB VM-state, 411.4 MiB/s
2021-11-22 15:26:57 migration active, transferred 1.7 GiB of 3.9 GiB VM-state, 368.8 MiB/s
2021-11-22 15:26:58 migration active, transferred 2.0 GiB of 3.9 GiB VM-state, 350.0 MiB/s
2021-11-22 15:26:59 migration active, transferred 2.6 GiB of 3.9 GiB VM-state, 339.4 MiB/s
2021-11-22 15:27:00 average migration speed: 573.8 MiB/s - downtime 50 ms
2021-11-22 15:27:00 migration status: completed
2021-11-22 15:27:02 migration finished successfully (duration 00:00:11)
TASK OK

Is this a bug or a problem of maybe saturated link? If it is because of link, can this be avoided somehow?
 
Last edited:
I had similar problems when i tried to migrate for example from node 1 to node 2, first it was: ERROR: online migrate failure - unable to detect remote migration address. Then it became: TASK ERROR: failed to get ip for node 'pve02' in network '10.0.20.xxx/24'

In the first case it was possible to migrate when the VM was shutdown, in the second case both migration didn't work.
The second error started when i checked the .datacenter.cfg and saw that there was no Migration Settings. After i choose from the gui the possible options from where you can choose, this second error occurred. So that didn't solve it either.

The strange thing is, it worked always before, I didn't change anything so far network related. The only thing i can remember is that Ceph was upgraded from version 16 to 17. But this problem is not happening only to Ceph also VMs on ZFS don't migrate.
 
Hi,
I had similar problems when i tried to migrate for example from node 1 to node 2, first it was: ERROR: online migrate failure - unable to detect remote migration address. Then it became: TASK ERROR: failed to get ip for node 'pve02' in network '10.0.20.xxx/24'

In the first case it was possible to migrate when the VM was shutdown, in the second case both migration didn't work.
The second error started when i checked the .datacenter.cfg and saw that there was no Migration Settings. After i choose from the gui the possible options from where you can choose, this second error occurred. So that didn't solve it either.

The strange thing is, it worked always before, I didn't change anything so far network related. The only thing i can remember is that Ceph was upgraded from version 16 to 17. But this problem is not happening only to Ceph also VMs on ZFS don't migrate.
please post the output of pveversion -v on both nodes, the config of the VM qm config <ID> and
Code:
ssh -e none -o BatchMode=yes -o HostKeyAlias=<target node name> root@<target node IP from corosync.conf> pvecm mtunnel -migration_network 10.0.20.0/24 -get_migration_ip
where you replace the pieces inside <here> with the appropriate information.
 
  • Like
Reactions: Kurtulus
I have executed the command you wrote for me, and they all gave me the right ip address of the node. i have 3 nodes so i executed 6 different ways, 2 on each node to get the ip address. So didn't see any strange thing there.

This is the pveversion -v output:
Code:
root@pve01:~# pveversion -v
proxmox-ve: 7.3-1 (running kernel: 5.15.74-1-pve)
pve-manager: 7.3-3 (running version: 7.3-3/c3928077)
pve-kernel-5.15: 7.2-14
pve-kernel-helper: 7.2-14
pve-kernel-5.13: 7.1-9
pve-kernel-5.11: 7.0-10
pve-kernel-5.15.74-1-pve: 5.15.74-1
pve-kernel-5.15.64-1-pve: 5.15.64-1
pve-kernel-5.13.19-6-pve: 5.13.19-15
pve-kernel-5.13.19-2-pve: 5.13.19-4
pve-kernel-5.11.22-7-pve: 5.11.22-12
pve-kernel-5.11.22-4-pve: 5.11.22-9
ceph: 17.2.5-pve1
ceph-fuse: 17.2.5-pve1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.2
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.2-5
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.3-1
libpve-guest-common-perl: 4.2-3
libpve-http-server-perl: 4.1-5
libpve-storage-perl: 7.3-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.0-3
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.3.1-1
proxmox-backup-file-restore: 2.3.1-1
proxmox-mini-journalreader: 1.3-1
proxmox-offline-mirror-helper: 0.5.0-1
proxmox-widget-toolkit: 3.5.3
pve-cluster: 7.3-1
pve-container: 4.4-2
pve-docs: 7.3-1
pve-edk2-firmware: 3.20220526-1
pve-firewall: 4.2-7
pve-firmware: 3.5-6
pve-ha-manager: 3.5.1
pve-i18n: 2.8-1
pve-qemu-kvm: 7.1.0-4
pve-xtermjs: 4.16.0-1
qemu-server: 7.3-1
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+2
vncterm: 1.7-1
zfsutils-linux: 2.1.6-pve1

And this is the output of qm config <ID>:
Code:
root@pve01:~# qm config 105
agent: 1
boot: order=scsi0;ide2;net0
cores: 8
cpu: host
ide2: none,media=cdrom
memory: 8192
meta: creation-qemu=7.1.0,ctime=1673887847
name: Asterisk19
net0: virtio=9A:4C:5C:A7:66:D9,bridge=vmbr1,firewall=1,tag=100
numa: 0
ostype: l26
scsi0: ceph-nvme:vm-105-disk-0,cache=writeback,iothread=1,size=64G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=40b6268e-c8a9-4507-8412-a66a60557ae7
sockets: 1
vmgenid: 9234f235-f0cd-47b0-8452-9362769a1816

root@pve01:~# qm config 100
agent: 1
boot: order=scsi0;net0
cores: 4
memory: 4096
meta: creation-qemu=6.1.0,ctime=1644385197
name: debian11
net0: virtio=1A:60:1D:58:3B:36,bridge=vmbr1,firewall=1,tag=21
numa: 0
ostype: l26
scsi0: ceph-nvme:vm-100-disk-0,size=20G
scsihw: virtio-scsi-pci
smbios1: uuid=7e6daef5-6082-4b3a-b9eb-a4653c404afb
sockets: 1
vmgenid: ce548b52-a562-4d94-8464-eb3d47bbb6f6
 
  • Like
Reactions: Kurtulus
I don't have special ssh banners, i do although have Neofetch. And in case i execute that command with get_ssh_info the neofetch banner displays first and then i get the right ip address, but only the ip address of <target node IP from corosync.conf>.
 
Last edited:
Can you try without neofetch? Any additional output likely messes up the parsing.
 
Hi Fiona, many thanks for your input, indeed the ssh banner or in this case Neofetch seems causing the problem. I have uninstalled it, and the first impression is that live migration is working now also on every node.
 
Thanks Guys,

The issue was neofetch, I uninstalled it and the migrations are woking flawlessly.

Great Job,
Michael
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!