Setup Ceph now VM migration is broken

devb0x

New Member
Dec 29, 2024
3
0
1
Hi all.

Got a 4 node proxmox cluster. Got 4 x 3.2 tb NVME drives from ebay and installed ceph on my cluster.
ceph is running with out any issues or failures. Created a pool which also got assigned to all 4 nodes no problem.
However when i try to now configure HA for some of my vms and test migration I get the following error and I have no clue how to fix this or how to diagnose what the issue could be:
Code:
could not get migration ip: multiple, different, IP address configured for network '10.20.66.8/23'

TASK ERROR: command '/usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=Thor' -o 'UserKnownHostsFile=/etc/pve/nodes/Thor/ssh_known_hosts' -o 'GlobalKnownHostsFile=none' root@10.20.66.7 pvecm mtunnel -migration_network 10.20.66.8/23 -get_migration_ip' failed: exit code 255

Has anyone come across this error before or know how to fix it? I don't see anything out of the ordinary that could be wrong with ceph or the cluster. Provided additional screenshots below:

Error
proxmox-migration-error.png

Ceph state
ceph-no-errors.png
ceph-state2.png
ceph-osds.png

VM104 - Hardware
vm104-hardware.png

Datacenter Options
datacenter-options.png
 
Can you show node network configuration(s), ideally for each host, or take a close look at each node network interfaces and ensure they're configured properly, specifically for interfaces for migration with IPs in 10.20.66/23.
 
Can you show node network configuration(s), ideally for each host, or take a close look at each node network interfaces and ensure they're configured properly, specifically for interfaces for migration with IPs in 10.20.66/23.
Sure attached are the screenshots of the network sections from all of the nodes in the cluster. They are all setup the same way. Each have vmbr0 as the main proxmox interface, vmbr1 as the interface for VM's / LXC's, and vmbr2 for ha specific traffic. mainly used by my ha postgress db that is scaled across the cluster. I also attached a screenshot of the datacenter.cfg. I asked chatGPT about the issue i was having and it suggested that I set the migration section of the datacenter.cfg to 10.20.66.0/23 but this seems like it would maybe break my cluster so I have not done that. Also not show as an option in the UI so i figured this would probably not be a fix for the issue I am having:

Baldar Network config:
baldar-network-config.png

Heimdall Network config:
Heimdall-Network-config.png
Odin-Network-config:

odin-network-config.png

Thor-Network-config:
thor-network-config.png

Datacenter.cfg :
datacenter.cfg.png
 
Balder's PVE interface is /32 when it should be /23. The error message didn't mention this host so I'm not sure how it could be the problem but it's a problem nonetheless.

You've made the same configuration error on other interfaces, set them all to /23, try your migrations again.

Unrelated to migrations:
  • You should have an uneven number of ceph monitors so kill one for total of three or add one outside of the cluster, for a total of five.
  • Please post your ceph configuration or at least public/cluster network configuration for each node.
 
Ok I fixed the network issue and got it figured out. Thank you everyone for all the help. The issue was the settings in the Migration settings in the proxmox UI. I just cleared out the migration ip so it just now says secure. I have also made a change to ceph to remove the extra monitor. I added 4 initially but changed it so only 3 monitors are now listed. The reason I had 4 was because i have 4 nodes and wanted to ensure replication would always work as expected. I will add a new proxmox node in the future when i get another Dell R740xd then ill re-add two monitors to have an even number. proxmox-migration-issue.png
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!