Error message 595, when migrating manually

Martino Rabe · Oct 30, 2024

Hello,

I installed proxmox 8.2.7 on two systems and added both to a cluster. I configured replication which seems to be working fine and added all lxc to the cluster. The problem is that I am not able to use or manually migrate the lxc systems when one system is going down. I only see the error message "No route to host (595)" when I click on the lxc container or "Verbindungsfehler (Connection error) 595: No route to host" when I want to migrate the lxc container to the working node.

The output (when one node is down) for pvecm status is

Code:

root@redmountain:~# pvecm status
Cluster information
-------------------
Name:             proxmox-cluster
Config Version:   2
Transport:        knet
Secure auth:      on

Quorum information
------------------
Date:             Wed Oct 30 17:42:04 2024
Quorum provider:  corosync_votequorum
Nodes:            1
Node ID:          0x00000002
Ring ID:          2.2c9
Quorate:          No

Votequorum information
----------------------
Expected votes:   2
Highest expected: 2
Total votes:      1
Quorum:           2 Activity blocked
Flags:        

Membership information
----------------------
    Nodeid      Votes Name
0x00000002          1 192.168.5.54 (local)

I even configured HA (even I know that 3 nodes are neccessary to get an automatic failover working) but I still get the same error when I click on migrate. When both nodes are up, I am able to migrate the lxc container to the other node without any problem.

Maybe I completely missunderstand the concept or I did some configuration error I am not able to find out ?
Does anyone have a hint for me ? I didn't find any other entry that describes the 595 error that describes the same environment.

If you need any logs please let me know I am happy to provide you with the information.

Martino Rabe · Oct 31, 2024

Maybe as an additional information, this is the log when both nodes are up and I manually migrate one lxc container

Code:

2024-10-31 18:53:53 shutdown CT 101
2024-10-31 18:54:05 starting migration of CT 101 to node 'redmountain' (192.168.5.54)
2024-10-31 18:54:05 found local volume 'zfs-disk:subvol-101-disk-0' (in current VM config)
2024-10-31 18:54:05 start replication job
2024-10-31 18:54:05 guest => CT 101, running => 0
2024-10-31 18:54:05 volumes => zfs-disk:subvol-101-disk-0
2024-10-31 18:54:06 create snapshot '__replicate_101-0_1730397245__' on zfs-disk:subvol-101-disk-0
2024-10-31 18:54:06 using secure transmission, rate limit: none
2024-10-31 18:54:06 incremental sync 'zfs-disk:subvol-101-disk-0' (__replicate_101-0_1730394003__ => __replicate_101-0_1730397245__)
2024-10-31 18:54:07 send from @__replicate_101-0_1730394003__ to zfs-disk/subvol-101-disk-0@__replicate_101-0_1730397245__ estimated size is 12.7M
2024-10-31 18:54:07 total estimated size is 12.7M
2024-10-31 18:54:07 TIME        SENT   SNAPSHOT zfs-disk/subvol-101-disk-0@__replicate_101-0_1730397245__
2024-10-31 18:54:08 18:54:08   7.76M   zfs-disk/subvol-101-disk-0@__replicate_101-0_1730397245__
2024-10-31 18:54:09 successfully imported 'zfs-disk:subvol-101-disk-0'
2024-10-31 18:54:09 delete previous replication snapshot '__replicate_101-0_1730394003__' on zfs-disk:subvol-101-disk-0
2024-10-31 18:54:10 (remote_finalize_local_job) delete stale replication snapshot '__replicate_101-0_1730394003__' on zfs-disk:subvol-101-disk-0
2024-10-31 18:54:10 end replication job
2024-10-31 18:54:10 # /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=redmountain' -o 'UserKnownHostsFile=/etc/pve/nodes/redmountain/ssh_known_hosts' -o 'GlobalKnownHostsFile=none' root@192.168.5.54 pvesr set-state 101 \''{"local/robustoak":{"fail_count":0,"last_node":"robustoak","last_iteration":1730397245,"last_try":1730397245,"storeid_list":["zfs-disk"],"duration":4.63319,"last_sync":1730397245}}'\'
2024-10-31 18:54:11 start final cleanup
2024-10-31 18:54:12 start container on target node
2024-10-31 18:54:12 # /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=redmountain' -o 'UserKnownHostsFile=/etc/pve/nodes/redmountain/ssh_known_hosts' -o 'GlobalKnownHostsFile=none' root@192.168.5.54 pct start 101
2024-10-31 18:54:15 migration finished successfully (duration 00:00:23)
TASK OK

So migrating is simply a problem when one node is down.

Neobin · Nov 2, 2024

Without quorum: [1], like e.g. in a two node cluster without a qdevice: [2] and one node being down, the cluster goes in a read-only state (Activity blocked).
With storage replication set up, you can move a/the guest/s manually through the CLI: [3].

[1] https://pve.proxmox.com/pve-docs/chapter-pvecm.html#_quorum
[2] https://pve.proxmox.com/pve-docs/chapter-pvecm.html#_corosync_external_vote_support
[3] https://pve.proxmox.com/pve-docs/chapter-pvesr.html#_migrating_a_guest_in_case_of_error

Search

Search

Error message 595, when migrating manually

Martino Rabe

Renowned Member

Attachments

Martino Rabe

Renowned Member

Neobin

Distinguished Member

We value your privacy