"Migrating" without syncing disk (after node isolation)

le_top · May 19, 2020

When a machine fails, recovery requires several tricks at least 50% of the time.

I am using Proxmox VE 5.4-13 and a machine failed, was isolated, etc. Some machines properly migrated, but 3 are in a "difficult" state

Two instances migrated to a machine that is not even in the HA cluster that was defined for them. One is using a raw image (not synced) and the other ceph (not available on this machine).

One instance is on the right machine, but does not start (systemctl status pve-container@130.service):

Code:

May 19 04:42:11 p3 lxc-start[24666]: lxc-start: 130: lxccontainer.c: wait_on_daemonized_start: 856 No such file or directory - Failed to receive the container state
May 19 04:42:11 p3 lxc-start[24666]: lxc-start: 130: tools/lxc_start.c: main: 330 The container failed to start
May 19 04:42:11 p3 lxc-start[24666]: lxc-start: 130: tools/lxc_start.c: main: 333 To get more details, run the container in foreground mode
May 19 04:42:11 p3 lxc-start[24666]: lxc-start: 130: tools/lxc_start.c: main: 336 Additional information can be obtained by setting the --logfile and --logpriority options
May 19 04:42:11 p3 systemd[1]: pve-container@130.service: Control process exited, code=exited status=1

What I would like to do is to move these machines back to the machine that rebooted without syncing/moving the disk image (not available on one machine, and possibly not in sync for machine n° 130 above).

1. Any idea how to do that?
[EDIT: I managed to move the CEPH based machine without a hassle, and I managed to move the raw image based machine using my procedure below. I fixed the ZFS machine by starting it on the server iit was located on after disabling the HA. Any answer to this question is still usefull for a future failure].
2. Any idea why the machines moved to a server not in the HA cluster configuration (or how to find out why)?

Here is a procedure that I put together in the past for a machine with a raw image in a similar situation, but I can't apply it this time as I need to correct ZFS based machines, not a raw machine that is not synced at all:

Procedure to move a raw image based VM back to the original server without "syncing" the image:
VM107 was found on p1 or a copy of its raw image was on /mnt/bigdisk/vz/images/107/vm-107-disk-0.raw. This image was smaller than the image in the same location on p5.

The migration to p1 was not configured, it is not explained how the machine 107 was assigned to p1 (happened again just now).

It was desirable to operate the VM107 on p5 with the image that was on p5. The problem was twofold: migration was not possible because the image on p5 blocked, and at the same time it was the desired image.

Solution:

- Rename of /mnt/bigdisk/vz/images/107/vm-107-disk-0.raw on both machines (to /mnt/bigdisk/vz/images/107/vm-107-disk-0.raw.org).
- touch on /mnt/bigdisk/vz/images/107/vm-107-disk-0.raw on p1 in order to have a size 0 image that will not start.
- Through the Proxmox interface, request to migrate to p5, which succeeded because the image was no longer blocking.
- Having a size 0 image, the migration is almost immediate.
- The machine not having started, on p5, an mv from /mnt/bigdisk/vz/images/107/vm-107-disk-0.raw.org to / mnt / bigdisk / vz / images / 107 / vm- 107-disk-0.raw.
- Then, request to start VM107 on p5 successfully.

fiona · Jun 3, 2020

Hi,

le_top said:
What I would like to do is to move these machines back to the machine that rebooted without syncing/moving the disk image (not available on one machine, and possibly not in sync for machine n° 130 above).

1. Any idea how to do that?
[EDIT: I managed to move the CEPH based machine without a hassle, and I managed to move the raw image based machine using my procedure below. I fixed the ZFS machine by starting it on the server iit was located on after disabling the HA. Any answer to this question is still usefull for a future failure].

if I understand correctly you are in a situation where the configuration file is on one node, but the disk you'd like to use is still on the other. This is not an expected situation, so there's no single command to run. But your workaround is good already. It can be simplified by removing the disks entries from the configuration file only (e.g. by deleting the lines in the text file), then migrating the disk-less machine and doing a [qm|pct] --rescan <ID> on the target (this will re-add the disks that are present on the target to the configuration).

2. Any idea why the machines moved to a server not in the HA cluster configuration (or how to find out why)?

Is restricted set for the service group? Otherwise the nodes not in the group will still be considered with low priority. If you did set restricted, could you please share your service group configuration

Code:

cat /etc/pve/ha/groups.cfg 
cat /etc/pve/ha/resources.cfg

Which service was migrated to the wrong node? Which node failed and which node was the wrong target?
Do you still have syslogs from around the time this happened (/var/log/syslog.*.gz)?

le_top · Aug 21, 2020

This just happened again for two machines.

Their groups were "empty".

I applied to procedure to manually remove the rootfs from '/etc/pve/nodes/<server>/lxc/*.conf for the machines in question.

I then performed the "pct rescan -vmid <ID>" commands to reattache the disks.
This attaches the disk as "unused0", so the VM did not start. I had to change "unused0" to "rootfs" in the configuration file.

fiona · Aug 24, 2020

Hi,
could you please post the output of

Code:

ha-manager config
ha-manager groupconfig

and tell us which machine was affected? Is the underlying storage shared (or a replicated ZFS)?

le_top said:
Their groups were "empty".

After the node failed or already before?

le_top · Aug 24, 2020

The machines that were concerened are ct:102 and ct:140. They both use replicated ZFS.
Both of their ha-groups were empty after the failure.
I can not guarantee that they were set before the failure, but in principle I checked all the settings before .

The machines migrated to server p1 which is a "small" server to help with the quorum. It does not host any machine at this time and ZFS is not replicated to p1 (it is replicated between p3 an p5).

There is no shared storage between p3 and p5 - they are on distinct sites.

Code:

ha-manager config
ct:101
    group HAGroup1
    state started

ct:102
    group HAGroup1
    state started

ct:103
    group Gravelines
    state disabled

ct:104
    group Gravelines
    state started

ct:105
    group HAGroup1
    state started

ct:107
    max_relocate 0
    state started

ct:109
    group HAGroup1
    max_relocate 3
    max_restart 3
    state started

ct:120
    group HAGroup1
    state started

ct:130
    group HAGroup1
    state started

ct:140
    group HAGroup1
    state started

ct:180
    group HAGroup1
    state started

ct:223
    group HAGroup1
    state started

ct:224
    group HAGroup1
    state started

ct:225
    group Gravelines
    state disabled



ha-manager groupconfig



group: Gravelines
        comment Serveurs sur Gravelines
        nodes p5:1
        nofailback 1
        restricted 1

group: HAGroup1
        comment Serveurs de calculs
        nodes p5:2,p3:1
        nofailback 1
        restricted 1

group: HAGroupAll
        comment Serveurs capacit� minimum
        nodes p5:2,p3:1
        nofailback 1
        restricted 1

group: HAGroupCeph1
        nodes p5:1
        nofailback 1
        restricted 1

fiona · Aug 25, 2020

I tried to reproduce this with 4 VMs and 4 CT in a group similar to your HAGroup1, but everything got recovered correctly to the second specified node. And taking a look at the code, the groups in the HA config are not affected by a failing node/recovering a service. So probably the groups were not set for your two containers before the node failed. Do you have any backup of your /etc/pve where you could check that?

Search

Search

"Migrating" without syncing disk (after node isolation)

le_top

Renowned Member

fiona

Proxmox Staff Member

le_top

Renowned Member

fiona

Proxmox Staff Member

le_top

Renowned Member

fiona

Proxmox Staff Member