Hello,
I have a 3 node cluster running v7.1-1 (running kernel: 5.13.19-3-pve). Server 03 has a running backup task that:
- Cant be stopped from the webUI (button is greyed out).
- Cant be killed, not even with "-9"
- When trying to view the log nothing is shown and after a few seconds, an "invalid ticket" is shown and I'm requested to log in again.
Source are all VMs and CTs of this server (some in Ceph pool, a few are in local disks). Destination is a PBS server which is working correctly (at least there's network conectivity from all 3 nodes and the other two nodes have created their backups correctly tonight).
Also:
- I can't access /etc/pve/nodes/server03 from any node.
- I can access /etc/pve/* on any node.
- Any qm command run on server 03 just hungs. Not even closing the shell that launched it terminates the process and I have to use kill -9 to stop if from other terminal. That means I can't migrate VM's from 03 to the other two.
- VMs on every server are running correctly.
- In the webUI, only the node I connect to shows with the green tick, the other have a greyed out question mark. That is, 01 sees just 01 with gree tick, the other 2 have a question mark. Node 02 just sees 02 as okey, same for 03: just sees itself as ok.
Can't find any specific error on logs which may give me a clue on what happend.
Is there any way to stop the backup task and recover the node without restarting? There are a few vital VMs there and I would like to preserve their uptime as much as possible.
Thanks!
I have a 3 node cluster running v7.1-1 (running kernel: 5.13.19-3-pve). Server 03 has a running backup task that:
- Cant be stopped from the webUI (button is greyed out).
- Cant be killed, not even with "-9"
- When trying to view the log nothing is shown and after a few seconds, an "invalid ticket" is shown and I'm requested to log in again.
Source are all VMs and CTs of this server (some in Ceph pool, a few are in local disks). Destination is a PBS server which is working correctly (at least there's network conectivity from all 3 nodes and the other two nodes have created their backups correctly tonight).
Also:
- I can't access /etc/pve/nodes/server03 from any node.
- I can access /etc/pve/* on any node.
- Any qm command run on server 03 just hungs. Not even closing the shell that launched it terminates the process and I have to use kill -9 to stop if from other terminal. That means I can't migrate VM's from 03 to the other two.
- VMs on every server are running correctly.
- In the webUI, only the node I connect to shows with the green tick, the other have a greyed out question mark. That is, 01 sees just 01 with gree tick, the other 2 have a question mark. Node 02 just sees 02 as okey, same for 03: just sees itself as ok.
Can't find any specific error on logs which may give me a clue on what happend.
Is there any way to stop the backup task and recover the node without restarting? There are a few vital VMs there and I would like to preserve their uptime as much as possible.
Thanks!