Hi,
tonight some of my backups failed withe the following error:
So it seems to be some problem with the lockfiles.
However my guess is that this is only a symptom.
After restarting my node1 of the cluster all VMs on that node run fine and none is locked anymore.
However my node2 seems to be stalled now.
In the web UI I get
When logging into node2 via SSH it hangs on any command related to VMs. For example qm list or similar.
I am currently assuming it has to do something with corosync.
however this looks fine on first sight
it looks like the replication runner is not starting. (I'm not using any replications)
this way currently no VM starts on my second node and I can't manage it via the web ui of node1. In addition I can't log into it's own web UI
I also checked NTP and the time seems to be the same on both machines and I can't see any errors regarding NTP.
Any ideas?
tonight some of my backups failed withe the following error:
Code:
102: 2021-04-22 04:28:21 ERROR: Backup of VM 102 failed - unable to open file '/etc/pve/nodes/proxmox/qemu-server/102.conf.tmp.3183' - Permission denied
So it seems to be some problem with the lockfiles.
However my guess is that this is only a symptom.
After restarting my node1 of the cluster all VMs on that node run fine and none is locked anymore.
However my node2 seems to be stalled now.
In the web UI I get
Code:
VM6655:1 GET https://192.168.42.23:8006/api2/json/nodes/proxmix/time 401 (permission denied - invalid PVE ticket)
When logging into node2 via SSH it hangs on any command related to VMs. For example qm list or similar.
I am currently assuming it has to do something with corosync.
however this looks fine on first sight
Code:
root@proxmix:~# pvecm status
Cluster information
-------------------
Name: Virtualizers
Config Version: 2
Transport: knet
Secure auth: on
Quorum information
------------------
Date: Thu Apr 22 10:12:35 2021
Quorum provider: corosync_votequorum
Nodes: 2
Node ID: 0x00000002
Ring ID: 1.a85
Quorate: Yes
Votequorum information
----------------------
Expected votes: 2
Highest expected: 2
Total votes: 2
Quorum: 2
Flags: Quorate
Membership information
----------------------
Nodeid Votes Name
0x00000001 1 192.168.42.24
0x00000002 1 192.168.42.23 (local)
it looks like the replication runner is not starting. (I'm not using any replications)
Code:
● pvesr.service - Proxmox VE replication runner
Loaded: loaded (/lib/systemd/system/pvesr.service; static; vendor preset: enabled)
Active: activating (start) since Thu 2021-04-22 09:56:00 CEST; 19min ago
Main PID: 1618 (pvesr)
Tasks: 1 (limit: 4915)
Memory: 72.8M
CGroup: /system.slice/pvesr.service
└─1618 /usr/bin/perl -T /usr/bin/pvesr run --mail 1
Apr 22 09:56:00 proxmix systemd[1]: Starting Proxmox VE replication runner...
this way currently no VM starts on my second node and I can't manage it via the web ui of node1. In addition I can't log into it's own web UI
Connection failure. Network error or Proxmox VE services not running?
I also checked NTP and the time seems to be the same on both machines and I can't see any errors regarding NTP.
Any ideas?
Code:
root@proxmix:~# pveversion -v
proxmox-ve: 6.3-1 (running kernel: 5.11.7-1-pve)
pve-manager: 6.3-6 (running version: 6.3-6/2184247e)
pve-kernel-5.11: 7.0-0+3~bpo10
pve-kernel-5.4: 6.3-8
pve-kernel-helper: 6.3-8
pve-kernel-5.11.7-1-pve: 5.11.7-1~bpo10
pve-kernel-5.4.106-1-pve: 5.4.106-1
pve-kernel-5.4.73-1-pve: 5.4.73-1
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.1.0-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.20-pve1
libproxmox-acme-perl: 1.0.8
libproxmox-backup-qemu0: 1.0.3-1
libpve-access-control: 6.1-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.3-5
libpve-guest-common-perl: 3.1-5
libpve-http-server-perl: 3.1-1
libpve-storage-perl: 6.3-7
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.6-2
lxcfs: 4.0.6-pve1
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.0.12-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.4-9
pve-cluster: 6.2-1
pve-container: 3.3-4
pve-docs: 6.3-1
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.2-2
pve-ha-manager: 3.1-1
pve-i18n: 2.3-1
pve-qemu-kvm: 5.2.0-5
pve-xtermjs: 4.7.0-3
qemu-server: 6.3-10
smartmontools: 7.2-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 2.0.4-pve1
Last edited: