Hi All,
After some successful tests with PBS (it actually ran perfectly flawless from day one) I have encoutered massive problems since the last couple of days (two weeks) up to a point where my complete datacenter more or less went down...
In short: If I do backups to PBS I am prone to rebooting almost every node in the proxmox VE Infrastructure...
* pvestatd, pvedaemon and pveproxy hang on all nodes.
* I have random VMs hanging and just consuming 100% CPU on one core (worst: they are still pingable...)
In short: at current it's unusable.... unfortunately.
I have PBS running as a VM (not backing this one up using PBS, though). The data storage is on a NFS share.... network connection is 10GBE all over...
PVE:
proxmox-ve: 6.2-1 (running kernel: 5.4.60-1-pve)
pve-manager: 6.2-11 (running version: 6.2-11/22fb4983)
pve-kernel-5.4: 6.2-6
pve-kernel-helper: 6.2-6
pve-kernel-5.3: 6.1-6
pve-kernel-5.4.60-1-pve: 5.4.60-2
pve-kernel-5.4.55-1-pve: 5.4.55-1
pve-kernel-5.4.44-2-pve: 5.4.44-2
pve-kernel-4.15: 5.4-8
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-4.15.18-20-pve: 4.15.18-46
pve-kernel-4.15.18-12-pve: 4.15.18-36
ceph-fuse: 12.2.13-pve1
corosync: 3.0.4-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: residual config
ifupdown2: 3.0.0-1+pve2
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.16-pve1
libproxmox-acme-perl: 1.0.5
libpve-access-control: 6.1-2
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.2-1
libpve-guest-common-perl: 3.1-3
libpve-http-server-perl: 3.0-6
libpve-storage-perl: 6.2-6
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.3-1
lxcfs: 4.0.3-pve3
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.2-12
pve-cluster: 6.1-8
pve-container: 3.1-13
pve-docs: 6.2-5
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-2
pve-firmware: 3.1-3
pve-ha-manager: 3.1-1
pve-i18n: 2.2-1
pve-qemu-kvm: 5.0.0-13
pve-xtermjs: 4.7.0-2
qemu-server: 6.2-14
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 0.8.4-pve1
PBS is on release:
ii proxmox-backup 1.0-4 all Proxmox Backup Server metapackage
ii proxmox-backup-client 0.8.17-1 amd64 Proxmox Backup Client tools
ii proxmox-backup-docs 0.8.17-1 all Proxmox Backup Documentation
ii proxmox-backup-server 0.8.17-1 amd64 Proxmox Backup Server daemon with tools and GUI
ii proxmox-mini-journalreader 1.1-1 amd64 Minimal systemd Journal Reader
ii proxmox-widget-toolkit 2.2-12 all ExtJS Helper Classes for Proxmox
Any idea what's the cause here?
Also... most notable are linux systems having qemu-guest-agent installed.... systems without guest agent seem to suffer less (and windows systems haven't been affected at all so far...)
Still: it's an uncool situation because even with test-runs I have massive problems...
Any idea?
Thanks!
Tobias
After some successful tests with PBS (it actually ran perfectly flawless from day one) I have encoutered massive problems since the last couple of days (two weeks) up to a point where my complete datacenter more or less went down...
In short: If I do backups to PBS I am prone to rebooting almost every node in the proxmox VE Infrastructure...
* pvestatd, pvedaemon and pveproxy hang on all nodes.
* I have random VMs hanging and just consuming 100% CPU on one core (worst: they are still pingable...)
In short: at current it's unusable.... unfortunately.
I have PBS running as a VM (not backing this one up using PBS, though). The data storage is on a NFS share.... network connection is 10GBE all over...
PVE:
proxmox-ve: 6.2-1 (running kernel: 5.4.60-1-pve)
pve-manager: 6.2-11 (running version: 6.2-11/22fb4983)
pve-kernel-5.4: 6.2-6
pve-kernel-helper: 6.2-6
pve-kernel-5.3: 6.1-6
pve-kernel-5.4.60-1-pve: 5.4.60-2
pve-kernel-5.4.55-1-pve: 5.4.55-1
pve-kernel-5.4.44-2-pve: 5.4.44-2
pve-kernel-4.15: 5.4-8
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-4.15.18-20-pve: 4.15.18-46
pve-kernel-4.15.18-12-pve: 4.15.18-36
ceph-fuse: 12.2.13-pve1
corosync: 3.0.4-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: residual config
ifupdown2: 3.0.0-1+pve2
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.16-pve1
libproxmox-acme-perl: 1.0.5
libpve-access-control: 6.1-2
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.2-1
libpve-guest-common-perl: 3.1-3
libpve-http-server-perl: 3.0-6
libpve-storage-perl: 6.2-6
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.3-1
lxcfs: 4.0.3-pve3
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.2-12
pve-cluster: 6.1-8
pve-container: 3.1-13
pve-docs: 6.2-5
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-2
pve-firmware: 3.1-3
pve-ha-manager: 3.1-1
pve-i18n: 2.2-1
pve-qemu-kvm: 5.0.0-13
pve-xtermjs: 4.7.0-2
qemu-server: 6.2-14
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 0.8.4-pve1
PBS is on release:
ii proxmox-backup 1.0-4 all Proxmox Backup Server metapackage
ii proxmox-backup-client 0.8.17-1 amd64 Proxmox Backup Client tools
ii proxmox-backup-docs 0.8.17-1 all Proxmox Backup Documentation
ii proxmox-backup-server 0.8.17-1 amd64 Proxmox Backup Server daemon with tools and GUI
ii proxmox-mini-journalreader 1.1-1 amd64 Minimal systemd Journal Reader
ii proxmox-widget-toolkit 2.2-12 all ExtJS Helper Classes for Proxmox
Any idea what's the cause here?
Also... most notable are linux systems having qemu-guest-agent installed.... systems without guest agent seem to suffer less (and windows systems haven't been affected at all so far...)
Still: it's an uncool situation because even with test-runs I have massive problems...
Any idea?
Thanks!
Tobias