Hi,
the IO-issue like I had in https://forum.proxmox.com/threads/i...5-39-1-pve-bug-soft-lockup-inside-vms.113373/ isn't fixed with vm-disk parameter aio=thread…
Today I migrate an VM live to an Mode with two disks (25G + 75G) and after that, many (all?) VMs on that hosts has issues. CPU goes up and I got "CPU stuck for 94271s" messages in the VM console. That are more than 26 hours, so I read in the logfile entrys with the date of tomorror:
this show that the network access is broken too (gateway not reachable).
The pveversion of the target host
Perhaps the migation-speed with >650MB/s is an problem for kernel to satified the request for the client? But why cpu-stuck??
The host has an AMD-CPU with the actual Bios-Upgrade.
After rebooting the host, the VMs starting normal.
Udo
the IO-issue like I had in https://forum.proxmox.com/threads/i...5-39-1-pve-bug-soft-lockup-inside-vms.113373/ isn't fixed with vm-disk parameter aio=thread…
Today I migrate an VM live to an Mode with two disks (25G + 75G) and after that, many (all?) VMs on that hosts has issues. CPU goes up and I got "CPU stuck for 94271s" messages in the VM console. That are more than 26 hours, so I read in the logfile entrys with the date of tomorror:
Code:
more /var/log/ha-debug
Oct 07 13:38:40 vappdb04-prod heartbeat: [1409]: info: log-rotate detected on logfile /var/log/ha-debug
Oct 07 13:38:40 vappdb04-prod heartbeat: [1409]: info: log-rotate detected on logfile /var/log/ha-log
Oct 07 13:38:40 vappdb04-prod heartbeat: [1409]: WARN: Gmain_timeout_dispatch: Dispatch function for send local status was delayed 101217210 ms (> 510 ms) be
fore being called (GSource: 0x5619a0d86cc0)
Oct 07 13:38:40 vappdb04-prod heartbeat: [1409]: info: Gmain_timeout_dispatch: started at 1790452651 should have started at 1780330930
Oct 07 13:38:40 vappdb04-prod heartbeat: [1409]: CRIT: Late heartbeat: Node vappdb04-prod: interval 101218210 ms (> deadtime)
Oct 07 13:38:40 vappdb04-prod heartbeat: [1409]: WARN: node 10.XXX.XXX.1: is dead
Oct 07 13:38:40 vappdb04-prod heartbeat: [1409]: WARN: node vappdb03-prod: is dead
The pveversion of the target host
Code:
pveversion -v
proxmox-ve: 7.2-1 (running kernel: 5.15.53-1-pve)
pve-manager: 7.2-11 (running version: 7.2-11/b76d3178)
pve-kernel-helper: 7.2-12
pve-kernel-5.15: 7.2-10
pve-kernel-5.11: 7.0-10
pve-kernel-5.15.53-1-pve: 5.15.53-1
pve-kernel-5.11.22-7-pve: 5.11.22-12
pve-kernel-5.11.22-5-pve: 5.11.22-10
pve-kernel-5.11.22-4-pve: 5.11.22-9
ceph-fuse: 15.2.17-pve1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve1
libproxmox-acme-perl: 1.4.2
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.2-4
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.2-2
libpve-guest-common-perl: 4.1-2
libpve-http-server-perl: 4.1-3
libpve-storage-perl: 7.2-8
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.0-3
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
openvswitch-switch: 2.15.0+ds1-2+deb11u1
proxmox-backup-client: 2.2.6-1
proxmox-backup-file-restore: 2.2.6-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.5.1
pve-cluster: 7.2-2
pve-container: 4.2-2
pve-docs: 7.2-2
pve-edk2-firmware: 3.20220526-1
pve-firewall: 4.2-6
pve-firmware: 3.5-2
pve-ha-manager: 3.4.0
pve-i18n: 2.7-2
pve-qemu-kvm: 7.0.0-3
pve-xtermjs: 4.16.0-1
qemu-server: 7.2-4
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.5-pve1
The host has an AMD-CPU with the actual Bios-Upgrade.
After rebooting the host, the VMs starting normal.
Udo