Tbh, here if an acpi/qemu shutdown fails, it simply times out and doesn't stops the VM.Will try that when my ACPI shutdown won't work anymore.
By the way...shouldn't that job finish with a warning instead of a OK, if a hard stop was required because the shutdown failed?
With PVE 7.4 it shutdown that TrueNAS VM so that it stopped before the timeout triggered.Tbh, here if an acpi/qemu shutdown fails, it simply times out and doesn't stops the VM.
I just get an error, something with failed to shutdown, timed out. (After around 2minutes)
Then i can either console into the vm and do an "shutdown now", or hardstop myself.
I think the only moment it does actually a hard stop, is when i shutdown or reboot the whole node and the vm is still running on it.
And the vm isn't configured as acpi or has no gemu agent.
It waits then for the shutdown for 2 minutes and if that times out, i think it will hardstop the vm.
But by default at least it just times out with doing nothing here.
Im a bit confused, of the fact what acpi or qemu-guest-agent should actually do by default.With PVE 7.4 it shutdown that TrueNAS VM so that it stopped before the timeout triggered.
But now with PVE8 it looks like the shutdown task just kills the VM without actually waiting for the GuestOS to be propery shutdown. :/
As a workaround I could use the TrueNAS API or TrueNAS webUI to shutdown the VM from within the guest OS but would still be problematic in case NUT triggers a shutdown of all the server so PVE will try to shutdown the VM on its own.
In the past I already had a problem where sometimes it tried a shutdown a VM. But not sure if it was my OPNsense VMs or the TrueNAS VMs. After the shutdown task I could see in the guests console that the guestOS was shutting down and stopping services but then it got stuck and the shutdown task failed with the VM still running but not responing (because the guest OS already had shutdown nearly everything) so I had to run a stop task to stop it (which was fine as as the filesystems where unmounted then).
But the last months there were no shutdown problems at all.
Each VM got a "shutdown timeout" option in the webUI at VM -> Option -> Star/Shutdown Order. But here that is everywhere set to default so it should be the default 180 seconds and not kill it after 15 seconds: https://pve.proxmox.com/pve-docs/chapter-qm.html#qm_startup_and_shutdownSame thing in the "Shutdown timeout" description, they mention there 180s by default.
--> i don't know where to change the default value of 180 seconds (probably with some cli command)
Exactly that's what i meant.Each VM got a "shutdown timeout" option in the webUI at VM -> Option -> Star/Shutdown Order. But here that is everywhere set to default so it should be the default 180 seconds and not kill it after 15 seconds: https://pve.proxmox.com/pve-docs/chapter-qm.html#qm_startup_and_shutdown
[79000.018843] pverados[1002335]: segfault at 55a429fac030 ip 000055a429fac030 sp 00007ffd0bdeb2c8 error 14 in perl[55a429f80000+195000] likely on CPU 0 (core 0, socket 0)
[79000.018858] Code: Unable to access opcode bytes at 0x55a429fac006.
[112691.083445] pverados[1426585]: segfault at 55a429fac030 ip 000055a429fac030 sp 00007ffd0bdeb2c8 error 14 in perl[55a429f80000+195000] likely on CPU 3 (core 3, socket 0)
[112691.083459] Code: Unable to access opcode bytes at 0x55a429fac006.
[115751.124815] pverados[1464845]: segfault at 55a429fac030 ip 000055a429fac030 sp 00007ffd0bdeb2c8 error 14 in perl[55a429f80000+195000] likely on CPU 22 (core 6, socket 0)
[115751.124830] Code: Unable to access opcode bytes at 0x55a429fac006.
[116981.038112] pverados[1480841]: segfault at 55a429fac030 ip 000055a429fac030 sp 00007ffd0bdeb2c8 error 14 in perl[55a429f80000+195000] likely on CPU 3 (core 3, socket 0)
[116981.038126] Code: Unable to access opcode bytes at 0x55a429fac006.
[118159.853135] pverados[1495409]: segfault at 55a429fac030 ip 000055a429fac030 sp 00007ffd0bdeb2c8 error 14 likely on CPU 27 (core 11, socket 0)
[118159.853145] Code: Unable to access opcode bytes at 0x55a429fac006.
[126120.960397] pverados[1596000]: segfault at 55a429fac030 ip 000055a429fac030 sp 00007ffd0bdeb2c8 error 14 in perl[55a429f80000+195000] likely on CPU 24 (core 8, socket 0)
[126120.960412] Code: Unable to access opcode bytes at 0x55a429fac006.
[131860.866128] pverados[1668760]: segfault at 55a429fac030 ip 000055a429fac030 sp 00007ffd0bdeb2c8 error 14 in perl[55a429f80000+195000] likely on CPU 29 (core 13, socket 0)
[131860.866145] Code: Unable to access opcode bytes at 0x55a429fac006.
[146685.756874] pverados[1858147]: segfault at 55fa0ab86e90 ip 000055fa07a3609d sp 00007fff61dbff30 error 7 in perl[55fa0795b000+195000] likely on CPU 14 (core 14, socket 0)
[146685.756886] Code: 0f 95 c2 c1 e2 05 08 55 00 41 83 47 08 01 48 8b 53 08 22 42 23 0f b6 c0 66 89 45 02 49 8b 07 8b 78 60 48 8b 70 48 44 8d 6f 01 <44> 89 68 60 41 83 fd 01 0f 8f 4d 04 00 00 48 8b 56 08 49 63 c5 48
[147674.484918] pverados[1870446]: segfault at 55fa08306910 ip 000055fa07a43c36 sp 00007fff61dc01a0 error 7 in perl[55fa0795b000+195000] likely on CPU 3 (core 3, socket 0)
[147674.484931] Code: 01 08 49 89 c6 e8 8a b1 02 00 48 8b 85 e0 00 00 00 4c 8b 44 24 08 48 8b 40 18 48 85 c0 0f 84 e3 02 00 00 48 8b 15 a2 52 21 00 <c7> 40 20 ff ff ff ff 66 48 0f 6e c8 48 89 50 28 48 8b 10 48 8b 12
[149555.347900] pverados[1894875]: segfault at 55fa0ab86e90 ip 000055fa07a3609d sp 00007fff61dbff30 error 7 in perl[55fa0795b000+195000] likely on CPU 16 (core 0, socket 0)
[149555.347913] Code: 0f 95 c2 c1 e2 05 08 55 00 41 83 47 08 01 48 8b 53 08 22 42 23 0f b6 c0 66 89 45 02 49 8b 07 8b 78 60 48 8b 70 48 44 8d 6f 01 <44> 89 68 60 41 83 fd 01 0f 8f 4d 04 00 00 48 8b 56 08 49 63 c5 48
[153775.382424] pverados[1947540]: segfault at 55fa08306910 ip 000055fa07a43c36 sp 00007fff61dc01a0 error 7 in perl[55fa0795b000+195000] likely on CPU 15 (core 15, socket 0)
[153775.382438] Code: 01 08 49 89 c6 e8 8a b1 02 00 48 8b 85 e0 00 00 00 4c 8b 44 24 08 48 8b 40 18 48 85 c0 0f 84 e3 02 00 00 48 8b 15 a2 52 21 00 <c7> 40 20 ff ff ff ff 66 48 0f 6e c8 48 89 50 28 48 8b 10 48 8b 12
see https://forum.proxmox.com/threads/pverados-segfault.130628/#post-575970I have updated my test cluster ( 5 nodes). All work is fine, but on few nodes I see a some errors in dmesg:
$> aptitude search linux-image | grep -i pve
v linux-image-6.2.16-1-pve-amd64 -
v linux-image-6.2.16-2-pve-amd64 -
v linux-image-6.2.16-3-pve-amd64 -
v linux-image-6.2.16-4-pve-amd64 -
v linux-image-6.2.16-5-pve-amd64 -
$> aptitude search linux-image-6.1.0-9
p linux-image-6.1.0-9-amd64 - Linux 6.1 for 64-bit PCs (signed)
p linux-image-6.1.0-9-amd64-dbg - Debug symbols for linux-image-6.1.0-9-amd64
hi, how did you fixe this problem ? i have the same error message.Still no luck. I changed USB disk; used dd on my linux machine. The real fatal error is:
Code:zstd uncompress failed with error code 20 FATAL ERROR: writer: failed to read/uncompress file /target/usr/lib/python3.11/re/__pycache__/_parser.cpython-311.pyc
https://pasteboard.co/tQJeNgT5edAh.jpg
https://pasteboard.co/aDPZIGG4sKNp.jpg
Hi!
Is it possible to make the "-dbg" version of the kernels available?
Code:$> aptitude search linux-image | grep -i pve v linux-image-6.2.16-1-pve-amd64 - v linux-image-6.2.16-2-pve-amd64 - v linux-image-6.2.16-3-pve-amd64 - v linux-image-6.2.16-4-pve-amd64 - v linux-image-6.2.16-5-pve-amd64 -
Example ( Base Debian ):
Code:$> aptitude search linux-image-6.1.0-9 p linux-image-6.1.0-9-amd64 - Linux 6.1 for 64-bit PCs (signed) p linux-image-6.1.0-9-amd64-dbg - Debug symbols for linux-image-6.1.0-9-amd64
-device piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2
Yeah, if you install in BIOS mode then UEFI boot won't be set up and it cannot work to switch later without any interaction. E.g., if UEFI interface isn't available we cannot register a boot entry to the EFIvars.I change BIOS to UEFI, but ProxMox boot failed, it show the message: Not bootable device ...