[Solved] Instant Reboot on PVE when running container/VM

chrsgrhmgrhm

New Member
Jul 3, 2024
6
0
1
Hello all.
I'm running a new install of PVE (8.2.4) and am experiencing some weird reboots.
Reboots are not graceful, and there's no logging as to what went wrong. I just get "-- Reboot --". Sometimes it will last hours, some times it will last minutes.
Memtest has not been able to complete. But no explicit errors have been seen on the console. Sometimes it runs for 15 minutes, sometimes it can run for hours. I just have not seen it complete. Memtest passed.
I ran prime95 for 4 hours without issues. I'm not sure if it would've completed, I stopped it early.

I turned off auto-start on all VMs and LXCs and rebooted, and it ran fine all night (12+ hours). I started up a VM (truenas) and it crashed within 15 minutes. I let it come back up and waited an hour without reboots, and started an LXC (heimdall) and it crashed within 30 minutes.

I'm running journalctl -f and nothing gets logged out before the reboot.

My hardware:
Intel i5 12400 - *NEW*
ASRock Z690m PG RIPTIDE mATX - *USED*
Crucial CT2K32G48C40U5 2x32GB DDR5-4800 - *NEW*
1x Seagate IronWolf 6TB 7200rpm - *NEW*
1x Kingston NV2 500GB M.2 - *NEW*
LEPA G1200 PSU - *USED*
Fractal Node 804 Case - *USED*

I read through this post: https://forum.proxmox.com/threads/proxmox-mystery-random-reboots.125001/ and changed my /etc/default/grum to GRUB_CMDLINE_LINUX_DEFAULT="quiet pci=assign-busses apicmaintimer idle=poll reboot=cold,hard" That did not fix it.

Attached are some system logs from the GUI, some journalctl -f logs, and then the pveversion -v.

What I haven't had the time to try yet:
Test individual sticks of RAM, in different slots (mobo is used).
Test PSU (it's a few years old, but nothing has been concerning up to this point).
Hooking it up to a monitor to see if the video out shows anything.

Here you can see the moments where it reboots:
1720020367724.png
 

Attachments

Last edited:
Passed memtest.

Code:
root@pve:~# memtester 1024 3
memtester version 4.6.0 (64-bit)
Copyright (C) 2001-2020 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).


pagesize is 4096
pagesizemask is 0xfffffffffffff000
want 1024MB (1073741824 bytes)
got  1024MB (1073741824 bytes), trying mlock ...locked.
Loop 1/3:
  Stuck Address       : ok         
  Random Value        : ok
  Compare XOR         : ok
  Compare SUB         : ok
  Compare MUL         : ok
  Compare DIV         : ok
  Compare OR          : ok
  Compare AND         : ok
  Sequential Increment: ok
  Solid Bits          : ok         
  Block Sequential    : ok         
  Checkerboard        : ok         
  Bit Spread          : ok         
  Bit Flip            : ok         
  Walking Ones        : ok         
  Walking Zeroes      : ok         
  8-bit Writes        : ok
  16-bit Writes       : ok


Loop 2/3:
  Stuck Address       : ok         
  Random Value        : ok
  Compare XOR         : ok
  Compare SUB         : ok
  Compare MUL         : ok
  Compare DIV         : ok
  Compare OR          : ok
  Compare AND         : ok
  Sequential Increment: ok
  Solid Bits          : ok         
  Block Sequential    : ok         
  Checkerboard        : ok         
  Bit Spread          : ok         
  Bit Flip            : ok         
  Walking Ones        : ok         
  Walking Zeroes      : ok         
  8-bit Writes        : ok
  16-bit Writes       : ok


Loop 3/3:
  Stuck Address       : ok         
  Random Value        : ok
  Compare XOR         : ok
  Compare SUB         : ok
  Compare MUL         : ok
  Compare DIV         : ok
  Compare OR          : ok
  Compare AND         : ok
  Sequential Increment: ok
  Solid Bits          : ok         
  Block Sequential    : ok         
  Checkerboard        : ok         
  Bit Spread          : ok         
  Bit Flip            : ok         
  Walking Ones        : ok         
  Walking Zeroes      : ok         
  8-bit Writes        : ok
  16-bit Writes       : ok
 
root@pve:~# pveversion -v
proxmox-ve: 8.2.0 (running kernel: 6.8.8-2-pve)
pve-manager: 8.2.4 (running version: 8.2.4/faa83925c9641325)
proxmox-kernel-helper: 8.1.0
proxmox-kernel-6.8: 6.8.8-2
proxmox-kernel-6.8.8-2-pve-signed: 6.8.8-2
proxmox-kernel-6.8.4-2-pve-signed: 6.8.4-2
ceph-fuse: 17.2.7-pve3
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx8
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.1
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.3
libpve-access-control: 8.1.4
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.0.7
libpve-cluster-perl: 8.0.7
libpve-common-perl: 8.2.1
libpve-guest-common-perl: 5.1.3
libpve-http-server-perl: 5.1.0
libpve-network-perl: 0.9.8
libpve-rs-perl: 0.8.9
libpve-storage-perl: 8.2.3
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.4.0-3
proxmox-backup-client: 3.2.4-1
proxmox-backup-file-restore: 3.2.4-1
proxmox-firewall: 0.4.2
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.2.3
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.6
proxmox-widget-toolkit: 4.2.3
pve-cluster: 8.0.7
pve-container: 5.1.12
pve-docs: 8.2.2
pve-edk2-firmware: 4.2023.08-4
pve-esxi-import-tools: 0.7.1
pve-firewall: 5.0.7
pve-firmware: 3.12-1
pve-ha-manager: 4.0.5
pve-i18n: 3.2.2
pve-qemu-kvm: 9.0.0-3
pve-xtermjs: 5.3.0-3
qemu-server: 8.2.1
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.4-pve1

Linux pve 6.8.8-2-pve #1 SMP PREEMPT_DYNAMIC PMX 6.8.8-2 (2024-06-24T09:00Z) x86_64

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Wed Jul 3 09:34:00 CDT 2024 on pts/0
root@pve:~# journalctl -f
Jul 03 09:39:35 pve systemd[1532]: Listening on gpg-agent-extra.socket - GnuPG cryptographic agent and passphrase cache (restricted).
Jul 03 09:39:35 pve systemd[1532]: Listening on gpg-agent-ssh.socket - GnuPG cryptographic agent (ssh-agent emulation).
Jul 03 09:39:35 pve systemd[1532]: Listening on gpg-agent.socket - GnuPG cryptographic agent and passphrase cache.
Jul 03 09:39:35 pve systemd[1532]: Reached target sockets.target - Sockets.
Jul 03 09:39:35 pve systemd[1532]: Reached target basic.target - Basic System.
Jul 03 09:39:35 pve systemd[1532]: Reached target default.target - Main User Target.
Jul 03 09:39:35 pve systemd[1532]: Startup finished in 87ms.
Jul 03 09:39:35 pve systemd[1]: Started user@0.service - User Manager for UID 0.
Jul 03 09:39:35 pve systemd[1]: Started session-1.scope - Session 1 of User root.
Jul 03 09:39:35 pve login[1547]: ROOT LOGIN on '/dev/pts/0'
Jul 03 09:39:37 pve pveproxy[1222]: proxy detected vanished client connection
Jul 03 09:39:44 pve kernel: EXT4-fs (dm-8): 1 orphan inode deleted
Jul 03 09:39:44 pve kernel: EXT4-fs (dm-8): recovery complete
Jul 03 09:39:44 pve kernel: EXT4-fs (dm-8): mounted filesystem fb298ffc-6366-4159-9f7a-4b0478c28a38 r/w with ordered data mode. Quota mode: none.
Jul 03 09:39:44 pve audit[1584]: AVC apparmor="STATUS" operation="profile_load" profile="/usr/bin/lxc-start" name="lxc-103_</var/lib/lxc>" pid=1584 comm="apparmor_parser"
Jul 03 09:39:44 pve kernel: audit: type=1400 audit(1720017584.721:28): apparmor="STATUS" operation="profile_load" profile="/usr/bin/lxc-start" name="lxc-103_</var/lib/lxc>" pid=1584 comm="apparmor_parser"
Jul 03 09:39:45 pve kernel: vmbr0: port 2(veth103i0) entered blocking state
Jul 03 09:39:45 pve kernel: vmbr0: port 2(veth103i0) entered disabled state
Jul 03 09:39:45 pve kernel: veth103i0: entered allmulticast mode
Jul 03 09:39:45 pve kernel: veth103i0: entered promiscuous mode
Jul 03 09:39:45 pve kernel: eth0: renamed from vethwtRQbR
Jul 03 09:39:45 pve kernel: vmbr0: port 2(veth103i0) entered blocking state
Jul 03 09:39:45 pve kernel: vmbr0: port 2(veth103i0) entered forwarding state
Jul 03 09:39:48 pve pveproxy[1224]: proxy detected vanished client connection
Jul 03 09:39:50 pve kernel: EXT4-fs (dm-12): recovery complete
Jul 03 09:39:50 pve kernel: EXT4-fs (dm-12): mounted filesystem abd61e48-00dd-4abd-8012-455bc1522954 r/w with ordered data mode. Quota mode: none.
Jul 03 09:39:50 pve audit[1987]: AVC apparmor="STATUS" operation="profile_load" profile="/usr/bin/lxc-start" name="lxc-101_</var/lib/lxc>" pid=1987 comm="apparmor_parser"
Jul 03 09:39:50 pve kernel: audit: type=1400 audit(1720017590.347:29): apparmor="STATUS" operation="profile_load" profile="/usr/bin/lxc-start" name="lxc-101_</var/lib/lxc>" pid=1987 comm="apparmor_parser"
Jul 03 09:39:50 pve kernel: vmbr0: port 3(fwpr101p0) entered blocking state
Jul 03 09:39:50 pve kernel: vmbr0: port 3(fwpr101p0) entered disabled state
Jul 03 09:39:50 pve kernel: fwpr101p0: entered allmulticast mode
Jul 03 09:39:50 pve kernel: fwpr101p0: entered promiscuous mode
Jul 03 09:39:50 pve kernel: vmbr0: port 3(fwpr101p0) entered blocking state
Jul 03 09:39:50 pve kernel: vmbr0: port 3(fwpr101p0) entered forwarding state
Jul 03 09:39:50 pve kernel: fwbr101i0: port 1(fwln101i0) entered blocking state
Jul 03 09:39:50 pve kernel: fwbr101i0: port 1(fwln101i0) entered disabled state
Jul 03 09:39:50 pve kernel: fwln101i0: entered allmulticast mode
Jul 03 09:39:50 pve kernel: fwln101i0: entered promiscuous mode
Jul 03 09:39:50 pve kernel: fwbr101i0: port 1(fwln101i0) entered blocking state
Jul 03 09:39:50 pve kernel: fwbr101i0: port 1(fwln101i0) entered forwarding state
Jul 03 09:39:50 pve kernel: fwbr101i0: port 2(veth101i0) entered blocking state
Jul 03 09:39:50 pve kernel: fwbr101i0: port 2(veth101i0) entered disabled state
Jul 03 09:39:50 pve kernel: veth101i0: entered allmulticast mode
Jul 03 09:39:50 pve kernel: veth101i0: entered promiscuous mode
Jul 03 09:39:50 pve kernel: eth0: renamed from veth3LfQgT
Jul 03 09:39:50 pve pvedaemon[1213]: <root@pam> end task UPID:pve:000005AD:0000250B:6685628A:vzstart:101:root@pam: OK
Jul 03 09:39:50 pve pvedaemon[1212]: <root@pam> end task UPID:pve:000005A2:000022E6:66856285:vzstart:103:root@pam: OK
Jul 03 09:39:50 pve pvestatd[1200]: modified cpu set for lxc/101: 0
Jul 03 09:39:50 pve pvestatd[1200]: modified cpu set for lxc/103: 0
Jul 03 09:39:50 pve pvestatd[1200]: modified cpu set for lxc/101: 1
Jul 03 09:39:51 pve pvestatd[1200]: status update time (41.996 seconds)
Jul 03 09:39:51 pve kernel: fwbr101i0: port 2(veth101i0) entered blocking state
Jul 03 09:39:51 pve kernel: fwbr101i0: port 2(veth101i0) entered forwarding state
 
Appears to be solved. I had a cluster set up for when I was trying to do a closer (which I no longer want to do).
I removed that and have been up for 3 days now.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!