proxmox system crash by zvol Tainted

cola16

Member
Feb 2, 2024
35
2
8
proxmox crashed three days ago.
when I type history in the shell, I don't see anything done three days ago. The syslog only shows errors from other proxmox nodes. The syslog ends with the following. This is the last log before start.

```
Apr 12 00:00:56 pve3.rack.cola16 systemd[1]: Starting dpkg-db-backup.service - Daily dpkg database backup service...
Apr 12 00:00:56 pve3.rack.cola16 systemd[1]: Starting logrotate.service - Rotate log files...
Apr 12 00:00:56 pve3.rack.cola16 systemd[1]: dpkg-db-backup.service: Deactivated successfully.
Apr 12 00:00:56 pve3.rack.cola16 systemd[1]: Finished dpkg-db-backup.service - Daily dpkg database backup service.
Apr 12 00:00:56 pve3.rack.cola16 ceph-mgr[2332]: 2024-04-12T00:00:56.620+0900 7f6abaccf6c0 -1 received signal: Hangup from killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw rbd-mirror cephfs-mirror (PID: 152265) UID: 0
Apr 12 00:00:56 pve3.rack.cola16 ceph-mds[2329]: 2024-04-12T00:00:56.620+0900 7f58f0d0c6c0 -1 received signal: Hangup from killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw rbd-mirror cephfs-mirror (PID: 152265) UID: 0
Apr 12 00:00:56 pve3.rack.cola16 ceph-mon[2333]: 2024-04-12T00:00:56.620+0900 7fb0f36596c0 -1 received signal: Hangup from killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw rbd-mirror cephfs-mirror (PID: 152265) UID: 0
Apr 12 00:00:56 pve3.rack.cola16 ceph-mon[2333]: 2024-04-12T00:00:56.620+0900 7fb0f36596c0 -1 mon.pve3@2(peon) e31 *** Got Signal Hangup ***
Apr 12 00:00:56 pve3.rack.cola16 ceph-osd[117181]: 2024-04-12T00:00:56.620+0900 7f18f9cc16c0 -1 received signal: Hangup from killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw rbd-mirror cephfs-mirror (PID: 152265) UID: 0
Apr 12 00:00:56 pve3.rack.cola16 ceph-osd[117223]: 2024-04-12T00:00:56.620+0900 7fc80e5e26c0 -1 received signal: Hangup from killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw rbd-mirror cephfs-mirror (PID: 152265) UID: 0
Apr 12 00:00:56 pve3.rack.cola16 ceph-mds[2329]: 2024-04-12T00:00:56.668+0900 7f58f0d0c6c0 -1 received signal: Hangup from (PID: 152266) UID: 0
Apr 12 00:00:56 pve3.rack.cola16 ceph-osd[117223]: 2024-04-12T00:00:56.668+0900 7fc80e5e26c0 -1 received signal: Hangup from (PID: 152266) UID: 0
Apr 12 00:00:56 pve3.rack.cola16 ceph-mon[2333]: 2024-04-12T00:00:56.668+0900 7fb0f36596c0 -1 received signal: Hangup from (PID: 152266) UID: 0
Apr 12 00:00:56 pve3.rack.cola16 ceph-mon[2333]: 2024-04-12T00:00:56.668+0900 7fb0f36596c0 -1 mon.pve3@2(peon) e31 *** Got Signal Hangup ***
Apr 12 00:00:56 pve3.rack.cola16 ceph-mgr[2332]: 2024-04-12T00:00:56.668+0900 7f6abaccf6c0 -1 received signal: Hangup from (PID: 152266) UID: 0
Apr 12 00:00:56 pve3.rack.cola16 ceph-osd[117181]: 2024-04-12T00:00:56.668+0900 7f18f9cc16c0 -1 received signal: Hangup from (PID: 152266) UID: 0
Apr 12 00:00:57 pve3.rack.cola16 systemd[1]: Reloading pveproxy.service - PVE API Proxy Server...
Apr 12 00:00:57 pve3.rack.cola16 pveproxy[152285]: send HUP to 2816
Apr 12 00:00:57 pve3.rack.cola16 pveproxy[2816]: received signal HUP
Apr 12 00:00:57 pve3.rack.cola16 pveproxy[2816]: server closing
Apr 12 00:00:57 pve3.rack.cola16 pveproxy[2816]: server shutdown (restart)
Apr 12 00:00:57 pve3.rack.cola16 systemd[1]: Reloaded pveproxy.service - PVE API Proxy Server.
Apr 12 00:00:58 pve3.rack.cola16 systemd[1]: Reloading spiceproxy.service - PVE SPICE Proxy Server...
Apr 12 00:00:58 pve3.rack.cola16 spiceproxy[152287]: send HUP to 2822
Apr 12 00:00:58 pve3.rack.cola16 spiceproxy[2822]: received signal HUP
Apr 12 00:00:58 pve3.rack.cola16 spiceproxy[2822]: server closing
Apr 12 00:00:58 pve3.rack.cola16 spiceproxy[2822]: server shutdown (restart)
Apr 12 00:00:58 pve3.rack.cola16 systemd[1]: Reloaded spiceproxy.service - PVE SPICE Proxy Server.
Apr 12 00:00:58 pve3.rack.cola16 pvefw-logger[1649]: received terminate request (signal)
Apr 12 00:00:58 pve3.rack.cola16 pvefw-logger[1649]: stopping pvefw logger
Apr 12 00:00:58 pve3.rack.cola16 systemd[1]: Stopping pvefw-logger.service - Proxmox VE firewall logger...
Apr 12 00:00:58 pve3.rack.cola16 spiceproxy[2822]: restarting server
Apr 12 00:00:58 pve3.rack.cola16 spiceproxy[2822]: starting 1 worker(s)
Apr 12 00:00:58 pve3.rack.cola16 spiceproxy[2822]: worker 152295 started
Apr 12 00:00:58 pve3.rack.cola16 systemd[1]: pvefw-logger.service: Deactivated successfully.
Apr 12 00:00:58 pve3.rack.cola16 systemd[1]: Stopped pvefw-logger.service - Proxmox VE firewall logger.
Apr 12 00:00:58 pve3.rack.cola16 systemd[1]: pvefw-logger.service: Consumed 2.001s CPU time.
Apr 12 00:00:58 pve3.rack.cola16 systemd[1]: Starting pvefw-logger.service - Proxmox VE firewall logger...
Apr 12 00:00:58 pve3.rack.cola16 pvefw-logger[152298]: starting pvefw logger
Apr 12 00:00:58 pve3.rack.cola16 systemd[1]: Started pvefw-logger.service - Proxmox VE firewall logger.
Apr 12 00:00:58 pve3.rack.cola16 systemd[1]: logrotate.service: Deactivated successfully.
Apr 12 00:00:58 pve3.rack.cola16 systemd[1]: Finished logrotate.service - Rotate log files.
Apr 12 00:00:58 pve3.rack.cola16 pveproxy[2816]: restarting server
Apr 12 00:00:58 pve3.rack.cola16 pveproxy[2816]: starting 3 worker(s)
Apr 12 00:00:58 pve3.rack.cola16 pveproxy[2816]: worker 152302 started
Apr 12 00:00:58 pve3.rack.cola16 pveproxy[2816]: worker 152303 started
Apr 12 00:00:58 pve3.rack.cola16 pveproxy[2816]: worker 152304 started
Apr 12 00:01:02 pve3.rack.cola16 pveproxy[152303]: detected empty handle
Apr 12 00:01:02 pve3.rack.cola16 pveproxy[142793]: detected empty handle
Apr 12 00:01:03 pve3.rack.cola16 spiceproxy[2823]: worker exit
Apr 12 00:01:03 pve3.rack.cola16 spiceproxy[2822]: worker 2823 finished
Apr 12 00:01:03 pve3.rack.cola16 pveproxy[134773]: worker exit
Apr 12 00:01:03 pve3.rack.cola16 pveproxy[2816]: worker 134773 finished
Apr 12 00:01:03 pve3.rack.cola16 pveproxy[2816]: worker 143516 finished
Apr 12 00:01:03 pve3.rack.cola16 pveproxy[2816]: worker 142793 finished
Apr 12 00:01:05 pve3.rack.cola16 pveproxy[152357]: worker exit
Apr 12 00:01:06 pve3.rack.cola16 pveproxy[152358]: worker exit
Apr 12 00:01:22 pve3.rack.cola16 pveproxy[152302]: detected empty handle
Apr 12 00:01:22 pve3.rack.cola16 pveproxy[152303]: detected empty handle
Apr 12 00:01:42 pve3.rack.cola16 pveproxy[152304]: detected empty handle
Apr 12 00:01:42 pve3.rack.cola16 pveproxy[152302]: detected empty handle
Apr 12 00:02:02 pve3.rack.cola16 pveproxy[152303]: detected empty handle
Apr 12 00:02:02 pve3.rack.cola16 pveproxy[152304]: detected empty handle
```

what can I do?
 
Any changes or improvements to this? I haven't updated in prob 3 weeks or more and all of a sudden today I have had two complete hardlocks of the system (first time ever since running proxmox for 1.5yrs).
Reviewing logs I am finding *nothing* that seems to be related or smoking gun for the issue. Just usual stuff logged up until around the time it goes down.

edit: Ran an update/upgrade in meantime. So far no issues... (PVE 8.2.2, Linux 6.8.4)
 
Last edited:
unfortunately.. Not work.
reinstall Proxmox. Not work.

Maybe It's hardware Issue.

The problem is that there were no issues for about 2.5 months after the machine was configured.
The last hardware upgrade was to replace the RAM.
After the RAM replacement, I had no issues for two weeks and then this happened.
 
unfortunately.. Not work.
reinstall Proxmox. Not work.

Maybe It's hardware Issue.

The problem is that there were no issues for about 2.5 months after the machine was configured.
The last hardware upgrade was to replace the RAM.
After the RAM replacement, I had no issues for two weeks and then this happened.
I used the memory overclock feature to lower it by 200Mhz and it has been running fine for 2 days and 7 hours now.
I'll try lower further if it shuts down again.
I passed the memtest at normal clocks(before lowering the clock) for 8 hours, But I think it's just a matter of the memory clock.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!