PBS crashes without error message

Ray_

Member
Nov 5, 2021
23
1
8
24
Hey,

since Friday my PBS crashes when backing up.

Here are the last lines of syslog before I have to hard reset the server, because it becomes completely unresponsive. No Web interface, no pings.
Code:
Feb 14 00:33:48 backup proxmox-backup-proxy[1842]: vm/107/2022-02-04T23:00:42Z remove
Feb 14 00:33:48 backup proxmox-backup-proxy[1842]: removing backup snapshot "/backups/vm/107/2022-02-04T23:00:42Z"
Feb 14 00:33:48 backup proxmox-backup-proxy[1842]: vm/107/2022-02-05T22:50:50Z keep
Feb 14 00:33:48 backup proxmox-backup-proxy[1842]: vm/107/2022-02-06T22:52:41Z keep
Feb 14 00:33:48 backup proxmox-backup-proxy[1842]: vm/107/2022-02-07T23:20:36Z keep
Feb 14 00:33:48 backup proxmox-backup-proxy[1842]: vm/107/2022-02-08T23:11:47Z keep
Feb 14 00:33:48 backup proxmox-backup-proxy[1842]: vm/107/2022-02-09T23:20:10Z keep
Feb 14 00:33:49 backup proxmox-backup-proxy[1842]: vm/107/2022-02-10T23:13:32Z keep
Feb 14 00:33:49 backup proxmox-backup-proxy[1842]: vm/107/2022-02-13T23:32:41Z keep
Feb 14 00:33:49 backup proxmox-backup-proxy[1842]: TASK OK
Feb 14 00:33:49 backup proxmox-backup-proxy[1842]: Upload backup log to Backups/vm/107/2022-02-13T23:32:41Z/client.log.blob
Feb 14 00:33:52 backup proxmox-backup-proxy[1842]: starting new backup on datastore 'Backups': "vm/108/2022-02-13T23:33:49Z"
Feb 14 00:33:52 backup proxmox-backup-proxy[1842]: download 'index.json.blob' from previous backup.
Feb 14 00:33:52 backup proxmox-backup-proxy[1842]: register chunks in 'drive-sata0.img.fidx' from previous backup.
Feb 14 00:33:53 backup proxmox-backup-proxy[1842]: download 'drive-sata0.img.fidx' from previous backup.
Feb 14 00:33:53 backup proxmox-backup-proxy[1842]: created new fixed index 1 ("vm/108/2022-02-13T23:33:49Z/drive-sata0.img.fidx")
Feb 14 00:33:53 backup proxmox-backup-proxy[1842]: register chunks in 'drive-sata1.img.fidx' from previous backup.
Feb 14 00:33:53 backup proxmox-backup-proxy[1842]: download 'drive-sata1.img.fidx' from previous backup.
Feb 14 00:33:53 backup proxmox-backup-proxy[1842]: created new fixed index 2 ("vm/108/2022-02-13T23:33:49Z/drive-sata1.img.fidx")
Feb 14 00:33:53 backup proxmox-backup-proxy[1842]: add blob "/backups/vm/108/2022-02-13T23:33:49Z/qemu-server.conf.blob" (323 bytes, comp: 323)
Feb 14 00:45:30 backup proxmox-backup-proxy[1842]: write rrd data back to disk
Feb 14 00:45:30 backup proxmox-backup-proxy[1842]: starting rrd data sync
Feb 14 00:45:30 backup proxmox-backup-proxy[1842]: rrd journal successfully committed (23 files in 0.280 seconds)


Is there any other log file that could give me more information? There is nothing interesting in kern.log.

Sincerely,
Ray_
 
can you run 'dmesg -w' during a backup (e.g. in an ssh session) ?
 
I don't think that this will help, since the kern.log from the last crash has nothing in it in the timeframe of the crash.
Afaik dmesg and kern.log have the same information.
 
yes, but if it crashes due to a faulty disk or simillar, you might see the kernel messages via ssh
 
Weirdly enough for the last 2 days, it's now working without a hiccup.

Very strange. I'll keep dmesg running for a few more days just in case it happens again.
 
After 9 days of nothing happening, today it crashed again.
Weirdly enough 30 minutes after the backup and all tasks have finished.

Syslog and dmesg -w shows nothing, besides the finished tasks.
 
Last edited:
in that case i would look into the hardware, e.g. do a memtest and maybe try to test a different psu
 
Since I host this instance in a datacenter, I contacted their support. Let's see what comes from this.

I am currently running the 5.13 kernel. Do you think an update to 5.15 would help?
Had to do this for my PVEs since 5.13 plays badly with iSCSI boot.
 
I am currently running the 5.13 kernel. Do you think an update to 5.15 would help?
only if it is a kernel issue/driver problem (but then i'd expect something in the logs)
if it's a hardware issue, it's unlikely to be solved with a newer kernel
 
The data center technician scanned the whole system including hard drives and found no faulty parts.
 
The data center technician scanned the whole system including hard drives and found no faulty parts.
not to disregard the datacenter or the technician, but i saw such weird errors with faulty psus, that did behave normally most of the time and i don't think there is a way to 'scan' the PSU for faults, except replacing it and test it?

Another thing that could be done is to check if the bios/firmware is up to date. Also if the server has an ikvm/idrac/ilo/etc. check if there is anything in the logs there...
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!