[SOLVED] Backups fails - SEGFAULT tokio-runtime-w

this is getting stranger and stranger. I did another build with a little more debug output, same place:

Code:
80f2dc97936b4922246a7347f3d8cc44b63551ad6df4e8334f418129c5b9976b  proxmox-backup-server_0.9.1-1_amd64.deb
42c7aed01f3595cb37068b5658a169f847d98e0dd75065e7429abaacb6d73f5b  proxmox-backup-server-dbgsym_0.9.1-1_amd64.deb

could you describe your PBS setup a bit more? is it running bare-metal, or in a VM?
 
Yes it's getting weirder. I'm having trouble reproducing the issue with the new debug variant, the backup appears to work. I have started a bigger backup job now so I'll see in a while of any of the VM/CT's trigger the issue again.

My PBS host (an old HTPC I previously had lying unused), yes, i's bare metal:
  • 2 Core AMD E-350 Processor (integrated w/ motherboard)
  • 4Gb RAM (DDR3)
  • 1Gbit/s Ethernet
  • 40 Gb SSD for /
  • 8TB HDD for primary backup storage
  • 8TB HDD for secondary backup storage
  • 2TB HDD, unused
It's connected to my 3-node PVE cluster on the same backend network.
 
if you can't reproduce it anymore, I can do another build with similar output but less invasive changes, if that does reproduce it again we at least know where to look more closely.
 
also it would be interesting to know how much load you get during a backup.. e.g., is memory getting scarce?
 
Sigh... I had a power failure and now my PBS won't boot, it's either the motherboard or the PSU that didn't like loosing power like that. It will take a while before I can get the system up and running again.

About the load, I can't look it up right now but I know the CPU was running at 100% for most operations. About RAM I'm not sure but I don't remember seeing it close to the limit, but that could just be that I checked at the wrong time.
 
I've got my machine working again (with a minor hardware upgrade while I was at it). I also updated PBS to 0.9-4. The backups are working again now and I have not been able to reproduce the issue anymore.

For my purposes I'm quite content at the moment, but if you wish to continue digging I may be able to give it a few more tries, if you need it.

In either case I'll mark the thread as solved.
 
if it does not show up again I'd chalk it up to hardware ;) out of curiousity - which components did you upgrade?
 
I switched out the motherboard, but since I didn't get the same one that I had previously and the CPU is integrated I also got a new CPU (now quad core Celeron instead of dual core AMD E-350). Also, because of a mistake when reading the specs I had to get a new DDR3 memory stick too, but same amount of RAM (4Gb) (new MB wanted SO-DIMM, when the old one had DIMM).

Just guessing, but I would think that my CPU was probably one of the weekest one to test your PBS, so I would imagine that the cause is a race condition or some other timing dependent fault that only appears under very strict conditions. Things like this are usually affected by debug-prints and other unrelated changes which also matches what we saw previously.
 
Hi, i have got the same issue with pbs 1.0-1 and motherboard on AMD e-350 apu. (connected to pve 6.3-2)

from /var/log/messges on pbs:

Apr 16 00:15:35 pveb kernel: [21282.006266] tokio-runtime-w[1354]: segfault at 5559439bf000 ip 00005559435f7c80 sp 00007fd97f40eb10 error 4 in proxmox-backup-proxy[555942f42000+796000]
Apr 16 00:15:35 pveb kernel: [21282.006291] Code: 00 00 00 01 00 00 48 8b 53 08 49 8d 40 ff 44 89 c6 83 e6 07 48 83 f8 07 72 71 48 89 f0 4c 29 c0 66 2e 0f 1f 84 00 00 00 00 00 <0f> b6 1a 48 31 fb 48 0f af d9 0f b6 7a 01 48 31 df 48 0f af f9 0f

CPU always has been stacked at 100% after running backup job thru 1g network in few sec after start. And as result job died too. But after that linux(pbs) goes back to normal state. Should i do the same and change my h/w?)
 
@vadymati if you follow the debugging instructions from earlier in this thread, could you get us a backtrace if the crash is reproducible? thanks!
 
Hi. Sorry for long responce. I`ve performed rollback to v1 (and lost access for some other reason to this platform for a few years) untill today. After upgrade to the latest 2.x.x situation haven`t changed. So i desided to try v3 and this is also didn`t work. I have attached zip with process- and platform-specific logs. It will be great if you take a look at them. Thanks.
 

Attachments

  • pvebv3.zip
    41.5 KB · Views: 6
@vadymati there is no indication of a segfault in your posted files - just a failure to parse a UPID, a potentially corrupted journal file, and a crash with an assert. I'd check your on-disk files (using debsums) and your memory, those things (especially combined) are an indication of something not working properly..
 
Hi, tripple checked (bunch of tests inside os/booting from usb drive). Nothing. I think it`s could be a problem with some sort of codepage on this platform and rust(?). systemctl status proxmox-backup* returns unreadle symbols in date/time field like you could seen in attached logs (pveb Sep u u: u: u pveb proxmox-backup-proxy[661]: server shutting down, waiting for active workers to complete). Later after apgrade to latest (3.0.4) i've starts getting error with code page on others pve hosts. Part of log from pbs host (2023-10-22T07:38:55.726642-04:00 pveb Oct �� ��:��:�� pveb proxmox-backup-proxy[680]: GET /api2/json/admin/datastore/WD01/status: 400 Bad Request: [client [::ffff:10.0.100.13]:50632] unable to read '"/run/proxmox-backup/active-operations/WD01"' - stream did not contain valid UTF-8). Also i`ve tried to reinstall different system codepage types of utf-8* on pbs and that does not helped too.

It`s not relevant to me anymore because i`ve swithed to Intel (runs as well)
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!