[SOLVED] Unable to start RAS daemon on PVE 8.2

PapaGigas

Member
Mar 18, 2023
40
2
8
Hi,

I'm having trouble starting the RAS daemon on PVE 8.2. Here’s the log output:

Code:
root@pve:~# systemctl status rasdaemon
× rasdaemon.service - RAS daemon to log the RAS events
     Loaded: loaded (/lib/systemd/system/rasdaemon.service; enabled; preset: enabled)
     Active: failed (Result: signal) since Sat 2024-08-03 13:35:18 WEST; 11s ago
   Duration: 160ms
    Process: 60258 ExecStart=/usr/sbin/rasdaemon -f -r (code=killed, signal=BUS)
    Process: 60259 ExecStartPost=/usr/sbin/rasdaemon --enable (code=exited, status=0/SUCCESS)
   Main PID: 60258 (code=killed, signal=BUS)
        CPU: 9.092s

Can anyone help, please?
 
Last edited:
Unfortunately I have no idea what a "rasdaemon" is. Does it come with PVE?

No, it doesn't come with Proxmox. Here’s what the RAS daemon is:

The rasdaemon program is a daemon which monitors the platform Reliablity, Availability and Serviceability (RAS) reports from the Linux kernel trace events. These trace events are logged in /sys/kernel/debug/tracing, reporting them via syslog/journald.

I mainly use it to monitor ECC memory for any errors, which is why I’ve installed it on Proxmox! ;)
 
a daemon which monitors the platform Reliablity
Sometimes I am curious and so I installed it on a test-machine, on hardware (Ryzen Threadripper). Without doing anything I got
Code:
~# systemctl  status rasdaemon.service  
● rasdaemon.service - RAS daemon to log the RAS events
     Loaded: loaded (/lib/systemd/system/rasdaemon.service; enabled; preset: enabled)
     Active: active (running) since Sun 2024-08-04 19:09:15 CEST; 6min ago
I have had no expectation. Do I have to configure something? I ask because:
Code:
~# ras-mc-ctl --status
ras-mc-ctl: drivers not loaded.

It required the hardware to support "something", right? Looks like "just" ECC - which this machine does not have (Homelab). A basic module is loaded:
Code:
~# lsmod | grep -i edac
edac_mce_amd           28672  0

Sorry, no useful help from me for you, and it's a tool not useful for me --> will "purge" it...
 
Sorry, no useful help from me for you

No problem. I've managed to get it running:

Code:
root@pve:~# systemctl status rasdaemon
● rasdaemon.service - RAS daemon to log the RAS events
     Loaded: loaded (/lib/systemd/system/rasdaemon.service; enabled; preset: enabled)
     Active: active (running) since Mon 2024-08-05 14:13:02 WEST; 2min 51s ago
    Process: 821798 ExecStartPost=/usr/sbin/rasdaemon --enable (code=exited, status=0/SUCCESS)
   Main PID: 821797 (rasdaemon)
      Tasks: 256 (limit: 154308)
     Memory: 37.0M
        CPU: 3.149s
     CGroup: /system.slice/rasdaemon.service
             └─821797 /usr/sbin/rasdaemon -f -r

But it failed again... this is what it shows in syslog:

Aug 05 20:01:17 pve kernel: traps: rasdaemon[88259] trap stack segment ip:784b8a8137f4 sp:784afdffea20 error:0 in libsqlite3.so.0.8.6[784b8a747000+f4000]

I'm now running memtest86+ to see if it's due to some faulty RAM.

Thanks anyway! ;)
 
Last edited:
After running Memtest86+ with no errors detected, I noticed that x2APIC was enabled in the BIOS. I changed it to xAPIC, and now it's working fine. I'll mark this thread as solved in case anyone else encounters this issue. ;)
 
  • Like
Reactions: UdoB

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!