pveproxy not starting

LittleI

New Member
Jun 17, 2024
5
0
1
Hi All,

I decided I should be more contentious of monitoring my RAID on my home server,
So installed storcli and Megaraid using the intructions here: https://forums.servethehome.com/ind...aid-storage-manager-on-debian-10-omv-5.27676/

(Stocli was already installed and previously rebooted)

I also did
Code:
update-rc.d vivaldiframeworkd defaults

However on reboot the Proxmox GUI has stooped working

So I have tried reverting my changes:
Code:
service vivaldiframeworkd stop
update-rc.d vivaldiframeworkd remove
dpkg -r megaraid-storage-manager

I have rebooted and still have no access to the CLI.

pveproxy start error:
Code:
× pveproxy.service - PVE API Proxy Server
     Loaded: loaded (/lib/systemd/system/pveproxy.service; enabled; preset: enabled)
     Active: failed (Result: exit-code) since Mon 2024-06-17 14:09:39 BST; 25s ago
    Process: 1483 ExecStartPre=/usr/bin/pvecm updatecerts --silent (code=exited, status=0/SUCCESS)
    Process: 1485 ExecStart=/usr/bin/pveproxy start (code=exited, status=255/EXCEPTION)
        CPU: 1.018s

Jun 17 14:09:39 pve systemd[1]: pveproxy.service: Scheduled restart job, restart counter is at 5.
Jun 17 14:09:39 pve systemd[1]: Stopped pveproxy.service - PVE API Proxy Server.
Jun 17 14:09:39 pve systemd[1]: pveproxy.service: Consumed 1.018s CPU time.
Jun 17 14:09:39 pve systemd[1]: pveproxy.service: Start request repeated too quickly.
Jun 17 14:09:39 pve systemd[1]: pveproxy.service: Failed with result 'exit-code'.
Jun 17 14:09:39 pve systemd[1]: Failed to start pveproxy.service - PVE API Proxy Server.

Journalctl shows this:
Code:
Jun 17 14:09:36 pve systemd[1]: Starting pveproxy.service - PVE API Proxy Server...
Jun 17 14:09:37 pve pveproxy[1424]: start failed - Unrecognised protocol tcp at /usr/share/perl5/PVE/Daemon.pm line 833.
Jun 17 14:09:37 pve pveproxy[1424]: start failed - Unrecognised protocol tcp at /usr/share/perl5/PVE/Daemon.pm line 833.

pvedaemon is not showing any errors.

Any pointers or help as to what to do next would be most appreciated :-)
 
I'd probably try uninstalling and purging the stuff that seems to have broken it, and if that works then you're probably in an ok position to experiment a bit more carefully. :)
 
So I have tried reverting my changes:
Ahhh. That bit just sunk in.

I'd guess that when you installed the megaraid-storage-manager package it (or a dependency) probably installed some config files or similar.

If that's the case, then doing an apt purge PACKAGE_NAME can sometimes fix the problem, as it'll (generally) remove any left over config files from the given package.
 
Last edited:
Based on a similar symptom, not long ago, you have an issue with local hostname resolution.
Thanks for the pointer, it appears I do have a DNS issue:

Code:
ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=117 time=7.73 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=117 time=7.89 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=117 time=7.86 ms

ping google.com
ping: google.com: Temporary failure in name resolution

nslookup google.com
Server:         8.8.8.8
Address:        8.8.8.8#53

Non-authoritative answer:
Name:   google.com
Address: 142.250.200.14
Name:   google.com
Address: 2a00:1450:4009:81f::200e

nslookup localhost
Server:         8.8.8.8
Address:        8.8.8.8#53

** server can't find localhost: NXDOMAIN

etc/resolv.conf
search lan
nameserver 8.8.8.8
nameserver 8.8.4.4
nameserver 1.1.1.1

I'm not sure where else to look?
 
run the following commands and examine their output:
cat /etc/hosts
cat /etc/resolv.conf
cat /etc/hostname
hostnamectl
ping localhost
ping $HOSTNAME
cat /etc/systemd/resolved.conf
systemctl status systemd-resolved
resolvectl status

possibly
sudo systemctl restart systemd-resolved

Continue monitoring "journalctl".
Possibly restart your network service and/or reboot. Its unclear how a raid software package could have messed up your networking. Perhaps you did something a while back and the change only took effect recently, after service restart.

Good luck


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Thanks for the pointers, I note the following errors:

Code:
hostnamectl
Failed to query system properties: The name org.freedesktop.hostname1 was not provided by any .service files

ping localhost
ping: localhost: Temporary failure in name resolution

ping $HOSTNAME
ping: pve: Temporary failure in name resolution

cat /etc/systemd/resolved.conf
cat: /etc/systemd/resolved.conf: No such file or directory

sudo systemctl restart systemd-resolved
Failed to restart systemd-resolved.service: Unit systemd-resolved.service not found.

resolvectl status
-bash: resolvectl: command not found
 
Code:
The first three issues need to be addressed to get your system up and running. Since you only provided the errors, I can't offer more specific guidance at this time.

I listed them, just in case you saw a pattern and said look at list :)

What I have found is that DBUS has a permission error:

Code:
D-Bus System Message Bus
     Loaded: loaded (/lib/systemd/system/dbus.service; static)
     Active: active (running) since Mon 2024-06-17 18:16:22 BST; 2s ago
TriggeredBy: ● dbus.socket
       Docs: man:dbus-daemon(1)
   Main PID: 76099 (dbus-daemon)
      Tasks: 1 (limit: 115665)
     Memory: 660.0K
        CPU: 6ms
     CGroup: /system.slice/dbus.service
             └─76099 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only


Jun 17 18:16:22 pve systemd[1]: Starting dbus.service - D-Bus System Message Bus...
Jun 17 18:16:22 pve dbus-daemon[76099]: [system] AppArmor D-Bus mediation is enabled
Jun 17 18:16:22 pve dbus-daemon[76099]: Cannot setup inotify for '/usr/share/dbus-1/system.d'; error 'Permission denied'
Jun 17 18:16:22 pve systemd[1]: Started dbus.service - D-Bus System Message Bus.
root@pve:~#

Code:
drwxr-xr-x 2 root root 4096 May 27 13:46 system.d

I think they are OK, but did change to 0777 to test and restarted dbus service, but no luck.

Also stopped apparmor and retried.

Any suggestion as to what else would give dbus the permissions issue?
 
Code:
The first three issues need to be addressed to get your system up and running. Since you only provided the errors, I can't offer more specific guidance at this time.

I listed them, just in case you saw a pattern and said look at list :)

What I have found is that DBUS has a permission error:

Code:
D-Bus System Message Bus
     Loaded: loaded (/lib/systemd/system/dbus.service; static)
     Active: active (running) since Mon 2024-06-17 18:16:22 BST; 2s ago
TriggeredBy: ● dbus.socket
       Docs: man:dbus-daemon(1)
   Main PID: 76099 (dbus-daemon)
      Tasks: 1 (limit: 115665)
     Memory: 660.0K
        CPU: 6ms
     CGroup: /system.slice/dbus.service
             └─76099 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only


Jun 17 18:16:22 pve systemd[1]: Starting dbus.service - D-Bus System Message Bus...
Jun 17 18:16:22 pve dbus-daemon[76099]: [system] AppArmor D-Bus mediation is enabled
Jun 17 18:16:22 pve dbus-daemon[76099]: Cannot setup inotify for '/usr/share/dbus-1/system.d'; error 'Permission denied'
Jun 17 18:16:22 pve systemd[1]: Started dbus.service - D-Bus System Message Bus.
root@pve:~#

Code:
drwxr-xr-x 2 root root 4096 May 27 13:46 system.d

I think they are OK, but did change to 0777 to test and restarted dbus service, but no luck.

Also stopped apparmor and retried.

Any suggestion as to what else would give dbus the permissions issue?
Good Afternoon,

Did you ever resolve this issue? I went through almost the same series of events, and I am experiencing the exact same problem you described.

Thank you
 
@littlel @PeterLHess I've just hit this problem myself, exactly the same issue after installing the MegaRAID tools as per that first post, and I've been able to fix it.

Initially I needed the tools as I needed to upgrade the ServeRAID M5225 firmware due to a known bug affecting RAID-0 disks from being re-added if they fall offline. Separate issue this but the command line tools didn't say the firmware couldn't be updated due to cache being held pending to be written back to some of the disks, which only the MSM tool did (and had to flush) before I could successfully apply the firmware.

Anyway, firmware problem fixed, I noticed the Proxmox GUI stopped working much like yourselves in this post, and from following some of the other advice here (and in other posts) found that DNS resolution was failing. I could "nslookup www.google.co.uk" for example, but couldn't ping the hostname.

/etc/nsswitch.conf and /etc/resolv.conf all looked ok. I debugged some strange dbus errors too with permission denied. No luck.

I stumbled across another post (https://forum.proxmox.com/threads/pveproxy-failed-to-get-address-info.33398/) where a friendly peep @joerg picked up on something around incorrect directory permissions, and this ended up ultimately fixing my problem.

After removing the MSM tools (you may be able to let this coexist, but by this time I had enough of it :D) I noticed the following directories had modified permissions:

/etc
/usr
/usr/lib

All were owned by root:root but were 0744 (or drwxr--r--) and I had to do the following to fix:

chmod 0755 /{etc,usr,usr/lib}

Then issuing a systemctl restart pveproxy followed by a systemctl status pveproxy to confirm it was all OK, I was then able to reconnect to the GUI on https://x.x.x.x:8006.

Hope that helps you fine peeps!

Andy
 
Last edited: