Can't load RADOS for module PVE::RADOS, GUI and CLI tools fail

CheesyPete

Member
Sep 14, 2020
16
3
23
24
Hello, my proxmox server has been running happily for a few months since last reboot. I tried to run a new container and got this message:
Code:
Can't load '/usr/lib/x86_64-linux-gnu/perl5/5.36/auto/PVE/RADOS/RADOS.so' for module PVE::RADOS: libboost_iostreams.so.1.74.0: cannot open shared object file: No such file or directory at /usr/lib/x86_64-linux-gnu/perl-base/DynaLoader.pm line 201, <DATA> line 960.
 at /usr/share/perl5/PVE/Storage/RBDPlugin.pm line 16.
Compilation failed in require at /usr/share/perl5/PVE/Storage/RBDPlugin.pm line 16, <DATA> line 960.
BEGIN failed--compilation aborted at /usr/share/perl5/PVE/Storage/RBDPlugin.pm line 16, <DATA> line 960.
Compilation failed in require at /usr/share/perl5/PVE/Storage.pm line 34, <DATA> line 960.
BEGIN failed--compilation aborted at /usr/share/perl5/PVE/Storage.pm line 34, <DATA> line 960.
Compilation failed in require at /usr/share/perl5/PVE/GuestHelpers.pm line 8, <DATA> line 960.
BEGIN failed--compilation aborted at /usr/share/perl5/PVE/GuestHelpers.pm line 8, <DATA> line 960.
Compilation failed in require at /usr/share/perl5/PVE/CLI/pct.pm line 14, <DATA> line 960.
BEGIN failed--compilation aborted at /usr/share/perl5/PVE/CLI/pct.pm line 14, <DATA> line 960.
Compilation failed in require at /usr/sbin/pct line 6, <DATA> line 960.
BEGIN failed--compilation aborted at /usr/sbin/pct line 6, <DATA> line 960.
After a restart, nothing 'proxmox' works. The server boots fine and I can ssh into it, but there is no GUI, trying to restart pveproxy.service and pvedaemon.service or even trying pct list gives the same error and journalctl -b -u pvedaemon.service is littered with the same error again.

I have tried:
  • Updating packages
  • Reinstalling librados2
  • Reinstalling qemu-server
  • Reinstalling pve-ha-manager
  • Removing all zfs pools
  • Disabling my NFS exports
Version:
Code:
root@pve:~# pveversion
pve-manager/8.3.2/3e76eec21c4a14a7 (running kernel: 6.8.12-5-pve)

Is it that something has updated while it has been running and has broken? I only found the error while trying to run a container, so it could have been an issue for a while?
Any help is much appreciated, let me know if there is more information needed or I anything for me to try.
 
Hi,
please check that the file from the error message exists:
Code:
No such file or directory at /usr/lib/x86_64-linux-gnu/perl-base/DynaLoader.pm
If not, try reinstalling the perl-base package. I'd also install the debsums package and run debsums -c to check that files from installed packages are correct.
 
Hi, thank you for responding I do have a file at /usr/lib/x86_64-linux-gnu/perl-base/DynaLoader.pm. I have run debsums -c and the output I get is
Code:
root@pve:~# debsums -c
Can't locate Dpkg/Conf.pm in @INC (you may need to install the Dpkg::Conf module) (@INC contains: /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.36.0 /usr/local/share/perl/5.36.0 /usr/lib/x86_64-linux-gnu/perl5/5.36 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl-base /usr/lib/x86_64-linux-gnu/perl/5.36 /usr/share/perl/5.36 /usr/local/lib/site_perl) at /usr/bin/debsums line 22.
BEGIN failed--compilation aborted at /usr/bin/debsums line 22.

My pveversion -v output is below:
Code:
root@pve:~# pveversion -v
proxmox-ve: 8.3.0 (running kernel: 6.8.12-5-pve)
pve-manager: 8.3.2 (running version: 8.3.2/3e76eec21c4a14a7)
proxmox-kernel-helper: 8.1.0
proxmox-kernel-6.8: 6.8.12-5
proxmox-kernel-6.8.12-5-pve-signed: 6.8.12-5
proxmox-kernel-6.8.12-2-pve-signed: 6.8.12-2
pve-kernel-5.4: 6.4-13
pve-kernel-5.4.166-1-pve: 5.4.166-1
pve-kernel-5.4.73-1-pve: 5.4.73-1
ceph-fuse: 16.2.15+ds-0+deb12u1
corosync: 3.1.7-pve3
criu: 3.17.1-2+deb12u1
glusterfs-client: 10.3-5
ifupdown: residual config
ifupdown2: 3.2.0-1+pmx11
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-5
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.1
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.4
libpve-access-control: 8.2.0
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.0.10
libpve-cluster-perl: 8.0.10
libpve-common-perl: 8.2.9
libpve-guest-common-perl: 5.1.6
libpve-http-server-perl: 5.1.2
libpve-network-perl: 0.10.0
libpve-rs-perl: 0.9.1
libpve-storage-perl: 8.3.3
libqb0: 1.0.5-1
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.5.0-1
proxmox-backup-client: 3.3.2-1
proxmox-backup-file-restore: 3.3.2-2
proxmox-firewall: 0.6.0
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.3.1
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.7
proxmox-widget-toolkit: 4.3.3
pve-cluster: 8.0.10
pve-container: 5.2.3
pve-docs: 8.3.1
pve-edk2-firmware: 4.2023.08-4
pve-esxi-import-tools: 0.7.2
pve-firewall: 5.1.0
pve-firmware: 3.14-2
pve-ha-manager: 4.0.6
pve-i18n: 3.3.2
pve-qemu-kvm: 9.0.2-4
pve-xtermjs: 5.3.0-3
qemu-server: 8.3.3
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.6-pve1

I have been backing up my containers using the built-in proxmox tools so I can reinstall if needed.
 
What do the following show:
cat /etc/apt/sources.list
cat /etc/apt/sources.list.d/ceph.list
 
Hi, thank you for responding I do have a file at /usr/lib/x86_64-linux-gnu/perl-base/DynaLoader.pm. I have run debsums -c and the output I get is
Code:
root@pve:~# debsums -c
Can't locate Dpkg/Conf.pm in @INC (you may need to install the Dpkg::Conf module) (@INC contains: /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.36.0 /usr/local/share/perl/5.36.0 /usr/lib/x86_64-linux-gnu/perl5/5.36 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl-base /usr/lib/x86_64-linux-gnu/perl/5.36 /usr/share/perl/5.36 /usr/local/lib/site_perl) at /usr/bin/debsums line 22.
BEGIN failed--compilation aborted at /usr/bin/debsums line 22.
Sounds like a more general issue with your perl installation (the debsums command uses Perl too). Please try re-installing the perl packages, e.g. most of what is needed with debsums should be covered with apt install --reinstall perl perl-base perl-modules libdpkg-perl libfile-fnmatch-perl and see if you can run debsums afterwards.

I'd also check the system logs/journal for disk/filesystem-related errors and check the physical disk health with e.g. smartctl.
 
Code:
root@pve:~# cat /etc/apt/sources.list
deb http://deb.debian.org/debian bookworm main contrib
deb http://deb.debian.org/debian bookworm-updates main contrib
deb http://security.debian.org/debian-security bookworm-security main contrib
root@pve:~# cat /etc/apt/sources.list.d/ceph.list
# deb https://enterprise.proxmox.com/debian/ceph-quincy bookworm enterprise
# deb http://download.proxmox.com/debian/ceph-quincy bookworm no-subscription
# deb https://enterprise.proxmox.com/debian/ceph-reef bookworm enterprise
# deb http://download.proxmox.com/debian/ceph-reef bookworm no-subscription

My root pool is a zfs mirror on 2 SSDs, no errors on zpool status, no scub failures, SMART test came out fine. I tried removing one drive to force booting from the other, same issue.

However after reinstalling those packages I get:
Code:
root@pve:~# debsums -c
debsums: missing file /usr/share/aclocal/autoopts.m4 (from autogen package)
debsums: missing file /usr/bin/x86_64-linux-gnu-cpp-10 (from cpp-10 package)
debsums: missing file /usr/lib/gcc/x86_64-linux-gnu/10/cc1 (from cpp-10 package)
debsums: missing file /usr/share/lintian/overrides/cpp-10 (from cpp-10 package)
debsums: missing file /bin/fusermount3 (from fuse3 package)
debsums: missing file /sbin/mount.fuse3 (from fuse3 package)
debsums: missing file /usr/share/doc/fuse3/changelog.Debian.gz (from fuse3 package)
debsums: missing file /usr/share/doc/fuse3/changelog.gz (from fuse3 package)
debsums: missing file /usr/share/doc/fuse3/copyright (from fuse3 package)
/usr/share/initramfs-tools/hooks/fuse
debsums: missing file /usr/share/man/man1/fusermount3.1.gz (from fuse3 package)
debsums: missing file /usr/share/man/man8/mount.fuse3.8.gz (from fuse3 package)
debsums: missing file /usr/share/lintian/overrides/g++ (from g++ package)
debsums: missing file /usr/bin/x86_64-linux-gnu-g++-12 (from g++-12 package)
debsums: missing file /usr/lib/gcc/x86_64-linux-gnu/12/cc1plus (from g++-12 package)
debsums: missing file /usr/lib/gcc/x86_64-linux-gnu/12/g++-mapper-server (from g++-12 package)
debsums: missing file /usr/share/doc/gcc-12-base/C++/README.C++ (from g++-12 package)
debsums: missing file /usr/share/doc/gcc-12-base/C++/changelog.gz (from g++-12 package)
debsums: missing file /usr/share/lintian/overrides/g++-12 (from g++-12 package)
debsums: missing file /usr/bin/c89-gcc (from gcc package)
debsums: missing file /usr/bin/c99-gcc (from gcc package)
This goes on for a good while with more packages too.

I am thinking this is now unfixable?
My container backups are on a separate zpool that seems fine, is there anything I need to do before I reinstall proxmox and try to restore those containers?
 
Code:
root@pve:~# cat /etc/apt/sources.list
deb http://deb.debian.org/debian bookworm main contrib
deb http://deb.debian.org/debian bookworm-updates main contrib
deb http://security.debian.org/debian-security bookworm-security main contrib
root@pve:~# cat /etc/apt/sources.list.d/ceph.list
# deb https://enterprise.proxmox.com/debian/ceph-quincy bookworm enterprise
# deb http://download.proxmox.com/debian/ceph-quincy bookworm no-subscription
# deb https://enterprise.proxmox.com/debian/ceph-reef bookworm enterprise
# deb http://download.proxmox.com/debian/ceph-reef bookworm no-subscription

My root pool is a zfs mirror on 2 SSDs, no errors on zpool status, no scub failures, SMART test came out fine. I tried removing one drive to force booting from the other, same issue.

However after reinstalling those packages I get:
Code:
root@pve:~# debsums -c
debsums: missing file /usr/share/aclocal/autoopts.m4 (from autogen package)
debsums: missing file /usr/bin/x86_64-linux-gnu-cpp-10 (from cpp-10 package)
debsums: missing file /usr/lib/gcc/x86_64-linux-gnu/10/cc1 (from cpp-10 package)
debsums: missing file /usr/share/lintian/overrides/cpp-10 (from cpp-10 package)
debsums: missing file /bin/fusermount3 (from fuse3 package)
debsums: missing file /sbin/mount.fuse3 (from fuse3 package)
debsums: missing file /usr/share/doc/fuse3/changelog.Debian.gz (from fuse3 package)
debsums: missing file /usr/share/doc/fuse3/changelog.gz (from fuse3 package)
debsums: missing file /usr/share/doc/fuse3/copyright (from fuse3 package)
/usr/share/initramfs-tools/hooks/fuse
debsums: missing file /usr/share/man/man1/fusermount3.1.gz (from fuse3 package)
debsums: missing file /usr/share/man/man8/mount.fuse3.8.gz (from fuse3 package)
debsums: missing file /usr/share/lintian/overrides/g++ (from g++ package)
debsums: missing file /usr/bin/x86_64-linux-gnu-g++-12 (from g++-12 package)
debsums: missing file /usr/lib/gcc/x86_64-linux-gnu/12/cc1plus (from g++-12 package)
debsums: missing file /usr/lib/gcc/x86_64-linux-gnu/12/g++-mapper-server (from g++-12 package)
debsums: missing file /usr/share/doc/gcc-12-base/C++/README.C++ (from g++-12 package)
debsums: missing file /usr/share/doc/gcc-12-base/C++/changelog.gz (from g++-12 package)
debsums: missing file /usr/share/lintian/overrides/g++-12 (from g++-12 package)
debsums: missing file /usr/bin/c89-gcc (from gcc package)
debsums: missing file /usr/bin/c99-gcc (from gcc package)
This goes on for a good while with more packages too.

I am thinking this is now unfixable?
You could attempt to reinstall all the packages that have missing files. Question is what led to the removal of these files in the first place? Maybe some half-aborted misplaced remove command? Shell history and system logs could give a hint if you are lucky.
My container backups are on a separate zpool that seems fine, is there anything I need to do before I reinstall proxmox and try to restore those containers?
Best to save configurations from /etc and /etc/pve you could still need and other important stuff on the root disk if there is.
 
Hi thanks again for both your help, I tried the full update/upgrade, sadly it didn't end up helping and I couldn't find anything in the logs.
I have just reinstalled and re-setup the server and it all works fine, thanks for your tip of porting over /etc.
I also found it useful to
Code:
zfs create rpool/data/subvol-xxx-disk-x
for all container disks stored on my rpool so I could re-create them from backups.

I still have no idea what caused the error in the first place, sorry if this is of little use to anyone who experiences this after me.