I need help to get my server working right, not sure what caused the issue.
I think is related to the steps I following when I was trying to make GPU Passthrough to a VM, I had trouble to make it work, so I tried to bind vfio-pci via device ID, then I was trying to load vfio-pci early using this tool mkinitcpio:
UPDATE:
The systemctl status degraded was fixed, the problem was that in /etc/hosts file, there was another IP address instead of the correct one, no sure how was updated, also I had no free disk space in proxmox, I was able to freed some space using these commands:
How to clean disk space
fstrim / -v
How to free space from journal logs
sudo journalctl --vacuum-time=3h
https://linuxhandbook.com/clear-systemd-journal-logs/
I was following this tutorial for GPU Passthrough:
PCI passthrough via OVMF - ArchWiki (archlinux.org)
https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF
I was trying re-generate initramfs generator using mkinitcpio
More details on:
https://wiki.archlinux.org/title/Mkinitcpio
I had these errors when I executed that command:
After that I guess it messed up my proxmox server.
My server working fine even I installed a VM tried GPU passthrough and it was working, after I restarted my server, I cannot have web GUI working, I ssh the server and systemd status I get the following:
Here are the failed units:
When I run apt-update command it shows:
Please any help on how to repair my proxmox server without touching the pve-data lv will be appreciated. I don't want to lose my configurations and VMs.
I think is related to the steps I following when I was trying to make GPU Passthrough to a VM, I had trouble to make it work, so I tried to bind vfio-pci via device ID, then I was trying to load vfio-pci early using this tool mkinitcpio:
UPDATE:
The systemctl status degraded was fixed, the problem was that in /etc/hosts file, there was another IP address instead of the correct one, no sure how was updated, also I had no free disk space in proxmox, I was able to freed some space using these commands:
How to clean disk space
fstrim / -v
How to free space from journal logs
sudo journalctl --vacuum-time=3h
https://linuxhandbook.com/clear-systemd-journal-logs/
I was following this tutorial for GPU Passthrough:
PCI passthrough via OVMF - ArchWiki (archlinux.org)
https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF
I was trying re-generate initramfs generator using mkinitcpio
Code:
# mkinitcpio -P
More details on:
https://wiki.archlinux.org/title/Mkinitcpio
I had these errors when I executed that command:
Code:
==> Building image from preset: mkinitcpio.d/example.preset: 'default'
-> -k /boot/vmlinuz-linux -c /etc/mkinitcpio.conf -g /tmp/initramfs-linux.img -U /efi/EFI/Linux/arch-linux.efi --microcode /boot/*-ucode.img
==> ERROR: Unable to write to path: `/efi/EFI/Linux/arch-linux.efi'
==> Building image from preset: mkinitcpio.d/example.preset: 'fallback'
-> -k /boot/vmlinuz-linux -c /etc/mkinitcpio.conf -g /tmp/initramfs-linux-fallback.img -S autodetect -U /efi/EFI/Linux/arch-linux-fallback.efi --microcode /boot/*-ucode.img
==> ERROR: Unable to write to path: `/efi/EFI/Linux/arch-linux-fallback.efi'
==> Building image from preset: mkinitcpio.d/hook.preset: 'default'
functions: line 201: printf: `P': invalid format character
-> -k /boot/vmlinuz-==> ERROR: specified kernel image does not exist: `/boot/vmlinuz-%PKGBASE%'
==> Building image from preset: mkinitcpio.d/hook.preset: 'fallback'
functions: line 201: printf: `P': invalid format character
-> -k /boot/vmlinuz-==> ERROR: specified kernel image does not exist: `/boot/vmlinuz-%PKGBASE%'
After that I guess it messed up my proxmox server.
My server working fine even I installed a VM tried GPU passthrough and it was working, after I restarted my server, I cannot have web GUI working, I ssh the server and systemd status I get the following:
Code:
root@pxserver:/# systemctl status -l
● pxserver
State: degraded
Jobs: 0 queued
Failed: 9 units
Since: Sun 2022-02-27 23:46:34 -04; 31min ago
Here are the failed units:
Code:
root@pxserver:/# systemctl --failed
UNIT LOAD ACTIVE SUB DESCRIPTION
● lightdm.service loaded failed failed Light Display Manager
● man-db.service loaded failed failed Daily man-db regeneration
● pve-cluster.service loaded failed failed The Proxmox VE cluster filesystem
● pve-firewall.service loaded failed failed Proxmox VE firewall
● pve-guests.service loaded failed failed PVE guests
● pve-ha-crm.service loaded failed failed PVE Cluster HA Resource Manager Daemon
● pve-ha-lrm.service loaded failed failed PVE Local HA Resource Manager Daemon
● pvescheduler.service loaded failed failed Proxmox VE scheduler
● pvestatd.service loaded failed failed PVE Status Daemon
LOAD = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB = The low-level unit activation state, values depend on unit type.
9 loaded units listed.
When I run apt-update command it shows:
Code:
root@pxserver:~# apt update
Hit:1 http://deb.debian.org/debian bullseye InRelease
Err:1 http://deb.debian.org/debian bullseye InRelease
Splitting up /var/lib/apt/lists/deb.debian.org_debian_dists_bullseye_InRelease into data and signature failed
Hit:2 http://ftp.debian.org/debian bullseye InRelease
Err:2 http://ftp.debian.org/debian bullseye InRelease
Splitting up /var/lib/apt/lists/ftp.debian.org_debian_dists_bullseye_InRelease into data and signature failed
Get:3 http://ftp.debian.org/debian bullseye-updates InRelease [39.4 kB]
Err:3 http://ftp.debian.org/debian bullseye-updates InRelease
Error writing to file - write (28: No space left on device) [IP: 151.101.218.132 80]
Ign:4 http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 InRelease
Get:5 http://deb.debian.org/debian bullseye-updates InRelease [39.4 kB]
Err:5 http://deb.debian.org/debian bullseye-updates InRelease
Error writing to file - write (28: No space left on device) [IP: 151.101.218.132 80]
Hit:6 http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 Release
Hit:7 http://download.proxmox.com/debian/pve bullseye InRelease
Err:8 http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 Release.gpg
At least one invalid signature was encountered.
Err:7 http://download.proxmox.com/debian/pve bullseye InRelease
Splitting up /var/lib/apt/lists/download.proxmox.com_debian_pve_dists_bullseye_InRelease into data and signature failed
Ign:9 https://developer.download.nvidia.com/compute/cuda/repos/debian11/x86_64 InRelease
Hit:10 https://developer.download.nvidia.com/compute/cuda/repos/debian11/x86_64 Release
Err:11 https://developer.download.nvidia.com/compute/cuda/repos/debian11/x86_64 Release.gpg
At least one invalid signature was encountered.
Hit:12 https://apt.iteas.at/iteas buster InRelease
Err:12 https://apt.iteas.at/iteas buster InRelease
Splitting up /var/lib/apt/lists/apt.iteas.at_iteas_dists_buster_InRelease into data and signature failed
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
All packages are up to date.
W: An error occurred during the signature verification. The repository is not updated and the previous index files will be used. GPG error: http://deb.debian.org/debian bullseye InRelease: Splitting up /var/lib/apt/lists/deb.debian.org_debian_dists_bullseye_InRelease into data and signature failed
W: An error occurred during the signature verification. The repository is not updated and the previous index files will be used. GPG error: http://ftp.debian.org/debian bullseye InRelease: Splitting up /var/lib/apt/lists/ftp.debian.org_debian_dists_bullseye_InRelease into data and signature failed
W: An error occurred during the signature verification. The repository is not updated and the previous index files will be used. GPG error: http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 Release: At least one invalid signature was encountered.
W: An error occurred during the signature verification. The repository is not updated and the previous index files will be used. GPG error: http://download.proxmox.com/debian/pve bullseye InRelease: Splitting up /var/lib/apt/lists/download.proxmox.com_debian_pve_dists_bullseye_InRelease into data and signature failed
W: An error occurred during the signature verification. The repository is not updated and the previous index files will be used. GPG error: https://developer.download.nvidia.com/compute/cuda/repos/debian11/x86_64 Release: At least one invalid signature was encountered.
W: An error occurred during the signature verification. The repository is not updated and the previous index files will be used. GPG error: https://apt.iteas.at/iteas buster InRelease: Splitting up /var/lib/apt/lists/apt.iteas.at_iteas_dists_buster_InRelease into data and signature failed
W: Failed to fetch http://ftp.debian.org/debian/dists/bullseye/InRelease Splitting up /var/lib/apt/lists/ftp.debian.org_debian_dists_bullseye_InRelease into data and signature failed
W: Failed to fetch http://ftp.debian.org/debian/dists/bullseye-updates/InRelease Error writing to file - write (28: No space left on device) [IP: 151.101.218.132 80]
W: Failed to fetch http://deb.debian.org/debian/dists/bullseye/InRelease Splitting up /var/lib/apt/lists/deb.debian.org_debian_dists_bullseye_InRelease into data and signature failed
W: Failed to fetch http://deb.debian.org/debian/dists/bullseye-updates/InRelease Error writing to file - write (28: No space left on device) [IP: 151.101.218.132 80]
W: Failed to fetch http://download.proxmox.com/debian/pve/dists/bullseye/InRelease Splitting up /var/lib/apt/lists/download.proxmox.com_debian_pve_dists_bullseye_InRelease into data and signature failed
W: Failed to fetch https://apt.iteas.at/iteas/dists/buster/InRelease Splitting up /var/lib/apt/lists/apt.iteas.at_iteas_dists_buster_InRelease into data and signature failed
W: Failed to fetch http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/Release.gpg At least one invalid signature was encountered.
W: Failed to fetch https://developer.download.nvidia.com/compute/cuda/repos/debian11/x86_64/Release.gpg At least one invalid signature was encountered.
W: Some index files failed to download. They have been ignored, or old ones used instead.
Please any help on how to repair my proxmox server without touching the pve-data lv will be appreciated. I don't want to lose my configurations and VMs.
Last edited: