[SOLVED] Problem with misssing /etc

joker2048 · Feb 29, 2024

Dear all,

I dont know why but my folder /etc/pve is empty. So pve will not start (of course).

I have backup of my VMs but these backups a a few days old (due to another problem which is out of scope here).

The VM disks on my proxmox seem to be there. Is there any good way to use my "a little bit old" backups and combine them with my disks to create a most current state?

I have another host on which I already installed a fresh proxmox instance.

Any idea on this? Should I use the dd command (and when yes how)?

Yours
joker2048

leesteken · Feb 29, 2024

joker2048 said:
I dont know why but my folder /etc/pve is empty. So pve will not start (of course).

It's a database that is exposed as a filesystem: https://pve.proxmox.com/pve-docs/pve-admin-guide.html#chapter_pmxcfs It's probably because some of the PVE services did not start because you are having some kind of problem. This is not uncommon (did you search the forum?) but I don't have the details at hand.

joker2048 said:
I have backup of my VMs but these backups a a few days old (due to another problem which is out of scope here).

Maybe it is related and that's why Proxmox is not loading.

joker2048 said:
The VM disks on my proxmox seem to be there. Is there any good way to use my "a little bit old" backups and combine them with my disks to create a most current state?

I have another host on which I already installed a fresh proxmox instance.

Any idea on this? Should I use the dd command (and when yes how)?

No definitely not blindly copy the Proxmox configuration from another system. Sorry for not being more helpful right now, but I wanted to stop you from ruining this system beyond repair.

EDIT: What is the output of systemctl status pve-cluster? Any problems stand out from journalctl -b 0 (use the arrows keys to scroll)?

joker2048 · Feb 29, 2024

Oh ok, good point.

I thought of a script error of myself because I replace the pem and key file with a wildcard certificate I use on my external servers (split dns setup at my homelab). This script deletes files (but normallyn only the according .key and .pem) so perhaps this could be a cause. I do not thought of that /etc/pve is a mounting point.

I am not aiming for blindly copy configurations. The approach is to have a new pve host. Then restore the backups. As stated these are a little bit old. Then change the disk image (copy from old system to new system) to have the complete condiguration and setup (from the backups) and the mos current files (from the disk copy). As I have a "new" system any operation should be harmless

My output:
root@pve:~# pvesm list
ipcc_send_rec[1] failed: Connection refused
ipcc_send_rec[2] failed: Connection refused
ipcc_send_rec[3] failed: Connection refused
Unable to load access control list: Connection refused
root@pve:~# systemctl status pve-cluster
× pve-cluster.service - The Proxmox VE cluster filesystem
Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; preset: enabled)
Active: failed (Result: exit-code) since Thu 2024-02-29 14:22:26 CET; 5h 44min ago
Process: 1011 ExecStart=/usr/bin/pmxcfs (code=exited, status=255/EXCEPTION)
CPU: 21ms

Feb 29 14:22:26 pve systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 5.
Feb 29 14:22:26 pve systemd[1]: Stopped pve-cluster.service - The Proxmox VE cluster filesystem.
Feb 29 14:22:26 pve systemd[1]: pve-cluster.service: Start request repeated too quickly.
Feb 29 14:22:26 pve systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
Feb 29 14:22:26 pve systemd[1]: Failed to start pve-cluster.service - The Proxmox VE cluster filesystem.
root@pve:~# journalctl -b 0
Feb 29 14:22:19 pve kernel: Linux version 6.5.11-7-pve (build@proxmox) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_DYNAMIC PMX 6.5.11-7 (2023-12-05T09:44Z) ()
Feb 29 14:22:19 pve kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-6.5.11-7-pve root=/dev/mapper/pve-root ro quiet
Feb 29 14:22:19 pve kernel: KERNEL supported cpus:
Feb 29 14:22:19 pve kernel: Intel GenuineIntel
Feb 29 14:22:19 pve kernel: AMD AuthenticAMD
Feb 29 14:22:19 pve kernel: Hygon HygonGenuine
Feb 29 14:22:19 pve kernel: Centaur CentaurHauls
Feb 29 14:22:19 pve kernel: zhaoxin Shanghai
Feb 29 14:22:19 pve kernel: BIOS-provided physical RAM map:
Feb 29 14:22:19 pve kernel: BIOS-e820: [mem 0x0000000000000000-0x0000000000057fff] usable
Feb 29 14:22:19 pve kernel: BIOS-e820: [mem 0x0000000000058000-0x0000000000058fff] reserved
Feb 29 14:22:19 pve kernel: BIOS-e820: [mem 0x0000000000059000-0x000000000009efff] usable
Feb 29 14:22:19 pve kernel: BIOS-e820: [mem 0x000000000009f000-0x000000000009ffff] reserved
Feb 29 14:22:19 pve kernel: BIOS-e820: [mem 0x0000000000100000-0x00000000c840bfff] usable
Feb 29 14:22:19 pve kernel: BIOS-e820: [mem 0x00000000c840c000-0x00000000c8412fff] ACPI NVS
Feb 29 14:22:19 pve kernel: BIOS-e820: [mem 0x00000000c8413000-0x00000000dba84fff] usable
Feb 29 14:22:19 pve kernel: BIOS-e820: [mem 0x00000000dba85000-0x00000000dbb0efff] reserved
Feb 29 14:22:19 pve kernel: BIOS-e820: [mem 0x00000000dbb0f000-0x00000000dbb28fff] ACPI data
Feb 29 14:22:19 pve kernel: BIOS-e820: [mem 0x00000000dbb29000-0x00000000dbc91fff] ACPI NVS
Feb 29 14:22:19 pve kernel: BIOS-e820: [mem 0x00000000dbc92000-0x00000000dbf6dfff] reserved
Feb 29 14:22:19 pve kernel: BIOS-e820: [mem 0x00000000dbf6e000-0x00000000dbffefff] type 20
Feb 29 14:22:19 pve kernel: BIOS-e820: [mem 0x00000000dbfff000-0x00000000dbffffff] usable
Feb 29 14:22:19 pve kernel: BIOS-e820: [mem 0x00000000dd000000-0x00000000df1fffff] reserved
Feb 29 14:22:19 pve kernel: BIOS-e820: [mem 0x00000000f8000000-0x00000000fbffffff] reserved
Feb 29 14:22:19 pve kernel: BIOS-e820: [mem 0x00000000fec00000-0x00000000fec00fff] reserved
Feb 29 14:22:19 pve kernel: BIOS-e820: [mem 0x00000000fed00000-0x00000000fed03fff] reserved
Feb 29 14:22:19 pve kernel: BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved
Feb 29 14:22:19 pve kernel: BIOS-e820: [mem 0x00000000fee00000-0x00000000fee00fff] reserved
Feb 29 14:22:19 pve kernel: BIOS-e820: [mem 0x00000000ff000000-0x00000000ffffffff] reserved
Feb 29 14:22:19 pve kernel: BIOS-e820: [mem 0x0000000100000000-0x000000041fdfffff] usable
Feb 29 14:22:19 pve kernel: NX (Execute Disable) protection: active
Feb 29 14:22:19 pve kernel: efi: EFI v2.3.1 by American Megatrends
Feb 29 14:22:19 pve kernel: efi: ACPI=0xdbb14000 ACPI 2.0=0xdbb14000 SMBIOS=0xf04c0
Feb 29 14:22:19 pve kernel: efi: Remove mem83: MMIO range=[0xf8000000-0xfbffffff] (64MB) from e820 map
Feb 29 14:22:19 pve kernel: e820: remove [mem 0xf8000000-0xfbffffff] reserved
Feb 29 14:22:19 pve kernel: efi: Not removing mem84: MMIO range=[0xfec00000-0xfec00fff] (4KB) from e820 map
Feb 29 14:22:19 pve kernel: efi: Not removing mem85: MMIO range=[0xfed00000-0xfed03fff] (16KB) from e820 map
Feb 29 14:22:19 pve kernel: efi: Not removing mem86: MMIO range=[0xfed1c000-0xfed1ffff] (16KB) from e820 map
Feb 29 14:22:19 pve kernel: efi: Not removing mem87: MMIO range=[0xfee00000-0xfee00fff] (4KB) from e820 map
Feb 29 14:22:19 pve kernel: efi: Remove mem88: MMIO range=[0xff000000-0xffffffff] (16MB) from e820 map
Feb 29 14:22:19 pve kernel: e820: remove [mem 0xff000000-0xffffffff] reserved
Feb 29 14:22:19 pve kernel: secureboot: Secure boot could not be determined (mode 0)
Feb 29 14:22:19 pve kernel: SMBIOS 2.8 present.
Feb 29 14:22:19 pve kernel: DMI: /D54250WYK, BIOS WYLPT10H.86A.0021.2013.1017.1606 10/17/2013
Feb 29 14:22:19 pve kernel: tsc: Fast TSC calibration using PIT
Feb 29 14:22:19 pve kernel: tsc: Detected 1895.620 MHz processor
Feb 29 14:22:19 pve kernel: e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
Feb 29 14:22:19 pve kernel: e820: remove [mem 0x000a0000-0x000fffff] usable
Feb 29 14:22:19 pve kernel: last_pfn = 0x41fe00 max_arch_pfn = 0x400000000
Feb 29 14:22:19 pve kernel: total RAM covered: 16334M
Feb 29 14:22:19 pve kernel: Found optimal setting for mtrr clean up
Feb 29 14:22:19 pve kernel: gran_size: 64K chunk_size: 64M num_reg: 9 lose cover RAM: 0G
Feb 29 14:22:19 pve kernel: MTRR map: 9 entries (5 fixed + 4 variable; max 25), built from 10 variable MTRRs
Feb 29 14:22:19 pve kernel: x86/PAT: Configuration [0-7]: WB WC UC- UC WB WP UC- WT
Feb 29 14:22:19 pve kernel: e820: update [mem 0xdd000000-0xffffffff] usable ==> reserved
Feb 29 14:22:19 pve kernel: last_pfn = 0xdc000 max_arch_pfn = 0x400000000
Feb 29 14:22:19 pve kernel: found SMP MP-table at [mem 0x000fd6b0-0x000fd6bf]

leesteken · Feb 29, 2024

joker2048 said:
root@pve:~# systemctl status pve-cluster
× pve-cluster.service - The Proxmox VE cluster filesystem
Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; preset: enabled)
Active: failed (Result: exit-code) since Thu 2024-02-29 14:22:26 CET; 5h 44min ago
Process: 1011 ExecStart=/usr/bin/pmxcfs (code=exited, status=255/EXCEPTION)
CPU: 21ms

Feb 29 14:22:26 pve systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 5.
Feb 29 14:22:26 pve systemd[1]: Stopped pve-cluster.service - The Proxmox VE cluster filesystem.
Feb 29 14:22:26 pve systemd[1]: pve-cluster.service: Start request repeated too quickly.
Feb 29 14:22:26 pve systemd[1]: pve-cluster.service: Failed with result 'exit-code'.

I thought of a script error of myself because I replace the pem and key file with a wildcard certificate I use on my external servers (split dns setup at my homelab). This script deletes files (but normallyn only the according .key and .pem) so perhaps this could be a cause. I do not thought of that /etc/pve is a mounting point.

That might be useful information to figure out why pve-cluster is failing.

joker2048 said:
I am not aiming for blindly copy configurations. The approach is to have a new pve host. Then restore the backups.

That would almost certainly work.

joker2048 said:
root@pve:~# journalctl -b 0
Feb 29 14:22:19 pve kernel: Linux version 6.5.11-7-pve (build@proxmox) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_DYNAMIC PMX 6.5.11-7 (2023-12-05T09:44Z) ()
Feb 29 14:22:19 pve kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-6.5.11-7-pve root=/dev/mapper/pve-root ro quiet
Feb 29 14:22:19 pve kernel: KERNEL supported cpus:
Feb 29 14:22:19 pve kernel: Intel GenuineIntel
Feb 29 14:22:19 pve kernel: AMD AuthenticAMD
Feb 29 14:22:19 pve kernel: Hygon HygonGenuine
Feb 29 14:22:19 pve kernel: Centaur CentaurHauls
Feb 29 14:22:19 pve kernel: zhaoxin Shanghai
...

There is no point in showing the beginning of the logs. You need to scroll manually using the arrow keys, looking for relevant error messages. But it's already clear that the pve-cluster service is failure. And I bet it has something to do with a mistake with certificates or the way they were installed. But I don't know much about that, sorry.

joker2048 · Feb 29, 2024

leesteken said:
There is no point in showing the beginning of the logs

Yes you are right. Sry was in a hurry. End of log says my certificate error. But this is misleading

The information thaa /etc/pve is database (sqlite) backed was the clou. Deleted everything inside of /etc/pve, rebooted and everything works like a charm. First action: Update Backups

One topic left: What should be includes in a backup on the pve host itself? /etc/pve? I think not. The database mounted at /etc/pve?

Search

Search

[SOLVED] Problem with misssing /etc

joker2048

Member

leesteken

Distinguished Member

joker2048

Member

leesteken

Distinguished Member

joker2048

Member