[SOLVED] VMs won't autostart after reboot and host/vm status is unknown

Feb 13, 2022
3
1
3
44
Hello,

Not sure what happened to Proxmox but I wanted to change the outlets my server was plugged into so I went into my four VMs and shut them each down, then I shutdown Proxmox. When it booted up, it didn't auto-start any of the VMs and I have little grey question marks beside the host and VMs (see image below).

When Proxmox boots, it displays the following just before wiping the screen and displaying the login prompt:

Code:
[FAILED] Failed to start Import ZFS pools by device scanning.
[FAILED] Failed to start Network UPS Tools - power device driver controller.
[FAILED] Failed to start Network UPS Tools - power device monitor and shutdown controller.
[FAILED] Failed to start PVE Status Daemon.
[FAILED] Failed to start Proxmox VE firewall.
[FAILED] Failed to start PVE Cluster HA Resource Manager Daemon.
[FAILED] Failed to start Proxmox VE scheduler.
[FAILED] Failed to start PVE Local HA Resource Manager Daemon.

I have a TrueNAS VM on this host which is using 6 of the attached drives, I suspect the ZFS pool error is from that and not an actual issue with the Proxmox boot drive.

The two NUT errors are because I haven't set up NUT yet.

The remaining Proxmox errors are concerning though. I can use SSH to connect to the host and manually start up several of the services which allows me to then manually start the VMs. After a reboot though, I'm back into the same situation. I have posted the results of a few systemctl status outputs. The trend seems to be ipcc_send_rec failed.

And finally, I can start these services manually:
pve-firewall.service
pvescheduler.service
pve-ha-lrm.service
pve-ha-crm.service
pvestatd.service

But not these:
pve-guests.service <-- fails to start
pve-manager.service <-- fails to start (calls pve-guests.service)

Any idea's what might be happening? I can post some more logs as well if that would help.


Code:
proxmox-ve: 7.1-1 (running kernel: 5.13.19-3-pve)
pve-manager: 7.1-10 (running version: 7.1-10/6ddebafe)
pve-kernel-helper: 7.1-8
pve-kernel-5.13: 7.1-6
pve-kernel-5.13.19-3-pve: 5.13.19-7
pve-kernel-5.13.19-2-pve: 5.13.19-4
ceph-fuse: 15.2.15-pve1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve2
libproxmox-acme-perl: 1.4.1
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.1-6
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.1-2
libpve-guest-common-perl: 4.0-3
libpve-http-server-perl: 4.1-1
libpve-storage-perl: 7.0-15
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.11-1
lxcfs: 4.0.11-pve1
novnc-pve: 1.3.0-1
proxmox-backup-client: 2.1.4-1
proxmox-backup-file-restore: 2.1.4-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.4-5
pve-cluster: 7.1-3
pve-container: 4.1-3
pve-docs: 7.1-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.3-4
pve-ha-manager: 3.3-3
pve-i18n: 2.6-2
pve-qemu-kvm: 6.1.0-3
pve-xtermjs: 4.16.0-1
qemu-server: 7.1-4
smartmontools: 7.2-1
spiceterm: 3.2-2
swtpm: 0.7.0~rc1+2
vncterm: 1.7-1
zfsutils-linux: 2.1.2-pve1


Code:
systemctl status pve-cluster.service
# Shows pve-cluster is active

systemctl status pve-firewall.service
● pve-firewall.service - Proxmox VE firewall
     Loaded: loaded (/lib/systemd/system/pve-firewall.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Sat 2022-02-12 19:06:04 EST; 16min ago
    Process: 2718 ExecStartPre=/usr/bin/update-alternatives --set ebtables /usr/sbin/ebtables-legacy (code=exited, status=0/SUCCESS)
    Process: 2730 ExecStartPre=/usr/bin/update-alternatives --set iptables /usr/sbin/iptables-legacy (code=exited, status=0/SUCCESS)
    Process: 2738 ExecStartPre=/usr/bin/update-alternatives --set ip6tables /usr/sbin/ip6tables-legacy (code=exited, status=0/SUCCESS)
    Process: 2749 ExecStart=/usr/sbin/pve-firewall start (code=exited, status=111)
        CPU: 467ms

Feb 12 19:06:04 host02 pve-firewall[2749]: ipcc_send_rec[1] failed: Connection refused
Feb 12 19:06:04 host02 pve-firewall[2749]: ipcc_send_rec[2] failed: Connection refused
Feb 12 19:06:04 host02 pve-firewall[2749]: ipcc_send_rec[3] failed: Connection refused
Feb 12 19:06:04 host02 pve-firewall[2749]: Unable to load access control list: Connection refused
Feb 12 19:06:04 host02 pve-firewall[2749]: ipcc_send_rec[1] failed: Connection refused
Feb 12 19:06:04 host02 pve-firewall[2749]: ipcc_send_rec[2] failed: Connection refused
Feb 12 19:06:04 host02 pve-firewall[2749]: ipcc_send_rec[3] failed: Connection refused
Feb 12 19:06:04 host02 systemd[1]: pve-firewall.service: Control process exited, code=exited, status=111/n/a
Feb 12 19:06:04 host02 systemd[1]: pve-firewall.service: Failed with result 'exit-code'.
Feb 12 19:06:04 host02 systemd[1]: Failed to start Proxmox VE firewall.


systemctl status pvescheduler.service
● pvescheduler.service - Proxmox VE scheduler
     Loaded: loaded (/lib/systemd/system/pvescheduler.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Sat 2022-02-12 19:06:04 EST; 17min ago
    Process: 2724 ExecStart=/usr/bin/pvescheduler start (code=exited, status=111)
        CPU: 553ms

Feb 12 19:06:04 host02 pvescheduler[2724]: ipcc_send_rec[1] failed: Connection refused
Feb 12 19:06:04 host02 pvescheduler[2724]: ipcc_send_rec[2] failed: Connection refused
Feb 12 19:06:04 host02 pvescheduler[2724]: ipcc_send_rec[3] failed: Connection refused
Feb 12 19:06:04 host02 pvescheduler[2724]: Unable to load access control list: Connection refused
Feb 12 19:06:04 host02 pvescheduler[2724]: ipcc_send_rec[1] failed: Connection refused
Feb 12 19:06:04 host02 pvescheduler[2724]: ipcc_send_rec[2] failed: Connection refused
Feb 12 19:06:04 host02 pvescheduler[2724]: ipcc_send_rec[3] failed: Connection refused
Feb 12 19:06:04 host02 systemd[1]: pvescheduler.service: Control process exited, code=exited, status=111/n/a
Feb 12 19:06:04 host02 systemd[1]: pvescheduler.service: Failed with result 'exit-code'.
Feb 12 19:06:04 host02 systemd[1]: Failed to start Proxmox VE scheduler.



systemctl status pve-guests.service
● pve-guests.service - PVE guests
     Loaded: loaded (/lib/systemd/system/pve-guests.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Sat 2022-02-12 19:06:06 EST; 17min ago
    Process: 3135 ExecStartPre=/usr/share/pve-manager/helpers/pve-startall-delay (code=exited, status=0/SUCCESS)
    Process: 3136 ExecStart=/usr/bin/pvesh --nooutput create /nodes/localhost/startall (code=exited, status=111)
   Main PID: 3136 (code=exited, status=111)
        CPU: 819ms

Feb 12 19:06:05 host02 systemd[1]: Starting PVE guests...
Feb 12 19:06:06 host02 pvesh[3136]: ipcc_send_rec[1] failed: Connection refused
Feb 12 19:06:06 host02 pvesh[3136]: ipcc_send_rec[2] failed: Connection refused
Feb 12 19:06:06 host02 pvesh[3136]: ipcc_send_rec[3] failed: Connection refused
Feb 12 19:06:06 host02 pvesh[3136]: Unable to load access control list: Connection refused
Feb 12 19:06:06 host02 systemd[1]: pve-guests.service: Main process exited, code=exited, status=111/n/a
Feb 12 19:06:06 host02 systemd[1]: pve-guests.service: Failed with result 'exit-code'.
Feb 12 19:06:06 host02 systemd[1]: Failed to start PVE guests.

etc.

Screen Shot 2022-02-12 at 7.12.23 PM.png
 
can you post the journal for a complete boot? (journalctl -b after a reboot for example)

whats the status of 'pve-cluster' service ?
 
ok thanks for the logs, but they are rather short (only ~ 1,5 minutes), can you wait a little longer (at least a few minutes) and then collect the logs again?
 
Hi dcsapak,

Thanks for checking them out. I actually just removed the logs since everything booted up fine that time so the logs didn't show the issue. I did some more investigating and I figured out what was causing the issue on my previous boots. I had implemented a version of a host backup script found in the discussion here: https://forum.proxmox.com/threads/how-to-backup-proxmox-configuration-files.67789/

The backup target was a share on my nas. The nas is a VM hosted by this host. So the problem was that the backup script was running when Proxmox booted but because my nas VM hadn't started up yet, the location the script was trying to backup too was not available. Part of the backup script included stopping several services, then making the backup, then restarting the services. So the script, stopped the services and then failed when the backup location couldn't be found and never restarted the services!

So thank you so much for looking into this and pointing me to the journalctl -b logs, I was able to find the script errors in there and correct the issue! Now the script checks to make sure the backup target exists before doing anything :) And I stopped it running during boot up and it just runs at midnight now, when the nas should be running :)

Problem solved!
 
  • Like
Reactions: dcsapak
Hi,

I need assistance because some of my VMs are not booting up after the server restarts. I have multiple servers and some of the VMs are affected.

giving me below error
 

Attachments

  • Screenshot_152.jpg
    Screenshot_152.jpg
    38.5 KB · Views: 7
Last edited:
Hi,

I need assistance because some of my VMs are not booting up after the server restarts. I have multiple servers and some of the VMs are affected.

giving me below error
please don't hijack an old thread. instead open your own thread with a bit more information (e.g. full error message, a bit more detailed descriptions of the setup/issue/etc.)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!