[SOLVED] PVE Shutdown/Reboot takes more than 10 minutes

fpdragon

Member
Feb 24, 2022
46
2
13
39
1707118295751.png

1707118496936.png

Not sure where to start.
It runs on my new refurbed HPE DL380 Gen9.

Hope you can help. Or maybe it is just normal on server HW?
But from the outputs with these timeouts and watchdogs it is kind of strange.
Thanks.
 

Attachments

  • 1707118402125.png
    1707118402125.png
    163.2 KB · Views: 4
Thanks Udo.

Code:
~# journalctl -b 0 -p 3
Feb 05 11:20:06 ProxHpDL380 kernel: [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 38d is 330)
Feb 05 11:20:08 ProxHpDL380 kernel: i8042: Can't read CTR while initializing i8042
Feb 05 11:20:06 ProxHpDL380 systemd-modules-load[1293]: Failed to find module 'vfio_virqfd #not needed if on kernel 6.2 or newer'
Feb 05 11:20:14 ProxHpDL380 pmxcfs[2740]: [quorum] crit: quorum_initialize failed: 2
Feb 05 11:20:14 ProxHpDL380 pmxcfs[2740]: [quorum] crit: can't initialize service
Feb 05 11:20:14 ProxHpDL380 pmxcfs[2740]: [confdb] crit: cmap_initialize failed: 2
Feb 05 11:20:14 ProxHpDL380 pmxcfs[2740]: [confdb] crit: can't initialize service
Feb 05 11:20:14 ProxHpDL380 pmxcfs[2740]: [dcdb] crit: cpg_initialize failed: 2
Feb 05 11:20:14 ProxHpDL380 pmxcfs[2740]: [dcdb] crit: can't initialize service
Feb 05 11:20:14 ProxHpDL380 pmxcfs[2740]: [status] crit: cpg_initialize failed: 2
Feb 05 11:20:14 ProxHpDL380 pmxcfs[2740]: [status] crit: can't initialize service
Feb 05 11:20:18 ProxHpDL380 smartd[2420]: Device: /dev/sdf [SAT], 341 Currently unreadable (pending) sectors
Feb 05 11:20:18 ProxHpDL380 smartd[2420]: Device: /dev/sdf [SAT], 340 Offline uncorrectable sectors
Feb 05 11:20:28 ProxHpDL380 pve-guests[2994]: CT is locked (disk)

Seems that there are several things goin on.

The firmware bug I read that this is something HW specific that linux wants to takeover some sensors. I guess it is non critical.

The quorum thing... before I had this machine running standalone and the shutdown took the same time.

/dev/sdf ... That was new to me. This is one of multiple disks that are passed through to a VM. SMART data is ok and the disk was working fine. At least I have not found an issue. Is the disk broken? Could this disk lead to the long shutdown delay and maybe also other problems?

What do you say?
 
Seems like your disks have problems, this can certainly lead to long startup and shutdown times, try to replace the disk /dev/sdf
Not every error is caught by SMART, unfortunatly, but try a smart long test:
smartctl -t long /dev/sdf
 
  • Like
Reactions: fpdragon
I set /dev/sdf to "retired", rebuild my storage pool and removed it.
now the shutdown takes only seconds.

Thanks a lot for the help.

Code:
~# journalctl  -b 0  -p 3
was the solution
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!