Hello. Relatively new to Proxmox guy here. I created several independent Proxmox hosts for a few days, then later joined them into a cluster and setup HA failover and such. Really enjoyed learning proxmox. However there's 1 thing that has bothered me, and I can't seem to come up with the proper keywords to get an answer, so I figured I'd ask here.
Before I clustered the machines together, when I started and stopped a VM from the WebGUI, the VM would nearly instantly "power on" as well as "power off". Less than 1 second would be my best guess. However, ever since I setup the cluster, VMs take 15 seconds (and sometimes more) before they power on (or off).
I can see this time in the tasks at the bottom of the WebGUI. I just shutdown and started 2 VMs for maintenance, and one showed:
HA 100 - Start 22:23:03
VM 100 - Start 22:23:18
That's 15 seconds. (this is about the shortest I've seen this year)
The other:
HA 117 - Start 20:32:48
VM 117 - Start 20:33:07
That's 19 seconds.
Still I've had others take 30 seconds or so.
Likewise, if I shutdown a VM from within the VM itself, the console will go offline, and almost always its 15+ seconds before the WebGUI of Proxmox actually shows the VM is powered off.
This is particularly annoying when I shutdown the entire cluster for maintenance and then have to start the VMs up afterwards. I end up waiting 20+ seconds for each VM to "power on" before I try the next one otherwise it seems like the VMs take even longer than 20 seconds to power on.
The *only* thing that I find even remotely out of place is that the Datacenter Summary page shows "Ceph" as "HEALTH_WARN" and if I click on it it says "OSD count 0 < osd_pool_default_size 2". Unless I'm really lost on what Ceph is and such (and I will admit its my weakest part of Proxmox), I don't use ceph at all, so I've ignored this since I noticed it.
All of the hosts are overpowered for their workload, all have 10Gb redundant networking, and all seem to be generally healthy. top shows CPU usage for 1m, 5m, and 15m averages less than 0.80, and RAM shows about 180GB of RAM free out of 256GB.
I do have a qdevice, but these problems were from before I had a qdevice.
Can someone explain what is going on "behind the scenes" that takes all that time? I'd like to find and fix the problem (assuming there is a problem). This has been going on for quite some time. I think I first made the HA cluster in 7.1, and still persists today despite using 7.4.
Thanks!
Before I clustered the machines together, when I started and stopped a VM from the WebGUI, the VM would nearly instantly "power on" as well as "power off". Less than 1 second would be my best guess. However, ever since I setup the cluster, VMs take 15 seconds (and sometimes more) before they power on (or off).
I can see this time in the tasks at the bottom of the WebGUI. I just shutdown and started 2 VMs for maintenance, and one showed:
HA 100 - Start 22:23:03
VM 100 - Start 22:23:18
That's 15 seconds. (this is about the shortest I've seen this year)
The other:
HA 117 - Start 20:32:48
VM 117 - Start 20:33:07
That's 19 seconds.
Still I've had others take 30 seconds or so.
Likewise, if I shutdown a VM from within the VM itself, the console will go offline, and almost always its 15+ seconds before the WebGUI of Proxmox actually shows the VM is powered off.
This is particularly annoying when I shutdown the entire cluster for maintenance and then have to start the VMs up afterwards. I end up waiting 20+ seconds for each VM to "power on" before I try the next one otherwise it seems like the VMs take even longer than 20 seconds to power on.
The *only* thing that I find even remotely out of place is that the Datacenter Summary page shows "Ceph" as "HEALTH_WARN" and if I click on it it says "OSD count 0 < osd_pool_default_size 2". Unless I'm really lost on what Ceph is and such (and I will admit its my weakest part of Proxmox), I don't use ceph at all, so I've ignored this since I noticed it.
All of the hosts are overpowered for their workload, all have 10Gb redundant networking, and all seem to be generally healthy. top shows CPU usage for 1m, 5m, and 15m averages less than 0.80, and RAM shows about 180GB of RAM free out of 256GB.
I do have a qdevice, but these problems were from before I had a qdevice.
Can someone explain what is going on "behind the scenes" that takes all that time? I'd like to find and fix the problem (assuming there is a problem). This has been going on for quite some time. I think I first made the HA cluster in 7.1, and still persists today despite using 7.4.
Thanks!