Lookin for some information...

nikatjef · Mar 15, 2024

Greetings,

Very new to Proxmox, so I may have missed discussion on this elsewhen, but when you use `qm list` or query the API to get a list of VMs, template status shows up as stopped. This causes various monitoring systems to falsely declare these templates as being stopped VMs instead of ignoring them.

I would like to see how difficult it would be to patch the system to update the status from stopped to template to fix these monitoring systems... To do that I need to figure out which of the git repositories would contain the requisite code. Any guidance would be greatly appreciated.

bbgeek17 · Mar 15, 2024

Sounds like very complicated and error prone way to solve a rather simple challenge:

{
"cpu": 0,
"disk": 0,
"diskread": 0,
"diskwrite": 0,
"id": "qemu/100",
"maxcpu": 1,
"maxdisk": 34359738368,
"maxmem": 2147483648,
"mem": 0,
"name": "template",
"netin": 0,
"netout": 0,
"node": "pve7demo1",
"status": "stopped",
"template": 1,
"type": "qemu",
"uptime": 0,
"vmid": 100
}
{
"cpu": 0.0153864002981679,
"disk": 0,
"diskread": 15880192,
"diskwrite": 1319182336,
"id": "qemu/3000",
"maxcpu": 1,
"maxdisk": 8589934592,
"maxmem": 8589934592,
"mem": 603291648,
"name": "vm3000",
"netin": 19893136,
"netout": 93358,
"node": "pve7demo2",
"status": "running",
"template": 0,
"type": "qemu",
"uptime": 288598,
"vmid": 3000
}
{
"cpu": 0.00102395490198625,
"disk": 0,
"diskread": 408521728,
"diskwrite": 1225148928,
"id": "qemu/3011",
"maxcpu": 1,
"maxdisk": 8589934592,
"maxmem": 8589934592,
"mem": 1431818240,
"name": "vm3011",
"netin": 544066691,
"netout": 5138010,
"node": "pve7demo1",
"status": "running",
"template": 0,
"type": "qemu",
"uptime": 128872,
"vmid": 3011
}

pvesh get /cluster/resources -type vm --output-format json |jq '.[]|select(.template != 1)|.vmid'
3000
3011

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

nikatjef · Mar 15, 2024

Doesn't solve the `qm list` output or the output of the API query... Maybe you are suggesting that it would be better to alter the 5+ monitoring systems which are throwing false positive alerts?

bbgeek17 · Mar 15, 2024

The API query returns JSON, making it much easier to process as shown above than to modify the core functionality of an appliance.

For any reliable monitoring, one should avoid using "qm list" as the CLI text output is never guaranteed to be stable. Granted, it likely hasn't changed in PVE in years.

If you insist, you can hunt for the relevant code here: https://github.com/proxmox.

Introducing new conditioning logic will likely break multiple internal, CLI and GUI portions of PVE that don't expect new state. Additionally, I'd imagine you are not guaranteed to know that a VM is a template when you query its running state, as it may be coming from processes in the kernel. Keep in mind that PVE, unlike OpenStack, does not keep track of the VM state in a database.

Since you asked for my opinion: Yes, I do think it would be easier to adjust monitoring tools to properly interpret the state of the system they are supposed to monitor, rather than custom patching core functionality of PVE. Such patches will need to be maintained forever out of the tree, as they will, likely, be overwritten on each upgrade. I expect you'll have to modify tens if not hundreds of files.

Good luck

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

Search

Search

Lookin for some information...

nikatjef

New Member

bbgeek17

Distinguished Member

nikatjef

New Member

bbgeek17

Distinguished Member