[SOLVED] Debian CPU hot-unplug error: 400 Parameter verification failed

Superfish1000

Member
Oct 28, 2019
27
5
23
30
I just configured hotplug for CPUs and RAM on one of my VMs and have been testing it. Hotplug for the CPU and RAM work but unplug fails every time with the error:

Code:
Parameter verification failed. (400)

vcpus: hotplug problem - 400 Parameter verification failed. vcpus: error unplugging cpu4


Parameter verification failed. (400)

memory: hotplug problem - 400 Parameter verification failed. dimm9: error unplug memory module

Simultaneously the VM hangs and I lose the ability to interact with the console and SSH.

NUMA is enabled, Hotplug is obviously enabled, VM has hotplug modules added as per wiki.

I've been trying to figure out what this might mean and if it's possible to fix it but so far I haven't even found a thread mentioning this error let alone a way to fix it. Any help would be appreciated. Software versions below.

One thing I've noticed that seems off though is that it keeps saying "vcpus: error unplugging cpu4" or "cpu3" when I only have 4 or 3 CPUs in the system. I'm not sure if it's saying this for human readability or if it's a flaw in the code because there is no cpu4/3 in the system. CPU counts start at cpu0 and if it's actually trying to remove cpu4/3 when there is 4 or 3 CPUs respectively then it will always fail.


Code:
QEMU emulator version 4.1.1 (pve-qemu-kvm_4.1.1)
Copyright (c) 2003-2019 Fabrice Bellard and the QEMU Project developers

pve-manager/6.1-8/806edfe1 (running kernel: 5.3.18-3-pve)

Linux scylla 5.3.18-3-pve #1 SMP PVE 5.3.18-3 (Tue, 17 Mar 2020 16:33:19 +0100) x86_64 GNU/Linux

VM OS:
Linux 4.9.0-12-amd64 #1 SMP Debian 4.9.210-1 (2020-01-20) x86_64 GNU/Linux
 
Last edited:
hi,

could you post:

* qm config VMID

* pveversion -v
 
Code:
root@scylla:~# qm config 104
agent: 1
balloon: 0
bootdisk: scsi0
cores: 8
cpu: qemu64
hotplug: disk,network,usb,memory,cpu
ide2: none,media=cdrom
memory: 6144
name: TCR-Refined
net0: virtio=XX:XX:XX:XX:XX:XX,bridge=vmbr0,tag=10
numa: 1
ostype: l26
parent: BeforeIBreakIt
scsi0: Mnemosyne:104/vm-104-disk-0.qcow2,size=32G
scsihw: virtio-scsi-pci
smbios1: uuid=724c76fe-f920-4142-837a-b659e464df5c
sockets: 1
vcpus: 2
vga: qxl
vmgenid: c2dd2463-e79b-4c98-8db0-f05ac3023467


Code:
root@scylla:~# pveversion -v
proxmox-ve: 6.1-2 (running kernel: 5.3.18-3-pve)
pve-manager: 6.1-8 (running version: 6.1-8/806edfe1)
pve-kernel-helper: 6.1-7
pve-kernel-5.3: 6.1-6
pve-kernel-5.0: 6.0-11
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-5.3.18-2-pve: 5.3.18-2
pve-kernel-5.0.21-5-pve: 5.0.21-10
pve-kernel-5.0.21-4-pve: 5.0.21-9
pve-kernel-5.0.21-3-pve: 5.0.21-7
pve-kernel-5.0.15-1-pve: 5.0.15-1
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.3-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: residual config
ifupdown2: 2.0.1-1+pve8
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.15-pve1
libpve-access-control: 6.0-6
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.0-17
libpve-guest-common-perl: 3.0-5
libpve-http-server-perl: 3.0-5
libpve-storage-perl: 6.1-5
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 3.2.1-1
lxcfs: 4.0.1-pve1
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.1-3
pve-cluster: 6.1-4
pve-container: 3.0-23
pve-docs: 6.1-6
pve-edk2-firmware: 2.20200229-1
pve-firewall: 4.0-10
pve-firmware: 3.0-6
pve-ha-manager: 3.0-9
pve-i18n: 2.0-4
pve-qemu-kvm: 4.1.1-4
pve-xtermjs: 4.3.0-1
qemu-server: 6.1-7
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.3-pve1
 
hi,

thank you

we'll try reproducing the error and get back to you if we need more info

maybe you know when it started happening. was it after some change in the config? any details you can give?
 
hi,

thank you

we'll try reproducing the error and get back to you if we need more info

maybe you know when it started happening. was it after some change in the config? any details you can give?

Unfortunately I have no idea on a possible timeline. I started testing out this feature the day before I made the first posting. I actually didn't know it was possible to hotplug CPU and RAM on a VM until I spoke with a friend of mine. Since I've found very little information on it other than the documentation I thought it was more likely that it was a bug since it would be less likely to be found if it was rarely used. Combine that with the "error unplugging cpu4" which made me suspicious from a programming standpoint as there is no "CPU4" in the system.

If you need anything from me to debug let me know. I can even provide the VM disk file if that will help.
 
If you need anything from me to debug let me know. I can even provide the VM disk file if that will help.
which iso did you use? it'd be useful if you could link it here.
 
which iso did you use? it'd be useful if you could link it here.
I believe it was this one. It hash matches the only version I have in my NAS at the moment and I don't see why I would have deleted any other images.

It has been a few months since I made this VM though. I use it to run a Minecraft server and decided to test hotplug on it because it was already configured, backed up and I would use hotplug on it if I could get it working anyway.

https://cdimage.debian.org/cdimage/archive/9.9.0/amd64/iso-cd/debian-9.9.0-amd64-netinst.iso
 
Hi,

cpu hot-unplug should works 100%.
do you have any kernel message in your vm ?


memory hot-unplug is not working fine (because of linux kernel memory management). you can simulate hot-unplug, with memory balloning. Setting shared=0 will force memory value to min, so you can play with min memory to hotplug/unplug memory. (up to maxmemory).
 
I'm not sure, but hotplug cpu method has changed recently

https://git.proxmox.com/?p=qemu-ser...;hpb=485449e37b3e864523a693c946e08ea46650b927

Code:
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -4504,7 +4504,7 @@ sub qemu_cpu_hotplug {
                my $retry = 0;
                my $currentrunningvcpus = undef;
                while (1) {
-                   $currentrunningvcpus = mon_cmd($vmid, "query-cpus");
+                   $currentrunningvcpus = mon_cmd($vmid, "query-cpus-fast");
                    last if scalar(@{$currentrunningvcpus}) == $i-1;
                    raise_param_exc({ vcpus => "error unplugging cpu$i" }) if $retry > 5;
                    $retry++;

maybe the new qemu-cpus-fast method return different list of cpus (vcpus/cores) than previous method.
That could explain why it's try to unplug the cpu3/4 ?
(cpu here, is the number of cores currently plugged)
 
Hi,

cpu hot-unplug should works 100%.
do you have any kernel message in your vm ?


memory hot-unplug is not working fine (because of linux kernel memory management). you can simulate hot-unplug, with memory balloning. Setting shared=0 will force memory value to min, so you can play with min memory to hotplug/unplug memory. (up to maxmemory).
It was my understanding that hot-unplug worked which is why it surprised me when it didn't. As for kernel messages, I just checked and there are no messages as far as I can tell. The entire OS seems to hang the moment the memory or CPU are unplugged so I'm not surprised.


As for the code, I'm still reading it but depending on what is stored in these variables this might be the faulty line.
I don't know enough about this code to say for sure, but just currentvcpus should be the value in the VM config file which is a 1-starting count.
I'm not sure about what qemu_devicedel will take but if it's looking for a 0-starting count then it will fail.

Code:
4503                 qemu_devicedel($vmid, "cpu$i");

"query-cpus-fast" vs "query-cpus" is because the command "query-cpus" is deprecated. I don't know if that could cause an issue somehow but perhaps. The change was made on the 6th of February and if no one else found it yet I could see that being the cause of the issue.

https://git.proxmox.com/?p=qemu-server.git;a=commit;h=65af8c312e685eaf50ca2e6d2f733231873a22c5


EDIT: I just edited the code the adjust the CPU value and it doesn't seem to have any effect, so that doesn't seem to be the issue. I'm going to keep testing.
 
Last edited:
hi,

i was able to reproduce the issue (still going through the code to find where it happens).

i was also able to find a workaround for the meantime, please verify this:
- debian 9.9.0 guest
- i try first 2 cores and 2 sockets, enable numa and cpu hotplugging in the options
- start vm

vm should start normally. if we go to the cpu settings now:
- VCPUs will say 4 but greyed out
- set vcpus to 1

Code:
Parameter verification failed. (400)

vcpus: hotplug problem - 400 Parameter verification failed. vcpus: error unplugging cpu4

comes up.

press OK and stop the vm.

start the vm again.

go to settings and change vcpus, it should work...

while looking for the cause i found this in the journal:

Code:
May 06 17:00:33 solar-jupiter pvedaemon[1049]: <root@pam> update VM 10000: -cores 2 -delete cpulimit,cpuunits,cpu -numa 1 -sockets 2 -vcpus 1
May 06 17:00:33 solar-jupiter pvedaemon[1049]: cannot delete 'cpulimit' - not set in current configuration!
May 06 17:00:33 solar-jupiter pvedaemon[1049]: cannot delete 'cpuunits' - not set in current configuration!
May 06 17:00:33 solar-jupiter pvedaemon[1049]: cannot delete 'cpu' - not set in current configuration!

seems like cpulimit,cpuunits,cpu options are being deleted in each update of vcpus (so the errors can be a symptom or a cause)

let me know if the workaround solves the issue for now, it should be patched soon with the new versions
 
also i got some kernel messages in the vm right after 'Parameter verification failed'. could you check that as well, and post it here?
 
press OK and stop the vm.

start the vm again.

go to settings and change vCPU, it should work...

Not sure exactly what you mean by that.
I followed what you said and the VM froze after I got the error. When I stopped it and started it again the VM only had 1 vCPU like I had told it to set itself to.

This is basically the same as shutting the VM down and adding or removing cores though which I could do before.

As for the errors, I stupidly didn't even think to check it before but I'm getting the same errors in the syslog. Knowing this now I have been playing with config settings.

I changed the CPU configuration from the dual socket 2 core config you mentioned to a single socket 8 core and started with 2 vCPUs.
Next, I added a vCPU to make sure that I detected the change in the VM.

Finally, I went to remove a vCPU but ALSO changed the value of cpulimit and cpuunits. This resulted in this error.

Code:
May 06 18:00:46 scylla pvedaemon[8856]: <root@pam> update VM 122: -cores 8 -cpulimit 128 -cpuunits 1023 -delete cpu -numa 1 -sockets 1 -vcpus 2
May 06 18:00:46 scylla pvedaemon[8856]: cannot delete 'cpu' - not set in current configuration!

If I do not change the value of cpuunits from 1024 to something else then it will give me the error

Code:
May 06 18:07:02 scylla pvedaemon[8856]: <root@pam> update VM 122: -cores 8 -cpulimit 128 -delete cpuunits,cpu -numa 1 -sockets 1 -vcpus 2
May 06 18:07:02 scylla pvedaemon[8856]: cannot delete 'cpuunits' - not set in current configuration!
May 06 18:07:02 scylla pvedaemon[8856]: cannot delete 'cpu' - not set in current configuration!

Next I tried changing the CPU to qemu64 instead of kvm64 which adds the 'cpu' value to the config file. This allowed me to clear all of the errors from the log.

Code:
May 06 18:15:22 scylla pvedaemon[15871]: <root@pam> update VM 122: -cores 8 -cpu qemu64 -cpulimit 128 -cpuunits 1023 -numa 1 -sockets 1 -vcpus 3
May 06 18:15:44 scylla pvedaemon[15871]: <root@pam> update VM 122: -cores 8 -cpu qemu64 -cpulimit 128 -cpuunits 1023 -numa 1 -sockets 1 -vcpus 2

Unfortunately, the error is still happening. Whatever is actually causing the issue I don't believe it is directly related to the config file errors as even once they are gone it still errors out.
 
Not sure exactly what you mean by that.
I followed what you said and the VM froze after I got the error. When I stopped it and started it again the VM only had 1 vCPU like I had told it to set itself to.
yes, and then you should be able to change the amount of vcpus after the vm starts running, and hotplug should work (at least here it does)

VM freezing sounds like there has to be a kernel panic or similar.

please check /var/log/syslog in the vm.

also switch to a console with ctrl-alt-f[1-9] and then try triggering the error. this will show us any kernel messages in the vm
 
TEMP.jpg

I'm not getting any errors in the syslog and the console froze just like the desktop.

When I do what you described the VM will start with 1 vCPU and I can add them, but I still can't removed them. The VM will hang if I try to remove a CPU at any point after this. I tested this on a second server just in case and the issue still happens there.

I'm fairly sure I have everything right but I've never tested this before and have thus never had it working, so it's definitely possible I made a mistake somewhere.

I created a file under: /lib/udev/rules.d/80-hotplug-cpu-mem.rules
containing
Code:
SUBSYSTEM=="cpu", ACTION=="add", TEST=="online", ATTR{online}=="0", ATTR{online}="1"
I added the kernel boot parameter
Code:
memhp_default_state=online
And I added
Code:
CONFIG_MOVABLE_NODE=YES
and
Code:
movable_node
as kernel config and kernel boot parameters respectively.

This is all I saw mentioned in the Wiki. Is there anything else I might be missing VM config wise?

Though it was not mentioned in the wiki I also installed the QEMU guest agent just in case it was required.
 
okay it seems like the kernel of debian 9.9 doesn't fully support cpu hotplugging. i was able to get it working on a stock debian buster (newer kernel) and it worked fine. try that and let me know (debian 10.2 to be specific)
 
okay it seems like the kernel of debian 9.9 doesn't fully support cpu hotplugging. i was able to get it working on a stock debian buster (newer kernel) and it worked fine. try that and let me know (debian 10.2 to be specific)
It seems like that is the answer.

Wow now I feel kind of dumb. I updated my VM to Debian 10 and the issue is completely gone. Proxmox still throws errors if I remove the CPU, cpuunits or cpulimit fields but other than that the issue is gone. Memory hotunplug is working flawlessly now too.
I'm super glad it works now. I can definitely use this feature.

Thanks so much for your help.
 
glad to be of help!

please mark the thread as [SOLVED] so others know what to expect
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!