Hotplug memory limits total memory to 44GB

wosp

Member
Apr 18, 2015
202
22
18
36
The Netherlands
Situation:

Physical host has 2 CPU-sockets with each 6-core's and HT-enabled. So, 2 x 6 x 2 = 24 vCPU's
Each CPU have 48 GB of memory installed in dual-channel configuration:

[ 16 GB ] [ 8 GB ]
[ 16 GB ] [ 8 GB ]

So, total system memory is 96 GB. Host is running PVE 4.3-9.

When I have a VM with the following configuration:

Code:
boot: dc
bootdisk: scsi0
cores: 12
hotplug: disk,network,usb,memory,cpu
ide2: none,media=cdrom
memory: 46080
name: test
net0: virtio=96:74:4E:9F:19:6D,bridge=vmbr0208
numa: 1
onboot: 1
ostype: l26
protection: 1
scsi0: SSD-cluster:vm-141-disk-1,discard=on,size=1G
scsihw: virtio-scsi-single
smbios1: uuid=403224b4-698c-4bde-8823-4113f6f23844
sockets: 1
vcpus: 2

The VM can't boot. Error:

kvm: -device pc-dimm,id=dimm59,memdev=mem-dimm59,node=0: a used vhost backend has no free memory slots left
TASK ERROR: start failed: command '/usr/bin/kvm -id 141 -chardev 'socket,id=qmp,path=/var/run/qemu-server/141.qmp,server,nowait' -mon 'chardev=qmp,mode=control' -pidfile /var/run/qemu-server/141.pid -daemonize -smbios 'type=1,uuid=403224b4-698c-4bde-8823-4113f6f23844' -name test -smp '2,sockets=1,cores=12,maxcpus=12' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' -vga cirrus -vnc unix:/var/run/qemu-server/141.vnc,x509,password -cpu kvm64,+lahf_lm,+sep,+kvm_pv_unhalt,+kvm_pv_eoi,enforce -m 'size=1024,slots=255,maxmem=4194304M' -object 'memory-backend-ram,id=ram-node0,size=1024M' -numa 'node,nodeid=0,cpus=0-11,memdev=ram-node0' -object 'memory-backend-ram,id=mem-dimm0,size=512M' -device 'pc-dimm,id=dimm0,memdev=mem-dimm0,node=0' -object 'memory-backend-ram,id=mem-dimm1,size=512M' -device 'pc-dimm,id=dimm1,memdev=mem-dimm1,node=0' -object 'memory-backend-ram,id=mem-dimm2,size=512M' -device 'pc-dimm,id=dimm2,memdev=mem-dimm2,node=0' -object 'memory-backend-ram,id=mem-dimm3,size=512M' -device 'pc-dimm,id=dimm3,memdev=mem-dimm3,node=0' -object 'memory-backend-ram,id=mem-dimm4,size=512M' -device 'pc-dimm,id=dimm4,memdev=mem-dimm4,node=0' -object 'memory-backend-ram,id=mem-dimm5,size=512M' -device 'pc-dimm,id=dimm5,memdev=mem-dimm5,node=0' -object 'memory-backend-ram,id=mem-dimm6,size=512M' -device 'pc-dimm,id=dimm6,memdev=mem-dimm6,node=0' -object 'memory-backend-ram,id=mem-dimm7,size=512M' -device 'pc-dimm,id=dimm7,memdev=mem-dimm7,node=0' -object 'memory-backend-ram,id=mem-dimm8,size=512M' -device 'pc-dimm,id=dimm8,memdev=mem-dimm8,node=0' -object 'memory-backend-ram,id=mem-dimm9,size=512M' -device 'pc-dimm,id=dimm9,memdev=mem-dimm9,node=0' -object 'memory-backend-ram,id=mem-dimm10,size=512M' -device 'pc-dimm,id=dimm10,memdev=mem-dimm10,node=0' -object 'memory-backend-ram,id=mem-dimm11,size=512M' -device 'pc-dimm,id=dimm11,memdev=mem-dimm11,node=0' -object 'memory-backend-ram,id=mem-dimm12,size=512M' -device 'pc-dimm,id=dimm12,memdev=mem-dimm12,node=0' -object 'memory-backend-ram,id=mem-dimm13,size=512M' -device 'pc-dimm,id=dimm13,memdev=mem-dimm13,node=0' -object 'memory-backend-ram,id=mem-dimm14,size=512M' -device 'pc-dimm,id=dimm14,memdev=mem-dimm14,node=0' -object 'memory-backend-ram,id=mem-dimm15,size=512M' -device 'pc-dimm,id=dimm15,memdev=mem-dimm15,node=0' -object 'memory-backend-ram,id=mem-dimm16,size=512M' -device 'pc-dimm,id=dimm16,memdev=mem-dimm16,node=0' -object 'memory-backend-ram,id=mem-dimm17,size=512M' -device 'pc-dimm,id=dimm17,memdev=mem-dimm17,node=0' -object 'memory-backend-ram,id=mem-dimm18,size=512M' -device 'pc-dimm,id=dimm18,memdev=mem-dimm18,node=0' -object 'memory-backend-ram,id=mem-dimm19,size=512M' -device 'pc-dimm,id=dimm19,memdev=mem-dimm19,node=0' -object 'memory-backend-ram,id=mem-dimm20,size=512M' -device 'pc-dimm,id=dimm20,memdev=mem-dimm20,node=0' -object 'memory-backend-ram,id=mem-dimm21,size=512M' -device 'pc-dimm,id=dimm21,memdev=mem-dimm21,node=0' -object 'memory-backend-ram,id=mem-dimm22,size=512M' -device 'pc-dimm,id=dimm22,memdev=mem-dimm22,node=0' -object 'memory-backend-ram,id=mem-dimm23,size=512M' -device 'pc-dimm,id=dimm23,memdev=mem-dimm23,node=0' -object 'memory-backend-ram,id=mem-dimm24,size=512M' -device 'pc-dimm,id=dimm24,memdev=mem-dimm24,node=0' -object 'memory-backend-ram,id=mem-dimm25,size=512M' -device 'pc-dimm,id=dimm25,memdev=mem-dimm25,node=0' -object 'memory-backend-ram,id=mem-dimm26,size=512M' -device 'pc-dimm,id=dimm26,memdev=mem-dimm26,node=0' -object 'memory-backend-ram,id=mem-dimm27,size=512M' -device 'pc-dimm,id=dimm27,memdev=mem-dimm27,node=0' -object 'memory-backend-ram,id=mem-dimm28,size=512M' -device 'pc-dimm,id=dimm28,memdev=mem-dimm28,node=0' -object 'memory-backend-ram,id=mem-dimm29,size=512M' -device 'pc-dimm,id=dimm29,memdev=mem-dimm29,node=0' -object 'memory-backend-ram,id=mem-dimm30,size=512M' -device 'pc-dimm,id=dimm30,memdev=mem-dimm30,node=0' -object 'memory-backend-ram,id=mem-dimm31,size=512M' -device 'pc-dimm,id=dimm31,memdev=mem-dimm31,node=0' -object 'memory-backend-ram,id=mem-dimm32,size=1024M' -device 'pc-dimm,id=dimm32,memdev=mem-dimm32,node=0' -object 'memory-backend-ram,id=mem-dimm33,size=1024M' -device 'pc-dimm,id=dimm33,memdev=mem-dimm33,node=0' -object 'memory-backend-ram,id=mem-dimm34,size=1024M' -device 'pc-dimm,id=dimm34,memdev=mem-dimm34,node=0' -object 'memory-backend-ram,id=mem-dimm35,size=1024M' -device 'pc-dimm,id=dimm35,memdev=mem-dimm35,node=0' -object 'memory-backend-ram,id=mem-dimm36,size=1024M' -device 'pc-dimm,id=dimm36,memdev=mem-dimm36,node=0' -object 'memory-backend-ram,id=mem-dimm37,size=1024M' -device 'pc-dimm,id=dimm37,memdev=mem-dimm37,node=0' -object 'memory-backend-ram,id=mem-dimm38,size=1024M' -device 'pc-dimm,id=dimm38,memdev=mem-dimm38,node=0' -object 'memory-backend-ram,id=mem-dimm39,size=1024M' -device 'pc-dimm,id=dimm39,memdev=mem-dimm39,node=0' -object 'memory-backend-ram,id=mem-dimm40,size=1024M' -device 'pc-dimm,id=dimm40,memdev=mem-dimm40,node=0' -object 'memory-backend-ram,id=mem-dimm41,size=1024M' -device 'pc-dimm,id=dimm41,memdev=mem-dimm41,node=0' -object 'memory-backend-ram,id=mem-dimm42,size=1024M' -device 'pc-dimm,id=dimm42,memdev=mem-dimm42,node=0' -object 'memory-backend-ram,id=mem-dimm43,size=1024M' -device 'pc-dimm,id=dimm43,memdev=mem-dimm43,node=0' -object 'memory-backend-ram,id=mem-dimm44,size=1024M' -device 'pc-dimm,id=dimm44,memdev=mem-dimm44,node=0' -object 'memory-backend-ram,id=mem-dimm45,size=1024M' -device 'pc-dimm,id=dimm45,memdev=mem-dimm45,node=0' -object 'memory-backend-ram,id=mem-dimm46,size=1024M' -device 'pc-dimm,id=dimm46,memdev=mem-dimm46,node=0' -object 'memory-backend-ram,id=mem-dimm47,size=1024M' -device 'pc-dimm,id=dimm47,memdev=mem-dimm47,node=0' -object 'memory-backend-ram,id=mem-dimm48,size=1024M' -device 'pc-dimm,id=dimm48,memdev=mem-dimm48,node=0' -object 'memory-backend-ram,id=mem-dimm49,size=1024M' -device 'pc-dimm,id=dimm49,memdev=mem-dimm49,node=0' -object 'memory-backend-ram,id=mem-dimm50,size=1024M' -device 'pc-dimm,id=dimm50,memdev=mem-dimm50,node=0' -object 'memory-backend-ram,id=mem-dimm51,size=1024M' -device 'pc-dimm,id=dimm51,memdev=mem-dimm51,node=0' -object 'memory-backend-ram,id=mem-dimm52,size=1024M' -device 'pc-dimm,id=dimm52,memdev=mem-dimm52,node=0' -object 'memory-backend-ram,id=mem-dimm53,size=1024M' -device 'pc-dimm,id=dimm53,memdev=mem-dimm53,node=0' -object 'memory-backend-ram,id=mem-dimm54,size=1024M' -device 'pc-dimm,id=dimm54,memdev=mem-dimm54,node=0' -object 'memory-backend-ram,id=mem-dimm55,size=1024M' -device 'pc-dimm,id=dimm55,memdev=mem-dimm55,node=0' -object 'memory-backend-ram,id=mem-dimm56,size=1024M' -device 'pc-dimm,id=dimm56,memdev=mem-dimm56,node=0' -object 'memory-backend-ram,id=mem-dimm57,size=1024M' -device 'pc-dimm,id=dimm57,memdev=mem-dimm57,node=0' -object 'memory-backend-ram,id=mem-dimm58,size=1024M' -device 'pc-dimm,id=dimm58,memdev=mem-dimm58,node=0' -object 'memory-backend-ram,id=mem-dimm59,size=1024M' -device 'pc-dimm,id=dimm59,memdev=mem-dimm59,node=0' -k en-us -device 'pci-bridge,id=pci.3,chassis_nr=3,bus=pci.0,addr=0x5' -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:5bee6a0b193' -drive 'if=none,id=drive-ide2,media=cdrom,aio=threads' -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=100' -device 'virtio-scsi-pci,id=virtioscsi0,bus=pci.3,addr=0x1' -drive 'file=rbd:cl1/vm-141-disk-1:mon_host=192.168.110.131\:6789;192.168.110.133\:6789;192.168.110.135\:6789:id=admin:auth_supported=cephx:keyring=/etc/pve/priv/ceph/SSD-cluster.keyring,if=none,id=drive-scsi0,discard=on,format=raw,cache=none,aio=native,detect-zeroes=unmap' -device 'scsi-hd,bus=virtioscsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,bootindex=200' -netdev 'type=tap,id=net0,ifname=tap141i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' -device 'virtio-net-pci,mac=96:74:4E:9F:19:6D,netdev=net0,bus=pci.0,addr=0x12,id=net0'' failed: exit code 1

With 44 GB memory (45056 MB) the VM can boot without any problem. So, I thought it possibly had something to do with the 48 GB installed per physical CPU and I only assigned 1 socket with 12 cores to the VM. So I changed to 2 sockets with 12 cores each (24 cores in total): same problem/error. When I disable NUMA and memory hotswap support I can boot the VM with 45 GB, or even with 80 GB, with only 1 socket with 12 cores assigned. So doesn't seems to have something to do with the memory per CPU.

Anyone that can explain this behavior to me or is this a bug?
 

spirit

Famous Member
Apr 2, 2010
6,043
752
173
www.groupe-cyllene.com
mmm, seem that qemu limit the number of pc dimm devices since this commit

https://lists.nongnu.org/archive/html/qemu-devel/2015-07/msg05998.html

cat /sys/module/vhost/parameters/max_mem_regions = 64

when the memory hotplug feature was introduce in proxmox, they didn't have this check.

proxmox hotplug virtual memory dimm, from 512M up to X G, then 1024M up to Xg,....
That's what fine up to 4TB, but with this new 64 limit, it's a problem.

I'll send a mail to pve-devel mailing list.
 
  • Like
Reactions: wosp

spirit

Famous Member
Apr 2, 2010
6,043
752
173
www.groupe-cyllene.com
can you try to add a file

/etc/modprobe.d/vhost.conf
with content:

options vhost max_mem_regions=509

then reboot,

do a
"cat /sys/module/vhost/parameters/max_mem_regions"

to verify

and if it's ok, try to add more memory in the vm.
 
  • Like
Reactions: Caes

wosp

Member
Apr 18, 2015
202
22
18
36
The Netherlands
Okay, well, it works for me. I have tested with 91 GB max. and boots without any problem now.
When I move the VM to a host without this change applied, it will not boot. So seems to be the solution. :)
 

udo

Famous Member
Apr 22, 2009
5,936
185
93
Ahrensburg; Germany
I would like to known if it's working for you. (I don't have tested it yet, need to find a spare server).
If it's working, yes, it'll be applied by default in PVE
Hi Spirit,
looks that the issue is still (or again?) on actually pve-nodes:
Code:
pveversion
pve-manager/6.2-11/22fb4983 (running kernel: 5.4.55-1-pve)

qm start 436
kvm: -device pc-dimm,id=dimm61,memdev=mem-dimm61,node=1: a used vhost backend has no free memory slots left
start failed: QEMU exited with code 1
Tried to start an 48GB VM. After applied vhost.conf and reboot, the VM is starting.

Udo
 

spirit

Famous Member
Apr 2, 2010
6,043
752
173
www.groupe-cyllene.com
Hi Spirit,
looks that the issue is still (or again?) on actually pve-nodes:
Code:
pveversion
pve-manager/6.2-11/22fb4983 (running kernel: 5.4.55-1-pve)

qm start 436
kvm: -device pc-dimm,id=dimm61,memdev=mem-dimm61,node=1: a used vhost backend has no free memory slots left
start failed: QEMU exited with code 1
Tried to start an 48GB VM. After applied vhost.conf and reboot, the VM is starting.

Udo
Hi Udo, I just found this bug again when trying to hotplug memory. Seem that we forgot to push the vhost option by default. I'll try to look if we can bump it in the kernel directly.
 

alex.tls

New Member
Oct 17, 2020
27
0
1
39
Hi Spirit,

this issue(or bug) still exist in Virtual Environment 6.4-9.
Just let you know.

BR
ALEX
 

VLD

New Member
Aug 9, 2021
1
0
1
38
I just ran into this with PVE 6.4-13. The vhost.conf fix resolved it for me.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!