Hotplug memory limits total memory to 44GB

wosp

Renowned Member
Apr 18, 2015
203
23
83
37
The Netherlands
Situation:

Physical host has 2 CPU-sockets with each 6-core's and HT-enabled. So, 2 x 6 x 2 = 24 vCPU's
Each CPU have 48 GB of memory installed in dual-channel configuration:

[ 16 GB ] [ 8 GB ]
[ 16 GB ] [ 8 GB ]

So, total system memory is 96 GB. Host is running PVE 4.3-9.

When I have a VM with the following configuration:

Code:
boot: dc
bootdisk: scsi0
cores: 12
hotplug: disk,network,usb,memory,cpu
ide2: none,media=cdrom
memory: 46080
name: test
net0: virtio=96:74:4E:9F:19:6D,bridge=vmbr0208
numa: 1
onboot: 1
ostype: l26
protection: 1
scsi0: SSD-cluster:vm-141-disk-1,discard=on,size=1G
scsihw: virtio-scsi-single
smbios1: uuid=403224b4-698c-4bde-8823-4113f6f23844
sockets: 1
vcpus: 2

The VM can't boot. Error:

kvm: -device pc-dimm,id=dimm59,memdev=mem-dimm59,node=0: a used vhost backend has no free memory slots left
TASK ERROR: start failed: command '/usr/bin/kvm -id 141 -chardev 'socket,id=qmp,path=/var/run/qemu-server/141.qmp,server,nowait' -mon 'chardev=qmp,mode=control' -pidfile /var/run/qemu-server/141.pid -daemonize -smbios 'type=1,uuid=403224b4-698c-4bde-8823-4113f6f23844' -name test -smp '2,sockets=1,cores=12,maxcpus=12' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' -vga cirrus -vnc unix:/var/run/qemu-server/141.vnc,x509,password -cpu kvm64,+lahf_lm,+sep,+kvm_pv_unhalt,+kvm_pv_eoi,enforce -m 'size=1024,slots=255,maxmem=4194304M' -object 'memory-backend-ram,id=ram-node0,size=1024M' -numa 'node,nodeid=0,cpus=0-11,memdev=ram-node0' -object 'memory-backend-ram,id=mem-dimm0,size=512M' -device 'pc-dimm,id=dimm0,memdev=mem-dimm0,node=0' -object 'memory-backend-ram,id=mem-dimm1,size=512M' -device 'pc-dimm,id=dimm1,memdev=mem-dimm1,node=0' -object 'memory-backend-ram,id=mem-dimm2,size=512M' -device 'pc-dimm,id=dimm2,memdev=mem-dimm2,node=0' -object 'memory-backend-ram,id=mem-dimm3,size=512M' -device 'pc-dimm,id=dimm3,memdev=mem-dimm3,node=0' -object 'memory-backend-ram,id=mem-dimm4,size=512M' -device 'pc-dimm,id=dimm4,memdev=mem-dimm4,node=0' -object 'memory-backend-ram,id=mem-dimm5,size=512M' -device 'pc-dimm,id=dimm5,memdev=mem-dimm5,node=0' -object 'memory-backend-ram,id=mem-dimm6,size=512M' -device 'pc-dimm,id=dimm6,memdev=mem-dimm6,node=0' -object 'memory-backend-ram,id=mem-dimm7,size=512M' -device 'pc-dimm,id=dimm7,memdev=mem-dimm7,node=0' -object 'memory-backend-ram,id=mem-dimm8,size=512M' -device 'pc-dimm,id=dimm8,memdev=mem-dimm8,node=0' -object 'memory-backend-ram,id=mem-dimm9,size=512M' -device 'pc-dimm,id=dimm9,memdev=mem-dimm9,node=0' -object 'memory-backend-ram,id=mem-dimm10,size=512M' -device 'pc-dimm,id=dimm10,memdev=mem-dimm10,node=0' -object 'memory-backend-ram,id=mem-dimm11,size=512M' -device 'pc-dimm,id=dimm11,memdev=mem-dimm11,node=0' -object 'memory-backend-ram,id=mem-dimm12,size=512M' -device 'pc-dimm,id=dimm12,memdev=mem-dimm12,node=0' -object 'memory-backend-ram,id=mem-dimm13,size=512M' -device 'pc-dimm,id=dimm13,memdev=mem-dimm13,node=0' -object 'memory-backend-ram,id=mem-dimm14,size=512M' -device 'pc-dimm,id=dimm14,memdev=mem-dimm14,node=0' -object 'memory-backend-ram,id=mem-dimm15,size=512M' -device 'pc-dimm,id=dimm15,memdev=mem-dimm15,node=0' -object 'memory-backend-ram,id=mem-dimm16,size=512M' -device 'pc-dimm,id=dimm16,memdev=mem-dimm16,node=0' -object 'memory-backend-ram,id=mem-dimm17,size=512M' -device 'pc-dimm,id=dimm17,memdev=mem-dimm17,node=0' -object 'memory-backend-ram,id=mem-dimm18,size=512M' -device 'pc-dimm,id=dimm18,memdev=mem-dimm18,node=0' -object 'memory-backend-ram,id=mem-dimm19,size=512M' -device 'pc-dimm,id=dimm19,memdev=mem-dimm19,node=0' -object 'memory-backend-ram,id=mem-dimm20,size=512M' -device 'pc-dimm,id=dimm20,memdev=mem-dimm20,node=0' -object 'memory-backend-ram,id=mem-dimm21,size=512M' -device 'pc-dimm,id=dimm21,memdev=mem-dimm21,node=0' -object 'memory-backend-ram,id=mem-dimm22,size=512M' -device 'pc-dimm,id=dimm22,memdev=mem-dimm22,node=0' -object 'memory-backend-ram,id=mem-dimm23,size=512M' -device 'pc-dimm,id=dimm23,memdev=mem-dimm23,node=0' -object 'memory-backend-ram,id=mem-dimm24,size=512M' -device 'pc-dimm,id=dimm24,memdev=mem-dimm24,node=0' -object 'memory-backend-ram,id=mem-dimm25,size=512M' -device 'pc-dimm,id=dimm25,memdev=mem-dimm25,node=0' -object 'memory-backend-ram,id=mem-dimm26,size=512M' -device 'pc-dimm,id=dimm26,memdev=mem-dimm26,node=0' -object 'memory-backend-ram,id=mem-dimm27,size=512M' -device 'pc-dimm,id=dimm27,memdev=mem-dimm27,node=0' -object 'memory-backend-ram,id=mem-dimm28,size=512M' -device 'pc-dimm,id=dimm28,memdev=mem-dimm28,node=0' -object 'memory-backend-ram,id=mem-dimm29,size=512M' -device 'pc-dimm,id=dimm29,memdev=mem-dimm29,node=0' -object 'memory-backend-ram,id=mem-dimm30,size=512M' -device 'pc-dimm,id=dimm30,memdev=mem-dimm30,node=0' -object 'memory-backend-ram,id=mem-dimm31,size=512M' -device 'pc-dimm,id=dimm31,memdev=mem-dimm31,node=0' -object 'memory-backend-ram,id=mem-dimm32,size=1024M' -device 'pc-dimm,id=dimm32,memdev=mem-dimm32,node=0' -object 'memory-backend-ram,id=mem-dimm33,size=1024M' -device 'pc-dimm,id=dimm33,memdev=mem-dimm33,node=0' -object 'memory-backend-ram,id=mem-dimm34,size=1024M' -device 'pc-dimm,id=dimm34,memdev=mem-dimm34,node=0' -object 'memory-backend-ram,id=mem-dimm35,size=1024M' -device 'pc-dimm,id=dimm35,memdev=mem-dimm35,node=0' -object 'memory-backend-ram,id=mem-dimm36,size=1024M' -device 'pc-dimm,id=dimm36,memdev=mem-dimm36,node=0' -object 'memory-backend-ram,id=mem-dimm37,size=1024M' -device 'pc-dimm,id=dimm37,memdev=mem-dimm37,node=0' -object 'memory-backend-ram,id=mem-dimm38,size=1024M' -device 'pc-dimm,id=dimm38,memdev=mem-dimm38,node=0' -object 'memory-backend-ram,id=mem-dimm39,size=1024M' -device 'pc-dimm,id=dimm39,memdev=mem-dimm39,node=0' -object 'memory-backend-ram,id=mem-dimm40,size=1024M' -device 'pc-dimm,id=dimm40,memdev=mem-dimm40,node=0' -object 'memory-backend-ram,id=mem-dimm41,size=1024M' -device 'pc-dimm,id=dimm41,memdev=mem-dimm41,node=0' -object 'memory-backend-ram,id=mem-dimm42,size=1024M' -device 'pc-dimm,id=dimm42,memdev=mem-dimm42,node=0' -object 'memory-backend-ram,id=mem-dimm43,size=1024M' -device 'pc-dimm,id=dimm43,memdev=mem-dimm43,node=0' -object 'memory-backend-ram,id=mem-dimm44,size=1024M' -device 'pc-dimm,id=dimm44,memdev=mem-dimm44,node=0' -object 'memory-backend-ram,id=mem-dimm45,size=1024M' -device 'pc-dimm,id=dimm45,memdev=mem-dimm45,node=0' -object 'memory-backend-ram,id=mem-dimm46,size=1024M' -device 'pc-dimm,id=dimm46,memdev=mem-dimm46,node=0' -object 'memory-backend-ram,id=mem-dimm47,size=1024M' -device 'pc-dimm,id=dimm47,memdev=mem-dimm47,node=0' -object 'memory-backend-ram,id=mem-dimm48,size=1024M' -device 'pc-dimm,id=dimm48,memdev=mem-dimm48,node=0' -object 'memory-backend-ram,id=mem-dimm49,size=1024M' -device 'pc-dimm,id=dimm49,memdev=mem-dimm49,node=0' -object 'memory-backend-ram,id=mem-dimm50,size=1024M' -device 'pc-dimm,id=dimm50,memdev=mem-dimm50,node=0' -object 'memory-backend-ram,id=mem-dimm51,size=1024M' -device 'pc-dimm,id=dimm51,memdev=mem-dimm51,node=0' -object 'memory-backend-ram,id=mem-dimm52,size=1024M' -device 'pc-dimm,id=dimm52,memdev=mem-dimm52,node=0' -object 'memory-backend-ram,id=mem-dimm53,size=1024M' -device 'pc-dimm,id=dimm53,memdev=mem-dimm53,node=0' -object 'memory-backend-ram,id=mem-dimm54,size=1024M' -device 'pc-dimm,id=dimm54,memdev=mem-dimm54,node=0' -object 'memory-backend-ram,id=mem-dimm55,size=1024M' -device 'pc-dimm,id=dimm55,memdev=mem-dimm55,node=0' -object 'memory-backend-ram,id=mem-dimm56,size=1024M' -device 'pc-dimm,id=dimm56,memdev=mem-dimm56,node=0' -object 'memory-backend-ram,id=mem-dimm57,size=1024M' -device 'pc-dimm,id=dimm57,memdev=mem-dimm57,node=0' -object 'memory-backend-ram,id=mem-dimm58,size=1024M' -device 'pc-dimm,id=dimm58,memdev=mem-dimm58,node=0' -object 'memory-backend-ram,id=mem-dimm59,size=1024M' -device 'pc-dimm,id=dimm59,memdev=mem-dimm59,node=0' -k en-us -device 'pci-bridge,id=pci.3,chassis_nr=3,bus=pci.0,addr=0x5' -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:5bee6a0b193' -drive 'if=none,id=drive-ide2,media=cdrom,aio=threads' -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=100' -device 'virtio-scsi-pci,id=virtioscsi0,bus=pci.3,addr=0x1' -drive 'file=rbd:cl1/vm-141-disk-1:mon_host=192.168.110.131\:6789;192.168.110.133\:6789;192.168.110.135\:6789:id=admin:auth_supported=cephx:keyring=/etc/pve/priv/ceph/SSD-cluster.keyring,if=none,id=drive-scsi0,discard=on,format=raw,cache=none,aio=native,detect-zeroes=unmap' -device 'scsi-hd,bus=virtioscsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,bootindex=200' -netdev 'type=tap,id=net0,ifname=tap141i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' -device 'virtio-net-pci,mac=96:74:4E:9F:19:6D,netdev=net0,bus=pci.0,addr=0x12,id=net0'' failed: exit code 1

With 44 GB memory (45056 MB) the VM can boot without any problem. So, I thought it possibly had something to do with the 48 GB installed per physical CPU and I only assigned 1 socket with 12 cores to the VM. So I changed to 2 sockets with 12 cores each (24 cores in total): same problem/error. When I disable NUMA and memory hotswap support I can boot the VM with 45 GB, or even with 80 GB, with only 1 socket with 12 cores assigned. So doesn't seems to have something to do with the memory per CPU.

Anyone that can explain this behavior to me or is this a bug?
 
mmm, seem that qemu limit the number of pc dimm devices since this commit

https://lists.nongnu.org/archive/html/qemu-devel/2015-07/msg05998.html

cat /sys/module/vhost/parameters/max_mem_regions = 64

when the memory hotplug feature was introduce in proxmox, they didn't have this check.

proxmox hotplug virtual memory dimm, from 512M up to X G, then 1024M up to Xg,....
That's what fine up to 4TB, but with this new 64 limit, it's a problem.

I'll send a mail to pve-devel mailing list.
 
  • Like
Reactions: wosp
can you try to add a file

/etc/modprobe.d/vhost.conf
with content:

options vhost max_mem_regions=509

then reboot,

do a
"cat /sys/module/vhost/parameters/max_mem_regions"

to verify

and if it's ok, try to add more memory in the vm.
 
  • Like
Reactions: Caes
Works! Thanks.
Will this be the default in PVE in the future, or do I need to change on all my nodes manually?

I would like to known if it's working for you. (I don't have tested it yet, need to find a spare server).
If it's working, yes, it'll be applied by default in PVE
 
Okay, well, it works for me. I have tested with 91 GB max. and boots without any problem now.
When I move the VM to a host without this change applied, it will not boot. So seems to be the solution. :)
 
I would like to known if it's working for you. (I don't have tested it yet, need to find a spare server).
If it's working, yes, it'll be applied by default in PVE
Hi Spirit,
looks that the issue is still (or again?) on actually pve-nodes:
Code:
pveversion
pve-manager/6.2-11/22fb4983 (running kernel: 5.4.55-1-pve)

qm start 436
kvm: -device pc-dimm,id=dimm61,memdev=mem-dimm61,node=1: a used vhost backend has no free memory slots left
start failed: QEMU exited with code 1
Tried to start an 48GB VM. After applied vhost.conf and reboot, the VM is starting.

Udo
 
Hi Spirit,
looks that the issue is still (or again?) on actually pve-nodes:
Code:
pveversion
pve-manager/6.2-11/22fb4983 (running kernel: 5.4.55-1-pve)

qm start 436
kvm: -device pc-dimm,id=dimm61,memdev=mem-dimm61,node=1: a used vhost backend has no free memory slots left
start failed: QEMU exited with code 1
Tried to start an 48GB VM. After applied vhost.conf and reboot, the VM is starting.

Udo
Hi Udo, I just found this bug again when trying to hotplug memory. Seem that we forgot to push the vhost option by default. I'll try to look if we can bump it in the kernel directly.
 
Hi Spirit,

this issue(or bug) still exist in Virtual Environment 6.4-9.
Just let you know.

BR
ALEX
 
I just ran into this with PVE 6.4-13. The vhost.conf fix resolved it for me.
 
Hi there,

just letting you know this is still an issue with PVE 7.2-11.

Regards,
Wolfgang
you still need to add

Code:
/etc/modprobe.d/vhost.conf

options vhost max_mem_regions=509


also, I have send patches for virtio-mem some months ago, but it's not yet applied.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!