Proxmox and Intel Optane DCPMM

Please take the following with a big grain of salt as I do not have any hands-on experience with persistent memory.

What do you want to achieve? I took a quick look and AFAICT the memory is exposed as a block device /dev/pmem0 which they format with XFS to place a file on it which they then attach to the VM as a simulated nvdimm device.

I am not sure why they do it like that. If all you want is to have a disk present in the VM that is stored on the fast pmem device, I would try to go the usual PVE road. If you want to pass through the pmem directly to the VM and don't care about migrating that VM to another node, you can do so by creating a new VM disk and instead of defining a storage where to place it, assign the device directly:
Code:
qm set <vmid>  -scsi5 file=/dev/pmem0

This will create a new disk with the bus type SCSI and bus ID 5 that if using the /dev/pmem0 device directly.

If you want to store multiple disk images or have a cluster where each node is configured similarly, you can also format the pmem device with a file system, make sure it is mounted (/etc/fstab) and create a directory storage on top of it. This will let you use the regular PVE tools to create disk images there and if you do a migration to another node, the disk image will also be transferred.

I hope that helps :)
 
Hey @aaron ,

Thank you for your insight. I can give a bit more details here.

PMem (NVDIMM) in AppDirect mode (storage) can either be wrapped in a regular block device with filesystem on top of it or exposed as NVDIMM and then formatted / mounted as "PMem aware" FS.

1) The benefits of using regular block device exposed and not PMem aware FS are all about compatibility - OS doesn't need to know it's working with PMem. The downside of it is performance hit as you still go though the OS file stack and use page files. While fast than many SSD, your latency is on magnitude higher than it could be with "Pmem" aware FS or applications using it directly as NVDIMM device.

2) If you expose it as NVDIMM, it means your PMem configuration tools, such as ndctl will recognize it as such and extra manipulation options become available. Then, on the VM itself, the device is recognized as NVDIMM and allows for "DAX" mount option, which makes the FS "PMem aware" and use DAX instead of traditional storage APIs / page files. It allows you to achieve Pmem-level io latency, in many cases 10x lower than using regular block device, which dramatically improves IOPS for random 4k read/write access. It also allows you to use libpmem directly in your apps.

Now, #1 is implemented in VMware as "Persistent memory storage type" (vPMemDisk), so you can use it as a main root disk for your VM. It only performs slightly better than fast NVMe SSDs, depending on configuration. I suspect what you suggested in your post may work for this scenario.

#2 is implemented in VMware as (vPMem), which is what that Intel article describes - passing through virtual NVDIMM.

I actually made it work yesterday with ProxMox, it's not fancy but it works. I used "args" option in qemu-server configs and it appears to work just fine but "memory hotplug" needs to be disabled:

args: -machine nvdimm=on -m slots=2,maxmem=1T -object memory-backend-file,id=mem1,share,mem-path=/pmemfs0/pmem0,size=100G,align=2M -device nvdimm,memdev=mem1,id=nv1,label-size=2M

So, if you are comfortable with manual edits and no sane way of migrating NVDIMM PMem VMs, you can certainly make it work with ProxMox. VMware allows for migration of vPMemDisk via Storage vMotion and vPMem as usually between machines with PMem installed.

Hope it helps
 
Thank you for the explanation, I am definitely a bit wiser now :)

The only thing that itches me in that Intel guide is that the NVDIMM device that you configure for the VM is backed by a file on an XFS file system with which the /dev/pmem0 device is formatted.

I did read a bit through the qemu docs about nvdimms and what that does, is to simulate NVDIMMs to the guest. Not sure if there are any saner ways to pass through NVDIMMs more directly to the guest so that the guest knows about them.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!