VirtioFS support

yaro014

Active Member
Dec 27, 2012
32
15
28
You can make it a systemd service so you dont have to deamonize just like it's shown in the post i mentioned earlier.
 
  • Like
Reactions: Rphoton

Rphoton

New Member
Oct 31, 2019
10
0
1
Spain
You can make it a systemd service so you dont have to deamonize just like it's shown in the post i mentioned earlier.
Oh, that sounds much more elegant, sorry I got lost reading your post, I had already read it yesterday and I seemed much more complicated, with multiple mounts and dynamic configuration, I'm just trying a single mount point. I'll check it out in more depth.
 
Last edited:

ryba84

Member
May 27, 2020
26
10
8
38
You can use simpler hookscript:
Code:
#!/bin/bash

function launch() {
        nohup /usr/lib/kvm/virtiofsd -f --socket-path=/var/run/shared-fs.sock -o source=/zp0/ct0/subvol-103-disk-1 --cache=always --syslog --daemonize &> /dev/null  &
        return 0
}

if [ $2 = "pre-start" ]; then
        launch
fi
exit 0
I've tried move samba fileserver to kvm but for now its not working. Cannot access shares (permissions). With systemd service qemu cannot access socket file.
 
Last edited:

RudyBzh

Member
Jul 9, 2020
10
1
8
42
'm using it with Proxm

This shouldn't be relevant, anyway yes, they are the same, but in the host I don't have the users present in the guest.


Could you better explain what you did? Because what I reported seems to be a virtiofsd related problem. What are the permissions of your directory?

Thanks a lot for your all your inputs about this.
I was experiencing the exact same issue but cannot understand why it was working on some VM and not on others. The "small" difference between primary and secondary group is hard to identify...
I had a look on the related mailing list (where I don't really understand 50%) and I'm wondering how long it it will take to be solved...
Please make a post here if you see a fix.
 

ajvpot

New Member
Jan 20, 2022
1
0
1
37
Hello! Is there any chance we will see support for the DAX feature of virtiofs? I tried to enable it but got an error message. This feature should help w/ performance because it avoids duplicating files between host and guest memory. It requires virtiofs to be built with some flags.
DAX info

https://virtio-fs.gitlab.io/howto-qemu.html says the flags are
CONFIG_DAX
CONFIG_FS_DAX
CONFIG_DAX_DRIVER
CONFIG_ZONE_DEVICE
 

lpfister9

New Member
Jun 6, 2022
1
5
3
Since this post is like the first result in google on this topic and i've just spent 2 days getting virtio-fs to work on proxmox (7.2-3), i wanted to put the most important info here for future people in my situation, as the docs are really all over the place.
What I learned is:
  • use hugepages
  • do NOT enable numa in Proxmox

Required preparation on the host:
To set up working virtio-fs drives in your vm's, the following setup worked for me:
first set up hugepages in /etc/fstab, by adding the following line:

hugetlbfs /dev/hugepages hugetlbfs defaults

reboot proxmox (maybe you can mount it somehow without reboot but i did not test that)
Set up a certain space for hugepages:

echo 2000 > /proc/sys/vm/nr_hugepages

This will make (2000x2MB) = 4GB of your ram reserved for hugepages. 2mb being the default size for hugepages in my setup. Change that number to how much RAM the VMs that will use your shared drive will have (e.g for 2 vm's with 1gb of ram each, reserve a little over 2gb for hugepages)

Next, prepare a folder on your host you'll share with the vms. I created a LVM volume, formatted it as ext4 and mounted it on /mount/sharevolumes/fileshare

Creating a VM that can mount your folder directly:

Start virtiofsd to create a socket the VM will use to access your storage. While debugging i used the following command to see it's output:
/usr/bin/virtiofsd -f -d --socket-path=/var/<socketname>.sock -o source=/mnt/sharevolumes/fileshare -o cache=always -o posix_lock -o flock
Once you get it working remove the -d (debug flag) and set it up as a service (i set it up as a service unit that can be created from template, to only need the service configuration once and be able to start one for each VM)

With that done, you can edit your VM to add the virtio-fs volume. As mentioned above, make sure you do not enable numa in proxmox. The settings that made it work for me had to be added as args:
args: -chardev socket,id=char0,path=/var/virtiofsd1.sock -device vhost-user-fs-pci,queue-size=1024,chardev=char0,tag=fileshare -object memory-backend-memfd,id=mem,hugetlb=yes,hugetlbsize=2097152,prealloc=yes,size=3G,share=on -mem-path /dev/hugepages -numa node,memdev=mem
I apologise for the bad readability but it is straight up copied from the working config.
This has to be put in /etc/pve/qemu-server/<vmID>.conf as a new line in addition to the existing config there. For reference, i'll paste my complete <vmID>.conf file at the end of the post.

For these args, you have to set the folowing yourself:
  • path=/<path-to-your-socket>
    • the socket will be created at this location, use the same location you started the virtiofsd socket in. Since each VM needs it's own socket, you'll have to adjust this inside each config file.
  • tag=<tag>
    • the tag under which you'll be able to mount on the guest OS
  • hugetlbsize=2097152
    • the default hugepage block size, default 2MB but if you change it change it here too
  • size=<VM's ram>
    • has to be the same as your VM. you can use 1G for a gigabyte and similar.
  • -mem-path /dev/hugepages
    • when you set up /etc/fstab earlier, you had to put the path for hugepages. use the same here

After adding these args, make sure your socket is running and start the VM.
Inside the guest OS you should now be able to mount the virtio-fs volume using the tag you've specified in the args.

mount -t virtiofs <tag> <mount-point>

for example what i used:
mount -t virtiofs fileshare /mnt/fileshare/



<vmID>.conf:

Code:
args: -chardev socket,id=char0,path=/var/virtiofsd1.sock -device vhost-user-fs-pci,queue-size=1024,chardev=char0,tag=fileshare -object memory-backend-memfd,id=mem,hugetlb=yes,hugetlbsize=2097152,prealloc=yes,size=3G,share=on -mem-path /dev/hugepages -numa node,memdev=mem
boot: order=scsi0;ide2;net0
cores: 2
ide2: local:iso/ubuntu-22.04-live-server-amd64.iso,media=cdrom,size=1432338K
memory: 3072
meta: creation-qemu=6.2.0,ctime=1654416192
name: cloudinittests
net0: virtio=C6:28:4A:61:E7:AA,bridge=vmbr0,firewall=1
numa: 0
onboot: 1
ostype: l26
scsi0: wd2tb:vm-110-disk-0,size=10G
scsihw: virtio-scsi-pci
smbios1: uuid=3939eba6-46aa-4e53-860d-b039eecbcfd6
sockets: 1
vmgenid: 70e27a5e-c8cd-43f7-ad6d-0e93980fb691
 
Last edited by a moderator:

nomizfs

Active Member
Jan 7, 2015
27
1
28
As mentioned above, make sure you do not enable numa in proxmox
Hello, i'm about to try this, and i will read up on https://virtio-fs.gitlab.io/howto-qemu.html but just wanted to ask in advance, if you actually tried enabling NUMA, and what happened?

I want to use virtiofs in a VM that has NUMA enabled, and has cpu pinning because it has a GPU with pcie passthrough. Is the issue with NUMA related to hugepages?

regards, mike

EDIT, i've now learned that it is possible to set hugepages per numa node, i will try to achieve this. i'm reading these links:

https://community.intel.com/t5/Embe...ge-affinity-for-specific-NUMA-node/m-p/265872

https://doc.dpdk.org/guides/linux_gsg/sys_reqs.html#use-of-hugepages-in-the-linux-environment
 
Last edited:

nomizfs

Active Member
Jan 7, 2015
27
1
28
Hello! Is there any chance we will see support for the DAX feature of virtiofs? I tried to enable it but got an error message. This feature should help w/ performance because it avoids duplicating files between host and guest memory. It requires virtiofs to be built with some flags.
DAX info

https://virtio-fs.gitlab.io/howto-qemu.html says the flags are
CONFIG_DAX
CONFIG_FS_DAX
CONFIG_DAX_DRIVER
CONFIG_ZONE_DEVICE
Hello, i asked about this from a linux openzfs developer, and he says it will probably not work, and then said that he is planning to start using virtiofs himself at some point, and when he does he will take a look at DAX, and the possibility of adding openzfs support for it.
 

alpha754293

New Member
Jan 8, 2023
20
6
3
So I've been playing around with this and experimenting with it and I have a few questions.

(i'm new to Proxmox and this level of virtualisation, so please forgive my stupid questions.)

1) I was reading this thread about the user and permissions management.

Stupid question: how do I create a user (and group) in Proxmox VE 7.3-3 such that the shared folder on the host will have the same permissions said shared folder is mounted in the VM guest?

I was able to create the folder as root on the host just fine, but when I tried to change the permissions of the shared folder on the host to the same permissions as my Ubuntu 20.04 VMs/guests, apparently, I didn't have the same group nor user. If I tried to create the user via the Proxmox VE GUI, and I set it to PAM, and then try and change/set the password, it says that the user doesn't exist. If I try to set it as the Proxmox VE authentication, I can set the password via the GUI, but the entry still doesn't show up in /etc/passwd. So, I'm pretty sure that I'm doing something wrong here.


Solved.

I had to do it through ssh/the command line. (It was still weird that when I created my user account via the GUI and then tried to set the password, it said that my user account didn't exist despite the fact that it was right there.

I am not sure which command finally took care of that, whether it was pveum or whether it was useradd on the host itself, but either way - I got it working.

2) Right now, I have been running /usr/lib/lvm/virtiofsd manually, each time I stop and start the VM guests. I've noticed that if I tried to use the same socket twice (i.e. create one socket and try to have two VMs connect to it), it doesn't let me do that. Does this mean that I need to run/start a new virtiofsd for each VM/guest? I get the feeling that this would be somewhat inefficient because you'd end up spawning as many virtiofsd processes as you have VMs running.

If you can please educate me in regards to this, that would be greatly appreciated.


Also solved, apparently.

It looks like that with the auto-start script for the socket, that it will create a new one each time you start a VM that has the hookscript.

3) I also read, in this thread, about the putting the perl script into /var/lib/vz/snippets and also the bash shell script as well to start it. I also read @yaro014's thread about trying to pass arguments to those scripts (because @Rphoton's example has, for example, the VMID hardcoded into the script). So, how do I make it so that it will automatically open the socket for each of my VMs after each stop and/or before each start? I'm not a programmer nor a developer, so I can see the scripts, but I am not knowledgeable enough to be able to make sense of how the scripts will be able to create a new socket if it the VMID is hardcoded into it? (or does that not matter?)

(I'm an idiot when it comes to these things, and therefore; an idiots guide to deploying this (and any help that the team can provide) would be greatly appreciated.

Thank you.


Also solved.

I put both the virtiofs.pl perl script as well as the launch-virtio-daemon.sh in to /var/lib/vz/snippets/.

Hookscript was "attached" to the VM via this command:
qm set 100 --hookscript local:snippets/virtiofs.pl

launch-virtio-daemon.sh contents here:
Code:
#!/usr/bin/bash


function launch() {

    nohup /usr/lib/kvm/virtiofsd --syslog --daemonize --socket-path=/var/run/shared-fs.sock -o source=/myfs/ -o cache=always &> /dev/null  &
    return 0
}

launch

The flag "--cache=always" doesn't work. (At least as of this writing, with Proxmox VE 7.3-3.)

The perl hook script contents here:
Code:
#!/usr/bin/perl

# Exmple hook script for PVE guests (hookscript config option)
# You can set this via pct/qm with
# pct set <vmid> -hookscript <volume-id>
# qm set <vmid> -hookscript <volume-id>
# where <volume-id> has to be an executable file in the snippets folder
# of any storage with directories e.g.:
# qm set 100 -hookscript local:snippets/hookscript.pl

use strict;
use warnings;

print "GUEST HOOK: " . join(' ', @ARGV). "\n";

# First argument is the vmid

my $vmid = shift;

# Second argument is the phase

my $phase = shift;

if ($phase eq 'pre-start') {

    # First phase 'pre-start' will be executed before the guest
    # ist started. Exiting with a code != 0 will abort the start

    print "$vmid is starting, doing preparations.\n";

    system('/var/lib/vz/snippets/launch-virtio-daemon.sh');

    # print "preparations failed, aborting."
    # exit(1);

} elsif ($phase eq 'post-start') {

    # Second phase 'post-start' will be executed after the guest
    # successfully started.

    print "$vmid started successfully.\n";

} elsif ($phase eq 'pre-stop') {

    # Third phase 'pre-stop' will be executed before stopping the guest
    # via the API. Will not be executed if the guest is stopped from
    # within e.g., with a 'poweroff'

    print "$vmid will be stopped.\n";

} elsif ($phase eq 'post-stop') {

    # Last phase 'post-stop' will be executed after the guest stopped.
    # This should even be executed in case the guest crashes or stopped
    # unexpectedly.

    print "$vmid stopped. Doing cleanup.\n";

} else {
    die "got unknown phase '$phase'\n";
}

exit(0);

Seems like it worked.

Ubuntu was able to mount via:
mount -t virtiofs myfs /myfs without any issues.

SLES, on the other hand, does not and would not mount it. (I tried with SLES15 SP4, SLES12 SP4 - neither worked).

I looked at the SLES documentation and tried the command that they showed for mounting 9p and that didn't work neither.

It said that the /myfs "special device" wasn't present or something along those lines.

Pity. So, for that, I ended up installing nfs-kernel-server on the host itself, and then created the NFS export, and then mounted the same directory in SLES over NFS instead. It worked enough. (I could write to it (via the virtio network interface) at 301 MB/s and read from it at 785 MB/s vs. the virtiofs that Ubuntu was able to use which was able to write to it at around 496 MB/s and read from it at 1700 MB/s (average between two Ubuntu VMs reading/writing to the host separately and sequentially).

So at least I was able to get Ubuntu up and running with that.

Tomorrow, I think that I am going to try CentOS and then also Windows.
 
Last edited:
  • Like
Reactions: UdoB

nomizfs

Active Member
Jan 7, 2015
27
1
28
for me as virtiofs doesn't support NUMA (VM won't start with NUMA enabled just as user lpfister9 says, it's basically useless. The Intel docs i linked to describe how to tie hugepages to a specific NUMA slot, but virtiofs don't care, and as such is useless.

virtiofs is useless, and basically as this is 2023, linux seems useless.. still struggling with wayland, vulcan, GPU drivers, the works.. we're lucky to have a functioning bash environment, and supposed to be content with that lol. It's all BS.
 

alpha754293

New Member
Jan 8, 2023
20
6
3
for me as virtiofs doesn't support NUMA (VM won't start with NUMA enabled just as user lpfister9 says, it's basically useless. The Intel docs i linked to describe how to tie hugepages to a specific NUMA slot, but virtiofs don't care, and as such is useless.

virtiofs is useless, and basically as this is 2023, linux seems useless.. still struggling with wayland, vulcan, GPU drivers, the works.. we're lucky to have a functioning bash environment, and supposed to be content with that lol. It's all BS.
I don't know about NUMA. (I've been reading the documentation about it as well, and I'm still a little fuzzy as to what the benefits of NUMA would be (for a single socket system).)

Like I understand how NUMA can be useful for a muilt-socket system, but with consolidation, the number of sockets may be declining.

For the features that you are looking for, I am not sure if there's a better, alternative version that you can with a free option.

(I'm not sure if VMWare ESXi support the features that you're looking for, but my understanding is that they DON'T have a free option where you can download it and try it out. A LOT of tech YouTubers talk about the per-CPU core licensing cost of VMWare ESXi.)

In my testing of virtio-fs, it's faster than Oracle VirtualBox (with shared folders between the host and the VM guests. xcp-ng doesn't even support it because Xen itself is a Type-1 hypervisor. TrueNAS' VM capabilties also didn't even have something like virtio-fs (that I was able to find, during the course of my research).

So, I'm not sure what other option is there that would be able to do the kind of things, with the kind of features that you're looking for.
 
Last edited:

nomizfs

Active Member
Jan 7, 2015
27
1
28
I don't know about NUMA. (I've been reading the documentation about it as well, and I'm still a little fuzzy as to what the benefits of NUMA would be (for a single socket system).)

Like I understand how NUMA can be useful for a muilt-socket system, but with consolidation, the number of sockets may be declining.

For the features that you are looking for, I am not sure if there's a better, alternative version that you can with a free option.

(I'm not sure if VMWare ESXi support the features that you're looking for, but my understanding is that they DON'T have a free option where you can download it and try it out. A LOT of tech YouTubers talk about the per-CPU core licensing cost of VMWare ESXi.)

In my testing of virtio-fs, it's faster than Oracle VirtualBox (with shared folders between the host and the VM guests. xcp-ng doesn't even support it because Xen itself is a Type-1 hypervisor. TrueNAS' VM capabilties also didn't even have something like virtio-fs (that I was able to find, during the course of my research).

So, I'm not sure what other option is there that would be able to do the kind of things, with the kind of features that you're looking for.
Well, there's something called 9pfs, which recently got a major performance improvement

https://wiki.qemu.org/Documentation/9p
https://www.phoronix.com/news/QEMU-7.2-Released

My main complaint is with linux overall, sure you can build something functional, but only using hours, days, weeks and months studying every single little detail yourself. Everything changes and breaks all the time

The main problem is with all these stupid distro's, there are thousands of 2 to 3 men projects, that never amount to anything, and always die. Nobody is focused on a single OS. Then you go to Debian and it's debian OR Ubuntu, but no single project. Then there is the stupid dependencies, where any little piece of code can break the whole system, because everything depends on everything else. And unless every little piece of code is just perfect, your whole system is nothing but a heap of trash.

Update your linux GPU driver? Well, that is a life or death event.

The kernel is stupid, the package managers are stupid, and the whole system is stupid.
 
Last edited:

alpha754293

New Member
Jan 8, 2023
20
6
3
Well, there's something called 9pfs, which recently got a major performance improvement
Not available for xcp-ng. (At least not when I tried to google it.)

According to the presentation published by Senior Principal Software Engineer, Stefan Hajnoczi (RedHat), he shows that virtio-fs is faster than 9p.

I don't know much about either (not in detail), so that can be taken with a grain of salt. Or not.

My main complaint is with linux overall, sure you can build something functional, but only using hours, days, weeks and months studying every single little detail yourself. Everything changes and breaks all the time

The main problem is with all these stupid distro's, there are thousands of 2 to 3 men projects, that never amount to anything, and always die. Nobody is focused on a single OS. Then you go to Debian and it's debian OR Ubuntu, but no single project. Then there is the stupid dependencies, where any little piece of code can break the whole system, because everything depends on everything else. And unless every little piece of code is just perfect, your whole system is nothing but a heap of trash.

Update your linux GPU driver? Well, that is a life or death event.

The kernel is stupid, the package managers are stupid, and the whole system is stupid.
Varies.

I agree with most of what you said and believe me, I've had to rpmfind.net to find some ridiculous, and arachic dependency before, so I agree with just about everything that you said there.

But that raises the question, which OS do you prefer?

I think that ALL OSes suffer from their own respective faults, one way or another, and that based on that, they ALL suck for all of their own individual reasons.

In my home lab, I run ALMOST everything, albeit I finally powered down my Solaris system, I think maybe either a year or two years ago, because the Oracle VirtualBox VM that was running Oracle Solaris 11.4, decided NOT to have an internet connection anymore, and I couldn't figure out how to fix it, no matter what I did or tried. (And I tried.) The wired connection just stopped/died, inexplicably, for no apparent reason.

Beyond that, I have Windows systems, MacOS system, BSD systems (technically), and Linux systems.

And I can tell you that NONE of them are perfect as they ALL have some problems with it in one form or another.

Some stuff just runs better in Linux. Some stuff runs better in MacOS. Some stuff runs better in BSD. And some stuff runs better in Windows.

There is no one-size-fits-all solution, so I use ALL of the tools that are available, to try and get what I want done, completed.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!