[TUTORIAL] virtiofsd in PVE 8.0.x

BobC · Jul 12, 2023

virtiofsd is used to "passthrough" parts of a host filesystem to a virtual machine with "local filesystem symantics and performance". Proxmox moved to a rust based version of virtiofsd in PVE 8 located here. It's installed as a separate package called "virtiofsd" in PVE 8, whereas a different, non-rust based version of virtiofsd, was "included" in PVE 7.x. Note there are several front-ends and back-ends that can be used. What is documented below works for me. Make sure you test well before putting anything into "production".

As was pointed out recently (thank you @fabian ) a future PVE 8.x version will have WebUI support for virtiofsd (see here for the patch). I fully expect these instructions will have a limited "shelf life" but until then, below is what I did to implement the new virtiofsd in PVE 8.0.x in four areas:

<VMID>.conf changes
virtiofsd on the host
hookscript additions
Mount inside the Linux VM

Directions

1. Add an "args:" line to the VM's VMID.conf file located in /etc/pve/qemu-server.

Please note the documentation here states this is for experts only.

Bash:

args: -chardev socket,id=virtfs0,path=/run/changme.sock -device vhost-user-fs-pci,queue-size=1024,chardev=virtfs0,tag=changeme_tag -object memory-backend-file,id=mem,size=32768M,mem-path=/dev/shm,share=on -numa node,memdev=mem

Substitute "changeme.sock", "changeme_tag" and "size" above to appropriate values where:
changeme.sock=full path name of the socket file used for IPC between VM and host
changem_tag=name used inside the VM for the mount command, or in /etc/fstab
size=size of RAM used by the VM (only tested with it equal in size to VM ram)

NOTE: once an "args:" line has been added, the VM will not start unless the socket file exists. virtiofsd is what creates the socket file (see next step).

Confirm args: is a part of the configuration with the command:
qm config VMID

And to "see" the way KVM/Qemu runs the VM:
qm showcmd VMID --pretty

Note the last 3 or 4 lines should contain the components of the args: line.

2. Start virtiofsd on the host.

To test, virtiofsd can be started manually, but needs to be started before the VM so it can create the socket *.sock file referenced in "args:". To start manually, use:

 /usr/libexec/virtiofsd --syslog --socket-path /run/changeme.sock --shared-dir /dir/to/share/with/vm --announce-submounts --inode-file-handles=mandatory

Substitue:
changeme.sock with the name used in args: in step (1)
/dir/to/share/with/vm with the underlying host directory exposed to the VM

Instead of "--syslog", use "--log-level debug" to troubleshoot. For additional/different options, see the virtiofsd documentation REAME file here.

To manually start it as a daemon, preface it with nohup and background it with ampersand (&) like:
nohup command_above &

Special NOTE, this version of virtiofsd has to be "double-forked" to run as a damon. This means once started, there will be two virtiofsd processes, a "parent" and "child". If you examine the PIDs, the "grandparent" PID has to be 1. The "grandparent" must be PID 1. More on this in the next section.

Once running, start the VM as you normally would and see how to mount the shared directory in Linux in step (4).

3. Start virtiofsd in a hookscript

This will automatically start virtiofsd before the VM.
If needed, see Hookscripts in the PVE documentation here.
Of the four phases of a hookscript, virtiofsd needs to run in the pre-start phase.

If your hookscripts are written in perl, I can not comment how to start virtiofsd since all of my hookscripts are in bash.
(Please feel free to add a perl equivalent in the comments.)

If your hookscripts are written in Bash, starting virtiofsd can be problematic since it needs to be double-forked. For complete coverage of what that means see here. After trying many commands including: Bash here documents, sourcing that here doc, eval(), setsig(1), nohup and a few others, the way I found that worked was to create a one-time systemd service unit with systemd-run(1).

In the "pre-start" phase of your bash hookscript, start virtiofsd with:

Bash:

 systemd-run --unit=changeme_service /usr/libexec/virtiofsd --syslog --socket-path /run/changeme.sock --shared-dir /dir/to/share/with/vm --announce-submounts --inode-file-handles=mandatory

Substitute:
changeme_service with a systemd service name of your choice
changeme.sock with the name used in the args: line in step (1)
/dir/to/share/with/vm for the underlying host directory exposed to the VM

Afterwards, the systemd service unit can then be examined with:
systemctl status changeme_service # or the name you changed it to

4. Mount it inside the Linux VM

To mount it manually, make (or determine) a directory to use as a mount point:
mkdir /mnt/mountpoint

Then issue the following mount using the changeme_tag defined in step 1:
mount -t virtiofs changeme_tag /mnt/mountpoint

Test with:
ls -l /mnt/mountpoint
If something is wrong, the command will appear to "hang".

Once working, and with a working hookscript, add the mount to the /etc/fstab permanently, like the following:

Code:

changeme_tag /mnt/mountpoint virtiofs defaults

See fstab(5) for other options if desired.

Laslty, vitiofsd will shut down automatically after the VM that is using it stops. There is no need to stop it from the hookscript. This is by design.

sikha · Aug 17, 2023

I switched to using the Rust version of virtiofsd back in PVE 7 and meant to do a write-up of the process but never got around to it.

The last time I touched this was back in May last year, so I don't remember much, but I use the following Perl hookscript to create a templated systemd service file for starting virtiofsd on a particular directory, and then enable it for a particular VMID (to launch whenever the VM launches, and stop when it stops) and add arguments to the VM's configuration. It currently uses hardcoded associations, and at the moment does not have any cleanup steps, but it may be helpful as a stepping stone for someone else (or I might get around to completing it eventually—~~alternatively, if this is something I can implement within PVE itself that might motivate me to fix up and submit a changeset~~ edit: I forgot, there's already an in-queue implementation that doesn't make use of the VM's systemd scope).

Perl:

#!/usr/bin/perl

use strict;
use warnings;

my %associations = (
  100 => ['/zpool/audio', '/zpool/books', '/zpool/games', '/zpool/work'],
  101 => ['/zpool/audio', '/zpool/games'],
);

use PVE::QemuServer;

use Template;
my $tt = Template->new;

print "GUEST HOOK: " . join(' ', @ARGV) . "\n";

my $vmid = shift;
my $conf = PVE::QemuConfig->load_config($vmid);
my $vfs_args_file = "/run/$vmid.virtfs";

my $phase = shift;

my $unit_tpl = "[Unit]
Description=virtiofsd filesystem share at [% share %] for VM %i
StopWhenUnneeded=true

[Service]
Type=simple
PIDFile=/run/virtiofsd/.run.virtiofsd.%i-[% share_id %].sock.pid
ExecStart=/usr/lib/kvm/virtiofsd -f --socket-path=/run/virtiofsd/%i-[% share_id %].sock -o source=[% share %] -o cache=always

[Install]
RequiredBy=%i.scope\n";

if ($phase eq 'pre-start') {
  print "$vmid is starting, doing preparations.\n";

  my $vfs_args = "-object memory-backend-memfd,id=mem,size=$conf->{memory}M,share=on -numa node,memdev=mem";
  my $char_id = 0;

  # TODO: Have removal logic. Probably need to glob the systemd directory for matching files.
  for (@{$associations{$vmid}}) {
    my $share_id = $_ =~ s/^\///r =~ s/\//_/gr;
    my $unit_name = 'virtiofsd-' . $share_id;
    my $unit_file = '/etc/systemd/system/' . $unit_name . '@.service';
    print "attempting to install unit $unit_name...\n";
    if (not -e $unit_file) {
      $tt->process(\$unit_tpl, { share => $_, share_id => $share_id }, $unit_file)
        || die $tt->error(), "\n";
      system("/usr/bin/systemctl daemon-reload");
      system("/usr/bin/systemctl enable $unit_name\@$vmid.service");
    }
    system("/usr/bin/systemctl start $unit_name\@$vmid.service");
    $vfs_args .= " -chardev socket,id=char$char_id,path=/run/virtiofsd/$vmid-$share_id.sock";
    $vfs_args .= " -device vhost-user-fs-pci,chardev=char$char_id,tag=$share_id";
    $char_id += 1;
  }

  open(FH, '>', $vfs_args_file) or die $!;
  print FH $vfs_args;
  close(FH);

  print $vfs_args . "\n";
  if (defined($conf->{args}) && not $conf->{args} =~ /$vfs_args/) {
    print "Appending virtiofs arguments to VM args.\n";
    $conf->{args} .= " $vfs_args";
  } else {
    print "Setting VM args to generated virtiofs arguments.\n";
    $conf->{args} = " $vfs_args";
  }
  PVE::QemuConfig->write_config($vmid, $conf);
}
elsif($phase eq 'post-start') {
  print "$vmid started successfully.\n";
  my $vfs_args = do {
    local $/ = undef;
    open my $fh, "<", $vfs_args_file or die $!;
    <$fh>;
  };

  if ($conf->{args} =~ /$vfs_args/) {
    print "Removing virtiofs arguments from VM args.\n";
    $conf->{args} =~ s/\ *$vfs_args//g;
    print $conf->{args};
    $conf->{args} = undef if $conf->{args} =~ /^$/;
    PVE::QemuConfig->write_config($vmid, $conf);
  }
}
elsif($phase eq 'pre-stop') {
  #print "$vmid will be stopped.\n";
}
elsif($phase eq 'post-stop') {
  #print "$vmid stopped. Doing cleanup.\n";
} else {
  die "got unknown phase '$phase'\n";
}

exit(0);

I have yet to use this with PVE 8, however. If I understand correctly, the key difference is just `/usr/lib/kvm/virtiofsd` changes to `/usr/libexec/virtiofsd`. But I'll eventually test things out on PVE 8 and report back.

jsterr · Aug 17, 2023

What can this be used for in production environment? Someone have some nice usecases I might forget about?

sikha · Aug 18, 2023

jsterr said:
What can this be used for in production environment? Someone have some nice usecases I might forget about?

virtiofs effectively provides a native-performing (read: not bottlenecked by a network protocol) KVM equivalent of LXC bind mounts for mounting host storage inside of a guest as a shared filesystem. The same use cases for those bind mounts would also apply here.

scr4tchy · Aug 18, 2023

sikha said:
The last time I touched this was back in May last year, so I don't remember much, but I use the following Perl hookscript to create a templated systemd service file for starting virtiofsd on a particular directory, and then enable it for a particular VMID (to launch whenever the VM launches, and stop when it stops) and add arguments to the VM's configuration.

Oh wow, thank you so much.. I had a few ZFS virtio shares around which I setup manually in args/systemd, but always dreaded the process. With the hook script, I can now set this up across the fleet, and finally ensure the few directories I care about in each VM can actually be on my ZFS pool directly, and back those up specifically rather than having to gruesomely backup the whole VMs.

Edit: Interestingly, the config updated by post-start does not seem to be reloaded before the VM starts, so the args are not present on the command executed to start the VM, despite the args being present in the config file. But this small change to Proxmox fixes it.

oztiks · Aug 18, 2023

sikha said:
I switched to using the Rust version of virtiofsd back in PVE 7 and meant to do a write-up of the process but never got around to it.

The last time I touched this was back in May last year, so I don't remember much, but I use the following Perl hookscript to create a templated systemd service file for starting virtiofsd on a particular directory, and then enable it for a particular VMID (to launch whenever the VM launches, and stop when it stops) and add arguments to the VM's configuration. It currently uses hardcoded associations, and at the moment does not have any cleanup steps, but it may be helpful as a stepping stone for someone else (or I might get around to completing it eventually—~~alternatively, if this is something I can implement within PVE itself that might motivate me to fix up and submit a changeset~~ edit: I forgot, there's already an in-queue implementation that doesn't make use of the VM's systemd scope).

Thanks so much for posting this!

Running into an error (proxmox 8):

Code:

[CODE]
qm start 102
GUEST HOOK: 102 pre-start
102 is starting, doing preparations.
attempting to install unit virtiofsd-mnt_local_...
DIRECTORY DOES EXIST!
-object memory-backend-memfd,id=mem,size=2048M,share=on -numa node,memdev=mem -chardev socket,id=char0,path=/run/virtiofsd/102-mnt_local_.sock -device vhost-user-fs-pci,chardev=char0,tag=mnt_local_
Setting VM args to generated virtiofs arguments.
unable to parse value of 'args' - got undefined value
GUEST HOOK: 102 post-start
102 started successfully.
Removing virtiofs arguments from VM args.
hookscript error for 102 on post-start: command '/var/lib/vz/snippets/virtiofs-hook.pl 102 post-start' failed: exit code 255

Any thoughts? I'm not seeing why that's happening. I added some debugging, and:

Perl:

elsif($phase eq 'post-start') {
  print "$vmid started successfully.\n";
  my $vfs_args = do {
    local $/ = undef;
    open my $fh, "<", $vfs_args_file or die $!;
    <$fh>;
  };

  if ($conf->{args} =~ /$vfs_args/) {
    print "Removing virtiofs arguments from VM args.\n";
    print "conf->args = $conf->{args}\n" if $DEBUG;
    print "vfs_args = $vfs_args\n" if $DEBUG;
    $conf->{args} =~ s/\ *$vfs_args//g;
    print $conf->{args};
    $conf->{args} = undef if $conf->{args} =~ /^$/;
    print "conf->{args} = $conf->{args}\n" if $DEBUG;
    PVE::QemuConfig->write_config($vmid, $conf);
  }

Results in:

Code:

qm start 102
GUEST HOOK: 102 pre-start
102 is starting, doing preparations.
attempting to install unit virtiofsd-mnt_local_...
DIRECTORY DOES EXIST!
-object memory-backend-memfd,id=mem,size=2048M,share=on -numa node,memdev=mem -chardev socket,id=char0,path=/run/virtiofsd/102-mnt_local_.sock -device vhost-user-fs-pci,chardev=char0,tag=mnt_local_
Setting VM args to generated virtiofs arguments.
vfs_args: -object memory-backend-memfd,id=mem,size=2048M,share=on -numa node,memdev=mem -chardev socket,id=char0,path=/run/virtiofsd/102-mnt_local_.sock -device vhost-user-fs-pci,chardev=char0,tag=mnt_local_
Use of uninitialized value in concatenation (.) or string at /var/lib/vz/snippets/virtiofs-hook.pl line 96.
unable to parse value of 'args' - got undefined value
GUEST HOOK: 102 post-start
102 started successfully.
Removing virtiofs arguments from VM args.
conf->args = -object memory-backend-memfd,id=mem,size=2048M,share=on -numa node,memdev=mem -chardev socket,id=char0,path=/run/virtiofsd/102-mnt_local_.sock -device vhost-user-fs-pci,chardev=char0,tag=mnt_local_
vfs_args = -object memory-backend-memfd,id=mem,size=2048M,share=on -numa node,memdev=mem -chardev socket,id=char0,path=/run/virtiofsd/102-mnt_local_.sock -device vhost-user-fs-pci,chardev=char0,tag=mnt_local_
conf->{args} =
hookscript error for 102 on post-start: command '/var/lib/vz/snippets/virtiofs-hook.pl 102 post-start' failed: exit code 255

I'm just struggling to figure out why post-start is failing.
It looks like

Perl:

    $conf->{args} = undef if $conf->{args} =~ /^$/;

Is undefining the variable as intended, but I don't understand the intent.

oztiks · Aug 18, 2023

Got it working. Leaving last post for posterity, unless others suggest I delete it (sorry, I don't make my way into forums much anymore).

Anyway - THANK YOU, again, @sikha !

here's my modified script for Proxmox 8:

Perl:

#!/usr/bin/perl

use strict;
use warnings;

my %associations = (
  102 => ['/mnt/local'],
  #  101 => ['/zpool/audio', '/zpool/games'],
);

use PVE::QemuServer;

use Template;
my $tt = Template->new;

print "GUEST HOOK: " . join(' ', @ARGV) . "\n";

my $vmid = shift;
my $conf = PVE::QemuConfig->load_config($vmid);
my $vfs_args_file = "/run/$vmid.virtfs";
my $virtiofsd_dir = "/run/virtiofsd/";
my $DEBUG = 1;
my $phase = shift;

my $unit_tpl = "[Unit]
Description=virtiofsd filesystem share at [% share %] for VM %i
StopWhenUnneeded=true

[Service]
Type=simple
RuntimeDirectory=virtiofsd
PIDFile=/run/virtiofsd/.run.virtiofsd.%i-[% share_id %].sock.pid
ExecStart=/usr/libexec/virtiofsd --log-level debug --socket-path /run/virtiofsd/%i-[% share_id %].sock --shared-dir [% share %] --cache=auto --announce-submounts --inode-file-handles=mandatory

[Install]
RequiredBy=%i.scope\n";

if ($phase eq 'pre-start') {
  print "$vmid is starting, doing preparations.\n";

  my $vfs_args = "-object memory-backend-memfd,id=mem,size=$conf->{memory}M,share=on -numa node,memdev=mem";
  my $char_id = 0;

  # TODO: Have removal logic. Probably need to glob the systemd directory for matching files.
  for (@{$associations{$vmid}}) {
    my $share_id = $_ =~ s/^\///r =~ s/\//_/gr;
    my $unit_name = 'virtiofsd-' . $share_id;
    my $unit_file = '/etc/systemd/system/' . $unit_name . '@.service';
    print "attempting to install unit $unit_name...\n";
    if (not -d $virtiofsd_dir) {
        print "ERROR: $virtiofsd_dir does not exist!\n";
    }
    else { print "DIRECTORY DOES EXIST!\n"; }

    if (not -e $unit_file) {
      $tt->process(\$unit_tpl, { share => $_, share_id => $share_id }, $unit_file)
        || die $tt->error(), "\n";
      system("/usr/bin/systemctl daemon-reload");
      system("/usr/bin/systemctl enable $unit_name\@$vmid.service");
    }
    system("/usr/bin/systemctl start $unit_name\@$vmid.service");
    $vfs_args .= " -chardev socket,id=char$char_id,path=/run/virtiofsd/$vmid-$share_id.sock";
    $vfs_args .= " -device vhost-user-fs-pci,chardev=char$char_id,tag=$share_id";
    $char_id += 1;
  }

  open(FH, '>', $vfs_args_file) or die $!;
  print FH $vfs_args;
  close(FH);

  print $vfs_args . "\n";
  if (defined($conf->{args}) && not $conf->{args} =~ /$vfs_args/) {
    print "Appending virtiofs arguments to VM args.\n";
    $conf->{args} .= " $vfs_args";
  } else {
    print "Setting VM args to generated virtiofs arguments.\n";
    print "vfs_args: $vfs_args\n" if $DEBUG;
    $conf->{args} = " $vfs_args";
  }
  PVE::QemuConfig->write_config($vmid, $conf);
}
elsif($phase eq 'post-start') {
  print "$vmid started successfully.\n";
  my $vfs_args = do {
    local $/ = undef;
    open my $fh, "<", $vfs_args_file or die $!;
    <$fh>;
  };

  if ($conf->{args} =~ /$vfs_args/) {
    print "Removing virtiofs arguments from VM args.\n";
    print "conf->args = $conf->{args}\n" if $DEBUG;
    print "vfs_args = $vfs_args\n" if $DEBUG;
    $conf->{args} =~ s/\ *$vfs_args//g;
    print $conf->{args};
    $conf->{args} = undef if $conf->{args} =~ /^$/;
    print "conf->args = $conf->{args}\n" if $DEBUG;
    PVE::QemuConfig->write_config($vmid, $conf) if defined($conf->{args});
  }
}
elsif($phase eq 'pre-stop') {
  #print "$vmid will be stopped.\n";
}
elsif($phase eq 'post-stop') {
  #print "$vmid stopped. Doing cleanup.\n";
} else {
  die "got unknown phase '$phase'\n";
}

exit(0);

I had to modify:
1. ExecStart command with new args for virtiofsd (old args did not work, so new args may require some tuning)
2. Added a line to mkdir /run/virtiofsd/ if it doesn't exist (otherwise, virtiofsd couldnt write its socket)
3. post-start function to not rewrite the config (PVE::QemuConfig->write_config) if it's undefined

sikha · Aug 19, 2023

scr4tchy said:
Edit: Interestingly, the config updated by post-start does not seem to be reloaded before the VM starts, so the args are not present on the command executed to start the VM, despite the args being present in the config file. But this small change to Proxmox fixes it.

Yeah, that was actually one issue I ran into when I tried to setup virtiofsd in a new VM earlier this month, for the first time in a year, but I just ignored it (ran it with hookscript once and then removed hookscript from VM config since the args still remained, and manually enabled the systemd units) for the time being.

And thanks @oztiks for trying it out with PVE 8! I'll be sure to keep your post for reference when I get around to upgrading.

oztiks said:
Perl:

$conf->{args} = undef if $conf->{args} =~ /^$/;

Is undefining the variable as intended, but I don't understand the intent.

I think that was intended to remove the `args` parameter from the VM config file if it was empty. My script was meant to just temporarily update the VM args before launch but I think that the cleanup part broke or I never properly got it working. I can't quite remember exactly what I did 15 months ago, lol.

scr4tchy · Aug 21, 2023

Thanks for the rust compatible script..! I will try to update to Proxmox8 + Rust today, I'm having "too many files" issues with some VMs despite LimitNOFile being set to infinite in systemd..

Code:

$conf->{args} = undef if $conf->{args} =~ /^$/;

Yeah, I got rid of `$conf->{args} = undef if $conf->{args} =~ /^$/;` yesterday. Intent seems to be removing the `args` completely from the config, yeah. Not strictly needed, so it's ok.

Edit: I updated to Proxmox8 successfully and using the new script above

oztiks · Sep 1, 2023

Added

Code:

RuntimeDirectory=virtiofsd

To the systemd unit generation above, otherwise, upon reboot, /run/virtiofsd doesn't exist and as such the VM fails to start.

scyto · Sep 1, 2023

Oh, interesting thanks for posting.

would this allow me to pass ceph volumes through to a guest?

Drallas · Sep 22, 2023

scyto said:
Oh, interesting thanks for posting.

would this allow me to pass ceph volumes through to a guest?

I tried this, so far no success.

I can start the vm, with virtiofsd, loaded, active and running on the host.

When I try to create the mount on the VM it fails:

Bash:

root@lnxsrv01:~# mount -t virtiofs mnt_pve_cephfs_docker /mnt/docker
mount: /mnt/docker: wrong fs type, bad option, bad superblock on mnt_pve_cephfs_docker, missing codepage or helper program, or other error.

But the same when I use a local EXT4 host file system.

@scyto: i did a quick Write-Up: virtiofs.md perhaps it helps you!

scyto · Sep 22, 2023

a4ds5t said:
@scyto: i did a quick Write-Up: virtiofs.md perhaps it helps you!

Yes it does, thanks!

I also spent last 24 hours reading more on cephFS and docker so now i am getting dangerous in thinking i know what i am talking about when i absolutely don't.

What do you think about using one of these (as the CSI driver doesn't seem tested yet...) instead of trying to user virtiofs - my thinking in that the FUSE client / kernel running in the docker host VM will send its comms down (up?) the qemu networking stack, over vmbr0 straight to the closest ceph monitor (hopefully itself as this a home lab where the qemu hosts and ceph hosts are the same nodes.

i.e. traffic should never really leave the kernel, or rather will never hit the wire and should be at memory speed?

https://github.com/Brindster/docker-plugin-cephfs
https://github.com/flaviostutz/cepher
https://gitlab.com/n0r1sk/docker-volume-cephfs

Drallas · Sep 23, 2023

scyto said:
Yes it does, thanks!

I also spent last 24 hours reading more on cephFS and docker so now i am getting dangerous in thinking i know what i am talking about when i absolutely don't.

What do you think about using one of these (as the CSI driver doesn't seem tested yet...) instead of trying to user virtiofs - my thinking in that the FUSE client / kernel running in the docker host VM will send its comms down (up?) the qemu networking stack, over vmbr0 straight to the closest ceph monitor (hopefully itself as this a home lab where the qemu hosts and ceph hosts are the same nodes.

i.e. traffic should never really leave the kernel, or rather will never hit the wire and should be at memory speed?

https://github.com/Brindster/docker-plugin-cephfs
https://github.com/flaviostutz/cepher
https://gitlab.com/n0r1sk/docker-volume-cephfs

Interesting, for Docker CSI Driver plugin for CephFS seems worth trying, but had some issues before with that, not understanding how to get to the loopback Ceph network that we have.

I'm also planning to use CephFS for Media, files, so i still need virtiofs too.

scyto · Sep 23, 2023

a4ds5t said:
not understanding how to get to the loopback Ceph network that we have.

oh i think i figured that one out... i was over thinking it for weeks.. i just put this on my router

now any VM running on vmbr0 with an IPv6 address on my LAN has communication with my 3 monitors on my private ceph network... the traffic will never hit the slow networks - it will either stay in kernel / loopback (thesis) and go to the node the VM is on at memory speed, or at worst be routed across the private ceph public network to one of the other nodes. (my public and private are the same because its a 3 node homelab using thunderbolt-net mesh).

the router was the quick and easy way to test, i of course could have set these manual routes inside each VM instead if i cared about isolation

fc00:: addresses are the ceph mesh network, that has no connection other than to each node
2600:: addresses are the nodes vmbr0 IPv6 addresses and are fully routable in my LAN

scyto · Sep 23, 2023

a4ds5t said:
I'm also planning to use CephFS for Media, files, so i still need virtiofs too.

redundant media files - neat!

I still might use virtioFS (i want to test that and the docker plugins to see if there is meaningful difference) does the script you have make this fragile than it just being built in?

Drallas · Sep 23, 2023

@scyto

It seems to be working now!

I successfully mounted a (test) CephFS folder via virtiofs into one of my Vm's, and I'm able to consume date from the CephFS folder.

The information, how to set this up, is a bit all over the place, so I created my own Write-ups.

Step 1.
Write-up: Create Erasure Coded CephFS Pools

Step 2.

Write-up: Mount Volumes into a Proxmox VM with virtiofs

PS: It's on GitHub Gists, since I plan to update them, when I learn more or improve up on them.

Drallas · Sep 23, 2023

scyto said:
redundant media files - neat!

I still might use virtioFS (i want to test that and the docker plugins to see if there is meaningful difference) does the script you have make this fragile than it just being built in?

It's (not that bad) on 3+1 Erasure Coded SSD Pools!

Drallas · Sep 23, 2023

scyto said:
oh i think i figured that one out... i was over thinking it for weeks.. i just put this on my router

View attachment 55731

now any VM running on vmbr0 with an IPv6 address on my LAN has communication with my 3 monitors on my private ceph network... the traffic will never hit the slow networks - it will either stay in kernel / loopback (thesis) and go to the node the VM is on at memory speed, or at worst be routed across the private ceph public network to one of the other nodes. (my public and private are the same because its a 3 node homelab using thunderbolt-net mesh).

the router was the quick and easy way to test, i of course could have set these manual routes inside each VM instead if i cared about isolation

fc00:: addresses are the ceph mesh network, that has no connection other than to each node
2600:: addresses are the nodes vmbr0 IPv6 addresses and are fully routable in my LAN

Need to study this more, or hoping to find a nice write-up somewhere!

Drallas · Sep 23, 2023

scyto said:
Yes it does, thanks!

I also spent last 24 hours reading more on cephFS and docker so now i am getting dangerous in thinking i know what i am talking about when i absolutely don't.

What do you think about using one of these (as the CSI driver doesn't seem tested yet...) instead of trying to user virtiofs - my thinking in that the FUSE client / kernel running in the docker host VM will send its comms down (up?) the qemu networking stack, over vmbr0 straight to the closest ceph monitor (hopefully itself as this a home lab where the qemu hosts and ceph hosts are the same nodes.

i.e. traffic should never really leave the kernel, or rather will never hit the wire and should be at memory speed?

https://github.com/Brindster/docker-plugin-cephfs
https://github.com/flaviostutz/cepher
https://gitlab.com/n0r1sk/docker-volume-cephfs

Nice, but the plugins seem a bit 'unmaintained'!
I'm leaning towards this glusterfs-volume-plugin.

[TUTORIAL] virtiofsd in PVE 8.0.x

Active Member

Directions​

1. Add an "args:" line to the VM's VMID.conf file located in /etc/pve/qemu-server.​

2. Start virtiofsd on the host.​

3. Start virtiofsd in a hookscript​

4. Mount it inside the Linux VM​

New Member

Renowned Member

New Member

New Member

New Member

New Member

New Member

New Member

New Member

Active Member

Member

Active Member

Member

Active Member

Active Member

Member

Write-up: Mount Volumes into a Proxmox VM with virtiofs​

Member

Member

Member

We value your privacy

Directions

1. Add an "args:" line to the VM's VMID.conf file located in /etc/pve/qemu-server.

2. Start virtiofsd on the host.

3. Start virtiofsd in a hookscript

4. Mount it inside the Linux VM

Write-up: Mount Volumes into a Proxmox VM with virtiofs