how did exclude a sata controller ?

peter247

Active Member
Dec 31, 2021
77
6
28
65
I'm passing the full controller to my virtualised truenas which looks to working well , BUT I tried stopping it and restarting I will left with a load of errors in the logs .
Which gets me concerned about data corruption ? , how do stop the driver for than controller loading ?
04:00.0 SATA controller: JMicron Technology Corp. JMB58x AHCI SATA controller
backups is a dataset on truenas zfs shared which proxmox so that it's a problem.

ar 21 15:01:51 nas kernel: I/O error, dev loop1, sector 0 op 0x1:(WRITE) flags 0x3800 phys_seg 1 prio class 2
Mar 21 15:01:51 nas kernel: Buffer I/O error on dev loop1, logical block 0, lost sync page write
Mar 21 15:01:51 nas kernel: EXT4-fs (loop1): I/O error while writing superblock
Mar 21 15:01:51 nas kernel: EXT4-fs (loop1): Remounting filesystem read-only
Mar 21 15:01:59 nas pvestatd[1248]: storage 'backups' is not online
Mar 21 15:02:09 nas pvestatd[1248]: storage 'backups' is not online
Mar 21 15:02:19 nas pvestatd[1248]: storage 'backups' is not online
Mar 21 15:02:22 nas kernel: I/O error, dev loop1, sector 40580032 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
Mar 21 15:02:29 nas pvestatd[1248]: storage 'backups' is not online
Mar 21 15:02:32 nas kernel: I/O error, dev loop1, sector 40580032 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
Mar 21 15:02:40 nas pvestatd[1248]: storage 'backups' is not online
Mar 21 15:02:42 nas kernel: I/O error, dev loop1, sector 40580032 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
Mar 21 15:02:49 nas pvestatd[1248]: storage 'backups' is not online
Mar 21 15:02:53 nas kernel: I/O error, dev loop1, sector 40580032 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
Mar 21 15:02:58 nas pvestatd[1248]: storage 'backups' is not online
Mar 21 15:03:01 nas kernel: CIFS: Status code returned 0xc000006d STATUS_LOGON_FAILURE
Mar 21 15:03:01 nas kernel: CIFS: VFS: \\truenas Send error in SessSetup = -13
Mar 21 15:03:01 nas kernel: CIFS: Status code returned 0xc000006d STATUS_LOGON_FAILURE
Mar 21 15:03:01 nas kernel: CIFS: VFS: \\truenas Send error in SessSetup = -13
Mar 21 15:03:01 nas kernel: CIFS: Status code returned 0xc000006d STATUS_LOGON_FAILURE
Mar 21 15:03:01 nas kernel: CIFS: VFS: \\truenas Send error in SessSetup = -13
Mar 21 15:03:01 nas kernel: I/O error, dev loop1, sector 40580032 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
Mar 21 15:03:01 nas kernel: CIFS: Status code returned 0xc000006d STATUS_LOGON_FAILURE
 
Last edited:
It looks like you're already passing the controller through to your TrueNAS VM, which is the right approach.

From what you're describing, the issue might be that the host is still loading the AHCI driver for that controller. When the VM stops, the device gets reattached to the host, which can lead to resets or errors in the logs. That would also explain your concern about potential data consistency issues.

You might want to try binding the controller to vfio-pci so the host never touches it at all.

For example:

1. Identify the device ID:
lspci -nn

Look for your JMicron controller, e.g. something like 197b:0585

2. Bind it to vfio:
echo "options vfio-pci ids=197b:0585" > /etc/modprobe.d/vfio.conf
update-initramfs -u

3. Reboot

This should ensure the controller is reserved for passthrough only and not claimed by the host when the VM stops.

Not entirely sure if this is the best way, but you might want to give this a try.
If anyone has a better approach, I'd really appreciate hearing about it.
 
Last edited:
thank you .
05:00.0 SATA controller [0106]: JMicron Technology Corp. JMB58x AHCI SATA controller [197b:0585]
You got that spot on !!!!!
BUT the disk are still showing in disks on proxmox
 
Last edited:
thank you .
05:00.0 SATA controller [0106]: JMicron Technology Corp. JMB58x AHCI SATA controller [197b:0585]
You got that spot on !!!!!
BUT the disk are still showing in disks on proxmox

Hi peter247,

To find out why the disks are still showing up, we need to verify if that SATA controller is successfully bound to the vfio-pci driver, rather than the host's default driver (like ahci).

Could you run the following command to check the kernel driver currently in use for that specific PCI device?

lspci -nnk -s 05:00.0

For example, here is the output from one of my SATA controllers that is not passed through to a VM (so it's still being used by the Proxmox host):

root@pve:~# lspci -nnk -s 03:00.0
03:00.0 SATA controller [0106]: ASMedia Technology Inc. ASM1062 Serial ATA Controller [1b21:0612] (rev 02)
Subsystem: ASMedia Technology Inc. ASM1061/ASM1062 Serial ATA Controller [1b21:1060]
Kernel driver in use: ahci
Kernel modules: ahci

Please check the output and look for the line that says Kernel driver in use:.
  • If it says vfio-pci, then the device is correctly isolated.
  • If it still says ahci (or something else), it means the Proxmox host is still claiming the controller, which is why the disks are still visible in the GUI.
Let us know what output you get!
 
Last edited:
But know that PVE also attaches this driver dynamically if possible so this is not total proof. You'd need to look at the configs and logs as well such as
Bash:
find /etc/modprobe.d/* -exec tail -n+1 {} +
tail -n+1 /proc/cmdline /etc/kernel/cmdline /etc/default/grub
journalctl -b0 -g "vfio|05:00"
 
Last edited:
You don't stop it, you let VFIO have a go at it first. It's likely ahci. Pulipulichen's command should show.
 
Last edited:
  • Like
Reactions: pulipulichen
Hi peter247,

To find out why the disks are still showing up, we need to verify if that SATA controller is successfully bound to the vfio-pci driver, rather than the host's default driver (like ahci).

Could you run the following command to check the kernel driver currently in use for that specific PCI device?



For example, here is the output from one of my SATA controllers that is not passed through to a VM (so it's still being used by the Proxmox host):



Please check the output and look for the line that says Kernel driver in use:.
  • If it says vfio-pci, then the device is correctly isolated.
  • If it still says ahci (or something else), it means the Proxmox host is still claiming the controller, which is why the disks are still visible in the GUI.
Let us know what output you get!
you are right .
root@nas:~# lspci -nnk -s 05:00.0
05:00.0 SATA controller [0106]: JMicron Technology Corp. JMB58x AHCI SATA controller [197b:0585]
Subsystem: JMicron Technology Corp. Device [197b:0000]
Kernel driver in use: ahci
Kernel modules: ahci
root@nas:~#
 
Yes but I specifically recommended against blacklist and using /etc/modprobe.d/*.conf in favor of softdep and kernel args. You can look into how they differ and decide yourself, of course, but I'd give that a try first.
 
Last edited:
Yes but I specifically recommended against blacklist and using /etc/modprobe.d/*.conf. You can do what you want, of course, but I'd give that a try first.
I've gone the :-
echo "options vfio-pci ids=197b:0585" > /etc/modprobe.d/vfio.conf
update-initramfs -u
modprobe.d more or less empty , just vifo.conf and blacklist nvidiafb
So
 
That explains why the passthrough isn't working yet. Since lspci shows the controller is still using the ahci driver, it confirms that ahci is grabbing the hardware before vfio-pci has a chance to bind to it.

As @Impact suggested, you can use softdep to force vfio-pci to load before ahci. You can try the following steps to resolve the driver conflict:

1. Create the configuration file to define the dependency:
echo "softdep ahci pre: vfio-pci" > /etc/modprobe.d/vfio-softdep.conf
2. Update the initramfs and reboot the system:

update-initramfs -u -k all
reboot

3. After rebooting, verify the driver status again:
lspci -nnk -s 05:00.0
If the configuration is correct, "Kernel driver in use" should now show vfio-pci instead of ahci.
 
  • Like
Reactions: leesteken