PCI Passthrough with SATA controller - can't reset pci device

navrone77

Member
Mar 27, 2014
7
0
21
Hello,

with my newly built server, I want to set up a storage VM with Openmediavault. To gain full access of SMART and HDD spin-control, I'd like to pass-through the whole onboard SATA Controller.
The Proxmox installation is located at a SSD drive, which is attached to a PCIe SATA RAID Controller (3ware 9650SE-2LP) in single drive mode.

My problem: If I try to passthrough the SATA controller, the following error appears on startup of the VM:
Code:
TASK ERROR: can't reset pci device '00:1f.2'


My VM config:
Code:
balloon: 512
bootdisk: virtio0
cores: 1
ide2: local:iso/openmediavault_0.5.0.24_amd64.iso,media=cdrom
memory: 4096
name: OpenMediaVault-VM
net0: virtio=2E:A1:6F:01:28:3D,bridge=vmbr0
ostype: l26
sockets: 1
virtio0: local:110/vm-110-disk-1.qcow2,size=10G
hostpci0: 00:1f.2

I can't find any additional log, that could be useful for debugging.
syslog:
Code:
Mar 27 12:01:55 proxmox pvedaemon[2688]: <root@pam> starting task UPID:proxmox:00000AE4:0000116C:53340523:qmstart:110:root@pam:
Mar 27 12:01:55 proxmox pvedaemon[2788]: start VM 110: UPID:proxmox:00000AE4:0000116C:53340523:qmstart:110:root@pam:
Mar 27 12:01:55 proxmox kernel: ahci 0000:00:1f.2: PCI INT B disabled
Mar 27 12:01:55 proxmox pvedaemon[2788]: can't reset pci device '00:1f.2'
Mar 27 12:01:55 proxmox kernel: pci-stub 0000:00:1f.2: claimed by stub
Mar 27 12:01:55 proxmox pvedaemon[2688]: <root@pam> end task UPID:proxmox:00000AE4:0000116C:53340523:qmstart:110:root@pam: can't reset pci device '00:1f.2'


Here an overview of my relevant hardware setup:
Code:
Supermicro X10SLM+-LN4F (Haswell generation) with newest firmware
Intel Xeon Processor E3-1230v3 Socket 1150
LSI 3ware 9650SE-2LP controller in PCIe slot

The newest Proxmox version is installed and the system is up to date.
Detailed software setup pveversion -v:
Code:
proxmox-ve-2.6.32: 3.2-121 (running kernel: 2.6.32-27-pve)
pve-manager: 3.2-1 (running version: 3.2-1/1933730b)
pve-kernel-2.6.32-27-pve: 2.6.32-121
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.5-1
pve-cluster: 3.0-12
qemu-server: 3.1-15
pve-firmware: 1.1-2
libpve-common-perl: 3.0-14
libpve-access-control: 3.0-11
libpve-storage-perl: 3.0-19
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-6
vzctl: 4.0-1pve5
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.7-4
ksm-control-daemon: 1.1-1
glusterfs-client: 3.4.2-1

Hardware virtualization Vt-d is enabled in BIOS and seems to be working correctly, because pass-through of the onboard network interfaces worked just fine, but is currently not enabled, so there couldn't be a relation to that.
Here the output of dmesg | grep -e DMAR -e IOMMU:
Code:
ACPI: DMAR 00000000dd904398 00080 (v01 INTEL HSW 00000001 INTL 00000001)
Intel-IOMMU: enabled
dmar: IOMMU 0: reg_base_addr fed90000 ver 1:0 cap d2008c20660462 ecap f010da
IOMMU 0xfed90000: using Queued invalidation
IOMMU: Setting RMRR:
IOMMU: Setting identity map for device 0000:00:1d.0 [0xdf69b000 - 0xdf6aa000]
IOMMU: Setting identity map for device 0000:00:1a.0 [0xdf69b000 - 0xdf6aa000]
IOMMU: Setting identity map for device 0000:00:14.0 [0xdf69b000 - 0xdf6aa000]
IOMMU: Prepare 0-16MiB unity mapping for LPC
IOMMU: Setting identity map for device 0000:00:1f.0 [0x0 - 0x1000000]

Overview of the devices: lspci
Code:
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v3 Processor DRAM Controller (rev 06)
00:14.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset Family USB xHCI (rev 04)
00:16.0 Communication controller: Intel Corporation 8 Series/C220 Series Chipset Family MEI Controller #1 (rev 04)
00:16.1 Communication controller: Intel Corporation 8 Series/C220 Series Chipset Family MEI Controller #2 (rev 04)
00:1a.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #2 (rev 04)
00:1c.0 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #1 (rev d4)
00:1c.2 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #3 (rev d4)
00:1c.3 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #4 (rev d4)
00:1c.4 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #5 (rev d4)
00:1c.6 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #7 (rev d4)
00:1c.7 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #8 (rev d4)
00:1d.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #1 (rev 04)
00:1f.0 ISA bridge: Intel Corporation C224 Series Chipset Family Server Standard SKU LPC Controller (rev 04)
00:1f.2 SATA controller: Intel Corporation 8 Series/C220 Series Chipset Family 6-port SATA Controller 1 [AHCI mode] (rev 04)
00:1f.3 SMBus: Intel Corporation 8 Series/C220 Series Chipset Family SMBus Controller (rev 04)
00:1f.6 Signal processing controller: Intel Corporation 8 Series Chipset Family Thermal Management Controller (rev 04)
01:00.0 PCI bridge: ASPEED Technology, Inc. AST1150 PCI-to-PCI Bridge (rev 03)
02:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 30)
03:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network Connection (rev 03)
04:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network Connection (rev 03)
05:00.0 RAID bus controller: 3ware Inc 9650SE SATA-II RAID PCIe (rev 01)
06:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network Connection (rev 03)
07:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network Connection (rev 03)

The corresponding SATA controller has the following specs:
Code:
00:1f.2 SATA controller: Intel Corporation 8 Series/C220 Series Chipset Family 6-port SATA Controller 1 [AHCI mode] (rev 04) (prog-if 01 [AHCI 1.0])
       Subsystem: Super Micro Computer Inc Device 0806
       Flags: 66MHz, medium devsel, IRQ 19
       I/O ports at f050 [size=8]
       I/O ports at f040 [size=4]
       I/O ports at f030 [size=8]
       I/O ports at f020 [size=4]
       I/O ports at f000 [size=32]
       Memory at f7612000 (32-bit, non-prefetchable) [size=2K]
       Capabilities: [80] MSI: Enable- Count=1/1 Maskable- 64bit-
       Capabilities: [70] Power Management version 3
       Capabilities: [a8] SATA HBA v1.0
       Kernel driver in use: pci-stub


As described in the KVM Howto, I set up the stub device with the following commands:
Code:
lspci -n
00:1f.2 0106: 8086:8c02 (rev 04)

echo "8086 8c02" > /sys/bus/pci/drivers/pci-stub/new_id
echo 0000:00:1f.2 > /sys/bus/pci/devices/0000:00:1f.2/driver/unbind
echo 0000:00:1f.2 > /sys/bus/pci/drivers/pci-stub/bind

lspci –k
00:1f.2 SATA controller: Intel Corporation 8 Series/C220 Series Chipset Family 6-port SATA Controller 1 [AHCI mode] (rev 04)
       Subsystem: Super Micro Computer Inc Device 0806
       Kernel driver in use: pci-stub

I noticed the IRQ of the SATA controller IRQ19 is also used by two of the four onboard network cards, could this cause some trouble? Since disabling network cards is not an option, how to influence the IRQ assignment?
Without luck I also tried the new 3.10 Kernel (same errors) and the "allow_unsafe_assigned_interrupts" option.


Here some Kernel info about the corresponding device 1f.2:
Code:
Mar 27 12:01:16 proxmox kernel: pci 0000:00:1f.2: reg 10: [io  0xf050-0xf057]
Mar 27 12:01:16 proxmox kernel: pci 0000:00:1f.2: reg 14: [io  0xf040-0xf043]
Mar 27 12:01:16 proxmox kernel: pci 0000:00:1f.2: reg 18: [io  0xf030-0xf037]
Mar 27 12:01:16 proxmox kernel: pci 0000:00:1f.2: reg 1c: [io  0xf020-0xf023]
Mar 27 12:01:16 proxmox kernel: pci 0000:00:1f.2: reg 20: [io  0xf000-0xf01f]
Mar 27 12:01:16 proxmox kernel: pci 0000:00:1f.2: reg 24: [mem 0xf7612000-0xf76127ff]
Mar 27 12:01:16 proxmox kernel: pci 0000:00:1f.2: PME# supported from D3hot
Mar 27 12:01:16 proxmox kernel: pci 0000:00:1f.2: PME# disabled
Mar 27 12:01:16 proxmox kernel: ahci 0000:00:1f.2: version 3.0
Mar 27 12:01:16 proxmox kernel: ahci 0000:00:1f.2: PCI INT B -> GSI 19 (level, low) -> IRQ 19
Mar 27 12:01:16 proxmox kernel: ahci 0000:00:1f.2: irq 39 for MSI/MSI-X
Mar 27 12:01:16 proxmox kernel: ahci 0000:00:1f.2: AHCI 0001.0300 32 slots 6 ports 6 Gbps 0x3f impl SATA mode
Mar 27 12:01:16 proxmox kernel: ahci 0000:00:1f.2: flags: 64bit ncq pm led clo pio slum part ems apst
Mar 27 12:01:16 proxmox kernel: ahci 0000:00:1f.2: setting latency timer to 64
Mar 27 12:01:55 proxmox kernel: ahci 0000:00:1f.2: PCI INT B disabled
Mar 27 12:01:55 proxmox pvedaemon[2788]: can't reset pci device '00:1f.2'
Mar 27 12:01:55 proxmox kernel: pci-stub 0000:00:1f.2: claimed by stub
Mar 27 12:01:55 proxmox pvedaemon[2688]: <root@pam> end task UPID:proxmox:00000AE4:0000116C:53340523:qmstart:110:root@pam: can't reset pci device '00:1f.2'


Does anyone have an idea, what could cause this issue or how to debug this further? If any more log output is needed, please tell.

Any help is greatly appreciated, thanks in advance.
 
I can tell some news about this issue. After finding this Thread I tried the manually assignment of the PCI device ... and it seems to work!

Code:
qm> device_add pci-assign,host=00:1f.2,id=intel_sata
PCI region 5 at address 0xf7612000 has size 0x800, which is not a multiple of 4K.  You might experience some performance hit due to that.

qm> info pci
....
  Bus  0, device   4, function 0:
    SATA controller: PCI device 8086:8c02
      IRQ 10.
      BAR0: I/O at 0xffffffffffffffff [0x0006].
      BAR1: I/O at 0xffffffffffffffff [0x0002].
      BAR2: I/O at 0xffffffffffffffff [0x0006].
      BAR3: I/O at 0xffffffffffffffff [0x0002].
      BAR4: I/O at 0xffffffffffffffff [0x001e].
      BAR5: 32 bit memory at 0xffffffffffffffff [0x000007fe].
      id "intel_sata"
....

However, there are two attached drives on the Intel SATA controller, which are not recognized in Openmediavault...

Code:
root@openmediavault:~# fdisk -l


Disk /dev/vda: 10.7 GB, 10737418240 bytes
255 heads, 63 sectors/track, 1305 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000d8e9d


   Device Boot      Start         End      Blocks   Id  System
/dev/vda1   *           1        1246    10005504   83  Linux
/dev/vda2            1246        1306      477185    5  Extended
/dev/vda5            1246        1306      477184   82  Linux swap / Solaris


But one step at once, what is the difference of assigning a device manually or doing so via the VM config file? In general, the passthrough seems to work.
 
Maybe something about the motherboard that doesn't like passthrough for the SATA controller? Can you swap them? e.g. pass through the 3ware and use the mobo sata for the promox install?
 
Huh yeah, I could try that, but I don't really see how this does help me?

As I stated, passthrough in general is working and the controller is a completely different device with another driver. Anything else I could try?
 
PCI passthrough is kind of arcane. Some controllers work, some don't. Some builtin on the motherboard have issues due to PCI bridge or whatever. I've seen things where an HBA in slot X doesn't pass through properly, but if moved to slot Y, it does. I was proposing a simple test to see if it was a problem like that. What it would have bought you was potentially success at what you wanted to do. I proposed a simple test which would have resolved your problem if successful, and what I got was snideness. If I had had a better alternative, I would have proposed it first, don't you think?
 
Maybe this will help:

Try to show up the bus tree with "lspci -t"

then look for your sata controller... maybe there is a parent bus node which must be passthrough too?

The tree should look like:

-[0000:00]-+-00.0 +-01.0-[01]--+-00.0
| \-00.1
+-01.1-[02]----00.0
+-14.0
+-1a.0
+-1c.0-[03-04]----00.0-[04]----00.0
+-1c.2-[05]----00.0
+-1c.3-[06]----00.0
+-1d.0
+-1f.0
+-1f.2
+-1f.3
\-1f.6
 
  • Like
Reactions: AreaKode
Maybe this will help:

Try to show up the bus tree with "lspci -t"

then look for your sata controller... maybe there is a parent bus node which must be passthrough too?

The tree should look like:

I got this thought, too. But I don't see any dependency there, or am I wrong?

Code:
-[0000:00]-+-00.0
           +-14.0
           +-16.0
           +-16.1
           +-1a.0
           +-1c.0-[01-02]----00.0-[02]----00.0
           +-1c.2-[03]----00.0
           +-1c.3-[04]----00.0
           +-1c.4-[05]----00.0
           +-1c.6-[06]----00.0
           +-1c.7-[07]----00.0
           +-1d.0
           +-1f.0
           +-1f.2
           +-1f.3
           \-1f.6

The corresponding device is 1f.2. Looks to me like it is directly attached and not through a parent, is it?
 
As I just noticed, my post from yesterday somehow didn't get approved.

Anyway, I found out that according to the thread here, pass-through is working on-the-fly.

The following procedure worked for me:

Code:
qm> device_add pci-assign,host=00:1f.2,id=intel_sata

PCI region 5 at address 0xf7612000 has size 0x800, which is not a multiple of 4K.  You might experience some performance hit due to that.

Regardless of that warning, the SATA controller got passed through to the VM.

Code:
.....
Info pci: 
  Bus  0, device   4, function 0:
    SATA controller: PCI device 8086:8c02
      IRQ 10.
      BAR0: I/O at 0xffffffffffffffff [0x0006].
      BAR1: I/O at 0xffffffffffffffff [0x0002].
      BAR2: I/O at 0xffffffffffffffff [0x0006].
      BAR3: I/O at 0xffffffffffffffff [0x0002].
      BAR4: I/O at 0xffffffffffffffff [0x001e].
      BAR5: 32 bit memory at 0xffffffffffffffff [0x000007fe].
      id "intel_sata"
.....

So it looks like it really can be attached to a VM manually. Only issue I encountered was that the drives attached to the passed-through controller were not recognized inside the VM.
But I guess this could be because of the assignment at runtime.

Interesting question is, how is the manual "device_add pci-assign" command different from specifying a hostpci device in the VM config?
 
Hm I see but why could the device reset fail? I created the pci-stub device so the controller driver is not loaded on the host. What could prevent the reset anyway?

Push, any ideas about this? Looks like I'm stuck at this point...
 
Push again
Because have the same issue here with the built-in SATA-Controller of my Asrock z87e-itx here.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!