Passing through Hdds with qm set command

ieronymous · Mar 7, 2022

Hi
I ve seen a different way to pass disks to a VM when an hba card is out of a question for some reason. You find the id (disk-by-id) of the drive (which doesn t change like sda sdb...etc) and you pass it through to the VM with the command

Code:

qm set id_of _the_vm -iscsi(0,1,2,3....n) /dev/disk/disk-by-id/ata-serial_of_disk-some_number

Even though it works and the VM sees the drives, what about the smart data of the disks passed through to the VM this way? Asking because there was a debate of whether or not disks will be recognized as virtual disks and therefore disk information will not be displayed. As a matter of fact at the disk's page of TrueNas, serial number column is empty. Wouldn t that cause confusion when one of the disks needs to be changed? You need to manually write down which disk corresponds to da0 da1 ...etc.
I ve also seen the disks passed through as -virtio(1,2,3....n) instead of iscsi(1,2,3...n) I don t know if that matters or not, or on which cases you use iscsi or virtio or sata ...etc. Can you elaborate on this?

PS: My use case scenario is an R730 with 16 sas drives on an embedded hba330 mini monolithic. Problem is that the backplane is an expander and can t pass through the onboard hba contoller because I need the 8 of those drives as a VM storage(will be created inside proxmox) and the other 8 are needed to be passed to the virtualized TrueNas. Complicated plan but due to the company's budget, not much I can do with a separate machine for True Nas.

Dunuin · Mar 8, 2022

ieronymous said:
Hi
I ve seen a different way to pass disks to a VM when an hba card is out of a question for some reason. You find the id (disk-by-id) of the drive (which doesn t change like sda sdb...etc) and you pass it through to the VM with the command

Code:

qm set id_of _the_vm -iscsi(0,1,2,3....n) /dev/disk/disk-by-id/ata-serial_of_disk-some_number

Even though it works and the VM sees the drives, what about the smart data of the disks passed through to the VM this way? Asking because there was a debate of whether or not disks will be recognized as virtual disks and therefore disk information will not be displayed. As a matter of fact at the disk's page of TrueNas, serial number column is empty. Wouldn t that cause confusion when one of the disks needs to be changed? You need to manually write down which disk corresponds to da0 da1 ...etc.
I ve also seen the disks passed through as -virtio(1,2,3....n) instead of iscsi(1,2,3...n) I don t know if that matters or not, or on which cases you use iscsi or virtio or sata ...etc. Can you elaborate on this?

Passthrough with "qm set" is no real physical passthrough like when passing through a HBA with PCI passtrough. All your VMs will see are just virtual disks so your disks LBA will be reported as 512B instead of 4K and the VM won't be able to use SMART.

ieronymous · Mar 8, 2022

Dunuin said:
Passthrough with "qm set" is no real physical passthrough like when passing through a HBA with PCI passtrough. All your VMs will see are just virtual disks so your disks LBA will be reported as 512B instead of 4K and the VM won't be able to use SMART.

....so there isnt any other way of doing it?

Dunuin · Mar 8, 2022

If you want direct and physical access to the real disks with no abstraction/virtualization layer in between...no. Then PCI passthrough of the complete HBA is the only option.

ieronymous · Mar 9, 2022

Dunuin said:
Then PCI passthrough of the complete HBA is the only option.

I know about that but in my case it can t be done since the R730 even though has all the connections for a second 8-bay backplane (mine is the 8 bay model) it refuses to play along and gives (probably) fake error messages about the cable or the card having being disconnecting. If I press the continue key it sees all the disks but I can t rely on a server who needs help to boot into the main OS (Proxmox). The other way was to try and see if I could use the 16 bay backplane (so it has an expander) . This way though I can t pass through only half the disks (8 of 16) to the VM because those will be virtual ones and not physical.
My only other solution and probably the last one is to use a DAS but all the brand solutions have inside a weird card that converts the internal cables to external ones (if I can recall 8088) and I don t know if this has to be flashed as well to IT mode, I don t know if that is possible with unkown cards and haven t seen a guide of doing that. So .... that is why I asked at first place.

apoc · Mar 9, 2022

if you have dedicated bays you can likely add a dedicated HBA for them as well?
That might be an alternative to your external DAS approach.

What makes me curious: aside the fact that the HDD identification within the VM is more difficult - why exactly do you insist to have SMART values within the VM? You can monitor that from the host as well. Have you thought of this? Asking because I was exact in the same situation (2 disks "passthrough" though).

I am using ZFS vdev capabilities (alias) to make my live a little easier. This is how my VM config looks like:

Code:

scsi1: /dev/disk/by-vdev/C1-S6,backup=0,discard=on,iothread=1,replicate=0,size=488386584K
scsi2: /dev/disk/by-vdev/C1-S7,backup=0,discard=on,iothread=1,replicate=0,size=488386584K

ieronymous · Mar 9, 2022

apoc said:
if you have dedicated bays you can likely add a dedicated HBA for them as well?
That might be an alternative to your external DAS approach.

But I am doing this.I have 2 hba adapters inside the server.The embedded h330 mini mono (flashed to IT) who connects to the first 8 drives and a pci-e one (hba330) connected to the other 8 disks. It doesnt play along due to that damn error message

apoc said:
What makes me curious: aside the fact that the HDD identification within the VM is more difficult - why exactly do you insist to have SMART values within the VM? You can monitor that from the host as well. Have you thought of this?

I should have started with... this VM will be TrueNas and want to pass the disks as physical ot could be.
Your method passes them as vdevs so abstract layer once more. I don t want that.
By the way I am not pretty sure if you passs the drives to a VM of prox can then monitor them.

Dunuin · Mar 9, 2022

ieronymous said:
By the way I am not pretty sure if you passs the drives to a VM of prox can then monitor them.

Using "qm set" passthrough is similar to using normal virtual disks stored on any storage. Just that the data is not stored on a zvol, LVM-Thin-LV or a qcow2 file but on a physical disks. Everything your VM will see are the virtual disks with no access to the real hardware, so it shouldn't be a problem to monitor the disks using PVE. Virtualization overhead isn't that bad, but you still got some using "qm set". And not sure how bad it would be to build a ZFS pool out of "qm set" virtual disks, but in general you don't want any abstraction layer between ZFS and the physical disks, which you will get when using "qm set". Same as with a HW raid controller...ZFS will work with it in between, but isn'T recommended: https://openzfs.github.io/openzfs-d...uning/Hardware.html#hardware-raid-controllers

ieronymous · Mar 9, 2022

Dunuin said:
not sure how bad it would be to build a ZFS pool out of "qm set" virtual disks, but in general you don't want any abstraction layer between ZFS and the physical disks, which you will get when using "qm set".

Thank you for your thorough answer. I am aware of that and don t want to risk and find out in a production system the implications an action like that might have. I am sticking with the other two ways
-find a way to bypass the error message with the second hba adapter controlling the second octet of drives
-find a guide of how to build a DAS (if something ready out of the box isn t available) or whatever that might be called as long as it serves my purposes.
Dont know if I have to flash that expander inside since no ready solution mentions what it uses inside.

apoc · Mar 10, 2022

Dunuin said:
And not sure how bad it would be to build a ZFS pool out of "qm set" virtual disks, but in general you don't want any abstraction layer between ZFS and the physical disks, which you will get when using "qm set".

I have not made any performance Tests but for me in a ZFS mirrored pool and a backup server for special purposes I can confirm this works without issues. Regular scrubs don't show problems either.

ieronymous said:
By the way I am not pretty sure if you passs the drives to a VM of prox can then monitor them.

I only can report that the disks still are monitored on my host. So I am tempted to say it works.

What is the exact error message of your other HBA? You mentioned it but I can't find the exact wording (or I am just overreading it ?!)

ieronymous · Mar 10, 2022

apoc said:
Regular scrubs don't show problems either.

.... maybe because they can be recognized? Not confident just asking. Maybe a known defective drive in a pool with this way could give the answer. If scrub could find results and had already happened on the main system where that same drive were passed physical, then we could have a more definite answer about that.

apoc said:
What is the exact error message of your other HBA? You mentioned it but I can't find the exact wording (or I am just overreading it ?!)

None of the above messages are true. I dont know what error flag these geniuses back at Dell came up with even though all the equipment is Dell compatible (hba,cables,backplane). I am trying to solve this for 2.5 months now.

The other possible explanation might be the second (see pic below) white connector for the data or the black one for power. I had so many tries with different combination of cables and connectors that I can t remember if using the first 8 bay backplane with the second connectors (white and black) were giving me the issue or not. I think it did give me the error.

bobmc · Mar 10, 2022

You could always approach the problem by asking if you really need to run TrueNAS as a VM.

Appreciate the convenience factor that's on offer, but anything TrueNAS does, you can do in other ways. Running samba either on the host or as a VM to do fileshares, and if you want some GUI control consider Webmin - it's a general management tool but it does have fileshare management as part of the toolbox.

apoc · Mar 10, 2022

ieronymous said:
.... maybe because they can be recognized?

If that would be the case then ZFS would seriously be broken. On a scrub all data is read and compared against the checksum. If that would not detect issues I don't know what else it was for.
If that does not make you confident OK. I am fine with it.

ieronymous said:
None of the above messages are true. I dont know what error flag these geniuses back at Dell came up with even though all the equipment is Dell compatible (hba,cables,backplane). I am trying to solve this for 2.5 months now.

This looks to me some in and or sideband signaling is expected but not working.
There are special cables needed AFAIK.
See:
https://serverfault.com/questions/5676/what-is-a-raid-controllers-sideband-cable-used-for

/edit: can you configure the idrac to ignore this? Seems that the management component throws this error

Dunuin · Mar 10, 2022

apoc said:
If that would be the case then ZFS would seriously be broken. On a scrub all data is read and compared against the checksum. If that would not detect issues I don't know what else it was for.
If that does not make you confident OK. I am fine with it.

There could be other complications. For example virtual qemu disks are reporting a 512B physical sector size by default even if the actual HDDs are using 4k physical sectors. So you if you dont explcitly create the pool with a ashift of 12 ZFS could create it with a ashift of 9 which could create alot of overhead.
And caching could be problematic, if ZFS thinks the data is stored on the disk but in reality its just cached by virtio.
And what the guest seen is not always whats actually on the disk. I for example once got a hdd where fdisk on the host wasnt't able to see any partitions but the fdisk in the guest was seeing them. And I also seen the opposite where the host was showing partitions but the guest couldn't see them. Since then I wouldn't trust qm set passthrough anymore for important data.

ieronymous · Mar 10, 2022

apoc said:
On a scrub all data is read and compared against the checksum. If that would not detect issues I don't know what else it was for.
If that does not make you confident OK. I am fine with it.

fair enough. That leaves us with the extra layer then, which even small one it exists though.

apoc said:
some in and or sideband signaling is expected but not working.

exactly

apoc said:
/edit: can you configure the idrac to ignore this? Seems that the management component throws this error

searched every menu couldnt find a relevant field. Also Dell's forum someone could have said something but others having the same problem, let it unsolved, changed cabling and were ok (already done this) or still waiting for a solution. None mentioned anything about changing options in the bios \ i drac menu.

Dunuin said:
So you if you dont explcitly create the pool with a ashift of 12 ZFS could create it with a ashift of 9 which could create alot of overhead.

My disks (I know that you didn t answer to me) are 2.5 inch 1.2Tb 10.000rpm sas3 12G drives. Both physical\logical sector are 512b. I have created the raid 10 zfs with ashift 9

bobmc said:
Appreciate the convenience factor that's on offer, but anything TrueNAS does, you can do in other ways. Running samba either on the host or as a VM to do fileshares, and if you want some GUI control consider Webmin - it's a general management tool but it does have fileshare management as part of the toolbox.

That is what I am already doing know by serving instead of shares the storage (raid level 10 on ext4) via iscsi through webmin. Tired of having to monitor dozen of things and want to have pure zfs (I know ubuntu for example can be installed on a zfs filesystem). I want to accomplish this they way i described. The hardware way only gets in my way.

apoc · Mar 10, 2022

ieronymous said:
exactly

This seems to be the cable that is responsible for the communication between the drac and the chassis:
https://www.itinstock.com/dell-0g95...-backplane-power-and-signal-cable-63324-p.asp

ieronymous · Mar 11, 2022

apoc said:
This seems to be the cable that is responsible for the communication between the drac and the chassis:

I hope didn t know that and therefore hoping this should be the solution or at least a new trial and error try.
Here are the part numbers of all the equipment needed. I omitted to type the part number of the steel 8 bay cage but ok it has no electronic or electrical parts, just a still cage which slides to the front. What is also missing is the sas cable between the hba and the backplane. I used Delock's Silverstone's and Supermicro's. Maybe that could be the problem all along? I can t use the embedded ones between are of a different type. The onboard perc 330 is a mini mono.
On the other hand if sas cable was the problem I wouldn t have the same err msg by connecting the onboard to the second backplane and secondary data and power motherboard connections (pic above in my previous post). Even getting the error without sas cables for the second hba controller, just connecting the second backplane to the white data cable and the black power cable.
It is driving me crazy.

specs
backplane : P/N 0TGNMY
power cable : P/N 0123W8
signal cable : P/N 0TRFPV

apoc said:
responsible for the communication between the drac and the chassis:

I believe it is between the backplane and the motherboard. If idrac takes info from the motherboard afterwards .... probably.

The second R730 with 16 bay slots has this expander and the mini monolithic hba adapter uses this cable below
16 bay cabling and mini mono sas cable
This server was the reason I asked about qm set passthrough at first place

apoc · Mar 11, 2022

ieronymous said:
I believe it is between the backplane and the motherboard. If idrac takes info from the motherboard afterwards .... probably.

Sorry I did simplify things a bit.
Imho: Technically the cable connects the drive cage (backplane) with the main board which allows communication of the idrac via its BMC (Baseboard management controller).

However what is important and what you can see on the picture (that's why I have pointed to this source): there are only power and some other connectors, so no SAS. That brought me to the co clusion this could be the missing element. Using the Part number you can even get then from eBay. Couldn't find a good picture there though

/edit: crap. You are using r730. The cable I have researched is for r720. I likely have mixed this up with another thread

Search

Search

Passing through Hdds with qm set command

ieronymous

Well-Known Member

Dunuin

Distinguished Member

ieronymous

Well-Known Member

Dunuin

Distinguished Member

ieronymous

Well-Known Member

apoc

Famous Member

ieronymous

Well-Known Member

Dunuin

Distinguished Member

ieronymous

Well-Known Member

apoc

Famous Member

ieronymous

Well-Known Member

bobmc

Renowned Member

apoc

Famous Member

Dunuin

Distinguished Member

ieronymous

Well-Known Member

apoc

Famous Member

ieronymous

Well-Known Member

apoc

Famous Member