PVE4 & SRP storage from another ZFS pool via SRPT

elurex

Active Member
Oct 28, 2015
204
14
38
Taiwan
Hi All

I am new to PV4 (less than 24 hrs) but I am very familiar with ubuntu, infiniband and zfs

My setup

PVE4 node: Mellanox ConnectX-3 VPI dual port card (and sr-iov can be initiated but some issue with iommu grouping now)
SAN node: Mellanox ConnectX-3 VPI dual port card running ZFS pool serving virtual block device over RDMA ( ubuntu 14.04.3 but It can be another PVE4 node as well)

in /etc/modules
Code:
mlx4_en
mlx4_ib
ib_ipoib
ib_umad
ib_srp

and I have a startup script execute the following
Code:
echo "id_ext=xxxxxx,ioc_guid=xxxxxx,dgid=xxxxx,pkey=ffff,service_id=xxxxx" > /sys/class/infiniband_srp/srp-mlx4_0-1/add_target

The disks are found (controller SCST_BIO)
Code:
: ~# lsscsi
[0:0:1:0]    cd/dvd  MATSHITA DVD-ROM SR-8178  PZ16  /dev/sr0
[4:2:0:0]    disk    LSI      MegaRAID SAS RMB 1.40  /dev/sda
[5:0:0:0]    disk    SCST_BIO kvm-node0         310  /dev/sdb
[5:0:0:1]    disk    SCST_BIO kvm-node1         310  /dev/sdc
[5:0:0:2]    disk    SCST_BIO kvm-node2         310  /dev/sdd
[5:0:0:3]    disk    SCST_BIO kvm-node3         310  /dev/sde
[5:0:0:4]    disk    SCST_BIO kvm-node4         310  /dev/sdf
[5:0:0:5]    disk    SCST_BIO kvm-node5         310  /dev/sdg

I am assuming adding them to VM should follow
pve wiki Physical_disk_to_kvm
( i can't even have url in my post yet)

Is there a more a native way that PVE can support virtual block device other than ZFS over iscsi? knowing those disks are actually not local (but multipath) so that when I setup PVE clusters can do live migration without the need to move data (because its actually handled by SAN)

Or I need to use ceph storage cluster method described by
pve wiki Storage:_Ceph
(I can run ceph on top of zfs) and also
mellanox docs/DOC-2141
. The downside is its not RDMA over infiniband network but RDMA over Ethernet network (huge performance drop)

I am trying moving away from VMWare to PVE4 due to it supports LXC and ZFS natively (FS on Root! however, many uses zpool in stripe-mirror mode, now is limited to raid1 or raidZx) Eventually I plan to run 4 PVE4 nodes with 2 of them as SAN (backup can be done using pve's zsync which basically is zfs send|zfs recv script) and all of them are interconnected with Mellanox ConnectX-3 VPI

PVE4 supports both inifiniband and zfs, somehow when both are used together, some of the HPC features are missing.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!