Hi all,
I have the following setup:
6 x physical nodes all setup in proxmox pve cluster (2 x E5-2667 v2 CPU's, 256 GB mem per box)
2 x SAN Controllers in my reundant Dell Compellent SAN with an individual portal ip each.
Everything is up and running and since proxmox wont show all the LUN's from 1 primary portal ip i had to setup 2 x ISCSI's - 1 for each controller.
So i have:
SAN (35 LUN's)
SAN2 (35 LUN's)
And then i had setup 70 LVM's as per the recommendations.
In CLI the LUN's are reported 100% correctly and multipathing etc. is all setup.
Is there no way to get Proxmox to see all the LUN's and controllers from just 1 ISCSI addition?
I basically setup some VM's and all was fine and dandy - until they started randomly corrupting and becoming beyond repair.
They where setup by picking a LVM as the storage, using virtio for disk + network and the actual ubuntu installation was done with guided entire disk on LVM.
One VM had more than 2500 inode issues when trying to repair it and ultimately it was so ruined the only option was to delete it and start over.
So there seems to be some form of misalignment of the data happening on that model.
Right now i have tried to start over entirely and setup the iSCSI targets to allow to use LUN's directly and i am installing with guided entire disk without LVM - so i removed 2 x LVM compared to before - and untill now the system hasnt corrupted yet.
So there is multiple questions here:
1) Is it possible to get Proxmox to only show 1 SAN and all the actual LUN's no matter which controller they are on.
2) Has anyone experienced these issues we have with corruption or ideas on why and what to do?
Just pasting my Multipath conf here for referencing (local discs sda and sdb blacklisted).
multipath.conf
From a CLI based point of view with multipath -v3, multipath -f, multipath -l etc. everything is looking good and correctly.
But still someone is behind the corruption thats happening so good ideas or experiences are welcome.
Ps. ill update later as to testing on direct LUN's with no LVM at all involved.
I have the following setup:
6 x physical nodes all setup in proxmox pve cluster (2 x E5-2667 v2 CPU's, 256 GB mem per box)
2 x SAN Controllers in my reundant Dell Compellent SAN with an individual portal ip each.
Everything is up and running and since proxmox wont show all the LUN's from 1 primary portal ip i had to setup 2 x ISCSI's - 1 for each controller.
So i have:
SAN (35 LUN's)
SAN2 (35 LUN's)
And then i had setup 70 LVM's as per the recommendations.
In CLI the LUN's are reported 100% correctly and multipathing etc. is all setup.
Is there no way to get Proxmox to see all the LUN's and controllers from just 1 ISCSI addition?
I basically setup some VM's and all was fine and dandy - until they started randomly corrupting and becoming beyond repair.
They where setup by picking a LVM as the storage, using virtio for disk + network and the actual ubuntu installation was done with guided entire disk on LVM.
One VM had more than 2500 inode issues when trying to repair it and ultimately it was so ruined the only option was to delete it and start over.
So there seems to be some form of misalignment of the data happening on that model.
Right now i have tried to start over entirely and setup the iSCSI targets to allow to use LUN's directly and i am installing with guided entire disk without LVM - so i removed 2 x LVM compared to before - and untill now the system hasnt corrupted yet.
So there is multiple questions here:
1) Is it possible to get Proxmox to only show 1 SAN and all the actual LUN's no matter which controller they are on.
2) Has anyone experienced these issues we have with corruption or ideas on why and what to do?
Just pasting my Multipath conf here for referencing (local discs sda and sdb blacklisted).
multipath.conf
Code:
defaults {
polling_interval 2
path_selector "round-robin 0"
path_grouping_policy multibus
getuid_callout "/lib/udev/scsi_id -g -u -d /dev/%n"
rr_min_io 100
failback immediate
no_path_retry queue
}
devices {
device {
vendor "COMPELNT"
product "Compellent Vol"
path_grouping_policy multibus
getuid_callout "/lib/udev/scsi_id --whitelisted --replace-whitespace --device=/dev/%n"
path_selector "round-robin 0"
path_checker tur
features "0"
hardware_handler "0"
prio const
failback immediate
rr_weight uniform
no_path_retry queue
rr_min_io 1000
rr_min_io_rq 1
}
}
blacklist {
wwid 36c81f660e05508001a6cf31a13d4e399
wwid 1Dell_Internal_Dual_SD_0123456789AB
}
From a CLI based point of view with multipath -v3, multipath -f, multipath -l etc. everything is looking good and correctly.
But still someone is behind the corruption thats happening so good ideas or experiences are welcome.
Ps. ill update later as to testing on direct LUN's with no LVM at all involved.