Given my configuration, what is the best way to achieve reliable storage?

surfrock66

Active Member
Feb 10, 2020
40
8
28
41
I'm a bit overwhelmed setting up the shared storage for a new Proxmox Cluster in my lab. This is our previous generation hardware which was sitting idle, and each of 10 servers has 4 ~300GB SAS disks. It is connected to our production Dell SAN, where our first testing 2TB lun is presented over iscsi to the proxmox cluster. I have a VM on the cluster and backed by the SAN, it's working fine.

When I set these up, each server's storage was put into a RAID-5 at roughly 850GB; this was the configuration we used for the previous hypervisor on these hosts, xenserver, and it worked well and we could tolerate drive failures seamlessly. The RAID controller is a DELL PERC controller, but that doesn't matter much.

I partitioned it out so the OS is using 100GB of that space. We were not planning on using the lvm-local storage on the individual hosts so we didn't really worry about it too much, the rough storage layout is this:

Code:
root@PROX-01:~# lsblk
NAME                                       MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda                                          8:0    0 836.6G  0 disk
├─sda1                                       8:1    0  1007K  0 part
├─sda2                                       8:2    0   512M  0 part
├─sda3                                       8:3    0  99.5G  0 part
│ ├─pve-swap                               253:0    0     8G  0 lvm  [SWAP]
│ ├─pve-root                               253:1    0  24.8G  0 lvm  /
│ ├─pve-data_tmeta                         253:2    0     1G  0 lvm
│ │ └─pve-data                             253:4    0  52.4G  0 lvm
│ └─pve-data_tdata                         253:3    0  52.4G  0 lvm
│   └─pve-data                             253:4    0  52.4G  0 lvm
└─sda4                                       8:4    0 736.6G  0 part
sdb                                          8:16   0     2T  0 disk
└─DS--PROX--Servers-vm--100--disk--0 253:5    0    80G  0 lvm
sr0                                         11:0    1  1024M  0 rom

So now I'm stuck. Proxmox can't do snapshotting on LVM over iSCSI, and ceph doesn't want to work on storage backed by a RAID controller; when I set this up I intended to try to do some sort of distributed storage on /dev/sda4 because the guide made it seem like a warning but not a hard stop, but it seems like cephfs on a RAID backed device will not create.

Theoretically, if I blew my nodes away and got rid of the RAID, I could make an OS disk and have 3 data disks on each host, then I can make a proper cephFS, but if the OS disk dies I'm toast. These are Dell Generation 11 poweredge servers, and I don't trust the disks even though many have been replaced over the years. Between orchestration and backups I'm sure I could quickly rebuild a host, but I don't want to deal with that and it feels like a downgrade over current capabilities with the RAID-5.

I have 0 budget for new hardware, this is a lab experiment as we are having issues and potentially want to move away from VMWare in our production environment; I can't get a ZFS storage device, my existing Compellent is only 50% provisioned so there is no business driver or urgency to add something that will present ZFS out of the box.

Am I out of luck to get shared, fault-tolerant storage which supports snapshotting given my current hardware configuration? I am totally open to the idea that I have a terrible fundamental misunderstanding of what is going on here too, feel free to tell me if I did something dumb.
 
Last edited:
I don't think you are misunderstanding anything in some major way. You are correct that:
- LVM layered on any shared storage (iSCSI included) must be in Thick/standard mode and snapshots are not supported
- Using ZFS over iSCSI is a special implementation that requires very specific software. If you were to implement it you would need to architecture high availability of this software on your own.
- Using Ceph on RAID'ed storage is not recommended for many reasons. You did not post what prevented you from forcing the system to work this way. It should be possible, in theory, despite being discouraged to do so. Its just a lab in the end.

Perhaps you can have your RAID card mirror 2 disks for boot, present the others as direct for Ceph setup?
You can also forgo the RAID card, setup mdraid R1 for the boot and let Ceph handle the rest. This would be outside of standard PVE install.

The table here is the appropriate reference : https://pve.proxmox.com/wiki/Storage
You need anything with shared=yes && snapshot=yes. Considering the limits of your lab as described - its either Ceph or Glusterfs+qcow.
Ceph is a distributed storage so it depends heavily on the network connectivity, can your lab handle that?


Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Yea, the networking stack is shared with production; in the event we felt this was reliable enough we built this connected with both prod and the lab VLANs. There's 4 nics per server and they're bonded and have LACP, but we don't have them all cabled up until we clear out some stuff on the switch from a decommissioned phone system.

When I did the osd create command, it aborts, and my searching said to bypass that I'd have to jump to ceph-volume lvm create commands which seemed ill advised.

Code:
root@PROX-01:~# pveceph osd create /dev/sda4
unable to get device info for '/dev/sda4'
 
I did do that, but the results made me think it was a bad idea. Because the OS and the ceph would be backed by the same raid5, I read this:

"Do not mix the OS disk with Ceph. Ceph will thrash the performance of the disk. Besides that the OS might grind to a halt, also Ceph won't benefit."

And thought I'd end up doing more harm than good with this method. I think I'm gonna do what you said originally; remake the raid with 2 disks then use the other 2 for ceph. It'll suck to rebuild, but it's a good test of our orchestration and documentation.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!