[SOLVED] Suggestions on mirroring boot drives, update! Booting directly from NVMe on older servers.

rsr911 · Dec 16, 2023

First my setup:

Servers are Lenovo RD440's. 8 SAS bays, all used for ceph drives. Currently the former DVD slot has an adapter to run a single sata ssd as my boot drive. I have no standard internal sata power connectors and only one sata port. My HBA is full with the 8 ceph drives attached. The board does have two SAS connectors. I believe it also has two internal USB ports.

So how can I setup a mirrored pair of drives for boot?

Ideas so far:

1) figure out a power splitter from the laptop size DVD interface and use two SAS ports off the motherboard for two sata ssds. Mirror with zfs or onboard raid if I can figure out how to get its bios to engage with my Raid/HBA cards installed (this is probably just a bios setting, like turning off the card bios)

2) use sata ssds in external USB enclosures but put them inside the servers. Mirror with ZFS.

3) add a bootable sata card to the last available slot and again figure out internal power.

4) use Raid/HBA cards I already have and again figure out the internal power.

I'm sure someone else has faced this issue using older hardware and/or all the bays for storage. I'm open to suggestions. I want to use drives though, not thumbdrives or microsd cards as these will be production servers.

esi_y · Dec 16, 2023

Or avoid this whole issue and run the hypervisor off ramdisk.

rsr911 · Dec 16, 2023

tempacc346235 said:
Or avoid this whole issue and run the hypervisor off ramdisk.

Servers are 128GB each but can go to 196GB. Are you pulling my leg or is this a viable option? If it is then it would write once to disk on shutdown correct? How would I do this?

esi_y · Dec 16, 2023

rsr911 said:
Servers are 128GB each but can go to 196GB. Are you pulling my leg or is this a viable option? If it is then it would write once to disk on shutdown correct? How would I do this?

I would start with something like here: https://gist.github.com/avinash-oza/9791c4edd78a03540dc69d6fbf21bd9c

It's viable for any system. Of course I assume servers have battery backup so that loss of power is not an issue and yes, I would probably dump to disk on restart/shutdown. We are talking mostly /var, even the cluster filesystem is eventually synced into config.db file there for individual node.

EDIT: Now loading 2GB initially into RAM, especially if it was over network, will take a while, but how often does one boot a hypervisor and how critical the speed there is with usual server boot times ...

rsr911 · Dec 16, 2023

tempacc346235 said:
I would start with something like here: https://gist.github.com/avinash-oza/9791c4edd78a03540dc69d6fbf21bd9c

It's viable for any system. Of course I assume servers have battery backup so that loss of power is not an issue and yes, I would probably dump to disk on restart/shutdown. We are talking mostly /var, even the cluster filesystem is eventually synced into config.db file there for individual node.

EDIT: Now loading 2GB initially into RAM, especially if it was over network, will take a while, but how often does one boot a hypervisor and how critical the speed there is with usual server boot times ...

I have a large APC smartUPS in the rack but it's running dumb right now. Havent taken the time to tie it into the servers for soft shutdown yet.

An alternative idea is a 2.5" adapter for B+M key dual nvmes I found that uses a sata interface as jbod. Then I could mirror with ZFS. But how much longer will those types of drives be availabile? I think the easy way is use the onboard SAS if I can adapt the DVD power.

But running in RAM is a neat idea. I'll read that link.

esi_y · Dec 16, 2023

rsr911 said:
I have a large APC smartUPS in the rack but it's running dumb right now. Havent taken the time to tie it into the servers for soft shutdown yet.

An alternative idea is a 2.5" adapter for B+M key dual nvmes I found that uses a sata interface as jbod. Then I could mirror with ZFS. But how much longer will those types of drives be availabile? I think the easy way is use the onboard SAS if I can adapt the DVD power.

But running in RAM is a neat idea. I'll read that link.

The conventional wisdom would be that everything fails all of the time. In case of clusters, what is the added value of having mirrored drives? For anything critical, it should be HA setup, other cluster node will take over while you go change that faulty component. You can minimize the chances of it failing by getting some reliable SATA DOM, but generally I would not care for having system driver mirrored, you also do not have failsafe for e.g. motherboard component short circuit.

Felix. · Dec 16, 2023

USB drives will cause pain for sure, one day or another. And screw SATA DOMs, Proxmox is logging quite a lot. You say you have a proper disk bay available, use that one.

Sounds like you are running a Cluster anyway, so I'd recommend to choose a proper Datacenter SSD as Boot Drive and not try mirroring it at all.
Thats what HA is about, you can afford to loose a node for a while.

I prefer mirrored drives, but in this situation a single robust disk sounds more viable than patchworking around it.

alexskysilk · Dec 16, 2023

rsr911 said:
So how can I setup a mirrored pair of drives for boot?

Not practical in your environment, but you have to ask yourself what are you after. You already have a cluster; if a member node dies you can just replace it. If you're running this for "production" (eg, you make money with it) you may want to rethink your hardware choices, but If there are no monetary value to downtime, its probably not worth investing money, time, or effort.

rsr911 · Dec 16, 2023

I do have one other potential option. I have Optane NVME drives that I am not using at present. The machine will not boot from NVME. However on a home PC of similar age that would not boot from an NVME I used a small sata ssd for /boot and efi partitions and / on the NVME. Still boots and runs fine.

I have spare everything, drives, NICs, HBA, motherboard, ram, powersupplies etc. I'm just not used to running a server boot drive in any configuration besides mirrored. I also have yet to figure out a good backup solution for the boot drives. I have just recently setup a server with PVE and PBS on top. There I went conventional: mirrored ssds for boot, ssd cached HDDs in raid 5 for backup storage. All running on a Avago 9400 series raid card. Spares for all of that as well. I'm new to proxmox. I don't know what happens if I lose a boot drive and need to reinstall. I believe the VM config files are stored on the boot drive, I know all the other settings are as well. It just seems to me boot and suspenders is the way to go but my servers don't really have what I need for mirrored boot drives. I just know I've got this cluster setup and will be using it in my production environment soon. Well actually I'm testing then tearing it down and making a five node cluster. Once its up and going I don't want to have to mess with it much.

For reference here I'm old school. First server I ever built had a three channel ultrawide SCSI controller in it. Two mirrored drives for boot with hot spare and a 12 drive raid array with hotspare. And cold spares on the shelf. If I came in to a beeping machine I swapped in a new drive, let it rebuild and went about my day. We've been running older ESXI for 8 years with a couple of windows server VMs. No HA, no cluster. Machine images stored on SAS Raid in the machines and backed up to mirrored NAS boxes. It all worked fine but had its issues from time to time. Now we've suffered a fire in one building. We went with cat6a everywhere and fiber and cat6a between buildings. So I'm trying to make a robust system that is as set it and forget it as is realistically possible.

Anyway I do know I won't be using these optanes. They are 118gb. And for boot purposes only pretty much any ssd would do but I can use the ones I have. Or just put in this best enterprise ones I can afford and figure out backing them up at least weekly.

In simple terms company ownership is so impressed with Proxmox we will be licensing everything once all the servers are running and dumping an essentially unused VMware license for three servers. The difference, as I understand it, is that ESXI runs in ram after boot and doesn't beat up the boot drive with lots of writes. For that matter one wonders, do I just drop in a SAS enterprise HDD for boot and be done with it? With backups of course. I have nearly new ones on the shelf.

esi_y · Dec 16, 2023

rsr911 said:
I do have one other potential option. I have Optane NVME drives that I am not using at present. The machine will not boot from NVME. However on a home PC of similar age that would not boot from an NVME I used a small sata ssd for /boot and efi partitions and / on the NVME. Still boots and runs fine.

In the standard PVE install with e.g. ZFS, the /boot is shoved onto the same partition where ESP is even. I would go as far as saying you can totally have the ESP and boot partition on a USB stick. Not to mention you can have two of them set in a chain for boot if you really wanted at zero cost.

rsr911 said:
I have spare everything, drives, NICs, HBA, motherboard, ram, powersupplies etc. I'm just not used to running a server boot drive in any configuration besides mirrored. I also have yet to figure out a good backup solution for the boot drives.

This is the missing thing with PVE as of today.

rsr911 said:
I have just recently setup a server with PVE and PBS on top. There I went conventional: mirrored ssds for boot, ssd cached HDDs in raid 5 for backup storage. All running on a Avago 9400 series raid card. Spares for all of that as well. I'm new to proxmox. I don't know what happens if I lose a boot drive and need to reinstall. I believe the VM config files are stored on the boot drive, I know all the other settings are as well. It just seems to me boot and suspenders is the way to go but my servers don't really have what I need for mirrored boot drives. I just know I've got this cluster setup and will be using it in my production environment soon. Well actually I'm testing then tearing it down and making a five node cluster. Once its up and going I don't want to have to mess with it much.

What's worth backing up though is /etc/pve which in fact is basically shared filesystem (that resides stored as sqlite config.db file in /var on individual nodes). So if your node fails, you still have this filesystem available on the remaining nodes in that shared location. If you e.g. run it with CEPH you can have HA setup and it will just start the VMs on a good node without much of a hiccup.

I would recommend experimenting a lot before going production anything with PVE.

rsr911 said:
For reference here I'm old school. First server I ever built had a three channel ultrawide SCSI controller in it. Two mirrored drives for boot with hot spare and a 12 drive raid array with hotspare. And cold spares on the shelf. If I came in to a beeping machine I swapped in a new drive, let it rebuild and went about my day. We've been running older ESXI for 8 years with a couple of windows server VMs. No HA, no cluster. Machine images stored on SAS Raid in the machines and backed up to mirrored NAS boxes. It all worked fine but had its issues from time to time. Now we've suffered a fire in one building. We went with cat6a everywhere and fiber and cat6a between buildings. So I'm trying to make a robust system that is as set it and forget it as is realistically possible.

I'm not sure about PVE being this effortless when it comes to e.g. failed drives.

rsr911 said:
Anyway I do know I won't be using these optanes. They are 118gb. And for boot purposes only pretty much any ssd would do but I can use the ones I have. Or just put in this best enterprise ones I can afford and figure out backing them up at least weekly.

You could simply test to put ESP + /boot on a USB stick and have that load root off that Optane.

rsr911 said:
In simple terms company ownership is so impressed with Proxmox we will be licensing everything once all the servers are running and dumping an essentially unused VMware license for three servers. The difference, as I understand it, is that ESXI runs in ram after boot and doesn't beat up the boot drive with lots of writes. For that matter one wonders, do I just drop in a SAS enterprise HDD for boot and be done with it? With backups of course. I have nearly new ones on the shelf.

Unpopular opinion, this is what's better about ESXi. Conceptually as a hypervisor.

rsr911 · Dec 16, 2023

tempacc346235 said:
In the standard PVE install with e.g. ZFS, the /boot is shoved onto the same partition where ESP is even. I would go as far as saying you can totally have the ESP and boot partition on a USB stick. Not to mention you can have two of them set in a chain for boot if you really wanted at zero cost.

This is the missing thing with PVE as of today.

What's worth backing up though is /etc/pve which in fact is basically shared filesystem (that resides stored as sqlite config.db file in /var on individual nodes). So if your node fails, you still have this filesystem available on the remaining nodes in that shared location. If you e.g. run it with CEPH you can have HA setup and it will just start the VMs on a good node without much of a hiccup.

I would recommend experimenting a lot before going production anything with PVE.

I'm not sure about PVE being this effortless when it comes to e.g. failed drives.

You could simply test to put ESP + /boot on a USB stick and have that load root off that Optane.

Unpopular opinion, this is what's better about ESXi. Conceptually as a hypervisor.

Well I think I have my answer! I can leave the current boot ssds, just install boot and esp on them. At 1tb they will last probably forever lol. Then put the system on the optanes which have a very high DWPD rating. Or like you said just get a pair of quality USB sticks to make mirrored boot and run systems on the optanes. I am thinking though the boot probably only changes with upgrades to the kernel so maybe just clonezilla.

Any reason I can't do a cron job to backup /etc/pve and /var using rsync to my NAS? I mean its Debian after all. My home PC is on Ubuntu currently with a cron rsync job to my old PC which is now a home NAS with 8 HDDs in raid. Hell that cron job even WOL the NAS machine since I only use it for backups. (Home PC has an ssd hw raid array just for fun). Since at home its only weekly backups it makes more sense for it to spin up, backup, shut down. So I know how to write those scripts. I could do daily and weekly of those directories. Alternatively I could make a small Linux VM on my PVE/PBS machine as a target for these backups and let PBS backup that VM to the big raid array. For off site I'm installing a backup machine at home. Just going to send a weekly over the wire. PBS backups will get copied to a NAS in another location at work.

esi_y · Dec 16, 2023

rsr911 said:
Well I think I have my answer! I can leave the current boot ssds, just install boot and esp on them. At 1tb they will last probably forever lol. Then put the system on the optanes which have a very high DWPD rating. Or like you said just get a pair of quality USB sticks to make mirrored boot and run systems on the optanes. I am thinking though the boot probably only changes with upgrades to the kernel so maybe just clonezilla.

I do not know if this will be of any help, but here we were once discussing how one could have ESP twice. I do not see much value in it if you are running a cluster, you can always regenerate these partitions, but in case you wanted to read in some more than peculiar mdadm setups even:
https://forum.proxmox.com/threads/proxmox-8-luks-encryption-question.137150/page-2#post-611562

rsr911 said:
Any reason I can't do a cron job to backup /etc/pve and /var using rsync to my NAS? I mean its Debian after all. My home PC is on Ubuntu currently with a cron rsync job to my old PC which is now a home NAS with 8 HDDs in raid. Hell that cron job even WOL the NAS machine since I only use it for backups. (Home PC has an ssd hw raid array just for fun). Since at home its only weekly backups it makes more sense for it to spin up, backup, shut down. So I know how to write those scripts. I could do daily and weekly of those directories. Alternatively I could make a small Linux VM on my PVE/PBS machine as a target for these backups and let PBS backup that VM to the big raid array. For off site I'm installing a backup machine at home. Just going to send a weekly over the wire. PBS backups will get copied to a NAS in another location at work.

Maybe have a look here, we discussed what's the relationship between the config.db and what's mounted into /etc/pve:
https://forum.proxmox.com/threads/q...kip-etc-pve-in-the-backup.137539/#post-613235

You can back it all up, but the fact is you cannot rely on that /etc/pve be in some consistent state. Better than no backup at all, but I personally would rather have the config.db files which are left in some consistent state of those individual nodes. This only matters if you e.g. lost whole cluster. In which case I would actually rather have new cluster deployed, maybe set of scripts, but entirely new and restore VMs from backups (supposedly off-site anyways).

One thing I would warn about though is mess that comes with dead nodes. If you ever have a node die, it's not a problem per se, of course, but PVE is not really built to handle e.g. a replacement node be on the same IP, let alone have same name. So as much as it might be absurd, say you have nodes pve121, pve122, pve123 ... and your 122 dies, say it was on 10.x.y.122, you will save yourself a lot of headaches when you spin up a new fresh node, not restore from backups, but put it on 10.x.y.124 and name it pve124, let it join the cluster and re-distribute your VMs as you wish.

rsr911 · Dec 17, 2023

Thank you. I'll read all that.

I have read about PVE not liking the same name and IP. I set aside a whole range of IPs for this purpose. My ceph public and ceph cluster are on isolated switches so no problems there either. I will most likely keep a handy copy of the network config handy where I can just increment the IPs for management, public, coro0, backup/coro1, ceph public and ceph cluster.

I do realize I'm going overboard with this whole belt and suspenders thing. It's likely just force of habit from my pre VM hypervisor days and always running hw raid. I was once such a raid fanatic that I beta tested Windows 2000 service packs to help fix driver issues with Megaraid cards back in like 2002-2003 time frame. I worked on SP3 and SP4. That was my pre Linux days as well. Nothing like hearing your home rig spin up 6 15k Cheatah drives in Raid 0 for "ultimate" speed... Easily beaten by a modern SAS HDD lol. Ever still the raid fanatic my home 12th gen Intel has three 4th gen NVMes in mdraid zero and backs up to hw sata raid ssds. Why? Idk, 17gb/s looks neat on a benchmark?

Thanks much. I'm off to read those links you added.

alexskysilk · Dec 17, 2023

rsr911 said:
Any reason I can't do a cron job to backup /etc/pve and /var using rsync to my NAS?

I think you're trying to fix a non existent problem.

rsr911 said:
First my setup:

Servers are Lenovo RD440's. 8 SAS bays, all used for ceph drives.

ceph implies a cluster. /etc/pve does NOT NEED TO BE BACKED UP. restoring a "backed up" /etc/pve could hive dire consequences and you REALLY dont want to. Think of /etc/pve as your "vcenter appliance" in the sense that it doesnt depend on any any of its server parents as required for function, but unlike vcenter, all hosts contain a copy of this and negotiate the "real" version when they enter quorum. As for var, I'm assuming you want that for logging- the proper way is to have a log facility like an elk stack with a grafana front end.

The whole point of a cluster, much like it was with your vsphere environment, is that the physical nodes are cattle, not sheep. It really doesn't matter if one of them dies; you can add a new server from scratch and add it to the pool. In general, having a mirrored boot environment can help avoid downtime on a single node, but in a cluster a node outage does not impact your production in the first place- making it a nice to have and not a must have. In your case, you're going through a bunch of gymnastics that would serve very little benefit. If You had concerns with the relative age of your hardware (RD440 was introduced in January 2014) and wanted to replace them with more modern gear, you can and should plan out the replacement with dual boot devices.

tempacc346235 said:
but PVE is not really built to handle e.g. a replacement node be on the same IP, let alone have same name.

Why is this important? again, cattle not sheep...

tempacc346235 said:
you will save yourself a lot of headaches when you spin up a new fresh node, not restore from backups, but put it on 10.x.y.124 and name it pve124, let it join the cluster and re-distribute your VMs as you wish.

exactly!

esi_y · Dec 17, 2023

rsr911 said:
Thank you. I'll read all that.

I have read about PVE not liking the same name and IP. I set aside a whole range of IPs for this purpose. My ceph public and ceph cluster are on isolated switches so no problems there either. I will most likely keep a handy copy of the network config handy where I can just increment the IPs for management, public, coro0, backup/coro1, ceph public and ceph cluster.

There's also some bugs (yes, I wrote it out here, but of course it will not be put into the official docs like this) that e.g. fail to clean up all the records of the dead node, so if your new then has same name and even IP, it is time bomb. Also there's nuances how e.g. corosync names the nodes, what record they have in the hosts file and maybe even what your DNS resolves them to, not to mention e.g. ssh is run with -o HostKeyAlias. So basically it's a precaution to avoid all those pitfalls not to reuse anything.

rsr911 said:
I do realize I'm going overboard with this whole belt and suspenders thing. It's likely just force of habit from my pre VM hypervisor days and always running hw raid. I was once such a raid fanatic that I beta tested Windows 2000 service packs to help fix driver issues with Megaraid cards back in like 2002-2003 time frame. I worked on SP3 and SP4. That was my pre Linux days as well. Nothing like hearing your home rig spin up 6 15k Cheatah drives in Raid 0 for "ultimate" speed... Easily beaten by a modern SAS HDD lol. Ever still the raid fanatic my home 12th gen Intel has three 4th gen NVMes in mdraid zero and backs up to hw sata raid ssds. Why? Idk, 17gb/s looks neat on a benchmark?

There's nothing wrong with RAIDs, just when you think of clusters nowadays with CEPH, it's becoming adding complexity (and cost) for no real benefit. The one case someone was insisting with everything being mdadm but that was single-node install (no cluster), well, there was a place for that there. It's like in the past you would have likely run your hypervisor of the mirrored SD card module and also just fine, but it was not barraging them with 10GB+ / day writing data. You can use RAID for performance (there's some people who need to be explained RAID is not a backup), you can use it for redundancy, but again with a cluster ... should not be needed.

rsr911 said:
Thanks much. I'm off to read those links you added.

It's really good to dry run everything, even test for failures (before you are in production), you may encounter some interesting situations and whilst the forums likely find you a quick answer, it's always good to have something happen that you had seen before and not unexpectedly.

esi_y · Dec 17, 2023

alexskysilk said:
Why is this important? again, cattle not sheep...

It matters to mention to someone new asking about PVE because somehow in the docs it fails to mention this. If you can find any piece in the docs that clearly states that e.g. never re-use an IP address after a dead, correctly removed node, please provide a link. And of course it also is not in the docs that there are SSH related bugs that cause issues if you ever find yourself in a situation with newly installed node with the same name as (cleanly) previously removed one. So it is important to mention, the docs are ashamed to do so.

alexskysilk · Dec 17, 2023

tempacc346235 said:
It matters to mention to someone new asking about PVE because somehow in the docs it fails to mention this

I see what you mean. fair enough. the issue isnt that there is a problem with this as a requirement but rather the lapse in setting expectations. You're right of course- you have to read between the lines in the node removal section of https://pve.proxmox.com/wiki/Cluster_Manager.

rsr911 said:
In simple terms company ownership is so impressed with Proxmox we will be licensing everything once all the servers are running and dumping an essentially unused VMware license for three servers.

The proxmox team is so small as to be a rounding error vs VMWare's staff- engineers, tsrs, tech writers, etc. To operate Proxmox reliably in the wild, the operator has to build a knowledgebase from official and non official sources (eg, the above comment.) If your company is so impressed with it, they really should be asking themselves if that feeling would remain the same should you quit.

esi_y · Dec 17, 2023

alexskysilk said:
I see what you mean. fair enough. the issue isnt that there is a problem with this as a requirement but rather the lapse in setting expectations. You're right of course- you have to read between the lines in the node removal section of https://pve.proxmox.com/wiki/Cluster_Manager.

Sorry for nitpicking, but not even reading between lines gets one the part about IP address. There's a section on removing a node and then (confusingly for first time reader) on separate a node (not recommended). But even then, there's just this:

"As mentioned above, it is critical to power off the node before removal, and make sure that it will not power on again (in the existing cluster network) with its current configuration. If you power on the node as it is, the cluster could end up broken, and it could be difficult to restore it to a functioning state."

One would expect e.g. node to be uniquely identifiable and if you follow the above not worry about a completely NEW node having join a cluster that happens to have anything like an IP address that was once previously used with a node which was - from the cluster - correctly removed with delnode command.

So all those features are bug-ridden and the PVE team is sugarcoating in the docs the real reason why not reuse anything ever (i.e. actively manually take measures to avoid doing so). Now I just chose not to sugarcoat it, because next thing this forum does to these people coming back here is tell them they did not read the docs, etc. No, it should be mentioned in the docs that due to bugs and our failure to put in automated checks, manually ensure e.g. IPs are not used again, not even after dead nodes, not even after successfully removed ones with a proper command.

alexskysilk said:
The proxmox team is so small as to be a rounding error vs VMWare's staff- engineers, tsrs, tech writers, etc. To operate Proxmox reliably in the wild, the operator has to build a knowledgebase from official and non official sources (eg, the above comment.) If your company is so impressed with it, they really should be asking themselves if that feeling would remain the same should you quit.

I had enough of my snide remarks above, so I won't join in here. But as always, caveat emptor.

rsr911 · Dec 17, 2023

alexskysilk said:
The proxmox team is so small as to be a rounding error vs VMWare's staff- engineers, tsrs, tech writers, etc. To operate Proxmox reliably in the wild, the operator has to build a knowledgebase from official and non official sources (eg, the above comment.) If your company is so impressed with it, they really should be asking themselves if that feeling would remain the same should you quit.

Good point. However we are a small family owned company and as a 25 year employee and vice president that family has been transferring me stock to the point that when our President (also their brother) retires, I will own controlling interest in the company. In short I have no reason to leave. Second Proxmox wasn't my idea. I've hired a younger guy I'm training as my assistant but to also be my main IT guy. After months of struggling with VMware trying to make the two node plus witness vsan we were told we could make only to find we had outdated hardware at every turn this younger guy suggested we just give Proxmox a try. Since he's been busy doing other things for me I started working on Proxmox. I had a cluster up and running on a single nic in less than a week. Then I tore that down and put in all my drives and NICs, ran into a now solved issue with ceph IPs and have this test cluster up and running.

I emphasize test at this point. We've put up a Linux VM and Windows server VM and have been doing failure testing. Aka pulling power on a server, unplugging network cables, pulling drives etc. No rebuilding a missing OSD isn't as simple as old school HW raid a three node, eventually a five node, is far more fault tolerant.

Why old equipment? Well our fire happened in May 2021 at the height of pandemic shortages. Our insurance payout did not cover the increased costs of building materials and other inflated items. We're done rebuilding now but we had our first losing year since the 2008 recession. I already had three RD440's running on single CPUs and only 32gb ram each. Now I have 4, soon to be 5, with both CPUs and 128gb ram each. The long term plan is to replace nodes one by one as budget allows. That's actually one reason I knew about the issue of using the same name and IP for a replacement and why I've reserved IPs in my network for future nodes. My idea, right or wrong, is to buy new servers, fail an old one, install the new, and let ceph populate it. I'd rather save money now and keep giving my people deserved raises than just dump a fortune into new or newish hardware. Plus I'm thinking I can transfer much of what I have to new servers like the dual port NICs of which I have four per server. At 10g everywhere my network is 10x faster than its ever been.

I bought these optanes because VMware needed them for vsan cache. May as well use them if I can. Failing that replace the current consumer drives with enterprise drives, like I need to do on the ceph cluster itself.

So in short, even with old hardware, this cluster is a rather big performance jump from two single CPUs running individually and on HDDs to five dual CPU, four times the ram, 10g networking, and all ssd storage. In baseline testing my VMs are clearly much faster. I'm not bottlenecked by a slow network, slow drives, not enough cores and ram etc.

Seems to me the solution here is use the optanes as described or install small enterprise drives for boot. Maybe not even worry about backups of the boot drives.

Having said that, please correct me if I'm wrong, but it seems like clusters are a lot like hardware raid cards. Replacement servers need to be "clean" just like raid cards will only use fresh empty drives to rebuild on. Which means to me if I have a server fail just replace what's broken, do a clean install and configure networking and join it to the cluster.

rsr911 · Dec 17, 2023

I wanted to add one more thing. Just before our fire we paid about $7k for VMware 7 licenses and support. It was a total pain in the butt. The interface is not intuitive. Hardware support sucks and their compatibility guide is useless. Now they've been sold to Broadcom for $61 billion. Sure the world relies on them but if people like me don't support products like Proxmox then VMware will never face any real competition. And with financial support they will only get better. In my eyes Proxmox is like the Ubuntu of the hypervisor world, a simple to setup and use system with lots of hardware support, a good community behind it, making it less intimidating to newbies. One shouldn't need a degree in IT to get a small company up and running on a solid server solution. At least that's my opinion. Who knows, maybe in a few years I'll be the guy on here helping the new guys. I hope so at least. So I'm not going to knock them too much for maybe not being entirely enterprise ready, everyone has to start somewhere. /end soapbox lol.

[SOLVED] Suggestions on mirroring boot drives, update! Booting directly from NVMe on older servers.

Member

Renowned Member

Member

Renowned Member

Member

Renowned Member

Renowned Member

Distinguished Member

Member

Renowned Member

Member

Renowned Member

Member

Distinguished Member

Renowned Member

Renowned Member

Distinguished Member

Renowned Member

Member

Member