Storage question, im stumped after hours of research :(! Shrink local and increase Directory storage

If you plan to reinstall everything in any case I would strongly suggest to NOT add the third to the cluster because if the cluster gets somehow broken in the process you won't have a running system at all. Instead I would proceed like this:

  1. Backup everything to a medium which is NOT on one of your nodes, e.G. your NAS or a external disc drive. If you have a ProxmoxBackupServer that's the best option (because with it you have live-restore) otherwise just use the regular vzdump/vma-archiv Backup integrated in ProxmoxVE. So in case something goes wrong you can restore your vms and containers and have your services running again. I would also test the restore before proceeding:
    All existing configuration in /etc/pve is overwritten when joining a cluster. In particular, a joining node cannot hold any guests, since guest IDs could otherwise conflict, and the node will inherit the cluster’s storage configuration. To join a node with existing guest, as a workaround, you can create a backup of each guest (using vzdump) and restore it under a different ID after joining. If the node’s storage layout differs, you will need to re-add the node’s storages, and adapt each storage’s node restriction to reflect on which nodes the storage is actually available
    https://pve.proxmox.com/wiki/Cluster_Manager#pvecm_join_node_to_cluster
  2. Install the third node as single-node, afterwards install the Proxmox Datacenter Manager as a VM (alternatively use the qm remote-migrate and ct remote-migrate command line tools) to migrate all workloads to the new single-node.
  3. Afterwards wipe your two nodes, migrate/restore your workloads to one of them (NOT both! If you add a node with vms or containers to a cluster you will loose their config!).
  4. Build a cluster by adding an empty node to the node with the guests like described in the doc: https://pve.proxmox.com/wiki/Cluster_Manager#pvecm_join_node_to_cluster
  5. Afterwards wipe your third node (you migrate all the stuff from it to your new cluster right?), reinstall it and add it to the cluster. Or alternatively (if you didn't had a ProxmoxBackupServer before): Install a ProxmoxBackupServer on it and use it as a qdevice: https://pve.proxmox.com/wiki/Cluster_Manager#_corosync_external_vote_support
    Please note that adding the PBS as a qdevice will also allow anyone with root access to one of your cluster nodes to access the PBS. This might not be what you want e.g. in case somebody hacks your cluster you don't want to give him access to your backups. One way around this is a setup like one described by Proxmox developer @aaron here: https://forum.proxmox.com/threads/planning-advice.169434/post-791732
    Basically you install ProxmoxVE and ProxmoxBackupServer on the same host without adding it to the cluster. Afterwards you create small Debian container or VM (whatever you prefer) to act as qdevice so somebody with root access on the Promox cluster will only be able to connect to the qdevice container/vm.


Some other things to consider:
  • Since you mentioned that you are using LVM as storage at the moment, for what do you actually need a cluster? Usually one would use a cluster if one want high-availability together with a shared storage (e.G. NFS on a NAS or a FC attached SAN) or ZFS for storage replication. If you don't need this you propably doesn't need a cluster with it's added complexity and failure possibilities.
  • One example: A cluster needs a dedicated network link for cluster communication due to corosyncs low-latency requirements ( see https://pve.proxmox.com/wiki/Cluster_Manager#pvecm_cluster_requirements ). So if you don't have this dedicated link I wouldn't run anything serious on the cluster (for this reason in my homelab I have a single-node for my always-on-services and a two-node+qdevice-cluster as "playground". None of my machines has a dedicated link for corosync so I don't run anything important on the cluster, for playing with stuff it's ok (because nothing important will be lost if it breaks)). So if you don't need high-availabiltiy you propably don't need a cluster. For having a single-glass-of-managment or migrating nodes between different single-nodes/cluster you can also use (as already mentioned) the Datacenter Manager:
  • LVM is (as you discovered) quite unflexible in terms of allocating directory/block level storage. ZFS is way more flexible in this regard and has some other nice features too (e.g. filesystem level compression, storage replication, "cheap" snapshots):
    • https://pve.proxmox.com/wiki/Storage:_ZFS
    • https://forum.proxmox.com/threads/f...y-a-few-disks-should-i-use-zfs-at-all.160037/
    • So if you are planning to reinstall everything in any case you could also use this as a reason to resetup everything with ZFS. Another benefit is that you could setup anything (OS plus vms plus containers) on a two-disc mirror. Since micro-pcs often just have limited storage slots this is what I'm doing myself on my Lenovo ThinkCentre ProxmoxVE nodes. The mirror has also the advantage that in case of one failing disc the system should continue to run and even heal some errors for themselves (thanks to ZFS bitrot-detection plus autohealing magic)
 
Last edited:
  • Like
Reactions: UdoB
... dedicated link I wouldn't run anything serious on the cluster (for this reason in my homelab I have a single-node for my always-on-services

That's surprising to me. While you do have a cluster (your playground) your always-on-services have no redundancy / HA?

In my Homelab I do it the other way around: nearly everything is in my cluster, approx. half of it with HA. For "dangerous experiments" I use either a standalone PVE or a virtualized cluster.

None of my machines has a dedicated link for corosync so I don't run anything important on the cluster

Okay. That's a valid reason!

My setup has two NICs in all nodes. One is dedicated for storage - as simple as possible, using a dumb/unmanaged switch. The other one carries a handful of frontend VLANs, everything is tagged here. Both establish an independent Corosync ring. Sounds good, right?

I monitor the "ping" duration on all of them. And I can see that each one got saturated from time to time (during a backup wave or while moving virtual disks) the storage network is congested. The other physical network is for user traffic. And some internal services (and most external CDNs) can saturate my LAN. So yes, sharing Corosync with other services seems to be risky; that's to be expected.

One option I was thinking about was to add a (third) dedicated network for Corosync. This would require me to utilize USB NICs, and I do not like that.

Until now I was lucky enough not to have a single HA induced reboot. (Knock on wood!) That's why I did not add those USB thingies yet...


It's great to have several options - choose your poison ;-)
 
  • Like
Reactions: Johannes S
This would require me to utilize USB NICs, and I do not like that.

I'm actually considering to add USB nics to my two Lenovo-micro-pc-nodes cluster to have a dedicated network link for corosync as I have no free space/slots for an additional PCI network adapter. I would also love to have faster transfer for storage replication (2.5 GB USB is still faster than 1GB) but I'm not totally convinced that using USB for that is a good idea. With corosync I can still use another network as fallback I'm not sure whether this would be possible with the migration/replication network since the documentation only describes how to have a fallback with corosync not for migration: https://pve.proxmox.com/wiki/Cluster_Manager#_guest_migration

There is also another reason, why everything important runs on my single node: It's the by far most beefy system (Xeon, amount of RAM) and the only one with ECC-Support and enough SATA-Slots to seperate OS and VM/LXC/data storage. Most of my services are also services where I could live with the minimal downtime needed to migrate/live-restore them on my cluster in case of an emergency. There are two exceptions though: My TrueNAS system and my PIhole. For pihole I'm using keepalived to have high-availability (my main pihole is on my single-node, two others are in the cluster): https://forum.proxmox.com/threads/pi-hole-lxc-with-gravity-sync.109881/

For TrueNAS the data is replicated to my ProxmoxBackupServer which could also serve as a fileserver in case of an emergency (since PVE is also installed on that node but only to host other backup-specific services, so I could spin up a fileserver VM quite fast if needed, even TrueNAS would be possible thanks to the installed HBA). In fact my PBS was my first combined PBS/PVE/NAS before I got the beefy machine ;) Now I don't run TrueNAS on it since it's not needed for the backups.
I remember @LnxBil mentioned several times, that he don't run a cluster in his home for similiar reasons as mine ( to avoid the complexity and pitfalls of a cluster). He is also saving power since his "playground" systems are mostly turned off, but he can start them if he needs more ressoruces).
 
  • Like
Reactions: UdoB
With corosync I can still use another network as fallback I'm not sure whether this would be possible with the migration/replication network
Well, I am fairly sure Corosync does not "know" (or care about) the other services running on one of the rings. Why would it?

...to avoid the complexity
Yes, complexity is a beast - earlier or later it bites! But, hey, it's a Homelab - and I want to have fun :-)

(And yes, one needs to have a very specific mindset to call maintaining a Cluster "fun"...)