Failover IP Question

pr0927 · Aug 18, 2024

Hi all, I'm not quite at the stage of setting up an HA cluster, but I'm marching in that direction. However, one thing has been confusing me, and looking online hasn't been too helpful - and ChatGPT consulting has been...not super confidence inspiring.

Basically, I have three nodes I will be putting in a cluster (with roughly equal storage and ZFS-replicated content). My understanding is that in the event of failover or migration, the VMs and LXCs on a different node are "copies" of one another and will spin up with the assigned IP address - this makes reverse proxy pointing and other local network navigating relatively seamless.

I have one Debian VM running Docker containers. I also have one LXC running Cockpit for SMB sharing.

My Proxmox host is running an NFS server (on the host itself) - so that I can map NFS shares to containers for certain volume paths, since I cannot pass through my whole SATA controller to a VM - performance has been perfectly fine and working great! I did not setup my NFS shares through Cockpit to minimize any additional overhead.

My concern is when a given node goes offline, even if I have the NFS server manually installed on each node with the same shares and all...the IP address of the given node that will take over, will be different.

Meaning, any Docker containers pointing to NFS mounts will now be pointing at NFS paths that are currently offline, since they use IP addresses.

No Gluster or Ceph here - just doing ZFS replication between the systems, not a separate fileserver NFS pool or anything.

What is the smart person's workaround here? Is there any kind of virtual IP that all nodes can share and assume in the event they are the "active" node?

ChatGPT mentioned something about running "keepalived" on each node, but I've also been arguing with it over every tiny thing I have been doing, so I don't know how correct it is...

Would appreciate any help/explanation for this - especially before I go down a road from which making changes will be a real pain.

gurubert · Aug 19, 2024

As there is no NFS failover you would always have to remount the share anyway. So there is no point in using a service IP for that.

You could install keepalived on the Proxmox nodes and configure a service IP with it.

pr0927 · Aug 19, 2024

Uhoh, this thread suddenly resurfaced from Narnia? I couldn't find evidence of it anywhere! Ended up reposting and the repost worked:

https://forum.proxmox.com/threads/2nd-try-failover-ip-help-questions.153022/

gurubert said:
As there is no NFS failover you would always have to remount the share anyway. So there is no point in using a service IP for that.

Sorry, I'm a bit confused here. If I have a Debian VM that is mounting an NFS share automatically, then wouldn't it auto-mount on another node too, so long as what it points to still "exists?" When I reboot Proxmox or the VM, I don't need to remount it. I have the NFS server on the host auto-starting up too. If I configured this in identical fashion on all 3 nodes, wouldn't the only hangup be that the mount path is pointing to a particular IP (meaning if node 1 is 192.168.0.80, and goes down, a spun-up identical Debian VM on a different node is now still pointing to 192.168.0.80, whereas node 2 might have an IP of 192.168.0.90).

What does keepalived do in this scenario?

gurubert · Aug 19, 2024

Keepalived is responsible to configure a service IP on a network interface on the nodes it runs on.

Unfortunately the Proxmox cluster service is not able to do so.

pr0927 · Aug 19, 2024

OK so if I do run keepalived on each node, and configure it properly, would it (in theory) resolve this issue for me?

Still a little confused by what you were saying about the NFS shares.

gurubert · Aug 19, 2024

It will resolve the issue that you can remount the NFS share from the same server IP.

A running VM on an unaffected Proxmox node will also have to remount the NFS share as the NFS session will not transfer to the new Proxmox node.

pr0927 · Aug 19, 2024

OK, right - that makes sense. So if I understand correctly - if I set that up right, then in the event of an entire node going down, a second node can takeover, and so long as it has the IP changed correctly, when the Debian VM spins up, it should point to the same IP it always points to for NFS mounts, which should then be what the second node now identifies as?

gurubert · Aug 19, 2024

Yes, the service IP is an additional IP address on a network interface of your choice.

pr0927 · Aug 19, 2024

Got it - thank you, appreciate your time! Now to learn how to do this keepalived stuff...

Also I shouldn't forget to ask - the overhead on an LXC running Cockpit for running is NFS server would likely be noticeable over just continuing to host the NFS service it on the Proxmox host itself, right?

gurubert · Aug 19, 2024

A container introduces no measurable overhead as the processes within run on the host kernel.

pr0927 · Aug 19, 2024

Oh interesting...so now I wonder if I should just run not only SMB off the Cockpit LXC, but also NFS - do you have any recommendation for doing this versus continuing with the host-based NFS service? The container is unprivileged.

VictorSTS · Aug 19, 2024

You can't do any proper HA if you depend on the resources of a single host. If Ceph isn't an option for whatever reason, I would just add the drives in the PVE host that currently have the NFS data as a ZFS storage to PVE and setup a VM there using as much as 80% of that space. Then create a VM with it's drives in that new ZFS storage and configure NFS there. Use ZFS replication as with other VMs to have your data in at least two hosts of the cluster. Much easier backups, easier disaster recovery, no host config needed, no keepalived, etc. You have a fully fledged hypervisor, just use it

pr0927 said:
Oh interesting...so now I wonder if I should just run not only SMB off the Cockpit LXC, but also NFS - do you have any recommendation for doing this versus continuing with the host-based NFS service? The container is unprivileged.

NFS server or client in an unprivileged CT requires special configs [1] and may cause container/server deadlocks.

[1] https://forum.proxmox.com/threads/tutorial-mounting-nfs-share-to-an-unprivileged-lxc.138506/

pr0927 · Aug 19, 2024

VictorSTS said:
You can't do any proper HA if you depend on the resources of a single host. If Ceph isn't an option for whatever reason, I would just add the drives in the PVE host that currently have the NFS data as a ZFS storage to PVE and setup a VM there using as much as 80% of that space. Then create a VM with it's drives in that new ZFS storage and configure NFS there. Use ZFS replication as with other VMs to have your data in at least two hosts of the cluster. Much easier backups, easier disaster recovery, no host config needed, no keepalived, etc. You have a fully fledged hypervisor, just use it

So this would be great, and I'd have passed through the storage to the VMs, but since the drives are SATA, and they're entirely separate from the 2xM.2 NVMe SSDs that are dedicated to VMs and LXCs, I'd need to pass them through - and I'd have to pass through the SATA controller. My Proxmox nodes all use a SATA boot drive, so I cannot do that. :/

That's why I ended up running the NFS server off the host itself.

VictorSTS said:
NFS server or client in an unprivileged CT requires special configs [1] and may cause container/server deadlocks.

[1] https://forum.proxmox.com/threads/tutorial-mounting-nfs-share-to-an-unprivileged-lxc.138506/

Hmm, I think these are similar approaches, but what I was proposing was a bit different - bind-mapping the drives to the LXC which runs Cockpit. Then from Cockpit, run the SMB shares (doing exactly this already) and the NFS shares (the idea I posited above). That way the LXC has a given IP address which could be static and carry over to another node, and any VMs relying on mounting an NFS share could point to this Cockpit LXC IP instead. If the LXC being unprivileged would cause issues - what if I just made it privileged? Is the security risk great for an LXC that would not be pointed at the outside world via reverse proxy, etc.?

VictorSTS · Aug 19, 2024

pr0927 said:
So this would be great, and I'd have passed through the storage to the VMs, but since the drives are SATA, and they're entirely separate from the 2xM.2 NVMe SSDs that are dedicated to VMs and LXCs, I'd need to pass them through - and I'd have to pass through the SATA controller. My Proxmox nodes all use a SATA boot drive, so I cannot do that. :/

What?

Not talking about pass through at all. Add the drives as a storage to the PVE host ifself. I.e. create a ZFS pool and add it as a storage to the host where you can host VM's into.

pr0927 · Aug 19, 2024

VictorSTS said:
Not talking about pass through at all. Add the drives as a storage to the PVE host ifself. I.e. create a ZFS pool and add it as a storage to the host where you can host VM's into.

Oh I should probably explain my architecture first then.

Each of the 3 nodes has 2x2TB M.2 NVMe SSDs dedicated to VMs and LXCs - already in one ZFS pool. Each of the 3 nodes also has 4x8TB HDDs, in another ZFS pool, for slower data storage - media, documents, etc. I also have a third ZFS pool on all 3 of a 4TB SATA SSD for Frigate NVR recordings.

Each of the nodes are virtually identical - two literally are, and the third one is analogous in most regards. I want the VMs/LXCs to have extremely fast underlying storage for any apps they are running in Docker or otherwise. The Cockpit LXC actually has the ZFS pools bindmounted, and then Cockpit itself then shares these mounts via SMB.

Maybe I've confused myself somewhere, but everything I'd read was saying that NFS shares are the best for a VM running docker, for apps to tap into, since the drives couldn't be passed-through to the VM easily given the SATA controller thing I mentioned. And that since I'd bind-mounted the ZFS shares to the LXC, they couldn't be bind-mounted to multiple VMs (I have 2 Debian VMs actually, both with Docker apps referencing the NFS shares).

Hopefully this isn't all insane...

pr0927 · Aug 20, 2024

VictorSTS said:
Not talking about pass through at all. Add the drives as a storage to the PVE host ifself. I.e. create a ZFS pool and add it as a storage to the host where you can host VM's into.

Bumping to not be forgotten. xD

You've made me suddenly think/wonder if all my hassle was for naught?

There must have been something that made me not run the NFS server inside the Cockpit LXC (maybe it was that mention of privileged versus unprivileged issues?) or not attaching ZFS pools to the VM? Was it corruption risk from attaching ZFS pools to multiple VMs (if even possible?).

I need the SMB shares to access the content of those shares from my PC over the network as well. And my understanding was NFS is preferred over SMB for Linux stuff and for mapping directly into containers/VMs and more performant?

And I thought I read it wasn't great to run SMB and NFS out of the same container anyway?

But moreover, I was under the impression that if I bind-mounted the ZFS pools to the Cockpit LXC, then trying to attach them to any VM would not work (or only work for one VM, and then an NFS server would have to happen anyway somewhere - my comment earlier about corruption risk?).

I saw a lot of threads and blogs with people recommending NFS server stuff to just be run off the Proxmox host - figured this must also be common practice for a good reason?

Am I way out in left-field?

pr0927 · Aug 20, 2024

Following up to this - I think I came across several advisories that multiple VMs sharing the same ZFS pool is a path to corruption, and my memory is being jogged that this is what set me down this path.

Saw a mixed amount of recommendations to run the NFS service off the host itself or on one VM (if the ZFS pool was attached to it, to share out to other VMs). Also saw people mentioning that for best performance and least abstraction layer overhead, to just run NFS off the Proxmox host.

So, understanding that Proxmox is meant to be a hypervisor - and that this is for a homelab (one for which I still value reliability, of course), it seems that it's "less recommended, but has its merits" to run off the host?

In which case keepalived is my only option?

The alternative being an LXC privileged container with Cockpit for NFS sharing out, since it might be less advised to run NFS out of the existing Cockpit LXC I have for SMB shares?

Search

Search

Failover IP Question

pr0927

New Member

gurubert

Distinguished Member

pr0927

New Member

gurubert

Distinguished Member

pr0927

New Member

gurubert

Distinguished Member

pr0927

New Member

gurubert

Distinguished Member

pr0927

New Member

gurubert

Distinguished Member

pr0927

New Member

VictorSTS

Distinguished Member

pr0927

New Member

VictorSTS

Distinguished Member

pr0927

New Member

pr0927

New Member

pr0927

New Member

We value your privacy