Hello all,
I have a small homelab server that I'm running Proxmox on. I also have an offsite machine that I use for backups of highly important data. That offsite machine was running ESXi for a while but I recently did some work to it and was switching it over to Proxmox as well. So, I started looking into clustering and such for it. As of right now I'm having problems getting the cluster to work. Every time I try to join the primary nodes cluster, it adds it but then deletes the nodes directory and never finishes the job on the offsite server. And on the main server, an error popups up about not being able to find /etc/pve/nodes/offsite/pve-ssl.pem.
I'm just completely guessing here, but my guess is that it has to do with network communication. In my current test setup, I have the offsite machine in the same location as the primary, but I have it on a separate sub network that simulates the network it is on offsite. Due to how it's configured, the offsite machine can reach the primary machine, but the primary machine can't see a machine in that sub-network. So, I guess the question here is, in a traditional cluster configuration, are the nodes expected to have a site to site VPN established outside of the node to enable communication as if they are on the same network? Or are there certain ports / port forwarding I need to have configured to make it work? I haven't really deeply considered the networking side of things for these yet, I need to do more research in that aspect. Figuring out what I would use / how it would be configured if I did want to have say a domain that would try the primary server first and then the offsite server if primary is unreachable. Any pointers on where to start looking for that would of course be welcome as well.
Next thing is, is this even the right setup for what I want to do?
I'll elaborate more on my goals of the offsite server. First and foremost it's primary duty is to backup my critical data. I currently have this configured via a scheduled cron job running a rsync pull task. This is working currently. Before on ESXi I had this configured passing through a SATA controller to a TrueNAS VM, but since Proxmox supports ZFS directly I figured I'd just do it as a cron job without the VM.
Secondary tasks that I would like to do is backup select VMs / Containers from my primary server. Having the ability to spin up these VMs / Containers on the offsite server if the main server is down / unreachable would be a nice to have. But I don't really require it to be automated, as long as the backup from the data on it is at least somewhat current (within the last hour-ish), and the data can be synced to the primary server when it is available again. I'm not running any mission critical VMs that require 99.999% up time or anything like that, but if I could make it work fully automated and transparent, that would be a nice bonus.
I may also run separate VMs / Containers only on the Primary or Offsite server depending on various needs / desires for it. And it is even possible I might one day find a reason to have a VM running on the offsite server and the primary server be a backup to that one. For example, the offsite location has a faster upload speed. So if I wanted to run a public website or something, I might would prefer it to be primarily ran off the offsite server to take advantage of the faster upload.
I had a test cluster set up previously and one thing I noticed that I really did not like is my inability to do certain things on the primary server if the offsite server was unavailable because of forum things. I think I can get around this by changing how many votes the primary machine gets, or running a raspberry pi as a voting server as well from what I've read so far. I haven't gotten back to playing with that more yet since I'm having problems getting the clusters to actually function atm. Just figured I'd mention it here in case it's relevant to things.
I don't think I have any real capability of running a reliable shared file system between both servers. I've started reading a little about Ceph to see if it would be useful for this configuration, but it's completely new to me and I haven't gotten into it much yet.
I'd welcome any advice on this setup. Just please keep in mind this is a homelab not an enterprise configuration. So some things might not be configured in an industry standard way. And I don't have a huge corporate budget to throw at this
I have a small homelab server that I'm running Proxmox on. I also have an offsite machine that I use for backups of highly important data. That offsite machine was running ESXi for a while but I recently did some work to it and was switching it over to Proxmox as well. So, I started looking into clustering and such for it. As of right now I'm having problems getting the cluster to work. Every time I try to join the primary nodes cluster, it adds it but then deletes the nodes directory and never finishes the job on the offsite server. And on the main server, an error popups up about not being able to find /etc/pve/nodes/offsite/pve-ssl.pem.
I'm just completely guessing here, but my guess is that it has to do with network communication. In my current test setup, I have the offsite machine in the same location as the primary, but I have it on a separate sub network that simulates the network it is on offsite. Due to how it's configured, the offsite machine can reach the primary machine, but the primary machine can't see a machine in that sub-network. So, I guess the question here is, in a traditional cluster configuration, are the nodes expected to have a site to site VPN established outside of the node to enable communication as if they are on the same network? Or are there certain ports / port forwarding I need to have configured to make it work? I haven't really deeply considered the networking side of things for these yet, I need to do more research in that aspect. Figuring out what I would use / how it would be configured if I did want to have say a domain that would try the primary server first and then the offsite server if primary is unreachable. Any pointers on where to start looking for that would of course be welcome as well.
Next thing is, is this even the right setup for what I want to do?
I'll elaborate more on my goals of the offsite server. First and foremost it's primary duty is to backup my critical data. I currently have this configured via a scheduled cron job running a rsync pull task. This is working currently. Before on ESXi I had this configured passing through a SATA controller to a TrueNAS VM, but since Proxmox supports ZFS directly I figured I'd just do it as a cron job without the VM.
Secondary tasks that I would like to do is backup select VMs / Containers from my primary server. Having the ability to spin up these VMs / Containers on the offsite server if the main server is down / unreachable would be a nice to have. But I don't really require it to be automated, as long as the backup from the data on it is at least somewhat current (within the last hour-ish), and the data can be synced to the primary server when it is available again. I'm not running any mission critical VMs that require 99.999% up time or anything like that, but if I could make it work fully automated and transparent, that would be a nice bonus.
I may also run separate VMs / Containers only on the Primary or Offsite server depending on various needs / desires for it. And it is even possible I might one day find a reason to have a VM running on the offsite server and the primary server be a backup to that one. For example, the offsite location has a faster upload speed. So if I wanted to run a public website or something, I might would prefer it to be primarily ran off the offsite server to take advantage of the faster upload.
I had a test cluster set up previously and one thing I noticed that I really did not like is my inability to do certain things on the primary server if the offsite server was unavailable because of forum things. I think I can get around this by changing how many votes the primary machine gets, or running a raspberry pi as a voting server as well from what I've read so far. I haven't gotten back to playing with that more yet since I'm having problems getting the clusters to actually function atm. Just figured I'd mention it here in case it's relevant to things.
I don't think I have any real capability of running a reliable shared file system between both servers. I've started reading a little about Ceph to see if it would be useful for this configuration, but it's completely new to me and I haven't gotten into it much yet.
I'd welcome any advice on this setup. Just please keep in mind this is a homelab not an enterprise configuration. So some things might not be configured in an industry standard way. And I don't have a huge corporate budget to throw at this
Last edited: