How do I remove Replication jobs from missing node?

dtiKev

Well-Known Member
Apr 7, 2018
75
4
48
Hi,
i recently added a 2nd production server and then joined the cluster. After this I migrated a few machines over from another cluster member that was sub par. I added new replication jobs to make sure that all machines were migrating between the two main nodes. I deleted the jobs that referenced the weaker server.

For whatever reason they were scheduled to be removed. I went ahead and powered down the machine that I wanted to remove after waiting about 30 minutes. I sent the node removal command and the gui updated a bit later.

Now hours later I still see the replication jobs referencing the now gone node. The error is:

Code:
Removal Scheduled,hostname lookup 'pvn0' failed - failed to get address info for: pvn0: Name or service not known.

I know it's not hurting anything but is there a way to clean this out of the gui?

Thanks,
dK
 
Hi,

as this error message tell you it can't resolve your IP address.
Do you have cluster quorum?
 
The IP address wouldn't do any good as pvn0 is no longer part of the cluster. I do have a quorum between the two active nodes.

As explained, I added a new node, migrated VMs to it. Setup new recplication between the two "good" nodes. Deleted the replications that referred to pvn0 which I intended to remove.

At this point I noticed within replication screens that it was scheduling the pvn0 replication jobs for removal. I waited about 30 minutes and then ran the command to remove pvn0 from my cluster. The worked at the cli and the gui updated to only show the two "good" nodes. The problem is that the replication jobs that used to be active for pvn0 never got removed.

So what I'm asking is, is there a way to clean that up? pvn0 is no longer part of the cluster and the machine no longer exists. The replication jobs are not active and I understand it's probably not hurting anything... it's just a visual status that appears like there's an error.

The next status on the pvn0 rep jobs ready as pending even though they are not active jobs. Their normal run-times are Sunday. Will it clear automatically?
 
You have to clean up manually.
destroy the dataset manually and remove the jobs from the /etc/pve/replication.cfg
 
Hello, is this still considered as good practice? I mean manually removing lines from a file inside /etc/pve ?
Thanks
 
Hello, is this still considered as good practice?
Yes, if Proxmox VE would not want that the user can edit these files, we would not expose them.
The /etc/pve/ mount is an SQLite based filesystem.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!