Ex-cluster node has frequent writes to pve-cluster and pve-replication

chill · Sep 2, 2023

Hi,

Running 8.0.4 on a new install. This was joined to a cluster, had VMs migrated over onto it, and was then removed from that cluster by following "Separate a Node Without Reinstalling". So far so good.

The issue I see is that the following files are written to / re-created every 60 seconds. Taking a zfs snapshot and diffing I see:

Code:

 # zfs diff rpool/ROOT/pve-1@snap1 rpool/ROOT/pve-1@snap2
M       /var/lib/pve-manager
M       /var/lib/pve-cluster/config.db-wal
M       /var/lib/pve-cluster/config.db-shm
+       /var/lib/pve-manager/pve-replication-state.json
-       /var/lib/pve-manager/pve-replication-state.json

The file /var/lib/pve-manager/pve-replication-state.json only contains {}. Also pvesr status shows nothing, and in the log for pvescheduler I see the below after startup:

Code:

pvescheduler[101516]: jobs: cfs-lock 'file-jobs_cfg' error: no quorum!
pvescheduler[101515]: replication: cfs-lock 'file-replication_cfg' error: no quorum!

Is there a timer or something I need to stop here?

Thanks!

Leonardo Mercadante · Sep 11, 2023

Hello, on the node, datacenter or vm config you can reach the tab replication and delete the schedules.

chill · Sep 11, 2023

Hi there, thanks for the reply. Each of those windows in the GUI are empty and have a popup saying "Replication needs at least two nodes".

Leonardo Mercadante · Sep 11, 2023

chill said:
Hi there, thanks for the reply. Each of those windows in the GUI are empty and have a popup saying "Replication needs at least two nodes".

Hello!
you can remove the replication tasks by command line.... I only don't know how to do it by terminal.

tkittich · Aug 14, 2024

Hi,

Has anyone found a solution? I just found out I have a similar problem. The following files seem to be very excessively written to every minute!! This node is not a part of a cluster and has no replication job at all.

Code:

/var/lib/pve-manager
/var/lib/pve-manager/pve-replication-state.json
/var/lib/pve-cluster/config.db-shm
/var/lib/pve-cluster/config.db-wal

I've also turned off HA and corosync to minimize disk writes. Is there any other services that should be stopped?

Code:

systemctl disable --now pve-ha-crm.service
systemctl disable --now pve-ha-lrm.service
systemctl disable --now corosync.service

VictorSTS · Aug 16, 2024

Those files are updated by the pve-cluster service, which is needed even if you don't have a cluster, as it provides the Proxmox Cluster Filesysme (pmxcfs) that generates /etc/pve directory where the configuration resides [1]. Shouldn't be a problem at all.

[1] https://pve.proxmox.com/wiki/Service_daemons#pve-cluster

tkittich · Aug 19, 2024

VictorSTS said:
Those files are updated by the pve-cluster service, which is needed even if you don't have a cluster, as it provides the Proxmox Cluster Filesysme (pmxcfs) that generates /etc/pve directory where the configuration resides [1]. Shouldn't be a problem at all.

[1] https://pve.proxmox.com/wiki/Service_daemons#pve-cluster

Thank you for your reply.

Since this node is not a part of a cluster, would it be ok to mount /var/lib/pve-manager and /var/lib/pve-cluster/ with sync=disabled zfs property in ZFS? That would disable sync writes and hopefully reduce the constant disk writes of pve-cluster. This ZFS pool is using a special vdev on solid-state disks, so the constant disk writes are not good.

VictorSTS · Aug 19, 2024

I wouldn't mess with that at all. If the few bytes/minute that are written by those processes have any impact in your drives, replace them with proper hardware. Anything will write more bytes that that: logs, backups, tasks, VM's themselves, etc.

esi_y · Sep 13, 2024

VictorSTS said:
I wouldn't mess with that at all. If the few bytes/minute that are written by those processes have any impact in your drives, replace them with proper hardware. Anything will write more bytes that that: logs, backups, tasks, VM's themselves, etc.

Well, as I am finding out more and more of these threads, I noticed the same pattern - everyone advised to use PLP SSDs and not look for implementation reasons for the very unusual:

https://forum.proxmox.com/threads/etc-pve-500k-600m-amplification.154074/#post-701223

If anyone (@chill @tkittich) caught in this is willing to try alternative pmxcfs, please let me know. In turn, it would help find out which processes are actually writing beyond 1 block size and which are just constantly writing, which would be the the next candidate for optimisation.

Search

Search

Ex-cluster node has frequent writes to pve-cluster and pve-replication

chill

New Member

Leonardo Mercadante

New Member

chill

New Member

Leonardo Mercadante

New Member

tkittich

New Member

VictorSTS

Famous Member

tkittich

New Member

VictorSTS

Famous Member

esi_y

Renowned Member

We value your privacy