[SOLVED] Slow access to pmxcfs in PVE 5.2 cluster

chrone

Active Member
Apr 15, 2015
115
16
38
planet earth
Hi Proxmoxers,

What could causing slow access (read and write) to pmxcfs which is mounted in /etc/pve in PVE 5.2 cluster?

For testing, It takes more than 10 seconds to create an empty file inside /etc/pve. There are no performance issue on the local storage and confirmed by mounting the pmxcfs locally.
 
Most likely cause in a cluster would be the cluster-network (all writes to /etc/pve need to synchronized across all cluster-nodes).

* Check your journal (especially for messages from pmxcfs and corosync)
* you can use `omping` to get some measurements of your cluster network

See our docs for some omping invocations (and general information on our cluster-stack's network requirements):
https://pve.proxmox.com/pve-docs/chapter-pvecm.html#_cluster_network
 
  • Like
Reactions: chrone
Most likely cause in a cluster would be the cluster-network (all writes to /etc/pve need to synchronized across all cluster-nodes).

* Check your journal (especially for messages from pmxcfs and corosync)
* you can use `omping` to get some measurements of your cluster network

See our docs for some omping invocations (and general information on our cluster-stack's network requirements):
https://pve.proxmox.com/pve-docs/chapter-pvecm.html#_cluster_network


Hi Stoiko, thanks for the advice.

Due to network constraint, we are using custom corosync configuration with UDPU to support for more than 16 nodes. We are also using mixed cluster PVE 5.0 and PE 5.2. From the corosync journal, I could only found one node is flapping and kept rejoining the corosync cluster without even left at all, and from the pmxcfs journal, there are often times it fails to write after retrying several times.

Is there a benchmark tool to test the corosync with UDPU as I believe omping is to test multicast?
 
updu (unicast) does not scale as well as multicast - and 16 is quite a large cluster - so my first guess is that you've reached the limit (and a bit beyon) of what is possible with your current network infrastructure.

omping sends both unicast and multicast packets - so you can still use it to check the latency.
What kind of network is your corosync running on?

You could consider creating a separate corosync ring with dedicated nics (and use a simple unmanaged switch, those usually don't interfere with multicast)
 
  • Like
Reactions: chrone
updu (unicast) does not scale as well as multicast - and 16 is quite a large cluster - so my first guess is that you've reached the limit (and a bit beyon) of what is possible with your current network infrastructure.

omping sends both unicast and multicast packets - so you can still use it to check the latency.
What kind of network is your corosync running on?

You could consider creating a separate corosync ring with dedicated nics (and use a simple unmanaged switch, those usually don't interfere with multicast)

I see. Thanks for the limitation information.

We have similar setup and do not have issue. I'll give omping a try then. The corosync is running on top of ovs bridge + lacp bond.

Unfortunately, adding new NIC is not an option.
 
updu (unicast) does not scale as well as multicast - and 16 is quite a large cluster - so my first guess is that you've reached the limit (and a bit beyon) of what is possible with your current network infrastructure.

omping sends both unicast and multicast packets - so you can still use it to check the latency.
What kind of network is your corosync running on?

You could consider creating a separate corosync ring with dedicated nics (and use a simple unmanaged switch, those usually don't interfere with multicast)


Hi Stoiko,

The omping test successfully without packet loss on unicast, and the latency is below 0.2ms.

Reducing the cluster down to 6 nodes did not help. Could be the frequent membership reformed causing the slow pmxcfs access?

The corosync and pve-cluster journal are as follows.
Code:
Dec 21 16:05:27 pmx1 corosync[4952]: notice  [TOTEM ] A new membership (10.100.100.101:78124) was formed. Members
Dec 21 16:05:27 pmx1 corosync[4952]:  [TOTEM ] A new membership (10.100.100.101:78124) was formed. Members
Dec 21 16:05:27 pmx1 corosync[4952]: warning [CPG   ] downlist left_list: 0 received
Dec 21 16:05:27 pmx1 corosync[4952]: warning [CPG   ] downlist left_list: 0 received
Dec 21 16:05:27 pmx1 corosync[4952]:  [CPG   ] downlist left_list: 0 received
Dec 21 16:05:27 pmx1 corosync[4952]: warning [CPG   ] downlist left_list: 0 received
Dec 21 16:05:27 pmx1 corosync[4952]:  [CPG   ] downlist left_list: 0 received
Dec 21 16:05:27 pmx1 corosync[4952]: warning [CPG   ] downlist left_list: 0 received
Dec 21 16:05:27 pmx1 corosync[4952]:  [CPG   ] downlist left_list: 0 received
Dec 21 16:05:27 pmx1 corosync[4952]: warning [CPG   ] downlist left_list: 0 received
Dec 21 16:05:27 pmx1 corosync[4952]:  [CPG   ] downlist left_list: 0 received
Dec 21 16:05:27 pmx1 corosync[4952]: warning [CPG   ] downlist left_list: 0 received
Dec 21 16:05:27 pmx1 corosync[4952]:  [CPG   ] downlist left_list: 0 received
Dec 21 16:05:27 pmx1 corosync[4952]:  [CPG   ] downlist left_list: 0 received
Dec 21 16:05:27 pmx1 corosync[4952]:  [CPG   ] downlist left_list: 0 received
Dec 21 16:05:27 pmx1 corosync[4952]:  [CPG   ] downlist left_list: 0 received
Dec 21 16:05:27 pmx1 corosync[4952]:  [CPG   ] downlist left_list: 0 received
Dec 21 16:05:27 pmx1 corosync[4952]:  [CPG   ] downlist left_list: 0 received
Dec 21 16:05:36 pmx1 pmxcfs[4925]: [status] notice: cpg_send_message retry 10
Dec 21 16:05:37 pmx1 pmxcfs[4925]: [status] notice: cpg_send_message retry 20
Dec 21 16:05:38 pmx1 pmxcfs[4925]: [status] notice: cpg_send_message retry 30
Dec 21 16:05:39 pmx1 pmxcfs[4925]: [status] notice: cpg_send_message retry 40
Dec 21 16:05:40 pmx1 pmxcfs[4925]: [status] notice: cpg_send_message retry 50
Dec 21 16:05:41 pmx1 pmxcfs[4925]: [status] notice: cpg_send_message retry 60
Dec 21 16:05:42 pmx1 pmxcfs[4925]: [status] notice: cpg_send_message retry 70
Dec 21 16:05:43 pmx1 pmxcfs[4925]: [status] notice: cpg_send_message retry 80
Dec 21 16:05:44 pmx1 pmxcfs[4925]: [status] notice: cpg_send_message retry 90
Dec 21 16:05:45 pmx1 pmxcfs[4925]: [status] notice: cpg_send_message retry 100
Dec 21 16:05:45 pmx1 pmxcfs[4925]: [status] notice: cpg_send_message retried 100 times
Dec 21 16:05:45 pmx1 pmxcfs[4925]: [status] crit: cpg_send_message failed: 6
Dec 21 16:05:46 pmx1 pmxcfs[4925]: [status] notice: cpg_send_message retry 10
Dec 21 16:05:47 pmx1 pmxcfs[4925]: [status] notice: cpg_send_message retry 20
Dec 21 16:05:48 pmx1 pmxcfs[4925]: [status] notice: cpg_send_message retry 30
Dec 21 16:05:49 pmx1 pmxcfs[4925]: [status] notice: cpg_send_message retry 40
Dec 21 16:05:50 pmx1 pmxcfs[4925]: [status] notice: cpg_send_message retry 50
 
Using tcpdump to troubleshoot corosync issue, I managed to find some old Proxmox nodes which where removed from the cluster but got turned on again accidentally. Removing the old nodes without reinstalling as documented in Proxmox Cluster Manager documentation fixed the issue.
 
Glad the issue was found! Thanks for updating the thread and marking it as solved!
 
  • Like
Reactions: chrone

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!