Problem with LXC container on PVE8 due to mmp_update_interval being too big.

VictorSTS

Distinguished Member
Oct 7, 2019
1,013
537
158
Spain
Hello,

<TLDR>
Seems that PVE or LXC or even Ceph change ext4's mmp_update_interval dynamically.
Why, when and how it does?
</TLDR>

Full detailes below:

In a PVE8.1 cluster with Ceph 18.2.1 storage, had a situation yesterday where a privileged LXC (id 200) with a 4'2TB ext4 disk as mp0 somehow got stuck just after backup, while the backup snapshot was being deleted. The snapshot got removed from the storage but in the VM config was still there in "deleted" state.

This are the last backup log lines:
Code:
INFO: Duration: 10188.46s
INFO: End Time: Thu Jul 31 01:59:51 2025
INFO: adding notes to backup
INFO: cleanup temporary 'vzdump' snapshot
Removing snap: 100% complete...done.
2025-07-31T02:14:57.974+0200 7f930e8086c0 -1 librbd::ImageWatcher: 0x7f92fc007550 image watch failed: 140269269914816, (107) Transport endpoint is not connected
2025-07-31T02:14:57.974+0200 7f930e8086c0 -1 librbd::Watcher: 0x7f92fc007550 handle_error: handle=140269269914816: (107) Transport endpoint is not connected

Every other backup seemed to be ok, no Ceph errors/warnings in logs, etc.

The real issue comes next: the node became "greyed out" as pvestatd couldn't refresh state info because lxc-info processes for CT 200 got stuck and the CT itself was completely unresponsive. While reviewing logs, etc, the whole host hung (no ping, no ipmi console, fully freezed) as if kernel somehow got completely stuck due to I/O deadlock (may happen as LXC uses the Ceph KRBD kernel mode driver to access the storage).

Had to be power reset and booted ok. Now, when trying to start the CT, this showed up in journal and console:

Code:
kernel: EXT4-fs warning (device rbd1): ext4_multi_mount_protect:328: MMP interval 720 higher than expected, please wait

Which essentially means that kernel will wait 720 seconds times 4 (nearly 48 minutes!!) before allowing access to that disk of the CT. Meanwhile, the CT was stuck in start state but obviously wasn't working. If left in that state long enough, pvestatd would become unresponsive again and the whole node becomes "grey". Didn't wait long enough to check if the whole host would fully freeze again.

To sort it out killed the CT and manually mapped the rbd and reduced mmp_update_interval:

Code:
tune2fs -E mmp_update_interval=30 /dev/rbd0

After that, I ran fsck.ext4 and no issue was found, so I simply started the LXC. It took the expected 30 seconds I set for mmp_update_interval and booted perfectly fine. Curious about it, I orderly shutdown the CT, mapped again the rbd and checked mmp_update_interval: it was set to "5" instead of the value of "30" I had manually set.

Why, when and how mmp_update_interval gets changed?
 
Last edited: