$> tune2fs -l /dev/sda1 | grep -ie "Block count" -ie "Reserved block count" -ie "Block size"
[CODE]
The limit ( in GB ):
("Block count" - "Reserved block count") * "Block size" / 1024^3
$> tune2fs -l /dev/sda1 | grep -ie "Block count" -ie "Reserved block count" -ie "Block size" | awk '{print $(NF-0)}' | tr "\n" " " | awk '{print ($1 - $2) * $3 / 1024^3 " GB" }'
Thanks for your reply. It's not lack of space, there's enough space available on all these servers.Hi!
How many free space do you have on the storage?
EXT4 filesystem has default 5% reserved space, maybe you hitting the limit? ( when you hit the limit you cant write it )
Code:$> tune2fs -l /dev/sda1 | grep -ie "Block count" -ie "Reserved block count" -ie "Block size" The limit ( in GB ): ("Block count" - "Reserved block count") * "Block size" / 1024^3
Code:$> tune2fs -l /dev/sda1 | grep -ie "Block count" -ie "Reserved block count" -ie "Block size" | awk '{print $(NF-0)}' | tr "\n" " " | awk '{print ($1 - $2) * $3 / 1024^3 " GB" }'
It happened on another VM, not sure if related. The output of this command above was 107. I grabbed kernel logs from the guest:Should it happen again, you can check withls -1 /proc/$(cat /var/run/qemu-server/123.pid)/fd/ | wc -l
replacing123
with the actual ID of your VM how many open file descriptors there are.
This might actually be unrelated, the timestamp in the VM is 20:03 EST, this server had a running backup and the backup had failed due to a hardware problem with the backup server at around the same time. This behaviour isn't really ideal (no iothread, VirtIO SCSI controller)It happened on another VM, not sure if related. The output of this command above was 107. I grabbed kernel logs from the guest:
INFO: 69% (7.7 GiB of 11.0 GiB) in 1m 33s, read: 101.3 MiB/s, write: 101.3 MiB/s
INFO: 69% (7.7 GiB of 11.0 GiB) in 17m 15s, read: 0 B/s, write: 0 B/s
ERROR: backup write data failed: command error: write_data upload error: pipelined request failed: timed out
INFO: aborting backup job
INFO: resuming VM again
ERROR: Backup of VM 412 failed - backup write data failed: command error: write_data upload error: pipelined request failed: timed out
INFO: Failed at 2024-02-02 01:14:54
INFO: Starting Backup of VM 413 (qemu)
INFO: Backup started at 2024-02-02 01:14:54
INFO: status = running
The system logs show failed IO and that the filesystem is remounted read-only because of that.It happened on another VM, not sure if related. The output of this command above was 107. I grabbed kernel logs from the guest:
When the connection to the backup target is lost or too slow, it's unfortunately expected. See my reply here: https://forum.proxmox.com/threads/i...m-in-the-vm-after-fsfreeze.141080/post-631577Code:INFO: 69% (7.7 GiB of 11.0 GiB) in 1m 33s, read: 101.3 MiB/s, write: 101.3 MiB/s INFO: 69% (7.7 GiB of 11.0 GiB) in 17m 15s, read: 0 B/s, write: 0 B/s ERROR: backup write data failed: command error: write_data upload error: pipelined request failed: timed out INFO: aborting backup job INFO: resuming VM again ERROR: Backup of VM 412 failed - backup write data failed: command error: write_data upload error: pipelined request failed: timed out INFO: Failed at 2024-02-02 01:14:54 INFO: Starting Backup of VM 413 (qemu) INFO: Backup started at 2024-02-02 01:14:54 INFO: status = running
Does ext4 in guest still crash after disabling discard? We also have a Ubuntu VM with 64G storage, ext4 crashes (gets fs read-only) once two weeks, while other VMs are working normally (some of them also have discard on though). Also not sure if it's the same cause as this.Thank you very much for your reply. Unfortunately in most cases the guest console got spammed with systemd messages stating the disk was readonly. I was able to grab the error right after in one or two instances, I don't have a screenshot but it was something like this:
validate_block_bitmap comm_fstrim bad block bitmap checksum
I had assumed it might fstrim on the host/guest, so I disabled fstrim.timer on both, but it still happened. I was also have to trigger it on two guests by running fstrim -v / on the host, but I couldn't reproduce it after that.
No backups or snapshots at all.
I will try the jq command if/when this happens again. My most recent change was to disable discard on the VM disks on PVE, I'll report back if this reoccurs.
Disabling/enabling discard didn't seem to help much. It has gotten a lot less frequent recently though, not sure what has changed.Does ext4 in guest still crash after disabling discard? We also have a Ubuntu VM with 64G storage, ext4 crashes (gets fs read-only) once two weeks, while other VMs are working normally (some of them also have discard on though). Also not sure if it's the same cause as this.