rbd snapshot "write 1000 at 0 result -30" error

alexskysilk · Jan 26, 2018

While running backups, vzdump got stuck on a specific container; there is no outward indication of fault but the task isnt moving, and the syslog is getting spammed with

rbd: rbd54: write 1000 at 0 result -30

the vzdump processes are not in a D state so all appears normal, but the filesize of the output file has not moved in half an hour.

Code:

1320536 ?        S      0:00 /bin/bash -c set -o pipefail && tar cpf - --totals --one-file-system -p --sparse --numeric-owner --acls --xattrs '--xattrs-include=user.*' '--xattrs-include=security.capability' '--warning=no-file-ignored' '--warning=no-xattr-write' --one-file-system '--warning=no-file-ignored' '--directory=/mnt/pve/altbackup/dump/vzdump-lxc-32427-2018_01_25-14_29_04.tmp' ./etc/vzdump/pct.conf '--directory=/mnt/vzsnap0' --no-anchored '--exclude=lost+found' --anchored '--exclude=./tmp/?*' '--exclude=./var/tmp/?*' '--exclude=./var/run/?*.pid' ./ | lzop >/mnt/pve/altbackup/dump/vzdump-lxc-32427-2018_01_25-14_29_04.tar.dat
1320537 ?        R     30:51 tar cpf - --totals --one-file-system -p --sparse --numeric-owner --acls --xattrs --xattrs-include=user.* --xattrs-include=security.capability --warning=no-file-ignored --warning=no-xattr-write --one-file-system --warning=no-file-ignored --directory=/mnt/pve/altbackup/dump/vzdump-lxc-32427-2018_01_25-14_29_04.tmp ./etc/vzdump/pct.conf --directory=/mnt/vzsnap0 --no-anchored --exclude=lost+found --anchored --exclude=./tmp/?* --exclude=./var/tmp/?* --exclude=./var/run/?*.pid ./

I finally killed the vzdump task; the snapshot was NOT deleted and remained mapped; attempting to perform fsck -n /dev/rbd54 (the snapshot) yielded "fsck.ext4: MMP: open with O_DIRECT failed while reading MMP block." I proceeded to unmap and delete the snapshot.

afterwards, performing a manual backup for the container succeeded without issue.

1. why did this happen, and how to identify?
2. If a snapshot is unreadable, vzdump should fail for that container, perform cleanup and resume to the next scheduled task. Should I submit this as a bug?

Alwin · Jan 26, 2018

alexskysilk said:
fsck.ext4: MMP: open with O_DIRECT failed while reading MMP block.

You might have been running into the multi mount protection.
https://ext4.wiki.kernel.org/index.php/Ext4_Disk_Layout#Multiple_Mount_Protection

alexskysilk said:
1. why did this happen, and how to identify?
2. If a snapshot is unreadable, vzdump should fail for that container, perform cleanup and resume to the next scheduled task. Should I submit this as a bug?

Does it look similar to this?
https://bugzilla.proxmox.com/show_bug.cgi?id=1541

alexskysilk · Jan 26, 2018

Alwin said:
You might have been running into the multi mount protection.

Maybe, but fsck -n is supposed to operate on a read only file system... also, when I unmapped the rbd it did not complain of being in use.

Alwin said:
Does it look similar to this?
https://bugzilla.proxmox.com/show_bug.cgi?id=1541

No. In that case the task failed; in my case the task was never concluded to failure, and had to be killed manually.

Alwin · Jan 29, 2018

Is this post connected to your other thread?
https://forum.proxmox.com/threads/lxc-container-stuck-on-startup-hangs-pveproxy.40235/

alexskysilk · Jan 29, 2018

I cant say for certain. While both symptoms occurred during vzdump jobs, this fault is interruptible (killable) and only happened once; my other thread is for a more severe bug (its not killable) and more reproducible.

Search

Search

rbd snapshot "write 1000 at 0 result -30" error

alexskysilk

Distinguished Member

Alwin

Proxmox Retired Staff

alexskysilk

Distinguished Member

Alwin

Proxmox Retired Staff

alexskysilk

Distinguished Member