Snapshot removal fails after backup

e100

Renowned Member
Nov 6, 2010
1,268
46
88
Columbus, Ohio
ulbuilder.wordpress.com
LVM on top of DRBD
vzdump was unable to remove one of the two snapshots that were created when backing up a single VM.


Backup log:
Code:
Apr 16 20:15:06 INFO: Starting Backup of VM 101 (qemu)
Apr 16 20:15:06 INFO: status = running
Apr 16 20:15:07 INFO: backup mode: snapshot
Apr 16 20:15:07 INFO: ionice priority: 7
Apr 16 20:15:07 INFO: suspend vm to make snapshot
Apr 16 20:15:07 INFO:   Logical volume "vzsnap-vm6-0" created
Apr 16 20:15:08 INFO:   Logical volume "vzsnap-vm6-1" created
Apr 16 20:15:08 INFO: resume vm
Apr 16 20:15:08 INFO: vm is online again after 1 seconds
Apr 16 20:15:08 INFO: creating archive '/backup/dump/vzdump-qemu-101-2012_04_16-20_15_06.tar.lzo'
Apr 16 20:15:08 INFO: adding '/backup/dump/vzdump-qemu-101-2012_04_16-20_15_06.tmp/qemu-server.conf' to archive ('qemu-server.conf')
Apr 16 20:15:08 INFO: adding '/dev/drbd1-vm6-vm5/vzsnap-vm6-0' to archive ('vm-disk-virtio1.raw')
Apr 16 21:11:16 INFO: adding '/dev/drbd1-vm6-vm5/vzsnap-vm6-1' to archive ('vm-disk-virtio0.raw')
Apr 16 21:17:05 INFO: Total bytes written: 356482288640 (91.46 MiB/s)
Apr 16 21:17:05 INFO: archive file size: 127.36GB
[B]Apr 16 21:20:14 INFO: lvremove failed - trying again in 8 seconds
Apr 16 21:21:22 INFO: lvremove failed - trying again in 16 seconds
Apr 16 21:22:39 INFO: lvremove failed - trying again in 32 seconds
Apr 16 21:24:11 ERROR: command 'lvremove -f /dev/drbd1-vm6-vm5/vzsnap-vm6-1' failed: got lock timeout - aborting command[/B]
Apr 16 21:24:11 INFO: Finished Backup of VM 101 (01:09:05)

Kernel messages related to lvremove:
Code:
Apr 16 21:19:31 vm6 kernel: lvremove      D ffff88063787b480     0 396621 393033    0 0x00000000
Apr 16 21:19:31 vm6 kernel: ffff88025ea3fb18 0000000000000082 ffff88025ea3fad8 ffffffff8140c1bc
Apr 16 21:19:31 vm6 kernel: 0000000000000008 0000000000001000 0000000000000000 000000000000000c
Apr 16 21:19:31 vm6 kernel: ffff88025ea3fb08 ffff88063787ba20 ffff88025ea3ffd8 ffff88025ea3ffd8
Apr 16 21:19:31 vm6 kernel: Call Trace:
Apr 16 21:19:31 vm6 kernel: [<ffffffff8140c1bc>] ? dm_table_unplug_all+0x5c/0x100
Apr 16 21:19:31 vm6 kernel: [<ffffffff81512663>] io_schedule+0x73/0xc0
Apr 16 21:19:31 vm6 kernel: [<ffffffff811cc42e>] __blockdev_direct_IO_newtrunc+0x6ee/0xb80
Apr 16 21:19:31 vm6 kernel: [<ffffffff811cc91e>] __blockdev_direct_IO+0x5e/0xd0
Apr 16 21:19:31 vm6 kernel: [<ffffffff811c91f0>] ? blkdev_get_blocks+0x0/0xc0
Apr 16 21:19:31 vm6 kernel: [<ffffffff811ca047>] blkdev_direct_IO+0x57/0x60
Apr 16 21:19:31 vm6 kernel: [<ffffffff811c91f0>] ? blkdev_get_blocks+0x0/0xc0
Apr 16 21:19:31 vm6 kernel: [<ffffffff81123084>] generic_file_aio_read+0x744/0x7d0
Apr 16 21:19:31 vm6 kernel: [<ffffffff811cab21>] ? blkdev_open+0x71/0xc0
Apr 16 21:19:31 vm6 kernel: [<ffffffff8118c173>] ? __dentry_open+0x113/0x330
Apr 16 21:19:31 vm6 kernel: [<ffffffff8122d2e8>] ? devcgroup_inode_permission+0x48/0x50
Apr 16 21:19:31 vm6 kernel: [<ffffffff8118ea5a>] do_sync_read+0xfa/0x140
Apr 16 21:19:31 vm6 kernel: [<ffffffff8119fcc4>] ? user_path_at+0x64/0xa0
Apr 16 21:19:31 vm6 kernel: [<ffffffff810944b0>] ? autoremove_wake_function+0x0/0x40
Apr 16 21:19:31 vm6 kernel: [<ffffffff811c962c>] ? block_ioctl+0x3c/0x40
Apr 16 21:19:31 vm6 kernel: [<ffffffff811a22d2>] ? vfs_ioctl+0x22/0xa0
Apr 16 21:19:31 vm6 kernel: [<ffffffff811a247a>] ? do_vfs_ioctl+0x8a/0x5d0
Apr 16 21:19:31 vm6 kernel: [<ffffffff8118f435>] vfs_read+0xb5/0x1a0
Apr 16 21:19:31 vm6 kernel: [<ffffffff8118f571>] sys_read+0x51/0x90
Apr 16 21:19:31 vm6 kernel: [<ffffffff8100b182>] system_call_fastpath+0x16/0x1b
Apr 16 21:19:31 vm6 kernel: vgs           D ffff880637bdc400     0 396625   4040    0 0x00000000
Apr 16 21:19:31 vm6 kernel: ffff88051973bb18 0000000000000082 ffff88051973bad8 ffffffff8140c1bc
Apr 16 21:19:31 vm6 kernel: 0000000000000008 0000000000001000 0000000000000000 000000000000000c
Apr 16 21:19:31 vm6 kernel: ffff88051973bb08 ffff880637bdc9a0 ffff88051973bfd8 ffff88051973bfd8
Apr 16 21:19:31 vm6 kernel: Call Trace:
Apr 16 21:19:31 vm6 kernel: [<ffffffff8140c1bc>] ? dm_table_unplug_all+0x5c/0x100
Apr 16 21:19:31 vm6 kernel: [<ffffffff81512663>] io_schedule+0x73/0xc0
Apr 16 21:19:31 vm6 kernel: [<ffffffff811cc42e>] __blockdev_direct_IO_newtrunc+0x6ee/0xb80
Apr 16 21:19:31 vm6 kernel: [<ffffffff811cc91e>] __blockdev_direct_IO+0x5e/0xd0
Apr 16 21:19:31 vm6 kernel: [<ffffffff811c91f0>] ? blkdev_get_blocks+0x0/0xc0
Apr 16 21:19:31 vm6 kernel: [<ffffffff811ca047>] blkdev_direct_IO+0x57/0x60
Apr 16 21:19:31 vm6 kernel: [<ffffffff811c91f0>] ? blkdev_get_blocks+0x0/0xc0
Apr 16 21:19:31 vm6 kernel: [<ffffffff81123084>] generic_file_aio_read+0x744/0x7d0
Apr 16 21:19:31 vm6 kernel: [<ffffffff811cab21>] ? blkdev_open+0x71/0xc0
Apr 16 21:19:31 vm6 kernel: [<ffffffff8118c173>] ? __dentry_open+0x113/0x330
Apr 16 21:19:31 vm6 kernel: [<ffffffff8122d2e8>] ? devcgroup_inode_permission+0x48/0x50
Apr 16 21:19:31 vm6 kernel: [<ffffffff8118ea5a>] do_sync_read+0xfa/0x140
Apr 16 21:19:31 vm6 kernel: [<ffffffff8119fcc4>] ? user_path_at+0x64/0xa0
Apr 16 21:19:31 vm6 kernel: [<ffffffff810944b0>] ? autoremove_wake_function+0x0/0x40
Apr 16 21:19:31 vm6 kernel: [<ffffffff811c962c>] ? block_ioctl+0x3c/0x40
Apr 16 21:19:31 vm6 kernel: [<ffffffff811a22d2>] ? vfs_ioctl+0x22/0xa0
Apr 16 21:19:31 vm6 kernel: [<ffffffff811a247a>] ? do_vfs_ioctl+0x8a/0x5d0
Apr 16 21:19:31 vm6 kernel: [<ffffffff8118f435>] vfs_read+0xb5/0x1a0
Apr 16 21:19:31 vm6 kernel: [<ffffffff8118f571>] sys_read+0x51/0x90
Apr 16 21:19:31 vm6 kernel: [<ffffffff8100b182>] system_call_fastpath+0x16/0x1b
Apr 16 21:21:31 vm6 kernel: lvremove      D ffff88063787b480     0 396621 393033    0 0x00000000
Apr 16 21:21:31 vm6 kernel: ffff88025ea3fb18 0000000000000082 ffff88025ea3fad8 ffffffff8140c1bc
Apr 16 21:21:31 vm6 kernel: 0000000000000008 0000000000001000 0000000000000000 000000000000000c
Apr 16 21:21:31 vm6 kernel: ffff88025ea3fb08 ffff88063787ba20 ffff88025ea3ffd8 ffff88025ea3ffd8
Apr 16 21:21:31 vm6 kernel: Call Trace:
Apr 16 21:21:31 vm6 kernel: [<ffffffff8140c1bc>] ? dm_table_unplug_all+0x5c/0x100
Apr 16 21:21:31 vm6 kernel: [<ffffffff81512663>] io_schedule+0x73/0xc0
Apr 16 21:21:31 vm6 kernel: [<ffffffff811cc42e>] __blockdev_direct_IO_newtrunc+0x6ee/0xb80
Apr 16 21:21:31 vm6 kernel: [<ffffffff811cc91e>] __blockdev_direct_IO+0x5e/0xd0
Apr 16 21:21:31 vm6 kernel: [<ffffffff811c91f0>] ? blkdev_get_blocks+0x0/0xc0
Apr 16 21:21:31 vm6 kernel: [<ffffffff811ca047>] blkdev_direct_IO+0x57/0x60
Apr 16 21:21:31 vm6 kernel: [<ffffffff811c91f0>] ? blkdev_get_blocks+0x0/0xc0
Apr 16 21:21:31 vm6 kernel: [<ffffffff81123084>] generic_file_aio_read+0x744/0x7d0
Apr 16 21:21:31 vm6 kernel: [<ffffffff811cab21>] ? blkdev_open+0x71/0xc0
Apr 16 21:21:31 vm6 kernel: [<ffffffff8118c173>] ? __dentry_open+0x113/0x330
Apr 16 21:21:31 vm6 kernel: [<ffffffff8122d2e8>] ? devcgroup_inode_permission+0x48/0x50
Apr 16 21:21:31 vm6 kernel: [<ffffffff8118ea5a>] do_sync_read+0xfa/0x140
Apr 16 21:21:31 vm6 kernel: [<ffffffff8119fcc4>] ? user_path_at+0x64/0xa0
Apr 16 21:21:31 vm6 kernel: [<ffffffff810944b0>] ? autoremove_wake_function+0x0/0x40
Apr 16 21:21:31 vm6 kernel: [<ffffffff811c962c>] ? block_ioctl+0x3c/0x40
Apr 16 21:21:31 vm6 kernel: [<ffffffff811a22d2>] ? vfs_ioctl+0x22/0xa0
Apr 16 21:21:31 vm6 kernel: [<ffffffff811a247a>] ? do_vfs_ioctl+0x8a/0x5d0
Apr 16 21:21:31 vm6 kernel: [<ffffffff8118f435>] vfs_read+0xb5/0x1a0
Apr 16 21:21:31 vm6 kernel: [<ffffffff8118f571>] sys_read+0x51/0x90
Apr 16 21:21:31 vm6 kernel: [<ffffffff8100b182>] system_call_fastpath+0x16/0x1b
Apr 16 21:21:31 vm6 kernel: vgs           D ffff880637bdc400     0 396625   4040    0 0x00000000
Apr 16 21:21:31 vm6 kernel: ffff88051973bb18 0000000000000082 ffff88051973bad8 ffffffff8140c1bc
Apr 16 21:21:31 vm6 kernel: 0000000000000008 0000000000001000 0000000000000000 000000000000000c
Apr 16 21:21:31 vm6 kernel: ffff88051973bb08 ffff880637bdc9a0 ffff88051973bfd8 ffff88051973bfd8
Apr 16 21:21:31 vm6 kernel: Call Trace:
Apr 16 21:21:31 vm6 kernel: [<ffffffff8140c1bc>] ? dm_table_unplug_all+0x5c/0x100
Apr 16 21:21:31 vm6 kernel: [<ffffffff81512663>] io_schedule+0x73/0xc0
Apr 16 21:21:31 vm6 kernel: [<ffffffff811cc42e>] __blockdev_direct_IO_newtrunc+0x6ee/0xb80
Apr 16 21:21:31 vm6 kernel: [<ffffffff811cc91e>] __blockdev_direct_IO+0x5e/0xd0
Apr 16 21:21:31 vm6 kernel: [<ffffffff811c91f0>] ? blkdev_get_blocks+0x0/0xc0
Apr 16 21:21:31 vm6 kernel: [<ffffffff811ca047>] blkdev_direct_IO+0x57/0x60
Apr 16 21:21:31 vm6 kernel: [<ffffffff811c91f0>] ? blkdev_get_blocks+0x0/0xc0
Apr 16 21:21:31 vm6 kernel: [<ffffffff81123084>] generic_file_aio_read+0x744/0x7d0
Apr 16 21:21:31 vm6 kernel: [<ffffffff811cab21>] ? blkdev_open+0x71/0xc0
Apr 16 21:21:31 vm6 kernel: [<ffffffff8118c173>] ? __dentry_open+0x113/0x330
Apr 16 21:21:31 vm6 kernel: [<ffffffff8122d2e8>] ? devcgroup_inode_permission+0x48/0x50
Apr 16 21:21:31 vm6 kernel: [<ffffffff8118ea5a>] do_sync_read+0xfa/0x140
Apr 16 21:21:31 vm6 kernel: [<ffffffff8119fcc4>] ? user_path_at+0x64/0xa0
Apr 16 21:21:31 vm6 kernel: [<ffffffff810944b0>] ? autoremove_wake_function+0x0/0x40
Apr 16 21:21:31 vm6 kernel: [<ffffffff811c962c>] ? block_ioctl+0x3c/0x40
Apr 16 21:21:31 vm6 kernel: [<ffffffff811a22d2>] ? vfs_ioctl+0x22/0xa0
Apr 16 21:21:31 vm6 kernel: [<ffffffff811a247a>] ? do_vfs_ioctl+0x8a/0x5d0
Apr 16 21:21:31 vm6 kernel: [<ffffffff8118f435>] vfs_read+0xb5/0x1a0
Apr 16 21:21:31 vm6 kernel: [<ffffffff8118f571>] sys_read+0x51/0x90
Apr 16 21:21:31 vm6 kernel: [<ffffffff8100b182>] system_call_fastpath+0x16/0x1b
 
Can you please increase the limit from 10k characters per post to 30k?
Sure is annoying having to make two posts just to provide all of the relevant data and I have seen others having this problems too.

Rest of the kernel messages related to lvremove:
Code:
Apr 16 21:23:31 vm6 kernel: lvremove      D ffff88063787b480     0 396621 393033    0 0x00000000
Apr 16 21:23:31 vm6 kernel: ffff88025ea3fb18 0000000000000082 ffff88025ea3fad8 ffffffff8140c1bc
Apr 16 21:23:31 vm6 kernel: 0000000000000008 0000000000001000 0000000000000000 000000000000000c
Apr 16 21:23:31 vm6 kernel: ffff88025ea3fb08 ffff88063787ba20 ffff88025ea3ffd8 ffff88025ea3ffd8
Apr 16 21:23:31 vm6 kernel: Call Trace:
Apr 16 21:23:31 vm6 kernel: [<ffffffff8140c1bc>] ? dm_table_unplug_all+0x5c/0x100
Apr 16 21:23:31 vm6 kernel: [<ffffffff81512663>] io_schedule+0x73/0xc0
Apr 16 21:23:31 vm6 kernel: [<ffffffff811cc42e>] __blockdev_direct_IO_newtrunc+0x6ee/0xb80
Apr 16 21:23:31 vm6 kernel: [<ffffffff811cc91e>] __blockdev_direct_IO+0x5e/0xd0
Apr 16 21:23:31 vm6 kernel: [<ffffffff811c91f0>] ? blkdev_get_blocks+0x0/0xc0
Apr 16 21:23:31 vm6 kernel: [<ffffffff811ca047>] blkdev_direct_IO+0x57/0x60
Apr 16 21:23:31 vm6 kernel: [<ffffffff811c91f0>] ? blkdev_get_blocks+0x0/0xc0
Apr 16 21:23:31 vm6 kernel: [<ffffffff81123084>] generic_file_aio_read+0x744/0x7d0
Apr 16 21:23:31 vm6 kernel: [<ffffffff811cab21>] ? blkdev_open+0x71/0xc0
Apr 16 21:23:31 vm6 kernel: [<ffffffff8118c173>] ? __dentry_open+0x113/0x330
Apr 16 21:23:31 vm6 kernel: [<ffffffff8122d2e8>] ? devcgroup_inode_permission+0x48/0x50
Apr 16 21:23:31 vm6 kernel: [<ffffffff8118ea5a>] do_sync_read+0xfa/0x140
Apr 16 21:23:31 vm6 kernel: [<ffffffff8119fcc4>] ? user_path_at+0x64/0xa0
Apr 16 21:23:31 vm6 kernel: [<ffffffff810944b0>] ? autoremove_wake_function+0x0/0x40
Apr 16 21:23:31 vm6 kernel: [<ffffffff811c962c>] ? block_ioctl+0x3c/0x40
Apr 16 21:23:31 vm6 kernel: [<ffffffff811a22d2>] ? vfs_ioctl+0x22/0xa0
Apr 16 21:23:31 vm6 kernel: [<ffffffff811a247a>] ? do_vfs_ioctl+0x8a/0x5d0
Apr 16 21:23:31 vm6 kernel: [<ffffffff8118f435>] vfs_read+0xb5/0x1a0
Apr 16 21:23:31 vm6 kernel: [<ffffffff8118f571>] sys_read+0x51/0x90
Apr 16 21:23:31 vm6 kernel: [<ffffffff8100b182>] system_call_fastpath+0x16/0x1b
Apr 16 21:23:31 vm6 kernel: vgs           D ffff880637bdc400     0 396625   4040    0 0x00000000
Apr 16 21:23:31 vm6 kernel: ffff88051973bb18 0000000000000082 ffff88051973bad8 ffffffff8140c1bc
Apr 16 21:23:31 vm6 kernel: 0000000000000008 0000000000001000 0000000000000000 000000000000000c
Apr 16 21:23:31 vm6 kernel: ffff88051973bb08 ffff880637bdc9a0 ffff88051973bfd8 ffff88051973bfd8
Apr 16 21:23:31 vm6 kernel: Call Trace:
Apr 16 21:23:31 vm6 kernel: [<ffffffff8140c1bc>] ? dm_table_unplug_all+0x5c/0x100
Apr 16 21:23:31 vm6 kernel: [<ffffffff81512663>] io_schedule+0x73/0xc0
Apr 16 21:23:31 vm6 kernel: [<ffffffff811cc42e>] __blockdev_direct_IO_newtrunc+0x6ee/0xb80
Apr 16 21:23:31 vm6 kernel: [<ffffffff811cc91e>] __blockdev_direct_IO+0x5e/0xd0
Apr 16 21:23:31 vm6 kernel: [<ffffffff811c91f0>] ? blkdev_get_blocks+0x0/0xc0
Apr 16 21:23:31 vm6 kernel: [<ffffffff811ca047>] blkdev_direct_IO+0x57/0x60
Apr 16 21:23:31 vm6 kernel: [<ffffffff811c91f0>] ? blkdev_get_blocks+0x0/0xc0
Apr 16 21:23:31 vm6 kernel: [<ffffffff81123084>] generic_file_aio_read+0x744/0x7d0
Apr 16 21:23:31 vm6 kernel: [<ffffffff811cab21>] ? blkdev_open+0x71/0xc0
Apr 16 21:23:31 vm6 kernel: [<ffffffff8118c173>] ? __dentry_open+0x113/0x330
Apr 16 21:23:31 vm6 kernel: [<ffffffff8122d2e8>] ? devcgroup_inode_permission+0x48/0x50
Apr 16 21:23:31 vm6 kernel: [<ffffffff8118ea5a>] do_sync_read+0xfa/0x140
Apr 16 21:23:31 vm6 kernel: [<ffffffff8119fcc4>] ? user_path_at+0x64/0xa0
Apr 16 21:23:31 vm6 kernel: [<ffffffff810944b0>] ? autoremove_wake_function+0x0/0x40
Apr 16 21:23:31 vm6 kernel: [<ffffffff811c962c>] ? block_ioctl+0x3c/0x40
Apr 16 21:23:31 vm6 kernel: [<ffffffff811a22d2>] ? vfs_ioctl+0x22/0xa0
Apr 16 21:23:31 vm6 kernel: [<ffffffff811a247a>] ? do_vfs_ioctl+0x8a/0x5d0
Apr 16 21:23:31 vm6 kernel: [<ffffffff8118f435>] vfs_read+0xb5/0x1a0
Apr 16 21:23:31 vm6 kernel: [<ffffffff8118f571>] sys_read+0x51/0x90
Apr 16 21:23:31 vm6 kernel: [<ffffffff8100b182>] system_call_fastpath+0x16/0x1b
 
And the final portion of lvremove related kernel messages:
Code:
Apr 16 21:25:31 vm6 kernel: lvremove      D ffff88063787b480     0 396621      1    0 0x00000000
Apr 16 21:25:31 vm6 kernel: ffff88025ea3fb18 0000000000000082 ffff88025ea3fad8 ffffffff8140c1bc
Apr 16 21:25:31 vm6 kernel: 0000000000000008 0000000000001000 0000000000000000 000000000000000c
Apr 16 21:25:31 vm6 kernel: ffff88025ea3fb08 ffff88063787ba20 ffff88025ea3ffd8 ffff88025ea3ffd8
Apr 16 21:25:31 vm6 kernel: Call Trace:
Apr 16 21:25:31 vm6 kernel: [<ffffffff8140c1bc>] ? dm_table_unplug_all+0x5c/0x100
Apr 16 21:25:31 vm6 kernel: [<ffffffff81512663>] io_schedule+0x73/0xc0
Apr 16 21:25:31 vm6 kernel: [<ffffffff811cc42e>] __blockdev_direct_IO_newtrunc+0x6ee/0xb80
Apr 16 21:25:31 vm6 kernel: [<ffffffff811cc91e>] __blockdev_direct_IO+0x5e/0xd0
Apr 16 21:25:31 vm6 kernel: [<ffffffff811c91f0>] ? blkdev_get_blocks+0x0/0xc0
Apr 16 21:25:31 vm6 kernel: [<ffffffff811ca047>] blkdev_direct_IO+0x57/0x60
Apr 16 21:25:31 vm6 kernel: [<ffffffff811c91f0>] ? blkdev_get_blocks+0x0/0xc0
Apr 16 21:25:31 vm6 kernel: [<ffffffff81123084>] generic_file_aio_read+0x744/0x7d0
Apr 16 21:25:31 vm6 kernel: [<ffffffff811cab21>] ? blkdev_open+0x71/0xc0
Apr 16 21:25:31 vm6 kernel: [<ffffffff8118c173>] ? __dentry_open+0x113/0x330
Apr 16 21:25:31 vm6 kernel: [<ffffffff8122d2e8>] ? devcgroup_inode_permission+0x48/0x50
Apr 16 21:25:31 vm6 kernel: [<ffffffff8118ea5a>] do_sync_read+0xfa/0x140
Apr 16 21:25:31 vm6 kernel: [<ffffffff8119fcc4>] ? user_path_at+0x64/0xa0
Apr 16 21:25:31 vm6 kernel: [<ffffffff810944b0>] ? autoremove_wake_function+0x0/0x40
Apr 16 21:25:31 vm6 kernel: [<ffffffff811c962c>] ? block_ioctl+0x3c/0x40
Apr 16 21:25:31 vm6 kernel: [<ffffffff811a22d2>] ? vfs_ioctl+0x22/0xa0
Apr 16 21:25:31 vm6 kernel: [<ffffffff811a247a>] ? do_vfs_ioctl+0x8a/0x5d0
Apr 16 21:25:31 vm6 kernel: [<ffffffff8118f435>] vfs_read+0xb5/0x1a0
Apr 16 21:25:31 vm6 kernel: [<ffffffff8118f571>] sys_read+0x51/0x90
Apr 16 21:25:31 vm6 kernel: [<ffffffff8100b182>] system_call_fastpath+0x16/0x1b
Apr 16 21:25:31 vm6 kernel: vgs           D ffff880637bdc400     0 396625   4040    0 0x00000000
Apr 16 21:25:31 vm6 kernel: ffff88051973bb18 0000000000000082 ffff88051973bad8 ffffffff8140c1bc
Apr 16 21:25:31 vm6 kernel: 0000000000000008 0000000000001000 0000000000000000 000000000000000c
Apr 16 21:25:31 vm6 kernel: ffff88051973bb08 ffff880637bdc9a0 ffff88051973bfd8 ffff88051973bfd8
Apr 16 21:25:31 vm6 kernel: Call Trace:
Apr 16 21:25:31 vm6 kernel: [<ffffffff8140c1bc>] ? dm_table_unplug_all+0x5c/0x100
Apr 16 21:25:31 vm6 kernel: [<ffffffff81512663>] io_schedule+0x73/0xc0
Apr 16 21:25:31 vm6 kernel: [<ffffffff811cc42e>] __blockdev_direct_IO_newtrunc+0x6ee/0xb80
Apr 16 21:25:31 vm6 kernel: [<ffffffff811cc91e>] __blockdev_direct_IO+0x5e/0xd0
Apr 16 21:25:31 vm6 kernel: [<ffffffff811c91f0>] ? blkdev_get_blocks+0x0/0xc0
Apr 16 21:25:31 vm6 kernel: [<ffffffff811ca047>] blkdev_direct_IO+0x57/0x60
Apr 16 21:25:31 vm6 kernel: [<ffffffff811c91f0>] ? blkdev_get_blocks+0x0/0xc0
Apr 16 21:25:31 vm6 kernel: [<ffffffff81123084>] generic_file_aio_read+0x744/0x7d0
Apr 16 21:25:31 vm6 kernel: [<ffffffff811cab21>] ? blkdev_open+0x71/0xc0
Apr 16 21:25:31 vm6 kernel: [<ffffffff8118c173>] ? __dentry_open+0x113/0x330
Apr 16 21:25:31 vm6 kernel: [<ffffffff8122d2e8>] ? devcgroup_inode_permission+0x48/0x50
Apr 16 21:25:31 vm6 kernel: [<ffffffff8118ea5a>] do_sync_read+0xfa/0x140
Apr 16 21:25:31 vm6 kernel: [<ffffffff8119fcc4>] ? user_path_at+0x64/0xa0
Apr 16 21:25:31 vm6 kernel: [<ffffffff810944b0>] ? autoremove_wake_function+0x0/0x40
Apr 16 21:25:31 vm6 kernel: [<ffffffff811c962c>] ? block_ioctl+0x3c/0x40
Apr 16 21:25:31 vm6 kernel: [<ffffffff811a22d2>] ? vfs_ioctl+0x22/0xa0
Apr 16 21:25:31 vm6 kernel: [<ffffffff811a247a>] ? do_vfs_ioctl+0x8a/0x5d0
Apr 16 21:25:31 vm6 kernel: [<ffffffff8118f435>] vfs_read+0xb5/0x1a0
Apr 16 21:25:31 vm6 kernel: [<ffffffff8118f571>] sys_read+0x51/0x90
Apr 16 21:25:31 vm6 kernel: [<ffffffff8100b182>] system_call_fastpath+0x16/0x1b
Apr 16 21:27:31 vm6 kernel: lvremove      D ffff88063787b480     0 396621      1    0 0x00000000
Apr 16 21:27:31 vm6 kernel: ffff88025ea3fb18 0000000000000082 ffff88025ea3fad8 ffffffff8140c1bc
Apr 16 21:27:31 vm6 kernel: 0000000000000008 0000000000001000 0000000000000000 000000000000000c
Apr 16 21:27:31 vm6 kernel: ffff88025ea3fb08 ffff88063787ba20 ffff88025ea3ffd8 ffff88025ea3ffd8
Apr 16 21:27:31 vm6 kernel: Call Trace:
Apr 16 21:27:31 vm6 kernel: [<ffffffff8140c1bc>] ? dm_table_unplug_all+0x5c/0x100
Apr 16 21:27:31 vm6 kernel: [<ffffffff81512663>] io_schedule+0x73/0xc0
Apr 16 21:27:31 vm6 kernel: [<ffffffff811cc42e>] __blockdev_direct_IO_newtrunc+0x6ee/0xb80
Apr 16 21:27:31 vm6 kernel: [<ffffffff811cc91e>] __blockdev_direct_IO+0x5e/0xd0
Apr 16 21:27:31 vm6 kernel: [<ffffffff811c91f0>] ? blkdev_get_blocks+0x0/0xc0
Apr 16 21:27:31 vm6 kernel: [<ffffffff811ca047>] blkdev_direct_IO+0x57/0x60
Apr 16 21:27:31 vm6 kernel: [<ffffffff811c91f0>] ? blkdev_get_blocks+0x0/0xc0
Apr 16 21:27:31 vm6 kernel: [<ffffffff81123084>] generic_file_aio_read+0x744/0x7d0
Apr 16 21:27:31 vm6 kernel: [<ffffffff811cab21>] ? blkdev_open+0x71/0xc0
Apr 16 21:27:31 vm6 kernel: [<ffffffff8118c173>] ? __dentry_open+0x113/0x330
Apr 16 21:27:31 vm6 kernel: [<ffffffff8122d2e8>] ? devcgroup_inode_permission+0x48/0x50
Apr 16 21:27:31 vm6 kernel: [<ffffffff8118ea5a>] do_sync_read+0xfa/0x140
Apr 16 21:27:31 vm6 kernel: [<ffffffff8119fcc4>] ? user_path_at+0x64/0xa0
Apr 16 21:27:31 vm6 kernel: [<ffffffff810944b0>] ? autoremove_wake_function+0x0/0x40
Apr 16 21:27:31 vm6 kernel: [<ffffffff811c962c>] ? block_ioctl+0x3c/0x40
Apr 16 21:27:31 vm6 kernel: [<ffffffff811a22d2>] ? vfs_ioctl+0x22/0xa0
Apr 16 21:27:31 vm6 kernel: [<ffffffff811a247a>] ? do_vfs_ioctl+0x8a/0x5d0
Apr 16 21:27:31 vm6 kernel: [<ffffffff8118f435>] vfs_read+0xb5/0x1a0
Apr 16 21:27:31 vm6 kernel: [<ffffffff8118f571>] sys_read+0x51/0x90
Apr 16 21:27:31 vm6 kernel: [<ffffffff8100b182>] system_call_fastpath+0x16/0x1b
Apr 16 21:27:31 vm6 kernel: vgs           D ffff880637bdc400     0 396625   4040    0 0x00000000
Apr 16 21:27:31 vm6 kernel: ffff88051973bb18 0000000000000082 ffff88051973bad8 ffffffff8140c1bc
Apr 16 21:27:31 vm6 kernel: 0000000000000008 0000000000001000 0000000000000000 000000000000000c
Apr 16 21:27:31 vm6 kernel: ffff88051973bb08 ffff880637bdc9a0 ffff88051973bfd8 ffff88051973bfd8
Apr 16 21:27:31 vm6 kernel: Call Trace:
Apr 16 21:27:31 vm6 kernel: [<ffffffff8140c1bc>] ? dm_table_unplug_all+0x5c/0x100
Apr 16 21:27:31 vm6 kernel: [<ffffffff81512663>] io_schedule+0x73/0xc0
Apr 16 21:27:31 vm6 kernel: [<ffffffff811cc42e>] __blockdev_direct_IO_newtrunc+0x6ee/0xb80
Apr 16 21:27:31 vm6 kernel: [<ffffffff811cc91e>] __blockdev_direct_IO+0x5e/0xd0
Apr 16 21:27:31 vm6 kernel: [<ffffffff811c91f0>] ? blkdev_get_blocks+0x0/0xc0
Apr 16 21:27:31 vm6 kernel: [<ffffffff811ca047>] blkdev_direct_IO+0x57/0x60
Apr 16 21:27:31 vm6 kernel: [<ffffffff811c91f0>] ? blkdev_get_blocks+0x0/0xc0
Apr 16 21:27:31 vm6 kernel: [<ffffffff81123084>] generic_file_aio_read+0x744/0x7d0
Apr 16 21:27:31 vm6 kernel: [<ffffffff811cab21>] ? blkdev_open+0x71/0xc0
Apr 16 21:27:31 vm6 kernel: [<ffffffff8118c173>] ? __dentry_open+0x113/0x330
Apr 16 21:27:31 vm6 kernel: [<ffffffff8122d2e8>] ? devcgroup_inode_permission+0x48/0x50
Apr 16 21:27:31 vm6 kernel: [<ffffffff8118ea5a>] do_sync_read+0xfa/0x140
Apr 16 21:27:31 vm6 kernel: [<ffffffff8119fcc4>] ? user_path_at+0x64/0xa0
Apr 16 21:27:31 vm6 kernel: [<ffffffff810944b0>] ? autoremove_wake_function+0x0/0x40
Apr 16 21:27:31 vm6 kernel: [<ffffffff811c962c>] ? block_ioctl+0x3c/0x40
Apr 16 21:27:31 vm6 kernel: [<ffffffff811a22d2>] ? vfs_ioctl+0x22/0xa0
Apr 16 21:27:31 vm6 kernel: [<ffffffff811a247a>] ? do_vfs_ioctl+0x8a/0x5d0
Apr 16 21:27:31 vm6 kernel: [<ffffffff8118f435>] vfs_read+0xb5/0x1a0
Apr 16 21:27:31 vm6 kernel: [<ffffffff8118f571>] sys_read+0x51/0x90
Apr 16 21:27:31 vm6 kernel: [<ffffffff8100b182>] system_call_fastpath+0x16/0x1b
 
Our on-call technician for last night provided some additional details.

The VM that was being backed up seems to have had all of its IO stalled sometime after the backup completed.
He reported that running lvscan would hang too.
Using the Stop command in the GUI did not work, it just hung and never completed.
The VM description was not displaying in the GUI, just the VMID.
The node that ran the backup showed up as Red in the Proxmox GUI.

Eventually he rebooted the node.
After it came up we were able to run lvremove to remove the snapshot.

# pveversion -v
Code:
pve-manager: 2.0-59 (pve-manager/2.0/18400f07)
running kernel: 2.6.32-11-pve
proxmox-ve-2.6.32: 2.0-66
pve-kernel-2.6.32-11-pve: 2.6.32-66
pve-kernel-2.6.32-7-pve: 2.6.32-60
lvm2: 2.02.88-2pve2
clvm: 2.02.88-2pve2
corosync-pve: 1.4.3-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.8-3
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.7-2
pve-cluster: 1.0-26
qemu-server: 2.0-38
pve-firmware: 1.0-15
libpve-common-perl: 1.0-26
libpve-access-control: 1.0-18
libpve-storage-perl: 2.0-17
vncterm: 1.0-2
vzctl: 3.0.30-2pve2
vzprocps: 2.0.11-2
vzquota: 3.0.12-3
pve-qemu-kvm: 1.0-9
ksm-control-daemon: 1.1-1
 
There was no indication that the snapshot had run out of space, I have it set to 14GB in /etc/vzdump.conf too.

If it had run out of space, it did so after the backup since the backup completed.
Had it run out of space before the backup completed the backup would have failed due to IO error on the snapshot.

At the time the lvremove did not work it is very unlikely that the snapshot was out of space since the backup was a success.
 
Maybe the snapshot runs out of space?

http://bugzilla.openvz.org/show_bug.cgi?id=2205

Not sure if this is the same bug, but maybe you can test that?

I set the snapshots to 500MB in /etc/vzdump.conf
ran manual backup.
While it is running I used sdelete -c c:\ and sdelete -c e:\ to write lots of data to the disks causing the snapshots to get full.

When they got full kernel says:
Code:
[COLOR=#000000]device-mapper: snapshots: Invalidating snapshot: Unable to allocate exception.
device-mapper: snapshots: Invalidating snapshot: Unable to allocate exception.
[/COLOR]


Those message were never logged during the original issue.

The backup is still running, I assume it will fail at some point.
I will post back when the test backup with full snapshots completes and let you know how it turns out.
So far everything seems to be working just fine with no issues.
 
Can you reproduce it when the VM is on local storage (not on DRBD)?

I have not been able to reproduce it even on DRBD.
It happened once so far. Given that there are other threads that seem to be having issues with LVM I wonder if there is a bug somewhere in the kernel.

This is one bug I found that seems interesting: https://bugzilla.redhat.com/show_bug.cgi?id=570359
 
I have not been able to reproduce it even on DRBD.
It happened once so far. Given that there are other threads that seem to be having issues with LVM I wonder if there is a bug somewhere in the kernel.

Yes, there is definitely a bug somewhere. The problem is that it is not reproducible, so we are unable to debug it.

This is one bug I found that seems interesting: https://bugzilla.redhat.com/show_bug.cgi?id=570359

Yes, that is one bug, but we have a workaround for that in vzdump:

see: https://bugzilla.proxmox.com/show_bug.cgi?id=127

But that does not really explain the crash.

Please try to set 'issue_discard = 0' in /etc/lvm/lvm.conf - maybe that helps (just a guess).
 
Seems like most of the reports of similar issues point the blame at udev.

Very frustrating that this problem persists and is so random it is hard to pin down.

Dietmar, I will try changing issue_discard = 0 but I doubt it will help since my device does not support trim anyway.

This server has ran a snapshot backup every day for about a month now. Only once did it have the problem that promped me to start this thread.

If it happens again are there any commands I should run to help gather more info to help solve this?
 
Yes, there is definitely a bug somewhere. The problem is that it is not reproducible, so we are unable to debug it.



Yes, that is one bug, but we have a workaround for that in vzdump:

see: https://bugzilla.proxmox.com/show_bug.cgi?id=127

But that does not really explain the crash.

Please try to set 'issue_discard = 0' in /etc/lvm/lvm.conf - maybe that helps (just a guess).

This bug suggest calling 'udevadm settle' before calling lvm commands in a script.
https://bugzilla.redhat.com/show_bug.cgi?id=570359

Maybe calling udevadm settle before each lvremove would help?
 
I just upload the new packages to the ‘pvetest’ repository.

Anybody is invited to test ;-) This should fix snapshot related problems with vzdump.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!