aaaaand - vzdump did it again.
on one box now it can be kind of reproduced again:
the final dump goes like this:
Dec 11 06:23:34 INFO: Starting Backup of VM 76023 (openvz)
Dec 11 06:23:34 INFO: CTID 76023 exist mounted running
Dec 11 06:23:34 INFO: status = running
Dec 11 06:23:34 INFO: backup mode: snapshot
Dec 11 06:23:34 INFO: bandwidth limit: 71200 KB/s
Dec 11 06:23:34 INFO: ionice priority: 7
Dec 11 06:23:34 INFO: creating lvm snapshot of /dev/mapper/pve-data ('/dev/pve/vzsnap-k82-0')
Dec 11 06:23:34 INFO: Logical volume "vzsnap-k82-0" created
Dec 11 06:23:42 INFO: creating archive '/mnt/pve/kXXbackup/dump/vzdump-openvz-76023-2013_12_11-06_23_34.tar.lzo'
Dec 11 06:25:53 INFO: Total bytes written: 2005463040 (1.9GiB, 15MiB/s)
Dec 11 06:26:04 INFO: archive file size: 906MB
Dec 11 06:26:04 INFO: delete old backup '/mnt/pve/kXXbackup/dump/vzdump-openvz-76023-2013_12_07-08_33_25.tar.lzo'
Dec 11 06:26:06 INFO: umount: /mnt/vzsnap0: device is busy.
Dec 11 06:26:06 INFO: (In some cases useful info about processes that use
Dec 11 06:26:06 INFO: the device is found by lsof(8) or fuser(1))
Dec 11 06:26:06 ERROR: command 'umount /mnt/vzsnap0' failed: exit code 1
after this things go totally bad again, finally freezing the whole system only way to recover is a reset.
during the last moments of the box I manually tried the unmount again - this session was hung until reset.
the snapshot is of course still there after the reboot:
--- Logical volume ---
LV Path /dev/pve/vzsnap-k82-0
LV Name vzsnap-k82-0
VG Name pve
LV UUID 6F2qjA-cAlT-xQtb-TJRj-fl3l-bRaI-Snh5kr
LV Write Access read/write
LV Creation host, time k82, 2013-12-11 06:23:34 +0100
LV Status available
# open 0
LV Size 17.70 GiB
Current LE 4530
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:7
the problem got obvious a few days ago when the very same thing happened the first time since months, the load has slightly increased during the last weeks of this system, but still is quite well below real limits.
filesystem is choosen to be ext3:
/dev/mapper/pve-data on /var/lib/vz type ext3 (rw,relatime,errors=continue,user_xattr,acl,barrier=0,data=ordered)
/dev/sda1 on /boot type ext3 (rw,relatime,errors=continue,user_xattr,acl,barrier=0,data=ordered)
and we are running this:
pveversion -v
proxmox-ve-2.6.32: 3.1-114 (running kernel: 2.6.32-26-pve)
pve-manager: 3.1-24 (running version: 3.1-24/060bd5a6)
pve-kernel-2.6.32-24-pve: 2.6.32-111
pve-kernel-2.6.32-25-pve: 2.6.32-113
pve-kernel-2.6.32-22-pve: 2.6.32-107
pve-kernel-2.6.32-26-pve: 2.6.32-114
pve-kernel-2.6.32-23-pve: 2.6.32-109
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.0-2
pve-cluster: 3.0-8
qemu-server: 3.1-8
pve-firmware: 1.0-23
libpve-common-perl: 3.0-9
libpve-access-control: 3.0-8
libpve-storage-perl: 3.0-18
pve-libspice-server1: 0.12.4-2
vncterm: 1.1-6
vzctl: 4.0-1pve4
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.4-17
ksm-control-daemon: 1.1-1
glusterfs-client: 3.4.1-1
please help - this is going to kill our proxmox-plans.
thank you
hk