We are using proxmox 3.1.
We use an openvz container as bacula backup server. The backups are made to a NAS device, which is mounted using cifs. Over "mound --bind" we use it insider the container.
Small test backups work without any problems, but as soon as we do a real backup the container always hangs after a few minutes. => It isn't possible to connect over ssh anymore (it asks for the password, but after entering it, it hangs), also no other service responds.
Over the proxmox web interface it isn't possible to shut the container down. Only stopping it works, but also gives a short error message:
(Once I had to restart the whole server to kill the container, but I couldn't reproduce this behaviour, so maybe there was an other independent problem)
Dmesg also gives following messages
Backups from the host system over vzdump to the NAS device using the same cifs mount work without any problems.
anybody knows a solution? if not I'm going to install the debian system as a kvm host.
proxmox-ve-2.6.32: 3.1-114 (running kernel: 2.6.32-26-pve)
pve-manager: 3.1-24 (running version: 3.1-24/060bd5a6)
pve-kernel-2.6.32-26-pve: 2.6.32-114
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.0-2
pve-cluster: 3.0-8
qemu-server: 3.1-8
pve-firmware: 1.0-23
libpve-common-perl: 3.0-9
libpve-access-control: 3.0-8
libpve-storage-perl: 3.0-18
pve-libspice-server1: 0.12.4-2
vncterm: 1.1-6
vzctl: 4.0-1pve4
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.4-17
ksm-control-daemon: 1.1-1
glusterfs-client: 3.4.1-1
We use an openvz container as bacula backup server. The backups are made to a NAS device, which is mounted using cifs. Over "mound --bind" we use it insider the container.
Small test backups work without any problems, but as soon as we do a real backup the container always hangs after a few minutes. => It isn't possible to connect over ssh anymore (it asks for the password, but after entering it, it hangs), also no other service responds.
Over the proxmox web interface it isn't possible to shut the container down. Only stopping it works, but also gives a short error message:
(Once I had to restart the whole server to kill the container, but I couldn't reproduce this behaviour, so maybe there was an other independent problem)
Manualy unmounting it afterwards works.Removing stale lock file /var/lib/vz/lock/100.lck
Killing container ...
Container was stopped
Can't umount /var/lib/vz/root/100: Device or resource busy
TASK ERROR: command 'vzctl stop 100 --fast' failed: exit code 51
Dmesg also gives following messages
INFO: task sshd:438586 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
sshd D ffff880beda4e200 0 438586 6761 100 0x00000000
ffff8810766c3e68 0000000000000082 ffff8810766c3f18 ffffffff811b179e
ffff8810766c3e28 00000000ffffff9c ffff8810766c3e70 ffff880394922000
ffff8810766c3e08 000000012a50c760 ffff8810766c3fd8 ffff8810766c3fd8
Call Trace:
[<ffffffff811b179e>] ? do_filp_open+0x14e/0xca0
[<ffffffff811ad443>] ? putname+0x33/0x50
[<ffffffff815427c2>] __mutex_lock_slowpath+0x122/0x170
[<ffffffff815421da>] mutex_lock+0x2a/0x50
[<ffffffff8119f616>] generic_file_llseek+0x36/0x70
[<ffffffff8119f6c7>] sys_lseek+0x77/0xb0
[<ffffffff8100b182>] system_call_fastpath+0x16/0x1b
INFO: task sshd:438586 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
sshd D ffff880beda4e200 0 438586 6761 100 0x00000000
ffff8810766c3e68 0000000000000082 ffff8810766c3f18 ffffffff811b179e
ffff8810766c3e28 00000000ffffff9c ffff8810766c3e70 ffff880394922000
ffff8810766c3e08 000000012a50c760 ffff8810766c3fd8 ffff8810766c3fd8
Call Trace:
[<ffffffff811b179e>] ? do_filp_open+0x14e/0xca0
[<ffffffff811ad443>] ? putname+0x33/0x50
[<ffffffff815427c2>] __mutex_lock_slowpath+0x122/0x170
[<ffffffff815421da>] mutex_lock+0x2a/0x50
[<ffffffff8119f616>] generic_file_llseek+0x36/0x70
[<ffffffff8119f6c7>] sys_lseek+0x77/0xb0
[<ffffffff8100b182>] system_call_fastpath+0x16/0x1b
INFO: task sshd:443157 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
sshd D ffff8803957f3460 0 443157 6761 100 0x00000004
ffff880394ff9e68 0000000000000082 0000000000000000 ffffffff811b179e
ffff880394ff9e28 00000000ffffff9c ffff880394ff9e70 ffff880397a46000
ffff880394ff9e08 000000012a52a35e ffff880394ff9fd8 ffff880394ff9fd8
Call Trace:
[<ffffffff811b179e>] ? do_filp_open+0x14e/0xca0
[<ffffffff815427c2>] __mutex_lock_slowpath+0x122/0x170
[<ffffffff815421da>] mutex_lock+0x2a/0x50
[<ffffffff8119f616>] generic_file_llseek+0x36/0x70
[<ffffffff8119f6c7>] sys_lseek+0x77/0xb0
[<ffffffff8100b182>] system_call_fastpath+0x16/0x1b
CT: 100: stopped
CT: 100: started
INFO: task cron:478585 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
cron D ffff88087a1d8880 0 478585 446520 100 0x00000000
ffff880377bebe68 0000000000000082 0000000000000000 ffffffff811b179e
ffff880377bebe28 00000000ffffff9c ffff880377bebe70 ffff8810781b9000
ffff880377bebe08 000000012ad6c419 ffff880377bebfd8 ffff880377bebfd8
Call Trace:
[<ffffffff811b179e>] ? do_filp_open+0x14e/0xca0
[<ffffffff815427c2>] __mutex_lock_slowpath+0x122/0x170
[<ffffffff815421da>] mutex_lock+0x2a/0x50
[<ffffffff8119f616>] generic_file_llseek+0x36/0x70
[<ffffffff8119f6c7>] sys_lseek+0x77/0xb0
[<ffffffff8100b182>] system_call_fastpath+0x16/0x1b
INFO: task cron:478585 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
cron D ffff88087a1d8880 0 478585 446520 100 0x00000000
ffff880377bebe68 0000000000000082 0000000000000000 ffffffff811b179e
ffff880377bebe28 00000000ffffff9c ffff880377bebe70 ffff8810781b9000
ffff880377bebe08 000000012ad6c419 ffff880377bebfd8 ffff880377bebfd8
Call Trace:
[<ffffffff811b179e>] ? do_filp_open+0x14e/0xca0
[<ffffffff815427c2>] __mutex_lock_slowpath+0x122/0x170
[<ffffffff815421da>] mutex_lock+0x2a/0x50
[<ffffffff8119f616>] generic_file_llseek+0x36/0x70
[<ffffffff8119f6c7>] sys_lseek+0x77/0xb0
[<ffffffff8100b182>] system_call_fastpath+0x16/0x1b
INFO: task cron:478585 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
cron D ffff88087a1d8880 0 478585 446520 100 0x00000000
ffff880377bebe68 0000000000000082 0000000000000000 ffffffff811b179e
ffff880377bebe28 00000000ffffff9c ffff880377bebe70 ffff8810781b9000
ffff880377bebe08 000000012ad6c419 ffff880377bebfd8 ffff880377bebfd8
Call Trace:
[<ffffffff811b179e>] ? do_filp_open+0x14e/0xca0
[<ffffffff815427c2>] __mutex_lock_slowpath+0x122/0x170
[<ffffffff815421da>] mutex_lock+0x2a/0x50
[<ffffffff8119f616>] generic_file_llseek+0x36/0x70
[<ffffffff8119f6c7>] sys_lseek+0x77/0xb0
[<ffffffff8100b182>] system_call_fastpath+0x16/0x1b
INFO: task cron:478585 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
cron D ffff88087a1d8880 0 478585 446520 100 0x00000000
ffff880377bebe68 0000000000000082 0000000000000000 ffffffff811b179e
ffff880377bebe28 00000000ffffff9c ffff880377bebe70 ffff8810781b9000
ffff880377bebe08 000000012ad6c419 ffff880377bebfd8 ffff880377bebfd8
Call Trace:
[<ffffffff811b179e>] ? do_filp_open+0x14e/0xca0
[<ffffffff815427c2>] __mutex_lock_slowpath+0x122/0x170
[<ffffffff815421da>] mutex_lock+0x2a/0x50
[<ffffffff8119f616>] generic_file_llseek+0x36/0x70
[<ffffffff8119f6c7>] sys_lseek+0x77/0xb0
[<ffffffff8100b182>] system_call_fastpath+0x16/0x1b
INFO: task cron:478585 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
cron D ffff88087a1d8880 0 478585 446520 100 0x00000000
ffff880377bebe68 0000000000000082 0000000000000000 ffffffff811b179e
ffff880377bebe28 00000000ffffff9c ffff880377bebe70 ffff8810781b9000
ffff880377bebe08 000000012ad6c419 ffff880377bebfd8 ffff880377bebfd8
Call Trace:
[<ffffffff811b179e>] ? do_filp_open+0x14e/0xca0
[<ffffffff815427c2>] __mutex_lock_slowpath+0x122/0x170
[<ffffffff815421da>] mutex_lock+0x2a/0x50
[<ffffffff8119f616>] generic_file_llseek+0x36/0x70
[<ffffffff8119f6c7>] sys_lseek+0x77/0xb0
[<ffffffff8100b182>] system_call_fastpath+0x16/0x1b
INFO: task cron:478585 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
cron D ffff88087a1d8880 0 478585 446520 100 0x00000000
ffff880377bebe68 0000000000000082 0000000000000000 ffffffff811b179e
ffff880377bebe28 00000000ffffff9c ffff880377bebe70 ffff8810781b9000
ffff880377bebe08 000000012ad6c419 ffff880377bebfd8 ffff880377bebfd8
Call Trace:
[<ffffffff811b179e>] ? do_filp_open+0x14e/0xca0
[<ffffffff815427c2>] __mutex_lock_slowpath+0x122/0x170
[<ffffffff815421da>] mutex_lock+0x2a/0x50
[<ffffffff8119f616>] generic_file_llseek+0x36/0x70
[<ffffffff8119f6c7>] sys_lseek+0x77/0xb0
[<ffffffff8100b182>] system_call_fastpath+0x16/0x1b
INFO: task cron:478585 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
cron D ffff88087a1d8880 0 478585 446520 100 0x00000000
ffff880377bebe68 0000000000000082 0000000000000000 ffffffff811b179e
ffff880377bebe28 00000000ffffff9c ffff880377bebe70 ffff8810781b9000
ffff880377bebe08 000000012ad6c419 ffff880377bebfd8 ffff880377bebfd8
Call Trace:
[<ffffffff811b179e>] ? do_filp_open+0x14e/0xca0
[<ffffffff815427c2>] __mutex_lock_slowpath+0x122/0x170
[<ffffffff815421da>] mutex_lock+0x2a/0x50
[<ffffffff8119f616>] generic_file_llseek+0x36/0x70
[<ffffffff8119f6c7>] sys_lseek+0x77/0xb0
[<ffffffff8100b182>] system_call_fastpath+0x16/0x1b
CT: 100: stopped
Backups from the host system over vzdump to the NAS device using the same cifs mount work without any problems.
anybody knows a solution? if not I'm going to install the debian system as a kvm host.