clusternode freeze during vm-load (openvz postgresql)

udo

Distinguished Member
Apr 22, 2009
5,981
203
163
Ahrensburg; Germany
Hi all,
i had just an freezed clusternode during an postgresql-db-import in an openvz-vm.

The first import breaks, and the second gives a lot of duplicate key value (that's normal because i don't drop the database before).
Suddenly the error-messages stop and a top on the pve-host show an high load:
Code:
top - 13:38:25 up 7 days, 21:11,  1 user,  load average: 6.02, 1.57, 0.54
Tasks: 213 total,  13 running, 200 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.1%us, 75.2%sy,  0.0%ni, 24.5%id,  0.1%wa,  0.1%hi,  0.1%si,  0.0%st
Mem:   7914084k total,  3173840k used,  4740244k free,    85948k buffers
Swap:  7340024k total,    25892k used,  7314132k free,  2470876k cached

    PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                
   1820 root      20   0  158m  23m 3908 R  100  0.3  14:49.97 pvestatd                                                                               
 661045 root      20   0  185m  34m 5472 R  100  0.5   0:26.23 pvedaemon                                                                              
 665985 root      20   0  183m  32m 5304 R  100  0.4   0:50.48 pvedaemon                                                                              
    901 root      20   0     0    0    0 S    1  0.0   3:16.65 kjournald
The load rise up to 7.33 (nine second later) and the whole node freezed. After reset there was nothing in the logs...

Version:
Code:
pve-manager: 2.0-7 (pve-manager/2.0/de5d8ab1)
running kernel: 2.6.32-6-pve
proxmox-ve-2.6.32: 2.0-46
pve-kernel-2.6.32-6-pve: 2.6.32-46
lvm2: 2.02.86-1pve1
clvm: 2.02.86-1pve1
corosync-pve: 1.4.1-1
openais-pve: 1.1.4-1
libqb: 0.5.1-1
redhat-cluster-pve: 3.1.7-1
pve-cluster: 1.0-9
qemu-server: 2.0-2
pve-firmware: 1.0-13
libpve-common-perl: 1.0-6
libpve-access-control: 1.0-1
libpve-storage-perl: 2.0-4
vncterm: 1.0-2
vzctl: 3.0.29-3pve2
vzdump: 1.2.6-1
vzprocps: 2.0.11-2
vzquota: 3.0.12-3
pve-qemu-kvm: 0.15.0-1
ksm-control-daemon: 1.1-1
After reboot i do the same - but now the database was imported without trouble.
How can it happens that pvestatd and pvedaemon block the system?
I guess it's not an io-problem, because the processes has an high cpu-usage - if they wait for io it should look different.

Udo
 
can you test with the latest kernel (pve-kernel-2.6.32-6-pve: 2.6.32-52)? just apt-get update/upgrade
 
oh, just noticed that its not yet in the repo. we try to push this to the public soon.