Hello,
Yesterday I have upgraded to PVE 3.4 from 3.2 and today I have big problems. The load of the node suddenly grew at 12:00. I can’t find the cause. Some servers hang with the next messages:
Thanks.
Yesterday I have upgraded to PVE 3.4 from 3.2 and today I have big problems. The load of the node suddenly grew at 12:00. I can’t find the cause. Some servers hang with the next messages:
Code:
# iotop -d 10 -P
Total DISK READ: 21.17 K/s | Total DISK WRITE: 2.82 M/s
PID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
10287 be/4 root 0.00 B/s 0.00 B/s 0.00 % 13.75 % kvm -id 104
8691 be/4 root 6.33 K/s 8.61 K/s 0.00 % 13.54 % kvm -id 140
9633 be/4 root 0.00 B/s 0.00 B/s 0.00 % 12.79 % kvm -id 111
10059 be/4 root 0.00 B/s 0.00 B/s 0.00 % 10.82 % kvm -id 156
8895 be/4 root 0.00 B/s 0.00 B/s 0.00 % 7.17 % kvm -id 117
9178 be/4 root 0.00 B/s 0.00 B/s 0.00 % 5.01 % kvm -id 119
9277 be/4 root 405.13 B/s 44.31 K/s 0.00 % 2.08 % kvm -id 108
7534 be/0 root 0.00 B/s 0.00 B/s 0.00 % 0.27 % [txg_sync]
10858 be/4 root 229.47 K/s 752.69 K/s 0.00 % 0.02 % kvm -id 113
12155 be/4 root 30.46 K/s 329.96 K/s 0.00 % 0.01 % kvm -id 116
8423 be/4 root 810.26 B/s 25.72 K/s 0.00 % 0.01 % kvm -id 112
10481 be/4 root 0.00 B/s 4.35 K/s 0.00 % 0.01 % kvm -id 106
1083 be/3 root 0.00 B/s 1215.38 B/s 0.00 % 0.00 % [jbd2/dm-0-8]
2554 be/3 root 0.00 B/s 1620.51 B/s 0.00 % 0.00 % [jbd2/sda4-8]
10076 be/4 root 0.00 B/s 9.50 K/s 0.00 % 0.00 % kvm -id 110
30156 be/4 root 0.00 B/s 11.87 K/s 0.00 % 0.00 % kvm -id 109
29999 be/4 root 405.13 B/s 130.16 K/s 0.00 % 0.00 % kvm -id 153
9801 be/4 root 0.00 B/s 4.75 K/s 0.00 % 0.00 % kvm -id 131
65437 be/4 root 0.00 B/s 2.77 K/s 0.00 % 0.00 % kvm -id 124
64509 be/4 root 0.00 B/s 3.17 K/s 0.00 % 0.00 % kvm -id 102
11121 be/4 root 810.26 B/s 5.54 K/s 0.00 % 0.00 % kvm -id 130
9751 be/4 root 0.00 B/s 2.77 K/s 0.00 % 0.00 % kvm -id 129
11169 be/4 root 405.13 B/s 405.13 B/s 0.00 % 0.00 % kvm -id 127
Code:
# zpool iostat -v 10
...
capacity operations bandwidth
pool alloc free read write read write
---------- ----- ----- ----- ----- ----- -----
pool2 2.20T 799G 0 66 32.7K 3.09M
pve-csv2 2.20T 799G 0 66 32.7K 3.09M
cache - - - - - -
sdb 55.9G 7.62M 0 1 21.0K 256K
---------- ----- ----- ----- ----- ----- -----
Code:
# iostat -d -x 10
...
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 1.30 1.10 237.30 5.40 3124.20 26.26 0.02 0.07 6.36 0.04 0.06 1.33
sdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdb 0.00 0.00 2.10 0.00 75.75 0.00 72.14 0.00 0.67 0.67 0.00 0.67 0.14
dm-0 0.00 0.00 0.20 1.70 1.20 13.60 15.58 0.00 1.05 10.00 0.00 1.05 0.20
dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-4 0.00 0.00 0.10 98.30 0.60 2755.00 56.01 0.00 0.05 7.00 0.04 0.05 0.45
Code:
# top
top - 16:11:27 up 1 day, 2:07, 2 users, load average: 4.94, 5.70, 6.35
Tasks: 1087 total, 1 running, 1086 sleeping, 0 stopped, 0 zombie
%Cpu(s): 2.5 us, 1.4 sy, 0.0 ni, 95.7 id, 0.3 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem: 128840 total, 76732 used, 52107 free, 95 buffers
MiB Swap: 65535 total, 0 used, 65535 free, 4892 cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
12155 root 20 0 5157m 3.5g 4116 S 16 2.8 12:09.68 kvm
29999 root 20 0 9231m 7.8g 3900 S 15 6.2 49:53.35 kvm
9801 root 20 0 4852m 4.1g 3960 S 15 3.3 465:40.06 kvm
64509 root 20 0 10.8g 10g 3972 S 10 8.0 245:27.25 kvm
11169 root 20 0 1406m 1.0g 3772 S 8 0.8 114:50.26 kvm
8423 root 20 0 3676m 3.1g 3808 S 6 2.5 113:30.77 kvm
10858 root 20 0 9313m 5.2g 3788 S 5 4.2 89:14.78 kvm
Code:
# pveversion --verbose
proxmox-ve-2.6.32: 3.3-147 (running kernel: 3.10.0-1-pve)
pve-manager: 3.4-1 (running version: 3.4-1/3f2d890e)
pve-kernel-3.10.0-1-pve: 3.10.0-5
pve-kernel-2.6.32-28-pve: 2.6.32-124
pve-kernel-2.6.32-37-pve: 2.6.32-147
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.7-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.10-2
pve-cluster: 3.0-16
qemu-server: 3.3-20
pve-firmware: 1.1-3
libpve-common-perl: 3.0-24
libpve-access-control: 3.0-16
libpve-storage-perl: 3.0-31
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-8
vzctl: 4.0-1pve6
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 2.1-12
ksm-control-daemon: 1.1-1
glusterfs-client: 3.5.2-1
Code:
# pveperf
CPU BOGOMIPS: 110201.04
REGEX/SECOND: 921999
HD SIZE: 62.87 GB (/dev/mapper/pve-root)
BUFFERED READS: 499.85 MB/sec
AVERAGE SEEK TIME: 9.07 ms
FSYNCS/SECOND: 4271.21
Code:
# pveperf /pool2/VMs/images/
CPU BOGOMIPS: 110201.04
REGEX/SECOND: 942748
HD SIZE: 2970.82 GB (pool2/VMs)
FSYNCS/SECOND: 4683.57
Code:
# pveperf /mnt/sda4/images/
CPU BOGOMIPS: 110201.04
REGEX/SECOND: 923666
HD SIZE: 3023.67 GB (/dev/sda4)
BUFFERED READS: 349.94 MB/sec
AVERAGE SEEK TIME: 9.98 ms
FSYNCS/SECOND: 2474.56
Code:
# qm list | grep -v stopped
VMID NAME STATUS MEM(MB) BOOTDISK(GB) PID
102 server102 running 10240 50.00 64509
104 server104 running 512 2.00 10287
106 server106 running 2048 300.00 10481
108 server108 running 2048 4.00 9277
109 server109 running 4096 50.00 30156
110 server110 running 1024 150.00 100765
111 server111 running 2048 100.00 9633
112 server112 running 3072 32.00 8423
113 server113 running 8192 48.00 10858
115 server115 running 1024 50.00 10631
116 server116 running 4096 32.00 12155
117 server117 running 2048 4.00 8895
119 server119 running 1024 16.00 9178
122 server122 running 1024 50.00 10779
124 server124 running 3072 40.00 65437
127 server127 running 1024 30.00 11169
128 server128 running 2048 8.00 11283
129 server129 running 4096 50.00 9751
130 server130 running 1024 50.00 11121
131 server131 running 4096 40.00 9801
140 server140 running 1024 150.00 8691
153 server153 running 8192 150.00 29999
156 server156 running 2048 150.00 10059
Thanks.