PVE 1.8 - disk performance becomes worse over time

filgood

New Member
Mar 20, 2011
15
0
1
Hi,

I've been a happy user since version 1.6. Recently I build a new proxmox host with an areca 1222 controller (+BBU) and 8 disks in raid 10. When I boot up for the first 12 hours or so, my VM's (KVM) run smooth...but when I check again the next day, the IO of the host (and hence the VMs) has gone to a crawl. Any clue what might be the cause of this?

When I boot up the system my perf out is:

prox-001:~# pveperf
CPU BOGOMIPS: 26486.14
REGEX/SECOND: 1301124
HD SIZE: 94.49 GB (/dev/mapper/pve-root)
BUFFERED READS: 528.27 MB/sec
AVERAGE SEEK TIME: 9.94 ms
FSYNCS/SECOND: 1984.11
DNS EXT: 215.94 ms
DNS INT: 68.79 ms (henri.local)



Next Day:

prox-001:~# pveperf
CPU BOGOMIPS: 26486.14
REGEX/SECOND: 1237770
HD SIZE: 94.49 GB (/dev/mapper/pve-root)
BUFFERED READS: 1.74 MB/sec
AVERAGE SEEK TIME: 106.38 ms
FSYNCS/SECOND: 3.44
DNS EXT: 163.14 ms
DNS INT: 200.01 ms (henri.local)


prox-001:~# pveversion -v
pve-manager: 1.8-18 (pve-manager/1.8/6070)
running kernel: 2.6.32-4-pve
proxmox-ve-2.6.32: 1.8-33
pve-kernel-2.6.32-4-pve: 2.6.32-33
qemu-server: 1.1-30
pve-firmware: 1.0-11
libpve-storage-perl: 1.0-17
vncterm: 0.9-2
vzctl: 3.0.27-1pve1
vzdump: 1.2-13
vzprocps: 2.0.11-2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.14.1-1
ksm-control-daemon: 1.0-6

Many thanks for any help,

fil
 
Hi esco,

Thanks for your quick reply. There seems very little activity:

Total DISK READ: 0 B/s | Total DISK WRITE: 47.52 K/s
PID USER DISK READ DISK WRITE SWAPIN IO> COMMAND
3851 root 0 B/s 7.92 K/s 0.00 % 28.72 % kvm -monitor unix:/var/run/qemu-server/106.mon,server,nowait -vnc unix:/var/run/qemu-server/106.vnc,password -pidfile /var/run/qemu-
3850 root 0 B/s 7.92 K/s 0.00 % 19.91 % kvm -monitor unix:/var/run/qemu-server/106.mon,server,nowait -vnc unix:/var/run/qemu-server/106.vnc,password -pidfile /var/run/qemu-
3847 root 0 B/s 7.92 K/s 0.00 % 19.88 % kvm -monitor unix:/var/run/qemu-server/106.mon,server,nowait -vnc unix:/var/run/qemu-server/106.vnc,password -pidfile /var/run/qemu-
850 root 0 B/s 0 B/s 0.00 % 19.87 % [kjournald]
3849 root 0 B/s 11.88 K/s 0.00 % 19.68 % kvm -monitor unix:/var/run/qemu-server/106.mon,server,nowait -vnc unix:/var/run/qemu-server/106.vnc,password -pidfile /var/run/qemu-
3848 root 0 B/s 3.96 K/s 0.00 % 9.92 % kvm -monitor unix:/var/run/qemu-server/106.mon,server,nowait -vnc unix:/var/run/qemu-server/106.vnc,password -pidfile /var/run/qemu-
3846 root 0 B/s 7.92 K/s 0.00 % 6.42 % kvm -monitor unix:/var/run/qemu-server/106.mon,server,nowait -vnc unix:/var/run/qemu-server/106.vnc,password -pidfile /var/run/qemu-
2560 root 0 B/s 0 B/s 0.00 % 0.00 % atd
1 root 0 B/s 0 B/s 0.00 % 0.00 % init [2]
2 root 0 B/s 0 B/s 0.00 % 0.00 % [kthreadd]


Thanks, filgood
 
a raid rebuild?
 
Hi Tom,

I've just checked via the web interface on the areca controller and the volumes are all in a normal state.


[TD="class: content, width: 26%"]Volume Set Name[/TD]
[TD="class: content, width: 74%"]raid10-000000002[/TD]

[TD="class: content, width: 26%"]Raid Set Name[/TD]
[TD="class: content, width: 74%"]Raid Set # 000 [/TD]

[TD="class: content, width: 26%"]Volume Capacity[/TD]
[TD="class: content, width: 74%"]1466.0GB[/TD]

[TD="class: content, width: 26%"]SCSI Ch/Id/Lun[/TD]
[TD="class: content, width: 74%"]0/0/1[/TD]

[TD="class: content, width: 26%"]Raid Level[/TD]
[TD="class: content, width: 74%"]Raid 1+0 [/TD]

[TD="class: content, width: 26%"]Stripe Size[/TD]
[TD="class: content, width: 74%"]64KBytes[/TD]

[TD="class: content, width: 26%"]Block Size[/TD]
[TD="class: content, width: 74%"]512Bytes[/TD]

[TD="class: content, width: 26%"]Member Disks[/TD]
[TD="class: content, width: 74%"]8[/TD]

[TD="class: content, width: 26%"]Cache Mode[/TD]
[TD="class: content, width: 74%"]Write Back[/TD]

[TD="class: content, width: 26%"]Tagged Queuing[/TD]
[TD="class: content, width: 74%"]Enabled[/TD]

[TD="class: content, width: 26%"]Volume State[/TD]
[TD="class: content, width: 74%"] Normal [/TD]
 
The only thing I've spotted is that the driver for the areca raid controller seems a bit old, but not sure if this could be the cause (the firmware on the controller is the latest available)

Jun 22 17:25:29 prox-001 kernel: ARECA RAID ADAPTER0: FIRMWARE VERSION V1.49 2010-12-02
Jun 22 17:25:29 prox-001 kernel: scsi0 : Areca SAS Host Adapter RAID Controller( RAID6 capable)
Jun 22 17:25:29 prox-001 kernel: Driver Version 1.20.00.15 2008/02/27
 
some iostat output (sda, sdb and sdc are volumes configured on the raid controller for proxmox, sda is where proxmox has been installed, sdb is where the vm is running).

prox-001:~# iostat
Linux 2.6.32-4-pve (prox-001) 06/23/11 _x86_64_
avg-cpu: %user %nice %system %iowait %steal %idle
3.28 0.00 2.13 22.18 0.00 72.40
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sdb 32.98 1222.55 488.02 79153623 31596719
sda 3.08 219.32 37.10 14199846 2401808
sda1 0.00 0.05 0.00 3146 16
sda2 3.08 219.27 37.10 14196460 2401792
sdc 0.00 0.04 0.00 2600 0
dm-0 0.00 0.01 0.00 928 0
dm-1 8.69 57.11 17.35 3697802 1123136
dm-2 22.73 162.07 19.75 10492978 1278656
dm-3 33.09 1222.45 488.02 79147151 31596719
sdd 0.00 0.06 0.00 3988 0
sdd1 0.00 0.05 0.00 3156 0


Anyone any clue?
 
Hi Tom,

Could you explain me the steps needed to test/move to this new kernel?

Thanks, filgood
 
some iostat output (sda, sdb and sdc are volumes configured on the raid controller for proxmox, sda is where proxmox has been installed, sdb is where the vm is running).

prox-001:~# iostat
Linux 2.6.32-4-pve (prox-001) 06/23/11 _x86_64_
avg-cpu: %user %nice %system %iowait %steal %idle
3.28 0.00 2.13 22.18 0.00 72.40
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sdb 32.98 1222.55 488.02 79153623 31596719
sda 3.08 219.32 37.10 14199846 2401808
sda1 0.00 0.05 0.00 3146 16
sda2 3.08 219.27 37.10 14196460 2401792
sdc 0.00 0.04 0.00 2600 0
dm-0 0.00 0.01 0.00 928 0
dm-1 8.69 57.11 17.35 3697802 1123136
dm-2 22.73 162.07 19.75 10492978 1278656
dm-3 33.09 1222.45 488.02 79147151 31596719
sdd 0.00 0.06 0.00 3988 0
sdd1 0.00 0.05 0.00 3156 0


Anyone any clue?
Hi,
my areca-raids performs good (also an 1222).
You can try following: use the cli64-program from areca to look at the controller:
Code:
cli64 hw info
cli64 disk info
# and then
cli64 disk info drv=X
# any errors on the disks?
you can also use cli64 interactive.

You can also try "iostat -dm 5 sda sdb sdc" for actual io - before and during pveperf.
I assume that the values are also bad with "pverperf /var/lib/vz"?!

Udo
 
Hi Tom,

Could you explain me the steps needed to test/move to this new kernel?

Thanks, filgood
Hi,
I think that don't solve the problem - the driver is a little bit old but run well for me:
Code:
modinfo arcmsr
filename:       /lib/modules/2.6.35-1-pve/kernel/drivers/scsi/arcmsr/arcmsr.ko
version:        Driver Version 1.20.00.15 2008/11/03
license:        Dual BSD/GPL
description:    ARECA (ARC11xx/12xx/13xx/16xx) SATA/SAS RAID Host Bus Adapter
author:         Erich Chen <support@areca.com.tw>
srcversion:     28E7351425AE492268C29B5
alias:          pci:v000017D3d00001681sv*sd*bc*sc*i*
alias:          pci:v000017D3d00001680sv*sd*bc*sc*i*
alias:          pci:v000017D3d00001381sv*sd*bc*sc*i*
alias:          pci:v000017D3d00001380sv*sd*bc*sc*i*
alias:          pci:v000017D3d00001280sv*sd*bc*sc*i*
alias:          pci:v000017D3d00001270sv*sd*bc*sc*i*
alias:          pci:v000017D3d00001260sv*sd*bc*sc*i*
alias:          pci:v000017D3d00001230sv*sd*bc*sc*i*
alias:          pci:v000017D3d00001220sv*sd*bc*sc*i*
alias:          pci:v000017D3d00001210sv*sd*bc*sc*i*
alias:          pci:v000017D3d00001202sv*sd*bc*sc*i*
alias:          pci:v000017D3d00001201sv*sd*bc*sc*i*
alias:          pci:v000017D3d00001200sv*sd*bc*sc*i*
alias:          pci:v000017D3d00001170sv*sd*bc*sc*i*
alias:          pci:v000017D3d00001160sv*sd*bc*sc*i*
alias:          pci:v000017D3d00001130sv*sd*bc*sc*i*
alias:          pci:v000017D3d00001120sv*sd*bc*sc*i*
alias:          pci:v000017D3d00001110sv*sd*bc*sc*i*
depends:        
vermagic:       2.6.35-1-pve SMP mod_unload modversions 

# pveperf /var/lib/vz
CPU BOGOMIPS:      27293.34
REGEX/SECOND:      1104643
HD SIZE:           543.34 GB (/dev/mapper/pve-data)
BUFFERED READS:    461.76 MB/sec
AVERAGE SEEK TIME: 5.48 ms
FSYNCS/SECOND:     5420.88
DNS EXT:           61.61 ms
DNS INT:           0.47 ms

Udo
 
Hi Udo,

Please find below the output for cli64. All individual drives look fine to me, no?

Many thanks for all your help...much appreciated!

~ filgood

prox-001:~/areca# ./cli64 hw info
The Hardware Monitor Information
=====================================================
[Controller H/W Monitor]
CPU Temperature : 44 C
Controller Temp. : 38 C
CPU Fan : 3350 RPM
12V : 11.977 V
5V : 5.026 V
3.3V : 3.312 V
DDR-II +1.8V : 1.840 V
PCI-E +1.8V : 1.840 V
CPU +1.8V : 1.840 V
CPU +1.2V : 1.216 V
DDR-II +0.9V : 0.912 V
Battery Status : 100%
[Enclosure#1 : ARECA SAS RAID AdapterV1.0]
=====================================================
prox-001:~/areca# ./cli64 disk info
# Enc# Slot# ModelName Capacity Usage
===============================================================================
1 01 Slot#1 WDC WD20EARS-00MVWB0 2000.4GB Raid Set # 000
2 01 Slot#2 WDC WD20EARS-00MVWB0 2000.4GB Raid Set # 000
3 01 Slot#3 WDC WD20EARS-00MVWB0 2000.4GB Raid Set # 000
4 01 Slot#4 WDC WD20EARS-00MVWB0 2000.4GB Raid Set # 000
5 01 Slot#5 WDC WD20EARS-00MVWB0 2000.4GB Raid Set # 000
6 01 Slot#6 WDC WD20EARS-00J2GB0 2000.4GB Raid Set # 000
7 01 Slot#7 WDC WD20EARS-00MVWB0 2000.4GB Raid Set # 000
8 01 Slot#8 WDC WD20EARS-00MVWB0 2000.4GB Raid Set # 000
===============================================================================
GuiErrMsg<0x00>: Success.

prox-001:~/areca# ./cli64 disk info drv=1
Drive Information
===============================================================
Device Type : SATA(5001B4D411E26010)
Device Location : Enclosure#1 Slot#1
Model Name : WDC WD20EARS-00MVWB0
Serial Number : WD-WCAZA7721462
Firmware Rev. : 51.0AB51
Disk Capacity : 2000.4GB
Device State : NORMAL
Timeout Count : 0
Media Error Count : 0
Device Temperature : 27 C
SMART Read Error Rate : 200(51)
SMART Spinup Time : 253(21)
SMART Reallocation Count : 200(140)
SMART Seek Error Rate : 200(0)
SMART Spinup Retries : 100(0)
SMART Calibration Retries : 100(0)
===============================================================
GuiErrMsg<0x00>: Success.
prox-001:~/areca# ./cli64 disk info drv=2
Drive Information
===============================================================
Device Type : SATA(5001B4D411E26011)
Device Location : Enclosure#1 Slot#2
Model Name : WDC WD20EARS-00MVWB0
Serial Number : WD-WCAZA7687416
Firmware Rev. : 51.0AB51
Disk Capacity : 2000.4GB
Device State : NORMAL
Timeout Count : 0
Media Error Count : 0
Device Temperature : 28 C
SMART Read Error Rate : 200(51)
SMART Spinup Time : 253(21)
SMART Reallocation Count : 200(140)
SMART Seek Error Rate : 200(0)
SMART Spinup Retries : 100(0)
SMART Calibration Retries : 100(0)
===============================================================
GuiErrMsg<0x00>: Success.
prox-001:~/areca# ./cli64 disk info drv=3
Drive Information
===============================================================
Device Type : SATA(5001B4D411E26012)
Device Location : Enclosure#1 Slot#3
Model Name : WDC WD20EARS-00MVWB0
Serial Number : WD-WMAZA0286862
Firmware Rev. : 51.0AB51
Disk Capacity : 2000.4GB
Device State : NORMAL
Timeout Count : 0
Media Error Count : 0
Device Temperature : 27 C
SMART Read Error Rate : 200(51)
SMART Spinup Time : 173(21)
SMART Reallocation Count : 200(140)
SMART Seek Error Rate : 200(0)
SMART Spinup Retries : 100(0)
SMART Calibration Retries : 100(0)
===============================================================
GuiErrMsg<0x00>: Success.
prox-001:~/areca# ./cli64 disk info drv=4
Drive Information
===============================================================
Device Type : SATA(5001B4D411E26013)
Device Location : Enclosure#1 Slot#4
Model Name : WDC WD20EARS-00MVWB0
Serial Number : WD-WMAZA0485226
Firmware Rev. : 51.0AB51
Disk Capacity : 2000.4GB
Device State : NORMAL
Timeout Count : 0
Media Error Count : 0
Device Temperature : 27 C
SMART Read Error Rate : 200(51)
SMART Spinup Time : 165(21)
SMART Reallocation Count : 200(140)
SMART Seek Error Rate : 200(0)
SMART Spinup Retries : 100(0)
SMART Calibration Retries : 100(0)
===============================================================
GuiErrMsg<0x00>: Success.
prox-001:~/areca# ./cli64 disk info drv=5
Drive Information
===============================================================
Device Type : SATA(5001B4D411E26014)
Device Location : Enclosure#1 Slot#5
Model Name : WDC WD20EARS-00MVWB0
Serial Number : WD-WCAZA7702814
Firmware Rev. : 51.0AB51
Disk Capacity : 2000.4GB
Device State : NORMAL
Timeout Count : 0
Media Error Count : 0
Device Temperature : 28 C
SMART Read Error Rate : 200(51)
SMART Spinup Time : 253(21)
SMART Reallocation Count : 200(140)
SMART Seek Error Rate : 200(0)
SMART Spinup Retries : 100(0)
SMART Calibration Retries : 100(0)
===============================================================
GuiErrMsg<0x00>: Success.
prox-001:~/areca# ./cli64 disk info drv=6
Drive Information
===============================================================
Device Type : SATA(5001B4D411E26015)
Device Location : Enclosure#1 Slot#6
Model Name : WDC WD20EARS-00J2GB0
Serial Number : WD-WCAYY0318302
Firmware Rev. : 80.00A80
Disk Capacity : 2000.4GB
Device State : NORMAL
Timeout Count : 0
Media Error Count : 0
Device Temperature : 29 C
SMART Read Error Rate : 200(51)
SMART Spinup Time : 164(21)
SMART Reallocation Count : 200(140)
SMART Seek Error Rate : 200(0)
SMART Spinup Retries : 100(0)
SMART Calibration Retries : 100(0)
===============================================================
GuiErrMsg<0x00>: Success.
prox-001:~/areca# ./cli64 disk info drv=7
Drive Information
===============================================================
Device Type : SATA(5001B4D411E26016)
Device Location : Enclosure#1 Slot#7
Model Name : WDC WD20EARS-00MVWB0
Serial Number : WD-WCAZA7655673
Firmware Rev. : 51.0AB51
Disk Capacity : 2000.4GB
Device State : NORMAL
Timeout Count : 0
Media Error Count : 0
Device Temperature : 27 C
SMART Read Error Rate : 200(51)
SMART Spinup Time : 253(21)
SMART Reallocation Count : 200(140)
SMART Seek Error Rate : 200(0)
SMART Spinup Retries : 100(0)
SMART Calibration Retries : 100(0)
===============================================================
GuiErrMsg<0x00>: Success.
prox-001:~/areca# ./cli64 disk info drv=8
Drive Information
===============================================================
Device Type : SATA(5001B4D411E26017)
Device Location : Enclosure#1 Slot#8
Model Name : WDC WD20EARS-00MVWB0
Serial Number : WD-WCAZA7702887
Firmware Rev. : 51.0AB51
Disk Capacity : 2000.4GB
Device State : NORMAL
Timeout Count : 0
Media Error Count : 0
Device Temperature : 27 C
SMART Read Error Rate : 200(51)
SMART Spinup Time : 253(21)
SMART Reallocation Count : 200(140)
SMART Seek Error Rate : 200(0)
SMART Spinup Retries : 100(0)
SMART Calibration Retries : 100(0)
===============================================================
GuiErrMsg<0x00>: Success.
 
Hi Udo,

Although I'm running the x.32 kernel, the driver seems to be the same in both:

prox-001:~/areca# moinfo arcmsr
-bash: moinfo: command not found
prox-001:~/areca# modinfo arcmsr
filename: /lib/modules/2.6.32-4-pve/kernel/drivers/scsi/arcmsr/arcmsr.ko
version: Driver Version 1.20.00.15 2008/02/27
license: Dual BSD/GPL
description: ARECA (ARC11xx/12xx/13xx/16xx) SATA/SAS RAID HOST Adapter
author: Erich Chen <support@areca.com.tw>
srcversion: C08ACC0E8B613C89690FAB9
alias: pci:v000017D3d00001681sv*sd*bc*sc*i*
alias: pci:v000017D3d00001680sv*sd*bc*sc*i*
alias: pci:v000017D3d00001381sv*sd*bc*sc*i*
alias: pci:v000017D3d00001380sv*sd*bc*sc*i*
alias: pci:v000017D3d00001280sv*sd*bc*sc*i*
alias: pci:v000017D3d00001270sv*sd*bc*sc*i*
alias: pci:v000017D3d00001260sv*sd*bc*sc*i*
alias: pci:v000017D3d00001230sv*sd*bc*sc*i*
alias: pci:v000017D3d00001220sv*sd*bc*sc*i*
alias: pci:v000017D3d00001210sv*sd*bc*sc*i*
alias: pci:v000017D3d00001202sv*sd*bc*sc*i*
alias: pci:v000017D3d00001201sv*sd*bc*sc*i*
alias: pci:v000017D3d00001200sv*sd*bc*sc*i*
alias: pci:v000017D3d00001170sv*sd*bc*sc*i*
alias: pci:v000017D3d00001160sv*sd*bc*sc*i*
alias: pci:v000017D3d00001130sv*sd*bc*sc*i*
alias: pci:v000017D3d00001120sv*sd*bc*sc*i*
alias: pci:v000017D3d00001110sv*sd*bc*sc*i*
depends:
vermagic: 2.6.32-4-pve SMP mod_unload modversions


prox-001:~/areca# pveperf /var/lib/vz
CPU BOGOMIPS: 26486.14
REGEX/SECOND: 1307734
HD SIZE: 1230.21 GB (/dev/mapper/pve-data)
BUFFERED READS: 2.52 MB/sec
AVERAGE SEEK TIME: 103.23 ms
FSYNCS/SECOND: 2.90
DNS EXT: 172.20 ms
DNS INT: 115.39 ms
prox-001:~/areca#
 
I've just done the iostat as you suggested (and running pveperf for a part of it and see no difference).


Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sdb 43.80 0.21 1.31 1 6
sda 7.20 0.05 0.00 0 0
sdc 0.00 0.00 0.00 0 0
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sdb 43.00 0.56 0.51 2 2
sda 11.60 0.06 0.02 0 0
sdc 0.00 0.00 0.00 0 0
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sdb 43.20 0.67 0.36 3 1
sda 11.80 0.97 0.01 4 0
sdc 0.00 0.00 0.00 0 0
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sdb 43.80 0.75 0.20 3 1
sda 15.20 0.35 0.01 1 0
sdc 0.00 0.00 0.00 0 0
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sdb 43.60 0.92 0.32 4 1
sda 20.00 0.12 0.00 0 0
sdc 0.00 0.00 0.00 0 0
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sdb 43.60 0.53 0.27 2 1
sda 13.60 0.08 0.05 0 0
sdc 0.00 0.00 0.00 0 0
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sdb 43.80 0.52 0.09 2 0
sda 12.20 0.11 0.02 0 0
sdc 0.00 0.00 0.00 0 0
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sdb 42.00 0.44 0.15 2 0
sda 11.80 0.13 0.00 0 0
sdc 0.00 0.00 0.00 0 0
 
Hi Udo, as the kernel difference was the only thing I could spot with your setup, I've now migrated to x.35 kernel. Just booted the system (although the vm is also booting up in the background) find below the output of pveperf:

prox-001:~# pveperf /var/lib/vz
CPU BOGOMIPS: 26485.61
REGEX/SECOND: 1256772
HD SIZE: 1230.21 GB (/dev/mapper/pve-data)
BUFFERED READS: 494.61 MB/sec
AVERAGE SEEK TIME: 20.42 ms
FSYNCS/SECOND: 2834.29
DNS EXT: 90.99 ms
DNS INT: 97.63 ms (henri.local)

prox-001:~# pveversion -v
pve-manager: 1.8-18 (pve-manager/1.8/6070)
running kernel: 2.6.35-1-pve
proxmox-ve-2.6.35: 1.8-11
pve-kernel-2.6.32-4-pve: 2.6.32-33
pve-kernel-2.6.35-1-pve: 2.6.35-11
qemu-server: 1.1-30
pve-firmware: 1.0-11
libpve-storage-perl: 1.0-17
vncterm: 0.9-2
vzctl: 3.0.27-1pve1
vzdump: 1.2-13
vzprocps: 2.0.11-2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.14.1-1
ksm-control-daemon: 1.0-6


I will monitor the performance over the next 24 hours and let this forum now if it degrades again over time....

Many thanks so far for all your help...

~ filgood
 
Hi Udo, as the kernel difference was the only thing I could spot with your setup, I've now migrated to x.35 kernel. Just booted the system (although the vm is also booting up in the background) find below the output of pveperf:

...


I will monitor the performance over the next 24 hours and let this forum now if it degrades again over time....

Many thanks so far for all your help...

~ filgood
Hmm,
i have on one machine an old 2.6.18 without trouble for over an half year now - so i don't know if the migration to 2.6.35 solve the issue.
But take a look with top (the cpu/mem-row). Compare the values at good and bad performance (io-wait, system and so on) - perhaps there can be find an hint.

Udo
 
just found a hint (this irq 17 message popped up and the io speed has dropped to the very slow levels reported).


Jun 23 17:41:17 prox-001 kernel: irq 17: nobody cared (try booting with the "irqpoll" option)
Jun 23 17:41:17 prox-001 kernel: Pid: 0, comm: swapper Not tainted 2.6.32-4-pve #1
Jun 23 17:41:17 prox-001 kernel: Call Trace:
Jun 23 17:41:17 prox-001 kernel: <IRQ> [<ffffffff81097bfd>] ? __report_bad_irq+0x30/0x7d
Jun 23 17:41:17 prox-001 kernel: [<ffffffff81097d4f>] ? note_interrupt+0x105/0x16e
Jun 23 17:41:17 prox-001 kernel: [<ffffffff810165b1>] ? read_tsc+0xa/0x20
Jun 23 17:41:17 prox-001 kernel: [<ffffffff810983b4>] ? handle_fasteoi_irq+0x93/0xb5
Jun 23 17:41:17 prox-001 kernel: [<ffffffff8101333f>] ? handle_irq+0x17/0x1d
Jun 23 17:41:17 prox-001 kernel: [<ffffffff81012999>] ? do_IRQ+0x57/0xb6
Jun 23 17:41:17 prox-001 kernel: [<ffffffff81011593>] ? ret_from_intr+0x0/0x11
Jun 23 17:41:17 prox-001 kernel: <EOI> [<ffffffffa01324f9>] ? acpi_idle_enter_bm+0x27d/0x2af [processor]
Jun 23 17:41:17 prox-001 kernel: [<ffffffffa01324f2>] ? acpi_idle_enter_bm+0x276/0x2af [processor]
Jun 23 17:41:17 prox-001 kernel: [<ffffffff812508ae>] ? cpuidle_idle_call+0x94/0xee
Jun 23 17:41:17 prox-001 kernel: [<ffffffff8100ff09>] ? cpu_idle+0xa2/0xda
Jun 23 17:41:17 prox-001 kernel: [<ffffffff81528140>] ? early_idt_handler+0x0/0x71
Jun 23 17:41:17 prox-001 kernel: [<ffffffff81528cea>] ? start_kernel+0x3f2/0x3fe
Jun 23 17:41:17 prox-001 kernel: [<ffffffff815283b7>] ? x86_64_start_kernel+0xf9/0x106
Jun 23 17:41:17 prox-001 kernel: handlers:
Jun 23 17:41:17 prox-001 kernel: [<ffffffffa001ab61>] (arcmsr_do_interrupt+0x0/0x26 [arcmsr])
Jun 23 17:41:17 prox-001 kernel: [<ffffffffa00efb12>] (irq_handler+0x0/0x3d6 [firewire_ohci])
Jun 23 17:41:17 prox-001 kernel: [<ffffffffa008f848>] (usb_hcd_irq+0x0/0x7e [usbcore])
Jun 23 17:41:17 prox-001 kernel: Disabling IRQ #17



Below you can see the IRQ 17 is the areca raid controller:

Jun 23 21:13:08 prox-001 kernel: arcmsr 0000:02:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
Jun 23 21:13:08 prox-001 kernel: arcmsr 0000:02:00.0: setting latency timer to 64
Jun 23 21:13:08 prox-001 kernel: usb 1-1: new high speed USB device using ehci_hcd and address 2
Jun 23 21:13:08 prox-001 kernel: ARECA RAID ADAPTER4: FIRMWARE VERSION V1.49 2010-12-02
Jun 23 21:13:08 prox-001 kernel: scsi4 : Areca SAS Host Adapter RAID Controller( RAID6 capable)
Jun 23 21:13:08 prox-001 kernel: Driver Version 1.20.00.15 2008/11/03
Jun 23 21:13:08 prox-001 kernel: scsi 4:0:0:0: Direct-Access Areca raid10-000000001 R001 PQ: 0 ANSI: 5
Jun 23 21:13:08 prox-001 kernel: scsi 4:0:0:1: Direct-Access Areca raid10-000000002 R001 PQ: 0 ANSI: 5
Jun 23 21:13:08 prox-001 kernel: scsi 4:0:0:2: Direct-Access Areca raid10-000000003 R001 PQ: 0 ANSI: 5
Jun 23 21:13:08 prox-001 kernel: scsi 4:0:0:3: Direct-Access Areca raid10-000000004 R001 PQ: 0 ANSI: 5
Jun 23 21:13:08 prox-001 kernel: scsi 4:0:16:0: Processor Areca RAID controller R001 PQ: 0 ANSI: 0
Jun 23 21:13:08 prox-001 kernel: sd 4:0:0:0: [sda] 2863280128 512-byte logical blocks: (1.46 TB/1.33 TiB)
Jun 23 21:13:08 prox-001 kernel: sd 4:0:0:0: Attached scsi generic sg0 type 0
Jun 23 21:13:08 prox-001 kernel: sd 4:0:0:0: [sda] Write Protect is off
Jun 23 21:13:08 prox-001 kernel: sd 4:0:0:0: [sda] Mode Sense: cb 00 00 08
Jun 23 21:13:08 prox-001 kernel: sd 4:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Jun 23 21:13:08 prox-001 kernel: sd 4:0:0:1: [sdb] 2863280128 512-byte logical blocks: (1.46 TB/1.33 TiB)
Jun 23 21:13:08 prox-001 kernel: sd 4:0:0:1: Attached scsi generic sg1 type 0
Jun 23 21:13:08 prox-001 kernel: sd 4:0:0:1: [sdb] Write Protect is off
Jun 23 21:13:08 prox-001 kernel: sd 4:0:0:1: [sdb] Mode Sense: cb 00 00 08
Jun 23 21:13:08 prox-001 kernel: sd 4:0:0:1: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Jun 23 21:13:08 prox-001 kernel: sda: sda1 sda2
Jun 23 21:13:08 prox-001 kernel: sd 4:0:0:2: [sdc] 2863280128 512-byte logical blocks: (1.46 TB/1.33 TiB)
Jun 23 21:13:08 prox-001 kernel: sd 4:0:0:2: Attached scsi generic sg2 type 0
Jun 23 21:13:08 prox-001 kernel: sd 4:0:0:0: [sda] Attached SCSI disk
Jun 23 21:13:08 prox-001 kernel: sd 4:0:0:2: [sdc] Write Protect is off
Jun 23 21:13:08 prox-001 kernel: sd 4:0:0:2: [sdc] Mode Sense: cb 00 00 08
Jun 23 21:13:08 prox-001 kernel: sdb:
Jun 23 21:13:08 prox-001 kernel: sd 4:0:0:3: Attached scsi generic sg3 type 0
Jun 23 21:13:08 prox-001 kernel: scsi 4:0:16:0: Attached scsi generic sg4 type 3
Jun 23 21:13:08 prox-001 kernel: sd 4:0:0:2: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Jun 23 21:13:08 prox-001 kernel: sd 4:0:0:3: [sdd] 3517577216 512-byte logical blocks: (1.80 TB/1.63 TiB)
Jun 23 21:13:08 prox-001 kernel: sd 4:0:0:3: [sdd] Write Protect is off
Jun 23 21:13:08 prox-001 kernel: sdc:
Jun 23 21:13:08 prox-001 kernel: sd 4:0:0:3: [sdd] Mode Sense: cb 00 00 08
Jun 23 21:13:08 prox-001 kernel: sd 4:0:0:3: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Jun 23 21:13:08 prox-001 kernel: sdd: unknown partition table
Jun 23 21:13:08 prox-001 kernel: sd 4:0:0:1: [sdb] Attached SCSI disk
Jun 23 21:13:08 prox-001 kernel: unknown partition table
Jun 23 21:13:08 prox-001 kernel: sd 4:0:0:3: [sdd] Attached SCSI disk
Jun 23 21:13:08 prox-001 kernel: unknown partition table
Jun 23 21:13:08 prox-001 kernel: sd 4:0:0:2: [sdc] Attached SCSI disk


Any idea how I can fix this? How do I set this irqpoll on the kernel?

Many thanks, filgood
 
Last edited:
the new kernel (35) does not cause this irq reset...but my vm (sbs2011) freezes up using the virtio network driver. I've since upgraded the bios of the motherboard to see if that can be of any help and will report back.

~ filgood