Very bad I/O on proxmox 4.4

TECOB · Oct 31, 2017

Hi,

We are getting very bad perfrormance on a server with Proxmox 4.4.

# pveperf
CPU BOGOMIPS: 38398.64
REGEX/SECOND: 2075874
HD SIZE: 94.37 GB (/dev/dm-0)
BUFFERED READS: 0.46 MB/sec
AVERAGE SEEK TIME: 4.52 ms
FSYNCS/SECOND: 1281.17
DNS EXT: 55.85 ms
DNS INT: 5.66 ms (tecob.com)

# df -h
Filesystem Size Used Avail Use% Mounted on
udev 10M 0 10M 0% /dev
tmpfs 6.3G 12M 6.3G 1% /run
/dev/dm-0 95G 17G 74G 19% /
tmpfs 16G 49M 16G 1% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 16G 0 16G 0% /sys/fs/cgroup
/dev/mapper/backup-backup 509G 186G 298G 39% /backup
/dev/fuse 30M 24K 30M 1% /etc/pve

RAID Hardware is ok, but when doing a vzdump of an LXC container, it takes hours to do the dump, normally done in 10 mintes...

iotop shows that write in and write out are running at a speed of 800K/s at a maximum...

This server was running very well but it has suddenly durring the night reached a charge of more than 200 and it's due to the longs delays in I/O.

Has someone met this kind of problem?

Regards

fireon · Oct 31, 2017

Have you tested you harddisk with smart? Or is there es special raidutility with?
What is this? "/dev/mapper/backup-backup" one disk or also an Raid, what Raid, HW or Software. What type of server do you have? HP/Dell ...

TECOB · Oct 31, 2017

Hi,

Server info (from OVH)
CPU: Intel(R) Xeon(R) CPU D-1521 @ 2.40GHz
RAM: 32GB
Disk: Hard Raid 3x600GB SAS 15K CacheCade 80GB SSD

There's a LVM-Thin in this host and /backup is the directpry containing the dumps. Everything on the RAID detailed before.

All disks are SMART ok with MegaCli command as shown below:
# /opt/MegaRAID/MegaCli/MegaCli64 -CfgDsply -a0

==============================================================================
Adapter: 0
Product Name: LSI MegaRAID SAS 9271-4i
Memory: 1024MB
BBU: Present
Serial No: SV32308875
==============================================================================
Number of DISK GROUPS: 2

DISK GROUP: 0
Number of Spans: 1
SPAN: 0
Span Reference: 0x00
Number of PDs: 3
Number of VDs: 1
Number of dedicated Hotspares: 0
Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Name :
RAID Level : Primary-5, Secondary-0, RAID Level Qualifier-3
Size : 1.089 TB
Sector Size : 512
Is VD emulated : No
Parity Size : 558.406 GB
State : Optimal
Strip Size : 256 KB
Number Of Drives : 3
Span Depth : 1
Default Cache Policy: WriteBack, ReadAhead, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteBack, ReadAhead, Direct, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy : Disk's Default
Encryption Type : None
Bad Blocks Exist: No
PI type: No PI

Is VD Cached: Yes
Cache Cade Type : Read and Write
Physical Disk Information:
Physical Disk: 0
Enclosure Device ID: 252
Slot Number: 0
Drive's position: DiskGroup: 0, Span: 0, Arm: 0
Enclosure position: N/A
Device Id: 25
WWN: 5000CCA05958D663
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SAS

Raw Size: 558.911 GB [0x45dd2fb0 Sectors]
Non Coerced Size: 558.411 GB [0x45cd2fb0 Sectors]
Coerced Size: 558.406 GB [0x45cd0000 Sectors]
Sector Size: 512
Logical Sector Size: 512
Physical Sector Size: 512
Firmware state: Online, Spun Up
Commissioned Spare : No
Emergency Spare : No
Device Firmware Level: A703
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x5000cca05958d661
SAS Address(1): 0x0
Connected Port Number: 3(path0)
Inquiry Data: HGST HUC156060CSS200 A7030XHKVL0V
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 12.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive: Not Certified
Drive Temperature :41C (105.80 F)
PI Eligibility: No
Drive is formatted for PI information: No
PI: No PI
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Port-1 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Drive has flagged a S.M.A.R.T alert : No

Physical Disk: 1
Enclosure Device ID: 252
Slot Number: 1
Drive's position: DiskGroup: 0, Span: 0, Arm: 1
Enclosure position: N/A
Device Id: 24
WWN: 5000CCA0597F5B3B
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SAS

Raw Size: 558.911 GB [0x45dd2fb0 Sectors]
Non Coerced Size: 558.411 GB [0x45cd2fb0 Sectors]
Coerced Size: 558.406 GB [0x45cd0000 Sectors]
Sector Size: 512
Logical Sector Size: 512
Physical Sector Size: 512
Firmware state: Online, Spun Up
Commissioned Spare : No
Emergency Spare : No
Device Firmware Level: A703
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x5000cca0597f5b39
SAS Address(1): 0x0
Connected Port Number: 2(path0)
Inquiry Data: HGST HUC156060CSS200 A7030XJ818WP
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 12.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive: Not Certified
Drive Temperature :38C (100.40 F)
PI Eligibility: No
Drive is formatted for PI information: No
PI: No PI
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Port-1 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Drive has flagged a S.M.A.R.T alert : No

Physical Disk: 2
Enclosure Device ID: 252
Slot Number: 2
Drive's position: DiskGroup: 0, Span: 0, Arm: 2
Enclosure position: N/A
Device Id: 26
WWN: 5000CCA05958E727
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SAS

Raw Size: 558.911 GB [0x45dd2fb0 Sectors]
Non Coerced Size: 558.411 GB [0x45cd2fb0 Sectors]
Coerced Size: 558.406 GB [0x45cd0000 Sectors]
Sector Size: 512
Logical Sector Size: 512
Physical Sector Size: 512
Firmware state: Online, Spun Up
Commissioned Spare : No
Emergency Spare : No
Device Firmware Level: A703
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x5000cca05958e725
SAS Address(1): 0x0
Connected Port Number: 1(path0)
Inquiry Data: HGST HUC156060CSS200 A7030XHKWPMV
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 12.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive: Not Certified
Drive Temperature :39C (102.20 F)
PI Eligibility: No
Drive is formatted for PI information: No
PI: No PI
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Port-1 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Drive has flagged a S.M.A.R.T alert : No

CACHECADE DISK GROUP: 0
CacheCade Virtual Drive Information:
CacheCade Virtual Drive: 0 (Target Id: 1)
Virtual Drive Type : CacheCade
Name :
RAID Level : Primary-0, Secondary-0
State : Optimal
Size : 74.0 GB
Target Id of the Associated LDs : 0
Default Cache Policy : WriteBack, ReadAhead, Direct, No Write Cache if Bad BBU
Current Cache Policy : WriteBack, ReadAhead, Direct, No Write Cache if Bad BBU
Physical Disk Information:
Physical Disk: 0
Enclosure Device ID: 252
Slot Number: 3
Drive's position: DiskGroup: 1, Span: 0, Arm: 0
Enclosure position: N/A
Device Id: 20
WWN: 55cd2e404c063f3b
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA

Raw Size: 74.530 GB [0x950f8b0 Sectors]
Non Coerced Size: 74.030 GB [0x940f8b0 Sectors]
Coerced Size: 74.0 GB [0x9400000 Sectors]
Sector Size: 512
Logical Sector Size: 512
Physical Sector Size: 4096
Firmware state: Online, Spun Up
Commissioned Spare : No
Emergency Spare : No
Device Firmware Level: 0130
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x4433221100000000
Connected Port Number: 0(path0)
Inquiry Data: BTWA525200VJ080BGN INTEL SSDSC2BB080G6 G2010130
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Solid State Device
Drive: Not Certified
Drive Temperature :34C (93.20 F)
PI Eligibility: No
Drive is formatted for PI information: No
PI: No PI
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Drive has flagged a S.M.A.R.T alert : No

The system got a reboot 2 days ago and I noticd something strange on the first date, but thought a CT was surcharging the server, but this morning it was at more of 200 of load and plenty of I/O waiting. Also, we have attached a NFS NAS to make dumps go faster but even reads on local disk are really slow...

LVM-Thin has never been full. There's no explanation, as far as I know.

Have someone an idea?

regards,

TECOB · Oct 31, 2017

Here's the output of pverversion:

# pveversion -v
proxmox-ve: 4.4-87 (running kernel: 4.4.59-1-pve)
pve-manager: 4.4-13 (running version: 4.4-13/7ea56165)
pve-kernel-4.4.35-1-pve: 4.4.35-77
pve-kernel-4.4.59-1-pve: 4.4.59-87
pve-kernel-4.4.49-1-pve: 4.4.49-86
lvm2: 2.02.116-pve3
corosync-pve: 2.4.2-2~pve4+1
libqb0: 1.0.1-1
pve-cluster: 4.0-49
qemu-server: 4.0-110
pve-firmware: 1.1-11
libpve-common-perl: 4.0-94
libpve-access-control: 4.0-23
libpve-storage-perl: 4.0-76
pve-libspice-server1: 0.12.8-2
vncterm: 1.3-2
pve-docs: 4.4-4
pve-qemu-kvm: 2.7.1-4
pve-container: 1.0-99
pve-firewall: 2.0-33
pve-ha-manager: 1.0-40
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u3
lxc-pve: 2.0.7-4
lxcfs: 2.0.6-pve1
criu: 1.6.0-1
novnc-pve: 0.5-9
smartmontools: 6.5+svn4324-1~pve80
zfsutils: 0.6.5.9-pve15~bpo80

udo · Oct 31, 2017

Hi,
have you look with atop how much IO do you have?

Perhaps other processes makes a lot of IO?

Udo

jeffwadsworth · Oct 31, 2017

Your buffered reads are incredibly bad on that device (/dev/dm0) as are your seek times. Do you have another controller handy?

fireon · Oct 31, 2017

hmm... on pve4 i had similar problem when i attached a faulty NFS. Then i had very very high load (150+ by 12 Cores). Can you test a backup to an other place?

If this is not, maybe you can test another controller.

TECOB said:
This server was running very well but it has suddenly durring the night reached a charge of more than 200 and it's due to the longs delays in I/O.

And this is only when a backup is running?

TECOB · Oct 31, 2017

Hi,

We have stopped all CT's and no one is running. Those values is for one dump with everything stopped.
NFS was not attached first. We attached it to make dumps to NFS NAS. We have seen high I/O on NFS that have a poor connection but that's not the issue here as those values are before attaching NAS and current NAS has good connection.
And not, this is not only during dumps. When host was at a load over 200, it was because all the CT's (about 15) where running on the host. We decided to stop all of them and move them to another server, and then saw that dumps are really slow...

In fact, everything is related to the slow access to HD. Don't know why...

@jeffwadsworth: what do you mean with another controller handy? If you mean if we can replace the RAID controller, that is not possible.

https://forum.proxmox.com/members/jeffwadsworth.35806/

jeffwadsworth · Oct 31, 2017

I would suggest you get another LSI MegaRAID SAS 9271-4i and switch it out to test. The configuration information is kept on the disks.
Make sure the firmware is the same.

TECOB · Oct 31, 2017

In case this is onvolved, I can see all those processes:

# ps ax | grep arc_prune
1415 ? S 0:00 [arc_prune]
1416 ? S 0:00 [arc_prune]
1417 ? S 0:00 [arc_prune]
1418 ? S 0:00 [arc_prune]
1419 ? S 0:00 [arc_prune]
1420 ? S 0:00 [arc_prune]
1421 ? S 0:00 [arc_prune]
1422 ? S 0:00 [arc_prune]
22546 pts/1 S+ 0:00 grep arc_prune

# ps ax | grep spl
1345 ? S< 0:00 [spl_kmem_cache]
1346 ? S< 0:00 [spl_kmem_cache]
1347 ? S< 0:00 [spl_kmem_cache]
1348 ? S< 0:00 [spl_kmem_cache]
1349 ? S< 0:00 [spl_system_task]
1350 ? S< 0:00 [spl_system_task]
1351 ? S< 0:00 [spl_system_task]
1352 ? S< 0:00 [spl_system_task]
1353 ? S< 0:00 [spl_system_task]
1354 ? S< 0:00 [spl_system_task]
1355 ? S< 0:00 [spl_system_task]
1356 ? S< 0:00 [spl_system_task]
1357 ? S< 0:00 [spl_system_task]
1358 ? S< 0:00 [spl_system_task]
1359 ? S< 0:00 [spl_system_task]
1360 ? S< 0:00 [spl_system_task]
1361 ? S< 0:00 [spl_system_task]
1362 ? S< 0:00 [spl_system_task]
1363 ? S< 0:00 [spl_system_task]
1364 ? S< 0:00 [spl_system_task]
1365 ? S< 0:00 [spl_system_task]
1366 ? S< 0:00 [spl_system_task]
1367 ? S< 0:00 [spl_system_task]
1368 ? S< 0:00 [spl_system_task]
1369 ? S< 0:00 [spl_system_task]
1370 ? S< 0:00 [spl_system_task]
1371 ? S< 0:00 [spl_system_task]
1372 ? S< 0:00 [spl_system_task]
1373 ? S< 0:00 [spl_system_task]
1374 ? S< 0:00 [spl_system_task]
1375 ? S< 0:00 [spl_system_task]
1376 ? S< 0:00 [spl_system_task]
1377 ? S< 0:00 [spl_system_task]
1378 ? S< 0:00 [spl_system_task]
1379 ? S< 0:00 [spl_system_task]
1380 ? S< 0:00 [spl_system_task]
1381 ? S< 0:00 [spl_system_task]
1382 ? S< 0:00 [spl_system_task]
1383 ? S< 0:00 [spl_system_task]
1384 ? S< 0:00 [spl_system_task]
1385 ? S< 0:00 [spl_system_task]
1386 ? S< 0:00 [spl_system_task]
1387 ? S< 0:00 [spl_system_task]
1388 ? S< 0:00 [spl_system_task]
1389 ? S< 0:00 [spl_system_task]
1390 ? S< 0:00 [spl_system_task]
1391 ? S< 0:00 [spl_system_task]
1392 ? S< 0:00 [spl_system_task]
1393 ? S< 0:00 [spl_system_task]
1394 ? S< 0:00 [spl_system_task]
1395 ? S< 0:00 [spl_system_task]
1396 ? S< 0:00 [spl_system_task]
1397 ? S< 0:00 [spl_system_task]
1398 ? S< 0:00 [spl_system_task]
1399 ? S< 0:00 [spl_system_task]
1400 ? S< 0:00 [spl_system_task]
1401 ? S< 0:00 [spl_system_task]
1402 ? S< 0:00 [spl_system_task]
1403 ? S< 0:00 [spl_system_task]
1404 ? S< 0:00 [spl_system_task]
1405 ? S< 0:00 [spl_system_task]
1406 ? S< 0:00 [spl_system_task]
1407 ? S< 0:00 [spl_system_task]
1408 ? S< 0:00 [spl_system_task]
1409 ? S< 0:00 [spl_system_task]
1410 ? S< 0:00 [spl_system_task]
1411 ? S< 0:00 [spl_system_task]
1412 ? S< 0:00 [spl_system_task]
1413 ? S< 0:00 [spl_dynamic_tas]
22563 pts/1 S+ 0:00 grep spl

And this is very strange because as far as I remember filesystem is ext4 ... aren't those processes related to zfs?

Regards

LnxBil · Nov 16, 2017

TECOB said:
... aren't those processes related to zfs?

Yes, try

Code:

rmmod zfs

TECOB · Nov 16, 2017

Hi @LnxBil

Thanks, we have removed this module but it does not impact on buffered reads. It's still as slow as before and i/o blocks everything.
We have also blacklisted this module and rebooted, but its still the same.

regards,

LnxBil · Nov 20, 2017

Are there events recorded in the controller log? Have you tried booting another Live-Linux and check if the performance is also bad?

TECOB · Nov 22, 2017

Hi @LnxBil

No way to use a Live distro, but as we have tried everything we have decided to reinstall Proxmox and at least, know if it's a problem with the hardware or with the Proxmox GNU/Linux soft.
We're going to install the same version and as similar as possible.

We'll keep you informed of the results and try everyhtiong before we abandon this server at the end of the month.

Regards

TECOB · Nov 22, 2017

ok,

We have reinstalled the server with the same Proxmox version and it appears the problem is a Hardware problem. Something is wrong with the RAID. We are having the same bad performance in pveperf with very low buffered reads and continuous freezes of the system.

That, at least, is a very good news. Even if we've been strugling with this for more than 2 weeks.

Thanks all for your help.

Regards

Search

Search

Very bad I/O on proxmox 4.4

TECOB

Renowned Member

fireon

Distinguished Member

TECOB

Renowned Member

TECOB

Renowned Member

udo

Distinguished Member

jeffwadsworth

Member

fireon

Distinguished Member

TECOB

Renowned Member

jeffwadsworth

Member

TECOB

Renowned Member

LnxBil

Distinguished Member

TECOB

Renowned Member

LnxBil

Distinguished Member

TECOB

Renowned Member

TECOB

Renowned Member

We value your privacy