proxmox 3.4 freezes on HP DL 380 Gen 9

jmanhique

Renowned Member
Aug 2, 2015
19
0
66
Good day all,

I'm having an annoying problem with my newly installed Proxmox 3.4 on HP Proliant DL 380 Gen 9.

This is the third time happening, at approximately 02:00am the server starts giving the following messages:

/var/log/messages
(...)
Aug 2 02:03:07 pve43 kernel: __ratelimit: 30 callbacks suppressed
Aug 2 02:03:07 pve43 kernel: lost page write due to I/O error on dm-3
Aug 2 02:03:07 pve43 kernel: lost page write due to I/O error on dm-3
Aug 2 02:03:07 pve43 kernel: lost page write due to I/O error on dm-3
Aug 2 02:03:08 pve43 kernel: lost page write due to I/O error on dm-3
Aug 2 02:03:08 pve43 kernel: lost page write due to I/O error on dm-3
Aug 2 02:03:08 pve43 kernel: lost page write due to I/O error on dm-3
Aug 2 02:03:08 pve43 kernel: lost page write due to I/O error on dm-3
Aug 2 02:03:08 pve43 kernel: lost page write due to I/O error on dm-3
Aug 2 02:03:08 pve43 kernel: lost page write due to I/O error on dm-3
Aug 2 02:03:09 pve43 kernel: lost page write due to I/O error on dm-3
(...)


/var/log/syslog

(...)
Aug 2 02:03:09 pve43 kernel: EXT3-fs error (device dm-3): ext3_get_inode_loc: unable to read inode block - inode=150865615, block=603455596
Aug 2 02:03:09 pve43 kernel: EXT3-fs (dm-3): I/O error while writing superblock
Aug 2 02:03:09 pve43 kernel: EXT3-fs error (device dm-3): ext3_get_inode_loc: unable to read inode block - inode=150865615, block=603455596
Aug 2 02:03:10 pve43 kernel: EXT3-fs (dm-3): I/O error while writing superblock
Aug 2 02:03:10 pve43 kernel: EXT3-fs error (device dm-3): ext3_get_inode_loc: unable to read inode block - inode=150865615, block=603455596
Aug 2 02:03:10 pve43 kernel: EXT3-fs (dm-3): I/O error while writing superblock
Aug 2 02:03:10 pve43 kernel: EXT3-fs error (device dm-3): ext3_get_inode_loc: unable to read inode block - inode=150865615, block=603455596
Aug 2 02:03:10 pve43 kernel: EXT3-fs (dm-3): I/O error while writing superblock
Aug 2 02:03:10 pve43 kernel: EXT3-fs error (device dm-3): ext3_get_inode_loc: unable to read inode block - inode=150865615, block=603455596
Aug 2 02:03:10 pve43 kernel: EXT3-fs (dm-3): I/O error while writing superblock
Aug 2 02:03:10 pve43 kernel: EXT3-fs error (device dm-3): ext3_get_inode_loc: unable to read inode block - inode=150865615, block=603455596
Aug 2 02:03:10 pve43 kernel: EXT3-fs (dm-3): I/O error while writing superblock
Aug 2 02:03:10 pve43 kernel: EXT3-fs error (device dm-3): ext3_get_inode_loc: unable to read inode block - inode=150865615, block=603455596
Aug 2 02:03:10 pve43 kernel: EXT3-fs (dm-3): I/O error while writing superblock
Aug 2 02:03:10 pve43 kernel: EXT3-fs error (device dm-3): ext3_get_inode_loc: unable to read inode block - inode=150865615, block=603455596
Aug 2 02:03:10 pve43 kernel: EXT3-fs (dm-3): I/O error while writing superblock
Aug 2 02:03:10 pve43 kernel: EXT3-fs error (device dm-3): ext3_get_inode_loc: unable to read inode block - inode=150865615, block=603455596
Aug 2 02:03:10 pve43 kernel: EXT3-fs (dm-3): I/O error while writing superblock
Aug 2 02:03:10 pve43 kernel: EXT3-fs error (device dm-3): ext3_get_inode_loc: unable to read inode block - inode=150865615, block=603455596
Aug 2 02:03:10 pve43 kernel: EXT3-fs (dm-3): I/O error while writing superblock
Aug 2 02:03:10 pve43 kernel: EXT3-fs error (device dm-3): ext3_get_inode_loc: unable to read inode block - inode=150865615, block=603455596
Aug 2 02:03:11 pve43 kernel: EXT3-fs (dm-3): I/O error while writing superblock
Aug 2 02:03:11 pve43 kernel: EXT3-fs error (device dm-3): ext3_get_inode_loc: unable to read inode block - inode=150865615, block=603455596
(...)

And the server freezes.

Before Aug 2 did the same on Jul 31
Jul 31 01:48:51 pve43 kernel: lost page write due to I/O error on dm-2
Jul 31 01:48:51 pve43 kernel: lost page write due to I/O error on dm-2

I have to swith the server off and on so it boots.

~# pveversion
pve-manager/3.4-6/102d4547 (running kernel: 2.6.32-39-pve)

~# pvdisplay
/dev/pve/vzsnap-pve43-0: read failed after 0 of 4096 at 2813501308928: Input/output error
/dev/pve/vzsnap-pve43-0: read failed after 0 of 4096 at 2813501366272: Input/output error
/dev/pve/vzsnap-pve43-0: read failed after 0 of 4096 at 0: Input/output error
/dev/pve/vzsnap-pve43-0: read failed after 0 of 4096 at 4096: Input/output error
--- Physical volume ---
PV Name /dev/sda3
VG Name pve
PV Size 2.73 TiB / not usable 1.94 MiB
Allocatable yes
PE Size 4.00 MiB
Total PE 715334
Free PE 3839
Allocated PE 711495
PV UUID eikhY0-fVwe-XEaj-iZ8F-FpRa-Rl5E-fNVRPK



Guys from HP say that theres nothing wrong with the hardware and I'm lost on what to do :s

Can you help?

Thanks in advance
 
ext3_get_inode_loc: unable to read inode block - inode=150865615, block=603455596

this looks either like a damage file system or a damaged hard drive.

you should:
* reboot your system and run fsck on all the drivers ( shutdown -Fr do that )
* if the fsck is sucessfull, then you're probably good
if you get against these failures you should think about replacing the hard drive
 
Thanks for the reply adamb. This issue is eating my brain. An HP specialist said that there isn't anything wrong with hardware. I'm kinda lost. Is there any other test you advise me to do?

Thanks again
 
Can you provide a bit more information about the server? Exact raid card model and specs?

(1) Dynamic Smart Array B140i and/or
(1) Smart Array P440
(1) Smart Array P840

I am pretty sure all of my machines have the P440.
 
Hi. Its a Smart Array P440.

The server is
HP ProLiant DL380 Gen9 with 24 x Intel Xeon CPU E5-2620 v2 @ 2.40Ghz
Has 64GB of RAM and 6x10 600 GB (SATA/SAS 3.0) running Raid 1+0


Its running proxmox 3.4-6 102d4547 (installed w/o any changes).

Theres a 3.0 USB disk plugged for backup (starting at 00:00) everyday.

You need any more info I can provide?

 
Hi. Its a Smart Array P440.

The server is
HP ProLiant DL380 Gen9 with 24 x Intel Xeon CPU E5-2620 v2 @ 2.40Ghz
Has 64GB of RAM and 6x10 600 GB (SATA/SAS 3.0) running Raid 1+0


Its running proxmox 3.4-6 102d4547 (installed w/o any changes).

Theres a 3.0 USB disk plugged for backup (starting at 00:00) everyday.

You need any more info I can provide?


Is the entire server on the latest HP update disk? I would definitely start by eliminating that USB 3.0 disk from the situation.
 
I recently upgraded the SPP (HP Service pack to the latest version) it that's the question (didn't get it). So you suggest removing the USB?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!