FS problems on proxmox during the vzdump?

RRJ

Member
Apr 14, 2010
245
0
16
Estonia, Tallinn
Hello,
I've got 2 servers with Proxmox installed on them. The problems is that one of them crushes with kernel panic. today it was the 2nd time, i'll post a screenshot with last kernel messages (i've got an ILO on it, so i can see them)
sector_IO_error.PNG
and on any console command it replies me with this:
services:~# ps axu
-bash: /bin/ps: Input/output error
services:~# dmesg
-bash: /bin/dmesg: Input/output error

and 2 minutes ago it dropped the ssh connection :(

is that is faulty sda disk or smth else?

on the second one, my dmesg is full of this type of logs during vzdump every night:

Code:
EXT3 FS on dm-3, internal journalext3_orphan_cleanup: deleting unreferenced inode 16229416
ext3_orphan_cleanup: deleting unreferenced inode 16245672
ext3_orphan_cleanup: deleting unreferenced inode 16662705
EXT3-fs: dm-3: 3 orphan inodes deleted
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting.  Commit interval 5 seconds
EXT3 FS on dm-3, internal journal
ext3_orphan_cleanup: deleting unreferenced inode 16229416
ext3_orphan_cleanup: deleting unreferenced inode 16245672
ext3_orphan_cleanup: deleting unreferenced inode 16662705
EXT3-fs: dm-3: 3 orphan inodes deleted
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting.  Commit interval 5 seconds
EXT3 FS on dm-3, internal journal
ext3_orphan_cleanup: deleting unreferenced inode 16229416
ext3_orphan_cleanup: deleting unreferenced inode 16245672
ext3_orphan_cleanup: deleting unreferenced inode 16662705
EXT3-fs: dm-3: 3 orphan inodes deleted
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.

what could be a resason?
 
Last edited:
check the hard disks for error´s. is this a single drive or a raid volume?
 
hi and tnx for reply.

which drive i should check? the one gave me kernel panic or other one, that flushes dmesg?
btw, the one with panic went online after reboot.
 
the dmesg notifications are expected and no errors.
 
I just got the same errors on one of my servers yesterday.
/var/log/messages looks something like this:
Code:
Nov  2 22:00:06 proxmox3 kernel: EXT3-fs: barriers disabled
Nov  2 22:00:06 proxmox3 kernel: kjournald starting.  Commit interval 5 seconds
Nov  2 22:00:06 proxmox3 kernel: EXT3-fs (dm-3): using internal journal
Nov  2 22:00:06 proxmox3 kernel: EXT3-fs (dm-3): 24 orphan inodes deleted
Nov  2 22:00:06 proxmox3 kernel: EXT3-fs (dm-3): recovery complete
Nov  2 22:00:06 proxmox3 kernel: EXT3-fs (dm-3): mounted filesystem with ordered data mode
Nov  2 22:04:13 proxmox3 kernel: EXT3-fs: barriers disabled
Nov  2 22:04:13 proxmox3 kernel: kjournald starting.  Commit interval 5 seconds
Nov  2 22:04:13 proxmox3 kernel: EXT3-fs (dm-3): using internal journal
Nov  2 22:04:13 proxmox3 kernel: EXT3-fs (dm-3): 24 orphan inodes deleted
Nov  2 22:04:13 proxmox3 kernel: EXT3-fs (dm-3): recovery complete
Nov  2 22:04:13 proxmox3 kernel: EXT3-fs (dm-3): mounted filesystem with ordered data mode
Nov  2 22:05:32 proxmox3 kernel: EXT3-fs: barriers disabled
...
Nov  2 23:15:41 proxmox3 kernel: kjournald starting.  Commit interval 5 seconds
Nov  2 23:15:41 proxmox3 kernel: EXT3-fs (dm-3): using internal journal
Nov  2 23:15:41 proxmox3 kernel: EXT3-fs (dm-3): 24 orphan inodes deleted
Nov  2 23:15:41 proxmox3 kernel: EXT3-fs (dm-3): recovery complete
Nov  2 23:15:41 proxmox3 kernel: EXT3-fs (dm-3): mounted filesystem with ordered data mode
Nov  3 01:38:08 proxmox3 kernel: __ratelimit: 6 callbacks suppressed
Nov  3 01:38:08 proxmox3 kernel: lost page write due to I/O error on dm-3
Nov  3 01:38:08 proxmox3 kernel: lost page write due to I/O error on dm-3
Nov  3 01:38:08 proxmox3 kernel: lost page write due to I/O error on dm-3
Nov  3 01:38:08 proxmox3 kernel: lost page write due to I/O error on dm-3
Nov  3 01:38:08 proxmox3 kernel: lost page write due to I/O error on dm-3
Nov  3 01:38:08 proxmox3 kernel: lost page write due to I/O error on dm-3
Nov  3 01:38:08 proxmox3 kernel: lost page write due to I/O error on dm-3
Nov  3 01:38:08 proxmox3 kernel: lost page write due to I/O error on dm-3
Nov  3 01:38:08 proxmox3 kernel: lost page write due to I/O error on dm-3
Nov  3 01:38:08 proxmox3 kernel: lost page write due to I/O error on dm-3
Nov  3 01:38:25 proxmox3 kernel: __ratelimit: 11 callbacks suppressed
Nov  3 01:38:25 proxmox3 kernel: lost page write due to I/O error on dm-3
Nov  3 01:38:25 proxmox3 kernel: lost page write due to I/O error on dm-3
Nov  3 01:38:25 proxmox3 kernel: lost page write due to I/O error on dm-3
Nov  3 01:38:25 proxmox3 kernel: lost page write due to I/O error on dm-3
Nov  3 01:38:25 proxmox3 kernel: lost page write due to I/O error on dm-3
Nov  3 01:38:25 proxmox3 kernel: lost page write due to I/O error on dm-3
Nov  3 01:38:25 proxmox3 kernel: lost page write due to I/O error on dm-3
Nov  3 01:38:25 proxmox3 kernel: lost page write due to I/O error on dm-3
Nov  3 01:38:25 proxmox3 kernel: lost page write due to I/O error on dm-3
Nov  3 01:38:25 proxmox3 kernel: lost page write due to I/O error on dm-3
Nov  3 01:43:28 proxmox3 kernel: EXT3-fs: barriers disabled
Nov  3 01:43:28 proxmox3 kernel: kjournald starting.  Commit interval 5 seconds
Nov  3 01:43:28 proxmox3 kernel: EXT3-fs (dm-3): using internal journal
Nov  3 01:43:28 proxmox3 kernel: EXT3-fs (dm-3): 24 orphan inodes deleted
Nov  3 01:43:28 proxmox3 kernel: EXT3-fs (dm-3): recovery complete
Nov  3 01:43:28 proxmox3 kernel: EXT3-fs (dm-3): mounted filesystem with ordered data mode
...
then the last 6 lines repeat multiple times.
I presume, dm-3 is the default /dev/mapper/pve-data volume. 22:00 is when the backup starts.
Can anyone tell me what's going on? It looks to me that the fs on pve-data has errors, so when an lvm snapshot is taken and mounted, fsck finds the same errors over and over again. The thing that bothers me is where the errors came from, and the lost page write due to I/O error on dm-3 error message.

dmesg shows the following:
Code:
device-mapper: snapshots: Invalidating snapshot: Unable to allocate exception.
EXT3-fs error (device dm-3): ext3_get_inode_loc: unable to read inode block - inode=88326401, block=353304594
__ratelimit: 6 callbacks suppressed
Buffer I/O error on device dm-3, logical block 0
lost page write due to I/O error on dm-3
EXT3-fs (dm-3): I/O error while writing superblock
EXT3-fs (dm-3): error in ext3_reserve_inode_write: IO failure
Buffer I/O error on device dm-3, logical block 0
lost page write due to I/O error on dm-3
EXT3-fs (dm-3): I/O error while writing superblock
EXT3-fs error (device dm-3): ext3_get_inode_loc: unable to read inode block - inode=88326405, block=353304594
Buffer I/O error on device dm-3, logical block 0
lost page write due to I/O error on dm-3
EXT3-fs (dm-3): I/O error while writing superblock
EXT3-fs (dm-3): error in ext3_reserve_inode_write: IO failure
Buffer I/O error on device dm-3, logical block 0
lost page write due to I/O error on dm-3
EXT3-fs (dm-3): I/O error while writing superblock
EXT3-fs error (device dm-3): ext3_get_inode_loc: unable to read inode block - inode=88326416, block=353304594
Buffer I/O error on device dm-3, logical block 0
lost page write due to I/O error on dm-3
EXT3-fs (dm-3): I/O error while writing superblock
EXT3-fs (dm-3): error in ext3_reserve_inode_write: IO failure
Buffer I/O error on device dm-3, logical block 0
lost page write due to I/O error on dm-3
EXT3-fs (dm-3): I/O error while writing superblock
EXT3-fs error (device dm-3): ext3_get_inode_loc: unable to read inode block - inode=88326402, block=353304594
Buffer I/O error on device dm-3, logical block 0
lost page write due to I/O error on dm-3
EXT3-fs (dm-3): I/O error while writing superblock
EXT3-fs (dm-3): error in ext3_reserve_inode_write: IO failure
Buffer I/O error on device dm-3, logical block 0
lost page write due to I/O error on dm-3
EXT3-fs (dm-3): I/O error while writing superblock
EXT3-fs error (device dm-3): ext3_get_inode_loc: unable to read inode block - inode=88326403, block=353304594
Buffer I/O error on device dm-3, logical block 0
lost page write due to I/O error on dm-3
EXT3-fs (dm-3): I/O error while writing superblock
EXT3-fs (dm-3): error in ext3_reserve_inode_write: IO failure
Buffer I/O error on device dm-3, logical block 0
lost page write due to I/O error on dm-3
EXT3-fs (dm-3): I/O error while writing superblock
EXT3-fs error (device dm-3): ext3_get_inode_loc: unable to read inode block - inode=88326415, block=353304594
EXT3-fs (dm-3): I/O error while writing superblock
EXT3-fs (dm-3): error in ext3_reserve_inode_write: IO failure
EXT3-fs (dm-3): I/O error while writing superblock
...
EXT3-fs error (device dm-3): ext3_get_inode_loc: unable to read inode block - inode=88326410, block=353304594
EXT3-fs (dm-3): I/O error while writing superblock
EXT3-fs (dm-3): error in ext3_reserve_inode_write: IO failure
EXT3-fs (dm-3): I/O error while writing superblock
Aborting journal on device dm-3.
JBD: I/O error detected when updating journal superblock for dm-3.
EXT3-fs (dm-3): error: ext3_journal_start_sb: Detected aborted journal
EXT3-fs (dm-3): error: remounting filesystem read-only
__ratelimit: 11 callbacks suppressed
Buffer I/O error on device dm-3, logical block 348160006
lost page write due to I/O error on dm-3
Buffer I/O error on device dm-3, logical block 348160007
lost page write due to I/O error on dm-3
Buffer I/O error on device dm-3, logical block 348160008
lost page write due to I/O error on dm-3
...
Buffer I/O error on device dm-3, logical block 348160015
lost page write due to I/O error on dm-3
EXT3-fs (dm-3): error: ext3_put_super: Couldn't clean up the journal

Could anyone please tell me if these I/O faults are hardware or LVM related?
The hardware is an LSI SAS MegaRAID controller with BBU in raid 1
PVE version is 1.9
Code:
proxmox3:~# pveversion --verbose
pve-manager: 1.9-24 (pve-manager/1.9/6542)
running kernel: 2.6.32-6-pve
proxmox-ve-2.6.32: 1.9-47
pve-kernel-2.6.32-6-pve: 2.6.32-47
qemu-server: 1.1-32
pve-firmware: 1.0-14
libpve-storage-perl: 1.0-19
vncterm: 0.9-2
vzctl: 3.0.29-2pve1
vzdump: 1.2-16
vzprocps: 2.0.11-2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.15.0-1
ksm-control-daemon: 1.0-6
 
Last edited by a moderator:
Hi RRJ and Jekader,

we keep having the same issues, how did you solve it?
 
oh, necroposting, i luv this :D

to solve this problem i replaced my faulty drive.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!