I/O Errors - Possible Corruption

Craig Lowe

Renowned Member
Feb 7, 2016
7
0
66
32
I recently got ProxMox up and running after months of deliberation as to which HyperVisor I wanted to use. Last night I almost finished transfering my applications to containers and the system was almost fully configured and running. At this point I was happy with the progress made and decided to call it a night for server configuration. An hour later I attempted to watch a film on one of my HTPC's and the NFS wasn't responding, so I went up to look at the server and the main VM had stopped due to I/O error, I attempted several times to restart the VM but it would just stop before it was booted with an I/O error. At this point I decided to restart the server, since then I have been unable to get almost anything up and running. The server repeatedly displays the following errors:

Buffer I/O error on dev loop0, logical block XXXX, lost async page write
EXT4-fs (loop0): error loading journal
loop: Write error at byte offset XXXXXX, length 4096
blk_update_request: I/O error, dev loop0, sector XXXXX

Thousands of these errors are displayed, I cannot load the management web console or SSH into the system, none of the LXC's or VM's appear to be running either. I have tried to umount the /dev/pve-mapper/data drive and FSCK it but I have been unable get it to unmount up to now, I tried to remove it from /etc/fstab and fsck after a reboot but it said there was nothing wrong with the drive.

I really am at a dead end with this, I am not new to linux but I obviously am very new to ProxMox and my online searching has lead to me to a complete stale mate in trying to rectify this problem, so any help at all is much appreciated!

Thank you,
Craig.

UPDATE:

I have just run ifdown, ifup on the network interfaces and I can now access the web management and SSH into the server. I have run pveversion -v, as support staff always seem to ask for it first, so...

root@pve:~# pveversion -v
proxmox-ve: 4.1-26 (running kernel: 4.2.6-1-pve)
pve-manager: 4.1-1 (running version: 4.1-1/2f9650d4)
pve-kernel-4.2.6-1-pve: 4.2.6-26
lvm2: 2.02.116-pve2
corosync-pve: 2.3.5-2
libqb0: 0.17.2-1
pve-cluster: 4.0-29
qemu-server: 4.0-41
pve-firmware: 1.1-7
libpve-common-perl: 4.0-41
libpve-access-control: 4.0-10
libpve-storage-perl: 4.0-38
pve-libspice-server1: 0.12.5-2
vncterm: 1.2-1
pve-qemu-kvm: 2.4-17
pve-container: 1.0-32
pve-firewall: 2.0-14
pve-ha-manager: 1.0-14
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u1
lxc-pve: 1.1.5-5
lxcfs: 0.13-pve1
cgmanager: 0.39-pve1
criu: 1.6.0-1
zfsutils: 0.6.5-pve6~jessie
 
Last edited:
I recently got ProxMox up and running after months of deliberation as to which HyperVisor I wanted to use. Last night I almost finished transfering my applications to containers and the system was almost fully configured and running. At this point I was happy with the progress made and decided to call it a night for server configuration. An hour later I attempted to watch a film on one of my HTPC's and the NFS wasn't responding, so I went up to look at the server and the main VM had stopped due to I/O error, I attempted several times to restart the VM but it would just stop before it was booted with an I/O error. At this point I decided to restart the server, since then I have been unable to get almost anything up and running. The server repeatedly displays the following errors:

Buffer I/O error on dev loop0, logical block XXXX, lost async page write
EXT4-fs (loop0): error loading journal
loop: Write error at byte offset XXXXXX, length 4096
blk_update_request: I/O error, dev loop0, sector XXXXX

Thousands of these errors are displayed, I cannot load the management web console or SSH into the system, none of the LXC's or VM's appear to be running either. I have tried to umount the /dev/pve-mapper/data drive and FSCK it but I have been unable get it to unmount up to now, I tried to remove it from /etc/fstab and fsck after a reboot but it said there was nothing wrong with the drive.

I really am at a dead end with this, I am not new to linux but I obviously am very new to ProxMox and my online searching has lead to me to a complete stale mate in trying to rectify this problem, so any help at all is much appreciated!

Thank you,
Craig.

UPDATE:

I have just run ifdown, ifup on the network interfaces and I can now access the web management and SSH into the server. I have run pveversion -v, as support staff always seem to ask for it first, so...

root@pve:~# pveversion -v
proxmox-ve: 4.1-26 (running kernel: 4.2.6-1-pve)
pve-manager: 4.1-1 (running version: 4.1-1/2f9650d4)
pve-kernel-4.2.6-1-pve: 4.2.6-26
lvm2: 2.02.116-pve2
corosync-pve: 2.3.5-2
libqb0: 0.17.2-1
pve-cluster: 4.0-29
qemu-server: 4.0-41
pve-firmware: 1.1-7
libpve-common-perl: 4.0-41
libpve-access-control: 4.0-10
libpve-storage-perl: 4.0-38
pve-libspice-server1: 0.12.5-2
vncterm: 1.2-1
pve-qemu-kvm: 2.4-17
pve-container: 1.0-32
pve-firewall: 2.0-14
pve-ha-manager: 1.0-14
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u1
lxc-pve: 1.1.5-5
lxcfs: 0.13-pve1
cgmanager: 0.39-pve1
criu: 1.6.0-1
zfsutils: 0.6.5-pve6~jessie
Hi,
sound strange!

What kind is the underlaying storage-system? Raid (controller/level)? single disk (type, smart values)?

Special mount-option? (how looks the output of "mount"?)

Udo
 
Sorry for the extremely late reply.

As it turns out, yes I had run out of disk space on the host >.<. Complete noob mistake! I moved the host to smaller drives and didn't shrink the VM's.

Thanks!
 
Happens to the best of us :-) Well, it's happend to me anyway - and on a production system with the whole whole world screaming at me. Disk space was literally the last thing I thought of.

You'd think that seeing as this is the most basic issue to befall a virtual machine host, somebody would have worked out a way of showing the admin a message that said maybe, "Out of disk space."
 
  • Like
Reactions: BunkerHosted
Yeah I wondered why there was no specific error message but then again if your running an advanced server platform you should probably know what you're doing >.> haha. It's so easy to overlook the basics, but lesson learnt!