Troubleshoot io-error status

flucimuc

Member
Feb 27, 2020
2
0
21
33
Hi all,

Since a few days I get the error "io-error" message for one of my VMs when I try to boot it.
i can't explain why this happened, this setup has been running the last 10 months without problems.

On the host I have installed Proxmox 6.1-7, the VM which has problems uses OpenMediaVault 4.1.x.
The host has 8 hard disks and one SSD installed, the hard disks are divided into two ZFS raids. (volume1 and volume2)

The VM itself was set up with two virtual disks. On one disk is the operating system, on the other disk there is only data.
The OS disk is located on the SSD and the data disk can be found on volume1. (vm-100-disk-0)

If I detache the data disk, I can boot the VM normally.
So my guess is there's something wrong with the raid, but I can't find anything out of the ordinary.
My second guess would be that the data disk on volume1 is somehow corrupt.
I've run a "zpool scrub" for voume1, but no error was found.

Here some info's regarding the storage situation:

Code:
root@pve:/# df -h
Filesystem            Size  Used Avail Use% Mounted on
udev                   16G     0   16G   0% /dev
tmpfs                 3.1G  9.2M  3.1G   1% /run
/dev/mapper/pve-root   57G  7.2G   47G  14% /
tmpfs                  16G   37M   16G   1% /dev/shm
tmpfs                 5.0M     0  5.0M   0% /run/lock
tmpfs                  16G     0   16G   0% /sys/fs/cgroup
/dev/nvme0n1p2        511M  304K  511M   1% /boot/efi
volume2                11T  256K   11T   1% /volume2
volume1               2.0M  256K  1.8M  13% /volume1
/dev/fuse              30M   16K   30M   1% /etc/pve
tmpfs                 3.1G     0  3.1G   0% /run/user/0
root@pve:/# lvs
  LV            VG  Attr       LSize    Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  data          pve twi-aotz-- <147.38g             39.62  2.86
  root          pve -wi-ao----   58.00g
  swap          pve -wi-ao----    8.00g
  vm-100-disk-0 pve Vwi-aotz--   60.00g data        63.63
  vm-101-disk-0 pve Vwi-aotz--   25.00g data        80.86
root@pve:/# vgs
  VG  #PV #LV #SN Attr   VSize   VFree
  pve   1   5   0 wz--n- 232.38g 16.00g
root@pve:/# pvs
  PV             VG  Fmt  Attr PSize   PFree
  /dev/nvme0n1p3 pve lvm2 a--  232.38g 16.00g
root@pve:/# zpool list -v
NAME        SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
volume1    36.2T  35.1T  1.13T        -         -    11%    96%  1.00x    ONLINE  -
  raidz1   36.2T  35.1T  1.13T        -         -    11%  96.9%      -  ONLINE
    sda        -      -      -        -         -      -      -      -  ONLINE
    sdb        -      -      -        -         -      -      -      -  ONLINE
    sdc        -      -      -        -         -      -      -      -  ONLINE
    sdd        -      -      -        -         -      -      -      -  ONLINE
volume2    14.5T  53.3G  14.5T        -         -     0%     0%  1.00x    ONLINE  -
  raidz1   14.5T  53.3G  14.5T        -         -     0%  0.35%      -  ONLINE
    sde        -      -      -        -         -      -      -      -  ONLINE
    sdf        -      -      -        -         -      -      -      -  ONLINE
    sdg        -      -      -        -         -      -      -      -  ONLINE
    sdh        -      -      -        -         -      -      -      -  ONLINE
root@pve:/# zpool status -v
  pool: volume1
 state: ONLINE
  scan: scrub repaired 0B in 0 days 19:38:40 with 0 errors on Wed Feb 26 13:07:48 2020
config:

        NAME        STATE     READ WRITE CKSUM
        volume1     ONLINE       0     0     0
          raidz1-0  ONLINE       0     0     0
            sda     ONLINE       0     0     0
            sdb     ONLINE       0     0     0
            sdc     ONLINE       0     0     0
            sdd     ONLINE       0     0     0

errors: No known data errors

  pool: volume2
 state: ONLINE
  scan: scrub repaired 0B in 0 days 00:00:02 with 0 errors on Sun Feb  9 00:24:05 2020
config:

        NAME        STATE     READ WRITE CKSUM
        volume2     ONLINE       0     0     0
          raidz1-0  ONLINE       0     0     0
            sde     ONLINE       0     0     0
            sdf     ONLINE       0     0     0
            sdg     ONLINE       0     0     0
            sdh     ONLINE       0     0     0

errors: No known data errors

Something that seems strange to me, why does the command "df -h" show a much smaller size for volume1? Is the volume not mounted correctly?
Now to my question, what can i do to fix this problem? If it cannot be fixed, is there any way I can save my data?

Thanks in advance for your help!
Sorry if I forgot to add some important information's.
 
Volume 1 is at 96% capacity and will most probably stopped to work. Your ZFS pool is too full and you will get I/O errors on writing to it.
 
I could fix the problem by deleting volume2 and adding the 4 disks to zpool volume1.
Why zpool volume1 is already full I have not fully understand yet, but at least I have solved the problem for now and can boot the VM and access my data.

Thanks to all who helped me.
 
great @LnxBil - I did not find that one as well :/
Doh. Seems we (humans) need to do the same mistakes over and over again ... :rolleyes:
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!