Hi all,
Since a few days I get the error "io-error" message for one of my VMs when I try to boot it.
i can't explain why this happened, this setup has been running the last 10 months without problems.
On the host I have installed Proxmox 6.1-7, the VM which has problems uses OpenMediaVault 4.1.x.
The host has 8 hard disks and one SSD installed, the hard disks are divided into two ZFS raids. (volume1 and volume2)
The VM itself was set up with two virtual disks. On one disk is the operating system, on the other disk there is only data.
The OS disk is located on the SSD and the data disk can be found on volume1. (vm-100-disk-0)
If I detache the data disk, I can boot the VM normally.
So my guess is there's something wrong with the raid, but I can't find anything out of the ordinary.
My second guess would be that the data disk on volume1 is somehow corrupt.
I've run a "zpool scrub" for voume1, but no error was found.
Here some info's regarding the storage situation:
Something that seems strange to me, why does the command "df -h" show a much smaller size for volume1? Is the volume not mounted correctly?
Now to my question, what can i do to fix this problem? If it cannot be fixed, is there any way I can save my data?
Thanks in advance for your help!
Sorry if I forgot to add some important information's.
Since a few days I get the error "io-error" message for one of my VMs when I try to boot it.
i can't explain why this happened, this setup has been running the last 10 months without problems.
On the host I have installed Proxmox 6.1-7, the VM which has problems uses OpenMediaVault 4.1.x.
The host has 8 hard disks and one SSD installed, the hard disks are divided into two ZFS raids. (volume1 and volume2)
The VM itself was set up with two virtual disks. On one disk is the operating system, on the other disk there is only data.
The OS disk is located on the SSD and the data disk can be found on volume1. (vm-100-disk-0)
If I detache the data disk, I can boot the VM normally.
So my guess is there's something wrong with the raid, but I can't find anything out of the ordinary.
My second guess would be that the data disk on volume1 is somehow corrupt.
I've run a "zpool scrub" for voume1, but no error was found.
Here some info's regarding the storage situation:
Code:
root@pve:/# df -h
Filesystem Size Used Avail Use% Mounted on
udev 16G 0 16G 0% /dev
tmpfs 3.1G 9.2M 3.1G 1% /run
/dev/mapper/pve-root 57G 7.2G 47G 14% /
tmpfs 16G 37M 16G 1% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 16G 0 16G 0% /sys/fs/cgroup
/dev/nvme0n1p2 511M 304K 511M 1% /boot/efi
volume2 11T 256K 11T 1% /volume2
volume1 2.0M 256K 1.8M 13% /volume1
/dev/fuse 30M 16K 30M 1% /etc/pve
tmpfs 3.1G 0 3.1G 0% /run/user/0
root@pve:/# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
data pve twi-aotz-- <147.38g 39.62 2.86
root pve -wi-ao---- 58.00g
swap pve -wi-ao---- 8.00g
vm-100-disk-0 pve Vwi-aotz-- 60.00g data 63.63
vm-101-disk-0 pve Vwi-aotz-- 25.00g data 80.86
root@pve:/# vgs
VG #PV #LV #SN Attr VSize VFree
pve 1 5 0 wz--n- 232.38g 16.00g
root@pve:/# pvs
PV VG Fmt Attr PSize PFree
/dev/nvme0n1p3 pve lvm2 a-- 232.38g 16.00g
root@pve:/# zpool list -v
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
volume1 36.2T 35.1T 1.13T - - 11% 96% 1.00x ONLINE -
raidz1 36.2T 35.1T 1.13T - - 11% 96.9% - ONLINE
sda - - - - - - - - ONLINE
sdb - - - - - - - - ONLINE
sdc - - - - - - - - ONLINE
sdd - - - - - - - - ONLINE
volume2 14.5T 53.3G 14.5T - - 0% 0% 1.00x ONLINE -
raidz1 14.5T 53.3G 14.5T - - 0% 0.35% - ONLINE
sde - - - - - - - - ONLINE
sdf - - - - - - - - ONLINE
sdg - - - - - - - - ONLINE
sdh - - - - - - - - ONLINE
root@pve:/# zpool status -v
pool: volume1
state: ONLINE
scan: scrub repaired 0B in 0 days 19:38:40 with 0 errors on Wed Feb 26 13:07:48 2020
config:
NAME STATE READ WRITE CKSUM
volume1 ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
sda ONLINE 0 0 0
sdb ONLINE 0 0 0
sdc ONLINE 0 0 0
sdd ONLINE 0 0 0
errors: No known data errors
pool: volume2
state: ONLINE
scan: scrub repaired 0B in 0 days 00:00:02 with 0 errors on Sun Feb 9 00:24:05 2020
config:
NAME STATE READ WRITE CKSUM
volume2 ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
sde ONLINE 0 0 0
sdf ONLINE 0 0 0
sdg ONLINE 0 0 0
sdh ONLINE 0 0 0
errors: No known data errors
Something that seems strange to me, why does the command "df -h" show a much smaller size for volume1? Is the volume not mounted correctly?
Now to my question, what can i do to fix this problem? If it cannot be fixed, is there any way I can save my data?
Thanks in advance for your help!
Sorry if I forgot to add some important information's.