io-error

julienW

Member
Dec 22, 2022
2
0
6
Hi

All my vms on a proxmox server appear wih the orange triangle IO error and are frozen.
It seems that my vms could not access to their harddrive.

At the proxmox boot I have the following message
Bash:
Mar 11 09:30:24 proxGPU2 systemd[1]: zfs-import@capacity.service: Main process exited, code=exited, status=1/FAILURE
Mar 11 09:30:24 proxGPU2 systemd[1]: zfs-import@capacity.service: Failed with result 'exit-code'.
Mar 11 09:30:24 proxGPU2 systemd[1]: Failed to start Import ZFS pool capacity.
Mar 11 09:30:35 proxGPU2 kernel:  zd0: p1
Mar 11 09:30:35 proxGPU2 kernel:  zd16: p1 p2 < p5 >
Mar 11 09:30:35 proxGPU2 kernel:  zd32: p1
Mar 11 09:30:35 proxGPU2 kernel:  zd48: p1
Mar 11 09:30:36 proxGPU2 kernel:  zd64: p1 p2 < p5 p6 > p3
Mar 11 09:30:36 proxGPU2 kernel:  zd96: p1 p2
Mar 11 09:30:36 proxGPU2 kernel:  zd112: p1
Mar 11 09:30:36 proxGPU2 kernel:  zd144: p1 p2 p3
Mar 11 09:30:36 proxGPU2 kernel:  zd160: p1
Mar 11 09:30:37 proxGPU2 kernel:  zd176: p1
Mar 11 09:30:37 proxGPU2 kernel:  zd208: p1
Mar 11 09:30:37 proxGPU2 kernel:  zd240: p1 p2
Mar 11 09:30:37 proxGPU2 systemd[1]: Finished Import ZFS pools by cache file.
Mar 11 09:30:37 proxGPU2 systemd[1]: Reached target ZFS pool import target.
Mar 11 09:30:37 proxGPU2 systemd[1]: Starting Mount ZFS filesystems...
Mar 11 09:30:37 proxGPU2 systemd[1]: Starting Wait for ZFS Volume (zvol) links in /dev...
Mar 11 09:30:37 proxGPU2 zvol_wait[4052]: Testing 16 zvol links
Mar 11 09:30:37 proxGPU2 zvol_wait[4052]: All zvol links are now present.
Mar 11 09:30:37 proxGPU2 systemd[1]: Finished Wait for ZFS Volume (zvol) links in /dev.

In my web interface I can see my pool.

Any idea to investigate this errors ?

Thanks in advance
 
Hi,
Hi

All my vms on a proxmox server appear wih the orange triangle IO error and are frozen.
It seems that my vms could not access to their harddrive.

At the proxmox boot I have the following message
Bash:
Mar 11 09:30:24 proxGPU2 systemd[1]: zfs-import@capacity.service: Main process exited, code=exited, status=1/FAILURE
Mar 11 09:30:24 proxGPU2 systemd[1]: zfs-import@capacity.service: Failed with result 'exit-code'.
Mar 11 09:30:24 proxGPU2 systemd[1]: Failed to start Import ZFS pool capacity.
Mar 11 09:30:35 proxGPU2 kernel:  zd0: p1
Mar 11 09:30:35 proxGPU2 kernel:  zd16: p1 p2 < p5 >
Mar 11 09:30:35 proxGPU2 kernel:  zd32: p1
Mar 11 09:30:35 proxGPU2 kernel:  zd48: p1
Mar 11 09:30:36 proxGPU2 kernel:  zd64: p1 p2 < p5 p6 > p3
Mar 11 09:30:36 proxGPU2 kernel:  zd96: p1 p2
Mar 11 09:30:36 proxGPU2 kernel:  zd112: p1
Mar 11 09:30:36 proxGPU2 kernel:  zd144: p1 p2 p3
Mar 11 09:30:36 proxGPU2 kernel:  zd160: p1
Mar 11 09:30:37 proxGPU2 kernel:  zd176: p1
Mar 11 09:30:37 proxGPU2 kernel:  zd208: p1
Mar 11 09:30:37 proxGPU2 kernel:  zd240: p1 p2
Mar 11 09:30:37 proxGPU2 systemd[1]: Finished Import ZFS pools by cache file.
Mar 11 09:30:37 proxGPU2 systemd[1]: Reached target ZFS pool import target.
Mar 11 09:30:37 proxGPU2 systemd[1]: Starting Mount ZFS filesystems...
Mar 11 09:30:37 proxGPU2 systemd[1]: Starting Wait for ZFS Volume (zvol) links in /dev...
Mar 11 09:30:37 proxGPU2 zvol_wait[4052]: Testing 16 zvol links
Mar 11 09:30:37 proxGPU2 zvol_wait[4052]: All zvol links are now present.
Mar 11 09:30:37 proxGPU2 systemd[1]: Finished Wait for ZFS Volume (zvol) links in /dev.
That likely just means that the pool was imported via the cache file, not via the zfs-import@capacity.service service.
In my web interface I can see my pool.

Any idea to investigate this errors ?

Thanks in advance
What does zpool status -v show? Please share the system logs/journal from around the time the issue happened (or the full system logs for the current boot).
 
Since the pool is visible in the web interface, it seems like the import process worked via the cache file, but there might still be an issue with access. Running zpool status -v is a good first step to check for errors. Also, try zfs list to see if datasets are accessible. If the pool is degraded or has missing devices, that could explain the IO errors. If zpool status doesn’t show anything unusual, checking the full system logs (journalctl -xe or dmesg) could help identify the cause.