io error on Truenas VM

Bronz · Mar 6, 2023

Hello, Im newbie here just got into proxmox and im trying to build a homeserver using a on old laptop (HP Pavilion dv6-7080ee) for simple smb file sharing and some movies database (normal file sharing and Jellfin each on separate VMs). I am using 4 external usb hdd 3 500GB 1 1TB , i found a tutorial on how to HDD passthrought without HBA, I occasionally encounter io error and truenas is stuck until I stop the VM and start it again. This happens after 1-2 days of runtime of the VM.

Dunuin · Mar 6, 2023

Did you run smartctl -a /dev/yourDisk to see how healthy your disks are? In case of 3.5" HDDs they can get really hot when running 24/7. Also make sure not to stack them, as vibrations can cause write/read errors too.

fiona · Mar 6, 2023

Hi,
apart from what @Dunuin said, I'd also check /var/log/syslog from around the time the issue occurs. There might be related messages telling you more. Please also share the output of qm config <ID> for the affected VM and pveversion -v.

Bronz · Mar 6, 2023

Dunuin said:
Did you run smartctl -a /dev/yourDisk to see how healthy your disks are? In case of 3.5" HDDs they can get really hot when running 24/7. Also make sure not to stack them, as vibrations can cause write/read errors too.

Hello Dunuin, i attached the output and yes they are not stacked.

Bronz · Mar 6, 2023

fiona said:
Hi,
apart from what @Dunuin said, I'd also check /var/log/syslog from around the time the issue occurs. There might be related messages telling you more. Please also share the output of qm config <ID> for the affected VM and pveversion -v.

Hello Fiona, thanks for the reply too. The syslog was too large to be uploaded so i had to omit the recurring message log " Mar 04 00:00:47 proxmox kernel: CIFS: VFS: \\192.168.178.29\HomeServer BAD_NETWORK_NAME: \\192.168.178.29\HomeServer " in order to be able to upload the text file. I also included qmconfig output and pveversion -v in the same text file.

apmuthu · Mar 6, 2023

# tail -n15 /var/log/syslog

Code:

Mar  6 20:03:10 ipa kernel: [2191387.109617] audit: type=1400 audit(1678113190.636:72732): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxc-102_</var/lib/lxc>" name="/run/systemd/unit-root/proc/" pid=4433 comm="(netdata)" fstype="proc" srcname="proc" flags="rw, nosuid, nodev, noexec"

Any worries?

Dunuin · Mar 6, 2023

The Seatage Momentus reports a lot raw read and seek errors as well as loads of data that had to be fixed by ECC.

Bronz · Mar 6, 2023

Dunuin said:
The Seatage Momentus reports a lot raw read and seek errors as well as loads of data that had to be fixed by ECC.

So i should use ECC memory ? or what is the case

Dunuin · Mar 6, 2023

Bronz said:
So i should use ECC memory ? or what is the case

Its always better to use ECC memory, but that is not what Is meant there.
The Disk itself got ECC to fix minor errors and that counter counts how much data was fixed by it.

fiona · Mar 7, 2023

When did the error occur? In the syslog I can see

Code:

Mar 04 19:47:46 proxmox smartd[773]: Device: /dev/sdc [SAT], removed ATA device: No such device
Mar 04 19:47:46 proxmox smartd[773]: Device: /dev/sdd [SAT], reconnected ATA device

but no direct messages related to IO errors.

Next time it happens, the following script should tell you which drive got the error:

Code:

root@pve701 ~ # cat query-block.pm
#!/bin/perl

use strict;
use warnings;

use PVE::QemuServer::Monitor qw(mon_cmd);

my $vmid = shift or die "need to specify vmid\n";

my $res = eval { mon_cmd($vmid, "query-block" ) };
die $@ if $@;
for my $blockdev ($res->@*) {
    print $blockdev->{device} . " got status " . $blockdev->{'io-status'} . "\n";
}

root@pve701 ~ # perl query-block.pm 169
drive-ide2 got status ok
drive-scsi0 got status nospace
drive-scsi1 got status ok
drive-scsi2 got status ok

Bronz · Mar 24, 2023

fiona said:

When did the error occur? In the syslog I can see

Code:

Mar 04 19:47:46 proxmox smartd[773]: Device: /dev/sdc [SAT], removed ATA device: No such device
Mar 04 19:47:46 proxmox smartd[773]: Device: /dev/sdd [SAT], reconnected ATA device

but no direct messages related to IO errors.

Next time it happens, the following script should tell you which drive got the error:

Code:

root@pve701 ~ # cat query-block.pm
#!/bin/perl

use strict;
use warnings;

use PVE::QemuServer::Monitor qw(mon_cmd);

my $vmid = shift or die "need to specify vmid\n";

my $res = eval { mon_cmd($vmid, "query-block" ) };
die $@ if $@;
for my $blockdev ($res->@*) {
    print $blockdev->{device} . " got status " . $blockdev->{'io-status'} . "\n";
}

root@pve701 ~ # perl query-block.pm 169
drive-ide2 got status ok
drive-scsi0 got status nospace
drive-scsi1 got status ok
drive-scsi2 got status ok

Hi again, the io-error appeared again can you help me with the code, how can i use it. The VM id is 101 its for my truenas server.

fiona · Mar 24, 2023

Bronz said:
Hi again, the io-error appeared again can you help me with the code, how can i use it. The VM id is 101 its for my truenas server.

Copy the contents

Code:

#!/bin/perl

use strict;
use warnings;

use PVE::QemuServer::Monitor qw(mon_cmd);

my $vmid = shift or die "need to specify vmid\n";

my $res = eval { mon_cmd($vmid, "query-block" ) };
die $@ if $@;
for my $blockdev ($res->@*) {
    print $blockdev->{device} . " got status " . $blockdev->{'io-status'} . "\n";
}

to a file called query-block.pm and then run it with perl query-block.pm 101

Bronz · Mar 24, 2023

fiona said:

Copy the contents

Code:

#!/bin/perl

use strict;
use warnings;

use PVE::QemuServer::Monitor qw(mon_cmd);

my $vmid = shift or die "need to specify vmid\n";

my $res = eval { mon_cmd($vmid, "query-block" ) };
die $@ if $@;
for my $blockdev ($res->@*) {
    print $blockdev->{device} . " got status " . $blockdev->{'io-status'} . "\n";
}

to a file called query-block.pm and then run it with perl query-block.pm 101

Thanks for the swift reply,

i got this from the script:

root@proxmox:~# perl query-block.pm 101
drive-ide2 got status ok
drive-scsi0 got status ok
drive-scsi1 got status ok
drive-scsi2 got status nospace
drive-scsi3 got status ok
drive-scsi4 got status ok

fiona · Mar 27, 2023

Bronz said:
drive-scsi2 got status nospace

Now we know it was this drive (in this case). Is the drive actually full? Is it still using aio=threads? AFAICT, QEMU will interpret short writes (which in practice, almost almost never happen) for aio=io_uring and aio=native as out-of-space, but not for aio=threads.

Bronz · Mar 27, 2023

fiona said:
Now we know it was this drive (in this case). Is the drive actually full? Is it still using aio=threads? AFAICT, QEMU will interpret short writes (which in practice, almost almost never happen) for aio=io_uring and aio=native as out-of-space, but not for aio=threads.

Hello again, yes i am using io=threads but the drive is not full in fact has more than 400 GB free.

io error on Truenas VM

Bronz

New Member

Dunuin

Distinguished Member

fiona

Proxmox Staff Member

Bronz

New Member

Attachments

Bronz

New Member

Attachments

apmuthu

Renowned Member

Dunuin

Distinguished Member

Bronz

New Member

Dunuin

Distinguished Member

fiona

Proxmox Staff Member

Bronz

New Member

fiona

Proxmox Staff Member

Bronz

New Member

fiona

Proxmox Staff Member

Bronz

New Member

We value your privacy

io error on Truenas VM

New Member

Distinguished Member

Proxmox Staff Member

New Member

​

Attachments

New Member

Attachments

Renowned Member

Distinguished Member

New Member

Distinguished Member

Proxmox Staff Member

New Member

Proxmox Staff Member

New Member

Proxmox Staff Member

New Member

We value your privacy