USB drive becomes inaccessible after device changes from /dev/sda to /dev/sdb

davidand · Jun 10, 2020

I have a problem with an external storage on Proxmox

I added a USB drive storage to my proxmox a few months ago this way:

1) Connected USB drive to Proxmox machine and added as an LVM storage on /dev/sda:

2) Created a new 1000 GB virtual hard drive on it:

Everything was working fine.

3) Until one day, after a reboot, this external drive has become /dev/sdb:

4) It is still visible in Proxmox UI:

5) However it's not accessible from inside of the VM - it fails with Input/Output error

Bash:

user@unifi-video:~$ lsblk
NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
loop0    7:0    0   97M  1 loop /snap/core/9289
loop1    7:1    0 93.9M  1 loop /snap/core/9066
sr0     11:0    1 1024M  0 rom
vda    252:0    0   16G  0 disk
├─vda1 252:1    0    1M  0 part
└─vda2 252:2    0   16G  0 part /
vdb    252:16   0 1000G  0 disk

user@unifi-video:~$ sudo cat /dev/vdb
cat: /dev/vdb: Input/output error

Any ideas how to fix this?

LnxBil · Jun 11, 2020

With LVM, it does not matter what underlying name the device has. The VM would also not start if the device it uses is not accessible.

Next would be to check if the disk itself has problems, so please post the output of the following commands in CODE tags:
- dmesg (on the pve host)
- lvs (on the pve host)

davidand · Jun 11, 2020

Hi @LnxBil - thanks for your response. I was also afraid that the disk might have problems, although I don't see any indication in S.M.A.R.T. and it's a pretty new drive, the machine has never lost power.

dmesg: (nothing else - just the following repeating indefinitely)

Code:

[2734239.386950] mce: CPU2: Core temperature above threshold, cpu clock throttled (total events = 294487)
[2734239.386951] mce: CPU6: Core temperature above threshold, cpu clock throttled (total events = 294451)
[2734239.386952] mce: CPU6: Package temperature above threshold, cpu clock throttled (total events = 456604)
[2734239.386985] mce: CPU4: Package temperature above threshold, cpu clock throttled (total events = 456370)
[2734239.386986] mce: CPU5: Package temperature above threshold, cpu clock throttled (total events = 456402)
[2754159.521981] mce: CPU0: Package temperature/speed normal
[2754159.521981] mce: CPU1: Package temperature/speed normal
[2754159.521982] mce: CPU4: Package temperature/speed normal
[2754159.521983] mce: CPU7: Package temperature/speed normal
[2754159.521983] mce: CPU3: Package temperature/speed normal
[2754159.522184] mce: CPU6: Core temperature/speed normal
[2754159.522770] mce: CPU2: Core temperature/speed normal
[2754159.523289] mce: CPU6: Package temperature/speed normal
[2754159.523887] mce: CPU2: Package temperature/speed normal
[2754459.523036] mce: CPU6: Core temperature above threshold, cpu clock throttled (total events = 295476)
[2754459.523036] mce: CPU2: Core temperature above threshold, cpu clock throttled (total events = 295510)
[2754459.523038] mce: CPU2: Package temperature above threshold, cpu clock throttled (total events = 457935)
[2754459.523072] mce: CPU4: Package temperature above threshold, cpu clock throttled (total events = 457679)
[2754459.523073] mce: CPU5: Package temperature above threshold, cpu clock throttled (total events = 457711)
[2754459.523074] mce: CPU1: Package temperature above threshold, cpu clock throttled (total events = 457722)
[2754459.523074] mce: CPU0: Package temperature above threshold, cpu clock throttled (total events = 457679)
[2754459.523076] mce: CPU7: Package temperature above threshold, cpu clock throttled (total events = 457714)
[2754459.523076] mce: CPU3: Package temperature above threshold, cpu clock throttled (total events = 457714)
[2754459.523619] mce: CPU6: Package temperature above threshold, cpu clock throttled (total events = 457917)
[2754459.524170] mce: CPU0: Package temperature/speed normal
[2754459.524170] mce: CPU4: Package temperature/speed normal
[2754459.524171] mce: CPU5: Package temperature/speed normal
[2754459.524173] mce: CPU7: Package temperature/speed normal
[2754459.524174] mce: CPU1: Package temperature/speed normal
[2754459.524175] mce: CPU3: Package temperature/speed normal
[2754459.524183] mce: CPU2: Core temperature/speed normal
[2754459.524772] mce: CPU6: Core temperature/speed normal

lvs:

Code:

  LV                           VG    Attr       LSize    Pool Origin                       Data%  Meta%  Move Log Cpy%Sync Convert
  data                         pve   twi-aotz-- <349.31g                                   27.11  1.74
  root                         pve   -wi-ao----   96.00g
  snap_vm-100-disk-0_beforenpm pve   Vri---tz-k   64.00g data
  snap_vm-101-disk-0_beforenpm pve   Vri---tz-k   16.00g data
  swap                         pve   -wi-ao----    8.00g
  vm-100-disk-0                pve   Vwi-aotz--   64.00g data snap_vm-100-disk-0_beforenpm 96.38
  vm-100-state-beforenpm       pve   Vwi-a-tz--   <8.49g data                              40.86
  vm-101-disk-0                pve   Vwi-aotz--   16.00g data snap_vm-101-disk-0_beforenpm 36.99
  vm-101-state-beforenpm       pve   Vwi-a-tz--   <2.49g data                              39.16
  vm-102-disk-0                pve   Vwi-aotz--   16.00g data                              82.61
  vm-103-disk-0                pve   Vwi-aotz--   16.00g data                              45.37
  vm-102-disk-0                wd3tb -wi-ao---- 1000.00g

I don't see anything wrong here

The only thing I don't like that the logical volume format is `raw` (on my screen #4 above) - but honestly, I don't know what it was looking like before the problem happened.

LnxBil · Jun 12, 2020

davidand said:
The only thing I don't like that the logical volume format is `raw` (on my screen #4 above) - but honestly, I don't know what it was looking like before the problem happened.

LVM is always raw, which means that there is no file involved and the data is stored directly on a block device, which is normally much faster than going through an additional file layer.

Could you please run

Code:

dmesg | grep -v mce

Also please "run" into the problem in the VM before, so that we may get additional information. Can you also post possible relevant entries from the dmesg of your guest?

davidand · Jun 12, 2020

Hi @LnxBil - sure, here it is:

1) dmesg | grep -v mce - is totally empty on the Proxmox host. dmesg from the host is attached in the dmesg_host.txt file

2) dmesg from the guest is attached in the dmesg_guest.txt file

3) I'm sorry but what do you mean by 'Also please "run" into the problem in the VM before, so that we may get additional information.'?

Thanks!

LnxBil · Jun 13, 2020

davidand said:
) I'm sorry but what do you mean by 'Also please "run" into the problem in the VM before, so that we may get additional information.'?

Best is to get the logfiles after you recreated the problem, so that potential problems are listed. Without any logfiles, we cannot proceed.

You have to go through your /var/log/syslog* files and strip away all mce logs, maybe something will pop up.

davidand · Jun 14, 2020

Hi @LnxBil - the problem is actually present all the time, the virtual drive never ever connects to the guest.

However, I opened /var/log/syslog on both the **host** and **guest** machines, then I've issued a cat /dev/vdb (which is the virtual disk) and below is the result - the lines in red were added to the guest's syslog.

Is that of any help?

The disk should be healthy on the proxmox host:

LnxBil · Jun 15, 2020

Thank you for the detailed answer. Unfortunately, I cannot see any problems.

Can you try to attach the disks as SCSI (with virtio scsi controller) instead of virtio? SCSI the preselected default for Linux VMs.

What OS version are you running in that specific VM? Maybe we run into this problem. Could you please post the output of pveversion -v.

davidand · Jun 15, 2020

Hi @LnxBil - you are a genius, thanks!

Confirming that detaching the Virtio virtual disk and reattaching it as SCSI worked! If it helps anyone, here is how to do it:

For the record, my guest VM is Ubuntu 18.04.4 LTS

davidand · Jun 15, 2020

BTW Can I mark this thread as 'Solved' somehow?

LnxBil · Jun 18, 2020

Nice that it worked out. In addition to just "replugging" the disk, you need to change the boot order if this disk would have been your boot disk. Just for people finding this thread in the future.

davidand · Jun 19, 2020

@LnxBil thanks. For me it was a secondary disk and the UUID of the disk remained the same (which I found strange) so actually after replugging /etc/fstab started just working.

LnxBil · Jun 19, 2020

davidand said:
@LnxBil thanks. For me it was a secondary disk and the UUID of the disk remained the same (which I found strange) so actually after replugging /etc/fstab started just working.

The UUID is on the disk itself, so it is the same.

Search

Search

USB drive becomes inaccessible after device changes from /dev/sda to /dev/sdb

davidand

Active Member

Attachments

LnxBil

Distinguished Member

davidand

Active Member

LnxBil

Distinguished Member

davidand

Active Member

Attachments

LnxBil

Distinguished Member

davidand

Active Member

LnxBil

Distinguished Member

davidand

Active Member

davidand

Active Member

LnxBil

Distinguished Member

davidand

Active Member

LnxBil

Distinguished Member

We value your privacy