USB drive becomes inaccessible after device changes from /dev/sda to /dev/sdb

Feb 17, 2020
104
18
38
44
I have a problem with an external storage on Proxmox :( I added a USB drive storage to my proxmox a few months ago this way:

1) Connected USB drive to Proxmox machine and added as an LVM storage on /dev/sda:

1591803453448.png

1591803478438.png

2) Created a new 1000 GB virtual hard drive on it:

1591803509941.png

Everything was working fine.

3) Until one day, after a reboot, this external drive has become /dev/sdb:

1591803977794.png

4) It is still visible in Proxmox UI:

1591803790852.png

5) However it's not accessible from inside of the VM - it fails with Input/Output error :(

Bash:
user@unifi-video:~$ lsblk
NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
loop0    7:0    0   97M  1 loop /snap/core/9289
loop1    7:1    0 93.9M  1 loop /snap/core/9066
sr0     11:0    1 1024M  0 rom
vda    252:0    0   16G  0 disk
├─vda1 252:1    0    1M  0 part
└─vda2 252:2    0   16G  0 part /
vdb    252:16   0 1000G  0 disk

user@unifi-video:~$ sudo cat /dev/vdb
cat: /dev/vdb: Input/output error

Any ideas how to fix this?
 

Attachments

  • 1591803552511.png
    1591803552511.png
    124.3 KB · Views: 2
With LVM, it does not matter what underlying name the device has. The VM would also not start if the device it uses is not accessible.

Next would be to check if the disk itself has problems, so please post the output of the following commands in CODE tags:
- dmesg (on the pve host)
- lvs (on the pve host)
 
Hi @LnxBil - thanks for your response. I was also afraid that the disk might have problems, although I don't see any indication in S.M.A.R.T. and it's a pretty new drive, the machine has never lost power.

dmesg: (nothing else - just the following repeating indefinitely)
Code:
[2734239.386950] mce: CPU2: Core temperature above threshold, cpu clock throttled (total events = 294487)
[2734239.386951] mce: CPU6: Core temperature above threshold, cpu clock throttled (total events = 294451)
[2734239.386952] mce: CPU6: Package temperature above threshold, cpu clock throttled (total events = 456604)
[2734239.386985] mce: CPU4: Package temperature above threshold, cpu clock throttled (total events = 456370)
[2734239.386986] mce: CPU5: Package temperature above threshold, cpu clock throttled (total events = 456402)
[2754159.521981] mce: CPU0: Package temperature/speed normal
[2754159.521981] mce: CPU1: Package temperature/speed normal
[2754159.521982] mce: CPU4: Package temperature/speed normal
[2754159.521983] mce: CPU7: Package temperature/speed normal
[2754159.521983] mce: CPU3: Package temperature/speed normal
[2754159.522184] mce: CPU6: Core temperature/speed normal
[2754159.522770] mce: CPU2: Core temperature/speed normal
[2754159.523289] mce: CPU6: Package temperature/speed normal
[2754159.523887] mce: CPU2: Package temperature/speed normal
[2754459.523036] mce: CPU6: Core temperature above threshold, cpu clock throttled (total events = 295476)
[2754459.523036] mce: CPU2: Core temperature above threshold, cpu clock throttled (total events = 295510)
[2754459.523038] mce: CPU2: Package temperature above threshold, cpu clock throttled (total events = 457935)
[2754459.523072] mce: CPU4: Package temperature above threshold, cpu clock throttled (total events = 457679)
[2754459.523073] mce: CPU5: Package temperature above threshold, cpu clock throttled (total events = 457711)
[2754459.523074] mce: CPU1: Package temperature above threshold, cpu clock throttled (total events = 457722)
[2754459.523074] mce: CPU0: Package temperature above threshold, cpu clock throttled (total events = 457679)
[2754459.523076] mce: CPU7: Package temperature above threshold, cpu clock throttled (total events = 457714)
[2754459.523076] mce: CPU3: Package temperature above threshold, cpu clock throttled (total events = 457714)
[2754459.523619] mce: CPU6: Package temperature above threshold, cpu clock throttled (total events = 457917)
[2754459.524170] mce: CPU0: Package temperature/speed normal
[2754459.524170] mce: CPU4: Package temperature/speed normal
[2754459.524171] mce: CPU5: Package temperature/speed normal
[2754459.524173] mce: CPU7: Package temperature/speed normal
[2754459.524174] mce: CPU1: Package temperature/speed normal
[2754459.524175] mce: CPU3: Package temperature/speed normal
[2754459.524183] mce: CPU2: Core temperature/speed normal
[2754459.524772] mce: CPU6: Core temperature/speed normal

lvs:
Code:
  LV                           VG    Attr       LSize    Pool Origin                       Data%  Meta%  Move Log Cpy%Sync Convert
  data                         pve   twi-aotz-- <349.31g                                   27.11  1.74
  root                         pve   -wi-ao----   96.00g
  snap_vm-100-disk-0_beforenpm pve   Vri---tz-k   64.00g data
  snap_vm-101-disk-0_beforenpm pve   Vri---tz-k   16.00g data
  swap                         pve   -wi-ao----    8.00g
  vm-100-disk-0                pve   Vwi-aotz--   64.00g data snap_vm-100-disk-0_beforenpm 96.38
  vm-100-state-beforenpm       pve   Vwi-a-tz--   <8.49g data                              40.86
  vm-101-disk-0                pve   Vwi-aotz--   16.00g data snap_vm-101-disk-0_beforenpm 36.99
  vm-101-state-beforenpm       pve   Vwi-a-tz--   <2.49g data                              39.16
  vm-102-disk-0                pve   Vwi-aotz--   16.00g data                              82.61
  vm-103-disk-0                pve   Vwi-aotz--   16.00g data                              45.37
  vm-102-disk-0                wd3tb -wi-ao---- 1000.00g

I don't see anything wrong here :( The only thing I don't like that the logical volume format is `raw` (on my screen #4 above) - but honestly, I don't know what it was looking like before the problem happened.
 
The only thing I don't like that the logical volume format is `raw` (on my screen #4 above) - but honestly, I don't know what it was looking like before the problem happened.

LVM is always raw, which means that there is no file involved and the data is stored directly on a block device, which is normally much faster than going through an additional file layer.

Could you please run

Code:
dmesg | grep -v mce

Also please "run" into the problem in the VM before, so that we may get additional information. Can you also post possible relevant entries from the dmesg of your guest?
 
Hi @LnxBil - sure, here it is:

1) dmesg | grep -v mce - is totally empty on the Proxmox host. dmesg from the host is attached in the dmesg_host.txt file

2) dmesg from the guest is attached in the dmesg_guest.txt file

3) I'm sorry but what do you mean by 'Also please "run" into the problem in the VM before, so that we may get additional information.'?

Thanks!
 

Attachments

  • dmesg_guest.txt
    43.4 KB · Views: 1
  • dmesg_host.txt
    85.2 KB · Views: 1
) I'm sorry but what do you mean by 'Also please "run" into the problem in the VM before, so that we may get additional information.'?

Best is to get the logfiles after you recreated the problem, so that potential problems are listed. Without any logfiles, we cannot proceed.

You have to go through your /var/log/syslog* files and strip away all mce logs, maybe something will pop up.
 
Hi @LnxBil - the problem is actually present all the time, the virtual drive never ever connects to the guest.

However, I opened /var/log/syslog on both the **host** and **guest** machines, then I've issued a cat /dev/vdb (which is the virtual disk) and below is the result - the lines in red were added to the guest's syslog.

Is that of any help?

1592165861637.png

The disk should be healthy on the proxmox host:

1592165979586.png
 
Thank you for the detailed answer. Unfortunately, I cannot see any problems.

Can you try to attach the disks as SCSI (with virtio scsi controller) instead of virtio? SCSI the preselected default for Linux VMs.

What OS version are you running in that specific VM? Maybe we run into this problem. Could you please post the output of pveversion -v.
 
  • Like
Reactions: davidand
Hi @LnxBil - you are a genius, thanks!

Confirming that detaching the Virtio virtual disk and reattaching it as SCSI worked! If it helps anyone, here is how to do it:

1592208752029.png
1592207136952.png1592207150819.png1592207165100.png

For the record, my guest VM is Ubuntu 18.04.4 LTS
 
Nice that it worked out. In addition to just "replugging" the disk, you need to change the boot order if this disk would have been your boot disk. Just for people finding this thread in the future.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!