Complete USB bus crashing - When using external SSD - SATA to USB 3.0

gfngfn256

Renowned Member
Mar 29, 2023
1,553
423
88
Never had this in the past, I regularly copy files from my Host machine to an external SSD (2TB), but now when doing this the minute I start a file copy or Rsync etc. the whole USB bus crashes & no USB port is available until I reboot the machine.

I believe this problem began extremely recently - IIRC I did this 3 weeks ago - so I imagine this problem began with the latest kernel update or one of the packages - I keep this home lab PVE instance up to date all the time:

Code:
root@MINS-PRXMX:~# pveversion -v
proxmox-ve: 8.2.0 (running kernel: 6.8.4-2-pve)
pve-manager: 8.2.2 (running version: 8.2.2/9355359cd7afbae4)
proxmox-kernel-helper: 8.1.0
proxmox-kernel-6.8: 6.8.4-2
proxmox-kernel-6.8.4-2-pve-signed: 6.8.4-2
proxmox-kernel-6.5.13-5-pve-signed: 6.5.13-5
proxmox-kernel-6.5: 6.5.13-5
ceph-fuse: 17.2.7-pve3
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx8
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.1
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.3
libpve-access-control: 8.1.4
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.0.6
libpve-cluster-perl: 8.0.6
libpve-common-perl: 8.2.1
libpve-guest-common-perl: 5.1.1
libpve-http-server-perl: 5.1.0
libpve-network-perl: 0.9.8
libpve-rs-perl: 0.8.8
libpve-storage-perl: 8.2.1
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.4.0-3
proxmox-backup-client: 3.2.2-1
proxmox-backup-file-restore: 3.2.2-1
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.2.3
proxmox-mini-journalreader: 1.4.0
proxmox-widget-toolkit: 4.2.3
pve-cluster: 8.0.6
pve-container: 5.0.11
pve-docs: 8.2.2
pve-edk2-firmware: 4.2023.08-4
pve-esxi-import-tools: 0.7.0
pve-firewall: 5.0.6
pve-firmware: 3.11-1
pve-ha-manager: 4.0.4
pve-i18n: 3.2.2
pve-qemu-kvm: 8.1.5-6
pve-xtermjs: 5.3.0-3
qemu-server: 8.2.1
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.3-pve2

I tried 2 different SATA to USB cables I have - same result.

I used a USB 3.1 flash drive - no problems whatsoever.

I must point out I'm using Veracrypt on that drive with an exFAT volume - but I don't believe that's linked. (Could test further).

Attached Journal log for the relevant period - I've noted the various stages involved.

I have no passthrough of any kind on this machine. (Minisforum NPB7 Intel i7-13700H, 32gb DDR5 1tb NVME + 2tb SATA SSD internal)
 

Attachments

  • Journal2.log
    17.1 KB · Views: 3
I'm wondering, could the new intel default of intel_iommu=on in the current 6.8 kernel be the culprit. (Effecting DMA error etc.).
See this post.
 
Code:
May 02 18:21:09 MINS-PRXMX kernel: DMAR: DRHD: handling fault status reg 3
May 02 18:21:09 MINS-PRXMX kernel: DMAR: [DMA Read NO_PASID] Request device [00:14.0] fault addr 0xff330000 [fault reason 0x06] PTE Read access is not set

I believe this is where things go wrong.
By the look of things this could be broken hardware, broken bios or buggy kernel.

Do you have IOMMU enabled in your BIOS? Maybe try disable that and remove
Code:
intel_iommu=on
since you don't need and passthrough?
 
I'll vote for a buggy kernel on this one.
If going back to kernel 6.5 does not help then it could very well be broken hardware, either the MB or the USB-SATA Cable.
 
I'm wondering, could the new intel default of intel_iommu=on in the current 6.8 kernel be the culprit. (Effecting DMA error etc.).
Code:
May 02 18:21:09 MINS-PRXMX kernel: DMAR: DRHD: handling fault status reg 3
May 02 18:21:09 MINS-PRXMX kernel: DMAR: [DMA Read NO_PASID] Request device [00:14.0] fault addr 0xff330000 [fault reason 0x06] PTE Read access is not set
That is typically IOMMU taking issue with a PCI(e) device. Try intel_iommu=off (since Ubuntu enabled it by default on kernel 6.8 and so did Proxmox) or maybe iommu=pt is good enough (to use identity mapping for devices that are not passed through).
 
  • Like
Reactions: justinclift
As I pointed out above - my thoughts exactly.
I don't believe its HW related - since it's always worked in the past.
That DMAR message also led me to believe that its a System/OS/Kernel issue with a PCI device.
So I'll probably have to go down the "adjust command line" route. Never had to do it in the past - so don't like going there.
 
Just to recap, the problem existed on a Veracrypt complete encrypted drive.

So I've done some further testing:

1. I've not adjusted any iommu settings.
2. I've tested same SATA to USB adapter with different drive (HDD vs SDD) & found no problem (non-Veracrypt)
3. I've created a Veracrypt-encrpted file/volume on another HDD & found no problem.

So as for now - I believe problem is only when using a fully encrypted drive with Veracrypt. Please note this drive in not divided into partitions at all, so example /dev/sdb (no further partitions present within this) is totally encrypted & files are placed within that.
Possibly if I'd partitioned it off and then encrypted for example /dev/sdb1 (vs /dev/sdb) problem wouldn't occur - further testing necessary to establish.
 
Last edited:
Watching this thread. I thought the USB ports on my motherboard might be dying, but came across this thread. I have the same scenario, backing up to a Veracrypt-encrypted partition on the external USB drive. Once I try copying data to the decrypted and mounted drive, USB starts to lose its mind--my UPS connection via USB gets dropped and my other USB drive starts to fail.

I've been following this same procedure for at least a year, running a backup script with a rotation of external drives without issue until this month.

Hoping for a resolution to this problem!
 
Your experience of a full USB meltdown matches exactly mine - & yes it only began recently (6.8 kernel maybe?).

Veracrypt-encrypted partition on the external USB drive
Just to clarify - you have multiple partitions on the disk & one of them like /dev/sda2 is encrypted?
 
Your experience of a full USB meltdown matches exactly mine - & yes it only began recently (6.8 kernel maybe?).


Just to clarify - you have multiple partitions on the disk & one of them like /dev/sda2 is encrypted?
There is only one partition on the drive, /dev/sdh1, and it is encrypted.
 
I would have to reboot my machine to verify
Not sure why. Just connect the device & unlock with VeraCrypt. Just don't write to it. I never had the bug without trying to write to it.
Anyway I imagine I've grasped your situation. So don't bother just because of me.

We can summarize so far, that the USB crash happens using a VeraCrypt completely-encrypted partition.

And like you, I did it for a long while before without incident, prior to the above-mentioned updates/kernel (?).

I'd like to add, since I've discovered other USB bugs with the newer kernel, (including Thunderbird problems & general USB speed problems - read these forums), I suspect our issue is also linked to this.
 
Last edited:
yes it only began recently (6.8 kernel maybe?)
Just to double check, have you tried dropping back to kernel 6.5 (even if just temporarily) to check if the problem goes away?

ie. confirm the concept that it's the 6.5 → 6.8 kernel jump
 
have you tried dropping back to kernel 6.5 (even if just temporarily)
Though I have been tempted to try - I'll refrain from doing it on a working system. I stick to: "If it ain't broken why fix it?"

As a workaround I copy the data I need to a USB flash drive & from there on another PC (yes Windows!) I transfer the data from the USB flash drive to the VeraCrypt disk.

I'd like to add something that I found the speeds while copying to the USB flash drive (using rsync) atrocious as well. Never had that in the past. I attribute all this to the general USB problems associated with this update/kernel (as my post above).
 
TL;DR: The 6.8 kernel is to blame. Reverting to 6.5 fixes it for me.

Yesterday, I updated Proxmox. I also noticed that my VeraCrypt version was older (1.25.9), so I updated to 1.26.7. After a reboot, I tried to write to the mounted VeraCrypt disk. Still the same USB issues. :(

Then, I decided to reboot again and manually chose to boot using the 6.5 kernel. Everything worked like a charm writing to VeraCrypt. and the USB bus continues to function admirably.

So it seems there is some kind of bug with the 6.8 kernel, USB, and VeraCrypt.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!