I'm experiencing persistent write stalling issues when copying large files from a Windows client to a Samba share running in an LXC container on my Proxmox VE server. The storage for the LXC is an NVMe SSD in an external USB 3.0 enclosure.
Problem Description:
Questions:
Problem Description:
- Large file copies (e.g., movie files >1GB) to the Samba share start with good speed (sometimes >100 MB/s, likely caching) but then quickly degrade and eventually stall completely, with the transfer speed dropping to 0 MB/s.
- During the stall, iostat on the Proxmox VE host shows the USB drive (/dev/sda) at nearly 100% utilization (%util) with very high I/O wait times (%iowait), but actual write throughput (wkB/s or wMB/s) drops to zero or near zero.
- Copying files from the server (Samba share) to the Windows client is fast and works fine.
- Small file copies to the server generally work without issue.
- This same NVMe drive and USB enclosure combination previously worked without such stalling issues when connected directly to my Windows PC.
- Proxmox VE Version: 8.4.0 (Kernel: 6.8.12-10-pve)
- Output of pveversion -v:
-
Code:
proxmox-ve: 8.4.0 (running kernel: 6.8.12-10-pve) pve-manager: 8.4.1 (running version: 8.4.1/2a5fa54a8503f96d) # ... (include the rest of your pveversion -v output) ... zfsutils-linux: 2.2.7-pve2
- Output of pveversion -v:
- NVMe Drive: WD Blue SN580 1TB
- USB Enclosure: Kinsound M.2 NVMe SSD Enclosure (Identified by dmesg as using JMicron chip with idVendor=152d, idProduct=0583)
- Proxmox Host Motherboard: BIOSTAR Group TB360-BTC PRO 2.0 (BIOS 5.13 06/08/2021)
- ZFS Pool ("tank") Configuration: Single device pool using partition /dev/sda4 from the USB NVMe.
- Output of zpool status tank:
Code:
pool: tank
state: ONLINE
config:
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
usb-JMicron_Tech_DD56419883890-0:0-part4 ONLINE 0 0 0
# Or sda4 if that's what it shows currently
errors: No known data errors
- LXC Container ("100 (media)"): [Specify OS if known, e.g., Ubuntu 22.04]. Runs Samba server.
- Relevant Mount Point for LXC Data: mp0: tank:subvol-100-disk-1,mp=/data,size=300G
- Client OS: Windows 10/11 [Specify]
- Initial Observation: lsblk -D /dev/sda initially showed DISC-MAX: 0B, indicating no OS-level TRIM support for the USB enclosure. zpool trim tank failed with "no devices in pool support trim operations".
- Researched JMicron JMS583 TRIM Issues on Linux: Found forum discussions suggesting a udev rule can help for devices reporting lbpme=0 but LBPU=1.
- SCSI Capability Check:
- sg_readcap -l /dev/sda reported: Logical block provisioning: lbpme=0, lbprz=0
- sg_vpd -p lbpv /dev/sda reported: Unmap command supported (LBPU): 1
- Applied Udev Rule: Created /etc/udev/rules.d/90-usb-nvme-jms583-trim.rules with:
Code:
ACTION=="add|change", ATTRS{idVendor}=="152d", ATTRS{idProduct}=="0583", SUBSYSTEM=="scsi_disk", ATTR{provisioning_mode}="unmap", ATTR{manage_start_stop}="1"
- Result of Udev Rule: After reloading rules and replugging the device, lsblk -D /dev/sda now shows DISC-MAX: 4G for /dev/sda and /dev/sda4.
- ZFS TRIM Test: zpool trim tank then completed successfully without errors.
- ZFS autotrim: zpool get autotrim tank showed it was off by default. I enabled it with zpool set autotrim=on tank.
- Current Problem - UNMAP Errors: Despite DISC-MAX showing 4G and manual zpool trim working once initially, file copies still stall. Furthermore, dmesg now shows critical errors during TRIM/DISCARD operations when autotrim is active or when writes occur:
Code:
[ 85.821360] sd 6:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[ 85.821365] sd 6:0:0:0: [sda] tag#0 Sense Key : Illegal Request [current]
[ 85.821366] sd 6:0:0:0: [sda] tag#0 Add. Sense: Logical block address out of range
[ 85.821368] sd 6:0:0:0: [sda] tag#0 CDB: Unmap/Read sub-channel 42 00 00 00 00 00 00 00 18 00
[ 85.821369] critical target error, dev sda, sector 1116008816 op 0x3:(DISCARD) flags 0x0 phys_seg 1 prio class 0
[ 85.821371] zio pool=tank vdev=/dev/disk/by-id/usb-JMicron_Tech_DD56419883890-0:0-part4 error=121 type=6 offset=464021282816 size=40960 flags=524480
- (These errors repeat)
- Current State: Due to these UNMAP errors, I have currently set autotrim=off again (zpool set autotrim=off tank) and removed the udev rule (/etc/udev/rules.d/90-usb-nvme-jms583-trim.rules, reloaded rules, replugged drive). The stalling issue persists.
- ZFS sync property for LXC data dataset: zfs get sync tank/subvol-100-disk-1 reports standard.
Questions:
- Given the "Logical block address out of range" errors during DISCARD operations even when DISC-MAX is non-zero, does this confirm a faulty TRIM/UNMAP implementation in my Kinsound (JMicron JMS583) enclosure's firmware when used with Linux?
- Has anyone found a reliable fix or workaround for these specific UNMAP errors with 152d:0583 devices on recent Proxmox/Linux kernels, beyond the standard udev provisioning_mode tweak?
- Could these failing UNMAP commands (when TRIM was attempted) themselves be causing or exacerbating the write stalls?
- Are there any other ZFS settings or kernel parameters I should investigate for a single-device ZFS pool on a USB-attached NVMe that might improve sustained write stability?
- Is it likely I need a different USB NVMe enclosure with a chipset known for better Linux compatibility (e.g., newer Realtek, ASMedia)?