There is a problem with the Western Digital Red NAS HDDs that we believe is a bug with the default kernel used by current versions of Proxmox VE 5.
We tested two WD20EFRX discs in three different Proxmox VE servers. The servers use different hardware, but unfortunately all are running up-to-date Proxmox VE 5.3 with the same kernel 4.15.18-11-pve #1 SMP PVE 4.15.18-34 (Mon, 25 Feb 2019 14:51:06 +0100) x86_64.
All three servers reproduce the same COMRESET failed (errno=-16) error, the dmesg is shown below:
The disc is not shown under /dev/sd*
Server hardware:
Searching the web, an Ubuntu bug and possible patch has been posted here:
We have yet to test the patch in a Proxmox VE system (we don't currently have a test system on hand, and we're reluctant to build and apply a custom kernel on our three production servers).
Has anyone else come across this bug?
Other Western Digital HDDs do not appear to be affected, as we have several WD Gold and WD Purple HDDs deployed in the three servers.
We tested two WD20EFRX discs in three different Proxmox VE servers. The servers use different hardware, but unfortunately all are running up-to-date Proxmox VE 5.3 with the same kernel 4.15.18-11-pve #1 SMP PVE 4.15.18-34 (Mon, 25 Feb 2019 14:51:06 +0100) x86_64.
All three servers reproduce the same COMRESET failed (errno=-16) error, the dmesg is shown below:
root@server3:~# dmesg | grep ata
[ 0.000000] BIOS-e820: [mem 0x00000000bc855000-0x00000000bc85dfff] ACPI data
[ 0.000000] Memory: 16256600K/16690700K available (12300K kernel code, 2480K rwdata, 4288K rodata, 2424K init, 2416K bss, 434100K reserved, 0K cma-reserved)
[ 0.082500] libata version 3.00 loaded.
[ 1.034853] scsi host0: ata_generic
[ 1.034995] scsi host1: ata_generic
[ 1.035018] ata1: PATA max UDMA/100 cmd 0xf130 ctl 0xf120 bmdma 0xf0f0 irq 18
[ 1.035020] ata2: PATA max UDMA/100 cmd 0xf110 ctl 0xf100 bmdma 0xf0f8 irq 18
[ 1.388146] Write protecting the kernel read-only data: 20480k
[ 1.705685] ata3: SATA max UDMA/133 abar m2048@0xfb925000 port 0xfb925100 irq 29
[ 1.705686] ata4: DUMMY
[ 1.705687] ata5: DUMMY
[ 1.705689] ata6: SATA max UDMA/133 abar m2048@0xfb925000 port 0xfb925280 irq 29
[ 1.705691] ata7: SATA max UDMA/133 abar m2048@0xfb925000 port 0xfb925300 irq 29
[ 1.705693] ata8: SATA max UDMA/133 abar m2048@0xfb925000 port 0xfb925380 irq 29
[ 2.019178] ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 2.019210] ata8: SATA link down (SStatus 0 SControl 300)
[ 2.019299] ata7: SATA link down (SStatus 0 SControl 300)
[ 2.021069] ata3.00: supports DRM functions and may not be fully accessible
[ 2.022174] ata3.00: ATA-11: Samsung SSD 860 EVO 250GB, RVT01B6Q, max UDMA/133
[ 2.022177] ata3.00: 488397168 sectors, multi 1: LBA48 NCQ (depth 31/32), AA
[ 2.024940] ata3.00: supports DRM functions and may not be fully accessible
[ 2.028079] ata3.00: configured for UDMA/133
[ 2.028634] ata3.00: Enabling discard_zeroes_data
[ 2.028838] ata3.00: Enabling discard_zeroes_data
[ 2.030182] ata3.00: Enabling discard_zeroes_data
[ 7.056067] ata6: link is slow to respond, please be patient (ready=0)
[ 11.736068] ata6: COMRESET failed (errno=-16)
[ 17.088068] ata6: link is slow to respond, please be patient (ready=0)
[ 19.788080] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 19.788376] ata6.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)
[ 30.148069] ata6: link is slow to respond, please be patient (ready=0)
[ 34.828070] ata6: COMRESET failed (errno=-16)
[ 40.180071] ata6: link is slow to respond, please be patient (ready=0)
[ 42.520085] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 42.520377] ata6.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)
[ 42.520381] ata6: limiting SATA link speed to 1.5 Gbps
[ 52.932072] ata6: link is slow to respond, please be patient (ready=0)
[ 57.612072] ata6: COMRESET failed (errno=-16)
[ 62.964072] ata6: link is slow to respond, please be patient (ready=0)
[ 65.304083] ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[ 65.304380] ata6.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)
[ 75.716074] ata6: link is slow to respond, please be patient (ready=0)
[ 80.396074] ata6: COMRESET failed (errno=-16)
[ 85.748075] ata6: link is slow to respond, please be patient (ready=0)
[ 87.968086] ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[ 88.402834] EXT4-fs (dm-1): mounted filesystem with ordered data mode. Opts: (null)
[ 88.783893] systemd[1]: Listening on LVM2 metadata daemon socket.
root@server3:~#
The disc is not shown under /dev/sd*
Server hardware:
- Server 1: i7 7700, ASRock Z270-PRO4, Corsair 16GB RAM
- Server 2: i5 4590, ASRock Z87-EXTREME3, Corsair 8GB RAM
- Server 3: i5 2400, Intel BLKDQ67SWB3, Kingston 16GB RAM
Searching the web, an Ubuntu bug and possible patch has been posted here:
Search for: Linux 4.15 and onwards fails to initialize some hard drives (new users aren't allowed to post external links)
We have yet to test the patch in a Proxmox VE system (we don't currently have a test system on hand, and we're reluctant to build and apply a custom kernel on our three production servers).
Has anyone else come across this bug?
Other Western Digital HDDs do not appear to be affected, as we have several WD Gold and WD Purple HDDs deployed in the three servers.