Hi there,
i recently noticed a problem with my Homeserver. As i logged onto my IPMI KVM i was welcomed by a lot of ata exceptions and failed commands.
At first i dissabled ncq cause i read that it could be the problem, but it didnt changed anything except the error message from "failed command write fpdma queued" to the one above. I swapped my HDDs inside the bay to check if its the hdd or cable/mainboard. Didnt change so i assume its the cable/mainboard. How can i determine which port is the root of the problem?
i recently noticed a problem with my Homeserver. As i logged onto my IPMI KVM i was welcomed by a lot of ata exceptions and failed commands.
Code:
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.3.10-1-pve root=/dev/mapper/pve-root ro quiet libata.force=noncq
[ 0.109971] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.3.10-1-pve root=/dev/mapper/pve-root ro quiet libata.force=noncq
[ 0.178635] Memory: 32700520K/33443260K available (14339K kernel code, 2396K rwdata, 4848K rodata, 2664K init, 5048K bss, 742740K reserved, 0K cma-reserved)
[ 0.626359] libata version 3.00 loaded.
[ 1.393373] acpi_cpufreq: overriding BIOS provided _PSD data
[ 1.418341] Write protecting the kernel read-only data: 22528k
[ 1.548981] ata1: SATA max UDMA/133 abar m4096@0xefc02000 port 0xefc02100 irq 52
[ 1.548982] ata2: SATA max UDMA/133 abar m4096@0xefc02000 port 0xefc02180 irq 53
[ 1.548983] ata3: SATA max UDMA/133 abar m4096@0xefc02000 port 0xefc02200 irq 54
[ 1.548985] ata4: SATA max UDMA/133 abar m4096@0xefc02000 port 0xefc02280 irq 55
[ 1.548986] ata5: SATA max UDMA/133 abar m4096@0xefc02000 port 0xefc02300 irq 56
[ 1.548987] ata6: SATA max UDMA/133 abar m4096@0xefc02000 port 0xefc02380 irq 57
[ 1.548989] ata7: SATA max UDMA/133 abar m4096@0xefc02000 port 0xefc02400 irq 58
[ 1.548990] ata8: SATA max UDMA/133 abar m4096@0xefc02000 port 0xefc02480 irq 59
[ 1.861106] ata7: SATA link down (SStatus 0 SControl 300)
[ 1.861356] ata6: SATA link down (SStatus 0 SControl 300)
[ 1.861640] ata8: SATA link down (SStatus 0 SControl 300)
[ 2.022250] ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 2.022268] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 2.022346] ata5: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 2.022361] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 2.022378] ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 2.022645] ata3.00: FORCE: horkage modified (noncq)
[ 2.022669] ata2.00: FORCE: horkage modified (noncq)
[ 2.022729] ata3.00: ATA-9: WDC WD10EFRX-68FYTN0, 82.00A82, max UDMA/133
[ 2.022731] ata3.00: 1953525168 sectors, multi 16: LBA48 NCQ (not used)
[ 2.022755] ata2.00: ATA-9: WDC WD10EFRX-68FYTN0, 82.00A82, max UDMA/133
[ 2.022756] ata2.00: 1953525168 sectors, multi 16: LBA48 NCQ (not used)
[ 2.023120] ata4.00: FORCE: horkage modified (noncq)
[ 2.023187] ata4.00: ATA-8: WDC WD10EFRX-68JCSN0, 01.01A01, max UDMA/133
[ 2.023188] ata4.00: 1953525168 sectors, multi 16: LBA48 NCQ (not used)
[ 2.023224] ata1.00: FORCE: horkage modified (noncq)
[ 2.023237] ata3.00: configured for UDMA/133
[ 2.023273] ata2.00: configured for UDMA/133
[ 2.023287] ata1.00: ATA-8: WDC WD10EFRX-68JCSN0, 01.01A01, max UDMA/133
[ 2.023289] ata1.00: 1953525168 sectors, multi 16: LBA48 NCQ (not used)
[ 2.023802] ata5.00: FORCE: horkage modified (noncq)
[ 2.023879] ata5.00: supports DRM functions and may not be fully accessible
[ 2.023880] ata5.00: ATA-9: Samsung SSD 850 EVO M.2 250GB, EMT21B6Q, max UDMA/133
[ 2.023881] ata5.00: 488397168 sectors, multi 1: LBA48 NCQ (not used)
[ 2.024024] ata4.00: configured for UDMA/133
[ 2.024144] ata1.00: configured for UDMA/133
[ 2.026084] ata5.00: supports DRM functions and may not be fully accessible
[ 2.026880] ata5.00: configured for UDMA/133
[ 5.373771] EXT4-fs (dm-1): mounted filesystem with ordered data mode. Opts: (null)
[ 2053.836204] ata3.00: exception Emask 0x10 SAct 0x0 SErr 0x400000 action 0x6 frozen
[ 2053.836231] ata3.00: irq_stat 0x08000000, interface fatal error
[ 2053.836248] ata3: SError: { Handshk }
[ 2053.836260] ata3.00: failed command: WRITE DMA EXT
[ 2053.836276] ata3.00: cmd 35/00:00:00:90:bc/00:0a:0e:00:00/e0 tag 21 dma 1310720 out
[ 2053.836314] ata3.00: status: { DRDY }
[ 2053.836327] ata3: hard resetting link
[ 2054.312186] ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 2054.313186] ata3.00: configured for UDMA/133
[ 2054.313204] ata3: EH complete
[ 3754.882265] ata3.00: exception Emask 0x10 SAct 0x0 SErr 0x400000 action 0x6 frozen
[ 3754.882295] ata3.00: irq_stat 0x08000000, interface fatal error
[ 3754.882311] ata3: SError: { Handshk }
[ 3754.882323] ata3.00: failed command: WRITE DMA EXT
[ 3754.882339] ata3.00: cmd 35/00:00:c0:cd:09/00:0a:0f:00:00/e0 tag 2 dma 1310720 out
[ 3754.882378] ata3.00: status: { DRDY }
[ 3754.882391] ata3: hard resetting link
[ 3755.358268] ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 3755.359289] ata3.00: configured for UDMA/133
[ 3755.359303] ata3: EH complete
[ 3826.353826] ata3.00: exception Emask 0x10 SAct 0x0 SErr 0x400000 action 0x6 frozen
[ 3826.353857] ata3.00: irq_stat 0x08000000, interface fatal error
[ 3826.353875] ata3: SError: { Handshk }
[ 3826.353887] ata3.00: failed command: WRITE DMA EXT
[ 3826.353904] ata3.00: cmd 35/00:00:80:7c:4c/00:0a:0f:00:00/e0 tag 20 dma 1310720 out
[ 3826.353944] ata3.00: status: { DRDY }
[ 3826.353958] ata3: hard resetting link
[ 3826.829822] ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 3826.830842] ata3.00: configured for UDMA/133
[ 3826.830856] ata3: EH complete
[ 3877.329530] ata3.00: exception Emask 0x10 SAct 0x0 SErr 0x400000 action 0x6 frozen
[ 3877.329561] ata3.00: irq_stat 0x08000000, interface fatal error
[ 3877.329579] ata3: SError: { Handshk }
[ 3877.329591] ata3.00: failed command: WRITE DMA EXT
[ 3877.329608] ata3.00: cmd 35/00:00:80:16:7a/00:0a:0f:00:00/e0 tag 2 dma 1310720 out
[ 3877.329647] ata3.00: status: { DRDY }
[ 3877.329660] ata3: hard resetting link
[ 3877.805536] ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 3877.806514] ata3.00: configured for UDMA/133
[ 3877.806528] ata3: EH complete
At first i dissabled ncq cause i read that it could be the problem, but it didnt changed anything except the error message from "failed command write fpdma queued" to the one above. I swapped my HDDs inside the bay to check if its the hdd or cable/mainboard. Didnt change so i assume its the cable/mainboard. How can i determine which port is the root of the problem?