[solved] proxmox 9 system disk crash, web gui not responding after 10 min with kernel above 6.8.12-13-pve

denut

New Member
Apr 16, 2025
3
0
1
hello


i use proxmox on my server since 8 version

motherboard : x399 designare ex

processor :Threadripper 1950x

boot disk on sata controler



all was working perferct

updating to proxmox 9

working again



updating via all kernel to 6.8.12-13

working again



sinse kernel 6.14.xx-x

the system crash after 5 or 45 min but never workin more than 1 hour

vm are on but web interface unrushable or rushable but console shell or vm shutdown/reboot not working





in the system log i have :

Oct 16 06:55:46 pve-tr4 kernel: ata1.00: failed command: READ FPDMA QUEUED
Oct 16 06:55:46 pve-tr4 kernel: ata1.00: cmd 60/08:10:08:32:64/00:00:0c:00:00/40 tag 2 ncq dma 4096 in
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Oct 16 06:55:46 pve-tr4 kernel: ata1.00: status: { DRDY }
Oct 16 06:55:46 pve-tr4 kernel: ata1.00: failed command: WRITE FPDMA QUEUED
Oct 16 06:55:46 pve-tr4 kernel: ata1.00: cmd 61/08:48:08:10:5e/00:00:01:00:00/40 tag 9 ncq dma 4096 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)


and :



Oct 16 06:55:46 pve-tr4 kernel: ata1.00: failed command: WRITE FPDMA QUEUED
Oct 16 06:55:46 pve-tr4 kernel: ata1.00: cmd 61/08:50:28:30:25/00:00:01:00:00/40 tag 10 ncq dma 4096 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Oct 16 06:55:46 pve-tr4 kernel: ata1.00: status: { DRDY }


and :



Oct 16 06:55:46 pve-tr4 kernel: ata1: hard resetting link
Oct 16 06:55:46 pve-tr4 kernel: ata1: SATA link down (SStatus 0 SControl 300)
Oct 16 06:55:46 pve-tr4 kernel: ata1: hard resetting link
Oct 16 06:55:46 pve-tr4 kernel: ata1: SATA link down (SStatus 0 SControl 300)
Oct 16 06:55:46 pve-tr4 kernel: ata1: limiting SATA link speed to <unknown>
Oct 16 06:55:46 pve-tr4 kernel: ata1: hard resetting link
Oct 16 06:55:46 pve-tr4 kernel: ata1: SATA link down (SStatus 0 SControl 3F0)
Oct 16 06:55:46 pve-tr4 kernel: ata1.00: disable device




Oct 16 06:55:46 pve-tr4 kernel: Buffer I/O error on device dm-1, logical block 42500
Oct 16 06:55:46 pve-tr4 kernel: sd 0:0:0:0: [sda] tag#24 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=45s
Oct 16 06:55:46 pve-tr4 kernel: sd 0:0:0:0: [sda] tag#24 Sense Key : Not Ready [current]
Oct 16 06:55:46 pve-tr4 kernel: sd 0:0:0:0: [sda] tag#24 Add. Sense: Logical unit not ready, hard reset required
Oct 16 06:55:46 pve-tr4 kernel: sd 0:0:0:0: [sda] tag#24 CDB: Write(10) 2a 00 01 25 40 38 00 00 08 00
Oct 16 06:55:46 pve-tr4 kernel: EXT4-fs warning (device dm-1): ext4_end_bio:353: I/O error 10 writing to inode 5637299 starting block 42503)
Oct 16 06:55:46 pve-tr4 kernel: Buffer I/O error on device dm-1, logical block 42503
Oct 16 06:55:46 pve-tr4 kernel: EXT4-fs warning (device dm-1): ext4_end_bio:353: I/O error 10 writing to inode 5768643 starting block 4652177)
Oct 16 06:55:46 pve-tr4 kernel: Buffer I/O error on device dm-1, logical block 4652177
Oct 16 06:55:46 pve-tr4 kernel: ata1: EH complete
Oct 16 06:55:46 pve-tr4 kernel: Aborting journal on device dm-1-8.
Oct 16 06:55:46 pve-tr4 kernel: ata1.00: detaching (SCSI 0:0:0:0)
Oct 16 06:55:46 pve-tr4 kernel: EXT4-fs error (device dm-1) in ext4_reserve_inode_write:5941: Journal has aborted
Oct 16 06:55:46 pve-tr4 kernel: EXT4-fs (dm-1): ext4_do_writepages: jbd2_start: 11263 pages, ino 5636746; err -30
Oct 16 06:55:46 pve-tr4 kernel: EXT4-fs error (device dm-1): ext4_journal_check_start:84: comm pmxcfs: Detected aborted journal
Oct 16 06:55:46 pve-tr4 kernel: EXT4-fs error (device dm-1): ext4_journal_check_start:84: comm kworker/u131:1: Detected aborted journal



actualy i run with kernel 6.8.12-13 pined but want tu upgrade

have somebody a idea from problem

thank's
 
Last edited:
Hi,
what does smartctl report about the disk health? If all looks fine there, it might be a kernel regression. You could try the newer opt-in 6.17 kernel to see if it's fixed there.
 
  • Like
Reactions: news
hello

thank's

fiona

for you answer

kernel 6.17..2-1 tested
same problem after 10 minutes web interface not rushable

Sans titre 1.jpg

kernel 6.8.12-13-pve pinned again

waiting for a solution
 
What exact disk model do you have? Are there firmware updates available?
 
It may be powersaving on sata line. Search this forum, there is topic about it, you probably must add option to grub to disable powersaving on sata line. Try add this: ahci.mobile_lpm_policy=0, if not help, try this: libata.noacpi=1, remember to run, update-grub after change
 
It may be powersaving on sata line. Search this forum, there is topic about it, you probably must add option to grub to disable powersaving on sata line. Try add this: ahci.mobile_lpm_policy=0, if not help, try this: libata.noacpi=1, remember to run, update-grub after change
hello thank's verry much

look like working
powered on since 2h no more problem
with the "ahci.mobile_lpm_policy=0" option added to grub and kernel Linux 6.17.2-1-pve

have a nice day !!!