ceph-osd crashes with kernel 6.17.2-1-pve on Dell system

https://bugzilla.kernel.org/show_bug.cgi?id=220693

Niklas Cassel 2026-04-21 09:24:15 UTC
(In reply to Phill from comment #47)
> I have been seeing similar issues in 6.17 kernels on my proxmox host since
> they switched from 6.14 to 6.17. After finding this bugzilla posting, i was
> able to get the 6.17 kernel booted up without issues using a udev rule. Not
> sure if that is helpful at all, but here it is. if nothing else, it might
> help other users who stumble upon this.
>
>
> # cat /etc/udev/rules.d/99-dellboss-max-sectors.rules
> ACTION=="add|change", SUBSYSTEM=="block", ENV{ID_MODEL}=="DELLBOSS_VD",
> ATTR{queue/max_sectors_kb}="1280"

Hello Phil,

For this specific drive, there has been a quirk added:
https://git.kernel.org/pub/scm/linu.../?id=2e983271363108b3813b38754eb96d9b1cb252bb

This commit was first included in v6.19, so I would expect your problem to
be fixed if you include to v6.19+
 
I tried the latest kernel available from community repositories:
Code:
Linux pmx-05 7.0.0-3-pve #1 SMP PREEMPT_DYNAMIC PMX 7.0.0-3 (2026-04-21T22:56Z) x86_64 GNU/Linux

For now, the ceph issues seem to be gone - all OSDs on the updated host show up/in, which was never the case with the 6.17 kernels (tried up to 6.17.13-6),
 
Я протестировал последнюю версию ядра, доступную в репозиториях сообщества:
Code:
Linux pmx-05 7.0.0-3-pve #1 SMP PREEMPT_DYNAMIC PMX 7.0.0-3 (2026-04-21T22:56Z) x86_64 GNU/Linux

На данный момент проблемы с Ceph, похоже, решены — все OSD на обновленном хосте отображаются/входят, чего никогда не было с ядрами версии 6.17 (пробовал до 6.17.13-6).
+
 
Unfortunately still having the same issue even with 7.x kernel:

Linux pve 7.0.2-6-pve #1 SMP PREEMPT_DYNAMIC PMX 7.0.2-6 (2026-05-20T08:55Z) x86_64 GNU/Linux

2026-05-25T04:56:28.342706+00:00 pve kernel: Call Trace:
2026-05-25T04:56:28.342707+00:00 pve kernel: <TASK>
2026-05-25T04:56:28.342709+00:00 pve kernel: megasas_queue_command+0x125/0x1d0 [megaraid_sas]
2026-05-25T04:56:28.342709+00:00 pve kernel: scsi_queue_rq+0x82e/0xcf0
2026-05-25T04:56:28.342720+00:00 pve kernel: blk_mq_dispatch_rq_list+0x136/0x7b0
 
К сожалению, проблема сохраняется даже с ядром версии 7.x:

Linux pve 7.0.2-6-pve #1 SMP PREEMPT_DYNAMIC PMX 7.0.2-6 (2026-05-20T08:55Z) x86_64 GNU/Linux

2026-05-25T04:56:28.342706+00:00 pve kernel: Call Trace:
2026-05-25T04:56:28.342707+00:00 pve kernel: <TASK>
2026-05-25T04:56:28.342709+00:00 ядро pve: megasas_queue_command+0x125/0x1d0 [megaraid_sas]
2026-05-25T04:56:28.342709+00:00 pve kernel: scsi_queue_rq+0x82e/0xcf0
2026-05-25T04:56:28.342720+00:00 pve kernel: blk_mq_dispatch_rq_list+0x136/0x7b0
hp dl380g11 megaraid no problem.
 
We've been hitting this issue on the same hardware. Dell R660xs - PERC H755N raid controller, latest firmware. NVMe drives are passed through as a non-raid disk, running a Ceph cluster as well.

Stable on 6.14, crashes randomly usually during boot on 6.17, 6.19, and 7.0

Code:
[    4.863109] sd 0:2:0:0: [sda] tag#3471 page boundary ptr_sgl: 0x00000000d22f0376
[    4.863574] BUG: unable to handle page fault for address: ff860691c3bd4000
[    4.863927] #PF: supervisor write access in kernel mode
[    4.864278] #PF: error_code(0x0002) - not-present page
[    4.864478] systemd[1]: Finished systemd-random-seed.service - Load/Save OS Random Seed.
[    4.864594] PGD 100000067 P4D 100860067 PUD 100861067 PMD 12b5d4067 PTE 0
[    4.865322] Oops: Oops: 0002 [#1] SMP NOPTI
[    4.865326] CPU: 2 UID: 0 PID: 483 Comm: systemd-modules Tainted: G S         O        7.0.2-6-pve #1 PREEMPT(lazy) 
[    4.866091] Tainted: [S]=CPU_OUT_OF_SPEC, [O]=OOT_MODULE
[    4.866092] Hardware name: Dell Inc. PowerEdge R660xs/0GK9K6, BIOS 2.10.1 04/01/2026
[    4.866093] RIP: 0010:megasas_build_and_issue_cmd_fusion+0xe88/0x1850 [megaraid_sas]
[    4.867165] Code: 20 48 89 d1 48 83 e1 fc 83 e2 01 48 0f 45 d9 4c 8b 73 10 44 8b 6b 18 4c 89 f9 4c 8d 79 08 45 85 fa 0f 84 fd 03 00 00 45 29 cc <4c> 89 31 48 83 c0 08 41 83 c0 01 45 29 cd 45 85 e4 7f ab 44 89 c0
[    4.867167] RSP: 0018:ff860691c8a87360 EFLAGS: 00010206
[    4.867170] RAX: 00000000fe471000 RBX: ff3fc503ad9db248 RCX: ff860691c3bd4000
[    4.867171] RDX: ff860691c3bd4008 RSI: ff3fc503ad9db110 RDI: 0000000000000000
[    4.867173] RBP: ff860691c8a87430 R08: 0000000000000200 R09: 0000000000001000
[    4.867174] R10: 0000000000000fff R11: 0000000000001000 R12: 0000000000024000
[    4.867175] R13: 0000000000025000 R14: 00000000faa00000 R15: ff860691c3bd4008
[    4.867177] FS:  00007947774499c0(0000) GS:ff3fc54258e0f000(0000) knlGS:0000000000000000
[    4.870152] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    4.870154] CR2: ff860691c3bd4000 CR3: 00000001306a5005 CR4: 0000000000f71ef0
[    4.870156] PKRU: 55555554
[    4.870157] Call Trace:
[    4.870159]  <TASK>
[    4.871863]  megasas_queue_command+0x125/0x1d0 [megaraid_sas]
[    4.871874]  scsi_queue_rq+0x82e/0xcf0
[    4.871879]  blk_mq_dispatch_rq_list+0x136/0x7b0
[    4.871883]  ? sbitmap_get+0x73/0x180
[    4.871886]  __blk_mq_sched_dispatch_requests+0x415/0x620
[    4.871890]  blk_mq_sched_dispatch_requests+0x2d/0x80
[    4.871892]  blk_mq_run_hw_queue+0x2c3/0x330
[    4.871895]  blk_mq_dispatch_list+0x16a/0x4d0
[    4.871898]  blk_mq_flush_plug_list+0x62/0x1d0
[    4.871901]  __blk_flush_plug+0xef/0x150
[    4.871904]  blk_finish_plug+0x30/0x50
[    4.871907]  read_pages+0x19e/0x240
[    4.871911]  page_cache_ra_order+0x271/0x410
[    4.871914]  page_cache_async_ra+0x1a9/0x260
[    4.871916]  ? filemap_get_read_batch+0x15b/0x2e0
[    4.871919]  filemap_readahead.isra.0+0x73/0xb0
[    4.871922]  filemap_get_pages+0x31a/0x7e0
[    4.871926]  ? copy_page_to_iter+0x9f/0x190
[    4.871929]  filemap_read+0x114/0x4a0
[    4.871934]  generic_file_read_iter+0xbb/0x110
[    4.871937]  ext4_file_read_iter+0x60/0x200
[    4.871940]  __kernel_read+0x196/0x310
[    4.871943]  kernel_read+0x44/0x60
[    4.871945]  kernel_read_file+0x182/0x2f0
[    4.871949]  init_module_from_file+0xd2/0x160
[    4.871953]  idempotent_init_module+0x110/0x300
[    4.871957]  __x64_sys_finit_module+0x73/0xf0
[    4.871959]  x64_sys_call+0xe2e/0x2390
[    4.871962]  do_syscall_64+0x11c/0x14e0
[    4.871964]  ? do_syscall_64+0x15a/0x14e0
[    4.871966]  ? exc_page_fault+0x92/0x1c0
[    4.871969]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[    4.871971] RIP: 0033:0x79477771a7b9
[    4.871974] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 27 66 0d 00 f7 d8 64 89 01 48
[    4.871975] RSP: 002b:00007fff554f74b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[    4.871978] RAX: ffffffffffffffda RBX: 000060c266c59c00 RCX: 000079477771a7b9
[    4.871979] RDX: 0000000000000000 RSI: 000060c266c6a080 RDI: 0000000000000006
[    4.871981] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[    4.871982] R10: 0000000000000000 R11: 0000000000000246 R12: 000060c266c6a080
[    4.871983] R13: 0000000000020000 R14: 000060c266c59900 R15: 0000000000000000
[    4.871985]  </TASK>
[    4.871986] Modules linked in: spl(O) vhost_net vhost vhost_iotlb tap efi_pstore nfnetlink dmi_sysfs ip_tables x_tables autofs4 btrfs libblake2b xor raid6_pq cdc_ether usbnet mii hid_generic usbkbd usbmouse usbhid hid dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio ice libeth_xdp gnss i2c_i801 libeth idxd libie xhci_pci i2c_mux libie_adminq ahci xhci_hcd megaraid_sas idxd_bus spi_intel_pci libie_fwlog tg3 i2c_smbus libahci spi_intel i2c_ismt wmi pinctrl_emmitsburg
[    4.881114] CR2: ff860691c3bd4000
[    4.881116] ---[ end trace 0000000000000000 ]---
[    4.974108] RIP: 0010:megasas_build_and_issue_cmd_fusion+0xe88/0x1850 [megaraid_sas]
[    4.974335] Code: 20 48 89 d1 48 83 e1 fc 83 e2 01 48 0f 45 d9 4c 8b 73 10 44 8b 6b 18 4c 89 f9 4c 8d 79 08 45 85 fa 0f 84 fd 03 00 00 45 29 cc <4c> 89 31 48 83 c0 08 41 83 c0 01 45 29 cd 45 85 e4 7f ab 44 89 c0
[    4.974764] RSP: 0018:ff860691c8a87360 EFLAGS: 00010206
[    4.974991] RAX: 00000000fe471000 RBX: ff3fc503ad9db248 RCX: ff860691c3bd4000
[    4.975216] RDX: ff860691c3bd4008 RSI: ff3fc503ad9db110 RDI: 0000000000000000
[    4.975444] RBP: ff860691c8a87430 R08: 0000000000000200 R09: 0000000000001000
[    4.975673] R10: 0000000000000fff R11: 0000000000001000 R12: 0000000000024000
[    4.975903] R13: 0000000000025000 R14: 00000000faa00000 R15: ff860691c3bd4008
[    4.976134] FS:  00007947774499c0(0000) GS:ff3fc54258e0f000(0000) knlGS:0000000000000000
[    4.976370] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    4.976606] CR2: ff860691c3bd4000 CR3: 00000001306a5005 CR4: 0000000000f71ef0
[    4.976848] PKRU: 55555554
[    4.977089] note: systemd-modules[483] exited with irqs disabled
 
  • Like
Reactions: carles89