Hello.
I reported ZFS issue here: PANIC: rpool: blkptr at ... DVA 0 has invalid OFFSET 18388167655883276288 #12019
The IO delay on node is rising from minute to minute. After some hours node stop responding completely. Service in RAM (like ceph) are still running.
After long time cluster shows this node as unavailable but it responds to ping, accept tcp connection to ssh port (no login possible). it requires manual reset to bring it back to life for short while until one of VM touched problematic ZFS area.
I know that no software is perfect and OpenZFS raise kernel PANIC. But why node is not reboted with kernel cmdline contain panic=30 ?
Additionally I think, in this case local watchdog should detect this issue and reboots the node.
I reported ZFS issue here: PANIC: rpool: blkptr at ... DVA 0 has invalid OFFSET 18388167655883276288 #12019
The IO delay on node is rising from minute to minute. After some hours node stop responding completely. Service in RAM (like ceph) are still running.
After long time cluster shows this node as unavailable but it responds to ping, accept tcp connection to ssh port (no login possible). it requires manual reset to bring it back to life for short while until one of VM touched problematic ZFS area.
I know that no software is perfect and OpenZFS raise kernel PANIC. But why node is not reboted with kernel cmdline contain panic=30 ?
Additionally I think, in this case local watchdog should detect this issue and reboots the node.