ZFS device fault for pool nvme

corianito

Well-Known Member
Jul 20, 2019
46
0
46
39
Hello! Today I received an email saying this:

ZFS has detected that a device was removed.

impact: Fault tolerance of the pool may be compromised.eid: 405class: statechangestate: REMOVEDhost: sevilla5time: 2024-06-16 10:36:53+0200vpath: /dev/disk/by-id/nvme-HUSMR7676BDP3Y1_SDM00001F934-part1vphys: pci-0000:84:00.0-nvme-1vguid: 0xD26E23B18BC6084Cdevid: nvme-HUSMR7676BDP3Y1_SDM00001F934-part1pool: nvme (0x996F9702537A4C86)

The data center sent me a picture of the server with all the disks with the light on but there is one that is not in use. Can something be solved from Proxmox or do I just have to change the disk? This is the first time I am facing this problem.
 
Did you (or someone else) kick the NVMe from the PCIe bus? Was a VM started with PCI(e) passthrough of a devices in the same IOMMU group? Was the driver unloaded? It really depends on why the NVMe devices is gone missing. I think most of those can be checked by a reboot (and not starting VMs with PCI(e) passthrough automatically).
Maybe you could look a the system log to find out if there were errors or other clues on why the device was removed.
 
Hello!

Really, nobody did anything, everything happened for no apparent reason, I was sleeping hahaha

Simply the NVME 1/3 disappeared. Can you give me a hint of commands that could help me see the problem?

Greetings and thank you very much for bothering to respond.