Hi,
This morning one of my nvme drives "disappeared" from the server but the server ilo reports it as online and healthy.
The drive is a wdc gold sn600 and its label is /dev/nvme1n1. I'm using this drive in a Ceph pool.
Here is my pveversion -v:
	
	
	
		
lsblk -l output without rbd blocks:
	
	
	
		
and also dmesg | grep nvme:
	
	
	
		
Any ideas? I can't reboot the server right right now.
Thanks
				
			This morning one of my nvme drives "disappeared" from the server but the server ilo reports it as online and healthy.
The drive is a wdc gold sn600 and its label is /dev/nvme1n1. I'm using this drive in a Ceph pool.
Here is my pveversion -v:
		Code:
	
	proxmox-ve: 6.3-1 (running kernel: 5.4.78-2-pve)
pve-manager: 6.3-3 (running version: 6.3-3/eee5f901)
pve-kernel-5.4: 6.3-3
pve-kernel-helper: 6.3-3
pve-kernel-5.4.78-2-pve: 5.4.78-2
pve-kernel-5.4.73-1-pve: 5.4.73-1
ceph: 15.2.8-pve2
ceph-fuse: 15.2.8-pve2
corosync: 3.0.4-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: residual config
ifupdown2: 3.0.0-1+pve3
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.16-pve1
libproxmox-acme-perl: 1.0.7
libproxmox-backup-qemu0: 1.0.2-1
libpve-access-control: 6.1-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.3-2
libpve-guest-common-perl: 3.1-4
libpve-http-server-perl: 3.1-1
libpve-storage-perl: 6.3-4
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.3-1
lxcfs: 4.0.6-pve1
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.0.6-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.4-3
pve-cluster: 6.2-1
pve-container: 3.3-2
pve-docs: 6.3-1
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.1-3
pve-ha-manager: 3.1-1
pve-i18n: 2.2-2
pve-qemu-kvm: 5.1.0-8
pve-xtermjs: 4.7.0-3
qemu-server: 6.3-3
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 0.8.5-pve1lsblk -l output without rbd blocks:
		Code:
	
	NAME                                                                                                  MAJ:MIN  RM   SIZE RO TYPE MOUNTPOINT
sda                                                                                                     8:0     0 447.1G  0 disk
├─sda1                                                                                                  8:1     0  1007K  0 part
├─sda2                                                                                                  8:2     0   512M  0 part /boot/efi
└─sda3                                                                                                  8:3     0 446.6G  0 part
  ├─pve-swap                                                                                          253:1     0     8G  0 lvm  [SWAP]
  ├─pve-root                                                                                          253:2     0    96G  0 lvm  /
  ├─pve-data_tmeta                                                                                    253:3     0   3.3G  0 lvm
  │ └─pve-data                                                                                        253:5     0 320.1G  0 lvm
  └─pve-data_tdata                                                                                    253:4     0 320.1G  0 lvm
    └─pve-data                                                                                        253:5     0 320.1G  0 lvm
nvme0n1                                                                                               259:0     0   1.8T  0 disk
└─nvme0n1p1                                                                                           259:5     0   1.8T  0 part
  ├─vg--local3-lv--local3_tmeta                                                                       253:0     0   112M  0 lvm
  │ └─vg--local3-lv--local3-tpool                                                                     253:7     0   1.8T  0 lvm
  │   ├─vg--local3-lv--local3                                                                         253:12    0   1.8T  0 lvm
  │   ├─vg--local3-vm--6000--disk--0                                                                  253:13    0    20G  0 lvm
  │   ├─vg--local3-vm--6006--disk--0                                                                  253:14    0    20G  0 lvm
  │   ├─vg--local3-vm--6007--disk--0                                                                  253:15    0    20G  0 lvm
  │   ├─vg--local3-vm--6010--disk--0                                                                  253:16    0    20G  0 lvm
  │   ├─vg--local3-vm--6011--disk--0                                                                  253:17    0    20G  0 lvm
  │   ├─vg--local3-vm--6013--disk--0                                                                  253:18    0    20G  0 lvm
  │   ├─vg--local3-vm--6026--disk--2                                                                  253:19    0    15G  0 lvm
  │   ├─vg--local3-vm--6004--disk--0                                                                  253:20    0    20G  0 lvm
  │   ├─vg--local3-vm--6018--disk--0                                                                  253:21    0    20G  0 lvm
  │   ├─vg--local3-vm--6023--disk--0                                                                  253:22    0    20G  0 lvm
  │   ├─vg--local3-vm--6025--disk--0                                                                  253:23    0    20G  0 lvm
  │   ├─vg--local3-vm--6025--disk--1                                                                  253:24    0    25G  0 lvm
  │   ├─vg--local3-vm--6026--disk--0                                                                  253:25    0    20G  0 lvm
  │   ├─vg--local3-vm--6026--disk--1                                                                  253:26    0    60G  0 lvm
  │   ├─vg--local3-vm--1003--disk--0                                                                  253:27    0    20G  0 lvm
  │   └─vg--local3-vm--1003--disk--1                                                                  253:28    0    60G  0 lvm
  └─vg--local3-lv--local3_tdata                                                                       253:6     0   1.8T  0 lvm
    └─vg--local3-lv--local3-tpool                                                                     253:7     0   1.8T  0 lvm
      ├─vg--local3-lv--local3                                                                         253:12    0   1.8T  0 lvm
      ├─vg--local3-vm--6000--disk--0                                                                  253:13    0    20G  0 lvm
      ├─vg--local3-vm--6006--disk--0                                                                  253:14    0    20G  0 lvm
      ├─vg--local3-vm--6007--disk--0                                                                  253:15    0    20G  0 lvm
      ├─vg--local3-vm--6010--disk--0                                                                  253:16    0    20G  0 lvm
      ├─vg--local3-vm--6011--disk--0                                                                  253:17    0    20G  0 lvm
      ├─vg--local3-vm--6013--disk--0                                                                  253:18    0    20G  0 lvm
      ├─vg--local3-vm--6026--disk--2                                                                  253:19    0    15G  0 lvm
      ├─vg--local3-vm--6004--disk--0                                                                  253:20    0    20G  0 lvm
      ├─vg--local3-vm--6018--disk--0                                                                  253:21    0    20G  0 lvm
      ├─vg--local3-vm--6023--disk--0                                                                  253:22    0    20G  0 lvm
      ├─vg--local3-vm--6025--disk--0                                                                  253:23    0    20G  0 lvm
      ├─vg--local3-vm--6025--disk--1                                                                  253:24    0    25G  0 lvm
      ├─vg--local3-vm--6026--disk--0                                                                  253:25    0    20G  0 lvm
      ├─vg--local3-vm--6026--disk--1                                                                  253:26    0    60G  0 lvm
      ├─vg--local3-vm--1003--disk--0                                                                  253:27    0    20G  0 lvm
      └─vg--local3-vm--1003--disk--1                                                                  253:28    0    60G  0 lvm
nvme1n1                                                                                               259:1     0   1.8T  0 disk
└─ceph--f107a279--77ae--4003--8523--b62d356df5bd-osd--block--95f1cb37--6324--472a--9cd7--0ba92770f3b5 253:8     0   1.8T  0 lvm
nvme3n1                                                                                               259:2     0   1.8T  0 disk
└─ceph--6ed9e93b--685a--4b3a--ae25--ca14e772d7ee-osd--block--0b46114b--f375--4ad7--9f80--914e4b802ea4 253:10    0   1.8T  0 lvm
nvme4n1                                                                                               259:3     0   1.8T  0 disk
└─ceph--cd46f637--5533--49bb--aa42--09efc22d89dc-osd--block--f74e3f69--c6a9--49fb--af63--ebac9a98bc22 253:11    0   1.8T  0 lvmand also dmesg | grep nvme:
		Code:
	
	[8599639.793030] nvme nvme2: I/O 677 QID 7 timeout, aborting
[8599639.793038] nvme nvme2: I/O 678 QID 7 timeout, aborting
[8599639.793040] nvme nvme2: I/O 679 QID 7 timeout, aborting
[8599639.793043] nvme nvme2: I/O 680 QID 7 timeout, aborting
[8599670.508517] nvme nvme2: I/O 677 QID 7 timeout, reset controller
[8599701.231967] nvme nvme2: I/O 0 QID 0 timeout, reset controller
[8599742.451266] nvme nvme2: Device not ready; aborting reset
[8599742.492142] nvme nvme2: Abort status: 0x371
[8599742.492144] nvme nvme2: Abort status: 0x371
[8599742.492145] nvme nvme2: Abort status: 0x371
[8599742.492146] nvme nvme2: Abort status: 0x371
[8599753.139132] nvme nvme2: Device not ready; aborting reset
[8599753.139528] nvme nvme2: Removing after probe failure status: -19
[8599763.758945] nvme nvme2: Device not ready; aborting reset
[8599763.759494] blk_update_request: I/O error, dev nvme2n1, sector 1802065328 op 0x0:(READ) flags 0x0 phys_seg 2 prio class 0
[8599763.759497] blk_update_request: I/O error, dev nvme2n1, sector 2048 op 0x0:(READ) flags 0x0 phys_seg 32 prio class 0
[8599763.759501] blk_update_request: I/O error, dev nvme2n1, sector 2797415056 op 0x1:(WRITE) flags 0x8800 phys_seg 3 prio class 0
[8599763.759503] blk_update_request: I/O error, dev nvme2n1, sector 1802230728 op 0x0:(READ) flags 0x0 phys_seg 2 prio class 0
[8599763.759506] blk_update_request: I/O error, dev nvme2n1, sector 2285580784 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[8599763.759508] blk_update_request: I/O error, dev nvme2n1, sector 1802234456 op 0x0:(READ) flags 0x0 phys_seg 2 prio class 0
[8599763.759514] blk_update_request: I/O error, dev nvme2n1, sector 0 op 0x0:(READ) flags 0x0 phys_seg 32 prio class 0
[8599763.759515] blk_update_request: I/O error, dev nvme2n1, sector 2797406496 op 0x1:(WRITE) flags 0x8800 phys_seg 1 prio class 0
[8599763.759519] blk_update_request: I/O error, dev nvme2n1, sector 1802229048 op 0x0:(READ) flags 0x0 phys_seg 2 prio class 0
[8599763.759520] blk_update_request: I/O error, dev nvme2n1, sector 2287970768 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0Any ideas? I can't reboot the server right right now.
Thanks
 
	