NVME is not displayed in /dev

Shuvi

New Member
Dec 19, 2024
11
0
1
Good evening,
I have just changed my case and started proxmox everything works except the NVME which contains my LVM with all disks.
I have tried:
1. de-energize and hold power button for 10 sec (worked before when the ssd was not recognized)
2. remove ssd and test in windows (show disk manager) normally recognized no partitions present
3. query dmseg and lspci... nothing significant found
Bash:
dmesg | grep -i 'nvme\|error'
[    2.931854] ERST: Error Record Serialization Table (ERST) support is initialized.
[    3.360148] BERT: Error records from previous boot:
[    3.360150] [Hardware Error]: event severity: fatal
[    3.360153] [Hardware Error]:  Error 0, type: fatal
[    3.360154] [Hardware Error]:  fru_text: DIMM Locate:  P0E0
[    3.360156] [Hardware Error]:   section_type: memory error
[    3.360158] [Hardware Error]:    error_status: Storage error in DRAM memory (0x0000000000040400)
[    3.360161] [Hardware Error]:   node:0 card:6 module:0
[    3.360163] [Hardware Error]:   error_type: 15, physical memory map-out event
[    3.489247] RAS: Correctable Errors collector initialized.
[    4.199681] nvme nvme0: pci function 0000:83:00.0
[    4.214242] nvme nvme0: Shutdown timeout set to 10 seconds
[    4.230167] nvme nvme0: allocated 64 MiB host memory buffer.
[    4.257974] nvme nvme0: 16/0/0 default/read/poll queues
lspci -nn | grep -i nvme
83:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd NVMe SSD Controller PM9C1a (DRAM-less) [144d:a80d]
4. desperately ask chatgpt and test around
5. open a thread here
it seems to be some software error i have not done anything else to the nvme

if anyone can help me that would be really great! ;)
 
Last edited:
Show output of lsblk, too.
Bash:
NAME               MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
sda                  8:0    0 14.6T  0 disk
├─sda1               8:1    0    2G  0 part
└─sda2               8:2    0 14.6T  0 part
sdb                  8:16   0 14.6T  0 disk
├─sdb1               8:17   0    2G  0 part
└─sdb2               8:18   0 14.6T  0 part
sdc                  8:32   0 14.6T  0 disk
├─sdc1               8:33   0    2G  0 part
└─sdc2               8:34   0 14.6T  0 part
sdd                  8:48   0 14.6T  0 disk
├─sdd1               8:49   0    2G  0 part
└─sdd2               8:50   0 14.6T  0 part
sde                  8:64   0  9.1T  0 disk
├─sde1               8:65   0  9.1T  0 part
└─sde9               8:73   0    8M  0 part
sdf                  8:80   0  9.1T  0 disk
├─sdf1               8:81   0  9.1T  0 part
└─sdf9               8:89   0    8M  0 part
sdg                  8:96   0  9.1T  0 disk
├─sdg1               8:97   0  9.1T  0 part
└─sdg9               8:105  0    8M  0 part
sdh                  8:112  0  9.1T  0 disk
├─sdh1               8:113  0  9.1T  0 part
└─sdh9               8:121  0    8M  0 part
sdi                  8:128  1 59.8G  0 disk
├─sdi1               8:129  1 1007K  0 part
├─sdi2               8:130  1  512M  0 part /boot/efi
└─sdi3               8:131  1 59.3G  0 part
  ├─pve-swap       252:20   0  7.4G  0 lvm  [SWAP]
  ├─pve-root       252:21   0   25G  0 lvm  /
  ├─pve-data_tmeta 252:22   0    1G  0 lvm 
  │ └─pve-data     252:24   0 17.5G  0 lvm 
  └─pve-data_tdata 252:23   0 17.5G  0 lvm 
    └─pve-data     252:24   0 17.5G  0 lvm
 
The dmesg output says „Hardware Error“, maybe the NVMe died?
I think this has something to do with my ram there is a stick that just doesn't want to be recognized but this is a cpu error (I have changed the stick through always the same) and it has not made any problems so far except that I have a little less ram
i mean before it worked normally until i had to reboot because of the changes i made

it is also recognized in windows and initialized in linux, as already mentioned
 
Last edited:
Looks like the pci address of the nvme maybe mapped into bad dimm and so as follow up error the device is not offered in /dev but seen in your win pc as that doesn't has bad dimm yet.
 
Looks like the pci address of the nvme maybe mapped into bad dimm and so as follow up error the device is not offered in /dev but seen in your win pc as that doesn't has bad dimm yet.
I've removed the dimm and no change exept this part is now missing :(
 
Unexpected but that's life ... this laptop/pc m2 nvme isn't certified for your host hw right ... ? Use it just in another one, shit happens anyday anywhere.
 
I'm using a Samsung Evo 990 2TB in a supermicro h11ssl-i mainbord and this config did it's job just fine but why should it have died now? and what am I doing wrong?
 
update I have created a small partition in Windows and briefly tested data on it and everything was fine formatted again and now the fucking SSD is recognized no idea why all of a sudden
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!