Hi,
Reaching out to hear if someone bumped into this issue. In our Supermicron 5 node ceph setup we mainly use nvme model Micron_9300_MTFDHAL7T6TDP. The issue we have is that when we add a new disk the disk in the slot next to it dies briefly. We do not yet know if it is related to the newer disk model Micron_7450_MTFDKCC7T6TFR that we add.
The slot placement is as below and in this case when I punch in something in slot 7 slot 5 dies for a msec.
1 3 5 7 9
0 2 4 6 8
Checking the logs we get lots of I/O errors of the failing disk and the /dev/nvme8n1 changes to /dev/nvme8n2. This is really annoying and causes lots of problems for us since we now have to migrate everything away from a node and shut it down before doing anything.
Some info about the node running PX 7.4-3 on kernel (Linux 5.15.104-1-pve #1 SMP PVE 5.15.104-2 (2023-04-12T11:23Z):
Really interesting to hear if we're alone with this issue.
--Mats
Reaching out to hear if someone bumped into this issue. In our Supermicron 5 node ceph setup we mainly use nvme model Micron_9300_MTFDHAL7T6TDP. The issue we have is that when we add a new disk the disk in the slot next to it dies briefly. We do not yet know if it is related to the newer disk model Micron_7450_MTFDKCC7T6TFR that we add.
The slot placement is as below and in this case when I punch in something in slot 7 slot 5 dies for a msec.
1 3 5 7 9
0 2 4 6 8
Checking the logs we get lots of I/O errors of the failing disk and the /dev/nvme8n1 changes to /dev/nvme8n2. This is really annoying and causes lots of problems for us since we now have to migrate everything away from a node and shut it down before doing anything.
Some info about the node running PX 7.4-3 on kernel (Linux 5.15.104-1-pve #1 SMP PVE 5.15.104-2 (2023-04-12T11:23Z):
Code:
Firmware Revision: 03.10.30
Firmware Build Time: 11/18/2022
BIOS Version: 2.5
BIOS Build Time: 09/14/2022
Redfish Version: 1.8.0
CPLD Version: A2.C5.08
Manufacturer: South Pole AB
Product Name: AS -1124US-TNRP
Serial No.:
-FRU Information
FRU Device ID: 0
Chassis Info:
Chassis Type: Other
Chassis Part Number: CSE-119UHTS-R1K22HP-A
Chassis Serial Number:
-Board Info:
Language: English
Board Manufacturer: Supermicro
Board Product Name: H12DSU-iN
Board Serial Num:
Board Part Num: H12DSU-iN
-Product Info:
Language: English
Manufacturer Name: South Pole AB
Product Name:
Product PartNum: AS -1124US-TNRP
Product Version:
Product SerialNum:
AssetTag:
Really interesting to hear if we're alone with this issue.
--Mats