Summary:
Cause:
I found next problem with NVMe storage devices after week of server powered and idle without reboot:
This problem is present on all 4 nodes, some NVMe storage devices is slow down, some is not.
If I reboot system the problem disappears (but not completely, see next) and again occurs in a couple of days.
Other thing - strange detect:
root@host2:~# smartctl -i /dev/nvme0 | grep model
Model Number: SAMSUNG MZVPW256HEGL-00000
root@host2:~# smartctl -i /dev/nvme0n1 | grep model
Model Number: INTEL SSDPE21D280GA
and
root@host2:~# smartctl -i /dev/nvme4 | grep model
Model Number: INTEL SSDPE21D280GA
root@host2:~# smartctl -i /dev/nvme4n1 | grep model
Model Number: SAMSUNG MZVPW256HEGL-00000
I think that is wrong naming of nvme devices in kernel driver
Decision:
Conclusion:
Use next platform Supermicro A+ Server 2123BT-HNC0R with per node configuration:
NVME: 4x 2.5" U.2 Intel Optane 900P 280GB
1x M.2 Samsung SM961 256GB (NVMe) SM961
SAS: 2x 2'5" SSD Samsung PM1633a 7.68TB
FC: QLogic QLE8362 (attache to FC switch, for use exported pools from external storages)
CPU: 2x AMD EPYC 7601 with SMT (64 cores/128 threads)
Memory: DDR4 ECC 2 TiB
Network: 2x 10Gbps Intel X550T Ethernet - bonding balanced-alb
OS: Debian GNU/Linux 9.5 (stretch) with linux kernel version 4.15.18-1-pve #1 SMP PVE 4.15.18-17 (Mon, 30 Jul 2018 12:53:35 +0200)
Virtualization: Proxmox 5.2-2, PVE Manager Version pve-manager/5.2-7/8d88e66a
Ceph Version: luminous 12.2.7-pve1
NVME: 4x 2.5" U.2 Intel Optane 900P 280GB
1x M.2 Samsung SM961 256GB (NVMe) SM961
SAS: 2x 2'5" SSD Samsung PM1633a 7.68TB
FC: QLogic QLE8362 (attache to FC switch, for use exported pools from external storages)
CPU: 2x AMD EPYC 7601 with SMT (64 cores/128 threads)
Memory: DDR4 ECC 2 TiB
Network: 2x 10Gbps Intel X550T Ethernet - bonding balanced-alb
OS: Debian GNU/Linux 9.5 (stretch) with linux kernel version 4.15.18-1-pve #1 SMP PVE 4.15.18-17 (Mon, 30 Jul 2018 12:53:35 +0200)
Virtualization: Proxmox 5.2-2, PVE Manager Version pve-manager/5.2-7/8d88e66a
Ceph Version: luminous 12.2.7-pve1
Cause:
I found next problem with NVMe storage devices after week of server powered and idle without reboot:
Code:
root@host2:~# nvme list
Node SN Model Namespace Usage Format FW Rev
---------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme0n1 PHMXXXXXXXXX280AGN INTEL SSDPE21D280GA 1 280.07 GB / 280.07 GB 512 B + 0 B E2010325
/dev/nvme1n1 PHMXXXXXXXXX280AGN INTEL SSDPE21D280GA 1 280.07 GB / 280.07 GB 512 B + 0 B E2010325
/dev/nvme2n1 PHMXXXXXXXXX280AGN INTEL SSDPE21D280GA 1 280.07 GB / 280.07 GB 512 B + 0 B E2010325
/dev/nvme3n1 PHMXXXXXXXXX280AGN INTEL SSDPE21D280GA 1 280.07 GB / 280.07 GB 512 B + 0 B E2010325
/dev/nvme4n1 S346XXXXXXXXXXX SAMSUNG MZVPW256HEGL-00000 1 239.87 GB / 256.06 GB 512 B + 0 B CXZ7500Q
Code:
root@host2:~# for d in {0..4}; do hdparm -Tt --direct /dev/nvme${d}n1; done
/dev/nvme0n1:
Timing O_DIRECT cached reads: 4870 MB in 2.00 seconds = 2435.09 MB/sec
Timing O_DIRECT disk reads: 7276 MB in 3.00 seconds = 2425.19 MB/sec
/dev/nvme1n1:
Timing O_DIRECT cached reads: 2758 MB in 2.00 seconds = 1379.17 MB/sec
Timing O_DIRECT disk reads: 2726 MB in 3.00 seconds = 908.07 MB/sec
/dev/nvme2n1:
Timing O_DIRECT cached reads: 614 MB in 2.12 seconds = 290.25 MB/sec
Timing O_DIRECT disk reads: 64 MB in 3.13 seconds = 20.42 MB/sec
/dev/nvme3n1:
Timing O_DIRECT cached reads: 4716 MB in 2.00 seconds = 2358.23 MB/sec
Timing O_DIRECT disk reads: 6068 MB in 3.00 seconds = 2022.55 MB/sec
/dev/nvme4n1:
Timing O_DIRECT cached reads: 2106 MB in 2.00 seconds = 1052.56 MB/sec
Timing O_DIRECT disk reads: 3886 MB in 3.00 seconds = 1295.33 MB/sec
root@host4:~# for d in {0..4}; do hdparm -Tt --direct /dev/nvme${d}n1; done
/dev/nvme0n1:
Timing O_DIRECT cached reads: 16 MB in 2.12 seconds = 7.56 MB/sec
Timing O_DIRECT disk reads: 24 MB in 3.17 seconds = 7.58 MB/sec
/dev/nvme1n1:
Timing O_DIRECT cached reads: 16 MB in 2.12 seconds = 7.55 MB/sec
Timing O_DIRECT disk reads: 22 MB in 3.03 seconds = 7.26 MB/sec
/dev/nvme2n1:
Timing O_DIRECT cached reads: 4814 MB in 2.00 seconds = 2406.97 MB/sec
Timing O_DIRECT disk reads: 7204 MB in 3.00 seconds = 2400.93 MB/sec
/dev/nvme3n1:
Timing O_DIRECT cached reads: 1010 MB in 2.07 seconds = 488.49 MB/sec
Timing O_DIRECT disk reads: 290 MB in 3.02 seconds = 95.92 MB/sec
/dev/nvme4n1:
Timing O_DIRECT cached reads: 3256 MB in 2.00 seconds = 1627.93 MB/sec
Timing O_DIRECT disk reads: 6434 MB in 3.00 seconds = 2144.20 MB/sec
If I reboot system the problem disappears (but not completely, see next) and again occurs in a couple of days.
Code:
after reboot again:
root@host2:~# for d in {0..4}; do hdparm -Tt --direct /dev/nvme${d}n1; done
/dev/nvme0n1:
Timing O_DIRECT cached reads: 4782 MB in 2.00 seconds = 2391.16 MB/sec
Timing O_DIRECT disk reads: 1482 MB in 3.00 seconds = 493.41 MB/sec
/dev/nvme1n1:
Timing O_DIRECT cached reads: 4868 MB in 2.00 seconds = 2434.48 MB/sec
Timing O_DIRECT disk reads: 4566 MB in 3.00 seconds = 1521.70 MB/sec
/dev/nvme2n1:
Timing O_DIRECT cached reads: 4880 MB in 2.00 seconds = 2440.29 MB/sec
Timing O_DIRECT disk reads: 7016 MB in 3.00 seconds = 2338.38 MB/sec
/dev/nvme3n1:
Timing O_DIRECT cached reads: 4870 MB in 2.00 seconds = 2435.76 MB/sec
Timing O_DIRECT disk reads: 7048 MB in 3.00 seconds = 2349.15 MB/sec
/dev/nvme4n1:
Timing O_DIRECT cached reads: 3042 MB in 2.00 seconds = 1521.08 MB/sec
Timing O_DIRECT disk reads: 6612 MB in 3.00 seconds = 2203.62 MB/sec
Other thing - strange detect:
root@host2:~# smartctl -i /dev/nvme0 | grep model
Model Number: SAMSUNG MZVPW256HEGL-00000
root@host2:~# smartctl -i /dev/nvme0n1 | grep model
Model Number: INTEL SSDPE21D280GA
and
root@host2:~# smartctl -i /dev/nvme4 | grep model
Model Number: INTEL SSDPE21D280GA
root@host2:~# smartctl -i /dev/nvme4n1 | grep model
Model Number: SAMSUNG MZVPW256HEGL-00000
I think that is wrong naming of nvme devices in kernel driver
Decision:
Conclusion: