Very slow read performance NVMe DC1000M

mmenaz

Renowned Member
Jun 25, 2009
835
26
93
Northern east Italy
Hi, I've a U2 Kingston DC1000M 1.92TB as secondary storage (ZFS formatted) that I use "just in case I need speed".
When installed it was very fast as expected (es read performance 1454.99 MB/sec).
i.e.
Bash:
root@proxmm01:~# hdparm -tT /dev/nvme0n1
/dev/nvme0n1:
 Timing cached reads:   20680 MB in  2.00 seconds = 10349.84 MB/sec
 HDIO_DRIVE_CMD(identify) failed: Inappropriate ioctl for device
 Timing buffered disk reads: 4366 MB in  3.00 seconds = 1454.99 MB/sec

now is incredibly slow in READ performance (117.83 MB/sec!)
Bash:
root@proxmm01:~# hdparm -tT /dev/nvme0n1
/dev/nvme0n1:
 Timing cached reads:   23346 MB in  2.00 seconds = 11689.69 MB/sec
 Timing buffered disk reads: 356 MB in  3.02 seconds = 117.83 MB/sec
and also
Bash:
root@proxmm01:~# hdparm -tT --direct /dev/nvme0n1
/dev/nvme0n1:
 Timing O_DIRECT cached reads:   518 MB in  2.00 seconds = 258.92 MB/sec
 Timing O_DIRECT disk reads: 374 MB in  3.01 seconds = 124.23 MB/sec

Write performance is still fast
Bash:
# pveperf /nvmep1
CPU BOGOMIPS:      86229.72
REGEX/SECOND:      3773896
HD SIZE:           1532.55 GB (nvmep1)
FSYNCS/SECOND:     15554.55

Here some other info about the system (and temperature is ok, 42°C)

Bash:
root@proxmm01:~# dmesg |grep -i nvme
[    1.016219] nvme nvme0: pci function 0000:01:00.0
[    1.025079] nvme nvme0: missing or invalid SUBNQN field.
[    1.026616] nvme nvme0: 15/0/0 default/read/poll queues
[    1.046485]  nvme0n1: p1 p2

and
Bash:
# lspci -vvv | egrep "0[0-9]:|Width\ "
[...]
01:00.0 Non-Volatile memory controller: Kingston Technology Company, Inc. Device 500b (rev 03) (prog-if 02 [NVM Express])
                LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM not supported
                LnkSta: Speed 8GT/s, Width x4

Bash:
# pveversion -v
proxmox-ve: 8.0.2 (running kernel: 6.2.16-6-pve)
pve-manager: 8.0.4 (running version: 8.0.4/d258a813cfa6b390)
pve-kernel-6.2: 8.0.5
proxmox-kernel-helper: 8.0.3
pve-kernel-5.15: 7.4-4
proxmox-kernel-6.2.16-6-pve: 6.2.16-7
proxmox-kernel-6.2: 6.2.16-7
pve-kernel-6.2.16-5-pve: 6.2.16-6
pve-kernel-5.15.108-1-pve: 5.15.108-1
ceph-fuse: 16.2.11+ds-2
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-3
libknet1: 1.25-pve1
libproxmox-acme-perl: 1.4.6
libproxmox-backup-qemu0: 1.4.0
libproxmox-rs-perl: 0.3.1
libpve-access-control: 8.0.4
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.0.7
libpve-guest-common-perl: 5.0.4
libpve-http-server-perl: 5.0.4
libpve-rs-perl: 0.8.5
libpve-storage-perl: 8.0.2
libqb0: 1.0.5-1
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve3
novnc-pve: 1.4.0-2
proxmox-backup-client: 3.0.2-1
proxmox-backup-file-restore: 3.0.2-1
proxmox-kernel-helper: 8.0.3
proxmox-mail-forward: 0.2.0
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.2
proxmox-widget-toolkit: 4.0.6
pve-cluster: 8.0.3
pve-container: 5.0.4
pve-docs: 8.0.4
pve-edk2-firmware: 3.20230228-4
pve-firewall: 5.0.3
pve-firmware: 3.7-1
pve-ha-manager: 4.0.2
pve-i18n: 3.0.5
pve-qemu-kvm: 8.0.2-4
pve-xtermjs: 4.16.0-3
qemu-server: 8.0.6
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.1.12-pve1

Backing up a disk from there to any destination is incredibly slow, a small vm went from 1minute (was beginning of 2021) to 13 minutes now!
Unfortunately, since I use it rarely, I'm not able to understand what happened that made it work so slow in read performance.
I initially thought was because at certain point I had a 4port nic that stop working and I replaced it with 2 x 2port nic (some pcie conflict?) but now I've again a 4port nic (same brand/model) and I've removed the 2 x 2port nic.
Maybe new Proxmox version/kernel? Any idea?


My (PC based) "home server" is the only computer with a nvme interface, so I can't move the storage into i.e. my PC.

any idea?
 
I have seen a lot of threads talking about DC1000M. Bad read performance with ZFS was never the topic. So probably not a hardware problem.
 
I use NVMe to TB or USB adapters mostly for doing off-line backups or for migration.

In a case like this it might help you diagnose is it's a drive issue or OS issue: you have to split the problem space...
I was able to grab a cheap M.2 pcie adapter and I tested in another PC.
htparam is slow there also, even if faster (216 MB/s vs 124 MB/s... even if in a "degraded" bus state, OS Kubuntu 23.10 "beta"), and I've opened a ticket with Kingston support.
I'm really puzzled about this problem (hope an updated firmware and a complete wipe will fix it, but I've no WIn PC here so I have to wait them for provide me something nvme fw-download can flash).
I keep you informed, thanks
 
Update: after a "low level" format everything works again as expected, even with re-partitionin and zfs formatting:
Bash:
nvme format -s1 /dev/nvme0
sgdisk -n 0:0:+1650GiB -t 0:bf01 /dev/nvme0n1
sgdisk -n 0:0:+10GiB -t 0:bf01 -c 0:SLOG1 /dev/nvme0n1
zpool create -o cachefile=none -o ashift=12 nvmep1 /dev/disk/by-id/nvme-eui.0026b72825364bf5-part1
zfs set atime=off compression=on xattr=sa nvmep1
hdparm -tT --direct /dev/nvme0n1
/dev/nvme0n1:
 Timing O_DIRECT cached reads:   6130 MB in  2.00 seconds = 3065.47 MB/sec
 Timing O_DIRECT disk reads: 9464 MB in  3.00 seconds = 3154.33 MB/sec
and pveperf, that was always good, is still good
Bash:
# pveperf /nvmep1/
CPU BOGOMIPS:      86242.08
REGEX/SECOND:      4147418
HD SIZE:           1597.00 GB (nvmep1)
FSYNCS/SECOND:     17848.80
I'm waiting Kingston assistance if they provide me some firmware fix, if even exists, or an explanation about what happened and how to prevent it to happen again