performance problems

rvdk92

Member
Nov 8, 2020
67
3
13
32
Hello,

I have a strange problem in one of our servers that I think is related to the ZFS volume.

If I run a du or ncdu command on this server to read the files, it is extremely slow. Takes about 30/40 minutes per TB. I don't notice the problem with writing to it.

Before making the backup, our backup client first reads the size of the storage vault and does this with du, so this takes far too long now.

I just added a temporary disk to the same server on the local datastore and copied a 30GB folder to it and when I then execute the command it is very fast.

Here's the setup.

PVE-A12
32 x Intel(R) Xeon(R) Gold 6134 CPU @ 3.20GHz (2 Sockets)
192GB
6 x SAS WD DC HC550 18TB SAS 12Gbps 3.5
2 x HP 1.92TB Enterprise SSD SATA 6G P/N: 838403-005
M.2. SSD (for OS): 2x Samsung 250GB M.2 SSD SATA NEW

ZFS config:
zpool.png


So 1 VM is running on this, Debian 12.

vm.png

I saw that I had once created it as a virtio disk, but I converted it to scsi1, unfortunately no difference.

If I run du on scsi0 I have all the files and size of 40GB back within 4 seconds, if I move scsi0 to DATASTORE-ZFS it also takes tens of minutes.

We have several Linux Debian servers on different types of hardware and it is fast everywhere, even on very old hardware with a HW raid controller and slow HDDs in RAID5. But this cannot be done no matter how slowly.

The only thing I can think of that makes the difference that we do not have on other servers is a special device with SSDs and that something in the config is not right. I have tried all different options unfortunately without positive results.
In the attachment txt file of zfs parameters.

Anybody a idea?
 

Attachments

  • zfsconfig.txt
    35.4 KB · Views: 1
Im not sure if this is related, but why is scsi1 53TB in its size? even if its not related, that might be a bad idea, because the pool does not have that much space.
 
Im not sure if this is related, but why is scsi1 53TB in its size? even if its not related, that might be a bad idea, because the pool does not have that much space.

Hi, 53TB is the size of the disk. the pool is 59TB

1707999810394.png
 
  • Like
Reactions: jsterr
  • Like
Reactions: UdoB
OK still a bad idea, as ZFS gets reaaaaaaaaaaally really slow when you go over 80% usage.

I added 2 additional 20 TB drives this morning and added them to the pool in a mirror.
Now it is 74% again.

unfortunately still the same effect, still so slow.
 
If I run a du or ncdu command on this server to read the files, it is extremely slow. Takes about 30/40 minutes per TB. I don't notice the problem with writing to it.
Metadata operations and data operations are not directly connected. your special device should help, as well as arc, but in the greater then amount of metadata (eg, number of files and their attendant information) the slower metadata operations will be. This is not unique to zfs; I have a legacy 250TB XFS over LSI RAID that can take days to update duc (which, itself, exists to make your life easier for du operations on a large filesystem- it may be useful to you. https://github.com/zevv/duc/blob/master/doc/duc.md)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!