Sudden High I/O Delay -- System becomes unresponsive

Jugrnot

Member
Dec 18, 2019
6
0
21
44
My previously rock solid system has become nearly unusable and I'm not even remotely sure where to look. I understand my Proxmox version is in need of upgrading, but unfortunately there are some issues with me doing so at this time. Please help?!

Last night around 2300 hours the system become completely unresponsive. After a reset button, I managed to get logged in and found something managed to fill up the / root partition, which I was able to clear out and get my CT/VMs started up. Ever since that time, the system continues to exhibit >70% I/O wait and becomes unresponsive. Checking in top along with iotop, there is nothing showing any actual load on the disks yet the I/O wait remains very high and system becomes unresponsive for long periods of time. Example, running "pct list" takes minutes to display anything.

Proxmox is bare metal on a ZFS mirror vdev of 2x HGST HUS72302CLAR2000 7200rpm 2tb disks. ZFS Scrub hasn't found any issues with the disks, nor has smartctl (that I can decrypt anyway.) DIsk1 and Disk2 smartctl details.

As another test, I shut down all CTs but one and all VMs but two to see if anything changes, and it does not. The three I left running are necessarry for my network/internet functionality and have very little overhead. With nothing running but those three, I still have this:

1722801174631.png

Once again, checking in iotop/htop/top, nothing is accessing the disks. Attempting to run 'pveversion -v' at the console took almost 45 seconds for the output to be displayed.

root@pve2 ~# pveversion -v
proxmox-ve: 6.4-1 (running kernel: 5.4.203-1-pve)
pve-manager: 6.4-15 (running version: 6.4-15/af7986e6)
pve-kernel-5.4: 6.4-20
pve-kernel-helper: 6.4-20
pve-kernel-5.3: 6.1-6
pve-kernel-5.4.203-1-pve: 5.4.203-1
pve-kernel-5.4.195-1-pve: 5.4.195-1
pve-kernel-5.4.140-1-pve: 5.4.140-1
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-5.3.10-1-pve: 5.3.10-1
ceph-fuse: 12.2.11+dfsg1-2.1+deb10u1
corosync: 3.1.5-pve2~bpo10+1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: residual config
ifupdown2: 3.0.0-1+pve4~bpo10
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.22-pve2~bpo10+1
libproxmox-acme-perl: 1.1.0
libproxmox-backup-qemu0: 1.1.0-1
libpve-access-control: 6.4-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.4-5
libpve-guest-common-perl: 3.1-5
libpve-http-server-perl: 3.2-5
libpve-storage-perl: 6.4-1
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.6-2
lxcfs: 4.0.6-pve1
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.1.14-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.6-2
pve-cluster: 6.4-1
pve-container: 3.3-6
pve-docs: 6.4-2
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-4
pve-firmware: 3.3-2
pve-ha-manager: 3.1-1
pve-i18n: 2.3-1
pve-qemu-kvm: 5.2.0-8
pve-xtermjs: 4.7.0-3
qemu-server: 6.4-2
smartmontools: 7.2-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 2.0.7-pve1
 
Look at
Code:
zpool iostat -vy 1

I noticed my local-zfs volume was full, so deleting some old snapshots while running that yields:

1722803354993.png

I/O Waits vary from 14% - 70%, still running one CT and two VMs.
 
If zfs fills up or has low free space, it can cause high system load. This amount of free space now is not too much either.

You can also run this and check what specific operations load the pool.
Code:
zpool iostat -q
 
I quite regret only using 2tb disks for my install. Guessing the only realistic way to upgrade the bare metal install to larger disks I'm assuming would require a fresh install? Given my node is basically down at this moment, maybe that would be my best bet.

My hesitation to update proxmox is my lack of knowledge on zfs pools. Specifically my concern is importing my zfs storage pool which has about 50tb worth of data I'd be rather upset if I lost, which is why I'm still on PVE6.4.
 
I quite regret only using 2tb disks for my install. Guessing the only realistic way to upgrade the bare metal install to larger disks I'm assuming would require a fresh install? Given my node is basically down at this moment, maybe that would be my best bet.
Not necessarily. ZFS provides a lot of possibilities for such a situation. You can attach another pair of disks to this pool if you have the technical capabilities and then disconnect the smaller ones.
You can also use the zfs send | zfs receive technique ro replace disks at this poool.

Look for what could be taking up so much data on this pool, maybe moving backups, templates or iso files to another place.
 
Not necessarily. ZFS provides a lot of possibilities for such a situation. You can attach another pair of disks to this pool if you have the technical capabilities and then disconnect the smaller ones.
You can also use the zfs send | zfs receive technique ro replace disks at this poool.

Look for what could be taking up so much data on this pool, maybe moving backups, templates or iso files to another place.

Did a bit of googling, and believe might have found a solution to this..

Can I literally just add say, two 8tb disks to the mirror via 'attach', let the 8tb disks resilver, then detach both of the 2tb disks followed up with 'zpool online -e rpool' to expand out to 8tb? Will that actually work on a live proxmox filesystem? Is there any real danger detaching the 2tb disks while the system is live?

Thanks a lot for your guidance on this, I'm learning a lot more.
 
Last edited:
Yes, you can do that. The pool will evacuate data from the disconnected disks. You have to plan it well and practice a bit because these disks are bootable. You have to prepare new disks to be bootable, create partitions and refresh the system boot.
With this method, if the disks are healthy, everything should work.
So you use zpool add poolxxx mirror /dev/disk/by-id/partx /dev/disk/by-id/party
and disconnect zpool remove pooll xxx mirror-0
https://openzfs.github.io/openzfs-docs/man/master/8/zpool-remove.8.html
 
  • Like
Reactions: Jugrnot
Hello Team

Network issue when HA Takes Process​


i have 3 Nodes enabled with Cluster and HA. When some node goes down. VMs are moving to another node. But Network is not working and VM is restarted.
How to achieve when my node goes down, After VM moved to Another node. without restart VMs and Network should be pingable .
For my Setup with 3 Nodes
1) Cluster enabled
2)HA Enabled
3) Ceph Configuration is Done
4) Ceph Monitor also configured. But OSD is not configured. Because my VMs are running in Shared Drive. Still do we need Ceph OSD? I don;t have much storage, i have shared Drive

Please advise when my VMs moved to Another Node. should be work network without disconnected.
 

Attachments

  • 2024-08-02 16_12_40-gva-esx-srv-01 - Proxmox Virtual Environment.png
    2024-08-02 16_12_40-gva-esx-srv-01 - Proxmox Virtual Environment.png
    30.7 KB · Views: 4
  • 2024-08-02 16_11_31-gva-esx-srv-01 - Proxmox Virtual Environment.png
    2024-08-02 16_11_31-gva-esx-srv-01 - Proxmox Virtual Environment.png
    10.6 KB · Views: 4
  • 2024-08-02 16_08_10-gva-esx-srv-01 - Proxmox Virtual Environment.png
    2024-08-02 16_08_10-gva-esx-srv-01 - Proxmox Virtual Environment.png
    22.2 KB · Views: 3
  • 2024-08-02 16_06_59-gva-esx-srv-01 - Proxmox Virtual Environment.png
    2024-08-02 16_06_59-gva-esx-srv-01 - Proxmox Virtual Environment.png
    37.2 KB · Views: 3
  • 2024-08-02 14_15_39-gva-esx-srv-01 - Proxmox Virtual Environment.png
    2024-08-02 14_15_39-gva-esx-srv-01 - Proxmox Virtual Environment.png
    52 KB · Views: 4

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!