VM I/O error when mounting a RAIDZ partition

Edu

New Member
May 18, 2022
5
0
1
Hello everybody,

First of all, I should say I'm a total beginner regarding these matters, so please bear with me for a moment. Thank you.

My company accidentally fill one RAIDZ partition (sdb1) up to 100% of its capacity yesterday causing the VM in which such partition was originally mounted to go unresponsive.
We stopped and then started the VM succesfully, but the part is now unmounted as it is absent in df (see below).
1652942596458.png

When checking with lsblk I can see the disk and its partition without a mount point indeed (see figure below for details).
1652942547303.png

The problem is that when I try to mount the partition from the console using the typical mount /dev/sdb1 <directory>, the terminal hangs indefinitely, and when checking it out in the proxmox VE dashboard the VM appears with a warning icon indicating io-error, although apparently the VM is still running as indicated in the Status flag (I don't know why it says that, because it effectively goes unresponsive). The figure below shows the specific error.
1652944919924.png

I assume the problem comes from the lack of any free space in the partition, so, do you have any ideas about how can I proceed without losing the data? I have searched and searched for solutions and had no luck yet... The fact I'm only a begginer does not help at all, to be honest.

Thank you very much in advance for your time!
 
Bumping... I'd be very grateful if someone could take a shot at this, thanks.
 
A ZFS raidz is never just one partition on a disk.

You will have to use the ZFS tools. What output do you get if you run
Code:
zpool status

and

Code:
zpool import
?
 
Thank you very much aaron, I appreciate your time a lot.

Running zpool status outputs "no pools available".

Running zpool import outputs "no pools available to import"

Cheers.
 
Okay, so for me to understand it correctly. The disk inside the VM ran full? The screenshots are from within the VM?

Did you run the zpool commands on the Proxmox VE host or within the VM?
 
Thanks again aaron. I'll try to explain in a bit more detail.

One of the VM's drives ran full because someone uploaded way too many files via ftp. That was what originated the problem as far as we know.

The disk in question is a ZFS SSD12TB-RAIDZ with 10.82 TiB allocated (and 357.5 free GiB). Here is a screenshot taken from the Disks tab of the node:
1653050706194.png

Now, the disk is installed inside the VM, showing 8.65 TiB of size (as far as I know this reduction from the supposedly 10.82 TiB is due to the RAIDZ setup and is OK), can be inspected in the Summary tab and of course it shows a 100% storage usage (see image below):
1653050937707.png

In the other hand, in the Content tab, it shows the following:
1653051081080.png


Now, regarding your questions:
The disk inside the VM ran full?
Yes, I believe there must be about 200 MB of free space or something like that, but to all effects it seems as it's fully occupied.
The screenshots are from within the VM?
The screenshots are taken from the terminal of the VM, you are right.
Did you run the zpool commands on the Proxmox VE host or within the VM?
I ran the commands from the VM terminal. I will try using the Proxmox VE host if you suggest it.

Hoping to hear from you again! Thanks in advance.
 

Attachments

  • 1653051057853.png
    1653051057853.png
    25.3 KB · Views: 4
  • 1653050935077.png
    1653050935077.png
    10.6 KB · Views: 3
Okay. Well, to get a more complete picture, please run the zpool status and zfs list commands on the Proxmox VE host.
As well as zpool get all <pool name>.
In the output of the zfs list command, you will see the vm-5001-disk-0 dataset. Please run zfs get all <pool>/vm-5001-disk-0. The path might be a bit different.
Please also share the storage configuration: cat /etc/pve/storage.cfg

With that, we should get a good picture of the current state. The big problem is, that once ZFS is full, IO will be problematic as it needs a bit of free space since it is copy on write.
 
  • Like
Reactions: Edu
For the future you might want to use some kind of quota. ZFS supports this and with that you can for example limit that your ZFS pool never can be filled up more than 90%. That way you can't run into a situation like now where ZFS stops working because it completely ran out of space. And beside that, a ZFS pool gets slow when it gets full. Because of that it is recommended to not fill your pool more than 80%. So another point why it makes sense to use quotas.
 
Thanks @Dunuin for the suggestion, it seems something we should have thought about earlier... Very useful indeed.

@aaron, I ran the commands from the Proxmox VE Shell with the following outputs:
  • zpool status: the first result is the problematic disk; it seems no errors are found...
    1653287585482.png

  • zfs list: here we can see 0B available in the disk
    1653287683631.png
  • Additionally, the zpool get all SSD12TB-RAIDZ command shows (I divide the screenshot in two images)1653288501254.png
    1653288517743-png.37204


  • zfs get all SSD12TB-RAIDZ/vm-5001-disk-0:
    1653288143450-png.37198


  • Finally, the storage configuration as displayed by the /etc/pve/storage.cfg:
    1653288306569.png
Thanks again for you invaluable help! Cheers!
 

Attachments

  • 1653288246232.png
    1653288246232.png
    17.3 KB · Views: 1
  • 1653288143450.png
    1653288143450.png
    96.5 KB · Views: 15
  • 1653288085863.png
    1653288085863.png
    96.5 KB · Views: 1
  • 1653288492708.png
    1653288492708.png
    72.8 KB · Views: 1
  • 1653288497156.png
    1653288497156.png
    72.8 KB · Views: 1
  • 1653288517743.png
    1653288517743.png
    21.3 KB · Views: 15

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!