f*ed up VM storage setup - need help recovering VM disk ZFS dataset

dakazze

New Member
Nov 25, 2022
27
0
1
First off, this is a super embarrassing fuckup that happened because I rushed setting up my backup server...

Creating some VMs I did not check the storage location which defaulted to a NAS NFS share. The NAS is still there and the NFS share is working but after rebooting the NAS while the VMs were running I cant start these VMs because "TASK ERROR: volume 'NAS:103/vm-103-disk-0.qcow2' does not exist".
One of these VMs involves a lot of work to setup again so I would really appreciate if there was a way to get it back up so I can move the disks where they belong.

There have been no major changes to the NAS dataset but rebooting the host did not help.

Thats the mountpoint set in the vm.conf:
Code:
scsi0: NAS:103/vm-103-disk-0.qcow2,iothread=1,size=32G


Code:
root@proxBu:/mnt# zfs list
NAME                           USED  AVAIL  REFER  MOUNTPOINT
rpool                          101G   728G    25K  /rpool
rpool/ROOT                    5.65G   728G    24K  /rpool/ROOT
rpool/ROOT/pve-1              5.65G   728G  5.65G  /
rpool/data                    57.1G   728G    25K  /rpool/data
rpool/data/subvol-100-disk-1   840M  99.2G   840M  /rpool/data/subvol-100-disk-1
rpool/data/subvol-104-disk-0  12.0M  3.99G  11.8M  /rpool/data/subvol-104-disk-0
rpool/data/vm-101-disk-0      35.5K   728G  35.5K  -
rpool/data/vm-101-disk-1      11.0G   728G  11.0G  -
rpool/data/vm-102-disk-0      81.5K   728G  44.5K  -
rpool/data/vm-103-disk-0      26.7G   728G  26.7G  -
rpool/data/vm-105-disk-0      10.7G   728G  10.2G  -
rpool/data/vm-105-state-uh    2.22G   728G  2.22G  -
rpool/data/vm-106-disk-0      5.66G   728G  5.66G  -
rpool/var-lib-vz              38.3G   728G  38.3G  /var/lib/vz

I am a noob when it comes to ZFS so my first thought was to use zfs set mountpoint for vm-103-disk which "does not apply to datasets of this type"

Any ideas?
 
If I understood you probably have a locked nfs mount because of nas reboot. Did you reboot the host?
Sorry, sometimes I have a hard time expressing myself in english.

The short Version:

1. Created VM and mistakenly set storage to NAS (different machine) which is a NFS share
2. I did not notice this mistake and everything worked fine for 2-3 weeks
3. I rebooted the NAS while the VM was still up because I did not know yet
4. Ever since I cant start the VM because 'NAS:103/vm-103-disk-0.qcow2' does not exist"
5. rebooted the host --> NFS share is accessible, path is the same, got R/W access
6. VM still cant boot because it cant find its boot disk
7. frustration / no idea what to do / angry at self for retarded mistake
 
1. Created VM and mistakenly set storage to NAS (different machine) which is a NFS share
2. I did not notice this mistake and everything worked fine for 2-3 weeks
That's not a bad mistake as we have all vm's and lxc's on nas also and works great.
3. I rebooted the NAS while the VM was still up because I did not know yet
That's ok as the kernel slow down (until 0) the app which is from kernel side the vm as it cannot temp. write I/O data to unavailable nfs-server. When nfs mount valid again outstanding data is written and vm get cpu slices (for generating new I/O) again.
4. Ever since I cant start the VM because 'NAS:103/vm-103-disk-0.qcow2' does not exist"
5. rebooted the host --> NFS share is accessible, path is the same, got R/W access
6. VM still cant boot because it cant find its boot disk
Problem on nfs server !! Maybe mostly (?) related to your zfs setup there.
7. frustration / no idea what to do / angry at self for retarded mistake
I understand you but still there's no mistake to vm storage definition.
 
Thanks for taking the time to reply, I appreciate the help!

That's not a bad mistake as we have all vm's and lxc's on nas also and works great.
Yea I know, but it was still a mistake which led to the current issue.

Problem on nfs server !! Maybe mostly (?) related to your zfs setup there.
Any idea how I might be able to find and fix the cause?

There have been no changes to the dataset, the share or the permissions. The share is accessible from the host and it has full access. Still I can neither start the VM nor move the disk because: 'NAS:103/vm-103-disk-0.qcow2' does not exist

Sadly I dont even have an idea where to start doing research -.-

The NFS target is a ZFS dataset managed by TrueNAS on a different machine. There are no obvious issues on that side and zfs list does not show anything that is not mounted. Is it possible that the disk was deleted during scrub since the mountpoint did not exist and the volume might have gotten damaged due to the unexpected shutdown? (yes I am a noob....)
 
The NFS target is a ZFS dataset managed by TrueNAS on a different machine. There are no obvious issues on that side and zfs list does not show anything that is not mounted.
root@proxBu:/mnt# zfs list
rpool/data/vm-103-disk-0 26.7G 728G 26.7G -
So "proxBu" is your truenas right ?
What happen if you do on proxBu "dd if=/rpool/data/vm-103-disk-0 of=/dev/null bs=1024k" (which takes a while)
and "ls -l /dev/zvol/rpool/data/vm-103-disk-0" ?
 
Last edited:
I'm still wondering what you are doing ... the zfs list show a pve installation ... and you said you have a truenas vm which itself has than passthrough disk. The nas export shares (over nfe or smb protocol) or block volumes (over iscsi) to clients. In case of a share the client (pve) setup eg. qemu files (as written 'NAS:103/vm-103-disk-0.qcow2') for vm and if it uses iscsi it get a block device (where the client even COULD setup a filesystem of itself and mount that if wished to but probably not in this case) and setup raw files for a vm.
You are talking about mounting but zfs list shows zvols on pve host ...
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!