qcow2 file on guest suddenly gone

marigo

Well-Known Member
Mar 7, 2014
34
3
48
Hi Community,

It seems that Proxmox lost a qcow2 file of a guest machine. I can't find it anymore on my local hd. The strange thing here is that my guest is still running and also can access the files which are on the qcow2 file.

This is actually the second time I am experience this kind of behaviour. The last time I just created a new one, but this time it's a 2TB file and I do like to recover it.

This is the output from fdisk of the physical disk where the qcow2 file was hosted.
Hopefully someone can help.

Code:
Disk /dev/sda: 3.7 TiB, 4000787030016 bytes, 7814037168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: CF29375E-3AD5-4881-BC90-483D920EF428

Device     Start        End    Sectors  Size Type
/dev/sda1   2048 7814035455 7814033408  3.7T Microsoft basic data
 
What VMID has your VM? What is the Path in PVE GUI of the virutal HDD? Normaly on an LVMsystem the HDDs are in the folder:

/var/lib/vz/image/VMID
 
Hi fireon,

My vmID is 108 and this virtual hd is located @ the absolute path given but that is the primary hd of the guest. My guest has two virtual hd's.
The first is where the OS is installed; no problems with that. The second one is the one that is missing. It is located on a different physical hd and used for storage.

--localhd -> (/var/lib/vz/images/108/vm-108-disk-1.qcow2)
--localhd2 -> /mnt/localhd2/images/108/vm-108-disk-1.qcow2) <--- this one is missing on the phisical disk.

I can't make changes to this file because it's gone. The strange thing here is that I can still access the files via the running guest so somewhere there is still a mount point but it is invissible.
 
If you think the file was erroneously deleted and you still have your VM running you can check /proc to verify (and save) the file.
Note that being able to access a deleted file is perfectly normal (it's even a common practice for temp-files you don't want to share with other tasks (see O_TMPFILE in open(2))

Here's how you can verify the file's deleted but still in use by the VM:
Find the VM's PID with `ps`, look through /proc/$PID/fd for the file descriptor pointing to the file and see if it is flagged as '(deleted)'.
Example:
Code:
# pgrep -f 'kvm -id 3000'
7337
# ls -l /proc/7337/fd/*
lrwx------ 1 root root 64 Mar  2 13:15 /proc/7337/fd/0 -> /dev/null
lrwx------ 1 root root 64 Mar  2 13:15 /proc/7337/fd/1 -> /dev/null
lrwx------ 1 root root 64 Mar  2 13:15 /proc/7337/fd/10 -> anon_inode:[eventfd]
lrwx------ 1 root root 64 Mar  2 13:15 /proc/7337/fd/11 -> socket:[539548]
lrwx------ 1 root root 64 Mar  2 13:15 /proc/7337/fd/12 -> socket:[602240]
lrwx------ 1 root root 64 Mar  2 13:15 /proc/7337/fd/13 -> /run/qemu-server/3000.pid
lrwx------ 1 root root 64 Mar  2 13:15 /proc/7337/fd/14 -> /dev/kvm
lrwx------ 1 root root 64 Mar  2 13:15 /proc/7337/fd/15 -> anon_inode:kvm-vm
lr-x------ 1 root root 64 Mar  2 13:15 /proc/7337/fd/16 -> pipe:[598397]
lrwx------ 1 root root 64 Mar  2 13:15 /proc/7337/fd/17 -> /dev/net/tun
lrwx------ 1 root root 64 Mar  2 13:15 /proc/7337/fd/18 -> /dev/vhost-net
lrwx------ 1 root root 64 Mar  2 13:15 /proc/7337/fd/19 -> anon_inode:[eventfd]
lrwx------ 1 root root 64 Mar  2 13:15 /proc/7337/fd/2 -> /dev/null
lrwx------ 1 root root 64 Mar  2 13:15 /proc/7337/fd/20 -> anon_inode:[eventfd]
lrwx------ 1 root root 64 Mar  2 13:15 /proc/7337/fd/21 -> /var/lib/vz/images/3000/vm-3000-disk-1.qcow2
lrwx------ 1 root root 64 Mar  2 13:15 /proc/7337/fd/22 -> anon_inode:[eventfd]
lrwx------ 1 root root 64 Mar  2 13:15 /proc/7337/fd/23 -> /var/lib/vz/images/3000/vm-3000-disk-3.qcow2 (deleted)
lrwx------ 1 root root 64 Mar  2 13:15 /proc/7337/fd/24 -> anon_inode:[eventfd]
lrwx------ 1 root root 64 Mar  2 13:15 /proc/7337/fd/25 -> anon_inode:kvm-vcpu
lrwx------ 1 root root 64 Mar  2 13:15 /proc/7337/fd/26 -> anon_inode:kvm-vcpu
lrwx------ 1 root root 64 Mar  2 13:15 /proc/7337/fd/27 -> socket:[602281]
lrwx------ 1 root root 64 Mar  2 13:15 /proc/7337/fd/28 -> anon_inode:[eventfd]
lrwx------ 1 root root 64 Mar  2 13:15 /proc/7337/fd/29 -> anon_inode:[eventfd]
lr-x------ 1 root root 64 Mar  2 13:15 /proc/7337/fd/3 -> /dev/urandom
lrwx------ 1 root root 64 Mar  2 13:15 /proc/7337/fd/30 -> anon_inode:[eventfd]
lrwx------ 1 root root 64 Mar  2 13:15 /proc/7337/fd/31 -> anon_inode:[eventfd]
lrwx------ 1 root root 64 Mar  2 13:15 /proc/7337/fd/32 -> anon_inode:[eventfd]
lrwx------ 1 root root 64 Mar  2 13:15 /proc/7337/fd/33 -> anon_inode:[eventfd]
lrwx------ 1 root root 64 Mar  2 13:15 /proc/7337/fd/4 -> anon_inode:[signalfd]
l-wx------ 1 root root 64 Mar  2 13:15 /proc/7337/fd/5 -> pipe:[601202]
lrwx------ 1 root root 64 Mar  2 13:15 /proc/7337/fd/6 -> anon_inode:[eventpoll]
lrwx------ 1 root root 64 Mar  2 13:15 /proc/7337/fd/7 -> anon_inode:[eventfd]
lrwx------ 1 root root 64 Mar  2 13:15 /proc/7337/fd/8 -> anon_inode:[eventpoll]
lrwx------ 1 root root 64 Mar  2 13:15 /proc/7337/fd/9 -> anon_inode:[eventfd]
#

Note the line:
Code:
lrwx------ 1 root root 64 Mar  2 13:15 /proc/7337/fd/23 -> /var/lib/vz/images/3000/vm-3000-disk-3.qcow2 (deleted)

Note that you can actually access the file via that path. Unfortunately you probably cannot link it back into the directory structure directly (I'm not sure why that restriction applies, there's a mechanism to prevent specifically this for temporary-files, but it might just trigger here, too.)
However you can read from and copy it normally with 'cp', though naturally you need enough free space :\ ('dd' with 'conv=sparse' might be better suited).
 
Hi wbumiller,

Thanks for your direction where to look at. I have found the line in "/proc" that indeed says the file is deleted.

Code:
 Mar  2 16:25 /proc/4820/fd/21 -> /mnt/localhd/images/108/vm-108-disk-1.qcow2 (deleted)
I don't know who deleted this, but it wasn't me. And there is no one else who has access to this machine. I use this only for private use.

I think you mean that I can copy the "/proc/4820/fd/21" file which should contain the same as the old "/mnt/localhd/images/108/vm-108-disk-1.qcow2" which isn't accessible anymore.

Can you let me know?
 
Copying the /proc/4820/fd/21 file does seem to make sense, but you will likely get some file system corruption in the VM when you use the new copy because the underlying file keeps changing underneath it as your currently running VM writes to it. Maybe if you took a VM snapshot first then you could roll back to that upon restore.

Another though I had was trying the "move disk" option from the Hardware tab of the UI. I suspect that KVM does the actual copy based on its own file handle since it has to keep track of changes while it does that work. Worth a try.

Just do not under any circumstance shut down the VM until you've made a copy.
 
Hi vkhera,

Thank you for your thoughts.

When trying moving the hd, proxmox complains that the "source" isn't available anymore. So I guess this will not work.
I will try to copy the fd/21 file to another storage medium and backup all files on the guest. just for safety. :)

When the fd/21 is copied I will rename it to a *.qcow2 file and will try to remount and reenable the file in de guest.

I will post the results here.
 
I recommend remounting the file system in the VM read-only, then you don't need to deal with changing files, but it's effectively a down-time for the duration of the process.
 
I am now backing up the files to different storage devices. I found out that I had no backup device which can hold one 2TB file.
When done I will recreate a new 2TB file and copy the content back to the new file.

I will definitely have to shop for a bigger storage device to hold big backups. ;)
 
Hah yes, using a disk bigger than the backup storage isn't the safest strategy :D
But the question remains how the file got deleted in the first place...
 
Well, that's indeed the big question. This wasn't also the first time I encounter this strange behavior. I had this before but the importance of the first "magic" deletion was very low. This time it is my virtual NAS where about every (home) client has files stored and serves as a backup for those clients.

I really like Proxmox VE but I use it for private use. I have no subscription so I really appreiciate it how the Proxmox staff and community help people to solve their problems. Most time the topic posted are to get help so this line is a very big compliment to all staff and community!

@wbumiller, I can give you access to the environment if you want to investigate it?
 
Thanks, but I wouldn't really know what to look for without any pointers as to when and how it happened or how to reproduce it. I'm not aware of any ways to trigger this currently (even taking manual fiddling with files in /etc/pve (removing locks while tasks are running etc.) into account)
 
What's the file system type of the physical disk? I'm curious as your fdisk output is showing "Microsoft basic data" for the partition type, and I'd expect that to say Linux.
 
Code:
df -T
...
/dev/sda1            ext4     3845577736 406789780 3243420740  12% /mnt/localhd
...
Looks like it's ext4.

I have shut down the vm and ofcourse the file is gone. I have recreated the hd for the vm and I am copying the files back to the virtual hd.
 
I don't know you storage backend and your infrastructure. But I think this anaware deletion can occur when you have 2 proxmox nodes, not in cluster and using the same mount point for VM disks. If you define 1 VM on the 2 nodes with the same VMID, deleting the VM on one node will delete it on the second.
 
  • Like
Reactions: evilbob
I have indeed two Proxmox servers but they have there own storage backend.

I have copied all data back last weekend. And everything is fine again. :)

Thank you for your brainpower and hints/tips.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!