Please help: VM disappeared during migration

Feb 14, 2021
41
2
13
68
Denmark
I have a small cluster with two hardware servers, HP1 and HP2. On HP1 I have a VM (Centos), running as HA, and everything have been running fine for a long time.

Today I tried to copy a lot of files from a USB-disk to the VM, and then everything seemed to stop. The GUI interface to Proxmox was no longer available, and the VM didn't was unreachable.

I rebooted the hardware server, and the Proxmox GUI worked again. There I can see, that my VM was moved away from HP1 to HP2. But I can't start my VM on HP2, it gives me this error:
Code:
TASK ERROR: timeout: no zvol device link for 'vm-100-disk-0' found after 300 sec found.

And my VM is not to be found on HP1. I guess that my VM became unresponsive, and so Proxmox tried to move it from HP1 to HP2. But then Proxmox on HP1 crashed.

Is there any way for me to recover the VM from HP1. I do have a backup, but it takes a lot of time (+6 hours) to restore it.

Any help appreciated,
Jesper, Denmark
 
Can you still see the disk virtual disk in whatever storage HP1 was using? If so, it may be possible to recover the VM by moving the VM's config file back to HP1. You should be able to move the config by powering off the VM and then running: mv /etc/pve/nodes/HP2/<VMID>.conf /etc/pve/nodes/HP1/<VMID.conf>.
 
Hello Dylan, thanks a lot for your suggestion, but it's too late. But it's also good to know, should this happen for me once more. The VM acts as mail and web server for a few domains, so after a few hours, I decided to reinstall from my backup.

Fortunately this went smoothly, but resulted in a loss of emails that were sent and received between the time of the backup and the time of the crash. Thanks to Fabian, I was able to remount the filesystem of my VM on HP1, and recover the lost emails: https://forum.proxmox.com/threads/mounting-the-filesystem-from-a-lost-vm.102610/

I think the crash may have been caused by sharing of USB-ports. I tried to copy a lot of large files from a USB-stick to my VM, when almost immediately both Proxmox and my VM became unresponsive. Should I in general be cautious when trying to access USB-sticks from my VM?
 
Hi, sorry I couldn't get to the question in time, but I'm glad the recovery process went smoothly :)

I think the crash may have been caused by sharing of USB-ports. I tried to copy a lot of large files from a USB-stick to my VM, when almost immediately both Proxmox and my VM became unresponsive. Should I in general be cautious when trying to access USB-sticks from my VM?
I can't say I have much personal experience with USB passthrough, but I haven't come across many recent issues with it. Are you using Host or Spice USB passthrough?
Could you also post the output of pveversion -v and qm config <VMID>?
 
Hello Dylan, thanks for your interest.

To make my USB-stick accessible from the VM, I did:
Code:
$ qm monitor 100
$ device_add usb-host,vendorid=0x781,productid=0x5583,id=someid

Code:
# pveversion -v
proxmox-ve: 7.1-1 (running kernel: 5.13.19-2-pve)
pve-manager: 7.1-8 (running version: 7.1-8/5b267f33)
pve-kernel-helper: 7.1-6
pve-kernel-5.13: 7.1-5
pve-kernel-5.11: 7.0-10
pve-kernel-5.13.19-2-pve: 5.13.19-4
pve-kernel-5.13.19-1-pve: 5.13.19-3
pve-kernel-5.11.22-7-pve: 5.11.22-12
pve-kernel-5.11.22-1-pve: 5.11.22-2
ceph-fuse: 15.2.13-pve1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve2
libproxmox-acme-perl: 1.4.0
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.1-5
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.0-14
libpve-guest-common-perl: 4.0-3
libpve-http-server-perl: 4.0-4
libpve-storage-perl: 7.0-15
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.11-1
lxcfs: 4.0.11-pve1
novnc-pve: 1.3.0-1
proxmox-backup-client: 2.1.2-1
proxmox-backup-file-restore: 2.1.2-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.4-4
pve-cluster: 7.1-3
pve-container: 4.1-3
pve-docs: 7.1-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.3-4
pve-ha-manager: 3.3-1
pve-i18n: 2.6-2
pve-qemu-kvm: 6.1.0-3
pve-xtermjs: 4.12.0-1
qemu-server: 7.1-4
smartmontools: 7.2-1
spiceterm: 3.2-2
swtpm: 0.7.0~rc1+2
vncterm: 1.7-1
zfsutils-linux: 2.1.1-pve3

i don't have the config of my original VM anymore, but here's the one for the restored backup:
Code:
# qm config 102
boot: order=scsi0;ide2;net0
cores: 2
description:  ...
ide2: none,media=cdrom
memory: 20480
name: SME10
net0: virtio=DA:34:09:6F:DB:38,bridge=vmbr1,firewall=1
net1: e1000=6A:D3:24:DF:42:79,bridge=vmbr0,firewall=1
numa: 0
onboot: 1
ostype: l26
scsi0: local-zfs:vm-102-disk-0,size=512G
scsi2: /dev/disk/by-id/ata-WDC_WD20EFRX-68EUZN0_WD-WCC4M0KH2SX9,backup=0,size=1953514584K
scsihw: virtio-scsi-pci
smbios1: uuid=9d410adc-5bcd-439a-94f0-514185fcd010
sockets: 1
vmgenid: 02d055f1-1124-4500-a09a-b7f3da050546

Here's the contents of /var/log/syslog around the time of the crash:
Code:
Jan  5 13:57:43 sonja kernel: [4123481.657855] usb 3-2: new SuperSpeed USB device number 11 using xhci_hcd
Jan  5 13:57:43 sonja kernel: [4123481.682484] usb 3-2: New USB device found, idVendor=0781, idProduct=5583, bcdDevice= 1.00
Jan  5 13:57:43 sonja kernel: [4123481.682997] usb 3-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
Jan  5 13:57:43 sonja kernel: [4123481.683514] usb 3-2: Product: Ultra Fit
Jan  5 13:57:43 sonja kernel: [4123481.684011] usb 3-2: Manufacturer: SanDisk
Jan  5 13:57:43 sonja kernel: [4123481.684507] usb 3-2: SerialNumber: 4C530001031202110073
Jan  5 13:57:43 sonja kernel: [4123481.707473] usb-storage 3-2:1.0: USB Mass Storage device detected
Jan  5 13:57:43 sonja kernel: [4123481.729932] scsi host7: usb-storage 3-2:1.0
Jan  5 13:57:44 sonja kernel: [4123482.742518] scsi 7:0:0:0: Direct-Access     SanDisk  Ultra Fit        1.00 PQ: 0 ANSI: 6
Jan  5 13:57:44 sonja kernel: [4123482.743269] sd 7:0:0:0: Attached scsi generic sg5 type 0
Jan  5 13:57:44 sonja kernel: [4123482.746008] sd 7:0:0:0: [sdf] 120127488 512-byte logical blocks: (61.5 GB/57.3 GiB)
Jan  5 13:57:44 sonja kernel: [4123482.749009] sd 7:0:0:0: [sdf] Write Protect is off
Jan  5 13:57:44 sonja kernel: [4123482.749796] sd 7:0:0:0: [sdf] Mode Sense: 43 00 00 00
Jan  5 13:57:44 sonja kernel: [4123482.751672] sd 7:0:0:0: [sdf] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
Jan  5 13:57:44 sonja kernel: [4123482.815876]  sdf: sdf1
Jan  5 13:57:44 sonja kernel: [4123482.858645] sd 7:0:0:0: [sdf] Attached SCSI removable disk
Jan  5 13:58:18 sonja pvedaemon[2208230]: <root@pam> starting task UPID:sonja:002905C2:189400EC:61D595EA:vncshell::root@pam:
Jan  5 13:58:18 sonja pvedaemon[2688450]: starting termproxy UPID:sonja:002905C2:189400EC:61D595EA:vncshell::root@pam:
Jan  5 13:58:19 sonja pvedaemon[543400]: <root@pam> successful auth for user 'root@pam'
Jan  5 13:58:19 sonja systemd[1]: Created slice User Slice of UID 0.
Jan  5 13:58:19 sonja systemd[1]: Starting User Runtime Directory /run/user/0...
Jan  5 13:58:19 sonja systemd[1]: Finished User Runtime Directory /run/user/0.
Jan  5 13:58:19 sonja systemd[1]: Starting User Manager for UID 0...
Jan  5 13:58:19 sonja systemd[2688461]: Queued start job for default target Main User Target.
Jan  5 13:58:19 sonja systemd[2688461]: Created slice User Application Slice.
Jan  5 13:58:19 sonja systemd[2688461]: Reached target Paths.
Jan  5 13:58:19 sonja systemd[2688461]: Reached target Timers.
Jan  5 13:58:19 sonja systemd[2688461]: Listening on GnuPG network certificate management daemon.
Jan  5 13:58:19 sonja systemd[2688461]: Listening on GnuPG cryptographic agent and passphrase cache (access for web browsers).
Jan  5 13:58:19 sonja systemd[2688461]: Listening on GnuPG cryptographic agent and passphrase cache (restricted).
Jan  5 13:58:19 sonja systemd[2688461]: Listening on GnuPG cryptographic agent (ssh-agent emulation).
Jan  5 13:58:19 sonja systemd[2688461]: Listening on GnuPG cryptographic agent and passphrase cache.
Jan  5 13:58:19 sonja systemd[2688461]: Reached target Sockets.
Jan  5 13:58:19 sonja systemd[2688461]: Reached target Basic System.
Jan  5 13:58:19 sonja systemd[2688461]: Reached target Main User Target.
Jan  5 13:58:19 sonja systemd[2688461]: Startup finished in 188ms.
Jan  5 13:58:19 sonja systemd[1]: Started User Manager for UID 0.
Jan  5 13:58:19 sonja systemd[1]: Started Session 1380 of user root.
Jan  5 14:01:06 sonja kernel: [4123685.027960] EXT4-fs (sdd1): recovery complete
Jan  5 14:01:06 sonja kernel: [4123685.045088] EXT4-fs (sdd1): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
Jan  5 14:01:27 sonja kernel: [4123706.486351] BPF:      type_id=10 bits_offset=152
Jan  5 14:01:27 sonja kernel: [4123706.486914] BPF:
Jan  5 14:01:27 sonja kernel: [4123706.487452] BPF:Invalid name
Jan  5 14:01:27 sonja kernel: [4123706.487985] BPF:
Jan  5 14:01:27 sonja kernel: [4123706.487985]
Jan  5 14:01:27 sonja kernel: [4123706.489110] failed to validate module [exfat] BTF: -22
Jan  5 14:02:42 sonja kernel: [4123781.262483] BPF:      type_id=10 bits_offset=152
Jan  5 14:02:42 sonja kernel: [4123781.263057] BPF:
Jan  5 14:02:42 sonja kernel: [4123781.263572] BPF:Invalid name
Jan  5 14:02:42 sonja kernel: [4123781.264073] BPF:
Jan  5 14:02:42 sonja kernel: [4123781.264073]
Jan  5 14:02:42 sonja kernel: [4123781.265091] failed to validate module [exfat] BTF: -22
Jan  5 14:02:59 sonja pvedaemon[543400]: <root@pam> successful auth for user 'root@pam'
Jan  5 14:03:16 sonja kernel: [4123815.405833] BPF:      type_id=10 bits_offset=152
Jan  5 14:03:16 sonja kernel: [4123815.406336] BPF:
Jan  5 14:03:16 sonja kernel: [4123815.406811] BPF:Invalid name
Jan  5 14:03:16 sonja kernel: [4123815.407270] BPF:
Jan  5 14:03:16 sonja kernel: [4123815.407270]
Jan  5 14:03:16 sonja kernel: [4123815.408142] failed to validate module [exfat] BTF: -22
Jan  5 14:04:49 sonja kernel: [4123907.629264] BPF:      type_id=10 bits_offset=152
Jan  5 14:04:49 sonja kernel: [4123907.629683] BPF:
Jan  5 14:04:49 sonja kernel: [4123907.630084] BPF:Invalid name
Jan  5 14:04:49 sonja kernel: [4123907.630434] BPF:
Jan  5 14:04:49 sonja kernel: [4123907.630434]
Jan  5 14:04:49 sonja kernel: [4123907.631117] failed to validate module [exfat] BTF: -22
Jan  5 14:10:50 sonja pvedaemon[2706353]: starting vnc proxy UPID:sonja:00294BB1:18952690:61D598DA:vncproxy:100:root@pam:
Jan  5 14:10:50 sonja pvedaemon[543400]: <root@pam> starting task UPID:sonja:00294BB1:18952690:61D598DA:vncproxy:100:root@pam:
Jan  5 14:14:55 sonja pveproxy[2659]: worker 2648453 finished
Jan  5 14:14:55 sonja pveproxy[2659]: starting 1 worker(s)
Jan  5 14:14:55 sonja pveproxy[2659]: worker 2717853 started
Jan  5 14:14:57 sonja pveproxy[2717846]: got inotify poll request in wrong process - disabling inotify
Jan  5 14:17:01 sonja CRON[2721078]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Jan  5 14:18:58 sonja pvedaemon[2218507]: <root@pam> successful auth for user 'root@pam'
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!