[SOLVED] Can't migrate VM on Ceph when source node _also_ has local-zfs but VM isn't using it

ikogan

Well-Known Member
Apr 8, 2017
39
2
48
39
I have four nodes setup with Ceph, one of which _also_ has ZFS setup. All of my ISOs, etc. are on ZFS. This particular VM was running on a node _without_ ZFS, just Ceph. I had to take that node offline today so I migrated it to the one _with_ ZFS. After bringing the node back up, I cannot migrate the VM back to that node because, for some reason, Proxmox complains that the "local-zfs" storage is not available. The VM I'm trying to migrate isn't using "local-zfs":

Code:
agent: 1
balloon: 2048
bootdisk: scsi0
cores: 2
cpu: Broadwell-noTSX,flags=+pcid;+spec-ctrl
hotplug: disk,network,usb,memory
ide2: none,media=cdrom
memory: 6144
name: IPA-Freyr
net0: virtio=6E:1B:98:D3:A2:8E,bridge=vmbr0
net1: virtio=96:F1:D1:65:23:3B,bridge=vmbr1
numa: 1
onboot: 1
ostype: l26
rng0: source=/dev/urandom
scsi0: cluster:vm-110-disk-0,discard=on,size=15G,ssd=1
scsihw: virtio-scsi-pci
serial0: socket
smbios1: uuid=07daab8b-bbcb-434c-952c-a7bf9b60ca8e
sockets: 1
vmgenid: 2e46e650-7b52-440e-8a1d-b60e4896db42

Storage configuration:

Code:
dir: local
    path /var/lib/vz
    content vztmpl,iso,backup,snippets
    maxfiles 5
    shared 0

lvmthin: local-lvm
    thinpool data
    vgname pve
    content rootdir,images

rbd: cluster
    content images,rootdir
    krbd 1
    monhost 10.11.1.1;10.11.1.2;10.11.1.3,10.11.1.4
    pool rbd
    username admin

zfspool: local-zfs-disk
    pool Data/Virtualization/Disk
    content rootdir,images
    nodes perun
    sparse 1

nfs: nas
    export /data/Virtualization/Data
    path /mnt/pve/nas
    server storage.domain.private
    content rootdir,images,vztmpl,iso,backup,snippets
    maxfiles 5
    options vers=4.2

dir: local-zfs
    path /Data/Virtualization/Data
    content vztmpl,iso,snippets
    maxfiles 5
    nodes perun
    shared 0

Both online and offline migration fails for this VM:

Code:
task started by HA resource agent
2020-05-07 22:42:23 use dedicated network address for sending migration traffic (10.13.1.3)
2020-05-07 22:42:23 starting migration of VM 110 to node 'triglav' (10.13.1.3)
2020-05-07 22:42:23 ERROR: Failed to sync data - storage 'local-zfs' is not available on node 'triglav'
2020-05-07 22:42:23 aborting phase 1 - cleanup resources
2020-05-07 22:42:23 ERROR: migration aborted (duration 00:00:00): Failed to sync data - storage 'local-zfs' is not available on node 'triglav'
TASK ERROR: migration aborted
 
what's your 'pveversion -v' output? what does 'pvesm list local-zfs' say?
 
pveversion -v:

Code:
proxmox-ve: 6.1-2 (running kernel: 5.3.18-3-pve)
pve-manager: 6.1-11 (running version: 6.1-11/f2f18736)
pve-kernel-helper: 6.1-9
pve-kernel-5.3: 6.1-6
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-5.3.18-2-pve: 5.3.18-2
ceph: 14.2.9-pve1
ceph-fuse: 14.2.9-pve1
corosync: 3.0.3-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ifupdown2: residual config
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.15-pve1
libproxmox-acme-perl: 1.0.2
libpve-access-control: 6.0-7
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.1-1
libpve-guest-common-perl: 3.0-10
libpve-http-server-perl: 3.0-5
libpve-storage-perl: 6.1-7
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.2-1
lxcfs: 4.0.3-pve2
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.1-6
pve-cluster: 6.1-8
pve-container: 3.1-4
pve-docs: 6.1-6
pve-edk2-firmware: 2.20200229-1
pve-firewall: 4.1-2
pve-firmware: 3.0-7
pve-ha-manager: 3.0-9
pve-i18n: 2.1-1
pve-qemu-kvm: 4.1.1-4
pve-xtermjs: 4.3.0-1
qemu-server: 6.1-20
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.3-pve1

pvesm list local-zfs:

Code:
Volid                                                                 Format  Type            Size VMID
local-zfs:iso/CentOS-7-x86_64-Minimal-1810.iso                        iso     iso        962592768
local-zfs:iso/CentOS-8-x86_64-1905-dvd1.iso                           iso     iso       7135559680
local-zfs:iso/ClearOS-DVD-x86_64.iso                                  iso     iso       1141899264
local-zfs:iso/clonezilla-live-20170220-yakkety-amd64.iso              iso     iso        276824064
local-zfs:iso/Fedora-Server-dvd-x86_64-30-1.2.iso                     iso     iso       3177185280
local-zfs:iso/Fedora-Server-netinst-x86_64-31-1.9.iso                 iso     iso        681574400
local-zfs:iso/Fedora-Workstation-Live-x86_64-31-1.9.iso               iso     iso       1929379840
local-zfs:iso/rancheros-proxmoxve-autoformat.iso                      iso     iso        150994944
local-zfs:iso/rancheros-proxmoxve.iso                                 iso     iso        150994944
local-zfs:iso/sysresccd-20161103-4.9.0.iso                            iso     iso        660402176
local-zfs:iso/tails-i386-2.10.iso                                     iso     iso       1209116672
local-zfs:iso/ubuntu-18.04-desktop-amd64.iso                          iso     iso       1921843200
local-zfs:iso/ubuntu-18.04.2-live-server-amd64.iso                    iso     iso        874512384
local-zfs:iso/ubuntu-19.10-live-server-amd64.iso                      iso     iso        883949568
local-zfs:iso/virtio-win.iso                                          iso     iso        371732480
local-zfs:vztmpl/ubuntu-18.04-standard_18.04.1-1_amd64.tar.gz         tgz     vztmpl     213430501
local-zfs:vztmpl/ubuntu-19.04-standard_19.04-1_amd64.tar.gz           tgz     vztmpl     213467952
local-zfs:vztmpl/centos-7-default_20161207_amd64.tar.xz               txz     vztmpl      65763092
local-zfs:vztmpl/centos-8-default_20191016_amd64.tar.xz               txz     vztmpl     106244064
local-zfs:vztmpl/fedora-30-default_20190718_amd64.tar.xz              txz     vztmpl      70874204
 
can't reproduce this here - the check is only supposed to trigger if there are disks on that storage. can you try the following:
perl -e 'use strict; use warnings; use Data::Dumper; use PVE::Storage; print Dumper(PVE::Storage::vdisk_list(PVE::Storage::config(), "local-zfs", 110))'
 
Yeah, that's what I was thinking too, it seems very strange. When I run that I get "storage 'local-zfs' does not exist". Is there a way to increase debug output during that migration?
 
Whoops, I forgot to run that as root, here's the actual output:

Code:
Use of uninitialized value $node in concatenation (.) or string at /usr/share/perl5/PVE/Storage.pm line 145, <DATA> line 755.
Use of uninitialized value $node in concatenation (.) or string at /usr/share/perl5/PVE/Storage.pm line 145, <DATA> line 755.
Use of uninitialized value $node in concatenation (.) or string at /usr/share/perl5/PVE/Storage.pm line 145, <DATA> line 755.
Debug: local-zfs - Debug: local-zfs - Debug: local-zfs - Use of uninitialized value $node in concatenation (.) or string at /usr/share/perl5/PVE/Storage.pm line 145.
Debug: local-zfs - $VAR1 = {                         
          'local-zfs' => [                           
                           {                         
                             'parent' => undef,      
                             'size' => '10737418240',                                                     
                             'format' => 'qcow2',    
                             'vmid' => '110',        
                             'used' => 3280748544,   
                             'volid' => 'local-zfs:110/vm-110-disk-0.qcow2',                              
                             'ctime' => 1575778516   
                           }                         
                         ]                           
        }
 
Wow ok, so there apparently is indeed a "110/vm-110-disk-0.qcow2" in that directory back from November. I guess it didn't get deleted. I'm assuming it didn't list it because the storage doesn't support images. Moving that disk out of the way allowed the migration to continue. Would you consider this a bug? Maybe the GUI should display some warning when there's a weird situation like this?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!