[SOLVED] impossible to migrate any VM.

dominix

Renowned Member
Jan 10, 2012
52
4
73
GMT +1
Hi everyone,

I got a proxmox 5 cluster, that I intent to upgrade to 6 and then 7, the cluster is made of 2 DL380 and a HPE MSA FC disk bay
Code:
  pve-manager/5.4-15/d0ec33c6 (running kernel: 4.15.18-30-pve)
when I try to migrate any VM no-one succeed, every vm reach an error
Code:
 can't activate LV '/dev/msa/vm-202-disk-0':   Refusing activation of partial LV msa/vm-202-disk-0.  Use '--activationmode partial' to override.
ERROR: online migrate failure - command '/usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=srv2' root@192.168.1.2 qm start 202 --skiplock --migratedfrom srv1 --migration_type secure --stateuri unix --machine pc-i440fx-2.12' failed: exit code 255
aborting phase 2 - cleanup resources
migrate_cancel

Any help or tips appreciated.

for information
Code:
srv1:~# pvdisplay 
  --- Physical volume ---
  PV Name               /dev/mapper/3600508b1001cbdeec7db635a26adea6c-part3
  VG Name               pve
  PV Size               279,11 GiB / not usable 4,28 MiB
  Allocatable           yes (but full)
  PE Size               4,00 MiB
  Total PE              71452
  Free PE               0
  Allocated PE          71452
  PV UUID               bNlB11-Nz8M-umpI-hP20-rE2g-uV71-hlJbHH
   
  --- Physical volume ---
  PV Name               /dev/mapper/3600c0ff0003c10d77c8bc75b01000000-part2
  VG Name               msa
  PV Size               2,91 TiB / not usable 2,98 MiB
  Allocatable           yes 
  PE Size               4,00 MiB
  Total PE              761746
  Free PE               142712
  Allocated PE          619034
  PV UUID               UaQXZL-d14G-WGkY-qtH9-Hvi7-DlOB-xllMuj
   
  --- Physical volume ---
  PV Name               /dev/mapper/3600c0ff0003c115d7d8bc75b01000000-part1
  VG Name               msa
  PV Size               2,90 TiB / not usable 3,00 MiB
  Allocatable           yes 
  PE Size               4,00 MiB
  Total PE              761508
  Free PE               761508
  Allocated PE          0
  PV UUID               JXSAxF-NT5q-gtAb-B09U-6QZt-urVn-PNTlEP


srv1:~# vgdisplay 
  --- Volume group ---
  VG Name               pve
  System ID             
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  11
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                3
  Open LV               3
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               279,11 GiB
  PE Size               4,00 MiB
  Total PE              71452
  Alloc PE / Size       71452 / 279,11 GiB
  Free  PE / Size       0 / 0   
  VG UUID               fN0OW5-ERLy-5HFC-RvxC-t64r-cYJK-0lOpVp
   
  --- Volume group ---
  VG Name               msa
  System ID             
  Format                lvm2
  Metadata Areas        2
  Metadata Sequence No  135
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                18
  Open LV               16
  Max PV                0
  Cur PV                2
  Act PV                2
  VG Size               5,81 TiB
  PE Size               4,00 MiB
  Total PE              1523254
  Alloc PE / Size       619034 / 2,36 TiB
  Free  PE / Size       904220 / 3,45 TiB
  VG UUID               lUDjoI-manq-GY8d-NAZ6-q9Q0-bWOg-cHEDAt

srv1:~# lvs
  LV            VG  Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  vm-100-disk-0 msa -wi-ao---- 200,00g                                                    
  vm-100-disk-1 msa -wi-ao---- 500,00g                                                    
  vm-101-disk-0 msa -wi-ao---- 100,00g                                                    
  vm-102-disk-0 msa -wi-ao----  20,00g                                                    
  vm-103-disk-0 msa -wi-ao---- 250,00g                                                    
  vm-104-disk-0 msa -wi-ao----  60,00g                                                    
  vm-105-disk-0 msa -wi-ao---- 100,00g                                                    
  vm-106-disk-1 msa -wi-ao----  40,10g                                                    
  vm-106-disk-2 msa -wi-ao----  80,00g                                                    
  vm-107-disk-2 msa -wi-ao----  40,00g                                                    
  vm-108-disk-0 msa -wi-ao---- 150,00g                                                    
  vm-109-disk-0 msa -wi-ao---- 150,00g                                                    
  vm-110-disk-0 msa -wi-a-----  12,00g                                                    
  vm-110-disk-1 msa -wi-a----- 100,00g                                                    
  vm-113-disk-0 msa -wi-ao----  64,00g                                                    
  vm-201-disk-0 msa -wi-ao----  32,00g                                                    
  vm-201-disk-1 msa -wi-ao---- 320,00g                                                    
  vm-202-disk-0 msa -wi-ao---- 200,00g                                                    
  data          pve -wi-ao---- 219,11g                                                    
  root          pve -wi-ao----  50,00g                                                    
  swap          pve -wi-ao----  10,00g
 
a bit more things...
I can move a disk from the MSA bay to an NFS (NAS) used for backups, then move the VM to the other node back an forth. but when I want to move back the disk to the MSA there is errors. (from any node)

Code:
create full clone of drive scsi0 (NAS:113/vm-113-disk-0.qcow2)
  Cannot change VG msa while PVs are missing.
  Consider vgreduce --removemissing.
TASK ERROR: storage migration failed: error with cfs lock 'storage-msa': lvcreate 'msa/pve-vm-113' error:   Cannot process volume group msa

however, I can migrate any VM if they are stopped.
these operations were possible a few month ago, without errors. is it an update that made this mess ?
looks like VG are not see like identicals from one node to the other.
 
Last edited:
OK, this last part looks to be an issue with LVM. I have to

Code:
vgextend --restoremissing msa <each PV on the vg>

to fix it. Now I can move my VM disks back from NFS to the MSA storage Bay. at least on node 1 but not node 2.
 
Last edited:
There have been some electrical disturbance on site I was not aware of.
looks like one fibre channel switch have freezed or kind of (half freezed, half working), asked someone on site to turn it off&on again.

on the node 2 there were some missing physical volumes
Code:
pvdisplay
...
  --- Physical volume ---
  PV Name               [unknown]
  VG Name               msa
  PV Size               2,90 TiB / not usable 3,00 MiB
  Allocatable           yes
  PE Size               4,00 MiB
  Total PE              761508
  Free PE               761508
  Allocated PE          0
  PV UUID               JXSAxF-NT5q-gtAb-B09U-6QZt-urVn-PNTlEP

I have made a rescan of the LUNs using
Code:
# ls /sys/class/fc_host
host1 host9

# echo "1" > /sys/class/fc_host/host1/issue_lip

on both node and now storage is clean again. I can move VM from one node to the other.

End of game. sysadmin win :)
 
Last edited: