Bulk migrate via command line

churnd · Jan 25, 2016

Sometimes I want to bulk migrate my VMs from the command line rather than using the web interface. To do this, I just run a for loop like so:

Code:

for vm in $(qm list | awk '{print $1}' | grep -Eo '[0-9]{1,3}'); do qm migrate $vm node2 --online; done

Sometimes this works great, sometimes it doesn't. When it doesn't I get a error in the logs just saying "
service 'vm:205' - migration failed (exit code 1)". It seems to happen on the VM's managed by HA only but I can't say that for 100% certainty. However, I notice if I do them one at a time, it works fine. Is there a problem with using the for loop this way? Is there a better way to bulk migrate via command line?

t.lamprecht · Jan 27, 2016

No, should be OK like it is, on PVE4.X atleast.

Can you please post the output from:

Code:

pveversion -v

churnd · Jan 28, 2016

t.lamprecht said:
No, should be OK like it is, on PVE4.X atleast.

Can you please post the output from:

Code:

pveversion -v

Code:

proxmox-ve: 4.1-34 (running kernel: 4.2.6-1-pve)
pve-manager: 4.1-5 (running version: 4.1-5/f910ef5c)
pve-kernel-4.2.6-1-pve: 4.2.6-34
pve-kernel-4.2.2-1-pve: 4.2.2-16
pve-kernel-4.2.3-2-pve: 4.2.3-22
lvm2: 2.02.116-pve2
corosync-pve: 2.3.5-2
libqb0: 0.17.2-1
pve-cluster: 4.0-30
qemu-server: 4.0-46
pve-firmware: 1.1-7
libpve-common-perl: 4.0-43
libpve-access-control: 4.0-11
libpve-storage-perl: 4.0-38
pve-libspice-server1: 0.12.5-2
vncterm: 1.2-1
pve-qemu-kvm: 2.4-21
pve-container: 1.0-37
pve-firewall: 2.0-15
pve-ha-manager: 1.0-18
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u1
lxc-pve: 1.1.5-5
lxcfs: 0.13-pve3
cgmanager: 0.39-pve1
criu: 1.6.0-1
zfsutils: 0.6.5-pve7~jessie

I did wonder if throwing in a "sleep 5" in the loop would help, in that maybe it doesn't like them being submitted that close together.

t.lamprecht · Jan 29, 2016

churnd said:
I did wonder if throwing in a "sleep 5" in the loop would help, in that maybe it doesn't like them being submitted that close together.

The difference between a HA managed machine and a "normal" one is here that when we make an Action (shutdown, start, migrate) to an HA Resource (e.g. through `qm` or `pct`) the original task only redirects it to the HA Manager which executes the action in a background worker.

So your script waits for each migration to finish before it starts the next when it's a non-HA managed machine but if it's ha-manage it changes the status from all machines given to migrate and they get all migrated at once, or at least 4 because we limit the possible active HA workers to 4.

I tried to reproduce your problem but failed yet, I'll retry with more VMs.

Can you look in the log (journalctl) to see what happens in the background task? The error message determines that this task failed.

churnd · Feb 4, 2016

Here's the latest log from a bulk migration: http://pastie.org/private/9xeqmgxyl7k5qmcbvxl8pw. VM 100 is the only one that would not migrate. It migrated fine once I ran it again for just that one VM.

t.lamprecht · Feb 5, 2016

Thanks for the followup!

I could vaguely reproduce this a few days ago and I'm on it.

The current state would be that when an migration fails, we place it in the started/stopped state (depending if it's enabled or not) on the original node.

Here it comes to an race condition where the API says the migration failed but shortly after that it succeeded non the less.
Thus the ha manager places it in the started state on origin node whereas it already runs on the target node => node mismatch.

Can you please open a bug report on https://bugzilla.proxmox.com/ for the ha-manager whit the summed up infos or at least a link to this thread?
Makes it easier to track the issue

A bug regarding failure of concurrent live migrations in our qemu-server package was also already fixed and is in the pvetest repo, this should also fix some issues related to this one.

neodemus · May 16, 2024

Bash:

for vm in $(qm list | sed '1d' | awk '{print $1}' | grep -Eo '[0-9]{1,4}'); do qm migrate $vm node2 --online && sleep 10; done

works well for modern pve versions

dehidding · Jul 20, 2025

I use the script below (a modified version of the above script) for migrating only the running VMs in a Ceph based cluster so that I can update hosts without shutting down the VMs. I don't care about the stopped VMs during the update process since they would not be affected by a potential reboot if the kernel was updated.

for vm in $(qm list | sed '1d' | awk '/running/ {print $1}' | grep -Eo '[0-9]{1,4}'); do qm migrate $vm [I]node2[/I] --online && sleep 10; done

jsterr · Jul 20, 2025

dehidding said:
I use the script below (a modified version of the above script) for migrating only the running VMs in a Ceph based cluster so that I can update hosts without shutting down the VMs. I don't care about the stopped VMs during the update process since they would not be affected by a potential reboot if the kernel was updated.

for vm in $(qm list | sed '1d' | awk '/running/ {print $1}' | grep -Eo '[0-9]{1,4}'); do qm migrate $vm [I]node2[/I] --online && sleep 10; done

on HA setups, using the maintenance-mode is a lot easier. On disabling: It even puts them back to the host the vms/ct were before enabling it.

Search

Search

Bulk migrate via command line

churnd

Active Member

t.lamprecht

Proxmox Staff Member

churnd

Active Member

t.lamprecht

Proxmox Staff Member

churnd

Active Member

t.lamprecht

Proxmox Staff Member

neodemus

New Member

dehidding

New Member

jsterr

Renowned Member

We value your privacy