Stuck migration of containers

James Crook · Feb 3, 2021

So two questions really.

Is is possible to cancel stop migrations once they are in the queue ? i.e i run

Code:

pvenode migrateall ProxMoxNode2 --maxworkers 1

but after a few containers it stops, due to one container failing to shutdown, but there is 30 behind that locked one.

is it possible to change the "maxim workers/bulk-action" after i have pushed all the containers to the queue ? i.e.
i run the command above, and get a locked container, stopping the queue. Can i change the maxim workers in the gui to 2, thus allowing it to process two at a time, and only one is locked so it will continue along, ending with just one stuck in stopping and none stuck in the queue ?

James Crook · Feb 3, 2021

To answer my second questions,
Yes if you change the option in the GUI then it will reflect the next items in the queue to the actioned

Weather it would allow you to increse from one to two to allow getting through the list i have yet to test (i suspect not as a worker submits jobs to process, and that needs to run, but is waiting for the previous job to complete)

fiona · Feb 5, 2021

Hi,
I'm probably too late by now, but for future reference if somebody runs into this: when you click on the Migrate all VMs and Containers task log in the GUI there should be a Stop button. Note that in the GUI you can also de-select the problematic container before starting the bulk migration.

James Crook · Feb 5, 2021

I couldn't see a stop, as i had set it through a crontab job.

It's always a diffrent contaier that locks up on shutdown, the logs show the container can't remount root, so some kind of race in journald i think.

After going down a rabbit hole i think it has to do with CentOS 7 being systemd 219 and proxmox wanting 220+.

fiona · Feb 8, 2021

James Crook said:
I couldn't see a stop, as i had set it through a crontab job.

Even if you start it on the CLI, there should be a task log in the GUI. Of course instead of clicking Stop in the GUI, you can also kill -2 the migrateall process. Then it won't start any additional migrations after those that are currently running.

James Crook said:
It's always a diffrent contaier that locks up on shutdown, the logs show the container can't remount root, so some kind of race in journald i think.

After going down a rabbit hole i think it has to do with CentOS 7 being systemd 219 and proxmox wanting 220+.

That sounds unfortunate. Do you have any additional information on this?

James Crook · Feb 8, 2021

I might have forgotten that we don't log into the web console as root, so that might have been why we didn't see the migration task.

Fabian_E said:
That sounds unfortunate. Do you have any additional information on this?

Not a huge amount, when it happens the journald service inside the container is in an error state.
There is also a message in dmesg stating it couldn't remount root.

It doesn't happen everytime, and the trouble is when it does the customer needs it fix quickly (as the container is half off)

using "pct enter xxx" and issuing poweroff shuts down the container the rest of the way, allowing the migration to finish. I kinda stopped looking into it when i relised it was an old unsupported systemd (from a proxmox point)

Search

Search

Stuck migration of containers

James Crook

Well-Known Member

James Crook

Well-Known Member

fiona

Proxmox Staff Member

James Crook

Well-Known Member

fiona

Proxmox Staff Member

James Crook

Well-Known Member