Experiencing slow migration on LXC containers

buco · Nov 12, 2023

Hi,

I'm trying to experiment with an HA cluster setup. For the moment I don't have ZFS (pseudo shared aka replication) storage yet. Now I'm trying to migrate an LXC from one node to the other. It works, but it shuts down the LXC container, then is stuck for a very long time on:

Code:

2023-11-12 06:34:08 starting migration of CT 105 to node 'pve2' (10.10.10.162)
2023-11-12 06:34:08 found local volume 'local:105/vm-105-disk-0.raw' (in current VM config)
2023-11-12 06:34:09 Formatting '/var/lib/vz/images/105/vm-105-disk-0.raw', fmt=raw size=53687091200 preallocation=off

Then it seems to be ready, start the actual migration and finishes.

But the step of the "Formatting" takes really long.

The container is very minimal, both hosts are not loaded at all, have plenty of RAM free and all storage is in NVME disks.

I'm trying to understand why it is taking so long. What happening during "Formatting"?

My storage.cfg:

Code:

root@hades:~# cat /etc/pve/storage.cfg 
dir: local
    path /var/lib/vz
    content images,snippets,rootdir,vztmpl,backup
    shared 0

dir: LocalBackups
    path /var/lib/vz/localbackups
    content iso
    prune-backups keep-all=1
    shared 0

lvm: sabnzb
    vgname pve
    content rootdir,images
    shared 0

zfspool: backup
    pool backup
    content images,rootdir
    mountpoint /backup
    nodes hades

dir: MainBackup
    path /backup
    content vztmpl,iso,backup
    prune-backups keep-all=1
    shared 0

root@hades:~#

The containers I tried to move which went unexpectedly slow:

Code:

root@pve2:~# cat /etc/pve/lxc/10{4,5}.conf
arch: amd64
cores: 1
features: nesting=1
hostname: intranet.smetnet.be
memory: 256
nameserver: 10.10.10.7
net0: name=eth0,bridge=vmbr0,firewall=1,gw=10.10.10.1,hwaddr=DE:36:6A:46:B7:20,ip=10.10.10.8/32,type=veth
onboot: 1
ostype: debian
rootfs: local:104/vm-104-disk-0.raw,size=25G
searchdomain: smetnet.be
swap: 256
tags: working
unprivileged: 1
arch: amd64
cores: 1
features: nesting=1
hostname: gitea.smetnet.be
memory: 512
nameserver: 10.10.10.7
net0: name=eth0,bridge=vmbr0,firewall=1,gw=10.10.10.1,hwaddr=F6:11:40:8A:08:E7,ip=10.10.10.9/32,type=veth
onboot: 1
ostype: debian
rootfs: local:105/vm-105-disk-0.raw,size=50G
searchdomain: smetnet.be
swap: 512
tags: working
unprivileged: 1
root@pve2:~#

LnxBil · Nov 12, 2023

buco said:
I'm trying to understand why it is taking so long. What happening during "Formatting"?

The data has to be moved physically from one host to the next. This is called a shared-nothing migration. So on the destination, a new volume (I suppose you use LVM) is created and a filesystem will be created (=formatted) and the files will be copied over.

In general, LX(C) containers cannot be used for live migration, because containers have to be stopped in order to be started on the other side. If you want "real" live migration, go with a "real" VM (QEMU/KVM), which is able to do live storage migration and live VM ram migration without zero downtime. This will be even slower in your setup, because a cluster and "real" HA is done via a (distributed or dedicated) shared storage. ZFS replication is one way to go, but not a "real" HA, because you will have more dataloss if the node failes where the VM runs on.

Search

Search

Experiencing slow migration on LXC containers

buco

New Member

LnxBil

Distinguished Member