#!/bin/bash
timeout 5 /usr/bin/lxc-freeze.real "$@"
INFO: starting new backup job: vzdump 3122 --compress lzo --node sys3 --storage dump-save --mode snapshot --remove 0
INFO: Starting Backup of VM 3122 (lxc)
INFO: status = running
INFO: backup mode: snapshot
INFO: ionice priority: 8
INFO: create storage snapshot snapshot
sys3 ~ # pveversion -v
proxmox-ve: 4.1-32 (running kernel: 4.2.6-1-pve)
pve-manager: 4.1-4 (running version: 4.1-4/ccba54b0)
pve-kernel-4.2.6-1-pve: 4.2.6-32
pve-kernel-4.2.2-1-pve: 4.2.2-16
pve-kernel-4.2.3-1-pve: 4.2.3-18
pve-kernel-4.2.3-2-pve: 4.2.3-22
lvm2: 2.02.116-pve2
corosync-pve: 2.3.5-2
libqb0: 0.17.2-1
pve-cluster: 4.0-30
qemu-server: 4.0-44
pve-firmware: 1.1-7
libpve-common-perl: 4.0-43
libpve-access-control: 4.0-11
libpve-storage-perl: 4.0-38
pve-libspice-server1: 0.12.5-2
vncterm: 1.2-1
pve-qemu-kvm: 2.4-20
pve-container: 1.0-37
pve-firewall: 2.0-15
pve-ha-manager: 1.0-17
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u1
lxc-pve: 1.1.5-5
lxcfs: 0.13-pve3
cgmanager: 0.39-pve1
criu: 1.6.0-1
zfsutils: 0.6.5-pve6~jessie
ps -exf -ostat,pid,comm
@RobFantini
So, you see a lxc-freeze stuck in the process list? Can you look in the process list of the container if there are processes in a different state? All frozen processes should be listed as state "D".
Code:ps -exf -ostat,pid,comm
Also:
The /sys/fs/cgroup/freezer/lxc/containerid/tasks should list all the frozen tasks, while cgroup.procs should list all procs that are meant to freeze.
ps -exf -ostat,pid,comm
...
Ss 18273 task UPID:sys3:
S 18276 \_ lxc-freeze
But where are the other processes, the ones that run inside the container? There should at the very least be an init process and some gettys etc.
sys3 /sys/fs/cgroup/freezer/lxc/3122 # cat cgroup.procs
8942
9811
9937
10005
10075
10134
10143
10197
10546
10843
10853
10895
10896
15764
16601
21128
22915
24399
24400
24401
24402
24403
26951
INFO: starting new backup job: vzdump 3122 --compress lzo --node sys3 --storage dump-save --mode snapshot --remove 0
INFO: Starting Backup of VM 3122 (lxc)
INFO: status = running
INFO: backup mode: snapshot
INFO: ionice priority: 8
INFO: create storage snapshot snapshot
ERROR: Backup of VM 3122 failed - VM is locked (snapshot)
INFO: Backup job finished with errors
TASK ERROR: job errors
I have exactly the same problem.
My workaround, until Proxmox team can fix this, was to schedule every single CT in a single backup job, in 15min steps.
But tomorrow after I checked backup logs my plan was destroyed.
The very first backup job stuck at "suspend vm", and of course the other jobs couldn't get the global lock and crashed too.
This is what we've concluded to get backups done:
as of now:
lxc : 'stop' mode backups are working.
kvm : 'suspend' works here.
we I've set up two backup jobs per node - stop lxc and suspend kvm .
I tried this too, but I had once the situation that vzdump didn't start the lxc after it backuped it.
INFO: restarting vm
INFO: lxc-start: lxc_start.c: main: 344 The container failed to start.
command 'lxc-start -n 4444' failed: exit code 1
4444: Jan 19 02:03:59 INFO: Starting Backup of VM 4444 (lxc)
4444: Jan 19 02:03:59 INFO: status = running
4444: Jan 19 02:03:59 INFO: backup mode: stop
4444: Jan 19 02:03:59 INFO: ionice priority: 8
4444: Jan 19 02:03:59 INFO: stopping vm
4444: Jan 19 02:04:11 INFO: creating archive '/bkup/dump/vzdump-lxc-4444-2016_01_19-02_03_59.tar.lzo'
4444: Jan 19 02:05:19 INFO: Total bytes written: 7220039680 (6.8GiB, 101MiB/s)
4444: Jan 19 02:05:19 INFO: archive file size: 2.02GB
4444: Jan 19 02:05:19 INFO: delete old backup '/bkup/dump/vzdump-lxc-4444-2016_01_09-02_23_51.tar.lzo'
4444: Jan 19 02:05:19 INFO: delete old backup '/bkup/dump/vzdump-lxc-4444-2016_01_16-02_19_46.tar.lzo'
4444: Jan 19 02:05:19 INFO: restarting vm
4444: Jan 19 02:05:25 INFO: Finished Backup of VM 4444 (00:01:26)