Proxmox VE vzdump backup of vm's

Jera92 · May 14, 2019

Dear members of the Proxmox forum

Since yesterday I'm experiencing a problem with the vzdump back-up tool.
When the backup of my vm's starts, everything goes well, except after a minutes or 5-10 I receive the following error message: unable to find configuration file for VM 100 - no such machine.
After that the backup is aborted.

I back-up my vm's to an external hard drive, next to the PVE server.
The version I'm running: pve-manager/5.4-5/c6fdb264 (running kernel: 4.15.18-14-pve)

I thought maybe there is a file or directory missing in the /var/lock directory, but I'm not sure about that.

Can somebody help me with this?

heutger · May 17, 2019

Wrong forum, you are in the german Proxmox Mail Gateway forum.

oguz · May 17, 2019

* Can you post the full output of such a backup command? (and the command itself)
* Is there anything weird/interesting in /var/log/syslog while this issue happens?
* Are you backing up a CT or a VM?
* Can you post the config for this guest? (`pct config CTID` or `qm config CTID`)

Jera92 · May 18, 2019

Thank you for your reply oguz,

The backup command: vzdump 100 101 104 106 --mailto "email-address" --compress gzip --mailnotification failure --mode snapshot --quiet 1 --storage back-up --node proxmox-sam
The output of one VM (same for the others):
100: 2019-05-18 00:30:02 INFO: Starting Backup of VM 100 (qemu)
100: 2019-05-18 00:30:02 INFO: status = running
100: 2019-05-18 00:30:03 INFO: update VM 100: -lock backup
100: 2019-05-18 00:30:03 INFO: VM Name: Fedorasrv28
100: 2019-05-18 00:30:03 INFO: include disk 'scsi0' 'local-lvm:vm-100-disk-0' 30G
100: 2019-05-18 00:30:03 INFO: include disk 'scsi1' 'local-lvm:vm-100-disk-1' 60G
100: 2019-05-18 00:30:03 INFO: backup mode: snapshot
100: 2019-05-18 00:30:03 INFO: ionice priority: 7
100: 2019-05-18 00:30:03 INFO: skip unused drive 'DATA:vm-100-disk-1' (not included into backup)
100: 2019-05-18 00:30:03 INFO: snapshots found (not included into backup)
100: 2019-05-18 00:30:03 INFO: creating archive '/mnt/bkp/dump/vzdump-qemu-100-2019_05_18-00_30_02.vma.gz'
100: 2019-05-18 00:30:03 INFO: started backup task 'b81c5695-37ff-44ba-8f72-4f4e402b6e14'
100: 2019-05-18 00:30:06 INFO: status: 0% (197656576/96636764160), sparse 0% (135872512), duration 3, read/write 65/20 MB/s
100: 2019-05-18 00:30:21 INFO: status: 1% (1615462400/96636764160), sparse 1% (1279037440), duration 18, read/write 94/18 MB/s
100: 2019-05-18 00:30:33 INFO: status: 2% (1934753792/96636764160), sparse 1% (1285128192), duration 30, read/write 26/26 MB/s
100: 2019-05-18 00:31:11 INFO: status: 3% (2907832320/96636764160), sparse 1% (1305821184), duration 68, read/write 25/25 MB/s
100: 2019-05-18 00:31:58 INFO: status: 4% (3869048832/96636764160), sparse 1% (1314619392), duration 115, read/write 20/20 MB/s
100: 2019-05-18 00:32:42 INFO: status: 5% (4850188288/96636764160), sparse 1% (1391579136), duration 159, read/write 22/20 MB/s
100: 2019-05-18 00:32:43 ERROR: unable to find configuration file for VM 100 - no such machine
100: 2019-05-18 00:32:43 INFO: aborting backup job
100: 2019-05-18 00:32:43 ERROR: unable to find configuration file for VM 100 - no such machine
100: 2019-05-18 00:32:45 ERROR: Backup of VM 100 failed - unable to find configuration file for VM 100 - no such machine

I'm backing up a VM, i'm not using containers atm.
I checked the syslog, but I found nothing that can lead to the backup failure.

Jera92 · May 18, 2019

The output off qm config 100:

Code:

agent: 1
bootdisk: scsi0
cores: 2
keyboard: fr-be
memory: 2048
name: Fedorasrv28
net0: virtio=B6:9A:CC:FD:55:CC,bridge=vmbr0
numa: 0
onboot: 1
ostype: l26
parent: snapRestoreDisk
scsi0: local-lvm:vm-100-disk-0,size=30G
scsi1: local-lvm:vm-100-disk-1,size=60G
scsihw: virtio-scsi-pci
smbios1: uuid=359112dd-86d0-477e-82bf-eb038abd4b96
sockets: 2
unused0: DATA:vm-100-disk-1
vga: qxl

oguz · May 22, 2019

hi.

a few questions:

* are you on the latest pve version? (output of `pveversion -v` is useful
* do you have a cluster? (if yes, details)
* what kind of storage is on `/mnt/bkp`? sometimes there are network related issues on nfs/smb storages which might cause problems during backup
* did you try with another backup mode? ("suspend" or "stop")

Jera92 · May 22, 2019

Hi

* are you on the latest pve version? (output of `pveversion -v` is useful)
the output:

Code:

proxmox-ve: 5.4-1 (running kernel: 4.15.18-14-pve)
pve-manager: 5.4-5 (running version: 5.4-5/c6fdb264)
pve-kernel-4.15: 5.4-2
pve-kernel-4.15.18-14-pve: 4.15.18-38
pve-kernel-4.15.18-13-pve: 4.15.18-37
pve-kernel-4.15.18-10-pve: 4.15.18-32
corosync: 2.4.4-pve1
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.1-9
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-51
libpve-guest-common-perl: 2.0-20
libpve-http-server-perl: 2.0-13
libpve-storage-perl: 5.0-42
libqb0: 1.0.3-1~bpo9
lvm2: 2.02.168-pve6
lxc-pve: 3.1.0-3
lxcfs: 3.0.3-pve1
novnc-pve: 1.0.0-3
proxmox-widget-toolkit: 1.0-26
pve-cluster: 5.0-37
pve-container: 2.0-37
pve-docs: 5.4-2
pve-edk2-firmware: 1.20190312-1
pve-firewall: 3.0-20
pve-firmware: 2.0-6
pve-ha-manager: 2.0-9
pve-i18n: 1.1-4
pve-libspice-server1: 0.14.1-2
pve-qemu-kvm: 3.0.1-2
pve-xtermjs: 3.12.0-1
qemu-server: 5.0-51
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.13-pve1~bpo2

* do you have a cluster? (if yes, details)
No I don't, it's a stand alone server.

* what kind of storage is on `/mnt/bkp`? sometimes there are network related issues on nfs/smb storages which might cause problems during backup
The /mnt/bkp directory is a mounted External hard drive formatted in ext4. I'm not using a nfs or samba storage space.

* did you try with another backup mode? ("suspend" or "stop")
I tried both, but they ended u with the same result.

Thank you for hour anwser!

oguz · May 23, 2019

hi again.

Jera92 said:
* are you on the latest pve version? (output of `pveversion -v` is useful)
the output:

looks like your packages aren't the latest ones.

* try an upgrade with `apt update` and then `apt full-upgrade`, followed by rebooting the server.

Jera92 said:
The /mnt/bkp directory is a mounted External hard drive formatted in ext4. I'm not using a nfs or samba storage space.

alright. this rules out any network related issues. maybe there's something wrong with your storage.

* try running a smartctl test on your device.

Code:

smartctl --test=short /your/device

wait a few minutes and run:

Code:

smartctl -a /your/device

that's all i can think of for now.

Jera92 · May 24, 2019

oguz said:
* try an upgrade with `apt update` and then `apt full-upgrade`, followed by rebooting the server.

I did a complete upgrade of my pve server and then rebooted.
After the reboot I triggered a manual backup proces, but I still receive the error message for all my vm's.
100: 2019-05-18 00:32:43 ERROR: unable to find configuration file for VM 100 - no such machine
100: 2019-05-18 00:32:43 INFO: aborting backup job
100: 2019-05-18 00:32:43 ERROR: unable to find configuration file for VM 100 - no such machine
100: 2019-05-18 00:32:45 ERROR: Backup of VM 100 failed - unable to find configuration file for VM 100 - no such machine

oguz said:
* try running a smartctl test on your device.

I ran the smartctl test on all my hard drives, I didn't found any problems that may cause the fabackup

I checked the system logs, but I still can't find something that can lead to the issue.

oguz · May 27, 2019

hi.

can you check and send the outputs of:

Code:

ls -arilh /etc/pve
ls -arilh /etc/pve/qemu-server/
systemctl status pve-cluster
ps aux | grep pmxcfs

there should be VM config files under /etc/pve/qemu-server/VMID.conf and you should see a running pmxcfs process

Jera92 · Jun 1, 2019

Hi Oguz, here you can find the output:

oguz said:
ls -arilh /etc/pve

total 8.0K
3 -rw-r----- 1 root www-data 880 May 14 11:59 vzdump.cron
104717 -r--r----- 1 root www-data 483 Jan 1 1970 .vmlist
104715 -r--r----- 1 root www-data 445 Jan 1 1970 .version
104723 -rw-r----- 1 root www-data 60 Jul 30 2018 user.cfg
104726 -rw-r----- 1 root www-data 412 Apr 20 13:30 storage.cfg
104716 -r--r----- 1 root www-data 1.5K Jan 1 1970 .rrd
104721 lrwxr-xr-x 1 root www-data 0 Jan 1 1970 qemu-server -> nodes/proxmox-sam/qemu-server
13 -rw-r----- 1 root www-data 1.7K Jul 31 2018 pve-www.key
16 -rw-r----- 1 root www-data 2.1K Jul 31 2018 pve-root-ca.pem
4 drwx------ 2 root www-data 0 Jul 31 2018 priv
104718 lrwxr-xr-x 1 root www-data 0 Jan 1 1970 openvz -> nodes/proxmox-sam/openvz
6 drwxr-xr-x 2 root www-data 0 Jul 31 2018 nodes
104722 -r--r----- 1 root www-data 206 Jan 1 1970 .members
104719 lrwxr-xr-x 1 root www-data 0 Jan 1 1970 lxc -> nodes/proxmox-sam/lxc
638 lrwxr-xr-x 1 root www-data 0 Jan 1 1970 local -> nodes/proxmox-sam
104727 drwxr-xr-x 2 root www-data 0 Apr 10 10:09 firewall
104714 -rw-r----- 1 root www-data 2 Jan 1 1970 .debug
104725 -rw-r----- 1 root www-data 56 May 26 09:44 datacenter.cfg
104724 -rw-r----- 1 root www-data 374 Apr 12 20:24 corosync.conf
104720 -r--r----- 1 root www-data 8.8K Jan 1 1970 .clusterlog
28 -rw-r----- 1 root www-data 451 Jul 31 2018 authkey.pub
3276801 drwxr-xr-x 93 root root 4.0K May 23 16:28 ..
1 drwxr-xr-x 2 root www-data 0 Jan 1 1970 .

oguz said:
ls -arilh /etc/pve/qemu-server/

104721 lrwxr-xr-x 1 root www-data 0 Jan 1 1970 /etc/pve/qemu-server -> nodes/proxmox-sam/qemu-server

oguz said:
systemctl status pve-cluster

pve-cluster.service - The Proxmox VE cluster filesystem
Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2019-05-28 13:29:31 CEST; 4 days ago
Process: 18787 ExecStartPost=/usr/bin/pvecm updatecerts --silent (code=exited, status=0/SUCCESS)
Process: 18765 ExecStart=/usr/bin/pmxcfs (code=exited, status=0/SUCCESS)
Main PID: 18774 (pmxcfs)
Tasks: 7 (limit: 4915)
Memory: 36.9M
CPU: 5min 31.306s
CGroup: /system.slice/pve-cluster.service
└─18774 /usr/bin/pmxcfs

Jun 01 08:29:05 proxmox-sam.jdr-mdr.be pmxcfs[18774]: [dcdb] notice: data verification successful
Jun 01 09:29:05 proxmox-sam.jdr-mdr.be pmxcfs[18774]: [dcdb] notice: data verification successful
Jun 01 10:29:05 proxmox-sam.jdr-mdr.be pmxcfs[18774]: [dcdb] notice: data verification successful
Jun 01 11:29:05 proxmox-sam.jdr-mdr.be pmxcfs[18774]: [dcdb] notice: data verification successful
Jun 01 12:29:05 proxmox-sam.jdr-mdr.be pmxcfs[18774]: [dcdb] notice: data verification successful
Jun 01 13:29:05 proxmox-sam.jdr-mdr.be pmxcfs[18774]: [dcdb] notice: data verification successful
Jun 01 14:29:05 proxmox-sam.jdr-mdr.be pmxcfs[18774]: [dcdb] notice: data verification successful
Jun 01 15:29:05 proxmox-sam.jdr-mdr.be pmxcfs[18774]: [dcdb] notice: data verification successful
Jun 01 16:29:05 proxmox-sam.jdr-mdr.be pmxcfs[18774]: [dcdb] notice: data verification successful
Jun 01 17:29:05 proxmox-sam.jdr-mdr.be pmxcfs[18774]: [dcdb] notice: data verification successful

oguz said:
ps aux | grep pmxcfs

root 3391 0.0 0.0 12784 940 pts/0 S+ 17:46 0:00 grep mxcfs
root 18774 0.0 0.4 675840 38536 ? Ssl May28 5:30 /usr/bin/pmxcfs

Proxmox VE vzdump backup of vm's

Jera92

Active Member

heutger

Famous Member

oguz

Famous Member

Jera92

Active Member

Jera92

Active Member

oguz

Famous Member

Jera92

Active Member

oguz

Famous Member

Jera92

Active Member

oguz

Famous Member

Jera92

Active Member

We value your privacy