freeze during and after blackup

riri1310

Member
Jul 14, 2017
14
0
6
44
Hi I use the built in backup to backup all the VMs with snapshot mode because they all are running.
I notice some downtime related to SSH connection (I have monit on the vm and monit told me that there is a SSH connexion problem). So during the backup the VMs are sometimes unavailable and the strange thing is that after the complete backup the proxmox freeze all the VMs I need an hard reboot to restart everything.

Thanks for your help, if you need some command line extract feel free to ask
 
Jan 16, 2018
168
27
28
What is the target for backup?`I had some troubles with NFS as target, the NFS Client in the Linux OS tends to eat up resources.
It looks like CIFS works more reliable in this case.
 

Humbug

Member
Nov 14, 2012
30
1
8
I'm not an expert but i have similar problems (not involving NFS though). What you could try as a workaround is a) install the CFQ i/o scheduler on the host and then b) use vzdump in combination with ionice parameter in the shell. See man vzdump.
 
Last edited:

riri1310

Member
Jul 14, 2017
14
0
6
44
I'm not an expert but i have similar problems (not involving NFS though). What you could try as a workaround is a) install the CFQ i/o scheduler on the host and then b) use vzdump in combination with ionice parameter in the shell. See man vzdump.
Hello my vzdump is already using "ionice priority: 7" should I use CFQ i/o scheduler too?

Thanks for your kind help on this, my server is always stuck after those backups...
 

Humbug

Member
Nov 14, 2012
30
1
8
I just read that ionice generally does not work on NFS:
https://forum.proxmox.com/threads/about-of-ionice-in-vzdump.16485/#post-84955

Maybe you can use another location instead?

Hello my vzdump is already using "ionice priority: 7" should I use CFQ i/o scheduler too?
As far as i know all other availble io schedulers do not honour ionice parameters, so you would have to use CFQ with it. I think it's still worth a try.

Maybe you can tell a bit more on:
- what version of PVE your Host uses
- how many guests you have
- what OS / version do your guests use
- how many free RAM your host has
- if you use Kernel Sampage Merging (KSM)
 

riri1310

Member
Jul 14, 2017
14
0
6
44
Thanks

Here are some details :

#pveversion -v
Code:
proxmox-ve: 5.2-2 (running kernel: 4.15.18-5-pve)
pve-manager: 5.2-10 (running version: 5.2-10/6f892b40)
pve-kernel-4.15: 5.2-8
pve-kernel-4.15.18-5-pve: 4.15.18-24
corosync: 2.4.2-pve5
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.0-8
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-40
libpve-guest-common-perl: 2.0-18
libpve-http-server-perl: 2.0-11
libpve-storage-perl: 5.0-30
libqb0: 1.0.1-1
lvm2: 2.02.168-pve6
lxc-pve: 3.0.2+pve1-3
lxcfs: 3.0.2-2
novnc-pve: 1.0.0-2
proxmox-widget-toolkit: 1.0-20
pve-cluster: 5.0-30
pve-container: 2.0-29
pve-docs: 5.2-8
pve-firewall: 3.0-14
pve-firmware: 2.0-5
pve-ha-manager: 2.0-5
pve-i18n: 1.0-6
pve-libspice-server1: 0.12.8-3
pve-qemu-kvm: 2.11.2-1
pve-xtermjs: 1.0-5
qemu-server: 5.0-38
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.11-pve1~bpo1
#cat /sys/block/sda/queue/scheduler
Code:
noop [deadline] cfq
The host add 4 containers all running debian 9

# free -m
total used free shared buff/cache available
Mem: 64110 24917 8502 306 30690 38174
Swap: 4095 402 3693


I think I will change [noop] to deadline (#cat /sys/block/sda/queue/scheduler) to see the change and effects...

Thanks
 

Humbug

Member
Nov 14, 2012
30
1
8
#cat /sys/block/sda/queue/scheduler
Code:
noop [deadline] cfq
I think I will change [noop] to deadline (#cat /sys/block/sda/queue/scheduler) to see the change and effects...
Looks like you're currently running deadline (the one in the brackets is active). Yes, give it a try.
 

riri1310

Member
Jul 14, 2017
14
0
6
44
You're right I made the copy after the change ;)

But the bad news is I just made a test and the system still freeze even with deadline option...

I have this in syslog :

Code:
Nov 11 13:02:51 ns3091370 pvestatd[4659]: storage 'ftpbackup_ns3091370' is not online
Nov 11 13:02:52 ns3091370 pvestatd[4659]: status update time (6.146 seconds)
It's look like the NFS backup drive had some problem to connect with the host (the drive is a backup from OVH)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE and Proxmox Mail Gateway. We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!