[SOLVED] OpenVZ containers going offline every night

Jip...also no idea. Might be worth a test if disabling the backup job (e.g. one night) also does prevent the error.

If you know a time slot where you can afford to have some downtime, you might also run the whole vzdump command live in shell to see what happens.
 
Thanks for the help, everyone!

I really appreciate it.

If you know a time slot where you can afford to have some downtime, you might also run the whole vzdump command live in shell to see what happens.
What does the command do? It just launches the backup job like the cronjob would?
 
You're welcome ;)

Yes, it just launches the backup task. If the container stays up, then it must be something different.

Oh, but it launches the backup task for a particular machine? I thought it launched off of them. Anyhow, this is what I get. It looks like Proxmox isn't even able to suspend it, so that command doesn't seem to be the one taking it down.

Here's what I got:

INFO: starting new backup job: vzdump 627
INFO: Starting Backup of VM 627 (openvz)
INFO: CTID 627 exist mounted running
INFO: status = running
INFO: mode failure - unable to dump into snapshot (use option --dumpdir)
INFO: trying 'suspend' mode instead
INFO: backup mode: suspend
INFO: ionice priority: 7
INFO: starting first sync /var/lib/vz/private/627/ to /var/lib/vz/dump/vzdump-openvz-627-2016_11_28-21_04_30.tmp
INFO: Number of files: 33299
INFO: Number of files transferred: 26386
INFO: Total file size: 1006611966 bytes
INFO: Total transferred file size: 1002145245 bytes
INFO: Literal data: 1002146768 bytes
INFO: Matched data: 0 bytes
INFO: File list size: 796176
INFO: File list generation time: 0.001 seconds
INFO: File list transfer time: 0.000 seconds
INFO: Total bytes sent: 1004205758
INFO: Total bytes received: 525105
INFO: sent 1004205758 bytes received 525105 bytes 18780016.13 bytes/sec
INFO: total size is 1006611966 speedup is 1.00
INFO: first sync finished (53 seconds)
INFO: suspend vm
INFO: Setting up checkpoint...
INFO: suspend...
INFO: Can not suspend container: Invalid argument
INFO: Error: unsupported fs type fuse
INFO: Checkpointing failed
ERROR: Backup of VM 627 failed - command 'vzctl --skiplock chkpnt 627 --suspend' failed: exit code 16
INFO: Backup job finished with errors
job errors
 
I have done a lot of testing, and eventually I figured out that it's not the whole server that goes down, but merely Apache.

Well, the server did get suspended during backup, but the fact that I kept seeing downtime after disabling backups was caused by Apache _also_ being restarted during log rotation.

Thanks for all your help.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!