Hello Proxmox community.
I have a Proxmox server(s) (7.4-3) where I've setup 3 different backup jobs for 3:2:1 backup setup. I've set them to send email on failure only which I thought it would be just fine.
That server failed (completely frozen and only way to get out of it was to cold boot the machine) after reboot Proxmox would not boot up (stuck on cleaning up zfs or something like that). I've tried to live boot debian os and tried to retrieve at least config files of VMs nothing was there at all. So I said it will be fine I have backups.
That's where I nearly had a stroke. Backups were missing for almost over a year! I don't think that backups didn't happen I think they were failing. But the emails were never sent out I believe. (This is happening on some of my proxmox machines. some of them will send email some of them wont. It's always the same setup so i don't know) I can't confirm since new PVE OS was already installed. I was VERY VERY lucky I haven't touched the VMs for over a year from config point of view so I could use the config files from year ago to restore VM IDs etc. and my ZFS pool was fine so I've imported the ZFS pool back and re-scanned storage to import disks back. The day was saved no data lost at all I was again very lucky.
However that brought me into a serious concern of knowing if backup happened and if it was successful or not using as well third party solution and not just rely on emails. Which in my scenario proved to fail. I use healtchecks.io to monitor almost anything in my IT business.
With healthchecks.io you can signal start, end and also failure if you wish by simple wget or curl commands. Example below:
wget https://mydomain.com/ping/d353a7c6-d8c9d1f-ae963d23d566/start = this will signal start of healtchcheck monitor
wget https://mydomain.com/ping/d353a7c6-d8c9d1f-ae963d23d566 = this will signal end of healthcheck monitor
wget https://mydomain.com/ping/d353a7c6-d8c9d1f-ae963d23d566/5 = this will signal failure with exit code 5
My question is:
How can I implement this to my configured backup jobs (these were setup using webgui only)? I've tried to do research and I understand there is script hook available for manually created backup jobs? Which if I understand right have to be then put into crontab in order for them to be executed? One possible solution would be to write shell script to execute vzdump, but that beats the purpose of nice simple webgui config option. Also if I am right I would have to manually run prune jobs? If this is the only way of doing can Proxmox Staff please consider adding support of healthchecks.io to the future release? There seems to be some checkmk company being able to monitor backups, but i read here on forum it doesn't monitor well backups.
https://healthchecks.io
Any help from staff or someone else would be greatly appreciated,
thank you.
Ladislav
I have a Proxmox server(s) (7.4-3) where I've setup 3 different backup jobs for 3:2:1 backup setup. I've set them to send email on failure only which I thought it would be just fine.
That server failed (completely frozen and only way to get out of it was to cold boot the machine) after reboot Proxmox would not boot up (stuck on cleaning up zfs or something like that). I've tried to live boot debian os and tried to retrieve at least config files of VMs nothing was there at all. So I said it will be fine I have backups.
That's where I nearly had a stroke. Backups were missing for almost over a year! I don't think that backups didn't happen I think they were failing. But the emails were never sent out I believe. (This is happening on some of my proxmox machines. some of them will send email some of them wont. It's always the same setup so i don't know) I can't confirm since new PVE OS was already installed. I was VERY VERY lucky I haven't touched the VMs for over a year from config point of view so I could use the config files from year ago to restore VM IDs etc. and my ZFS pool was fine so I've imported the ZFS pool back and re-scanned storage to import disks back. The day was saved no data lost at all I was again very lucky.
However that brought me into a serious concern of knowing if backup happened and if it was successful or not using as well third party solution and not just rely on emails. Which in my scenario proved to fail. I use healtchecks.io to monitor almost anything in my IT business.
With healthchecks.io you can signal start, end and also failure if you wish by simple wget or curl commands. Example below:
wget https://mydomain.com/ping/d353a7c6-d8c9d1f-ae963d23d566/start = this will signal start of healtchcheck monitor
wget https://mydomain.com/ping/d353a7c6-d8c9d1f-ae963d23d566 = this will signal end of healthcheck monitor
wget https://mydomain.com/ping/d353a7c6-d8c9d1f-ae963d23d566/5 = this will signal failure with exit code 5
My question is:
How can I implement this to my configured backup jobs (these were setup using webgui only)? I've tried to do research and I understand there is script hook available for manually created backup jobs? Which if I understand right have to be then put into crontab in order for them to be executed? One possible solution would be to write shell script to execute vzdump, but that beats the purpose of nice simple webgui config option. Also if I am right I would have to manually run prune jobs? If this is the only way of doing can Proxmox Staff please consider adding support of healthchecks.io to the future release? There seems to be some checkmk company being able to monitor backups, but i read here on forum it doesn't monitor well backups.
https://healthchecks.io
Any help from staff or someone else would be greatly appreciated,
thank you.
Ladislav