Hello all.
Sorry for my English, I'm not a native speaker )
I`d like to share my Proxmox VM backup/restore/consistency_check bash script.
I do it for two reasons:
- it could be useful for someone
- I`d like to get some critique about it. Perhaps this is fundamentally wrong way and you all use another methods.
So, how does it work.
We have a main Proxmox server and a backup Proxmox server, which is ready to run vm's if the main server is down.
(Okay, Proxmox has a replication feature and that's great, but it needs ZFS. And ZFS has noticeably lower performance and much more RAM requirements)
1. We make a VM backup via default Proxmox feature: Datacenter - Backup
2. Copy .lzo to backup server via scp by cron
3. Restore VM on backup server
4. Edit VM config file:
- disable network interface to avoid IP address conflicts when 2 copies of VM are running
- disable cdroms
- disable start on boot
- cut memory (our backup server is not as powerful as a main server)
- insert current date in the VM's name, so we can know when the copy was restored
5. Start VM
6. Check if VM was booted. We could ping it, but the network is already down. So we use qm agent ping.
7. If VM successfully started then we shut id down and remove .lzo file
That's it.
So, this is simplified version of script which is running by cron on a backup server:
Sorry for my English, I'm not a native speaker )
I`d like to share my Proxmox VM backup/restore/consistency_check bash script.
I do it for two reasons:
- it could be useful for someone
- I`d like to get some critique about it. Perhaps this is fundamentally wrong way and you all use another methods.
So, how does it work.
We have a main Proxmox server and a backup Proxmox server, which is ready to run vm's if the main server is down.
(Okay, Proxmox has a replication feature and that's great, but it needs ZFS. And ZFS has noticeably lower performance and much more RAM requirements)
1. We make a VM backup via default Proxmox feature: Datacenter - Backup
2. Copy .lzo to backup server via scp by cron
3. Restore VM on backup server
4. Edit VM config file:
- disable network interface to avoid IP address conflicts when 2 copies of VM are running
- disable cdroms
- disable start on boot
- cut memory (our backup server is not as powerful as a main server)
- insert current date in the VM's name, so we can know when the copy was restored
5. Start VM
6. Check if VM was booted. We could ping it, but the network is already down. So we use qm agent ping.
7. If VM successfully started then we shut id down and remove .lzo file
That's it.
So, this is simplified version of script which is running by cron on a backup server:
Code:
### VM Restore function ###
function restoreVMID {
qm stop $1
sleep 20s
qmrestore /var/lib/vz/dump/vzdump-qemu-$1-$(date +%Y_%m_%d)*.vma.lzo $1 --unique 1 --storage local-zfs --force 1
if [ $? -eq 0 ]
then
echo -e "VM" $1 "Successful recovery"
else
echo -e "VM" $1 "Recovery failed!"
return 123
fi
### Edit VM config ###
sed -i 's/firewall=.*/firewall=1,link_down=1/' /etc/pve/qemu-server/$1.conf
sed -i '/cdrom/d' /etc/pve/qemu-server/$1.conf
sed -i '/onboot/d' /etc/pve/qemu-server/$1.conf
sed -i 's/memory:.*/memory: 4096/' /etc/pve/qemu-server/$1.conf
sed -i 's/\(.*\)name:\ /\1name:\ '$(date +%Y.%m.%d)'\-/g' /etc/pve/qemu-server/$1.conf
}
### Let`s count backup quantity, restore our VM`s & test availability ###
countLZO=0
for i in `ls /var/lib/vz/dump/*$(date +%Y_%m_%d)*.lzo | cut -c 30-32 | awk '{print $1}' | uniq`
do
restoreVMID $i
if [ $? -ne 0 ]
then
echo $(date +%Y.%m.%d\ %H:%M) "VM" $i "Recovery failed!"
continue
fi
qm start $i
sleep 3m
qm agent $i ping
if [ $? -eq 0 ]
then
echo -e "ping" $i "is OK"
rm /var/lib/vz/dump/vzdump-qemu-$i*.vma.lzo
else
echo -e "VM" $i "Ping error!"
qm stop $i
continue
fi
qm stop $i
countLZO=$((countLZO+1))
done
echo -e "Total backup count =" $countLZO$
### Let`s count existing VM quantity ###
countVM=0
for i in `qm list | awk '{print $1}' | awk "NR!=1"`
do
countVM=$((countVM+1))
done
echo -e "LZO" $countLZO "VM" $countVM
echo -e "FINISH"
exit 0