Problem restoring backups from Proxmox Backup Server

tirhnossa

New Member
May 22, 2024
6
1
3
Hey guys,

I'm new here on the forum and I'm looking for help with a problem that I've been facing for some time, but I haven't found people who have managed to solve it.

I'm Brazilian and I'm using a translator, so some things may be poorly translated.

I have a backup of my PVE virtual machines that are local, these backups are made by the Proxmox Backup Server (it is also local). What happens is that I've had some machines that had problems and I tried to restore the backup, however, I always get the error below, after getting close to 20% restoration:

restore failed: reading file "/datastore/POOL-BKP/.chunks/0bb5/0bb5ea2de6ee7f33a1887ede72c4deb1268dc9fbd2cc1d5228e6f20a01b8bb8b" failed: Input/output error (os error 5)
Logical volume "vm-101-disk-0" successfully removed
temporary volume 'local-lvm:vm-101-disk-0' sucessfully removed
error before or during data restore, some or all disks were not completely restored. VM 101 state is NOT cleaned up.
TASK ERROR: command '/usr/bin/pbs-restore --repository backup@pbs@172.31.0.10: POOL-BKP vm/100/2024-05-07T03:00:02Z drive-ide0.img.fidx /dev/ pve/vm-101-disk-0 --verbose --format raw --skip-zero' failed: exit code 255

Anyone who has been through this could help me resolve it or perhaps instruct me on how to proceed. (I've already tried to reset 3 machines in the past, different periods and different operating systems, and I always get this error)

Versions of virtualization and backup systems:

Proxmox Virtual Environment 7.1-10
Proxmox Backup Server 3.1-2
 

Attachments

  • Evidence.png
    Evidence.png
    79.4 KB · Views: 25
Last edited:
Hi,
this error indicates that your Proxmox Backup Server might have an issue with the storage underlying the datastore. Please check the systemd journal on the PBS for errors, you can get a paginated view of the journal since boot in reverse chronological order by running journalctl -r -b
 
As a side note: schedule verification jobs in PBS. They will detect this kind of failures (among others) so you know that there is something wrong with your backups way before you may need to restore some of them.
 
  • Like
Reactions: Chris
Good morning,

When I executed the command it returned me this log. Could that be the problem?

May 27 08:31:17 pbs login[7156]: ROOT LOGIN on '/dev/pts/0'
May 27 08:31:17 pbs systemd[1]: Started session-156.scope - Session 156 of User root.
May 27 08:31:17 pbs systemd[1]: Started user@0.service - User Manager for UID 0.
May 27 08:31:17 pbs systemd[7141]: Startup finished in 318ms.
May 27 08:31:17 pbs systemd[7141]: Reached target default.target - Main User Target.
May 27 08:31:17 pbs systemd[7141]: Reached target basic.target - Basic System.
May 27 08:31:17 pbs systemd[7141]: Reached target sockets.target - Sockets.
May 27 08:31:17 pbs systemd[7141]: Listening on gpg-agent.socket - GnuPG cryptographic agent and passphrase cache.
May 27 08:31:17 pbs systemd[7141]: Listening on gpg-agent-ssh.socket - GnuPG cryptographic agent (ssh-agent emulation).
May 27 08:31:17 pbs systemd[7141]: Listening on gpg-agent-extra.socket - GnuPG cryptographic agent and passphrase cache (restricted).
May 27 08:31:17 pbs systemd[7141]: Listening on gpg-agent-browser.socket - GnuPG cryptographic agent and passphrase cache (access for web browse>
May 27 08:31:17 pbs systemd[7141]: Listening on dirmngr.socket - GnuPG network certificate management daemon.
May 27 08:31:17 pbs systemd[7141]: Reached target timers.target - Timers.
May 27 08:31:17 pbs systemd[7141]: Reached target paths.target - Paths.
May 27 08:31:17 pbs systemd[7141]: Created slice app.slice - User Application Slice.
May 27 08:31:17 pbs systemd[7141]: Queued start job for default target default.target.
May 27 08:31:16 pbs (systemd)[7141]: pam_unix(systemd-user:session): session opened for user root(uid=0) by (uid=0)
May 27 08:31:16 pbs systemd[1]: Starting user@0.service - User Manager for UID 0...
May 27 08:31:16 pbs systemd[1]: Finished user-runtime-dir@0.service - User Runtime Directory /run/user/0.
May 27 08:31:16 pbs systemd[1]: Starting user-runtime-dir@0.service - User Runtime Directory /run/user/0...
May 27 08:31:16 pbs systemd[1]: Created slice user-0.slice - User Slice of UID 0.
May 27 08:31:16 pbs systemd-logind[465]: New session 156 of user root.
May 27 08:31:16 pbs login[7135]: pam_unix(login:session): session opened for user root(uid=0) by root(uid=0)
May 27 08:25:43 pbs postfix/smtp[7119]: D1AE739C084D: to=<ti@rhnossa.com.br>, relay=none, delay=63374, delays=63368/0.02/6.1/0, dsn=4.4.3, statu>
May 27 08:25:37 pbs postfix/qmgr[760]: D1AE739C084D: from=<root@pbs.rhnossa.local>, size=1208, nrcpt=1 (queue active)
May 27 08:20:42 pbs postfix/smtp[7115]: 2339D39C0856: to=<ti@rhnossa.com.br>, relay=none, delay=29826, delays=29820/0.02/6.1/0, dsn=4.4.3, statu>
May 27 08:20:36 pbs postfix/qmgr[760]: 2339D39C0856: from=<root@pbs.rhnossa.local>, size=2097, nrcpt=1 (queue active)
May 27 08:20:36 pbs proxmox-backup-[758]: pbs proxmox-backup-proxy[758]: rrd journal successfully committed (25 files in 0.109 seconds)
May 27 08:20:36 pbs proxmox-backup-[758]: pbs proxmox-backup-proxy[758]: starting rrd data sync
May 27 08:20:36 pbs proxmox-backup-[758]: pbs proxmox-backup-proxy[758]: write rrd data back to disk
May 27 08:19:28 pbs smartd[464]: Device: /dev/sda [SAT], 571 Offline uncorrectable sectors
May 27 08:19:28 pbs smartd[464]: Device: /dev/sda [SAT], 571 Currently unreadable (pending) sectors
May 27 08:17:01 pbs CRON[7109]: pam_unix(cron:session): session closed for user root
May 27 08:17:01 pbs CRON[7110]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)

Oi
esse erro indica que o Proxmox Backup Server pode ter um problema com o armazenamento subjacente ao armazenamento de dados. Por favor, verifique o diário systemd no PBS para erros, você pode obter uma visualização paginada do diário desde a inicialização em ordem cronológica inversa executando journalctl -r -b
 
As a side note: schedule verification jobs in PBS. They will detect this kind of failures (among others) so you know that there is something wrong with your backups way before you may need to restore some of them.
Good morning everything is fine?

I activated the verification job in this tab, would that be correct?

1716816329540.png
 
May 27 08:19:28 pbs smartd[464]: Device: /dev/sda [SAT], 571 Offline uncorrectable sectors
May 27 08:19:28 pbs smartd[464]: Device: /dev/sda [SAT], 571 Currently unreadable (pending) sectors
There is some problem with your /dev/sda disk. Replace it asap.

I activated the verification job in this tab, would that be correct?
Yes. If possible, change the verification time so it will not overlap with backup or garbage collector tasks in the server.
 
There is some problem with your /dev/sda disk. Replace it asap.


Yes. If possible, change the verification time so it will not overlap with backup or garbage collector tasks in the server.
So, in this case, since /dev/sda is my entire disk, do I actually need to replace it?
1716834881132.png

Regarding job verification, thanks for the tip, I've already changed the schedule.
 
I would definitely replace it: it will fail sooner than later. Add a new drive, create a partition and format it as ext4. Create a new datastore on it and sync backups from the old HDD to the new. Hopefully, most backups will sync correctly.
 
I would definitely replace it: it will fail sooner than later. Add a new drive, create a partition and format it as ext4. Create a new datastore on it and sync backups from the old HDD to the new. Hopefully, most backups will sync correctly.
Thank you, I will try this
 
So, in this case, since /dev/sda is my entire disk, do I actually need to replace it?
en-us: As sda is also your boot-drive, if you want to risk and just "clone" it, this would be the walk-through. But a fresh install is recommended.
pt-br: Como o disco sda também é teu disco de boot/sistema, se você quer arriscar e só "clonar" ele pra outro HD, esse é o passo a passo. Mas uma instalação limpa do proxmox/pbs em outro HD é o mais recomendado.

Bash:
sda => atual
sdb => novo

# Copiar layout de partições
sgdisk /dev/sda -R /dev/sdb
sgdisk -G /dev/sdb

# Criar e adicionar a partição lvm do disco novo no vg
pvcreate /dev/sdb3
vgextend pve /dev/sdb3

# Mover os dados de uma partição pra outra no LVM
pvmove /dev/sda3 /dev/sdb3

# Remover o disco antigo do vg/pv
vgreduce pve /dev/sda3
pvremove /dev/sda3

umount /boot/efi

# Refazer a partição de boot
proxmox-boot-tool format /dev/sdb2
proxmox-boot-tool init /dev/sdb2

blkid /dev/sdb* | grep vfat
#Pegar o valor de UUID do comando acima ^
# Editar em /etc/fstab adicionando a partição de boot
    UUID="3139-05AE" /boot/efi vfat defaults 0 1
 
mount /boot/efi

proxmox-boot-tool refresh

Aí só remover o HD antigo e deixar o novo rodando. Nem precisa reiniciar.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!