Problem restoring backups from Proxmox Backup Server

tirhnossa · May 22, 2024

Hey guys,

I'm new here on the forum and I'm looking for help with a problem that I've been facing for some time, but I haven't found people who have managed to solve it.

I'm Brazilian and I'm using a translator, so some things may be poorly translated.

I have a backup of my PVE virtual machines that are local, these backups are made by the Proxmox Backup Server (it is also local). What happens is that I've had some machines that had problems and I tried to restore the backup, however, I always get the error below, after getting close to 20% restoration:

restore failed: reading file "/datastore/POOL-BKP/.chunks/0bb5/0bb5ea2de6ee7f33a1887ede72c4deb1268dc9fbd2cc1d5228e6f20a01b8bb8b" failed: Input/output error (os error 5)
Logical volume "vm-101-disk-0" successfully removed
temporary volume 'local-lvm:vm-101-disk-0' sucessfully removed
error before or during data restore, some or all disks were not completely restored. VM 101 state is NOT cleaned up.
TASK ERROR: command '/usr/bin/pbs-restore --repository backup@pbs@172.31.0.10: POOL-BKP vm/100/2024-05-07T03:00:02Z drive-ide0.img.fidx /dev/ pve/vm-101-disk-0 --verbose --format raw --skip-zero' failed: exit code 255

Anyone who has been through this could help me resolve it or perhaps instruct me on how to proceed. (I've already tried to reset 3 machines in the past, different periods and different operating systems, and I always get this error)

Versions of virtualization and backup systems:

Proxmox Virtual Environment 7.1-10
Proxmox Backup Server 3.1-2

Chris · May 23, 2024

Hi,
this error indicates that your Proxmox Backup Server might have an issue with the storage underlying the datastore. Please check the systemd journal on the PBS for errors, you can get a paginated view of the journal since boot in reverse chronological order by running journalctl -r -b

VictorSTS · May 24, 2024

As a side note: schedule verification jobs in PBS. They will detect this kind of failures (among others) so you know that there is something wrong with your backups way before you may need to restore some of them.

tirhnossa · May 27, 2024

Good morning,

When I executed the command it returned me this log. Could that be the problem?

May 27 08:31:17 pbs login[7156]: ROOT LOGIN on '/dev/pts/0'
May 27 08:31:17 pbs systemd[1]: Started session-156.scope - Session 156 of User root.
May 27 08:31:17 pbs systemd[1]: Started user@0.service - User Manager for UID 0.
May 27 08:31:17 pbs systemd[7141]: Startup finished in 318ms.
May 27 08:31:17 pbs systemd[7141]: Reached target default.target - Main User Target.
May 27 08:31:17 pbs systemd[7141]: Reached target basic.target - Basic System.
May 27 08:31:17 pbs systemd[7141]: Reached target sockets.target - Sockets.
May 27 08:31:17 pbs systemd[7141]: Listening on gpg-agent.socket - GnuPG cryptographic agent and passphrase cache.
May 27 08:31:17 pbs systemd[7141]: Listening on gpg-agent-ssh.socket - GnuPG cryptographic agent (ssh-agent emulation).
May 27 08:31:17 pbs systemd[7141]: Listening on gpg-agent-extra.socket - GnuPG cryptographic agent and passphrase cache (restricted).
May 27 08:31:17 pbs systemd[7141]: Listening on gpg-agent-browser.socket - GnuPG cryptographic agent and passphrase cache (access for web browse>
May 27 08:31:17 pbs systemd[7141]: Listening on dirmngr.socket - GnuPG network certificate management daemon.
May 27 08:31:17 pbs systemd[7141]: Reached target timers.target - Timers.
May 27 08:31:17 pbs systemd[7141]: Reached target paths.target - Paths.
May 27 08:31:17 pbs systemd[7141]: Created slice app.slice - User Application Slice.
May 27 08:31:17 pbs systemd[7141]: Queued start job for default target default.target.
May 27 08:31:16 pbs (systemd)[7141]: pam_unix(systemd-user:session): session opened for user root(uid=0) by (uid=0)
May 27 08:31:16 pbs systemd[1]: Starting user@0.service - User Manager for UID 0...
May 27 08:31:16 pbs systemd[1]: Finished user-runtime-dir@0.service - User Runtime Directory /run/user/0.
May 27 08:31:16 pbs systemd[1]: Starting user-runtime-dir@0.service - User Runtime Directory /run/user/0...
May 27 08:31:16 pbs systemd[1]: Created slice user-0.slice - User Slice of UID 0.
May 27 08:31:16 pbs systemd-logind[465]: New session 156 of user root.
May 27 08:31:16 pbs login[7135]: pam_unix(login:session): session opened for user root(uid=0) by root(uid=0)
May 27 08:25:43 pbs postfix/smtp[7119]: D1AE739C084D: to=<ti@rhnossa.com.br>, relay=none, delay=63374, delays=63368/0.02/6.1/0, dsn=4.4.3, statu>
May 27 08:25:37 pbs postfix/qmgr[760]: D1AE739C084D: from=<root@pbs.rhnossa.local>, size=1208, nrcpt=1 (queue active)
May 27 08:20:42 pbs postfix/smtp[7115]: 2339D39C0856: to=<ti@rhnossa.com.br>, relay=none, delay=29826, delays=29820/0.02/6.1/0, dsn=4.4.3, statu>
May 27 08:20:36 pbs postfix/qmgr[760]: 2339D39C0856: from=<root@pbs.rhnossa.local>, size=2097, nrcpt=1 (queue active)
May 27 08:20:36 pbs proxmox-backup-[758]: pbs proxmox-backup-proxy[758]: rrd journal successfully committed (25 files in 0.109 seconds)
May 27 08:20:36 pbs proxmox-backup-[758]: pbs proxmox-backup-proxy[758]: starting rrd data sync
May 27 08:20:36 pbs proxmox-backup-[758]: pbs proxmox-backup-proxy[758]: write rrd data back to disk
May 27 08:19:28 pbs smartd[464]: Device: /dev/sda [SAT], 571 Offline uncorrectable sectors
May 27 08:19:28 pbs smartd[464]: Device: /dev/sda [SAT], 571 Currently unreadable (pending) sectors
May 27 08:17:01 pbs CRON[7109]: pam_unix(cron:session): session closed for user root
May 27 08:17:01 pbs CRON[7110]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)

Chris said:
Oi
esse erro indica que o Proxmox Backup Server pode ter um problema com o armazenamento subjacente ao armazenamento de dados. Por favor, verifique o diário systemd no PBS para erros, você pode obter uma visualização paginada do diário desde a inicialização em ordem cronológica inversa executando journalctl -r -b

tirhnossa · May 27, 2024

VictorSTS said:
As a side note: schedule verification jobs in PBS. They will detect this kind of failures (among others) so you know that there is something wrong with your backups way before you may need to restore some of them.

Good morning everything is fine?

I activated the verification job in this tab, would that be correct?

VictorSTS · May 27, 2024

tirhnossa said:
May 27 08:19:28 pbs smartd[464]: Device: /dev/sda [SAT], 571 Offline uncorrectable sectors
May 27 08:19:28 pbs smartd[464]: Device: /dev/sda [SAT], 571 Currently unreadable (pending) sectors

There is some problem with your /dev/sda disk. Replace it asap.

tirhnossa said:
I activated the verification job in this tab, would that be correct?

Yes. If possible, change the verification time so it will not overlap with backup or garbage collector tasks in the server.

tirhnossa · May 27, 2024

VictorSTS said:
There is some problem with your /dev/sda disk. Replace it asap.

Yes. If possible, change the verification time so it will not overlap with backup or garbage collector tasks in the server.

So, in this case, since /dev/sda is my entire disk, do I actually need to replace it?

Regarding job verification, thanks for the tip, I've already changed the schedule.

VictorSTS · May 27, 2024

I would definitely replace it: it will fail sooner than later. Add a new drive, create a partition and format it as ext4. Create a new datastore on it and sync backups from the old HDD to the new. Hopefully, most backups will sync correctly.

tirhnossa · May 31, 2024

VictorSTS said:
I would definitely replace it: it will fail sooner than later. Add a new drive, create a partition and format it as ext4. Create a new datastore on it and sync backups from the old HDD to the new. Hopefully, most backups will sync correctly.

Thank you, I will try this

santiagobiali · May 31, 2024

tirhnossa said:
So, in this case, since /dev/sda is my entire disk, do I actually need to replace it?

en-us: As sda is also your boot-drive, if you want to risk and just "clone" it, this would be the walk-through. But a fresh install is recommended.
pt-br: Como o disco sda também é teu disco de boot/sistema, se você quer arriscar e só "clonar" ele pra outro HD, esse é o passo a passo. Mas uma instalação limpa do proxmox/pbs em outro HD é o mais recomendado.

Bash:

sda => atual
sdb => novo

# Copiar layout de partições
sgdisk /dev/sda -R /dev/sdb
sgdisk -G /dev/sdb

# Criar e adicionar a partição lvm do disco novo no vg
pvcreate /dev/sdb3
vgextend pve /dev/sdb3

# Mover os dados de uma partição pra outra no LVM
pvmove /dev/sda3 /dev/sdb3

# Remover o disco antigo do vg/pv
vgreduce pve /dev/sda3
pvremove /dev/sda3

umount /boot/efi

# Refazer a partição de boot
proxmox-boot-tool format /dev/sdb2
proxmox-boot-tool init /dev/sdb2

blkid /dev/sdb* | grep vfat
#Pegar o valor de UUID do comando acima ^
# Editar em /etc/fstab adicionando a partição de boot
    UUID="3139-05AE" /boot/efi vfat defaults 0 1
 
mount /boot/efi

proxmox-boot-tool refresh

Aí só remover o HD antigo e deixar o novo rodando. Nem precisa reiniciar.

tirhnossa · Jun 26, 2024

Good morning guys,

I thank the help of all you. Indeed, after I changed the HD that stored the backups, the problem was resolved.

Thank you very much @VictorSTS

Search

Search

Problem restoring backups from Proxmox Backup Server

tirhnossa

New Member

Attachments

Chris

Proxmox Staff Member

VictorSTS

Distinguished Member

tirhnossa

New Member

tirhnossa

New Member

VictorSTS

Distinguished Member

tirhnossa

New Member

VictorSTS

Distinguished Member

tirhnossa

New Member

santiagobiali

Member

tirhnossa

New Member

We value your privacy