[SOLVED] I've LOST and CORRUPTED my data on a proxmox VE v 7.3-4 after space out on PBS 3.2-2!

VGusev2007

Renowned Member
May 24, 2010
107
12
83
Russia
Dear all!

I use proxmox VE v 7.3-4 with PBS v 3.2-2 for a long time.

I missed out space on my pbs and after that my backup job stacked with error msg:

Code:
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
INFO: started backup task '5c30694c-25d4-4696-b113-75462f864802'
INFO: resuming VM again
INFO: scsi0: dirty-bitmap status: OK (560.0 MiB of 16.2 GiB dirty)
INFO: using fast incremental mode (dirty-bitmap), 560.0 MiB dirty of 16.2 GiB total
INFO:  12% (72.0 MiB of 560.0 MiB) in 2s, read: 36.0 MiB/s, write: 34.0 MiB/s
ERROR: backup write data failed: command error: write_data upload error: pipelined request failed: inserting chunk on store 'storage01' failed for 88a1a151f283827462d8ab2f879f4e4683386fb5a2fe43a109363643584129f7 - fchmod "/srv/storage01/.chunks/88a1/88a1a151f283827462d8ab2f879f4e4683386fb5a2fe43a109363643584129f7.tmp_AAMSiH" failed: EDQUOT: Quota exceeded
INFO: aborting backup job
INFO: resuming VM again
ERROR: Backup of VM 8998 failed - backup write data failed: command error: write_data upload error: pipelined request failed: inserting chunk on store 'storage01' failed for 88a1a151f283827462d8ab2f879f4e4683386fb5a2fe43a109363643584129f7 - fchmod "/srv/storage01/.chunks/88a1/88a1a151f283827462d8ab2f879f4e4683386fb5a2fe43a109363643584129f7.tmp_AAMSiH" failed: EDQUOT: Quota exceeded
INFO: Failed at 2024-10-25 22:00:11
...

After that several of my vm've stuck with corrupt fs inside it!

Code:
Like this one:
[Mon Oct 28 15:06:45 2024] systemd-journald[362]: Failed to write entry (22 items, 753 bytes), ignoring: Read-only file system

Dmesg also showed a lot of i/o.

fsck.ext4 shows a lot of error with inode and so on insade of that VMs!

I use zfs as a backend storage.
I don't have any problem like this one before I started to use PBS!

Is it danger to use PBS on production env?
 
Last edited:
failed: EDQUOT: Quota exceeded
I use zfs as a backend storage.
You need to read man zfsprops and look for "Quota". Then you get an overview of the space used with zfs list -o space; then you ask it for details about the dataset used for the PBS datastore with zfs get all <yourpbsdataset>.

At the end - and if you have space unassigned and available in the pool for this - you can increase the Quota with zfs set quota=<newsetting> <yourpbsdataset>. :)

If the complete pool is actually full you need to delete some unused data... which may be a problem in itself...
 
You need to read man zfsprops and look for "Quota". Then you get an overview of the space used with zfs list -o space; then you ask it for details about the dataset used for the PBS datastore with zfs get all <yourpbsdataset>.

At the end - and if you have space unassigned and available in the pool for this - you can increase the Quota with zfs set quota=<newsetting> <yourpbsdataset>. :)

If the complete pool is actually full you need to delete some unused data... which may be a problem in itself...
No. I realized that problem comes from bad design of proxmox at all. For try to mitigate impact of my problem I need to have consider use PVE v 8.2 with use of Backup fleecing (advanced feature) or switch to pve-zsync
 
No. I realized that problem comes from bad design of proxmox at all. For try to mitigate impact of my problem I need to have consider use PVE v 8.2 with use of Backup fleecing (advanced feature) or switch to pve-zsync

So I would just point out that if this is occuring in PVE 8, you should file a Bugzilla report [1]. Also not too long ago, it was made clear to me in no uncertain terms by Proxmox staff that users running PVE 7 are anyhow not in scope for anything as it is EOL.

[1] https://bugzilla.proxmox.com/
 
So I would just point out that if this is occuring in PVE 8, you should file a Bugzilla report [1]. Also not too long ago, it was made clear to me in no uncertain terms by Proxmox staff that users running PVE 7 are anyhow not in scope for anything as it is EOL.

[1] https://bugzilla.proxmox.com/
That problem is common (and still here) and described in wiki: https://pve.proxmox.com/wiki/Backup_and_Restore see section VM Backup Fleecing

If you can answer for my other topic witch is related to that problem I will be glad - https://forum.proxmox.com/threads/what-will-happen-if-out-of-space-backup-fleecing.156581/
 
That problem is common (and still here) and described in wiki: https://pve.proxmox.com/wiki/Backup_and_Restore see section VM Backup Fleecing

If you can answer for my other topic witch is related to that problem I will be glad - https://forum.proxmox.com/threads/what-will-happen-if-out-of-space-backup-fleecing.156581/

I do not see why fleecing should corrupt your guest disks. I intentionally left your solo post unanswered because usually with zero answers staff will eventually reply. It's odd to me as described above.
 
  • Like
Reactions: VGusev2007
I do not see why fleecing should corrupt your guest disks. I intentionally left your solo post unanswered because usually with zero answers staff will eventually reply. It's odd to me as described above.
If backup process hangs for any reasons, there is a lot of problem after that... You can see that. It's my production. Thank for you did not answer me in that topic. Hope that proxmox stuff will answer me.

1730129222172.png

1730129196510.png
 
If backup process hangs for any reasons, there is a lot of problem after that... You can see that. It's my production. Thank for you did not answer me in that topic. Hope that proxmox stuff will answer me.

I just find it completely weird that you filling up target storage for backup has any impact your guest disks. I think staff eventually use some filter where there were no answers and reply such threads (if they feel like they can reply something). Another trick you can employ is to post this to PBS forum. That is so slow that whatever you post there will be on the 1st page for long. :)
 
  • Like
Reactions: VGusev2007
But if you can make a reproducer with PVE8 (with or without fleecing, doesn't matter), I would not even wait and file Bugzilla report right away. This is not documented in any way and does not even make sense to me - consider that in some scenarios the admins running PVE cluster and PBS might be completely different people.
 
But if you can make a reproducer with PVE8 (with or without fleecing, doesn't matter), I would not even wait and file Bugzilla report right away. This is not documented in any way and does not even make sense to me - consider that in some scenarios the admins running PVE cluster and PBS might be completely different people.
Yes. If I will have that possibility, I test that, but I'm sure that problem still here because design. You need fast and robust backup server to make any kind online backup. I use proxmox PVE from v 4.0 and problem still here. It's design. I hoped that PBS uses other design but not, I've googled it and because of this I've closed that topic as solved because there is no new good info for any users.
 
Yes. If I will have that possibility, I test that, but I'm sure that problem still here because design. You need fast and robust backup server to make any kind online backup. I use proxmox PVE from v 4.0 and problem still here. It's design. I hoped that PBS uses other design but not, I've googled it and because of this I've closed that topic as solved because there is no new good info for any users.
Also, I suppose that problem comes not only from proxmox but from qemu itself.
 
Yeah, that really looks as two different independent problems !
Sad but no. It happened because my PBS server stopped write any new data during try to get a new backup data. And after several hours my Proxmox VMs just hangs. I logged in into those and all of my VMs which had a some write during backup switched to read only with a lot of errors. I fixed that only from LiveCD. This is sad but true.

Backend storage for the vm's/lxc's (with ext4 inside) on zfs ?
And backend storage for pbs on zfs but both on different hosts ?
Yes, for both questions.
 
Would you mind to post the configuration of a couple of those VMs that suffered the issue?
Hey! Yes. No problem.

The first one;

Code:
#Current state%3A production.
balloon: 0
boot: order=virtio0
cores: 4
ide2: none,media=cdrom
memory: 20480
name: mail
net0: virtio=52:54:00:3D:1D:10,bridge=vmbr0,firewall=1,tag=1
numa: 1
ostype: l26
parent: auto-hourly-31-10-2024-12-10-05
protection: 1
smbios1: uuid=5928ee84-1926-4ff4-82fa-f37dc443fcf3
sockets: 1
tablet: 0
virtio0: local-zfs:vm-106-disk-1,size=70G
virtio1: local-zfs:vm-106-disk-2,size=6066G

The second one:
Code:
boot: order=scsi0
cores: 4
ide2: none,media=cdrom
memory: 4096
name: antivirus
net0: virtio=00:0C:29:00:36:B2,bridge=vmbr0,tag=1
numa: 1
ostype: l26
parent: snap24072024
protection: 1
scsi0: local-zfs:vm-125-disk-0,size=25G
smbios1: uuid=0e87280d-abab-4842-bc82-484fc1737bd7
sockets: 1
vmgenid: cac2d8a0-d29f-4f1b-83d8-332dec70ea26

I met problem with my mail and antivirus during that time. Both of them had some write during backup. They're critical for me.
 
Oh. Thank you so much. No, I didn't post it to BZ because, PBS or just online backup is wrong by design and there is some info about that in the proxmox wiki. Just don't use it, or upgrade to pve 8.2 and use local SSD for temp cache. But I don't know what will be happening if local SSD out of space during backup. I think there will meet same problem.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!