Proxmox/OMV/HD Passthrough

revvr

New Member
Sep 15, 2023
23
2
3
I have a server running Proxmox. Within that, I have an VM running Open Media Vault. OMV has 4 hard drives that are passed through to it to create a ZFS storage pool.

I also have a Synology NAS that I want to back up via HyperBackup to that ZFS storage pool.

The issue that I'm having is that the first backup from Synology completes successfully. However, when running a data integrity check on it, it fails every time and says that some files of the backup have changed.

I thought that the idea of passing through hard drives meant that nothing else touches them, so I'm wondering what is changing the contents of these backups.

I'm starting to think that I did not pass these drives through correctly after noticing that Proxmox is still running the SMART checks on these drives and I cannot see SMART values in OMV. If these drives were passed through correctly, would that mean that I should see SMART values in Proxmox or in the VM that is using them? What else could I do to verify that I passed them through correctly?
 
I'm starting to think that I did not pass these drives through correctly after noticing that Proxmox is still running the SMART checks on these drives and I cannot see SMART values in OMV. If these drives were passed through correctly, would that mean that I should see SMART values in Proxmox or in the VM that is using them? What else could I do to verify that I passed them through correctly?
It's normal when using disk passthrough (and not PCI passthrough of a disk controller) that SMART is only available via the Proxmox host. See other threads about this on this forum for more information.
The issue that I'm having is that the first backup from Synology completes successfully. However, when running a data integrity check on it, it fails every time and says that some files of the backup have changed.

I thought that the idea of passing through hard drives meant that nothing else touches them, so I'm wondering what is changing the contents of these backups.
Unless you also mount or use the disks or partitions that you passthrough, it should not be touched. Make sure there is nothing else using them, as filesystems cannot handle concurrent use of a disk.
Check if Proxmox does not detect and automatically improts the ZFS pool,which does appear to happen to other people here. They only notice that when there are two rpools and boot fails.
It could also be that your drives are giving errors or your system memory is causing errors. Does a scrub of your ZFS pool show errors inside the OMV VM?
 
It's normal when using disk passthrough (and not PCI passthrough of a disk controller) that SMART is only available via the Proxmox host. See other threads about this on this forum for more information.

Unless you also mount or use the disks or partitions that you passthrough, it should not be touched. Make sure there is nothing else using them, as filesystems cannot handle concurrent use of a disk.
Check if Proxmox does not detect and automatically improts the ZFS pool,which does appear to happen to other people here. They only notice that when there are two rpools and boot fails.
It could also be that your drives are giving errors or your system memory is causing errors. Does a scrub of your ZFS pool show errors inside the OMV VM?
Thanks. I don't mount them anywhere else. I also don't see a ZFS pool within Proxmox.

A scrub of the ZFS pool in OMV comes back clean, zero errors, and I've done it twice.

Before going with OMV, I was testing out Unraid and created a ZFS pool using these disks. However, that ZFS pool was corrupted as well, which is why I took the opportunity to jump to OMV.

I'm truly at loss with what is happening. I suppose my next step is to do a memtest which sucks because I have to take all the VMs offline while I do that.

The hard drives are manufactrer refurbished from Server Part Deals. I've had them for 2 weeks. All SMART indicators appear to be in good shape.
 
Thanks. I don't mount them anywhere else. I also don't see a ZFS pool within Proxmox.

A scrub of the ZFS pool in OMV comes back clean, zero errors, and I've done it twice.

The hard drives are manufactrer refurbished from Server Part Deals. I've had them for 2 weeks. All SMART indicators appear to be in good shape.
If ZFS claims everything is alright, then it's not your drives. And it is also not outside interference because ZFS would detect that (if you restart the VM before a scrub).
Before going with OMV, I was testing out Unraid and created a ZFS pool using these disks. However, that ZFS pool was corrupted as well, which is why I took the opportunity to jump to OMV.

I'm truly at loss with what is happening. I suppose my next step is to do a memtest which sucks because I have to take all the VMs offline while I do that.
Looks like it's not something inside the VM, since it has to be a common factor between unraid and OMV. Maybe your backup software is not working in this setup. Have you tried copying the data yourself and comparing the contents?
 
Updating this thread.

I nuked my OMV and installed Unraid again, created ZFS data pools, and ran Hyper Backup to it. I managed to create a first backup in a day. I then immediately ran an Integrity Check on it and it was found corrupted.
I have a few theories and observations:

1) I have a fairly large collection of photos and videos. Broken files are aways coming back from these photos & videos (phone-taken). This time around, I had fewer files that were broken (just 3 vs the 15-20 that I had with OMV). I find it odd that it is always these files that are broken. None of the hundreds of files I have in other directories (Word, PDF, PPTX, DOCX, etc.). The other constant is that there are always files that were generated by the Synology Photos app present.

These are the ones from today:

Code:
Backup task was unable to run due to errors found at the backup destination.
The following files were found broken in the latest integrity check and cannot be restored.
There may be other broken files which were not detected this time.
If you have further questions, please contact Synology support.


Broken file(s):
Version: 2023-10-22 04:23:48-0700, shared folder: homes

  File    30320526     2023-01-22 02:22:10  username/Photos/MobileBackup/Other/Google Photos 2016/@eaDir/VID_20160713_102141297.mp4/SYNOPHOTO_FILM_M.mp4
  File    14029        2023-01-22 02:21:34  username/Photos/MobileBackup/Other/Google Photos 2016/@eaDir/VID_20160713_102141297.mp4/SYNOPHOTO_THUMB_SM.jpg
  File    222404246    2023-01-15 03:33:46  username/Photos/MobileBackup/Other/Google Photos 2016/VID_20160713_102917577.mp4

As a reminder, I have 5 other backups of the same content that are working properly. One inside the NAS, one in an external drive attached to the NAS, one on an offline Rpi with a drive attached to it, one on Google Photos, one on Hetzner. None of these present any issues.

2) This might be related to the use of ZFS? Both OMV and Unraid ran ZFS file systems.

3) Use of non-ECC memory at destination computer. OMV/Unraid run on a cheap N5105 board that does not support ECC memory.

II think I'm going to nuke this Unraid install and try Unraid with BTRFS and see if anything changes. If that fails too, I'll have to take the whole thing offline to run a memtest, but at that point I think I might as well just replace the whole thing with something else.

@leesteken When you say copy the data myself? Do you mean using scp or rsync manually? How would I compare everything afterwards?
 
Proxmox has had a lot of problems on N5105, as a search of the forum will show. But this feel less and less as a Proxmox issue.
Maybe your network cable is causing silent corruptions or maybe the source (on the NAS) is already corrupted. Or maybe there is some ransomware running and encrypting stuff (with a bug in it, so you notice before it is ready)?

How does your backup software check the integrity? Does it read all files from the source again and checks whether the bytes are identical? Or does it have knowledge about photo- and video-formats (and apparently not Office documents) and find faults in them (which could already be in the originals)? Can you check files (photo's and documents) yourself on the source and on copies that you make yourself?
When you say copy the data myself? Do you mean using scp or rsync manually? How would I compare everything afterwards?
Yes, just copy files. Use a file compare utility (or diff) and check some files manually.
 
Updating.

I tried backing up to a single disk with BTRFS to the same server and got the same problem. So at least I can discard the ZFS theory.

I doubt I have ransomware in here. I've always been very careful with these files and none of the other backups are failing integrity checks.

A backup to the Raspberry Pi I keep mostly offline was successful. It doesn't seem to be related to the NAS or its network.

This leaves me with:

1) Server (destination) memory issues
2) Server (destination) hard drives

I'm going to attach an external hard drive to the server and pass it through to Unraid and try to backup there. If that works, it has to be related to the server's hard drives? If it doesn't, I plan on runing memtest on this thing. If that also works, I guess I'm out of options?

I'm starting to suspect the hard drives themselves, but I don't have any other problems as far as I can tell. I bought four Seagate Exos X16 manufacturer refurbished drives. They all pass S.M.A.R.T.
 
Last edited:
Updating again.

I tried running a backup to a known and good external USB drive. This backup failed the integrity check. On one hand, this is good because it rules out the other hard drives in the server. On the other, it is bad because I still have no clue as to what is causing these backups to fail while the other backups to different destinations continue to work and pass integrity checks.

I now pulled the server from the rack and am running Memtest86+ on it. I'm also running a memory test on the NAS. If these both pass, I don't really know what else to do other than give up...
 
Last update:

Well, I think I have finally identified the problem. Bad RAM stick in the Synology NAS. Synology support may have been right all along.

For the record (and for those who Google), as I was running the memory test on the NAS, I noticed it got stuck at 75.xx% and did not move for a while. I went to the unit, rebooted it, and tried again. Again, it got stuck at a different point of the test. I decided to pull the aftermarket memory I had put in there and ran the test again. It passed.

Bash:
2023-10-28T04:07:13-07:00 Jupiter findhostd[14449]: util_fhost.c:1195 Memtest passed!

While the NAS was doing its memory test, I went ahead and put the suspected faulty stick into another machine that I have here and ran Memtest86+. It crashed Memtest within a few minutes and rebooted the machine. I did this a few more times with the same result, which I believe means that it is a bad stick.

The server (or target computer where HyperBackup was backuping to) is on its 3rd Memtest86+ pass and zero errors. I'm runnining all 10 tests on it (6 hours so far) and I will leave it running overnight for good measure.

I only wish I would have run memtest earlier in the game, but the laziness of pulling the server out of the rack beat me every time. I should have at least suspected the aftermarket ram on the NAS... By the way, the aftermarket RAM is NEMIX 16GB (1X16GB) 1.2V DDR4 2666MHZ PC4-21300 ECC UDIMM 2RX8. I will now do a trial by fire of their RMA process...

Edit: I was finally able to generate a backup and check integrity. Everything now looks good, so confirming it was the bad RAM.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!