We uploaded two ISOs on June 11 via the proxmox GUI and made use of the checksum matching feature while doing so.
When comparing the corrupted ISOs to the verified/original ISOs, we see the following files differ:
Rocky-8.10-x86_64-minimal.iso
Rocky-9.4-x86_64-minimal.iso
In both cases, the xml files above had their contents replaced by null values (
Test uploads since this has been noticed have been successful and verified to be intact. However, we upgraded our proxmox environment to latest on June 18. So we are not now testing on the same version of proxmox which performed the uploads originally.
Obviously, this gives us concern. The upload apparently succeeded and checksum was matched. Can we trust that checksum matching code? Even worse to consider: What could possibly cause these ISOs sitting in cephfs, to become corrupted spontaneously, in situ?
Might there be any known bugs or issues with either the uploader, the checksum matching, or ceph that might account for this? Otherwise, we're pretty concerned about what appears to be spontaneous data corruption in our ceph cluster that otherwise reports as healthy and we're otherwise having no issues.
Thanks!
--Brian
Code:
root@pve1:/var/log/pve/tasks# pvenode task list --type imgcopy
┌──────────────────────────────────────────────────────────────┬─────────┬────┬───────────────┬────────────┬────────────┬────────┐
│ UPID │ Type │ ID │ User │ Starttime │ Endtime │ Status │
╞══════════════════════════════════════════════════════════════╪═════════╪════╪═══════════════╪════════════╪════════════╪════════╡
│ UPID:pve1:0012779F:277C022A:66686D36:imgcopy::redacted@pve: │ imgcopy │ │ redacted@pve │ 1718119734 │ 1718119750 │ OK │
├──────────────────────────────────────────────────────────────┼─────────┼────┼───────────────┼────────────┼────────────┼────────┤
│ UPID:pve1:0012CDF8:277DA7EC:6668716D:imgcopy::redacted@pve: │ imgcopy │ │ redacted@pve │ 1718120813 │ 1718120824 │ OK │
├──────────────────────────────────────────────────────────────┼─────────┼────┼───────────────┼────────────┼────────────┼────────┤
Code:
root@pve1:/var/log/pve/tasks# pvenode task log UPID:pve1:0012779F:277C022A:66686D36:imgcopy::redacted@pve:
starting file import from: /var/tmp/pveupload-7199be1779e22d8a81f11dac849eda05
calculating checksum...OK, checksum verified
target node: pve1
target file: /mnt/pve/cephfs/template/iso/Rocky-8.10-x86_64-minimal.iso
file size is: 2694053888
command: cp -- /var/tmp/pveupload-7199be1779e22d8a81f11dac849eda05 /mnt/pve/cephfs/template/iso/Rocky-8.10-x86_64-minimal.iso
finished file import successfully
TASK OK
Code:
root@pve1:/var/log/pve/tasks# pvenode task log UPID:pve1:0012CDF8:277DA7EC:6668716D:imgcopy::redacted@pve:
starting file import from: /var/tmp/pveupload-7c584847348af7323f4b9506856bb773
calculating checksum...OK, checksum verified
target node: pve1
target file: /mnt/pve/cephfs/template/iso/Rocky-9.4-x86_64-minimal.iso
file size is: 1829634048
command: cp -- /var/tmp/pveupload-7c584847348af7323f4b9506856bb773 /mnt/pve/cephfs/template/iso/Rocky-9.4-x86_64-minimal.iso
finished file import successfully
TASK OK
- Proxmox GUI uploader reported that the hashes matched upon original uploading.
- We have located the original ISOs that were uploaded and they are still intact and match the checksums provided by rockylinux. So the originals are good.
When comparing the corrupted ISOs to the verified/original ISOs, we see the following files differ:
Rocky-8.10-x86_64-minimal.iso
Code:
media.repo
BaseOS/repodata/5a3a9e9fc6a304fdf3a12a4fc8f37fd4efd76524fcd808a060139147308d7a41-primary.xml.gz
BaseOS/repodata/6e26cc2b8c46d5e2c47fe9892f436e48353c750873082a3b9b07132b09abcb40-other.xml.gz
BaseOS/repodata/71f62d6dadfbf3238ce701da43cb69958ce4c546cc370f92e70ba933f3193c23-comps-BaseOS.x86_64.xml
BaseOS/repodata/e105891d2832b712e68b45a603e895845e4df1c99d988936f02d3e899f68b5e5-comps-BaseOS.x86_64.xml.xz
BaseOS/repodata/repomd.xml
Minimal/repodata/0a0ee3d6de957f97960893014ede3f247303f7770819f3ecf9ae30beed45675e-comps-Minimal.x86_64.xml.xz
Minimal/repodata/1cb61ea996355add02b1426ed4c1780ea75ce0c04c5d1107c025c3fbd7d8bcae-primary.xml.gz
Minimal/repodata/22305a97eed1bed923f2cfa37086b208bc9ebcc1e4426384efff558576f40edd-other.sqlite.xz
Minimal/repodata/2b13cd3f9d81647fd31aa16de1b16b582efd9566f8c4334e4561a030f3777c37-comps-Minimal.x86_64.xml
Minimal/repodata/3e3eaeee784726c6a95c8b0b4b776eeb0adef3c9f88bc94df600e571dd030e0c-primary.sqlite.xz
Minimal/repodata/8a1d161ad47cce30bb3c704a541481224c9d490f98f9edb3980d1793922df099-filelists.sqlite.xz
Minimal/repodata/95a4415d859d7120efb6b3cf964c07bebbff9a5275ca673e6e74a97bcbfb2a5f-filelists.xml.gz
Minimal/repodata/ef3e20691954c3d1318ec3071a982da339f4ed76967ded668b795c9e070aaab6-other.xml.gz
Minimal/repodata/repomd.xml
Rocky-9.4-x86_64-minimal.iso
Code:
minimal/repodata/bd201f63f99e67d65f859f38ab472022f055238d74c78c6dd407ef57c4f0f90d-primary.sqlite.bz2
minimal/repodata/d250f7f881bb991be3648c021fb305dd6085b902321b26f52033500ebff7cae1-x86_64.xml.gz
minimal/repodata/repomd.xml
In both cases, the xml files above had their contents replaced by null values (
^@^@^@^@^@^@^@
) while the original file size was retained, and "noeol" (no end of line) is present. In the case of the compressed files, none could be decompressed fully, but zcat
-ing them to a text file revealed that some were somewhat intact, but then would truncate many lines prematurely, while again, retaining the original file size. For example the file "5a3a9e9fc6a304fdf3a12a4fc8f37fd4efd76524fcd808a060139147308d7a41-primary.xml.gz" when zcat
-ed out, ends abruptly at line 120,607 whereas the original file is 137,636 lines. Presumably, null values make up the difference (and zcat
doesn't output them).Test uploads since this has been noticed have been successful and verified to be intact. However, we upgraded our proxmox environment to latest on June 18. So we are not now testing on the same version of proxmox which performed the uploads originally.
Obviously, this gives us concern. The upload apparently succeeded and checksum was matched. Can we trust that checksum matching code? Even worse to consider: What could possibly cause these ISOs sitting in cephfs, to become corrupted spontaneously, in situ?
Might there be any known bugs or issues with either the uploader, the checksum matching, or ceph that might account for this? Otherwise, we're pretty concerned about what appears to be spontaneous data corruption in our ceph cluster that otherwise reports as healthy and we're otherwise having no issues.
Thanks!
--Brian
Last edited: