FIPS mode on tape library - PBS errors out on labeling

Mar 3, 2025
10
5
3
Hi all,

We've run into a problem commissioning our new tape backup solution. Since we're in a heavily regulated industry, we're required to adhere to FIPS 140-2, which means using HPE's MSL Encryption Kit for validated FIPS 140-2 encryption on our Ultrium 9-SCSi drive. This seems to be causing an issue with PBS though, as when we try to label a tape we receive the error "status decode data encryption caps page failed - drive does not support AES-GCM encryption." Best as I can tell, this is due to PBS trying to query/set an encryption key on the tape drive, but since the drive is running in FIPS mode this is not allowed as the drive wants to use its own key and key management.

Is there a way around this? Is there any way to tell PBS that encryption is being handled by the tape drive/library itself and not to worry about it? I've tried creating a media pool with encryption turned off, but even trying to label tapes into that pool returns the same error (most likely because PBS can't read/check the encryption status due to FIPS mode, I'm guessing). Has anyone else managed to fix this?

Thank you much in advance.
 
Hi,

It seems that the drive returns unexpected data when the drive is in FIPS mode. Currently we assume that we can set the encryption mode for the drive (so we can be sure that we're in the right mode, else we might end up in writing encrypted data when we wanted to write unencrypted one or vice versa) I can imagine implementing a 'drive managed' mode where we don't set the encryption mode (or autodetecting FIPS mode) but AFAIK we don't have a library/drive here that supports such a mode, so it would be great if you could post some output from commands we do

for starters i'd need the output of
Code:
sg_raw -r 16k <tape device> A2 20 00 10 00 00 00 00 FF FF 00 00

(this is the scsi command we decode where the error occurs)

where <tape device> is something like /dev/tape/by-id/scsi-xxxxxx-sg
you can find the path of your drives by using
Code:
proxmox-tape drive list

also if you would not mind opening an enhancement request here: https://bugzilla.proxmox.com so we can better keep track of it

thanks!
 
Thank you so much dcsapak - I really appreciate the response and am more than happy to help out. I know FIPS mode isn't a common need, but with the regulatory landscape changing here in the US I can see it becoming more necessary. Our entire proxmox stack (8 node cluster with CEPH, PBS, and tape library) is still in the commissioning phase so it's the perfect time to put it to use in testing out new features.

I've opened an enhancement request here: https://bugzilla.proxmox.com/show_bug.cgi?id=6326

On running the raw scsi command, I received the following:

Code:
root@pvepbs01:~# proxmox-tape drive list
┌─────────┬────────────────────────────────────┬────────────────┬────────┬────────────────┬────────────┐
│ name    │ path                               │ changer        │ vendor │ model          │ serial     │
╞═════════╪════════════════════════════════════╪════════════════╪════════╪════════════════╪════════════╡
│ lto0_01 │ /dev/tape/by-id/scsi-CZ2D2302T8-sg │ tape_library01 │ HPE    │ Ultrium_9-SCSI │ CZ2D2302T8 │
└─────────┴────────────────────────────────────┴────────────────┴────────┴────────────────┴────────────┘
root@pvepbs01:~# sg_raw -r 16k /dev/tape/by-id/scsi-CZ2D2302T8-sg A2 20 00 10 00 00 00 00 FF FF 00 00
SCSI Status: Good

Received 92 bytes of data:
 00     00 10 00 58 0a 00 00 00  00 00 00 00 00 00 00 00    ...X............
 10     00 00 00 00 01 00 00 14  3f 34 00 20 00 0c 00 20    ........?4. ...
 20     eb 00 00 00 00 00 00 00  00 01 00 14 02 00 00 14    ................
 30     3f 3c 00 20 00 3c 00 20  eb 00 00 00 00 00 00 00    ?<. .<. ........
 40     00 01 00 14 03 00 00 14  3f 3c 00 20 00 3c 00 20    ........?<. .<.
 50     eb 00 00 00 00 00 00 00  00 01 00 14                ............

Thank you again, and please don't hesitate to let me know how else I can help!
 
thanks

i sent a patch in the meantime. If you could test it, that would be great

we put a temporary build with that commit in http://download.proxmox.com/temp/tape-encryption-skip-fix/
(only the backup-server package should be necessary)

the checksums are (since it's just http):
99b3bdbaaebec77b924e43ed62544bfcc5b29be1095dcefc435d4f502c2ea965 proxmox-backup-client_3.4.0-1_amd64.deb
539f60c92c46e074f298080dd0a3a748e9163c65db0f2a0ea8b8bb2f4dc20949 proxmox-backup-client-dbgsym_3.4.0-1_amd64.deb
9c9136b3a60564d3f518b94d2929dfe4bcbe73c2f6bcfb5f4123d20e69148468 proxmox-backup-client-static_3.4.0-1_amd64.deb
13da0c04add88df044792dab57572ddd5b40741c77280a54a5706ce1dbc8dd44 proxmox-backup-client-static-dbgsym_3.4.0-1_amd64.deb
7531137fc11d7ca47252729074ae94a3da9920de5c7c491a25a36628fc64caf5 proxmox-backup-docs_3.4.0-1_all.deb
970331e3fb233f4500ef0887249cecee706eb42f157aff02224af7390e672c26 proxmox-backup-file-restore_3.4.0-1_amd64.deb
3221277ced070088ee82fb857901b6785e93fb64361afbc17fa2549417c803de proxmox-backup-file-restore-dbgsym_3.4.0-1_amd64.deb
524f278133131ae3418b5ef12aee878487a41091c9afe14d9f0204cb18b51db1 proxmox-backup-server_3.4.0-1_amd64.deb
c9696432e4df65fad355baef663e458bafec89030558e93fe99cf160931f8b53 proxmox-backup-server-dbgsym_3.4.0-1_amd64.deb

with that package you should be able to backup, but only if the PBS tape encryption is off (since we can't manage it)
 
Thanks again, Dominik - installing only proxmox-backup-server_3.4.0-1_amd64.deb worked like a charm!

Code:
2025-04-15T09:06:24-05:00: could not set encryption mode on drive: decode data encryption caps page failed - drive does not support AES-GCM encryption, ignoring.
2025-04-15T09:06:46-05:00: Label media 'CJP190L8' for pool 'test-pool'
2025-04-15T09:06:57-05:00: TASK OK

Never thought I'd be so happy to see "ignoring" in a log file. With this I was able to format the few tapes we'd used in the past for testing and get our entire set labeled. Actual backups to the tapes will happen in a few days as this server doesn't have its storage yet (still commissioning everything), however I should be able to run a few checks with the smaller VMs we have in the cluster for testing. Again, thank you so much for the amazingly fast response - so glad we decided to go with Proxmox for our new virtualization stack!

As always, let me know if there's anything else you'd like us to test while this stack is still being commissioned - we're more than happy to assist. Thanks again!
 
Alright, was able to make a backup of one of the small testing VMs and tried to write to tape - here's the log:

Code:
2025-04-15T10:37:57-05:00: Starting tape backup job 'test-store:test-single-tape-pool:lto0_01:test-store-single-tape'
2025-04-15T10:37:57-05:00: update media online status
2025-04-15T10:37:57-05:00: starting new media set - reason: policy is AlwaysCreate
2025-04-15T10:37:57-05:00: media set uuid: 94757cdd-c20f-4e7b-bfb2-35f7e4db0591
2025-04-15T10:37:57-05:00: found 1 groups (out of 1 total)
2025-04-15T10:37:57-05:00: latest-only: true (only considering latest snapshots)
2025-04-15T10:37:57-05:00: backup snapshot "vm/101/2025-04-15T15:28:23Z"
2025-04-15T10:37:57-05:00: allocated new writable media 'CJP180L8'
2025-04-15T10:37:57-05:00: trying to load media 'CJP180L8' into drive 'lto0_01'
2025-04-15T10:38:38-05:00: could not set encryption mode on drive: decode data encryption caps page failed - drive does not support AES-GCM encryption, ignoring.
2025-04-15T10:38:38-05:00: found media label CJP180L8 (23460ce5-68bf-4c29-8d6b-f4610748f83b)
2025-04-15T10:38:38-05:00: writing new media set label
2025-04-15T10:38:56-05:00: queued notification (id=2507200b-3539-43a0-b5ba-68db4581059b)
2025-04-15T10:38:56-05:00: TASK ERROR: Set encryption mode not what was desired (set: true, wanted: false)

Looks like there's one more check to disable somewhere, as the pool it was using is set to no encryption:

Code:
root@pvepbs01:/# proxmox-tape pool list
┌───────────────────────┬────────────┬───────────┬──────────┬─────────┐
│ name                  │ allocation │ retention │ template │ encrypt │
╞═══════════════════════╪════════════╪═══════════╪══════════╪═════════╡
│ test-pool             │ continue   │ overwrite │          │ no      │
├───────────────────────┼────────────┼───────────┼──────────┼─────────┤
│ test-single-tape-pool │ always     │ keep      │          │ no      │
└───────────────────────┴────────────┴───────────┴──────────┴─────────┘

Let me know what you'd like to try next, and thank you again!
 
  • Like
Reactions: Johannes S
Thanks for testing!

Looks like there's one more check to disable somewhere, as the pool it was using is set to no encryption:
ah yes, of course i forgot that. we have added an additional assertion that verifies the set encryption mode is really set, we have to omit that too for this situation.
(as I wrote, we don't have any hardware here that support such a library managed encryption mode, so I can't test this exact scenario here...)

I'll prepare a new set of patches, and see that i update packages for you to test, stay tuned.
 
ok, i sent a v2:
https://lore.proxmox.com/pbs-devel/20250416070703.493585-1-d.csapak@proxmox.com/

and new packages were uploaded to the same folder as before
but with different names + different checksums:
c8481f4b32ea1faca8634f23c6926e77a7c3d3422c2e46e4991e42a76bf5a002 proxmox-backup-client_3.4.0-1+testv2_amd64.deb
94a0af26f94de6d44bb5d27cb29397dfd76d6733e3c03a41f742d7781ec351f3 proxmox-backup-client-dbgsym_3.4.0-1+testv2_amd64.deb
fd3bce3d973ad7f87b057675c8939127b29a04edeedfece41726783ea8ebc554 proxmox-backup-client-static_3.4.0-1+testv2_amd64.deb
758d2d2cef74a407e4d93a176c91b1810da58347df71d0a7396f706de138b5b4 proxmox-backup-client-static-dbgsym_3.4.0-1+testv2_amd64.deb
af29e7867970a012181b7f7db7bbc621b833a014eae0aabda37020bbd2cd5275 proxmox-backup-docs_3.4.0-1+testv2_all.deb
edf73ff4ab5c3761fd6f1a0a25b8453cd28f3966b5ed9c261364368b42f4ad14 proxmox-backup-file-restore_3.4.0-1+testv2_amd64.deb
32387e3d7ffdb5ea1f3eae0e0a76e8a0abb150b832c746d647a1ee83565aa720 proxmox-backup-file-restore-dbgsym_3.4.0-1+testv2_amd64.deb
7fc0dca1a5e25736a0ce59f6b111fd1abb39b8ae8e7bf53a54943e24b68d8812 proxmox-backup-server_3.4.0-1+testv2_amd64.deb
36214f794e7d08b11629470c24369ebd85f8457c1b2dba5de9a8bcce0cca1291 proxmox-backup-server-dbgsym_3.4.0-1+testv2_amd64.deb
 
Absolutely fantastic - thank you Dominik! With that patch I was able to write to a tape with no problems:

Code:
2025-04-16T09:02:26-05:00: Starting tape backup job 'test-store:test-single-tape-pool:lto0_01:test-store-single-tape'
2025-04-16T09:02:26-05:00: update media online status
2025-04-16T09:02:26-05:00: media set uuid: 94757cdd-c20f-4e7b-bfb2-35f7e4db0591
2025-04-16T09:02:26-05:00: found 1 groups (out of 1 total)
2025-04-16T09:02:26-05:00: latest-only: true (only considering latest snapshots)
2025-04-16T09:02:26-05:00: backup snapshot "vm/101/2025-04-15T15:28:23Z"
2025-04-16T09:02:26-05:00: allocated new writable media 'CJP180L8'
2025-04-16T09:02:26-05:00: trying to load media 'CJP180L8' into drive 'lto0_01'
2025-04-16T09:03:07-05:00: could not set encryption mode on drive: decode data encryption caps page failed - drive does not support setting AES-GCM encryption, ignoring.
2025-04-16T09:03:07-05:00: found media label CJP180L8 (23460ce5-68bf-4c29-8d6b-f4610748f83b)
2025-04-16T09:03:07-05:00: moving to end of media
2025-04-16T09:03:07-05:00: arrived at end of media
2025-04-16T09:03:13-05:00: wrote 970 chunks (1042.02 MB at 153.91 MB/s)
2025-04-16T09:03:13-05:00: end backup test-store:"vm/101/2025-04-15T15:28:23Z"
2025-04-16T09:03:13-05:00: percentage done: 100.00% (1/1 snapshots)
2025-04-16T09:03:16-05:00: append media catalog
2025-04-16T09:03:16-05:00: rewind media
2025-04-16T09:04:23-05:00: exported media 'CJP180L8' to import/export slot 24
2025-04-16T09:04:23-05:00: queued notification (id=6cb72bf8-9704-4085-9a2d-d5d1150ecce9)
2025-04-16T09:04:23-05:00: TASK OK

I also saw your note in the mailing list response about not being able to read labels:

Note that in contrast to normal operation, the tape label will also be
encrypted then and will not be readable in case the encryption key is
lost or changed.

And that's definitely something we've been keeping in mind but bears repeating to anyone else reading this thread down the line: When the library is managing its keys the encryption becomes completely transparent to PBS, which means backing up those keys becomes critical. As part of the MSL Encryption Kit from HPE we received two identical key storage units, and their instructions state specifically to keep them mirrored and keep one offsite.

That sidebar for posterity over, I deleted the snapshot from the datastore and went to restore, forgetting that I needed to move the tape from the import/export slot first:

Code:
2025-04-16T09:32:12-05:00: Mediaset '94757cdd-c20f-4e7b-bfb2-35f7e4db0591'
2025-04-16T09:32:12-05:00: Pool: test-single-tape-pool
2025-04-16T09:32:12-05:00: Datastore(s): test-store
2025-04-16T09:32:12-05:00: Drive: lto0_01
2025-04-16T09:32:12-05:00: Required media list: CJP180L8
2025-04-16T09:32:12-05:00: trying to load media 'CJP180L8' into drive 'lto0_01'
2025-04-16T09:32:12-05:00: Unit Attention, Additional sense: Import or export element accessed
2025-04-16T09:32:13-05:00: could not load tape into drive - unable to load media 'CJP180L8' - inside import/export slot
2025-04-16T09:32:13-05:00: Please insert media 'CJP180L8' into changer 'tape_library01'
2025-04-16T09:32:13-05:00: queued notification (id=b43217d8-f9b3-4731-af6b-0031bdb7ecb8)
2025-04-16T09:34:12-05:00: could not set encryption mode on drive: decode data encryption caps page failed - drive does not support setting AES-GCM encryption, ignoring.
2025-04-16T09:34:12-05:00: found media label CJP180L8 (23460ce5-68bf-4c29-8d6b-f4610748f83b)
2025-04-16T09:34:12-05:00: File 2: chunk archive for datastore 'test-store'
2025-04-16T09:34:16-05:00: restored 0 B (0 B/s)
2025-04-16T09:34:16-05:00: register 970 chunks
2025-04-16T09:34:16-05:00: File 3: snapshot archive test-store:vm/101/2025-04-15T15:28:23Z
2025-04-16T09:34:16-05:00: restore snapshot vm/101/2025-04-15T15:28:23Z
2025-04-16T09:34:16-05:00: File 4: skip catalog '23460ce5-68bf-4c29-8d6b-f4610748f83b'
2025-04-16T09:34:16-05:00: detected EOT after 5 files
2025-04-16T09:34:16-05:00: Restore mediaset '94757cdd-c20f-4e7b-bfb2-35f7e4db0591' done
2025-04-16T09:34:16-05:00: TASK WARNINGS: 1

And sure enough, the backup shows up right where it should, with PVE picking it up as well. The only way to streamline this process would be some sort of "When media is inserted into import/export slot, move it into a free spot in the library" rule, but that's incredibly minor and not part of this.

Thank you again, Dominik! I'm consistently amazed at how easy Proxmox has made this entire process - not just the tape backup side of things, but the entire setup of and migration to our new stack. I can't tell you how much we appreciate the quick response on getting our tape library up and running within our regulatory requirements. Again, thanks, and as always please let us know if you'd like to use our unit for testing over the next few weeks as we finish commissioning the system.
 
great to hear that it works! glad to help and bring pbs/pve to more people ;) (i guess you won't be the only one that requires FIPS compliant backups)

as for the import/export slot loading, it could be done, but as you said it's a separate issue (a bug report would help to triage and track this)

if you wouldn't mind, there are two things that you could do:

* could you test with encryption enabled on the pbs side? this should fail in the first step when we try to set the encryption (or at the latest with the separate check), just want to make sure our assumptions hold the other way round ;)
* if all is well and you don't mind your e-mail address/name to be in the commit, you could answer to my patch on the mailing list with a line that reads:
Code:
Tested-by: Your Name <your@email>

this might speed up the inclusion of the patch

thanks!
 
  • Like
Reactions: Johannes S
Can do - thank you Dominik! I tested creating a pool that uses encryption, and the process fails on the second check when PBS attempts to set the encryption mode:

Code:
2025-04-17T08:53:45-05:00: Starting tape backup job 'test-store:test-enc-pool:lto0_01:test-enc-failure'
2025-04-17T08:53:45-05:00: update media online status
2025-04-17T08:53:45-05:00: media set uuid: 5f319a2a-083a-46da-ab92-7a649fa4cfd6
2025-04-17T08:53:45-05:00: found 1 groups (out of 1 total)
2025-04-17T08:53:45-05:00: latest-only: true (only considering latest snapshots)
2025-04-17T08:53:45-05:00: backup snapshot "vm/101/2025-04-15T15:28:23Z"
2025-04-17T08:53:45-05:00: allocated new writable media 'CJP193L8'
2025-04-17T08:53:45-05:00: trying to load media 'CJP193L8' into drive 'lto0_01'
2025-04-17T08:53:45-05:00: could not set encryption mode on drive: decode data encryption caps page failed - drive does not support setting AES-GCM encryption, ignoring.
2025-04-17T08:53:45-05:00: found media label CJP193L8 (e53e0bc9-beb8-4566-828b-e6696258a2cd)
2025-04-17T08:53:45-05:00: writing new media set label
2025-04-17T08:53:57-05:00: queued notification (id=41654cc6-f689-4954-804d-c17bbfcb68c7)
2025-04-17T08:53:57-05:00: TASK ERROR: could not set encryption mode on drive: decode data encryption caps page failed - drive does not support setting AES-GCM encryption

And I'm just about to send the email to the mailing list. Thank you again for all your help, and please don't hesitate to reach out if you'd like us to test anything further!