Problems using encrypted zfs Storage

marcelprox

New Member
Dec 7, 2019
7
1
3
45
Hello together,

I've a Dell T30 and i would like to use two mirrored 1TB ssd-drives for my data and vm's. Because i would like to encrypt my data i'm trying to use an encrypted ZFS Setup. I tought everything works great, but after a reboot of my proxmox host i'm not able to startup my vm's again. For now i don't have any productive data or vm's because i'm (happily) just playing around, but i would like to use this as a prod system after everything runs smoothly.
This are the steps i thought would setup all i need:
  • zpool create ssdRaidPool mirror /dev/sdb /dev/sdc
  • zpool set feature@encryption=enabled ssdRaidPool
  • zfs create -o encryption=on -o keyformat=passphrase ssdRaidPool/encryptedSsd
  • pvesm add zfspool encryptedSsdStorage -pool ssdRaidPool/encryptedSsd

After that i ensure enrcyption works and type:
  • zfs load-key ssdRaidPool/encryptedSsd
I've got the message "Key load error: Key already loaded for 'ssdRaidPool/encryptedSsd'. Everything fines, i'm able to create new vm's and lxc container - no problem, everything works well.

BUT: After a reboot i'm unable to start my vm's and containers... Right after rebooting proxmox i decrypt my storage like above (zfs load-key ssdRaidPool/encryptedSsd) and try to start my machines, but the error is always:

"Job for pve-container@100.service failed because the control process exited with error code.

See "systemctl status pve-container@100.service" and "journalctl -xe" for details.
TASK ERROR: command 'systemctl start pve-container@100' failed: exit code 1
"



systemctl status pve-container@100.service gives me:
pve-container@100.service - PVE LXC Container: 100
Loaded: loaded (/lib/systemd/system/pve-container@.service; static; vendor preset: enabled)
Active: failed (Result: exit-code) since Wed 2020-01-08 21:01:46 CET; 13min ago
Docs: man:lxc-start
man:lxc
man:pct
Process: 7870 ExecStart=/usr/bin/lxc-start -n 100 (code=exited, status=1/FAILURE)

Jan 08 21:01:46 pve systemd[1]: Starting PVE LXC Container: 100...
Jan 08 21:01:46 pve lxc-start[7870]: lxc-start: 100: lxccontainer.c: wait_on_daemonized_start: 865 No such file or directory - Failed to receive the container state
Jan 08 21:01:46 pve lxc-start[7870]: lxc-start: 100: tools/lxc_start.c: main: 329 The container failed to start
Jan 08 21:01:46 pve lxc-start[7870]: lxc-start: 100: tools/lxc_start.c: main: 332 To get more details, run the container in foreground mode
Jan 08 21:01:46 pve lxc-start[7870]: lxc-start: 100: tools/lxc_start.c: main: 335 Additional information can be obtained by setting the --logfile and --logpriority options
Jan 08 21:01:46 pve systemd[1]: pve-container@100.service: Control process exited, code=exited, status=1/FAILURE
Jan 08 21:01:46 pve systemd[1]: pve-container@100.service: Failed with result 'exit-code'.
Jan 08 21:01:46 pve systemd[1]: Failed to start PVE LXC Container: 100.




Nothing useful i can find with journalctl -xe


Any ideas what I'm doing wrong here :/?

Thanks in regard!
 
Hi,

the problem is that the datasets are not mounted.
I guess after loading the key a restart of
systemctl status zfs-mount.service
should be enough.
 
  • Like
Reactions: marcelprox
Unfortunately not :/.
After systemctl status zfs-mount.service i get the following output:
● zfs-mount.service - Mount ZFS filesystems
Loaded: loaded (/lib/systemd/system/zfs-mount.service; enabled; vendor preset: enabled)
Active: active (exited) since Thu 2020-01-09 18:54:04 CET; 25s ago
Docs: man:zfs(8)
Process: 936 ExecStart=/sbin/zfs mount -a (code=exited, status=0/SUCCESS)
Main PID: 936 (code=exited, status=0/SUCCESS)

Jan 09 18:54:04 pve systemd[1]: Starting Mount ZFS filesystems...
Jan 09 18:54:04 pve systemd[1]: Started Mount ZFS filesystems.
root@pve:~#



If i try systemctl stop zfs-mount-service and after that systemctl start zfs-mount.service i get the status:

root@pve:~# systemctl start zfs-mount.service
Job for zfs-mount.service failed because the control process exited with error code.
See "systemctl status zfs-mount.service" and "journalctl -xe" for details.
root@pve:~# systemctl status zfs-mount.service
● zfs-mount.service - Mount ZFS filesystems
Loaded: loaded (/lib/systemd/system/zfs-mount.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Thu 2020-01-09 18:57:39 CET; 4s ago
Docs: man:zfs(8)
Process: 2054 ExecStart=/sbin/zfs mount -a (code=exited, status=1/FAILURE)
Main PID: 2054 (code=exited, status=1/FAILURE)

Jan 09 18:57:38 pve systemd[1]: Starting Mount ZFS filesystems...
Jan 09 18:57:39 pve zfs[2054]: cannot mount '/ssdRaidPool/encryptedSpace': directory is not empty
Jan 09 18:57:39 pve systemd[1]: zfs-mount.service: Main process exited, code=exited, status=1/FAILURE
Jan 09 18:57:39 pve systemd[1]: zfs-mount.service: Failed with result 'exit-code'.
Jan 09 18:57:39 pve systemd[1]: Failed to start Mount ZFS filesystems.
 
Ok, wrong lead :D...
Hasn't anything to do with zfs encryption!
Actualy i found this thread because i had problems with a second proxmox installation where (unencrypted) Containers also didn't boot up after a reboot..
So, in short, the workaround in this thread (remove dirs in subvols, unmoun & mount, startup containers) is working !

No Problems with encryptedZFS Volumes at all.

Thank you neverless!
 
  • Like
Reactions: guletz