Hi,
A while ago I installed PVE 7.2 and had setup a LVM-Thin pool on top of a LUKS container on top of a mdadm raid1. The LUKS container gets automatically unlocked on boot by "/etc/crypttab" with a keyfile and the PVEs LVM-Thin storage then uses that Thin-pool. That worked totally fine so far. Even 2 or 3 days ago, where I rebooted my PVE node.
But today I upgraded my PVE node (the packages that should be released today or yesterday) and it wanted me to reboot again, because of the new firmware package, so I did this, but then my LVM-Thin storage wasn't working anymore.
APT upgrades since the last reboot where it still worked:
Didn't change any host configs and didn't install anything new.
I checked my mdadm raid1 and it was healthy:
Then I check the "/dev/md" folder:
So what previously always was called "/dev/md/j3710:md_1" is now only called "/dev/md/md_1".
So I changed the crypttab to "/dev/md/md_1", rebooted and everything was working again.
The question is now, why has that changed? And will it in the future switch back to "/dev/md/j3710:md_1" and fail again?
I did a bit of research and found this post explaining it a bit:
https://unix.stackexchange.com/a/533941
My mdadm.conf is indeed set to use "HOMEHOST <system>":
And my hostname is still "j3710" and also was "j3710" when the array was created.
My array is still named "j3710:md_1":
So why is this now called "/dev/md/md_1" instead of the previous "/dev/md/j3710:md_1"?
And could I fix this by using the above UUID in my crypttab, like this?:
If that works it will fix the problem with this host. But I got other PVE hosts that also got a LUKS encrypted swap on top of mdadm raid1 (I know this is problematic, but no one can tell me a better solution to have a mirrored swap) and there I wouldn`t be able to use UUIDs, as cryptsetup will format the swap partition on each reboot so the UUID would always change.
So the question is...why is it now called "/dev/md/md_1" when my mdadm.conf isn't set to "HOMEHOST <none>" nor "HOMEHOST <ignore>"?
@Stoiko Ivanov:
Thread is a continuation of this post.
A while ago I installed PVE 7.2 and had setup a LVM-Thin pool on top of a LUKS container on top of a mdadm raid1. The LUKS container gets automatically unlocked on boot by "/etc/crypttab" with a keyfile and the PVEs LVM-Thin storage then uses that Thin-pool. That worked totally fine so far. Even 2 or 3 days ago, where I rebooted my PVE node.
But today I upgraded my PVE node (the packages that should be released today or yesterday) and it wanted me to reboot again, because of the new firmware package, so I did this, but then my LVM-Thin storage wasn't working anymore.
APT upgrades since the last reboot where it still worked:
Code:
Start-Date: 2022-12-15 15:46:42
Commandline: apt-get dist-upgrade
Upgrade: pve-firmware:amd64 (3.5-6, 3.6-1), libproxmox-acme-perl:amd64 (1.4.2, 1.4.3), libproxmox-acme-plugins:amd64 (1.4.2, 1.4.3), pve-kernel-helper:amd64 (7.2-14, 7.3-1)
End-Date: 2022-12-15 15:48:51
I checked my mdadm raid1 and it was healthy:
Code:
root@j3710:~# cat /proc/mdstat
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
md127 : active raid1 sda5[0] sdb5[1]
31439872 blocks super 1.2 [2/2] [UU]
unused devices: <none>
lsblk
showed that my LUKS container wasn't mounted. So I checked my crypttab:
Code:
root@j3710:~# cat /etc/crypttab
# <target name> <source device> <key file> <options>
luks_raid1 /dev/md/j3710:md_1 /root/.keys/luks_raid1.key luks
Then I check the "/dev/md" folder:
Code:
root@j3710:~# ls -l /dev/md
total 0
lrwxrwxrwx 1 root root 8 Dec 15 16:16 md_1 -> ../md127
So what previously always was called "/dev/md/j3710:md_1" is now only called "/dev/md/md_1".
So I changed the crypttab to "/dev/md/md_1", rebooted and everything was working again.
The question is now, why has that changed? And will it in the future switch back to "/dev/md/j3710:md_1" and fail again?
I did a bit of research and found this post explaining it a bit:
https://unix.stackexchange.com/a/533941
HOMEHOST
The homehost line gives a default value for the --homehost= option to mdadm. There should normally be only one other word on the line. It should either be a host name, or one of the special words <system>, <none> and <ignore>. If <system> is given, then the gethostname(2) systemcall is used to get the host name. This is the default.
[...]
When arrays are created, this host name will be stored in the metadata. When arrays are assembled using auto-assembly, arrays which do not record the correct homehost name in their metadata will be assembled using a "foreign" name. A "foreign" name alway ends with a digit string preceded by an underscore to differentiate it from any possible local name. e.g. /dev/md/1_1 or /dev/md/home_0.
My mdadm.conf is indeed set to use "HOMEHOST <system>":
Code:
root@j3710:~# cat /etc/mdadm/mdadm.conf
# mdadm.conf
#
# !NB! Run update-initramfs -u after updating this file.
# !NB! This will ensure that initramfs has an uptodate copy.
#
# Please refer to mdadm.conf(5) for information about this file.
#
# by default (built-in), scan all partitions (/proc/partitions) and all
# containers for MD superblocks. alternatively, specify devices to scan, using
# wildcards if desired.
#DEVICE partitions containers
# automatically tag new arrays as belonging to the local system
HOMEHOST <system>
# instruct the monitoring daemon where to send mail alerts
MAILADDR <redacted>
# definitions of existing MD arrays
# This configuration was auto-generated on Wed, 16 Nov 2022 18:27:10 +0100 by mkconf
And my hostname is still "j3710" and also was "j3710" when the array was created.
My array is still named "j3710:md_1":
Code:
root@j3710:~# mdadm --examine --brief --scan --config=partitions
ARRAY /dev/md/md_1 metadata=1.2 UUID=2cdcdb2b:faa6069f:4d0b4501:842554f6 name=j3710:md_1
root@j3710:~# mdadm --detail /dev/md/md_1
/dev/md/md_1:
Version : 1.2
Creation Time : Wed Nov 16 18:29:13 2022
Raid Level : raid1
Array Size : 31439872 (29.98 GiB 32.19 GB)
Used Dev Size : 31439872 (29.98 GiB 32.19 GB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent
Update Time : Sat Dec 3 22:28:10 2022
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Consistency Policy : resync
Name : j3710:md_1 (local to host j3710)
UUID : 2cdcdb2b:faa6069f:4d0b4501:842554f6
Events : 71
Number Major Minor RaidDevice State
0 8 5 0 active sync /dev/sda5
1 8 21 1 active sync /dev/sdb5
So why is this now called "/dev/md/md_1" instead of the previous "/dev/md/j3710:md_1"?
And could I fix this by using the above UUID in my crypttab, like this?:
Code:
# <target name> <source device> <key file> <options>
luks_raid1 UUID=2cdcdb2b:faa6069f:4d0b4501:842554f6 /root/.keys/luks_raid1.key luks
If that works it will fix the problem with this host. But I got other PVE hosts that also got a LUKS encrypted swap on top of mdadm raid1 (I know this is problematic, but no one can tell me a better solution to have a mirrored swap) and there I wouldn`t be able to use UUIDs, as cryptsetup will format the swap partition on each reboot so the UUID would always change.
So the question is...why is it now called "/dev/md/md_1" when my mdadm.conf isn't set to "HOMEHOST <none>" nor "HOMEHOST <ignore>"?
@Stoiko Ivanov:
Thread is a continuation of this post.