[SOLVED] Solaris QEMU ZPools

could you exactly point to the line in the logfile where you were dropping to the emergency shell / initramfs? what message do you see there?
Yesterday I did an upgrade to PVE 8.1.4, hoping the effect might have been fixed - unfortunatly not.

Anyway - I attached the logfile from the reboot after the upgrade. The emergency prompt appears directly after line 1853.

I also attached a photo from the screen.
 

Attachments

  • pve.814.log
    204.6 KB · Views: 3
  • pve.814.jpg
    pve.814.jpg
    339.7 KB · Views: 5
can you disable nfs mounts to make the errors below go away, to make sure it's not the nfs mounts causing this problem?

there is some ticket and linked threads on the failing of zfs-import systemd service at https://bugzilla.proxmox.com/show_bug.cgi?id=4835 , these appear to be non critical or causing emergency shell

afaik that "
1605 Feb 14 08:17:49 proxmox systemd[1]: Mounting nfs-docs-oreilly.mount - /nfs/docs/oreilly...
1606 Feb 14 08:17:49 proxmox mount[1197]: mount: /nfs/docs/oreilly: special device /nfs/docs/oreilly_v6.iso does not exist.
1607 Feb 14 08:17:49 proxmox mount[1197]: dmesg(1) may have more information after failed mount system call.
1608 Feb 14 08:17:49 proxmox systemd[1]: nfs-docs-oreilly.mount: Mount process exited, code=exited, status=32/n/a
1609 Feb 14 08:17:49 proxmox systemd[1]: nfs-docs-oreilly.mount: Failed with result 'exit-code'.
1610 Feb 14 08:17:49 proxmox systemd[1]: Failed to mount nfs-docs-oreilly.mount - /nfs/docs/oreilly.
1611 Feb 14 08:17:49 proxmox systemd[1]: Dependency failed for nfs-server.service - NFS server and services.
1612 Feb 14 08:17:49 proxmox systemd[1]: Dependency failed for nfs-mountd.service - NFS Mount Daemon.
1613 Feb 14 08:17:49 proxmox systemd[1]: nfs-mountd.service: Job nfs-mountd.service/start failed with result 'dependency'.
1614 Feb 14 08:17:49 proxmox systemd[1]: Dependency failed for nfs-idmapd.service - NFSv4 ID-name mapping service.
1615 Feb 14 08:17:49 proxmox systemd[1]: nfs-idmapd.service: Job nfs-idmapd.service/start failed with result 'dependency'.
1616 Feb 14 08:17:49 proxmox systemd[1]: nfs-server.service: Job nfs-server.service/start failed with result 'dependency'.
1617 Feb 14 08:17:49 proxmox systemd[1]: Dependency failed for local-fs.target - Local File Systems.
1618 Feb 14 08:17:49 proxmox systemd[1]: local-fs.target: Job local-fs.target/start failed with result 'dependency'.
1619 Feb 14 08:17:49 proxmox systemd[1]: local-fs.target: Triggering OnFailure= dependencies.
1620 Feb 14 08:17:49 proxmox systemd[1]: systemd-ask-password-console.path: Deactivated successfully.
1796 Feb 14 08:17:50 proxmox zpool[1281]: cannot import 'rpool': a pool with that name already exists
1797 Feb 14 08:17:50 proxmox zpool[1281]: use the form 'zpool import <pool | id> <newpool>' to give it a new name
1798 Feb 14 08:17:50 proxmox systemd[1]: zfs-import@rpool.service: Main process exited, code=exited, status=1/FAILURE
1799 Feb 14 08:17:50 proxmox systemd[1]: zfs-import@rpool.service: Failed with result 'exit-code'.
1800 Feb 14 08:17:50 proxmox systemd[1]: Failed to start zfs-import@rpool.service - Import ZFS pool rpool.
1801 Feb 14 08:17:50 proxmox zpool[1280]: cannot import 'nfs': no such pool available
1802 Feb 14 08:17:50 proxmox systemd[1]: zfs-import@nfs.service: Main process exited, code=exited, status=1/FAILURE
1803 Feb 14 08:17:50 proxmox systemd[1]: zfs-import@nfs.service: Failed with result 'exit-code'.
1804 Feb 14 08:17:50 proxmox systemd[1]: Failed to start zfs-import@nfs.service - Import ZFS pool nfs.
 
zfs-import failing has no effect other than that pool possibly not being imported. in this case we have three import services running:
- zfs-import-cache (using the cache file)
- zfs-import@rpool (makes no sense, the rpool is always imported by the initrd!)
- zfs-import@nfs (this races with PVE services activating that pool, and it seems the PVE services win ;) - or it is in the cache file, and thus already imported via that service)

the cause for the "rescue shell" is

Code:
1618 Feb 14 08:17:49 proxmox systemd[1]: local-fs.target: Job local-fs.target/start failed with result 'dependency'.

since without mounting all local file systems configured to be mounted at boot time, systemd doesn't know whether it's safe to proceed.

and it seems that the lines above that with NFS (why those are in the dependency chain for local-fs I don't know) are to blame..
 
Last edited:
can you disable nfs mounts to make the errors below go away, to make sure it's not the nfs mounts causing this problem?

there is some ticket and linked threads on the failing of zfs-import systemd service at https://bugzilla.proxmox.com/show_bug.cgi?id=4835 , these appear to be non critical or causing emergency shell

afaik that "
1605 Feb 14 08:17:49 proxmox systemd[1]: Mounting nfs-docs-oreilly.mount - /nfs/docs/oreilly...
1606 Feb 14 08:17:49 proxmox mount[1197]: mount: /nfs/docs/oreilly: special device /nfs/docs/oreilly_v6.iso does not exist.
1607 Feb 14 08:17:49 proxmox mount[1197]: dmesg(1) may have more information after failed mount system call.
1608 Feb 14 08:17:49 proxmox systemd[1]: nfs-docs-oreilly.mount: Mount process exited, code=exited, status=32/n/a
1609 Feb 14 08:17:49 proxmox systemd[1]: nfs-docs-oreilly.mount: Failed with result 'exit-code'.
1610 Feb 14 08:17:49 proxmox systemd[1]: Failed to mount nfs-docs-oreilly.mount - /nfs/docs/oreilly.
1611 Feb 14 08:17:49 proxmox systemd[1]: Dependency failed for nfs-server.service - NFS server and services.
1612 Feb 14 08:17:49 proxmox systemd[1]: Dependency failed for nfs-mountd.service - NFS Mount Daemon.
1613 Feb 14 08:17:49 proxmox systemd[1]: nfs-mountd.service: Job nfs-mountd.service/start failed with result 'dependency'.
1614 Feb 14 08:17:49 proxmox systemd[1]: Dependency failed for nfs-idmapd.service - NFSv4 ID-name mapping service.
1615 Feb 14 08:17:49 proxmox systemd[1]: nfs-idmapd.service: Job nfs-idmapd.service/start failed with result 'dependency'.
1616 Feb 14 08:17:49 proxmox systemd[1]: nfs-server.service: Job nfs-server.service/start failed with result 'dependency'.
1617 Feb 14 08:17:49 proxmox systemd[1]: Dependency failed for local-fs.target - Local File Systems.
1618 Feb 14 08:17:49 proxmox systemd[1]: local-fs.target: Job local-fs.target/start failed with result 'dependency'.
1619 Feb 14 08:17:49 proxmox systemd[1]: local-fs.target: Triggering OnFailure= dependencies.
1620 Feb 14 08:17:49 proxmox systemd[1]: systemd-ask-password-console.path: Deactivated successfully.
1796 Feb 14 08:17:50 proxmox zpool[1281]: cannot import 'rpool': a pool with that name already exists
1797 Feb 14 08:17:50 proxmox zpool[1281]: use the form 'zpool import <pool | id> <newpool>' to give it a new name
1798 Feb 14 08:17:50 proxmox systemd[1]: zfs-import@rpool.service: Main process exited, code=exited, status=1/FAILURE
1799 Feb 14 08:17:50 proxmox systemd[1]: zfs-import@rpool.service: Failed with result 'exit-code'.
1800 Feb 14 08:17:50 proxmox systemd[1]: Failed to start zfs-import@rpool.service - Import ZFS pool rpool.
1801 Feb 14 08:17:50 proxmox zpool[1280]: cannot import 'nfs': no such pool available
1802 Feb 14 08:17:50 proxmox systemd[1]: zfs-import@nfs.service: Main process exited, code=exited, status=1/FAILURE
1803 Feb 14 08:17:50 proxmox systemd[1]: zfs-import@nfs.service: Failed with result 'exit-code'.
1804 Feb 14 08:17:50 proxmox systemd[1]: Failed to start zfs-import@nfs.service - Import ZFS pool nfs.
Actually PVE is no NFS client. One of the pools is named "nfs" because it's hosting filesystems shared in my environment. PVE is exporting some of the filesystems:
/ssd/home 192.168.125.0/24(rw,sync,no_root_squash,no_subtree_check)
/nfs/docs 192.168.125.0/24(rw,sync,no_root_squash,no_subtree_check)
/nfs/docs/oreilly 192.168.125.0/24(rw,sync,no_root_squash,no_subtree_check)
/nfs/backup 192.168.125.0/24(rw,sync,no_root_squash,no_subtree_check)
/nfs/tmp 192.168.125.0/24(rw,sync,no_root_squash,no_subtree_check)

The /nfs/docs/oreilly is a loopback mount of an ISO CD image

/nfs/docs/oreilly_v6.iso /nfs/docs/oreilly iso9660 ro 0 0

I commented it out in fstab. This seems to solve the emergency mode issue (aside from the volmode setting).
 
The question arising now is: Why is "local-fs.target" failing with the ISO loopback mount?
Mounting it manually with
mount -t iso9660 -o ro /nfs/docs/oreilly_v6.iso /nfs/docs/oreilly
works fine
 
because you didn't specify that it is a loopback, and the generated unit is bogus:

Code:
1606 Feb 14 08:17:49 proxmox mount[1197]: mount: /nfs/docs/oreilly: special device /nfs/docs/oreilly_v6.iso does not exist.
 
I've set up a little PVE lab on my Dell Precision M4600. Installed PVE 8.1.4 on a ZPool and created 2 VMs. One CT with Ubuntu 20.04, a second KVM running Solaris11. Both were running fine and the reboot creates no problems. Even the volmode is set to default - no trouble.

So far we've had the loopback mount which created an error.

I set up another VM (KVM) Debian12 on ext4. Copied my ISO image to the VM and created the fstab entries for mounting
/ssd/docs/oreilly_v6.iso /ssd/docs/oreilly iso9660 loop,ro 0 0
Rebooted the VM - everything fine

Last I copied the ISO to the lab PVE and created the same fstab entries. The next reboot of my lab PVE ended in emergency mode:
-- Boot d16c9e4512484d57905fbc5064202e7e --
Feb 14 17:53:33 pvetest systemd[1]: Mounting ssd-docs-oreilly.mount - /ssd/docs/oreilly...
Feb 14 17:53:33 pvetest mount[599]: mount: /ssd/docs/oreilly: failed to setup loop device for /ssd/docs/oreilly_v6.iso.
Feb 14 17:53:33 pvetest systemd[1]: ssd-docs-oreilly.mount: Mount process exited, code=exited, status=32/n/a
Feb 14 17:53:33 pvetest systemd[1]: ssd-docs-oreilly.mount: Failed with result 'exit-code'.
Feb 14 17:53:33 pvetest systemd[1]: Failed to mount ssd-docs-oreilly.mount - /ssd/docs/oreilly.

Commenting the fstab line out from emergency shell lets me continue the boot process.

Last I commented the fstab line for the loopback mount in aganin, ran an systemctl daemon-reload and mounted the ISO just with
mount /ssd/docs/oreilly
everything fine.

To me it seems, that PVE doesn't handle loopback mounts in fstab correctly during the boot sequence
 
could you post the output of "systemctl show ssd-docs-oreilly.mount"? note that none of this is PVE, it's all systemd ;) one thing that you might need to add is an explicit dependency on the mountpoint where the iso file is located - e.g., if it is on one of your non-/ ZFS pools, those need to be imported and mounted first.. "man fstab" has more details on the syntax and what not, but you can also write a mount unit instead of the fstab line if you are more comfortable with that.
 
could you post the output of "systemctl show ssd-docs-oreilly.mount"? note that none of this is PVE, it's all systemd ;) one thing that you might need to add is an explicit dependency on the mountpoint where the iso file is located - e.g., if it is on one of your non-/ ZFS pools, those need to be imported and mounted first.. "man fstab" has more details on the syntax and what not, but you can also write a mount unit instead of the fstab line if you are more comfortable with that.
Here it is:

root@pvetest:~# systemctl show --no-pager ssd-docs-oreilly.mount
Where=/ssd/docs/oreilly
What=/ssd/docs/oreilly_v6.iso
Options=loop,ro
Type=iso9660
TimeoutUSec=1min 30s
ControlPID=0
DirectoryMode=0755
SloppyOptions=no
LazyUnmount=no
ForceUnmount=no
ReadWriteOnly=no
Result=exit-code
UID=[not set]
GID=[not set]
Slice=system.slice
ControlGroupId=0
MemoryCurrent=[not set]
MemoryAvailable=infinity
CPUUsageNSec=3037000
TasksCurrent=[not set]
IPIngressBytes=[no data]
IPIngressPackets=[no data]
IPEgressBytes=[no data]
IPEgressPackets=[no data]
IOReadBytes=18446744073709551615
IOReadOperations=18446744073709551615
IOWriteBytes=18446744073709551615
IOWriteOperations=18446744073709551615
Delegate=no
CPUAccounting=yes
CPUWeight=[not set]
StartupCPUWeight=[not set]
CPUShares=[not set]
StartupCPUShares=[not set]
CPUQuotaPerSecUSec=infinity
CPUQuotaPeriodUSec=infinity
IOAccounting=no
IOWeight=[not set]
StartupIOWeight=[not set]
BlockIOAccounting=no
BlockIOWeight=[not set]
StartupBlockIOWeight=[not set]
MemoryAccounting=yes
DefaultMemoryLow=0
DefaultMemoryMin=0
MemoryMin=0
MemoryLow=0
MemoryHigh=infinity
MemoryMax=infinity
MemorySwapMax=infinity
MemoryLimit=infinity
DevicePolicy=auto
TasksAccounting=yes
TasksMax=9386
IPAccounting=no
ManagedOOMSwap=auto
ManagedOOMMemoryPressure=auto
ManagedOOMMemoryPressureLimit=0
ManagedOOMPreference=none
UMask=0022
LimitCPU=infinity
LimitCPUSoft=infinity
LimitFSIZE=infinity
LimitFSIZESoft=infinity
LimitDATA=infinity
LimitDATASoft=infinity
LimitSTACK=infinity
LimitSTACKSoft=8388608
LimitCORE=infinity
LimitCORESoft=0
LimitRSS=infinity
LimitRSSSoft=infinity
LimitNOFILE=524288
LimitNOFILESoft=1024
LimitAS=infinity
LimitASSoft=infinity
LimitNPROC=31288
LimitNPROCSoft=31288
LimitMEMLOCK=8388608
LimitMEMLOCKSoft=8388608
LimitLOCKS=infinity
LimitLOCKSSoft=infinity
LimitSIGPENDING=31288
LimitSIGPENDINGSoft=31288
LimitMSGQUEUE=819200
LimitMSGQUEUESoft=819200
LimitNICE=0
LimitNICESoft=0
LimitRTPRIO=0
LimitRTPRIOSoft=0
LimitRTTIME=infinity
LimitRTTIMESoft=infinity
OOMScoreAdjust=0
CoredumpFilter=0x33
Nice=0
IOSchedulingClass=2
IOSchedulingPriority=4
CPUSchedulingPolicy=0
CPUSchedulingPriority=0
CPUAffinityFromNUMA=no
NUMAPolicy=n/a
TimerSlackNSec=50000
CPUSchedulingResetOnFork=no
NonBlocking=no
StandardInput=null
StandardOutput=journal
StandardError=inherit
TTYReset=no
TTYVHangup=no
TTYVTDisallocate=no
SyslogPriority=30
SyslogLevelPrefix=yes
SyslogLevel=6
SyslogFacility=3
LogLevelMax=-1
LogRateLimitIntervalUSec=0
LogRateLimitBurst=0
SecureBits=0
CapabilityBoundingSet=cap_chown cap_dac_override cap_dac_read_search cap_fowner cap_fsetid cap_kill cap_setgid cap_setuid cap_setpcap cap_linux_immutable cap_net_bind_service cap_net_broadcast cap_net_admin cap_net_raw cap_ipc_lock cap_ipc_owner cap_sys_module cap_sys_rawio cap_sys_chroot cap_sys_ptrace cap_sys_pacct cap_sys_admin cap_sys_boot cap_sys_nice cap_sys_resource cap_sys_time cap_sys_tty_config cap_mknod cap_lease cap_audit_write cap_audit_control cap_setfcap cap_mac_override cap_mac_admin cap_syslog cap_wake_alarm cap_block_suspend cap_audit_read cap_perfmon cap_bpf cap_checkpoint_restore
DynamicUser=no
RemoveIPC=no
PrivateTmp=no
PrivateDevices=no
ProtectClock=no
ProtectKernelTunables=no
ProtectKernelModules=no
ProtectKernelLogs=no
ProtectControlGroups=no
PrivateNetwork=no
PrivateUsers=no
PrivateMounts=no
PrivateIPC=no
ProtectHome=no
ProtectSystem=no
SameProcessGroup=yes
UtmpMode=init
IgnoreSIGPIPE=yes
NoNewPrivileges=no
SystemCallErrorNumber=2147483646
LockPersonality=no
RuntimeDirectoryPreserve=no
RuntimeDirectoryMode=0755
StateDirectoryMode=0755
CacheDirectoryMode=0755
LogsDirectoryMode=0755
ConfigurationDirectoryMode=0755
TimeoutCleanUSec=infinity
MemoryDenyWriteExecute=no
RestrictRealtime=no
RestrictSUIDSGID=no
RestrictNamespaces=no
MountAPIVFS=no
KeyringMode=shared
ProtectProc=default
ProcSubset=all
ProtectHostname=no
KillMode=control-group
KillSignal=15
RestartKillSignal=15
FinalKillSignal=9
SendSIGKILL=yes
SendSIGHUP=no
WatchdogSignal=6
Id=ssd-docs-oreilly.mount
Names=ssd-docs-oreilly.mount
Requires=system.slice
RequiredBy=local-fs.target
Conflicts=umount.target
Before=umount.target local-fs.target
After=system.slice systemd-journald.socket -.mount local-fs-pre.target ssd.mount
RequiresMountsFor=/ssd/docs/oreilly_v6.iso /ssd/docs
Documentation="man:fstab(5)" "man:systemd-fstab-generator(8)"
Description=/ssd/docs/oreilly
LoadState=loaded
ActiveState=failed
FreezerState=running
SubState=failed
FragmentPath=/run/systemd/generator/ssd-docs-oreilly.mount
SourcePath=/etc/fstab
UnitFileState=generated
UnitFilePreset=enabled
StateChangeTimestamp=Thu 2024-02-15 10:10:23 CET
StateChangeTimestampMonotonic=8965269
InactiveExitTimestamp=Thu 2024-02-15 10:10:23 CET
InactiveExitTimestampMonotonic=8960110
ActiveEnterTimestampMonotonic=0
ActiveExitTimestampMonotonic=0
InactiveEnterTimestamp=Thu 2024-02-15 10:10:23 CET
InactiveEnterTimestampMonotonic=8965269
CanStart=yes
CanStop=yes
CanReload=yes
CanIsolate=no
CanFreeze=no
StopWhenUnneeded=no
RefuseManualStart=no
RefuseManualStop=no
AllowIsolate=no
DefaultDependencies=yes
OnSuccessJobMode=fail
OnFailureJobMode=replace
IgnoreOnIsolate=yes
NeedDaemonReload=no
JobTimeoutUSec=infinity
JobRunningTimeoutUSec=infinity
JobTimeoutAction=none
ConditionResult=yes
AssertResult=yes
ConditionTimestamp=Thu 2024-02-15 10:10:23 CET
ConditionTimestampMonotonic=8941726
AssertTimestamp=Thu 2024-02-15 10:10:23 CET
AssertTimestampMonotonic=8941728
Transient=no
Perpetual=no
StartLimitIntervalUSec=10s
StartLimitBurst=5
StartLimitAction=none
FailureAction=none
SuccessAction=none
InvocationID=cd119ce26cce4a4388f83b0662b24536
CollectMode=inactive
 
and what is /ssd and /ssd/docs ? are these ZFS datasets?
 
yeah, so you need to tell systemd that there is a dependency there, else mounting it at boot time is inherently racy (or possibly even worse - always ordered wrong!)
 
did you also setup a ZFS pool there? or is the iso file on / in that case?
 
did you also setup a ZFS pool there? or is the iso file on / in that case?
That's an excellent point!
In the standard distribution ZFS is not included, so I used a single partition with ext4. I'll install ZFS using a ZPool and re-run the test.
 
OK, I did some testing including ZFS.

ISO mounts handled by fstab entry

/export/docs/oreilly_v6.iso /export/docs/oreilly iso9660 loop,ro 0 0

1. ISO and its mountpoint on ext4 -> OK
2. ISO and its mountpoint on ZFS -> failure
3. ISO on ext4, mountpoint on ZFS -> OK
4. ISO on ZFS, mountpoint on ext4 -> failure

I would say, this is an interoperability issue with ZFS and fstab
 
well, yes and no. like I wrote above - systemd doesn't know about the ZFS mountpoints until "zfs-mount.service" runs, so you need to order your mounts after that if they require ZFS datasets to be mounted (or use the zfs mount generator).. if you give systemd an incomplete picture of the dependencies between units, then it cannot handle everything properly (or only by luck, and not deterministically)
 
well, yes and no. like I wrote above - systemd doesn't know about the ZFS mountpoints until "zfs-mount.service" runs, so you need to order your mounts after that if they require ZFS datasets to be mounted (or use the zfs mount generator).. if you give systemd an incomplete picture of the dependencies between units, then it cannot handle everything properly (or only by luck, and not deterministically)
I checked the dependencies of zfs-mount.service. For zfs-mount.service it says
Before=local-fs.target zfs-share.service
Afaik local-fs.target is evaluating fstab and mounting the filesystems. According to the dependency of zfs-mount.service the ZFS mounts should be completed before local-fs.target. But for some reason systemd-fstab-generator inserts a dependency in local-fs.target that depends on the completion of zfs-mount.service - so we have a deadlock :-(
 
I think you misunderstand how the unit ordering works there (or how systemd targets work?). just order your fstab line after zfs-mount.service, the fstab manpage tells you how.
 
I think you misunderstand how the unit ordering works there (or how systemd targets work?). just order your fstab line after zfs-mount.service, the fstab manpage tells you how.
OK, I think I got you. Maybe I was a bit spoiled by Solaris, working with ZFS from the very beginning in 2005 ;-)

I was a bit confused about the statement "ordering the fstab lines", because the fstab on my lab PVE just had these lines
# <file system> <mount point> <type> <options> <dump> <pass>
proc /proc proc defaults 0 0
# oreilly
/ssd/docs/oreilly_v6.iso /ssd/docs/oreilly iso9660 loop,ro 0 0

For the loopback mount I now added
x-systemd.requires=zfs-mount.service

That solved the issue

Thanx a lot!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!