TASK ERROR: IOMMU not present

NOIDSR

Member
Mar 1, 2021
53
2
8
31
Hi there,

Newbie here, I have been successfully using proxmox for some time until just recently one of my 3 aray SDD failed and system is in degrade mode now. No big deal, I disconnected failed drive and it should work in degrade state since 2 other drives are connected. But now when I start my any VM I get task error IOMMU not present and VMs does not start. Any ideas how to troubleshoot? Besides physically disconnecting the SSD I have not changed anything. Thanks!
 
It should only complain about IOMMU when you start VMs with PCI passthrough. Maybe the BIOS settings got reset and IOMMU is not (fully) enabled?
Was the SSD connected to a M.2 slot? Then it is a PCIe devices and removing (or adding) one can change the PCI-id of other devices by one.
 
It should only complain about IOMMU when you start VMs with PCI passthrough. Maybe the BIOS settings got reset and IOMMU is not (fully) enabled?
Was the SSD connected to a M.2 slot? Then it is a PCIe devices and removing (or adding) one can change the PCI-id of other devices by one.
Thanks for your reply! I have been using my VMs with passthrough. I have saved BIOS configuration and loaded again in case something got reset. No change. SSD was not in M2 slot. In hardware section of each VM if I double click each PCI device it shows a message that no IOMMU detected. If I remove those devices from VMs hardware then I can load VM in console.
 

Attachments

  • Screen Shot 2021-08-29 at 08.08.43.png
    Screen Shot 2021-08-29 at 08.08.43.png
    92.1 KB · Views: 221
It's possible that your edits to the kernel cmdline didn't sync to all of the members of the RAID set. Run "cat /proc/cmdline" to see the kernel commandline that actually booted, and verify that it contains your intel_iommu/iommu arguments as expected.
 
It's possible that your edits to the kernel cmdline didn't sync to all of the members of the RAID set. Run "cat /proc/cmdline" to see the kernel commandline that actually booted, and verify that it contains your intel_iommu/iommu arguments as expected.
Ok ...here is what I get when run this command...My linux begginer knowledge is very limitted, thanks for guidance.Screen Shot 2021-08-29 at 08.32.54.png
 
I'm asuming you have an Intel CPU because the AMD IOMMU is enabled by default. Your /proc/cmdline does not contain intel_iommu=on and that is probably why it is not working. Have a look at /etc/default/grub and /etc/kernel/cmdline to check if intel_iommu=on is there. You are booting from ZFS, so it is likely you are using systemd-boot instead of GRUB. But it also looks like your system is much behind in updates, so you could be booting with GRUB. Don't update now, as that will only make debugging this a moving target and maybe it only looks old because of this issue.
Maybe all you need is to run update-grub or update-initramfs -u (which will also update systemd-boot).
 
  • Like
Reactions: thenickdude
I'm asuming you have an Intel CPU because the AMD IOMMU is enabled by default. Your /proc/cmdline does not contain intel_iommu=on and that is probably why it is not working. Have a look at /etc/default/grub and /etc/kernel/cmdline to check if intel_iommu=on is there. You are booting from ZFS, so it is likely you are using systemd-boot instead of GRUB. But it also looks like your system is much behind in updates, so you could be booting with GRUB. Don't update now, as that will only make debugging this a moving target and maybe it only looks old because of this issue.
Maybe all you need is to run update-grub or update-initramfs -u (which will also update systemd-boot).
I ran updates you suggested, no change.

/etc/kernel/cmdline looks like this:

root=ZFS=rpool/ROOT/pve-1 boot=zfs intel_iommu=on textonly video=astdrmfb video=efifb:off pcie_acs_override=downstream

/etc/default/grub looks like this:

# If you change this file, run 'update-grub' afterwards to update
# /boot/grub/grub.cfg.
# For full documentation of the options in this file, see:
# info -f grub -n 'Simple configuration'

GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="Proxmox Virtual Environment"
GRUB_CMDLINE_LINUX_DEFAULT="quiet"
GRUB_CMDLINE_LINUX="root=ZFS=rpool/ROOT/pve-1 boot=zfs"

# Disable os-prober, it might add menu entries for each guest
GRUB_DISABLE_OS_PROBER=true

# Uncomment to enable BadRAM filtering, modify to suit your needs
# This works with Linux (no patch required) and with any kernel that obtains
# the memory map information from GRUB (GNU Mach, kernel of FreeBSD ...)
#GRUB_BADRAM="0x01234567,0xfefefefe,0x89abcdef,0xefefefef"

# Uncomment to disable graphical terminal (grub-pc only)
#GRUB_TERMINAL=console

# The resolution used on graphical terminal
# note that you can use only modes which your graphic card supports via VBE
# you can see them in real GRUB with the command `vbeinfo'
#GRUB_GFXMODE=640x480

# Uncomment if you don't want GRUB to pass "root=UUID=xxx" parameter to Linux
#GRUB_DISABLE_LINUX_UUID=true

# Disable generation of recovery mode menu entries
GRUB_DISABLE_RECOVERY="true"

# Uncomment to get a beep at grub start
#GRUB_INIT_TUNE="480 440 1"
 
Your kernel parameters are configured for systemd-boot but you appear to be booting with GRUB. I don't know why this was changed by replacing a drive.
A work-around is to also add the kernel parameters to GRUB by changing it to GRUB_CMDLINE_LINUX="root=ZFS=rpool/ROOT/pve-1 boot=zfs intel_iommu=on textonly video=astdrmfb video=efifb:off pcie_acs_override=downstream" and run update-grub.
After that you can try to figure out why you are no longer booting in UEFI mode and switched to legacy or CSM mode.
 
OK, I can do a work around but how do I get to an original previous state? I have not replaced the drive yet, just disconected it as it was preventing system loading getting stuck in Bios at the start up. As I understand we are talking about passthrough since proxmox starts I think correctly. I start my passthrough VMs from a web dashboard, not auto at the proxmox startup. Also the question is was it configured before to boot with GRUB and kernel params were changed somehow or it was system-boot and something was changed in GRUB...? Thanks
 
I appears like your system used to boot in UEFI mode using systemd-boot, but was changed to legacy boot via GRUB. I think UEFI-mode requires a GPT and will fall-back to legacy if it finds a MBR. Maybe the failed drive was used to boot and contained a GPT, and the current boot drive contains a MBR. Or a BIOS setting was changed.
Maybe try booting from the third drive? What is the output of gdisk -l for each of your drives in /dev/ ?
 
Here you go......

root@V1:/dev# lsblk -e7
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 931.5G 0 disk
├─sda1 8:1 0 931.5G 0 part
└─sda9 8:9 0 8M 0 part
sdb 8:16 0 894.3G 0 disk
├─sdb1 8:17 0 1007K 0 part
├─sdb2 8:18 0 512M 0 part
└─sdb3 8:19 0 869.5G 0 part
sdc 8:32 0 5.5T 0 disk
├─sdc1 8:33 0 5.5T 0 part
└─sdc9 8:41 0 8M 0 part
sdd 8:48 0 931.5G 0 disk
├─sdd1 8:49 0 16M 0 part
└─sdd2 8:50 0 931.5G 0 part
zd0 230:0 0 1M 0 disk
zd16 230:16 0 1M 0 disk
zd32 230:32 0 1M 0 disk
zd48 230:48 0 1M 0 disk
zd64 230:64 0 100G 0 disk
├─zd64p1 230:65 0 499M 0 part
├─zd64p2 230:66 0 99M 0 part
├─zd64p3 230:67 0 16M 0 part
└─zd64p4 230:68 0 99.4G 0 part
zd80 230:80 0 100G 0 disk
├─zd80p1 230:81 0 200M 0 part
└─zd80p2 230:82 0 99.8G 0 part
zd96 230:96 0 100G 0 disk
├─zd96p1 230:97 0 499M 0 part
├─zd96p2 230:98 0 99M 0 part
├─zd96p3 230:99 0 16M 0 part
└─zd96p4 230:100 0 99.4G 0 part
zd112 230:112 0 100G 0 disk
├─zd112p1 230:113 0 200M 0 part
└─zd112p2 230:114 0 99.8G 0 part
nvme0n1 259:0 0 931.5G 0 disk
├─nvme0n1p1 259:1 0 200M 0 part
└─nvme0n1p2 259:2 0 931.2G 0 part
root@V1:/dev#

------------------------------------------------------------------
root@V1:/dev# gdisk -l sda
GPT fdisk (gdisk) version 1.0.3

Partition table scan:
MBR: protective
BSD: not present
APM: not present
GPT: present

Found valid GPT with protective MBR; using GPT.
Disk sda: 1953525168 sectors, 931.5 GiB
Sector size (logical/physical): 512/512 bytes
Disk identifier (GUID): 1419AB66-D6DC-1841-8A81-E084A6381661
Partition table holds up to 128 entries
Main partition table begins at sector 2 and ends at sector 33
First usable sector is 34, last usable sector is 1953525134
Partitions will be aligned on 2048-sector boundaries
Total free space is 3437 sectors (1.7 MiB)

Number Start (sector) End (sector) Size Code Name
1 2048 1953507327 931.5 GiB BF01 zfs-dedc892affa076cc
9 1953507328 1953523711 8.0 MiB BF07
---------------------------------------------------------------------------
root@V1:/dev# gdisk -l sdb
GPT fdisk (gdisk) version 1.0.3

Partition table scan:
MBR: protective
BSD: not present
APM: not present
GPT: present

Found valid GPT with protective MBR; using GPT.
Disk sdb: 1875385008 sectors, 894.3 GiB
Sector size (logical/physical): 512/512 bytes
Disk identifier (GUID): 057F0DC6-4E25-4122-A1F7-3CC45AC71A32
Partition table holds up to 128 entries
Main partition table begins at sector 2 and ends at sector 33
First usable sector is 34, last usable sector is 1875384974
Partitions will be aligned on 8-sector boundaries
Total free space is 50862734 sectors (24.3 GiB)

Number Start (sector) End (sector) Size Code Name
1 34 2047 1007.0 KiB EF02
2 2048 1050623 512.0 MiB EF00
3 1050624 1824522240 869.5 GiB BF01
-----------------------------------------------------------------------------------
root@V1:/dev# gdisk -l sdc
GPT fdisk (gdisk) version 1.0.3

Partition table scan:
MBR: protective
BSD: not present
APM: not present
GPT: present

Found valid GPT with protective MBR; using GPT.
Disk sdc: 11721045168 sectors, 5.5 TiB
Sector size (logical/physical): 512/4096 bytes
Disk identifier (GUID): 008C82D7-3574-FD4E-8771-FD37D03F6C9E
Partition table holds up to 128 entries
Main partition table begins at sector 2 and ends at sector 33
First usable sector is 34, last usable sector is 11721045134
Partitions will be aligned on 2048-sector boundaries
Total free space is 3181 sectors (1.6 MiB)

Number Start (sector) End (sector) Size Code Name
1 2048 11721027583 5.5 TiB BF01 zfs-2565cb981bc7c7a0
9 11721027584 11721043967 8.0 MiB BF07
----------------------------------------------------------------------------------------------
root@V1:/dev# gdisk -l sdd
GPT fdisk (gdisk) version 1.0.3

The protective MBR's 0xEE partition is oversized! Auto-repairing.

Partition table scan:
MBR: protective
BSD: not present
APM: not present
GPT: present

Found valid GPT with protective MBR; using GPT.
Disk sdd: 1953525168 sectors, 931.5 GiB
Sector size (logical/physical): 512/512 bytes
Disk identifier (GUID): 941D14C9-FC8E-4BEE-957D-F8D551D1B0C5
Partition table holds up to 128 entries
Main partition table begins at sector 2 and ends at sector 33
First usable sector is 34, last usable sector is 1953525134
Partitions will be aligned on 8-sector boundaries
Total free space is 3471 sectors (1.7 MiB)

Number Start (sector) End (sector) Size Code Name
1 34 32767 16.0 MiB 0C01 Microsoft reserved ...
2 32768 1953521663 931.5 GiB 0700 Basic data partition
 
They all seem to have a GPT. How about /dev/nvme0n1? Which one is your boot drive? /dev/sdb appears to have an ESP partition which would allow for UEFI boot. Maybe the NVME drive has one as well? The other drives can only boot with GRUB.
 
They all seem to have a GPT. How about /dev/nvme0n1? Which one is your boot drive? /dev/sdb appears to have an ESP partition which would allow for UEFI boot. Maybe the NVME drive has one as well? The other drives can only boot with GRUB.
Negative. nvme drive is this M drive that is connected to my MACOS VM.



I believe SDA and SDB are two array remaining drives. So either or can be used. But if I understand correctly the failed one could have been used for boot, Since it is gone now maybe boot is messed up...
 
Try setting the drive that is /dev/sdb as your boot drive in the BIOS and see if it boots with the systemd-boot menu (kernel selection in center of the screen) instead of GRUB (kernel selection in top-left of screen).
 
Try setting the drive that is /dev/sdb as your boot drive in the BIOS and see if it boots with the systemd-boot menu (kernel selection in center of the screen) instead of GRUB (kernel selection in top-left of screen).
OK, that worked. Thank you very much. As you see in the screen shots I changed boot sequence from P4 to Linux Boot Manager and it did the trick. Not sure if I understand what happened but I like the result. :)

Now since this is working again I feel I need to upgrade to the later proxmox version as its been a while. I am not a subscriber though. What is the best way to do this and not to mess up with the existing working system? Or better do nothing since everything is working....If this new update would affect performance noticeably then I would take the risk.
 

Attachments

  • IMG_20210829_210730.jpg
    IMG_20210829_210730.jpg
    88.3 KB · Views: 146
  • IMG_20210829_204431.jpg
    IMG_20210829_204431.jpg
    98.4 KB · Views: 128
Hey. I'm also having the same issue now. I don't have /etc/kernel/cmdline file. /etc/default/grub looks like this:
Code:
# If you change this file, run 'update-grub' afterwards to update
# /boot/grub/grub.cfg.
# For full documentation of the options in this file, see:
#   info -f grub -n 'Simple configuration'

GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT="quiet"
GRUB_CMDLINE_LINUX=""
intel_iommu=on

# Uncomment to enable BadRAM filtering, modify to suit your needs
# This works with Linux (no patch required) and with any kernel that obtains
# the memory map information from GRUB (GNU Mach, kernel of FreeBSD ...)
#GRUB_BADRAM="0x01234567,0xfefefefe,0x89abcdef,0xefefefef"

# Uncomment to disable graphical terminal (grub-pc only)
#GRUB_TERMINAL=console

# The resolution used on graphical terminal
# note that you can use only modes which your graphic card supports via VBE
# you can see them in real GRUB with the command `vbeinfo'
#GRUB_GFXMODE=640x480

# Uncomment if you don't want GRUB to pass "root=UUID=xxx" parameter to Linux
#GRUB_DISABLE_LINUX_UUID=true

# Uncomment to disable generation of recovery mode menu entries
#GRUB_DISABLE_RECOVERY="true"

# Uncomment to get a beep at grub start
#GRUB_INIT_TUNE="480 440 1"
Any idea what might be the problem in this case? I have virtualization enabled in BIOS.
 
the 'intel_iommu=on' is in the wrong place in your example,

instead of
Code:
GRUB_CMDLINE_LINUX_DEFAULT="quiet"
GRUB_CMDLINE_LINUX=""
intel_iommu=on

you need to have
Code:
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on"
GRUB_CMDLINE_LINUX=""

(don't forget to 'update-grub' afterwards)
 
the 'intel_iommu=on' is in the wrong place in your example,

instead of
Code:
GRUB_CMDLINE_LINUX_DEFAULT="quiet"
GRUB_CMDLINE_LINUX=""
intel_iommu=on

you need to have
Code:
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on"
GRUB_CMDLINE_LINUX=""

(don't forget to 'update-grub' afterwards)
After doing so I am still getting the same error. Maybe my hardware doesn't support it. I have Intel J1900. It seems it supports Vt-x, but no vt-d. Thanks for your help!
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!