Move io_uring from default (important)

freebee · Jun 22, 2023

Observations
On my servers, I work with NVME, ZFS, and Windows and Linux guest machines.
Especially on old Linuxes (like Suse Linux Enterprise 12 SP4), I experienced 3 major problems:
1. After a VM backup, the IBM DB2 got corrupted. Nobody was using it at the moment.
2. During a backup from this server I got corrupted data, stopping the backup. I try again and goes ok until the end.
3. The io latency inside VM sometimes goes very high, the network for 1-second stops.

I change to threads in the last few weeks, and no more problems, Everything got solved.
Maybe the io_uring is designed for Linux kernel file system technologies. ZFS is not on the scope.

Problems with security
Google on the other hand disabled io_uring on their products, see the official report:
https://security.googleblog.com/2023/06/learnings-from-kctf-vrps-42-linux.html
Is a greater source of vulnerabilities and a head cache for us, hosting data from others.

So, is correct Proxmox using io_uring for default on storage configuration?
Is it, not better back threads or native?
If someone wants to choose io_uring, ok, but do not let this default.

bbgeek17 · Jun 22, 2023

We took an extensive look at aio=native vs. aio=iouring. Our guidance and bias are towards aio=native for O_DIRECT raw block storage.
If you are running on top of local storage that can block inline of I/O submission (i.e., ZFS, XFS, etc.), io_uring is technically better because it will not block. However, you are not the first person to see consistency issues with io_uring.
If you can accept occasional latency spikes, aio=native might be a more conservative option that is also faster than aio=threads. If not, consider using aio=native with underlying storage that can't block.

Here's more technical information on aio=native and aio=io_uring, along with a thorough performance comparison:
https://kb.blockbridge.com/technote/proxmox-aio-vs-iouring/

ps. don't forget not to confuse aio=threads and iothreads

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

freebee · Jun 22, 2023

Hi. Thank you for the answer.
The io_uring is not designed for openzfs (default used by proxmox) and does not benefit from performance.
https://github.com/openzfs/zfs/issues/8716

Have other problems pointed on this email:
io_uring and POSIX read-write concurrency guarantees:
https://lore.kernel.org/all/ff3be659-e054-88c3-7b4b-c511f679333d@nedprod.com/T/

The Proxmox project let this default for those who use OpenZFS is not the right thing. OpenZFS is not a benefit in nothing and if are reports like mine, back to the standard from 6.x is better. I'm considering the Security View too from the Google report.

bbgeek17 · Jun 23, 2023

Very interesting information! Thank you for sharing!

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

RolandK · Jun 23, 2023

i can tell that i also did not have good experience with io_uring at time of introduction/getting default on proxmox 7 ( https://bugzilla.proxmox.com/show_bug.cgi?id=1453#c23 ) .

because i also do not put enough trust into io_uring, we have configured all our VMs (qcow2 on top of zfs datasets) with aio=threads, virtio-scsi-single and iothreads=1 since then and while i know that this may not be optimal performance, it's running reliable for us and i cannot remember that we ever had any data corruption (10 TB of VM storage, zfs datasets being replicated with syncoid)

fiona · Jun 23, 2023

Hi,

freebee said:
Problems with security
Google on the other hand disabled io_uring on their products, see the official report:
https://security.googleblog.com/2023/06/learnings-from-kctf-vrps-42-linux.html
Is a greater source of vulnerabilities and a head cache for us, hosting data from others.

don't have time to write an essay right now, but to avoid scaring people: with Proxmox VE, you can't just do arbitrary io_uring calls, except if you have CLI access as an @pam user. Otherwise, the VM's IO just goes through QEMU's io_uring calls which use liburing and you can't influence which calls are done, just write different data.

freebee said:
So, is correct Proxmox using io_uring for default on storage configuration?
Is it, not better back threads or native?
If someone wants to choose io_uring, ok, but do not let this default.

For almost all people it works fine and has better performance and if you're not happy with it, you can opt out of it. It's true there were some issues at the time it was introduced, but most of those got addresses by continued development in newer kernels and for the few storage configurations still affected, we actually do avoid the io_uring default: https://git.proxmox.com/?p=qemu-ser...f8c027d995f8cf0764aa6de512b9bb8;hb=HEAD#l1576

spirit · Jun 23, 2023

freebee said:
Observations
On my servers, I work with NVME, ZFS, and Windows and Linux guest machines.
Especially on old Linuxes (like Suse Linux Enterprise 12 SP4), I experienced 3 major problems:
1. After a VM backup, the IBM DB2 got corrupted. Nobody was using it at the moment.
2. During a backup from this server I got corrupted data, stopping the backup. I try again and goes ok until the end.
3. The io latency inside VM sometimes goes very high, the network for 1-second stops.

I change to threads in the last few weeks, and no more problems, Everything got solved.
Maybe the io_uring is designed for Linux kernel file system technologies. ZFS is not on the scope.

Problems with security
Google on the other hand disabled io_uring on their products, see the official report:
https://security.googleblog.com/2023/06/learnings-from-kctf-vrps-42-linux.html
Is a greater source of vulnerabilities and a head cache for us, hosting data from others.

So, is correct Proxmox using io_uring for default on storage configuration?
Is it, not better back threads or native?
If someone wants to choose io_uring, ok, but do not let this default.

to be sure, is it with zfs as block storage ? (not .raw or .qcow2 on top of zfs filesystem).

freebee · Jun 23, 2023

spirit said:
to be sure, is it with zfs as block storage ? (not .raw or .qcow2 on top of zfs filesystem).

Hi.
Is not .raw or .qcow2.
Is ZFS.

freebee · Jun 23, 2023

fiona said:
Hi,

don't have time to write an essay right now, but to avoid scaring people: with Proxmox VE, you can't just do arbitrary io_uring calls, except if you have CLI access as an @pam user. Otherwise, the VM's IO just goes through QEMU's io_uring calls which use liburing and you can't influence which calls are done, just write different data.

For almost all people it works fine and has better performance and if you're not happy with it, you can opt out of it. It's true there were some issues at the time it was introduced, but most of those got addresses by continued development in newer kernels and for the few storage configurations still affected, we actually do avoid the io_uring default: https://git.proxmox.com/?p=qemu-ser...f8c027d995f8cf0764aa6de512b9bb8;hb=HEAD#l1576

Hello. Thank you for the answer.
I just ask to not default on all configurations:

Not designed for use with openzfs.
Relation with lost data yet.
Security problems.

Google is more radically, removing from kernel.
If we know could have problems, is not rational to default to this option.
I´m not radical, but I recognized the choice to be the default may have been premature, even though companies like Redhat promoting its implementation in production. If Proxmox project accepts or rejects this suggestion, is another thing I can´t control.

Kind regards.

fiona · Jun 23, 2023

freebee said:
Hello. Thank you for the answer.
I just ask to not default on all configurations:

Not designed for use with openzfs.

Relation with lost data yet.

Security problems.

Google is more radically, removing from kernel.

If we know could have problems, is not rational to default to this option.

I already wrote why the security issues faced by Google are not really relevant here. And I'm not seeing many people complaining about the default, for most it works perfectly fine and gives better performance. If you have a concrete bug about the data loss issue, please report it!

freebee said:
Hi. Thank you for the answer.
The io_uring is not designed for openzfs (default used by proxmox) and does not benefit from performance.
https://github.com/openzfs/zfs/issues/8716

You can use io_uring with an underlying ZFS storage perfectly fine. AFAIU that request is about leveraging more of the interface for ZFS.

freebee · Jun 23, 2023

fiona said:
I already wrote why the security issues faced by Google are not really relevant here. And I'm not seeing many people complaining about the default, for most it works perfectly fine and gives better performance. If you have a concrete bug about the data loss issue, please report it!

You can use io_uring with an underlying ZFS storage perfectly fine. AFAIU that request is about leveraging more of the interface for ZFS.

Meltdown and Spectre were no complaints for most users too and were a big problem.
But the discussion is the problems pointed out here.

The fact OpenZFS has a tool to test the io_uring does not give out bugs, and not all conditions were tested (for example, the VM -> backup -> openzfs using io_uring interface).
One of my problems (checksum error), systemd-journald broken and zpool status reveal the checksum error 1. I reboot the server, zfs scrub and data validated (no problems found). smartctl to see disk errors: very good too (no problems found or some violation). Tested the ECC memory, and nothing there too. Temperature 17 degrees. In another episode, the backup failed with a corrupted data message. This time I did not reboot, repeat the process, and ok.

The concerns about io_uring are registered here. Since we are talking about the data layer, it is relevant. This kind of thing is very dangerous.
I alert people like me using the proxmox, since the project doesn´t take any responsibility for lost data.
The intention is a better experience and users know about the io_uring risks.
I and others related this. If the staff members ignore, no problem.

Best regards.

mrgo · Jun 23, 2023

I don't get your point about spectre/meltdown because its old and most people are aware of it.

Bash:

./spectre-meltdown-checker.sh
Spectre and Meltdown mitigation detection tool v0.45

Checking for vulnerabilities on current system
Kernel is Linux 5.15.107-2-pve #1 SMP PVE 5.15.107-2 (2023-05-10T09:10Z) x86_64
CPU is AMD Ryzen 5 3600 6-Core Processor

Hardware check
* Hardware support (CPU microcode) for mitigation techniques
  * Indirect Branch Restricted Speculation (IBRS)
    * SPEC_CTRL MSR is available:  YES
    * CPU indicates IBRS capability:  NO
    * CPU indicates preferring IBRS always-on:  NO
    * CPU indicates preferring IBRS over retpoline:  YES
  * Indirect Branch Prediction Barrier (IBPB)
    * CPU indicates IBPB capability:  YES  (IBPB_SUPPORT feature bit)
  * Single Thread Indirect Branch Predictors (STIBP)
    * SPEC_CTRL MSR is available:  YES
    * CPU indicates STIBP capability:  YES  (AMD STIBP feature bit)
    * CPU indicates preferring STIBP always-on:  NO
  * Speculative Store Bypass Disable (SSBD)
    * CPU indicates SSBD capability:  YES  (AMD SSBD in SPEC_CTRL)
  * L1 data cache invalidation
    * CPU indicates L1D flush capability:  NO
  * CPU supports Transactional Synchronization Extensions (TSX):  NO
  * CPU supports Software Guard Extensions (SGX):  NO
  * CPU supports Special Register Buffer Data Sampling (SRBDS):  NO
  * CPU microcode is known to cause stability problems:  NO  (family 0x17 model 0x71 stepping 0x0 ucode 0x8701021 cpuid 0x870f10)
  * CPU microcode is the latest known available version:  YES  (latest version is 0x8701021 dated 2020/01/25 according to builtin firmwares DB v222+i20220208)
* CPU vulnerability to the speculative execution attack variants
  * Affected by CVE-2017-5753 (Spectre Variant 1, bounds check bypass):  YES
  * Affected by CVE-2017-5715 (Spectre Variant 2, branch target injection):  YES
  * Affected by CVE-2017-5754 (Variant 3, Meltdown, rogue data cache load):  NO
  * Affected by CVE-2018-3640 (Variant 3a, rogue system register read):  NO
  * Affected by CVE-2018-3639 (Variant 4, speculative store bypass):  YES
  * Affected by CVE-2018-3615 (Foreshadow (SGX), L1 terminal fault):  NO
  * Affected by CVE-2018-3620 (Foreshadow-NG (OS), L1 terminal fault):  NO
  * Affected by CVE-2018-3646 (Foreshadow-NG (VMM), L1 terminal fault):  NO
  * Affected by CVE-2018-12126 (Fallout, microarchitectural store buffer data sampling (MSBDS)):  NO
  * Affected by CVE-2018-12130 (ZombieLoad, microarchitectural fill buffer data sampling (MFBDS)):  NO
  * Affected by CVE-2018-12127 (RIDL, microarchitectural load port data sampling (MLPDS)):  NO
  * Affected by CVE-2019-11091 (RIDL, microarchitectural data sampling uncacheable memory (MDSUM)):  NO
  * Affected by CVE-2019-11135 (ZombieLoad V2, TSX Asynchronous Abort (TAA)):  NO
  * Affected by CVE-2018-12207 (No eXcuses, iTLB Multihit, machine check exception on page size changes (MCEPSC)):  NO
  * Affected by CVE-2020-0543 (Special Register Buffer Data Sampling (SRBDS)):  NO

CVE-2017-5753 aka 'Spectre Variant 1, bounds check bypass'
* Mitigated according to the /sys interface:  YES  (Mitigation: usercopy/swapgs barriers and __user pointer sanitization)
* Kernel has array_index_mask_nospec:  YES  (1 occurrence(s) found of x86 64 bits array_index_mask_nospec())
* Kernel has the Red Hat/Ubuntu patch:  NO
* Kernel has mask_nospec64 (arm64):  NO
* Kernel has array_index_nospec (arm64):  NO
> STATUS:  NOT VULNERABLE  (Mitigation: usercopy/swapgs barriers and __user pointer sanitization)

CVE-2017-5715 aka 'Spectre Variant 2, branch target injection'
* Mitigated according to the /sys interface:  YES  (Mitigation: Retpolines, IBPB: conditional, STIBP: disabled, RSB filling, PBRSB-eIBRS: Not affected)
* Mitigation 1
  * Kernel is compiled with IBRS support:  YES
    * IBRS enabled and active:  NO
  * Kernel is compiled with IBPB support:  YES
    * IBPB enabled and active:  YES
* Mitigation 2
  * Kernel has branch predictor hardening (arm):  NO
  * Kernel compiled with retpoline option:  YES
    * Kernel compiled with a retpoline-aware compiler:  YES  (kernel reports full retpoline compilation)
> STATUS:  NOT VULNERABLE  (Full retpoline + IBPB are mitigating the vulnerability)

CVE-2017-5754 aka 'Variant 3, Meltdown, rogue data cache load'
* Mitigated according to the /sys interface:  YES  (Not affected)
* Kernel supports Page Table Isolation (PTI):  YES
  * PTI enabled and active:  NO
  * Reduced performance impact of PTI:  NO  (PCID/INVPCID not supported, performance impact of PTI will be significant)
* Running as a Xen PV DomU:  NO
> STATUS:  NOT VULNERABLE  (your CPU vendor reported your CPU model as not affected)

CVE-2018-3640 aka 'Variant 3a, rogue system register read'
* CPU microcode mitigates the vulnerability:  YES
> STATUS:  NOT VULNERABLE  (your CPU vendor reported your CPU model as not affected)

CVE-2018-3639 aka 'Variant 4, speculative store bypass'
* Mitigated according to the /sys interface:  YES  (Mitigation: Speculative Store Bypass disabled via prctl and seccomp)
* Kernel supports disabling speculative store bypass (SSB):  YES  (found in /proc/self/status)
* SSB mitigation is enabled and active:  YES  (per-thread through prctl)
* SSB mitigation currently active for selected processes:  YES  (chronyd haveged systemd-journald systemd-logind udevadm)
> STATUS:  NOT VULNERABLE  (Mitigation: Speculative Store Bypass disabled via prctl and seccomp)

CVE-2018-3615 aka 'Foreshadow (SGX), L1 terminal fault'
* CPU microcode mitigates the vulnerability:  N/A
> STATUS:  NOT VULNERABLE  (your CPU vendor reported your CPU model as not affected)

CVE-2018-3620 aka 'Foreshadow-NG (OS), L1 terminal fault'
* Mitigated according to the /sys interface:  YES  (Not affected)
* Kernel supports PTE inversion:  YES  (found in kernel image)
* PTE inversion enabled and active:  NO
> STATUS:  NOT VULNERABLE  (your CPU vendor reported your CPU model as not affected)

CVE-2018-3646 aka 'Foreshadow-NG (VMM), L1 terminal fault'
* Information from the /sys interface: Not affected
* This system is a host running a hypervisor:  YES
* Mitigation 1 (KVM)
  * EPT is disabled:  N/A  (the kvm_intel module is not loaded)
* Mitigation 2
  * L1D flush is supported by kernel:  YES  (found flush_l1d in kernel image)
  * L1D flush enabled:  NO
  * Hardware-backed L1D flush supported:  NO  (flush will be done in software, this is slower)
  * Hyper-Threading (SMT) is enabled:  NO
> STATUS:  NOT VULNERABLE  (your CPU vendor reported your CPU model as not affected)

CVE-2018-12126 aka 'Fallout, microarchitectural store buffer data sampling (MSBDS)'
* Mitigated according to the /sys interface:  YES  (Not affected)
* Kernel supports using MD_CLEAR mitigation:  YES  (found md_clear implementation evidence in kernel image)
* Kernel mitigation is enabled and active:  NO
* SMT is either mitigated or disabled:  NO
> STATUS:  NOT VULNERABLE  (your CPU vendor reported your CPU model as not affected)

CVE-2018-12130 aka 'ZombieLoad, microarchitectural fill buffer data sampling (MFBDS)'
* Mitigated according to the /sys interface:  YES  (Not affected)
* Kernel supports using MD_CLEAR mitigation:  YES  (found md_clear implementation evidence in kernel image)
* Kernel mitigation is enabled and active:  NO
* SMT is either mitigated or disabled:  NO
> STATUS:  NOT VULNERABLE  (your CPU vendor reported your CPU model as not affected)

CVE-2018-12127 aka 'RIDL, microarchitectural load port data sampling (MLPDS)'
* Mitigated according to the /sys interface:  YES  (Not affected)
* Kernel supports using MD_CLEAR mitigation:  YES  (found md_clear implementation evidence in kernel image)
* Kernel mitigation is enabled and active:  NO
* SMT is either mitigated or disabled:  NO
> STATUS:  NOT VULNERABLE  (your CPU vendor reported your CPU model as not affected)

CVE-2019-11091 aka 'RIDL, microarchitectural data sampling uncacheable memory (MDSUM)'
* Mitigated according to the /sys interface:  YES  (Not affected)
* Kernel supports using MD_CLEAR mitigation:  YES  (found md_clear implementation evidence in kernel image)
* Kernel mitigation is enabled and active:  NO
* SMT is either mitigated or disabled:  NO
> STATUS:  NOT VULNERABLE  (your CPU vendor reported your CPU model as not affected)

CVE-2019-11135 aka 'ZombieLoad V2, TSX Asynchronous Abort (TAA)'
* Mitigated according to the /sys interface:  YES  (Not affected)
* TAA mitigation is supported by kernel:  YES  (found tsx_async_abort in kernel image)
* TAA mitigation enabled and active:  NO
> STATUS:  NOT VULNERABLE  (your CPU vendor reported your CPU model as not affected)

CVE-2018-12207 aka 'No eXcuses, iTLB Multihit, machine check exception on page size changes (MCEPSC)'
* Mitigated according to the /sys interface:  YES  (Not affected)
* This system is a host running a hypervisor:  YES
* iTLB Multihit mitigation is supported by kernel:  YES  (found itlb_multihit in kernel image)
* iTLB Multihit mitigation enabled and active:  NO
> STATUS:  NOT VULNERABLE  (your CPU vendor reported your CPU model as not affected)

CVE-2020-0543 aka 'Special Register Buffer Data Sampling (SRBDS)'
* Mitigated according to the /sys interface:  YES  (Not affected)
* SRBDS mitigation control is supported by the kernel:  YES  (found SRBDS implementation evidence in kernel image. Your kernel is up to date for SRBDS mitigation)
* SRBDS mitigation control is enabled and active:  NO
> STATUS:  NOT VULNERABLE  (your CPU vendor reported your CPU model as not affected)

> SUMMARY: CVE-2017-5753:OK CVE-2017-5715:OK CVE-2017-5754:OK CVE-2018-3640:OK CVE-2018-3639:OK CVE-2018-3615:OK CVE-2018-3620:OK CVE-2018-3646:OK CVE-2018-12126:OK CVE-2018-12130:OK CVE-2018-12127:OK CVE-2019-11091:OK CVE-2019-11135:OK CVE-2018-12207:OK CVE-2020-0543:OK

Need more detailed information about mitigation options? Use --explain
A false sense of security is worse than no security at all, see --disclaimer

As you can see the Proxmox kernel applies all measurements to mitigate known problems. So no point anymore.

I also don't want to join this aboutwhatism talk between the google security announcement and the real effect on Proxmox VE operation.

I just want to say that I run zfs with io_uring and I can't confirm your integrity issues. I also run with ECC memory and way hotter temps (at the moment) and experience no issues at all. I think there is something going wrong in your rig.

If this whole drama would affect you seriously I would expect you to hold a 25/8 premium subscription and recommend you to consult the support team asap. Otherwise @fiona already said please stop making a drama.

I mean everybody should have its own threatmodel that could be used now to evaluate the amount of drama caused by the information Google Security Team provided.

In my case I don't care because my setup is far away enough from the internet and I am to not interesting that somebody cares about it.

freebee · Jun 23, 2023

m4k5ym said:

Your spectre and meltdown thing I can't really get because its old and most people are aware.

Bash:

./spectre-meltdown-checker.sh
Spectre and Meltdown mitigation detection tool v0.45

Checking for vulnerabilities on current system
Kernel is Linux 5.15.107-2-pve #1 SMP PVE 5.15.107-2 (2023-05-10T09:10Z) x86_64
CPU is AMD Ryzen 5 3600 6-Core Processor

Hardware check
* Hardware support (CPU microcode) for mitigation techniques
  * Indirect Branch Restricted Speculation (IBRS)
    * SPEC_CTRL MSR is available:  YES
    * CPU indicates IBRS capability:  NO
    * CPU indicates preferring IBRS always-on:  NO
    * CPU indicates preferring IBRS over retpoline:  YES
  * Indirect Branch Prediction Barrier (IBPB)
    * CPU indicates IBPB capability:  YES  (IBPB_SUPPORT feature bit)
  * Single Thread Indirect Branch Predictors (STIBP)
    * SPEC_CTRL MSR is available:  YES
    * CPU indicates STIBP capability:  YES  (AMD STIBP feature bit)
    * CPU indicates preferring STIBP always-on:  NO
  * Speculative Store Bypass Disable (SSBD)
    * CPU indicates SSBD capability:  YES  (AMD SSBD in SPEC_CTRL)
  * L1 data cache invalidation
    * CPU indicates L1D flush capability:  NO
  * CPU supports Transactional Synchronization Extensions (TSX):  NO
  * CPU supports Software Guard Extensions (SGX):  NO
  * CPU supports Special Register Buffer Data Sampling (SRBDS):  NO
  * CPU microcode is known to cause stability problems:  NO  (family 0x17 model 0x71 stepping 0x0 ucode 0x8701021 cpuid 0x870f10)
  * CPU microcode is the latest known available version:  YES  (latest version is 0x8701021 dated 2020/01/25 according to builtin firmwares DB v222+i20220208)
* CPU vulnerability to the speculative execution attack variants
  * Affected by CVE-2017-5753 (Spectre Variant 1, bounds check bypass):  YES
  * Affected by CVE-2017-5715 (Spectre Variant 2, branch target injection):  YES
  * Affected by CVE-2017-5754 (Variant 3, Meltdown, rogue data cache load):  NO
  * Affected by CVE-2018-3640 (Variant 3a, rogue system register read):  NO
  * Affected by CVE-2018-3639 (Variant 4, speculative store bypass):  YES
  * Affected by CVE-2018-3615 (Foreshadow (SGX), L1 terminal fault):  NO
  * Affected by CVE-2018-3620 (Foreshadow-NG (OS), L1 terminal fault):  NO
  * Affected by CVE-2018-3646 (Foreshadow-NG (VMM), L1 terminal fault):  NO
  * Affected by CVE-2018-12126 (Fallout, microarchitectural store buffer data sampling (MSBDS)):  NO
  * Affected by CVE-2018-12130 (ZombieLoad, microarchitectural fill buffer data sampling (MFBDS)):  NO
  * Affected by CVE-2018-12127 (RIDL, microarchitectural load port data sampling (MLPDS)):  NO
  * Affected by CVE-2019-11091 (RIDL, microarchitectural data sampling uncacheable memory (MDSUM)):  NO
  * Affected by CVE-2019-11135 (ZombieLoad V2, TSX Asynchronous Abort (TAA)):  NO
  * Affected by CVE-2018-12207 (No eXcuses, iTLB Multihit, machine check exception on page size changes (MCEPSC)):  NO
  * Affected by CVE-2020-0543 (Special Register Buffer Data Sampling (SRBDS)):  NO

CVE-2017-5753 aka 'Spectre Variant 1, bounds check bypass'
* Mitigated according to the /sys interface:  YES  (Mitigation: usercopy/swapgs barriers and __user pointer sanitization)
* Kernel has array_index_mask_nospec:  YES  (1 occurrence(s) found of x86 64 bits array_index_mask_nospec())
* Kernel has the Red Hat/Ubuntu patch:  NO
* Kernel has mask_nospec64 (arm64):  NO
* Kernel has array_index_nospec (arm64):  NO
> STATUS:  NOT VULNERABLE  (Mitigation: usercopy/swapgs barriers and __user pointer sanitization)

CVE-2017-5715 aka 'Spectre Variant 2, branch target injection'
* Mitigated according to the /sys interface:  YES  (Mitigation: Retpolines, IBPB: conditional, STIBP: disabled, RSB filling, PBRSB-eIBRS: Not affected)
* Mitigation 1
  * Kernel is compiled with IBRS support:  YES
    * IBRS enabled and active:  NO
  * Kernel is compiled with IBPB support:  YES
    * IBPB enabled and active:  YES
* Mitigation 2
  * Kernel has branch predictor hardening (arm):  NO
  * Kernel compiled with retpoline option:  YES
    * Kernel compiled with a retpoline-aware compiler:  YES  (kernel reports full retpoline compilation)
> STATUS:  NOT VULNERABLE  (Full retpoline + IBPB are mitigating the vulnerability)

CVE-2017-5754 aka 'Variant 3, Meltdown, rogue data cache load'
* Mitigated according to the /sys interface:  YES  (Not affected)
* Kernel supports Page Table Isolation (PTI):  YES
  * PTI enabled and active:  NO
  * Reduced performance impact of PTI:  NO  (PCID/INVPCID not supported, performance impact of PTI will be significant)
* Running as a Xen PV DomU:  NO
> STATUS:  NOT VULNERABLE  (your CPU vendor reported your CPU model as not affected)

CVE-2018-3640 aka 'Variant 3a, rogue system register read'
* CPU microcode mitigates the vulnerability:  YES
> STATUS:  NOT VULNERABLE  (your CPU vendor reported your CPU model as not affected)

CVE-2018-3639 aka 'Variant 4, speculative store bypass'
* Mitigated according to the /sys interface:  YES  (Mitigation: Speculative Store Bypass disabled via prctl and seccomp)
* Kernel supports disabling speculative store bypass (SSB):  YES  (found in /proc/self/status)
* SSB mitigation is enabled and active:  YES  (per-thread through prctl)
* SSB mitigation currently active for selected processes:  YES  (chronyd haveged systemd-journald systemd-logind udevadm)
> STATUS:  NOT VULNERABLE  (Mitigation: Speculative Store Bypass disabled via prctl and seccomp)

CVE-2018-3615 aka 'Foreshadow (SGX), L1 terminal fault'
* CPU microcode mitigates the vulnerability:  N/A
> STATUS:  NOT VULNERABLE  (your CPU vendor reported your CPU model as not affected)

CVE-2018-3620 aka 'Foreshadow-NG (OS), L1 terminal fault'
* Mitigated according to the /sys interface:  YES  (Not affected)
* Kernel supports PTE inversion:  YES  (found in kernel image)
* PTE inversion enabled and active:  NO
> STATUS:  NOT VULNERABLE  (your CPU vendor reported your CPU model as not affected)

CVE-2018-3646 aka 'Foreshadow-NG (VMM), L1 terminal fault'
* Information from the /sys interface: Not affected
* This system is a host running a hypervisor:  YES
* Mitigation 1 (KVM)
  * EPT is disabled:  N/A  (the kvm_intel module is not loaded)
* Mitigation 2
  * L1D flush is supported by kernel:  YES  (found flush_l1d in kernel image)
  * L1D flush enabled:  NO
  * Hardware-backed L1D flush supported:  NO  (flush will be done in software, this is slower)
  * Hyper-Threading (SMT) is enabled:  NO
> STATUS:  NOT VULNERABLE  (your CPU vendor reported your CPU model as not affected)

CVE-2018-12126 aka 'Fallout, microarchitectural store buffer data sampling (MSBDS)'
* Mitigated according to the /sys interface:  YES  (Not affected)
* Kernel supports using MD_CLEAR mitigation:  YES  (found md_clear implementation evidence in kernel image)
* Kernel mitigation is enabled and active:  NO
* SMT is either mitigated or disabled:  NO
> STATUS:  NOT VULNERABLE  (your CPU vendor reported your CPU model as not affected)

CVE-2018-12130 aka 'ZombieLoad, microarchitectural fill buffer data sampling (MFBDS)'
* Mitigated according to the /sys interface:  YES  (Not affected)
* Kernel supports using MD_CLEAR mitigation:  YES  (found md_clear implementation evidence in kernel image)
* Kernel mitigation is enabled and active:  NO
* SMT is either mitigated or disabled:  NO
> STATUS:  NOT VULNERABLE  (your CPU vendor reported your CPU model as not affected)

CVE-2018-12127 aka 'RIDL, microarchitectural load port data sampling (MLPDS)'
* Mitigated according to the /sys interface:  YES  (Not affected)
* Kernel supports using MD_CLEAR mitigation:  YES  (found md_clear implementation evidence in kernel image)
* Kernel mitigation is enabled and active:  NO
* SMT is either mitigated or disabled:  NO
> STATUS:  NOT VULNERABLE  (your CPU vendor reported your CPU model as not affected)

CVE-2019-11091 aka 'RIDL, microarchitectural data sampling uncacheable memory (MDSUM)'
* Mitigated according to the /sys interface:  YES  (Not affected)
* Kernel supports using MD_CLEAR mitigation:  YES  (found md_clear implementation evidence in kernel image)
* Kernel mitigation is enabled and active:  NO
* SMT is either mitigated or disabled:  NO
> STATUS:  NOT VULNERABLE  (your CPU vendor reported your CPU model as not affected)

CVE-2019-11135 aka 'ZombieLoad V2, TSX Asynchronous Abort (TAA)'
* Mitigated according to the /sys interface:  YES  (Not affected)
* TAA mitigation is supported by kernel:  YES  (found tsx_async_abort in kernel image)
* TAA mitigation enabled and active:  NO
> STATUS:  NOT VULNERABLE  (your CPU vendor reported your CPU model as not affected)

CVE-2018-12207 aka 'No eXcuses, iTLB Multihit, machine check exception on page size changes (MCEPSC)'
* Mitigated according to the /sys interface:  YES  (Not affected)
* This system is a host running a hypervisor:  YES
* iTLB Multihit mitigation is supported by kernel:  YES  (found itlb_multihit in kernel image)
* iTLB Multihit mitigation enabled and active:  NO
> STATUS:  NOT VULNERABLE  (your CPU vendor reported your CPU model as not affected)

CVE-2020-0543 aka 'Special Register Buffer Data Sampling (SRBDS)'
* Mitigated according to the /sys interface:  YES  (Not affected)
* SRBDS mitigation control is supported by the kernel:  YES  (found SRBDS implementation evidence in kernel image. Your kernel is up to date for SRBDS mitigation)
* SRBDS mitigation control is enabled and active:  NO
> STATUS:  NOT VULNERABLE  (your CPU vendor reported your CPU model as not affected)

> SUMMARY: CVE-2017-5753:OK CVE-2017-5715:OK CVE-2017-5754:OK CVE-2018-3640:OK CVE-2018-3639:OK CVE-2018-3615:OK CVE-2018-3620:OK CVE-2018-3646:OK CVE-2018-12126:OK CVE-2018-12130:OK CVE-2018-12127:OK CVE-2019-11091:OK CVE-2019-11135:OK CVE-2018-12207:OK CVE-2020-0543:OK

Need more detailed information about mitigation options? Use --explain
A false sense of security is worse than no security at all, see --disclaimer

As you can see the Proxmox Kernel applies all measurements to mitigate those problems. So no point anymore.

I don't want to join this aboutwhatism talk between the google security announcement and the effect on proxmox ve.

I just want to say that I run zfs with io_uring and I can't confirm your integrity issues. I also run with ECC memory and way hotter temps (at the moment) and experience no issues at all. I think there is something going wrong in your rig.

If this whole drama would affect you seriously I would expect you to hold a 25/8 premium subscription and consult the support team.

Hi.
Don´t take this personally to you.
I just mentioned the meltdown as an example of a problem that most people didn't know about.
The fact your scenario working doesn´t mean have no bugs.
Again is not an adequate response. This does not change the io_uring problems with other people.
If you feel offended in some way by this "drama", don´t care and ignore it. Don´t waste time.

Kind regards.

mrgo · Jun 23, 2023

I would recommend you to take a walk outside and get some steam off. I don't like it when people see themselves more important then the rest of our over 8 billion human population

Nobody forces you to use io_uring in your setup. Change the default yourself and be happy.

freebee · Jun 23, 2023

m4k5ym said:
I would recommend you to take a walk outside and get some steam off. I don't like it when people see themselves more important then the rest of our over 8 billion human population

Nobody forces you to use io_uring in your setup. Change the default yourself and be happy.

Don´t be emotional. The forum is technical.

Kind regards.

btspce · Jul 30, 2023

fiona said:
Hi,

don't have time to write an essay right now, but to avoid scaring people: with Proxmox VE, you can't just do arbitrary io_uring calls, except if you have CLI access as an @pam user. Otherwise, the VM's IO just goes through QEMU's io_uring calls which use liburing and you can't influence which calls are done, just write different data.

For almost all people it works fine and has better performance and if you're not happy with it, you can opt out of it. It's true there were some issues at the time it was introduced, but most of those got addresses by continued development in newer kernels and for the few storage configurations still affected, we actually do avoid the io_uring default: https://git.proxmox.com/?p=qemu-ser...f8c027d995f8cf0764aa6de512b9bb8;hb=HEAD#l1576

For clarification. Is it safe to use io_uring on a LVM thin raw volume using no cache and io threads ?
Is it only Regular LVM volumes that can hang sometimes with io_uring and no cache ?

fiona · Aug 3, 2023

Hi,

btspce said:
For clarification. Is it safe to use io_uring on a LVM thin raw volume using no cache and io threads ?
Is it only Regular LVM volumes that can hang sometimes with io_uring and no cache ?

yes, those problems were limited to LVM. LVM-Thin is the default for Proxmox VE installations and io_uring is the default for LVM-thin disks (and all storages except those that had problems), so if there were any issues, I'd expect quite a few reports, which there aren't.

btspce · Aug 3, 2023

Thanks Fiona !

Search

Search

Move io_uring from default (important)

freebee

Well-Known Member

bbgeek17

Distinguished Member

freebee

Well-Known Member

bbgeek17

Distinguished Member

RolandK

Renowned Member

fiona

Proxmox Staff Member

spirit

Distinguished Member

freebee

Well-Known Member

freebee

Well-Known Member

fiona

Proxmox Staff Member

freebee

Well-Known Member

mrgo

Well-Known Member

freebee

Well-Known Member

mrgo

Well-Known Member

freebee

Well-Known Member

btspce

New Member

fiona

Proxmox Staff Member

btspce

New Member

We value your privacy