Any chance of newer ZFS and LVM packages in PVE 3.4 ?

gkovacs

Renowned Member
Dec 22, 2008
514
51
93
Budapest, Hungary
Currently Proxmox VE 3.4 includes the following LVM and ZFS versions:
lvm2: 2.02.98-pve4
zfs: 0.6.3-3~wheezy

Proxmox VE 4.1 has newer versions, and those have many bugfixes (especially for ZFS 0.6.5), and new features (like SSD caching in LVM2).

- Is it possible to backport these newer versions to Proxmox VE 3.4?

- Does the Proxmox team plan to do that?
- Is PVE 3.4 going to be supported with package updates at all?
 
We still do security fixes, and important bug fixes. But we will not add new features to 3.4.
 
We still do security fixes, and important bug fixes. But we will not add new features to 3.4.

Well, current ZFS release 0.6.5.3 contains hundreds of important bugfixes since 0.6.3 (among them many deadlocks and kernel panics), so I really hope you will find the time to include it in your upcoming PVE 3.4 fixes...
https://github.com/zfsonlinux/zfs/releases

Also including identical ZFS versions would help us (together with the ZFS on Linux developers) diagnose a serious data corruption bug, that's probably caused by the 4.2 kernel and might be corrupting data for many users:
https://github.com/zfsonlinux/zfs/issues/3990

In this post Brian Behlendorf (lead ZoL dev) specifically asks for us to test this issue with an older kernel, so it would be great if you could help us find this bug!
https://github.com/zfsonlinux/zfs/issues/3990#issuecomment-163740395
 
@gkovacs

I read the zfsonlinux thread you mention. Did you have the Linux swap file on a ZVOL when you had those data corruptions? Did you have any KVM machines running with direct hardware access?

Also I came across this guy doing some heavy testing of ZOL on a large NUMA system.
http://blog.servercentral.com/zfs-thangs and contributing to improvements in ZOL. Maybe he could test for data corruptions.

Regarding the memory bank issue. That is interesting. I think it may be that with four dimm sockets occupied the system may split the memory in two zones bound to specific cores of the cpu, whereas with just two dimm sockets occupied, all cores would have to contend for the same memory in one SMP zone. This should be visible in numactl --hardware. This might also affect DMA mappings, but I do not understand why that should affect LVM differently than ZFS, or would it be because ZFS is using more RAM then LVM, thereby thrashing some DMA area because of memory pressures? This is pure speculation, but maybe on high RAM systems that do not come close to swapping this is not visible. Or the newer kernel features like transparent hugepages and the resulting compaction/migration of pages due to the RAM usage of ARC are involved. As the swap on Zvol issues demonstrate I think, there is some negative impact on stability of either the newer zfs on linux code or the kernel or both.
 
@gkovacs

I read the zfsonlinux thread you mention. Did you have the Linux swap file on a ZVOL when you had those data corruptions? Did you have any KVM machines running with direct hardware access?

Yes I did have the swap on ZFS (not sure if ZVOL, default Proxmox ZFS RAID10 install). I did not have ANY virtual machines running, I only restored containers and VMs to test for data corruption. Also swap was not or rarely used, as no guests were running.

Also I came across this guy doing some heavy testing of ZOL on a large NUMA system.
http://blog.servercentral.com/zfs-thangs and contributing to improvements in ZOL. Maybe he could test for data corruptions.

Regarding the memory bank issue. That is interesting. I think it may be that with four dimm sockets occupied the system may split the memory in two zones bound to specific cores of the cpu, whereas with just two dimm sockets occupied, all cores would have to contend for the same memory in one SMP zone. This should be visible in numactl --hardware.

I have checked numactl with all the different memory configurations, but I only ever had a single zone (zone 0), never more. Since it's a single socket system, thats what I was expecting.

This might also affect DMA mappings, but I do not understand why that should affect LVM differently than ZFS, or would it be because ZFS is using more RAM then LVM, thereby thrashing some DMA area because of memory pressures? This is pure speculation, but maybe on high RAM systems that do not come close to swapping this is not visible. Or the newer kernel features like transparent hugepages and the resulting compaction/migration of pages due to the RAM usage of ARC are involved. As the swap on Zvol issues demonstrate I think, there is some negative impact on stability of either the newer zfs on linux code or the kernel or both.

We also experienced some very rare data corruption on ext4 / lvm, but only under MySQL (some InnoDB indexes got corrupted), but MySQL has already demonstrated weird issues under LXC, so I'm not sure it's a related issue.

Maybe it's connected to the checksum computation somehow? Both ZFS and MySQL/InnoDB create checksums on their disk writes... then again, ext4 does as well on the journal, and we never had problems there.

Anyhow, I would really love to test the same version of ZFS (that's included in Proxmox 4.1) under Proxmox 3.4, so if it works alright, then it must be a 4.2 kernel issue (this would be my current bet). I hope that @dietmar and @tom can sympathize with our intention of finding this bug, and help us out by creating a ZFS 0.6.5.3 package for PVE 3.4.
 
Last edited:
Hello @gkovacs

In the 3.4 version you can install package pve-kernel-2.6.32-42-pve (At least if you have the no-subscription repository, didn't check others). It gives you a 0.6.5.2-47_g7c033da ZFS kernel module.

With the different memory dimm configurations, was your system always running in dual channel mode? You really should take the swap out of the equation, either no swap or a separate swap device not on zfs.
 
Hello @gkovacs

In the 3.4 version you can install package pve-kernel-2.6.32-42-pve (At least if you have the no-subscription repository, didn't check others). It gives you a 0.6.5.2-47_g7c033da ZFS kernel module.

With the different memory dimm configurations, was your system always running in dual channel mode? You really should take the swap out of the equation, either no swap or a separate swap device not on zfs.

Thank you for this tip. However I sincerely hope that @dietmar follows this thread, and will help us bughunting by releasing the same ZFS packages for PVE 3.4 as they include in PVE 4.1.
 
We still do security fixes, and important bug fixes. But we will not add new features to 3.4.

According the the ZFS on Linux github, they just released a new version, and it's compatible with 2.6.32 - 4.4 kernels.
https://github.com/zfsonlinux/zfs/releases/tag/zfs-0.6.5.4

Would it be possible to release it for both 3.4 and 4.1? If yes, then we could test the checksum / data corruption bug that's currently affecting the kernel in PVE 4.1 (on the same hardware where we found it before).
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!