Proxmox on XFS (MDADM RAID 10) + Guest on XFS

teraflop

New Member
Jan 12, 2021
4
0
1
35
Hello,

Sorry if this is too noob. Just beginning here with Proxmox. The goal is to deploy a Proxmox node that will in turn host Linux guests. Drive configuration is 4 x 1TB SSDs (local). I am used to XFS on bare metal servers. So I am thinking of sticking to the same.

Proxmox Host/Hypervisor: XFS on MDADM RAID 10

Should I be using XFS as the filesystem on the guests as well? Is that possible? or should I stick to ext4? Guest VMs will be primarily CentOS 7 / 8 VMs with cPanel.

Thanks
 
Keep in mind that mdraid isn't officially supported and no one will test if an proxmox update will destroy your array. I'm also using mdraid and saw several people the last weeks with corrupted mdraid arrays. If you got enough RAM and don't need encryption I would really recommend using ZFS as software raid or using hardware raid.
 
Hi,

Proxmox does not supports MD RAID, if you use it you're mostly on your own - if you want a software RAID with actual safety, corruption and recovery checks in place we recommend using ZFS.
 
I see. The only problem I have with ZFS is that I would be completely new to it. I am not familiar with it, which makes me worried about things going wrong. How much of a RAM overhead does ZFS usually require? As a rule of thumb?
 
By default ZFS will use 50% of your RAM but you can limit that. Rule of thumb is 4GB + 1GB per 1TB of raw storage capacity. Or 4GB + 5GB per 1 TB of raw storage capacity if you want to use deduplication. ZFS is great and powerful (self-healing filesystem, compression, deduplication, snapshots, no bit rot, reliable and so on) but you really need to be willing to learn because it is a little bit "special". :D
 
Last edited:
By default ZFS will use 50% of your RAM but you can limit that. Rule of thumb is 4GB + 1GB per 1TB of raw storage capacity. Or 4GB + 5GB per 1 TB of raw storage capacity if you want to use deduplication. ZFS is great and powerful (self-healing filesystem, compression, deduplication, snapshots, no bit rot, reliable and so on) but you really need to be willing to learn because it is a little bit "special". :D
Tell me more about the "special" bits, that is what is important :)
 
Tell me more about the "special" bits, that is what is important :)
You should watch some ZFS introductions on youtube. There is a lot of stuff what makes ZFS great and better than other file systems. But also many traps where you need to configure/optimize something that can't be changed later without destroying the array or vitual discs first.
And its not working like your typical filesystems.

And there is a lot of "ZFS speak" you need to know to understand. For example:
mirror = raid1
stripe = raid0
stripped mirror = raid10
raidz1 = raid5
raidz2 = raid6
raidz3 = like raid5/raid6 but with 3 drives for parity
pool = array
zvol = a "virtual" block device
dataset = a file system
zdev = physical HDD/SSD
recordsize = blocksize of dataset
volblocksize = blocksize of a zvol
scrubbing = recalculate checksums to find error
ZIL = a part of a disc where sync writes are temporarily stored
SLOG = a dedicated disc to store a ZIL
ARC = read cache in RAM
L2ARC = read cache on disc
CoW = copy on write
ashift = defining blocksize of operations to your zdevs
...
 
Last edited:
  • Like
Reactions: Renato77
You can also checkout our documentation regarding ZFS: https://pve.proxmox.com/pve-docs/chapter-sysadmin.html#chapter_zfs

In general I'd recommend a raid10 for your setup, reserve about 8 to 10 GiB of RAM for it and you will be mostly golden.
You can do the setup either using root pool by setting up ZFS using the Proxmox VE ISO installer directly, or if you have an additional disk for the root partition then setup a plain XFS (or ext4) there and create the ZFS with the 4 1xTB disks afterwards using the webinterface.

IMO, having 4 SSDs in a RAID10 will avoid a few "optimization" issues completely, no need for adding a ZFS log or other caching-like fast special device as it would be tempting if this would be a 4 spinner disk setup.
 
Thanks. I’m going to do some reading over the weekend on this. What is the recommended FS for guest VMs in such a case? XFS?
 
Keep in mind that mdraid isn't officially supported and no one will test if an proxmox update will destroy your array. I'm also using mdraid and saw several people the last weeks with corrupted mdraid arrays. If you got enough RAM and don't need encryption I would really recommend using ZFS as software raid or using hardware raid.

I use mdadm raid6/raid10 arrays inside guest VPS for almost 2 years in high-loaded environment.
mdadm arrays added to VPS as disks and formatted with xfs inside guest OS.

mdadm arrays are not visible in proxmox disks UI. How can proxmox update destroy the mdadm arrays? Why?
Can you point me to the cases when proxmox destroys data on mdadm arrays? Was lvm configured over array in proxmox?

By default ZFS will use 50% of your RAM but you can limit that. Rule of thumb is 4GB + 1GB per 1TB of raw storage capacity. Or 4GB + 5GB per 1 TB of raw storage capacity if you want to use deduplication. ZFS is great and powerful (self-healing filesystem, compression, deduplication, snapshots, no bit rot, reliable and so on) but you really need to be willing to learn because it is a little bit "special". :D

I have RAM 32GB\64GB on servers, used by VPS applications almost completely.
Disk subsystem - 60x6TB=360TB or 45x4TB=180TB raw space.

What are zfs benefits for this configuration comparing to xfs/mdadm? What's the memory requirement for zfs in this case?
 
I use mdadm raid6/raid10 arrays inside guest VPS for almost 2 years in high-loaded environment.
mdadm arrays added to VPS as disks and formatted with xfs inside guest OS.

mdadm arrays are not visible in proxmox disks UI. How can proxmox update destroy the mdadm arrays? Why?
Can you point me to the cases when proxmox destroys data on mdadm arrays? Was lvm configured over array in proxmox?
You can search the forums. I saw 2 or 3 thread the last weeks where the rootfs got corrupted and always proxmox was installed on an mdadm array.
I personally use mdraid too and as far as I know mdraid is implemented in the linux kernel and proxmox uses its own patched kernel based on ubuntu. So if proxmox updates the kernel it might be, that this will cause mdraid to break. Because it is not officially supported no one will test if a kernel update will break mdraid before the release.
I have RAM 32GB\64GB on servers, used by VPS applications almost completely.
Disk subsystem - 60x6TB=360TB or 45x4TB=180TB raw space.

What are zfs benefits for this configuration comparing to xfs/mdadm? What's the memory requirement for zfs in this case?
ZFS isn't just raid. Its a complete very advanced copy on write filesystem. Bit rot protection, its self-checking data integrety and can repair damaged data, compression on block level, deduplication, snapshots, replication, encryption, ...

Rule of thumb for ZFS is 4GB + 1GB per 1TB of raw capacity. Or 4GB + 5GB per 1TB of raw capacity if you want to use deduplication. So with 360TB/180TB of raw capacity that 32/64GB RAM might be a little bit on the low side. ;)
 
Last edited:
You can search the forums. I saw 2 or 3 thread the last weeks where the rootfs got corrupted and always proxmox was installed on an mdadm array.
I personally use mdraid too and as far as I know mdraid is implemented in the linux kernel and proxmox uses its own patched kernel based on ubuntu. So if proxmox updates the kernel it might be, that this will cause mdraid to break. Because it is not officially supported no one will test if a kernel update will break mdraid before the release.

Probably, there are much more cases where filesystem got corrupted and proxmox was not installed on mdadm array :)
Was mdadm array really the root cause of the corruption?

Not sure, it this issue is still actual https://forum.proxmox.com/threads/proxmox-installation-dont-see-raid1.44465/

Usually, I install Debian on mdadm raid1 array on system drives and configure several mdadm raid6/raid10 arrays for guest OS.
Why should proxmox remove mdadm support from his Debian-based kernel?
 
Last edited:
Probably, there are much more cases where filesystem got corrupted and proxmox was not installed on mdadm array :)
Was mdadm array really the root cause of the corruption?
Not sure but even the staff asked first if mdraid was used...and yes, it was in that cases.
Usually, I install Debian on mdadm raid1 array on system drives and configure several mdadm raid6/raid10 arrays for guest OS.
Why should proxmox remove mdadm support from his Debian-based kernel?
If you install Proxmox ontop of a Debian the Proxmox packages will replace the debian kernel with the their own customized kernel. Its also recommended to delete the original debian kernel afterwards. And its not about "removing" the mdadm from the kernel. But the kernel gets patched to support ZFS and other stuff and that might create bugs and problems and no one will care about that, because it is not supported.
So mdadm will most likely work but don't rely on it if you really care about your data or reliability is needed.
 
Would proxmox still be able to break md raid if you only passthrough lvm volumes created on that array? I've seen threads on using md-raid(dm?) with -integrity to, slightly, approach zfs capabilities.

Would it e.g. be possible (feasible or not) to initialise hdd's, create integrity & raid, create lvm volumes (which I want in proxmox) on another pc. Install proxmox on .m2 in server, attach storage, passthrough individual needed volumes? I'm the only user, and don't need many volumes (max. 5) so doing it manually doesn't matter to me (ZFS will be an option in 3 years)
 
Would it e.g. be possible (feasible or not) to initialise hdd's, create integrity & raid, create lvm volumes (which I want in proxmox) on another pc. Install proxmox on .m2 in server, attach storage, passthrough individual needed volumes? I'm the only user, and don't need many volumes (max. 5) so doing it manually doesn't matter to me (ZFS will be an option in 3 years)
Of course it is possible, it's Linux so (almost) anything can be done, e.g. you can also just install Debian on mdadm and install PVE afterwards on top.
I loved mdadm and LVM for decades and then discovered ZFS and never wanted to go back.
 
Of course it is possible, it's Linux so (almost) anything can be done, e.g. you can also just install Debian on mdadm and install PVE afterwards on top.
I loved mdadm and LVM for decades and then discovered ZFS and never wanted to go back.
Thanks for your response!
Unfortunately I couldn't get debian, without DE, to work on my i3-10100 + nvidia fx1800. The screen turns off after boot, keyboard (can reboot), mouse & ssh are accessible. I saw various threads about backporting kernel 5.14 for 10th gen intel but couldn't make it work smoothly. Guess I found one thing which Linux can't do or more accurately, a semi-amateur can't do in Linux ;)

About an hour before you replied, I pulled the plug and started over with Ubuntu Server + lvm, everything worked smoothly out of the box for now. Except that I don't like netplan and I can't acces the pve webUI since none of the network changes to netplan (or virt-net) seem to be persistent (it stays on subnet 192.168.122.x while my private net is 10.0.0.x).

What kind of hardware are you using with ZFS? What made you decide to move? How does your power usage compare to mdadm+lvm?

ZFS is on my wishlist, after some hard needed hardware upgrades (& vdex expansion is introduced)
 
What kind of hardware are you using with ZFS?

The smallest setup was a Pi2, but that was more a proof of concept, it was very slow. The currently smallest setup in production is an atom with 16GB RAM and two 240 GB Enterprise SSDs in mirroring mode. All other machines are at least workstations, so dual Xeon and at least 64GB of RAM.
I also use ZFS on my iMac, but only as data storage that can be synced to my beefier machines.

What made you decide to move?
I always try out new things and ZFS blew my mind how easy a lot of things are compared to mdadm/lvm. I gradually upgraded everything to ZFS, so that you can benefit from it: e.g. external secondary backups from backup servers as off-site is now very fast due to the incremental steps.

How does your power usage compare to mdadm+lvm?
I compared a lof of different aspects, but I never compared that.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!