Newbie Questions: How do I start?

ylavi · Oct 11, 2022

I'm not sure yet exactly what questions to ask... The most fundamental is - are there any sites (including pay sites) which have good tutorials (ideally video, or alternatively written) on setting up and working with PVE environments? Or do I learn from blogs and the Q&A sites such as these forums?

I've been using Linux for years (almost exclusively Ubuntu), on the desktop and running a home server - a bare-metal 32-bit install which is now getting too old. It's time to start again with virtualisation.

I haven't yet done the (ideally first and final) PVE installation. I'd appreciate feedback on my plans so I know what to do while installing. Hopefully that will help me understand what other questions I should ask.

My hardware will be something like the following (if it all works together and passes my testing):

Dell T410 - Xeon E5504, 32GB RAM, Dell H310 in IT mode, 400GB SAS SSD (STEC), 600GB SAS HDD (Dell/Seagate 15K), 3TB SAS HDD (Dell/Seagate 7.2K), SuperMicro I350-T2 NIC (in addition to two on-board Ethernet ports)
Gigabyte consumer motherboard with early generation i3, 16GB RAM, 750GB SATA HDD, 1TB SATA HDD, Intel I340-T4 NIC (in addition to one on-board Ethernet port)

I'm not concerned about the current storage space; my current server has a few tens of GB of data. There will be room for expansion here, and I didn't invest heavily in this hardware; I want to see how things go for a while and then I'll decide if to upgrade or replace.

All the drives are intended to be used independently. Certainly I wouldn't risk making volumes which span devices, and these drives are not appropriate for mirroring (largely because I don't intend to). I am concerned about what was stated in another thread "Its no problem to use the same disk for OS + VM /LXC storage. Bigger problem would be that you can't mirror your storage, so everything is lost when your disk fails (and it will fail...the question is just when and not if)." but my idea is to have the less powerful machine available (and updated frequently) as a backup for the main one at times when the latter is out of action, whether planned or unplanned.

What do I need to know about setting up PVE so that I'd be able to (1) move VMs between hosts (to prepare for scheduled downtime) and (2) back up VMs between hosts (to be ready for unscheduled downtime). Do I need PVE clustering? I should have several gigabit ethernet ports available on each host - how many do I connect directly between the two hosts, and for what purposes?

Probably my core VM will be Nethserver, on which will run NextCloud. As mentioned above we don't have a lot of data, and it's mostly accessed as a regular file system from Windows and Linux. In discussions on the Nethserver forum I did see comments amounting to "put your data directly on ZFS on Proxmox and then make it available to NextCloud but make sure you have ECC RAM and enough of it" - I should be OK with the latter now on the Dell server although perhaps it would stretch the other machine.

Where do I learn best about ZFS? I don't want to find myself with issues and not have a clue what's going on. Not that I know anything about LVS yet... Would I be better off avoiding ZFS altogether? I noticed mentions of qcow2 on top of ZFS over here. Does the strategy above (data in storage managed and shared directly by Proxmox) make sense, especially if I want to have a fallback server available? I imagine I should have a storage appliance VM and back that up to the other machine (or perhaps have a second one on the other machine and sync the data between them, although then I'd need a plan for switching to the backup)

I'll start with that and will welcome and appreciate your comments and advice. Thanks!

LnxBil · Oct 12, 2022

Only a few quick answers:

ylavi said:
are there any sites (including pay sites) which have good tutorials (ideally video, or alternatively written) on setting up and working with PVE environments? Or do I learn from blogs and the Q&A sites such as these forums?

What about a book? PVE is based on Debian with an Ubuntu kernel, so you'll feel home if you come from ubuntu.

My approach is always to play around with the system and try to understand it, so go ahead and install it inside of your current virtualization product (e.g. VirtualBox) and play around. I did get started exactly like that.

ylavi said:
Do I need PVE clustering?

You would need 3 devices (2 server + 1 QDevice), but I don't see why you should do that.

You could do replication from on system to the other and you could create regularly backups to a third party storage from which you then restore on the other machine.

Dunuin · Oct 12, 2022

ylavi said:
I'm not sure yet exactly what questions to ask... The most fundamental is - are there any sites (including pay sites) which have good tutorials (ideally video, or alternatively written) on setting up and working with PVE environments? Or do I learn from blogs and the Q&A sites such as these forums?

There are a lot of Youtubers doing PVE tutorials all the time. But I would also recommend to read the full PVE documentation. And you can use the advanced forum search to search for threads tagged as "Tutorial".

ylavi said:
My hardware will be something like the following (if it all works together and passes my testing):

Dell T410 - Xeon E5504, 32GB RAM, Dell H310 in IT mode, 400GB SAS SSD (STEC), 600GB SAS HDD (Dell/Seagate 15K), 3TB SAS HDD (Dell/Seagate 7.2K), SuperMicro I350-T2 NIC (in addition to two on-board Ethernet ports)

Gigabyte consumer motherboard with early generation i3, 16GB RAM, 750GB SATA HDD, 1TB SATA HDD, Intel I340-T4 NIC (in addition to one on-board Ethernet port)

I'm not concerned about the current storage space; my current server has a few tens of GB of data. There will be room for expansion here, and I didn't invest heavily in this hardware; I want to see how things go for a while and then I'll decide if to upgrade or replace.

All the drives are intended to be used independently. Certainly I wouldn't risk making volumes which span devices, and these drives are not appropriate for mirroring (largely because I don't intend to). I am concerned about what was stated in another thread "Its no problem to use the same disk for OS + VM /LXC storage. Bigger problem would be that you can't mirror your storage, so everything is lost when your disk fails (and it will fail...the question is just when and not if)." but my idea is to have the less powerful machine available (and updated frequently) as a backup for the main one at times when the latter is out of action, whether planned or unplanned.

There are 3 problems with backups but without mirroring.
1.) The downtime. Your server will crash and services will be offline until you fix it. You will have to decide if thats a problem or not.
2.) Dataloss. You will loose all data since your last backup. Even with daily backups you will loose everything you did that day. You will again have do decide if you care about loosing a day of work or not.
3.) Silent data corruption over time (bit rot). When not using a filesystem like ZFS that support checksumming and bit rot protection you will never know if your data got corrupted or not. Data will degrade more or less over time. Lets say you store 100.000 home photos on it and 10 of them will corrupt each year. You don't always open those pictures to verify that these are still viewable or just a pixel mess. Maybe you want to view a picture again in 5 years and then realize that this picture got silently corrupted. You then check all your backups, but you only got 3 year of backups and that picture corrupted 4 years ago, so all your backups just contain the corrupted unusable image. But you never noticed it, because the picture was still there with the same name and size.
With ZFS this wouldn't happen. ZFS checks all data in fixed intervals and will identify corrupted data and automatically fix the curruption. But to be able to fix it, ZFS needs parity data so you atleast need a mirror.
So in short: You can't really rely on backups if you don't know if the data you are about to backup is still healthy or not.

ylavi said:
What do I need to know about setting up PVE so that I'd be able to (1) move VMs between hosts (to prepare for scheduled downtime) and (2) back up VMs between hosts (to be ready for unscheduled downtime). Do I need PVE clustering? I should have several gigabit ethernet ports available on each host - how many do I connect directly between the two hosts, and for what purposes?

If you want high availability or a cluster you need atleast 3 always running hosts. But its also possible to manually move guests between non-clustered hosts by doing backup and restores with a shared backup storage like a NFS/SMB share for vzdump or a Proxmox Backup Server (PBS).

ylavi said:
Where do I learn best about ZFS?

Thats a complex topic and can't be explained in a short tutorial. I don't know of a single tutorial that will tell you everything you need to know. I had in mind to write a tutorial to explain the common ZFS beginner errors people are always stumbling upon but didn't started that yet.
Would be easier to point you to specific ZFS tutorials if you encounter a problem or if something specific isn't clear.

nunner · Oct 12, 2022

I'm not sure yet exactly what questions to ask... The most fundamental is - are there any sites (including pay sites) which have good tutorials (ideally video, or alternatively written) on setting up and working with PVE environments?

The PVE Administration Guide [1] and wiki [2] would be a good place to start.

[1] https://pve.proxmox.com/pve-docs/index.html
[2] https://pve.proxmox.com/wiki/Main_Page

ylavi · Oct 12, 2022

Thanks to all of you, particularly @Dunuin who made some particularly useful and important points.

I understand the implications of backup frequency vs the ongoing nature of mirrors. And I see the strengths of ZFS with the error correction based on parity data. But things become less clear to me when I think about how the picture includes hosts and guests. And this brings me back to one of my original questions about storing data directly in the storage of a PVE host.

Perhaps some of this will be clear from the PVE documentation but it helps me to have an idea of what I will find before I start reading...

I should add that my limited experience with virtualisation is mostly from VirtualBox and therefore I may have some preconceptions which aren't relevant to PVE (or at least not the only option).

The questions which arise from the answers above are:

If my data is in some sort of guest partition (which surely has its own file system?), does the guest manage its own ZFS on the virtual disk, or can it somehow benefit from the host's ZFS? My secondary machine isn't so powerful (and memory isn't ECC) so I was considering not using ZFS on those drives.

If using guest partitions like that, could I make two of those as mirrors and put them on different devices on the host? In what ways might that be better or worse than the guest having one partition which sits on a mirror managed by the host? I would be happier not to have the complexity of mirrored drives within guests but the cost of that is mirroring physical drives.

Can guests be moved directly between hosts by copying over guest partition files (without some intermediate location) and then creating a new guest definition on the target? Is that basically what the backup-restore process does anyway?

Would keeping my data directly on a PVE partition (shared by PVE to guests and whoever, not with passthrough of the device to a guest) have benefits? Might it be harder to move around or back up? To start with I'm wondering which storage structure (if any) could save versions of files such that they are available to Windows users with the "previous versions" feature and/or with the versions feature in Nextcloud, and these can be saved in a backup. Is there some way to do [something like] that, or is it more straightforward to back up frequently and deal with the process of extracting individual files from appropriate backups?

Dunuin · Oct 12, 2022

ylavi said:
If my data is in some sort of guest partition (which surely has its own file system?)

Jup.

ylavi said:
, does the guest manage its own ZFS on the virtual disk, or can it somehow benefit from the host's ZFS? My secondary machine isn't so powerful (and memory isn't ECC) so I was considering not using ZFS on those drives.

The guest won't use ZFS unless you format your virtual disk also with ZFS ontop of the ZFS of the hst that stores the virtual disk. But this isn't recommended as ZFS got massive overhead and this verhead would multiply when running ZFS ontop of ZFS. But lets say you format your virtual disk with a fast but simple filesystem like ext4 or xfs, which on its own won't give your bit rot protection or block level compression, it would still benefit from the ZFS of the host that is storing the virtual disk, as the hosts ZFS will verify and compress all data that is stored on it, including the virtual disks.

ylavi said:
If using guest partitions like that, could I make two of those as mirrors and put them on different devices on the host? In what ways might that be better or worse than the guest having one partition which sits on a mirror managed by the host? I would be happier not to have the complexity of mirrored drives within guests but the cost of that is mirroring physical drives.

Usually you want your host storage mirrored and no redundancy in your guests. So there is only one raid you will have to manage and monitor and all guest will benefit from this redundancy, even if they just got a single virtual disk.
There might be some usecases where it might make sense to do the raid inside the guest (for example a TrueNAS VM with a HBA PCI passedthrough) but this should be a rare instance.

ylavi said:
Can guests be moved directly between hosts by copying over guest partition files (without some intermediate location) and then creating a new guest definition on the target? Is that basically what the backup-restore process does anyway?

This would be a migration and is only supported between nodes of a cluster and for that, you would need at least 3 hosts because you always need a quorum. Without a cluster you can only do a backup and restore. So one host would need to write a backup of the VM to the shared backup storage or the PBS and after that the other host would need to restore that backup, reading it from the shared backup storage or PBS. So without a cluster a direct transfer of VMs between two hosts isn't possible.

ylavi said:
Would keeping my data directly on a PVE partition (shared by PVE to guests and whoever, not with passthrough of the device to a guest) have benefits? Might it be harder to move around or back up? To start with I'm wondering which storage structure (if any) could save versions of files such that they are available to Windows users with the "previous versions" feature and/or with the versions feature in Nextcloud, and these can be saved in a backup. Is there some way to do [something like] that, or is it more straightforward to back up frequently and deal with the process of extracting individual files from appropriate backups?

That really depends. First you can't easily share folders from the host with a VM. This only works with SMB/NFS shares and PVE got no NAS functionalities. So you would need to setup a SMB server on your ownm using the CLI, to bring folders from the host into the VM. Easierst would be to also run a NAS VM with a virtual disk that contains the data that should be shared between different VMs.
A TrueNAS VM with its ZFS for example can share SMB shares with shadow copy which supports versioning. With that mounted SMB shares in windows will support "previous versions".
I think you can also setup that on your own running a SMB server with ZFS on the PVE host itself...but that won't be easy to setup.

bobmc · Oct 12, 2022

ylavi said:
If my data is in some sort of guest partition (which surely has its own file system?), does the guest manage its own ZFS on the virtual disk, or can it somehow benefit from the host's ZFS? My secondary machine isn't so powerful (and memory isn't ECC) so I was considering not using ZFS on those drives.

When you create a VM you will normally create a Virtual disk as part of the process. If you're using ZFS as storage, then a ZVOL will be created on ZFS as this virtual disk. So the VM is not aware of ZFS in any way - it just 'sees' a hard drive which it will format and store data in the normal way. The Proxmox host has no direct access or visibility of this data but you can backup this data without the VM being aware of the backup happening. The VM will still benefit from the zfs features like parity, mirroring, snapshots and bit-rot detection.

ylavi · Oct 14, 2022

Thanks for the answers! A lot of the terms mentioned here in passing give me starting points for learning more!
I'll do some reading, and start experimenting when my hardware is ready (I expect I will order more disks and set up mirrors)

Search

Search

Newbie Questions: How do I start?

ylavi

New Member

LnxBil

Distinguished Member

Dunuin

Distinguished Member

nunner

Active Member

ylavi

New Member

Dunuin

Distinguished Member

bobmc

Renowned Member

ylavi

New Member