I created my first PVE node over a year ago and have used it successfully, but I feel like I always have many questions. The documentation explains the options, but everything is very complex and it's not always clear what the best practice is for my situation.
In my current environment, I have one legacy server running many services including smtp, imap, http/https and ssh. There are about 600 GB of user data on it. Another machine has about 60 GB and serves users mostly via SSH. I have a few lesser machines with only a few GB of data and one or two non-critical services each.
Up until now, I've been relying on data backups and detailed notes of how my servers are configured so that I can install and configure a new server from scratch should something fail badly. As a result, I had not used vzdump at all until this week. I started testing and using it as a means to move VMs from one independent PVE node to a new one that I set up. I've moved 3 small servers so far, but the 600 GB and 60 GB seem too large to move in this way. I'm hoping to get a cluster going and do an online migration soon to avoid any significant down time.
In the longer term, I'm thinking of recreating all these VMs and hosts with different settings hoping to optimize for storage efficiency, speeding up online migration and so on.
Questions:
1) What should I do with my user data? I have two good servers available with 24 cores and 64 GB memory each. Each is configured with 4TB of storage in a raid 10 currently and there is extra storage on one of the nodes (and more drive bays if needed). I installed PVE with default options which creates a local storage of about 100 GB of type directory, and a local-lvm storage of type lvm-thin using rest which is nearly 4 TB. The local directory storage is set up for backup dumps, but it seems the 100 GB is very little space for this if I have nearly 4 TB for VMs. I certainly can't dump the 600 or 60 GB servers here. This makes me think I should possibly move all the data to a separate NSF server, or maybe a VM running NSF. At least, I could have a separate virtual drive within the VMs that is marked not to be backed up. Or maybe data can be directly on some physical storage on a node. I already backup the data to a backup server using rsync anyway, so it really doesn't need to be in the backup dumps.
2) When creating VM hardware for a new Linux Guest, in the Hard Disk section in particular, there are many many options and the documentation (section 10.2.4 here: https://pve.proxmox.com/pve-docs/pve-admin-guide.html#qm_virtual_machines_settings ) stops short of clearly recommending anything. I have so far stuck with default options (Bus/Device SCSI, local-lvm storage, No cache). Should I be using something other than SCSI here?
3) What about the CPU? I have seen recommendations to use only 1 or 2 cores for the CPU, but this seems like it could be very limiting. Is there never a reason to go to 4 cores for a potentially busy server?
4) I'm also concerned about memory. My Debian Linux mail server is set to a fixed amount of 16 GB memory, but it always uses all of it. It runs fine but seems to enjoy using as much memory as I give it. I've read that this is an intentional memory management strategy in Linux. I would like to use dynamically allocated memory with ballooning, but I'm concerned that the guests will just request more and more memory. My multi-user ssh server has 8 GB and uses all of it as well, then I have a minor Apache 2 server with 4 GB and a Windows 2016 with 4 GB (both running fine).
5) When installing the guest, is there any reason not to use EXT4 file system? I have read that ZFS can help greatly with dumping virtual disks containing a lot of 0 data. Does this mean ZFS storage on the host, or ZFS file system in the guest?
6) When it comes to vzdump, the stop mode supposedly creates a consistent dump. What does this mean exactly? I take it to mean that the snapshot contains exactly the data present at the time the machine is shutdown (even though this mode restarts the VM and allows file system changes in the guest while the backup process is ongoing). The snapshot mode seems to work the same way except the VM never shuts down. So what exactly is the risk of inconsistency? The vzdump documentation is not very clear about this. What is the actual risk in restoring from an inconsistent dump?
That's it for now. Thank you for considering my questions.
In my current environment, I have one legacy server running many services including smtp, imap, http/https and ssh. There are about 600 GB of user data on it. Another machine has about 60 GB and serves users mostly via SSH. I have a few lesser machines with only a few GB of data and one or two non-critical services each.
Up until now, I've been relying on data backups and detailed notes of how my servers are configured so that I can install and configure a new server from scratch should something fail badly. As a result, I had not used vzdump at all until this week. I started testing and using it as a means to move VMs from one independent PVE node to a new one that I set up. I've moved 3 small servers so far, but the 600 GB and 60 GB seem too large to move in this way. I'm hoping to get a cluster going and do an online migration soon to avoid any significant down time.
In the longer term, I'm thinking of recreating all these VMs and hosts with different settings hoping to optimize for storage efficiency, speeding up online migration and so on.
Questions:
1) What should I do with my user data? I have two good servers available with 24 cores and 64 GB memory each. Each is configured with 4TB of storage in a raid 10 currently and there is extra storage on one of the nodes (and more drive bays if needed). I installed PVE with default options which creates a local storage of about 100 GB of type directory, and a local-lvm storage of type lvm-thin using rest which is nearly 4 TB. The local directory storage is set up for backup dumps, but it seems the 100 GB is very little space for this if I have nearly 4 TB for VMs. I certainly can't dump the 600 or 60 GB servers here. This makes me think I should possibly move all the data to a separate NSF server, or maybe a VM running NSF. At least, I could have a separate virtual drive within the VMs that is marked not to be backed up. Or maybe data can be directly on some physical storage on a node. I already backup the data to a backup server using rsync anyway, so it really doesn't need to be in the backup dumps.
2) When creating VM hardware for a new Linux Guest, in the Hard Disk section in particular, there are many many options and the documentation (section 10.2.4 here: https://pve.proxmox.com/pve-docs/pve-admin-guide.html#qm_virtual_machines_settings ) stops short of clearly recommending anything. I have so far stuck with default options (Bus/Device SCSI, local-lvm storage, No cache). Should I be using something other than SCSI here?
3) What about the CPU? I have seen recommendations to use only 1 or 2 cores for the CPU, but this seems like it could be very limiting. Is there never a reason to go to 4 cores for a potentially busy server?
4) I'm also concerned about memory. My Debian Linux mail server is set to a fixed amount of 16 GB memory, but it always uses all of it. It runs fine but seems to enjoy using as much memory as I give it. I've read that this is an intentional memory management strategy in Linux. I would like to use dynamically allocated memory with ballooning, but I'm concerned that the guests will just request more and more memory. My multi-user ssh server has 8 GB and uses all of it as well, then I have a minor Apache 2 server with 4 GB and a Windows 2016 with 4 GB (both running fine).
5) When installing the guest, is there any reason not to use EXT4 file system? I have read that ZFS can help greatly with dumping virtual disks containing a lot of 0 data. Does this mean ZFS storage on the host, or ZFS file system in the guest?
6) When it comes to vzdump, the stop mode supposedly creates a consistent dump. What does this mean exactly? I take it to mean that the snapshot contains exactly the data present at the time the machine is shutdown (even though this mode restarts the VM and allows file system changes in the guest while the backup process is ongoing). The snapshot mode seems to work the same way except the VM never shuts down. So what exactly is the risk of inconsistency? The vzdump documentation is not very clear about this. What is the actual risk in restoring from an inconsistent dump?
That's it for now. Thank you for considering my questions.