I Have Many Questions

DavidSTVZ · Nov 21, 2019

I created my first PVE node over a year ago and have used it successfully, but I feel like I always have many questions. The documentation explains the options, but everything is very complex and it's not always clear what the best practice is for my situation.

In my current environment, I have one legacy server running many services including smtp, imap, http/https and ssh. There are about 600 GB of user data on it. Another machine has about 60 GB and serves users mostly via SSH. I have a few lesser machines with only a few GB of data and one or two non-critical services each.

Up until now, I've been relying on data backups and detailed notes of how my servers are configured so that I can install and configure a new server from scratch should something fail badly. As a result, I had not used vzdump at all until this week. I started testing and using it as a means to move VMs from one independent PVE node to a new one that I set up. I've moved 3 small servers so far, but the 600 GB and 60 GB seem too large to move in this way. I'm hoping to get a cluster going and do an online migration soon to avoid any significant down time.

In the longer term, I'm thinking of recreating all these VMs and hosts with different settings hoping to optimize for storage efficiency, speeding up online migration and so on.

Questions:

1) What should I do with my user data? I have two good servers available with 24 cores and 64 GB memory each. Each is configured with 4TB of storage in a raid 10 currently and there is extra storage on one of the nodes (and more drive bays if needed). I installed PVE with default options which creates a local storage of about 100 GB of type directory, and a local-lvm storage of type lvm-thin using rest which is nearly 4 TB. The local directory storage is set up for backup dumps, but it seems the 100 GB is very little space for this if I have nearly 4 TB for VMs. I certainly can't dump the 600 or 60 GB servers here. This makes me think I should possibly move all the data to a separate NSF server, or maybe a VM running NSF. At least, I could have a separate virtual drive within the VMs that is marked not to be backed up. Or maybe data can be directly on some physical storage on a node. I already backup the data to a backup server using rsync anyway, so it really doesn't need to be in the backup dumps.

2) When creating VM hardware for a new Linux Guest, in the Hard Disk section in particular, there are many many options and the documentation (section 10.2.4 here: https://pve.proxmox.com/pve-docs/pve-admin-guide.html#qm_virtual_machines_settings ) stops short of clearly recommending anything. I have so far stuck with default options (Bus/Device SCSI, local-lvm storage, No cache). Should I be using something other than SCSI here?

3) What about the CPU? I have seen recommendations to use only 1 or 2 cores for the CPU, but this seems like it could be very limiting. Is there never a reason to go to 4 cores for a potentially busy server?

4) I'm also concerned about memory. My Debian Linux mail server is set to a fixed amount of 16 GB memory, but it always uses all of it. It runs fine but seems to enjoy using as much memory as I give it. I've read that this is an intentional memory management strategy in Linux. I would like to use dynamically allocated memory with ballooning, but I'm concerned that the guests will just request more and more memory. My multi-user ssh server has 8 GB and uses all of it as well, then I have a minor Apache 2 server with 4 GB and a Windows 2016 with 4 GB (both running fine).

5) When installing the guest, is there any reason not to use EXT4 file system? I have read that ZFS can help greatly with dumping virtual disks containing a lot of 0 data. Does this mean ZFS storage on the host, or ZFS file system in the guest?

6) When it comes to vzdump, the stop mode supposedly creates a consistent dump. What does this mean exactly? I take it to mean that the snapshot contains exactly the data present at the time the machine is shutdown (even though this mode restarts the VM and allows file system changes in the guest while the backup process is ongoing). The snapshot mode seems to work the same way except the VM never shuts down. So what exactly is the risk of inconsistency? The vzdump documentation is not very clear about this. What is the actual risk in restoring from an inconsistent dump?

That's it for now. Thank you for considering my questions.

RobFantini · Nov 22, 2019

backups and user data - i'd solve those 1st. there are multiple right options.

Backups:
i'd say get backups fixed 1st . do you have extra disk slots in server chassis? or a spare system around that you can use for nfs? after backups are reliable then work on the rest

User data:
what kind of data? [ like documents , email etc ]

DavidSTVZ · Nov 22, 2019

I agree... backups are of prime importance. I currently use rsnapshot to back up the data nightly to a physical server in another part of the building. Come to think of it, I should probably have a second emergency backup of the data.

As I said, the VMs aren't backed up, but I do have detailed notes on reinstalling each server from scratch in the event that something fails completely. I could probably do it in a matter of hours since the notes are recent and apply to Debian Buster. I guess if I want dumps of the VM, I need to separate the data from the VM using one of the method I suggested.

Data is documents and email mostly. The email is a bit of a problem because users tend keep absolutely massive mail spools... one new email means that rsync needs to copy the entire inbox file (many GB for some users). I try to ask them nicely to archive old mail on occasion, but it's a constant battle. I need some software that transparently archives the oldest items in the inboxes or something, but that's a different issue. I guess it's ultimately not too bad... about 60 GB of mail spool data changes daily.

RobFantini · Nov 22, 2019

great that you use rsnapshot! we've used that for a very long time. note if you put it to a kvm never use lxc - as for some reason restore of a backup can take many hours. something to do with the hard links.

for documents we use nextcloud . we've a kvm for that. in case you are not familiar with nextcloud . so the files exist on desktop + nextcloud. in addition we use rsnapshot to backup the nextcloud data. I'd suggest moving documents from the large vm to nextcloud .

for desktops we have a backuppc kvm. users like that as it is easy for them to restore files. i like rsnapshot, but try to get a user to self service restore a backup..

for the emails - do you have a local mail server?

DavidSTVZ · Nov 22, 2019

When I first started doing sys admin, I had a server with no RAID and only weekly tape backups (required 3 tapes). That was a mess. I had two drive failures and restores from tape in less than a year, one during a holiday. I upgraded to a RAID config with rsync backups to another server also with RAID, then later upgraded to rsnapshot. Since switching to RAID, I have never had a non-recoverable drive failure. I have had users ask to restore accidentally deleted files from the backups though. In any case, I love rsnapshot... lets me sleep easy at night. Rsync is also a fantastic way to migrate data from an old to new server when the occasion arises.

I will definitely look into nextcloud. If not that, then I should probably switch to something similar eventually for data. I guess in the short term, I will try an NSF VM. The setup should be easy enough that I can build a new NSF server in less than 30 minutes and then begin dumping the latest snapshot to it should something somehow corrupt the VM. The data restoration won't be fast... but it's data, that can't be helped much.

The mail data is local. We have our own mail server. I have few enough users that I should probably just work with each user individually to archive the old mail, so that only a few gigs has to be duplicated each day for backups.

RobFantini · Nov 22, 2019

I remember tape backup and non raid systems, and crashes always at the worst time. like 3 hours after getting wisdom teeth pulled a lightning spike went to telephone pole to the phone wire attached to the motherboard. raid was a great improvement. then zfs and now we use ceph.

I'm curious - for email - do you have a imap server?
Where do the emails get stored?

Anyway it sounds like you are heading in a safe direction by putting data to a vm on nfs. Do look at nextcloud for documents using a kvm on nfs. Or owncloud.

DavidSTVZ · Nov 27, 2019

Yes, we use imap, a webmail client (that is technically an imap client) and command line access (so someone may still be using Pine to directly read mailboxes on the server). The email is just in mbox format in the mail spool and in archives in user home directories. It's all in the VM.

I recently tried migrating the smaller VM (60 GB) and found that it takes a very long time and takes up much more space in a dump due to non-zeroed parts of the disk image. Now I'm researching ways to properly trim deleted files and make the image sparse again. I can move everything to NFS, but without that step, it will still take very long time to dump the image.

Search

Search

I Have Many Questions

DavidSTVZ

Member

RobFantini

Famous Member

DavidSTVZ

Member

RobFantini

Famous Member

DavidSTVZ

Member

RobFantini

Famous Member

DavidSTVZ

Member