Optimizing PVE containers for backup / storage tiering

wbk

Renowned Member
Oct 27, 2019
213
31
68
Hi all,

I'd like to ask for input on something I'm considering while migrating and consolidating some machines.

Current situation:
* Multiple machines (some containers on PVE, some bare metal) have multiple years of data
* New data, in aggregate over these installations, is added at about 50-100 GB per month
* Older data is rarely edited, and not frequently read
* Backup to PBS is already a multi-hour exercise

Target situation:
* One larger PVE, that hosts data from the multiple current sources
* Backups frequent enough to be able to rely on them to lose less than a week of data in case of disaster
* On-line availability of the whole catalog (on-line as in: "no tape drive / bluray / USB HDD", not as in "everything needs to be accessible by smartphone worldwide")

Considering backups already take a while, I thought of tiering the storage, and have three containers running storage servers instead of one:
* One would hold data from the last 6-9 months or so, for the near future requiring less than a TB of storage. It would see weekly backups.
* The next would hold data up to two / two and a half years ago, requiring some two TB of storage. As it sees few edits, it could do with a backup every quarter (ie, after refresh from the above tier)
* The last one would hold historic data, 2 years and older. It would see an influx of about half a TB each half year; with retention of backups in the tier above, one backup per year should do
* The storage servers in each of the containers would provide year-based directories that are mounted inside the container that hosts the front-facing server.

This is on a home network; data links are gigabit ethernet, with no near-term option of upgrading. The main container now runs Yunohost with Nextcloud as target for the data.

I guess storage tiering is standard practice once data becomes unwieldy. Are there some best practices in relation to Proxmox? Is there a more sane method than I described above?
 
To make this more concrete, and perhaps elicit some speculation about the pros and cons, this is how I plan to set up the constellation:
  • PVE node runs seven containers:
    • yunohost: runs, among others Nextcloud
    • nc2000 : stores data up to and including the year 2000
    • nc2010 : stores data up to and including the year 2010
    • nc2015 : stores data up to and including the year 2015
    • nc2020 : stores data up to and including the year 2020
    • nc2025 : stores data up to and including the year 2025
    • nc2028 : stores data up to and including the year 2028
  • PBS backs up:
    • daily
      • yunohost
      • nc2028
    • monthly
      • nc2025
    • quarterly
      • nc2020
    • yearly
      • nc2015
      • nc2010
      • nc2000
The nc* containers:
  • privileged (if need be)
  • mount a large enough storage at /mnt
  • bind-mount storage at /srv/nextcloud_20xy
  • export the nextcloud_20xy directory
The node:
  • mounts the exports so that Yunohost can use them without being privileged
The Yunohost container:
  • bind-mounts per-user directories the exported data containers at the correct place in the Nextcloud directory structure

The main reason for tiering storage this way and spreading it over multiple containers, is to be able to keep frequent backups from PVE to PBS and from PBS to remote and tape.
Tiering the storage like this, each container can stay (way) under a terabyte.

Any thoughts?