PVE backup to local USB - I've messed it up

theUtmost

New Member
Jan 23, 2024
12
1
3
Hi folks
Sorry yet another newbie to PVE here with pesky questions :p

Short version:
I have a single nvme inside a Lenovo ThinkCentre M720q with PVE installed.
And HAD configured a 1Tb external USB as storage for Backups.
At some stage, something went wrong, I guess the USB cable disconnected or similar (hey, it happens).
Rather than the backups simply failing (as I expected them to) pve kept merrily backing up to the mount point for the USB: /mnt/usb
Which... of course is using exponentially more lvm storage with every backup cycle including every backup cycle!

I have temporarily disabled backups, and disconnected the USB drive.
Right this minute, with the USB drive unplugged, lsblk shows:

Code:
NAME                         MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
nvme0n1                      259:0    0 238.5G  0 disk
├─nvme0n1p1                  259:1    0  1007K  0 part
├─nvme0n1p2                  259:2    0     1G  0 part /boot/efi
└─nvme0n1p3                  259:3    0 237.5G  0 part
  ├─pve-swap                 252:0    0   7.6G  0 lvm  [SWAP]
  ├─pve-root                 252:1    0  69.5G  0 lvm  /
  ├─pve-data_tmeta           252:2    0   1.4G  0 lvm 
  │ └─pve-data-tpool         252:4    0 141.5G  0 lvm 
  │   ├─pve-data             252:5    0 141.5G  1 lvm 
  │   └─pve-vm--101--disk--0 252:6    0    40G  0 lvm 
  └─pve-data_tdata           252:3    0 141.5G  0 lvm 
    └─pve-data-tpool         252:4    0 141.5G  0 lvm 
      ├─pve-data             252:5    0 141.5G  1 lvm 
      └─pve-vm--101--disk--0 252:6    0    40G  0 lvm

As you can see that's only a single nvme and only running 1x vm.

Can someone please give me some guidance on a more robust (ie survives reboots and USB cable disconnections) mounting method for local USB drive to be used for backups?

Ideally, I would like to comment out the USB drive from being used as a "system" drive as well.
Actually that goes for any and all future USB drives that I might want to plug in eg if I need to change the USB backup drive to a different model/brand/size/serial#.

The whole system did become unresponsive several times recently (had to force poweroff the ThinkCentre USFF PC with 4second press of power button as I could not reach it via WebUI or SSH, even though it responded to PING on LAN) , and I will ask for some further advice in a separate thread about those issues, but first I should get the internal & external drives correctly setup for my needs.

I'm also wondering whether I should just go ahead and delete everything (while USB backup drive is disconnected) from the mount point : /mnt/usb?
Or do I need to look for important things in there first?

One of the weird system lockups I have had within the last week, I WAS able to get into PVE WebUI, but some services were NOT running
eg, I couldn't get to monitd, or syslog page or launch a shell from the webui, but I DID look on the Disks page which weirdly didn't list the nvme - only the USB drive as /sda1 (I think - this is from memory).

That's the only reason I'm mentioning this aspect though, while the PVE node seems to be ok for now, I want to make sure it's safe to delete everything from the failed /mnt/usb dir

Also, I think it would be safest to reformat and repartition the external USB, for new round of backups, right?
Any advice will be greatly appreciated, as I find my way around PVE, TIA folks!
theUtmost
 
If you mark the storage as a mount point using the checkbox then the backups will fail if the drive is disconnected. It should be fine to delete what's in /mnt/USB. Proxmox would not have put anything but backups there.
Excellent thanks for that.
Ok I'll delete the content of /mnt/USB then.

Where do I mark the storage as a mount point?
(I expect I am probably looking in the wrong area, sorry, it's not obvious with what I am looking at)

Cheers, theUtmost
 
I am looking in Datacenter > Storage

I have defined a Type = Directory
I gave it an ID: BUSB, and the directory is set to /mnt/usb/
I've expanded the Advanced options and there is no mention of a checkbox for mount point that I can find.
I swapped the View between Server View / Folder View / Pool view but that doesn't seem to give me any other options, but I will keep looking, thanks
 
You can do it from the command line:

pvesm set BUSB --is_mountpoint yes

Could have sworn there was a setting in the GUI for that but I can't find it either.
thanks appreciate the additional help here.
That gave me an error:
Code:
-bash: /usr/sbin/pvesm: Input/output error

I think my system is unhappy somewhere else.
I started following along this Youtube video as well, to see what differed with my setup:
https://www.youtube.com/watch?v=lZjMxdBPH7M

First interesting discrepancy was, when I ran fdisk -l
It listed the partitions as expected, but then in after the /dev/mapper section for the vm = 101 that is running on this PVE node, it said:

Code:
Partition 1 does not start on physical sector boundary.
Partition 2 does not start on physical sector boundary.
Partition 3 does not start on physical sector boundary.
Partition 4 does not start on physical sector boundary.

Next thing I ran timed out then gave input/output error:
Code:
blkid -o list

-bash: /usr/sbin/blkid: Input/output error

Then, I was going to try the method of mounting the drive UUID in fstab, but when I tried to nano the /etc/fstab:

Code:
root@pve:/mnt/usb# nano /etc/fstab
-bash: /usr/bin/nano: Input/output error
root@pve:/mnt/usb# vi /etc/fstab
-bash: /usr/bin/vi: Input/output error
root@pve:/mnt/usb# cat /etc/fstab
# <file system> <mount point> <type> <options> <dump> <pass>
/dev/pve/root / ext4 errors=remount-ro 0 1
UUID=406B-963F /boot/efi vfat defaults 0 1
/dev/pve/swap none swap sw 0 0
proc /proc proc defaults 0 0
root@pve:/mnt/usb# sudo nano /etc/fstab
-bash: sudo: command not found
root@pve:/mnt/usb# ls -l /etc/fstab
-rw-r--r-- 1 root root 207 Oct  8 05:15 /etc/fstab
root@pve:/mnt/usb# lsattr /etc/fstab
-bash: /usr/bin/lsattr: Input/output error
root@pve:/mnt/usb# su nano /etc/fstab
-bash: /usr/bin/su: Input/output error
root@pve:/mnt/usb# df -h
-bash: /usr/bin/df: Input/output error
root@pve:/mnt/usb# pvesm set BUSB --is_mountpoint yes
-bash: /usr/sbin/pvesm: Input/output error

and you can see my other attempts to access that I tried su and sudo.
Interesting that the cat DOES show fstab content, so i presume it's a permissions issue that I can't figure out.
I tried su and sudo on your suggested pvesm command.
sudo says command not found and su gives Input/output error.

Not sure where to from here but I will keep looking and reading thanks for your help so far!
 
ummm.
I just went back into pve (node) > Disks
and it no longer even lists the nvme drive - it doesn't exist.
But the /dev/sda and the /dev/sda1 partition ON that - do.
the partition, brand, model, serial all do match my actual USB drive.

The nvme has vanished, but the system is kinda still running? (from RAM I guess, log writes aren't going anywhere useful and this maybe explains some gaps in syslog...)

df gave input/output error

mount gives:

Code:
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
proc on /proc type proc (rw,relatime)
udev on /dev type devtmpfs (rw,nosuid,relatime,size=8083932k,nr_inodes=2020983,mode=755,inode64)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,nodev,noexec,relatime,size=1623520k,mode=755,inode64)
/dev/mapper/pve-root on / type ext4 (ro,relatime,errors=remount-ro)
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev,inode64)
tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k,inode64)
cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime)
pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime)
efivarfs on /sys/firmware/efi/efivars type efivarfs (rw,nosuid,nodev,noexec,relatime)
bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700)
systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=30,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=224)
mqueue on /dev/mqueue type mqueue (rw,nosuid,nodev,noexec,relatime)
tracefs on /sys/kernel/tracing type tracefs (rw,nosuid,nodev,noexec,relatime)
debugfs on /sys/kernel/debug type debugfs (rw,nosuid,nodev,noexec,relatime)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M)
fusectl on /sys/fs/fuse/connections type fusectl (rw,nosuid,nodev,noexec,relatime)
configfs on /sys/kernel/config type configfs (rw,nosuid,nodev,noexec,relatime)
ramfs on /run/credentials/systemd-sysusers.service type ramfs (ro,nosuid,nodev,noexec,relatime,mode=700)
ramfs on /run/credentials/systemd-tmpfiles-setup-dev.service type ramfs (ro,nosuid,nodev,noexec,relatime,mode=700)
/dev/nvme0n1p2 on /boot/efi type vfat (rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro)
ramfs on /run/credentials/systemd-sysctl.service type ramfs (ro,nosuid,nodev,noexec,relatime,mode=700)
ramfs on /run/credentials/systemd-tmpfiles-setup.service type ramfs (ro,nosuid,nodev,noexec,relatime,mode=700)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,nosuid,nodev,noexec,relatime)
sunrpc on /run/rpc_pipefs type rpc_pipefs (rw,relatime)
lxcfs on /var/lib/lxcfs type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
/dev/fuse on /etc/pve type fuse (rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other)
tmpfs on /run/user/0 type tmpfs (rw,nosuid,nodev,relatime,size=1623516k,nr_inodes=405879,mode=700,inode64)

which seems to be lacking a root?
I am very confused.
I did look at the SMART status of both the nvme and the USB driver earlier today and both said status was passed.
 
Yeap - and PVE webUI has died on me again.
At this stage I still have an active shell session which was initiated originally FROM the webui, to the NIC I use as a management interface on the ThinkCentre PC. It has a static IP address with no gateway, in the same subnet as my current PC, so I haven't lost access to that (yet).
Based on past experience though this shell access will go away soon too.

It's Interesting that the single vm I'm running on PVE = OPNsense, which provides internet connection, DHCP and DNS to my home subnet is still mostly working - so that must be entirely in RAM also. Not a bad effort and kudos to the devs on both sides!

But what on earth is going on with my storage!???
It's no wonder I am having problems if the nvme has just vanished!
 
Shell access to PVE has stopped responding now.
I still get a reply to PING on that static IP address running on the motherboard NIC of the ThinkCentre, but it seems like most of the services are now dead on PVE.
But not all.
Amazingly - the OPNsense vm running on it is still up (but having issues).
OPNsense vm still can be accessed via it's own webUI, but it shows not all functions are available.
Eg the update checker got stuck on retrieving internal update status (because I guess that is trying to look at actual files on disk which are unavailable).
OPNsense dashboard shows all services running (RAM again I guess), and gateway is online, BUT LAN and WAN interfaces are listed as DOWN.
OPNsense is still (right now) servicing internet, DHCP and DNS for LAN clients.

Based on prior experience though - that will stop working at some stage, and then I will be forced to poweroff/on the ThinkCentre PC to restore functioning.
When that happens, I will leave the USB backup drive detached, and instead try a backup regime to SMB fileserver.
At least as a trial to see if it improves reliability of this system.
I preferred the USB backup destination since I wanted a fully localised (to the pve node itself) backup that didn't rely on any aspect of network comms, since, in the near future I wanted to embark on more network changes - which I didn't want breaking the pve backups...

I did run Lenovo manufacturer UEFI diagnostics before I installed pve on this USFF ThinkCentre PC and it found no issues.

Aside from: try another box or another nvme drive, can anyone suggest (when I can get back into the system of course!) where to start looking to troubleshoot a potential issue with either system config related to storage or hardware (especially NVMe M.2 SSD related) ?
Cheers, theUtmost
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!