I've really screwed up bad, PLEASE HELP!

Aug 17, 2020
23
1
8
89
I have three ZFS pools on my server

"TANK" - a 6x10TB RAIDz2 for my data
"nvme-pool" - a single 500GB NVMe disk used for fast storage for LXC and VMs.

Then there's the disk which Proxmox is installed to but I'm not 100% sure whether I should be calling it rpool or bpool, or something else, so I'll just refer to it as Boot-SSD.

Boot-SSD - a single SATA disk for the Proxmox install and a small number of scripts.


Here's the screw up....

I was moving some files around within a dataset on TANK. I was doing via the Shell. Basically what I was doing was moving all the files in any folders within my current directory, to my current directory. I ran this command mv /*/* instead of mv ../*/*.

A bunch of text appeared in the Shell, which I very stupidly didn't screenshot and now can't remember exactly, but I think it included messages along the lines of "/usr/bin/ls doesn't exist". When running ls, it did say no such directory /usr/bin/ls.

I've shut down the server and physically disconnected the TANK pool as it's the one that contains the really important data and I don't want to take any risks.

When I rebooted the server, it booted into grub rescue. When I get to the stage of running insmod normal, it says that normal isn't found.

If I boot the Proxmox install, it fails to execute "/sbin/init" saying that the file or folder is not found, and it kernel panics.

I don't know what to do. Please help me.
 
I'm starting to think that I've completely borked my install...

I had a spare SSD laying around, so I did a fresh install of Proxmox onto "Fresh-SSD", and then tried importing my pools.

Thankfully though, it appears my data on `TANK` and `nvme-pool` is intact. So I still have the important data on TANK and all of my LXC subvols on nvme-pool... the original Proxmox install disk looks like it might be another matter.



ls -a rpool/ results in . .. data ROOT. "data" contains some old, unused LXC subvols and "ROOT" contains a folder called "pve-1" with nothing in it. So it's my newb conclusion that "data" and "ROOT" are still there because they are datasets but everything else was destroyed by my stupidity. It also appears that the snapshots for "rpool", "data", and "ROOT" contain not real data. Here's a sample snippet of the output from du -h from within /rpool/.zfs/snapshot

Code:
512     ./autosnap_2021-06-24_16:00:03_hourly/data
512     ./autosnap_2021-06-24_16:00:03_hourly/ROOT
1.5K    ./autosnap_2021-06-24_16:00:03_hourly
512     ./autosnap_2021-06-23_03:00:03_hourly/data
512     ./autosnap_2021-06-23_03:00:03_hourly/ROOT
1.5K    ./autosnap_2021-06-23_03:00:03_hourly
512     ./autosnap_2021-06-24_00:00:03_daily/data
512     ./autosnap_2021-06-24_00:00:03_daily/ROOT
1.5K    ./autosnap_2021-06-24_00:00:03_daily
512     ./autosnap_2021-06-24_17:00:03_hourly/data
512     ./autosnap_2021-06-24_17:00:03_hourly/ROOT
1.5K    ./autosnap_2021-06-24_17:00:03_hourly
93K     .


As things stand, is it possible for me to recover any data from the original proxmox install? (surely I can recover it from a snapshot?!)
 
One last comment before I turn in.

I'm at the point now where I'm starting to feel that perhaps I should take this as a blessing in disguise. (fingers crossed) I don't think I've lost anything really important... aside from the time I put into setting up my orignal install (smartd, crontab, UPS, users / user mappings etc). So I'm starting to think that perhaps I should jsut start again from scratch and take what I've learnt since I started running Proxmox, and hopefully end up with a new install that's cleaner and better configured.

The thing that stopping me though is that I could really do with reusing the disk that the original install is on... So I guess my question at this point in time is that, given the above information in my previous replies, have I confirmed that my original install is gone and that there's no feasible recovery method?
 
have I confirmed that my original install is gone and that there's no feasible recovery method?
If you did an install on zfs, your boot pool will be called rpool. if rpool has snapshots, you can boot from a rescue disk (eg the proxmox installed) and from shell perform
zpool import rpool
zfs rollback rpool@snapshotname

once rebooted, the os might fail and drop to grub; you'll need to perform zpool import -f before you'd be able to proceed.

but honestly, as long as your vdisks are on another pool just reinstall. its easier.
 
  • Like
Reactions: The Corpse of Elvis
2 things comes to my mind.

1.) Raid never replaces a backup. If you are that worried about destroying your data you should really backup all your pools. I for example buy everything twice. Once a week a script will replicate all data from the main pools to the backup pools so it wouldn't really be a problem if I destroy a complete pool because there is always a duplicate of that pool. The backup pools are part of a separate backup server that is most of the time powered off and even if it is not powered of all backup pools are set to read only so I can't destroy my backups by mistake.

2.) Your nvme-pool isn't that safe. If you only use a single drive ZFS can't auto heal itself because there is no parity data if bit rot occures. If you don't got free M.2 slot or PCIe 4x slot for another NVMe drive there is a "copies=2" ZFS option. If you use that everything is written twice (cutting the capacity and performance in half) so you got parity and self healing is working. But a mirror would be a way better choise. With a mirror you would loose the same capacity but 2-4x more performance and wont loose data if a NVMe SSD is failing.
 
Last edited:
  • Like
Reactions: The Corpse of Elvis
If you did an install on zfs, your boot pool will be called rpool. if rpool has snapshots, you can boot from a rescue disk (eg the proxmox installed) and from shell perform
zpool import rpool
zfs rollback rpool@snapshotname

once rebooted, the os might fail and drop to grub; you'll need to perform zpool import -f before you'd be able to proceed.

but honestly, as long as your vdisks are on another pool just reinstall. its easier.

When I rollback, the boot behaviour is the same as it was before I rolled back anything. The boot stalls and I get run-init: can't execute '/sbin/init': No such file or directory which causes a kernel panic.

I think I'm just gonna fresh install. It'll get me back up and running quicker.
 
Last edited:
2 things comes to my mind.

1.) Raid never replaces a backup. If you are that worried about destroying your data you should really backup all your pools. I for example buy everything twice. Once a week a script will replicate all data from the main pools to the backup pools so it wouldn't really be a problem if I destroy a complete pool because there is always a duplicate of that pool. The backup pools are part of a separate backup server that is most of the time powered off and even if it is not powered of all backup pools are set to read only so I can't destroy my backups by mistake.

2.) Your nvme-pool isn't that safe. If you only use a single drive ZFS can't auto heal itself because there is no parity data if bit rot occures. If you don't got free M.2 slot or PCIe 4x slot for another NVMe drive there is a "copies=2" ZFS option. If you use that everything is written twice (cutting the capacity and performance in half) so you got parity and self healing is working. But a mirror would be a way better choise. With a mirror you would loose the same capacity but 2-4x more performance and wont loose data if a NVMe SSD is failing.

1. I hear ya... that's why the really important stuff on TANK is almost entirely backed up to another ZFS pool on a TrueNAS server, which has been off for a couple of weeks (so I should have 99.9% of the data elsewhere). But this is definitely a lesson learned, I had been planning to replicate the boot and nvme-pool data to TANK but never got around to it. I won't make the same mistake with the new install.

2. Again, I understand the risk I'm taking with my nvme-pool but it is temporary. I plan to buy a second disks as soon as I have the money. Like the boot disk, the plan was to back it up to my TANK pool but I've had some IRL stuff that's kept me distracted, stressed, and short of time. But the (thankfully metaphorical) bite marks on my ass have caught my attention.

The question I'm left wondering is what happened to my rpool snapshots? The snapshots exists but don't contain any data, so either my sanoid config was messed up on rpool (but working on tank and nvme-pool) or the mv /*/* destroyed their data somehow... surely the snapshots would have been read only though!?[/ICODE]
 
When I rollback, the boot behaviour is the same as it was before I rolled back anything. The boot stalls and I get run-init: can't execute '/sbin/init': No such file or directory which causes a kernel panic.

I think I'm just gonna fresh install. It'll get me back up and running quicker.
You can boot from a Live Linux that supports ZFS (like Ubuntu). That way you can boot and import your old pool to that live linux. Now you can rollback the snapshots and backup your config files.
 
You can boot from a Live Linux that supports ZFS (like Ubuntu). That way you can boot and import your old pool to that live linux. Now you can rollback the snapshots and backup your config files.
That's what I've done, both with an Ubuntu live CD and from a fresh install of Proxmox (on a different disk) but the snapshots are all small (see the second post of this thread) and the boot behaviour remains the same.
 
For what it's worth, I've attached the old install disk to my PC and imported the pool as altrpool. I then ran sudo zfs list -r altrpool.


Code:
NAME                              USED  AVAIL     REFER  MOUNTPOINT
altrpool                         30.7G   198G      104K  /altrpool
altrpool/ROOT                    17.7G   198G       96K  /altrpool/ROOT
altrpool/ROOT/pve-1              17.7G   198G     17.0G  /
altrpool/data                    12.9G   198G      120K  /altrpool/data
altrpool/data/subvol-100-disk-0   402M  7.61G      401M  /altrpool/data/subvol-100-disk-0
altrpool/data/subvol-106-disk-0   978M  3.07G      957M  /altrpool/data/subvol-106-disk-0
altrpool/data/subvol-107-disk-0   863M  7.16G      863M  /altrpool/data/subvol-107-disk-0
altrpool/data/subvol-108-disk-0   873M  15.1G      873M  /altrpool/data/subvol-108-disk-0
altrpool/data/subvol-111-disk-0   536M  7.52G      493M  /altrpool/data/subvol-111-disk-0
altrpool/data/vm-301-disk-0      9.36G   198G     9.36G  -


As I'm sure has become clear by now, I don't really know a great deal but to my eyes it looks like there's still some data altrpool/ROOT/pve-1 mounted at /)?... but there is no .zfs directory or snapshots in pve-1 and I don't know how else I would access them if they do indeed still exists.
 
Last edited:
As I'm sure has become clear by now, I don't really know a great deal but to my eyes it looks like there's still some data altrpool/ROOT/pve-1 mounted at /)?... but there is no .zfs directory or snapshots in pve-1 and I don't know how else I would access them if they do indeed still exists.
Use sudo zfs list -t snapshot -r altrpool1 if you want to see snapshots.
 
There's good new and bad news....

The good news is, is that I've got a second NVMe disk and I'm ready to sort out a mirrored boot disk.

The bad news is that in the process of send | recv a snapshot of nvme-pool to my rpool disk, I think I've found the reasons why my data disappeared on my original install... and I'm scared quite frankly.



I've just tried to zfs send | recv a manually created snapshot from one pool to another. Here's the command I used to create the snapshots - zfs snapshot -r nvme-pool@migrate-20210629. I then used this command to send the snapshot to my other pool zfs send -v nvme-pool@migrate-20210629 | zfs recv rpool/nvme-pool.

First of all, the send | recv size was around 400M rather than ~60G. I checked the destination dataset and sure enough, the folders (NextcloudBackupFolder and CustomScripts) at the root of the nvme-pool were copied over but the data within child datasets on the nvme-pool, wasn't.

My next course of action was to check the snapshot itself by cd'ing to nvme-pool/.zfs/snapshot and running du -h migrate-20210629, and sure enough, it's showing the data within the folders on the nvme-pool but nothing within the subvol-* child datasets. Here's a sample of the output-

Code:
26K     migrate-20210629/NextcloudBackupFolder/data/data/appdata_3b786c6d0a7c1/css/settings
    10K     migrate-20210629/NextcloudBackupFolder/data/data/appdata_3b786c6d0a7c1/css/files_sharing
    26K     migrate-20210629/NextcloudBackupFolder/data/data/appdata_3b786c6d0a7c1/css/calendar
    492K    migrate-20210629/NextcloudBackupFolder/data/data/appdata_3b786c6d0a7c1/css
    18M     migrate-20210629/NextcloudBackupFolder/data/data/appdata_3b786c6d0a7c1
    512     migrate-20210629/NextcloudBackupFolder/data/data/Catherine
    54M     migrate-20210629/NextcloudBackupFolder/data/data
    54M     migrate-20210629/NextcloudBackupFolder/data
    317M    migrate-20210629/NextcloudBackupFolder
    512     migrate-20210629/subvol-105-disk-0
    512     migrate-20210629/subvol-113-disk-0
    36K     migrate-20210629/CustomScripts
    512     migrate-20210629/subvol-107-disk-0
    512     migrate-20210629/subvol-120-disk-0
    512     migrate-20210629/subvol-202-disk-0
    512     migrate-20210629/subvol-101-disk-0
    512     migrate-20210629/subvol-122-disk-0
    512     migrate-20210629/subvol-103-disk-0
    317M    migrate-20210629

This is where it get really scary... the same is true of the snapshots on my data pool.

I snapshot my pools automatically using sanoid. Here's the contents of my sandoid.conf-

Code:
####################
    # sanoid.conf file #
    ####################


    [rpool]
            use_template = production
            recursive = yes

    [nvme-pool]
            use_template = production
            recursive = yes

    [tank]
            use_template = production
            recursive = yes


    #############################
    # templates below this line #
    #############################

    [template_production]
            # store hourly snapshots 48h
            hourly = 48

            # store 14 days of daily snaps
            daily = 14

            # store back 0 months of monthly
            monthly = 0

            # store back 0 yearly (remove manually if to large)
            yearly = 0

            # create new snapshots
            autosnap = yes

            # clean old snapshot
            autoprune = yes

I'm pretty sure the config is okay but in any event, even when I manually create a snapshot with the `-r` option... it doesn't work!

What the heck is wrong?!
 
I just don't understand what's going on... I've booted my FreeNAS / TrueNAS install and imported the nvme-pool. I've then taken a recursive snapshot of the nvme-pool via the TrueNAS GUI, and then send | recv'd via the shell to a newly create pool (pool980).

When I navigate pool980, the dataset structure from nvme-pool has been recreated, and the data in any folders at the root of nvme-pool has been copied to the new pool. But the "subvol" datasets are entirely empty, including not having a .zfs folder.

When I run zfs list poo980/nvme-pool/subvol-101-disk-0, it says (on TrueNAS) "cannot open 'poo980/nvme-pool/subvol-101-disk-0' : dataset does not exists".

One other thing I tried was creating a manual, recursive snapshot of just nvme-pool/subvol-101-disk-0 via the TrueNAS GUI. I then send | recv'd via the shell to pool980 and it worked... complete with a hidden .zfs folder.

I don't know whether I missing something basic, or something serious is wrong.

[EDIT] (Whilst still on my TrueNAS install) I've just tried sending the snapshot I described in the first paragraph, this time however, I used the
-R flag in my command, zfs send -R nvme-pool@manual-2021-06-30_00-06 | zfs recv pool980/manual-test. Everything replicated properly. So I then tried again but this time, I tried replicating a snapshot created by sanoid (zfs send -R nvme-pool@autosnap_2021-06-29_22:00:01_hourly | zfs recv pool980/sanoid-test) and it failed, saying-

Code:
cannot send nvme-pool@autosnap_2021-06-29_22:00:01_hourly recursively: snapshot nvme-pool/subvol-103-disk-0@autosnap_2021-06-29_22:00:01_hourly does not existwarning: cannot send 'nvme-pool@autosnap_2021-06-29_22:00:01_hourly': backup failed
cannot receive: failed to read from stream
 
Last edited:
I just don't understand what's going on... I've booted my FreeNAS / TrueNAS install and imported the nvme-pool. I've then taken a recursive snapshot of the nvme-pool via the TrueNAS GUI, and then send | recv'd via the shell to a newly create pool (pool980).

When I navigate pool980, the dataset structure from nvme-pool has been recreated, and the data in any folders at the root of nvme-pool has been copied to the new pool. But the "subvol" datasets are entirely empty, including not having a .zfs folder.

When I run zfs list poo980/nvme-pool/subvol-101-disk-0, it says (on TrueNAS) "cannot open 'poo980/nvme-pool/subvol-101-disk-0' : dataset does not exists".

One other thing I tried was creating a manual, recursive snapshot of just nvme-pool/subvol-101-disk-0 via the TrueNAS GUI. I then send | recv'd via the shell to pool980 and it worked... complete with a hidden .zfs folder.

I don't know whether I missing something basic, or something serious is wrong.
Probably stupid question… you took a recursive snapshot. Did you use the -R flag to send?
 
Probably stupid question… you took a recursive snapshot. Did you use the -R flag to send?
Thank you for your response. Before you posted, I had started editing my previous comment with info regarding use of the -R flag. Please see the edit at the bottom of my previous comment for more details.
 
Last edited:
Your prior post was deleted, not edited.

That's weird. So you can't see https://forum.proxmox.com/threads/ive-really-screwed-up-bad-please-help.91408/#post-399582 ? I can see it whilst I'm logged in but not when I open a new private browser window. If anyone could enlighten me as to why it was deleted, I'd be grateful.


Anyway... here's the edit and a small amount of new info-


[EDIT] (Whilst still on my TrueNAS install) I've just tried sending the snapshot I described in the first paragraph, this time however, I used the
-R flag in my command, zfs send -R nvme-pool@manual-2021-06-30_00-06 | zfs recv pool980/manual-test. Everything replicated properly. So I then tried again but this time, I tried replicating a snapshot created by sanoid (zfs send -R nvme-pool@autosnap_2021-06-29_22:00:01_hourly | zfs recv pool980/sanoid-test) and it failed, saying-

Code:
cannot send nvme-pool@autosnap_2021-06-29_22:00:01_hourly recursively: snapshot nvme-pool/subvol-103-disk-0@autosnap_2021-06-29_22:00:01_hourly does not exist warning: cannot send 'nvme-pool@autosnap_2021-06-29_22:00:01_hourly': backup failed
cannot receive: failed to read from stream


Since writing that, I've booted back into my Proxmox install to try the zfs send -R nvme-pool@autosnap_2021-06-29_22:00:01_hourly | zfs recv pool980/sanoid-test command again and I get the same result.
 
Last edited:
It is really hard to know exactly what is wrong or what you are doing because you are inconsistent in your command descriptions. You need to be using zfs SEND not zfs SENT. I can't tell if that is a typo only on the forum or the command you are issuing. The reason you originally only copied 400MB is because that is the command you issued. So far, zfs send recv has behaved entirely as expected. ZFS is highly stable and predictable. Slow down and read some manuals prior to causing any inadvertent data loss.
 
Also keep in mind that zfs send | zfs recieve between Proxmox and TrueNAS can cause problems if both are running on different OpenZFS versions. ZFS is only backwards compatible, so sending data in one direction might work (higher to lower version) but not in the other direction (lower to higher).
 
Last edited:
Also keep in mind that zfs send | zfs recieve between Proxmox and TrueNAS can cause problems if both are running on different OpenZFS versions. ZFS is only backwards compatible, so sending data in one direction might work (higher to lower version) but not in the other direction (lower to higher).

I am aware of that but thank you for mentioning it.

It is really hard to know exactly what is wrong or what you are doing because you are inconsistent in your command descriptions. You need to be using zfs SEND not zfs SENT. I can't tell if that is a typo only on the forum or the command you are issuing. The reason you originally only copied 400MB is because that is the command you issued. So far, zfs send recv has behaved entirely as expected. ZFS is highly stable and predictable. Slow down and read some manuals prior to causing any inadvertent data loss.

If the "sent" was within a string I had formatted as inline code, then it was a (now fixed) typo.

I think at this point, I need to clarify that there are two main issues, so to speak.

First of all, I may or may not have destroyed my old install with a careless usage of the command mv /*/*, which should have been mv ../*/*. At this point, I have a new install up and working, and I've almost entirely replicated the functionality of my old install. But there are one or two things I would like to be able to recover from my original install, namely my container config files. There's still a lingering question in my mind as to whether I can still recover said data, given what I've learnt in the last couple days. So I'm going to take some time and see whether I can accomplish anything by playing around with an image of my original install.

The second issue I've been having is senting and receiving snapshots. I think there were two main reasons for my confusion (and to my shame) I didn't realize that you had to use an -R flag when sending a recursive snapshot. Perhaps I'd just forgotten about the flag but in any event, lesson learned.

But... my issues with sending and receiving snapshots wasn't solely down to the lack of an -R flag. As I said in a previous post (https://forum.proxmox.com/threads/ive-really-screwed-up-bad-please-help.91408/post-399589), even when I used the -R flag, send recv's still failed. I now know why the failure was occurring and it's because I was trying to send a snapshot that was created by Sanoid.

Sanoid, by design, handles the creation of snapshots individually. Even when you've configured it to recursivly snapshot your pool or dataset in the sanoid.conf file. This can mean that child datasets have timestamps in their names that do not match their "parent" dataset. Which means that azfs send | zfs recv command will fail but using Syncoid will work. This information comes direct from Jim Salter himself, the guy who created Sanoid. Here's a link to the reddit thread where he gave me the info - https://old.reddit.com/r/zfs/comments/oafmjk/i_think_i_might_thave_a_major_issue_with_my/

Just in case the thread disappears though-


either you cannot use zfs send | recv to send Sanoid snapshots and this behaviour is to be expected
Syncoid is zfs send/receive; it's just zfs send/receive done for you. But it literally executes commands on the command line, just as you might (if with a bit more redirection to get extra features).
All syncoid does is make life easier, particularly when it comes to automation.
I find that nvme-pool/subvol-103-disk-0@autosnap_2021-06-29_22:00:01_hourly does not exists but a snapshot with a different timestamp, such as nvme-pool/subvol-103-disk-0@autosnap_2021-06-29_22:00:03_hourly does.
This is why you can't do a simple zfs send -R—because the snapshots weren't taken with -R in the first place. By default, Sanoid takes snapshots individually, so you can have slightly different timestamps on child datasets than their parents.
Syncoid also manages datasets individually, which is why a recursive syncoid succeeds where your manual attempts failed—any individual dataset replicates fine, but when you try to do a -R recursion, it fails.
I designed it this way because it's more robust. As you're seeing, any time you've got snapshots on a child that don't exist on a parent (or vice versa) things get decidedly flaky with ZFS recursion.
If you really, really want to use ZFS recursion, you'll want to edit your Sanoid configs a bit. recursive=yes causes Sanoid to take snapshots individually, as you've seen. recursive=zfs causes Sanoid to take snapshots using ZFS built-in recursion, with the -R flag.
Using sanoid recursive=zfs will get you child snapshots with the same timestamps and therefore (hopefully) working zfs recursion in replication. But I don't typically recommend it, because, again... it's less flexible, and (as you're seeing) less reliable.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!