ZFS data loss

Rafael Gómez

New Member
Nov 4, 2018
6
0
1
Spain
Hi, everyone.

I'm running proxmox in a workstation. System is in lvm ssd. A 5 disk (2TB each) raidz1 is used as storage for virtual disks and data. System has been running for 3 years without remarcable problems. ZFS pool is mounted in a folder of the Debian base system. A folder with data is shared with VM's throug samba.

Power source failed suddenly with massive damage. Mainboard and half RAM were destroyed.

I replaced mainboard with same model, installed for now 8gb of RAM and mounted again the system. Due to a mistake, only system and 2 of the hard disks were detected on the first boot. System was turned off and the rest of disks were added succesfully.

System and VM's works again. But only subvols on raid survived. All sared data was lost.

ZFS pool appeared to be ok, but data wasn't here:

root@cloud:~# zpool status
pool: AlmacenZ1
state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
the pool may no longer be accessible by software that does not support
the features. See zpool-features(5) for details.
scan: scrub repaired 0B in 19h15m with 0 errors on Sun Nov 4 03:12:59 2018
config:

NAME STATE READ WRITE CKSUM
AlmacenZ1 ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
wwn-0x50024e9206796bc1 ONLINE 0 0 0
wwn-0x5000039ffaee6354 ONLINE 0 0 0
sdd ONLINE 0 0 0
wwn-0x50014ee20a595e97 ONLINE 0 0 0
wwn-0x50014ee2b5052af4 ONLINE 0 0 0

errors: No known data errors

The volume called "almacen" (that contains the interesting data, share) shows 6.12TB capacity (that's ok) and 512GB of free space (also correct). But i cannot see files and directories on this volume (5TB of data are "invisible").

root@cloud:/AlmacenZ1/almacen# zfs list
NAME USED AVAIL REFER MOUNTPOINT
AlmacenZ1 6.43T 597G 153K /AlmacenZ1
AlmacenZ1/almacen 6.12T 597G 6.12T /AlmacenZ1/almacen
AlmacenZ1/subvol-104-disk-1 1.09G 28.9G 1.09G /AlmacenZ1/subvol-104-disk-1
AlmacenZ1/subvol-113-disk-2 6.32G 114G 6.32G /AlmacenZ1/subvol-113-disk-2
AlmacenZ1/subvol-114-disk-0 13.1G 6.95G 13.1G /AlmacenZ1/subvol-114-disk-0
AlmacenZ1/vm-100-disk-1 30.9G 598G 29.8G -
AlmacenZ1/vm-101-disk-1 103G 659G 41.1G -
AlmacenZ1/vm-102-disk-1 191M 597G 191M -
AlmacenZ1/vm-103-disk-1 30.9G 597G 30.9G -
AlmacenZ1/vm-109-disk-1 127G 597G 127G -

As you can see i'm not an experienced linux system admin.

Any posibilities to get my data back?

Tanks for reading and help
 
Hi,

If a zfs pool is ok(as I see in your case), then is very unlikely to have lost any data.
I agree. Some months ago, one of the HD died and it was changed without problems or downtime. A quite clean and easy process.

If files are still there... where are they?

As I said, i'm not a sys admin. Any clue in the right direction is wellcome.
 
If the files that you need are not where you expected to be, then are only one of this posibility:

- someone/something have deleted or only moved in another location
If you have setup some kind of zfs snapshot, then you can find last snapshot that have this files
 
  • Like
Reactions: Rafael Gómez
I really don't know, I think both posibilities are discarded.
- The server didn't go online after the crash, nobody had access to delete it. My first acces was through proxmox web interface, and data was already gone.
- There are no snapshots configured:

root@cloud:/# zfs list -t snapshot
no datasets available

On next quote we can see the volume "/AlmacenZ1/almacen" with 6.12TB used, and 591GB available (everithing as excpected). Then, the "du /AlmacenZ1/almacen -h" says that only 107Mb are used.

That difference is the amount of lost data.

root@cloud:/AlmacenZ1# zfs list
NAME USED AVAIL REFER MOUNTPOINT
AlmacenZ1 6.43T 597G 153K /AlmacenZ1
AlmacenZ1/almacen 6.12T 597G 6.12T /AlmacenZ1/almacen
AlmacenZ1/subvol-104-disk-1 1.09G 28.9G 1.09G /AlmacenZ1/subvol-104-disk-1
AlmacenZ1/subvol-113-disk-2 6.32G 114G 6.32G /AlmacenZ1/subvol-113-disk-2
AlmacenZ1/subvol-114-disk-0 13.1G 6.95G 13.1G /AlmacenZ1/subvol-114-disk-0
AlmacenZ1/vm-100-disk-1 30.9G 598G 29.8G -
AlmacenZ1/vm-101-disk-1 103G 659G 41.1G -
AlmacenZ1/vm-102-disk-1 191M 597G 191M -
AlmacenZ1/vm-103-disk-1 30.9G 597G 30.9G -
AlmacenZ1/vm-109-disk-1 127G 597G 127G -

root@cloud:/AlmacenZ1# du /AlmacenZ1/almacen -h
4.0K /AlmacenZ1/almacen/archivos-proxmox/private
4.0K /AlmacenZ1/almacen/archivos-proxmox/template/cache
4.0K /AlmacenZ1/almacen/archivos-proxmox/template/iso
12K /AlmacenZ1/almacen/archivos-proxmox/template
4.0K /AlmacenZ1/almacen/archivos-proxmox/images
107M /AlmacenZ1/almacen/archivos-proxmox/dump
107M /AlmacenZ1/almacen/archivos-proxmox
107M /AlmacenZ1/almacen
root@cloud:/AlmacenZ1#
 
this is the result:

root@cloud:~# zfs get all AlmacenZ1/almacen
NAME PROPERTY VALUE SOURCE
AlmacenZ1/almacen type filesystem -
AlmacenZ1/almacen creation Wed Jun 28 18:10 2017 -
AlmacenZ1/almacen used 6.12T -
AlmacenZ1/almacen available 597G -
AlmacenZ1/almacen referenced 6.12T -
AlmacenZ1/almacen compressratio 1.00x -
AlmacenZ1/almacen mounted no -
AlmacenZ1/almacen quota none default
AlmacenZ1/almacen reservation none default
AlmacenZ1/almacen recordsize 128K default
AlmacenZ1/almacen mountpoint /AlmacenZ1/almacen default
AlmacenZ1/almacen sharenfs off default
AlmacenZ1/almacen checksum on default
AlmacenZ1/almacen compression off default
AlmacenZ1/almacen atime on default
AlmacenZ1/almacen devices on default
AlmacenZ1/almacen exec on default
AlmacenZ1/almacen setuid on default
AlmacenZ1/almacen readonly off default
AlmacenZ1/almacen zoned off default
AlmacenZ1/almacen snapdir hidden default
AlmacenZ1/almacen aclinherit restricted default
AlmacenZ1/almacen createtxg 1154 -
AlmacenZ1/almacen canmount on default
AlmacenZ1/almacen xattr on default
AlmacenZ1/almacen copies 1 default
AlmacenZ1/almacen version 5 -
AlmacenZ1/almacen utf8only off -
AlmacenZ1/almacen normalization none -
AlmacenZ1/almacen casesensitivity sensitive -
AlmacenZ1/almacen vscan off default
AlmacenZ1/almacen nbmand off default
AlmacenZ1/almacen sharesmb off default
AlmacenZ1/almacen refquota none default
AlmacenZ1/almacen refreservation none default
AlmacenZ1/almacen guid 12035703846605059671 -
AlmacenZ1/almacen primarycache all default
AlmacenZ1/almacen secondarycache all default
AlmacenZ1/almacen usedbysnapshots 0B -
AlmacenZ1/almacen usedbydataset 6.12T -
AlmacenZ1/almacen usedbychildren 0B -
AlmacenZ1/almacen usedbyrefreservation 0B -
AlmacenZ1/almacen logbias latency default
AlmacenZ1/almacen dedup off default
AlmacenZ1/almacen mlslabel none default
AlmacenZ1/almacen sync standard default
AlmacenZ1/almacen dnodesize legacy default
AlmacenZ1/almacen refcompressratio 1.00x -
AlmacenZ1/almacen written 6.12T -
AlmacenZ1/almacen logicalused 6.13T -
AlmacenZ1/almacen logicalreferenced 6.13T -
AlmacenZ1/almacen volmode default default
AlmacenZ1/almacen filesystem_limit none default
AlmacenZ1/almacen snapshot_limit none default
AlmacenZ1/almacen filesystem_count none default
AlmacenZ1/almacen snapshot_count none default
AlmacenZ1/almacen snapdev hidden default
AlmacenZ1/almacen acltype off default
AlmacenZ1/almacen context none default
AlmacenZ1/almacen fscontext none default
AlmacenZ1/almacen defcontext none default
AlmacenZ1/almacen rootcontext none default
AlmacenZ1/almacen relatime off default
AlmacenZ1/almacen redundant_metadata all default
AlmacenZ1/almacen overlay off default
root@cloud:~#

I didn't know this command. Now i can see that it is not mounted! It's moment to investigate why, and how to mount it.
 
I think now i undersand the problem: you pointed me to the right direction.

When system first started with only a couple disks of pull, proxmox re-created the storage folders in /AlmacenZ1/almacen.
Then, when zfs tries to mount the filesystem, mountpoint contain those files and process cannot be completed.

I tried to erase the folders that proxmox creates, but they re-created inmediately. I guess I need to stop some proxmox's services to avoid folder creation.

¿Someone knows how can I do that?

EDITED:
Just disabled the storage via web interface. Manually deleted the files and ran "zfs mount -a" and filesystem was mounted.

All data is safe. Thanks for your assistance.
 
Last edited:
good to see that you solved your problem, i thought something like this that your dataset is not mounted
 
A little trick for next time:
Code:
zfs mount -O -a

That tells ZFS that it's ok to "O"verlay existing directories which will allow the mounts to succeed.
 
Thanks for your help.

I'm slowly reading the Sun's ZFS documentation, and looking for a new and more convinient backup strategy. It has been a good warning!
 
Hi Rafael!

You must start reading an test your knowledge using plain files without any risk. You can create a new pool using 2 small files (1 Gb ) and then test some worst scenario: replacing a file disk, and so on. Note on paper what you succeed and what it was not working.

Good luck!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!