Proxmox VE 6.3.3 - BACKUP error: Structure needs cleaning (os error 117)

isantos · Mar 6, 2021

Hello everyone,

When I run Proxmox backup to a Proxmox Backup Server, the backup fail2s with:

Code:

ERROR: backup write data failed: command error: write_data upload error: pipelined request failed: Structure needs cleaning (os error 117)
INFO: aborting backup job
ERROR: Backup of VM 100 failed - backup write data failed: command error: write_data upload error: pipelined request failed: Structure needs cleaning (os error 117)

I am stuck with this error and have no idea how to get more information on it. It looks related to filesystem error as I have found searching on many threads, but every scenario I found looks different from mine. Honestly I am not sure I am following the right clues. I thank you in advance for anyone who might have some tips for me.

My scenario:
I am migrating my VMware ESXi nodes to Proxmox, starting with only one node:

Proxmox VE 6.3-3 single node
TrueNAS Core 12.0-U2 with one ZFS pool shared with iSCSI
Storage is ZFS over iSCSI and I am using FreeNAS-API iSCSI Provider by https://github.com/TheGrandWazoo/freenas-proxmox

I migrated an ESXi Virtual Machine using this tutorial: http://wiki.ixcsoft.com.br/index.php/Convertendo_discos_.vmdk_para_Proxmox
1. Transfered flat-vmdk from ESXi datastore to an NFS share using scp. It's a 11TB vmdk file;
2. Converted the flat-vmdk with: qemu-img convert source_file.vmdk dest_file.raw
3. Created a Proxmox VM with a disk size of 15TB;
4. dd the raw file output of qemu-img convert to the VM disk: dd if=/path_to_vmdk-flat.raw of=/dev/zvol/pool01/vm-100-disk-1 bs=1m status=progress
5. started the vm and increased the disk size to fit the new 15TB size (parted; resizepart; pvresize; lvextend; xfs_growfs). This ran with no error, I can see the new disk size in df -h. Looks good

The problem happens when I try to run the backup to Proxmox Backup Server. The error is the one mentioned above. But I don't know what is broken. Is the zfs zvol? Or the vm file system? I investigated the VM filesystem with the following commands:

Boot into CentOS 7 CD - Troubleshooting:

Code:

# fdisk -l /dev/sda
WARNING: fdisk GPT support is currently new, and therefore in an experimental phase. Use at your own discretion.


Disk /dev/sda: 16492.7 GB, 16492674416640 bytes, 32212254720 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: gpt
Disk identifier: 59C3315C-52CE-469C-807C-1A4A1F694775




#         Start          End    Size  Type            Name
 1         2048         4095      1M  BIOS boot       
 2         4096      2101247      1G  Microsoft basic
 3      2101248  32212254686     15T  Linux LVM

I tried checking the LVM volumes with xfs_repair to see if there would be any reported errors, but the output looks fine to me:

xfs_repair /dev/cl_mailserver/root
xfs_repair /dev/cl_mailserver/var
xfs_repair /dev/sda2

Code:

sh-4.2# xfs_repair /dev/cl_mailbox01/var
Phase 1     - find and verify superblock…
Phase 2     - using internal log
    - zero log…
    - scan filesystem freespace and inode maps…
    - found root inode chunk
Phase 3    - for each AG…
    - scan and clear agi unlinked lists…
    - process known inodes and perform inode discovery…
    - agno = 0
    - agno = 1
    - agno = 2
    - agno = 3
    - process newly discovered inodes…
Phase 4    - check for duplicate blocks…
    -  setting up duplicate extent list…
    - check for inodes claiming duplicate blocks…
    - agno = 0
    - agno = 1
    - agno = 2
    - agno = 3
Phase 5    - rebuild AG headers and trees…
    - reset superblock…
Phase 6    - check inode connectivity…
    - resetting contents of realtime bitmap and summary inodes
    - traversing filesystem …
    - traversal finished …
    - moving disconnected inodes to lost+found …
Phase 7     - verify and correct link counts…
done

But when I check the BIOS boot partition, it returns this:

Code:

# e2fsck /dev/sda1
e2fsck 1.42.9 (28-Dec-2013)
ext2fs_open2: Bad magic number in super-block
e2fsck: Superblock invalid, trying backup blocks...
e2fsck: Bad magic number in super-block while trying to open /dev/sda1


The superblock could not be read or does not describe a correct ext2
filesystem.  If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
    e2fsck -b 8193 <device>

I didn't found another thread that looks like my environment.
I am not sure the backup problem is related to the superblocks error on my BIOS Boot partition /dev/sda1, but that was the only filesystem error I found so far. The I followed this clue:

Code:

# mke2fs -n /dev/sda1
mke2fs 1.42.9 (28-Dec-2013)
Filesystem label=
OS type: Linux
Block size=1024 (log=0)
Fragment size=1024 (log=0)
Stride=0 blocks, Stripe width=0 blocks
128 inodes, 1024 blocks
51 blocks (4.98%) reserved for the super user
First data block=1
Maximum filesystem blocks=1048576
1 block group
8192 blocks per group, 8192 fragments per group
128 inodes per group

# dumpe2fs /dev/sda1
dumpe2fs 1.42.9 (28-Dec-2013)
dumpe2fs: Bad magic number in super-block while trying to open /dev/sda1
Couldn't find valid filesystem superblock.

# mke2fs -S /dev/sda1
mke2fs 1.42.9 (28-Dec-2013)
Discarding device blocks: done                           
Filesystem label=
OS type: Linux
Block size=1024 (log=0)
Fragment size=1024 (log=0)
Stride=0 blocks, Stripe width=0 blocks
128 inodes, 1024 blocks
51 blocks (4.98%) reserved for the super user
First data block=1
Maximum filesystem blocks=1048576
1 block group
8192 blocks per group, 8192 fragments per group
128 inodes per group


Allocating group tables: done                           
Writing superblocks and filesystem accounting information: done

The command below outputs a very long list of errors, followed by Clear? yes. I cut the output to short it with another e2fsck -y /dev/sda1 that outputs no errors.

Code:

# e2fsck -y /dev/sda1
e2fsck 1.42.9 (28-Dec-2013)
/dev/sda1 contains a file system with errors, check forced.
Resize inode not valid.  Recreate? yes


Pass 1: Checking inodes, blocks, and sizes
Inode 1 has EXTENTS_FL flag set on filesystem without extents support.
Clear? yes


Root inode is not a directory.  Clear? yes


Quota inode is not in use, but contains data.  Clear? yes


Quota inode is not in use, but contains data.  Clear? yes


Inode 5 has EXTENTS_FL flag set on filesystem without extents support.
Clear? yes


Inode 6 has EXTENTS_FL flag set on filesystem without extents support.
Clear? yes


Journal inode is not in use, but contains data.  Clear? yes

# e2fsck -y /dev/sda1
e2fsck 1.42.9 (28-Dec-2013)
/dev/sda1: clean, 14/128 files, 27/1024 blocks

# grub2-install /dev/sda
Installing for i386-pc platform.
Installation finished. No error reported.

# e2fsck -y /dev/sda1
e2fsck 1.42.9 (28-Dec-2013)
ext2fs_open2: Bad magic number in super-block
e2fsck: Superblock invalid, trying backup blocks...
e2fsck: Bad magic number in super-block while trying to open /dev/sda1


The superblock could not be read or does not describe a correct ext2
filesystem.  If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
    e2fsck -b 8193 <device>

I don't know if the /dev/sda1 is the cause to the error on the backup. But I find it curious that after I reinstall grub, the superblock errors appear again.
Any clues on this?

Stoiko Ivanov · Mar 25, 2021

Could you also check the filesystem on the Backup target? (IIRC this is a PBS system)
The error looks like it's reported from the backup-server

I hope this helps!

Search

Search

Proxmox VE 6.3.3 - BACKUP error: Structure needs cleaning (os error 117)

isantos

Member

Stoiko Ivanov

Proxmox Staff Member