Hello everyone,
When I run Proxmox backup to a Proxmox Backup Server, the backup fail2s with:
I am stuck with this error and have no idea how to get more information on it. It looks related to filesystem error as I have found searching on many threads, but every scenario I found looks different from mine. Honestly I am not sure I am following the right clues. I thank you in advance for anyone who might have some tips for me.
My scenario:
I am migrating my VMware ESXi nodes to Proxmox, starting with only one node:
Proxmox VE 6.3-3 single node
TrueNAS Core 12.0-U2 with one ZFS pool shared with iSCSI
Storage is ZFS over iSCSI and I am using FreeNAS-API iSCSI Provider by https://github.com/TheGrandWazoo/freenas-proxmox
I migrated an ESXi Virtual Machine using this tutorial: http://wiki.ixcsoft.com.br/index.php/Convertendo_discos_.vmdk_para_Proxmox
1. Transfered flat-vmdk from ESXi datastore to an NFS share using scp. It's a 11TB vmdk file;
2. Converted the flat-vmdk with: qemu-img convert source_file.vmdk dest_file.raw
3. Created a Proxmox VM with a disk size of 15TB;
4. dd the raw file output of qemu-img convert to the VM disk: dd if=/path_to_vmdk-flat.raw of=/dev/zvol/pool01/vm-100-disk-1 bs=1m status=progress
5. started the vm and increased the disk size to fit the new 15TB size (parted; resizepart; pvresize; lvextend; xfs_growfs). This ran with no error, I can see the new disk size in df -h. Looks good
The problem happens when I try to run the backup to Proxmox Backup Server. The error is the one mentioned above. But I don't know what is broken. Is the zfs zvol? Or the vm file system? I investigated the VM filesystem with the following commands:
Boot into CentOS 7 CD - Troubleshooting:
I tried checking the LVM volumes with xfs_repair to see if there would be any reported errors, but the output looks fine to me:
xfs_repair /dev/cl_mailserver/root
xfs_repair /dev/cl_mailserver/var
xfs_repair /dev/sda2
But when I check the BIOS boot partition, it returns this:
I didn't found another thread that looks like my environment.
I am not sure the backup problem is related to the superblocks error on my BIOS Boot partition /dev/sda1, but that was the only filesystem error I found so far. The I followed this clue:
The command below outputs a very long list of errors, followed by Clear? yes. I cut the output to short it with another e2fsck -y /dev/sda1 that outputs no errors.
I don't know if the /dev/sda1 is the cause to the error on the backup. But I find it curious that after I reinstall grub, the superblock errors appear again.
Any clues on this?
When I run Proxmox backup to a Proxmox Backup Server, the backup fail2s with:
Code:
ERROR: backup write data failed: command error: write_data upload error: pipelined request failed: Structure needs cleaning (os error 117)
INFO: aborting backup job
ERROR: Backup of VM 100 failed - backup write data failed: command error: write_data upload error: pipelined request failed: Structure needs cleaning (os error 117)
I am stuck with this error and have no idea how to get more information on it. It looks related to filesystem error as I have found searching on many threads, but every scenario I found looks different from mine. Honestly I am not sure I am following the right clues. I thank you in advance for anyone who might have some tips for me.
My scenario:
I am migrating my VMware ESXi nodes to Proxmox, starting with only one node:
Proxmox VE 6.3-3 single node
TrueNAS Core 12.0-U2 with one ZFS pool shared with iSCSI
Storage is ZFS over iSCSI and I am using FreeNAS-API iSCSI Provider by https://github.com/TheGrandWazoo/freenas-proxmox
I migrated an ESXi Virtual Machine using this tutorial: http://wiki.ixcsoft.com.br/index.php/Convertendo_discos_.vmdk_para_Proxmox
1. Transfered flat-vmdk from ESXi datastore to an NFS share using scp. It's a 11TB vmdk file;
2. Converted the flat-vmdk with: qemu-img convert source_file.vmdk dest_file.raw
3. Created a Proxmox VM with a disk size of 15TB;
4. dd the raw file output of qemu-img convert to the VM disk: dd if=/path_to_vmdk-flat.raw of=/dev/zvol/pool01/vm-100-disk-1 bs=1m status=progress
5. started the vm and increased the disk size to fit the new 15TB size (parted; resizepart; pvresize; lvextend; xfs_growfs). This ran with no error, I can see the new disk size in df -h. Looks good
The problem happens when I try to run the backup to Proxmox Backup Server. The error is the one mentioned above. But I don't know what is broken. Is the zfs zvol? Or the vm file system? I investigated the VM filesystem with the following commands:
Boot into CentOS 7 CD - Troubleshooting:
Code:
# fdisk -l /dev/sda
WARNING: fdisk GPT support is currently new, and therefore in an experimental phase. Use at your own discretion.
Disk /dev/sda: 16492.7 GB, 16492674416640 bytes, 32212254720 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: gpt
Disk identifier: 59C3315C-52CE-469C-807C-1A4A1F694775
# Start End Size Type Name
1 2048 4095 1M BIOS boot
2 4096 2101247 1G Microsoft basic
3 2101248 32212254686 15T Linux LVM
I tried checking the LVM volumes with xfs_repair to see if there would be any reported errors, but the output looks fine to me:
xfs_repair /dev/cl_mailserver/root
xfs_repair /dev/cl_mailserver/var
xfs_repair /dev/sda2
Code:
sh-4.2# xfs_repair /dev/cl_mailbox01/var
Phase 1 - find and verify superblock…
Phase 2 - using internal log
- zero log…
- scan filesystem freespace and inode maps…
- found root inode chunk
Phase 3 - for each AG…
- scan and clear agi unlinked lists…
- process known inodes and perform inode discovery…
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- process newly discovered inodes…
Phase 4 - check for duplicate blocks…
- setting up duplicate extent list…
- check for inodes claiming duplicate blocks…
- agno = 0
- agno = 1
- agno = 2
- agno = 3
Phase 5 - rebuild AG headers and trees…
- reset superblock…
Phase 6 - check inode connectivity…
- resetting contents of realtime bitmap and summary inodes
- traversing filesystem …
- traversal finished …
- moving disconnected inodes to lost+found …
Phase 7 - verify and correct link counts…
done
But when I check the BIOS boot partition, it returns this:
Code:
# e2fsck /dev/sda1
e2fsck 1.42.9 (28-Dec-2013)
ext2fs_open2: Bad magic number in super-block
e2fsck: Superblock invalid, trying backup blocks...
e2fsck: Bad magic number in super-block while trying to open /dev/sda1
The superblock could not be read or does not describe a correct ext2
filesystem. If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
e2fsck -b 8193 <device>
I didn't found another thread that looks like my environment.
I am not sure the backup problem is related to the superblocks error on my BIOS Boot partition /dev/sda1, but that was the only filesystem error I found so far. The I followed this clue:
Code:
# mke2fs -n /dev/sda1
mke2fs 1.42.9 (28-Dec-2013)
Filesystem label=
OS type: Linux
Block size=1024 (log=0)
Fragment size=1024 (log=0)
Stride=0 blocks, Stripe width=0 blocks
128 inodes, 1024 blocks
51 blocks (4.98%) reserved for the super user
First data block=1
Maximum filesystem blocks=1048576
1 block group
8192 blocks per group, 8192 fragments per group
128 inodes per group
# dumpe2fs /dev/sda1
dumpe2fs 1.42.9 (28-Dec-2013)
dumpe2fs: Bad magic number in super-block while trying to open /dev/sda1
Couldn't find valid filesystem superblock.
# mke2fs -S /dev/sda1
mke2fs 1.42.9 (28-Dec-2013)
Discarding device blocks: done
Filesystem label=
OS type: Linux
Block size=1024 (log=0)
Fragment size=1024 (log=0)
Stride=0 blocks, Stripe width=0 blocks
128 inodes, 1024 blocks
51 blocks (4.98%) reserved for the super user
First data block=1
Maximum filesystem blocks=1048576
1 block group
8192 blocks per group, 8192 fragments per group
128 inodes per group
Allocating group tables: done
Writing superblocks and filesystem accounting information: done
The command below outputs a very long list of errors, followed by Clear? yes. I cut the output to short it with another e2fsck -y /dev/sda1 that outputs no errors.
Code:
# e2fsck -y /dev/sda1
e2fsck 1.42.9 (28-Dec-2013)
/dev/sda1 contains a file system with errors, check forced.
Resize inode not valid. Recreate? yes
Pass 1: Checking inodes, blocks, and sizes
Inode 1 has EXTENTS_FL flag set on filesystem without extents support.
Clear? yes
Root inode is not a directory. Clear? yes
Quota inode is not in use, but contains data. Clear? yes
Quota inode is not in use, but contains data. Clear? yes
Inode 5 has EXTENTS_FL flag set on filesystem without extents support.
Clear? yes
Inode 6 has EXTENTS_FL flag set on filesystem without extents support.
Clear? yes
Journal inode is not in use, but contains data. Clear? yes
# e2fsck -y /dev/sda1
e2fsck 1.42.9 (28-Dec-2013)
/dev/sda1: clean, 14/128 files, 27/1024 blocks
# grub2-install /dev/sda
Installing for i386-pc platform.
Installation finished. No error reported.
# e2fsck -y /dev/sda1
e2fsck 1.42.9 (28-Dec-2013)
ext2fs_open2: Bad magic number in super-block
e2fsck: Superblock invalid, trying backup blocks...
e2fsck: Bad magic number in super-block while trying to open /dev/sda1
The superblock could not be read or does not describe a correct ext2
filesystem. If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
e2fsck -b 8193 <device>
I don't know if the /dev/sda1 is the cause to the error on the backup. But I find it curious that after I reinstall grub, the superblock errors appear again.
Any clues on this?