As far as I know, this issue is only resolved by 4.18+. You may be able to use a kernel from Ubuntu or Debian Backports, but I didn't have any luck due to missing ZFS support and/or hardware modules in those kernels. I'm currently building my own kernels to track 4.19 + ZFS + hardware I need...
Proxmox is (Debian) Linux so you may want to Google "Linux backup software". If you used ZFS for your install, you can take a snapshot and send it to a file.
The issue with just copying files is that the system won't be in a crash-consistent state and you won't be getting a copy of the Master...
You may want to check out:
https://www.phoronix.com/scan.php?page=article&item=freebsd-12-zfs&num=1
Every filesystem has a use case where it shines. If you're looking for raw sequential throughput, no CoW filesystem is going to compete with ext4 in RAID0. You can try these safe tuning options...
The issue is still present but less frequently encountered in the 4.15.x line. See: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1779678
I saw it as recently as 4.15.18-8-pve and moved to custom 4.18 and 4.19 kernels afterwards. As in the bug report, I haven't seen the issue on these...
If the new drive is at least as big as the one you're replacing, you can add it without any partitioning; ZFS will take care of that for you during the replace operation.
You can use either format to reference the drive but /dev/disk/by-id is the recommended approach as it won't vary if you...
Looking at the output of your zpool status, I see the "old" (original?) drive and another drive that was presumably the first replacement that failed. I would be inclined to leave those for now, add the new drive and do:
zpool replace POOLNAME ORIGINAL_DRIVE SECOND_NEW_DRIVE
If that does not...
Yes, a reboot will clear it up -- I'm not aware of any way to recover a system in this state without a reboot. My experience has been the same as in that Ubuntu kernel bug report; it's an infrequent condition that presents like a deadlock. We typically go months between incidents on 4.15 kernels...
When the startup is hung, do
grep copy_net_ns /proc/*/stack
If that returns anything, you're having this issue. I can confirm that the issue is still present (but less frequent) on recent 4.15.x Proxmox PVE kernels. The Ubuntu kernel team acknowledged this bug...
As far as I know, it's not related to ZFS. All of the patches are in the mainline kernel (which doesn't contain any ZFS code) related to network namespacing, NFS and cgroups.
I suspect it's related to specific workloads within the containers as we only ever saw it on a hypervisors in one...
You may have a cable or SATA port issue. check dmesg and /var/log/syslog for ATA errors. Then reconcile against:
https://lime-technology.com/wiki/index.php/The_Analysis_of_Drive_Issues#Drive_Interface_Issues
What filesystem type do you have on /dev/md1? Is it BTRFS by chance? If it is and you're using snapshots, they will continue to reference any files you delete from the live filesystem. So to clear space you would need to remove files from the live filesystem AND delete any snapshots that were...
NOTE: This is just a caution for the Proxmox kernel team and anyone that might be building their own kernels. As far as I know, the problematic change is not present in the official Proxmox 4.15 kernel.
An issue with identical symptoms has emerged in 4.16 and is patched in 4.17. See...
Some things to check / try:
* Use arp-scan to make sure you don't have any duplicated MAC addresses on your network
* Install / use a different NIC
* Confirm there is no firewall running on your workstation or anywhere between you and the Proxmox server
* If you're using an ethernet cable on...
Also, if you're running zfs auto snapshot on the receiving end, you need disable snapshots for the receiving filesystem on HOST2 while the first send is in progress:
zfs set com.sun:auto-snapshot=false zfs-vps1/subvol-105-disk-1
After step V, re-enable snapshots on the receiving end:
zfs inherit...
That does sounds like your bases are covered on the power side unless the failover / detection circuit is faulty. You can test this by removing and reinserting the power supplies one at a time to confirm the power transfers properly.
If you haven't explicitly set up the kernel to reboot on panic, then it would normally hang rather than reboot for most hardware faults (with a trace on console). If that server has a management device, you can check the log entries (SEL) using ipmitool. If not, reboot and enter the BIOS to see...
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.