Hello,
I receive many emails with this error:
The 9 node cluster worked very well from a year but now I added VMs (120 Vms now) and I think this is due to a timeout with zfs listing on the receiving node.
The backup node contains all the replicas and does nothing (no runningVMs):
When I try the command
sometime the output is very fast and I think it is cached, But sometime it takes 5-10 seconds.
I think this causes the random replication problem.
Is there a way to improve caching? I have now two logs ssd but I can use them for something else
I receive many emails with this error:
Code:
command '/usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=HOSTNAME' root@1.2.3.4 -- pvesr prepare-local-job 184-0 --scan kvm kvm:vm-184-disk-0 --last_sync 1633621331' failed: exit code 4
The 9 node cluster worked very well from a year but now I added VMs (120 Vms now) and I think this is due to a timeout with zfs listing on the receiving node.
The backup node contains all the replicas and does nothing (no runningVMs):
Code:
zpool status -v
pool: kvm
state: ONLINE
scan: scrub repaired 0B in 0 days 07:26:51 with 0 errors on Sun Sep 12 07:50:52 2021
config:
NAME STATE READ WRITE CKSUM
kvm ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ata-HGST_HUH721008ALE600_2SG05MBF-part4 ONLINE 0 0 0
ata-HGST_HUH721008ALE600_2SG08U8F-part4 ONLINE 0 0 0
logs
mirror-1 ONLINE 0 0 0
ata-Micron_5100_MTFDDAK240TCB_18271D4AB52F-part3 ONLINE 0 0 0
ata-SAMSUNG_MZ7WD240HAFV-00003_S16LNYAD904567-part3 ONLINE 0 0 0
errors: No known data errors
Code:
ii pve-cluster 6.2-1 amd64 "pmxcfs" distributed cluster filesystem for Proxmox Virtual Environment.
ii pve-container 3.2-3 all Proxmox VE Container management tool
ii pve-docs 6.2-6 all Proxmox VE Documentation
ii pve-edk2-firmware 2.20200531-1 all edk2 based firmware modules for virtual machines
ii pve-firewall 4.1-3 amd64 Proxmox VE Firewall
ii pve-firmware 3.1-3 all Binary firmware code for the pve-kernel
ii pve-ha-manager 3.1-1 amd64 Proxmox VE HA Manager
ii pve-i18n 2.2-2 all Internationalization support for Proxmox VE
ii pve-kernel-5.4 6.3-1 all Latest Proxmox VE Kernel Image
ii pve-kernel-5.4.73-1-pve 5.4.73-1 amd64 The Proxmox PVE Kernel Image
ii pve-kernel-helper 6.3-1 all Function for various kernel maintenance tasks.
ii pve-lxc-syscalld 0.9.1-1 amd64 PVE LXC syscall daemon
ii pve-manager 6.2-15 amd64 Proxmox Virtual Environment Management Tools
ii pve-qemu-kvm 5.1.0-6 amd64 Full virtualization on x86 hardware
ii pve-xtermjs 4.7.0-2 amd64 Binaries built from the Rust termproxy crate
ii pve-zsync 2.0-3 all Proxmox VE ZFS syncing tool
ii smartmontools 7.1-pve2 amd64 control and monitor storage systems using S.M.A.R.T.
ii zfs-zed 0.8.5-pve1 amd64 OpenZFS Event Daemon
ii zfsutils-linux 0.8.5-pve1 amd64 command-line tools to manage OpenZFS filesystems
When I try the command
Code:
zfs list -o name,volsize,origin,type,refquota -t volume,filesystem -Hrp
I think this causes the random replication problem.
Is there a way to improve caching? I have now two logs ssd but I can use them for something else