I'm seeing quite a few ZFS replication failures and I'm not sure how to diagnose the root cause.
I'm just replicating 1 VM to a second Proxmox host. The second host is replicating 1 VM back to the first. Each VM has 2 virtual disks with iothread=1. I've had the same issue with iothread disabled as well. I have the replication set at ever 5 minutes.
What are some things to look at?
I'm just replicating 1 VM to a second Proxmox host. The second host is replicating 1 VM back to the first. Each VM has 2 virtual disks with iothread=1. I've had the same issue with iothread disabled as well. I have the replication set at ever 5 minutes.
What are some things to look at?
Code:
Feb 05 11:40:09 vmhost3.dev20.example.com pvesr[206905]: command 'zfs destroy zfs-dev20/vm-104-disk-1@__replicate_104-0_1580920200__' failed: got timeout
Feb 05 11:41:59 vmhost3.dev20.example.com pvesr[206905]: command 'zfs destroy zfs-dev20/vm-104-disk-1@__replicate_104-0_1580919600__' failed: got timeout
Feb 05 11:45:09 vmhost3.dev20.example.com pvesr[270130]: 104-0: got unexpected replication job error - command 'zfs snapshot zfs-dev20/vm-104-disk-0@__replicate_104-0_1580921100__' failed: got timeout
Feb 05 11:50:07 vmhost3.dev20.example.com pvesr[331309]: command 'zfs destroy zfs-dev20/vm-104-disk-0@__replicate_104-0_1580921100__' failed: got timeout
Feb 05 11:51:35 vmhost3.dev20.example.com pvesr[331309]: command 'zfs destroy zfs-dev20/vm-104-disk-0@__replicate_104-0_1580920800__' failed: got timeout
Feb 05 11:51:40 vmhost3.dev20.example.com pvesr[331309]: command 'zfs destroy zfs-dev20/vm-104-disk-1@__replicate_104-0_1580920800__' failed: got timeout
Feb 05 11:56:17 vmhost3.dev20.example.com pvesr[392435]: command 'zfs destroy zfs-dev20/vm-104-disk-0@__replicate_104-0_1580921400__' failed: got timeout
Feb 05 11:56:25 vmhost3.dev20.example.com pvesr[392435]: command 'zfs destroy zfs-dev20/vm-104-disk-1@__replicate_104-0_1580921400__' failed: got timeout
Feb 05 12:05:07 vmhost3.dev20.example.com pvesr[518555]: 104-0: got unexpected replication job error - command 'zfs snapshot zfs-dev20/vm-104-disk-0@__replicate_104-0_1580922300__' failed: got timeout
Feb 05 12:10:10 vmhost3.dev20.example.com pvesr[576697]: command 'zfs destroy zfs-dev20/vm-104-disk-0@__replicate_104-0_1580922300__' failed: got timeout
Feb 05 12:10:16 vmhost3.dev20.example.com pvesr[576697]: 104-0: got unexpected replication job error - command 'zfs snapshot zfs-dev20/vm-104-disk-0@__replicate_104-0_1580922600__' failed: got timeout
Feb 05 12:20:06 vmhost3.dev20.example.com pvesr[690055]: command 'zfs destroy zfs-dev20/vm-104-disk-0@__replicate_104-0_1580922600__' failed: got timeout
Feb 05 12:21:50 vmhost3.dev20.example.com pvesr[690055]: command 'zfs destroy zfs-dev20/vm-104-disk-0@__replicate_104-0_1580922000__' failed: got timeout
Feb 05 12:22:00 vmhost3.dev20.example.com pvesr[690055]: command 'zfs destroy zfs-dev20/vm-104-disk-1@__replicate_104-0_1580922000__' failed: got timeout
Feb 05 12:25:09 vmhost3.dev20.example.com pvesr[768434]: 104-0: got unexpected replication job error - command 'zfs snapshot zfs-dev20/vm-104-disk-0@__replicate_104-0_1580923500__' failed: got timeout
Feb 05 12:30:09 vmhost3.dev20.example.com pvesr[818339]: 104-0: got unexpected replication job error - command 'zfs snapshot zfs-dev20/vm-104-disk-0@__replicate_104-0_1580923800__' failed: got timeout
Feb 05 12:40:06 vmhost3.dev20.example.com pvesr[944399]: command 'zfs destroy zfs-dev20/vm-104-disk-0@__replicate_104-0_1580923800__' failed: got timeout
Feb 05 12:41:51 vmhost3.dev20.example.com pvesr[944399]: command 'zfs destroy zfs-dev20/vm-104-disk-0@__replicate_104-0_1580923200__' failed: got timeout
Feb 05 12:41:58 vmhost3.dev20.example.com pvesr[944399]: command 'zfs destroy zfs-dev20/vm-104-disk-1@__replicate_104-0_1580923200__' failed: got timeout
Feb 05 12:45:08 vmhost3.dev20.example.com pvesr[1021813]: 104-0: got unexpected replication job error - command 'zfs snapshot zfs-dev20/vm-104-disk-0@__replicate_104-0_1580924700__' failed: got timeout
Feb 05 12:55:21 vmhost3.dev20.example.com pvesr[1146265]: command 'zfs destroy zfs-dev20/vm-104-disk-0@__replicate_104-0_1580925300__' failed: got timeout
Feb 05 12:55:21 vmhost3.dev20.example.com pvesr[1146265]: 104-0: got unexpected replication job error - command 'zfs snapshot zfs-dev20/vm-104-disk-1@__replicate_104-0_1580925300__' failed: got timeout
Feb 05 13:00:10 vmhost3.dev20.example.com pvesr[1208270]: 104-0: got unexpected replication job error - command 'zfs snapshot zfs-dev20/vm-104-disk-0@__replicate_104-0_1580925600__' failed: got timeout
Feb 05 13:16:09 vmhost3.dev20.example.com pvesr[1402147]: command 'zfs destroy zfs-dev20/vm-104-disk-0@__replicate_104-0_1580926200__' failed: got timeout
Feb 05 13:16:16 vmhost3.dev20.example.com pvesr[1402147]: command 'zfs destroy zfs-dev20/vm-104-disk-1@__replicate_104-0_1580926200__' failed: got timeout