Guten Morgen,
ich stehe schon länger vor dem kuriosem Problem, dass meine ZFS Disk Replikationen am Wochenende vermehrt fehlschlagen.
Ich bekomme jedes Wochenende etwa ~200 Mails von zwei Nodes in einem Cluster rein, mit folgender Fehlermeldung:
Node 1:
Node 2:
Proxmox VE Version:
Die Replikationen finden jeweils um 10 Minuten versetzt statt, sodass Sie sich nicht in die Quere kommen, was innerhalb der Woche auch super funktioniert.
Die Replikationen dauern nicht lange im Schnitt ~30 Sek.
Auf den Nodes laufen jeweils vier und fünf VMs.
Node 1:
Node 2:
Hat jemand evtl. Ideen woran es liegen kann, dass es nur an Wochenenden fehlschlägt? Irgendwelche Logs die ich durchsuchen kann?
Ich hab die Replikation für Sonntags bereits deaktiviert, da an diesem Tag keine neuen Daten hinzukommen sollten.
Repilikations-Jobs Node 1:
Replikations-Jobs Node 2:
Grüße!
ich stehe schon länger vor dem kuriosem Problem, dass meine ZFS Disk Replikationen am Wochenende vermehrt fehlschlagen.
Ich bekomme jedes Wochenende etwa ~200 Mails von zwei Nodes in einem Cluster rein, mit folgender Fehlermeldung:
Node 1:
Code:
Replication job 105-0 with target 'proxmox2' and schedule 'mon..sat 10,30,50' failed!
Last successful sync: 2023-06-24 09:30:53
Next sync try: 2023-06-24 09:55:00
Failure count: 1
Error:
command 'zfs snapshot tank1/data1/vm-105-disk-0@__replicate_105-0_1687593216__' failed: got timeout
Node 2:
Code:
Replication job 102-0 with target 'proxmox1' and schedule 'mon..sat 00,20,40' failed!
Last successful sync: 2023-06-24 09:41:55
Next sync try: 2023-06-24 10:05:00
Failure count: 1
Error:
command 'zfs snapshot tank1/data1/vm-102-disk-0@__replicate_102-0_1687593695__' failed: got timeout
Proxmox VE Version:
Kernel Version Linux 5.15.107-1-pve #1 SMP PVE 5.15.107-1 (2023-04-20T10:05Z) |
PVE Manager Version pve-manager/7.4-3/9002ab8a |
Die Replikationen finden jeweils um 10 Minuten versetzt statt, sodass Sie sich nicht in die Quere kommen, was innerhalb der Woche auch super funktioniert.
Die Replikationen dauern nicht lange im Schnitt ~30 Sek.
Auf den Nodes laufen jeweils vier und fünf VMs.
Node 1:
journalctl -u pvescheduler.service
Code:
Jun 24 09:53:36 proxmox1 pvescheduler[2449473]: 104-0: got unexpected replication job error - command 'zfs snapshot tank1/data1/vm-104-disk-0@__replicate_104-0_1687593003__' failed: got timeout
Jun 24 09:54:22 proxmox1 pvescheduler[2449473]: 105-0: got unexpected replication job error - command 'zfs snapshot tank1/data1/vm-105-disk-0@__replicate_105-0_1687593216__' failed: got timeout
Jun 24 11:53:46 proxmox1 pvescheduler[170416]: 104-0: got unexpected replication job error - command 'zfs snapshot tank1/data1/vm-104-disk-0@__replicate_104-0_1687600203__' failed: got timeout
Jun 24 12:05:11 proxmox1 pvescheduler[972162]: 104-0: got unexpected replication job error - command 'zfs snapshot tank1/data1/vm-104-disk-0@__replicate_104-0_1687600863__' failed: got timeout
Jun 24 12:37:11 proxmox1 pvescheduler[1882496]: command 'zfs destroy tank1/data1/vm-105-disk-0@__replicate_105-0_1687601403__' failed: got timeout
Jun 24 21:11:21 proxmox1 pvescheduler[2491271]: 104-0: got unexpected replication job error - command 'zfs snapshot tank1/data1/vm-104-disk-0@__replicate_104-0_1687633803__' failed: got timeout
Jun 25 02:02:04 proxmox1 pvescheduler[4150683]: <root@pam> starting task UPID:proxmox1:003F559C:1670DEB0:649783FC:vzdump::root@pam:
Jun 26 00:13:39 proxmox1 pvescheduler[1739862]: 104-0: got unexpected replication job error - command 'zfs snapshot tank1/data1/vm-104-disk-0@__replicate_104-0_1687731000__' failed: got timeout
Jun 26 00:20:34 proxmox1 pvescheduler[1739862]: 105-0: got unexpected replication job error - command 'zfs snapshot tank1/data1/vm-105-disk-0@__replicate_105-0_1687731219__' failed: got timeout
Jun 26 00:26:40 proxmox1 pvescheduler[1739862]: 108-0: got unexpected replication job error - command 'zfs snapshot tank1/data1/vm-108-disk-0@__replicate_108-0_1687731634__' failed: got timeout
Jun 26 00:32:15 proxmox1 pvescheduler[1739862]: 109-0: got unexpected replication job error - command 'zfs snapshot tank1/data1/vm-109-disk-0@__replicate_109-0_1687732000__' failed: got timeout
Jun 26 00:38:21 proxmox1 pvescheduler[1223911]: command 'zfs destroy tank1/data1/vm-104-disk-0@__replicate_104-0_1687731000__' failed: got timeout
Jun 26 00:43:54 proxmox1 pvescheduler[1223911]: 104-0: got unexpected replication job error - command 'zfs snapshot tank1/data1/vm-104-disk-0@__replicate_104-0_1687732380__' failed: got timeout
Jun 26 00:49:14 proxmox1 pvescheduler[1223911]: command 'zfs destroy tank1/data1/vm-105-disk-0@__replicate_105-0_1687731219__' failed: got timeout
Jun 26 00:55:03 proxmox1 pvescheduler[1223911]: 105-0: got unexpected replication job error - command 'zfs snapshot tank1/data1/vm-105-disk-0@__replicate_105-0_1687733034__' failed: got timeout
Jun 26 01:02:09 proxmox1 pvescheduler[1223911]: command 'zfs destroy tank1/data1/vm-108-disk-0@__replicate_108-0_1687731634__' failed: got timeout
Jun 26 01:08:09 proxmox1 pvescheduler[1223911]: 108-0: got unexpected replication job error - command 'zfs snapshot tank1/data1/vm-108-disk-0@__replicate_108-0_1687733703__' failed: got timeout
Jun 26 01:14:13 proxmox1 pvescheduler[1223911]: command 'zfs destroy tank1/data1/vm-109-disk-0@__replicate_109-0_1687732000__' failed: got timeout
Jun 26 01:19:27 proxmox1 pvescheduler[1223911]: 109-0: got unexpected replication job error - command 'zfs snapshot tank1/data1/vm-109-disk-0@__replicate_109-0_1687734489__' failed: got timeout
Jun 26 01:24:39 proxmox1 pvescheduler[629666]: command 'zfs destroy tank1/data1/vm-104-disk-0@__replicate_104-0_1687732380__' failed: got timeout
Jun 26 01:30:31 proxmox1 pvescheduler[629666]: 104-0: got unexpected replication job error - command 'zfs snapshot tank1/data1/vm-104-disk-0@__replicate_104-0_1687735200__' failed: got timeout
Jun 26 01:37:05 proxmox1 pvescheduler[629666]: command 'zfs destroy tank1/data1/vm-105-disk-0@__replicate_105-0_1687733034__' failed: got timeout
Jun 26 01:43:15 proxmox1 pvescheduler[629666]: 105-0: got unexpected replication job error - command 'zfs snapshot tank1/data1/vm-105-disk-0@__replicate_105-0_1687735831__' failed: got timeout
Jun 26 01:50:07 proxmox1 pvescheduler[629666]: command 'zfs destroy tank1/data1/vm-108-disk-0@__replicate_108-0_1687733703__' failed: got timeout
Jun 26 01:56:41 proxmox1 pvescheduler[629666]: 108-0: got unexpected replication job error - command 'zfs snapshot tank1/data1/vm-108-disk-0@__replicate_108-0_1687736595__' failed: got timeout
Jun 26 02:02:01 proxmox1 pvescheduler[3405527]: <root@pam> starting task UPID:proxmox1:00340114:16F4B3B8:6498D579:vzdump::root@pam:
Jun 26 02:02:10 proxmox1 pvescheduler[629666]: command 'zfs destroy tank1/data1/vm-109-disk-0@__replicate_109-0_1687734489__' failed: got timeout
Jun 26 02:10:38 proxmox1 pvescheduler[3483801]: command 'zfs destroy tank1/data1/vm-108-disk-0@__replicate_108-0_1687736595__' failed: got timeout
Node 2:
journalctl -u pvescheduler.service
Code:
Jun 24 21:52:25 proxmox2 pvescheduler[497827]: 106-0: got unexpected replication job error - command 'zfs snapshot tank1/data1/vm-106-disk-0@__replicate_106-0_1687636279__' failed: got timeout
Jun 24 22:13:42 proxmox2 pvescheduler[465118]: command 'zfs destroy tank1/data1/vm-101-disk-0@__replicate_101-0_1687635784__' failed: got timeout
Jun 24 22:14:46 proxmox2 pvescheduler[465118]: 101-0: got unexpected replication job error - command 'zfs snapshot tank1/data1/vm-101-disk-0@__replicate_101-0_1687637584__' failed: got timeout
Jun 24 22:16:15 proxmox2 pvescheduler[919679]: command 'zfs destroy tank1/data1/vm-102-disk-0@__replicate_102-0_1687635964__' failed: got timeout
Jun 24 22:16:28 proxmox2 pvescheduler[919679]: 102-0: got unexpected replication job error - command 'zfs snapshot tank1/data1/vm-102-disk-0@__replicate_102-0_1687637764__' failed: got timeout
Jun 24 22:18:25 proxmox2 pvescheduler[1184182]: command 'zfs destroy tank1/data1/vm-103-disk-0@__replicate_103-0_1687636084__' failed: got timeout
Jun 24 22:18:51 proxmox2 pvescheduler[1184182]: 103-0: got unexpected replication job error - command 'zfs snapshot tank1/data1/vm-103-disk-0@__replicate_103-0_1687637884__' failed: got timeout
Jun 24 22:19:25 proxmox2 pvescheduler[1304452]: command 'zfs destroy tank1/data1/vm-107-disk-0@__replicate_107-0_1687636137__' failed: got timeout
Jun 24 22:20:36 proxmox2 pvescheduler[1304452]: 107-0: got unexpected replication job error - command 'zfs snapshot tank1/data1/vm-107-disk-0@__replicate_107-0_1687637944__' failed: got timeout
Jun 24 22:22:35 proxmox2 pvescheduler[1719912]: command 'zfs destroy tank1/data1/vm-106-disk-0@__replicate_106-0_1687636279__' failed: got timeout
Jun 24 22:23:13 proxmox2 pvescheduler[1719912]: 106-0: got unexpected replication job error - command 'zfs snapshot tank1/data1/vm-106-disk-0@__replicate_106-0_1687638124__' failed: got timeout
Jun 24 22:43:35 proxmox2 pvescheduler[482491]: command 'zfs destroy tank1/data1/vm-101-disk-0@__replicate_101-0_1687637584__' failed: got timeout
Jun 24 22:44:39 proxmox2 pvescheduler[482491]: 101-0: got unexpected replication job error - command 'zfs snapshot tank1/data1/vm-101-disk-0@__replicate_101-0_1687639384__' failed: got timeout
Jun 24 22:46:22 proxmox2 pvescheduler[902775]: command 'zfs destroy tank1/data1/vm-102-disk-0@__replicate_102-0_1687637764__' failed: got timeout
Jun 24 22:47:09 proxmox2 pvescheduler[902775]: 102-0: got unexpected replication job error - command 'zfs snapshot tank1/data1/vm-102-disk-0@__replicate_102-0_1687639564__' failed: got timeout
Jun 24 22:48:36 proxmox2 pvescheduler[1278040]: command 'zfs destroy tank1/data1/vm-103-disk-0@__replicate_103-0_1687637884__' failed: got timeout
Jun 24 22:49:34 proxmox2 pvescheduler[1278040]: 103-0: got unexpected replication job error - command 'zfs snapshot tank1/data1/vm-103-disk-0@__replicate_103-0_1687639684__' failed: got timeout
Jun 24 22:50:11 proxmox2 pvescheduler[1278040]: command 'zfs destroy tank1/data1/vm-107-disk-0@__replicate_107-0_1687637944__' failed: got timeout
Jun 24 22:52:09 proxmox2 pvescheduler[1278040]: 107-0: got unexpected replication job error - command 'zfs snapshot tank1/data1/vm-107-disk-0@__replicate_107-0_1687639774__' failed: got timeout
Jun 24 22:52:46 proxmox2 pvescheduler[1278040]: command 'zfs destroy tank1/data1/vm-106-disk-0@__replicate_106-0_1687638124__' failed: got timeout
Jun 24 22:53:30 proxmox2 pvescheduler[1278040]: 106-0: got unexpected replication job error - command 'zfs snapshot tank1/data1/vm-106-disk-0@__replicate_106-0_1687639929__' failed: got timeout
Jun 24 23:13:26 proxmox2 pvescheduler[1691635]: command 'zfs destroy tank1/data1/vm-101-disk-0@__replicate_101-0_1687639384__' failed: got timeout
Jun 24 23:14:20 proxmox2 pvescheduler[1691635]: 101-0: got unexpected replication job error - command 'zfs snapshot tank1/data1/vm-101-disk-0@__replicate_101-0_1687641184__' failed: got timeout
Jun 24 23:16:31 proxmox2 pvescheduler[2152732]: command 'zfs destroy tank1/data1/vm-102-disk-0@__replicate_102-0_1687639564__' failed: got timeout
Jun 24 23:17:01 proxmox2 pvescheduler[2152732]: 102-0: got unexpected replication job error - command 'zfs snapshot tank1/data1/vm-102-disk-0@__replicate_102-0_1687641364__' failed: got timeout
Jun 24 23:18:24 proxmox2 pvescheduler[2461804]: command 'zfs destroy tank1/data1/vm-103-disk-0@__replicate_103-0_1687639684__' failed: got timeout
Jun 24 23:18:45 proxmox2 pvescheduler[2461804]: 103-0: got unexpected replication job error - command 'zfs snapshot tank1/data1/vm-103-disk-0@__replicate_103-0_1687641484__' failed: got timeout
Jun 24 23:20:36 proxmox2 pvescheduler[2784014]: command 'zfs destroy tank1/data1/vm-107-disk-0@__replicate_107-0_1687639774__' failed: got timeout
Jun 24 23:22:13 proxmox2 pvescheduler[2784014]: 107-0: got unexpected replication job error - command 'zfs snapshot tank1/data1/vm-107-disk-0@__replicate_107-0_1687641604__' failed: got timeout
Jun 24 23:22:46 proxmox2 pvescheduler[2784014]: command 'zfs destroy tank1/data1/vm-106-disk-0@__replicate_106-0_1687639929__' failed: got timeout
Jun 24 23:23:34 proxmox2 pvescheduler[2784014]: 106-0: got unexpected replication job error - command 'zfs snapshot tank1/data1/vm-106-disk-0@__replicate_106-0_1687641733__' failed: got timeout
Jun 25 02:02:05 proxmox2 pvescheduler[315156]: <root@pam> starting task UPID:proxmox2:0004CF32:16667D8B:649783FD:vzdump::root@pam:
Jun 26 02:02:00 proxmox2 pvescheduler[964754]: <root@pam> starting task UPID:proxmox2:000EB899:16EA51DE:6498D578:vzdump::root@pam:
Jun 26 02:06:46 proxmox2 pvescheduler[1027658]: 101-0: got unexpected replication job error - command 'zfs snapshot tank1/data1/vm-101-disk-0@__replicate_101-0_1687737780__' failed: got timeout
Jun 26 02:07:46 proxmox2 pvescheduler[1027658]: 102-0: got unexpected replication job error - command 'zfs snapshot tank1/data1/vm-102-disk-0@__replicate_102-0_1687738006__' failed: got timeout
Jun 26 02:08:35 proxmox2 pvescheduler[1027658]: command 'zfs destroy tank1/data1/vm-103-disk-1@__replicate_103-0_1687730460__' failed: got timeout
Hat jemand evtl. Ideen woran es liegen kann, dass es nur an Wochenenden fehlschlägt? Irgendwelche Logs die ich durchsuchen kann?
Ich hab die Replikation für Sonntags bereits deaktiviert, da an diesem Tag keine neuen Daten hinzukommen sollten.
Repilikations-Jobs Node 1:
Replikations-Jobs Node 2:
Grüße!
Last edited: