I know. hence my comments (check #38).
ps.: i do not think you see any edits i made past the initial post.
that is correct, I just realized that emails do not get sent when a post is edited, only when post is added.... I'll check my forum settings.
I know. hence my comments (check #38).
ps.: i do not think you see any edits i made past the initial post.
dell1 ~ # omping -c 600 -i 1 -q sys3-corosync sys5-corosync dell1-corosync
sys3-corosync : waiting for response msg
sys5-corosync : waiting for response msg
sys5-corosync : joined (S,G) = (*, 232.43.211.234), pinging
sys3-corosync : waiting for response msg
sys3-corosync : joined (S,G) = (*, 232.43.211.234), pinging
sys5-corosync : given amount of query messages was sent
sys3-corosync : given amount of query messages was sent
sys3-corosync : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.077/0.245/0.306/0.031
sys3-corosync : multicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.082/0.259/0.319/0.032
sys5-corosync : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.143/0.253/11.557/0.463
sys5-corosync : multicast, xmt/rcv/%loss = 600/599/0% (seq>=2 0%), min/avg/max/std-dev = 0.146/0.260/11.575/0.464
sys3 ~ # omping -c 600 -i 1 -q sys3-corosync sys5-corosync dell1-corosync
sys5-corosync : waiting for response msg
dell1-corosync : waiting for response msg
sys5-corosync : joined (S,G) = (*, 232.43.211.234), pinging
dell1-corosync : joined (S,G) = (*, 232.43.211.234), pinging
sys5-corosync : given amount of query messages was sent
dell1-corosync : given amount of query messages was sent
sys5-corosync : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.116/0.196/1.173/0.050
sys5-corosync : multicast, xmt/rcv/%loss = 600/599/0% (seq>=2 0%), min/avg/max/std-dev = 0.124/0.214/1.192/0.050
dell1-corosync : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.143/0.256/3.950/0.157
dell1-corosync : multicast, xmt/rcv/%loss = 600/599/0% (seq>=2 0%), min/avg/max/std-dev = 0.160/0.268/3.960/0.157
sys5 ~ # omping -c 600 -i 1 -q sys3-corosync sys5-corosync dell1-corosync
sys3-corosync : waiting for response msg
dell1-corosync : waiting for response msg
sys3-corosync : waiting for response msg
dell1-corosync : waiting for response msg
sys3-corosync : joined (S,G) = (*, 232.43.211.234), pinging
dell1-corosync : joined (S,G) = (*, 232.43.211.234), pinging
sys3-corosync : given amount of query messages was sent
dell1-corosync : given amount of query messages was sent
sys3-corosync : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.081/0.202/0.315/0.034
sys3-corosync : multicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.108/0.224/0.322/0.032
dell1-corosync : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.145/0.238/0.346/0.036
dell1-corosync : multicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.164/0.249/0.355/0.036
both from sys5.
ping -c 1000 -i 0.1 10.2.8.181
ping -c 1000 -i 0.1 10.2.8.42
--- 10.2.8.42 ping statistics ---
1000 packets transmitted, 1000 received, 0% packet loss, time 99900ms
rtt min/avg/max/mdev = 0.064/0.176/0.287/0.029 ms
sys5 ~
--- 10.2.8.181 ping statistics ---
1000 packets transmitted, 1000 received, 0% packet loss, time 99897ms
rtt min/avg/max/mdev = 0.117/0.194/0.304/0.031 ms
sys5 ~ #
ping -c 1000 -i 0.1 10.2.8.19
--- 10.2.8.19 ping statistics ---
1000 packets transmitted, 1000 received, 0% packet loss, time 99926ms
rtt min/avg/max/mdev = 0.101/0.185/0.263/0.035 ms
ping -c 1000 -i 0.1 10.2.8.42
--- 10.2.8.42 ping statistics ---
1000 packets transmitted, 1000 received, 0% packet loss, time 99923ms
rtt min/avg/max/mdev = 0.073/0.193/0.275/0.035 ms
this morning
sys5 all green ,
dell1 and sys3 show localhost green
In the past sys3 was all green.
there was a backup at sys5 last night
Via automatic (sheduled) backups ?? or by Hand ?
Basically replicate the Backup-Procedures from your Nightly sheduled Backup 1:1. Then just run it at currentTime+1Minute instead of the sheduled time.
109: Nov 21 22:27:19 INFO: transferred 159995 MB in 1636 seconds (97 MB/s)
109: Nov 23 00:42:18 INFO: transferred 159995 MB in 9735 seconds (16 MB/s)
1747: Nov 21 22:46:12 INFO: transferred 39728 MB in 609 seconds (65 MB/s)
1747: Nov 23 03:46:37 INFO: transferred 39728 MB in 9308 seconds (4 MB/s)
3902: Nov 21 22:48:03 INFO: transferred 8589 MB in 20 seconds (429 MB/s)
3902: Nov 23 03:53:18 INFO: transferred 8589 MB in 81 seconds (106 MB/s)
Nov 22 22:00:02 sys5 vzdump[23130]: INFO: Starting Backup of VM 109 (qemu)
Nov 22 22:00:03 sys5 qm[23133]: <root@pam> update VM 109: -lock backup
Nov 22 22:02:20 sys5 corosync[8309]: [MAIN ] Corosync main process was not scheduled for 6762.6118 ms (threshold is 1320.0000 ms). Consider token timeout increase.
Nov 22 22:02:20 sys5 corosync[8309]: [TOTEM ] A processor failed, forming new configuration.
Nov 22 22:02:20 sys5 pve-firewall[8317]: firewall update time (5.092 seconds)
Nov 22 22:02:20 sys5 corosync[8309]: [TOTEM ] A new membership (10.2.8.19:14180) was formed. Members joined: 1 3 left: 1 3
Nov 22 22:02:20 sys5 corosync[8309]: [TOTEM ] Failed to receive the leave message. failed: 1 3
Nov 22 22:02:20 sys5 corosync[8309]: [QUORUM] Members[3]: 4 1 3
Nov 22 22:02:20 sys5 corosync[8309]: [MAIN ] Completed service synchronization, ready to provide service.
Nov 22 22:02:26 sys5 pvestatd[23475]: status update time (21.081 seconds)
Nov 22 22:02:26 sys5 pmxcfs[16758]: [status] notice: RRDC update error /var/lib/rrdcached/db/pve2-vm/107: -1
Nov 22 22:02:26 sys5 pmxcfs[16758]: [status] notice: RRDC update error /var/lib/rrdcached/db/pve2-vm/5544: -1
Nov 22 22:02:26 sys5 pmxcfs[16758]: [status] notice: RRDC update error /var/lib/rrdcached/db/pve2-vm/3103: -1
vzdump --mailnotification always --node sys5 --storage nfs-pve --mode snapshot --exclude 77904,8102 --mailto fbcadmin --all 1 --quiet 1 --compress lzo
109: Nov 22 22:00:02 INFO: Starting Backup of VM 109 (qemu)
109: Nov 22 22:00:02 INFO: status = running
109: Nov 22 22:00:03 INFO: update VM 109: -lock backup
109: Nov 22 22:00:03 INFO: backup mode: snapshot
109: Nov 22 22:00:03 INFO: ionice priority: 7
109: Nov 22 22:00:03 INFO: creating archive '/mnt/pve/nfs-pve/dump/vzdump-qemu-109-2015_11_22-22_00_02.vma.lzo'
109: Nov 22 22:00:03 INFO: started backup task '3a6ca23d-a2de-4af9-8b26-b70d17228357'
109: Nov 22 22:00:06 INFO: status: 0% (231866368/159995920384), sparse 0% (5517312), duration 3, 77/75 MB/s
109: Nov 22 22:00:59 INFO: status: 1% (1604059136/159995920384), sparse 0% (26710016), duration 56, 25/25 MB/s
109: Nov 22 22:04:03 INFO: status: 2% (3265134592/159995920384), sparse 0% (47767552), duration 240, 9/8 MB/s
109: Nov 22 22:05:21 INFO: status: 3% (4838785024/159995920384), sparse 0% (237436928), duration 318, 20/17 MB/s
109: Nov 22 22:08:21 INFO: status: 4% (6504251392/159995920384), sparse 0% (270823424), duration 498, 9/9 MB/s
109: Nov 22 22:10:09 INFO: status: 5% (8088715264/159995920384), sparse 0% (317431808), duration 606, 14/14 MB/s
109: Nov 22 22:10:17 INFO: status: 6% (9725870080/159995920384), sparse 0% (1478684672), duration 614, 204/59 MB/s
109: Nov 22 22:10:22 INFO: status: 7% (11249254400/159995920384), sparse 1% (2859257856), duration 619, 304/28 MB/s
109: Nov 22 22:10:27 INFO: status: 8% (13103464448/159995920384), sparse 2% (4700565504), duration 624, 370/2 MB/s
109: Nov 22 22:10:32 INFO: status: 9% (14650376192/159995920384), sparse 3% (6091571200), duration 629, 309/31 MB/s
109: Nov 22 22:10:42 INFO: status: 10% (16145514496/159995920384), sparse 4% (7089979392), duration 639, 149/49 MB/s
109: Nov 22 22:10:57 INFO: status: 11% (17610440704/159995920384), sparse 5% (8027828224), duration 654, 97/35 MB/s
109: Nov 22 22:11:23 INFO: status: 12% (19201916928/159995920384), sparse 5% (8063328256), duration 680, 61/59 MB/s
109: Nov 22 22:16:09 INFO: status: 13% (20822360064/159995920384), sparse 5% (8098742272), duration 966, 5/5 MB/s
109: Nov 22 22:16:54 INFO: status: 14% (22427140096/159995920384), sparse 5% (8122384384), duration 1011, 35/35 MB/s
109: Nov 22 22:19:38 INFO: status: 15% (24077008896/159995920384), sparse 5% (8151613440), duration 1175, 10/9 MB/s
109: Nov 22 22:21:33 INFO: status: 16% (25710559232/159995920384), sparse 5% (8177446912), duration 1290, 14/13 MB/s
109: Nov 22 22:23:03 INFO: status: 17% (27205632000/159995920384), sparse 5% (8200376320), duration 1380, 16/16 MB/s
109: Nov 22 22:23:54 INFO: status: 18% (28811853824/159995920384), sparse 5% (8229777408), duration 1431, 31/30 MB/s
109: Nov 22 22:25:47 INFO: status: 19% (30447173632/159995920384), sparse 5% (8254017536), duration 1544, 14/14 MB/s
Nov 23 15:50:01 dell1 CRON[25215]: (root) CMD (pve-zsync sync --source 4526 --dest 10.2.2.46:tank/pve-zsync-bkup --name etherpad-syncjob --maxsnap 12 --met
hod ssh)
Nov 23 15:50:01 dell1 CRON[25217]: (root) CMD (pve-zsync sync --source 3106 --dest 10.2.2.46:tank/pve-zsync-bkup --name mediawiki-syncjob --maxsnap 12 --me
thod ssh)
Nov 23 15:50:01 dell1 CRON[25218]: (root) CMD (pve-zsync sync --source 3122 --dest 10.2.2.46:tank/pve-zsync-bkup --name ona-syncjob --maxsnap 12 --method s
sh)
Nov 23 15:50:01 dell1 CRON[25216]: (root) CMD (pve-zsync sync --source 101 --dest 10.2.2.46:tank/pve-zsync-bkup --name ldap-syncjob --maxsnap 12 --method
ssh)
Nov 23 15:50:01 dell1 CRON[25220]: (root) CMD (pve-zsync sync --source 3551 --dest 10.2.2.46:tank/pve-zsync-bkup --name nodejs-syncjob --maxsnap 12 --me
thod ssh)
Nov 23 15:50:01 dell1 CRON[25219]: (root) CMD (pve-zsync sync --source 4501 --dest 10.2.2.46:tank/pve-zsync-bkup --name pro4-ray-syncjob --maxsnap 48 --me
thod ssh)
Nov 23 15:50:38 dell1 pveproxy[27822]: worker exit
Nov 23 15:50:38 dell1 pveproxy[9058]: worker 27822 finished
Nov 23 15:50:38 dell1 pveproxy[9058]: starting 1 worker(s)
Nov 23 15:50:38 dell1 pveproxy[9058]: worker 26069 started
Nov 23 15:50:42 dell1 corosync[7713]: [TOTEM ] A processor failed, forming new configuration.
Nov 23 15:50:42 dell1 corosync[7713]: [TOTEM ] A new membership (10.2.8.19:15060) was formed. Members
$ModLoad ommail
$ActionMailSMTPServer localhost
$ActionMailFrom rsyslog@myplace.com
$ActionMailTo someone@myplace.com
$template mailSubject,"A processor failed line in syslog on %hostname%"
$template mailBody,"Check pve web pages, they may be red.\r\n\r\n%msg%"
$ActionMailSubject mailSubject
# Only send an email every 15 minutes
# $ActionExecOnlyOnceEveryInterval 900
# This if/then must all be on one line
if $msg contains 'A processor failed' then :ommail:;mailBody
I wonder why the separated Vlans for every Service and prioritising them on the switch-level did not do the trick.
Any chance one of the services you mentioned above did not run in its own Vlan and or used higher VLan Prio then the corosync Vlan ?
Good thing its fixed (took long enough)
yeah, unless you have a nic for every single vlan (in your case that would be ...5 ??), that is rather hard (at some point you run out of nics) to do. Also feels tedious
I currently have around 80 vlans in my network , with up to 40 OVS_intports (in different Vlans) leaving over a single Bond... think about that![]()
for a cluster , I do not think it is normal for one node to have /etc/pve writable and the others not.
Is that that true?
No, this is strange.
We use essential cookies to make this site work, and optional cookies to enhance your experience.