Replication not happening, help.

Manny Vazquez

Well-Known Member
Jul 12, 2017
107
2
58
Miami, FL USA
I do not understand what has happened, I have 3 hosts. all 3 running same version 5.0.23.
All 3 on ZFS.

Replication worked fin for a while accross the hard drives of one host to the other, but now it is stuck on 2 of the hosts.

In the images attached,
== vm1 no longer exists, it was reformatted as VM4, but can not remove the name from the list (does not matter)

you can see VM2 replication is working fine against the other 2 hosts, I just added a new VM (5000) to the server at the moment I took this screen shot, but as you see, the other VMs replications run fine on 10/18 and are scheduled to run again on 10/19 .

If you look at the other images, VM3 and VM4, in both cases they list as status OK, but the Next Sync says "pending" and it has been like this for a couple weeks already.

Any idea would be really appreciated.Screen Shot 2018-10-18 at 2.36.34 PM.png Screen Shot 2018-10-18 at 2.37.16 PM.png Screen Shot 2018-10-18 at 2.38.07 PM.png Screen Shot 2018-10-18 at 2.41.27 PM.png Screen Shot 2018-10-18 at 2.46.06 PM.png Screen Shot 2018-10-18 at 2.46.21 PM.png
 
is the 'pvesr.timer' loaded and active on all nodes?

Code:
systemctl status pvesr.timer
systemctl list-timers
 
Just to note, this is a quite outdated. please upgrade to latest version 5.2 so we do not hunt for already fixed bugs.

See https://pve.proxmox.com/wiki/Downlo...Proxmox_Virtual_Environment_5.x_to_latest_5.2
Thanks Tom.

I wish I could afford to pay for commercial support, but my budget at this moment does not allow for the 1,500 euros of the basic.. Maybe in a year if this finally takes off and I can actually stop putting money into this organization.

This may sound stupid to people that know so much as you do, but trying to follow the upgrade instructions I run into a "couple" of questions.
1.- It says "Check your sources.list file, should look like this:" , but the only reference I see to sources.list is in "cat /etc/apt/sources.list.d/pve-enterprise.list", is that the right place?
2.- Can I update a hosts WHILE the vms are in production or should I move the VMs to another host?
3.- Do you have any idea how long does the update take?

Also, is there another way to fix this problem without actually upgrading? I am getting 3 more servers (donated, since we are actually a non-for-profit company) and I could make these new servers with the new version, put them into the cluster and them once I move all the VMs, upgarde (with cd) the older ones.

What do you think would be best?
I am just afraid that if one of these nodes goes down, I have no way to recover the VMs that are in that node.

Please, I implore for help.
 
Thanks Tom.

I wish I could afford to pay for commercial support, but my budget at this moment does not allow for the 1,500 euros of the basic.. Maybe in a year if this finally takes off and I can actually stop putting money into this organization.

This may sound stupid to people that know so much as you do, but trying to follow the upgrade instructions I run into a "couple" of questions.
1.- It says "Check your sources.list file, should look like this:" , but the only reference I see to sources.list is in "cat /etc/apt/sources.list.d/pve-enterprise.list", is that the right place?

Not really, the upgrade instructions are quite clear about the location of the files.
https://pve.proxmox.com/wiki/Downlo...Proxmox_Virtual_Environment_5.x_to_latest_5.2

2.- Can I update a hosts WHILE the vms are in production or should I move the VMs to another host?

If you can use live migration (shared storage), you can upgrade host by host without stopping the VMs.

3.- Do you have any idea how long does the update take?

Depends on your hardware and network. In my environments it just takes a few minutes.

Also, is there another way to fix this problem without actually upgrading?

Maybe, without deeper analysis I cannot advice. But in any case, you need to run current version only, otherwise you always miss bug fixes.

What do you think would be best?

Always run latest version.

I am just afraid that if one of these nodes goes down, I have no way to recover the VMs that are in that node.

No way to recover? You should re-plan your backup setup.
 
Hi, just wanted to add here, for the benefit of others that might read it.
I figured out a way to get this going , without having to upgrade (I will do it, but on my time, not this rush) ..
First I had to delete all the replication jobs

>> pvesr delete <vmid-jobid> --force

Once all the replications were deleted (they were not working anyhow, and even the delete via gui was not working. I proceeded to check the timer.

>> systemctl status pvesr.timer

In here I saw that the timer was stuck since June.
So , I looked for the job id to kill it.

>> ps aux | grep -v grep | grep pvesr

Once I got the ID, I killed it.

>> kill <jobid>

This restarted almost immediately..

I added a few replication jobs and they started immediately as they had lost the time :) at this moment I have VMs replicating properly and HA "should" be once again working fine .
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!