Run replication manually

exp · Oct 1, 2024

I have two nodes, PVE1 and PVE2.

PVE2 runs all my VMs/CTs under normal conditions and runs 24x7.

PVE1 is a smaller backup which Id like to use if VPE2 fails or if there are any hardware or software maintenance I’d like to do. In this case, I’m just pushing all VMs/CTs to PVE1 in the interim.

But it also means PVE1 is not used most of the time and hence turned off. I am using an existing RPi0 as a qdevice to keep the cluster working.

I also want to replicate everything to PVE1. First, to make the live migrations much faster. And second, if anything very bad happens with PVE2’s disks, PVE1 is a backup and can continue to run VMs/CTs from a recent state.

For this setup, I want to manually turn on PVE1 every day or so and manually initiate all replication. Once replication is finished, I want to initiate shutdown and power off PVE1 again.

How can i manually initiate all replication and make sure a full run is complete before powering off?

carles89 · Oct 1, 2024

You can do it through scripting, using the API from the node itself (pvesh)

https://pve.proxmox.com/pve-docs/api-viewer/#/nodes/{node}/replication/{id}

An option would be to launch the replication using schedule now and then check its status through status.

exp · Oct 2, 2024

carles89 said:
You can do it through scripting, using the API from the node itself (pvesh)

https://pve.proxmox.com/pve-docs/api-viewer/#/nodes/{node}/replication/{id}

An option would be to launch the replication using schedule now and then check its status through status.

Thank you, this really looks very promising!!

I tried as below, but how would you read the status?

Code:

root@pve1:~# pvesh create /nodes/pve2/replication/401-0/schedule_now


root@pve1:~# pvesh get /nodes/pve2/replication/401-0/status
┌────────────┬────────────┐
│ key        │ value      │
╞════════════╪════════════╡
│ duration   │ 580.699161 │
├────────────┼────────────┤
│ fail_count │ 0          │
├────────────┼────────────┤
│ guest      │ 401        │
├────────────┼────────────┤
│ id         │ 401-0      │
├────────────┼────────────┤
│ jobnum     │ 0          │
├────────────┼────────────┤
│ last_sync  │ 1727764103 │
├────────────┼────────────┤
│ last_try   │ 1727820967 │
├────────────┼────────────┤
│ next_sync  │ 1728201600 │
├────────────┼────────────┤
│ pid        │ 229146     │
├────────────┼────────────┤
│ schedule   │ sun 01:00  │
├────────────┼────────────┤
│ source     │ pve2       │
├────────────┼────────────┤
│ target     │ pve1       │
├────────────┼────────────┤
│ type       │ local      │
├────────────┼────────────┤
│ vmtype     │ qemu       │
└────────────┴────────────┘
root@pve1:~#

EDIT: Maybe it’s done when output does not include pid field, last_sync==last_try and fail_cnt=0? Is this reliable? Wouldn’t it be much nicer to just have the output of the replication so it’s evident what the issue is if something goes wrong?

PS: Does it matter if I run this on PVE1 or PVE2, since they are in a cluster?

carles89 · Oct 2, 2024

Timestamp of last_sync and last_try is expressed in Unix time format [0] [1].

In your script, you could store the timestamp before you start the snapshot with schedule_now , and when the snapshot is done check if last_sync value is greater than the one you've stored. Checking fail_count would be interesting too, I suggest you to check what information the api shows when there's a failed replication and compare.

exp said:
PS: Does it matter if I run this on PVE1 or PVE2, since they are in a cluster?

You should be able to run the command from any node, just use the {node}where the VM is running when you call the script:
pvesh create /nodes/{node}/replication/{id}/schedule_now

[0] https://en.wikipedia.org/wiki/Unix_time
[1] https://www.epochconverter.com

Run replication manually

exp

Member

carles89

Renowned Member

exp

Member

carles89

Renowned Member

We value your privacy