transfer rate drops to a crawl

Apr 27, 2024
434
150
43
Portland, OR
www.gnetsys.net
I've got a lot of syncs running site to site to support a migration.
I need them to go go go go. Not lag for a day.

What is happening when a sync job drops to a few K and just stays that way until you kill it?
There are no 'logs' of this, other than the dramatic time window between log entries.

Are there priorities for the types of jobs?
Does a datastore read always win out over a datastore write? Seems that way.

How does PBS decide to split up its bandwidth?
When a sync that was using most of the bandwidth finishes, why doesn't a concurrent sync pick up speed and use that now available bandwidth?

I understand contention on the source server and network issues. If I could lay the blame on either of those things, it would be fixable, but I don't see them. Yet.

I'm not seeing log errors for locked backups. Or at least not to any frequent degree. Mostly I cause that when I try to execute the sync job and its already running.

I need this to work. Now. I'm considering wiping a datastore and trying again.
 
Hmm. Maybe I fixed it?

I was getting this.

Mismatch between pool hostid and system hostid on imported pool.​


I did the fix where you gen a new hostid, set it to multihost, and back.
The sync job apparently immediately started working at the prior top speed.

I didn't cause this issue. My weekday nemesis decided to redo custom drive pool. I hope I've fixed it. Really under the gun here.