Replication error I/O error

tomas12343

New Member
Jun 6, 2020
20
0
1
42
Tried to replicate vm with 3 virtual hdd's with total disk space 2.5 tb. At first I had some storage problems (as I mentioned on my previous post, they were resolved). Now replication goes well and on the second hdd I get an error with the log:

2020-06-12 02:52:00 208-0: start replication job
2020-06-12 02:52:00 208-0: guest => VM 208, running => 7308
2020-06-12 02:52:00 208-0: volumes => Disk5:vm-208-disk-0,Disk5:vm-208-disk-1,Disk5:vm-208-disk-2
2020-06-12 02:52:01 208-0: create snapshot '__replicate_208-0_1591919520__' on Disk5:vm-208-disk-0
2020-06-12 02:52:01 208-0: create snapshot '__replicate_208-0_1591919520__' on Disk5:vm-208-disk-1
2020-06-12 02:52:01 208-0: create snapshot '__replicate_208-0_1591919520__' on Disk5:vm-208-disk-2
2020-06-12 02:52:01 208-0: using secure transmission, rate limit: none
2020-06-12 02:52:01 208-0: full sync 'Disk5:vm-208-disk-0' (__replicate_208-0_1591919520__)
2020-06-12 02:52:02 208-0: full send of Disk5/vm-208-disk-0@__replicate_208-0_1591919520__ estimated size is 922G
2020-06-12 02:52:02 208-0: total estimated size is 922G
2020-06-12 02:52:03 208-0: TIME SENT SNAPSHOT Disk5/vm-208-disk-0@__replicate_208-0_1591919520__
2020-06-12 02:52:03 208-0: Disk5/vm-208-disk-0 name Disk5/vm-208-disk-0 -
2020-06-12 02:52:03 208-0: volume 'Disk5/vm-208-disk-0' already exists
2020-06-12 02:52:03 208-0: warning: cannot send 'Disk5/vm-208-disk-0@__replicate_208-0_1591919520__': signal received
2020-06-12 02:52:03 208-0: cannot send 'Disk5/vm-208-disk-0': I/O error
2020-06-12 02:52:03 208-0: command 'zfs send -Rpv -- Disk5/vm-208-disk-0@__replicate_208-0_1591919520__' failed: exit code 1
2020-06-12 02:52:03 208-0: delete previous replication snapshot '__replicate_208-0_1591919520__' on Disk5:vm-208-disk-0
2020-06-12 02:52:03 208-0: delete previous replication snapshot '__replicate_208-0_1591919520__' on Disk5:vm-208-disk-1
2020-06-12 02:52:04 208-0: delete previous replication snapshot '__replicate_208-0_1591919520__' on Disk5:vm-208-disk-2
2020-06-12 02:52:04 208-0: end replication job with error: command 'set -o pipefail && pvesm export Disk5:vm-208-disk-0 zfs - -with-snapshots 1 -snapshot __replicate_208-0_1591919520__ | /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=emron-backup' root@192.168.10.223 -- pvesm import Disk5:vm-208-disk-0 zfs - -with-snapshots 1 -allow-rename 0' failed: exit code 255

My best guess is that the previous replication failed on hdd 2 (on the backup storage I see only hdd 0 and 1) and when the replication starts again it says that hdd 0 exists. But I cannot test it because the replication takes about 6-8 hours and I cannot wait on the pc for that long (Replication starts at midnight because it slows down the server for the first replication) and I cannot find previous replication logs..
Any ideas??
 
Last edited:
check the journal on both nodes - there should be more information regarding the source of the I/O error

I hope this helps!
Tried to, I cannot find the journal on destination.

What I did is started the replication and deactivated so that I could see the first log when it finished. The log is:

cannot receive new filesystem stream: out of space
2020-06-13 07:53:28 208-0: cannot open 'Disk5/vm-208-disk-2': dataset does not exist
2020-06-13 07:53:28 208-0: command 'zfs recv -F -- Disk5/vm-208-disk-2' failed: exit code 1
2020-06-13 07:53:28 208-0: delete previous replication snapshot '__replicate_208-0_1591994040__' on Disk5:vm-208-disk-0
2020-06-13 07:53:29 208-0: delete previous replication snapshot '__replicate_208-0_1591994040__' on Disk5:vm-208-disk-1
2020-06-13 07:53:31 208-0: delete previous replication snapshot '__replicate_208-0_1591994040__' on Disk5:vm-208-disk-2
2020-06-13 07:53:31 208-0: end replication job with error: command 'set -o pipefail && pvesm export Disk5:vm-208-disk-2 zfs - -with-snapshots 1 -snapshot __replicate_208-0_1591994040__ | /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=emron-backup' root@192.168.10.223 -- pvesm import Disk5:vm-208-disk-2 zfs - -with-snapshots 1 -allow-rename 0' failed: exit code 1

I had a problem with space on the source disk: although there was enough space, when I started the replication, the space wasn't enough (system reserves space for replication), so I added another hdd in the pool.

Maybe the two disks are not sharing space and the first hdd fills up with the replication?
 
Tried to, I cannot find the journal on destination.
the journal can be read with `journalctl` - see `man journalctl`

cannot receive new filesystem stream: out of space
seems there is not enough space - as you analyzed correctly

check the output of:
* `zpool status`
* `zpool list`
* `zfs list`
* `zfs get all $dataset` - for one of the dataset you receive into
 
the journal can be read with `journalctl` - see `man journalctl`


seems there is not enough space - as you analyzed correctly

check the output of:
* `zpool status`
* `zpool list`
* `zfs list`
* `zfs get all $dataset` - for one of the dataset you receive into

I am waiting for a new hdd for the destination pool. Will inform how it will go. I really wish it is that simple!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!