You don't have updated librbd to jewel ??????? (10.2.0) (I mean on your proxmox 4.3 node)
No; there are no pending upgrades for the ceph stuff on the proxmox nodes.
You don't have updated librbd to jewel ??????? (10.2.0) (I mean on your proxmox 4.3 node)
dpkg -l | egrep -i '(ceph|rbd|rados)'
ii ceph-common 0.80.8-1~bpo70+1 amd64 common utilities to mount and interact with a ceph storage cluster
ii libcephfs1 0.80.7-2+deb8u1 amd64 Ceph distributed file system client library
ii librados2 0.80.8-1~bpo70+1 amd64 RADOS distributed object store client library
ii librados2-perl 1.0-3 amd64 Perl bindings for librados
ii librbd1 0.80.8-1~bpo70+1 amd64 RADOS block device client library
ii python-ceph 0.80.8-1~bpo70+1 amd64 Python libraries for the Ceph distributed filesystem
I'm surprised that you are able to connect to your jewel cluster.
Using old librbd on new ceph cluster version is not supported and tested by ceph team.
Use sint new librbd with old ceph cluster is ok and tested.
0.80.8 is the one hosted in debian jessie main repository.Mir, where did *you* get 0.80.8 on 4.3?
The packages from inktank is build on and for debian jessie the same way packages are made for debian jessie-backports so this is a "supported" install.What about using new librbd with proxmox in violation of version-specific dependencies? Is that tested?
I think we have found the cause to your migration problemsI'm surprised that you are able to connect to your jewel cluster.
Using old librbd on new ceph cluster version is not supported and tested by ceph team.
Hi,Currently my efforts are focused on finding a repro that does not involve corrupting live production data. So far, no luck.
deb http://download.ceph.com/debian-firefly/ jessie main
Do you have configured tunables on your jewel cluster to be compatible with firefly librbd ?Per the Ceph developers, mismatched client/server versions should not cause any issues without supplemental stupidity (such as using features or tunables not supported by the older client), and that they work very hard to preserve compatibility.
again, qemu drive-mirror abort if any io error (read or write) occur during the migration (on source or destination)And even in the case of supplemental stupidity, errors would be the expected result, not silent data corruption.
This is strange. for image create, the command execute by proxmox is the same.By contrast, we did set up a test with two Proxmox 4.3 nodes, one running Jewel and one on the default 0.80.7. We promptly got a "function not implemented" error trying to create an image on the Firefly cluster, or on trying to migrate an image from a Jewel cluster to a Firefly cluster. And live migrating any VMs from the Proxmox-jewel node to the Proxmox-firefly node failes: "Error: start failed."
one on the default 0.80.7. We promptly got a "function not implemented" error trying to create an image on the Firefly cluster,