rbd Error, cant make new virtual machines

tchmnkyz

New Member
Mar 19, 2013
7
0
1
Hey Guys,

So i have my nice ceph cluster running nicely and everything has been great till i did a recent update. Now after the upgrade (apt-get dist-upgrade) i get the following error when trying to create VM's:

TASK ERROR: create failed - rbd create vm-101-disk-1' error: rbd: create error: (22) Invalid argument

I am not sure what other info to give to help get this issue resolved. Please see the info below:

From the Ceph Cluster
# ceph -v
ceph version 0.56.3 (6eb7e15a4783b122e9b0c85ea9ba064145958aa5)

From the first node in my cluster

root@node01:~ # pveversion -v
pve-manager: 2.3-13 (pve-manager/2.3/7946f1f1)
running kernel: 2.6.32-18-pve
proxmox-ve-2.6.32: 2.3-88
pve-kernel-2.6.32-16-pve: 2.6.32-82
pve-kernel-2.6.32-18-pve: 2.6.32-88
pve-kernel-2.6.32-17-pve: 2.6.32-83
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.4-4
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.93-2
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.9-1
pve-cluster: 1.0-36
qemu-server: 2.3-18
pve-firmware: 1.0-21
libpve-common-perl: 1.0-48
libpve-access-control: 1.0-26
libpve-storage-perl: 2.3-6
vncterm: 1.0-3
vzctl: 4.0-1pve2
vzprocps: 2.0.11-2
vzquota: 3.1-1
pve-qemu-kvm: 1.4-8
ksm-control-daemon: 1.1-1

root@node01:~ # ceph -v
ceph version 0.56.3 (6eb7e15a4783b122e9b0c85ea9ba064145958aa5)

Any help anyone can give would be greatly appreciated!
 
Hey Guys,

So i have my nice ceph cluster running nicely and everything has been great till i did a recent update. Now after the upgrade (apt-get dist-upgrade) i get the following error when trying to create VM's:

TASK ERROR: create failed - rbd create vm-101-disk-1' error: rbd: create error: (22) Invalid argument

I am not sure what other info to give to help get this issue resolved. Please see the info below:

From the Ceph Cluster
# ceph -v
ceph version 0.56.3 (6eb7e15a4783b122e9b0c85ea9ba064145958aa5)

From the first node in my cluster

root@node01:~ # pveversion -v
pve-manager: 2.3-13 (pve-manager/2.3/7946f1f1)
running kernel: 2.6.32-18-pve
proxmox-ve-2.6.32: 2.3-88
pve-kernel-2.6.32-16-pve: 2.6.32-82
pve-kernel-2.6.32-18-pve: 2.6.32-88
pve-kernel-2.6.32-17-pve: 2.6.32-83
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.4-4
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.93-2
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.9-1
pve-cluster: 1.0-36
qemu-server: 2.3-18
pve-firmware: 1.0-21
libpve-common-perl: 1.0-48
libpve-access-control: 1.0-26
libpve-storage-perl: 2.3-6
vncterm: 1.0-3
vzctl: 4.0-1pve2
vzprocps: 2.0.11-2
vzquota: 3.1-1
pve-qemu-kvm: 1.4-8
ksm-control-daemon: 1.1-1

root@node01:~ # ceph -v
ceph version 0.56.3 (6eb7e15a4783b122e9b0c85ea9ba064145958aa5)

Any help anyone can give would be greatly appreciated!


Hi, what is your ceph cluster version ? 0.56 is the mininum now.
 
all of the nodes in the cluster are the same version (ceph):

ceph version 0.56.3 (6eb7e15a4783b122e9b0c85ea9ba064145958aa5)

I took that from each of the servers and made sure they all match.
 
all of the nodes in the cluster are the same version (ceph):

ceph version 0.56.3 (6eb7e15a4783b122e9b0c85ea9ba064145958aa5)

I took that from each of the servers and made sure they all match.


oh, I also forgot to say to rbd storage configuration have some little change in proxmox 2.3

http://pve.proxmox.com/wiki/Storage:_Ceph



rbd: mycephcluster
monhost 192.168.0.1:6789;192.168.0.2:6789;192.168.0.3:6789
pool rbd (optionnal, default =r rbd)
username admin (optionnal, default = admin)
content images
 
rbd: Ceph01
monhost 10.15.8.50:6789;10.15.8.51:6789;10.15.8.52:6789
pool vms
content images
username admin

It is setup with similar fashion.
 
ok, i think i may have found more insite into it. When trying to manually create test images. It seems that with my version of rbd the "--format 2" fails for creating the image but when it is set to 1 it will create it just fine. So my question is, if i change it in the file "/usr/share/perl5/PVE/Storage/RBDPlugin.pm" to only do format 1 will it temporarily resolve my issue? the chunk in question is:

<pre>
sub alloc_image {
my ($class, $storeid, $scfg, $vmid, $fmt, $name, $size) = @_;


die "illegal name '$name' - sould be 'vm-$vmid-*'\n"
if $name && $name !~ m/^vm-$vmid-/;

$name = &$find_free_diskname($storeid, $scfg, $vmid);

my $cmd = &$rbd_cmd($scfg, $storeid, 'create', '--format' , 2, '--size', ($size/1024), $name);
run_command($cmd, errmsg => "rbd create $name $size $storeid $scfg' error", errfunc => sub {});

return $name;
}
</pre>

So my thoughts were to change this to a 1 and give that a try please let me know what you think!
 
So my thoughts were to change this to a 1 and give that a try please let me know what you think!

I guess we don't really want to use that old format. Maybe you need to create a new pool to allow format 2 - please can you test that?
 
Just created a brand new pool and then tried again and it still errors out when trying to use format 2.
 
I found the issue. It turns out that when ceph updates its deb packages. it does not restart the daemons. As such the version running in memory was actually version 48.3. I restarted all of the nodes in the cluster and the problem was fixed!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!