Proxmox Ceph Problem

Aug 8, 2024
11
0
1
Hello,

For the past two weeks, I've been encountering an issue where I can no longer clone or move a disk to Ceph storage. Here’s the cloning output:
create full clone of drive scsi0 (Ceph-VM-Pool:vm-120-disk-0)

transferred 0.0 B of 32.0 GiB (0.00%)

qemu-img: Could not open 'zeroinit:rbd:Ceph-VM-Pool/vm-111-disk-0:conf=/etc/pve/ceph.conf:id=admin:keyring=/etc/pve/priv/ceph/Ceph-VM-Pool.keyring': Could not open 'rbd:Ceph-VM-Pool/vm-111-disk-0:conf=/etc/pve/ceph.conf:id=admin:keyring=/etc/pve/priv/ceph/Ceph-VM-Pool.keyring': No such file or directory

Removing image: 1% complete...
[...]
Removing image: 100% complete...done.

TASK ERROR: clone failed: copy failed: command '/usr/bin/qemu-img convert -p -n -f raw -O raw 'rbd:Ceph-VM-Pool/vm-120-disk-0:conf=/etc/pve/ceph.conf:id=admin:keyring=/etc/pve/priv/ceph/Ceph-VM-Pool.keyring' 'zeroinit:rbd:Ceph-VM-Pool/vm-111-disk-0:conf=/etc/pve/ceph.conf:id=admin:keyring=/etc/pve/priv/ceph/Ceph-VM-Pool.keyring'' failed: exit code 1

I came across a post with a similar issue, but unfortunately, those posts didn’t resolve my problem. My current version is: pve-manager/8.2.4/faa83925c9641325 (running kernel: 6.8.8-4-pve).

Thanks in advance!
 
Hello,

Could you please send us the output of:

From the host hosting VM 111:
- cat /etc/pve/storage.cfg
- cat /etc/pve/qemu-server/111.conf
- cat /etc/pve/ceph.conf
- cat /etc/network/interfaces/network

From a node in the Ceph cluster:
- ceph -s
 
I am unable to retrieve the configuration from VM 111 because it is the new number of the cloned VM. The source VM is 120, for example. Please don't be surprised about the Network configuration, this is a specialized Proxmox cluster in a data center at a hosting provider, therefore is the unusual network configuration. Below are the outputs:

Code:
cat /etc/pve/storage.cfg


dir: local


        disable


        path /var/lib/vz


        content iso,vztmpl,backup


        shared 0





zfspool: local-zfs


        pool rpool/data


        content images,rootdir


        sparse 1





cephfs: Ceph-Pool


        path /mnt/pve/Ceph-Pool


        content iso,backup,vztmpl


        fs-name Ceph-Pool





rbd: Ceph-VM-Pool


        content rootdir,images


        krbd 0


        pool Ceph-VM-Pool





pbs: S56-BackupSRV


        datastore Cluster_Backup


        server —removed—


        content backup


        encryption-key —removed—


        fingerprint  —removed—


        prune-backups keep-all=1


        username backupuser@pbs


Code:
cat /etc/pve/qemu-server/120.conf


boot: order=scsi0;ide2;net0


cores: 1


cpu: x86-64-v2-AES


ide2: Ceph-Pool:iso/linuxmint-21.3-cinnamon-64bit.iso,media=cdrom,size=2995344K


memory: 2048


meta: creation-qemu=8.1.5,ctime=1716295137


name: Management.Showforces.Node1


net0: virtio=BC:24:11:92:85:E2,bridge=vmbr6


numa: 0


ostype: l26


scsi0: Ceph-VM-Pool:vm-120-disk-0,iothread=1,size=32G


scsihw: virtio-scsi-single


smbios1: uuid=21e97f45-6e54-4f2c-8e24-d85a4c8bb8b4


sockets: 1


tags: Showforces


vga: vmware,clipboard=vnc


vmgenid: 5142632b-2b5f-4fad-8a43-b3c0bcd994a0

Code:
cat /etc/pve/ceph.conf


[global]


        auth_client_required = cephx


        auth_cluster_required = cephx


        auth_service_required = cephx


        cluster_network = 10.10.11.1/29


        fsid = f844f866-897e-4a69-a547-05581cc8f1a3


        mon_allow_pool_delete = true


        mon_host = 10.10.11.1 10.10.11.2 10.10.11.3


        ms_bind_ipv4 = true


        ms_bind_ipv6 = false


        osd_pool_default_min_size = 2


        osd_pool_default_size = 3


        public_network = 10.10.11.1/29





[client]


        keyring = /etc/pve/priv/$cluster.$name.keyring





[client.crash]


        keyring = /etc/pve/ceph/$cluster.$name.keyring





[mds]


        keyring = /var/lib/ceph/mds/ceph-$id/keyring





[mds.node1]


        host = node1


        mds_standby_for_name = pve





[mds.node2]


        host = node2


        mds_standby_for_name = pve





[mds.node3]


        host = node3


        mds_standby_for_name = pve





[mon.node1]


        public_addr = 10.10.11.1





[mon.node2]


        public_addr = 10.10.11.2





[mon.node3]


        public_addr = 10.10.11.3


Code:
Etc/network/interfaces

cat /etc/network/interfaces


# network interface settings; autogenerated


# Please do NOT modify this file directly, unless you know what


# you're doing.


#


# If you want to manage parts of the network configuration manually,


# please utilize the 'source' or 'source-directory' directives to do


# so.


# PVE will preserve these directives, but will NOT read its network


# configuration from sourced files, so do not attempt to move any of


# the PVE managed interfaces into external files!





auto lo


iface lo inet loopback





iface eno1 inet manual





iface enxaaf31a9525c8 inet manual





iface eno2 inet manual





auto ens21f0


iface ens21f0 inet manual





auto ens21f1


iface ens21f1 inet manual





auto ens23f0


iface ens23f0 inet manual





auto ens23f1


iface ens23f1 inet manual





auto bond0


iface bond0 inet static


        address 10.10.10.1/29


        bond-slaves ens21f0 ens21f1


        bond-miimon 100


        bond-mode broadcast


#Bond Corosync





auto bond1


iface bond1 inet static


        address 10.10.11.1/29


        bond-slaves ens23f0 ens23f1


        bond-miimon 100


        bond-mode broadcast


        mtu 9000


#Bond Ceph (MTU:9000)





auto vmbr0


iface vmbr0 inet static


        address —removed—


        gateway —removed—


        bridge-ports eno1


        bridge-stp off


        bridge-fd 0


        pointtopoint —removed—





auto vmbr3


iface vmbr3 inet static


        address —removed—


        bridge-ports eno1.4000


        bridge-stp off


        bridge-fd 0


        mtu 1400


#Wan-Adressen (MTU:1400)





auto vmbr4


iface vmbr4 inet manual


        bridge-ports none


        bridge-stp off


        bridge-fd 0


#Netz —removed—





auto vmbr5


iface vmbr5 inet manual


        bridge-ports none


        bridge-stp off


        bridge-fd 0


#Netz —removed—





auto vmbr6


iface vmbr6 inet manual


        bridge-ports none


        bridge-stp off


        bridge-fd 0


#Netz —removed—





auto vmbr7


iface vmbr7 inet manual


        bridge-ports none


        bridge-stp off


        bridge-fd 0


#Netz —removed—





auto vmbr8


iface vmbr8 inet manual


        bridge-ports none


        bridge-stp off


        bridge-fd 0


#Netz —removed—





auto vmbr1


iface vmbr1 inet manual


        bridge-ports none


        bridge-stp off


        bridge-fd 0


#Netz —removed—


source /etc/network/interfaces.d/*

Code:
root@node1:~# ceph -s


  cluster:


    id:     f844f866-897e-4a69-a547-05581cc8f1a3


    health: HEALTH_OK


 


  services:


    mon: 3 daemons, quorum node1,node2,node3 (age 13d)


    mgr: node3(active, since 13d), standbys: node2, node1


    mds: 1/1 daemons up, 2 standby


    osd: 6 osds: 6 up (since 13d), 6 in (since 8w)


 


  data:


    volumes: 1/1 healthy


    pools:   4 pools, 97 pgs


    objects: 310.71k objects, 1.2 TiB


    usage:   3.5 TiB used, 7.0 TiB / 10 TiB avail


    pgs:     97 active+clean


 


  io:


    client:   0 B/s rd, 276 KiB/s wr, 0 op/s rd, 76 op/s wr
 
Last edited:
Please, would you be so kind as to edit the post and use "Code" blocks as provider by the message editor. It is very hard to read it like that.
 
I changed the storage.conf file to have the monhost option, but unfortenetly it didnt help. Here the outputs from cloning:

Code:
create full clone of drive scsi0 (Ceph-VM-Pool:vm-120-disk-0)
transferred 0.0 B of 32.0 GiB (0.00%)
qemu-img: Could not open 'zeroinit:rbd:Ceph-VM-Pool/vm-111-disk-0:mon_host=10.10.11.1;10.10.11.2;10.10.11.3:auth_supported=cephx:id=admin:keyring=/etc/pve/priv/ceph/Ceph-VM-Pool.keyring': Could not open 'rbd:Ceph-VM-Pool/vm-111-disk-0:mon_host=10.10.11.1;10.10.11.2;10.10.11.3:auth_supported=cephx:id=admin:keyring=/etc/pve/priv/ceph/Ceph-VM-Pool.keyring': No such file or directory
Removing image: 1% complete...
Removing image: 2% complete...
Removing image: 3% complete...
Removing image: 4% complete...
Removing image: 5% complete...
Removing image: 6% complete...
Removing image: 7% complete...
Removing image: 8% complete...
Removing image: 9% complete...
Removing image: 10% complete...
Removing image: 11% complete...
Removing image: 12% complete...
Removing image: 13% complete...
Removing image: 14% complete...
Removing image: 15% complete...
Removing image: 16% complete...
Removing image: 17% complete...
Removing image: 18% complete...
Removing image: 19% complete...
Removing image: 20% complete...
Removing image: 21% complete...
Removing image: 22% complete...
Removing image: 23% complete...
Removing image: 24% complete...
Removing image: 25% complete...
Removing image: 26% complete...
Removing image: 27% complete...
Removing image: 28% complete...
Removing image: 29% complete...
Removing image: 30% complete...
Removing image: 31% complete...
Removing image: 32% complete...
Removing image: 33% complete...
Removing image: 34% complete...
Removing image: 35% complete...
Removing image: 36% complete...
Removing image: 37% complete...
Removing image: 38% complete...
Removing image: 39% complete...
Removing image: 40% complete...
Removing image: 41% complete...
Removing image: 42% complete...
Removing image: 43% complete...
Removing image: 44% complete...
Removing image: 45% complete...
Removing image: 46% complete...
Removing image: 47% complete...
Removing image: 48% complete...
Removing image: 49% complete...
Removing image: 50% complete...
Removing image: 51% complete...
Removing image: 52% complete...
Removing image: 53% complete...
Removing image: 54% complete...
Removing image: 55% complete...
Removing image: 56% complete...
Removing image: 57% complete...
Removing image: 58% complete...
Removing image: 59% complete...
Removing image: 60% complete...
Removing image: 61% complete...
Removing image: 62% complete...
Removing image: 63% complete...
Removing image: 64% complete...
Removing image: 65% complete...
Removing image: 66% complete...
Removing image: 67% complete...
Removing image: 68% complete...
Removing image: 69% complete...
Removing image: 70% complete...
Removing image: 71% complete...
Removing image: 72% complete...
Removing image: 73% complete...
Removing image: 74% complete...
Removing image: 75% complete...
Removing image: 76% complete...
Removing image: 77% complete...
Removing image: 78% complete...
Removing image: 79% complete...
Removing image: 80% complete...
Removing image: 81% complete...
Removing image: 82% complete...
Removing image: 83% complete...
Removing image: 84% complete...
Removing image: 85% complete...
Removing image: 86% complete...
Removing image: 87% complete...
Removing image: 88% complete...
Removing image: 89% complete...
Removing image: 90% complete...
Removing image: 91% complete...
Removing image: 92% complete...
Removing image: 93% complete...
Removing image: 94% complete...
Removing image: 95% complete...
Removing image: 96% complete...
Removing image: 97% complete...
Removing image: 98% complete...
Removing image: 99% complete...
Removing image: 100% complete...done.
TASK ERROR: clone failed: copy failed: command '/usr/bin/qemu-img convert -p -n -f raw -O raw 'rbd:Ceph-VM-Pool/vm-120-disk-0:mon_host=10.10.11.1;10.10.11.2;10.10.11.3:auth_supported=cephx:id=admin:keyring=/etc/pve/priv/ceph/Ceph-VM-Pool.keyring' 'zeroinit:rbd:Ceph-VM-Pool/vm-111-disk-0:mon_host=10.10.11.1;10.10.11.2;10.10.11.3:auth_supported=cephx:id=admin:keyring=/etc/pve/priv/ceph/Ceph-VM-Pool.keyring'' failed: exit code 1
 
Could you please post the contents of the storage.cfg? from the command above it seems you used `;` as a separator for the IPs while it should be whitespaces (` `).
 
I used spaces like in the article. It looks like the ';' be set automatically

Code:
@node1:~# cat /etc/pve/storage.cfg
dir: local
        disable
        path /var/lib/vz
        content iso,vztmpl,backup
        shared 0

zfspool: local-zfs
        pool rpool/data
        content images,rootdir
        sparse 1

cephfs: Ceph-Pool
        path /mnt/pve/Ceph-Pool
        content iso,backup,vztmpl
        fs-name Ceph-Pool

rbd: Ceph-VM-Pool
        monhost 10.10.11.1 10.10.11.2 10.10.11.3
        content rootdir,images
        krbd 0
        pool Ceph-VM-Pool



root@node1:~#
 
Last edited:
I have tried to manually create a VM disk and map it using the same keyring path, and it worked without any issues. So, I'm assuming the problem lies in the Proxmox Ceph configuration that is used for cloning. I keep trying to solve this, but I would be thankful for any new tips or suggestions. :)
Code:
root@node3:~# rbd create Ceph-VM-Pool/vm-200-disk-0 --size 10G --id admin --keyring /etc/pve/priv/ceph/Ceph-VM-Pool.keyring
root@node3:~# rbd map Ceph-VM-Pool/vm-200-disk-0 --id admin --keyring /etc/pve/priv/ceph/Ceph-VM-Pool.keyring

/dev/rbd0
 
You seem to have only one network (10.10.10.1/29). You do not need to specify the cluster_network in the configuration in this case.
Can I just comment out the cluster comment like this, and do I need to recreate all the MONs?
Code:
[global]
        auth_client_required = cephx
        auth_cluster_required = cephx
        auth_service_required = cephx
#       cluster_network = 10.10.11.1/29
        fsid = f844f866-897e-4a69-a547-05581cc8f1a3
        mon_allow_pool_delete = true
        mon_host = 10.10.11.1 10.10.11.2 10.10.11.3
        ms_bind_ipv4 = true
        ms_bind_ipv6 = false
        osd_pool_default_min_size = 2
        osd_pool_default_size = 3
        public_network = 10.10.11.1/29
 
I resolved the issue by installing version 8.1.5-6 of the pve-qemu-kvm package with this command
Code:
apt install pve-qemu-kvm:amd64=8.1.5-6
The version 9.0.0-5 is installed on the other nodes, but I am no longer able to install version 9.0.0-5 for testing purposes on the node where I manually installed 8.1.5-6. Is this package no longer available or has it been removed?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!