Probleme mit Datapool

micwol

New Member
Jul 1, 2024
4
0
1
Hallo,

folgender Command:

rbd ls -p Datapool --long

gibt das aus:

2024-10-04T11:54:59.695+0200 7b52cbe24780 -1 librbd::api::Image: list_images: error listing v1 images: (108) Cannot send after transport endpoint shutdown
rbd: listing images failed: (108) Cannot send after transport endpoint shutdown

und das bei 2 Knoten.

Beim 3. Knoten kommt das:

NAME SIZE PARENT FMT PROT LOCK
vm-100-disk-0 8 GiB 2
vm-101-disk-0 250 GiB 2 excl


Welches Problem haben die 2 anderen Knoten.
Alle 3 Nodes haben aktuelle Software drauf.


Code:
proxmox-ve: 8.2.0 (running kernel: 6.8.12-2-pve)
pve-manager: 8.2.7 (running version: 8.2.7/3e0176e6bb2ade3b)
proxmox-kernel-helper: 8.1.0
proxmox-kernel-6.8: 6.8.12-2
proxmox-kernel-6.8.12-2-pve-signed: 6.8.12-2
proxmox-kernel-6.8.12-1-pve-signed: 6.8.12-1
proxmox-kernel-6.8.8-2-pve-signed: 6.8.8-2
proxmox-kernel-6.8.4-2-pve-signed: 6.8.4-2
ceph: 18.2.4-pve3
ceph-fuse: 18.2.4-pve3
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx9
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.1
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.4
libpve-access-control: 8.1.4
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.0.7
libpve-cluster-perl: 8.0.7
libpve-common-perl: 8.2.3
libpve-guest-common-perl: 5.1.4
libpve-http-server-perl: 5.1.1
libpve-network-perl: 0.9.8
libpve-rs-perl: 0.8.10
libpve-storage-perl: 8.2.5
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.4.0-4
proxmox-backup-client: 3.2.7-1
proxmox-backup-file-restore: 3.2.7-1
proxmox-firewall: 0.5.0
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.2.3
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.7
proxmox-widget-toolkit: 4.2.3
pve-cluster: 8.0.7
pve-container: 5.2.0
pve-docs: 8.2.3
pve-edk2-firmware: 4.2023.08-4
pve-esxi-import-tools: 0.7.2
pve-firewall: 5.0.7
pve-firmware: 3.13-2
pve-ha-manager: 4.0.5
pve-i18n: 3.2.3
pve-qemu-kvm: 9.0.2-3
pve-xtermjs: 5.3.0-3
qemu-server: 8.2.4
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.6-pve1
 
Code:
  cluster:
    id:     d95d1195-ecca-4149-9f2a-a52aa7c92e75
    health: HEALTH_WARN
            1 daemons have recently crashed
 
  services:
    mon: 3 daemons, quorum pve3,pve1,pve2 (age 51m)
    mgr: pve3(active, since 64m), standbys: pve1, pve2
    mds: 1/1 daemons up, 2 standby
    osd: 3 osds: 3 up (since 51m), 3 in (since 5d)
 
  data:
    volumes: 1/1 healthy
    pools:   4 pools, 97 pgs
    objects: 119.08k objects, 465 GiB
    usage:   1.4 TiB used, 6.8 TiB / 8.2 TiB avail
    pgs:     97 active+clean
 
nach dem Reboot des 3. Knoten kommt das:

Code:
 cluster:
    id:     d95d1195-ecca-4149-9f2a-a52aa7c92e75
    health: HEALTH_ERR
            Module 'devicehealth' has failed: disk I/O error
            1 daemons have recently crashed
            1 mgr modules have recently crashed
 
  services:
    mon: 3 daemons, quorum pve3,pve1,pve2 (age 3m)
    mgr: pve1(active, since 4m), standbys: pve2, pve3
    mds: 1/1 daemons up, 2 standby
    osd: 3 osds: 3 up (since 2m), 3 in (since 5d)
 
  data:
    volumes: 1/1 healthy
    pools:   4 pools, 97 pgs
    objects: 119.08k objects, 465 GiB
    usage:   1.4 TiB used, 6.8 TiB / 8.2 TiB avail
    pgs:     96 active+clean
             1  active+clean+scrubbing+deep
 
Ich habe die Manager nacheinander auf Node 1 und 2 gelöscht und neu erstellt.
Nun bringt "ceph -s" das:

Code:
cluster:
    id:     d95d1195-ecca-4149-9f2a-a52aa7c92e75
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum pve3,pve1,pve2 (age 65m)
    mgr: pve3(active, since 2m), standbys: pve2, pve1
    mds: 1/1 daemons up, 2 standby
    osd: 3 osds: 3 up (since 65m), 3 in (since 5d)
 
  data:
    volumes: 1/1 healthy
    pools:   4 pools, 97 pgs
    objects: 119.08k objects, 465 GiB
    usage:   1.4 TiB used, 6.8 TiB / 8.2 TiB avail
    pgs:     97 active+clean

Mein "Datapool" geht aber weiterhin nicht:

"rbd error: rbd: listing images failed: (108) Cannot send after transport endpoint shutdown (500)"
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!