proxmox 7.4, 3 nodes cluster, ceph ok but rados seems corrupted

Romainp

Active Member
Jan 23, 2018
13
0
41
52
Hi all!
Hope someone can help and giude me to the right direction :)
So I have a 3 node cluster with promox 7.4.x.
after migrate from 7.2 to 7.3 the last node of my 3 nodes, I started to have some issues.
Ceph seems to be working fine and healthy
Code:
root@socsmi-srv1:~# pveceph status
  cluster:
    id:     ba2ffda3-4124-4c6d-8386-b6d07aa6f47e
    health: HEALTH_OK


  services:
    mon: 3 daemons, quorum socsmi-srv1,socsmi-srv2,socsmi-srv3 (age 100m)
    mgr: socsmi-srv3(active, since 3h), standbys: socsmi-srv2, socsmi-srv1
    mds: 1/1 daemons up, 2 standby
    osd: 71 osds: 71 up (since 2h), 71 in (since 7M)


  data:
    volumes: 1/1 healthy
    pools:   3 pools, 80 pgs
    objects: 1.59M objects, 5.8 TiB
    usage:   18 TiB used, 99 TiB / 116 TiB avail
    pgs:     80 active+clean


  io:
    client:   0 B/s rd, 968 KiB/s wr, 0 op/s rd, 19 op/s wr

But every openration on the vm I got some rbd errors:

Code:
root@socsmi-srv1:~# rbd ls -l -p .mgr
rbd: error opening base-101-disk-0: (2) No such file or directory
rbd: error opening base-107-disk-0: (2) No such file or directory
rbd: error opening base-114-disk-0: (2) No such file or directory
rbd: error opening vm-100-disk-0: (2) No such file or directory
rbd: error opening vm-102-disk-0: (2) No such file or directory
rbd: error opening vm-102-disk-1: (2) No such file or directory
rbd: error opening vm-103-disk-0: (2) No such file or directory
rbd: error opening vm-103-disk-1: (2) No such file or directory
rbd: error opening vm-104-disk-0: (2) No such file or directory
rbd: error opening vm-105-disk-0: (2) No such file or directory
rbd: error opening vm-105-disk-1: (2) No such file or directory
rbd: error opening vm-105-disk-2: (2) No such file or directory
rbd: error opening vm-105-disk-3: (2) No such file or directory
rbd: error opening vm-105-disk-4: (2) No such file or directory
rbd: error opening vm-106-disk-0: (2) No such file or directory
rbd: error opening vm-106-disk-1: (2) No such file or directory
rbd: error opening vm-106-state-Before_upgrade_7_0: (2) No such file or directory
rbd: error opening vm-108-disk-0: (2) No such file or directory
rbd: error opening vm-109-disk-0: (2) No such file or directory
rbd: error opening vm-109-disk-1: (2) No such file or directory
rbd: error opening vm-109-disk-2: (2) No such file or directory
rbd: error opening vm-110-disk-0: (2) No such file or directory
rbd: error opening vm-111-disk-0: (2) No such file or directory
rbd: error opening vm-112-disk-0: (2) No such file or directory
rbd: error opening vm-113-disk-0: (2) No such file or directory
rbd: error opening vm-113-disk-1: (2) No such file or directory
rbd: error opening vm-115-disk-0: (2) No such file or directory
rbd: error opening vm-115-disk-1: (2) No such file or directory
rbd: error opening vm-116-disk-0: (2) No such file or directory
rbd: error opening vm-116-disk-1: (2) No such file or directory
rbd: error opening vm-116-disk-2: (2) No such file or directory
rbd: error opening vm-117-disk-0: (2) No such file or directory
rbd: error opening vm-117-state-patching: (2) No such file or directory
rbd: error opening vm-118-disk-0: (2) No such file or directory
rbd: error opening vm-118-state-patching: (2) No such file or directory
rbd: error opening vm-119-disk-0: (2) No such file or directory
rbd: error opening vm-119-state-patching: (2) No such file or directory
rbd: error opening vm-120-disk-0: (2) No such file or directory
rbd: error opening vm-121-disk-0: (2) No such file or directory
rbd: error opening vm-122-disk-0: (2) No such file or directory
rbd: error opening vm-123-disk-0: (2) No such file or directory
rbd: error opening vm-124-disk-0: (2) No such file or directory
rbd: error opening vm-125-disk-0: (2) No such file or directory
rbd: error opening vm-126-disk-0: (2) No such file or directory
rbd: error opening vm-126-disk-1: (2) No such file or directory
NAME  SIZE  PARENT  FMT  PROT  LOCK
rbd: listing images failed: (2) No such file or directory


When I try to delete an image:

Code:
root@socsmi-srv1:~# rbd rm vm-126-disk-0 -p .mgr --debug-rados=20
2023-05-25T16:56:55.858-0400 7f899d4ce4c0  1 librados: starting msgr at
2023-05-25T16:56:55.858-0400 7f899d4ce4c0  1 librados: starting objecter
2023-05-25T16:56:55.858-0400 7f899d4ce4c0  1 librados: setting wanted keys
2023-05-25T16:56:55.858-0400 7f899d4ce4c0  1 librados: calling monclient init
2023-05-25T16:56:55.862-0400 7f899d4ce4c0  1 librados: init done
2023-05-25T16:56:55.862-0400 7f899d4ce4c0 10 librados: wait_for_osdmap waiting
2023-05-25T16:56:55.866-0400 7f899d4ce4c0 10 librados: wait_for_osdmap done waiting
2023-05-25T16:56:55.866-0400 7f899d4ce4c0 10 librados: call oid=rbd_directory nspace=
2023-05-25T16:56:55.866-0400 7f899d4ce4c0 10 librados: Objecter returned from call r=0
2023-05-25T16:56:55.890-0400 7f899a004700 10 librados: set snap write context: seq = 4d and snaps = [4d]
2023-05-25T16:56:55.890-0400 7f8999803700 20 librados: queue_aio_write 0x559863880260 completion 0x7f8980001f30 write_seq 1
2023-05-25T16:56:55.894-0400 7f899b006700 20 librados: complete_aio_write 0x7f8980001f30
2023-05-25T16:56:55.898-0400 7f899a004700 20 librados: queue_aio_write 0x559863880260 completion 0x7f8980001f30 write_seq 2
2023-05-25T16:56:55.898-0400 7f899c008700 20 librados: complete_aio_write 0x7f8980001f30
2023-05-25T16:56:55.902-0400 7f899a004700 -1 librbd::object_map::RefreshRequest: failed to load object map: rbd_object_map.f05bbf6e1ad169
2023-05-25T16:56:55.902-0400 7f899a004700 -1 librbd::object_map::InvalidateRequest: 0x7f8980015740 should_complete: r=0
2023-05-25T16:56:55.902-0400 7f8999803700 10 librados: handle_notify 63625645522979 cookie 140228534818720 notifier_id 35743643 len 26
2023-05-25T16:56:55.902-0400 7f899a004700 10 librados: finish linger op 0x7f8980017810 acked (0)
2023-05-25T16:56:55.902-0400 7f899a004700 10 librados: operator() completed notify (linger op 0x7f8980017810), ec = system:0
2023-05-25T16:56:55.902-0400 7f899a004700 20 librados: queue_aio_write 0x559863880260 completion 0x559863661100 write_seq 3
2023-05-25T16:56:55.906-0400 7f899c008700 20 librados: complete_aio_write 0x559863661100
2023-05-25T16:56:55.906-0400 7f8999803700 20 librados: queue_aio_write 0x559863880260 completion 0x559863661100 write_seq 4
2023-05-25T16:56:55.906-0400 7f899b006700 20 librados: complete_aio_write 0x559863661100
2023-05-25T16:56:55.906-0400 7f899a004700 10 librados: handle_notify 63625645522980 cookie 140228534818720 notifier_id 35743643 len 26
2023-05-25T16:56:55.906-0400 7f8999803700 10 librados: finish linger op 0x7f8978003060 acked (0)
2023-05-25T16:56:55.910-0400 7f8999803700 10 librados: operator() completed notify (linger op 0x7f8978003060), ec = system:0
2023-05-25T16:56:55.910-0400 7f899a004700 10 librados: async_watch_flush enter
2023-05-25T16:56:55.910-0400 7f899a004700 10 librados: async_watch_flush exit
2023-05-25T16:56:55.910-0400 7f89897fa700 20 librados: flush_aio_writes
2023-05-25T16:56:55.910-0400 7f89897fa700 20 librados: flush_aio_writes
Removing image: 0% complete...failed.
2023-05-25T16:56:55.918-0400 7f89897fa700 20 librados: flush_aio_writes
2023-05-25T16:56:55.918-0400 7f89897fa700 20 librados: flush_aio_writes
rbd: error opening image vm-126-disk-0: (2) No such file or directory
rbd: image has snapshots with linked clones - these must be deleted or flattened before the image can be removed.
2023-05-25T16:56:55.918-0400 7f899d4ce4c0 10 librados: watch_flush enter
2023-05-25T16:56:55.918-0400 7f899d4ce4c0 10 librados: watch_flush exit
2023-05-25T16:56:55.918-0400 7f899d4ce4c0  1 librados: shutdown

Not sure what to do next at that step..

Thanks in advance for you comments and help.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!