df command stuck in between

Ayush · Feb 3, 2024

Hi Team,

I have 3 node cluster and all the 3 nodes have ceph and zfs pools.
Recently we face as an issue that when I raise df command it stuck in between. Ceph status show Ok and healthy .

How can I find the issue it has?

Ayush · Feb 3, 2024

pvecm status
Cluster information
-------------------
Name: HA
Config Version: 3
Transport: knet
Secure auth: on

Quorum information
------------------
Date: Sat Feb 3 14:39:50 2024
Quorum provider: corosync_votequorum
Nodes: 3
Node ID: 0x00000001
Ring ID: 1.1245
Quorate: Yes

Votequorum information
----------------------
Expected votes: 3
Highest expected: 3
Total votes: 3
Quorum: 2
Flags: Quorate

Membership information
----------------------
Nodeid Votes Name
0x00000001 1 172.16.100.171 (local)
0x00000002 1 172.16.100.173
0x00000003 1 172.16.100.172

root@171:~# ceph osd df
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
1 ssd 3.63869 1.00000 3.6 TiB 409 GiB 408 GiB 17 KiB 1.5 GiB 3.2 TiB 10.99 1.01 88 up
3 ssd 1.74660 1.00000 1.7 TiB 193 GiB 192 GiB 2 KiB 1.1 GiB 1.6 TiB 10.80 0.99 41 up
0 ssd 1.74660 1.00000 1.7 TiB 174 GiB 173 GiB 19 KiB 1019 MiB 1.6 TiB 9.73 0.89 38 up
2 ssd 3.63869 1.00000 3.6 TiB 428 GiB 427 GiB 8 KiB 1.3 GiB 3.2 TiB 11.50 1.05 91 up
4 ssd 3.63869 1.00000 3.6 TiB 404 GiB 403 GiB 19 KiB 1.6 GiB 3.2 TiB 10.86 0.99 87 up
5 ssd 1.74660 1.00000 1.7 TiB 198 GiB 197 GiB 7 KiB 709 MiB 1.6 TiB 11.06 1.01 42 up
TOTAL 16 TiB 1.8 TiB 1.8 TiB 75 KiB 7.3 GiB 14 TiB 10.93

cluster:
id: 47061c54-d430-47c6-afa6-952da8e88877
health: HEALTH_OK

services:
mon: 3 daemons, quorum 172,171,173 (age 2w)
mgr: 172(active, since 4M), standbys: irage173, irage171
osd: 6 osds: 6 up (since 11d), 6 in (since 11d)

data:
pools: 2 pools, 129 pgs
objects: 153.62k objects, 600 GiB
usage: 1.8 TiB used, 14 TiB / 16 TiB avail
pgs: 129 active+clean

io:
client: 0 B/s rd, 79 KiB/s wr, 0 op/s rd, 4 op/s wr

Ayush · Feb 3, 2024

Hi Team,

It shows following error :-

2.743699] AppArmor: AppArmor Filesystem Enabled
[ 2.813210] ERST: Error Record Serialization Table (ERST) support is initialized.
[ 3.287863] RAS: Correctable Errors collector initialized.
[ 6.529780] EXT4-fs (dm-2): mounted filesystem 98630d6b-8864-4cdf-be53-bc0da31b6525 with ordered data mode. Quota mode: none.
[ 7.496684] ACPI Error: No handler for Region [SYSI] (0000000096bc81c9) [IPMI] (20221020/evregion-130)
[ 7.496789] ACPI Error: Region IPMI (ID=7) has no handler (20221020/exfldio-261)
[ 7.496894] ACPI Error: Aborting method \_SB.PMI0._GHL due to previous error (AE_NOT_EXIST) (20221020/psparse-529)
[ 7.496998] ACPI Error: Aborting method \_SB.PMI0._PMC due to previous error (AE_NOT_EXIST) (20221020/psparse-529)
[ 7.655095] ZFS: Loaded module v2.1.12-pve1, ZFS pool version 5000, ZFS filesystem version 5
[ 57.248652] usb 1-1.5: Failed to suspend device, error -71
[ 730.770999] pverados[8198]: segfault at 55b0f8c0a030 ip 000055b0f8c0a030 sp 00007ffdeebc9228 error 14 in perl[55b0f8bde000+195000] likely on CPU 61 (core 10, socket 1)
[ 750.313475] pverados[8280]: segfault at 55b0f8c0a030 ip 000055b0f8c0a030 sp 00007ffdeebc9228 error 14 in perl[55b0f8bde000+195000] likely on CPU 57 (core 8, socket 1)
[ 6950.517683] pverados[33951]: segfault at 55b0f8c0a030 ip 000055b0f8c0a030 sp 00007ffdeebc9228 error 14 in perl[55b0f8bde000+195000] likely on CPU 60 (core 10, socket 0)
[ 7030.427364] pverados[34284]: segfault at 55b0f8c0a030 ip 000055b0f8c0a030 sp 00007ffdeebc9228 error 14 in perl[55b0f8bde000+195000] likely on CPU 46 (core 1, socket 0)
[ 8319.630367] pverados[39599]: segfault at 55b0f8c0a030 ip 000055b0f8c0a030 sp 00007ffdeebc9228 error 14 in perl[55b0f8bde000+195000] likely on CPU 51 (core 3, socket 1)
[ 9060.214712] pverados[42668]: segfault at 55b0f8c0a030 ip 000055b0f8c0a030 sp 00007ffdeebc9228 error 14 in perl[55b0f8bde000+195000] likely on CPU 27 (core 18, socket 1)
[ 9870.067009] pverados[46039]: segfault at 55b0f8c0a030 ip 000055b0f8c0a030 sp 00007ffdeebc9228 error 14 in perl[55b0f8bde000+195000] likely on CPU 25 (core 17, socket 1)
[12040.270463] pverados[55012]: segfault at 55b0f8c0a030 ip 000055b0f8c0a030 sp 00007ffdeebc9228 error 14 in perl[55b0f8bde000+195000] likely on CPU 6 (core 3, socket 0)
[12479.673193] pverados[56811]: segfault at 55b0f8c0a030 ip 000055b0f8c0a030 sp 00007ffdeebc9228 error 14 in perl[55b0f8bde000+195000] likely on CPU 39 (core 26, socket 1)
[13270.610175] pverados[60109]: segfault at 55b0f8c0a030 ip 000055b0f8c0a030 sp 00007ffdeebc9228 error 14 in perl[55b0f8bde000+195000] likely on CPU 2 (core 1, socket 0)
[13589.895751] pverados[61422]: segfault at 55b0f8c0a030 ip 000055b0f8c0a030 sp 00007ffdeebc9228 error 14 in perl[55b0f8bde000+195000] :

Search

Search

df command stuck in between

Ayush

Member

Ayush

Member

Ayush

Member