kernel error message on one server each 5 minutes

Fiko · Aug 27, 2021

Hi, can you help?
I have this message each 5 minutes on one of my proxmox server
Aug 27 16:10:27 proxprod2 kernel: [36585190.379894] JBD2: Detected IO errors while flushing file data on dm-7-8
Aug 27 16:15:27 proxprod2 kernel: [36585490.277685] JBD2: Detected IO errors while flushing file data on dm-7-8
Aug 27 16:20:28 proxprod2 kernel: [36585791.066524] JBD2: Detected IO errors while flushing file data on dm-7-8
Aug 27 16:25:28 proxprod2 kernel: [36586090.856836] JBD2: Detected IO errors while flushing file data on dm-7-8

lvdisplay give me this list of dm
root@proxprod2:~# lvdisplay|awk '/LV Name/{n=$3} /Block device/{d=$3; sub(".*:","dm-",d); print d,n;}'
dm-0 osd-block-b619c0ee-8dd7-45b6-ab76-906c220a46d2
dm-1 osd-block-58e61a40-d6c4-4e23-a597-e6cd0dde9f28
dm-2 osd-block-e4a16ca6-2a93-481e-afe8-1c26f867e5d2
dm-3 osd-block-a3342286-f579-4a18-a286-9f6ed8a8f1af
dm-4 osd-block-60718446-31a9-4b15-95bf-78fd68a56a9d
dm-5 osd-block-0a396437-9af2-4555-8a48-b6d925968fa1
dm-8 osd-block-a5170d92-84d0-4bef-9160-679543474064
dm-6 swap
dm-7 root

And df display me this
root@proxprod2:~# df
Filesystem 1K-blocks Used Available Use% Mounted on
udev 264067696 0 264067696 0% /dev
tmpfs 52825088 4222040 48603048 8% /run
/dev/mapper/pve-root 237096824 11267884 215545704 5% /
tmpfs 264125428 58128 264067300 1% /dev/shm
tmpfs 5120 0 5120 0% /run/lock
tmpfs 264125428 0 264125428 0% /sys/fs/cgroup
/dev/nvme6n1p2 523248 312 522936 1% /boot/efi
tmpfs 264125428 24 264125404 1% /var/lib/ceph/osd/ceph-7
tmpfs 264125428 24 264125404 1% /var/lib/ceph/osd/ceph-24
tmpfs 264125428 24 264125404 1% /var/lib/ceph/osd/ceph-2
tmpfs 264125428 24 264125404 1% /var/lib/ceph/osd/ceph-17
tmpfs 264125428 24 264125404 1% /var/lib/ceph/osd/ceph-3
tmpfs 264125428 24 264125404 1% /var/lib/ceph/osd/ceph-23
tmpfs 264125428 52 264125376 1% /var/lib/ceph/osd/ceph-49
16.1.0.200:/RaidDisk 1942631296 153141888 1789489408 8% /mnt/pve/qnap-01
16.1.0.200:/USBDisk1 2918608992 221216 2918387776 1% /mnt/pve/qnap-02
16.1.0.200:/USBDisk2 4864342432 234827136 4629515296 5% /mnt/pve/qnap-03
16.1.0.200:/USBDisk3 2918608992 221216 2918387776 1% /mnt/pve/qnap-04
/dev/fuse 30720 268 30452 1% /etc/pve
tmpfs 52825084 0 52825084 0% /run/user/0

Stefan_R · Sep 1, 2021

I would check if your hardware is ok... This message usually indicates a failed disk.

Can you also post lsblk and cat /etc/pve/storage.cfg output? What storage technology are you using?

Fiko · Sep 1, 2021

Stefan_R said:
I would check if your hardware is ok... This message usually indicates a failed disk.

Can you also post lsblk and cat /etc/pve/storage.cfg output? What storage technology are you using?

Hi Stefan R
We use ceph and have qnap network drive for iso / lx templates / backups

Attached lsblk.txt

==================
root@proxprod2:~# cat /etc/pve/storage.cfg
dir: local
path /var/lib/vz
content rootdir
shared 0

rbd: BD_3Replication
content images,rootdir
krbd 1
pool BD_3Replication

rbd: VM_2Replication
content images,rootdir
krbd 1
pool VM_2Replication

nfs: qnap-01
export /RaidDisk
path /mnt/pve/qnap-01
server 16.1.0.200
content vztmpl,backup,iso
options vers=3
prune-backups keep-last=7

nfs: qnap-02
export /USBDisk1
path /mnt/pve/qnap-02
server 16.1.0.200
content iso,backup,vztmpl
prune-backups keep-last=3

nfs: qnap-03
export /USBDisk2
path /mnt/pve/qnap-03
server 16.1.0.200
content vztmpl,backup,iso
prune-backups keep-last=3

nfs: qnap-04
export /USBDisk3
path /mnt/pve/qnap-04
server 16.1.0.200
content vztmpl,backup,iso
prune-backups keep-last=3

Stefan_R · Sep 1, 2021

Right, JBD2 is attributed to ext4, and dm-7 seems to point to your root drive, so... bad root disk? nvme6n1?

Fiko · Sep 1, 2021

here the smart info:

Search

Search

kernel error message on one server each 5 minutes

Fiko

Member

Stefan_R

Proxmox Retired Staff

Fiko

Member

Attachments

Stefan_R

Proxmox Retired Staff

Fiko

Member

We value your privacy