[SOLVED] pvestatd[1873]: unable to activate storage 'cephfs' - directory '/mnt/pve/cephfs' does not exist or is unreachable

lazypaul · Sep 8, 2020

My cluster comes error as below, any one who can help.

Code:

Sep 08 18:18:12 g8kvm13 pvestatd[1873]: unable to activate storage 'cephfs' - directory '/mnt/pve/cephfs' does not exist or is unreachable
Sep 08 18:18:22 g8kvm13 pvestatd[1873]: got timeout
Sep 08 18:18:22 g8kvm13 pvestatd[1873]: unable to activate storage 'cephfs' - directory '/mnt/pve/cephfs' does not exist or is unreachable
Sep 08 18:18:32 g8kvm13 pvestatd[1873]: got timeout
Sep 08 18:18:32 g8kvm13 pvestatd[1873]: unable to activate storage 'cephfs' - directory '/mnt/pve/cephfs' does not exist or is unreachable
Sep 08 18:18:42 g8kvm13 pvestatd[1873]: got timeout
Sep 08 18:18:42 g8kvm13 pvestatd[1873]: unable to activate storage 'cephfs' - directory '/mnt/pve/cephfs' does not exist or is unreachable
Sep 08 18:18:52 g8kvm13 pvestatd[1873]: got timeout
Sep 08 18:18:52 g8kvm13 pvestatd[1873]: unable to activate storage 'cephfs' - directory '/mnt/pve/cephfs' does not exist or is unreachable
Sep 08 18:19:00 g8kvm13 systemd[1]: Starting Proxmox VE replication runner...
Sep 08 18:19:01 g8kvm13 systemd[1]: pvesr.service: Succeeded.
Sep 08 18:19:01 g8kvm13 systemd[1]: Started Proxmox VE replication runner.
Sep 08 18:19:02 g8kvm13 kernel: mce: [Hardware Error]: Machine check events logged
Sep 08 18:19:02 g8kvm13 kernel: mce: [Hardware Error]: Machine check events logged
Sep 08 18:19:02 g8kvm13 pvestatd[1873]: got timeout
Sep 08 18:19:02 g8kvm13 pvestatd[1873]: unable to activate storage 'cephfs' - directory '/mnt/pve/cephfs' does not exist or is unreachable
Sep 08 18:19:13 g8kvm13 pvestatd[1873]: got timeout
Sep 08 18:27:22 g8kvm13 pvestatd[1873]: unable to activate storage 'cephfs' - directory '/mnt/pve/cephfs' does not exist or is unreachable
Sep 08 18:27:27 g8kvm13 pmxcfs[1591]: [status] notice: received log
Sep 08 18:27:32 g8kvm13 pvestatd[1873]: got timeout
Sep 08 18:27:32 g8kvm13 pvestatd[1873]: unable to activate storage 'cephfs' - directory '/mnt/pve/cephfs' does not exist or is unreachable
Sep 08 18:27:43 g8kvm13 pvestatd[1873]: got timeout
Sep 08 18:27:43 g8kvm13 pvestatd[1873]: unable to activate storage 'cephfs' - directory '/mnt/pve/cephfs' does not exist or is unreachable
Sep 08 18:27:51 g8kvm13 pmxcfs[1591]: [status] notice: received log
Sep 08 18:27:52 g8kvm13 pvestatd[1873]: got timeout
Sep 08 18:27:52 g8kvm13 pvestatd[1873]: unable to activate storage 'cephfs' - directory '/mnt/pve/cephfs' does not exist or is unreachable
Sep 08 18:28:00 g8kvm13 systemd[1]: Starting Proxmox VE replication runner...
Sep 08 18:30:02 g8kvm13 systemd[1]: pvesr.service: Succeeded.
Sep 08 18:30:02 g8kvm13 systemd[1]: Started Proxmox VE replication runner.
Sep 08 18:30:02 g8kvm13 pvestatd[1873]: got timeout
Sep 08 18:30:02 g8kvm13 pvestatd[1873]: unable to activate storage 'cephfs' - directory '/mnt/pve/cephfs' does not exist or is unreachable
Sep 08 18:30:12 g8kvm13 pvestatd[1873]: got timeout
Sep 08 18:30:12 g8kvm13 pvestatd[1873]: unable to activate storage 'cephfs' - directory '/mnt/pve/cephfs' does not exist or is unreachable
Sep 08 18:30:22 g8kvm13 pvestatd[1873]: got timeout
Sep 08 18:30:22 g8kvm13 pvestatd[1873]: unable to activate storage 'cephfs' - directory '/mnt/pve/cephfs' does not exist or is unreachable
Sep 08 18:31:02 g8kvm13 pvestatd[1873]: unable to activate storage 'cephfs' - directory '/mnt/pve/cephfs' does not exist or is unreachable
Sep 08 18:31:36 g8kvm13 kernel: mce_notify_irq: 13 callbacks suppressed
Sep 08 18:31:36 g8kvm13 kernel: mce: [Hardware Error]: Machine check events logged
Sep 08 18:31:43 g8kvm13 pvestatd[1873]: got timeout

wolfgang · Sep 9, 2020

Hi,

do you see problems when you call

Code:

ras-mc-ctl --summary

you have maybe to install "rasdaemon"

lazypaul · Sep 9, 2020

wolfgang said:
Hi,

do you see problems when you call

Code:

ras-mc-ctl --summary

you have maybe to install "rasdaemon"

It seems no errors, the log show this log frequency , the Ceph show health is OK, but run " df -h " hand there.

root@g8kvm37:~# ras-mc-ctl --summary
No Memory errors.

No PCIe AER errors.

No Extlog errors.
No MCE errors.
root@g8kvm37:~#

wolfgang · Sep 9, 2020

Can you please send the /etc/pve/storage.cfg and the output of "mount"?

lazypaul · Sep 9, 2020

wolfgang said:
Can you please send the /etc/pve/storage.cfg and the output of "mount"?

Code:

root@g8kvm37:~# cat /etc/pve/storage.cfg
dir: local
    path /var/lib/vz
    content iso,vztmpl,backup

lvmthin: local-lvm
    thinpool data
    vgname pve
    content images,rootdir

rbd: G8KvmData
    content rootdir,images
    krbd 0
    pool G8KvmData

cephfs: cephfs
    path /mnt/pve/cephfs
    content backup,iso,vztmpl

lvm: test-nvme
    vgname test-nvme
    content images,rootdir
    nodes g8kvm06
    shared 0

pbs: pbs-data-g9
    datastore pbs-data
    server 10.0.142.0
    content backup
    fingerprint 67:1f:7c:39:ce:1b:f7:40:64:ac:c2:cd:16:57:e5:cd:3a:ae:72:2b:00:af:1a:16:59:05:ea:1b:87:a7:43:3a
    maxfiles 100
    username root@pam

root@g8kvm37:~# mount
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
proc on /proc type proc (rw,relatime)
udev on /dev type devtmpfs (rw,nosuid,relatime,size=49449688k,nr_inodes=12362422,mode=755)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=9894896k,mode=755)
/dev/mapper/pve-root on / type ext4 (rw,relatime,errors=remount-ro)
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k)
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755)
cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime)
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd)
pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime)
none on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma)
systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=33,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=46821)
mqueue on /dev/mqueue type mqueue (rw,relatime)
debugfs on /sys/kernel/debug type debugfs (rw,relatime)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M)
sunrpc on /run/rpc_pipefs type rpc_pipefs (rw,relatime)
fusectl on /sys/fs/fuse/connections type fusectl (rw,relatime)
configfs on /sys/kernel/config type configfs (rw,relatime)
lxcfs on /var/lib/lxcfs type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
/dev/fuse on /etc/pve type fuse (rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other)
10.0.141.1,10.0.141.2,10.0.141.3,10.0.141.4,10.0.141.5:/ on /mnt/pve/cephfs type ceph (rw,relatime,name=admin,secret=<hidden>,acl)
tmpfs on /var/lib/ceph/osd/ceph-305 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-306 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-307 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-308 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-309 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-310 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-311 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-312 type tmpfs (rw,relatime)
tmpfs on /run/user/0 type tmpfs (rw,nosuid,nodev,relatime,size=9894892k,mode=700)
tracefs on /sys/kernel/debug/tracing type tracefs (rw,relatime)
root@g8kvm37:~#

wolfgang · Sep 9, 2020

Can you write or read to/from /mnt/pve/cephfs?

lazypaul · Sep 9, 2020

wolfgang said:
Can you write or read to/from /mnt/pve/cephfs?

I can not enter the folder, even I type chmod 755 cephfs, no use

Code:

root@g8kvm37:/mnt/pve#
root@g8kvm37:/mnt/pve# cd cephfs
-bash: cd: cephfs: Permission denied
root@g8kvm37:/mnt/pve# ls -al
ls: cannot access 'cephfs': Permission denied
total 8
drwxr-xr-x 3 root root 4096 Sep  4 17:27 .
drwxr-xr-x 4 root root 4096 Sep  4 17:27 ..
d????????? ? ?    ?       ?            ? cephfs
root@g8kvm37:/mnt/pve#
root@g8kvm37:/mnt/pve#
root@g8kvm37:/mnt/pve#

wolfgang · Sep 9, 2020

Is the secret correct?
compare this two secrets

Code:

ceph auth get-key client.admin
cat /etc/pve/priv/ceph/cephfs.secret

lazypaul · Sep 9, 2020

wolfgang said:
Is the secret correct?
compare this two secrets

Code:

ceph auth get-key client.admin cat /etc/pve/priv/ceph/cephfs.secret

It look the same

lazypaul · Sep 10, 2020

wolfgang said:
Is the secret correct?
compare this two secrets

Code:

ceph auth get-key client.admin cat /etc/pve/priv/ceph/cephfs.secret

Reboot the node is ok. But I still have many node over 10 nodes have this problem. Reboot is the only option?

wolfgang · Sep 10, 2020

Try to unmount the dir.
It gets mounted automatically again if there are no kernel problems.
If there are hanging processes in the kernel you have to reboot all nodes.

lazypaul · Sep 10, 2020

wolfgang said:
Try to unmount the dir.
It gets mounted automatically again if there are no kernel problems.
If there are hanging processes in the kernel you have to reboot all nodes.

Umount the folder is ok, Thank you very much !!!!!!

Another question, do you know if ceph has ackage updated, does it need the reboot the node ?

wolfgang · Sep 10, 2020

No there is no need to reboot the node.

You have to restart the ceph services to get the latest version running.
This can be done over the GUI.
A node reboot is only necessary if a new kernel is available.

lazypaul · Sep 10, 2020

wolfgang said:
No there is no need to reboot the node.

You have to restart the ceph services to get the latest version running.
This can be done over the GUI.
A node reboot is only necessary if a new kernel is available.

One more question, do you know why no HA enable(no grooups, no resource) , restart the corosync could make the cluseter reboot ?

wolfgang · Sep 11, 2020

There is a discussion on the developer list about this.
I guess the problem is resolved soon with an update.

kifeo · Dec 1, 2020

I had the same issue, unmounting the directory and waiting a little bit resolved the issue, thanks @wolfgang

Search

Search

[SOLVED] pvestatd[1873]: unable to activate storage 'cephfs' - directory '/mnt/pve/cephfs' does not exist or is unreachable

lazypaul

Member

wolfgang

Proxmox Retired Staff

lazypaul

Member

wolfgang

Proxmox Retired Staff

lazypaul

Member

wolfgang

Proxmox Retired Staff

lazypaul

Member

wolfgang

Proxmox Retired Staff

lazypaul

Member

lazypaul

Member

wolfgang

Proxmox Retired Staff

lazypaul

Member

wolfgang

Proxmox Retired Staff

lazypaul

Member

wolfgang

Proxmox Retired Staff

kifeo

Active Member