one node cluster ceph. pls help!

BAlekssei

Renowned Member
Aug 28, 2015
6
0
66
Hello!
i Have one node ceph cluster node i need to reboot it i stop the ceph by manual, and after booting i get
PHP:
[HTML]
# service ceph start mon
=== mon.0 ===
Starting Ceph mon.0 on proxmox2...
2015-08-28 10:15:41.811943 7f5d7d833840 -1 unable to read magic from mon data
failed: 'ulimit -n 32768;  /usr/bin/ceph-mon -i 0 --pid-file /var/run/ceph/mon.0.pid -c /etc/ceph/ceph.conf --cluster ceph '


and

# ceph -s
2015-08-28 10:16:20.862987 7f79701aa700  0 -- :/1091923 >> 192.168.12.3:6789/0 pipe(0x1c62120 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x1c5ea90).fault
2015-08-28 10:16:23.863186 7f7968dfa700  0 -- :/1091923 >> 192.168.12.3:6789/0 pipe(0x1c66c40 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x1c6aee0).fault


pls help!

# ceph -v
ceph version 0.94.3 (95cefea9fd9ab740263bf8bb4796fd864d9afe2b)


ceph.conf
[global]      auth client required = cephx      auth cluster required = cephx      auth service required = cephx      auth supported = cephx      cluster network = 192.168.12.0/24      filestore xattr use omap = true      fsid = 9085c611-1591-4aa5-a38e-90d21a9feb71      keyring = /etc/pve/priv/$cluster.$name.keyring      osd journal size = 5120      osd pool default min size = 1      public network = 192.168.12.0/24  [osd]      keyring = /var/lib/ceph/osd/ceph-$id/keyring  [mon.0]      host = proxmox2      mon addr = 192.168.12.3:6789
 
Last edited:
Code:
ip-192.168.12.3

# netstat -an | grep -i listen | grep :
tcp        0      0 0.0.0.0:8006            0.0.0.0:*               LISTEN
tcp        0      0 0.0.0.0:59015           0.0.0.0:*               LISTEN
tcp        0      0 0.0.0.0:111             0.0.0.0:*               LISTEN
tcp        0      0 127.0.0.1:85            0.0.0.0:*               LISTEN
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN
tcp        0      0 0.0.0.0:3128            0.0.0.0:*               LISTEN
tcp        0      0 127.0.0.1:25            0.0.0.0:*               LISTEN
tcp6       0      0 :::111                  :::*                    LISTEN
tcp6       0      0 :::47858                :::*                    LISTEN
tcp6       0      0 :::22                   :::*                    LISTEN
tcp6       0      0 ::1:25                  :::*                    LISTEN
 
Last edited:
Code:
ip-192.168.12.3
Hi,
I mean the command "ip addr".
# netstat -an | grep -i listen | grep :
tcp 0 0 0.0.0.0:8006 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:59015 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:85 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:3128 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN
tcp6 0 0 :::111 :::* LISTEN
tcp6 0 0 :::47858 :::* LISTEN
tcp6 0 0 :::22 :::* LISTEN
tcp6 0 0 ::1:25 :::* LISTEN


[/code]
is the keyring uable:
Code:
ls -l /etc/pve/priv/
how looks the mod-dir?
Code:
find /var/lib/ceph/mon/ -ls
Udo
 
Code:
ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 10:c3:7b:4b:bb:82 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::12c3:7bff:fe4b:bb82/64 scope link
       valid_lft forever preferred_lft forever
3: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN
    link/ether 10:c3:7b:4b:bb:82 brd ff:ff:ff:ff:ff:ff
    inet 192.168.12.3/24 brd 192.168.12.255 scope global vmbr0
    inet6 fe80::12c3:7bff:fe4b:bb82/64 scope link
       valid_lft forever preferred_lft forever
4: venet0: <BROADCAST,POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN
    link/void
    inet6 fe80::1/128 scope link
       valid_lft forever preferred_lft forever

/# ls -l /etc/pve/priv/
total 3
-rw------- 1 root www-data 1679 Apr 30 19:18 authkey.key
-rw------- 1 root www-data  791 Aug 27 21:56 authorized_keys
drwx------ 2 root www-data    0 May 28 22:17 ceph
-rw------- 1 root www-data   63 May 28 22:14 ceph.client.admin.keyring
-rw------- 1 root www-data  214 May 28 22:14 ceph.mon.keyring
drwx------ 2 root www-data    0 May 28 22:20 lock
-rw------- 1 root www-data 1679 Apr 30 19:18 pve-root-ca.key
-rw------- 1 root www-data    3 Aug 27 20:09 pve-root-ca.srl


find /var/lib/ceph/mon/ -ls
4161618    4 drwxr-xr-x   4 root     root         4096 Aug 28 02:40 /var/lib/ceph/mon/
4169729    4 drwxr-xr-x   3 root     root         4096 Aug 28 02:41 /var/lib/ceph/mon/ceph-0_backup
4161626    4 -rw-r--r--   1 root     root           77 May 28 22:14 /var/lib/ceph/mon/ceph-0_backup/keyring
4161623    4 drwxr-xr-x   2 root     root         4096 Aug 28 02:59 /var/lib/ceph/mon/ceph-0_backup/store.db
4161554 2068 -rw-r--r--   1 root     root      2110387 Aug 27 13:01 /var/lib/ceph/mon/ceph-0_backup/store.db/740940.sst
4161555 1032 -rw-r--r--   1 root     root      1052470 Aug 27 13:01 /var/lib/ceph/mon/ceph-0_backup/store.db/740947.sst
4161565    0 -rw-r--r--   1 root     root            0 May 28 22:14 /var/lib/ceph/mon/ceph-0_backup/store.db/LOCK
4161566    4 -rw-r--r--   1 root     root          187 Aug 28 02:15 /var/lib/ceph/mon/ceph-0_backup/store.db/LOG.old
4161567    4 -rw-r--r--   1 root     root           16 Aug 28 02:23 /var/lib/ceph/mon/ceph-0_backup/store.db/CURRENT
4161568    4 -rw-r--r--   1 root     root          187 Aug 28 02:23 /var/lib/ceph/mon/ceph-0_backup/store.db/LOG
4161569 2116 -rw-r--r--   1 root     root      2162126 Aug 27 13:01 /var/lib/ceph/mon/ceph-0_backup/store.db/740939.sst
4161570    4 -rw-r--r--   1 root     root         3218 Aug 27 20:47 /var/lib/ceph/mon/ceph-0_backup/store.db/740949.sst
4161571    0 -rw-r--r--   1 root     root            0 Aug 28 02:23 /var/lib/ceph/mon/ceph-0_backup/store.db/740980.log
4161572 2060 -rw-r--r--   1 root     root      2101804 Aug 27 13:01 /var/lib/ceph/mon/ceph-0_backup/store.db/740946.sst
4161573 2124 -rw-r--r--   1 root     root      2169663 Aug 27 13:01 /var/lib/ceph/mon/ceph-0_backup/store.db/740943.sst
4161574 2064 -rw-r--r--   1 root     root      2108320 Aug 27 13:01 /var/lib/ceph/mon/ceph-0_backup/store.db/740941.sst
4161575 2076 -rw-r--r--   1 root     root      2119045 Aug 27 13:01 /var/lib/ceph/mon/ceph-0_backup/store.db/740942.sst
4161576 2092 -rw-r--r--   1 root     root      2136268 Aug 27 13:01 /var/lib/ceph/mon/ceph-0_backup/store.db/740945.sst
4161577   68 -rw-r--r--   1 root     root        65536 Aug 28 02:23 /var/lib/ceph/mon/ceph-0_backup/store.db/MANIFEST-740979
4161578 2132 -rw-r--r--   1 root     root      2176718 Aug 27 13:01 /var/lib/ceph/mon/ceph-0_backup/store.db/740944.sst
4161615    4 drwxr-xr-x   3 root     root         4096 Aug 28 03:07 /var/lib/ceph/mon/ceph-0
4161553    4 -rw-r--r--   1 root     root           77 May 28 22:14 /var/lib/ceph/mon/ceph-0/keyring
4169774    4 drwxr-xr-x   2 root     root         4096 Aug 28 10:15 /var/lib/ceph/mon/ceph-0/store.db
4161636 2068 -rw-r--r--   1 root     root      2110387 Aug 27 13:01 /var/lib/ceph/mon/ceph-0/store.db/740940.sst
4161659 1032 -rw-r--r--   1 root     root      1052470 Aug 27 13:01 /var/lib/ceph/mon/ceph-0/store.db/740947.sst
4161625    0 -rw-r--r--   1 root     root            0 May 28 22:14 /var/lib/ceph/mon/ceph-0/store.db/LOCK
4169782    4 -rw-r--r--   1 root     root          187 Aug 28 10:02 /var/lib/ceph/mon/ceph-0/store.db/LOG.old
4169781    4 -rw-r--r--   1 root     root           16 Aug 28 10:15 /var/lib/ceph/mon/ceph-0/store.db/CURRENT
4169778    4 -rw-r--r--   1 root     root          187 Aug 28 10:15 /var/lib/ceph/mon/ceph-0/store.db/LOG
4169780    4 -rw-r--r--   1 root     root        65536 Aug 28 10:15 /var/lib/ceph/mon/ceph-0/store.db/MANIFEST-740993
4161633 2116 -rw-r--r--   1 root     root      2162126 Aug 27 13:01 /var/lib/ceph/mon/ceph-0/store.db/740939.sst
4161631    4 -rw-r--r--   1 root     root         3218 Aug 27 20:47 /var/lib/ceph/mon/ceph-0/store.db/740949.sst
4161658 2060 -rw-r--r--   1 root     root      2101804 Aug 27 13:01 /var/lib/ceph/mon/ceph-0/store.db/740946.sst
4161648 2124 -rw-r--r--   1 root     root      2169663 Aug 27 13:01 /var/lib/ceph/mon/ceph-0/store.db/740943.sst
4161644 2064 -rw-r--r--   1 root     root      2108320 Aug 27 13:01 /var/lib/ceph/mon/ceph-0/store.db/740941.sst
4161645 2076 -rw-r--r--   1 root     root      2119045 Aug 27 13:01 /var/lib/ceph/mon/ceph-0/store.db/740942.sst
4161650 2092 -rw-r--r--   1 root     root      2136268 Aug 27 13:01 /var/lib/ceph/mon/ceph-0/store.db/740945.sst
4169779    0 -rw-r--r--   1 root     root            0 Aug 28 10:15 /var/lib/ceph/mon/ceph-0/store.db/740994.log
4161649 2132 -rw-r--r--   1 root     root      2176718 Aug 27 13:01 /var/lib/ceph/mon/ceph-0/store.db/740944.sst
 
Code:
ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 10:c3:7b:4b:bb:82 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::12c3:7bff:fe4b:bb82/64 scope link
       valid_lft forever preferred_lft forever
3: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN
    link/ether 10:c3:7b:4b:bb:82 brd ff:ff:ff:ff:ff:ff
    inet 192.168.12.3/24 brd 192.168.12.255 scope global vmbr0
    inet6 fe80::12c3:7bff:fe4b:bb82/64 scope link
       valid_lft forever preferred_lft forever
4: venet0: <BROADCAST,POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN
    link/void
    inet6 fe80::1/128 scope link
       valid_lft forever preferred_lft forever

/# ls -l /etc/pve/priv/
total 3
-rw------- 1 root www-data 1679 Apr 30 19:18 authkey.key
-rw------- 1 root www-data  791 Aug 27 21:56 authorized_keys
drwx------ 2 root www-data    0 May 28 22:17 ceph
-rw------- 1 root www-data   63 May 28 22:14 ceph.client.admin.keyring
-rw------- 1 root www-data  214 May 28 22:14 ceph.mon.keyring
drwx------ 2 root www-data    0 May 28 22:20 lock
-rw------- 1 root www-data 1679 Apr 30 19:18 pve-root-ca.key
-rw------- 1 root www-data    3 Aug 27 20:09 pve-root-ca.srl


find /var/lib/ceph/mon/ -ls
4161618    4 drwxr-xr-x   4 root     root         4096 Aug 28 02:40 /var/lib/ceph/mon/
4169729    4 drwxr-xr-x   3 root     root         4096 Aug 28 02:41 /var/lib/ceph/mon/ceph-0_backup
4161626    4 -rw-r--r--   1 root     root           77 May 28 22:14 /var/lib/ceph/mon/ceph-0_backup/keyring
4161623    4 drwxr-xr-x   2 root     root         4096 Aug 28 02:59 /var/lib/ceph/mon/ceph-0_backup/store.db
4161554 2068 -rw-r--r--   1 root     root      2110387 Aug 27 13:01 /var/lib/ceph/mon/ceph-0_backup/store.db/740940.sst
4161555 1032 -rw-r--r--   1 root     root      1052470 Aug 27 13:01 /var/lib/ceph/mon/ceph-0_backup/store.db/740947.sst
4161565    0 -rw-r--r--   1 root     root            0 May 28 22:14 /var/lib/ceph/mon/ceph-0_backup/store.db/LOCK
4161566    4 -rw-r--r--   1 root     root          187 Aug 28 02:15 /var/lib/ceph/mon/ceph-0_backup/store.db/LOG.old
4161567    4 -rw-r--r--   1 root     root           16 Aug 28 02:23 /var/lib/ceph/mon/ceph-0_backup/store.db/CURRENT
4161568    4 -rw-r--r--   1 root     root          187 Aug 28 02:23 /var/lib/ceph/mon/ceph-0_backup/store.db/LOG
4161569 2116 -rw-r--r--   1 root     root      2162126 Aug 27 13:01 /var/lib/ceph/mon/ceph-0_backup/store.db/740939.sst
4161570    4 -rw-r--r--   1 root     root         3218 Aug 27 20:47 /var/lib/ceph/mon/ceph-0_backup/store.db/740949.sst
4161571    0 -rw-r--r--   1 root     root            0 Aug 28 02:23 /var/lib/ceph/mon/ceph-0_backup/store.db/740980.log
4161572 2060 -rw-r--r--   1 root     root      2101804 Aug 27 13:01 /var/lib/ceph/mon/ceph-0_backup/store.db/740946.sst
4161573 2124 -rw-r--r--   1 root     root      2169663 Aug 27 13:01 /var/lib/ceph/mon/ceph-0_backup/store.db/740943.sst
4161574 2064 -rw-r--r--   1 root     root      2108320 Aug 27 13:01 /var/lib/ceph/mon/ceph-0_backup/store.db/740941.sst
4161575 2076 -rw-r--r--   1 root     root      2119045 Aug 27 13:01 /var/lib/ceph/mon/ceph-0_backup/store.db/740942.sst
4161576 2092 -rw-r--r--   1 root     root      2136268 Aug 27 13:01 /var/lib/ceph/mon/ceph-0_backup/store.db/740945.sst
4161577   68 -rw-r--r--   1 root     root        65536 Aug 28 02:23 /var/lib/ceph/mon/ceph-0_backup/store.db/MANIFEST-740979
4161578 2132 -rw-r--r--   1 root     root      2176718 Aug 27 13:01 /var/lib/ceph/mon/ceph-0_backup/store.db/740944.sst
4161615    4 drwxr-xr-x   3 root     root         4096 Aug 28 03:07 /var/lib/ceph/mon/ceph-0
4161553    4 -rw-r--r--   1 root     root           77 May 28 22:14 /var/lib/ceph/mon/ceph-0/keyring
4169774    4 drwxr-xr-x   2 root     root         4096 Aug 28 10:15 /var/lib/ceph/mon/ceph-0/store.db
4161636 2068 -rw-r--r--   1 root     root      2110387 Aug 27 13:01 /var/lib/ceph/mon/ceph-0/store.db/740940.sst
4161659 1032 -rw-r--r--   1 root     root      1052470 Aug 27 13:01 /var/lib/ceph/mon/ceph-0/store.db/740947.sst
4161625    0 -rw-r--r--   1 root     root            0 May 28 22:14 /var/lib/ceph/mon/ceph-0/store.db/LOCK
4169782    4 -rw-r--r--   1 root     root          187 Aug 28 10:02 /var/lib/ceph/mon/ceph-0/store.db/LOG.old
4169781    4 -rw-r--r--   1 root     root           16 Aug 28 10:15 /var/lib/ceph/mon/ceph-0/store.db/CURRENT
4169778    4 -rw-r--r--   1 root     root          187 Aug 28 10:15 /var/lib/ceph/mon/ceph-0/store.db/LOG
4169780    4 -rw-r--r--   1 root     root        65536 Aug 28 10:15 /var/lib/ceph/mon/ceph-0/store.db/MANIFEST-740993
4161633 2116 -rw-r--r--   1 root     root      2162126 Aug 27 13:01 /var/lib/ceph/mon/ceph-0/store.db/740939.sst
4161631    4 -rw-r--r--   1 root     root         3218 Aug 27 20:47 /var/lib/ceph/mon/ceph-0/store.db/740949.sst
4161658 2060 -rw-r--r--   1 root     root      2101804 Aug 27 13:01 /var/lib/ceph/mon/ceph-0/store.db/740946.sst
4161648 2124 -rw-r--r--   1 root     root      2169663 Aug 27 13:01 /var/lib/ceph/mon/ceph-0/store.db/740943.sst
4161644 2064 -rw-r--r--   1 root     root      2108320 Aug 27 13:01 /var/lib/ceph/mon/ceph-0/store.db/740941.sst
4161645 2076 -rw-r--r--   1 root     root      2119045 Aug 27 13:01 /var/lib/ceph/mon/ceph-0/store.db/740942.sst
4161650 2092 -rw-r--r--   1 root     root      2136268 Aug 27 13:01 /var/lib/ceph/mon/ceph-0/store.db/740945.sst
4169779    0 -rw-r--r--   1 root     root            0 Aug 28 10:15 /var/lib/ceph/mon/ceph-0/store.db/740994.log
4161649 2132 -rw-r--r--   1 root     root      2176718 Aug 27 13:01 /var/lib/ceph/mon/ceph-0/store.db/740944.sst
Hi,
can you post both keyrings:
Code:
cat /etc/pve/priv/ceph.mon.keyring

cat /var/lib/ceph/mon/ceph-0/keyring
and any different error message, if you start the mon in foreground?
Code:
/usr/bin/ceph-mon -i 0 --pid-file /var/run/ceph/mon.0.pid -c /etc/ceph/ceph.conf --cluster ceph -d
Udo
 
Code:
root@proxmox2:/# cat /etc/pve/priv/ceph.mon.keyring
[mon.]
        key = AQD7aGdVoJd4BBAAa6viCohhhFPdFSP3i82bHA==
        caps mon = "allow *"
[client.admin]
        key = AQD7aGdViM+xABAA0YrvGspr1LMD3dgGWqrbUw==
        auid = 0
        caps mds = "allow"
        caps mon = "allow *"
        caps osd = "allow *"

Code:
root@proxmox2:/# cat /var/lib/ceph/mon/ceph-0/keyring
[mon.]
        key = AQD7aGdVoJd4BBAAa6viCohhhFPdFSP3i82bHA==
        caps mon = "allow *"

Code:
root@proxmox2:/# /usr/bin/ceph-mon -i 0 --pid-file /var/run/ceph/mon.0.pid -c /etc/ceph/ceph.conf --cluster ceph -d
2015-08-30 23:19:59.945149 7f6ca0d65840  0 ceph version 0.94.3 (95cefea9fd9ab740263bf8bb4796fd864d9afe2b), process ceph-mon, pid 358707
2015-08-30 23:19:59.982831 7f6ca0d65840 -1 unable to read magic from mon data
 
Code:
root@proxmox2:/# cat /etc/pve/priv/ceph.mon.keyring
[mon.]
        key = AQD7aGdVoJd4BBAAa6viCohhhFPdFSP3i82bHA==
        caps mon = "allow *"
[client.admin]
        key = AQD7aGdViM+xABAA0YrvGspr1LMD3dgGWqrbUw==
        auid = 0
        caps mds = "allow"
        caps mon = "allow *"
        caps osd = "allow *"

Code:
root@proxmox2:/# cat /var/lib/ceph/mon/ceph-0/keyring
[mon.]
        key = AQD7aGdVoJd4BBAAa6viCohhhFPdFSP3i82bHA==
        caps mon = "allow *"

Code:
root@proxmox2:/# /usr/bin/ceph-mon -i 0 --pid-file /var/run/ceph/mon.0.pid -c /etc/ceph/ceph.conf --cluster ceph -d
2015-08-30 23:19:59.945149 7f6ca0d65840  0 ceph version 0.94.3 (95cefea9fd9ab740263bf8bb4796fd864d9afe2b), process ceph-mon, pid 358707
2015-08-30 23:19:59.982831 7f6ca0d65840 -1 unable to read magic from mon data
Hi,
sorry - I'm out of ideas...

Perhaps this discribed a posibility for you: http://www.sebastien-han.fr/blog/2015/01/29/ceph-recover-a-rbd-image-from-a-dead-cluster/

Udo
 
Code:
# rbd info red
2015-08-31 22:57:02.507222 7f4c22b11700  0 -- :/1460958 >> 192.168.12.3:6789/0 pipe(0x3b0d050 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x3b0a160).fault
2015-08-31 22:57:05.507289 7f4c15ab5700  0 -- :/1460958 >> 192.168.12.3:6789/0 pipe(0x3b13010 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x3b0c790).fault
2015-08-31 22:57:08.507467 7f4c22b11700  0 -- :/1460958 >> 192.168.12.3:6789/0 pipe(0x3b0cea0 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x3b11140).fault
2015-08-31 22:57:11.507602 7f4c15ab5700  0 -- :/1460958 >> 192.168.12.3:6789/0 pipe(0x3b13010 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x3b172b0).fault

((( and what it this "unable to read magic from mon data"

and

Code:
# rbd -p rbd map red
ERROR: Module rbd not found.
FATAL: Module rbd not found.
rbd: failed to load rbd kernel module (1)
rbd: sysfs write failed
rbd: map failed: (2) No such file or directory

Code:
locate rbd.ko
/lib/modules/2.6.32-37-pve/kernel/drivers/block/drbd/drbd.ko
 
Last edited:
Code:
# rbd info red
2015-08-31 22:57:02.507222 7f4c22b11700  0 -- :/1460958 >> 192.168.12.3:6789/0 pipe(0x3b0d050 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x3b0a160).fault
2015-08-31 22:57:05.507289 7f4c15ab5700  0 -- :/1460958 >> 192.168.12.3:6789/0 pipe(0x3b13010 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x3b0c790).fault
2015-08-31 22:57:08.507467 7f4c22b11700  0 -- :/1460958 >> 192.168.12.3:6789/0 pipe(0x3b0cea0 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x3b11140).fault
2015-08-31 22:57:11.507602 7f4c15ab5700  0 -- :/1460958 >> 192.168.12.3:6789/0 pipe(0x3b13010 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x3b172b0).fault
the rbd program tried to connect to the mon (defined in /etc/ceph/ceph.conf) but due to no running mon (the mon process listen on port 6789) you get this error.
((( and what it this "unable to read magic from mon data"
never had this... perhaps ask on the ceph-user mailing list?!
and

Code:
# rbd -p rbd map red
ERROR: Module rbd not found.
FATAL: Module rbd not found.
rbd: failed to load rbd kernel module (1)
rbd: sysfs write failed
rbd: map failed: (2) No such file or directory

Code:
locate rbd.ko
/lib/modules/2.6.32-37-pve/kernel/drivers/block/drbd/drbd.ko
yes, pve use the ceph inside from qemu and not the kernel module.

Udo
 
I just encountered the same issue on a single-node ceph install and was able to solve it. Since this is the only Google hit, covering the symptoms, i thought i ought to post the steps leading to a fruitful solution:

The symptom
manifests like this:
You try to start one of your VM's and it is not starting, the log logs similar to this one:::
/dev/rbd0
/dev/rbd1
/dev/rbd2
rbd: sysfs write failed
TASK ERROR: can't mount rbd volume vm-1093-disk-1: rbd: sysfs write failed



Solution

0. You have already tried a normal scrub that did not reveal any errors, or generated errors that you have already fixed.
Code:
ceph osd scrub osd.X
1. manually trigger a deep scrub for every OSD by executing the following command, where X = your OSD-Number.
Code:
ceph osd deep-scrub osd.X
2. wait until the deep scrub has finished by watching
Code:
ceph -w
3. Once there are no more deep-scrubs happening, check which Placement Groups (PG) have been revealed as broken:
Code:
ceph health detail
You should encounter a bunch of broken ones like this output example:
pg 18.6d is active+clean+inconsistent, acting [3,6,5,4,9,7,12,11,10]
pg 17.67 is active+clean+inconsistent, acting [0,5,4]
pg 18.66 is active+clean+inconsistent, acting [0,4,7,12,11,9,5,6,10]
pg 17.62 is active+clean+inconsistent, acting [6,10,5]
pg 17.7f is active+clean+inconsistent, acting [0,7,5]
pg 17.77 is active+clean+inconsistent, acting [6,9,5]
pg 17.74 is active+clean+inconsistent, acting [2,3,5]
pg 17.48 is active+clean+inconsistent, acting [6,14,9]
pg 18.47 is active+clean+inconsistent, acting [11,10,6,4,7,15,5,3,0]
pg 18.41 is active+clean+inconsistent, acting [6,4,2,7,10,5,9,8,0]
pg 18.40 is active+clean+inconsistent, acting [1,6,4,8,5,2,11,3,7]
pg 17.43 is active+clean+inconsistent, acting [11,5,7]
pg 17.28 is active+clean+inconsistent, acting [4,14,1]
pg 17.25 is active+clean+inconsistent, acting [1,6,7]
pg 17.20 is active+clean+inconsistent, acting [12,7,4]
pg 18.37 is active+clean+inconsistent, acting [15,13,10,7,0,2,11,9,4]
pg 17.32 is active+clean+inconsistent, acting [0,5,9]
pg 17.9 is active+clean+inconsistent, acting [7,4,10]
pg 17.7 is active+clean+inconsistent, acting [10,7,11]
pg 18.1 is active+clean+inconsistent, acting [3,6,12,11,4,7,5,13,0]
pg 17.0 is active+clean+inconsistent, acting [7,2,4]
pg 18.1c is active+clean+inconsistent, acting [1,4,7,8,9,3,6,5,10]
pg 18.12 is active+clean+inconsistent, acting [6,4,2,10,8,7,0,5,1]
38 scrub errors

If all PGs list exactly the same number inside [ ], its likely there is an issue with the OSD.<Said Number>. Else you just happen to have some broken PG's, that you need to fix.

4. Fix the PG in question by issuing the following command for every broken PG:
Code:
ceph pg repair X

e.g.
ceph pg repair 18.6d

5. Once you have fixed all PG-errors you'll be able to start your VM again.
 
I just encountered the same issue on a single-node ceph install and was able to solve it. Since this is the only Google hit, covering the symptoms, i thought i ought to post the steps leading to a fruitful solution:
...
4. Fix the PG in question by issuing the following command for every broken PG:
Code:
ceph pg repair X
...
Hi,
only for curosity, how should an pg repair work on a single node? Or do you still have an relica of 3 but one node only?

Udo
 
yes, on a single node you NEED to replicate over the bucket type OSD, whereas on multi-node you wanna go via Bucket Type Host, Rack or above bucket types.

In this particular Case (I'm doing tests on this particular system) i have a K9 M3 EC-Pool, a corresponding SSD-Cache (3/1) and a Replication 3/1 pool. All on top of a 18 HDD-OSD (multiple sizes, speeds, ages), 4 SSD System
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!