Proxmox 5 beta OSD trouble...

Discussion in 'Proxmox VE: Installation and configuration' started by Vassilis Kipouros, Mar 23, 2017.

  1. Vassilis Kipouros

    Joined:
    Nov 2, 2016
    Messages:
    46
    Likes Received:
    3
    After upgrading to Proxmox 5.0 BETA, I add a new osd to my ceph cluster.

    create OSD on /dev/sdg (xfs)
    Caution: invalid backup GPT header, but valid main header; regenerating
    backup header from main header.

    ****************************************************************************
    Caution: Found protective or hybrid MBR and corrupt GPT. Using GPT, but disk
    verification and recovery are STRONGLY recommended.
    ****************************************************************************
    GPT data structures destroyed! You may now partition the disk using fdisk or
    other utilities.
    Creating new GPT entries.
    The operation has completed successfully.
    Setting name!
    partNum is 1
    REALLY setting name!
    The operation has completed successfully.
    Setting name!
    partNum is 0
    REALLY setting name!
    The operation has completed successfully.
    meta-data=/dev/sdg1 isize=2048 agcount=4, agsize=60719917 blks
    = sectsz=512 attr=2, projid32bit=1
    = crc=1 finobt=1, sparse=0, rmapbt=0, reflink=0
    data = bsize=4096 blocks=242879665, imaxpct=25
    = sunit=0 swidth=0 blks
    naming =version 2 bsize=4096 ascii-ci=0 ftype=1
    log =internal log bsize=4096 blocks=118593, version=2
    = sectsz=512 sunit=0 blks, lazy-count=1
    realtime =none extsz=4096 blocks=0, rtextents=0
    The operation has completed successfully.
    TASK OK

    But the OSD never appears in the gui & never gets added to the ceph configuration.
    The disk usage reports "partitions" and not osd.x

    After rebooting the node and retrying to add (ceph-disk zap /dev/sdx, pveceph createosd /dev/sdx)
    the situation is still the same...

    The rest of the cluster seems to work fine.

    Please help!
     
  2. tom

    tom Proxmox Staff Member
    Staff Member

    Joined:
    Aug 29, 2006
    Messages:
    13,641
    Likes Received:
    420
    did you re-use the disk or was it a brand new one?

    I assume there is some old data on it, preventing the use.
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  3. Vassilis Kipouros

    Joined:
    Nov 2, 2016
    Messages:
    46
    Likes Received:
    3
    I'm using ceph-disk zap. Done it twice...

    And nothing reports that use is prevented... Everything reports OK...

    ????

    I also have a clock skew that doesn't want to fix itself even after two hours from the node reboot...

    Also performance is a bit weird.... CT's and VM's take ages to start
    When trying to bulk start I get this:

    Can't call method "has_lock" on an undefined value at /usr/share/perl5/PVE/API2/Nodes.pm line 1300.
    Can't call method "has_lock" on an undefined value at /usr/share/perl5/PVE/API2/Nodes.pm line 1300.
    Can't call method "has_lock" on an undefined value at /usr/share/perl5/PVE/API2/Nodes.pm line 1300.
    Use of uninitialized value in string eq at /usr/share/perl5/PVE/API2/Nodes.pm line 1382.
    Use of uninitialized value in string eq at /usr/share/perl5/PVE/API2/Nodes.pm line 1386.
    Use of uninitialized value in concatenation (.) or string at /usr/share/perl5/PVE/API2/Nodes.pm line 1392.
    unknown VM type ''
    Use of uninitialized value in string eq at /usr/share/perl5/PVE/API2/Nodes.pm line 1382.
    Use of uninitialized value in string eq at /usr/share/perl5/PVE/API2/Nodes.pm line 1386.
    Use of uninitialized value in concatenation (.) or string at /usr/share/perl5/PVE/API2/Nodes.pm line 1392.
    unknown VM type ''
    Use of uninitialized value in string eq at /usr/share/perl5/PVE/API2/Nodes.pm line 1382.
    Use of uninitialized value in string eq at /usr/share/perl5/PVE/API2/Nodes.pm line 1386.
    Use of uninitialized value in concatenation (.) or string at /usr/share/perl5/PVE/API2/Nodes.pm line 1392.
    unknown VM type ''
    TASK OK
     
    #3 Vassilis Kipouros, Mar 23, 2017
    Last edited: Mar 23, 2017
  4. tom

    tom Proxmox Staff Member
    Staff Member

    Joined:
    Aug 29, 2006
    Messages:
    13,641
    Likes Received:
    420
    this is a known issue and the question remains: use disk or brand new? if these are used disk please wipe them completely and try again.

    to get a better understanding of your problems - how did you install 5.0 beta, or how did you upgrade? seems something went wrong here.
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  5. Vassilis Kipouros

    Joined:
    Nov 2, 2016
    Messages:
    46
    Likes Received:
    3
    The disk was not brand new. (doesn't ceph-disk zap nuke everything on the disk? what do you mean wipe completely? fdisk?).

    I edited my sources.list to this and then dist upgrade all three nodes:


    deb http://ftp.gr.debian.org/debian stretch main contrib

    # security updates
    deb http://security.debian.org stretch/updates main contrib

    # PVE pve-no-subscription repository provided by proxmox.com,
    # NOT recommended for production use
    deb http://download.proxmox.com/debian stretch pvetest
     
  6. fabian

    fabian Proxmox Staff Member
    Staff Member

    Joined:
    Jan 7, 2016
    Messages:
    3,390
    Likes Received:
    523
    what does "ceph version" say?
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  7. Ivan Dimitrov

    Ivan Dimitrov New Member

    Joined:
    Jul 14, 2016
    Messages:
    15
    Likes Received:
    1
    Hi, I have my VM autostart with the following error:
    Code:
    Mar 23 21:56:23 convergence qm[15052]: start failed: org.freedesktop.DBus.Error.InvalidArgs: CPUShares value out of range
    Mar 23 21:56:23 convergence qm[15049]: <root@pam> end task UPID:convergence:00003ACC:0000384E:58D43677:qmstart:179:root@pam: start failed: org.freedesktop.DBus.Error.InvalidArgs: CPUShares value out of range
    I have given the 50000 CPU shares to this VM and this was supposed to be the highest possible value. Not sure if this is a bug or the option is deprecated.
    After commenting the CPU shares I have not restarted the host to see if there will other problems.
     
  8. fabian

    fabian Proxmox Staff Member
    Staff Member

    Joined:
    Jan 7, 2016
    Messages:
    3,390
    Likes Received:
    523
    thanks, filed as https://bugzilla.proxmox.com/show_bug.cgi?id=1321
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  9. Vassilis Kipouros

    Joined:
    Nov 2, 2016
    Messages:
    46
    Likes Received:
    3
    ceph version 10.2.6 (656b5b63ed7c43bd014bcafd81b001959d5f089f)

    do I need to upgrade ceph somehow?

    autostarting the vm's and containers also does not work...
     
  10. Vassilis Kipouros

    Joined:
    Nov 2, 2016
    Messages:
    46
    Likes Received:
    3
    I tried reading the OSD after dd zeroing out the disk.
    Same behaviour.
     
  11. fabian

    fabian Proxmox Staff Member
    Staff Member

    Joined:
    Jan 7, 2016
    Messages:
    3,390
    Likes Received:
    523
    yes. "pveceph install" will install Ceph Luminous Beta packages, but please read the Ceph release notes for Kraken (especially about sort bitwise!) and the notes for Luminous before upgrading. note that the upgrade path is not yet very tested.
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  12. Vassilis Kipouros

    Joined:
    Nov 2, 2016
    Messages:
    46
    Likes Received:
    3
    just "pveceph install" on all nodes?

    which part of the release notes is important?
    can you provide a link?
     
  13. Vassilis Kipouros

    Joined:
    Nov 2, 2016
    Messages:
    46
    Likes Received:
    3
    cluster 664d7895-6e32-4f53-a8bc-46050faca724
    health HEALTH_WARN
    clock skew detected on mon.2
    Monitor clock skew detected
    monmap e3: 3 mons at {0=192.168.100.1:6789/0,1=192.168.100.2:6789/0,2=192.168.100.3:6789/0}
    election epoch 416, quorum 0,1,2 0,1,2
    osdmap e3106: 17 osds: 17 up, 17 in
    flags sortbitwise,require_jewel_osds
    pgmap v1232212: 1088 pgs, 2 pools, 5612 GB data, 1403 kobjects
    11231 GB used, 4511 GB / 15743 GB avail
    1088 active+clean
    client io 26403 kB/s rd, 26632 B/s wr, 100 op/s rd, 3 op/s wr


    I see I have the setbitwise flag set...
     
  14. fabian

    fabian Proxmox Staff Member
    Staff Member

    Joined:
    Jan 7, 2016
    Messages:
    3,390
    Likes Received:
    523
    sortbitwise seems the most important, we actually hosed a test cluster by not setting that before upgrading.
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  15. Vassilis Kipouros

    Joined:
    Nov 2, 2016
    Messages:
    46
    Likes Received:
    3
    After doing: ceph osd set require_kraken_osds

    ceph -s reports both:

    osdmap e3161: 17 osds: 17 up, 17 in
    flags sortbitwise,require_jewel_osds,require_kraken_osds

    do I need to remove the jewel reference?
    how do I do it? (tried ceph osd unset require_jewel_osds but it does not work)
     
  16. Vassilis Kipouros

    Joined:
    Nov 2, 2016
    Messages:
    46
    Likes Received:
    3
    I upgraded ceph but the issue with the OSD never being added still persists...
    I ceph-disk zaped the disk with new ceph, and then added via the gui...
    everything reports ok but the osd never appears... (help... pls)

    create OSD on /dev/sdg (xfs)
    Caution: invalid backup GPT header, but valid main header; regenerating
    backup header from main header.

    ****************************************************************************
    Caution: Found protective or hybrid MBR and corrupt GPT. Using GPT, but disk
    verification and recovery are STRONGLY recommended.
    ****************************************************************************
    GPT data structures destroyed! You may now partition the disk using fdisk or
    other utilities.
    Creating new GPT entries.
    The operation has completed successfully.
    Setting name!
    partNum is 1
    REALLY setting name!
    The operation has completed successfully.
    The operation has completed successfully.
    Setting name!
    partNum is 0
    REALLY setting name!
    The operation has completed successfully.
    meta-data=/dev/sdg1 isize=2048 agcount=4, agsize=60719917 blks
    = sectsz=512 attr=2, projid32bit=1
    = crc=1 finobt=1, sparse=0, rmapbt=0, reflink=0
    data = bsize=4096 blocks=242879665, imaxpct=25
    = sunit=0 swidth=0 blks
    naming =version 2 bsize=4096 ascii-ci=0 ftype=1
    log =internal log bsize=4096 blocks=118593, version=2
    = sectsz=512 sunit=0 blks, lazy-count=1
    realtime =none extsz=4096 blocks=0, rtextents=0
    The operation has completed successfully.
    TASK OK
     
  17. Vassilis Kipouros

    Joined:
    Nov 2, 2016
    Messages:
    46
    Likes Received:
    3
    osd.17 never pops up after all the above...

    upload_2017-3-24_23-9-1.png

    upload_2017-3-24_23-10-28.png
     
  18. fabian

    fabian Proxmox Staff Member
    Staff Member

    Joined:
    Jan 7, 2016
    Messages:
    3,390
    Likes Received:
    523
    what does the journal from around the time of osd creation say? you can limit the output with "--since " and "--until ", e.g.:

    Code:
    journalctl --since "2017-03-24 22:56:00" --until "2017-03-24 23:10:00"
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  19. Vassilis Kipouros

    Joined:
    Nov 2, 2016
    Messages:
    46
    Likes Received:
    3

    root@pve1:~# journalctl --since "2017-03-27 18:00:00" --until "2017-03-27 18:10:00"
    -- Logs begin at Mon 2017-03-27 17:33:58 EEST, end at Mon 2017-03-27 18:30:06 EEST. --
    Mar 27 18:02:44 pve1 pvedaemon[1929]: <root@pam> starting task UPID:pve1:00003A7B:0002A3D0:58D92994:vncshell::root@pam:
    Mar 27 18:02:44 pve1 pvedaemon[14971]: starting vnc proxy UPID:pve1:00003A7B:0002A3D0:58D92994:vncshell::root@pam:
    Mar 27 18:02:44 pve1 pvedaemon[14971]: launch command: /usr/bin/vncterm -rfbport 5900 -timeout 10 -authpath /nodes/pve1 -perm Sys.Console -notls -listen localhost -c /bin/login -f root
    Mar 27 18:02:45 pve1 login[14973]: pam_unix(login:session): session opened for user root by (uid=0)
    Mar 27 18:02:45 pve1 systemd[1]: Created slice User Slice of root.
    Mar 27 18:02:45 pve1 systemd[1]: Starting User Manager for UID 0...
    Mar 27 18:02:45 pve1 systemd-logind[1117]: New session 4 of user root.
    Mar 27 18:02:45 pve1 systemd[14974]: pam_unix(systemd-user:session): session opened for user root by (uid=0)
    Mar 27 18:02:45 pve1 systemd[1]: Started Session 4 of user root.
    Mar 27 18:02:45 pve1 systemd[14974]: Listening on GnuPG cryptographic agent and passphrase cache.
    Mar 27 18:02:45 pve1 systemd[14974]: Reached target Timers.
    Mar 27 18:02:45 pve1 systemd[14974]: Reached target Paths.
    Mar 27 18:02:45 pve1 systemd[14974]: Listening on GnuPG cryptographic agent and passphrase cache (restricted).
    Mar 27 18:02:45 pve1 systemd[14974]: Listening on GnuPG network certificate management daemon.
    Mar 27 18:02:45 pve1 systemd[14974]: Listening on GnuPG cryptographic agent (ssh-agent emulation).
    Mar 27 18:02:45 pve1 systemd[14974]: Listening on GnuPG cryptographic agent (access for web browsers).
    Mar 27 18:02:45 pve1 systemd[14974]: Starting D-Bus User Message Bus Socket.
    Mar 27 18:02:45 pve1 systemd[14974]: Listening on D-Bus User Message Bus Socket.
    Mar 27 18:02:45 pve1 systemd[14974]: Reached target Sockets.
    Mar 27 18:02:45 pve1 systemd[14974]: Reached target Basic System.
    Mar 27 18:02:45 pve1 systemd[14974]: Reached target Default.
    Mar 27 18:02:45 pve1 systemd[14974]: Startup finished in 31ms.
    Mar 27 18:02:45 pve1 systemd[1]: Started User Manager for UID 0.
    Mar 27 18:02:45 pve1 login[14996]: ROOT LOGIN on '/dev/pts/4'
    Mar 27 18:02:55 pve1 kernel: Alternate GPT is invalid, using primary GPT.
    Mar 27 18:02:55 pve1 kernel: sdg: sdg1 sdg2
    Mar 27 18:02:55 pve1 kernel: Alternate GPT is invalid, using primary GPT.
    Mar 27 18:02:55 pve1 kernel: sdg: sdg1 sdg2
    Mar 27 18:02:57 pve1 kernel: sdg:
    Mar 27 18:03:13 pve1 pvedaemon[1929]: <root@pam> starting task UPID:pve1:00003BDE:0002AF0A:58D929B1:cephcreateosd:sdg:root@pam:
    Mar 27 18:03:13 pve1 kernel: Alternate GPT is invalid, using primary GPT.
    Mar 27 18:03:13 pve1 kernel: sdg:
    Mar 27 18:03:16 pve1 kernel: sdg:
    Mar 27 18:03:16 pve1 kernel: sdg:
    Mar 27 18:03:17 pve1 kernel: sdg: sdg2
    Mar 27 18:03:18 pve1 systemd[1]: Starting Ceph disk activation: /dev/sdg2...
    Mar 27 18:03:18 pve1 kernel: sdg: sdg2
    Mar 27 18:03:18 pve1 kernel: sdg: sdg2
    Mar 27 18:03:18 pve1 sh[15419]: main_trigger: main_trigger: Namespace(cluster='ceph', dev='/dev/sdg2', dmcrypt=None, dmcrypt_key_dir='/etc/ceph/dmcrypt-keys', func=<function main_trigger at 0x7fb394b16398>,
    Mar 27 18:03:18 pve1 sh[15419]: command: Running command: /sbin/init --version
    Mar 27 18:03:18 pve1 sh[15419]: command_check_call: Running command: /bin/chown ceph:ceph /dev/sdg2
    Mar 27 18:03:18 pve1 sh[15419]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
    Mar 27 18:03:18 pve1 sh[15419]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
    Mar 27 18:03:18 pve1 sh[15419]: main_trigger: trigger /dev/sdg2 parttype 45b0969e-9b03-4f30-b4c6-b4b80ceff106 uuid 4925f1a4-ad1f-408c-b61f-b42266b2177c
    Mar 27 18:03:18 pve1 sh[15419]: command: Running command: /usr/sbin/ceph-disk --verbose activate-journal /dev/sdg2
    Mar 27 18:03:19 pve1 systemd[1]: Stopped Ceph disk activation: /dev/sdg2.
    Mar 27 18:03:19 pve1 systemd[1]: Starting Ceph disk activation: /dev/sdg2...
    Mar 27 18:03:19 pve1 sh[15419]: main_trigger:
    Mar 27 18:03:19 pve1 sh[15419]: main_trigger: get_dm_uuid: get_dm_uuid /dev/sdg2 uuid path is /sys/dev/block/8:98/dm/uuid
    Mar 27 18:03:19 pve1 sh[15419]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
    Mar 27 18:03:19 pve1 sh[15419]: command: Running command: /usr/bin/ceph-osd --get-device-fsid /dev/sdg2
    Mar 27 18:03:19 pve1 sh[15419]: get_space_osd_uuid: Journal /dev/sdg2 has OSD UUID 00000000-0000-0000-0000-000000000000
    Mar 27 18:03:19 pve1 sh[15419]: main_activate_space: activate: OSD device not present, not starting, yet
    Mar 27 18:03:19 pve1 systemd[1]: Stopped Ceph disk activation: /dev/sdg2.
    Mar 27 18:03:19 pve1 systemd[1]: Starting Ceph disk activation: /dev/sdg2...
    Mar 27 18:03:19 pve1 sh[15475]: main_trigger: main_trigger: Namespace(cluster='ceph', dev='/dev/sdg2', dmcrypt=None, dmcrypt_key_dir='/etc/ceph/dmcrypt-keys', func=<function main_trigger at 0x7f71b371e398>,
    Mar 27 18:03:19 pve1 sh[15475]: command: Running command: /sbin/init --version
    Mar 27 18:03:19 pve1 sh[15475]: command_check_call: Running command: /bin/chown ceph:ceph /dev/sdg2
    Mar 27 18:03:19 pve1 sh[15475]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
    Mar 27 18:03:19 pve1 sh[15475]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
    Mar 27 18:03:19 pve1 sh[15475]: main_trigger: trigger /dev/sdg2 parttype 45b0969e-9b03-4f30-b4c6-b4b80ceff106 uuid 4925f1a4-ad1f-408c-b61f-b42266b2177c
    Mar 27 18:03:19 pve1 sh[15475]: command: Running command: /usr/sbin/ceph-disk --verbose activate-journal /dev/sdg2
    Mar 27 18:03:19 pve1 sh[15475]: main_trigger:
    Mar 27 18:03:19 pve1 sh[15475]: main_trigger: get_dm_uuid: get_dm_uuid /dev/sdg2 uuid path is /sys/dev/block/8:98/dm/uuid
    Mar 27 18:03:19 pve1 sh[15475]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
    Mar 27 18:03:19 pve1 sh[15475]: command: Running command: /usr/bin/ceph-osd --get-device-fsid /dev/sdg2
    Mar 27 18:03:19 pve1 sh[15475]: get_space_osd_uuid: Journal /dev/sdg2 has OSD UUID 00000000-0000-0000-0000-000000000000
    Mar 27 18:03:19 pve1 sh[15475]: main_activate_space: activate: OSD device not present, not starting, yet
    Mar 27 18:03:20 pve1 kernel: sdg: sdg2
    Mar 27 18:03:20 pve1 sh[15492]: main_trigger: main_trigger: Namespace(cluster='ceph', dev='/dev/sdg2', dmcrypt=None, dmcrypt_key_dir='/etc/ceph/dmcrypt-keys', func=<function main_trigger at 0x7fecbf24d398>,
    Mar 27 18:03:20 pve1 sh[15492]: command: Running command: /sbin/init --version
    Mar 27 18:03:20 pve1 sh[15492]: command_check_call: Running command: /bin/chown ceph:ceph /dev/sdg2
    Mar 27 18:03:20 pve1 sh[15492]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
    Mar 27 18:03:20 pve1 sh[15492]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
    Mar 27 18:03:20 pve1 sh[15492]: main_trigger: trigger /dev/sdg2 parttype 45b0969e-9b03-4f30-b4c6-b4b80ceff106 uuid 4925f1a4-ad1f-408c-b61f-b42266b2177c
    Mar 27 18:03:20 pve1 sh[15492]: command: Running command: /usr/sbin/ceph-disk --verbose activate-journal /dev/sdg2
    Mar 27 18:03:20 pve1 systemd[1]: Stopped Ceph disk activation: /dev/sdg2.
    Mar 27 18:03:20 pve1 systemd[1]: Starting Ceph disk activation: /dev/sdg2...
    Mar 27 18:03:20 pve1 sh[15492]: main_trigger:
    Mar 27 18:03:20 pve1 sh[15492]: main_trigger: get_dm_uuid: get_dm_uuid /dev/sdg2 uuid path is /sys/dev/block/8:98/dm/uuid
    Mar 27 18:03:20 pve1 sh[15492]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
    Mar 27 18:03:20 pve1 sh[15492]: command: Running command: /usr/bin/ceph-osd --get-device-fsid /dev/sdg2
    Mar 27 18:03:20 pve1 sh[15492]: get_space_osd_uuid: Journal /dev/sdg2 has OSD UUID 00000000-0000-0000-0000-000000000000
    Mar 27 18:03:20 pve1 sh[15492]: main_activate_space: activate: OSD device not present, not starting, yet
    Mar 27 18:03:20 pve1 kernel: sdg: sdg2
    Mar 27 18:03:20 pve1 kernel: sdg: sdg1 sdg2
    Mar 27 18:03:20 pve1 sh[15530]: main_trigger: main_trigger: Namespace(cluster='ceph', dev='/dev/sdg2', dmcrypt=None, dmcrypt_key_dir='/etc/ceph/dmcrypt-keys', func=<function main_trigger at 0x7f408dcf7398>,
    Mar 27 18:03:20 pve1 sh[15530]: command: Running command: /sbin/init --version
    Mar 27 18:03:20 pve1 sh[15530]: command_check_call: Running command: /bin/chown ceph:ceph /dev/sdg2
    Mar 27 18:03:20 pve1 sh[15530]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
    Mar 27 18:03:20 pve1 sh[15530]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
    Mar 27 18:03:20 pve1 sh[15530]: main_trigger: trigger /dev/sdg2 parttype 45b0969e-9b03-4f30-b4c6-b4b80ceff106 uuid 4925f1a4-ad1f-408c-b61f-b42266b2177c
    Mar 27 18:03:20 pve1 sh[15530]: command: Running command: /usr/sbin/ceph-disk --verbose activate-journal /dev/sdg2
    Mar 27 18:03:21 pve1 systemd[1]: Stopped Ceph disk activation: /dev/sdg2.
    Mar 27 18:03:21 pve1 systemd[1]: Starting Ceph disk activation: /dev/sdg2...
    Mar 27 18:03:21 pve1 sh[15530]: main_trigger:
    Mar 27 18:03:21 pve1 sh[15530]: main_trigger: get_dm_uuid: get_dm_uuid /dev/sdg2 uuid path is /sys/dev/block/8:98/dm/uuid
    Mar 27 18:03:21 pve1 sh[15530]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
    Mar 27 18:03:21 pve1 sh[15530]: command: Running command: /usr/bin/ceph-osd --get-device-fsid /dev/sdg2
    Mar 27 18:03:21 pve1 sh[15530]: get_space_osd_uuid: Journal /dev/sdg2 has OSD UUID 00000000-0000-0000-0000-000000000000
    Mar 27 18:03:21 pve1 sh[15530]: main_activate_space: activate: OSD device not present, not starting, yet
    Mar 27 18:03:21 pve1 systemd[1]: Stopped Ceph disk activation: /dev/sdg2.
    Mar 27 18:03:21 pve1 systemd[1]: ceph-disk@dev-sdg2.service: Start request repeated too quickly.
    Mar 27 18:03:21 pve1 systemd[1]: Failed to start Ceph disk activation: /dev/sdg2.
    Mar 27 18:03:21 pve1 systemd[1]: ceph-disk@dev-sdg2.service: Unit entered failed state.
    Mar 27 18:03:21 pve1 systemd[1]: ceph-disk@dev-sdg2.service: Failed with result 'start-limit-hit'.

    Mar 27 18:03:21 pve1 sh[15593]: main_trigger: main_trigger: Namespace(cluster='ceph', dev='/dev/sdg2', dmcrypt=None, dmcrypt_key_dir='/etc/ceph/dmcrypt-keys', func=<function main_trigger at 0x7f6ba7770398>,
    Mar 27 18:03:21 pve1 sh[15593]: command: Running command: /sbin/init --version
    Mar 27 18:03:21 pve1 sh[15593]: command_check_call: Running command: /bin/chown ceph:ceph /dev/sdg2
    Mar 27 18:03:21 pve1 sh[15593]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
    Mar 27 18:03:21 pve1 sh[15593]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
    Mar 27 18:03:21 pve1 sh[15593]: main_trigger: trigger /dev/sdg2 parttype 45b0969e-9b03-4f30-b4c6-b4b80ceff106 uuid 4925f1a4-ad1f-408c-b61f-b42266b2177c
    Mar 27 18:03:21 pve1 sh[15593]: command: Running command: /usr/sbin/ceph-disk --verbose activate-journal /dev/sdg2
    Mar 27 18:03:21 pve1 sh[15593]: main_trigger:
    Mar 27 18:03:21 pve1 sh[15593]: main_trigger: get_dm_uuid: get_dm_uuid /dev/sdg2 uuid path is /sys/dev/block/8:98/dm/uuid
    Mar 27 18:03:21 pve1 sh[15593]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
    Mar 27 18:03:21 pve1 sh[15593]: command: Running command: /usr/bin/ceph-osd --get-device-fsid /dev/sdg2
    Mar 27 18:03:21 pve1 sh[15593]: get_space_osd_uuid: Journal /dev/sdg2 has OSD UUID 00000000-0000-0000-0000-000000000000
    Mar 27 18:03:21 pve1 sh[15593]: main_activate_space: activate: OSD device not present, not starting, yet
    Mar 27 18:03:22 pve1 kernel: sdg: sdg1 sdg2
    Mar 27 18:03:22 pve1 systemd[1]: ceph-disk@dev-sdg2.service: Start request repeated too quickly.
    Mar 27 18:03:22 pve1 systemd[1]: Failed to start Ceph disk activation: /dev/sdg2.
    Mar 27 18:03:22 pve1 systemd[1]: ceph-disk@dev-sdg2.service: Failed with result 'start-limit-hit'.
    Mar 27 18:03:27 pve1 kernel: XFS (sdg1): Mounting V5 Filesystem
    Mar 27 18:03:27 pve1 kernel: XFS (sdg1): Ending clean mount
    Mar 27 18:03:28 pve1 kernel: XFS (sdg1): Unmounting Filesystem
    Mar 27 18:03:28 pve1 kernel: sdg: sdg1 sdg2
    Mar 27 18:03:28 pve1 kernel: sdg: sdg1 sdg2
    Mar 27 18:03:28 pve1 systemd[1]: Starting Ceph disk activation: /dev/sdg2...
    Mar 27 18:03:28 pve1 sh[15781]: main_trigger: main_trigger: Namespace(cluster='ceph', dev='/dev/sdg2', dmcrypt=None, dmcrypt_key_dir='/etc/ceph/dmcrypt-keys', func=<function main_trigger at 0x7fe9020b8398>,
    Mar 27 18:03:28 pve1 sh[15781]: command: Running command: /sbin/init --version
    Mar 27 18:03:28 pve1 sh[15781]: command_check_call: Running command: /bin/chown ceph:ceph /dev/sdg2
    Mar 27 18:03:28 pve1 sh[15781]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
    Mar 27 18:03:28 pve1 sh[15781]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
    Mar 27 18:03:28 pve1 sh[15781]: main_trigger: trigger /dev/sdg2 parttype 45b0969e-9b03-4f30-b4c6-b4b80ceff106 uuid 4925f1a4-ad1f-408c-b61f-b42266b2177c
    Mar 27 18:03:28 pve1 sh[15781]: command: Running command: /usr/sbin/ceph-disk --verbose activate-journal /dev/sdg2
    Mar 27 18:03:29 pve1 systemd[1]: Stopped Ceph disk activation: /dev/sdg2.
    Mar 27 18:03:29 pve1 systemd[1]: Starting Ceph disk activation: /dev/sdg2...
    Mar 27 18:03:29 pve1 sh[15781]: main_trigger:
    Mar 27 18:03:29 pve1 sh[15781]: main_trigger: get_dm_uuid: get_dm_uuid /dev/sdg2 uuid path is /sys/dev/block/8:98/dm/uuid
    Mar 27 18:03:29 pve1 sh[15781]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
    Mar 27 18:03:29 pve1 sh[15781]: command: Running command: /usr/bin/ceph-osd --get-device-fsid /dev/sdg2
    Mar 27 18:03:29 pve1 sh[15781]: get_space_osd_uuid: Journal /dev/sdg2 has OSD UUID 00000000-0000-0000-0000-000000000000
    Mar 27 18:03:29 pve1 sh[15781]: main_activate_space: activate: OSD device not present, not starting, yet
    Mar 27 18:03:29 pve1 kernel: sdg: sdg1 sdg2
    Mar 27 18:03:29 pve1 sh[15803]: main_trigger: main_trigger: Namespace(cluster='ceph', dev='/dev/sdg2', dmcrypt=None, dmcrypt_key_dir='/etc/ceph/dmcrypt-keys', func=<function main_trigger at 0x7f85ccd1a398>,
    Mar 27 18:03:29 pve1 sh[15803]: command: Running command: /sbin/init --version
    Mar 27 18:03:29 pve1 sh[15803]: command_check_call: Running command: /bin/chown ceph:ceph /dev/sdg2
    Mar 27 18:03:29 pve1 sh[15803]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
    Mar 27 18:03:29 pve1 sh[15803]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
    Mar 27 18:03:29 pve1 sh[15803]: main_trigger: trigger /dev/sdg2 parttype 45b0969e-9b03-4f30-b4c6-b4b80ceff106 uuid 4925f1a4-ad1f-408c-b61f-b42266b2177c
    Mar 27 18:03:29 pve1 sh[15803]: command: Running command: /usr/sbin/ceph-disk --verbose activate-journal /dev/sdg2
    Mar 27 18:03:29 pve1 systemd[1]: Stopped Ceph disk activation: /dev/sdg2.
    Mar 27 18:03:29 pve1 systemd[1]: Starting Ceph disk activation: /dev/sdg2...
    Mar 27 18:03:30 pve1 sh[15803]: main_trigger:
    Mar 27 18:03:30 pve1 sh[15803]: main_trigger: get_dm_uuid: get_dm_uuid /dev/sdg2 uuid path is /sys/dev/block/8:98/dm/uuid
    Mar 27 18:03:30 pve1 sh[15803]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
    Mar 27 18:03:30 pve1 sh[15803]: command: Running command: /usr/bin/ceph-osd --get-device-fsid /dev/sdg2
    Mar 27 18:03:30 pve1 sh[15803]: get_space_osd_uuid: Journal /dev/sdg2 has OSD UUID 00000000-0000-0000-0000-000000000000
    Mar 27 18:03:30 pve1 sh[15803]: main_activate_space: activate: OSD device not present, not starting, yet
    Mar 27 18:03:30 pve1 pvedaemon[1929]: <root@pam> end task UPID:pve1:00003BDE:0002AF0A:58D929B1:cephcreateosd:sdg:root@pam: OK
    Mar 27 18:03:30 pve1 sh[15834]: main_trigger: main_trigger: Namespace(cluster='ceph', dev='/dev/sdg2', dmcrypt=None, dmcrypt_key_dir='/etc/ceph/dmcrypt-keys', func=<function main_trigger at 0x7f625e117398>,
    Mar 27 18:03:30 pve1 sh[15834]: command: Running command: /sbin/init --version
    Mar 27 18:03:30 pve1 sh[15834]: command_check_call: Running command: /bin/chown ceph:ceph /dev/sdg2
    Mar 27 18:03:30 pve1 sh[15834]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
    Mar 27 18:03:30 pve1 sh[15834]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
    Mar 27 18:03:30 pve1 sh[15834]: main_trigger: trigger /dev/sdg2 parttype 45b0969e-9b03-4f30-b4c6-b4b80ceff106 uuid 4925f1a4-ad1f-408c-b61f-b42266b2177c
    Mar 27 18:03:30 pve1 sh[15834]: command: Running command: /usr/sbin/ceph-disk --verbose activate-journal /dev/sdg2
    Mar 27 18:03:30 pve1 sh[15834]: main_trigger:
    Mar 27 18:03:30 pve1 sh[15834]: main_trigger: get_dm_uuid: get_dm_uuid /dev/sdg2 uuid path is /sys/dev/block/8:98/dm/uuid
    Mar 27 18:03:30 pve1 sh[15834]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
    Mar 27 18:03:30 pve1 sh[15834]: command: Running command: /usr/bin/ceph-osd --get-device-fsid /dev/sdg2
    Mar 27 18:03:30 pve1 sh[15834]: get_space_osd_uuid: Journal /dev/sdg2 has OSD UUID 00000000-0000-0000-0000-000000000000
    Mar 27 18:03:30 pve1 sh[15834]: main_activate_space: activate: OSD device not present, not starting, yet
    Mar 27 18:03:30 pve1 systemd[1]: Started Ceph disk activation: /dev/sdg2.
    Mar 27 18:04:05 pve1 smartd[1109]: Device: /dev/sda [SAT], SMART Usage Attribute: 190 Airflow_Temperature_Cel changed from 69 to 68
    Mar 27 18:04:05 pve1 smartd[1109]: Device: /dev/sdc [SAT], SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 117 to 118
    Mar 27 18:04:05 pve1 smartd[1109]: Device: /dev/sdd [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 153 to 150
    Mar 27 18:04:06 pve1 smartd[1109]: Device: /dev/sde [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 116 to 115
    Mar 27 18:04:06 pve1 smartd[1109]: Device: /dev/sdf [SAT], SMART Usage Attribute: 190 Airflow_Temperature_Cel changed from 59 to 58
    Mar 27 18:04:06 pve1 smartd[1109]: Device: /dev/sdf [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 41 to 42
    Mar 27 18:07:39 pve1 pveproxy[2843]: worker 2845 finished
    Mar 27 18:07:39 pve1 pveproxy[2843]: starting 1 worker(s)
    Mar 27 18:07:39 pve1 pveproxy[2843]: worker 17382 started
    Mar 27 18:07:43 pve1 pveproxy[17381]: got inotify poll request in wrong process - disabling inotify
    Mar 27 18:09:57 pve1 pvedaemon[1931]: <root@pam> successful auth for user 'root@pam'
    root@pve1:~#
     
    #19 Vassilis Kipouros, Mar 27, 2017
    Last edited: Mar 27, 2017
  20. Vassilis Kipouros

    Joined:
    Nov 2, 2016
    Messages:
    46
    Likes Received:
    3
    Also out of the blue I have an osd partial full, which is probably a false alarm...

    How can one osd be partial full when all have the same weight???

    upload_2017-3-27_18-38-35.png
     
    #20 Vassilis Kipouros, Mar 27, 2017
    Last edited: Mar 27, 2017
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice