Proxmox 5 beta OSD trouble...

Vassilis Kipouros

Active Member
Nov 2, 2016
54
6
28
47
After upgrading to Proxmox 5.0 BETA, I add a new osd to my ceph cluster.

create OSD on /dev/sdg (xfs)
Caution: invalid backup GPT header, but valid main header; regenerating
backup header from main header.

****************************************************************************
Caution: Found protective or hybrid MBR and corrupt GPT. Using GPT, but disk
verification and recovery are STRONGLY recommended.
****************************************************************************
GPT data structures destroyed! You may now partition the disk using fdisk or
other utilities.
Creating new GPT entries.
The operation has completed successfully.
Setting name!
partNum is 1
REALLY setting name!
The operation has completed successfully.
Setting name!
partNum is 0
REALLY setting name!
The operation has completed successfully.
meta-data=/dev/sdg1 isize=2048 agcount=4, agsize=60719917 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=1, sparse=0, rmapbt=0, reflink=0
data = bsize=4096 blocks=242879665, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=1
log =internal log bsize=4096 blocks=118593, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
The operation has completed successfully.
TASK OK

But the OSD never appears in the gui & never gets added to the ceph configuration.
The disk usage reports "partitions" and not osd.x

After rebooting the node and retrying to add (ceph-disk zap /dev/sdx, pveceph createosd /dev/sdx)
the situation is still the same...

The rest of the cluster seems to work fine.

Please help!
 
did you re-use the disk or was it a brand new one?

I assume there is some old data on it, preventing the use.
 
I'm using ceph-disk zap. Done it twice...

And nothing reports that use is prevented... Everything reports OK...

????

I also have a clock skew that doesn't want to fix itself even after two hours from the node reboot...

Also performance is a bit weird.... CT's and VM's take ages to start
When trying to bulk start I get this:

Can't call method "has_lock" on an undefined value at /usr/share/perl5/PVE/API2/Nodes.pm line 1300.
Can't call method "has_lock" on an undefined value at /usr/share/perl5/PVE/API2/Nodes.pm line 1300.
Can't call method "has_lock" on an undefined value at /usr/share/perl5/PVE/API2/Nodes.pm line 1300.
Use of uninitialized value in string eq at /usr/share/perl5/PVE/API2/Nodes.pm line 1382.
Use of uninitialized value in string eq at /usr/share/perl5/PVE/API2/Nodes.pm line 1386.
Use of uninitialized value in concatenation (.) or string at /usr/share/perl5/PVE/API2/Nodes.pm line 1392.
unknown VM type ''
Use of uninitialized value in string eq at /usr/share/perl5/PVE/API2/Nodes.pm line 1382.
Use of uninitialized value in string eq at /usr/share/perl5/PVE/API2/Nodes.pm line 1386.
Use of uninitialized value in concatenation (.) or string at /usr/share/perl5/PVE/API2/Nodes.pm line 1392.
unknown VM type ''
Use of uninitialized value in string eq at /usr/share/perl5/PVE/API2/Nodes.pm line 1382.
Use of uninitialized value in string eq at /usr/share/perl5/PVE/API2/Nodes.pm line 1386.
Use of uninitialized value in concatenation (.) or string at /usr/share/perl5/PVE/API2/Nodes.pm line 1392.
unknown VM type ''
TASK OK
 
Last edited:
I'm using ceph-disk zap. Done it twice...

And nothing reports that use is prevented... Everything reports OK...

????

this is a known issue and the question remains: use disk or brand new? if these are used disk please wipe them completely and try again.

to get a better understanding of your problems - how did you install 5.0 beta, or how did you upgrade? seems something went wrong here.
 
The disk was not brand new. (doesn't ceph-disk zap nuke everything on the disk? what do you mean wipe completely? fdisk?).

I edited my sources.list to this and then dist upgrade all three nodes:


deb http://ftp.gr.debian.org/debian stretch main contrib

# security updates
deb http://security.debian.org stretch/updates main contrib

# PVE pve-no-subscription repository provided by proxmox.com,
# NOT recommended for production use
deb http://download.proxmox.com/debian stretch pvetest
 
The disk was not brand new. (doesn't ceph-disk zap nuke everything on the disk? what do you mean wipe completely? fdisk?).

I edited my sources.list to this and then dist upgrade all three nodes:


deb http://ftp.gr.debian.org/debian stretch main contrib

# security updates
deb http://security.debian.org stretch/updates main contrib

# PVE pve-no-subscription repository provided by proxmox.com,
# NOT recommended for production use
deb http://download.proxmox.com/debian stretch pvetest

what does "ceph version" say?
 
Hi, I have my VM autostart with the following error:
Code:
Mar 23 21:56:23 convergence qm[15052]: start failed: org.freedesktop.DBus.Error.InvalidArgs: CPUShares value out of range
Mar 23 21:56:23 convergence qm[15049]: <root@pam> end task UPID:convergence:00003ACC:0000384E:58D43677:qmstart:179:root@pam: start failed: org.freedesktop.DBus.Error.InvalidArgs: CPUShares value out of range
I have given the 50000 CPU shares to this VM and this was supposed to be the highest possible value. Not sure if this is a bug or the option is deprecated.
After commenting the CPU shares I have not restarted the host to see if there will other problems.
 
Hi, I have my VM autostart with the following error:
Code:
Mar 23 21:56:23 convergence qm[15052]: start failed: org.freedesktop.DBus.Error.InvalidArgs: CPUShares value out of range
Mar 23 21:56:23 convergence qm[15049]: <root@pam> end task UPID:convergence:00003ACC:0000384E:58D43677:qmstart:179:root@pam: start failed: org.freedesktop.DBus.Error.InvalidArgs: CPUShares value out of range
I have given the 50000 CPU shares to this VM and this was supposed to be the highest possible value. Not sure if this is a bug or the option is deprecated.
After commenting the CPU shares I have not restarted the host to see if there will other problems.

thanks, filed as https://bugzilla.proxmox.com/show_bug.cgi?id=1321
 
yes. "pveceph install" will install Ceph Luminous Beta packages, but please read the Ceph release notes for Kraken (especially about sort bitwise!) and the notes for Luminous before upgrading. note that the upgrade path is not yet very tested.
 
cluster 664d7895-6e32-4f53-a8bc-46050faca724
health HEALTH_WARN
clock skew detected on mon.2
Monitor clock skew detected
monmap e3: 3 mons at {0=192.168.100.1:6789/0,1=192.168.100.2:6789/0,2=192.168.100.3:6789/0}
election epoch 416, quorum 0,1,2 0,1,2
osdmap e3106: 17 osds: 17 up, 17 in
flags sortbitwise,require_jewel_osds
pgmap v1232212: 1088 pgs, 2 pools, 5612 GB data, 1403 kobjects
11231 GB used, 4511 GB / 15743 GB avail
1088 active+clean
client io 26403 kB/s rd, 26632 B/s wr, 100 op/s rd, 3 op/s wr


I see I have the setbitwise flag set...
 
sortbitwise seems the most important, we actually hosed a test cluster by not setting that before upgrading.
 
After doing: ceph osd set require_kraken_osds

ceph -s reports both:

osdmap e3161: 17 osds: 17 up, 17 in
flags sortbitwise,require_jewel_osds,require_kraken_osds

do I need to remove the jewel reference?
how do I do it? (tried ceph osd unset require_jewel_osds but it does not work)
 
sortbitwise seems the most important, we actually hosed a test cluster by not setting that before upgrading.

I upgraded ceph but the issue with the OSD never being added still persists...
I ceph-disk zaped the disk with new ceph, and then added via the gui...
everything reports ok but the osd never appears... (help... pls)

create OSD on /dev/sdg (xfs)
Caution: invalid backup GPT header, but valid main header; regenerating
backup header from main header.

****************************************************************************
Caution: Found protective or hybrid MBR and corrupt GPT. Using GPT, but disk
verification and recovery are STRONGLY recommended.
****************************************************************************
GPT data structures destroyed! You may now partition the disk using fdisk or
other utilities.
Creating new GPT entries.
The operation has completed successfully.
Setting name!
partNum is 1
REALLY setting name!
The operation has completed successfully.
The operation has completed successfully.
Setting name!
partNum is 0
REALLY setting name!
The operation has completed successfully.
meta-data=/dev/sdg1 isize=2048 agcount=4, agsize=60719917 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=1, sparse=0, rmapbt=0, reflink=0
data = bsize=4096 blocks=242879665, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=1
log =internal log bsize=4096 blocks=118593, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
The operation has completed successfully.
TASK OK
 
what does the journal from around the time of osd creation say? you can limit the output with "--since " and "--until ", e.g.:

Code:
journalctl --since "2017-03-24 22:56:00" --until "2017-03-24 23:10:00"
 
what does the journal from around the time of osd creation say? you can limit the output with "--since " and "--until ", e.g.:

Code:
journalctl --since "2017-03-24 22:56:00" --until "2017-03-24 23:10:00"


root@pve1:~# journalctl --since "2017-03-27 18:00:00" --until "2017-03-27 18:10:00"
-- Logs begin at Mon 2017-03-27 17:33:58 EEST, end at Mon 2017-03-27 18:30:06 EEST. --
Mar 27 18:02:44 pve1 pvedaemon[1929]: <root@pam> starting task UPID:pve1:00003A7B:0002A3D0:58D92994:vncshell::root@pam:
Mar 27 18:02:44 pve1 pvedaemon[14971]: starting vnc proxy UPID:pve1:00003A7B:0002A3D0:58D92994:vncshell::root@pam:
Mar 27 18:02:44 pve1 pvedaemon[14971]: launch command: /usr/bin/vncterm -rfbport 5900 -timeout 10 -authpath /nodes/pve1 -perm Sys.Console -notls -listen localhost -c /bin/login -f root
Mar 27 18:02:45 pve1 login[14973]: pam_unix(login:session): session opened for user root by (uid=0)
Mar 27 18:02:45 pve1 systemd[1]: Created slice User Slice of root.
Mar 27 18:02:45 pve1 systemd[1]: Starting User Manager for UID 0...
Mar 27 18:02:45 pve1 systemd-logind[1117]: New session 4 of user root.
Mar 27 18:02:45 pve1 systemd[14974]: pam_unix(systemd-user:session): session opened for user root by (uid=0)
Mar 27 18:02:45 pve1 systemd[1]: Started Session 4 of user root.
Mar 27 18:02:45 pve1 systemd[14974]: Listening on GnuPG cryptographic agent and passphrase cache.
Mar 27 18:02:45 pve1 systemd[14974]: Reached target Timers.
Mar 27 18:02:45 pve1 systemd[14974]: Reached target Paths.
Mar 27 18:02:45 pve1 systemd[14974]: Listening on GnuPG cryptographic agent and passphrase cache (restricted).
Mar 27 18:02:45 pve1 systemd[14974]: Listening on GnuPG network certificate management daemon.
Mar 27 18:02:45 pve1 systemd[14974]: Listening on GnuPG cryptographic agent (ssh-agent emulation).
Mar 27 18:02:45 pve1 systemd[14974]: Listening on GnuPG cryptographic agent (access for web browsers).
Mar 27 18:02:45 pve1 systemd[14974]: Starting D-Bus User Message Bus Socket.
Mar 27 18:02:45 pve1 systemd[14974]: Listening on D-Bus User Message Bus Socket.
Mar 27 18:02:45 pve1 systemd[14974]: Reached target Sockets.
Mar 27 18:02:45 pve1 systemd[14974]: Reached target Basic System.
Mar 27 18:02:45 pve1 systemd[14974]: Reached target Default.
Mar 27 18:02:45 pve1 systemd[14974]: Startup finished in 31ms.
Mar 27 18:02:45 pve1 systemd[1]: Started User Manager for UID 0.
Mar 27 18:02:45 pve1 login[14996]: ROOT LOGIN on '/dev/pts/4'
Mar 27 18:02:55 pve1 kernel: Alternate GPT is invalid, using primary GPT.
Mar 27 18:02:55 pve1 kernel: sdg: sdg1 sdg2
Mar 27 18:02:55 pve1 kernel: Alternate GPT is invalid, using primary GPT.
Mar 27 18:02:55 pve1 kernel: sdg: sdg1 sdg2
Mar 27 18:02:57 pve1 kernel: sdg:
Mar 27 18:03:13 pve1 pvedaemon[1929]: <root@pam> starting task UPID:pve1:00003BDE:0002AF0A:58D929B1:cephcreateosd:sdg:root@pam:
Mar 27 18:03:13 pve1 kernel: Alternate GPT is invalid, using primary GPT.
Mar 27 18:03:13 pve1 kernel: sdg:
Mar 27 18:03:16 pve1 kernel: sdg:
Mar 27 18:03:16 pve1 kernel: sdg:
Mar 27 18:03:17 pve1 kernel: sdg: sdg2
Mar 27 18:03:18 pve1 systemd[1]: Starting Ceph disk activation: /dev/sdg2...
Mar 27 18:03:18 pve1 kernel: sdg: sdg2
Mar 27 18:03:18 pve1 kernel: sdg: sdg2
Mar 27 18:03:18 pve1 sh[15419]: main_trigger: main_trigger: Namespace(cluster='ceph', dev='/dev/sdg2', dmcrypt=None, dmcrypt_key_dir='/etc/ceph/dmcrypt-keys', func=<function main_trigger at 0x7fb394b16398>,
Mar 27 18:03:18 pve1 sh[15419]: command: Running command: /sbin/init --version
Mar 27 18:03:18 pve1 sh[15419]: command_check_call: Running command: /bin/chown ceph:ceph /dev/sdg2
Mar 27 18:03:18 pve1 sh[15419]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
Mar 27 18:03:18 pve1 sh[15419]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
Mar 27 18:03:18 pve1 sh[15419]: main_trigger: trigger /dev/sdg2 parttype 45b0969e-9b03-4f30-b4c6-b4b80ceff106 uuid 4925f1a4-ad1f-408c-b61f-b42266b2177c
Mar 27 18:03:18 pve1 sh[15419]: command: Running command: /usr/sbin/ceph-disk --verbose activate-journal /dev/sdg2
Mar 27 18:03:19 pve1 systemd[1]: Stopped Ceph disk activation: /dev/sdg2.
Mar 27 18:03:19 pve1 systemd[1]: Starting Ceph disk activation: /dev/sdg2...
Mar 27 18:03:19 pve1 sh[15419]: main_trigger:
Mar 27 18:03:19 pve1 sh[15419]: main_trigger: get_dm_uuid: get_dm_uuid /dev/sdg2 uuid path is /sys/dev/block/8:98/dm/uuid
Mar 27 18:03:19 pve1 sh[15419]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
Mar 27 18:03:19 pve1 sh[15419]: command: Running command: /usr/bin/ceph-osd --get-device-fsid /dev/sdg2
Mar 27 18:03:19 pve1 sh[15419]: get_space_osd_uuid: Journal /dev/sdg2 has OSD UUID 00000000-0000-0000-0000-000000000000
Mar 27 18:03:19 pve1 sh[15419]: main_activate_space: activate: OSD device not present, not starting, yet
Mar 27 18:03:19 pve1 systemd[1]: Stopped Ceph disk activation: /dev/sdg2.
Mar 27 18:03:19 pve1 systemd[1]: Starting Ceph disk activation: /dev/sdg2...
Mar 27 18:03:19 pve1 sh[15475]: main_trigger: main_trigger: Namespace(cluster='ceph', dev='/dev/sdg2', dmcrypt=None, dmcrypt_key_dir='/etc/ceph/dmcrypt-keys', func=<function main_trigger at 0x7f71b371e398>,
Mar 27 18:03:19 pve1 sh[15475]: command: Running command: /sbin/init --version
Mar 27 18:03:19 pve1 sh[15475]: command_check_call: Running command: /bin/chown ceph:ceph /dev/sdg2
Mar 27 18:03:19 pve1 sh[15475]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
Mar 27 18:03:19 pve1 sh[15475]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
Mar 27 18:03:19 pve1 sh[15475]: main_trigger: trigger /dev/sdg2 parttype 45b0969e-9b03-4f30-b4c6-b4b80ceff106 uuid 4925f1a4-ad1f-408c-b61f-b42266b2177c
Mar 27 18:03:19 pve1 sh[15475]: command: Running command: /usr/sbin/ceph-disk --verbose activate-journal /dev/sdg2
Mar 27 18:03:19 pve1 sh[15475]: main_trigger:
Mar 27 18:03:19 pve1 sh[15475]: main_trigger: get_dm_uuid: get_dm_uuid /dev/sdg2 uuid path is /sys/dev/block/8:98/dm/uuid
Mar 27 18:03:19 pve1 sh[15475]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
Mar 27 18:03:19 pve1 sh[15475]: command: Running command: /usr/bin/ceph-osd --get-device-fsid /dev/sdg2
Mar 27 18:03:19 pve1 sh[15475]: get_space_osd_uuid: Journal /dev/sdg2 has OSD UUID 00000000-0000-0000-0000-000000000000
Mar 27 18:03:19 pve1 sh[15475]: main_activate_space: activate: OSD device not present, not starting, yet
Mar 27 18:03:20 pve1 kernel: sdg: sdg2
Mar 27 18:03:20 pve1 sh[15492]: main_trigger: main_trigger: Namespace(cluster='ceph', dev='/dev/sdg2', dmcrypt=None, dmcrypt_key_dir='/etc/ceph/dmcrypt-keys', func=<function main_trigger at 0x7fecbf24d398>,
Mar 27 18:03:20 pve1 sh[15492]: command: Running command: /sbin/init --version
Mar 27 18:03:20 pve1 sh[15492]: command_check_call: Running command: /bin/chown ceph:ceph /dev/sdg2
Mar 27 18:03:20 pve1 sh[15492]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
Mar 27 18:03:20 pve1 sh[15492]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
Mar 27 18:03:20 pve1 sh[15492]: main_trigger: trigger /dev/sdg2 parttype 45b0969e-9b03-4f30-b4c6-b4b80ceff106 uuid 4925f1a4-ad1f-408c-b61f-b42266b2177c
Mar 27 18:03:20 pve1 sh[15492]: command: Running command: /usr/sbin/ceph-disk --verbose activate-journal /dev/sdg2
Mar 27 18:03:20 pve1 systemd[1]: Stopped Ceph disk activation: /dev/sdg2.
Mar 27 18:03:20 pve1 systemd[1]: Starting Ceph disk activation: /dev/sdg2...
Mar 27 18:03:20 pve1 sh[15492]: main_trigger:
Mar 27 18:03:20 pve1 sh[15492]: main_trigger: get_dm_uuid: get_dm_uuid /dev/sdg2 uuid path is /sys/dev/block/8:98/dm/uuid
Mar 27 18:03:20 pve1 sh[15492]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
Mar 27 18:03:20 pve1 sh[15492]: command: Running command: /usr/bin/ceph-osd --get-device-fsid /dev/sdg2
Mar 27 18:03:20 pve1 sh[15492]: get_space_osd_uuid: Journal /dev/sdg2 has OSD UUID 00000000-0000-0000-0000-000000000000
Mar 27 18:03:20 pve1 sh[15492]: main_activate_space: activate: OSD device not present, not starting, yet
Mar 27 18:03:20 pve1 kernel: sdg: sdg2
Mar 27 18:03:20 pve1 kernel: sdg: sdg1 sdg2
Mar 27 18:03:20 pve1 sh[15530]: main_trigger: main_trigger: Namespace(cluster='ceph', dev='/dev/sdg2', dmcrypt=None, dmcrypt_key_dir='/etc/ceph/dmcrypt-keys', func=<function main_trigger at 0x7f408dcf7398>,
Mar 27 18:03:20 pve1 sh[15530]: command: Running command: /sbin/init --version
Mar 27 18:03:20 pve1 sh[15530]: command_check_call: Running command: /bin/chown ceph:ceph /dev/sdg2
Mar 27 18:03:20 pve1 sh[15530]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
Mar 27 18:03:20 pve1 sh[15530]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
Mar 27 18:03:20 pve1 sh[15530]: main_trigger: trigger /dev/sdg2 parttype 45b0969e-9b03-4f30-b4c6-b4b80ceff106 uuid 4925f1a4-ad1f-408c-b61f-b42266b2177c
Mar 27 18:03:20 pve1 sh[15530]: command: Running command: /usr/sbin/ceph-disk --verbose activate-journal /dev/sdg2
Mar 27 18:03:21 pve1 systemd[1]: Stopped Ceph disk activation: /dev/sdg2.
Mar 27 18:03:21 pve1 systemd[1]: Starting Ceph disk activation: /dev/sdg2...
Mar 27 18:03:21 pve1 sh[15530]: main_trigger:
Mar 27 18:03:21 pve1 sh[15530]: main_trigger: get_dm_uuid: get_dm_uuid /dev/sdg2 uuid path is /sys/dev/block/8:98/dm/uuid
Mar 27 18:03:21 pve1 sh[15530]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
Mar 27 18:03:21 pve1 sh[15530]: command: Running command: /usr/bin/ceph-osd --get-device-fsid /dev/sdg2
Mar 27 18:03:21 pve1 sh[15530]: get_space_osd_uuid: Journal /dev/sdg2 has OSD UUID 00000000-0000-0000-0000-000000000000
Mar 27 18:03:21 pve1 sh[15530]: main_activate_space: activate: OSD device not present, not starting, yet
Mar 27 18:03:21 pve1 systemd[1]: Stopped Ceph disk activation: /dev/sdg2.
Mar 27 18:03:21 pve1 systemd[1]: ceph-disk@dev-sdg2.service: Start request repeated too quickly.
Mar 27 18:03:21 pve1 systemd[1]: Failed to start Ceph disk activation: /dev/sdg2.
Mar 27 18:03:21 pve1 systemd[1]: ceph-disk@dev-sdg2.service: Unit entered failed state.
Mar 27 18:03:21 pve1 systemd[1]: ceph-disk@dev-sdg2.service: Failed with result 'start-limit-hit'.

Mar 27 18:03:21 pve1 sh[15593]: main_trigger: main_trigger: Namespace(cluster='ceph', dev='/dev/sdg2', dmcrypt=None, dmcrypt_key_dir='/etc/ceph/dmcrypt-keys', func=<function main_trigger at 0x7f6ba7770398>,
Mar 27 18:03:21 pve1 sh[15593]: command: Running command: /sbin/init --version
Mar 27 18:03:21 pve1 sh[15593]: command_check_call: Running command: /bin/chown ceph:ceph /dev/sdg2
Mar 27 18:03:21 pve1 sh[15593]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
Mar 27 18:03:21 pve1 sh[15593]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
Mar 27 18:03:21 pve1 sh[15593]: main_trigger: trigger /dev/sdg2 parttype 45b0969e-9b03-4f30-b4c6-b4b80ceff106 uuid 4925f1a4-ad1f-408c-b61f-b42266b2177c
Mar 27 18:03:21 pve1 sh[15593]: command: Running command: /usr/sbin/ceph-disk --verbose activate-journal /dev/sdg2
Mar 27 18:03:21 pve1 sh[15593]: main_trigger:
Mar 27 18:03:21 pve1 sh[15593]: main_trigger: get_dm_uuid: get_dm_uuid /dev/sdg2 uuid path is /sys/dev/block/8:98/dm/uuid
Mar 27 18:03:21 pve1 sh[15593]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
Mar 27 18:03:21 pve1 sh[15593]: command: Running command: /usr/bin/ceph-osd --get-device-fsid /dev/sdg2
Mar 27 18:03:21 pve1 sh[15593]: get_space_osd_uuid: Journal /dev/sdg2 has OSD UUID 00000000-0000-0000-0000-000000000000
Mar 27 18:03:21 pve1 sh[15593]: main_activate_space: activate: OSD device not present, not starting, yet
Mar 27 18:03:22 pve1 kernel: sdg: sdg1 sdg2
Mar 27 18:03:22 pve1 systemd[1]: ceph-disk@dev-sdg2.service: Start request repeated too quickly.
Mar 27 18:03:22 pve1 systemd[1]: Failed to start Ceph disk activation: /dev/sdg2.
Mar 27 18:03:22 pve1 systemd[1]: ceph-disk@dev-sdg2.service: Failed with result 'start-limit-hit'.
Mar 27 18:03:27 pve1 kernel: XFS (sdg1): Mounting V5 Filesystem
Mar 27 18:03:27 pve1 kernel: XFS (sdg1): Ending clean mount
Mar 27 18:03:28 pve1 kernel: XFS (sdg1): Unmounting Filesystem
Mar 27 18:03:28 pve1 kernel: sdg: sdg1 sdg2
Mar 27 18:03:28 pve1 kernel: sdg: sdg1 sdg2
Mar 27 18:03:28 pve1 systemd[1]: Starting Ceph disk activation: /dev/sdg2...
Mar 27 18:03:28 pve1 sh[15781]: main_trigger: main_trigger: Namespace(cluster='ceph', dev='/dev/sdg2', dmcrypt=None, dmcrypt_key_dir='/etc/ceph/dmcrypt-keys', func=<function main_trigger at 0x7fe9020b8398>,
Mar 27 18:03:28 pve1 sh[15781]: command: Running command: /sbin/init --version
Mar 27 18:03:28 pve1 sh[15781]: command_check_call: Running command: /bin/chown ceph:ceph /dev/sdg2
Mar 27 18:03:28 pve1 sh[15781]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
Mar 27 18:03:28 pve1 sh[15781]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
Mar 27 18:03:28 pve1 sh[15781]: main_trigger: trigger /dev/sdg2 parttype 45b0969e-9b03-4f30-b4c6-b4b80ceff106 uuid 4925f1a4-ad1f-408c-b61f-b42266b2177c
Mar 27 18:03:28 pve1 sh[15781]: command: Running command: /usr/sbin/ceph-disk --verbose activate-journal /dev/sdg2
Mar 27 18:03:29 pve1 systemd[1]: Stopped Ceph disk activation: /dev/sdg2.
Mar 27 18:03:29 pve1 systemd[1]: Starting Ceph disk activation: /dev/sdg2...
Mar 27 18:03:29 pve1 sh[15781]: main_trigger:
Mar 27 18:03:29 pve1 sh[15781]: main_trigger: get_dm_uuid: get_dm_uuid /dev/sdg2 uuid path is /sys/dev/block/8:98/dm/uuid
Mar 27 18:03:29 pve1 sh[15781]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
Mar 27 18:03:29 pve1 sh[15781]: command: Running command: /usr/bin/ceph-osd --get-device-fsid /dev/sdg2
Mar 27 18:03:29 pve1 sh[15781]: get_space_osd_uuid: Journal /dev/sdg2 has OSD UUID 00000000-0000-0000-0000-000000000000
Mar 27 18:03:29 pve1 sh[15781]: main_activate_space: activate: OSD device not present, not starting, yet
Mar 27 18:03:29 pve1 kernel: sdg: sdg1 sdg2
Mar 27 18:03:29 pve1 sh[15803]: main_trigger: main_trigger: Namespace(cluster='ceph', dev='/dev/sdg2', dmcrypt=None, dmcrypt_key_dir='/etc/ceph/dmcrypt-keys', func=<function main_trigger at 0x7f85ccd1a398>,
Mar 27 18:03:29 pve1 sh[15803]: command: Running command: /sbin/init --version
Mar 27 18:03:29 pve1 sh[15803]: command_check_call: Running command: /bin/chown ceph:ceph /dev/sdg2
Mar 27 18:03:29 pve1 sh[15803]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
Mar 27 18:03:29 pve1 sh[15803]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
Mar 27 18:03:29 pve1 sh[15803]: main_trigger: trigger /dev/sdg2 parttype 45b0969e-9b03-4f30-b4c6-b4b80ceff106 uuid 4925f1a4-ad1f-408c-b61f-b42266b2177c
Mar 27 18:03:29 pve1 sh[15803]: command: Running command: /usr/sbin/ceph-disk --verbose activate-journal /dev/sdg2
Mar 27 18:03:29 pve1 systemd[1]: Stopped Ceph disk activation: /dev/sdg2.
Mar 27 18:03:29 pve1 systemd[1]: Starting Ceph disk activation: /dev/sdg2...
Mar 27 18:03:30 pve1 sh[15803]: main_trigger:
Mar 27 18:03:30 pve1 sh[15803]: main_trigger: get_dm_uuid: get_dm_uuid /dev/sdg2 uuid path is /sys/dev/block/8:98/dm/uuid
Mar 27 18:03:30 pve1 sh[15803]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
Mar 27 18:03:30 pve1 sh[15803]: command: Running command: /usr/bin/ceph-osd --get-device-fsid /dev/sdg2
Mar 27 18:03:30 pve1 sh[15803]: get_space_osd_uuid: Journal /dev/sdg2 has OSD UUID 00000000-0000-0000-0000-000000000000
Mar 27 18:03:30 pve1 sh[15803]: main_activate_space: activate: OSD device not present, not starting, yet
Mar 27 18:03:30 pve1 pvedaemon[1929]: <root@pam> end task UPID:pve1:00003BDE:0002AF0A:58D929B1:cephcreateosd:sdg:root@pam: OK
Mar 27 18:03:30 pve1 sh[15834]: main_trigger: main_trigger: Namespace(cluster='ceph', dev='/dev/sdg2', dmcrypt=None, dmcrypt_key_dir='/etc/ceph/dmcrypt-keys', func=<function main_trigger at 0x7f625e117398>,
Mar 27 18:03:30 pve1 sh[15834]: command: Running command: /sbin/init --version
Mar 27 18:03:30 pve1 sh[15834]: command_check_call: Running command: /bin/chown ceph:ceph /dev/sdg2
Mar 27 18:03:30 pve1 sh[15834]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
Mar 27 18:03:30 pve1 sh[15834]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
Mar 27 18:03:30 pve1 sh[15834]: main_trigger: trigger /dev/sdg2 parttype 45b0969e-9b03-4f30-b4c6-b4b80ceff106 uuid 4925f1a4-ad1f-408c-b61f-b42266b2177c
Mar 27 18:03:30 pve1 sh[15834]: command: Running command: /usr/sbin/ceph-disk --verbose activate-journal /dev/sdg2
Mar 27 18:03:30 pve1 sh[15834]: main_trigger:
Mar 27 18:03:30 pve1 sh[15834]: main_trigger: get_dm_uuid: get_dm_uuid /dev/sdg2 uuid path is /sys/dev/block/8:98/dm/uuid
Mar 27 18:03:30 pve1 sh[15834]: command: Running command: /sbin/blkid -o udev -p /dev/sdg2
Mar 27 18:03:30 pve1 sh[15834]: command: Running command: /usr/bin/ceph-osd --get-device-fsid /dev/sdg2
Mar 27 18:03:30 pve1 sh[15834]: get_space_osd_uuid: Journal /dev/sdg2 has OSD UUID 00000000-0000-0000-0000-000000000000
Mar 27 18:03:30 pve1 sh[15834]: main_activate_space: activate: OSD device not present, not starting, yet
Mar 27 18:03:30 pve1 systemd[1]: Started Ceph disk activation: /dev/sdg2.
Mar 27 18:04:05 pve1 smartd[1109]: Device: /dev/sda [SAT], SMART Usage Attribute: 190 Airflow_Temperature_Cel changed from 69 to 68
Mar 27 18:04:05 pve1 smartd[1109]: Device: /dev/sdc [SAT], SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 117 to 118
Mar 27 18:04:05 pve1 smartd[1109]: Device: /dev/sdd [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 153 to 150
Mar 27 18:04:06 pve1 smartd[1109]: Device: /dev/sde [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 116 to 115
Mar 27 18:04:06 pve1 smartd[1109]: Device: /dev/sdf [SAT], SMART Usage Attribute: 190 Airflow_Temperature_Cel changed from 59 to 58
Mar 27 18:04:06 pve1 smartd[1109]: Device: /dev/sdf [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 41 to 42
Mar 27 18:07:39 pve1 pveproxy[2843]: worker 2845 finished
Mar 27 18:07:39 pve1 pveproxy[2843]: starting 1 worker(s)
Mar 27 18:07:39 pve1 pveproxy[2843]: worker 17382 started
Mar 27 18:07:43 pve1 pveproxy[17381]: got inotify poll request in wrong process - disabling inotify
Mar 27 18:09:57 pve1 pvedaemon[1931]: <root@pam> successful auth for user 'root@pam'
root@pve1:~#
 
Last edited:
Also out of the blue I have an osd partial full, which is probably a false alarm...

How can one osd be partial full when all have the same weight???

upload_2017-3-27_18-38-35.png
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!