pve > ceph > osd displays "partial read (500)"

RobFantini

Famous Member
May 24, 2012
2,041
107
133
Boston,Mass
'partial read (500)' displays in a rectangle.

same when click on 'pools'

I suspect it is due running vzdump backups on same network as ceph. the error continues an hour after the backup. network is 10G .

does anyone have a clue what else could cause the error?
 
more info. a pve-zsync that runs every 15 minutes had this occur for 2ND time in 3 days:

Code:
Date: Fri, 20 Jan 2017 13:30:01
From: Cron Daemon <root@f..>
To: root@...
Subject: Cron <root@sys1> pve-zsync sync --source 10.2.2.65:111 --dest tank/pve-zsync/15Minutes --name
  pro4-15min --maxsnap 96 --method ssh

COMMAND:
  ssh root@10.2.2.65 -- pvesm path kvm-zfs:vm-111-disk-1
GET ERROR:
  Permission denied, please try again.
Permission denied, please try again.
Permission denied (publickey,password).

Job --source 10.2.2.65:111 --name pro4-15min got an ERROR!!!
ERROR Message:

the next pve-zsync worked.

also the log works on pve
 
OK I found a work around. Still not sure of the cause:

notes:
ceph seems OK, the issue just seems to be pve screen:
Code:
s020  /fbc/adm # ceph --status
  cluster 63efaa45-7507-428f-9443-82a0a546b70d
  health HEALTH_OK
  monmap e3: 3 mons at {0=10.2.2.21:6789/0,1=10.2.2.10:6789/0,2=10.2.2.67:6789/0}
  election epoch 28, quorum 0,1,2 1,0,2
  osdmap e71: 6 osds: 6 up, 6 in
  flags sortbitwise,require_jewel_osds
  pgmap v794889: 192 pgs, 3 pools, 290 GB data, 75330 objects
  581 GB used, 2070 GB / 2651 GB avail
  192 active+clean
  client io 30584 B/s wr, 0 op/s rd, 2 op/s wr
Mon ok per that.

for pve try:
Code:
systemctl restart ceph
that did not fix issue at pve.

note syslog has a lot of these, starting at login to pve time:
Code:
Jan 21 10:29:44 s020 pvedaemon[21409]: partial read
Jan 21 10:29:48 s020 pvedaemon[28890]: partial read
Jan 21 10:29:51 s020 pvedaemon[21409]: partial read

try
Code:
# systemctl restart pvedaemon

On that node pve > ceph osd , pools are normal.
error fixed!

Not so on other nodes. I did not wait more then a minute after fixing one node. It is possible that the other nodes could have self fixed after awhile.

on all nodes run systemctl restart pvedaemon .

that fixed issue. note no need to restart ceph on other nodes.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!