In this case it was not a bond used for VM access, only ceph (storage). 10G interfaces -- similar to your configuration. I have removed both the bonding and the OpenVswitch bridge (both OpenVSwitch features), and it has been stable for two days. Previously we could usually last 8-12 hours before...
I'm having the attribute type 5 error as well. For me the temporary fallback was to remove openvswitch from the interfaces using the Mellanox 4 driver:
https://forum.proxmox.com/threads/pve-6-and-mellanox-4-x-drivers.56553/#post-260909
For what it's worth, the devices have been stable since I removed the openvswitch bonding and switching layer.
I may try adding back in just the switching layer without the LACP.
Haven't worked any more on compiling a newer kernel module.
Ultimately I will need all components working again as...
Thanks, Alwin. I've checked the firmware; it is current. This card was stable under PVE versions 3, 4 and 5. It also seems to be stable when not running openvswitch bonding. (Openvswitch is running stably on other adapters that are not this card.)
So the failure seems to be a combination of...
I'm having massive instability with the built-in mellanox 4.0.0 drivers (mlx4_en). However, I don't seem to be able to compile the Mellanox drivers (for Debian 9.6).
Has anyone else had success or failure with this setup?
My particular cards are
lspci -k
82:00.0 Ethernet controller: Mellanox...
I was able to update the kernel independently, but ceph is still stuck:
apt-get upgrade ceph
Reading package lists... Done
Building dependency tree
Reading state information... Done
Calculating upgrade... Done
Some packages could not be installed. This may mean that you have
requested an...
Fairly standard installation… and today dist-upgrade is trying to uninstall proxmox-ve. Other threads' suggestions aren't fixing it for me. Any ideas? I'm reluctant to override.
apt-get dist-upgrade
Reading package lists... Done
Building dependency tree
Reading state information... Done...
Since April 30, I've gotten 7 warning emails from one host, 5 from another, and 2 from another;
each email claims there is 1 CurrentPendingSector failed; that is, currently unreadable (pending).
On each host, it's the same type of drive, a CT1000MX500SSD1 (Crucial 1TB)
Running smartctl...
Patrick987, I had a similar behavior after moving to luminous (on PVE4) and after the move to PVE5. The OSD gets created (and is visible in the tree, see your example for osd.6). Did you ever find a good fix for this?
The manual fix for me was
ceph osd crush add <osd.x> <weight> host=<host>...
I've been testing PVE 5.0 with ceph 12.1.x (luminous release candidate).
•cephfs on ec pool seems OK; want to do a bit more testing, but performance is pretty good
•rbd on ec pool doesn't work yet from the GUI — the CLI requires a new option --data-pool, which specifies which ec pool to...
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.