Recent content by leex12

L
7 OSD down across two nodes Issues Since Upgrading to v8 - HELP!

@fiona thanks very much for the support! Switching back off iommu has stopped the crash! That helps my paranoia as that was defaulted to off in v7 so I didn't imagine the issue being related to the upgrade. I checked the BIOS on my four dell r230 and its all the same. The controllers were...
- leex12
- Post #18
- Jun 21, 2024
- Forum: Proxmox VE: Installation and configuration
L
7 OSD down across two nodes Issues Since Upgrading to v8 - HELP!

As a real dumb question - once I have made that change how can i check that it has been applied?
- leex12
- Post #16
- Jun 19, 2024
- Forum: Proxmox VE: Installation and configuration
L
7 OSD down across two nodes Issues Since Upgrading to v8 - HELP!

@fiona really need some guidance here but I think i may have got to the bottom of this ... So to recap .. 6 server cluster. four of which are dell r230 servers. two of whcih have gone through the upgrade process and are all fine and two which aren't. I upgraded my ceph version ages ago and...
- leex12
- Post #14
- Jun 19, 2024
- Forum: Proxmox VE: Installation and configuration
L
7 OSD down across two nodes Issues Since Upgrading to v8 - HELP!

googing around this .. number of non-proxmox folks saying the issue is related to NIC drivers and vt-d?
- leex12
- Post #12
- Jun 18, 2024
- Forum: Proxmox VE: Installation and configuration
L
Jumbo frame size set - netmtu in COROSYNC.CONF ?

i have moved everything back to 8745 Jun 18 18:38:44 pve01 corosync[1262]: [KNET ] pmtud: PMTUD completed for host: 2 link: 0 current link mtu: 8629 Jun 18 18:38:44 pve01 corosync[1262]: [KNET ] pmtud: Starting PMTUD for host: 2 link: 1 Jun 18 18:38:44 pve01 corosync[1262]: [KNET ] udp...
- leex12
- Post #4
- Jun 18, 2024
- Forum: Proxmox VE: Networking and Firewall
L
Jumbo frame size set - netmtu in COROSYNC.CONF ?

my question was shouls i be setting netmtu to stop the messages! There is nothing exciting in my config logging { debug: on to_syslog: yes } nodelist { node { name: pve01 nodeid: 5 quorum_votes: 1 ring0_addr: 192.168.20.1 ring1_addr: 10.107.0.1 } node {...
- leex12
- Post #3
- Jun 18, 2024
- Forum: Proxmox VE: Networking and Firewall
L
7 OSD down across two nodes Issues Since Upgrading to v8 - HELP!

So did the fresh install to see if that impacted anything .. it didn't. Been working fine with the boot disc and an external disc. Re-added a SSD for ceph. Worked fine for over an hour then my console erupts with DMAR ERROR DMA PTE for vPFN this is in system log and there is a lot of them...
- leex12
- Post #11
- Jun 18, 2024
- Forum: Proxmox VE: Installation and configuration
L
7 OSD down across two nodes Issues Since Upgrading to v8 - HELP!

I see a bunch of these errors on pve03 when I tried to add a new drive Jun 17 09:29:54 pve03 kernel: DMAR: ERROR: DMA PTE for vPFN 0x7ee69 already set (to 7ee69003 not 262743001) The crc errors looks very similar to an old issue relating to the kenal. Any thoughts or suggestions? I am just...
- leex12
- Post #9
- Jun 17, 2024
- Forum: Proxmox VE: Installation and configuration
L
7 OSD down across two nodes Issues Since Upgrading to v8 - HELP!

when you look in the full log i see this " _verify_csum bad crc32c/0x1000 checksum at blob offset 0x0, got 0x6706be76, expected 0xbfa2820a, device location [0x9627286000~1000], logical extent 0x100000~1000, object #-1:2c740c03:::eek:sdmap.194823:0#"
- leex12
- Post #8
- Jun 17, 2024
- Forum: Proxmox VE: Installation and configuration
L
7 OSD down across two nodes Issues Since Upgrading to v8 - HELP!

journalctl
- leex12
- Post #7
- Jun 17, 2024
- Forum: Proxmox VE: Installation and configuration
L
7 OSD down across two nodes Issues Since Upgrading to v8 - HELP!

Below is a recreated OSD .. stayed up for a few hours then died. grep -Hn 'ERR' /var/log/ceph/ceph-osd.9101.log /var/log/ceph/ceph-osd.9101.log:28764:2024-06-16T21:52:08.451+0100 754587c8a3c0 -1 ** ERROR: osd init failed: (5) Input/output error...
- leex12
- Post #6
- Jun 17, 2024
- Forum: Proxmox VE: Installation and configuration
L
7 OSD down across two nodes Issues Since Upgrading to v8 - HELP!

I have done a dirty spreadsheet across the versions attached .. pve3 + pve4 are the nodes that have the problem proxmox-ve: 8.2.0 (running kernel: 6.8.4-3-pve) pve-manager: 8.2.2 (running version: 8.2.2/9355359cd7afbae4) proxmox-kernel-helper: 8.1.0 pve-kernel-5.15: 7.4-13 proxmox-kernel-6.8...
- leex12
- Post #5
- Jun 17, 2024
- Forum: Proxmox VE: Installation and configuration
L
7 OSD down across two nodes Issues Since Upgrading to v8 - HELP!

So I went radical .. physically removed drives from the two nodes. Reformatted drives and recreated new OSDs. Will work for a while then crap out. I have run three drives against a test program and they are passing so i don't think we are looking at hard drive failure. The issue has happened...
- leex12
- Post #3
- Jun 17, 2024
- Forum: Proxmox VE: Installation and configuration
L
Jumbo frame size set - netmtu in COROSYNC.CONF ?

Not sure if I was snow blind to them in v7 but since upgrade to v8 I have tons of messages 'complaining' about MTU size. Historically I have had MTU set on NICs, bridges and vlans set to 9000. Since seeing these messages I have validated 8745 seems to be a sweet spot working between nodes...
- leex12
- Thread
- Jun 16, 2024
- #corosync.conf #frame-size #pmtud
- Replies: 4
- Forum: Proxmox VE: Networking and Firewall
L
7 OSD down across two nodes Issues Since Upgrading to v8 - HELP!

14 07:00:26 pve03 systemd[1]: Starting ceph-osd@9000.service - Ceph object storage daemon osd.9000... Jun 14 07:00:26 pve03 systemd[1]: Started ceph-osd@9000.service - Ceph object storage daemon osd.9000. Jun 14 07:00:34 pve03 systemd[1]: ceph-osd@9000.service: Main process exited, code=killed...
- leex12
- Post #2
- Jun 14, 2024
- Forum: Proxmox VE: Installation and configuration

Top Bottom

Back

Search

Search

Recent content by leex12

7 OSD down across two nodes Issues Since Upgrading to v8 - HELP!

7 OSD down across two nodes Issues Since Upgrading to v8 - HELP!

7 OSD down across two nodes Issues Since Upgrading to v8 - HELP!

7 OSD down across two nodes Issues Since Upgrading to v8 - HELP!

Jumbo frame size set - netmtu in COROSYNC.CONF ?

Jumbo frame size set - netmtu in COROSYNC.CONF ?

7 OSD down across two nodes Issues Since Upgrading to v8 - HELP!

7 OSD down across two nodes Issues Since Upgrading to v8 - HELP!

7 OSD down across two nodes Issues Since Upgrading to v8 - HELP!

7 OSD down across two nodes Issues Since Upgrading to v8 - HELP!

7 OSD down across two nodes Issues Since Upgrading to v8 - HELP!

7 OSD down across two nodes Issues Since Upgrading to v8 - HELP!

7 OSD down across two nodes Issues Since Upgrading to v8 - HELP!

Jumbo frame size set - netmtu in COROSYNC.CONF ?

7 OSD down across two nodes Issues Since Upgrading to v8 - HELP!

We value your privacy