no, never done, but is it possible to open tickets even on the free license?Have you created a Bug Report in the pve Ticket system?
no, never done, but is it possible to open tickets even on the free license?Have you created a Bug Report in the pve Ticket system?
Thank you! This is basically what I went through as well. Unfortunately, it did not help. On top of that, my USB NICs were added to create a bond (active-backup) but _even then_ that didn't work since the first NIC was alive but useless so the second NIC didn't come into play. The only way I could get things going was to use round-robin and I have lots of packet loss and latency. Unplugging the first NIC 'fixed' it until I could reboot.I have similar problems, my proxmox host drops out from the network occasionally, it has been stable for over one year but it just started happening now so I guess it is related to an upgrade as mentioned earlier in this thread. I have to plug the ethernet cable out and in to get it back.
I am running6.8.12-11-pve
kernal.
This is what I find if I runethtool -i enp0s31f6
Code:driver: e1000e version: 6.8.12-11-pve firmware-version: 2.3-4 expansion-rom-version: bus-info: 0000:00:1f.6 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: yes
It just happened now and the logs informed me about hardware Unit Hang:
dmesg | tail -100
Code:MAC Status <40080083> PHY Status <796d> PHY 1000BASE-T Status <3800> PHY Extended Status <3000> PCI Status <10> [292707.471193] e1000e 0000:00:1f.6 enp0s31f6: Detected Hardware Unit Hang: TDH <a1> TDT <a6> next_to_use <a6> next_to_clean <a0> buffer_info[next_to_clean]: time_stamp <111694031> next_to_watch <a1> jiffies <1116dd941> next_to_watch.status <0> MAC Status <40080083> PHY Status <796d> PHY 1000BASE-T Status <3800> PHY Extended Status <3000> PCI Status <10> [292709.455165] e1000e 0000:00:1f.6 enp0s31f6: Detected Hardware Unit Hang: TDH <a1> TDT <a6> next_to_use <a6> next_to_clean <a0> buffer_info[next_to_clean]: time_stamp <111694031> next_to_watch <a1> jiffies <1116de101> next_to_watch.status <0> MAC Status <40080083> PHY Status <796d> PHY 1000BASE-T Status <3800> PHY Extended Status <3000> PCI Status <10> [292711.439132] e1000e 0000:00:1f.6 enp0s31f6: Detected Hardware Unit Hang: TDH <a1> TDT <a6> next_to_use <a6> next_to_clean <a0> buffer_info[next_to_clean]: time_stamp <111694031> next_to_watch <a1> jiffies <1116de8c1> next_to_watch.status <0> MAC Status <40080083> PHY Status <796d> PHY 1000BASE-T Status <3800> PHY Extended Status <3000> PCI Status <10> [292713.486122] e1000e 0000:00:1f.6 enp0s31f6: Detected Hardware Unit Hang: TDH <a1> TDT <a6> next_to_use <a6> next_to_clean <a0> buffer_info[next_to_clean]: time_stamp <111694031> next_to_watch <a1> jiffies <1116df0c0> next_to_watch.status <0> MAC Status <40080083> PHY Status <796d> PHY 1000BASE-T Status <3800> PHY Extended Status <3000> PCI Status <10> [292715.470168] e1000e 0000:00:1f.6 enp0s31f6: Detected Hardware Unit Hang: TDH <a1> TDT <a6> next_to_use <a6> next_to_clean <a0> buffer_info[next_to_clean]: time_stamp <111694031> next_to_watch <a1> jiffies <1116df880> next_to_watch.status <0> MAC Status <40080083> PHY Status <796d> PHY 1000BASE-T Status <3800> PHY Extended Status <3000> PCI Status <10> [292717.454083] e1000e 0000:00:1f.6 enp0s31f6: Detected Hardware Unit Hang: TDH <a1> TDT <a6> next_to_use <a6> next_to_clean <a0> buffer_info[next_to_clean]: time_stamp <111694031> next_to_watch <a1> jiffies <1116e0040> next_to_watch.status <0> MAC Status <40080083> PHY Status <796d> PHY 1000BASE-T Status <3800> PHY Extended Status <3000> PCI Status <10> [292718.471934] e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Down [292718.558360] vmbr0: port 1(enp0s31f6) entered disabled state [292726.136923] e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx [292726.136966] vmbr0: port 1(enp0s31f6) entered blocking state [292726.136974] vmbr0: port 1(enp0s31f6) entered forwarding state
journalctl --since "10 minutes ago" --no-pager | grep -Ei 'network|link|enp0s31f6|vmbr0|e1000e'
Code:Jun 05 20:31:37 pve-acer-veriton kernel: e1000e 0000:00:1f.6 enp0s31f6: Detected Hardware Unit Hang: Jun 05 20:31:39 pve-acer-veriton kernel: e1000e 0000:00:1f.6 enp0s31f6: Detected Hardware Unit Hang: ... Jun 05 20:36:45 pve-acer-veriton kernel: e1000e 0000:00:1f.6 enp0s31f6: Detected Hardware Unit Hang: Jun 05 20:36:47 pve-acer-veriton kernel: e1000e 0000:00:1f.6 enp0s31f6: Detected Hardware Unit Hang: Jun 05 20:36:48 pve-acer-veriton kernel: e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Down Jun 05 20:36:48 pve-acer-veriton kernel: vmbr0: port 1(enp0s31f6) entered disabled state Jun 05 20:36:56 pve-acer-veriton kernel: e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx Jun 05 20:36:56 pve-acer-veriton kernel: vmbr0: port 1(enp0s31f6) entered blocking state Jun 05 20:36:56 pve-acer-veriton kernel: vmbr0: port 1(enp0s31f6) entered forwarding state Jun 05 20:37:43 pve-acer-veriton systemd[1252867]: Listening on dirmngr.socket - GnuPG network certificate management daemon.
ip -s link show enp0s31f6
Code:2: enp0s31f6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master vmbr0 state UP mode DEFAULT group default qlen 1000 link/ether d4:61:37:01:c8:33 brd ff:ff:ff:ff:ff:ff RX: bytes packets errors dropped missed mcast 35711121667 38257968 0 9749 2413 1347507 TX: bytes packets errors dropped carrier collsns 221063815294 157587082 0 0 0 0
After consultation with chatgpt I did the following:
Disabled Energy Efficient Ethernet (EEE)
EEE can apparently cause link flapping or power-saving quirks.
Created a/etc/systemd/system/disable-eee.service
file.
Code:[Unit] Description=Disable EEE on enp0s31f6 After=network.target [Service] ExecStart=/sbin/ethtool --set-eee enp0s31f6 eee off Type=oneshot RemainAfterExit=true [Install] WantedBy=multi-user.target
activate with:
Code:systemctl daemon-reexec systemctl enable --now disable-eee.service
Tuned e1000e driver settings
Created/etc/modprobe.d/e1000e.conf
and filled it with:
Code:options e1000e InterruptThrottleRate=0,0 RxIntDelay=0 TxIntDelay=0 options e1000e enable_eee=0
Applied those changes:
Code:update-initramfs -u -k all reboot
That seemed to work, my host was stable for 2 days but today it acted up again, I found this post and applied the ethtool fix suggested here, I put it in /etc/network/interfaces as such:
Code:iface vmbr0 inet static address 192.168.X.X/24 gateway 192.168.X.X bridge-ports enp0s31f6 bridge-stp off bridge-fd 0 bridge-vlan-aware yes bridge-vids 2-4094 post-up ethtool -K enp0s31f6 gso off gro off tso off tx off rx off rxvlan off txvlan off sg off
I guess all I can do now is to wait and see if this helps..
Does anyone know if what I have done is legit or if it can have unintended consequences?
I noticed the none of you guys have done the EEE disabling or the driver tuning.. Is this something that I should perhaps remove?
Yes please. We need your help on this. Yesterday I updated my machine. Directly after a reboot the machine was lost on the network. Cable out, cable in -> There again. Tomorrow morning the device was lost on the network again.Well, would be great if somebody from the Proxmox Team would take the time to look into this...
We use essential cookies to make this site work, and optional cookies to enhance your experience.