Proxmox VE 5.0 released!

Status
Not open for further replies.

BloodyIron

Active Member
Jan 14, 2013
193
4
38
it.lanified.com
Another bit of info.

Prod node 2, upgraded from 4.4 to 5.0, keeps the old eth0/eth1 interface naming, and it has the /etc/udev/rules.d/70-persistent-net.rules file

Prod node 1, reinstalled from scratch 5.0 release, has the "new" renaming of interfaces to enp4s0 or whatever, and I manually created the /etc/udev/rules.d/70-persistent-net.rules file to match the relevant MAC address and other info

However, Prod node 1 does not honour the udev file. My reading suggests I need to add GRUB_CMDLINE_LINUX_DEFAULT="net.ifnames=0" to grub, however Prod node 2 does NOT have this in their grub conf file, so I am very confused as to how Prod node 2 is doing what I want, without having the configuration I would expect.

I have been trapsing across the config files for quite a while now, and I'm going to hold off on adding the grub flag, but I'd like to have some dev direction as to how this was handled, as I would want both nodes to be consistently configured.
 

BloodyIron

Active Member
Jan 14, 2013
193
4
38
it.lanified.com
Another update to my quest for GLORY!

Turns out the Prod node 1, which was a install from scratch 5.0 in our last episode, had a few packages to update. I updated them, and now have the bond0 LACP working. I don't think I did anything special. Apt update, apt upgrade, install presented packages (not from enterprise repo btw). And configured the bond0 and such identical to Prod node 2.

The logs are not throwing the address errors any more, the switch ports report matching configuration with Prod node 2, and I was able to successfully migrate a test VM on AND off it.

It looks like it's working, except I would like to eventually get the interface naming back to eth0/eth1 stuff for consistency purposes, but I don't yet understand how Prod node 2 _can_ do that without having that grub flag.

This has been some adventure so far :S
 

PigLover

Well-Known Member
Apr 8, 2013
101
33
48
Okay so the node I upgraded from 4.4 to 5.0 has ifconfig

but the node I rebuilt from scratch 5.0 does not have ifconfig... what.. the hell... :(

EDIT: fresh installs do not get the package "net-tools", but upgraded ones retain it. This is how I got my precious ifconfig back.
This isn't really a Proxmox issue - its Debian. ifconfig (and the rest of net-tools) has been deprecated in Stretch. The Debian community is trying to force the transition to new tools (ip, iw, etc). They are still in the repos and can be installed with apt (as you noted) but are not installed by default anymore.
 

gsupp

Member
Jun 27, 2017
38
14
8
TX, USA
has the "new" renaming of interfaces to enp4s0
Yeah, there seems to be a lot of changes in Debian Stretch like that. I have a server that has interfaces enp3s0, enp5s0 and enp11s0. Talk about confusing, it's not even sequential. If it wasn't for Proxmox allowing me to put a comment for each network interface in the GUI, I'd never keep them straight. Losing "ifconfig" is another one. I've been trying to get used to using "ip a" (hey it's shorter!) but I'm not a huge fan of the way it displays info. At any rate, glad updating all system packages solved your LACP issue. I believe I read somewhere that "apt dist-upgrade" was the preferred way to make sure all the Proxmox packages are updated...but I'd probably have to dig a bit to find where I saw that.
 
  • Like
Reactions: BloodyIron

PigLover

Well-Known Member
Apr 8, 2013
101
33
48
The NIC naming is also more of a Debian issue than a Proxmox issue (actually, more of a "mainline Linux" issue since it is currently being adopted by most all major distributions as they get onto the 4.x kernel train).

I actually got through the "new" naming convention for network interfaces last year with Ubuntu 16.04. It is confusing at first, but the reason they did it is sound. Having deterministic and predictable interface names that aren't re-evaluated on every boot is really important. And the old method of setting up a udev rule to match the MAC was a bit crude - it broke if you had to replace the NIC for maintenance or move the image to a different, though identically configured, server. The "new" method results in stable interface naming across both of those - and many other - real world scenarios.
 
Last edited:

BloodyIron

Active Member
Jan 14, 2013
193
4
38
it.lanified.com
Okay after a few hours or something the "received packet on bond0 with own address as source address" error is coming back up again. This is really frustrating and I'm just going to disable the bonding until I get some dev response here :/

I have absolutely no clue what the root cause is, but I see no evidence that it's the switch.

Perhaps, actually, the issue might arise when I put a large amount of load over the LACP, like migrating 8 VMs, but done 3 at a time. That seems to be a consistent possible cause.
 

BloodyIron

Active Member
Jan 14, 2013
193
4
38
it.lanified.com
I don't mean to be rude, but have you read through all of my prior messages? I've been very exhaustive in my testing, and I have performed a good amount of switch-centric testing. If you haven't had a chance to review what I wrote, please do, and share your thoughts. :)

Could also be a switch problem. Bonding works by replacing original MAC in the package by the MAC from the actual nic used for the transport.
 

mir

Renowned Member
Apr 14, 2012
3,489
97
68
Copenhagen, Denmark
I don't mean to be rude, but have you read through all of my prior messages? I've been very exhaustive in my testing, and I have performed a good amount of switch-centric testing. If you haven't had a chance to review what I wrote, please do, and share your thoughts. :)
I was referring to this:
"Perhaps, actually, the issue might arise when I put a large amount of load over the LACP, like migrating 8 VMs, but done 3 at a time. That seems to be a consistent possible cause."
It's a know fact that if you stress a SOHO switch is will begin to broadcast packages on every port in a LAGG. This is also a fact for enterprise switches if your LAGG is configured on the default VLAN (VLAN 1). The default VLAN in any switch is software supported so never use the default VLAN for anything except perhaps management VLAN.
 

BloodyIron

Active Member
Jan 14, 2013
193
4
38
it.lanified.com
Yeah, I'm using an Avaya 4548GT, and the "Prod" node 2 is also using LACP in literally the exact same configuration, and has not failed once. This single node is the consistent failure point. When doing the live migration of this many VMs, it's coming from Prod 2, to Prod 1, and Prod 2 is on the same switch with the same port configuration.

Also consider this was working _before_ the 5.0 upgrade when both were on 4.4, with the same switch.


I was referring to this:
"Perhaps, actually, the issue might arise when I put a large amount of load over the LACP, like migrating 8 VMs, but done 3 at a time. That seems to be a consistent possible cause."
It's a know fact that if you stress a SOHO switch is will begin to broadcast packages on every port in a LAGG. This is also a fact for enterprise switches if your LAGG is configured on the default VLAN (VLAN 1). The default VLAN in any switch is software supported so never use the default VLAN for anything except perhaps management VLAN.
 

Leo David

Member
Apr 25, 2017
69
1
8
40
Hi.
I'm using pve 4.3, and an external ceph Jewel storage. Can i safely upgrade to pve 5.0 ?
 

wolfgang

Proxmox Staff Member
Staff member
Oct 1, 2014
5,105
338
103
Hi.
I'm using pve 4.3, and an external ceph Jewel storage. Can i safely upgrade to pve 5.0 ?
If you use this as production system I would wait for the upgrade until ceph is no longer RC.
Sorry I over read the external :)
Default we use jewel librbd what work perfect with external ceph cluster.
 
Last edited:

joulester

New Member
May 1, 2017
21
1
3
31
Do I still use the "sed -i 's/jessie/stretch/g' /etc/apt/sources.list.d/pve-enterprise.list" if I dont have a subscription?
 

Leo David

Member
Apr 25, 2017
69
1
8
40
If you use this as production system I would wait for the upgrade until ceph is no longer RC.
Thanks. So is not recomended to use PVE 5 with an external Ceph below Luminous ?
I've just reinstalled Ceph Jewel in production, and thought that I can benefit PVE 5 features with this new ceph cluster...
 

Rhinox

Active Member
Sep 28, 2016
272
36
28
29
From what I've been reading the naming is actually a systemd thing, not kernel or debian specific. Hence 16.04 is systemd ;)
More exactly, it is udev-thing. Even distros without systemd use these new "predictable network interface names"...
 

martin

Proxmox Staff Member
Staff member
Apr 28, 2005
645
374
83
Thanks a lot for all the feedback!

As this thread is filled up with too many topics to a lot different issues, its too hard to follow and I close it for further posting.

For all further questions and issues, please just open new threads!

Martin
 
Status
Not open for further replies.

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE and Proxmox Mail Gateway. We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!