Node unaccessible after upgrading from 2.3 to 3.1

Sparky · Nov 28, 2013

Hey Guys,

for two month now we had a cluster of two running in our noise chamber.
Both were running Proxmox 2.3 and worked as good as possible as it was configured.
The problem started today.
We noticed that Version 2.3 is already outdated and were about to realize HA Clustering.
We installed a third node and added it to the cluster, then we decided to upgrade all
nodes one after another.
Third and second node were okay, both were not holding any virtual machines at the moment.
As for now every node was accessible via each others webinterface.
One of the virtual machines is our firewall, the other one our DNS, so I prepared the Server
according to the wiki until this

Code:

[COLOR=#000000]./pve-upgrade-2.3-to-3.0 --download-only[/COLOR]

After shutting down all the virtual machines I proceeded to go on with the upgrade.

So, this is the part i guess where a lot of stuff startet to go wrong.
This guy told me the upgrade failed, my guess was he just couldn't download some final updates
as he was holding the shut down firewall, so I rebootet him and tried to access him.

First I tried to access him from another node and noticed that there was an error message
that google couldn't find so far.

Code:

Connection error 595: Connection refused

Damn it. The other node said the same.

I wanted to access node 1 himself, but the page was not reachable.
Double Damn.

So I did some testing with pings and ssh sessions and stuff and found that node 1 is still reachable from everywhere, ping, ssh, the stuff that WinSCP does, everything is fine.

My next attempt was to shut down all the nodes and bring back on one after another. Node one as the cluster master, node two and then node three.
But already after booting up node one I noticed that nothing got better.

So I started to migrate all the virtual machines to node two, noticed that error number 595 ist still present and wanted to reactivate the backups I made on the NAS before I started. I only found one of three backups.

Well, after seeing that the little icon of node one is red in the webinterface I found a suggestion to check several things services as suggested. (Cluster, daemon and stuff).

Got the hint to manually start the Apache service and received a mesage telling that Document Root is missing.

Code:

[....] Starting web server: apache2Warning: DocumentRoot [/usr/share/pve-manager/root] does not exist[Thu Nov 28 15:27:07 2013] [warn] NameVirtualHost *:80 has no VirtualHosts
Action 'start' failed.
The Apache error log may have more information.
 failed!

That is all I got for today.
As Node one locked himself up I couldn't try to do upgrade again. Getting the firewall working on node two didn't help as there are some unknown settings.

I would really appretiate your help tomorrow is my last chance to get this cluster working.
Any hints how to fix this problem?
Rollback the node to a former state? Before upgrading?
How to get the config files of the virtual machines so I can easily restart them on the other nodes?

Thanks for reading that many letters, hope we can fix this together. My classmates already tried to hurt me because the internet was down

greetings

m.ardito · Nov 28, 2013

just one quick note: pve 3.x has no more apache2, has its own web server (IIIRC it's the pveproxy service): if you upgraded it's still there but useless (you can also remove apache2)
and remember to connect to "https://ip:8006" as no redirection is made anymore if you simply call "ip" in web browser...

Marco

Sparky · Nov 28, 2013

Thanks for that hint, good to know Apache should be useless here by now.
I remembered to connect to https instead http and didn't forget about the port either, but thanks for the kind Reminder

Any more help and ideas are appretiated.

grz

Sparky · Dec 11, 2013

Well let me give you guys a short update on this.

I actually managed to get my Node working again after having a night of nightmares.

Here is what I did wrong:

I followed the Tutorial on how to upgrade to version 3.1 and got to this point here.

Code:

./pve-upgrade-2.3-to-3.0 --download-only

I assumed this would download all necessary Stuff so you would be able to upgrade without being connected to the Internet.
As you might guess, it didn't.

"There I fixed it":

Somehow two of three Backups got lost, otherwise it would have been a piece Cake to restore my firewall.
Using my Notebook as Networkbridge between WLAN and LAN, giving him the gateways Adress, I was able to rerun the upgrade process.
After rerunning this all came up without errors. Problem solved.
Did new Backups and everything is fine again, the project can go on.

Summary:
- Messed up the upgrade process
- Reconnecting the Node to the internet and rerunning the upgrade fixed everything

Have a great day

m.ardito · Dec 11, 2013

Sparky said:
I followed the Tutorial on how to upgrade to version 3.1 and got to this point here.

Code:

./pve-upgrade-2.3-to-3.0 --download-only

I assumed this would download all necessary Stuff so you would be able to upgrade without being connected to the Internet.
As you might guess, it didn't.

Hi, it should have ended with something like

Code:

Fetched 277 MB in 6min 7s (752 kB/s) 
Download complete and in download only mode 
download successful

did you notice if it did?

Marco

Sparky · Dec 11, 2013

m.ardito said:
Hi, it should have ended with something like

Code:

Fetched 277 MB in 6min 7s (752 kB/s) Download complete and in download only mode download successful

did you notice if it did?

Marco

Hi Marco,

yes, I can remember that it said it all was downloaded successfully.
That was the Reason to proceed at all. Looked right for me in this Moment.^^

The Odyssey startet with the upgradeprocess itself as it failed because of the lack of internet connection.

Greetings

m.ardito · Dec 12, 2013

Yes, I tried and happens the same. It still needs to download or fails...

I've update the wiki page: http://pve.proxmox.com/mediawiki/index.php?title=Upgrade_from_2.3_to_3.0&diff=6140&oldid=6095

Marco

Sparky · Dec 12, 2013

Glad my failure helped to improve the wiki at least

Node unaccessible after upgrading from 2.3 to 3.1

Sparky

New Member

m.ardito

Famous Member

Sparky

New Member

Sparky

New Member

m.ardito

Famous Member

Sparky

New Member

m.ardito

Famous Member

Sparky

New Member

We value your privacy