Custom .bashrc can break VNC and live migration between cluster nodes.

Deafboy

Renowned Member
Jan 25, 2010
12
1
68
This was supposed to be a cry for help, but during the log collection I've managed to find and fix the cause of the problem. I'm posting this, hoping it might save somebody's time and sanity in the future.

Shortly after upgrading to 6.2, I've started to experience a weird behavior every time the nodes in the cluster needed to exchange some data in real time. Specifically during the live migration between nodes and opening a noVNC session across nodes (logged onto node A, VM running on node B).

Live migration log provides no useful info:
2020-05-31 17:36:53 starting migration of VM 883 to node 'g7' (192.168.1.151)
2020-05-31 17:36:53 starting VM 883 on remote node 'g7'
2020-05-31 17:36:56 ERROR: online migrate failure - unable to detect remote migration address
2020-05-31 17:36:56 aborting phase 2 - cleanup resources
2020-05-31 17:36:56 migrate_cancel
2020-05-31 17:36:58 ERROR: migration finished with problems (duration 00:00:06)
TASK ERROR: migration problems

Live migration while disk is on local storage provides more info:
2020-05-31 17:17:18 ERROR: online migrate failure - can't open migration tunnel - got strange reply from mtunnel (']0;G7tunnel online')

And finally javascript console in browser (while openning noVNC) confirms the suspission:
Failed when connecting: Invalid server version G7�RFB


Problem was the following line I've added to the .bashrc on each node shortly before the upgrade. It changes the window title, so you can easily distinguish between multiple SSH sessions.
echo -en "\033]0;NameOfTheNode\a"


I'd like to take this oportunity to ask the devs - Wouldn't it make sense NOT to run bash during machine to machine communication? And in case it's necessary, at least NOT execute the .bashrc?
 
  • Like
Reactions: DerDanilo
And in case it's necessary, at least NOT execute the .bashrc?

No, you have to develop it correctly :-p
It is expected behavior, that if you e.g. do that, you get exactly what you asked for:

Code:
ssh <node> id

Therefore, you need to add some magic to your .bashrc and only print out the stuff you want on interactive shell sessions in order to not break the non-interactive ones (e.g. the one above). Add the following where you want to break:

Code:
case $- in
    *i*)  ;;
    *)    return;;
esac

An alternative is to use a special if block ... and in your case this:

Code:
if [ -t 1 ]
then
    echo -en "\033]0;NameOfTheNode\a"
fi
 
  • Like
Reactions: Deafboy

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!