Unable to find serial interface - console problem

Tomasz Krzywicki · Apr 15, 2021

root@gda-host-vm-15:~# pveversion
pve-manager/6.3-6/2184247e (running kernel: 5.4.103-1-pve)

Hi,
I have the problem from month or two. All look like all work ok, but when I use proxmox http interface and open the console to the virtual server, I see this , what you can see on attached picture. The connection by ssh (linuxes) or remote desktop (windows servers 2016) does not work at this moment. Not all virtual servers have this problem on host. This what can help is rebooting the host. The restart of virtual server often does not work (stop should be used instead of shutdown or restart) but new started/restarted virtual server will work but very short or will have the problem from picture at once. It looks like virtual server all time is working correct, but proxmox cut the access to it(? but I am not sure).
Is anybody here who know what is this ? Thank you for any suggestions .
regards
Tomasz

dcsapak · Apr 15, 2021

did you configure the display to be a serial console? did you configure a serial terminal?

can you post the vm config? (qm config ID)

Tomasz Krzywicki · Apr 16, 2021

Hi,
here is sample config. We have never configured the display to be a serial console. When VM is created graphic card is default and qemu agent is selected. I have not configured serial terminal because it was never needed.
You asking about it ... is it because the latest version of proxmox needs it ? I have never seen this problem on proxmox version 5.4-15 and previous.

qm config 3111
agent: 1
balloon: 0
boot: c
bootdisk: scsi0
cores: 2
description:
memory: 16384
name: gda-phx-22
net0: virtio=D6:13

C:7D:FE:5B,bridge=vmbr0
numa: 0
onboot: 1
ostype: win10
scsi0: sdf_pool:vm-3111-disk-0,cache=writeback,discard=on,size=228G
scsi1: sdf_pool:vm-3111-disk-1,cache=writeback,discard=on,size=700G
scsihw: virtio-scsi-pci
smbios1: uuid=82e7be30-032f-4611-ad74-a392a0a307bc
sockets: 1

dcsapak · Apr 19, 2021

for xtermjs you need a serial terminal configured in the vm, since it is text based (in contrast to novnc which is display based)
so if you do not want to configure a serial interface (also in the vm) just use novnc instead

Tomasz Krzywicki · Apr 19, 2021

Do I need serial terminal ? I never used it . Do I need novnc ?
We use putty's ssh and ms remote desktop. Only for admin activity the console from https proxmox interface is used or when net card does not work.
When the problem occure, it is not possible to connect to VM neither ssh, nor remote desktop and build in console. If novnc or serial terminal will work, ok I need it, but why the others do not work ? What should users do who use RD or ssh ? Change the tool ...?
Should I instruct, please use remote desktop , but when it stop working and VM will look like does not work, use novnc instead ? Is it correct?
Where can I download novnc for share it with users ?

dcsapak · Apr 20, 2021

i am not completely sure what the issue here is.

if you use the web ui to connect to vms, this will use novnc (the built in web vnc client) by default. you either clicked explicitly on the xtermjs button, or changed the default in 'datacenter->options' to xtermjs.
this will not work if no serial terminal is configured in the vm.

you do not have to use xtermjs, this is a choice you have to make.

if your other methods of accessing the vm do not work, i would check the logs (especially inside the vm) what happened.

Tomasz Krzywicki · Apr 20, 2021

Here is syslog from host where VM id:378 works. This machine shows "Unable to find serial interface" in console at last Friday and today between 8 and 9 am. I tried again today after minute or two and console showed correct linux login screen (surprise). Linux is without graphic interface.

Tomasz Krzywicki · Apr 20, 2021

Other host and vm id: 3105. It shows problem at Friday and today at 10:13am when I tried again before I wrote it. Lets try now when I am writing it ... yes second try shows correct screen in the console.

If this is working now as strange as I describing ... maybe I write this .

This problem we see from the start of use version 6 but last days it is much often. Something happend last week. Host 15 had problem with one disk. 3 sectors was pending and machine foreined it nad disk was not visible in the system. It happen when disk is trying to read pending sectors and copy to reserved sectors. Sometimes it uses all his time and raid card is not able to communicate and forein it. (Dell server r730 and perc 710 controler). When it happend first time many many VM from host 15 had problems described here. (But some from other hosts too) . Host was restarted and imported forein and I was able to copy VM images to other disks. Many VMs on this host started and worked ok by first minutes but shows described problem after this time. Yesterday disk was removed and replaced, host restarted and VMs from this host seem to work ok .
If the problem is connected to the disk, ok many VMs from this host can work strange but why some VMs from other hosts have the same problem?
We see the same problem for small number of VMs from time to time but without disk malfunction. Can it be connected to disk problem or disk utilisation problem ? Do not think so if it can happend on different hosts at the same time.

dcsapak · Apr 20, 2021

your qm config above does not show that you configured a serial terminal for the vm, did you do that?
if you did not configure a serial terminal for the vms, this can *never* work.

as for the problems with accessing the vm, check the syslog *inside* the vm to see maybe whats going on, the host syslog does not show anything

Tomasz Krzywicki · Apr 20, 2021

I have never thought I can need serial terminal. Today I do not see where it can be usable if we have console in https.
Here are log files from VMs 3105 and 378 .

dcsapak · Apr 20, 2021

nothing obvious in the logs (i guess the restart of the vm was you resetting/rebooting them?)

what does 'qm status ID --verbose' say when the console access does not work anymore?

Tomasz Krzywicki · Apr 20, 2021

VMs 3105 and 378 were not restarted. Both were not restarted from 2 weeks. (or more)

This time 383.
1. Status :

Code:

qm status 383 --verbose
Configuration file 'nodes/gda-host-vm-13/qemu-server/383.conf' does not exist
root@gda-host-vm-13:~# ssh 10.164.112.120
Linux gda-host-vm-20 5.4.78-2-pve #1 SMP PVE 5.4.78-2 (Thu, 03 Dec 2020 14:26:17 +0100) x86_64

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Thu Apr 15 15:15:53 2021 from 10.164.112.101
root@gda-host-vm-20:~# qm status 383 --verbose
balloon: 17179869184
ballooninfo:
        actual: 17179869184
        free_mem: 212938752
        last_update: 1618920971
        major_page_faults: 453607588
        max_mem: 17179869184
        mem_swapped_in: 0
        mem_swapped_out: 0
        minor_page_faults: 18757144598
        total_mem: 16825561088
blockstat:
        scsi0:
                account_failed: 1
                account_invalid: 1
                failed_flush_operations: 0
                failed_rd_operations: 0
                failed_unmap_operations: 0
                failed_wr_operations: 0
                flush_operations: 2842211
                flush_total_time_ns: 42370158351
                idle_time_ns: 1198840823
                invalid_flush_operations: 0
                invalid_rd_operations: 0
                invalid_unmap_operations: 0
                invalid_wr_operations: 0
                rd_bytes: 2859259603456
                rd_merged: 0
                rd_operations: 71416103
                rd_total_time_ns: 8257386909538
                timed_stats:
                unmap_bytes: 8519680
                unmap_merged: 0
                unmap_operations: 80
                unmap_total_time_ns: 8489743
                wr_bytes: 150045777920
                wr_highest_offset: 203856613376
                wr_merged: 0
                wr_operations: 18888038
                wr_total_time_ns: 2260297401988
        scsi1:
                account_failed: 1
                account_invalid: 1
                failed_flush_operations: 0
                failed_rd_operations: 0
                failed_unmap_operations: 0
                failed_wr_operations: 0
                flush_operations: 7751101673
                flush_total_time_ns: 97391398066616
                idle_time_ns: 23612519
                invalid_flush_operations: 0
                invalid_rd_operations: 0
                invalid_unmap_operations: 0
                invalid_wr_operations: 0
                rd_bytes: 18250860209152
                rd_merged: 0
                rd_operations: 1101153756
                rd_total_time_ns: 146437007665709
                timed_stats:
                unmap_bytes: 31791870066688
                unmap_merged: 0
                unmap_operations: 31734031
                unmap_total_time_ns: 18574061242054
                wr_bytes: 95228481683456
                wr_highest_offset: 1610612736000
                wr_merged: 0
                wr_operations: 8652802833
                wr_total_time_ns: 458510219443838
cpus: 4
disk: 0
diskread: 21110119812608
diskwrite: 95378527461376
freemem: 212938752
maxdisk: 214748364800
maxmem: 17179869184
mem: 16612622336
name: gda-docker-01
netin: 19793191896167
netout: 17954091915254
nics:
        tap383i0:
                netin: 19793191896167
                netout: 17954091915254
pid: 2938
proxmox-support:
        pbs-dirty-bitmap: 1
        pbs-dirty-bitmap-migration: 1
        pbs-library-version: 1.0.2 (18d5b98ab1bec4004178a0db6f2debb83bfa9165)
        query-bitmap-info: 1
qmpstatus: running
running-machine: pc-i440fx-5.1+pve0
running-qemu: 5.1.0
status: running
template:
uptime: 5128168
vmid: 383

2. Check console , see picture

3. Status

Code:

root@gda-host-vm-20:~# qm status 383 --verbose
balloon: 17179869184
ballooninfo:
        actual: 17179869184
        max_mem: 17179869184
blockstat:
        scsi0:
                account_failed: 1
                account_invalid: 1
                failed_flush_operations: 0
                failed_rd_operations: 0
                failed_unmap_operations: 0
                failed_wr_operations: 0
                flush_operations: 2842450
                flush_total_time_ns: 42373208141
                idle_time_ns: 2647535460
                invalid_flush_operations: 0
                invalid_rd_operations: 0
                invalid_unmap_operations: 0
                invalid_wr_operations: 0
                rd_bytes: 2859799280128
                rd_merged: 0
                rd_operations: 71429738
                rd_total_time_ns: 8279846092285
                timed_stats:
                unmap_bytes: 8519680
                unmap_merged: 0
                unmap_operations: 80
                unmap_total_time_ns: 8489743
                wr_bytes: 150063085568
                wr_highest_offset: 203856613376
                wr_merged: 0
                wr_operations: 18890151
                wr_total_time_ns: 2260702715254
        scsi1:
                account_failed: 1
                account_invalid: 1
                failed_flush_operations: 0
                failed_rd_operations: 0
                failed_unmap_operations: 0
                failed_wr_operations: 0
                flush_operations: 7751500080
                flush_total_time_ns: 97396461756292
                idle_time_ns: 2647651201
                invalid_flush_operations: 0
                invalid_rd_operations: 0
                invalid_unmap_operations: 0
                invalid_wr_operations: 0
                rd_bytes: 18252376310784
                rd_merged: 0
                rd_operations: 1101222030
                rd_total_time_ns: 146464173786826
                timed_stats:
                unmap_bytes: 31794579197952
                unmap_merged: 0
                unmap_operations: 31736620
                unmap_total_time_ns: 18577765209709
                wr_bytes: 95234133180416
                wr_highest_offset: 1610612736000
                wr_merged: 0
                wr_operations: 8653252356
                wr_total_time_ns: 458536422695921
cpus: 4
disk: 0
diskread: 21112175590912
diskwrite: 95384196265984
maxdisk: 214748364800
maxmem: 17179869184
mem: 12368368782
name: gda-docker-01
netin: 19794311910763
netout: 17954818165965
nics:
        tap383i0:
                netin: 19794311910763
                netout: 17954818165965
pid: 2938
proxmox-support:
        pbs-dirty-bitmap: 1
        pbs-dirty-bitmap-migration: 1
        pbs-library-version: 1.0.2 (18d5b98ab1bec4004178a0db6f2debb83bfa9165)
        query-bitmap-info: 1
qmpstatus: running
running-machine: pc-i440fx-5.1+pve0
running-qemu: 5.1.0
status: running
template:
uptime: 5128447
vmid: 383

4. Console again and this time it works ok.

dcsapak · Apr 20, 2021

how do you open the console (which button do you press) ?

again: if you have no serial terminal for the vm configured, do not open the xtermjs console, but the novnc console (or change your default in datacenter -> options)

Tomasz Krzywicki said:
VMs 3105 and 378 were not restarted. Both were not restarted from 2 weeks. (or more)

i can see from the logs of 3105 that it rebooted today (Apr 20) @ 10:22:18

so maybe you cannot reach the vm because it reboots itself?

Tomasz Krzywicki · Apr 20, 2021

The button console from the picture. Like you can see I have only noVNC enabled.
I never open serial terminal because I have no possibility to do it.
I never think what kind of console I open. I press console and it opens.

3105 was not restarted by me. Maybe it restarted after console button was pressed first time ?

"Other host and vm id: 3105. It shows problem at Friday and today at 10:13am when I tried again before I wrote it. Lets try now when I am writing it ... yes second try shows correct screen in the console."

I pressed it at 10:13 , next wrote this message next pressed second time before "Today at 10:34" (when message was submitted) .
Is it a good proof for restart by calling console if it happend at 10:22 ? I remeber , I saw result not at once but after some moments ... after second try.

dcsapak · Apr 20, 2021

ok i think i know why this happens. for some reason the 'status' api call takes sometimes longer for you, and until that is finished, the ui
allows xtermjs to open on the button click

i'll send a patch to the deve list

as a workaround you can set the default console to 'novnc' in 'datacenter -> options'

though i think this will not fix the issue that you cannot reach the vm via rdp/ssh.. for that you have to investigate more inside the vms

Tomasz Krzywicki · Apr 21, 2021

Hi,
I found serial set in options and replaced with novnc. I will inform about results if I will find a problem again . Thank you.

(Is not it possible the problem is connected to disk timeouts and can be propagated to other hosts ? Even if disk is empty after moving all images to other disks ...)

Unable to find serial interface - console problem

New Member

Attachments

Proxmox Staff Member

New Member

Proxmox Staff Member

New Member

Proxmox Staff Member

New Member

Attachments

New Member

Attachments

Proxmox Staff Member

New Member

Attachments

Proxmox Staff Member

New Member

Attachments

Proxmox Staff Member

New Member

Attachments

Proxmox Staff Member

New Member