Windows Storage Issues after Migration from Xen

Aug 30, 2024
4
0
1
Hello Everybody,

I have an Issue with Windows VMs dropping Disk I/O after migration to proxmox from xcp-ng.

Here is the pveversion -v of one of my pve hosts.
Bash:
proxmox-ve: 8.2.0 (running kernel: 6.8.8-4-pve)
pve-manager: 8.2.4 (running version: 8.2.4/faa83925c9641325)
proxmox-kernel-helper: 8.1.0
proxmox-kernel-6.8: 6.8.8-4
proxmox-kernel-6.8.8-4-pve-signed: 6.8.8-4
proxmox-kernel-6.8.8-3-pve-signed: 6.8.8-3
proxmox-kernel-6.8.4-2-pve-signed: 6.8.4-2
proxmox-kernel-6.5.13-5-pve: 6.5.13-5
ceph-fuse: 17.2.7-pve3
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx9
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.1
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.3
libpve-access-control: 8.1.4
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.0.7
libpve-cluster-perl: 8.0.7
libpve-common-perl: 8.2.1
libpve-guest-common-perl: 5.1.4
libpve-http-server-perl: 5.1.0
libpve-network-perl: 0.9.8
libpve-rs-perl: 0.8.9
libpve-storage-perl: 8.2.3
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.4.0-3
proxmox-backup-client: 3.2.7-1
proxmox-backup-file-restore: 3.2.7-1
proxmox-firewall: 0.5.0
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.2.3
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.6
proxmox-widget-toolkit: 4.2.3
pve-cluster: 8.0.7
pve-container: 5.1.12
pve-docs: 8.2.2
pve-edk2-firmware: 4.2023.08-4
pve-esxi-import-tools: 0.7.1
pve-firewall: 5.0.7
pve-firmware: 3.13-1
pve-ha-manager: 4.0.5
pve-i18n: 3.2.2
pve-qemu-kvm: 9.0.2-1
pve-xtermjs: 5.3.0-3
qemu-server: 8.2.3
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.4-pve1

I have done the migration in the following way:

1. I shutdown the VM on the XCP-NG side
2. I exported the disk from xcp-ng to a proxmox host in the .raw format
Code:
curl -u 'user:password' -X GET "ip_of_xcp-ng_host/export_raw_vdi?vdi=<uuidofxendisk>" -o new_disk.raw
3. I then created a new VM on the proxmox side and imported the exported disk from step 3
Code:
qm importdisk <vmid> new_disk.raw MY_BLOCK_STORAGE
4. I attached the DISK as IDE Device to be able to boot from it
5. I attached another disk as scsi device
6. I started the VM and was able to login and install virtio drivers
7. I shutdown the VM and removed the second disk. Detached the first disk and reattached it as scsi drive
8. Changed boot order accordingly and the vm was able to boot as expected
9. I then uninstalled the XenServer Tools and rebooted the VM again


How did i notice the issue?
My Monitoring solution got angry about high latency of pings and packet loss.
Code:
Mi 11. Sep 20:33:10 CEST 2024: 64 bytes from 10.64.10.14: icmp_seq=448 ttl=127 time=0.452 ms
Mi 11. Sep 20:33:11 CEST 2024: 64 bytes from 10.64.10.14: icmp_seq=449 ttl=127 time=0.400 ms
Mi 11. Sep 20:33:31 CEST 2024: 64 bytes from 10.64.10.14: icmp_seq=450 ttl=127 time=19148 ms
Mi 11. Sep 20:33:31 CEST 2024: 64 bytes from 10.64.10.14: icmp_seq=451 ttl=127 time=18149 ms
Mi 11. Sep 20:33:31 CEST 2024: 64 bytes from 10.64.10.14: icmp_seq=452 ttl=127 time=17149 ms
Mi 11. Sep 20:33:31 CEST 2024: 64 bytes from 10.64.10.14: icmp_seq=453 ttl=127 time=16148 ms
Mi 11. Sep 20:33:31 CEST 2024: 64 bytes from 10.64.10.14: icmp_seq=454 ttl=127 time=15148 ms
Mi 11. Sep 20:33:31 CEST 2024: 64 bytes from 10.64.10.14: icmp_seq=455 ttl=127 time=14148 ms
Mi 11. Sep 20:33:31 CEST 2024: 64 bytes from 10.64.10.14: icmp_seq=456 ttl=127 time=13147 ms
Mi 11. Sep 20:33:31 CEST 2024: 64 bytes from 10.64.10.14: icmp_seq=457 ttl=127 time=12147 ms
Mi 11. Sep 20:33:31 CEST 2024: 64 bytes from 10.64.10.14: icmp_seq=458 ttl=127 time=11147 ms
Mi 11. Sep 20:33:31 CEST 2024: 64 bytes from 10.64.10.14: icmp_seq=459 ttl=127 time=10147 ms
Mi 11. Sep 20:33:31 CEST 2024: 64 bytes from 10.64.10.14: icmp_seq=460 ttl=127 time=9146 ms
Mi 11. Sep 20:33:31 CEST 2024: 64 bytes from 10.64.10.14: icmp_seq=461 ttl=127 time=8146 ms
Mi 11. Sep 20:33:31 CEST 2024: 64 bytes from 10.64.10.14: icmp_seq=462 ttl=127 time=7146 ms
Mi 11. Sep 20:33:31 CEST 2024: 64 bytes from 10.64.10.14: icmp_seq=463 ttl=127 time=6145 ms
Mi 11. Sep 20:33:31 CEST 2024: 64 bytes from 10.64.10.14: icmp_seq=464 ttl=127 time=5146 ms
Mi 11. Sep 20:33:31 CEST 2024: 64 bytes from 10.64.10.14: icmp_seq=465 ttl=127 time=4146 ms
Mi 11. Sep 20:33:31 CEST 2024: 64 bytes from 10.64.10.14: icmp_seq=466 ttl=127 time=3146 ms
Mi 11. Sep 20:33:31 CEST 2024: 64 bytes from 10.64.10.14: icmp_seq=467 ttl=127 time=2146 ms
Mi 11. Sep 20:33:31 CEST 2024: 64 bytes from 10.64.10.14: icmp_seq=468 ttl=127 time=1146 ms
Mi 11. Sep 20:33:31 CEST 2024: 64 bytes from 10.64.10.14: icmp_seq=469 ttl=127 time=146 ms
Mi 11. Sep 20:33:32 CEST 2024: 64 bytes from 10.64.10.14: icmp_seq=470 ttl=127 time=0.377 ms
Mi 11. Sep 20:33:33 CEST 2024: 64 bytes from 10.64.10.14: icmp_seq=471 ttl=127 time=0.299 ms
Mi 11. Sep 20:33:34 CEST 2024: 64 bytes from 10.64.10.14: icmp_seq=472 ttl=127 time=0.343 ms

So i started investigating my HPE Nimble, my Switches and my network cards and i couldn't find any issues.

I then migrated a troublesome VM onto local-lvm storage and also had these issues with packet loss and high latency.
I then thought it might be the network adapter and changed it from virtio pv to Intel E1000E, but the issue persists.

So my only logical conclusion is, that it might be some kind of left over xen driver causing me issues.

I have searched and found this discussion on the proxmox forums and tried cleaning up all the xen drivers.
Unfortuanately i still have these issues.

The issue is inconsistent and different in severity on different windows VMs.

Here is the VM config of a very troublesome VM
Code:
agent: 1
bios: ovmf
boot: order=scsi0;net0;ide2
cores: 8
cpu: x86-64-v3
efidisk0: CC-ST03-OS-PVE:vm-145-disk-0,efitype=4m,pre-enrolled-keys=1,size=528K
ide2: CC-ST04-ISO:iso/virtio-win-0.1.262.iso,media=cdrom,size=708140K
machine: pc-q35-9.0
memory: 32768
meta: creation-qemu=9.0.2,ctime=1724683317
name: VM NAME
net0: virtio=BC:24:11:6E:2D:43,bridge=CORCLOUD,firewall=1
numa: 0
ostype: win10
scsi0: CC-ST03-OS-DEV:vm-145-disk-0,aio=native,discard=on,iothread=1,size=100G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=2e36ec73-7afa-4c85-aa59-7fbe8cec61f0
sockets: 1
vmgenid: 184e62e4-f888-4045-a12b-d56af1215bc3

I think it is something related to Storage Drivers. But i am out of ideas, what more i could test / check.

Does anybody have an idea?

Thanks and best regards
 

Attachments

  • worst_vm.png
    worst_vm.png
    251.6 KB · Views: 2
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!