CEPH - Disk 0 Bytes after hard resize

Discussion in 'Proxmox VE: Installation and configuration' started by watnow101, Dec 6, 2018 at 10:59.

Tags:
  1. watnow101

    watnow101 New Member

    Joined:
    Apr 9, 2018
    Messages:
    5
    Likes Received:
    0
    Hi

    I have a 7 Node Cluster with CEPH - RBD. I have a lot of existing VM's running on the cluster and decided to try out containers. I noticed that KRBD needs to be enabled to do that. I went and tick the KRBD box and added "Container" to content together with "Disk Image" in RBD. I then could spin up containers successfully.

    upload_2018-12-6_11-32-32.png

    All was fine until I had to resize a hard disk on one of the existing VM's. Usually you can resize the hard disk and then it will apply only after you poweroff the VM. Once I set the Size increment and saved it, the VM immediately crashed(see pic below)

    upload_2018-12-6_11-38-59.png

    The VM was unable to boot so I looked under RBD to check if I can still see the disk, I could but noticed the hard disk was on 0 B

    upload_2018-12-6_11-41-6.png

    Did I make a fatal error by doing all that? Will I be able to recover the data?

    Code:
    root@node4:~# pveversion -v
    proxmox-ve: 5.2-3 (running kernel: 4.15.18-9-pve)
    pve-manager: 5.2-12 (running version: 5.2-12/ba196e4b)
    pve-kernel-4.15: 5.2-12
    pve-kernel-4.15.18-9-pve: 4.15.18-30
    pve-kernel-4.15.18-1-pve: 4.15.18-19
    pve-kernel-4.15.17-1-pve: 4.15.17-9
    ceph: 12.2.8-pve1
    corosync: 2.4.4-pve1
    criu: 2.11.1-1~bpo90
    glusterfs-client: 3.8.8-1
    ksm-control-daemon: 1.2-2
    libjs-extjs: 6.0.1-2
    libpve-access-control: 5.1-2
    libpve-apiclient-perl: 2.0-5
    libpve-common-perl: 5.0-42
    libpve-guest-common-perl: 2.0-18
    libpve-http-server-perl: 2.0-11
    libpve-storage-perl: 5.0-32
    libqb0: 1.0.3-1~bpo9
    lvm2: 2.02.168-pve6
    lxc-pve: 3.0.2+pve1-5
    lxcfs: 3.0.2-2
    novnc-pve: 1.0.0-2
    proxmox-widget-toolkit: 1.0-20
    pve-cluster: 5.0-30
    pve-container: 2.0-30
    pve-docs: 5.2-10
    pve-edk2-firmware: 1.20181023-1
    pve-firewall: 3.0-14
    pve-firmware: 2.0-6
    pve-ha-manager: 2.0-5
    pve-i18n: 1.0-6
    pve-libspice-server1: 0.14.1-1
    pve-qemu-kvm: 2.12.1-1
    pve-xtermjs: 1.0-5
    qemu-server: 5.0-41
    smartmontools: 6.5+svn4324-1
    spiceterm: 3.0-5
    vncterm: 1.5-3
    zfsutils-linux: 0.7.12-pve1~bpo1
     

    Attached Files:

  2. Alwin

    Alwin Proxmox Staff Member
    Staff Member

    Joined:
    Aug 1, 2017
    Messages:
    1,737
    Likes Received:
    150
    On a PVE 5.3 you don't need to set the KRBD for container. When you set KRBD then every started KVM machine will use the kernel client of ceph to connect (mapped device). Else it runs through qemu with librbd.

    Did you reboot the VM in question? Or just changed to storage to KRBD and then made the resize right away? Do you see anything regarding the resize in the syslog/journal on the PVE node?

    EDIT: please update to PVE 5.3.
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  3. watnow101

    watnow101 New Member

    Joined:
    Apr 9, 2018
    Messages:
    5
    Likes Received:
    0
    I changed the storage to KRBD in the morning and later that afternoon I made the resize.

    This is what syslog says

    Code:
    Dec  5 15:55:28 node6 pvedaemon[3924629]: <root@pam> update VM 2043: resize --disk sata0 --size +100G
    Dec  5 15:55:28 node6 kernel: [8669263.134425] rbd1: detected capacity change from 107374182400 to 214748364800
    Dec  5 15:55:31 node6 pvedaemon[3924629]: VM 2043 qmp command failed - VM 2043 qmp command 'block_resize' failed - got timeout
    At that moment the VM crashed.


    I will update to PVE5.3

    Thank you
     
  4. Alwin

    Alwin Proxmox Staff Member
    Staff Member

    Joined:
    Aug 1, 2017
    Messages:
    1,737
    Likes Received:
    150
    Please post the 'qm config 2043'.

    And aside, you only need three monitor for quorum, more then that is only needed if you have thousands of Ceph nodes or clients. You can save the resources for your VMs. ;)
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  5. watnow101

    watnow101 New Member

    Joined:
    Apr 9, 2018
    Messages:
    5
    Likes Received:
    0
    Noted, thank you.

    Here is 'qm config 2043' - I removed the description and name.

    Code:
    bootdisk: sata0
    cores: 2
    cpu: qemu64
    description: 
    memory: 4096
    name: 
    net0: e1000=96:5A:D3:97:23:85,bridge=vmbr0
    ostype: l26
    smbios1: uuid=7ee88c89-97a5-44de-a826-3a0cbb684eda
    sockets: 1
    unused0: Node1-3:vm-2043-disk-1
    I also detached the drive thinking by reattaching the drive might fix it but it was unable to reattach it. This is the error I get when trying to add it.

    upload_2018-12-7_16-54-43.png
     
  6. Alwin

    Alwin Proxmox Staff Member
    Staff Member

    Joined:
    Aug 1, 2017
    Messages:
    1,737
    Likes Received:
    150
    Which VM has a failed image? I see two different VMIDs on the screenshots. Do all your nodes have access to the ceph cluster?
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  7. watnow101

    watnow101 New Member

    Joined:
    Apr 9, 2018
    Messages:
    5
    Likes Received:
    0
    Hi

    Yes, it happened to two VM's whos hard disk I resized. Sorry I mixed the screenshots up. Here is 2043.

    upload_2018-12-7_17-20-47.png

    I am certain that all my nodes can access the ceph cluster as I have a running VM that is using the same RBD pool on Node4.
     
  8. Alwin

    Alwin Proxmox Staff Member
    Staff Member

    Joined:
    Aug 1, 2017
    Messages:
    1,737
    Likes Received:
    150
    Please check in the ceph logs, what might have caused the disk to "disappear". When you try to do a 'rbd -p<pool> info vm-2043-disk-1', what does it show?
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice